ML-Agents 1.0 - A Definite Guide

by Gilbert Tanner on Aug 03, 2020 · 6 min read

ML-Agents 1.0 - A Definite Guide

Unity has just released version 1.0 of its Machine Learning framework, ML-Agents. This guide will give you an overview of ML-Agents and how it can be used to train reinforcement learning algorithms.

For more information, check out ML-Agents excellent documentation.

What is Unity ML-Agents?

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents. Agents can be trained using reinforcement learning, imitation learning, neuroevolution, or other machine learning methods through a simple-to-use Python API. - ML-Agents README


  • 15+ example Unity environments
  • Support for multiple environment configurations and training scenarios
  • Flexible Unity SDK that can be integrated into your game or custom Unity scene
  • Training using two deep reinforcement learning algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC)
  • Built-in support for Imitation Learning through Behavioral Cloning or Generative Adversarial Imitation Learning
  • Self-play mechanism for training agents in adversarial scenarios
  • Easily definable Curriculum Learning scenarios for complex tasks
  • Train robust agents using environment randomization
  • Flexible agent control with On Demand Decision Making
  • Train using multiple concurrent Unity environment instances
  • Utilizes the Unity Inference Engine to provide native cross-platform support
  • Unity environment control from Python
  • Wrap Unity learning environments as a gym

Key Components

ML-Agents contains five high-level components:

  • Learning Environment - contains the Unity scene and all the game characters. The Unity scene provides the environment in which agents observe, act, and learn. ML-Agents includes a Unity SDK that enables you to transform any Unity scene into a learning environment.
  • Python Low-Level API - Python interface for interacting and manipulating a learning environment. This API is available as a dedicated Python package and is used to communicate with and control the Academy during training.
  • External Communicator - Connects the Learning Environment with the Python Low-Level API.
  • Python Trainers - Contains all the machine learning algorithms. Available as a python package called mlagents that exposes a command-line utility called mlagents-learn.
  • Gym Wrapper - OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. Since its release, Gym has become really popular under reinforcement learning engineers. ML-Agents provides a gym wrapper in the gym-unity python package.
Simplified block diagram of ML-Agents
Figure 1: Simplified block diagram of ML-Agents

Getting started

In this section, I will show you how to install ML-Agents, open an example environment, and train an agent in it.

For this guide, we'll use the 3D Balance Ball environment, which contains a number of agent cubes and balls (which are all copies of each other). Each agent cube tries to keep its ball from falling by rotating either horizontally or vertically.

3D Balance Ball Environment
Figure 2: 3D Balance Ball Environment

Installation and Setup

In order to work with ML-Agents, you need to have Unity 2018.4 or beyond installed. If you haven't installed Unity yet, you can download it from here.

Next, create a new project and install the ML-Agents package using the Unity package manager.

Install ML-Agents
Figure 3: Install ML-Agents
Note: If the package doesn't show up, enable "Show preview packages" under Advanced, next to the search bar.

The example environment, unfortunately, can't be installed through the package manager, so you have to download the repository, decompress it, and copy the ML-Agents folder found under Project/Assets into the assets folder of your project. Now under Examples/3DBall/Scenes, you can find the 3DBall Scene.

Understanding a Unity Environment

In the context of Unity, an environment is a scene containing one or more agents and other GameObjects that an agent interacts with.

3D Balanced Ball Scene
Figure 4: 3D Balanced Ball Scene


The agent is the actor that observes the environment and takes actions accordingly. In the case of the 3D Balance Ball environment, the agent is the cube that tries to balance the ball.

3D Balanced Ball Agent
Figure 5: 3D Balanced Ball Agent

In Unity every agent must have a behavior. The behavior determines how an Agent makes decisions.

There are three types of behavior:

  • Heuristic: This is a hard-coded behavior that is used to test the environment. It can also be used to let the programmer play the game instead of the AI.
  • Training: This mode is active when the AI is currently training. By default if the Python training script is active when you press the play button it will start training. Else it will perform inference.
  • Inference: Inference is when the learned model is used to make decisions.

Running a pre-trained model

Every ML-Agents example environment comes with a pre-trained model, which can be found inside the TFModels folder. If you look at the Behavior Parameters script attached to the Agent GameObject you can see that the model inside the TFModels folder is connected to the Model property.  

Behavior Parameters Script
Figure 6: Behavior Parameters Script

If the Behavior Type is set to default or inference the model should control the agent if you click play.

Figure 7: 3D Balanced Ball Pre-trained model

Training a new model with Reinforcement Learning

Now that you know how to run a pre-trained model it's time I show you how to create a custom model. Training in the ML-Agents Toolkit is powered by a dedicated Python package, mlagents.

The package can be installed using pip:

pip install mlagents

This package exposes a command mlagents-learn that is the single entry point for all training workflows.

To view a description of all the accepted arguments of mlagents-learn, use the --help argument.

mlagents-learn --help
usage: mlagents-learn.exe [-h] [--env ENV_PATH]
                          [--curriculum CURRICULUM_CONFIG_PATH]
                          [--lesson LESSON] [--sampler SAMPLER_FILE_PATH]
                          [--keep-checkpoints KEEP_CHECKPOINTS] [--resume]
                          [--force] [--run-id RUN_ID]
                          [--initialize-from RUN_ID] [--save-freq SAVE_FREQ]
                          [--seed SEED] [--inference] [--base-port BASE_PORT]
                          [--num-envs NUM_ENVS] [--no-graphics] [--debug]
                          [--env-args ...] [--cpu] [--version] [--width WIDTH]
                          [--height HEIGHT] [--quality-level QUALITY_LEVEL]
                          [--time-scale TIME_SCALE]
                          [--target-frame-rate TARGET_FRAME_RATE]
                          [--capture-frame-rate CAPTURE_FRAME_RATE]

To train a model, we need to at least specify the trainer-config-file and the run-id.

  • The trainer-config-file specifies the file path to the trainer configuration yaml file. It contains all the hyperparameters. For an in-depth look at all the hyperparameters, check out the training configurations section of the training readme.
  • The run-id is a unique name used to identify the training run.

ML-Agents provides a configuration file for all the example environment, which is located in the config directory of the repo you downloaded earlier.

To train the model, navigate into the downloaded repository and execute:

mlagents-learn config/trainer_config.yaml --run-id=3DBall1

If everything went as expected you should see "Listening on port 5004. Start training by pressing the Play button in the Unity Editor." inside the command line. Press the play button as described in the message, and training should start.

Figure 8: ML-Agents training model

You can monitor the training progress by launching Tensorboard:

tensorboard --logdir=summaries
Training progress
Figure 9: Training progress

Unity ML-Agents Gym Wrapper

OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. Since its release, Gym has become really popular under reinforcement learning engineers. ML-Agents provides a gym wrapper in the gym-unity python package.


The gym wrapper can be installed using:

pip3 install gym_unity

Using the Gym Wrapper

The gym interface is available from gym_unity.envs. To start an environment use:

from gym_unity.envs import UnityToGymWrapper

env = UnityToGymWrapper(unity_environment, uint8_visual, allow_multiple_obs)

For more information check out the official documentation.


ML-Agents is an open-source project that enables games and simulations to serve as environments for training intelligent agents. In this article, you learned the basics of ML-Agents. If you're interested in creating a custom game check out 'Making a New Learning Environment'.

That's it from this article. If you have any questions or just want to chat with me, feel free to leave a comment below or contact me on social media. If you want to get  continuous updates about my blog make sure to join my newsletter.