The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents. Agents can be trained using reinforcement learning, imitation learning, neuroevolution, or other machine learning methods through a simple-to-use Python API. - ML-Agents README
This guide will give you an overview of ML-Agents and how it can be used to train reinforcement learning algorithms. For more information about ML-Agents, check out the excellent documentation.
- 18+ example Unity environments
- Support for multiple environment configurations and training scenarios
- Flexible Unity SDK that can be integrated into your game or custom Unity scene
- Support for training single-agent, multi-agent cooperative, and multi-agent competitive scenarios via several Deep Reinforcement Learning algorithms (PPO, SAC, MA-POCA, self-play).
- Support for learning from demonstrations through two Imitation Learning algorithms (BC and GAIL).
- Easily definable Curriculum Learning scenarios for complex tasks
- Train robust agents using environment randomization
- Flexible agent control with On-Demand Decision Making
- Train using multiple concurrent Unity environment instances
- Utilizes the Unity Inference Engine to provide native cross-platform support
- Unity environment control from Python
- Wrap Unity learning environments as a gym
ML-Agents contains five high-level components:
- Learning Environment - contains the Unity scene and all the game characters. The Unity scene provides the environment where agents observe, act, and learn. In addition, ML-Agents includes a Unity SDK that enables you to transform any Unity scene into a learning environment.
- Python Low-Level API - Python interface for interacting and manipulating a learning environment. This API is available as a dedicated Python package and is used to communicate with and control the Academy during training.
- External Communicator - Connects the Learning Environment with the Python Low-Level API.
- Python Trainers - Contains all the machine learning algorithms. Available as a python package called mlagents that exposes a command-line utility called mlagents-learn.
- Gym Wrapper - OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. Since its release, Gym has become popular with reinforcement learning engineers. ML-Agents provides a gym wrapper in the gym-unity python package.
This section will show you how to install ML-Agents, open an example environment, and train an agent in it.
For this guide, we'll use the 3D Balance Ball environment, which contains several agent cubes and balls (which are all copies of each other). Each agent cube tries to keep its ball from falling by rotating either horizontally or vertically.
Installation and Setup
To work with ML-Agents, you need to have Unity 2019.4 or beyond installed. If you haven't installed Unity yet, you can download it from here.
Next, create a new project and install the ML-Agents package using the Unity package manager.
Unfortunately, the example environment can't be installed through the package manager, so you have to download the repository, decompress it, and copy the ML-Agents folder found under Project/Assets into the assets folder of your project. Now under Examples/3DBall/Scenes, you can find the 3DBall Scene.
Understanding a Unity Environment
In the context of Unity, an environment is a scene containing one or more agents and other GameObjects that an agent interacts with.
The agent is the actor that observes the environment and takes action accordingly. In the case of the 3D Balance Ball environment, the agent is the cube that tries to balance the ball.
In Unity, every agent must have a behavior. The behavior determines how an Agent makes decisions.
There are three types of behavior:
- Heuristic: This is a hard-coded behavior used to test the environment. It can also be used to let the programmer play the game instead of the AI.
- Training: This mode is active when the AI is currently training. By default, if the Python training script is active, it will start training when you press the play button. Else it will perform inference.
- Inference: Inference is when the learned model is used to make decisions.
Running a pre-trained model
Every ML-Agents example environment comes with a pre-trained model, which can be found inside the TFModels folder. If you look at the Behavior Parameters script attached to the Agent GameObject, you can see that the model inside the TFModels folder is connected to the Model property.
If the Behavior Type is set to default or inference, the model should control the agent if you click play.
Training a new model with Reinforcement Learning
Now that you know how to run a pre-trained model, it's time I show you how to create a custom model. Training in the ML-Agents Toolkit is powered by a dedicated Python package, mlagents.
The package can be installed using pip:
pip install mlagents
This package exposes a command mlagents-learn that is the single entry point for all training workflows.
To view a description of all the accepted arguments of mlagents-learn, use the --help argument.
usage: mlagents-learn.exe [-h] [--env ENV_PATH] [--curriculum CURRICULUM_CONFIG_PATH] [--lesson LESSON] [--sampler SAMPLER_FILE_PATH] [--keep-checkpoints KEEP_CHECKPOINTS] [--resume] [--force] [--run-id RUN_ID] [--initialize-from RUN_ID] [--save-freq SAVE_FREQ] [--seed SEED] [--inference] [--base-port BASE_PORT] [--num-envs NUM_ENVS] [--no-graphics] [--debug] [--env-args ...] [--cpu] [--version] [--width WIDTH] [--height HEIGHT] [--quality-level QUALITY_LEVEL] [--time-scale TIME_SCALE] [--target-frame-rate TARGET_FRAME_RATE] [--capture-frame-rate CAPTURE_FRAME_RATE] trainer_config_path ...
To train a model, we need to at least specify the trainer-config-file and the run-id.
- The trainer-config-file specifies the file path to the trainer configuration YAML file. It contains all the hyperparameters. For an in-depth look at all the hyperparameters, check out the training configurations section of the training readme.
- The run-id is a unique name used to identify the training run.
ML-Agents provides a configuration file for all the example environments, located in the config directory of the repo you downloaded earlier.
To train the model, navigate into the downloaded repository and execute:
mlagents-learn config/trainer_config.yaml --run-id=3DBall1
If everything went as expected, you should see "Listening on port 5004. Start training by pressing the Play button in the Unity Editor." inside the command line. Press the play button as described in the message, and training should start.
You can monitor the training progress by launching Tensorboard:
Unity ML-Agents Gym Wrapper
OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. Since its release, Gym has become popular with reinforcement learning engineers. ML-Agents provides a gym wrapper in the gym-unity python package.
The gym wrapper can be installed using:
pip3 install gym_unity
Using the Gym Wrapper
The gym interface is available from gym_unity.envs. To start an environment use:
from gym_unity.envs import UnityToGymWrapper env = UnityToGymWrapper(unity_env, uint8_visual, flatten_branched, allow_multiple_obs)
For more information, check out the official documentation.
ML-Agents is an open-source project that enables games and simulations to serve as environments for training intelligent agents. In this article, you learned the basics of ML-Agents. If you're interested in creating a custom game, check out 'Making a New Learning Environment'.
That's it from this article. If you have any questions or want to chat with me, feel free to contact me via EMAIL or social media.