RL Baselines3 Zoo is a training framework for Reinforcement Learning (RL), using Stable Baselines3.
It provides scripts for training, evaluating agents, tuning hyperparameters, plotting results and recording videos.
In addition, it includes a collection of tuned hyperparameters for common environments and RL algorithms, and agents trained with those settings.
We are looking for contributors to complete the collection!
Goals of this repository:
This is the SB3 version of the original SB2 rl-zoo.
Note: although SB3 and the RL Zoo are compatible with Numpy>=2.0, you will need Numpy<2 to run agents on pybullet envs (see issue).
Documentation is available online: https://rl-baselines3-zoo.readthedocs.io/
From source:
As a python package:
Note: you can do python -m rl_zoo3.train
from any folder and you have access to rl_zoo3
command line interface, for instance, rl_zoo3 train
is equivalent to python train.py
apt-get install swig cmake ffmpeg
pip install -r requirements.txt
pip install -e .[plots,tests]
Please see Stable Baselines3 documentation for alternatives to install stable baselines3.
The hyperparameters for each environment are defined in hyperparameters/algo_name.yml
.
If the environment exists in this file, then you can train an agent using:
python train.py --algo algo_name --env env_id
Evaluate the agent every 10000 steps using 10 episodes for evaluation (using only one evaluation env):
python train.py --algo sac --env HalfCheetahBulletEnv-v0 --eval-freq 10000 --eval-episodes 10 --n-eval-envs 1
More examples are available in the documentation.
The RL Zoo has some integration with other libraries/services like Weights & Biases for experiment tracking or Hugging Face for storing/sharing trained models. You can find out more in the dedicated section of the documentation.
Please see the dedicated section of the documentation.
Note: to download the repo with the trained agents, you must use git clone --recursive https://github.com/DLR-RM/rl-baselines3-zoo
in order to clone the submodule too.
If the trained agent exists, then you can see it in action using:
python enjoy.py --algo algo_name --env env_id
For example, enjoy A2C on Breakout during 5000 timesteps:
python enjoy.py --algo a2c --env BreakoutNoFrameskip-v4 --folder rl-trained-agents/ -n 5000
Please see the dedicated section of the documentation.
Please see the dedicated section of the documentation.
Current Collection: 200+ Trained Agents!Final performance of the trained agents can be found in benchmark.md
. To compute them, simply run python -m rl_zoo3.benchmark
.
List and videos of trained agents can be found on our Huggingface page: https://huggingface.co/sb3
NOTE: this is not a quantitative benchmark as it corresponds to only one run (cf issue #38). This benchmark is meant to check algorithm (maximal) performance, find potential bugs and also allow users to have access to pretrained agents.
7 atari games from OpenAI benchmark (NoFrameskip-v4 versions).
RL Algo BeamRider Breakout Enduro Pong Qbert Seaquest SpaceInvaders A2C ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ PPO ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ DQN ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ QR-DQN ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️Additional Atari Games (to be completed):
RL Algo MsPacman Asteroids RoadRunner A2C ✔️ ✔️ ✔️ PPO ✔️ ✔️ ✔️ DQN ✔️ ✔️ ✔️ QR-DQN ✔️ ✔️ ✔️ Classic Control Environments RL Algo CartPole-v1 MountainCar-v0 Acrobot-v1 Pendulum-v1 MountainCarContinuous-v0 ARS ✔️ ✔️ ✔️ ✔️ ✔️ A2C ✔️ ✔️ ✔️ ✔️ ✔️ PPO ✔️ ✔️ ✔️ ✔️ ✔️ DQN ✔️ ✔️ ✔️ N/A N/A QR-DQN ✔️ ✔️ ✔️ N/A N/A DDPG N/A N/A N/A ✔️ ✔️ SAC N/A N/A N/A ✔️ ✔️ TD3 N/A N/A N/A ✔️ ✔️ TQC N/A N/A N/A ✔️ ✔️ TRPO ✔️ ✔️ ✔️ ✔️ ✔️ RL Algo BipedalWalker-v3 LunarLander-v2 LunarLanderContinuous-v2 BipedalWalkerHardcore-v3 CarRacing-v0 ARS ✔️ ✔️ A2C ✔️ ✔️ ✔️ ✔️ PPO ✔️ ✔️ ✔️ ✔️ DQN N/A ✔️ N/A N/A N/A QR-DQN N/A ✔️ N/A N/A N/A DDPG ✔️ N/A ✔️ SAC ✔️ N/A ✔️ ✔️ TD3 ✔️ N/A ✔️ ✔️ TQC ✔️ N/A ✔️ ✔️ TRPO ✔️ ✔️See https://github.com/bulletphysics/bullet3/tree/master/examples/pybullet/gym/pybullet_envs. Similar to MuJoCo Envs but with a free (MuJoCo 2.1.0+ is now free!) easy to install simulator: pybullet. We are using BulletEnv-v0
version.
Note: those environments are derived from Roboschool and are harder than the Mujoco version (see Pybullet issue)
RL Algo Walker2D HalfCheetah Ant Reacher Hopper Humanoid ARS A2C ✔️ ✔️ ✔️ ✔️ ✔️ PPO ✔️ ✔️ ✔️ ✔️ ✔️ DDPG ✔️ ✔️ ✔️ ✔️ ✔️ SAC ✔️ ✔️ ✔️ ✔️ ✔️ TD3 ✔️ ✔️ ✔️ ✔️ ✔️ TQC ✔️ ✔️ ✔️ ✔️ ✔️ TRPO ✔️ ✔️ ✔️ ✔️ ✔️PyBullet Envs (Continued)
RL Algo Minitaur MinitaurDuck InvertedDoublePendulum InvertedPendulumSwingup A2C PPO DDPG SAC TD3 TQC RL Algo Walker2d HalfCheetah Ant Swimmer Hopper Humanoid ARS ✔️ ✔️ ✔️ ✔️ ✔️ A2C ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ PPO ✔️ ✔️ ✔️ ✔️ ✔️ DDPG SAC ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ TD3 ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ TQC ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ TRPO ✔️ ✔️ ✔️ ✔️ ✔️See https://gym.openai.com/envs/#robotics and #71
MuJoCo version: 1.50.1.0 Gym version: 0.18.0
We used the v1 environments.
RL Algo FetchReach FetchPickAndPlace FetchPush FetchSlide HER+TQC ✔️ ✔️ ✔️ ✔️See https://github.com/qgallouedec/panda-gym/.
Similar to MuJoCo Robotics Envs but with a free easy to install simulator: pybullet.
We used the v1 environments.
RL Algo PandaReach PandaPickAndPlace PandaPush PandaSlide PandaStack HER+TQC ✔️ ✔️ ✔️ ✔️ ✔️See https://github.com/Farama-Foundation/Minigrid. A simple, lightweight and fast Gym environments implementation of the famous gridworld.
RL Algo Empty-Random-5x5 FourRooms DoorKey-5x5 MultiRoom-N4-S5 Fetch-5x5-N2 GoToDoor-5x5 PutNear-6x6-N2 RedBlueDoors-6x6 LockedRoom KeyCorridorS3R1 Unlock ObstructedMaze-2Dlh A2C PPO ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ ✔️ DQN QR-DQN TRPOThere are 22 environment groups (variations for each) in total.
Colab Notebook: Try it Online!You can train agents online using Colab notebook.
Passing arguments in an interactive sessionThe zoo is not meant to be executed from an interactive session (e.g: Jupyter Notebooks, IPython), however, it can be done by modifying sys.argv
and adding the desired arguments.
Example
import sys from rl_zoo3.train import train sys.argv = ["python", "--algo", "ppo", "--env", "MountainCar-v0"] train()
To run tests, first install pytest, then:
Same for type checking with pytype:
To cite this repository in publications:
@misc{rl-zoo3, author = {Raffin, Antonin}, title = {RL Baselines3 Zoo}, year = {2020}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/DLR-RM/rl-baselines3-zoo}}, }
If you trained an agent that is not present in the RL Zoo, please submit a Pull Request (containing the hyperparameters and the score too).
We would like to thank our contributors: @iandanforth, @tatsubori @Shade5 @mcres, @ernestum, @qgallouedec
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4