This repository contains a benchmark of model-based reinforcement learning solutions made of probabilistic models and planning agents. This benchmark was used to run the experiments of the paper "Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose?", Balázs Kégl, Gabriel Hurtado, Albert Thomas, ICLR 2021. You can also check the associated blog post for the general context and a summary of this paper.
The different systems of the benchmark are located in the benchmark/
folder. Each system has its own folder where one can
ramp-test
command from ramp-workflow (see the ramp-test command documentation for more information on how to use this command)model-based-rl
command implemented in the mbrl-tools
package provided in this repository.You can easily install all the required packages with conda and the following procedure:
conda
environment from environment.yml
using conda >= 4.9.2
:conda env create -f environment.yml
By default this will create an environment named mbrl
. You can specify the name of your choice by adding -n <environment_name>
to the conda env create
command.
Activate the environment with conda activate mbrl
.
Install the generative regression branch of ramp-workflow
by running
pip install git+https://github.com/paris-saclay-cds/ramp-workflow.git@generative_regression_clean
mbrl-tools
package by running pip install .
in the mbrl-tools/
directory.With this installation you can run all the models of the ICLR 2021 paper. If you do not want to run all the models you might only need a subset of the packages listed in environment.yml
.
Finally, if you want to run the inverted pendulum experiments you need MuJoCo 2.0 and mujoco-py. mujoco-py
can be installed easily with pip install mujoco-py
.
We will go through the different functionalities using the acrobot system located in benchmark/acrobot/
. The main structure of this folder is based on the one required by ramp-workflow with a few additional components for the dynamic evaluation (model-based reinforcement learning loop):
problem.py
file specifying the problem and the training and evaluation protocolenv.py
file for the Open AI gym environment of the acrobot systemreward_function.py
file for the reward function of the reinforcement learning taskgenerate_static_trace.py
file used to generate the static datasets from the real systemdata/
folder containing the static datasets generated on the real systemsubmissions/
folder containing the different modelsagents/
folder containing the different agentsTo train and evaluate a model located in submissions/
on a static dataset run ramp-test --submission <submission_name> --data-label <dataset_name>
. For instance to run the linear model on the dataset generated with a random policy:
ramp-test --submission arlin_sigma --data-label random
For more information on the ramp-test
options and generated outputs please refer to the ramp-workflow documentation.
To evaluate a model, coupled with a random shooting agent, in a model-based reinforcement learning setup use the model-based-rl
command. For instance to evaluate the linear model you can run
model-based-rl --submission arlin_sigma --agent-name random_shooting
The --submission
option name was inherited from the terminology used by ramp-test
. Other options include the number of epochs, the minimum number of steps per epoch, using an initial trace instead of running a random agent for the first epoch. More information on the different options can be obtained by running model-based-rl --help
.
If you use this code please cite our ICLR 2021 paper:
@inproceedings{Kegl2021,
title={Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose?},
author={Kégl, Balázs and Hurtado, Gabriel and Thomas, Albert},
booktitle={9th International Conference on Learning Representations, {ICLR} 2021},
year={2021}
}
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4