ROLL is an efficient and user-friendly RL library designed for Large Language Models (LLMs) utilizing Large Scale GPU resources. It significantly enhances LLM performance in key areas such as human preference alignment, complex reasoning, and multi-turn agentic interaction scenarios.
Leveraging a multi-role distributed architecture with Ray for flexible resource allocation and heterogeneous task scheduling, ROLL integrates cutting-edge technologies like Megatron-Core, SGLang and vLLM to accelerate model training and inference.
๐ฃ Updates [08/11/2025] ๐ Our Paper released, see Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning. [08/13/2025] ๐ ROLL supports AMD GPUs with out-of-box image docker and Dockerfile and specific yamls underexamples/
directory. Please refer to Installation. [08/10/2025] ๐ Agentic RL supports stepwise learning, like GiGPO; Distill supports VLM. Explore the new capabilities! [07/31/2025] ๐ Refactor agentic rl design. Support agentic rl async training. Explore the new capabilities! [07/31/2025] ๐ Support DistillPipeline/DpoPipeline. Support lora. Support GSPO [06/25/2025] ๐ Support thread env for env scaling and support qwen2.5 VL agentic pipeline. [06/13/2025] ๐ Support Qwen2.5 VL rlvr pipeline and upgrade mcore to 0.12 version. [06/09/2025] ๐ ROLL tech report is now available! Access the report here. [05/30/2025] ๐ Training RLVR and Agentic RL with ROLL is now available! Explore the new capabilities.
Quick Start based on alicloud
Installation
Config guide
RLVR Pipeline
Agentic RL Pipeline
domain_batch_size
distribution control.We are continuously working to expand ROLL's capabilities:
ROLL is inspired by the design of OpenRLHF, VeRL, Nemo-Aligner, and RAGEN. The project is developed by Alibaba TAOBAO & TMALL Group and Alibaba Group. The code is distributed under the Apache License (Version 2.0). This product contains various third-party components under other open-source licenses. See the NOTICE
file for more information.
The following repositories have been used in ROLL, either in their close-to-original form or as an inspiration:
If you use ROLL in your research or project, please consider citing us:
@article{wang2025reinforcement, title={Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library}, author={Wang, Weixun and Xiong, Shaopan and Chen, Gengru and Gao, Wei and Guo, Sheng and He, Yancheng and Huang, Ju and Liu, Jiaheng and Li, Zhendong and Li, Xiaoyang and others}, journal={arXiv preprint arXiv:2506.06122}, year={2025} }
ROLL is a project jointly developed by Taotian Future Life Lab and Aicheng Technology, with a strong emphasis on pioneering the future of Reinforcement Learning (RL). Our mission is to explore and shape innovative forms of future living powered by advanced RL technologies. If you are passionate about the future of RL and want to be part of its evolution, we warmly welcome you to join us! Learn more about the ROLL Team through our official channels below๐
We are HIRING!
We welcome contributions from the community! ๐ค
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4