Deep reinforcement learning (RL) research has made tremendous progress over the past years. Today it is widely used in various industries as an automatic decision making tool. Many Ray users have presented remarkable successes in using RLlib at recent Ray summits, for example Riot Games, Microsoft, AWS, TwoSigma, and McKinsey/QuantumBlack - just to name a few.
Despite wide adoption, there remains one challenge: Tuning an RL algorithm to a specific problem environment demands engineering expertise and compute.
The RLlib team at Anyscale has implemented a brand new, model-based RL algorithm, DreamerV3, published in Jan 2023 by Google DeepMind. DreamerV3 is very easy to tune as it requires less than a handful of hyperparameters and is extremely versatile with respect to the different environments it can solve, covering discrete- and continuous action domains, dense- and sparse reward functions, as well as image and vector observations. Due to its model-based nature, it is also very sample efficient, learning reliably on a minimum number of actual environment interactions. Furthermore, since our implementation utilizes RLlib's new multi-node/multi-GPU "Learner API", it achieves linear walltime scalability compared to the originally published DeepMind code.
We will explain the basics of model-based RL and the DreamerV3 architecture. Then we'll show how anyone can rapidly implement such complex algorithms on top of the new "RLlib light" APIs. Finally, we will demo the linearly scalable learning capabilities of DreamerV3, running on dozens of GPUs and mastering very challenging example environments.
Takeaways:
• Help explain the inner workings and learning capabilities of the SOTA DreamerV3 algo and its implementation on top of RLlib%E2%80%99s new API stack, offering multi-GPU training (OSS RLlib) and multi-node/multi-GPU training (on Anyscale).
Sven Mika has been working as a machine learning engineer for Anyscale since early 2020, where he is currently the tech-lead of the RL/RLlib team. Over the past year, his team has been focusing on performance improvements and large API refactors in the context of the "RLlib light" initiative, which aims at converting the library into a powerful and flexible toolbox for RL practitioners and researchers. These efforts already lead to an enhanced support of important industry use cases, such as massive-multi-agent algorithms for league-based self-play, scalable multi-node/multi-GPU training, model-based reinforcement learning, and fully fault tolerant scalability of RL experiments. Before starting at Anyscale, he was a leading developer of other successful open-source RL library projects, such as RLgraph and TensorForce.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.