Ray Deep Dives

Ray Scalability Deep Dive: The Journey to Support 4,000 Nodes

September 19, 1:45 PM - 2:15 PM
View Slides

In today's dynamic machine learning landscape, Ray has emerged as an essential platform, powering demanding tasks like training ChatGPT at OpenAI and processing terabytes of data everyday at Amazon. This talk unveils Ray's pivotal role in addressing the exponential growth of modern ML workloads.

We will take a deep dive into Ray internal scalability, covering tasks, actors, objects and nodes, offering concrete examples to guide you in developing scalable code that maximizes Ray's potential.

Furthermore, we will explore the latest post-Ray 2.0 enhancements on health checks, resource broadcasting, and asynchronous actor creation. Join us on this exciting journey as we discuss the challenges and opportunities of buidling an unprecedented 4000-node cluster.

Takeaways

• Help the audience understand Ray's scalability and improvements after 2.0.

About Yi

Yi Cheng is a software engineer at Anyscale and a committer for the Ray project. He is interested in building efficient and reliable computation systems. He recently focused on Ray's reliability and scalability.

Yi Cheng

Software Engineer, Anyscale
Photo of Ray Summit pillows
Ray Summit 23 logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Photo of Ray pillows and Raydiate sign
Photo of Raydiate sign

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.