While Ray seamlessly scales offline and online ML computation, it can also bring its own set of challenges, particularly when it comes to debugging large scale ML workloads. To address these challenges, we will highlight recent advancements in observability tooling within Ray and Anyscale and how users can utilize these new tools to effectively debug both offline (preprocessing, training, tuning, inference) and online (serving) ML workloads.
In this talk, we will discuss the various tools available to Ray/Anyscale users in their journey of developing and deploying a ML application in production. We will demo developing a ML workload and bringing it to production and the many tools provided by Anyscale and Ray. By the end of the talk, beginner users will understand the most valuable and fundamental observability tools that are available and advanced users will get a glimpse of some of the more advanced functionality for debugging really tricky errors.
Takeaways: Introduce new observability tools in Ray/Anyscale to Ray users and teach them how to use these tools when developing real world workloads
SangBin Cho is a software engineer at Anyscale and a committer on open-source project Ray. He has contributed to various parts of Ray's core distributed systems including actor scheduler, placement group APIs, data plane improvement (object store and data processing support), stability improvement, and observability infrastructure, tooling, and APIs.
Alan is a Staff Engineer and Tech lead at Anyscale working on Observability, Endpoints, and the Anyscale Platform. Before Anyscale, he was at Linkedin working on the Linkedin Profile page.
Chao Wang is a software engineer at Anyscale, where he works on Observability and the Anyscale Platform. He is passionate about full-stack engineering and building end-to-end solutions for developers. Before joining Anyscale, he pursued a master's degree at UC San Diego.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.