AI/ML Platform & Applications

Last Mile Data Processing for ML Training using Ray

September 18, 2:30 PM - 3:00 PM
View Slides

In this talk, we will discuss how Pinterest integrated Ray to streamline ML innovations. With Ray, ML engineers at Pinterest can scale the end-to-end ML lifecycle from data processing to model training all within a single Job, minimizing the need for a workflow framework and greatly enhancing the ML developer velocity.

We’ll visit how we integrated Ray with Pinterest’s in-house Kubernetes infrastructure, and how it supports business-critical use cases. Furthermore, we’ll share insights into the challenges faced, lessons learned, and the benefits Ray brings to our business.

About Chia-Wei

Chia-Wei Chen is a software engineer on Pinterest's Machine Learning Training Platform team. He works on building compute infrastructure for ML Training, Batch inference and Jupyter.

About Qingxian

Qingxian Lai is a software engineer on the Pinterest Machine Learning Data platform team. He has expertise in large-scale feature serving, and has been solving last-mile-data-processing problems with Ray.

About Raymond

Raymond Lee is a software engineer at Pinterest working on a large-scale distributed training system and building efficient ML Runtime. In the past, he has also worked on model serving for Generative AI and ML infrastructure for Visual Search.

Chia-Wei Chen

Software Engineer, Pinterest

Qingxian Lai

Software Engineer, Pinterest

Raymond Lee

Software Engineer, Pinterest
Photo of Ray Summit pillows
Ray Summit 23 logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Photo of Ray pillows and Raydiate sign
Photo of Raydiate sign

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.