Ray Deep Dives

Managed Cloud Infrastructure for LLMs

September 18, 1:45 PM - 2:15 PM
View Slides

Infrastructure challenges like high compute costs, GPU availability, scalability, and the burden of managing cloud resources slow down LLM and generative AI development. Anyscale provides the solutions to tackle these problems so our customers can focus on building and deploying high-performing custom models and applications. Our infrastructure powers our fast, cost-efficient, and scalable Anyscale Endpoints product. In this talk, you will hear about how we:

• Leverage all available GPU across different clouds to satisfy your compute needs

• Build intelligent features such as autoscaling and fully utilizing preemptible instances to cut cost

• Speed up instance start time to accelerate development cycle

• Manage compute, networking, storage and other cloud resources

Takeaways

• There is growing interest in self-hosting open source LLMs due to its flexibility, data privacy and cost-effectiveness, but it comes with challenges.

• Anyscale platform provides the solutions to the infrastructure challenges that come with self-hosting LLM, such as high compute costs, GPU availability, scalability, and the burden of managing cloud resources.

About Yifei

Yifei leads the Infrastructure and SRE teams at Anyscale. Her teams focus on building a seamless, cost-effective and scalable infrastructure for large-scale machine learning workloads. Before Anyscale, she spent a few years at Google working on open-source machine learning library TensorFlow.

About Allen

Allen is a software engineer at Anyscale working on cloud infrastructure. He currently focuses on simplifying cloud resources management for customers. Before anyscale, Allen spent one year working on AWS Elasticsearch.

About Bruce

Bruce is a software engineer at Anyscale. He works on the infrastructure and platform at Anyscale, helping to improve the Anyscale product features with faster, cheaper and high availability. Before Anyscale, Bruce worked at Uber, building the Uber search platform serving Uber rides, Uber eats, etc.

About Lanbo

Lanbo Chen is a software engineer at Anyscale working on cloud infrastructure. He primarily focuses on the Anyscale dataplane deployed to the customers. Before Anyscale, Lanbo worked at Google and Amazon, building the storage infrastructure.

Yifei Feng

Engineering Manager, Anyscale

Allen Yin

Software Engineer, Anyscale

Bruce Zhang

Software Engineer, Anyscale

Lanbo Chen

Software Engineer, Anyscale
Photo of Ray Summit pillows
Ray Summit 23 logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Photo of Ray pillows and Raydiate sign
Photo of Raydiate sign

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.