AI/ML Platform & Applications

SkyPilot: Run AI on Any Cloud

September 19, 1:00 PM - 1:30 PM
View Slides

SkyPilot is an open-source framework for running LLMs, AI, and batch jobs on any cloud. In this talk, we describe how SkyPilot offers AI practitioners high GPU availability, maximum cost savings, and managed execution — all with a simple cloud-agnostic interface. SkyPilot is open source (Apache 2.0) and under active development at UC Berkeley.

SkyPilot's new approach is to view all clouds and regions (that a user has access to) as a coherent collection called the Sky. SkyPilot transparently sends a job to the cheapest and most available location in the Sky and manages its execution. It cuts cloud costs by supporting spot instances with automatic recovery (potentially across zones, regions, clouds) and automatically cleaning up idle cloud instances.

We'll also share our experiences deploying SkyPilot to users in the past year. SkyPilot has powered the development of popular LLM projects including the Vicuna chatbots and the vLLM inference system. It is being used by 10s of organizations for diverse use cases, including AI training on GPUs/TPUs, AI serving, CPU batch jobs, and interactive development.

About Zongheng

Zongheng Yang

Postdoctoral Researcher, UC Berkeley Sky Computing Lab
Photo of Ray Summit pillows
Ray Summit 23 logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Photo of Ray pillows and Raydiate sign
Photo of Raydiate sign

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.