Lightning Talks

Parallel inferencing with KServe Ray integration

September 18, 6:00 PM - 6:15 PM
View Slides

KServe is a Opensource production-ready model inference framework on Kubernetes utilizing many knative's features such as routing for canary traffic and payload logging. However, the one model per container paradigm limits the concurrency and throughput when sending multiple inference requests. With RayServe integration, a model can be deployed as individual Python workers allowing for parallel inference. This enables concurrent inference requests to be processed simultaneously, improving overall efficiency. In this talk, we will share how you can configure, run, and scale machine learning models in Kubernetes using KServe and Ray.

About Ted

Ted Chang is software engineer in the IBM Cognitive Open Technologies Group focusing on software development in the MLOps and Data/AI space. Lately, he has been focusing on Kubeflow, KServe and Flink.

About Jim

James Busche is a senior software engineer in the IBM Open Technologies Group, currently focused on the Open Source CodeFlare project. Previously, James has been a DevOps Cloud engineer for IBM Watson and the worldwide Watson Kubernetes deployments.

Ted Chang

Software Engineer, IBM

Jim Busche

Software Engineer, IBM
Photo of Ray Summit pillows
Ray Summit 23 logo

Ready to Register?

Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.

Photo of Ray pillows and Raydiate sign
Photo of Raydiate sign

Join the Conversation

Ready to get involved in the Ray community before the conference? Ask a question in the forums. Open a pull request. Or share why you’re excited with the hashtag #RaySummit on Twitter.