This case study revisits the key milestones that allowed Amazon to successfully transition core exabyte-scale data catalog management jobs from Spark to Ray. We'll review the challenges encountered, concessions made, and future vision to more deeply incorporate Ray into critical batch and streaming business intelligence pipelines at Amazon. Topics covered include techniques for developing large-scale serverless Ray job management infrastructure, build and deploy management, risk mitigation strategies leveraged to ease the migration to Ray, and operational excellence methods employed to ensure a smooth rollout to production.
Software engineer working on data management and optimization for big data technologies at Amazon.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.