At FlightAware, we collect vast amounts of data about aircraft in motion all around the globe. On our Predictive Technologies crew, we leveraged Ray and AWS for training a new runway prediction model in XGBoost. In this talk we'll discuss our use case that begins with a data lake of Parquet files in S3 containing the features of billions of examples and concludes with our cost-effective solution built on a scalable Ray cluster that can quickly shuttle terabytes of training data from S3 into distributed memory. We'll talk about the various components of building a distributed XGBoost training system and how Ray helped make this as seamless as possible. In particular, we'll share how we organized our training data, configured our fault-tolerant and elastic Ray cluster, leveraged Amazon Lustre for FSx filesystem for high-speed data loading, tracked real-time metrics and evaluation data in MLFlow, and along the way we'll also discuss some tips and tricks we learned throughout the process that can help keep your costs and and training time lower.
Patrick Dolan is a Senior Machine Learning Engineer at FlightAware building industry-leading predictive products in the aviation sector. He's an instrument-rated private pilot in the United States and has been an aviation enthusiast since childhood. He's been very fortunate to be able combine his knowledge and interests while working at aviation technology companies such as FlightAware, ForeFlight, and CloudAhoy. Additionally, he's a true believer in the power of the cloud and spent time at Amazon Web Services helping others to realize their technology vision. He lives in Austin, TX with his wife, three children, and their golden retriever Lucy.
Come connect with the global community of thinkers and disruptors who are building and deploying the next generation of AI and ML applications.