Text-to-image models (like Stable Diffusion) have revolutionized the landscape of AI-based applications by introducing the ability to synthesize incredibly realistic and coherent images. However, using these models is difficult due to a number of challenges including:
- Compute requirements, including mix of CPU and GPU instances,
- The need to stitch together fine tuning with inference and model deployment for more end-to-end MLOps experience,
- The requirement for a capable infrastructure to successfully deploy these models.
This hands-on training aims to address these challenges and demonstrate, in a practical manner, how to fine-tune stable diffusion models, execute batch inference to generate additional images, and ultimately deploy the model within a production-ready environment.