
Optimizing Machine Learning Models: From Fine-Tuning to Distillation
May 28 @ 7:00 pm - 8:00 pm CDT
In this talk, we'll explore various machine learning model optimization techniques, with a focus on fine-tuning and distillation. As models grow larger and more complex, optimizing them for efficiency is crucial, especially when deploying in resource constrained environments. We'll dive deep into fine-tuning and distillation, explaining their processes, benefits, and trade-offs. Additionally, we'll introduce LoRA (Low-Rank Adaptation) and QLoRA (Quantized LoRA), two cutting edge techniques that allow for efficient fine-tuning of large models, particularly large language models (LLMs), with minimal computational overhead. The session will also briefly cover other optimization methods, such as pruning, quantization, and reinforcement learning (RL), providing a comprehensive overview of how to select the right technique for different use cases. By the end of this talk, you'll have a deeper understanding of these optimization strategies and when to apply them to enhance model performance, scalability, and efficiency.
Speaker(s): Raja Krishna
Virtual: https://events.vtools.ieee.org/m/480233