Tutorial: Training & Deploying Multi-Modal Models

Workshop

Deep Learning Theory Workshop and Summer School

Speaker(s)

Brandon McKinzie (Apple)

Location

Calvin Lab Auditorium

Date

Thursday, Aug. 4, 2022

Time

11:15 a.m. – 12:30 p.m. PT

Abstract

Abstract: Modern multi-modal models such as CLIP require significant engineering efforts to efficiently train, evaluate, and deploy. Furthermore, such models typically serve as a backbone feature extractor for many downstream tasks. This talk will provide an overview of how we’ve accomplished this at Apple, where CLIP now powers a large number of user experiences on iOS. We’ll cover concepts such as multi-node multi-gpu distributed training on billions of examples, transfer learning for downstream tasks, model pruning, efficient on-device inference with transformers, and more.