Diffusion Generative Modeling

About

Diffusion models are now the de facto approach to generative modeling across a wide range of data modalities, including images, audio, videos, and visuomotor policies, and they form the backbone of industry-scale systems like AlphaFold and Veo. In recent years, there has been a surge of interest in building the mathematical foundations undergirding this family of methods. This research has driven a rich transfusion of ideas between the practice of generative modeling on the one hand, and research in physics, TCS, statistics, and the foundations of machine learning on the other.

While this body of work has begun to demystify certain algorithmic and statistical aspects of diffusion models, much of the modern pipeline for building these at scale, and the rich behaviors they exhibit, remain out of the reach of current theory. In addition, these models are now being deployed in domains like text and molecules that present fresh new challenges well beyond the simple confines of natural images, within which they were conceived. Pretrained diffusion models also offer essential base capabilities that practitioners have tried to steer toward a menagerie of downstream objectives.

The central thesis of this program is that the field is at a critical juncture where there is significant upside from ramping up interactions between these different communities. Such interactions have the potential to give rise to new principled insights, into both algorithm design and how to evaluate newly proposed empirical interventions, which can simultaneously enrich the family of questions and abstractions that theorists can study and enable more reliable scaling of diffusion-based generative modeling in practice.

Long-Term Participants (tentative): Giulio Biroli (ENS Paris), Joey Bose (Imperial College London), Joan Bruna (Courant Institute), Sitan Chen (Harvard University), Yuxin Chen (University of Pennsylvania), Sinho Chewi (Yale University), Giannis Daras (MIT), Ahmed El Alaoui (Cornell University), Zahra Kadkhodaie (Flatiron Institute), Frederic Koehler (University of Chicago), Holden Lee (Johns Hopkins University), Gen Li (CUHK), Sidhanth Mohanty (Northwestern University), Andrea Montanari (Stanford University), Andrej Risteski (Carnegie Mellon University), Grant Rotskoff (Stanford University), Molei Tao (Georgia Institute of Technology), Thuy-Duong Vuong (UC San Diego), Mengdi Wang (Princeton University), Yuting Wei (University of Pennsylvania), Andre Wibisono (Yale University), Yuchen Wu (Cornell University)