Abstract

Denoising diffusion models have led to a series of breakthroughs in image and video generation. In this talk, I will explore some of the connections between diffusion models and physics. Rooted in non-equilibrium thermodynamics, diffusion models enable a variety of extensions by lifting them into augmented spaces, encompassing position, momentum, and potentially additional auxiliary variables. This viewpoint gives rise to a “complete recipe” for constructing invertible diffusion processes, as well as new samplers that significantly reduce the number of sampling steps at test time. Additionally, thermodynamic processes offer a natural playground for generative AI. I will demonstrate how video diffusion models can effectively downscale precipitation patterns to finer scales, capturing extreme event statistics and local geographical patterns.

Video Recording