The Future of Language Models and Transformers

Program

Special Year on Large Language Models and Transformers, Part 2

Location

Calvin Lab auditorium

Date

Monday, Mar. 31 – Friday, Apr. 4, 2025

Back to calendar

9 – 9:30 a.m.

Coffee and Check-In

9:30 – 10:30 a.m.

LLM Reasoning

Denny Zhou (Google DeepMind)

10:30 – 11 a.m.

Break

11 a.m. – 12 p.m.

The Key Ingredients of Optimizing Test-Time Compute and What's Still Missing

Aviral Kumar (Carnegie Mellon University)

12 – 1:30 p.m.

Lunch (on your own)

1:30 – 2:30 p.m.

Openthinker: curating a reasoning post-training dataset and training open data reasoning models

Alex Dimakis (UC Berkeley)

2:30 – 3 p.m.

Break

3 – 4 p.m.

LLM Skills and Metacognition: Scaffolding for New Forms of Learning?

Sanjeev Arora (Princeton University)

4 – 5 p.m.

Reception

9 – 9:30 a.m.

Coffee and Check-In

9:30 – 10:30 a.m.

What will Transformers look like in 2027?

Yoon Kim (Massachusetts Institute of Technology)

10:30 – 11 a.m.

Break

11 a.m. – 12 p.m.

Reducing the Dimension of Language: A Spectral Perspective on Transformers

Elad Hazan (Princeton University)

12 – 1:30 p.m.

Lunch (on your own)

1:30 – 2:30 p.m.

Mixed-modal Language Modeling: Chameleon, Transfusion, and Mixture of Transformers

Luke Zettlemoyer (University of Washington)

2:30 – 3 p.m.

Break

3 – 4 p.m.

The Frontier between Retrieval-augmented and Long-context Language Models

Danqi Chen (Princeton University)

4 – 5 p.m.

Attention to Detail: Fine-Grained Vision-Language Alignment

Kai-Wei Chang (UCLA)

9 – 9:30 a.m.

Coffee and Check-In

9:30 – 10:30 a.m.

Inference Scaling: A New Frontier for AI Capability

Azalia Mirhoseini (Stanford / DeepMind)

10:30 – 11 a.m.

Break

11 a.m. – 12 p.m.

DeepSeek-R1 Thoughtology: <Thinking> about LLM Reasoning

Siva Reddy (IVADO - Mila - McGill University)

12 – 1:30 p.m.

Lunch (on your own)

1:30 – 2:30 p.m.

On cognitive maps, LLMs, world models, and understanding

Dileep George (Google DeepMind)

2:30 – 3 p.m.

Break

3 – 4 p.m.

Talk by

Zaid Harchaoui (University of Washington)

4 – 4:30 p.m.

Break

4:30 – 5 p.m.

Light Refreshments

5 – 6:15 p.m.

The Move Toward AGI: Why Large Language Models Surprised Almost Everyone, and What’s Coming Next | Theoretically Speaking

Anil Ananthaswamy (Simons Institute),
Dileep George (Google DeepMind),
Azalia Mirhoseini (Stanford),
Luke Zettlemoyer (University of Washington and Meta)

9 – 9:30 a.m.

Coffee and Check-In

9:30 – 10:30 a.m.

On Knowledge Separation and Latent Diffusion for Text

Kilian Weinberger (Cornell University)

10:30 – 11 a.m.

Break

11 a.m. – 12 p.m.

Controllable and Creative Natural Language Generation

Nanyun (Violet) Peng (UCLA)

12 – 1:30 p.m.

Lunch (on your own)

1:30 – 2:30 p.m.

Transformers can learn compositional function

Jason Lee (Princeton University)

2:30 – 3 p.m.

Break

3 – 4 p.m.

Predicting and optimizing the behavior of large ML models

Andrew Ilyas (Stanford University)

4 – 5 p.m.

Panel Discussion

9 – 9:30 a.m.

Coffee and Check-In

9:30 – 10:30 a.m.

Towards sequence-to-sequence models without activation functions

Grigorios Chrysos (University of Wisconsin-Madison)

10:30 – 11 a.m.

Break

11 a.m. – 12 p.m.

The Power of Resets! Learning better, one reset at a time

Kianté Brantley (Harvard University)

12 – 1:30 p.m.

Lunch (on your own)

1:30 – 2:30 p.m.

SILO Open LM: Training LMs on Siloed Datasets

Sewon Min (UC Berkeley)

2:30 – 3 p.m.

Break

3 – 4 p.m.

The Future of Language Models: A Perspective on Evaluation

Swabha Swayamdipta (University of Southern California)

4 – 5 p.m.

Learning Generative Models from Corrupted Data

Gianis Daras (MIT)