About

This is the first part in a special, yearlong program on large language models and transformers that spans the 2024–2025 academic year. This program is inspired by the success of the Simons Institute's workshop on Large Language Models and Transformers, held in August 2023.

This program's overarching goal is to try to understand the ongoing revolution in transformers and large language models (LLMs) through a wide lens, in a relaxed setting that facilitates discussion, debate, and intellectual cross-pollination. At a conceptual level, LLMs profoundly change the landscape for theories of human language, of the brain and computation, and of the nature of human intelligence. In linguistics, they provide a new way to think about grammar, semantics, and conceptual representation. In neuroscience, vector models provide a new approach to computational models of the brain. In cognitive science, they challenge our notions of what are the essential elements of human intelligence. 

This program will explore very concrete questions about transformers as models of computation. This includes algorithmic ideas to reduce the complexity of training to nearly linear in the length of the input, as well as scaling laws studying how cross-entropy loss scales with model size, data set size, and amount of compute. The program will also explore how scaling laws might help in understanding high-level outcomes such as the emergence of complex skills in LLM models. 

At a practical level, it is clear that LLMs will have a profound impact on human society, and issues of alignment, trust, and security will play a central role. Alignment refers to the gap between complex human values and the mechanisms that drive AI decision-making. Related issues include trustworthiness (how do we know the model will do what it’s intended to?), interpretability (can we identify with certainty why a machine learning algorithm delivers a specific answer?), safety (can we safeguard against destructive actions by ML algorithms or humans using them?), security (can we protect data and systems from adversaries?), and fairness (can we safeguard against bias?). The legal and regulatory dimension of technological developments in AI, as well as its practical interaction with the capabilities and design of large language models, will be another key area of inquiry.

Three workshops will take place in Fall 2024 as part of this program. Dates TBD:

  1. Boot Camp
  2. Transformers as a Computational Model
  3. Alignment, Trust, Watermarking, and Copyright Issues in LLMs

Two workshops will take place in Spring 2025 during Part 2 of this program.

Long-Term Participants:
Jacob Andreas (MIT), Boaz Barak (Harvard University), Moses Charikar ( Stanford University), Danqi Chen (Princeton University), Yejin Choi ( UW, AI2), Sanjoy Dasgupta  (UCSD), Costis Daskalakis (MIT), Ev Fedorenko (MIT), Surbhi Goel (University of Pennsylvania), Shafi Goldwasser (UC Berkeley), Michael Kearns (University of Pennsylvania), Florent Krzakala (EPFL), Jason Lee (Princeton University), Tengyu Ma ( Stanford University), Jitendra Malik (UC Berkeley), Chris Manning ( Stanford University), Ankur Moitra (MIT), Christos Papadimitriou (Columbia University), Ellie Pavlick (Brown University), Aditi Raghunathan (CMU), Sasha Rush (Cornell University), Ludwig Schmidt ( UW, Anthropic), Greg Valiant ( Stanford University), Umesh Vazirani (UC Berkeley), Lenka Zdeborová (EPFL)

Research Fellows:
Enric Boix, Bingbin Liu, Kaifeng Lyu, Pierre Marion, Binghui Peng, Max Simchowitz.

Organizers

Welcome Reception

Program visitors are encouraged to arrive on Monday, August 26, to check in to the program. A welcome reception will take place at 4 p.m. that day.