This workshop will consider the role of transformers as a central building block in the development of large language models, as well as their inherent limitations. The main questions posed will be around their effectiveness (What core properties makes transformers work so well?), necessity (Are there models that will do even better?), future ability (Can we extrapolate the future capabilities of LLMs scaling with data and compute?), and computational properties (What TCS models can be used to understand the emergence of complex skills in LLM models?). We will also consider other tools that may illuminate the properties of transformers such as tools from computational physics. These questions will explore both the transformer as a model class as well as the learned abilities from trained models. While the workshop focuses on Transformers, we will also explore other alternatives gaining popularity such as state-space models (SSMs).


Registration is required for in-person attendance, access to the livestream, and early access to the recording. Space may be limited, and you are advised to register early. 

For additional information please visit: https://simons.berkeley.edu/participating-workshop.

Please note: the Simons Institute regularly captures photos and video of activity around the Institute for use in videos, publications, and promotional materials. 

Register Now