About

This workshop will consider the role of transformers as a central building block in the development of large language models, as well as their inherent limitations. The main questions posed will be around their effectiveness (What core properties makes transformers work so well?), necessity (Are there models that will do even better?), future ability (Can we extrapolate the future capabilities of LLMs scaling with data and compute?), and computational properties (What TCS models can be used to understand the emergence of complex skills in LLM models?). We will also consider other tools that may illuminate the properties of transformers such as tools from computational physics. These questions will explore both the transformer as a model class as well as the learned abilities from trained models. While the workshop focuses on Transformers, we will also explore other alternatives gaining popularity such as state-space models (SSMs).

Chairs/Organizers