The Future of Language Models: A Perspective on Evaluation

Workshop

The Future of Language Models and Transformers

Speaker(s)

Swabha Swayamdipta (University of Southern California)

Location

Calvin Lab Auditorium

Date

Friday, Apr. 4, 2025

Time

3 – 4 p.m. PT

Abstract

The progress in techniques to evaluate LLMs has regrettably fallen behind the progress in LLM development, making it challenging to quantify progress. My research calls for rethinking the fundamental principles underlying the evaluation of Transformer-based language models. I will discuss some work on applying language models to real tasks, as well as the selection of test data for efficient and robust evaluation. I will present challenges that arise when we do not know what the ground truth might be. Finally, I will discuss some ideas on evaluating Transformer language models without involving language.

Attachment

Slides

The Future of Language Models: A Perspective on Evaluation

Abstract

Attachment

Video Recording