Challenges in Making LLMs Safe and Robust

Workshop

Special Year on Large Language Models and Transformers, Part 1 Boot Camp

Speaker(s)

Aditi Raghunathan (Carnegie Mellon University)

Location

Calvin Lab Auditorium

Date

Friday, Sept. 6, 2024

Time

9:30 – 10:45 a.m. PT

Abstract

There are numerous safety concerns and wide-ranging attacks on current large language models. In this talk, we will identify the common root causes of these failures. Using a simple illustrative problem, we will walk through several defense strategies and evaluate their strengths and weaknesses. Finally, we will draw connections to the broader literature on safety and robustness in machine learning.

Attachment

LLM24-BC Slides - Aditi Raghunathan.pdf

Challenges in Making LLMs Safe and Robust

Abstract

Attachment

Video Recording