Alignment, Trust, Watermarking, and Copyright Issues in LLMs

Program

Special Year on Large Language Models and Transformers, Part 1

Location

Calvin lab auditorium

Date

Monday, Oct. 14 – Thursday, Oct. 17, 2024

Back to calendar

All talks listed in Pacific Time. Schedule subject to change.

9:15 – 9:40 a.m.

Coffee and Check-In

9:40 – 9:45 a.m.

Opening Remarks

9:45 – 10:30 a.m.

Private Retrieval-Augmented Generation

Raluca Popa (UC Berkeley)

Video

10:30 – 11 a.m.

Break

11 – 11:45 a.m.

Scalable Extraction of Training Data from (Production) Language Models

Nicholas Carlini (Google DeepMind)

Video

11:45 a.m. – 12:30 p.m.

Differentially private synthetic data for private LLM training

Andreas Terzis (Google Deep Mind)

Video

12:30 – 1:45 p.m.

Lunch (on your own)

1:45 – 2 p.m.

A Sociotechnical Approach to A Safe, Responsible AI Future: A Path for Science‑ and Evidence‑based AI Policy

Dawn Song (UC Berkeley)

Video

2 – 2:45 p.m.

Defense against prompt injection attacks

David Wagner (UC Berkeley)

Video

2:45 – 3 p.m.

Break

3 – 3:45 p.m.

Pluralistic Alignment: A Roadmap, Recent Work, and Open Problems

Taylor Sorensen (University of Washington)

Video

3:45 – 4 p.m.

Break

4 – 5 p.m.

Panel Discussion: AI Safety Regulation

Scott Aaronson (UT Austin & OpenAI),
Dan Hendrycks (Center for AI Safety),
Ion Stoica (UC Berkeley),
Martin Casado (a16z),
Joseph Gonzalez (UC Berkeley)

Video

9:15 – 9:45 a.m.

Coffee and Check-In

9:45 – 10:30 a.m.

Interactive Proofs, Debate, and AI Safety

Jonah Brown-Cohen (Google DeepMind)

Video

10:30 – 11 a.m.

Break

11 – 11:45 a.m.

Prover-Verifier Games Improve Legibility of LLM outputs

Yining Chen (Open AI)

Video

11:45 a.m. – 12:30 p.m.

Models that prove their own correctness

Orr Paradise (UC Berkeley)

Video

12:30 – 2:15 p.m.

Lunch (on your own)

2:15 – 3 p.m.

On Mitigating Backdoors

Jonathan Shafer (MIT)

Video

3 – 3:15 p.m.

Break

3:15 – 4 p.m.

Formal backdoor detection games and deceptive alignment

Jacob Hilton (Alignment Research Center)

Video

4 – 5 p.m.

Reception

9:15 – 9:45 a.m.

Coffee and Check-In

9:45 – 10:30 a.m.

AI Interactions: Misuse, Markets, and Managers

Jacob Steinhardt (UC Berkeley)

Video

10:30 – 11 a.m.

Break

11 – 11:45 a.m.

Beyond Preferences in AI Alignment: Towards Richer Models of Human Reasons and Decisions

Tan Zhi Xuan (MIT)

Video

11:45 a.m. – 12:30 p.m.

Differential privacy in the clean room: Copyright protections for generative AI

Aloni Cohen (University of Chicago)

Video

12:30 – 2 p.m.

Lunch (on your own)

2 – 2:45 p.m.

Veridical Data Science and Alignment in Medical AI

Bin Yu (UC Berkeley)

Video

2:45 – 3 p.m.

Break

3 – 3:45 p.m.

What should we align with?

Frauke Kreuter (LMU Munich and University of Maryland)

Video

3:45 – 4:30 p.m.

Open Technical Questions in Generative AI Copyright

Yangsibo Huang (Google)

Video

9 – 9:30 a.m.

Coffee and Check-In

9:30 – 9:45 a.m.

Watermarking: where to?

Scott Aaronson (UT Austin)

Video

9:45 – 10:30 a.m.

Pseudorandom Error-Correcting Codes with Applications to Watermarking Generative AI

Miranda Christ (Columbia University)

Video

10:30 – 11 a.m.

Break

11 – 11:45 a.m.

Edit Distance Robust Watermarks: beyond substitution channels

Noah Golowich (MIT)

Video

11:45 a.m. – 12:30 p.m.

Distortion-free mechanisms for language model provenance

Rohith Kuditipudi (Stanford University)

Video

12:30 – 2 p.m.

Lunch (on your own)

2:15 – 3 p.m.

Will Copyright Derail Generative AI Technologies?

Pam Samuelson (UC Berkeley)

Video

3 – 3:15 p.m.

Break

3:15 – 4 p.m.

From Risk to Resilience: Risk Assessment, Safety Alignment, and Guardrails for Foundation Models

Bo Li (University of Illinois at Urbana–Champaign)

Video

4 – 5 p.m.

Panel Discussion: Will AI make us stupid and what can we do about it?

Umesh Vazirani (UC Berkeley),
Vered Shemtov (Stanford),
Trevor Darrell (UC Berkeley),
Maroussia Lévesque (Harvard Law School),
Richard Zemel (Columbia University)

Video