Linear representations of concepts in modern AI models

Workshop

Smale@95: A Conference in Honor of Steve Smale

Speaker(s)

Mikhail Belkin (UCSD)

Location

Calvin Lab Auditorium

Date

Tuesday, July 22, 2025

Time

10 – 10:30 a.m. PT

Abstract

A trained Large Language Model (LLM) contains much of human knowledge. Remarkably, many concepts can be recovered from the internal activations of neural networks via linear "probes", which are, mathematically, single index models. I will discuss how such probes can be constructed and used based on Recursive Feature Machines — a feature-learning kernel method originally designed for extracting relevant features from tabular data.

Linear representations of concepts in modern AI models

Abstract

Video Recording