Results 481 - 490 of 23758

Kaylene Stocking

Distortion of Learning from Human Preferences

Talk By

Abstract not available.

Generalizing Beyond, and Without a Population Distributions

In this talk I will discuss several ways to go beyond the PAC view of in-distribution generalization, and will argue for abandoning a distributional view altogether. (1) I will discuss how generic out-of-distribution generalization, including "length generalization" is fundamentally more fickle and requires a fundamentally different view of learning; (2) I will discuss dynamic learning where the basic object is a (non stationary) process, rather than a distribution; and (3) I will discuss how predictors and systems should be compared in a way that does not involve a notion of a population distribution.

Gradient Descent Dominates Ridge: A Statistical View on Implicit Regularization

A key puzzle in deep learning is how simple gradient methods find generalizable solutions without explicit regularization. This talk discusses the implicit regularization of gradient descent (GD) through the lens of statistical dominance. Using least squares as a clean proxy, we present two surprising findings. First, GD dominates ridge regression. For any well-specified Gaussian least squares problem, the finite-sample excess risk of optimally stopped GD is no more than a constant times that of optimally tuned ridge regression. However, there is a natural subset of these problems where GD achieves a polynomially smaller excess risk. Thus, implicit regularization is statistically superior to explicit regularization, in addition to its computational advantages. Second, GD and online stochastic gradient descent (SGD) are incomparable. We construct a sequence of well-specified Gaussian least squares problems where optimally stopped GD is polynomially worse than online SGD, and similarly vice versa. Our construction leverages a key insight from benign overfitting, revealing a fundamental separation between batch and online learning.

Neural Algorithic Design

I will talk about some recent result on designing efficient and effective "neural algorithmic modules", for problems such as graph partitioning and clustering.

Learning AC^0 Circuits Beyond Product Distributions

A landmark result in computational learning theory is the ``Low-Degree Algorithm" of Linial, Mansour, and Nisan that brought techniques from harmonic analysis to bear on learning theory. A striking application of their approach is that $AC^0$ circuits can be learned using quasipolynomial time and samples from the uniform distribution. This remains the state-of-the-art and their running time is known to be tight under various cryptographic assumptions. In this work, our focus is on the fundamental question: Is their quasipolynomial time algorithm an isolated phenomenon, or are there less brittle distributional assumptions that enable computationally efficient learning? In particular, can it be extended to natural distributions that allow for correlations? There are major challenges in doing so, most notably that we no longer have the convenience of a nice family of orthogonal functions. Our main result is a quasipolynomial time learning algorithm for $\mathsf{AC}^0$ circuits under high-temperature \emph{Ising models} of polynomial growth, nearly matching what is known in the product distribution setting. Our approach relies on constructing approximate Ising model samplers with low circuit complexity that can also be inverted. Such samplers can then be combined with recent advances on learning under nasty noise. The construction of these samplers builds on structural properties of Ising models, uncovering an exciting connection between strong spatial mixing and PAC learnability. Joint work with Gautam Chandrasekaran, Jason Gaitonde and Ankur Moitra

Event details

Let’s Stop Leaving Money on the Table

Katrina Ligett
Richard M. Karp Distinguished Lecture

Thursday, January 29
3:30 – 4:30 p.m.
Calvin Lab auditorium

Event details

Fault-tolerant quantum computation and quantum complexity

Fault-tolerant quantum computation (FTQC) is the study of making quantum computation robust against the presence of noise models of interest. In this talk, I will briefly describe recent progress that significantly improves the encoding rate and time efficiency of FTQC against the `local’ noise models, standard in physics. Then, I will switch gears to other complexity theory-inspired noise models and propose some open questions that I hope to think about at Simons.