![Geometric Methods in Optimization and Sampling_hi-res logo](/sites/default/files/styles/workshop_banner_sm_1x/public/2023-03/Geometric%20Methods%20in%20Optimization%20and%20Sampling_hi-res.jpg?h=b86da943&itok=A6FwVTMn)
Abstract
We study the effect of gradient-based optimization on feature learning in two-layer neural networks. We consider two settings: 1- In the proportional asymptotic limit, we show that the first gradient update improves upon the initial random features model in terms of prediction risk. 2- In the non-asymptotic setting, we show that a network trained via SGD exhibits low-dimensional representations, with applications in learning a single-index model with explicit rates.