Non-Parametric Convergence Rates for Plain Vanilla Stochastic Gradient Descent

Monday, Dec. 6, 2021 1:00 pm1:15 pm

Add to Calendar


Raphaël Berthier (École polytechnique fédérale de Lausanne)


Calvin Lab Auditorium

Most theoretical guarantees for stochastic gradient descent (SGD) assume that the iterates are averaged, that the stepsizes are decreasing, and/or that the objective is regularized. However, practice shows that these tricks are less necessary than theoretically expected. I will present an analysis of SGD that uses none of these tricks: we analyze the behavior of the last iterate of fixed step-size, non-regularized SGD. Our results apply for kernel regression, i.e., infinite-dimensional linear regression. As a special case, we analyze an online algorithm for estimating a real function on the unit interval from the observation of its value at randomly sampled points.