Description

Iterative algorithms are the workhorses of modern statistical learning, and are widely used to fit large-scale, complex models to random data. While the choice of an algorithm and its hyperparameters determines both the speed and fidelity of the learning pipeline, it is common for this choice to be made heuristically, either by expensive trial-and-error or by comparing rough bounds on convergence rates of various candidate algorithms. Motivated by this, we develop a principled framework that produces sharp, iterate-by-iterate characterizations of solution quality for algorithms run with sample-splitting on a wide range of nonconvex model-fitting problems with Gaussian data. I will present the general framework and highlight several concrete consequences for parameter estimation in some popular statistical models, covering both higher-order algorithms based on alternating updates as well as first-order algorithms based on subgradient descent. These corollaries reveal multiple nonstandard phenomena and facilitate rigorous comparisons between algorithms.