Abstract

Double descent refers to the phenomenon that inferential metrics such as generalization error or mean squared error can exhibit two qualitatively different regimes or phases, depending on the parameters of the model and learning pipeline: one where there is a tradeoff as, e.g, the number of parameters increases; and one where this tradeoff saturates and the inferential metric improves monotonically. Implicit regularization refers to the phenomenon that an approximation algorithm for a particular objective, in addition to coming with some sort of approximation guarantee, can also exactly optimize a regularized version of the objective nominally being optimized, i.e., one can solve a regularized problem simply by using a (faster) approximation algorithm to approximate the solution of an unregularized problem. Both phenomena have received attention recently due to neural networks, and both are seen in kernel-based learning and more generally, but neither phenomenon is particularly well-understood. Here, we use techniques from Randomized Numerical Linear Algebra (RandNLA) to develop an approach to address both of these phenomena, and we apply these techniques to perhaps the simplest and most ubiquitous statistical learning problem, the overdetermined/underdetermined least-squares regression problem. The approach involves constructing what we call a surrogate design, which amounts to using volume/determinantal information to reweight an initial design to be more "nice." Given this surrogate design, we can obtain exact non-asymptotic expressions for double descent and implicit regularization. For the latter, in particular, we show that using sampling/projection methods from RandNLA to approximate the solution to unregularized least-squares regression problems implicitly solves ridge-regularized least-squares problems exactly.