Subsampled natural gradient (SNG) and Gauss-Newton (SGN) methods have demonstrated impressive performance for parametric optimization problems in scientific machine learning, including neural network wavefunctions and physics-informed neural networks. However, their development has been hindered by a lack of theoretical understanding. To make progress on this problem, we study these algorithms on a simplified parametric optimization problem involving a linear model and a quadratic loss function. In this setting we show that both SNG and SGN can be interpreted as sketch-and-project methods and can thus enjoy rigorous convergence guarantees for arbitrary batch sizes. This interpretation also explains a recently proposed accelerated variant of SNG and clarifies how this accelerated variant can be extended to the setting of SGN. Finally, these explorations inspire a number of new questions about sketch-and-project algorithms, some of which have been resolved and many of which remain open.
This seminar is part of the Recent Progress and Open Directions in Matrix Computations series.
All scheduled dates:
Upcoming
No Upcoming activities yet