Abstract
Unconstrained optimization of a smooth nonconvex objective over many variables is a classic problem in optimization. Several effective techniques have been proposed over the years, along with results about global and local convergence. There has been an upsurge of interest recently on techniques with good global complexity properties. (This interest is being driven largely by researchers in machine learning, who want to solve the nonconvex problems arising from neural network training and robust statistics, but it has roots in the optimization literature.) In this talk we describe the algorithmic tools that can be used to design methods with appealing practical behavior as well as provably good global convergence properties. These tools include the conjugate gradient and Lanczos algorithms, Newton's method, cubic regularization, trust regions, and accelerated gradient. We show how these elements can be assembled into a comprehensive method, and compare a number of proposals that have been made to date. If time permits, we will consider the behavior of first-order methods in the vicinity of saddle points, showing that accelerated gradient methods are as unlikely as gradient descent to converge to saddle points, and escapes from such points faster.
This talk presents joint work with Clement Royer and Mike O'Neill (both of U Wisconsin-Madison).