![Foundations of Machine Learning( smaller text) _hi-res](/sites/default/files/styles/workshop_banner_sm_1x/public/2023-01/Foundations%20of%20Machine%20Learning_hi-res.jpg?h=01395199&itok=AaUSmaOK)
Abstract
Training neural networks is a difficult non-convex optimization problem with possibly numerous local optimal and saddle points. However, empirical evidence seems to suggest the effectiveness of simple gradient-based algorithms. In this work, we analyze the properties of stationary points for training one-hidden layer neural networks with ReLU activation functions and show that a stationary point implies a global optimum with high probability under some conditions on the neural weights. Moreover, we introduce semi-random units where the activation pattern is determined by a random projection of the input, and show that networks with these units are guaranteed to converge to global optimal with high probability.