Mixture Models, Robustness, and Sum of Squares Proofs

Abstract

We use the Sum of Squares (SoS) method to develop a new efficient algorithm for clustering and mean estimation in well-separated high-dimensional mixture models, substantially improving upon the statistical guarantees achieved by previous efficient algorithms. In particular, we study mixtures of k distributions, where every pair of distributions has means separated by separated by at least k^epsilon for any epsilon > 0. In the special case of spherical Gaussian mixtures, we give a k^O(1/epsilon^2)-time algorithm that learns the means of the components in the mixture and accurately clusters samples from the mixture. This is the first algorithm to improve on greedy (?single-linkage?) and spectral clustering, breaking a long-standing barrier for efficient algorithms at separation k^1/4. Our techniques are based on adapting algorithmic ideas from robust statistics, and are potentially of independent interest. Our main algorithm for learning mixture models provides an entirely SoS interpretation of the convex programming framework of [Diakonikolas et al, FOCS 16]. We show that many of the proofs from that paper can be replaced with much simpler proofs using only basic concentration and Holder's inequality, which allows us to approach this problem via SoS. As a corollary of this, we also obtain improved rates for robust mean estimation in certain regimes. Joint work with Sam Hopkins (Berkeley).

Mixture Models, Robustness, and Sum of Squares Proofs

Abstract

Video Recording