Abstract

We consider the fundamental statistical problem of robustly estimating the mean and covariance of a distribution from i.i.d. samples in n-dimensional space, i.e. in the presence of arbitrary (malicious) noise, with no assumption on the noise. Classical solutions (Tukey point, geometric median) are either intractable to compute efficiently or produce estimates whose error scales as a power of the dimension. In this talk, we present efficient and practical algorithms to compute estimates for the mean, covariance and operator norm of the covariance matrix, for a wide class of distributions. We show that the estimate of the algorithm is higher than the information-theoretic lower bound by a factor of at most the square-root of the logarithm of the dimension. This gives polynomial-time solutions to some basic questions studied in robust statistics. One of the applications is agnostic SVD.

Video Recording