Abstract

Fitting a model to a collection of observations is one of the quintessential problems in statistics. The typical assumption is that the data was generated by a model of a given type (e.g., a mixture model). This is a simplifying assumption that is only approximately valid, as real datasets are typically exposed to some source of contamination. Hence, any estimator designed for a particular model must also be robust in the presence of corrupted data. Until recently, even for the most basic problem of robustly estimating the mean of a high-dimensional dataset, all known robust estimators were hard to compute. A recent line of work in theoretical computer science obtained the first computationally efficient robust estimators for a range of high-dimensional estimation tasks. In this tutorial talk, we will survey the algorithmic techniques underlying these estimators and the connections between them. We will illustrate these techniques for the problems of robust mean and covariance estimation. Finally, we will discuss new directions and opportunities for future work.

Video Recording