Abstract

Robust statistics traditionally focuses on outliers, or perturbations in total variation distance. However, a dataset could be corrupted in many other ways, such as systematic measurement errors and missing covariates. We generalize the robust statistics approach to consider perturbations under any Wasserstein distance, and show that robust estimation is possible whenever a distribution’s population statistics are robust under a certain family of friendly perturbations. This generalizes a property called resilience previously employed in the special case of mean estimation with outliers. We justify the generalized resilience property by showing that it holds under moment or hypercontractive conditions, and compare results obtained under TV and Wasserstein perturbations. We present two approaches for designing minimum distance estimators with good finite- sample rates: weakening the discrepancy and expanding the set of distributions. Joint work with Banghua Zhu and Jacob Steinhardt. https://arxiv.org/abs/1909.08755