This series of talks is part of the Algorithms and Uncertainty Boot Camp. Videos for each talk area will be available through the links above.
Speaker: Eli Upfal, Brown University
Sampling is a powerful technique, which is at the core of statistical data analysis and machine learning. Using a finite, often small, set of observations, we attempt to estimate properties of an entire sample space. How good are estimates obtained from a sample? Any rigorous application of sampling requires an understanding of the sample complexity of the problem – the minimum size sample needed to obtain the required results. In this tutorial we will cover some of the rich mathematical theory that has been develop in recent years to address this question, in particular in the context of statistical machine learning and rigorous data mining.
Main topics: Uniform convergence; VC-dimension - the ϵ-net and ϵ-sample theorems; Rademacher complexity; Applications in machine learning and data mining.