Understanding Generalization in Adaptive Data Analysis

Workshop

Data Privacy: Planning Workshop

Speaker(s)

Vitaly Feldman (IBM Almaden)

Location

Date

Tuesday, May 23, 2017

Time

5 – 5:30 p.m. PT

Abstract

Datasets are often reused to perform multiple statistical analyses in an adaptive way, in which each analysis may depend on the outcomes of previous analyses on the same dataset. Standard statistical guarantees do not account for these dependencies and little is known on how to provably avoid overfitting. In this talk I'll describe recent work that provides a new framework to address this problem. I'll then describe several approaches to the problem based on techniques developed in the context of differential privacy.
Based on joint works with Dwork, Hardt, Pitassi, Reingold, Roth and Steinke.

Attachment

Understanding Generalization in Adaptive Data Analysis