Abstract

Abstract: The fundamental assumption of learning theory is the statistical learning framework, which assumes that training and test data are drawn from the same distribution. 

As machine learning is increasingly deployed in practice, this assumption no longer holds in practice.  In this talk, we will take a closer theoretical look at two cases where this assumption is violated. The first is robustness to adversarial examples, where test data consists of perturbed versions of legitimate inputs drawn from the underlying data distribution. The second problem is imbalanced classification — where the training data is highly imbalanced, while test data is less so. 

Bio: Kamalika Chaudhuri is a Research Scientist at Meta AI and a Professor at the University of California, San Diego. She received a Bachelor of Technology degree in Computer Science and Engineering in 2002 from Indian Institute of Technology, Kanpur, and a PhD in Computer Science from University of California at Berkeley in 2007. She received an NSF CAREER Award in 2013 and a Hellman Faculty Fellowship in 2012. She has served as the program co-chair for AISTATS 2019 and ICML 2019, and as the General Chair for ICML 2022. 

Kamalika’s research interests lie in the foundations of trustworthy machine learning – or machine learning beyond accuracy, which includes problems such as learning from sensitive data while preserving privacy, learning under sampling bias, in the presence of an adversary, and from off-distribution data.