Abstract

As machine learning enters wider use in real-world applications, there is increasing worry about the validity of predictors trained at one institution, eg. a hospital or a state prison system, and applied to many others. Empirically, scientists have found that predictors’ accuracies plummet in new domains. This raises fairness concerns for communities that don’t have the resources or the data available to create their own predictors from scratch. In this talk I will discuss this phenomenon and a model for one source of inter-institution variability, which separates invariant features from institution-specific features. Our approach uses an adversarial neural network to censor institution-specific features from the data, ensuring that prediction is based on invariant features. In contrast to much work in the domain adaptation literature, which assumes access to data from both the training and target domains, we do not require any data from the target domain. When there are invariant features discoverable in the data, our method can improve accuracy on completely unseen populations.

Video Recording