Abstract

We describe a role for differential privacy in open data repositories handling sensitive data. Archival repositories in the human sciences balance discoverability and replicability with their legal liabilities and ethical constraints to protect sensitive information. The ability to explore differentially private releases of archived data allows a curve-bending change in this trade-off. We further describe PSI, an implementation of a curator system for differentially private queries and statistical models, and its integration with the Dataverse repository. We describe some of the pragmatics of implementing a general purpose curator that works across a wide variety of types of data and types of uses, and of presenting differential privacy to an applied audience new to these concepts.

Video Recording