Spring 2019

Discovery, Replication and Reuse of Sensitive Scientific Data with PSI

Monday, March 4th, 2019 4:00 pm4:45 pm

James Honaker (Harvard University)

We describe a role for differential privacy in open data repositories handling sensitive data. Archival repositories in the human sciences balance discoverability and replicability with their legal liabilities and ethical constraints to protect sensitive information. The ability to explore differentially private releases of archived data allows a curve-bending change in this trade-off. We further describe PSI, an implementation of a curator system for differentially private queries and statistical models, and its integration with the Dataverse repository. We describe some of the pragmatics of implementing a general purpose curator that works across a wide variety of types of data and types of uses, and of presenting differential privacy to an applied audience new to these concepts.