Abstract

Sampling is a powerful tool that is widely used statistics and machine Learning for nonparametric / distribution-free inference.  Interestingly, it also finds its place as the centerpiece of many useful algorithms and fundamental theory in differential privacy under the name of “privacy amplification” or “subsampling lemma”, which roughly says that if we apply an (ε, δ)-DP mechanism to a **randomly** chosen subsample of the data, then a stronger “amplified" privacy guarantee of (O(γε), γδ)-DP can be proven.

In this talk, I will present a new privacy amplification result under the formalism of Renyi Differential Privacy. The result allows us to perform tighter privacy composition over a sequence of heterogeneous subsampled mechanisms, and also to understand the nature of subsampled mechanisms on a more fine-grained scale.

I will talk about the mathematics underlying this problem as well as the numerical problems that arise when implementing the bound.

If time permits, I will also talk about a recent unpublished work that addresses the same problem (but more precisely) for the case of Poisson Sampling — when each data point is selected independently at random. 

Video Recording