Abstract

We review models and algorithms for private analysis of massive dynamic data. We first consider algorithms that satisfy the combined constraints of the streaming model and differential privacy: they make a constant number of passes over the data, keep sublinear space, and output a differentially private statistic. Then we consider a natural further requirement for dynamic data: that the algorithm continuously updates a statistic as new data arrives, while keeping the entire sequence of updates private. Finally, we explore the connections between streaming algorithms and pan-private algorithms, i.e. algorithms whose memory state is private, so that their guarantees hold even in the face of a security breach. While the strong pan-privacy requirement forbids explicitly storing sketches of the data, we show we can store sufficient statistics on a sketch in order to estimate interesting quantities, such as the number of distinct elements or the number of heavy hitters.

Video Recording