Calvin Lab Auditorium
We review models and algorithms for private analysis of massive dynamic data. We first consider algorithms that satisfy the combined constraints of the streaming model and differential privacy: they make a constant number of passes over the data, keep sublinear space, and output a differentially private statistic. Then we consider a natural further requirement for dynamic data: that the algorithm continuously updates a statistic as new data arrives, while keeping the entire sequence of updates private. Finally, we explore the connections between streaming algorithms and pan-private algorithms, i.e. algorithms whose memory state is private, so that their guarantees hold even in the face of a security breach. While the strong pan-privacy requirement forbids explicitly storing sketches of the data, we show we can store sufficient statistics on a sketch in order to estimate interesting quantities, such as the number of distinct elements or the number of heavy hitters.