Abstract

As the sizes of modern datasets grow, many classical algorithms become prohibitively expensive, as the input is often too large to be stored in the memory of a single compute node. Streaming algorithms are designed to handle massive inputs using limited space: they process a stream of data items 'on the fly' while maintaining a small memory footprint at all times. In this talk I will discuss classical techniques for solving basic statistical estimation problems (e.g., moment estimation, distinct elements) on data streams, as well as recent results on graph analysis in the streaming model.

Video Recording