Abstract
Identifying correlations within multiple streams of high-volume time series is a general but challenging problem. A simple exact solution has cost that is linear in the dimensionality of the data, and quadratic in the number of streams. In this work, we use dimensionality reduction techniques (sketches), along with ideas derived from coding theory and fast matrix multiplication to allow fast (subquadratic) recovery of those pairs that display high correlation.
Joint work with Jacques Dark