This series of talks was part of the Big Data Boot Camp. Videos for each talk are available through the links above.
The introduction of randomization over the last decade into the design and analysis of algorithms for matrix computations has provided a new paradigm, particularly appropriate for many very large-scale applications, as well as a complementary perspective to traditional numerical linear algebra approaches to matrix computations. These novel approaches have been motivated by technological developments in many application areas that permit the automatic generation of large data sets, and they are of particular interest for large-scale data analysis since many data sets are naturally modeled as matrices.
This tutorial will describe the theory underlying randomized algorithms for matrix problems (such as matrix multiplication, least-squares regression, least absolute deviations regression, and low-rank matrix approximation) as well as where that theory is headed. Although the use of randomization is sometimes a relatively-straightforward application of Johnson-Lindenstrauss ideas, Euclidean spaces are much more structured objects than general metric spaces, and thus the best—both in theory and in numerical analysis and data analysis practice—randomized matrix algorithms take this into account. An emphasis will be placed on highlighting a few key concepts, several of which have natural statistical interpretations, that are responsible for the improvements in worst case theory and also for the usefulness of these algorithms in large-scale numerical and data applications.