Succinct Data Representations and Applications

Program

Theoretical Foundations of Big Data Analysis

Date

Monday, Sept. 16 – Thursday, Sept. 19, 2013

About

Update:

This workshop will run from Monday, September 16 to Thursday, September 19. There will be no Friday session. All talks will take place in the Chevron Auditorium, International House, UC Berkeley. The reception on Monday afternoon will be held at Calvin Lab.

A broadly successful approach to Big Data analysis involves understanding and manipulating not the raw data, but the essence of the data.

This may apply when we "capture" the data during measurements, as in compressed sensing, sampling or streaming algorithms. Not all data is captured, but only a representation suitable for subsequent analyses: in many applications, this representation is succinct—far smaller than the original data—and adequate at least for approximate analyses.
It may also apply when we "store and transport" data, as in compression, distributed sensing and data fusion: a succinct summary of the data might well suffice and save significantly on communication between multiple sites.
Finally, it may apply when we store, analyze and mine data, as in signal analysis, statistical analyses, complex query processing, machine learning or optimization: even sophisticated algorithms may be able to be executed fast and with sufficient accuracy given a succinct representation of the data, saving computing time and space.

Thus succinct representations of data, be they for capture, storage, transportation or analysis, not only make dealing with Big Data more efficient, but in some cases even bring a formidable Big Data task into the realm of the feasible. Succinct representations are possible because of the underlying principle of sparsity in nature.

In this workshop, we will cast a broad net and include many of the perspectives on data reduction that have been developed in various fields, including computer science, statistics, applied mathematics and signal processing. We will also highlight the many applications of such techniques in science and engineering. The workshop, which takes place early in the program, will set the stage for the semester-long activity in in the general area of resource-constrained Big Data analysis.

Enquiries may be sent to the organizers at this address.

Support is gratefully acknowledged from:

Chairs/Organizers

Petros Drineas

(Purdue University; chair)

Francis Bach

(INRIA and École Normale Supérieure Paris)

Peter Bühlmann

(ETH Zürich)

Emmanuel Candès

(Stanford University)

Piotr Indyk

(Massachusetts Institute of Technology)

Ravi Kannan

(Simons Institute, UC Berkeley)

Muthu Muthukrishnan

(Amazon)

Robert Nowak

(University of Wisconsin-Madison)

Stephen Wright

(University of Wisconsin-Madison)

Invited Participants

Francis Bach (INRIA and ENS Paris), Leonid Barenboim (Ben-Gurion University of the Negev), Ivona Bezáková (Rochester Institute of Technology), Peter Bickel (UC Berkeley), Josh Bloom (UC Berkeley), Sebastien Bubeck (Princeton University), Peter Bühlmann (ETH Zürich), Aydin Buluç (Lawrence Berkeley National Laboratory), Emmanuel Candès (Stanford University), Constantine Caramanis (University of Texas, Austin), Amit Chakrabarti (Dartmouth College), Venkat Chandrasekaran (California Institute of Technology), Xi Chen (Carnegie Mellon University), Artur Czumaj (University of Warwick), Alexandre d'Aspremont (CNRS - ENS Paris), Anindya De (UC Berkeley), Jim Demmel (UC Berkeley), Amit Deshpande (Microsoft Research), Petros Drineas (Rensselaer Polytechnic Institute), Noureddine El Karoui (UC Berkeley), Maryam Fazel (University of Washington), Michael Friedlander (University of British Columbia), Anna Gilbert (University of Michigan), David Gleich (Purdue University), Alex Gray (Georgia Institute of Technology), Sudipto Guha (University of Pennsylvania), Moritz Hardt (IBM Almaden), Steven Heilman (Courant Institute, NYU), Kazuo Iwama (Kyoto University), Martin Jaggi (École Polytechnique), Ming Jin (UC Berkeley), Michael Jordan (UC Berkeley), Sagar Kale (Dartmouth College), Ravi Kannan (Microsoft Research India), Anthony Kim (UC Berkeley), Valerie King (University of Victoria), Mladen Kolar (Carnegie Mellon University), Jakub Konečný (University of Edinburgh), John Lafferty (Carnegie Mellon University), Liza Levina (University of Michigan), Jian Li (Tsinghua University), Lisha Li (UC Berkeley), Ping Li (Rutgers University), Yi Li (University of Michigan), Edo Liberty (Yahoo! Research), Han Liu (Princeton University), Michael Mahoney (Stanford University), Andrew McGregor (University of Massachusetts), Nicolai Meinshausen (University of Oxford and ETH Zürich), Muthu Muthukrishnan (Rutgers University and Microsoft Research India), Jelani Nelson (Harvard University), Jennifer Neville (Purdue University), Rob Nowak (University of Wisconsin-Madison), Sang-Yun Oh (Stanford University), Ely Porat (Bar-Ilan University), Eric Price (Massachusetts Institute of Technology), Chris Ré (Stanford University), Ben Recht (UC Berkeley), Peter Richtarik (University of Edinburgh), Ronitt Rubinfeld (Massachusetts Institute of Technology), Richard Samworth (University of Cambridge), Leonard Schulman (California Institute of Technology), Or Sheffet (Carnegie Mellon University), Nikhil Srivastava (Microsoft Research India), Daniel Štefankovič (University of Rochester), Mario Szegedy (Rutgers University), Justin Thaler (Harvard University), Joel Tropp (California Institute of Technology), David Tse (UC Berkeley), Caroline Uhler (IST Austria), Santosh Vempala (Georgia Institute of Technology), Suresh Venkatasubramanian (University of Utah), Martin Wainwright (UC Berkeley), David Woodruff (IBM Almaden Research), Mary Wootters (University of Michigan), John Wright (Columbia University), Stephen Wright (University of Wisconsin-Madison), Bin Yu (UC Berkeley), Ming Yuan (University of Wisconsin-Madison).