Programs
Fall 2013

Theoretical Foundations of Big Data Analysis

Aug. 22Dec. 20, 2013

We live in an era of "Big Data": science, engineering and technology are producing increasingly large data streams, with petabyte and exabyte scales becoming increasingly common. In scientific fields such data arise in part because tests of standard theories increasingly focus on extreme physical conditions (cf., particle physics) and in part because science has become increasingly exploratory (cf., astronomy and genomics). In commerce, massive data arise because so much of human activity is now online, and because business models aim to provide services that are increasingly personalized.

The Big Data phenomenon presents opportunities and perils. On the optimistic side of the coin, massive data may amplify the inferential power of algorithms that have been shown to be successful on modest-sized data sets. The challenge is to develop the theoretical principles needed to scale inference and learning algorithms to massive, even arbitrary, scale. On the pessimistic side of the coin, massive data may amplify the error rates that are part and parcel of any inferential algorithm. The challenge is to control such errors even in the face of the heterogeneity and uncontrolled sampling processes underlying many massive data sets. Another major issue is that Big Data problems often come with time constraints, where a high-quality answer that is obtained slowly can be less useful than a medium-quality answer that is obtained quickly. Overall we have a problem in which the classical resources of the theory of computation—e.g., time, space and energy—trade off in complex ways with the data resource.

Various aspects of this general problem are being faced in the theory of computation, statistics and related disciplines—where topics such as dimension reduction, distributed optimization, Monte Carlo sampling, compressed sensing, low-rank matrix factorization, streaming and hardness of approximation are of clear relevance—but the general problem remains untackled. This program will bring together experts from these areas with the aim of laying the theoretical foundations of the emerging field of Big Data.

sympa [at] lists [dot] simons [dot] berkeley [dot] edu (subject: %20, amp, body: subscribe%20bd2013announcements%40lists.simons.berkeley.edu) (Click here to subscribe to our announcements email list for this program.)

Organizers: 
Michael Jordan (UC Berkeley; chair), Stephen Boyd (Stanford University), Peter Bühlmann (ETH Zürich), Ravi Kannan (Microsoft Research India), Michael Mahoney (Stanford University), Muthu Muthukrishnan (Rutgers University and Microsoft Research India).
Long-Term Participants (including Organizers): 
Alexandr Andoni (Microsoft Research), Ivona Bezáková (Rochester Institute of Technology), Peter Bickel (UC Berkeley), Joshua Bloom (UC Berkeley), Sébastien Bubeck (Princeton University), Aydın Buluç (Lawrence Berkeley National Laboratory), Emmanuel Candès (Stanford University), Amit Chakrabarti (Dartmouth College), James Demmel (UC Berkeley), Petros Drineas (Rensselaer Polytechnic Institute), Noureddine El Karoui (UC Berkeley), Michael Friedlander (University of British Columbia), David Gleich (Purdue University), Alexander Gray (Georgia Institute of Technology and Skytree, Inc.), Moritz Hardt (IBM Almaden), Dorit Hochbaum (UC Berkeley), Kazuo Iwama (Kyoto University), Michael Jordan (UC Berkeley; chair), Ravi Kannan (Microsoft Research India), Valerie King (University of Victoria), Jian Li (Tsinghua University), Michael Mahoney (Stanford University), Andrew McGregor (University of Massachusetts), Muthu Muthukrishnan (Rutgers University and Microsoft Research India), Jennifer Neville (Purdue University), Robert Nowak (University of Wisconsin-Madison), Ely Porat (Bar-Ilan University), Yuval Rabani (Hebrew University of Jerusalem), Chris Ré (Stanford University), Benjamin Recht (UC Berkeley), Peter Richtarik (University of Edinburgh), Richard Samworth (University of Cambridge), Leonard Schulman (California Institute of Technology), Daniel Štefankovič (University of Rochester), Mario Szegedy (Rutgers University), Joel Tropp (California Institute of Technology), David Tse (Stanford University), Suresh Venkatasubramanian (University of Utah), Martin Wainwright (UC Berkeley), David Woodruff (IBM Almaden), Bin Yu (UC Berkeley).
Research Fellows: 
Leonid Barenboim (Weizmann Institute), Xi Chen (New York University), Martin Jaggi (ETH Zürich), Mladen Kolar (University of Chicago), Yi Li (Max Planck Institute, Saarbrücken), Han Liu (Princeton University), Sang-Yun Oh (Lawrence Berkeley National Laboratory), Eric Price (University of Texas, Austin; Google Research Fellow), Or Sheffet (Harvard University), Nikhil Srivastava (Microsoft Research India), Justin Thaler (Yahoo Labs; Microsoft Research Fellow), Caroline Uhler (IST Austria).
Visiting Graduate Students: 
John Duchi (UC Berkeley), Sagar Kale (Dartmouth College), Arindam Khan (Georgia Institute of Technology), Jakub Konečný (University of Edinburgh), Martin Takáč (University of Edinburgh), Gongguo Tang (Colorado School of Mines), Yixin Xu (Rutgers University).

Workshops

Sept. 3Sept. 6, 2013
Organizers: Michael Jordan (UC Berkeley)
Sept. 16Sept. 19, 2013
Organizers: Petros Drineas (Rensselaer Polytechnic Institute; chair), Francis Bach (INRIA and École Normale Supérieure Paris), Peter Bühlmann (ETH Zürich), Emmanuel Candès (Stanford University), Piotr Indyk (Massachusetts Institute of Technology), Ravi Kannan (Microsoft Research India), Muthu Muthukrishnan (Rutgers University and Microsoft Research India), Robert Nowak (University of Wisconsin-Madison), Stephen Wright (University of Wisconsin-Madison)
Oct. 21Oct. 24, 2013
Organizers: Michael Mahoney (Stanford University; chair), Guy Blelloch (Carnegie Mellon University), John Gilbert (UC Santa Barbara), Chris Ré (Stanford University), Martin Wainwright (UC Berkeley)
Nov. 18Nov. 21, 2013
Organizers: Michael Kearns (University of Pennsylvania; co-chair), Jennifer Neville (Purdue University; co-chair), Deepak Agarwal (LinkedIn), Edo Airoldi (Harvard University), Ashish Goel (Stanford University), Matt Jackson (Stanford University)
Dec. 11Dec. 14, 2013
Organizers: Kunal Talwar (Microsoft Research; chair), Avrim Blum (Carnegie Mellon University), Kamalika Chaudhuri (UC San Diego), Cynthia Dwork (Microsoft Research), Michael Jordan (UC Berkeley)

Those interested in participating in this program should send bigdata [at] lists [dot] simons [dot] berkeley [dot] edu (email to the organizers.)

Program image: "Say Big Oh" by Muthu.