Friday, July 31, 2020

Theory at the Institute and Beyond

Amid the biggest pandemic in a century, one that has disrupted lives and livelihoods the world over, it is gratifying to see how people in different walks of life have found ways to cope and carry on. Within the realm of theory research, the pursuit to better understand the foundations of computation and their implications doesn’t seem to have slowed down even a bit.

Make no mistake, the pandemic has disrupted our traditional modes of operation. A typical theorist might spend several hours each day brainstorming in a group, often over a beverage. For most theorists, this is the single most productive and enjoyable activity each day. As these meetings move online, they remain a shadow of what they used to be. Normally, surprising results often arise out of chance encounters between researchers from very different areas. As conferences and workshops shift online, however, these chance encounters become very rare. Finally, on most days, we are struggling along an unforgiving trail in an attempt to scale a seemingly insurmountable peak. Doubts — such as, Are we on the right trail? Even if we are on the right trail, are we strong enough to get through it? — often linger and can easily set one up for failure. Sharing these “theoretical” struggles with other researchers over lunch or in the corridors can be critical to keep us going. Sadly, these opportunities are rare these days.

Yet theory research does not seem to have missed a beat. The ACM STOC conference was held online for the first time. Despite the limitations of the medium, there are many silver linings to an online conference. Participation at the conference nearly doubled from last year, with 606 participants from 33 countries, many of which had not been represented at typical STOC conferences before. The videos of the conference talks are all available online, a fantastic resource for researchers going forward. Finally, as Ronen Eldan’s talk at the conference beautifully demonstrated, video is a really effective medium to communicate a research work in broad strokes in a very short time. In fact, this year’s online conference seemed to have so many advantages that the PC chair, Julia Chuzhoy, suggested holding the STOC conference twice each year, once online and once offline.

As weekly seminars at most universities move online, they have begun to attract participants from across the world. CS Theory Online Talks maintains a list of theory talks that are available online (also see here and here). The PIMS-CRM summer school on probability has morphed into a great set of online courses that I have really enjoyed. The Oxford-Warwick Complexity Meetings are an online lecture series dedicated to complexity theory, while we at the Simons Institute are also hosting a lecture series, on Boolean function analysis (more on this later). This flurry of online activity catalyzed by the pandemic is promising to make theory research broadly accessible to graduate students and undergraduates across the globe.

Meanwhile, fantastic new results keep pouring in. The biggest breakthrough this summer is the work of Karlin, Klein, and Oveis Gharan on the metric traveling salesman problem (metric TSP). They have posted a 1.5 - ε approximation algorithm for metric TSP for some constant ε > 10 ^{^-³⁶}. Metric TSP is a fundamental combinatorial optimization problem wherein the inputs consist of a network of cities and the distances between them. The goal is to find the shortest-length route that visits each city exactly once and returns to the starting point. This problem is called metric TSP if the distances between cities are assumed to satisfy the triangle inequality, namely the distance from City A to City C is at most the sum of the distances from City A to City B and from City B to City C.

The Christofides algorithm, discovered in 1976, achieves a 1.5-approximation to the optimal solution and is a classic algorithm taught in undergraduate classes on algorithms. Despite significant efforts, a better algorithm for metric TSP remained elusive for more than four decades. In 2011, two approaches were developed to improve 1.5 in the case of shortest-path metrics on unweighted graphs: one by Oveis Gharan, Saberi, and Singh and one by Mömke and Svensson. This breakthrough builds on the approach of Oveis Gharan, Saberi, and Singh to design the improved approximation algorithm, breaking the 1.5-approximation barrier that stood for four decades.

As the summer wraps up, the Simons Institute is organizing a series of lectures on the analysis of Boolean functions, which is a fundamental tool in a theorist’s tool kit and one that has innumerable applications in almost all areas of theoretical computer science, including learning theory, hardness of approximation, circuit complexity, pseudorandomness, and quantum computation. Over the past couple of years, the field of analysis of Boolean functions has seen several major advances. Long-standing open questions like the sensitivity conjecture have been resolved, and there has been substantial progress on others such as the sunflower lemma and the Fourier entropy conjecture. There has been an influx of new ideas from areas such as stochastic calculus. This lecture series features weekly two-hour lectures from the key players behind these advances.

In the first week of August, the Institute will host a fascinating interdisciplinary workshop titled Decoding Communication in Nonhuman Species. Communication in nonhuman animals and plant life has fascinated scientists for a long time, but with the recent advances in machine learning, there is a chance that one can actually decode these communications. To make this possible, one needs to bring together and build on the latest discoveries of experts in disparate fields, including machine learning, signal processing, data science, linguistics, and acoustics. The goal of the workshop is to explore the challenges and current state of the art in the study of nonhuman species’ communication as well as disciplines and technologies that are likely to provide solutions to these challenges.

After a brief summer hiatus, the Simons Institute is gearing up for an exciting fall semester. Unfortunately, it appears that events at the Institute will be online at least during the first half of the semester. But judging by the level of excitement among the organizers and potential participants, we expect this fall’s programs to be no less intense. Valuable lessons learned from our experience organizing part of the spring semester online, and through the various conferences and workshops in the past few months, will be incorporated.

Over the fall, the Institute will host two synergistic programs, one titled Probability, Geometry, and Computation in High Dimensions and the other on Theory of Reinforcement Learning.

Probability, Geometry, and Computation in High Dimensions: In the study of algorithms for statistical inference tasks in high dimensions, there is a rich interplay among the areas of probability, geometry, and computation. For example, ideas from statistical physics have been brought to bear in understanding algorithm design and structural properties in probabilistic settings. A large body of computational and structural phase transitions predicted via statistical physics is only recently being rigorously validated. Similarly, the phenomenon of “concentration of measure” lies at the intersection of probability and geometry but is related to the problem of dimension-free guarantees for many important algorithms.

The program aims to bring together computer scientists, mathematicians, physicists, and statisticians toward addressing basic questions such as: Which properties of data can be learned from a small number of samples? Can we characterize general trade-offs between the quality of data (statistical information) and the availability of computational resources? For what kinds of problems is it possible to make algorithmic guarantees that are dimension-free or have minimal dimension dependence?

The first workshop of the program, to be held September 21 to 25, will be devoted to rigorous evidence for abrupt change in structural properties and computational complexity (computational phase transitions) predicted by statistical physics for a variety of statistical problems. The second workshop will be concerned with concentration of measure, a ubiquitous phenomenon in high-dimensional spaces and a key ingredient in several randomized algorithms and Monte Carlo sampling methods. This workshop will bring high-dimensional geometers and probabilists together with computer scientists to share ideas on applications as well as state-of-the-art techniques. The third and final workshop of the program will focus on algorithms for learning and testing that are dimension-free or have a mild dependence on the dimension. With data being increasingly high dimensional, devising algorithms with such guarantees is not only desirable but also necessary.

Theory of Reinforcement Learning: In recent years, reinforcement learning has found exciting new applications to problems in artificial intelligence and robotics. Many of these advances were made possible by a combination of large-scale computation, innovative use of flexible neural network architectures and training methods, and new and classical reinforcement learning algorithms. However, we do not understand the power and limitations of RL-based algorithms. Reinforcement learning’s core issues, such as efficiency of exploration and the trade-off between the scale and the difficulty of learning and planning, have been extensively studied. Yet when it comes to design of scalable algorithms, many fundamental questions remain open. This program aims to reunite researchers across disciplines that have played a role in developing the theory of reinforcement learning.

The first workshop of the program will be devoted to understanding the success of deep neural networks in reinforcement learning, in particular for algorithms that are able to learn in environments previously thought to be much too large. The second workshop will engage theoretical computer scientists to study whether tools from the classical online algorithms literature can be brought to bear on reinforcement learning setups, since online algorithms traditionally have dealt with a dynamic environment and RL is but one approach to model interactions with a dynamic environment.

The third and final workshop will attempt to gather some of the tools needed to satisfactorily find good policies with off-policy data. Typically in RL, it is assumed that one can choose a policy and obtain data generated by that policy (either by running the policy or through simulation). In many applications, however, obtaining on-policy data is impossible, and all one has is a batch set of data that may be generated by a nonstationary and even unknown policy. Estimating the value of new policies in such settings becomes a hard statistical problem, which will be the focus of this workshop.

Both this fall’s programs kick off with boot camps aimed at bringing outsiders up to speed on the area. While the pandemic has stopped us from traveling, it has simultaneously erased geographic barriers in a sense. Wherever you may be, we hope you will join us in what is promising to be an intense and rewarding semester.

Theory at the Institute and Beyond

Related stories