Tuesday, Apr. 28, 2015

From the Inside: Information Theory

By Venkat Guruswami

When Alan Turing visited Bell Labs during WWII, apparently Claude Shannon and he had tea together every day. What they talked about is no longer known, but one of the principal aims of the Simons program on Information Theory is to act as a "channel" that fosters new dialogues and deepens existing connections between the Information Theory community (with Shannon as its founding figure) and the Theoretical Computer Science community (with the Turing machine being the mathematical underpinning for computation).

The subject of information theory was born in Shannon's monumental 1948 work laying the mathematical foundations of communication and identifying the potential and limits of reliable information processing. The subject immediately flourished within the electrical engineering community, alongside the related field of coding theory, which tackled the challenge of constructing explicit schemes to realize the potential shown to be possible by information theory. The algorithmic tasks of encoding and recovering data using error-correcting codes formed a natural link to computation, though computer science as a field wasn't even formally born yet. Indeed Peterson's 1960 algorithm for error-correcting Reed-Solomon codes was one of the earliest non-trivial polynomial time algorithms, well before polynomial time was formalized as the mathematically robust notion of efficient tractability.

Even after theoretical computer science was firmly established and flourishing as a discipline, for a long while communication theory was primarily studied in the electrical engineering world, alongside some work in coding theory in the mathematics community. However, starting in the late 80s and 90s, and continuing actively to this day, computer scientists are also addressing key challenges and setting influential agendas in coding theory (particularly in the more combinatorial setting of worst-case errors). The many-fold motivations for this include algorithmic challenges relating to efficient error-correction, applications of codes in cryptography, and the emergence of algebraic codes as a key tool in complexity theory. The resulting new developments in coding theory include expander codes, list decoding, locally testable codes, sub-linear time decoding, etc. In the other direction, information theory continued to broaden its frontiers, with much exciting work focusing on problems with a significant computational component such as iterative message passing algorithms, network coding, sparse signal recovery, and statistical inference.

Further, information theory, with its uncanny expressiveness and mathematical power, has also extended its influence well beyond its origins in communication theory. Information-theoretic arguments are key to many results in complexity theory, such as lower bounds in communication complexity, data structures, streaming computation, and hashing -- and in combinatorics, such as counting estimates, concentration inequalities, and hypercontractivity bounds. While coding theory progressed largely independently (in terms of technical aspects) from information theory, Arikan's remarkable recent discovery of polar codes shows the power of information-theoretic reasoning in constructing capacity-achieving codes for a whole range of scenarios.

This rich context calls for increased and deeper interactions among information theory, coding theory, and the theory of computing in the years ahead. Due to traditional discipline boundaries, as well as some differences in perspectives and cultures, many of the above-mentioned topics, however, have been largely studied within one community. This has been changing to some extent in recent years, with such examples as list decoding, polar codes, coding for distributed storage, network coding, high-dimensional data analysis, and phase transition phenomena attracting parallel investigation in both electrical engineering (EE) and computer science (CS) communities. The objective of this Simons program is to build upon this recent confluence and facilitate increased collaborations between these scientific communities so that such a cross-disciplinary study becomes the norm rather than the exception.

The program kick-started with a “boot camp” consisting of five mini-courses, covering a breadth of currently active trends in information theory, intended to acquaint program participants with the key themes of the program. The program was successful in genuine transfer of information across traditional EE/CS boundaries, giving just the right start to the program.

During the semester, the Institute hosted three workshops addressing the three broad technical themes that the program focuses on. The first of these workshops focused on emerging trends in coding theory, both on new requirements driven by modern applications (such as in distributed or flash storage, or high-throughput optimal networks), as well as cross-fertilization of concepts recently developed in one of the two communities (e.g., polar codes, spatial coupling, sub-linear decoding, and non-malleable coding). It was quite apparent during the workshop that this was a uniquely exciting period for coding theory, with many mathematically non-trivial challenges emerging out of profoundly influential applications, and increased synergy between the complementary viewpoints of EE/CS communities.

The second workshop will focus on the important role information theory can play in dealing with today's explosion of data in multiple application domains. It will bring together a multidisciplinary group of participants from the information theory, machine learning, and “big data” communities, to compare and combine different techniques and approaches, and pave the way for their adoption in diverse application areas.

The third workshop, titled Information Theory in Complexity Theory and Combinatorics, is a timely one given that information-theoretic methods have revolutionized recent research in diverse topics within theoretical computer science. Many of the notions underlying this trend, for example message compression in interactive settings, information complexity as a resource measure, and coding for interactive communication, should be of general interest to the information theory community, whose rich experience could provide valuable new perspectives on the challenges ahead (for instance in extending these developments to the multiparty as opposed to the two-party setting). Finally, information theory has given gems of proofs for many fundamental results in combinatorics, and the workshop will reflect on and provide further scope for such interactions between information theory and combinatorics. After all, it is entropy that counts.

Letter from the Director
Through the Computational Lens: Interdisciplinary Collaboration at the Simons Institute
Research Vignette: Faster Algorithms for Linear Programming
Research Vignette: Semialgebraic Geometry of Ranks
Program Preview: Cryptography

From the Inside: Information Theory

Related Articles:

Related stories