Abstract

The seminal works of Karger and Benczur and Karger from the 90s showed how graphs can be "sparsified" to size nearly linear in the number of vertices, so that every cut is preserved to within multiplicative factors close to 1. This work motivates the general notion of the ``sparsification of a structure with respect to a class of queries'' which should roughly produce a compressed representation of the structure while answering every query in the class approximately correctly. One may ask what else beyond "graphs with respect to cut queries" can be so sparsified, and the ensuing years have discovered many new structures and sparsifications and derived potent applications. In 2015, Kogan and Krauthgamer suggested an umbrella to unify some of this quest by investigating CSP (Constraint Satisfaction Problem) sparsification: --- here the structure to be compressed is an instance of the CSP on n variables and a query is an assignment to the n variables, and the goal is to approximately determine the number of constraints satisfied by the queried assignment. This naturally captures graph sparsification, but also hypergraph sparsifications and many variants. Several follow up works gave some non-trivial CSPs that allow for near linear (in n) sparsifications, including classification of all binary CSPs (where each constraint applies to two variables) that allow such sparsification, but the general picture remains wide open.
In our works we introduce a new class of sparsification problems, namely code sparsification, where the structure to be preserved is a linear error correcting code; the query is a message, and the goal is to compute the approximate weight of the encoding of the message. We show that this class of problems gives the right language to abstract the techniques of Karger and Benczur and Karger --- and indeed all codes can be sparsified to length nearly linear in the number of message bits. This generalization already resolves some basic questions in CSP sparsification. A further generalization to additive codes over finite abelian groups gives even more powerful results and in particular completely classifies the class of symmetric Boolean CSPs that allow nearly linear sized sparsification. (Prior to our work even the case of sparsification of 3-XOR constraints was open.)
Based on joint works with Sanjeev Khanna (Penn) and Aaron (Louie) Putterman (Harvard).