Kevin Leyton-Brown (University of British Columbia)
Calvin Lab Room 322
Understanding the Empirical Hardness of NP-Complete Problems
Problems are intractable when they “can be solved, but not fast enough for the solution to be usable” [Hopcroft, Motwani & Ullman, 2007]. NP-complete problems are commonly said to be intractable; however, the reality is more complex. All known algorithms for solving NP-complete problems require exponential time in the worst case; however, these algorithms nevertheless solve many problems of practical importance astoundingly quickly, and are hence relied upon in a broad range of applications. The propositional satisfiability problem (SAT) serves as a good example. The FCC's upcoming multi-billion dollar spectrum auctions will be run by solving hundreds of thousands of SAT-encoded graph coloring problems, typically each involving tens of thousands of variables and millions of constraints. These instances can often be solved in seconds, even though the same solvers can be stymied by hand-crafted instances involving only hundreds of variables. Clearly, we could benefit from a more nuanced understanding of algorithm behaviour than is offered by asymptotic, worst-case analysis.
Our work asks the question most relevant to an end user: “How hard is it to solve a given family of problem instances, using the best available methods?” Formal, complexity-theoretic analysis of this question seems hopeless: the best available algorithms are highly complex (and, in some cases, only available in compiled form), and instance distributions representative of practical applications are heterogeneous and richly structured. For this reason, we turn to statistical, rather than combinatorial, analysis. The main claim of this talk is that rigorous statistical methods can characterize algorithm runtime with high levels of confidence. More specifically, this talk surveys over a decade of research showing how to build empirical hardness models (EHMs) that, given a new problem instance, estimate the runtime of an algorithm in low-order polynomial time. We have shown that it is possible to build quite accurate models for different NP-complete problems (we have studied SAT, combinatorial auction winner determination, mixed integer programming, and the traveling salesman problem), distributions of problem instances (we have considered dozens), solvers (again, dozens). We have robustly found that even very succinct EHMs can achieve high accuracies, meaning that they describe simple relationships between instance characteristics and algorithm runtime. This makes our approach important even for theoretically inclined computer scientists who prefer proofs to experimental findings: EHMs can uncover new, simple relationships between instance characteristics and runtime, and thereby catalyze new theoretical work.