ML for Solvers

Kevin Leyton Brown

Title: Predicting Satisfiability at the Phase Transition via End-to-End Learning

Abstract: Strangely enough, it is possible to use machine learning models to predict the satisfiability status of hard SAT problems with accuracy considerably higher than random guessing. We consider random 3-SAT problems at the solubility phase transition, where exactly 50% of problems are satisfiable and where empirical runtimes of known solution methods scale exponentially with problem size. In this domain, we can, for example, learn computationally efficient models that achieve 84% prediction accuracy on 600-variable problems, which take hours to solve with state-of-the-art methods.

In our first work on this problem (AAAI 2012) we showed that we could make such predictions using simple decision tree models that relied on manually engineered polynomial-time computable features (e.g., linear programming relaxations). We also found that our models could generalize across problem sizes, maintaining high accuracy on large problems even when trained only on very small problems.

More recently (AAAI 2020), we showed that this phenomenon does not depend on access to hand-crafted features. Instead, we achieved even better performance (including generalization across problem size) via end-to-end learning methods: models that map directly from raw problem inputs to predictions and that take only linear time to evaluate. These results leveraged a novel deep network architecture that captures the key structural invariance exhibited by SAT problems--satisfiability status is preserved under variable and clause reorderings.

If time permits, the talk will end with a teaser: a discussion of unpublished theoretical work that establishes a performance gap between (1) learning an objective function from data (e.g., travel times in a road network) and then optimizing (e.g., finding fastest routes); and (2) learning the same function end-to-end (i.e., back-propagating through the optimization problem).

Predicting Satisfiability at the Phase Transition. L. Xu, H. Hoos, K. Leyton-Brown. Conference of the Association for the Advancement of Artificial Intelligence (AAAI), (7 pages), 2012. PDF

Predicting Propositional Satisfiability via End-to-End Learning. C. Cameron, R. Chen, J. Hartford, K. Leyton-Brown. Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020. PDF DOI

Elias Khalil

Title: Learning to Schedule Heuristics in Branch and Bound

Abstract: Primal heuristics play a crucial role in exact solvers for Mixed Integer Programming (MIP). While solvers are guaranteed to find optimal solutions given sufficient time, real-world applications typically require finding good solutions early on in the search to enable fast decision-making. While much of MIP research focuses on designing effective heuristics, the question of how to manage multiple MIP heuristics in a solver has not received equal attention. Generally, solvers follow hard-coded rules derived from empirical testing on broad sets of instances. Since the performance of heuristics is instance-dependent, using these general rules for a particular problem might not yield the best performance.

We propose the first data-driven framework for scheduling heuristics in an exact MIP solver. By learning from data describing the performance of primal heuristics, we obtain a problem-specific schedule of heuristics that collectively find many solutions at minimal cost. We provide a formal description of the problem and propose an efficient algorithm for computing such a schedule. Compared to the default settings of a state-of-the-art academic MIP solver, we are able to reduce the average primal integral by up to 49% on two classes of challenging instances, the Generalized Independent Set Problem and the Fixed-Charge Multi-Commodity Network Flow Problem.

Joint work with Antonia Chmiela (Zuse Institute Berlin), Ambros Gleixner (Zuse Institute Berlin, HTW Berlin), Andrea Lodi (Polytechnique Montreal), and Sebastian Pokutta (Zuse Institute Berlin, TU Berlin).

Pre-print available at: https://arxiv.org/abs/2103.10294

Md. Solimul Chowdhury

Title: Extensions of CDCL Branching Heuristics by Exploration during Conflict Depression

Abstract: Boolean Satisfiability (SAT) is a fundamental problem in computer science, with profound implications in Computational Complexity, Logic, and Artificial Intelligence. Given a set of clauses over boolean variables, the SAT solving task is to determine variable assignments which satisfy that set of clauses or report unsatisfiability, in case no such assignments exist. In general, SAT solving is intractable. Despite the theoretical hardness, many large real-world problems from domains, such as hardware design verification, software debugging, planning, and encryption are routinely solved by using SAT solvers. At present, the most dominant SAT solvers are based on Conflict Driven Clause Learning (CDCL). The efficiency CDCL SAT solving depends crucially on finding conflicts at a fast rate. State-of- the-art CDCL branching heuristics such as VSIDS and LRB conform to this goal. In this talk, I will present our study (AAAI 2020) of the VSIDS and LRB branching heuristics and a novel exploration guided CDCL extension named expSAT, which is designed based on the insights obtained from our study.. Our study shows that both of these heuristics typically generate conflicts in short bursts, followed by what we call a conflict depression (CD) phase, in which the search fails to generate any conflicts for a number of consecutive decisions. This lack of conflict indicates that the variables which are currently ranked highest by the branching heuristic fail to generate conflicts. Based on this analysis of conflict depression, we propose expSAT, which randomly samples variable selection sequences amid a CD phase, in order to learn an updated heuristic from the generated conflicts. The goal is to escape from conflict depressions expeditiously. The branching heuristic deployed in expSAT combines these updates with the standard activity scores of VSIDS and LRB. An extensive empirical evaluation with five state-of-the-art CDCL SAT solvers demonstrates good-to-strong performance gains with the expSAT approach for benchmark instances from SAT Competitions 2017 and 2018, and impressive gains over a fixed set of hard bitcoin mining instances.

Guiding CDCL SAT Search via Random Exploration amid Conflict Depression.Md Solimul Chowdhury, Martin Müller, Jia-Huai You: AAAI 2020: 1428-1435

Mislav Balunović

Title: Learning to Solve SMT Formulas

Abstract: In this talk, we will present a new approach for learning to solve SMT formulas. We phrase the challenge of solving SMT formulas as a tree search problem where at each step a transformation is applied to the input formula until the formula is solved. Our approach works in two phases: first, given a dataset of unsolved formulas we learn a policy that for each formula selects a suitable transformation to apply at each step in order to solve the formula, and second, we synthesize a strategy in the form of a loop-free program with branches. This strategy is an interpretable representation of the policy decisions and is used to guide the SMT solver to decide formulas more efficiently, without requiring any modification to the solver itself and without needing to evaluate the learned policy at inference time. We show that our approach is effective in practice – it solves 17% more formulas over a range of benchmarks and achieves up to 100x runtime improvement over a state-of-the-art SMT solver.

Wednesday, Apr. 28, 2021
8:20 – 8:30 a.m.