Boolean satisfiability algorithm heuristics

In computer science, there are certain classes of algorithms (heuristics) that solves types of the Boolean satisfiability problem despite there being no known efficient algorithm in the general case.

Although no known algorithm is known to solve SAT in polynomial time, there are classes of SAT problems which do have efficient algorithms that solve them.

The classes of problems amenable to SAT heuristics arise from many practical problems in AI planning, circuit testing, and software verification.

[1][2] Research on constructing efficient SAT solvers has been based on various principles such as resolution, search, local search and random walk, binary decisions, and Stalmarck's algorithm.

As there exist polynomial-time algorithms to convert any Boolean expression to conjunctive normal form such as Tseitin's algorithm, posing SAT problems in CNF does not change their computational difficulty.

SAT problems are canonically expressed in CNF because CNF has certain properties that can help prune the search space and speed up the search process.

It relies on a Branching Heuristic to pick the next free variable assignment; the branching algorithm effectively makes choosing the variable assignment into a decision tree.

Different implementations of this heuristic produce markedly different decision trees, and thus have significant effect on the efficiency of the solver.

Their basic premise is to choose a free variable assignment that will satisfy the most already unsatisfied clauses in the Boolean expression.

However, as Boolean expressions get larger, more complicated, or more structured, these heuristics fail to capture useful information about these problems that could improve efficiency; they often get stuck in local maxima or do not consider the distribution of variables.

Additionally, larger problems require more processing, as the operation of counting free variables in unsatisfied clauses dominates the run-time.

As VSIDS progresses and searches more parts of the Boolean expression, periodically, all scores are divided by a constant.

This discounts the effect of the presence of variables in earlier-found clauses in favor of variables with a greater presence in more recent clauses.

VSIDS will select the variable phase with the highest score to determine where to branch.

Further, VSIDS guarantees that each variable assignment satisfies the greatest number of recently searched segments of the Boolean expression.

For MAX-SAT, the version of SAT in which the number of satisfied clauses is maximized, solvers also use probabilistic algorithms.

and then we have that This algorithm cannot be further optimized by the PCP theorem unless P = NP.

Other stochastic SAT solvers, such as WalkSAT and GSAT are an improvement to the above procedure.

They start by randomly assigning values to each variable and then traverse the given Boolean expression to identify which variables to flip to minimize the number of unsatisfied clauses.

They may randomly select a variable to flip or select a new random variable assignment to escape local maxima, much like a simulated annealing algorithm.

In this class of problems, each clause in a CNF Boolean expression is given a weight.

The objective is the maximize or minimize the total sum of the weights of the satisfied clauses given a Boolean expression.

Note that the stochastic probabilistic solvers can also be used to find optimal approximations for Max-SAT.

Variable splitting is a tool to find upper and lower bounds on a Max-SAT problem.

represent lower and upper bounds on the minimization and maximization of the weights of

The total maximum (or minimum) weight of the soft clauses can be evaluated given the variable assignment necessary to satisfy the hard clauses and trying to optimize the free variables (the variables that the satisfaction of the hard clauses does not depend on).

The latter step is an implementation of Max-SAT given some pre-defined variables.

As SAT solvers and practical SAT problems (e.g. circuit verification) get more advanced, the Boolean expressions of interest may exceed millions of variables with several million clauses; therefore, efficient data structures to store and evaluate the clauses must be used.

), they rely on many pointers, which increases their memory overhead, decreases cache locality, and increases cache misses, which renders them impractical for problems with large clause counts and large clause sizes.

The elimination of pointers and the contiguous memory occupation of arrays serve to decrease memory usage and increase cache locality and cache hits, which offers a run-time speed up compared to the aforesaid implementation.