Alpha–beta pruning

It is an adversarial search algorithm used commonly for machine playing of two-player combinatorial games (Tic-tac-toe, Chess, Connect 4, etc.).

[1] John McCarthy during the Dartmouth Workshop met Alex Bernstein of IBM, who was writing a chess program.

[2] Allen Newell and Herbert A. Simon who used what John McCarthy calls an "approximation"[3] in 1958 wrote that alpha–beta "appears to have been reinvented a number of times".

Richards, Timothy Hart, Michael Levin and/or Daniel Edwards also invented alpha–beta independently in the United States.

[5] McCarthy proposed similar ideas during the Dartmouth workshop in 1956 and suggested it to a group of his students including Alan Kotok at MIT in 1961.

[6] Alexander Brudno independently conceived the alpha–beta algorithm, publishing his results in 1963.

[8][9] Judea Pearl proved its optimality in terms of the expected running time for trees with randomly assigned leaf values in two papers.

[10][11] The optimality of the randomized version of alpha–beta was shown by Michael Saks and Avi Wigderson in 1986.

[13] The algorithm maintains two values, alpha and beta, which respectively represent the minimum score that the maximizing player is assured of and the maximum score that the minimizing player is assured of.

Thus, other outcomes from playing move B no longer need to be considered since the opponent can force a win.

The maximum score that the opponent could force after move "B" is negative infinity: a loss for the player.

The benefit of alpha–beta pruning lies in the fact that branches of the search tree can be eliminated.

With an (average or constant) branching factor of b, and a search depth of d plies, the maximum number of leaf node positions evaluated (when the move ordering is pessimal) is O(bd) – the same as a simple minimax search.

interval uniformly at random, the expected number of nodes evaluated increases to

An optimal alpha-beta prune would eliminate all but about 2,000 terminal nodes, a reduction of 99.8%.

As the number of positions searched decreases exponentially each move nearer the current position, it is worth spending considerable effort on sorting early moves.

In practice, the move ordering is often determined by the results of earlier, smaller searches, such as through iterative deepening.

In comparison, fail-hard alpha–beta limits its function return value into the inclusive range of α and β.

The main difference between fail-soft and fail-hard implementations is whether α and β are updated before or after the cutoff check.

If they are updated before the check, then they can exceed initial bounds and the algorithm is fail-soft.

Further improvement can be achieved without sacrificing accuracy by using ordering heuristics to search earlier parts of the tree that are likely to force alpha–beta cutoffs.

This is particularly useful for win/loss searches near the end of a game where the extra depth gained from the narrow window and a simple win/loss evaluation function may lead to a conclusive result.

Over time, other improvements have been suggested, and indeed the Falphabeta (fail-soft alpha–beta) idea of John Fishburn is nearly universal and is already incorporated above in a slightly modified form.

Another advantage of using iterative deepening is that searches at shallower depths give move-ordering hints, as well as shallow alpha and beta estimates, that both can help produce cutoffs for higher depth searches much earlier than would otherwise be possible.

This can potentially make them more time-efficient, but typically at a heavy cost in space-efficiency.

An illustration of alpha–beta pruning. The grayed-out subtrees don't need to be explored (when moves are evaluated from left to right), since it is known that the group of subtrees as a whole yields the value of an equivalent subtree or worse, and as such cannot influence the final result. The max and min levels represent the turn of the player and the adversary, respectively.
An animated pedagogical example that attempts to be human-friendly by substituting initial infinite (or arbitrarily large) values for emptiness and by avoiding using the negamax coding simplifications.