Minimax

Minimax (sometimes Minmax, MM[1] or saddle point[2]) is a decision rule used in artificial intelligence, decision theory, game theory, statistics, and philosophy for minimizing the possible loss for a worst case (maximum loss) scenario.

Originally formulated for several-player zero-sum game theory, covering both the cases where players take alternate moves and those where they make simultaneous moves, it has also been extended to more complex games and to general decision-making in the presence of uncertainty.

Then, we determine which action player i can take in order to make sure that this smallest value is the highest possible.

Another way to understand the notation is by reading from right to left: When we write the initial set of outcomes

In two-player zero-sum games, the minimax solution is the same as the Nash equilibrium.

The following example of a zero-sum game, where A and B make simultaneous moves, illustrates maximin solutions.

Minimax is used in zero-sum games to denote minimizing the opponent's maximum payoff.

In a zero-sum game, this is identical to minimizing one's own maximum loss, and to maximizing one's own minimum gain.

"Maximin" is a term commonly used for non-zero-sum games to describe the strategy which maximizes one's own minimum payoff.

In non-zero-sum games, this is not generally the same as minimizing the opponent's maximum gain, nor the same as the Nash equilibrium strategy.

A simple version of the minimax algorithm, stated below, deals with games such as tic-tac-toe, where each player can win, lose, or draw.

The minimax algorithm helps find the best move, by working backwards from the end of the game.

This leads to combinatorial game theory as developed by John H. Conway.

Often this is generally only possible at the very end of complicated games such as chess or go, since it is not computationally feasible to look ahead as far as the completion of the game, except towards the end, and instead, positions are given finite values as estimates of the degree of belief that they will lead to a win for one player or another.

This can be extended if we can supply a heuristic evaluation function which gives values to non-final game states without considering all possible following complete sequences.

For example, the chess computer Deep Blue (the first one to beat a reigning world champion, Garry Kasparov at that time) looked ahead at least 12 plies, then applied a heuristic evaluation function.

It is therefore impractical to completely analyze games such as chess using the minimax algorithm.

The performance of the naïve minimax algorithm may be improved dramatically, without affecting the result, by the use of alpha–beta pruning.

Other heuristic pruning methods can also be used, but not all of them are guaranteed to give the same result as the unpruned search.

The heuristic value is a score measuring the favorability of the node for the maximizing player.

The heuristic value for terminal (game ending) leaf nodes are scores corresponding to win, loss, or draw, for the maximizing player.

The algorithm continues evaluating the maximum and minimum values of the child nodes alternately until it reaches the root node, where it chooses the move with the largest value (represented in the figure with a blue arrow).

One approach is to treat this as a game against nature (see move by nature), and using a similar mindset as Murphy's law or resistentialism, take an approach which minimizes the maximum expected loss, using the same techniques as in the two-person zero-sum games.

In addition, expectiminimax trees have been developed, for two-player games in which chance (for example, dice) is a factor.

is called minimax if it satisfies An alternative criterion in the decision theoretic framework is the Bayes estimator in the presence of a prior distribution

An estimator is Bayes if it minimizes the average risk A key feature of minimax decision making is being non-probabilistic: in contrast to decisions using expected value or expected utility, it makes no assumptions about the probabilities of various outcomes, just scenario analysis of what the possible outcomes are.

Various extensions of this non-probabilistic approach exist, notably minimax regret and Info-gap decision theory.

Compare to expected value analysis, whose conclusion is of the form: "This strategy yields ℰ(X) = n ."

To do so, "voting should not be viewed as a form of personal self-expression or moral judgement directed in retaliation towards major party candidates who fail to reflect our values, or of a corrupt system designed to limit choices to those acceptable to corporate elites," but rather as an opportunity to reduce harm or loss.

[8] Rawls defined this principle as the rule which states that social and economic inequalities should be arranged so that "they are to be of the greatest benefit to the least-advantaged members of society".