Stochastic game

The game is played in a sequence of stages.

The players select actions and each player receives a payoff that depends on the current state and the chosen actions.

The game then moves to a new random state whose distribution depends on the previous state and the actions chosen by the players.

The procedure is repeated at the new state and play continues for a finite or infinite number of stages.

Stochastic games generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic situations in which the environment changes in response to the players’ choices.

[2] Stochastic two-player games on directed graphs are widely used for modeling and analysis of discrete systems operating in an unknown (adversarial) environment[citation needed].

Possible configurations of a system and its environment are represented as vertices, and the transitions correspond to actions of the system, its environment, or "nature".

A run of the system then corresponds to an infinite path in the graph.

Thus, a system and its environment can be seen as two players with antagonistic objectives, where one player (the system) aims at maximizing the probability of "good" runs, while the other player (the environment) aims at the opposite.

We introduce basic concepts and algorithmic questions studied in this area, and we mention some long-standing open problems.

Then, we mention selected recent results.

The ingredients of a stochastic game are: a finite set of players

(either a finite set or a measurable space

The game starts at some initial state

, of a two-person zero-sum stochastic game

, with finitely many states and actions exists, and Truman Bewley and Elon Kohlberg (1976) proved that

Some precautions are needed in defining the value of a two-person zero-sum

of a two-person zero-sum stochastic game

Jean-François Mertens and Abraham Neyman (1981) proved that every two-person zero-sum stochastic game with finitely many states and actions has a uniform value.

[3] If there is a finite number of players and the action sets and the set of states are finite, then a stochastic game with a finite number of stages always has a Nash equilibrium.

The same is true for a game with infinitely many stages if the total payoff is the discounted sum.

Nicolas Vieille has shown that all two-person stochastic games with finite state and action spaces have a uniform equilibrium payoff.

, the expectation of the limit inferior of the averages of the stage payoffs with respect to the probability on plays defined by

, and the expectation of the limit superior of the averages of the stage payoffs with respect to the probability on plays defined by

Jean-François Mertens and Abraham Neyman (1981) proves that every two-person zero-sum stochastic game with finitely many states and actions has a limiting-average value,[3] and Nicolas Vieille has shown that all two-person stochastic games with finite state and action spaces have a limiting-average equilibrium payoff.

Whether every stochastic game with finitely many players, states, and actions, has a uniform equilibrium payoff, or a limiting-average equilibrium payoff, or even a liminf-average equilibrium payoff, is a challenging open question.

A Markov perfect equilibrium is a refinement of the concept of sub-game perfect Nash equilibrium to stochastic games.

[5] The resulting stochastic Bayesian game model is solved via a recursive combination of the Bayesian Nash equilibrium equation and the Bellman optimality equation.

Stochastic games have applications in economics, evolutionary biology and computer networks.

[6][7] They are generalizations of repeated games which correspond to the special case where there is only one state.

A modified game of rock paper scissors played with dice, shown generally in the first graph and more strictly in the second. Each player has a set of 6 dice, where 4 sides of each die has rock, paper or scissors, and 2 sides have one of the other options, so that the set of 6 dice contains every combination. "RP" means a die with 4 rock and 2 paper, and although both players may not choose this die, any two combinations are equivalent to a combination of any basis die and some other die, so BR is used as a basis. Ties let the players choose a new die if they would like. The more expansive graph is needed because the probability of the players winning is influenced by which die they and their opponents choose. The probability of each outcome is therefore a function of the action profiles of the players during the "choose and roll" stage.