of nonnegative real numbers, each of whose rows and columns sums to 1, i.e., Thus, a doubly stochastic matrix is both left stochastic and right stochastic.
[1] Indeed, any matrix that is both left and right stochastic must be square: if every row sums to 1 then the sum of all entries in the matrix must be equal to the number of rows, and since the same holds for columns, the number of rows and columns must be equal.
Using the matrix entries as Cartesian coordinates, it lies in an
-dimensional Euclidean space defined by
independent linear constraints specifying that the row and column sums all equal 1.
Moreover, the entries are all constrained to be non-negative and less than or equal to 1.
The Birkhoff–von Neumann theorem (often known simply as Birkhoff's theorem[2][3][4]) states that the polytope
is the convex hull of the set of
permutation matrices, and furthermore that the vertices of
are precisely the permutation matrices.
is a doubly stochastic matrix, then there exist
This representation is known as the Birkhoff–von Neumann decomposition, and may not be unique.
It is often described as a real-valued generalization of Kőnig's theorem, where the correspondence is established through adjacency matrices of graphs.
Let X be a doubly stochastic matrix.
Then we will show that there exists a permutation matrix P such that xij ≠ 0 whenever pij ≠ 0.
Thus if we let λ be the smallest xij corresponding to a non-zero pij, the difference X – λP will be a scalar multiple of a doubly stochastic matrix and will have at least one more zero cell than X.
Accordingly we may successively reduce the number of non-zero cells in X by removing scalar multiples of permutation matrices until we arrive at the zero matrix, at which point we will have constructed a convex combination of permutation matrices equal to the original X.
Proof: Construct a bipartite graph in which the rows of X are listed in one part and the columns in the other, and in which row i is connected to column j iff xij ≠ 0.
Let A be any set of rows, and define A' as the set of columns joined to rows in A in the graph.
We want to express the sizes |A| and |A'| of the two sets in terms of the xij.
It follows that the conditions of Hall's marriage theorem are satisfied, and that we can therefore find a set of edges in the graph which join each row in X to exactly one (distinct) column.
These edges define a permutation matrix whose non-zero cells correspond to non-zero cells in X.
There is a simple generalisation to matrices with more columns and rows such that the i th row sum is equal to ri (a positive integer), the column sums are equal to 1, and all cells are non-negative (the sum of the row sums being equal to the number of columns).
Any matrix in this form can be expressed as a convex combination of matrices in the same form made up of 0s and 1s.
The proof is to replace the i th row of the original matrix by ri separate rows, each equal to the original row divided by ri ; to apply Birkhoff's theorem to the resulting square matrix; and at the end to additively recombine the ri rows into a single i th row.
In the same way it is possible to replicate columns as well as rows, but the result of recombination is not necessarily limited to 0s and 1s.
A different generalisation (with a significantly harder proof) has been put forward by R. M. Caron et al.[3]