Maximum-entropy random graph model

Maximum-entropy random graph models are random graph models used to study complex networks subject to the principle of maximum entropy under a set of structural constraints,[1] which may be global, distributional, or local.

Any random graph model (at a fixed set of parameter values) results in a probability distribution on graphs, and those that are maximum entropy within the considered class of distributions have the special property of being maximally unbiased null models for network inference[2] (e.g. biological network inference).

Each model defines a family of probability distributions on the set of graphs of size

(such as fixed expected average degree, degree distribution of a particular form, or specific degree sequence), enforced in the graph distribution alongside entropy maximization by the method of Lagrange multipliers.

Note that in this context "maximum entropy" refers not to the entropy of a single graph, but rather the entropy of the whole probabilistic ensemble of random graphs.

Several commonly studied random network models are in fact maximum entropy, for example the ER graphs

(which each have one global constraint on the number of edges), as well as the configuration model (CM).

[3] and soft configuration model (SCM) (which each have

local constraints, one for each nodewise degree-value).

In the two pairs of models mentioned above, an important distinction[4][5] is in whether the constraint is sharp (i.e. satisfied by every element of the set of size-

graphs with nonzero probability in the ensemble), or soft (i.e. satisfied on average across the whole ensemble).

The former (sharp) case corresponds to a microcanonical ensemble,[6] the condition of maximum entropy yielding all graphs

as equiprobable; the latter (soft) case is canonical,[7] producing an exponential random graph model (ERGM).

Suppose we are building a random graph model consisting of a probability distribution

of this ensemble will be given by We would like the ensemble-averaged values

"soft" constraints on the graph distribution: where

Application of the method of Lagrange multipliers to determine the distribution

is a normalizing constant (the partition function) and

are parameters (Lagrange multipliers) coupled to the correspondingly indexed graph observables, which may be tuned to yield graph samples with desired values of those properties, on average; the result is an exponential family and canonical ensemble; specifically yielding an ERGM.

In the canonical framework above, constraints were imposed on ensemble-averaged quantities

Although these properties will on average take on values specifiable by appropriate setting of

Instead, we may impose a much stricter condition: every graph with nonzero probability must satisfy

Under these "sharp" constraints, the maximum-entropy distribution is determined.

is that of a fixed number of edges

drawn from the ensemble (instantiated with a probability denoted

This is in direct analogy to the microcanonical ensemble in classical statistical mechanics, wherein the system is restricted to a thin manifold in the phase space of all states of a particular energy value.

, we have no external constraints (besides normalization) to satisfy, and thus we'll select

It is well known that the entropy-maximizing distribution in the absence of external constraints is the uniform distribution over the sample space (see maximum entropy probability distribution), from which we obtain: where the last expression in terms of binomial coefficients is the number of ways to place

A variety of maximum-entropy ensembles have been studied on generalizations of simple graphs.

These include, for example, ensembles of simplicial complexes,[9] and weighted random graphs with a given expected degree sequence [10]