Fisher information metric

In information geometry, the Fisher information metric[1] is a particular Riemannian metric which can be defined on a smooth statistical manifold, i.e., a smooth manifold whose points are probability distributions.

is drawn from the value space R for a (discrete or continuous) random variable X.

The Fisher information metric then takes the form:[clarification needed] The integral is performed over all values x in R. The variable

The labels j and k index the local coordinate axes on the manifold.

The Fisher information metric is particularly simple for the exponential family, which has

The shortest paths (geodesics) between two univariate normal distributions are either parallel to the

Alternatively, the metric can be obtained as the second derivative of the relative entropy or Kullback–Leibler divergence.

This can be thought of intuitively as: "The distance between two infinitesimally close points on a statistical differential manifold is the informational difference between them."

This observation has resulted in practical applications in chemical and processing industry[citation needed]: in order to minimize the change in free entropy of a system, one should follow the minimum geodesic path between the desired endpoints of the process.

The geodesic minimizes the entropy, due to the Cauchy–Schwarz inequality, which states that the action is bounded below by the length of the curve, squared.

The Fisher metric also allows the action and the curve length to be related to the Jensen–Shannon divergence.

[7] Specifically, one has where the integrand dJSD is understood to be the infinitesimal change in the Jensen–Shannon divergence along the path taken.

For a discrete probability space, that is, a probability space on a finite set of objects, the Fisher metric can be understood to simply be the Euclidean metric restricted to a positive orthant (e.g. "quadrant" in

[8] Consider a flat, Euclidean space, of dimension N+1, parametrized by points

as the basis vectors for the tangent space, so that the Euclidean metric may be written as The superscript 'flat' is there to remind that, when written in coordinate form, this metric is with respect to the flat-space coordinate

An N-dimensional unit sphere embedded in (N + 1)-dimensional Euclidean space may be defined as This embedding induces a metric on the sphere, it is inherited directly from the Euclidean metric on the ambient space.

To complete the process, recall that the probabilities are parametric functions of the manifold variables

Thus, the above induces a metric on the parameter manifold: or, in coordinate form, the Fisher information metric is: where, as before, The superscript 'fisher' is present to remind that this expression is applicable for the coordinates

That is, the Fisher information metric on a statistical manifold is simply (four times) the Euclidean metric restricted to the positive orthant of the sphere, after appropriate changes of variable.

One way is to carefully recast all of the above steps in an infinite-dimensional space, being careful to define limits appropriately, etc., in order to make sure that all manipulations are well-defined, convergent, etc.

[9] This should perhaps be no surprise, as the Fubini–Study metric provides the means of measuring information in quantum mechanics.

By setting the phase of the complex coordinate to zero, one obtains exactly one-fourth of the Fisher information metric, exactly as above.

One begins with the same trick, of constructing a probability amplitude, written in polar coordinates, so: Here,

The usual condition that probabilities lie within a simplex, namely that is equivalently expressed by the idea the square amplitude be normalized: When

Using the infinitesimal notation, the polar form of the probability above is simply Inserting the above into the Fubini–Study metric gives: Setting

in the above makes it clear that the first term is (one-fourth of) the Fisher information metric.

The full form of the above can be made slightly clearer by changing notation to that of standard Riemannian geometry, so that the metric becomes a symmetric 2-form acting on the tangent space.

The Fisher information metric is then an inner product on the tangent space.

Square integrability is equivalent to saying that a Cauchy sequence converges to a finite value under the weak topology: the space contains its limit points.

is finite-dimensional, then so is the submanifold; likewise, the tangent space has the same dimension as