The probability density function of the von Mises–Fisher distribution for the random p-dimensional unit vector
denotes the modified Bessel function of the first kind at order
are called the mean direction and concentration parameter, respectively.
[3] Other applications are found in geology, bioinformatics, and text mining.
In the textbook, Directional Statistics [3] by Mardia and Jupp, the normalization constant given for the Von Mises Fisher probability density is apparently different from the one given here:
This is resolved by noting that Mardia and Jupp give the density "with respect to the uniform distribution", while the density here is specified in the usual way, with respect to Lebesgue measure.
was derived above via the surface area, the same result may be obtained by setting
, whose density function is: the Von Mises–Fisher distribution is obtained by conditioning on
By expanding and using the fact that the first two right-hand-side terms are fixed, the Von Mises-Fisher density,
More succinctly, the restriction of any isotropic multivariate normal density to the unit hypersphere, gives a Von Mises-Fisher density, up to normalization.
is simply the normalized arithmetic mean, a sufficient statistic:[3] Use the modified Bessel function of the first kind to define Then: Thus
is (Sra, 2011) A more accurate inversion can be obtained by iterating the Newton method a few times For N ≥ 25, the estimated spherical standard error of the sample mean direction can be computed as:[4] where It is then possible to approximate a
The expected value of the Von Mises–Fisher distribution is not on the unit hypersphere, but instead has a length of less than one.
, the length of the expected value is strictly between zero and one and is a monotonic rising function of
The empirical mean (arithmetic average) of a collection of points on the unit hypersphere behaves in a similar manner, being close to the origin for widely spread data and close to the sphere for concentrated data.
The expected value can be used to compute differential entropy and KL divergence.
is: Von Mises-Fisher (VMF) distributions are closed under orthogonal linear transforms.
An algorithm for drawing pseudo-random samples from the Von Mises Fisher (VMF) distribution was given by Ulrich[5] and later corrected by Wood.
must be drawn from the uniform distribution on the tangential subsphere; and the radial component,
The normalization constant for this density may be verified by using: as given in Appendix 1 (A.3) in Directional Statistics.
[9] To generate a Von Mises–Fisher distributed pseudo-random spherical 3-D unit vector[10][11]
This distribution may be better understood by highlighting its relation to the beta distribution: where the Legendre duplication formula is useful to understand the relationships between the normalization constants of the various densities above.
In machine learning, especially in image classification, to-be-classified inputs (e.g. images) are often compared using cosine similarity, which is the dot product between intermediate representations in the form of unitvectors (termed embeddings).
The deep neural networks that extract embeddings for classification should learn to spread the classes as far apart as possible and ideally this should give classes that are uniformly distributed on
[14] For a better statistical understanding of across-class cosine similarity, the distribution of dot-products between unitvectors independently sampled from the uniform distribution may be helpful.
The variances decrease, the distributions of all three variables become more Gaussian, and the final approximation gets better as the dimensionality,
The above-mentioned radial-tangential decomposition generalizes to the Saw family and the radial compoment,
Also notice that the left-hand factor of the radial density is the surface area of
The definition of the Von Mises-Fisher distribution can be extended to include also the case where
, so that the support is the 0-dimensional hypersphere, which when embedded into 1-dimensional Euclidean space is the discrete set,