Directional statistics

More generally, directional statistics deals with observations on compact Riemannian manifolds including the Stiefel manifold.

Other examples of data that may be regarded as directional include statistics involving temporal periods (e.g. time of day, week, month, year, etc.

), compass directions, dihedral angles in molecules, orientations, rotations and so on.

on the line can be "wrapped" around the circumference of a circle of unit radius.

This concept can be extended to the multivariate context by an extension of the simple sum to a number of

sums that cover all dimensions in the feature space:

The following sections show some relevant circular distributions.

The underlying linear probability distribution for the von Mises distribution is mathematically intractable; however, for statistical purposes, there is no need to deal with the underlying linear distribution.

The usefulness of the von Mises distribution is twofold: it is the most mathematically tractable of all circular distributions, allowing simpler statistical analysis, and it is a close approximation to the wrapped normal distribution, which, analogously to the linear normal distribution, is important because it is the limiting case for the sum of a large number of small angular deviations.

The probability density function (pdf) of the circular uniform distribution is given by

The pdf of the wrapped normal distribution (WN) is:

where μ and σ are the mean and standard deviation of the unwrapped distribution, respectively and

The pdf of the wrapped Cauchy distribution (WC) is:

The pdf of the wrapped Lévy distribution (WL) is:

The projected normal distribution is a circular distribution representing the direction of a random variable with multivariate normal distribution, obtained by radial projection of the variable over the unit (n-1)-sphere.

Due to this, and unlike other commonly used circular distributions, it is not symmetric nor unimodal.

[8] The Bingham distribution is a distribution over axes in N dimensions, or equivalently, over points on the (N − 1)-dimensional sphere with the antipodes identified.

[9] For example, if N = 2, the axes are undirected lines through the origin in the plane.

In this case, each axis cuts the unit circle in the plane (which is the one-dimensional sphere) at two points that are each other's antipodes.

These distributions are for example used in geology,[10] crystallography[11] and bioinformatics.

[1] [12] [13] The raw vector (or trigonometric) moments of a circular distribution are defined as where

Sample moments are analogously defined: The population resultant vector, length, and mean angle are defined in analogy with the corresponding sample parameters.

Various measures of central tendency and statistical dispersion may be defined for both the population and a sample drawn from that population.

[3] The most common measure of location is the circular mean.

The population circular mean is simply the first moment of the distribution while the sample mean is the first moment of the sample.

When data is concentrated, the median and mode may be defined by analogy to the linear case, but for more dispersed or multi-modal data, these concepts are not useful.

The calculation of the distribution of the mean for most circular distributions is not analytically possible, and in order to carry out an analysis of variance, numerical or mathematical approximations are needed.

[14] The central limit theorem may be applied to the distribution of the sample means.

(main article: Central limit theorem for directional statistics).

approaches a bivariate normal distribution in the limit of large sample size.

The overall shape of a protein can be parameterized as a sequence of points on the unit sphere . Shown are two views of the spherical histogram of such points for a large collection of protein structures. The statistical treatment of such data is in the realm of directional statistics. [ 1 ]
Three points sets sampled from different Kent distributions on the sphere.