[1] The sign of the covariance, therefore, shows the tendency in the linear relationship between the variables.
[2] In the opposite case, when greater values of one variable mainly correspond to lesser values of the other (that is, the variables tend to show opposite behavior), the covariance is negative.
The magnitude of the covariance is the geometric mean of the variances that are in common for the two random variables.
The correlation coefficient normalizes the covariance by dividing by the geometric mean of the total variances for the two random variables.
A distinction must be made between (1) the covariance of two random variables, which is a population parameter that can be seen as a property of the joint probability distribution, and (2) the sample covariance, which in addition to serving as a descriptor of the sample, also serves as an estimated value of the population parameter.
but this equation is susceptible to catastrophic cancellation (see the section on numerical computation below).
(In fact, correlation coefficients can simply be understood as a normalized version of covariance.)
, the covariance is calculated using a double summation over the indices of the matrix:
The variance is a special case of the covariance in which the two variables are identical:[4]: 121
are real-valued constants, then the following facts are a consequence of the definition of covariance:
is the joint cumulative distribution function of the random vector
[4]: 121 Similarly, the components of random vectors whose covariance matrix is zero in every entry outside the main diagonal are also called uncorrelated.
is non-linear, while correlation and covariance are measures of linear dependence between two random variables.
This example shows that if two random variables are uncorrelated, that does not in general imply that they are independent.
Many of the properties of covariance can be extracted elegantly by observing that it satisfies similar properties to those of an inner product: In fact these properties imply that the covariance defines an inner product over the quotient vector space obtained by taking the subspace of random variables with finite second moment and identifying any two that differ by a constant.
That quotient vector space is isomorphic to the subspace of random variables with finite second moment and mean zero; on that subspace, the covariance is exactly the L2 inner product of real-valued functions on the sample space.
As a result, for random variables with finite variance, the inequality
[12] The covariance is sometimes called a measure of "linear dependence" between the two random variables.
When the covariance is normalized, one obtains the Pearson correlation coefficient, which gives the goodness of the fit for the best possible linear function describing the relation between the variables.
Certain sequences of DNA are conserved more than others among species, and thus to study secondary and tertiary structures of proteins, or of RNA structures, sequences are compared in closely related species.
In genetics, covariance serves a basis for computation of Genetic Relationship Matrix (GRM) (aka kinship matrix), enabling inference on population structure from sample with no known close relatives as well as inference on estimation of heritability of complex traits.
In the theory of evolution and natural selection, the price equation describes how a genetic trait changes in frequency over time.
The equation uses a covariance between a trait and fitness, to give a mathematical description of evolution and natural selection.
[13][14] Covariances play a key role in financial economics, especially in modern portfolio theory and in the capital asset pricing model.
Covariances among various assets' returns are used to determine, under certain assumptions, the relative amounts of different assets that investors should (in a normative analysis) or are predicted to (in a positive analysis) choose to hold in a context of diversification.
The covariance matrix is important in estimating the initial conditions required for running weather forecast models, a procedure known as data assimilation.
This is an example of its widespread application to Kalman filtering and more general state estimation for time-varying systems.
The eddy covariance technique is a key atmospherics measurement technique where the covariance between instantaneous deviation in vertical wind speed from the mean value and instantaneous deviation in gas concentration is the basis for calculating the vertical turbulent fluxes.
The covariance matrix is used to capture the spectral variability of a signal.
[15] The covariance matrix is used in principal component analysis to reduce feature dimensionality in data preprocessing.