In probability, and statistics, a multivariate random variable or random vector is a list or vector of mathematical variables each of whose value is unknown, either because the value has not yet occurred or because there is imperfect knowledge of its value.
The individual variables in a random vector are grouped together because they are all part of a single mathematical system — often they represent different properties of an individual statistical unit.
For example, while a given person has a specific age, height and weight, the representation of these features of an unspecified person from within a group would be a random vector.
Normally each element of a random vector is a real number.
Formally, a multivariate random variable is a column vector
(or its transpose, which is a row vector) whose components are random variables on the probability space
Every random vector gives rise to a probability measure on
Random vectors can be subjected to the same kinds of algebraic operations as can non-random vectors: addition, subtraction, multiplication by a scalar, and the taking of inner products.
is More generally we can study invertible mappings of random vectors.
whose elements are the expected values of the respective random variables.
matrix whose (i,j)th element is the covariance between the i th and the j th random variables.
, where the superscript T refers to the transpose of the indicated vector:[2]: p. 464 [3]: p.335 By extension, the cross-covariance matrix between two random vectors
[3]: p.337 The correlation matrix (also called second moment) of an
matrix whose (i,j)th element is the correlation between the i th and the j th random variables.
, where the superscript T refers to the transpose of the indicated vector:[4]: p.190 [3]: p.334 By extension, the cross-correlation matrix between two random vectors
denotes their joint cumulative distribution function.
It is defined by[2]: p. 468 One can take the expectation of a quadratic form in the random vector
refers to the trace of a matrix — that is, to the sum of the elements on its main diagonal (from upper left to lower right).
, we see that: Hence which leaves us to show that This is true based on the fact that one can cyclically permute matrices when taking a trace without changing the end result (e.g.:
Using the permutation we get: and by plugging this into the original formula we get: One can take the expectation of the product of two different quadratic forms in a zero-mean Gaussian random vector
For example, one might want to choose the portfolio return having the lowest variance for a given expected value.
of random returns on the individual assets, and the portfolio return p (a random scalar) is the inner product of the vector of random returns with a vector w of portfolio weights — the fractions of the portfolio placed in the respective assets.
, the expected value of the portfolio return is wTE(
) and the variance of the portfolio return can be shown to be wTCw, where C is the covariance matrix of
In linear regression theory, we have data on n observations on a dependent variable y and n observations on each of k independent variables xj.
The observations on the dependent variable are stacked into a column vector y; the observations on each independent variable are also stacked into column vectors, and these latter column vectors are combined into a design matrix X (not denoting a random vector in this context) of observations on the independent variables.
Then the following regression equation is postulated as a description of the process that generated the data: where β is a postulated fixed but unknown vector of k response coefficients, and e is an unknown random vector reflecting random influences on the dependent variable.
By some chosen technique such as ordinary least squares, a vector
, is computed as Then the statistician must analyze the properties of
, c is a k × 1 vector of constants (intercepts), Ai is a time-invariant k × k matrix and