Projection matrix

[3][4] The diagonal elements of the projection matrix are the leverages, which describe the influence each response value has on the fitted value for that same observation.

If the vector of response values is denoted by

is usually pronounced "y-hat", the projection matrix

can also be expressed compactly using the projection matrix: where

For the case of linear models with independent and identically distributed errors in which

, this reduces to:[3] From the figure, it is clear that the closest point from the vector

, and is one where we can draw a line orthogonal to the column space of

A vector that is orthogonal to the column space of a matrix is in the nullspace of the matrix transpose, so From there, one rearranges, so Therefore, since

is a matrix of explanatory variables (the design matrix), β is a vector of unknown parameters to be estimated, and ε is the error vector.

Many types of models and techniques are subject to this formulation.

A few examples are linear least squares, smoothing splines, regression splines, local regression, kernel regression, and linear filtering.

When the weights for each observation are identical and the errors are uncorrelated, the estimated parameters are so the fitted values are Therefore, the projection matrix (and hat matrix) is given by The above may be generalized to the cases where the weights are not identical and/or the errors are correlated.

Suppose that the covariance matrix of the errors is Σ.

The projection matrix has a number of useful algebraic properties.

[5][6] In the language of linear algebra, the projection matrix is the orthogonal projection onto the column space of the design matrix

Some facts of the projection matrix in this setting are summarized as follows:[4] The projection matrix corresponding to a linear model is symmetric and idempotent, that is,

However, this is not always the case; in locally weighted scatterplot smoothing (LOESS), for example, the hat matrix is in general neither symmetric nor idempotent.

For linear models, the trace of the projection matrix is equal to the rank of

, which is the number of independent parameters of the linear model.

[8] For other models such as LOESS that are still linear in the observations

, the projection matrix can be used to define the effective degrees of freedom of the model.

Practical applications of the projection matrix in regression analysis include leverage and Cook's distance, which are concerned with identifying influential observations, i.e. observations which have a large effect on the results of a regression.

Define the hat or projection operator as

Similarly, define the residual operator as

is a column of all ones, which allows one to analyze the effects of adding an intercept term to a regression.

is a large sparse matrix of the dummy variables for the fixed effect terms.

One can use this partition to compute the hat matrix of

, which might be too large to fit into computer memory.

The hat matrix was introduced by John Wilder in 1972.

(1978) gives the properties of the matrix and also many examples of its application.