Then a solution to our minimization problem is given by simply because is exactly a sought for orthogonal projection of
The algebraic solution of the normal equations with a full-rank matrix XTX can be written as where X+ is the Moore–Penrose pseudoinverse of X.
An exception occurs in numerical smoothing and differentiation where an analytical expression is required.
: Both substitutions are facilitated by the triangular nature of R. Orthogonal decomposition methods of solving the least squares problem are slower than the normal equations method but are more numerically stable because they avoid forming the product XTX.
Because Q is orthogonal, the sum of squares of the residuals, s, may be written as: Since v doesn't depend on β, the minimum value of s is attained when the upper block, u, is zero.
In accordance with a general approach described in the introduction above (find XS which is an orthogonal projection), and thus, is a solution of a least squares problem.
This method is the most computationally intensive, but is particularly useful if the normal equations matrix, XTX, is very ill-conditioned (i.e. if its condition number multiplied by the machine's relative round-off error is appreciably large).
In that case, including the smallest singular values in the inversion merely adds numerical noise to the solution.
This can be cured with the truncated SVD approach, giving a more stable and exact answer, by explicitly setting to zero all singular values below a certain threshold and so ignoring them, a process closely related to factor analysis.
Hence it is appropriate that considerable effort has been devoted to the task of ensuring that these computations are undertaken efficiently and with due regard to round-off error.
Individual statistical analyses are seldom undertaken in isolation, but rather are part of a sequence of investigatory steps.
Some of the topics involved in considering numerical methods for linear least squares relate to this point.
Thus important topics can be Fitting of linear models by least squares often, but not always, arise in the context of statistical analysis.
An early summary of these effects, regarding the choice of computation methods for matrix inversion, was provided by Wilkinson.