The remainder of the residual sum of squares is attributed to lack of fit of the model since it would be mathematically possible to eliminate these errors entirely.
One takes as estimates of α and β the values that minimize the sum of squares of residuals, i.e., the sum of squares of the differences between the observed y-value and the fitted y-value.
Then are the residuals, which are observable estimates of the unobservable values of the error term ε ij.
Because of the nature of the method of least squares, the whole vector of residuals, with scalar components, necessarily satisfies the two constraints It is thus constrained to lie in an (N − 2)-dimensional subspace of R N, i.e. there are N − 2 "degrees of freedom for error".
But the numerator then has a noncentral chi-squared distribution, and consequently the quotient as a whole has a non-central F-distribution.
One uses this F-statistic to test the null hypothesis that the linear model is correct.