Freedman's paradox

In statistical analysis, Freedman's paradox,[1][2] named after David Freedman, is a problem in model selection whereby predictor variables with no relationship to the dependent variable can pass tests of significance – both individually via a t-test, and jointly via an F-test for the significance of the regression.

Freedman demonstrated (through simulation and asymptotic calculation) that this is a common occurrence when the number of variables is similar to the number of data points.

Specifically, if the dependent variable and k regressors are independent normal variables, and there are n observations, then as k and n jointly go to infinity in the ratio k/n=ρ, More recently, new information-theoretic estimators have been developed in an attempt to reduce this problem,[3] in addition to the accompanying issue of model selection bias,[4] whereby estimators of predictor variables that have a weak relationship with the response variable are biased.

This statistics-related article is a stub.

You can help Wikipedia by expanding it.