Behrens–Fisher problem

In statistics, the Behrens–Fisher problem, named after Walter-Ulrich Behrens and Ronald Fisher, is the problem of interval estimation and hypothesis testing concerning the difference between the means of two normally distributed populations when the variances of the two populations are not assumed to be equal, based on two independent samples.

These differences involve not only what is counted as being a relevant solution, but even the basic statement of the context being considered.

It is well known that an exact test can be gained by randomly discarding data from the larger dataset until the sample sizes are equal, assembling data in pairs and taking differences, and then using an ordinary t-test to test for the mean-difference being zero: clearly this would not be "optimal" in any sense.

The task of specifying interval estimates for this problem is one where a frequentist approach fails to provide an exact solution, although some approximations are available.

[citation needed] Thus study of the problem can be used to elucidate the differences between the frequentist and Bayesian approaches to interval estimation.

Ronald Fisher in 1935 introduced fiducial inference[3][4] in order to apply it to this problem.

Fisher approximated the distribution of this by ignoring the random variation of the relative sizes of the standard deviations, Fisher's solution provoked controversy because it did not have the property that the hypothesis of equal means would be rejected with probability α if the means were in fact equal.

Many other methods of treating the problem have been proposed since, and the effect on the resulting confidence intervals have been investigated.

Nevertheless, the Behrens–Fisher T can be compared with a corresponding quantile of Student's t distribution with these estimated numbers of degrees of freedom,

In this way, the boundary between acceptance and rejection region of the test statistic T is calculated based on the empirical variances si2, in a way that is a smooth function of these.

Among these are,[7] In Dudewicz’s comparison of selected methods,[7] it was found that the Dudewicz–Ahmed procedure is recommended for practical use.

[13] A follow-up paper showed that the classic paired t-test is a central Behrens–Fisher problem with a non-zero population correlation coefficient and derived its corresponding probability density function by solving its associated non-central Behrens–Fisher problem with a nonzero population correlation coefficient.

[14] It also solved a more general non-central Behrens–Fisher problem with a non-zero population correlation coefficient in the appendix.