Null hypothesis

In scientific research, the null hypothesis (often denoted H0)[1] is the claim that the effect being studied does not exist.

The test of significance is designed to assess the strength of the evidence against the null hypothesis, or a statement of 'no effect' or 'no difference'.

For instance, trying to determine if there is a positive proof that an effect has occurred or that samples derive from different batches.

Multiple analyses can be performed to show how the hypothesis should either be rejected or excluded e.g. having a high confidence level, thus demonstrating a statistically significant difference.

This is demonstrated by showing that zero is outside of the specified confidence interval of the measurement on either side, typically within the real numbers.

(When it is proven that something is e.g. bigger than x, it does not necessarily imply it is plausible that it is smaller or equal than x; it may instead be a poor quality measurement with low accuracy.

The confidence level should indicate the likelihood that much more and better data would still be able to exclude the null hypothesis on the same side.

Hypothesis testing requires constructing a statistical model of what the data would look like if chance or random processes alone were responsible for the results.

The model of the result of the random process is called the distribution under the null hypothesis.

[13] If the data-set of a randomly selected representative sample is very unlikely relative to the null hypothesis (defined as being part of a class of sets of data that only rarely will be observed), the experimenter rejects the null hypothesis, concluding it (probably) is false.

This class of data-sets is usually specified via a test statistic, which is designed to measure the extent of apparent departure from the null hypothesis.

The procedure works by assessing whether the observed departure, measured by the test statistic, is larger than a value defined, so that the probability of occurrence of a more extreme value is small under the null hypothesis (usually in less than either 5% or 1% of similar data-sets in which the null hypothesis does hold).

In this case, because the null hypothesis could be true or false, in some contexts this is interpreted as meaning that the data give insufficient evidence to make any conclusion, while in other contexts, it is interpreted as meaning that there is not sufficient evidence to support changing from a currently useful regime to a different one.

Nevertheless, if at this point the effect appears likely and/or large enough, there may be an incentive to further investigate, such as running a bigger sample.

The test of the hypothesis consists of administering the drug to half of the people in a study group as a controlled experiment.

If the data show a statistically significant change in the people receiving the drug, the null hypothesis is rejected.

The numerous uses of significance testing were well known to Fisher who discussed many in his book written a decade before defining the null hypothesis.

Fisher mentioned few constraints on the choice and stated that many null hypotheses should be considered and that many tests are possible for each.

David Cox said, "How [the] translation from subject-matter problem to statistical model is done is often the most critical part of an analysis".

Testing hypotheses suggested by the data is circular reasoning that proves nothing; It is a special limitation on the choice of the null hypothesis.

The standard "no difference" null hypothesis may reward the pharmaceutical company for gathering inadequate data.

"Difference" is a better null hypothesis in this case, but statistical significance is not an adequate criterion for reaching a nuanced conclusion which requires a good numeric estimate of the drug's effectiveness.

A "minor" or "simple" proposed change in the null hypothesis ((new vs old) rather than (new vs placebo)) can have a dramatic effect on the utility of a test for complex non-statistical reasons.

A potential null hypothesis implying a one-tailed test is "this coin is not biased toward heads".

Beware that, in this context, the term "one-tailed" does not refer to the outcome of a single coin toss (i.e., whether or not the coin comes up "tails" instead of "heads"); the term "one-tailed" refers to a specific way of testing the null hypothesis in which the critical region (also known as "region of rejection") ends up in on only one side of the probability distribution.

However, the probability of 5 tosses of the same kind, irrespective of whether these are head or tails, is twice as much as that of the 5-head occurrence singly considered.

This example illustrates that the conclusion reached from a statistical test may depend on the precise formulation of the null and alternative hypotheses.

Fisher said, "the null hypothesis must be exact, that is free of vagueness and ambiguity, because it must supply the basis of the 'problem of distribution,' of which the test of significance is the solution", implying a more restrictive domain for H0.

The statistical theory required to deal with the simple cases of directionality dealt with here, and more complicated ones, makes use of the concept of an unbiased test.

While Fisher was willing to ignore the unlikely case of the Lady guessing all cups of tea incorrectly (which may have been appropriate for the circumstances), medicine believes that a proposed treatment that kills patients is significant in every sense and should be reported and perhaps explained.