One of the application of Student's t-test is to test the location of one sequence of independent and identically distributed random variables.
If we want to test the locations of multiple sequences of such variables, Šidák correction should be applied in order to calibrate the level of the Student's t-test.
Moreover, if we want to test the locations of nearly infinitely many sequences of variables, then Šidák correction should be used, but with caution.
More specifically, the validity of Šidák correction depends on how fast the number of sequences goes to infinity.
Suppose we are interested in m different hypotheses,
Now the hypothesis test scheme becomes Let
be the level of this test (the type-I error), that is, the probability that we falsely reject
We aim to design a test with certain level
Suppose when testing each hypothesis
can be developed by the following procedure, known as Šidák correction.
For finitely many t-tests, suppose
are independent but not necessarily identically distributed, and
has finite fourth moment.
Our goal is to design a test for
This test can be based on the t-statistic of each sequences, that is, where: Using Šidák correction, we reject
if any of the t-tests based on the t-statistics above reject at level
when where The test defined above has asymptotic level α, because In some cases, the number of sequences,
, increase as the data size of each sequences,
If this is true, then we will need to test a null including infinitely many hypotheses, that is
To design a test, Šidák correction may be applied, as in the case of finitely many t-test.
This result is related to high-dimensional statistics and is proven by Fan, Hall & Yao (2007).
Indeed, The results above are based on Central Limit Theorem.
According to Central Limit Theorem, each of our t-statistics
and the standard normal distribution is asymptotically negligible.
The question is, if we aggregate all the differences between the distribution of each
and the standard normal distribution, is this aggregation of differences still asymptotically ignorable?
This is because in the latter case we are summing up infinitely many infinitesimal terms.
If the number of the terms goes to infinity too fast, that is,
too fast, then the sum may not be zero, the distribution of the t-statistics can not be approximated by the standard normal distribution, the true level does not converges to the nominal level
, and then the Šidák correction fails.