Newman–Keuls method

The Newman–Keuls or Student–Newman–Keuls (SNK) method is a stepwise multiple comparisons procedure used to identify sample means that are significantly different from each other.

[4] This procedure is often used as a post-hoc test whenever a significant difference between three or more sample means has been revealed by an analysis of variance (ANOVA).

Thus, the procedure is more likely to reveal significant differences between group means and to commit type I errors by incorrectly rejecting a null hypothesis when it is true.

In other words, the Neuman-Keuls procedure is more powerful but less conservative than Tukey's range test.

The Newman–Keuls method controls the Family-Wise Error Rate (FWER) in the weak sense but not the strong sense:[11][12] the Newman–Keuls procedure controls the risk of rejecting the null hypothesis if all means are equal (global null hypothesis) but does not control the risk of rejecting partial null hypotheses.

For instance, when four means are compared, under the partial null hypothesis that μ1=μ2 and μ3=μ4=μ+delta with a non-zero delta, the Newman–Keuls procedure has a probability greater than alpha of rejecting μ1=μ2 or μ3=μ4 or both.

[11] In the worst case, the FWER of Newman–Keuls procedure is 1-(1-alpha)^int(J/2) where int(J/2) represents the integer part of the total number of groups divided by 2.

In 1995 Benjamini and Hochberg presented a new, more liberal, and more powerful criterion for those types of problems: False discovery rate (FDR) control.

[13] In 2006, Shaffer showed (by extensive simulation) that the Newman–Keuls method controls the FDR with some constraints.

Violating homogeneity of variance can be more problematic than in the two-sample case since the MSE is based on data from all groups.

The Newman–Keuls method employs a stepwise approach when comparing sample means.

To determine if there is a significant difference between two means with equal sample sizes, the Newman–Keuls method uses a formula that is identical to the one used in Tukey's range test, which calculates the q value by taking the difference between two sample means and dividing it by the standard error: where

On both cases, MSE (mean squared error) is taken from the ANOVA conducted in the first stage of the analysis.

[16] Because the number of means within a range changes with each successive pairwise comparison, the critical value of the q statistic also changes with each comparison, which makes the Neuman-Keuls method more lenient and hence more powerful than Tukey's range test.

[7] The Newman–Keuls procedure cannot produce a confidence interval for each mean difference, or for multiplicity adjusted exact p-values due to its sequential nature.