Sign test

If X and Y are quantitative variables, the sign test can be used to test the hypothesis that the difference between the X and Y has zero median, assuming continuous distributions of the two random variables X and Y, in the situation when we can draw paired samples from X and Y.

To test the null hypothesis, independent pairs of sample data are collected from the populations {(x1, y1), (x2, y2), .

The normal approximation to the binomial distribution can be used for large sample sizes, m > 25.

Zar gives the following example of the sign test for matched pairs.

[5] The null hypothesis is that there is no difference between the hind leg and foreleg length in deer.

With a larger sample size, the evidence might be sufficient to reject the null hypothesis.

Because the observations can be expressed as numeric quantities (actual leg length), the paired t-test or Wilcoxon signed rank test will usually have greater power than the sign test to detect consistent differences.

If the observed result was 9 positive differences in 10 comparisons, the sign test would be significant.

Conover[6] gives the following example using a one-sided sign test for matched pairs.

What is the probability of a result as extreme as 8 positives in favor of B in 9 pairs, if the null hypothesis is true, that consumers have no preference for B over A?

In a clinical trial, survival time (weeks) is collected for 10 subjects with non-Hodgkin's lymphoma.

The exact survival time was not known for one subject who was still alive after 362 weeks, when the study ended.

The researcher wished to determine if the median survival time was less than or greater than 200 weeks.

Because any one observation is equally likely to be above or below the population median, the number of plus scores will have a binomial distribution with mean = 0.5.

This is exactly the same as the probability of a result as extreme as 7 heads in 10 tosses of a fair coin.

The syntax for the function is where Examples of the sign test using the R function binom.test The sign test example from Zar [5] compared the length of hind legs and forelegs of deer.

The hypothesized probability of success (defined as hind leg longer than foreleg) is p = 0.5 under the null hypothesis that hind legs and forelegs do not differ in length.

Conover [6] and Sprent [7] describe John Arbuthnot's use of the sign test in 1710.

If the null hypothesis of equal number of births is true, the probability of the observed outcome is 1/282, leading Arbuthnot to conclude that the probability of male and female births were not exactly equal.

"Nicholas Bernoulli (1710–1713) completes the analysis of Arbuthnot's data by showing that the larger part of the variation of the yearly number of male births can be explained as binomial with p = 18/35.

Hence we here have a test of significance rejecting the hypothesis p = 0.5 followed by an estimation of p and a discussion of the goodness of fit …" The sign test requires only that the observations in a pair be ordered, for example x > y.

The paired t-test will generally have greater power to detect differences than the sign test.

The asymptotic relative efficiency of the sign test to the paired t-test, under these circumstances, is 0.637.

[5] Bian, McAleer and Wong[10] proposed in 2011 a non-parametric test for paired data when there are many ties.