Two-proportion Z-test

The Two-proportion Z-test (or, Two-sample proportion Z-test) is a statistical method used to determine whether the difference between the proportions of two groups, coming from a binomial distribution is statistically significant.

[1] This approach relies on the assumption that the sample proportions follow a normal distribution under the Central Limit Theorem, allowing the construction of a z-test for hypothesis testing and confidence interval estimation.

This test leverages the property that the sample proportions (which is the average of observations coming from a Bernoulli distribution) are asymptotically normal under the Central Limit Theorem, enabling the construction of a z-test.

The test involves two competing hypotheses: The z-statistic for comparing two proportions is computed using:[2]

Where: The pooled proportion is used to estimate the shared probability of success under the null hypothesis, and the standard error accounts for variability across the two samples.

The z-test determines statistical significance by comparing the calculated z-statistic to a critical value.

Or, alternatively, by computing the p-value and rejecting the null hypothesis if

The confidence interval for the difference between two proportions, based on the definitions above, is:

Where: This interval provides a range of plausible values for the true difference between population proportions.

[3]: 216–7 [4]: 875 Fisher’s exact test is more suitable for when the sample sizes are small.

Notice how the variance estimation is different between the hypothesis testing and the confidence intervals.

The first uses a pooled variance (based on the null hypothesis), while the second has to estimate the variance using each sample separately (so as to allow for the confidence interval to accommodate a range of differences in proportions).

The minimum detectable effect (MDE) is the smallest difference between two proportions (

) that a statistical test can detect for a chosen Type I error level (

It is commonly used in study design to determine whether the sample sizes allows for a test with sufficient sensitivity to detect meaningful differences.

The MDE for when using the (two-sided) z-test formula for comparing two proportions, incorporating critical values for

Where: The MDE depends on the sample sizes, baseline proportions (

When the baseline proportions are not known, they need to be assumed or roughly estimated from a small study.

Researchers may use the MDE to assess the feasibility of detecting meaningful differences before conducting a study.

The Minimal Detectable Effect (MDE) is the smallest difference, denoted as

, that satisfies two essential criteria in hypothesis testing: Given that the distribution is normal under the null and the alternative hypothesis, for the two criteria to happen, it is required that the distance of

) is exactly in the location in which the probability of exceeding this value, under the null, is (

The first criterion establishes the critical value required to reject the null hypothesis.

Under the null hypothesis, the test statistic is based on the pooled standard error (

, the statistical power would be only 50% because the alternative distribution is symmetric about the threshold.

To achieve a higher power level, an additional component is required in the MDE calculation.

to ensure that the probability of detecting the difference under the alternative hypothesis is at least

By summing the critical thresholds from the null and adding to it the relevant quantile from the alternative distributions, the MDE ensures the test satisfies the dual requirements of rejecting

To ensure valid results, the following assumptions must be met: The z-test is most reliable when sample sizes are large, and all assumptions are satisfied.

Use prop.test() with continuity correction disabled: Output includes z-test equivalent results: chi-squared statistic, p-value, and confidence interval: Use proportions_ztest from statsmodels: This statistics-related article is a stub.