In statistics, econometrics, political science, epidemiology, and related disciplines, a regression discontinuity design (RDD) is a quasi-experimental pretest–posttest design that aims to determine the causal effects of interventions by assigning a cutoff or threshold above or below which an intervention is assigned.
By comparing observations lying closely on either side of the threshold, it is possible to estimate the average treatment effect in environments in which randomisation is unfeasible.
[2] Recent study comparisons of randomised controlled trials (RCTs) and RDDs have empirically demonstrated the internal validity of the design.
The main problem with estimating the causal effect of such an intervention is the homogeneity of performance to the assignment of treatment (e.g., a scholarship award).
Since high-performing students are more likely to be awarded the merit scholarship and continue performing well at the same time, comparing the outcomes of awardees and non-recipients would lead to an upward bias of the estimates.
Despite the absence of an experimental design, an RDD can exploit exogenous characteristics of the intervention to elicit causal effects.
The two most common approaches to estimation using an RDD are non-parametric and parametric (normally polynomial regression).
The most common non-parametric method used in the RDD context is a local linear regression.
[4] The major benefit of using non-parametric methods in an RDD is that they provide estimates based on data closer to the cut-off, which is intuitively appealing.
[4] It is impossible to definitively test for validity if agents are able to determine their treatment status perfectly.
However, some tests can provide evidence that either supports or discounts the validity of the regression discontinuity design.
For the earlier example, one could test if those who just barely passed have different characteristics (demographics, family income, etc.)
Consider the example of Carpenter and Dobkin (2011) who studied the effect of legal access to alcohol in the United States.
If parameter estimates are sensitive to removing or adding covariates to the model, then this may cast doubt on the validity of the regression discontinuity design.
[4] Recent work has shown how to add covariates, under what conditions doing so is valid, and the potential for increased precision.
[14] The identification of causal effects hinges on the crucial assumption that there is indeed a sharp cut-off, around which there is a discontinuity in the probability of assignment from 0 to 1.
In reality, however, cutoffs are often not strictly implemented (e.g. exercised discretion for students who just fell short of passing the threshold) and the estimates will hence be biased.
This technique was coined regression kink design by Nielsen, Sørensen, and Taber (2010), though they cite similar earlier analyses.
For instance, the designs often involve serious issues that do not offer room for random experiments.
Besides, the design of the experiments depends on the accuracy of the modelling process and the relationship between inputs and outputs.