Difference in differences (DID[1] or DD[2]) is a statistical technique used in econometrics and quantitative research in the social sciences that attempts to mimic an experimental research design using observational study data, by studying the differential effect of a treatment on a 'treatment group' versus a 'control group' in a natural experiment.
Difference in differences requires data measured from a treatment group and a control group at two or more different time periods, specifically at least one time period before "treatment" and at least one time period after "treatment."
DID, therefore, calculates the "normal" difference in the outcome variable between the two groups (the difference that would still exist if neither group experienced the treatment), represented by the dotted line Q.
Consider the average of the dependent variable and dummy indicators by group and time: and suppose for simplicity that
Then The strict exogeneity assumption then implies that Without loss of generality, assume that
Below it is shown how this estimator can be read as a coefficient in an ordinary least squares regression.
Given that the formal definition above accurately represents reality, this assumption automatically holds.
In order to increase the likelihood of the parallel trend assumption holding, a difference-in-differences approach is often combined with matching.
By defining the Outcome Variable as a temporal difference (change in observed outcome between pre- and posttreatment periods), and matching multiple units in a large sample on the basis of similar pre-treatment histories, the resulting ATE (i.e. the ATT: Average Treatment Effect for the Treated) provides a robust difference-in-differences estimate of treatment effects.
This serves two statistical purposes: firstly, conditional on pre-treatment covariates, the parallel trends assumption is likely to hold; and secondly, this approach reduces dependence on associated ignorability assumptions necessary for valid inference.
To guarantee the accuracy of the DID estimate, the composition of individuals of the two groups is assumed to remain unchanged over time.
When using a DID model, various issues that may compromise the results, such as autocorrelation[5] and Ashenfelter dips, must be considered and dealt with.
The DID method can be implemented according to the table below, where the lower right cell is the DID estimator.
Although it is not shown rigorously here, this is a proper parametrization of the model formal definition, furthermore, it turns out that the group and period averages in that section relate to the model parameter estimates as follows where
can be interpreted as the impact of both the control group and the intervention's (treatment's) counterfactual.
, due to the parallel trend assumption, is also the same differential between the treatment and control group in
The above descriptions should not be construed to imply the (average) effect of only the control group, for
As in Card and Krueger, below, a first (time) difference of the outcome variable
This nuance is important to understand when the user believes (weak) violations of parallel pre-trend exist or in the case of violations of the appropriate counterfactual approximation assumptions given the existence of non-common shocks or confounding events.
To see the relation between this notation and the previous section, consider as above only one observation per time period for each group, then and so on for other values of
, which is equivalent to But this is the expression for the treatment effect that was given in the formal definition and in the above table.
The Card and Krueger article on minimum wage in New Jersey, published in 1994,[6] is considered one of the most famous DID studies; Card was later awarded the 2021 Nobel Memorial Prize in Economic Sciences in part for this and related work.
Card and Krueger compared employment in the fast food sector in New Jersey and in Pennsylvania, in February 1992 and in November 1992, after New Jersey's minimum wage rose from $4.25 to $5.05 in April 1992.
Observing a change in employment in New Jersey only, before and after the treatment, would fail to control for omitted variables such as weather and macroeconomic conditions of the region.
By including Pennsylvania as a control in a difference-in-differences model, any bias caused by variables common to New Jersey and Pennsylvania is implicitly controlled for, even when these variables are unobserved.
Assuming that New Jersey and Pennsylvania have parallel trends over time, Pennsylvania's change in employment can be interpreted as the change New Jersey would have experienced, had they not increased the minimum wage, and vice versa.
The evidence suggested that the increased minimum wage did not induce a decrease in employment in New Jersey, contrary to what some economic theory would suggest.
The table below shows Card & Krueger's estimates of the treatment effect on employment, measured as FTEs (or full-time equivalents).
Card and Krueger estimate that the $0.80 minimum wage increase in New Jersey led to an average 2.75 FTE increase in employment per store.
A software example application of this research is found on the Stata's command -diff- [7] authored by Juan Miguel Villa.