Moderation (statistics)

[1][2] The effect of a moderating variable is characterized statistically as an interaction;[1] that is, a categorical (e.g., sex, ethnicity, class) or continuous (e.g., age, level of reward) variable that is associated with the direction and/or magnitude of the relation between dependent and independent variables.

In analysis of variance (ANOVA) terms, a basic moderator effect can be represented as an interaction between a focal independent variable and a factor that specifies the appropriate conditions for its operation.

[1] To quantify the effect of a moderating variable in multiple regression analyses, regressing random variable Y on X, an additional term is added to the model.

Multicollinearity tends to cause coefficients to be estimated with higher standard errors and hence greater uncertainty.

Mean-centering (subtracting raw scores from the mean) may reduce multicollinearity, resulting in more interpretable regression coefficients.

Like simple main effect analysis in ANOVA, in post-hoc probing of interactions in regression, we are examining the simple slope of one independent variable at the specific values of the other independent variable.

In what follows, the regression equation with two variables A and B and an interaction term A*B, will be considered.

For example, suppose that both A and B are single dummy coded (0,1) variables, and that A represents ethnicity (0 = European Americans, 1 = East Asians) and B represents the condition in the study (0 = control, 1 = experimental).

The coefficient of A shows the ethnicity effect on Y for the control condition, while the coefficient of B shows the effect of imposing the experimental condition for European American participants.

To probe if there is any significant difference between European Americans and East Asians in the experimental condition, we can simply run the analysis with the condition variable reverse-coded (0 = experimental, 1 = control), so that the coefficient for ethnicity represents the ethnicity effect on Y in the experimental condition.

In a similar vein, if we want to see whether the treatment has an effect for East Asian participants, we can reverse code the ethnicity variable (0 = East Asians, 1 = European Americans).

When the analysis is run again, b1 now represents the difference between males and females at the mean level of the SWLS score of the sample.

Cohen et al. (2003) recommended using the following to probe the simple effect of gender on the dependent variable (Y) at three levels of the continuous independent variable: high (one standard deviation above the mean), moderate (at the mean), and low (one standard deviation below the mean).

Then one can explore the effects of gender on the dependent variable (Y) at high, moderate, and low levels of the SWLS score.

This coding system is appropriate when researchers have an a priori hypothesis concerning the specific differences among the group means.

Effects coding is used when there is no reference group or orthogonal contrasts.

This coding system is appropriate when the groups represent natural categories.

(Centering involves subtracting the overall sample mean score from the original score; standardizing does the same followed by dividing by the overall sample standard deviation.)

Sometimes this is supplemented by simple slope analysis, which determines whether the effect of X on Y is statistically significant at particular values of Z.

A common technique for simple slope analysis is the Johnson-Neyman approach.

[11] Various internet-based tools exist to help researchers plot and interpret such two-way interactions.

For instance, if we have a three-way interaction between A, B, and C, the regression equation will be as follows: It is worth noting that the reliability of the higher-order terms depends on the reliability of the lower-order terms.

The solution for this problem is to use highly reliable measures for each independent variable.

Another caveat for interpreting the interaction effects is that when variable A and variable B are highly correlated, then the A * B term will be highly correlated with the omitted variable A2; consequently what appears to be a significant moderation effect might actually be a significant nonlinear effect of A alone.

If this is the case, it is worth testing a nonlinear regression model by adding nonlinear terms in individual variables into the moderated regression analysis to see if the interactions remain significant.

Moderated regression analyses also tend to include additional variables, which are conceptualized as covariates of no interest.

Conceptual diagram of a simple moderation model in which the effect of the focal antecedent (X) on the outcome (Y) is influenced or dependent on a moderator (W).
A statistical diagram of a simple moderation model.
A statistical diagram that depicts a moderation model with X as a multicategorical independent variable.
An example of conceptual moderation model with one categorical and one continuous independent variable.
A statistical diagram that depicts a moderation model with W with three levels, as a multi-categorical independent variable.
Effects coding
A conceptual diagram of an additive multiple moderation model
An example of a two-way interaction effect plot
A conceptual diagram of a moderated moderation model, otherwise known as a three-way interaction.