Influential observation

In statistics, an influential observation is an observation for a statistical calculation whose deletion from the dataset would noticeably change the result of the calculation.

[1] In particular, in regression analysis an influential observation is one whose deletion has a large effect on the parameter estimates.

is the n×k design matrix of explanatory variables (including a constant),

is a k×1 vector of estimates of some population parameter

Then we have the following measures of influence: An outlier may be defined as a data point that differs markedly from other observations.

[6][7] A high-leverage point are observations made at extreme values of independent variables.

[8] Both types of atypical observations will force the regression line to be close to the point.

[2] In Anscombe's quartet, the bottom right image has a point with high leverage and the bottom left image has an outlying point.

In Anscombe's quartet the two datasets on the bottom both contain influential points. All four sets are identical when examined using simple summary statistics, but vary considerably when graphed. If one point is removed, the line would look very different.