Average absolute deviation

The average absolute deviation (AAD) of a data set is the average of the absolute deviations from a central point.

In the general form, the central point can be a mean, median, mode, or the result of any other measure of central tendency or any reference value related to the given data set.

Several measures of statistical dispersion are defined in terms of the absolute deviation.

The statistical literature has not yet adopted a standard notation, as both the mean absolute deviation around the mean and the median absolute deviation around the median have been denoted by their initials "MAD" in the literature, which may lead to confusion, since they generally have values considerably different from each other.

The choice of measure of central tendency,

"Average absolute deviation" can refer to either this usage, or to the general form with respect to a specified central point (see above).

MAD has been proposed to be used in place of standard deviation since it corresponds better to real life.

[1] Because the MAD is a simpler measure of variability than the standard deviation, it can be useful in school teaching.

[2][3] This method's forecast accuracy is very closely related to the mean squared error (MSE) method which is just the average squared error of the forecasts.

Although these methods are very closely related, MAD is more commonly used because it is both easier to compute (avoiding the need for squaring)[4] and easier to understand.

Thus if X is a normally distributed random variable with expected value 0 then, see Geary (1935):[6]

However, in-sample measurements deliver values of the ratio of mean average deviation / standard deviation for a given Gaussian sample n with the following bounds:

, with a bias for small n.[7] The mean absolute deviation from the mean is less than or equal to the standard deviation; one way of proving this relies on Jensen's inequality.

, where φ is a convex function, this implies for

For a general case of this statement, see Hölder's inequality.

This is the maximum likelihood estimator of the scale parameter

Since the median minimizes the average absolute distance, we have

By using the general dispersion function, Habib (2011) defined MAD about median as

This representation allows for obtaining MAD median correlation coefficients.

[citation needed] While in principle the mean or any other central point could be taken as the central point for the median absolute deviation, most often the median value is taken instead.

For a symmetric distribution, the median absolute deviation is equal to half the interquartile range.

The maximum absolute deviation around an arbitrary point is the maximum of the absolute deviations of a sample from that point.

While not strictly a measure of central tendency, the maximum absolute deviation can be found using the formula for the average absolute deviation as above with

The measures of statistical dispersion derived from absolute deviation characterize various measures of central tendency as minimizing dispersion: The median is the measure of central tendency most associated with the absolute deviation.

Some location parameters can be compared as follows: The mean absolute deviation of a sample is a biased estimator of the mean absolute deviation of the population.

In order for the absolute deviation to be an unbiased estimator, the expected value (average) of all the sample absolute deviations must equal the population absolute deviation.

The average of all the sample absolute deviations about the mean of size 3 that can be drawn from the population is 44/81, while the average of all the sample absolute deviations about the median is 4/9.

Therefore, the absolute deviation is a biased estimator.

However, this argument is based on the notion of mean-unbiasedness.

Each measure of location has its own form of unbiasedness (see entry on biased estimator).