Exponential smoothing is one of many window functions commonly applied to smooth data in signal processing, acting as low-pass filters to remove high-frequency noise.
This method is preceded by Poisson's use of recursive exponential window functions in convolutions from the 19th century, as well as Kolmogorov and Zurbenko's use of recursive moving averages from their studies of turbulence in the 1940s.
The use of the exponential window function is first attributed to Poisson[2] as an extension of a numerical analysis technique from the 17th century, and later adopted by the signal processing community in the 1940s.
Exponential smoothing was first suggested in the statistical literature without citation to previous work by Robert Goodell Brown in 1956,[3] and then expanded by Charles C. Holt in 1957.
close to 1 have less of a smoothing effect and give greater weight to recent changes in the data, while values of
closer to 0 have a greater smoothing effect and are less responsive to recent changes.
[6] Unlike some other smoothing methods, such as the simple moving average, this technique does not require any minimum number of observations to be made before it begins to produce results.
Technically it can also be classified as an autoregressive integrated moving average (ARIMA) (0,1,1) model with no constant term.
Exponential smoothing puts substantial weight on past observations, so the initial value of demand will have an unreasonably large effect on early forecasts.
However, a more robust and objective way to obtain values of the unknown parameters included in any exponential smoothing method is to estimate them from the observed data.
The unknown parameters and the initial values for any exponential smoothing method can be estimated by minimizing the sum of squared errors (SSE).
The name 'exponential smoothing' is attributed to the use of the exponential window function during convolution.
, and the weights assigned to previous observations are proportional to the terms of the geometric progression A geometric progression is the discrete version of an exponential function, so this is where the name for this smoothing method originated according to Statistics lore.
Exponential smoothing and moving average have similar defects of introducing a lag relative to the input data.
While this can be corrected by shifting the result by half the window length for a symmetrical kernel, such as a moving average or gaussian, it is unclear how appropriate this would be for exponential smoothing.
They (moving average with symmetrical kernels) also both have roughly the same distribution of forecast error when α = 2/(k + 1) where k is the number of past data points in consideration of moving average.
Computationally speaking, they also differ in that moving average requires that the past k data points, or the data point at lag k + 1 plus the most recent forecast value, to be kept, whereas exponential smoothing only needs the most recent forecast value to be kept.
[11] In the signal processing literature, the use of non-causal (symmetric) filters is commonplace, and the exponential window function is broadly used in this fashion, but a different terminology is used: exponential smoothing is equivalent to a first-order infinite-impulse response (IIR) filter and moving average is equivalent to a finite impulse response filter with equal weighting factors.
The basic idea behind double exponential smoothing is to introduce a term to take into account the possibility of a series exhibiting some form of trend.
One method, works as follows:[12] Again, the raw data sequence of observations is represented by
Note that F0 is undefined (there is no estimation for time 0), and according to the definition F1=s0+b0, which is well defined, thus further values can be evaluated.
There are different types of seasonality: 'multiplicative' and 'additive' in nature, much like addition and multiplication are basic operations in mathematics.
If every month of December we sell 10,000 more apartments than we do in November the seasonality is additive in nature.
Multiplicative seasonality can be represented as a constant factor, not an absolute amount.
[15] Holt's novel idea was to repeat filtering an odd number of times greater than 1 and less than 5, which was popular with scholars of previous eras.
[15] While recursive filtering had been used previously, it was applied twice and four times to coincide with the Hadamard conjecture, while triple application required more than double the operations of singular convolution.
The use of a triple application is considered a rule of thumb technique, rather than one based on theoretical foundations and has often been over-emphasized by practitioners.
The method calculates a trend line for the data as well as seasonal indices that weight the values in the trend line based on where that time point falls in the cycle of length
is the sequence of best estimates of the linear trend that are superimposed on the seasonal changes, and
periods) of historical data is needed to initialize a set of seasonal factors.