Autoregressive integrated moving average

In time series analysis used in statistics and econometrics, autoregressive integrated moving average (ARIMA) and seasonal ARIMA (SARIMA) models are generalizations of the autoregressive moving average (ARMA) model to non-stationary series and periodic variation, respectively.

All these models are fitted to time series in order to better understand it and predict future values.

Specifically, ARMA assumes that the series is stationary, that is, its expected value is constant in time.

This operation generalizes ARMA and corresponds to the "integrated" part of ARIMA.

Analogously, periodic variation is removed by "seasonal differencing".

[2] As in ARMA, the "autoregressive" (AR) part of ARIMA indicates that the evolving variable of interest is regressed on its prior values.

The "moving average" (MA) part indicates that the regression error is a linear combination of error terms whose values occurred contemporaneously and at various times in the past.

According to Wold's decomposition theorem,[4][5][6] the ARMA model is sufficient to describe a regular (a.k.a.

purely nondeterministic[6]) wide-sense stationary time series, so we are motivated to make such a non-stationary time series stationary, e.g., by using differencing, before we can use ARMA.

pure sine or complex-valued exponential process[5]), the predictable component is treated as a non-zero-mean but periodic (i.e., seasonal) component in the ARIMA framework that it is eliminated by the seasonal differencing.

Non-seasonal ARIMA models are usually denoted ARIMA(p, d, q) where parameters p, d, q are non-negative integers: p is the order (number of time lags) of the autoregressive model, d is the degree of differencing (the number of times the data have had past values subtracted), and q is the order of the moving-average model.

) of multiplicity d, then it can be rewritten as: An ARIMA(p, d, q) process expresses this polynomial factorisation property with p = p'−d, and is given by: and so is special case of an ARMA(p+d, q) process having the autoregressive polynomial with d unit roots.

(This is why no process that is accurately described by an ARIMA model with d > 0 is wide-sense stationary.)

[clarification needed] The effect of the first type of factor is to allow each season's value to drift separately over time, whereas with the second type values for adjacent seasons move together.

[clarification needed] Identification and specification of appropriate factors in an ARIMA model can be an important step in modeling as it can allow a reduction in the overall number of parameters to be estimated while allowing the imposition on the model of types of behavior that logic and experience suggest should be there.

Differencing in statistics is a transformation applied to a non-stationary time-series in order to make it stationary in the mean sense (that is, to remove the non-constant trend), but it does not affect the non-stationarity of the variance or autocovariance.

From the perspective of signal processing, especially the Fourier spectral analysis theory, the trend is a low-frequency part in the spectrum of a series, while the season is a periodic-frequency part.

Therefore, differencing is a high-pass (that is, low-stop) filter and the seasonal-differencing is a comb filter to suppress respectively the low-frequency trend and the periodic-frequency season in the spectrum domain (rather than directly in the time domain).

Mathematically, this is shown as It may be necessary to difference the data a second time to obtain a stationary time series, which is referred to as second-order differencing: Seasonal differencing involves computing the difference between an observation and the corresponding observation in the previous season e.g a year.

This is shown as: The differenced data are then used for the estimation of an ARMA model.

Some well-known special cases arise naturally or are mathematically equivalent to other popular forecasting models.

The corrected AIC for ARIMA models can be written as The Bayesian Information Criterion (BIC) can be written as The objective is to minimize the AIC, AICc or BIC values for a good model.

While the AIC tries to approximate models towards the reality of the situation, the BIC attempts to find the perfect fit.

The BIC approach is often criticized as there never is a perfect fit to real-life complex data; however, it is still a useful method for selection as it penalizes models more heavily for having more parameters than the AIC would.

AICc can only be used to compare ARIMA models with the same orders of differencing.

For ARIMAs with different orders of differencing, RMSE can be used for model comparison.

The first is non-stationary: while the second is wide-sense stationary: Now forecasts can be made for the process

For this reason, researchers plot the ACF and histogram of the residuals to check the assumptions before producing forecast intervals.

A number of variations on the ARIMA model are commonly employed.

[11] If the time-series is suspected to exhibit long-range dependence, then the d parameter may be allowed to have non-integer values in an autoregressive fractionally integrated moving average model, which is also called a Fractional ARIMA (FARIMA or ARFIMA) model.