Statistical & Financial Consulting by Stanford PhD
Home Page

Stochastic process Xt is an Autoregressive Integrated Moving Average Process of order (p,d,q) if process Yt = Δd Xt is an autoregressive moving average process of order (p,q). Yt must follow the equation:

Yt = µ + φ1 Yt-1 + ... + φp Yt-p + εt + ψ1 εt-1 + ... + ψq εt-q,

where εt is white noise – a stationary stochastic process with zero mean and uncorrelated values at different moments of time. The dark blue part of the equation above is the autoregressive term while the light blue part is the moving average term. Operator Δ is the difference operatorΔXt = Xt – Xt-1. The powers of the difference operator are understood in the following sense: Δd Xt = Δ(Δd-1 Xt). For example, Δ2 XtΔ(ΔXt) = Δ(Xt – Xt-1) = (Xt – Xt-1) – (Xt-1 – Xt-2) = Xt – 2 Xt-1 + Xt-2.

The standard notation for an autoregressive integrated moving average process of order (p,d,q) is ARIMA(p,d,q), while the standard notation for an autoregressive moving average process of order (p,q) is ARMA(p,q). ARMA processes are stationary for many choices of parameters. Therefore, ARIMA processes are used to model processes in real life which are non-stationary but have stationary growth (d=1) or the speed of growth (d=2), or the speed of the speed of growth (d=3), etc.

Modeling is typically done in two stages. First, process Xt is differenced d times until the result is stationary according to unit root tests, like the augmented Dickey-Fuller test or the Phillips-Perron test. Second, the result of differencing Yt is modeled as an ARMA process. The order (p,q) of Yt is informally researched using the observed patters in its autocorrelation and partial autocorrelation functions. The impulse response function can be of great help as well, especially if the structure of the optimal ARMA model is complex: large values of p and q, negative coefficients, etc. After several candidate models of Yt have been identified, they are estimated using the method of maximum likelihood or the generalized method of moments. The best model is then identified using 1-2 model selection criteria. Examples of model selection criteria are AICBIC and cross-validation.

In theory, several values of d can be tried provided they ensure stationarity of the differenced series. The optimal value can then be identified using the model selection criteria. In practice, in most cases the smallest d ensuring stationarity is used.


Greene, W. H. (2011). Econometric Analysis (7th ed). Upper Saddle River, NJ: Prentice Hall.

Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.

Brockwell, P. J., & Davis, R. A. (1991). Time Series: Theory and Methods (2nd ed). New York: Springer.

Wei, W. W. S. (1990). Time Series Analysis: Univariate and Multivariate Methods. Redwood City, CA: Addison Wesley.

Tsay, R. S. (2005). Analysis of Financial Time Series. New Jersey: Wiley-Interscience.