ARIMA Modeling - Statistics Tutor in Montreal, Miami, Houston, Austin, Phoenix, Columbus

Statistical & Financial Consulting by Stanford PhD

Home Page

ARIMA PROCESS

Stochastic process X_t is an Autoregressive Integrated Moving Average Process of order (p,d,q) if process Y_t = Δ^d X_t is an autoregressive moving average process of order (p,q). Y_t must follow the equation:

Y_t = µ + φ₁ Y_t-1 + ... + φ_p Y_t-p + ε_t + ψ₁ ε_t-1 + ... + ψ_q ε_t-q,

where ε_t is white noise – a stationary stochastic process with zero mean and uncorrelated values at different moments of time. The dark blue part of the equation above is the autoregressive term while the light blue part is the moving average term. Operator Δ is the difference operator: ΔX_t = X_t – X_t-1. The powers of the difference operator are understood in the following sense: Δ^d X_t = Δ(Δ^d-1 X_t). For example, Δ² X_t = Δ(ΔX_t) = Δ(X_t – X_t-1) = (X_t – X_t-1) – (X_t-1 – X_t-2) = X_t – 2 X_t-1 + X_t-2.

The standard notation for an autoregressive integrated moving average process of order (p,d,q) is ARIMA(p,d,q), while the standard notation for an autoregressive moving average process of order (p,q) is ARMA(p,q). ARMA processes are stationary for many choices of parameters. Therefore, ARIMA processes are used to model processes in real life which are non-stationary but have stationary growth (d=1) or the speed of growth (d=2), or the speed of the speed of growth (d=3), etc.

Modeling is typically done in two stages. First, process X_t is differenced d times until the result is stationary according to unit root tests, like the augmented Dickey-Fuller test or the Phillips-Perron test. Second, the result of differencing Y_t is modeled as an ARMA process. The order (p,q) of Y_t is informally researched using the observed patters in its autocorrelation and partial autocorrelation functions. The impulse response function can be of great help as well, especially if the structure of the optimal ARMA model is complex: large values of p and q, negative coefficients, etc. After several candidate models of Y_t have been identified, they are estimated using the method of maximum likelihood or the generalized method of moments. The best model is then identified using 1-2 model selection criteria. Examples of model selection criteria are AIC, BIC and cross-validation.

In theory, several values of d can be tried provided they ensure stationarity of the differenced series. The optimal value can then be identified using the model selection criteria. In practice, in most cases the smallest d ensuring stationarity is used.

ARIMA REFERENCES

Greene, W. H. (2011). Econometric Analysis (7th ed). Upper Saddle River, NJ: Prentice Hall.

Hamilton, J. D. (1994). Time Series Analysis. Princeton University Press.

Brockwell, P. J., & Davis, R. A. (1991). Time Series: Theory and Methods (2nd ed). New York: Springer.

Wei, W. W. S. (1990). Time Series Analysis: Univariate and Multivariate Methods. Redwood City, CA: Addison Wesley.

Tsay, R. S. (2005). Analysis of Financial Time Series. New Jersey: Wiley-Interscience.

BACK TO THE STATISTICAL ANALYSES DIRECTORY

IMPORTANT LINKS ON THIS SITE

Detailed description of the services offered in the areas of statistical consulting and financial consulting: home page, types of service, experience, case studies, payment options and statistics tutoring
Directory of financial topics

consulting@stanfordphd.com