2.2.5.1.2 Algorithm for ARIMA

ARIMA model means an autoregressive integrated moving average model. And it may include autoregressive(AR), moving average (MA) or differencing. In this app, nag function nag_tsa_multi_inp_model_estim (g13bec) is used to fit an ARIMA model [1], and nag function nag_tsa_multi_inp_model_forecast (g13bjc) is used to forecast future values by a known ARIMA model [2].

Contents

ARIMA Model

For a general ARIMA model,

\[\begin{equation}\tag{1} \begin{split} \nabla ^d \nabla_s^D y_t &= c + w_t \\ w_t &= \Phi_1 w_{t-s} + \Phi_2 w_{t-2s} + ... + \Phi_P w_{t-Ps} + e_t - \Theta_1 e_{t-s} - \Theta_2 e_{t-2s} - ... - \Theta_Q e_{t-Qs} \\ e_t &= \phi_1 e_{t-1} + \phi_2 e_{t-2} + ... + \phi_p e_{t-p} + a_t - \theta_1 a_{t-1} - \theta_2 a_{t-2} - ... - \theta_q e_{t-q} \end{split} \end{equation}\]
,

where \(y_t\) is the input time series (t = 1 ... n), P, Q, D, p, q, d are orders of seasonal autoregressive, seasonal moving average, seasonal differencing, autoregressive, moving average and differencing respectively. And s is the seasonal period. c is the mean of the differenced values, \(\Phi_i (i=1 ... P)\), \(\Theta_i (i=1 ... Q)\), \(\phi_i (i=1 ... p)\), \(\theta_i (i=1 ... q)\) are coefficients for seasonal autoregressive, seasonal moving average, autoregressive and moving average. \(a_t\) is the residual.

Estimation

Residual series \(a_t\) can be obtained by \(y_t\) in equation 1. Sum squares of residuals:

\[\begin{equation}\tag{2}S = \sum_{-\infty}^n a_t^2 \end{equation}\]

Estimation Criterion

Three criteria are available:

\[ D = S\]

Iterate by minimizing D.

\(y_i, \; i=0, -1, ... \) are considered as unobserved random variables with known distribution.

\[ D = M \times S\]

where the multiplier M is a function calculated from the ARIMA model arguments.

Minimizing D is equivalent to maximizing the exact likelihood of the data.

\[ D = M \times S\]

but with a different value of M. It is distinct from exact likelihood method only if the mean term is included in the model.

In this app, Marquardt method [4] is used to minimize the objective function.

Quantities

Residuals are available at \(t \ge 1 + d + s \times D\).

\[\hat{y}_t = y_t - a_t\]

Differenced series length is: \(N = n - d - s \times D\), and \(df = N - (\text{number of parameters})\) .

\[erv = \frac{S}{df}\]

\[C = erv \times H^{-1}\]

where H is the linearised least squares matrix in the final iteration.

Forecast

To predict time series \(y_t\) at t = n + 1, ... n + L, set \(a_t = 0\) for t = n + 1, ... n + L, and calculate the predicted value by reversing Eq 1.

\[\begin{equation}\tag{3} \begin{split} e_t &= \phi_1 e_{t-1} + \phi_2 e_{t-2} + ... + \phi_p e_{t-p} + a_t - \theta_1 a_{t-1} - \theta_2 a_{t-2} - ... - \theta_q e_{t-q} \\ w_t &= \Phi_1 w_{t-s} + \Phi_2 w_{t-2s} + ... + \Phi_P w_{t-Ps} + e_t - \Theta_1 e_{t-s} - \Theta_2 e_{t-2s} - ... - \Theta_Q e_{t-Qs} \\ y_t &= (\nabla ^d \nabla_s^D)^{-1} (c + w_t) \end{split} \end{equation}\]

The forecast error variance of \(y_{n+L}\) can be calculated as:

\[S_L^2 = V_n \times (\psi_0^2 + \psi_1^2 + ... + \psi_{L-1}^2)\]

where Vn is the residual variance of the ARIMA model, and \(\psi_i\) is the "psi-weights" of the model as defined in [3].

Reference

  1. nag_tsa_multi_inp_model_estim (g13bec)
  2. nag_tsa_multi_inp_model_forecast (g13bjc)
  3. George E. P. Box and Gwilym M. Jenkins (1976). Time Series Analysis: Forecasting and Control. (Revised Edition) Holden–Day
  4. D. W. Marquardt (1963). "An algorithm for least squares estimation of nonlinear parameters". J. Soc. Indust. Appl. Math. 11 431.