Forecasting


Forecasting is extrapolation into the future based on a time series model.

Notation


  • $t_0$ is the horizon of the time series
  • $l$ is the number of time steps to forecast called the "lead time"
  • $\hat{x}_{t_0} (l)$ is the forcast of $t_0 + l$ given data up to the horizon $t_0$

Example

Let $t_0 = 10$, then the forecast for $X_{12}$ would be $\hat{X}_{10} (2)$

ARMA Forecasting


Once the model is fit, the forecast will be calculated in steps, iteratively forecasting each forecast point.

Recall that $\bar{x}$ is the estimate for $\mu$ of an ARMA process.

Then the forecast for an ARMA(p,q) is described by

$$ \hat{X}_{t_0} \left( {l} \right) = \sum_{i=1}^p \phi_i \hat{X}_{t_0} \left( l - i \right) - \sum_{j=1}^q \theta_j \hat{a}_{t_0 + l - j} + \bar{x} \left[ 1 - \sum_{i = 1}^p \phi_i \right] + \hat{a}_t(l) $$

Facts About the Forecast

  • The forecast errors are related to the white noise
$$ e_{t_0} (l) = \hat{X}_{t_0} \left( {l} \right) - X_{t_0 + l} = \sum_{k=0}^{l-1} \psi_k a_{t_0 +l -k} $$
  • $e_{t_0} (l)$ is normally distributed
  • $E[e_{t_0} (l)] = 0$
  • The variance of the prediction error is given by
$$ var \left[ e_{t_0} (l) \right] = var \left[ \sum_{k=0}^{l-1} \psi_k a_{t_0 +l -k} \right] = \sigma_a^2 \left[ \sum_{i = 0}^{l-1} \psi_i^2 \right] $$

Forecast Probability Limits

Given $\hat{X}_{t_0}$, an estimate of $X_{t_0 + l}$ based on data up to $t_0$, limits on the forecast are constructed such that the probability that $X_{t_0 + l}$ will fall in the limits is 95%.

The forecast probability limits for an ARMA process is given by

$$ Forecast \,\, Interval: \,\, \hat{X}_{t_0} \left( {l} \right) \pm z_{1-\alpha / 2} \hat{\sigma}_a \left[ \sum_{i = 0}^{l-1} \psi_i^2 \right]^\frac{1}{2} $$

where $\psi_i$ are the weights of the ARMA model expressed as a general linear process.

Eventual Forecast of an ARMA Process

Recall that since the an ARMA process is stationary, the expected value is the mean. Then, intuitively, as the time steps of the forecast grow large, the forecast will regress to the mean i.e. as $l \rightarrow \infty$, $\hat{X}_{t_0} \left( {l} \right) \rightarrow \mu$.

Example - ARMA(1,0)

From the general equation above, the forecast for an ARMA(1,0) is described by

$$ \hat{x}_{t_0} (l) = \phi_1 \hat{x}_{t_0} (l-1) + \bar{x} (1 - \phi_1) + \hat{a}_t(l) $$

Let $\hat{a}_t(l)$ since the underlying process is random, then the forecast $l$ can be calculated

$$ \hat{x}_{t_0} (l) = \phi_1 \hat{x}_{t_0} (l-1) + \bar{x} (1 - \phi_1) $$

Let $t_0 = 10$ and assume we want to forecast $t_{10}$, the forecast steps are

$$ \hat{x}_{t_{10}} (1) = \phi_1 \hat{x}_{t_{10}} (0) + \bar{x} (1 - \phi_1) $$$$ \hat{x}_{t_{10}} (2) = \phi_1 \hat{x}_{t_{10}} (1) + \bar{x} (1 - \phi_1) $$

where $\hat{x}_{t_{10}} = X_{10}$

Example - ARMA(2,0)

From the general equation above, the forecast for an ARMA(1,0) is described by

$$ \hat{x}_{t_0} (l) = \phi_1 \hat{x}_{t_0} (l-1) + \phi_2 \hat{x}_{t_0} (l-2) + \bar{x} (1 - \phi_1 - \phi_2) + \hat{a}_t(l) $$

Using the following realization data and model provide a forecast for $\hat{x}_{75} (1)$, $\hat{x}_{75} (2)$, and $\hat{x}_{75} (3)$.

Model

$$ X_t = 1.6 X_t - 0.8 X_t + 30 (1 - 1.6 + 0.8) $$

Realization Data

  • $t_0 = 75$
  • $X_{74} = 27.7$
  • $X_{75} = 23.4$
  • $\bar{X} = 29.5$

Calculation for $\hat{x}_{75} (1)$

$$ \hat{x}_{75} (1) = 1.6 \hat{x}_{75} (0) - 0.8 \hat{x}_{75} (-1) + \bar{x} (1 - 1.6 + 0.8) $$$$ \hat{x}_{75} (1) = 1.6 (23.4) - 0.8 (27.7) + 29.5 (1 - 1.6 + 0.8) = 21.2 $$

Calculation for $\hat{x}_{75} (2)$

$$ \hat{x}_{75} (2) = 1.6 \hat{x}_{75} (1) - 0.8 \hat{x}_{75} (0) + \bar{x} (1 - 1.6 + 0.8) $$$$ \hat{x}_{75} (2) = 1.6 (21.2) - 0.8 (23.4) + 29.5 (1 - 1.6 + 0.8) = 21.1 $$

Calculation for $\hat{x}_{75} (3)$

$$ \hat{x}_{75} (3) = 1.6 \hat{x}_{75} (2) - 0.8 \hat{x}_{75} (1) + \bar{x} (1 - 1.6 + 0.8) $$$$ \hat{x}_{75} (3) = 1.6 (21.1) - 0.8 (21.2) + 29.5 (1 - 1.6 + 0.8) = 22.7 $$

Checking Forecasts


Model forecasts are often validated by only using the first section of the data for model training and using a later section to calculate residuals. Mean square error of the residuals can be used to compare models.

The forecast horizon is set back $k$ steps, then, using the remaining $n-k$ time steps, the model is used to forecast on the withheld known future $k$ time steps.

Forecasting with ARIMA


Independence from the Realization Mean

Consider the following ARIMA model

$$ (1-B) \left( X_t - \mu \right) = a_t $$

Then

$$ \hat{X}_{t_0} (l) = \hat{X}_{t_0} (l - 1) + \bar{X} ( 1 - 1 ) $$

Notice that the mean in the equation above is multiplied by 0 due to the root on the unit circle.

Affects from Roots on the Unit Circle

The forecast depends on the order of the ARIMA.

  • With one factor of $(1-B)$, the forecast for the next value is just the value at the horizon.
  • If the order of $d$ is greater than 1, a trend will preserved in the forecast.

Forecast Probability Limits

Since the model is not stationary, the forecast limits are unbounded as $l$ increases.

Forecasting with Seasonality


The forecast for a seasonal model forecasts the value at $l$ to be equal to the value at $l-s$ where $s$ is the seasonality.

Special Case: Airline Models

Seasonal models of the following form are typically known as "Airline Models."

$$ (1-B) \left( 1- B^s \right) X_t = a_t $$

The addition of the $(1-B)$ with the seasonality term preserves trends in the data.

Signal + Noise Models


A signal plus noise model is expressed in the following form:

$$ X_t = \lambda (t) + Z_t + a_t $$

where $\lambda$ is a deterministic function and $Z_t$ represents some noise model (generally modeled by an ARMA model).

Typically, the following strategy is used to forecast with a deterministic model:

  1. Fit the model to the data
  2. Calculate the residuals, which are termed $Z_t$
  3. Fit an ARMA model to $Z_t$ to estimate the structure of the noise

Then, the forecast is given by

$$ \hat{X}_{t_0} (l) = \lambda \left( t_0 + l \right) + \hat{Z}_{t_0} (l) $$

where $\lambda$ represents the deterministic function.

Linear Trend

For a linear model,

  • $X_t$ is fit with an OLS
  • $\hat{Z}_t$ is calculated as $\hat{Z} = X_t - \hat{\beta}_0 - \hat{\beta}_1 t$ where $\hat{\beta}_0 $ and $ \hat{\beta}_1$ are estimates from OLS.
  • Fit $\hat{Z}_t$ to an ARMA model

Then, the forecasts for a linear trend are

$$ \hat{X}_{t_0} (l) = \hat{\beta}_0 + \hat{\beta}_1 t + \hat{Z}_{t_0} (l) $$

And the forecast limits are given by

$$ Forecast \,\, Interval: \,\, \hat{\beta}_0 + \hat{\beta}_1 t + \hat{Z}_{t_0} (l) \pm z_{1-\alpha/2} \hat{\sigma}_a \left[ \sum_{k = 0}^{l-1} \psi_k^2 \right]^\frac{1}{2} $$

where $\hat{\sigma}_a$ and $\psi_k$ are based on the ARMA fit to $\hat{Z}_t$

References


[1] W. Woodward and B. Salder, "Forecasting", SMU, 2019