Forecasting¶

Forecasting is extrapolation into the future based on a time series model.

Notation¶

$t_0$ is the horizon of the time series
$l$ is the number of time steps to forecast called the "lead time"
$\hat{x}_{t_0} (l)$ is the forcast of $t_0 + l$ given data up to the horizon $t_0$

Example¶

Let $t_0 = 10$, then the forecast for $X_{12}$ would be $\hat{X}_{10} (2)$

ARMA Forecasting¶

Once the model is fit, the forecast will be calculated in steps, iteratively forecasting each forecast point.

Recall that $\bar{x}$ is the estimate for $\mu$ of an ARMA process.

Then the forecast for an ARMA(p,q) is described by

$$ \hat{X}_{t_0} \left( {l} \right) = \sum_{i=1}^p \phi_i \hat{X}_{t_0} \left( l - i \right) - \sum_{j=1}^q \theta_j \hat{a}_{t_0 + l - j} + \bar{x} \left[ 1 - \sum_{i = 1}^p \phi_i \right] + \hat{a}_t(l) $$

Facts About the Forecast¶

The forecast errors are related to the white noise

$$ e_{t_0} (l) = \hat{X}_{t_0} \left( {l} \right) - X_{t_0 + l} = \sum_{k=0}^{l-1} \psi_k a_{t_0 +l -k} $$

$e_{t_0} (l)$ is normally distributed
$E[e_{t_0} (l)] = 0$
The variance of the prediction error is given by

$$ var \left[ e_{t_0} (l) \right] = var \left[ \sum_{k=0}^{l-1} \psi_k a_{t_0 +l -k} \right] = \sigma_a^2 \left[ \sum_{i = 0}^{l-1} \psi_i^2 \right] $$

Forecast Probability Limits¶

Given $\hat{X}_{t_0}$, an estimate of $X_{t_0 + l}$ based on data up to $t_0$, limits on the forecast are constructed such that the probability that $X_{t_0 + l}$ will fall in the limits is 95%.

The forecast probability limits for an ARMA process is given by

$$ Forecast \,\, Interval: \,\, \hat{X}_{t_0} \left( {l} \right) \pm z_{1-\alpha / 2} \hat{\sigma}_a \left[ \sum_{i = 0}^{l-1} \psi_i^2 \right]^\frac{1}{2} $$

where $\psi_i$ are the weights of the ARMA model expressed as a general linear process.

Eventual Forecast of an ARMA Process¶

Recall that since the an ARMA process is stationary, the expected value is the mean. Then, intuitively, as the time steps of the forecast grow large, the forecast will regress to the mean i.e. as $l \rightarrow \infty$, $\hat{X}_{t_0} \left( {l} \right) \rightarrow \mu$.

Example - ARMA(1,0)¶

From the general equation above, the forecast for an ARMA(1,0) is described by

$$ \hat{x}_{t_0} (l) = \phi_1 \hat{x}_{t_0} (l-1) + \bar{x} (1 - \phi_1) + \hat{a}_t(l) $$

Let $\hat{a}_t(l)$ since the underlying process is random, then the forecast $l$ can be calculated

$$ \hat{x}_{t_0} (l) = \phi_1 \hat{x}_{t_0} (l-1) + \bar{x} (1 - \phi_1) $$

Let $t_0 = 10$ and assume we want to forecast $t_{10}$, the forecast steps are

$$ \hat{x}_{t_{10}} (1) = \phi_1 \hat{x}_{t_{10}} (0) + \bar{x} (1 - \phi_1) $$$$ \hat{x}_{t_{10}} (2) = \phi_1 \hat{x}_{t_{10}} (1) + \bar{x} (1 - \phi_1) $$

where $\hat{x}_{t_{10}} = X_{10}$

Example - ARMA(2,0)¶

From the general equation above, the forecast for an ARMA(1,0) is described by

$$ \hat{x}_{t_0} (l) = \phi_1 \hat{x}_{t_0} (l-1) + \phi_2 \hat{x}_{t_0} (l-2) + \bar{x} (1 - \phi_1 - \phi_2) + \hat{a}_t(l) $$

Using the following realization data and model provide a forecast for $\hat{x}_{75} (1)$, $\hat{x}_{75} (2)$, and $\hat{x}_{75} (3)$.

Model

$$ X_t = 1.6 X_t - 0.8 X_t + 30 (1 - 1.6 + 0.8) $$

Realization Data

$t_0 = 75$
$X_{74} = 27.7$
$X_{75} = 23.4$
$\bar{X} = 29.5$

Calculation for $\hat{x}_{75} (1)$

$$ \hat{x}_{75} (1) = 1.6 \hat{x}_{75} (0) - 0.8 \hat{x}_{75} (-1) + \bar{x} (1 - 1.6 + 0.8) $$$$ \hat{x}_{75} (1) = 1.6 (23.4) - 0.8 (27.7) + 29.5 (1 - 1.6 + 0.8) = 21.2 $$

Calculation for $\hat{x}_{75} (2)$

$$ \hat{x}_{75} (2) = 1.6 \hat{x}_{75} (1) - 0.8 \hat{x}_{75} (0) + \bar{x} (1 - 1.6 + 0.8) $$$$ \hat{x}_{75} (2) = 1.6 (21.2) - 0.8 (23.4) + 29.5 (1 - 1.6 + 0.8) = 21.1 $$

Calculation for $\hat{x}_{75} (3)$

$$ \hat{x}_{75} (3) = 1.6 \hat{x}_{75} (2) - 0.8 \hat{x}_{75} (1) + \bar{x} (1 - 1.6 + 0.8) $$$$ \hat{x}_{75} (3) = 1.6 (21.1) - 0.8 (21.2) + 29.5 (1 - 1.6 + 0.8) = 22.7 $$

Checking Forecasts¶

Model forecasts are often validated by only using the first section of the data for model training and using a later section to calculate residuals. Mean square error of the residuals can be used to compare models.

The forecast horizon is set back $k$ steps, then, using the remaining $n-k$ time steps, the model is used to forecast on the withheld known future $k$ time steps.

Forecasting with ARIMA¶

Independence from the Realization Mean¶

Consider the following ARIMA model

$$ (1-B) \left( X_t - \mu \right) = a_t $$

Then

$$ \hat{X}_{t_0} (l) = \hat{X}_{t_0} (l - 1) + \bar{X} ( 1 - 1 ) $$

Notice that the mean in the equation above is multiplied by 0 due to the root on the unit circle.

Affects from Roots on the Unit Circle¶

The forecast depends on the order of the ARIMA.

With one factor of $(1-B)$, the forecast for the next value is just the value at the horizon.
If the order of $d$ is greater than 1, a trend will preserved in the forecast.

Forecast Probability Limits¶

Since the model is not stationary, the forecast limits are unbounded as $l$ increases.

Forecasting with Seasonality¶

The forecast for a seasonal model forecasts the value at $l$ to be equal to the value at $l-s$ where $s$ is the seasonality.

Special Case: Airline Models¶

Seasonal models of the following form are typically known as "Airline Models."

$$ (1-B) \left( 1- B^s \right) X_t = a_t $$

The addition of the $(1-B)$ with the seasonality term preserves trends in the data.

Signal + Noise Models¶

A signal plus noise model is expressed in the following form:

$$ X_t = \lambda (t) + Z_t + a_t $$

where $\lambda$ is a deterministic function and $Z_t$ represents some noise model (generally modeled by an ARMA model).

Typically, the following strategy is used to forecast with a deterministic model:

Fit the model to the data
Calculate the residuals, which are termed $Z_t$
Fit an ARMA model to $Z_t$ to estimate the structure of the noise

Then, the forecast is given by

$$ \hat{X}_{t_0} (l) = \lambda \left( t_0 + l \right) + \hat{Z}_{t_0} (l) $$

where $\lambda$ represents the deterministic function.

Linear Trend¶

For a linear model,

$X_t$ is fit with an OLS
$\hat{Z}_t$ is calculated as $\hat{Z} = X_t - \hat{\beta}_0 - \hat{\beta}_1 t$ where $\hat{\beta}_0 $ and $ \hat{\beta}_1$ are estimates from OLS.
Fit $\hat{Z}_t$ to an ARMA model

Then, the forecasts for a linear trend are

$$ \hat{X}_{t_0} (l) = \hat{\beta}_0 + \hat{\beta}_1 t + \hat{Z}_{t_0} (l) $$

And the forecast limits are given by

$$ Forecast \,\, Interval: \,\, \hat{\beta}_0 + \hat{\beta}_1 t + \hat{Z}_{t_0} (l) \pm z_{1-\alpha/2} \hat{\sigma}_a \left[ \sum_{k = 0}^{l-1} \psi_k^2 \right]^\frac{1}{2} $$

where $\hat{\sigma}_a$ and $\psi_k$ are based on the ARMA fit to $\hat{Z}_t$

References¶

[1] W. Woodward and B. Salder, "Forecasting", SMU, 2019