Cross-validation for time series

December 5, 2016

(This article was originally published at Hyndsight, and syndicated at StatsBlogs.)

I’ve added a couple of new functions to the forecast package for R which implement two types of cross-validation for time series.

K-fold cross-validation for autoregression

The first is regular k-fold cross-validation for autoregressive models. Although cross-validation is sometimes not valid for time series models, it does work for autoregressions, which includes many machine learning approaches to time series. The theoretical background is provided in Bergmeir, Hyndman and Koo (2015). So cross-validation can be applied to any model where the predictors are lagged values of the response variable.

This is implemented for NNAR models (neural network autoregressions) in R as follows:

modelcv <- CVar(lynx, k=5, lambda=0.15)

The output is a summary of the accuracy across folds:

5-fold cross-validation
                  Mean          SD
ME        -32.88142801  98.0725227
RMSE      931.90966858 352.8705338
MAE       608.99488205 272.1244879
MPE       -17.84710226  15.2700638
MAPE       53.99760978  12.7264054
ACF1        0.04842174   0.1480883
Theil's U   0.82984737   0.1487229

The CVar function is rather limited at this stage, and will only handle cross-validation for models computed using nnetar. If there is enough interest, I might add more functionality at a later stage.

Time series cross-validation

In this procedure, there is a series of test sets, each consisting of a single observation. The corresponding training set consists only of observations that occurred prior to the observation that forms the test set. Thus, no future observations can be used in constructing the forecast. The following diagram illustrates the series of training and test sets, where the blue observations form the training sets, and the red observations form the test sets.


The forecast accuracy is computed by averaging over the test sets. This procedure is sometimes known as “evaluation on a rolling forecasting origin” because the “origin” at which the forecast is based rolls forward in time.

With time series forecasting, one-step forecasts may not be as relevant as multi-step forecasts. In this case, the cross-validation procedure based on a rolling forecasting origin can be modified to allow multi-step errors to be used. Suppose that we are interested in models that produce good 4-step-ahead forecasts. Then the corresponding diagram is shown below.


Time series cross validation is implemented with the tsCV function. In the following example, we compare the residual RMSE with the RMSE obtained via time series cross-validation.

e <- tsCV(dj, rwf, drift=TRUE, h=1)
sqrt(mean(e^2, na.rm=TRUE))
## [1] 22.68249
sqrt(mean(residuals(rwf(dj, drift=TRUE))^2, na.rm=TRUE))
## [1] 22.49681

Here I apply a random walk with drift to the Dow-Jones index time series dj. The first calculation implements a one-step time series cross-validation where the drift parameter is re-estimated at every forecast origin. The second calculation estimates the drift parameter once for the whole data set, and then computes the RMSE from the one-step forecasts. As expected, the RMSE from the residuals is smaller, as the corresponding “forecasts” are based on a model fitted to the entire data set, rather than being true forecasts.

The tsCV function is very general, and will work for any forecasting function that returns an object of class forecast. You don’t even have to specify the minimum sample size for model fitting, as it will silently fit models beginning with a single observation, and return a missing value whenever the model cannot be estimated.

These functions are only on the github version for now, but will migrate to CRAN next time I release a new stable version.

Please comment on the article here: Hyndsight

Tags: , , , ,