GapRollForward

This page presents the GapRollForward class.

Rolling Forward, better known as Walk Forward, is a popular cross-validation method for time series. In contrast to GapLeavePOut and GapKFold, which both allow training sets on both sides of the test set, Walk Forward requires that the training set must be before the test set. That is, inference from (past) data can only be validated on future data.

../_images/walk_forward.svg

The GapRollForward class simply introduces gaps into vanilla Walk Forward.

../_images/gap_walk_forward.svg

The following code snippet produces the cross-validation setup of the above image:

>>> from tscv import GapRollForward
>>> cv = GapRollForward(min_train_size=3, gap_size=1, max_test_size=2)
>>> for train, test in cv.split(range(10)):
...     print("train:", train, "test:", test)
...
train: [0 1 2] test: [4 5]
train: [0 1 2 3 4] test: [6 7]
train: [0 1 2 3 4 5 6] test: [8 9]

In the code sample, GapRollForward is a class provided by this package. It has a split method, which takes in the whole data set and produce the training and test sets indices.

Tip

By calling the split method directly, you can verify whether your configuration is what you desire. In practice, you will not use these indices directly though. Rather, you will send an instance of GapRollForward as argument to a scikit-learn cross-validator.