GapKFold

class tscv.GapKFold(n_splits=5, gap_before=0, gap_after=0)[source]

K-Folds cross-validator with Gaps

Provides train/test indices to split data in train/test sets. Split dataset into k consecutive folds (without shuffling).

Each fold is then used once as a validation while the k - 1 remaining folds (with the gap removed) form the training set.

Parameters
n_splitsint, default=5

Number of folds. Must be at least 2.

gap_beforeint, default=0

Gap before the test sets.

gap_afterint, default=0

Gap after the test sets.

Notes

The first n_samples % n_splits folds have size n_samples // n_splits + 1, other folds have size n_samples // n_splits, where n_samples is the number of samples.

Examples

>>> import numpy as np
>>> from tscv import GapKFold
>>> kf = GapKFold(n_splits=5, gap_before=3, gap_after=4)
>>> kf.get_n_splits(np.arange(10))
5
>>> print(kf)
GapKFold(gap_after=4, gap_before=3, n_splits=5)
>>> for train_index, test_index in kf.split(np.arange(10)):
...    print("TRAIN:", train_index, "TEST:", test_index)
TRAIN: [6 7 8 9] TEST: [0 1]
TRAIN: [8 9] TEST: [2 3]
TRAIN: [0] TEST: [4 5]
TRAIN: [0 1 2] TEST: [6 7]
TRAIN: [0 1 2 3 4] TEST: [8 9]
get_n_splits(X=None, y=None, groups=None)[source]

Returns the number of splitting iterations in the cross-validator

Parameters
Xobject

Always ignored, exists for compatibility.

yobject

Always ignored, exists for compatibility.

groupsobject

Always ignored, exists for compatibility.

Returns
n_splitsint

Returns the number of splitting iterations in the cross-validator.

split(X, y=None, groups=None)

Generate indices to split data into training and test set.

Parameters
Xarray-like, shape (n_samples, n_features)

Training data, where n_samples is the number of samples and n_features is the number of features.

yarray-like, of length n_samples

The target variable for supervised learning problems.

groupsarray-like, with shape (n_samples,), optional

Group labels for the samples used while splitting the dataset into train/test set.

Yields
trainndarray

The training set indices for that split.

testndarray

The testing set indices for that split.