[![Downloads](https://pepy.tech/badge/tscv/month)](https://pepy.tech/project/tscv)
[![Build Status](https://travis-ci.com/WenjieZ/TSCV.svg?branch=master)](https://travis-ci.com/WenjieZ/TSCV)
[![codecov](https://codecov.io/gh/WenjieZ/TSCV/branch/master/graph/badge.svg?token=dcGlEfHCw2)](https://codecov.io/gh/WenjieZ/TSCV)
[![Documentation Status](https://readthedocs.org/projects/tscv/badge/?version=latest)](https://tscv.readthedocs.io/en/latest/?badge=latest)
[![DOI](https://zenodo.org/badge/186586661.svg)](https://zenodo.org/badge/latestdoi/186586661)
![](train-gap-test.svg)
# TSCV: Time Series Cross-Validation
This repository is a [scikit-learn](https://scikit-learn.org) extension for time series cross-validation.
It introduces **gaps** between the training set and the test set, which mitigates the temporal dependence of time series and prevents information leakage.
## Installation
```bash
pip install tscv
```
or
```bash
conda install -c conda-forge tscv
```
## Usage
This extension defines 3 cross-validator classes and 1 function:
- `GapLeavePOut`
- `GapKFold`
- `GapRollForward`
- `gap_train_test_split`
The three classes can all be passed, as the `cv` argument, to
scikit-learn functions such as `cross-validate`, `cross_val_score`,
and `cross_val_predict`, just like the native cross-validator classes.
The one function is an alternative to the `train_test_split` function in `scikit-learn`.
## Examples
The following example uses `GapKFold` instead of `KFold` as the cross-validator.
```python
import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import cross_val_score
from tscv import GapKFold
iris = datasets.load_iris()
clf = svm.SVC(kernel='linear', C=1)
# use GapKFold as the cross-validator
cv = GapKFold(n_splits=5, gap_before=5, gap_after=5)
scores = cross_val_score(clf, iris.data, iris.target, cv=cv)
```
The following example uses `gap_train_test_split` to split the data set into the training set and the test set.
```python
import numpy as np
from tscv import gap_train_test_split
X, y = np.arange(20).reshape((10, 2)), np.arange(10)
X_train, X_test, y_train, y_test = gap_train_test_split(X, y, test_size=2, gap_size=2)
```
## Contributing
- Report bugs in the issue tracker
- Express your use cases in the issue tracker
## Documentations
- [tscv.readthedocs.io](https://tscv.readthedocs.io)
## Acknowledgments
- I would like to thank Jeffrey Racine and Christoph Bergmeir for the helpful discussion.
## License
BSD-3-Clause
## Citation
Wenjie Zheng. (2021). Time Series Cross-Validation (TSCV): an extension for scikit-learn. Zenodo. http://doi.org/10.5281/zenodo.4707309
```latex
@software{zheng_2021_4707309,
title={{Time Series Cross-Validation (TSCV): an extension for scikit-learn}},
author={Zheng, Wenjie},
month={april},
year={2021},
publisher={Zenodo},
doi={10.5281/zenodo.4707309},
url={http://doi.org/10.5281/zenodo.4707309}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/WenjieZ/TSCV",
"name": "tscv",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "model selection,hyperparameter optimization,backtesting",
"author": "Wenjie Zheng",
"author_email": "work@zhengwenjie.net",
"download_url": "https://files.pythonhosted.org/packages/cd/5a/7ebce6c6baa22f9fd4a6b87249d347c6339a1f5537279e03999fcf06b95c/tscv-0.1.3.tar.gz",
"platform": null,
"description": "[![Downloads](https://pepy.tech/badge/tscv/month)](https://pepy.tech/project/tscv)\n[![Build Status](https://travis-ci.com/WenjieZ/TSCV.svg?branch=master)](https://travis-ci.com/WenjieZ/TSCV)\n[![codecov](https://codecov.io/gh/WenjieZ/TSCV/branch/master/graph/badge.svg?token=dcGlEfHCw2)](https://codecov.io/gh/WenjieZ/TSCV)\n[![Documentation Status](https://readthedocs.org/projects/tscv/badge/?version=latest)](https://tscv.readthedocs.io/en/latest/?badge=latest)\n[![DOI](https://zenodo.org/badge/186586661.svg)](https://zenodo.org/badge/latestdoi/186586661)\n\n![](train-gap-test.svg)\n\n# TSCV: Time Series Cross-Validation\n\nThis repository is a [scikit-learn](https://scikit-learn.org) extension for time series cross-validation.\nIt introduces **gaps** between the training set and the test set, which mitigates the temporal dependence of time series and prevents information leakage.\n\n## Installation\n\n```bash\npip install tscv\n```\n\nor\n\n```bash\nconda install -c conda-forge tscv\n```\n\n## Usage\n\nThis extension defines 3 cross-validator classes and 1 function:\n- `GapLeavePOut`\n- `GapKFold`\n- `GapRollForward`\n- `gap_train_test_split`\n\nThe three classes can all be passed, as the `cv` argument, to\nscikit-learn functions such as `cross-validate`, `cross_val_score`,\nand `cross_val_predict`, just like the native cross-validator classes.\n\nThe one function is an alternative to the `train_test_split` function in `scikit-learn`.\n\n## Examples\n\nThe following example uses `GapKFold` instead of `KFold` as the cross-validator.\n```python\nimport numpy as np\nfrom sklearn import datasets\nfrom sklearn import svm\nfrom sklearn.model_selection import cross_val_score\nfrom tscv import GapKFold\n\niris = datasets.load_iris()\nclf = svm.SVC(kernel='linear', C=1)\n\n# use GapKFold as the cross-validator\ncv = GapKFold(n_splits=5, gap_before=5, gap_after=5)\nscores = cross_val_score(clf, iris.data, iris.target, cv=cv)\n```\n\nThe following example uses `gap_train_test_split` to split the data set into the training set and the test set.\n```python\nimport numpy as np\nfrom tscv import gap_train_test_split\n\nX, y = np.arange(20).reshape((10, 2)), np.arange(10)\nX_train, X_test, y_train, y_test = gap_train_test_split(X, y, test_size=2, gap_size=2)\n```\n\n## Contributing\n- Report bugs in the issue tracker\n- Express your use cases in the issue tracker\n\n## Documentations\n- [tscv.readthedocs.io](https://tscv.readthedocs.io)\n\n## Acknowledgments\n\n- I would like to thank Jeffrey Racine and Christoph Bergmeir for the helpful discussion.\n\n## License\nBSD-3-Clause\n\n## Citation\n\nWenjie Zheng. (2021). Time Series Cross-Validation (TSCV): an extension for scikit-learn. Zenodo. http://doi.org/10.5281/zenodo.4707309\n\n```latex\n@software{zheng_2021_4707309,\n title={{Time Series Cross-Validation (TSCV): an extension for scikit-learn}},\n author={Zheng, Wenjie},\n month={april},\n year={2021},\n publisher={Zenodo},\n doi={10.5281/zenodo.4707309},\n url={http://doi.org/10.5281/zenodo.4707309}\n}\n```\n",
"bugtrack_url": null,
"license": "new BSD",
"summary": "Time series cross-validation",
"version": "0.1.3",
"split_keywords": [
"model selection",
"hyperparameter optimization",
"backtesting"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "65bdadaa4803a999efcb2feba16359cd6d7361a8edcdc26497a54ca796811392",
"md5": "60caf5f52733e3b4e3a3f730544d4296",
"sha256": "8503ea18719b9891830dd436640990ccf4c5466307951454da9fd1a5a3243ce9"
},
"downloads": -1,
"filename": "tscv-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "60caf5f52733e3b4e3a3f730544d4296",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 12259,
"upload_time": "2023-01-23T18:50:00",
"upload_time_iso_8601": "2023-01-23T18:50:00.403919Z",
"url": "https://files.pythonhosted.org/packages/65/bd/adaa4803a999efcb2feba16359cd6d7361a8edcdc26497a54ca796811392/tscv-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "cd5a7ebce6c6baa22f9fd4a6b87249d347c6339a1f5537279e03999fcf06b95c",
"md5": "1e5511c67553779a812c3a9aa88173d2",
"sha256": "4934fcc9d5210d0bc4efcade76c195be2fb10bed82c827b05ac39953ef4dddc9"
},
"downloads": -1,
"filename": "tscv-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "1e5511c67553779a812c3a9aa88173d2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 13617,
"upload_time": "2023-01-23T18:50:02",
"upload_time_iso_8601": "2023-01-23T18:50:02.470151Z",
"url": "https://files.pythonhosted.org/packages/cd/5a/7ebce6c6baa22f9fd4a6b87249d347c6339a1f5537279e03999fcf06b95c/tscv-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-23 18:50:02",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "WenjieZ",
"github_project": "TSCV",
"travis_ci": true,
"coveralls": true,
"github_actions": false,
"circle": true,
"appveyor": true,
"lcname": "tscv"
}