tscv


Nametscv JSON
Version 0.1.3 PyPI version JSON
download
home_pagehttps://github.com/WenjieZ/TSCV
SummaryTime series cross-validation
upload_time2023-01-23 18:50:02
maintainer
docs_urlNone
authorWenjie Zheng
requires_python>=3.6
licensenew BSD
keywords model selection hyperparameter optimization backtesting
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage
            [![Downloads](https://pepy.tech/badge/tscv/month)](https://pepy.tech/project/tscv)
[![Build Status](https://travis-ci.com/WenjieZ/TSCV.svg?branch=master)](https://travis-ci.com/WenjieZ/TSCV)
[![codecov](https://codecov.io/gh/WenjieZ/TSCV/branch/master/graph/badge.svg?token=dcGlEfHCw2)](https://codecov.io/gh/WenjieZ/TSCV)
[![Documentation Status](https://readthedocs.org/projects/tscv/badge/?version=latest)](https://tscv.readthedocs.io/en/latest/?badge=latest)
[![DOI](https://zenodo.org/badge/186586661.svg)](https://zenodo.org/badge/latestdoi/186586661)

![](train-gap-test.svg)

# TSCV: Time Series Cross-Validation

This repository is a [scikit-learn](https://scikit-learn.org) extension for time series cross-validation.
It introduces **gaps** between the training set and the test set, which mitigates the temporal dependence of time series and prevents information leakage.

## Installation

```bash
pip install tscv
```

or

```bash
conda install -c conda-forge tscv
```

## Usage

This extension defines 3 cross-validator classes and 1 function:
- `GapLeavePOut`
- `GapKFold`
- `GapRollForward`
- `gap_train_test_split`

The three classes can all be passed, as the `cv` argument, to
scikit-learn functions such as `cross-validate`, `cross_val_score`,
and `cross_val_predict`, just like the native cross-validator classes.

The one function is an alternative to the `train_test_split` function in `scikit-learn`.

## Examples

The following example uses `GapKFold` instead of `KFold` as the cross-validator.
```python
import numpy as np
from sklearn import datasets
from sklearn import svm
from sklearn.model_selection import cross_val_score
from tscv import GapKFold

iris = datasets.load_iris()
clf = svm.SVC(kernel='linear', C=1)

# use GapKFold as the cross-validator
cv = GapKFold(n_splits=5, gap_before=5, gap_after=5)
scores = cross_val_score(clf, iris.data, iris.target, cv=cv)
```

The following example uses `gap_train_test_split` to split the data set into the training set and the test set.
```python
import numpy as np
from tscv import gap_train_test_split

X, y = np.arange(20).reshape((10, 2)), np.arange(10)
X_train, X_test, y_train, y_test = gap_train_test_split(X, y, test_size=2, gap_size=2)
```

## Contributing
- Report bugs in the issue tracker
- Express your use cases in the issue tracker

## Documentations
- [tscv.readthedocs.io](https://tscv.readthedocs.io)

## Acknowledgments

- I would like to thank Jeffrey Racine and Christoph Bergmeir for the helpful discussion.

## License
BSD-3-Clause

## Citation

Wenjie Zheng. (2021). Time Series Cross-Validation (TSCV): an extension for scikit-learn. Zenodo. http://doi.org/10.5281/zenodo.4707309

```latex
@software{zheng_2021_4707309,
  title={{Time Series Cross-Validation (TSCV): an extension for scikit-learn}},
  author={Zheng, Wenjie},
  month={april},
  year={2021},
  publisher={Zenodo},
  doi={10.5281/zenodo.4707309},
  url={http://doi.org/10.5281/zenodo.4707309}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/WenjieZ/TSCV",
    "name": "tscv",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "model selection,hyperparameter optimization,backtesting",
    "author": "Wenjie Zheng",
    "author_email": "work@zhengwenjie.net",
    "download_url": "https://files.pythonhosted.org/packages/cd/5a/7ebce6c6baa22f9fd4a6b87249d347c6339a1f5537279e03999fcf06b95c/tscv-0.1.3.tar.gz",
    "platform": null,
    "description": "[![Downloads](https://pepy.tech/badge/tscv/month)](https://pepy.tech/project/tscv)\n[![Build Status](https://travis-ci.com/WenjieZ/TSCV.svg?branch=master)](https://travis-ci.com/WenjieZ/TSCV)\n[![codecov](https://codecov.io/gh/WenjieZ/TSCV/branch/master/graph/badge.svg?token=dcGlEfHCw2)](https://codecov.io/gh/WenjieZ/TSCV)\n[![Documentation Status](https://readthedocs.org/projects/tscv/badge/?version=latest)](https://tscv.readthedocs.io/en/latest/?badge=latest)\n[![DOI](https://zenodo.org/badge/186586661.svg)](https://zenodo.org/badge/latestdoi/186586661)\n\n![](train-gap-test.svg)\n\n# TSCV: Time Series Cross-Validation\n\nThis repository is a [scikit-learn](https://scikit-learn.org) extension for time series cross-validation.\nIt introduces **gaps** between the training set and the test set, which mitigates the temporal dependence of time series and prevents information leakage.\n\n## Installation\n\n```bash\npip install tscv\n```\n\nor\n\n```bash\nconda install -c conda-forge tscv\n```\n\n## Usage\n\nThis extension defines 3 cross-validator classes and 1 function:\n- `GapLeavePOut`\n- `GapKFold`\n- `GapRollForward`\n- `gap_train_test_split`\n\nThe three classes can all be passed, as the `cv` argument, to\nscikit-learn functions such as `cross-validate`, `cross_val_score`,\nand `cross_val_predict`, just like the native cross-validator classes.\n\nThe one function is an alternative to the `train_test_split` function in `scikit-learn`.\n\n## Examples\n\nThe following example uses `GapKFold` instead of `KFold` as the cross-validator.\n```python\nimport numpy as np\nfrom sklearn import datasets\nfrom sklearn import svm\nfrom sklearn.model_selection import cross_val_score\nfrom tscv import GapKFold\n\niris = datasets.load_iris()\nclf = svm.SVC(kernel='linear', C=1)\n\n# use GapKFold as the cross-validator\ncv = GapKFold(n_splits=5, gap_before=5, gap_after=5)\nscores = cross_val_score(clf, iris.data, iris.target, cv=cv)\n```\n\nThe following example uses `gap_train_test_split` to split the data set into the training set and the test set.\n```python\nimport numpy as np\nfrom tscv import gap_train_test_split\n\nX, y = np.arange(20).reshape((10, 2)), np.arange(10)\nX_train, X_test, y_train, y_test = gap_train_test_split(X, y, test_size=2, gap_size=2)\n```\n\n## Contributing\n- Report bugs in the issue tracker\n- Express your use cases in the issue tracker\n\n## Documentations\n- [tscv.readthedocs.io](https://tscv.readthedocs.io)\n\n## Acknowledgments\n\n- I would like to thank Jeffrey Racine and Christoph Bergmeir for the helpful discussion.\n\n## License\nBSD-3-Clause\n\n## Citation\n\nWenjie Zheng. (2021). Time Series Cross-Validation (TSCV): an extension for scikit-learn. Zenodo. http://doi.org/10.5281/zenodo.4707309\n\n```latex\n@software{zheng_2021_4707309,\n  title={{Time Series Cross-Validation (TSCV): an extension for scikit-learn}},\n  author={Zheng, Wenjie},\n  month={april},\n  year={2021},\n  publisher={Zenodo},\n  doi={10.5281/zenodo.4707309},\n  url={http://doi.org/10.5281/zenodo.4707309}\n}\n```\n",
    "bugtrack_url": null,
    "license": "new BSD",
    "summary": "Time series cross-validation",
    "version": "0.1.3",
    "split_keywords": [
        "model selection",
        "hyperparameter optimization",
        "backtesting"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "65bdadaa4803a999efcb2feba16359cd6d7361a8edcdc26497a54ca796811392",
                "md5": "60caf5f52733e3b4e3a3f730544d4296",
                "sha256": "8503ea18719b9891830dd436640990ccf4c5466307951454da9fd1a5a3243ce9"
            },
            "downloads": -1,
            "filename": "tscv-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "60caf5f52733e3b4e3a3f730544d4296",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 12259,
            "upload_time": "2023-01-23T18:50:00",
            "upload_time_iso_8601": "2023-01-23T18:50:00.403919Z",
            "url": "https://files.pythonhosted.org/packages/65/bd/adaa4803a999efcb2feba16359cd6d7361a8edcdc26497a54ca796811392/tscv-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "cd5a7ebce6c6baa22f9fd4a6b87249d347c6339a1f5537279e03999fcf06b95c",
                "md5": "1e5511c67553779a812c3a9aa88173d2",
                "sha256": "4934fcc9d5210d0bc4efcade76c195be2fb10bed82c827b05ac39953ef4dddc9"
            },
            "downloads": -1,
            "filename": "tscv-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "1e5511c67553779a812c3a9aa88173d2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 13617,
            "upload_time": "2023-01-23T18:50:02",
            "upload_time_iso_8601": "2023-01-23T18:50:02.470151Z",
            "url": "https://files.pythonhosted.org/packages/cd/5a/7ebce6c6baa22f9fd4a6b87249d347c6339a1f5537279e03999fcf06b95c/tscv-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-23 18:50:02",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "WenjieZ",
    "github_project": "TSCV",
    "travis_ci": true,
    "coveralls": true,
    "github_actions": false,
    "circle": true,
    "appveyor": true,
    "lcname": "tscv"
}
        
Elapsed time: 0.03067s