tsflex


Nametsflex JSON
Version 0.4.0 PyPI version JSON
download
home_pagehttps://github.com/predict-idlab/tsflex
SummaryToolkit for flexible processing & feature extraction on time-series data
upload_time2024-04-04 10:23:04
maintainerNone
docs_urlNone
authorJonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost
requires_python<3.13,>=3.7.1
licenseMIT
keywords time-series processing feature-extraction data-science machine learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # <p align="center"> <a href="https://predict-idlab.github.io/tsflex"><img alt="tsflex" src="https://raw.githubusercontent.com/predict-idlab/tsflex/main/docs/_static/logo.png" width="66%"></a></p>

[![PyPI Latest Release](https://img.shields.io/pypi/v/tsflex.svg)](https://pypi.org/project/tsflex/)
[![Conda Latest Release](https://img.shields.io/conda/vn/conda-forge/tsflex?label=conda)](https://anaconda.org/conda-forge/tsflex)
[![support-version](https://img.shields.io/pypi/pyversions/tsflex)](https://img.shields.io/pypi/pyversions/tsflex)
[![codecov](https://img.shields.io/codecov/c/github/predict-idlab/tsflex?logo=codecov)](https://codecov.io/gh/predict-idlab/tsflex)
[![CodeQL](https://github.com/predict-idlab/tsflex/actions/workflows/codeql.yml/badge.svg)](https://github.com/predict-idlab/tsflex/actions/workflows/codeql.yml)
[![Downloads](https://static.pepy.tech/badge/tsflex)](https://pepy.tech/project/tsflex)
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?)](http://makeapullrequest.com)
[![Documentation](https://github.com/predict-idlab/tsflex/actions/workflows/deploy-docs.yml/badge.svg)](https://github.com/predict-idlab/tsflex/actions/workflows/deploy-docs.yml)
[![Testing](https://github.com/predict-idlab/tsflex/actions/workflows/test.yml/badge.svg)](https://github.com/predict-idlab/tsflex/actions/workflows/test.yml)

<!-- ![Downloads](https://img.shields.io/conda/dn/conda-forge/tsflex?logo=anaconda) -->

> _tsflex_ is a toolkit for _**flex**ible **t**ime **s**eries_ [processing](https://predict-idlab.github.io/tsflex/processing) & [feature extraction](https://predict-idlab.github.io/tsflex/features), that is efficient and makes few assumptions about sequence data.

#### Useful links

- [Paper](https://www.sciencedirect.com/science/article/pii/S2352711021001904)
- [Documentation](https://predict-idlab.github.io/tsflex/)
- [Example (machine learning) notebooks](https://github.com/predict-idlab/tsflex/tree/main/examples)

#### Installation

|                                                      | command                               |
| :--------------------------------------------------- | :------------------------------------ |
| [**pip**](https://pypi.org/project/tsflex/)          | `pip install tsflex`                  |
| [**conda**](https://anaconda.org/conda-forge/tsflex) | `conda install -c conda-forge tsflex` |

## Usage

_tsflex_ is built to be intuitive, so we encourage you to copy-paste this code and toy with some parameters!

### <a href="https://predict-idlab.github.io/tsflex/features/#getting-started">Feature extraction</a>

```python
import pandas as pd; import numpy as np; import scipy.stats as ss
from tsflex.features import MultipleFeatureDescriptors, FeatureCollection
from tsflex.utils.data import load_empatica_data

# 1. Load sequence-indexed data (in this case a time-index)
df_tmp, df_acc, df_ibi = load_empatica_data(['tmp', 'acc', 'ibi'])

# 2. Construct your feature extraction configuration
fc = FeatureCollection(
    MultipleFeatureDescriptors(
          functions=[np.min, np.mean, np.std, ss.skew, ss.kurtosis],
          series_names=["TMP", "ACC_x", "ACC_y", "IBI"],
          windows=["15min", "30min"],
          strides="15min",
    )
)

# 3. Extract features
fc.calculate(data=[df_tmp, df_acc, df_ibi], approve_sparsity=True)
```

Note that the feature extraction is performed on multivariate data with varying sample rates.
| signal | columns | sample rate |
|:-------|:-------|------------------:|
| df_tmp | ["TMP"]| 4Hz |
| df_acc | ["ACC_x", "ACC_y", "ACC_z" ]| 32Hz |
| df_ibi | ["IBI"]| irregularly sampled |

### <a href="https://predict-idlab.github.io/tsflex/processing/index.html#getting-started">Processing</a>

[Working example in our docs](https://predict-idlab.github.io/tsflex/processing/index.html#working-example)

## Why tsflex? ✨

- `Flexible`:
  - handles multivariate/multimodal time series
  - versatile function support
    => **integrates** with many packages for:
    - processing (e.g., [scipy.signal](https://docs.scipy.org/doc/scipy/reference/tutorial/signal.html), [statsmodels.tsa](https://www.statsmodels.org/stable/tsa.html#time-series-filters))
    - feature extraction (e.g., [numpy](https://numpy.org/doc/stable/reference/routines.html), [scipy.stats](https://docs.scipy.org/doc/scipy/reference/tutorial/stats.html), [antropy](https://raphaelvallat.com/antropy/build/html/api.html), [nolds](https://cschoel.github.io/nolds/nolds.html#algorithms), [seglearn](https://dmbee.github.io/seglearn/feature_functions.html)¹, [tsfresh](https://tsfresh.readthedocs.io/en/latest/text/list_of_features.html)¹, [tsfel](https://tsfel.readthedocs.io/en/latest/descriptions/feature_list.html)¹)
  - feature extraction handles **multiple strides & window sizes**
- `Efficient`:<br>
  - view-based operations for processing & feature extraction => extremely **low memory peak** & **fast execution time**<br>
    - see: [feature extraction benchmark visualization](https://predict-idlab.github.io/tsflex/#benchmark)
- `Intuitive`:<br>
  - maintains the sequence-index of the data
  - feature extraction constructs interpretable output column names
  - intuitive API
- `Few assumptions` about the sequence data:
  - no assumptions about sampling rate
  - able to deal with multivariate asynchronous data<br>i.e. data with small time-offsets between the modalities
- `Advanced functionalities`:
  - apply [FeatureCollection.**reduce**](https://predict-idlab.github.io/tsflex/features/index.html#tsflex.features.FeatureCollection.reduce) after feature selection for faster inference
  - use **function execution time logging** to discover processing and feature extraction bottlenecks
  - embedded [SeriesPipeline](http://predict-idlab.github.io/tsflex/processing/#tsflex.processing.SeriesPipeline.serialize) & [FeatureCollection](https://predict-idlab.github.io/tsflex/features/index.html#tsflex.features.FeatureCollection.serialize) **serialization**
  - time series [**chunking**](https://predict-idlab.github.io/tsflex/chunking/index.html)

¹ These integrations are shown in [integration-example notebooks](https://github.com/predict-idlab/tsflex/tree/main/examples).

## Future work 🔨

- scikit-learn integration for both processing and feature extraction<br>
  **note**: is actively developed upon [sklearn integration](https://github.com/predict-idlab/tsflex/tree/sklearn_integration) branch.
- Support time series segmentation (exposing under the hood strided-rolling functionality) - [see this issue](https://github.com/predict-idlab/tsflex/issues/15)
- Support for multi-indexed dataframes

=> Also see the [enhancement issues](https://github.com/predict-idlab/tsflex/issues?q=is%3Aissue+is%3Aopen+label%3Aenhancement+)

## Contributing 👪

We are thrilled to see your contributions to further enhance `tsflex`.<br>
See [this guide](CONTRIBUTING.md) for more instructions on how to contribute.

## Referencing our package

If you use `tsflex` in a scientific publication, we would highly appreciate citing us as:

```bibtex
@article{vanderdonckt2021tsflex,
    author = {Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},
    title = {tsflex: flexible time series processing \& feature extraction},
    journal = {SoftwareX},
    year = {2021},
    url = {https://github.com/predict-idlab/tsflex},
    publisher={Elsevier}
}
```

Link to the paper: https://www.sciencedirect.com/science/article/pii/S2352711021001904

---

<p align="center">
👤 <i>Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost</i>
</p>

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/predict-idlab/tsflex",
    "name": "tsflex",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.7.1",
    "maintainer_email": null,
    "keywords": "time-series, processing, feature-extraction, data-science, machine learning",
    "author": "Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/11/04/ad8957cead467cd53efc39d53b388046029d5bde3fd992c99b263df9e132/tsflex-0.4.0.tar.gz",
    "platform": null,
    "description": "# <p align=\"center\"> <a href=\"https://predict-idlab.github.io/tsflex\"><img alt=\"tsflex\" src=\"https://raw.githubusercontent.com/predict-idlab/tsflex/main/docs/_static/logo.png\" width=\"66%\"></a></p>\n\n[![PyPI Latest Release](https://img.shields.io/pypi/v/tsflex.svg)](https://pypi.org/project/tsflex/)\n[![Conda Latest Release](https://img.shields.io/conda/vn/conda-forge/tsflex?label=conda)](https://anaconda.org/conda-forge/tsflex)\n[![support-version](https://img.shields.io/pypi/pyversions/tsflex)](https://img.shields.io/pypi/pyversions/tsflex)\n[![codecov](https://img.shields.io/codecov/c/github/predict-idlab/tsflex?logo=codecov)](https://codecov.io/gh/predict-idlab/tsflex)\n[![CodeQL](https://github.com/predict-idlab/tsflex/actions/workflows/codeql.yml/badge.svg)](https://github.com/predict-idlab/tsflex/actions/workflows/codeql.yml)\n[![Downloads](https://static.pepy.tech/badge/tsflex)](https://pepy.tech/project/tsflex)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?)](http://makeapullrequest.com)\n[![Documentation](https://github.com/predict-idlab/tsflex/actions/workflows/deploy-docs.yml/badge.svg)](https://github.com/predict-idlab/tsflex/actions/workflows/deploy-docs.yml)\n[![Testing](https://github.com/predict-idlab/tsflex/actions/workflows/test.yml/badge.svg)](https://github.com/predict-idlab/tsflex/actions/workflows/test.yml)\n\n<!-- ![Downloads](https://img.shields.io/conda/dn/conda-forge/tsflex?logo=anaconda) -->\n\n> _tsflex_ is a toolkit for _**flex**ible **t**ime **s**eries_ [processing](https://predict-idlab.github.io/tsflex/processing) & [feature extraction](https://predict-idlab.github.io/tsflex/features), that is efficient and makes few assumptions about sequence data.\n\n#### Useful links\n\n- [Paper](https://www.sciencedirect.com/science/article/pii/S2352711021001904)\n- [Documentation](https://predict-idlab.github.io/tsflex/)\n- [Example (machine learning) notebooks](https://github.com/predict-idlab/tsflex/tree/main/examples)\n\n#### Installation\n\n|                                                      | command                               |\n| :--------------------------------------------------- | :------------------------------------ |\n| [**pip**](https://pypi.org/project/tsflex/)          | `pip install tsflex`                  |\n| [**conda**](https://anaconda.org/conda-forge/tsflex) | `conda install -c conda-forge tsflex` |\n\n## Usage\n\n_tsflex_ is built to be intuitive, so we encourage you to copy-paste this code and toy with some parameters!\n\n### <a href=\"https://predict-idlab.github.io/tsflex/features/#getting-started\">Feature extraction</a>\n\n```python\nimport pandas as pd; import numpy as np; import scipy.stats as ss\nfrom tsflex.features import MultipleFeatureDescriptors, FeatureCollection\nfrom tsflex.utils.data import load_empatica_data\n\n# 1. Load sequence-indexed data (in this case a time-index)\ndf_tmp, df_acc, df_ibi = load_empatica_data(['tmp', 'acc', 'ibi'])\n\n# 2. Construct your feature extraction configuration\nfc = FeatureCollection(\n    MultipleFeatureDescriptors(\n          functions=[np.min, np.mean, np.std, ss.skew, ss.kurtosis],\n          series_names=[\"TMP\", \"ACC_x\", \"ACC_y\", \"IBI\"],\n          windows=[\"15min\", \"30min\"],\n          strides=\"15min\",\n    )\n)\n\n# 3. Extract features\nfc.calculate(data=[df_tmp, df_acc, df_ibi], approve_sparsity=True)\n```\n\nNote that the feature extraction is performed on multivariate data with varying sample rates.\n| signal | columns | sample rate |\n|:-------|:-------|------------------:|\n| df_tmp | [\"TMP\"]| 4Hz |\n| df_acc | [\"ACC_x\", \"ACC_y\", \"ACC_z\" ]| 32Hz |\n| df_ibi | [\"IBI\"]| irregularly sampled |\n\n### <a href=\"https://predict-idlab.github.io/tsflex/processing/index.html#getting-started\">Processing</a>\n\n[Working example in our docs](https://predict-idlab.github.io/tsflex/processing/index.html#working-example)\n\n## Why tsflex? \u2728\n\n- `Flexible`:\n  - handles multivariate/multimodal time series\n  - versatile function support\n    => **integrates** with many packages for:\n    - processing (e.g., [scipy.signal](https://docs.scipy.org/doc/scipy/reference/tutorial/signal.html), [statsmodels.tsa](https://www.statsmodels.org/stable/tsa.html#time-series-filters))\n    - feature extraction (e.g., [numpy](https://numpy.org/doc/stable/reference/routines.html), [scipy.stats](https://docs.scipy.org/doc/scipy/reference/tutorial/stats.html), [antropy](https://raphaelvallat.com/antropy/build/html/api.html), [nolds](https://cschoel.github.io/nolds/nolds.html#algorithms), [seglearn](https://dmbee.github.io/seglearn/feature_functions.html)\u00b9, [tsfresh](https://tsfresh.readthedocs.io/en/latest/text/list_of_features.html)\u00b9, [tsfel](https://tsfel.readthedocs.io/en/latest/descriptions/feature_list.html)\u00b9)\n  - feature extraction handles **multiple strides & window sizes**\n- `Efficient`:<br>\n  - view-based operations for processing & feature extraction => extremely **low memory peak** & **fast execution time**<br>\n    - see: [feature extraction benchmark visualization](https://predict-idlab.github.io/tsflex/#benchmark)\n- `Intuitive`:<br>\n  - maintains the sequence-index of the data\n  - feature extraction constructs interpretable output column names\n  - intuitive API\n- `Few assumptions` about the sequence data:\n  - no assumptions about sampling rate\n  - able to deal with multivariate asynchronous data<br>i.e. data with small time-offsets between the modalities\n- `Advanced functionalities`:\n  - apply [FeatureCollection.**reduce**](https://predict-idlab.github.io/tsflex/features/index.html#tsflex.features.FeatureCollection.reduce) after feature selection for faster inference\n  - use **function execution time logging** to discover processing and feature extraction bottlenecks\n  - embedded [SeriesPipeline](http://predict-idlab.github.io/tsflex/processing/#tsflex.processing.SeriesPipeline.serialize) & [FeatureCollection](https://predict-idlab.github.io/tsflex/features/index.html#tsflex.features.FeatureCollection.serialize) **serialization**\n  - time series [**chunking**](https://predict-idlab.github.io/tsflex/chunking/index.html)\n\n\u00b9 These integrations are shown in [integration-example notebooks](https://github.com/predict-idlab/tsflex/tree/main/examples).\n\n## Future work \ud83d\udd28\n\n- scikit-learn integration for both processing and feature extraction<br>\n  **note**: is actively developed upon [sklearn integration](https://github.com/predict-idlab/tsflex/tree/sklearn_integration) branch.\n- Support time series segmentation (exposing under the hood strided-rolling functionality) - [see this issue](https://github.com/predict-idlab/tsflex/issues/15)\n- Support for multi-indexed dataframes\n\n=> Also see the [enhancement issues](https://github.com/predict-idlab/tsflex/issues?q=is%3Aissue+is%3Aopen+label%3Aenhancement+)\n\n## Contributing \ud83d\udc6a\n\nWe are thrilled to see your contributions to further enhance `tsflex`.<br>\nSee [this guide](CONTRIBUTING.md) for more instructions on how to contribute.\n\n## Referencing our package\n\nIf you use `tsflex` in a scientific publication, we would highly appreciate citing us as:\n\n```bibtex\n@article{vanderdonckt2021tsflex,\n    author = {Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},\n    title = {tsflex: flexible time series processing \\& feature extraction},\n    journal = {SoftwareX},\n    year = {2021},\n    url = {https://github.com/predict-idlab/tsflex},\n    publisher={Elsevier}\n}\n```\n\nLink to the paper: https://www.sciencedirect.com/science/article/pii/S2352711021001904\n\n---\n\n<p align=\"center\">\n\ud83d\udc64 <i>Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost</i>\n</p>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Toolkit for flexible processing & feature extraction on time-series data",
    "version": "0.4.0",
    "project_urls": {
        "Documentation": "https://predict-idlab.github.io/tsflex",
        "Homepage": "https://github.com/predict-idlab/tsflex",
        "Repository": "https://github.com/predict-idlab/tsflex"
    },
    "split_keywords": [
        "time-series",
        " processing",
        " feature-extraction",
        " data-science",
        " machine learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fb51ae2f8f14c293367f328e14267839861495ca242aeda9f6b3ff1410df6a59",
                "md5": "5a1c5f9142ce81c6604afc9b6fe097ae",
                "sha256": "c12d16bbff384e499f094a88088ed178b22e491d9b4fac6da60f3cba67c922cd"
            },
            "downloads": -1,
            "filename": "tsflex-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5a1c5f9142ce81c6604afc9b6fe097ae",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.7.1",
            "size": 65849,
            "upload_time": "2024-04-04T10:23:01",
            "upload_time_iso_8601": "2024-04-04T10:23:01.943958Z",
            "url": "https://files.pythonhosted.org/packages/fb/51/ae2f8f14c293367f328e14267839861495ca242aeda9f6b3ff1410df6a59/tsflex-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1104ad8957cead467cd53efc39d53b388046029d5bde3fd992c99b263df9e132",
                "md5": "5b2b4b0202f0ecacc55dc3ba04af84e9",
                "sha256": "5fcd4e05e7fa7ad6cc30d12ce32288351a1d06021f56e4be791d85b4d0ccea89"
            },
            "downloads": -1,
            "filename": "tsflex-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "5b2b4b0202f0ecacc55dc3ba04af84e9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.7.1",
            "size": 58581,
            "upload_time": "2024-04-04T10:23:04",
            "upload_time_iso_8601": "2024-04-04T10:23:04.321999Z",
            "url": "https://files.pythonhosted.org/packages/11/04/ad8957cead467cd53efc39d53b388046029d5bde3fd992c99b263df9e132/tsflex-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-04 10:23:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "predict-idlab",
    "github_project": "tsflex",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "tsflex"
}
        
Elapsed time: 0.22216s