wavetrainer


Namewavetrainer JSON
Version 0.0.9 PyPI version JSON
download
home_pagehttps://github.com/8W9aG/wavetrainer
SummaryA library for automatically finding the optimal model within feature and hyperparameter space.
upload_time2025-03-22 00:24:36
maintainerNone
docs_urlNone
authorWill Sackfield
requires_pythonNone
licenseMIT
keywords machine-learning ml hyperparameter features
VCS
bugtrack_url
requirements pandas optuna scikit-learn feature-engine tqdm numpy scipy catboost venn-abers mapie
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # wavetrainer

<a href="https://pypi.org/project/wavetrainer/">
    <img alt="PyPi" src="https://img.shields.io/pypi/v/wavetrainer">
</a>

A library for automatically finding the optimal model within feature and hyperparameter space on time series models.

<p align="center">
    <img src="wavetrain.png" alt="wavetrain" width="200"/>
</p>

## Dependencies :globe_with_meridians:

Python 3.11.6:

- [pandas](https://pandas.pydata.org/)
- [optuna](https://optuna.readthedocs.io/en/stable/)
- [scikit-learn](https://scikit-learn.org/)
- [feature-engine](https://feature-engine.trainindata.com/en/latest/)
- [tqdm](https://github.com/tqdm/tqdm)
- [numpy](https://numpy.org/)
- [scipy](https://scipy.org/)
- [catboost](https://catboost.ai/)
- [venn-abers](https://github.com/ip200/venn-abers)
- [mapie](https://mapie.readthedocs.io/en/stable/)

## Raison D'ĂȘtre :thought_balloon:

`wavetrainer` aims to split out the various aspects of creating a good model into different composable pieces and searches the space of these different pieces to find an optimal model. This came about after doing code like this multiple times on multiple projects. This is specifically geared towards time series models, validating itself through walk-forward analysis.

## Architecture :triangular_ruler:

`wavetrainer` is an object orientated library. The entities are organised like so:

* **Trainer**: A sklearn compatible object that can fit and predict data.
    * **Reducer**: An object that can reduce the feature space based on heuristics.
    * **Weights**: An object that adds weights to the features.
    * **Selector**: An object that can select which features to include from the training set.
    * **Calibrator**: An object that can calibrate the probabilities produced by the model.
    * **Model**: An object that represents the underlying model architecture being used.
    * **Windower**: An object that represents the lookback window of the data.

## Installation :inbox_tray:

This is a python package hosted on pypi, so to install simply run the following command:

`pip install wavetrainer`

or install using this local repository:

`python setup.py install --old-and-unmanageable`

## Usage example :eyes:

The use of `wavetrainer` is entirely through code due to it being a library. It attempts to hide most of its complexity from the user, so it only has a few functions of relevance in its outward API.

### Training

To train a model:

```python
import wavetrainer as wt
import pandas as pd
import numpy as np
import random

data_size = 10
df = pd.DataFrame(
    np.random.randint(0, 30, size=data_size),
    columns=["X"],
    index=pd.date_range("20180101", periods=data_size),
)
df["Y"] = [random.choice([True, False]) for _ in range(data_size)]

X = df["X"]
Y = df["Y"]

wavetrainer = wt.create("my_wavetrain")
wavetrainer = wavetrainer.fit(X, y=Y)
```

This will save it to the folder `my_wavetrain`.

### Load

To load a trainer (as well as its composite states):

```python
import wavetrainer as wt

wavetrainer = wt.load("my_wavetrain")
```

### Predict

To make a prediction from new data:

```python
import wavetrainer as wt
import pandas as pd
import numpy as np

wavetrainer = wt.load("my_wavetrain")
data_size = 1
df = pd.DataFrame(
    np.random.randint(0, 30, size=data_size),
    columns=["X"],
    index=pd.date_range("20180101", periods=data_size),
)
X = df["X"]

preds = wavetrainer.predict(X)
```

`preds` will now contain both the predictions and the probabilities associated with those predictions.

## License :memo:

The project is available under the [MIT License](LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/8W9aG/wavetrainer",
    "name": "wavetrainer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "machine-learning, ML, hyperparameter, features",
    "author": "Will Sackfield",
    "author_email": "will.sackfield@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a2/ff/1d4e168a0faf3f554eb6b3be4de3e39efd8df1782fe92a1250458af74ff8/wavetrainer-0.0.9.tar.gz",
    "platform": null,
    "description": "# wavetrainer\n\n<a href=\"https://pypi.org/project/wavetrainer/\">\n    <img alt=\"PyPi\" src=\"https://img.shields.io/pypi/v/wavetrainer\">\n</a>\n\nA library for automatically finding the optimal model within feature and hyperparameter space on time series models.\n\n<p align=\"center\">\n    <img src=\"wavetrain.png\" alt=\"wavetrain\" width=\"200\"/>\n</p>\n\n## Dependencies :globe_with_meridians:\n\nPython 3.11.6:\n\n- [pandas](https://pandas.pydata.org/)\n- [optuna](https://optuna.readthedocs.io/en/stable/)\n- [scikit-learn](https://scikit-learn.org/)\n- [feature-engine](https://feature-engine.trainindata.com/en/latest/)\n- [tqdm](https://github.com/tqdm/tqdm)\n- [numpy](https://numpy.org/)\n- [scipy](https://scipy.org/)\n- [catboost](https://catboost.ai/)\n- [venn-abers](https://github.com/ip200/venn-abers)\n- [mapie](https://mapie.readthedocs.io/en/stable/)\n\n## Raison D'\u00eatre :thought_balloon:\n\n`wavetrainer` aims to split out the various aspects of creating a good model into different composable pieces and searches the space of these different pieces to find an optimal model. This came about after doing code like this multiple times on multiple projects. This is specifically geared towards time series models, validating itself through walk-forward analysis.\n\n## Architecture :triangular_ruler:\n\n`wavetrainer` is an object orientated library. The entities are organised like so:\n\n* **Trainer**: A sklearn compatible object that can fit and predict data.\n    * **Reducer**: An object that can reduce the feature space based on heuristics.\n    * **Weights**: An object that adds weights to the features.\n    * **Selector**: An object that can select which features to include from the training set.\n    * **Calibrator**: An object that can calibrate the probabilities produced by the model.\n    * **Model**: An object that represents the underlying model architecture being used.\n    * **Windower**: An object that represents the lookback window of the data.\n\n## Installation :inbox_tray:\n\nThis is a python package hosted on pypi, so to install simply run the following command:\n\n`pip install wavetrainer`\n\nor install using this local repository:\n\n`python setup.py install --old-and-unmanageable`\n\n## Usage example :eyes:\n\nThe use of `wavetrainer` is entirely through code due to it being a library. It attempts to hide most of its complexity from the user, so it only has a few functions of relevance in its outward API.\n\n### Training\n\nTo train a model:\n\n```python\nimport wavetrainer as wt\nimport pandas as pd\nimport numpy as np\nimport random\n\ndata_size = 10\ndf = pd.DataFrame(\n    np.random.randint(0, 30, size=data_size),\n    columns=[\"X\"],\n    index=pd.date_range(\"20180101\", periods=data_size),\n)\ndf[\"Y\"] = [random.choice([True, False]) for _ in range(data_size)]\n\nX = df[\"X\"]\nY = df[\"Y\"]\n\nwavetrainer = wt.create(\"my_wavetrain\")\nwavetrainer = wavetrainer.fit(X, y=Y)\n```\n\nThis will save it to the folder `my_wavetrain`.\n\n### Load\n\nTo load a trainer (as well as its composite states):\n\n```python\nimport wavetrainer as wt\n\nwavetrainer = wt.load(\"my_wavetrain\")\n```\n\n### Predict\n\nTo make a prediction from new data:\n\n```python\nimport wavetrainer as wt\nimport pandas as pd\nimport numpy as np\n\nwavetrainer = wt.load(\"my_wavetrain\")\ndata_size = 1\ndf = pd.DataFrame(\n    np.random.randint(0, 30, size=data_size),\n    columns=[\"X\"],\n    index=pd.date_range(\"20180101\", periods=data_size),\n)\nX = df[\"X\"]\n\npreds = wavetrainer.predict(X)\n```\n\n`preds` will now contain both the predictions and the probabilities associated with those predictions.\n\n## License :memo:\n\nThe project is available under the [MIT License](LICENSE).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A library for automatically finding the optimal model within feature and hyperparameter space.",
    "version": "0.0.9",
    "project_urls": {
        "Homepage": "https://github.com/8W9aG/wavetrainer"
    },
    "split_keywords": [
        "machine-learning",
        " ml",
        " hyperparameter",
        " features"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a2ff1d4e168a0faf3f554eb6b3be4de3e39efd8df1782fe92a1250458af74ff8",
                "md5": "c9edc18c0b90d95587ce38ae2a188164",
                "sha256": "3bb45e072b66e86b0352747ef09c0aa50ac1769daab2d1b3a1e76d8c094f1219"
            },
            "downloads": -1,
            "filename": "wavetrainer-0.0.9.tar.gz",
            "has_sig": false,
            "md5_digest": "c9edc18c0b90d95587ce38ae2a188164",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 20488,
            "upload_time": "2025-03-22T00:24:36",
            "upload_time_iso_8601": "2025-03-22T00:24:36.808984Z",
            "url": "https://files.pythonhosted.org/packages/a2/ff/1d4e168a0faf3f554eb6b3be4de3e39efd8df1782fe92a1250458af74ff8/wavetrainer-0.0.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-03-22 00:24:36",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "8W9aG",
    "github_project": "wavetrainer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "2.2.3"
                ]
            ]
        },
        {
            "name": "optuna",
            "specs": [
                [
                    ">=",
                    "4.2.1"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.5.2"
                ]
            ]
        },
        {
            "name": "feature-engine",
            "specs": [
                [
                    ">=",
                    "1.8.3"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.67.1"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.26.4"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.15.2"
                ]
            ]
        },
        {
            "name": "catboost",
            "specs": [
                [
                    ">=",
                    "1.2.7"
                ]
            ]
        },
        {
            "name": "venn-abers",
            "specs": [
                [
                    ">=",
                    "1.4.6"
                ]
            ]
        },
        {
            "name": "mapie",
            "specs": [
                [
                    ">=",
                    "0.9.2"
                ]
            ]
        }
    ],
    "lcname": "wavetrainer"
}
        
Elapsed time: 0.90524s