sklearn-utilities


Namesklearn-utilities JSON
Version 0.5.12 PyPI version JSON
download
home_pageNone
SummaryUtilities for scikit-learn.
upload_time2025-01-29 09:30:34
maintainerNone
docs_urlNone
author34j
requires_python<3.13,>=3.9
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Sklearn Utilities

<p align="center">
  <a href="https://github.com/34j/sklearn-utilities/actions/workflows/ci.yml?query=branch%3Amain">
    <img src="https://img.shields.io/github/actions/workflow/status/34j/sklearn-utilities/ci.yml?branch=main&label=CI&logo=github&style=flat-square" alt="CI Status" >
  </a>
  <a href="https://sklearn-utilities.readthedocs.io">
    <img src="https://img.shields.io/readthedocs/sklearn-utilities.svg?logo=read-the-docs&logoColor=fff&style=flat-square" alt="Documentation Status">
  </a>
  <a href="https://codecov.io/gh/34j/sklearn-utilities">
    <img src="https://img.shields.io/codecov/c/github/34j/sklearn-utilities.svg?logo=codecov&logoColor=fff&style=flat-square" alt="Test coverage percentage">
  </a>
</p>
<p align="center">
  <a href="https://python-poetry.org/">
    <img src="https://img.shields.io/badge/packaging-poetry-299bd7?style=flat-square&logo=" alt="Poetry">
  </a>
  <a href="https://github.com/ambv/black">
    <img src="https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square" alt="black">
  </a>
  <a href="https://github.com/pre-commit/pre-commit">
    <img src="https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white&style=flat-square" alt="pre-commit">
  </a>
</p>
<p align="center">
  <a href="https://pypi.org/project/sklearn-utilities/">
    <img src="https://img.shields.io/pypi/v/sklearn-utilities.svg?logo=python&logoColor=fff&style=flat-square" alt="PyPI Version">
  </a>
  <img src="https://img.shields.io/pypi/pyversions/sklearn-utilities.svg?style=flat-square&logo=python&amp;logoColor=fff" alt="Supported Python versions">
  <img src="https://img.shields.io/pypi/l/sklearn-utilities.svg?style=flat-square" alt="License">
</p>

Utilities for scikit-learn.

## Installation

Install this via pip (or your favourite package manager):

```shell
pip install sklearn-utilities
```

## API

See [Docs](https://sklearn-utilities.readthedocs.io/en/latest/sklearn_utilities.html) for more information.

- `EstimatorWrapperBase`: base class for wrappers. Redirects all attributes which are not in the wrapper to the wrapped estimator.
- `DataFrameWrapper`: tries to convert every estimator output to a pandas DataFrame or Series.
- `FeatureUnionPandas`: a `FeatureUnion` that works with pandas DataFrames.
- `IncludedColumnTransformerPandas`, `ExcludedColumnTransformerPandas`: select columns by name.
- `AppendPredictionToX`: appends the prediction of y to X.
- `AppendXPredictionToX`: appends the prediction of X to X.
- `DropByNoisePrediction`: drops columns which has high importance in predicting noise.
- `DropMissingColumns`: drops columns with missing values above a threshold.
- `DropMissingRowsY`: drops rows with missing values in y. Use `feature_engine.DropMissingData` for X.
- `IntersectXY`: drops rows where the index of X and y do not intersect. Use with `feature_engine.DropMissingData`.
- `ReindexMissingColumns`: reindexes columns of X in `transform()` to match the columns of X in `fit()`.
- `ReportNonFinite`: reports non-finite values in X and/or y.
- `IdTransformer`: a transformer that does nothing.
- `RecursiveFitSubtractRegressor`: a regressor that recursively fits a regressor and subtracts the prediction from the target.
- `SmartMultioutputEstimator`: a `MultiOutputEstimator` that supports tuple of arrays in `predict()` and supports pandas `Series` and `DataFrame`.
- `until_event()`, `since_event()`: calculates the time since or until events (`Series[bool]`)
- `ComposeVarEstimator`: composes mean and std/var estimators.
- `DummyRegressorVar`: `DummyRegressor` that returns 1.0 for std/var.
- `TransformedTargetRegressorVar`: `TransformedTargetRegressor` with std/var support.
- `StandardScalerVar`: `StandardScaler` with std/var support.
- `EvalSetWrapper`, `CatBoostProgressBarWrapper`: wrapper that passes `eval_set` to `fit()` using `train_test_split()`, mainly for `CatBoost`. The latter shows progress bar (using `tqdm`) as well. Useful for early stopping. For LightGBM, see [`lightgbm-callbacks`](https://github.com/34j/lightgbm-callbacks).

### `sklearn_utilities.dataset`

- `add_missing_values()`: adds missing values to a dataset.

### `sklearn_utilities.torch`

- `PCATorch`: faster PCA using PyTorch with GPU support.

#### `sklearn_utilities.torch.skorch`

- `SkorchReshaper`, `SkorchCNNReshaper`: reshapes X and y for `nn.Linear` and `nn.Conv1d/2d` respectively. (For `nn.Conv2d`, uses `np.sliding_window_view()`.)
- `AllowNaN`: wraps a loss module and assign 0 to y and y_hat for indices where y contains NaN in `forward()`..

## See also

- [ml-tooling/best-of-ml-python](https://github.com/ml-tooling/best-of-ml-python)

## Contributors ✨

Thanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):

<!-- prettier-ignore-start -->
<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->
<!-- markdownlint-disable -->
<!-- markdownlint-enable -->
<!-- ALL-CONTRIBUTORS-LIST:END -->
<!-- prettier-ignore-end -->

This project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "sklearn-utilities",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "34j",
    "author_email": "34j.95a2p@simplelogin.com",
    "download_url": "https://files.pythonhosted.org/packages/5f/81/31392aae2f6682c97b09a10a23ff5fe03e9b259fb56902ffd62af1bc68e4/sklearn_utilities-0.5.12.tar.gz",
    "platform": null,
    "description": "# Sklearn Utilities\n\n<p align=\"center\">\n  <a href=\"https://github.com/34j/sklearn-utilities/actions/workflows/ci.yml?query=branch%3Amain\">\n    <img src=\"https://img.shields.io/github/actions/workflow/status/34j/sklearn-utilities/ci.yml?branch=main&label=CI&logo=github&style=flat-square\" alt=\"CI Status\" >\n  </a>\n  <a href=\"https://sklearn-utilities.readthedocs.io\">\n    <img src=\"https://img.shields.io/readthedocs/sklearn-utilities.svg?logo=read-the-docs&logoColor=fff&style=flat-square\" alt=\"Documentation Status\">\n  </a>\n  <a href=\"https://codecov.io/gh/34j/sklearn-utilities\">\n    <img src=\"https://img.shields.io/codecov/c/github/34j/sklearn-utilities.svg?logo=codecov&logoColor=fff&style=flat-square\" alt=\"Test coverage percentage\">\n  </a>\n</p>\n<p align=\"center\">\n  <a href=\"https://python-poetry.org/\">\n    <img src=\"https://img.shields.io/badge/packaging-poetry-299bd7?style=flat-square&logo=\" alt=\"Poetry\">\n  </a>\n  <a href=\"https://github.com/ambv/black\">\n    <img src=\"https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square\" alt=\"black\">\n  </a>\n  <a href=\"https://github.com/pre-commit/pre-commit\">\n    <img src=\"https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white&style=flat-square\" alt=\"pre-commit\">\n  </a>\n</p>\n<p align=\"center\">\n  <a href=\"https://pypi.org/project/sklearn-utilities/\">\n    <img src=\"https://img.shields.io/pypi/v/sklearn-utilities.svg?logo=python&logoColor=fff&style=flat-square\" alt=\"PyPI Version\">\n  </a>\n  <img src=\"https://img.shields.io/pypi/pyversions/sklearn-utilities.svg?style=flat-square&logo=python&amp;logoColor=fff\" alt=\"Supported Python versions\">\n  <img src=\"https://img.shields.io/pypi/l/sklearn-utilities.svg?style=flat-square\" alt=\"License\">\n</p>\n\nUtilities for scikit-learn.\n\n## Installation\n\nInstall this via pip (or your favourite package manager):\n\n```shell\npip install sklearn-utilities\n```\n\n## API\n\nSee [Docs](https://sklearn-utilities.readthedocs.io/en/latest/sklearn_utilities.html) for more information.\n\n- `EstimatorWrapperBase`: base class for wrappers. Redirects all attributes which are not in the wrapper to the wrapped estimator.\n- `DataFrameWrapper`: tries to convert every estimator output to a pandas DataFrame or Series.\n- `FeatureUnionPandas`: a `FeatureUnion` that works with pandas DataFrames.\n- `IncludedColumnTransformerPandas`, `ExcludedColumnTransformerPandas`: select columns by name.\n- `AppendPredictionToX`: appends the prediction of y to X.\n- `AppendXPredictionToX`: appends the prediction of X to X.\n- `DropByNoisePrediction`: drops columns which has high importance in predicting noise.\n- `DropMissingColumns`: drops columns with missing values above a threshold.\n- `DropMissingRowsY`: drops rows with missing values in y. Use `feature_engine.DropMissingData` for X.\n- `IntersectXY`: drops rows where the index of X and y do not intersect. Use with `feature_engine.DropMissingData`.\n- `ReindexMissingColumns`: reindexes columns of X in `transform()` to match the columns of X in `fit()`.\n- `ReportNonFinite`: reports non-finite values in X and/or y.\n- `IdTransformer`: a transformer that does nothing.\n- `RecursiveFitSubtractRegressor`: a regressor that recursively fits a regressor and subtracts the prediction from the target.\n- `SmartMultioutputEstimator`: a `MultiOutputEstimator` that supports tuple of arrays in `predict()` and supports pandas `Series` and `DataFrame`.\n- `until_event()`, `since_event()`: calculates the time since or until events (`Series[bool]`)\n- `ComposeVarEstimator`: composes mean and std/var estimators.\n- `DummyRegressorVar`: `DummyRegressor` that returns 1.0 for std/var.\n- `TransformedTargetRegressorVar`: `TransformedTargetRegressor` with std/var support.\n- `StandardScalerVar`: `StandardScaler` with std/var support.\n- `EvalSetWrapper`, `CatBoostProgressBarWrapper`: wrapper that passes `eval_set` to `fit()` using `train_test_split()`, mainly for `CatBoost`. The latter shows progress bar (using `tqdm`) as well. Useful for early stopping. For LightGBM, see [`lightgbm-callbacks`](https://github.com/34j/lightgbm-callbacks).\n\n### `sklearn_utilities.dataset`\n\n- `add_missing_values()`: adds missing values to a dataset.\n\n### `sklearn_utilities.torch`\n\n- `PCATorch`: faster PCA using PyTorch with GPU support.\n\n#### `sklearn_utilities.torch.skorch`\n\n- `SkorchReshaper`, `SkorchCNNReshaper`: reshapes X and y for `nn.Linear` and `nn.Conv1d/2d` respectively. (For `nn.Conv2d`, uses `np.sliding_window_view()`.)\n- `AllowNaN`: wraps a loss module and assign 0 to y and y_hat for indices where y contains NaN in `forward()`..\n\n## See also\n\n- [ml-tooling/best-of-ml-python](https://github.com/ml-tooling/best-of-ml-python)\n\n## Contributors \u2728\n\nThanks goes to these wonderful people ([emoji key](https://allcontributors.org/docs/en/emoji-key)):\n\n<!-- prettier-ignore-start -->\n<!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section -->\n<!-- markdownlint-disable -->\n<!-- markdownlint-enable -->\n<!-- ALL-CONTRIBUTORS-LIST:END -->\n<!-- prettier-ignore-end -->\n\nThis project follows the [all-contributors](https://github.com/all-contributors/all-contributors) specification. Contributions of any kind welcome!\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Utilities for scikit-learn.",
    "version": "0.5.12",
    "project_urls": {
        "Bug Tracker": "https://github.com/34j/sklearn-utilities/issues",
        "Changelog": "https://github.com/34j/sklearn-utilities/blob/main/CHANGELOG.md",
        "Documentation": "https://sklearn-utilities.readthedocs.io",
        "Repository": "https://github.com/34j/sklearn-utilities"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a989e0291902df3a32451567fece583b5aa583d7360404593ba7d31ce0d414d3",
                "md5": "92ec3a4cc8bee0f9de45bcb09c703a7f",
                "sha256": "1ad65663137774c50f17291ca73a71d847b98c6d17a3bb3e6cb696002b93505f"
            },
            "downloads": -1,
            "filename": "sklearn_utilities-0.5.12-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "92ec3a4cc8bee0f9de45bcb09c703a7f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.9",
            "size": 40154,
            "upload_time": "2025-01-29T09:30:33",
            "upload_time_iso_8601": "2025-01-29T09:30:33.188148Z",
            "url": "https://files.pythonhosted.org/packages/a9/89/e0291902df3a32451567fece583b5aa583d7360404593ba7d31ce0d414d3/sklearn_utilities-0.5.12-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5f8131392aae2f6682c97b09a10a23ff5fe03e9b259fb56902ffd62af1bc68e4",
                "md5": "00ffe5da2f10a337d2a1f4c476cd9225",
                "sha256": "fdce39fe23e55930be77bea2e1e6676e3e5b7c87da4088fb641d5a1031b0dfe5"
            },
            "downloads": -1,
            "filename": "sklearn_utilities-0.5.12.tar.gz",
            "has_sig": false,
            "md5_digest": "00ffe5da2f10a337d2a1f4c476cd9225",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.9",
            "size": 29337,
            "upload_time": "2025-01-29T09:30:34",
            "upload_time_iso_8601": "2025-01-29T09:30:34.398646Z",
            "url": "https://files.pythonhosted.org/packages/5f/81/31392aae2f6682c97b09a10a23ff5fe03e9b259fb56902ffd62af1bc68e4/sklearn_utilities-0.5.12.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-29 09:30:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "34j",
    "github_project": "sklearn-utilities",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sklearn-utilities"
}
        
34j
Elapsed time: 1.26728s