sklearn-smithy


Namesklearn-smithy JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryToolkit to forge scikit-learn compatible estimators.
upload_time2024-06-15 20:32:04
maintainerNone
docs_urlNone
authorFrancesco Bruzzesi
requires_python>=3.10
licenseMIT License Copyright (c) 2024 Francesco Bruzzesi Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords cli data-science machine-learning python scikit-learn tui webui
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <img src="https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/sksmith-logo.svg" width=150 height=150 align="right">

# Scikit-learn Smithy

Scikit-learn smithy is a tool that helps you to forge scikit-learn compatible estimator with ease.

---

[WebUI](https://sklearn-smithy.streamlit.app/) | [Documentation](https://fbruzzesi.github.io/sklearn-smithy) | [Repository](https://github.com/fbruzzesi/sklearn-smithy) | [Issue Tracker](https://github.com/fbruzzesi/sklearn-smithy/issues)

---

How can you use it?

<details><summary>✅ Directly from the browser via a Web UI. </summary>

- Available at [sklearn-smithy.streamlit.app](https://sklearn-smithy.streamlit.app/)
- It requires no installation.
- Powered by [streamlit](https://streamlit.io/)

<img src="https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/webui.png" align="right">

</details>

<details><summary>✅ As a CLI (command line interface) in the terminal.</summary>

- Available via the `smith forge` command.
- It requires [installation](#installation): `python -m pip install sklearn-smithy`
- Powered by [typer](https://typer.tiangolo.com/).

<img src="https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/cli.png" align="right">

</details>

<details><summary>✅ As a TUI (terminal user interface) in the terminal.</summary>

- Available via the `smith forge-tui` command.
- It requires installing [extra dependencies](#extra-dependencies): `python -m pip install "sklearn-smithy[textual]"`
- Powered by [textual](https://textual.textualize.io/).

<img src="https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/tui.png" align="right">

</details>

All these tools will prompt a series of questions regarding the estimator you want to create, and then it will generate the boilerplate code for you.

## Why ❓

Writing scikit-learn compatible estimators might be harder than expected.

While everyone knows about the `fit` and `predict`, there are other behaviours, methods and attributes that
scikit-learn might be expecting from your estimator depending on:

- The type of estimator you're writing.
- The signature of the estimator.
- The signature of the `.fit(...)` method.

Scikit-learn Smithy to the rescue: this tool aims to help you crafting your own estimator by asking a few
questions about it, and then generating the boilerplate code.

In this way you will be able to fully focus on the core implementation logic, and not on nitty-gritty details
of the scikit-learn API.

### Sanity check

Once the core logic is implemented, the estimator should be ready to test against the _somewhat official_
[`parametrize_with_checks`](https://scikit-learn.org/dev/modules/generated/sklearn.utils.estimator_checks.parametrize_with_checks.html#sklearn.utils.estimator_checks.parametrize_with_checks)
pytest compatible decorator:

```py
from sklearn.utils.estimator_checks import parametrize_with_checks

@parametrize_with_checks([
    YourAwesomeRegressor,
    MoreAwesomeClassifier,
    EvenMoreAwesomeTransformer,
])
def test_sklearn_compatible_estimator(estimator, check):
    check(estimator)
```

and it should be compatible with scikit-learn Pipeline, GridSearchCV, etc.

### Official guide

Scikit-learn documentation on how to
[develop estimators](https://scikit-learn.org/dev/developers/develop.html#developing-scikit-learn-estimators).

## Supported estimators

The following types of scikit-learn estimator are supported:

- ✅ Classifier
- ✅ Regressor
- ✅ Outlier Detector
- ✅ Clusterer
- ✅ Transformer
  - ✅ Feature Selector
- 🚧 Meta Estimator

## Installation

sklearn-smithy is available on [pypi](https://pypi.org/project/sklearn-smithy), so you can install it directly from there:

```bash
python -m pip install sklearn-smithy
```

**Remark:** The minimum Python version required is 3.10.

This will make the `smith` command available in your terminal, and you should be able to run the following:

```bash
smith version
```

> sklearn-smithy=...

### Extra dependencies

To run the TUI, you need to install the `textual` dependency as well:

```bash
python -m pip install "sklearn-smithy[textual]"
```

## User guide 📚

Please refer to the dedicated [user guide](https://fbruzzesi.github.io/sklearn-smithy/user-guide/) documentation section.

## Origin story

The idea for this tool originated from [scikit-lego #660](https://github.com/koaning/scikit-lego/pull/660), which I cannot better explain than quoting the PR description itself:

> So the story goes as the following:
>
> - The CI/CD fails for scikit-learn==1.5rc1 because of a change in the `check_estimator` internals
> - In the [scikit-learn issue](https://github.com/scikit-learn/scikit-learn/issues/28966) I got a better picture of how to run test for compatible components
> - In particular, [rolling your own estimator](https://scikit-learn.org/dev/developers/develop.html#rolling-your-own-estimator) suggests to use [`parametrize_with_checks`](https://scikit-learn.org/dev/modules/generated/sklearn.utils.estimator_checks.parametrize_with_checks.html#sklearn.utils.estimator_checks.parametrize_with_checks), and of course I thought "that is a great idea to avoid dealing manually with each test"
> - Say no more, I enter a rabbit hole to refactor all our tests - which would be fine
> - Except that these tests failures helped me figure out a few missing parts in the codebase

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "sklearn-smithy",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "cli, data-science, machine-learning, python, scikit-learn, tui, webui",
    "author": "Francesco Bruzzesi",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/4f/f6/00e9fca8e50fe7b5c279790d2bd020c2f172b41b7f504ec70efc9eda18ce/sklearn_smithy-0.2.0.tar.gz",
    "platform": null,
    "description": "<img src=\"https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/sksmith-logo.svg\" width=150 height=150 align=\"right\">\n\n# Scikit-learn Smithy\n\nScikit-learn smithy is a tool that helps you to forge scikit-learn compatible estimator with ease.\n\n---\n\n[WebUI](https://sklearn-smithy.streamlit.app/) | [Documentation](https://fbruzzesi.github.io/sklearn-smithy) | [Repository](https://github.com/fbruzzesi/sklearn-smithy) | [Issue Tracker](https://github.com/fbruzzesi/sklearn-smithy/issues)\n\n---\n\nHow can you use it?\n\n<details><summary>\u2705 Directly from the browser via a Web UI. </summary>\n\n- Available at [sklearn-smithy.streamlit.app](https://sklearn-smithy.streamlit.app/)\n- It requires no installation.\n- Powered by [streamlit](https://streamlit.io/)\n\n<img src=\"https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/webui.png\" align=\"right\">\n\n</details>\n\n<details><summary>\u2705 As a CLI (command line interface) in the terminal.</summary>\n\n- Available via the `smith forge` command.\n- It requires [installation](#installation): `python -m pip install sklearn-smithy`\n- Powered by [typer](https://typer.tiangolo.com/).\n\n<img src=\"https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/cli.png\" align=\"right\">\n\n</details>\n\n<details><summary>\u2705 As a TUI (terminal user interface) in the terminal.</summary>\n\n- Available via the `smith forge-tui` command.\n- It requires installing [extra dependencies](#extra-dependencies): `python -m pip install \"sklearn-smithy[textual]\"`\n- Powered by [textual](https://textual.textualize.io/).\n\n<img src=\"https://raw.githubusercontent.com/FBruzzesi/sklearn-smithy/main/docs/img/tui.png\" align=\"right\">\n\n</details>\n\nAll these tools will prompt a series of questions regarding the estimator you want to create, and then it will generate the boilerplate code for you.\n\n## Why \u2753\n\nWriting scikit-learn compatible estimators might be harder than expected.\n\nWhile everyone knows about the `fit` and `predict`, there are other behaviours, methods and attributes that\nscikit-learn might be expecting from your estimator depending on:\n\n- The type of estimator you're writing.\n- The signature of the estimator.\n- The signature of the `.fit(...)` method.\n\nScikit-learn Smithy to the rescue: this tool aims to help you crafting your own estimator by asking a few\nquestions about it, and then generating the boilerplate code.\n\nIn this way you will be able to fully focus on the core implementation logic, and not on nitty-gritty details\nof the scikit-learn API.\n\n### Sanity check\n\nOnce the core logic is implemented, the estimator should be ready to test against the _somewhat official_\n[`parametrize_with_checks`](https://scikit-learn.org/dev/modules/generated/sklearn.utils.estimator_checks.parametrize_with_checks.html#sklearn.utils.estimator_checks.parametrize_with_checks)\npytest compatible decorator:\n\n```py\nfrom sklearn.utils.estimator_checks import parametrize_with_checks\n\n@parametrize_with_checks([\n    YourAwesomeRegressor,\n    MoreAwesomeClassifier,\n    EvenMoreAwesomeTransformer,\n])\ndef test_sklearn_compatible_estimator(estimator, check):\n    check(estimator)\n```\n\nand it should be compatible with scikit-learn Pipeline, GridSearchCV, etc.\n\n### Official guide\n\nScikit-learn documentation on how to\n[develop estimators](https://scikit-learn.org/dev/developers/develop.html#developing-scikit-learn-estimators).\n\n## Supported estimators\n\nThe following types of scikit-learn estimator are supported:\n\n- \u2705 Classifier\n- \u2705 Regressor\n- \u2705 Outlier Detector\n- \u2705 Clusterer\n- \u2705 Transformer\n  - \u2705 Feature Selector\n- \ud83d\udea7 Meta Estimator\n\n## Installation\n\nsklearn-smithy is available on [pypi](https://pypi.org/project/sklearn-smithy), so you can install it directly from there:\n\n```bash\npython -m pip install sklearn-smithy\n```\n\n**Remark:** The minimum Python version required is 3.10.\n\nThis will make the `smith` command available in your terminal, and you should be able to run the following:\n\n```bash\nsmith version\n```\n\n> sklearn-smithy=...\n\n### Extra dependencies\n\nTo run the TUI, you need to install the `textual` dependency as well:\n\n```bash\npython -m pip install \"sklearn-smithy[textual]\"\n```\n\n## User guide \ud83d\udcda\n\nPlease refer to the dedicated [user guide](https://fbruzzesi.github.io/sklearn-smithy/user-guide/) documentation section.\n\n## Origin story\n\nThe idea for this tool originated from [scikit-lego #660](https://github.com/koaning/scikit-lego/pull/660), which I cannot better explain than quoting the PR description itself:\n\n> So the story goes as the following:\n>\n> - The CI/CD fails for scikit-learn==1.5rc1 because of a change in the `check_estimator` internals\n> - In the [scikit-learn issue](https://github.com/scikit-learn/scikit-learn/issues/28966) I got a better picture of how to run test for compatible components\n> - In particular, [rolling your own estimator](https://scikit-learn.org/dev/developers/develop.html#rolling-your-own-estimator) suggests to use [`parametrize_with_checks`](https://scikit-learn.org/dev/modules/generated/sklearn.utils.estimator_checks.parametrize_with_checks.html#sklearn.utils.estimator_checks.parametrize_with_checks), and of course I thought \"that is a great idea to avoid dealing manually with each test\"\n> - Say no more, I enter a rabbit hole to refactor all our tests - which would be fine\n> - Except that these tests failures helped me figure out a few missing parts in the codebase\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2024 Francesco Bruzzesi\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.",
    "summary": "Toolkit to forge scikit-learn compatible estimators.",
    "version": "0.2.0",
    "project_urls": {
        "Documentation": "https://fbruzzesi.github.io/sklearn-smithy",
        "Issues": "https://github.com/FBruzzesi/sklearn-smithy/issues",
        "Repository": "https://github.com/FBruzzesi/sklearn-smithy",
        "Website": "https://sklearn-smithy.streamlit.app/"
    },
    "split_keywords": [
        "cli",
        " data-science",
        " machine-learning",
        " python",
        " scikit-learn",
        " tui",
        " webui"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f4920b8d6b01fbb639b16560c18506cd1b66c6f168972d34bc614ca7b9cfbd16",
                "md5": "aa549711383a44811ec2b2d8041bd050",
                "sha256": "1769f7c128d0c43f6143a5ced01cc7298ba943d3fcb13953c14f94e91910ca98"
            },
            "downloads": -1,
            "filename": "sklearn_smithy-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aa549711383a44811ec2b2d8041bd050",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 24529,
            "upload_time": "2024-06-15T20:32:02",
            "upload_time_iso_8601": "2024-06-15T20:32:02.683706Z",
            "url": "https://files.pythonhosted.org/packages/f4/92/0b8d6b01fbb639b16560c18506cd1b66c6f168972d34bc614ca7b9cfbd16/sklearn_smithy-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4ff600e9fca8e50fe7b5c279790d2bd020c2f172b41b7f504ec70efc9eda18ce",
                "md5": "80cae5969a57d614f812c6450b0e102c",
                "sha256": "aa61505872cfd40ffe8695d711688a2b9d8c5cc7762241f92567f0e187575d07"
            },
            "downloads": -1,
            "filename": "sklearn_smithy-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "80cae5969a57d614f812c6450b0e102c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 18919,
            "upload_time": "2024-06-15T20:32:04",
            "upload_time_iso_8601": "2024-06-15T20:32:04.621241Z",
            "url": "https://files.pythonhosted.org/packages/4f/f6/00e9fca8e50fe7b5c279790d2bd020c2f172b41b7f504ec70efc9eda18ce/sklearn_smithy-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-15 20:32:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "FBruzzesi",
    "github_project": "sklearn-smithy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "sklearn-smithy"
}
        
Elapsed time: 0.36832s