scikit-transformers


Namescikit-transformers JSON
Version 0.3.1 PyPI version JSON
download
home_pagehttps://alexandregazagnes.github.io/scikit-transformers/
Summaryscikit-transformers is a very usefull package to enable and provide custom transformers such as LogColumnTransformer, BoolColumnTransformers and others fancy transformers.
upload_time2024-02-09 23:42:52
maintainerAlexandreGazagnes
docs_urlNone
authorAlexandreGazagnes
requires_python>=3.6
licenseGPL-3.0
keywords python machine learning sklearn transformers scikit-learn tools data pandas
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![image](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/assets/img/img.png?raw=true)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
![Python](https://img.shields.io/badge/python-3.10.x-green.svg)
![Repo Size](https://img.shields.io/github/repo-size/AlexandreGazagnes/scikit-transformers)
[![PEP8](https://img.shields.io/badge/code%20style-pep8-orange.svg)](https://www.python.org/dev/peps/pep-0008/)
[![Poetry](https://img.shields.io/endpoint?url=https://python-poetry.org/badge/v0.json)](https://python-poetry.org/)
![Coverage](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/assets/img/cov.svg?raw=true)
![Tests](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/tests.yaml/badge.svg)
![Statics](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/statics.yaml/badge.svg)
![Doc](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/docs.yaml/badge.svg)
![Pypi](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/publish.yaml/badge.svg)
![GitHub commit activity](https://img.shields.io/github/commit-activity/m/AlexandreGazagnes/scikit-transformers)

# Scikit-transformers : Scikit-learn + Custom transformers


## About

**scikit-transformers** is a very usefull package to enable and provide custom transformers such as ```LogColumnTransformer```, ```BoolColumnTransformers``` and others fancy transformers.

It was created to provide a simple way to use custom transformers in ```scikit-learn``` pipelines, and allow to use them in a ```scikit-learn ```model, using ```GridSearchCV``` for testing and tuning hyperparameters.

The starting point was to provide a simple ```LogColumnTransformer```, which is a simple wrapper around the numpy log function, making possible to use a skew threshold to apply the log transformation only on columns with a skew superior to a given threshold.

With ```scikit-transformers```, it is now possible to use this ```LogColumnTransformer``` in transformer in a ```GridSearchCV``` using a skew threshold as hyperparameter to find what columns are good to log or not.

```LogColumnTransformer``` is one of the many transformers implemented in ```scikit-transformers```.



## Installation

Using regular pip and venv tools :

```bash
python3 -m venv .venv
source .venv/bin/activate
pip install scikit-transformers
```


## Usage

For a very basic usage :
```python
import pandas as pd

from sktransf.trasnformer import LogColumnTransformer

df = pd.DataFrame(
    { "a": range(10),
      "b": range(10)
    }
)

logger = LogColumnTransformer()
logger.fit_transform(df)
df_transf = logger.transform(df)
```

Using common transformers : 

```python
import pandas as pd

from sktransf.transformer import LogColumnTransformer, BoolColumnTransformer
from sktransf.selector import DropUniqueColumnSelector

df = pd.DataFrame(
    { "a": range(10),
      "b": range(10)
    }
)

df_bool = BoolColumnTransformer().fit_transform(df)
df_unique = DropUniqueColumnTransformer().fit_transform(df)
df_logged = LogColumnTransformer().fit_transform(df)
```

Using a pipeline with a scikit-learn model : 

```python
import pandas as pd
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LinearRegression

from sktransf.transformer import LogColumnTransformer, BoolColumnTransformer
from sktransf.selector import DropUniqueColumnSelector

pipe = Pipeline([
    ('bool', BoolColumnTransformer()),
    ('unique', DropUniqueColumnTransformer()),
    ('log', LogColumnTransformer()),
    ('model', LinearRegression())
])

X = pd.DataFrame(
    { "a": range(10),
      "b": range(10)
    }
)

y = range(10)

pipe.fit(X, y)

y_pred = pipe.predict(X)
```


## Documentation

For more specific information, please refer to the notebooks: 

* Transformers : 
  * [LogColumnTransformer notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/transformer/LogColumnTransformer.ipynb)
  * [BoolColumnTransformer notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/transformer/BoolColumnTransformer.ipynb)
* Selectors : 
  * [DropUniqueColumnSelector notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/selector/DropUniqueColumnSelector.ipynb)
  * [DropSkuColumnSelector notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/selector/DropSkuColumnSelector.ipynb)
* Pipelines :
  * [Pipelines notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/Pipelines.ipynb)


A complete documentation is be available on the  [github page](https://alexandregazagnes.github.io/scikit-transformers/).


## Changelog, Releases and Roadmap

Please refer to the [changelog](https://alexandregazagnes.github.io/scikit-transformers/CHANGELOG/) page for more information.


## Contributing

Pull requests are welcome.

For major changes, please open an issue first to discuss what you would like to change.

For more information, please refer to the [contributing](https://alexandregazagnes.github.io/scikit-transformers/CONTRIBUTING/) page.


## License

[GPLv3](https://raw.githubusercontent.com/AlexandreGazagnes/scikit-transformers/main/LICENSE)

            

Raw data

            {
    "_id": null,
    "home_page": "https://alexandregazagnes.github.io/scikit-transformers/",
    "name": "scikit-transformers",
    "maintainer": "AlexandreGazagnes",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "alex@gazagnes.net",
    "keywords": "python,machine learning,sklearn,transformers,scikit-learn,tools,data,pandas",
    "author": "AlexandreGazagnes",
    "author_email": "alex@gazagnes.net",
    "download_url": "https://files.pythonhosted.org/packages/dd/01/b95d328f3dfcd3313590a21ac6842780c7739c0ae3b1b8e148e23b18f110/scikit_transformers-0.3.1.tar.gz",
    "platform": null,
    "description": "![image](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/assets/img/img.png?raw=true)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n![Python](https://img.shields.io/badge/python-3.10.x-green.svg)\n![Repo Size](https://img.shields.io/github/repo-size/AlexandreGazagnes/scikit-transformers)\n[![PEP8](https://img.shields.io/badge/code%20style-pep8-orange.svg)](https://www.python.org/dev/peps/pep-0008/)\n[![Poetry](https://img.shields.io/endpoint?url=https://python-poetry.org/badge/v0.json)](https://python-poetry.org/)\n![Coverage](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/assets/img/cov.svg?raw=true)\n![Tests](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/tests.yaml/badge.svg)\n![Statics](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/statics.yaml/badge.svg)\n![Doc](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/docs.yaml/badge.svg)\n![Pypi](https://github.com/AlexandreGazagnes/scikit-transformers/actions/workflows/publish.yaml/badge.svg)\n![GitHub commit activity](https://img.shields.io/github/commit-activity/m/AlexandreGazagnes/scikit-transformers)\n\n# Scikit-transformers : Scikit-learn + Custom transformers\n\n\n## About\n\n**scikit-transformers** is a very usefull package to enable and provide custom transformers such as ```LogColumnTransformer```, ```BoolColumnTransformers``` and others fancy transformers.\n\nIt was created to provide a simple way to use custom transformers in ```scikit-learn``` pipelines, and allow to use them in a ```scikit-learn ```model, using ```GridSearchCV``` for testing and tuning hyperparameters.\n\nThe starting point was to provide a simple ```LogColumnTransformer```, which is a simple wrapper around the numpy log function, making possible to use a skew threshold to apply the log transformation only on columns with a skew superior to a given threshold.\n\nWith ```scikit-transformers```, it is now possible to use this ```LogColumnTransformer``` in transformer in a ```GridSearchCV``` using a skew threshold as hyperparameter to find what columns are good to log or not.\n\n```LogColumnTransformer``` is one of the many transformers implemented in ```scikit-transformers```.\n\n\n\n## Installation\n\nUsing regular pip and venv tools :\n\n```bash\npython3 -m venv .venv\nsource .venv/bin/activate\npip install scikit-transformers\n```\n\n\n## Usage\n\nFor a very basic usage :\n```python\nimport pandas as pd\n\nfrom sktransf.trasnformer import LogColumnTransformer\n\ndf = pd.DataFrame(\n    { \"a\": range(10),\n      \"b\": range(10)\n    }\n)\n\nlogger = LogColumnTransformer()\nlogger.fit_transform(df)\ndf_transf = logger.transform(df)\n```\n\nUsing common transformers : \n\n```python\nimport pandas as pd\n\nfrom sktransf.transformer import LogColumnTransformer, BoolColumnTransformer\nfrom sktransf.selector import DropUniqueColumnSelector\n\ndf = pd.DataFrame(\n    { \"a\": range(10),\n      \"b\": range(10)\n    }\n)\n\ndf_bool = BoolColumnTransformer().fit_transform(df)\ndf_unique = DropUniqueColumnTransformer().fit_transform(df)\ndf_logged = LogColumnTransformer().fit_transform(df)\n```\n\nUsing a pipeline with a scikit-learn model : \n\n```python\nimport pandas as pd\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.linear_model import LinearRegression\n\nfrom sktransf.transformer import LogColumnTransformer, BoolColumnTransformer\nfrom sktransf.selector import DropUniqueColumnSelector\n\npipe = Pipeline([\n    ('bool', BoolColumnTransformer()),\n    ('unique', DropUniqueColumnTransformer()),\n    ('log', LogColumnTransformer()),\n    ('model', LinearRegression())\n])\n\nX = pd.DataFrame(\n    { \"a\": range(10),\n      \"b\": range(10)\n    }\n)\n\ny = range(10)\n\npipe.fit(X, y)\n\ny_pred = pipe.predict(X)\n```\n\n\n## Documentation\n\nFor more specific information, please refer to the notebooks: \n\n* Transformers : \n  * [LogColumnTransformer notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/transformer/LogColumnTransformer.ipynb)\n  * [BoolColumnTransformer notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/transformer/BoolColumnTransformer.ipynb)\n* Selectors : \n  * [DropUniqueColumnSelector notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/selector/DropUniqueColumnSelector.ipynb)\n  * [DropSkuColumnSelector notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/selector/DropSkuColumnSelector.ipynb)\n* Pipelines :\n  * [Pipelines notebook](https://github.com/AlexandreGazagnes/scikit-transformers/blob/main/docs/notebooks/Pipelines.ipynb)\n\n\nA complete documentation is be available on the  [github page](https://alexandregazagnes.github.io/scikit-transformers/).\n\n\n## Changelog, Releases and Roadmap\n\nPlease refer to the [changelog](https://alexandregazagnes.github.io/scikit-transformers/CHANGELOG/) page for more information.\n\n\n## Contributing\n\nPull requests are welcome.\n\nFor major changes, please open an issue first to discuss what you would like to change.\n\nFor more information, please refer to the [contributing](https://alexandregazagnes.github.io/scikit-transformers/CONTRIBUTING/) page.\n\n\n## License\n\n[GPLv3](https://raw.githubusercontent.com/AlexandreGazagnes/scikit-transformers/main/LICENSE)\n",
    "bugtrack_url": null,
    "license": "GPL-3.0",
    "summary": "scikit-transformers is a very usefull package to enable and provide custom transformers such as LogColumnTransformer, BoolColumnTransformers and others fancy transformers.",
    "version": "0.3.1",
    "project_urls": {
        "Changelog": "https://alexandregazagnes.github.io/scikit-transformers/CHANGELOG/",
        "Code": "https://github.com/AlexandreGazagnes/scikit-transformers/tree/main",
        "Documentation": "https://alexandregazagnes.github.io/scikit-transformers/",
        "Homepage": "https://alexandregazagnes.github.io/scikit-transformers/",
        "Issues": "https://github.com/AlexandreGazagnes/scikit-transformers/issues",
        "Repository": "https://github.com/AlexandreGazagnes/scikit-transformers/tree/main"
    },
    "split_keywords": [
        "python",
        "machine learning",
        "sklearn",
        "transformers",
        "scikit-learn",
        "tools",
        "data",
        "pandas"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "71fbaf1077afa931cc6a69e3a48e1e8a094eafe9bc5e7e9efe381f7a7bb1ef6e",
                "md5": "be4cdb53b097488ccd0b0a00d8f32143",
                "sha256": "750a47393836b2a74ffb2febf1fbbe947339c124d058d30085ad842a72db786a"
            },
            "downloads": -1,
            "filename": "scikit_transformers-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "be4cdb53b097488ccd0b0a00d8f32143",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 26090,
            "upload_time": "2024-02-09T23:42:50",
            "upload_time_iso_8601": "2024-02-09T23:42:50.680501Z",
            "url": "https://files.pythonhosted.org/packages/71/fb/af1077afa931cc6a69e3a48e1e8a094eafe9bc5e7e9efe381f7a7bb1ef6e/scikit_transformers-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dd01b95d328f3dfcd3313590a21ac6842780c7739c0ae3b1b8e148e23b18f110",
                "md5": "ea985016a40b9af0db50e592a6bc259d",
                "sha256": "5c1578daf6c0a93f0f015a7db4ecb675f2a59b3e0ed243f53fc6ee23eb030138"
            },
            "downloads": -1,
            "filename": "scikit_transformers-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "ea985016a40b9af0db50e592a6bc259d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 22632,
            "upload_time": "2024-02-09T23:42:52",
            "upload_time_iso_8601": "2024-02-09T23:42:52.550266Z",
            "url": "https://files.pythonhosted.org/packages/dd/01/b95d328f3dfcd3313590a21ac6842780c7739c0ae3b1b8e148e23b18f110/scikit_transformers-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-09 23:42:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AlexandreGazagnes",
    "github_project": "scikit-transformers",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "scikit-transformers"
}
        
Elapsed time: 1.23431s