Name | skrub JSON |
Version |
0.4.1
JSON |
| download |
home_page | None |
Summary | Prepping tables for machine learning |
upload_time | 2024-12-11 19:28:08 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
|
skrub
=====
.. image:: https://skrub-data.github.io/stable/_static/skrub.svg
:align: center
:width: 50 %
:alt: skrub logo
|py_ver| |pypi_var| |pypi_dl| |codecov| |circleci| |black|
.. |py_ver| image:: https://img.shields.io/pypi/pyversions/skrub
.. |pypi_var| image:: https://img.shields.io/pypi/v/skrub?color=informational
.. |pypi_dl| image:: https://img.shields.io/pypi/dm/skrub
.. |codecov| image:: https://img.shields.io/codecov/c/github/skrub-data/skrub/main
.. |circleci| image:: https://img.shields.io/circleci/build/github/skrub-data/skrub/main?label=CircleCI
.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg
**skrub** (formerly *dirty_cat*) is a Python
library that facilitates prepping your tables for machine learning.
If you like the package, spread the word and ⭐ this repository!
You can also join the `discord server <https://discord.gg/ABaPnm7fDC>`_.
Website: https://skrub-data.org/
What can skrub do?
------------------
The goal of skrub is to bridge the gap between tabular data sources and machine-learning models.
skrub provides high-level tools for joining dataframes (``Joiner``, ``AggJoiner``, ...),
encoding columns (``MinHashEncoder``, ``ToCategorical``, ...), building a pipeline
(``TableVectorizer``, ``tabular_learner``, ...), and explore interactively your data (``TableReport``).
.. figure::
https://github.com/rcap107/skrub-datasets/blob/master/data/output.gif?raw=true
:alt: An animation showing how TableReport works
An animation showing how TableReport works
>>> from skrub.datasets import fetch_employee_salaries
>>> dataset = fetch_employee_salaries()
>>> df = dataset.X
>>> y = dataset.y
>>> df.iloc[0]
gender F
department POL
department_name Department of Police
division MSB Information Mgmt and Tech Division Records...
assignment_category Fulltime-Regular
employee_position_title Office Services Coordinator
date_first_hired 09/22/1986
year_first_hired 1986
>>> from sklearn.model_selection import cross_val_score
>>> from skrub import tabular_learner
>>> cross_val_score(tabular_learner('regressor'), df, y)
array([0.89370447, 0.89279068, 0.92282557, 0.92319094, 0.92162666])
See our `examples <https://skrub-data.org/stable/auto_examples>`_.
Installation
------------
skrub can easily be installed via ``pip`` or ``conda``. For more installation information, see
the `installation instructions <https://skrub-data.org/stable/install.html>`_.
Contributing
------------
The best way to support the development of skrub is to spread the word!
Also, if you already are a skrub user, we would love to hear about your use cases and challenges in the `Discussions <https://github.com/skrub-data/skrub/discussions>`_ section.
To report a bug or suggest enhancements, please
`open an issue <https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-an-issue>`_.
If you want to contribute directly to the library, then check the
`how to contribute <https://skrub-data.org/stable/CONTRIBUTING.html>`_ page on
the website for more information.
Raw data
{
"_id": null,
"home_page": null,
"name": "skrub",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "Patricio Cerda <patricio.cerda@inria.fr>",
"download_url": "https://files.pythonhosted.org/packages/95/b4/947b51a9b47fb5301ac14a6759f4d4fc2baa09e0059167de482a5779b822/skrub-0.4.1.tar.gz",
"platform": null,
"description": "skrub\n=====\n\n.. image:: https://skrub-data.github.io/stable/_static/skrub.svg\n :align: center\n :width: 50 %\n :alt: skrub logo\n\n\n|py_ver| |pypi_var| |pypi_dl| |codecov| |circleci| |black|\n\n.. |py_ver| image:: https://img.shields.io/pypi/pyversions/skrub\n.. |pypi_var| image:: https://img.shields.io/pypi/v/skrub?color=informational\n.. |pypi_dl| image:: https://img.shields.io/pypi/dm/skrub\n.. |codecov| image:: https://img.shields.io/codecov/c/github/skrub-data/skrub/main\n.. |circleci| image:: https://img.shields.io/circleci/build/github/skrub-data/skrub/main?label=CircleCI\n.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg\n\n\n**skrub** (formerly *dirty_cat*) is a Python\nlibrary that facilitates prepping your tables for machine learning.\n\nIf you like the package, spread the word and \u2b50 this repository!\nYou can also join the `discord server <https://discord.gg/ABaPnm7fDC>`_.\n\nWebsite: https://skrub-data.org/\n\nWhat can skrub do?\n------------------\n\nThe goal of skrub is to bridge the gap between tabular data sources and machine-learning models.\n\nskrub provides high-level tools for joining dataframes (``Joiner``, ``AggJoiner``, ...),\nencoding columns (``MinHashEncoder``, ``ToCategorical``, ...), building a pipeline\n(``TableVectorizer``, ``tabular_learner``, ...), and explore interactively your data (``TableReport``).\n\n.. figure::\n https://github.com/rcap107/skrub-datasets/blob/master/data/output.gif?raw=true\n :alt: An animation showing how TableReport works\n\n An animation showing how TableReport works\n\n\n>>> from skrub.datasets import fetch_employee_salaries\n>>> dataset = fetch_employee_salaries()\n>>> df = dataset.X\n>>> y = dataset.y\n>>> df.iloc[0]\ngender F\ndepartment POL\ndepartment_name Department of Police\ndivision MSB Information Mgmt and Tech Division Records...\nassignment_category Fulltime-Regular\nemployee_position_title Office Services Coordinator\ndate_first_hired 09/22/1986\nyear_first_hired 1986\n\n>>> from sklearn.model_selection import cross_val_score\n>>> from skrub import tabular_learner\n>>> cross_val_score(tabular_learner('regressor'), df, y)\narray([0.89370447, 0.89279068, 0.92282557, 0.92319094, 0.92162666])\n\nSee our `examples <https://skrub-data.org/stable/auto_examples>`_.\n\nInstallation\n------------\n\nskrub can easily be installed via ``pip`` or ``conda``. For more installation information, see\nthe `installation instructions <https://skrub-data.org/stable/install.html>`_.\n\nContributing\n------------\n\nThe best way to support the development of skrub is to spread the word!\n\nAlso, if you already are a skrub user, we would love to hear about your use cases and challenges in the `Discussions <https://github.com/skrub-data/skrub/discussions>`_ section.\n\nTo report a bug or suggest enhancements, please\n`open an issue <https://docs.github.com/en/issues/tracking-your-work-with-issues/creating-an-issue>`_.\n\nIf you want to contribute directly to the library, then check the\n`how to contribute <https://skrub-data.org/stable/CONTRIBUTING.html>`_ page on\nthe website for more information.\n",
"bugtrack_url": null,
"license": null,
"summary": "Prepping tables for machine learning",
"version": "0.4.1",
"project_urls": {
"Homepage": "https://skrub-data.org/",
"Issues": "https://github.com/skrub-data/skrub/issues",
"Source": "https://github.com/skrub-data/skrub"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e69ab77226bf12a8690a5d8fa7f1198bc4fdd967dc0138f14549d687ea94daea",
"md5": "e1b49e823425590c8d0ba8833337d71d",
"sha256": "011940ec1a0c79cbaaf0cd18e83aad09f7071011b8e3e2cebe658c8bfa969d64"
},
"downloads": -1,
"filename": "skrub-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e1b49e823425590c8d0ba8833337d71d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 327645,
"upload_time": "2024-12-11T19:28:02",
"upload_time_iso_8601": "2024-12-11T19:28:02.364073Z",
"url": "https://files.pythonhosted.org/packages/e6/9a/b77226bf12a8690a5d8fa7f1198bc4fdd967dc0138f14549d687ea94daea/skrub-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "95b4947b51a9b47fb5301ac14a6759f4d4fc2baa09e0059167de482a5779b822",
"md5": "1d492f8569b1a80c9299331e57fe8184",
"sha256": "2d32267fcae3aec0af187f209039d78b283fe37ddbee112862b7cefc51f0c2d4"
},
"downloads": -1,
"filename": "skrub-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "1d492f8569b1a80c9299331e57fe8184",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 6510113,
"upload_time": "2024-12-11T19:28:08",
"upload_time_iso_8601": "2024-12-11T19:28:08.007064Z",
"url": "https://files.pythonhosted.org/packages/95/b4/947b51a9b47fb5301ac14a6759f4d4fc2baa09e0059167de482a5779b822/skrub-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-11 19:28:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "skrub-data",
"github_project": "skrub",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"circle": true,
"lcname": "skrub"
}