recipies


Namerecipies JSON
Version 1.2.0 PyPI version JSON
download
home_pageNone
SummaryA modular preprocessing package for Pandas Dataframe
upload_time2025-07-24 10:32:53
maintainerNone
docs_urlNone
authorHendrik Schmidt, Patrick Rockenschaub
requires_pythonNone
licenseMIT license
keywords recipies pandas dataframe polars preprocessing recipys
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  <img src="https://github.com/rvandewater/ReciPies/blob/development/docs/figures/recipies_logo.svg?raw=true" 
alt="recipies logo" height="300">
</div>

# ReciPies 🥧

[![CI](https://github.com/rvandewater/ReciPies/actions/workflows/ci.yml/badge.svg)](https://github.com/rvandewater/ReciPies/actions/workflows/ci.yml)
![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)
![Platform](https://img.shields.io/badge/platform-linux--64%20|%20win--64%20|%20osx--64-lightgrey)
[![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
[![PyPI version shields.io](https://img.shields.io/pypi/v/recipies.svg)](https://pypi.python.org/pypi/recipies/)
[![Python Version](https://img.shields.io/pypi/pyversions/recipies.svg)](https://pypi.python.org/pypi/recipies/)
[![Downloads](https://pepy.tech/badge/recipies)](https://pepy.tech/project/recipies)
[![arXiv](https://img.shields.io/badge/arXiv-2306.05109-b31b1b.svg)](http://arxiv.org/abs/2306.05109)

The ReciPies package is a preprocessing framework operating on [Polars](https://github.com/pola-rs/polars)
and [Pandas](https://github.com/pandas-dev/pandas) dataframes. The backend can be chosen by the user.
The operation of this package is inspired by the R-package [recipes](https://recipes.tidymodels.org/).
This package allows the user to apply a number of extensible operations for imputation, feature generation/extraction,
scaling, and encoding.
It operates on modified Dataframe objects from the established data science package Pandas.
## Installation

You can install ReciPies from pip using:

```
pip install recipies
```

> Note that the package is called `recipies`  on pip.
>
You can install ReciPies from source to ensure you have the latest version:

```
conda env update -f environment.yml
conda activate ReciPies
pip install -e .
```

> Note that the last command installs the package called `recipies`.

## Usage

To define preprocessing operations, one has to supply _roles_ to the different columns of the Dataframe.
This allows the user to create groups of columns which have a particular function.
Then, we provide several "steps" that can be applied to the datasets, among which: Historical accumulation,
Resampling the time resolution, A number of imputation methods, and a wrapper for any
[Scikit-learn](https://github.com/scikit-learn/scikit-learn) preprocessing step.
We believe to have covered any basic preprocessing needs for prepared datasets.
Any missing step can be added by following the step interface.

# 📄Paper

If you use this code in your research, please cite the following publication which uses ReciPys extensively to create a 
customisable preprocessing pipeline (a standalone paper is in preparation):

```
@inproceedings{vandewaterYetAnotherICUBenchmark2024,
  title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},
  shorttitle = {Yet Another ICU Benchmark},
  booktitle = {The Twelfth International Conference on Learning Representations},
  author = {van de Water, Robin and Schmidt, Hendrik Nils Aurel and Elbers, Paul and Thoral, Patrick and Arnrich, Bert and Rockenschaub, Patrick},
  year = {2024},
  month = oct,
  urldate = {2024-02-19},
  langid = {english},
}

```

This paper can also be found on arxiv: https://arxiv.org/pdf/2306.05109.pdf





            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "recipies",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "recipies, pandas, dataframe, polars, preprocessing, recipys",
    "author": "Hendrik Schmidt, Patrick Rockenschaub",
    "author_email": "Robin van de Water <robin.vandewater@hpi.de>",
    "download_url": "https://files.pythonhosted.org/packages/4d/50/165a97aab5b4d01574d80cbc62703b78dc3650ebf5ebee089af3321f088c/recipies-1.2.0.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <img src=\"https://github.com/rvandewater/ReciPies/blob/development/docs/figures/recipies_logo.svg?raw=true\" \nalt=\"recipies logo\" height=\"300\">\n</div>\n\n# ReciPies \ud83e\udd67\n\n[![CI](https://github.com/rvandewater/ReciPies/actions/workflows/ci.yml/badge.svg)](https://github.com/rvandewater/ReciPies/actions/workflows/ci.yml)\n![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)\n![Platform](https://img.shields.io/badge/platform-linux--64%20|%20win--64%20|%20osx--64-lightgrey)\n[![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)\n[![PyPI version shields.io](https://img.shields.io/pypi/v/recipies.svg)](https://pypi.python.org/pypi/recipies/)\n[![Python Version](https://img.shields.io/pypi/pyversions/recipies.svg)](https://pypi.python.org/pypi/recipies/)\n[![Downloads](https://pepy.tech/badge/recipies)](https://pepy.tech/project/recipies)\n[![arXiv](https://img.shields.io/badge/arXiv-2306.05109-b31b1b.svg)](http://arxiv.org/abs/2306.05109)\n\nThe ReciPies package is a preprocessing framework operating on [Polars](https://github.com/pola-rs/polars)\nand [Pandas](https://github.com/pandas-dev/pandas) dataframes. The backend can be chosen by the user.\nThe operation of this package is inspired by the R-package [recipes](https://recipes.tidymodels.org/).\nThis package allows the user to apply a number of extensible operations for imputation, feature generation/extraction,\nscaling, and encoding.\nIt operates on modified Dataframe objects from the established data science package Pandas.\n## Installation\n\nYou can install ReciPies from pip using:\n\n```\npip install recipies\n```\n\n> Note that the package is called `recipies`  on pip.\n>\nYou can install ReciPies from source to ensure you have the latest version:\n\n```\nconda env update -f environment.yml\nconda activate ReciPies\npip install -e .\n```\n\n> Note that the last command installs the package called `recipies`.\n\n## Usage\n\nTo define preprocessing operations, one has to supply _roles_ to the different columns of the Dataframe.\nThis allows the user to create groups of columns which have a particular function.\nThen, we provide several \"steps\" that can be applied to the datasets, among which: Historical accumulation,\nResampling the time resolution, A number of imputation methods, and a wrapper for any\n[Scikit-learn](https://github.com/scikit-learn/scikit-learn) preprocessing step.\nWe believe to have covered any basic preprocessing needs for prepared datasets.\nAny missing step can be added by following the step interface.\n\n# \ud83d\udcc4Paper\n\nIf you use this code in your research, please cite the following publication which uses ReciPys extensively to create a \ncustomisable preprocessing pipeline (a standalone paper is in preparation):\n\n```\n@inproceedings{vandewaterYetAnotherICUBenchmark2024,\n  title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},\n  shorttitle = {Yet Another ICU Benchmark},\n  booktitle = {The Twelfth International Conference on Learning Representations},\n  author = {van de Water, Robin and Schmidt, Hendrik Nils Aurel and Elbers, Paul and Thoral, Patrick and Arnrich, Bert and Rockenschaub, Patrick},\n  year = {2024},\n  month = oct,\n  urldate = {2024-02-19},\n  langid = {english},\n}\n\n```\n\nThis paper can also be found on arxiv: https://arxiv.org/pdf/2306.05109.pdf\n\n\n\n\n",
    "bugtrack_url": null,
    "license": "MIT license",
    "summary": "A modular preprocessing package for Pandas Dataframe",
    "version": "1.2.0",
    "project_urls": null,
    "split_keywords": [
        "recipies",
        " pandas",
        " dataframe",
        " polars",
        " preprocessing",
        " recipys"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d691b232b99fdccb349f5b220daee5f8d89e63eaccc8357acf969849e918c61c",
                "md5": "5dc1521426ddac16c8dfd923d68aa6d7",
                "sha256": "8feaaa2f577aee7ae2b8745387c72d7f15dbc18d0ae0ef17a4bdf351ab6b28a3"
            },
            "downloads": -1,
            "filename": "recipies-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5dc1521426ddac16c8dfd923d68aa6d7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 17043,
            "upload_time": "2025-07-24T10:32:52",
            "upload_time_iso_8601": "2025-07-24T10:32:52.170803Z",
            "url": "https://files.pythonhosted.org/packages/d6/91/b232b99fdccb349f5b220daee5f8d89e63eaccc8357acf969849e918c61c/recipies-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4d50165a97aab5b4d01574d80cbc62703b78dc3650ebf5ebee089af3321f088c",
                "md5": "ce1db6aa7a5394110b8ede226f799b47",
                "sha256": "56a762264f7cfba42ad903af24c1a279cd31dd591bf3717a677b532b2c0de343"
            },
            "downloads": -1,
            "filename": "recipies-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "ce1db6aa7a5394110b8ede226f799b47",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 3730807,
            "upload_time": "2025-07-24T10:32:53",
            "upload_time_iso_8601": "2025-07-24T10:32:53.601602Z",
            "url": "https://files.pythonhosted.org/packages/4d/50/165a97aab5b4d01574d80cbc62703b78dc3650ebf5ebee089af3321f088c/recipies-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-24 10:32:53",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "recipies"
}
        
Elapsed time: 0.95695s