
# 🥧ReciPies🐍
[](https://github.com/rvandewater/ReciPies/actions/workflows/ci.yml)
[](https://github.com/psf/black)

[](LICENSE)
[](https://pypi.python.org/pypi/recipies/)
[](http://arxiv.org/abs/2306.05109)
The ReciPies package is a preprocessing framework operating on [Polars](https://github.com/pola-rs/polars)
and [Pandas](https://github.com/pandas-dev/pandas) dataframes. The backend can be chosen by the user.
The operation of this package is inspired by the R-package [recipes](https://recipes.tidymodels.org/).
This package allows the user to apply a number of extensible operations for imputation, feature generation/extraction,
scaling, and encoding.
It operates on modified Dataframe objects from the established data science package Pandas.
## Installation
You can install ReciPies from pip using:
```
pip install recipies
```
> Note that the package is called `recipies` on pip.
>
You can install ReciPies from source to ensure you have the latest version:
```
conda env update -f environment.yml
conda activate ReciPies
pip install -e .
```
> Note that the last command installs the package called `recipies`.
## Usage
To define preprocessing operations, one has to supply _roles_ to the different columns of the Dataframe.
This allows the user to create groups of columns which have a particular function.
Then, we provide several "steps" that can be applied to the datasets, among which: Historical accumulation,
Resampling the time resolution, A number of imputation methods, and a wrapper for any
[Scikit-learn](https://github.com/scikit-learn/scikit-learn) preprocessing step.
We believe to have covered any basic preprocessing needs for prepared datasets.
Any missing step can be added by following the step interface.
# 📄Paper
If you use this code in your research, please cite the following publication (a standalone paper is in preparation):
```
@inproceedings{vandewaterYetAnotherICUBenchmark2024,
title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},
shorttitle = {Yet Another ICU Benchmark},
booktitle = {The Twelfth International Conference on Learning Representations},
author = {van de Water, Robin and Schmidt, Hendrik Nils Aurel and Elbers, Paul and Thoral, Patrick and Arnrich, Bert and Rockenschaub, Patrick},
year = {2024},
month = oct,
urldate = {2024-02-19},
langid = {english},
}
```
This paper can also be found on arxiv: https://arxiv.org/pdf/2306.05109.pdf
Raw data
{
"_id": null,
"home_page": null,
"name": "recipies",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "recipies, pandas, dataframe, polars, preprocessing, recipys",
"author": "Hendrik Schmidt, Patrick Rockenschaub",
"author_email": "Robin van de Water <robin.vandewater@hpi.de>",
"download_url": "https://files.pythonhosted.org/packages/f4/71/249b5da9ecd6baa60012be42c4a056b55487859347993f0399c19cced197/recipies-1.1.5.tar.gz",
"platform": null,
"description": "\n\n# \ud83e\udd67ReciPies\ud83d\udc0d\n\n[](https://github.com/rvandewater/ReciPies/actions/workflows/ci.yml)\n[](https://github.com/psf/black)\n\n[](LICENSE)\n[](https://pypi.python.org/pypi/recipies/)\n[](http://arxiv.org/abs/2306.05109)\n\nThe ReciPies package is a preprocessing framework operating on [Polars](https://github.com/pola-rs/polars)\nand [Pandas](https://github.com/pandas-dev/pandas) dataframes. The backend can be chosen by the user.\nThe operation of this package is inspired by the R-package [recipes](https://recipes.tidymodels.org/).\nThis package allows the user to apply a number of extensible operations for imputation, feature generation/extraction,\nscaling, and encoding.\nIt operates on modified Dataframe objects from the established data science package Pandas.\n\n## Installation\n\nYou can install ReciPies from pip using:\n\n```\npip install recipies\n```\n\n> Note that the package is called `recipies` on pip.\n>\nYou can install ReciPies from source to ensure you have the latest version:\n\n```\nconda env update -f environment.yml\nconda activate ReciPies\npip install -e .\n```\n\n> Note that the last command installs the package called `recipies`.\n\n## Usage\n\nTo define preprocessing operations, one has to supply _roles_ to the different columns of the Dataframe.\nThis allows the user to create groups of columns which have a particular function.\nThen, we provide several \"steps\" that can be applied to the datasets, among which: Historical accumulation,\nResampling the time resolution, A number of imputation methods, and a wrapper for any\n[Scikit-learn](https://github.com/scikit-learn/scikit-learn) preprocessing step.\nWe believe to have covered any basic preprocessing needs for prepared datasets.\nAny missing step can be added by following the step interface.\n\n# \ud83d\udcc4Paper\n\nIf you use this code in your research, please cite the following publication (a standalone paper is in preparation):\n\n```\n@inproceedings{vandewaterYetAnotherICUBenchmark2024,\n title = {Yet Another ICU Benchmark: A Flexible Multi-Center Framework for Clinical ML},\n shorttitle = {Yet Another ICU Benchmark},\n booktitle = {The Twelfth International Conference on Learning Representations},\n author = {van de Water, Robin and Schmidt, Hendrik Nils Aurel and Elbers, Paul and Thoral, Patrick and Arnrich, Bert and Rockenschaub, Patrick},\n year = {2024},\n month = oct,\n urldate = {2024-02-19},\n langid = {english},\n}\n\n```\n\nThis paper can also be found on arxiv: https://arxiv.org/pdf/2306.05109.pdf\n\n\n\n\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "A modular preprocessing package for Pandas Dataframe",
"version": "1.1.5",
"project_urls": null,
"split_keywords": [
"recipies",
" pandas",
" dataframe",
" polars",
" preprocessing",
" recipys"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "7ced7ab44a3561766afae50b53b041bb61e2b9466a6db377df81e71229f853b8",
"md5": "e835c380e64c94b38db51cedf37021df",
"sha256": "2a94f5e6242b5f2a7d5a4a6e6868c4938792a66451d585c073e7192faddbfd23"
},
"downloads": -1,
"filename": "recipies-1.1.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e835c380e64c94b38db51cedf37021df",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 3788,
"upload_time": "2025-07-14T07:55:14",
"upload_time_iso_8601": "2025-07-14T07:55:14.699877Z",
"url": "https://files.pythonhosted.org/packages/7c/ed/7ab44a3561766afae50b53b041bb61e2b9466a6db377df81e71229f853b8/recipies-1.1.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f471249b5da9ecd6baa60012be42c4a056b55487859347993f0399c19cced197",
"md5": "0b9cb38ed332e8b32929db4630e17acd",
"sha256": "b322a01eaf2e02303e298ab2bbffa62e1f499e69d134f12f542786b64bd47bca"
},
"downloads": -1,
"filename": "recipies-1.1.5.tar.gz",
"has_sig": false,
"md5_digest": "0b9cb38ed332e8b32929db4630e17acd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 3470011,
"upload_time": "2025-07-14T07:55:16",
"upload_time_iso_8601": "2025-07-14T07:55:16.506360Z",
"url": "https://files.pythonhosted.org/packages/f4/71/249b5da9ecd6baa60012be42c4a056b55487859347993f0399c19cced197/recipies-1.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-14 07:55:16",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "recipies"
}