Name | cleandat JSON |
Version |
0.0.3
JSON |
| download |
home_page | https://github.com/tiadams/cleandat |
Summary | Python functions to facilitate the pre-processing of data for ML tasks in a clinical context. |
upload_time | 2024-01-17 15:21:21 |
maintainer | |
docs_url | None |
author | Tim Adams |
requires_python | |
license | MIT License |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# CleanDat
Python functions to facilitate the pre-processing of data to prepare them for ML tasks, especially suitable for data in a clinical context.
---
Major functionalities include heuristic based data cleaning and feature engineering like:
- Automatic detection of encoding strings (e.g. 1=m) and application of the corresponding encoding to un-encoded data of the corresponding column
- Automatic detection of date strings of different formats (e.g. 2019-01-01, 01/01/2019, January 2022) and conversion to a unified format
- Encoding of date strings into decomposed date features (e.g. year, month, day, weekday, etc.)
- Heuristics for unification of different number formats, e.g. 1,000.00 vs. 1.000,00 or exponential notations like 1e3 vs 10x10^2
- Detection and replacement of inconsistent data values
# Setup
Install via pip:
pip install cleandat
Raw data
{
"_id": null,
"home_page": "https://github.com/tiadams/cleandat",
"name": "cleandat",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Tim Adams",
"author_email": "tim-adams@gmx.net",
"download_url": "https://files.pythonhosted.org/packages/0a/ad/2111a9159e4fa098d6253e15f5aaec992e08287266b94089bd221c6e48f5/cleandat-0.0.3.tar.gz",
"platform": null,
"description": "\n# CleanDat\nPython functions to facilitate the pre-processing of data to prepare them for ML tasks, especially suitable for data in a clinical context.\n\n---\n\nMajor functionalities include heuristic based data cleaning and feature engineering like:\n- Automatic detection of encoding strings (e.g. 1=m) and application of the corresponding encoding to un-encoded data of the corresponding column\n- Automatic detection of date strings of different formats (e.g. 2019-01-01, 01/01/2019, January 2022) and conversion to a unified format\n- Encoding of date strings into decomposed date features (e.g. year, month, day, weekday, etc.)\n- Heuristics for unification of different number formats, e.g. 1,000.00 vs. 1.000,00 or exponential notations like 1e3 vs 10x10^2\n- Detection and replacement of inconsistent data values\n\n# Setup\n\nInstall via pip:\n\n pip install cleandat\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Python functions to facilitate the pre-processing of data for ML tasks in a clinical context.",
"version": "0.0.3",
"project_urls": {
"Homepage": "https://github.com/tiadams/cleandat"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "79eeb95719512cce8b823143520db0bf89c4f305f2d16de592a97db90d7feb33",
"md5": "91ba3fc9baf2f03641a284eb7b8e480e",
"sha256": "e4910f0d1907fdf00c95f4606520357c09b1d0e87e8d448b97c1f5b2f037b8f9"
},
"downloads": -1,
"filename": "cleandat-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "91ba3fc9baf2f03641a284eb7b8e480e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 13935,
"upload_time": "2024-01-17T15:21:20",
"upload_time_iso_8601": "2024-01-17T15:21:20.191295Z",
"url": "https://files.pythonhosted.org/packages/79/ee/b95719512cce8b823143520db0bf89c4f305f2d16de592a97db90d7feb33/cleandat-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0aad2111a9159e4fa098d6253e15f5aaec992e08287266b94089bd221c6e48f5",
"md5": "1ee4d2b84a8f32ded75f0aa9ec7d5dce",
"sha256": "e85c54f195429135076066ac8136391ff9e12586b1dd202e0bc4fbd06e0613ce"
},
"downloads": -1,
"filename": "cleandat-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "1ee4d2b84a8f32ded75f0aa9ec7d5dce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10505,
"upload_time": "2024-01-17T15:21:21",
"upload_time_iso_8601": "2024-01-17T15:21:21.315110Z",
"url": "https://files.pythonhosted.org/packages/0a/ad/2111a9159e4fa098d6253e15f5aaec992e08287266b94089bd221c6e48f5/cleandat-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-17 15:21:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tiadams",
"github_project": "cleandat",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "cleandat"
}