# Features
Data Processing is used for data processing through MinIO, databases, Web APIs, etc. The data types handled include:
- txt
- json
- doc
- html
- excel
- csv
- pdf
- markdown
- ppt
## Text Type Processing
The data processing process includes: cleaning abnormal data, filtering, de-duplication, and anonymization.
Raw data
{
"_id": null,
"home_page": "https://github.com/kubeagi/arcadia",
"name": "one-data-processing",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "PDF WORD WEB parsing preprocessing",
"author": "",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/2a/34/3d570a1bf6452d713fc62da845f5eea3d2f38955f65dcd10653c4be27735/one_data_processing-0.0.14.tar.gz",
"platform": null,
"description": "# Features\n\nData Processing is used for data processing through MinIO, databases, Web APIs, etc. The data types handled include:\n- txt\n- json \n- doc\n- html\n- excel\n- csv\n- pdf\n- markdown\n- ppt\n\n## Text Type Processing \n\nThe data processing process includes: cleaning abnormal data, filtering, de-duplication, and anonymization.\n",
"bugtrack_url": null,
"license": "",
"summary": "Data Processing is used for data processing through MinIO, databases, Web APIs, etc.",
"version": "0.0.14",
"project_urls": {
"Homepage": "https://github.com/kubeagi/arcadia"
},
"split_keywords": [
"pdf",
"word",
"web",
"parsing",
"preprocessing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "2bcae169eb977331e3f7587dc0f26e203c36eb473cb25c419cf7aabeb5557083",
"md5": "67322d46e370f2e8092d14cb3e5d83f3",
"sha256": "6ea402a0951d2079510d74389cdd4a97ca36610e5b9fbb09c7f0a8482d04a446"
},
"downloads": -1,
"filename": "one_data_processing-0.0.14-py3-none-any.whl",
"has_sig": false,
"md5_digest": "67322d46e370f2e8092d14cb3e5d83f3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 15977,
"upload_time": "2024-02-02T09:52:34",
"upload_time_iso_8601": "2024-02-02T09:52:34.951894Z",
"url": "https://files.pythonhosted.org/packages/2b/ca/e169eb977331e3f7587dc0f26e203c36eb473cb25c419cf7aabeb5557083/one_data_processing-0.0.14-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2a343d570a1bf6452d713fc62da845f5eea3d2f38955f65dcd10653c4be27735",
"md5": "99a6552b7b4b7677e8e14f4ea5581614",
"sha256": "62551b7d4ab8e11a189b12cfa391acdb04b0ec15da077e2b89e8cc7e59714235"
},
"downloads": -1,
"filename": "one_data_processing-0.0.14.tar.gz",
"has_sig": false,
"md5_digest": "99a6552b7b4b7677e8e14f4ea5581614",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 10688,
"upload_time": "2024-02-02T09:52:37",
"upload_time_iso_8601": "2024-02-02T09:52:37.215483Z",
"url": "https://files.pythonhosted.org/packages/2a/34/3d570a1bf6452d713fc62da845f5eea3d2f38955f65dcd10653c4be27735/one_data_processing-0.0.14.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-02 09:52:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kubeagi",
"github_project": "arcadia",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "one-data-processing"
}