# DPK Ray Transforms
## installation
The [transforms](https://github.com/IBM/data-prep-kit/blob/dev/transforms/README.md) are delivered as a standard pyton library available on pypi and can be installed using pip install:
`python -m pip install data-prep-toolkit-transforms-ray`
installing the Ray transforms will also install `data_prep_toolkit_transforms` and `data-prep-toolkit-ray`
## List of Ray Transforms availabe in current package
Note: This list includes the transforms that are part of the current release for 0.2.1.dev3 and will be maintained on best effort but may may not be always up to date. users are encourage to raise an issue in git when they discover missing components
* code
* [code2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code2parquet/ray/README.md)
* [proglang_select](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/proglang_select/ray/README.md)
* [header_cleanser (Not available on MacOS)](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code2parquet/ray/README.md)
* [code_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code_quality/ray/README.md)
* [repo_level_ordering](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/repo_level_ordering/ray/README.md)
* language
* [doc_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_quality/ray/README.md)
* [doc_chunk](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_chunk/ray/README.md)
* [lang_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/lang_id/ray/README.md)
* [text_encoder](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/text_encoder/ray/README.md)
* [pdf2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pdf2parquet/ray/README.md)
* [pii_redactor](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pii_redactor/ray/README.md)
* universal
* [fdedup](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/fdedup/ray/README.md)
* [tokenization](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/tokenization/ray/README.md)
* [ededup](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/ededup/ray/README.md)
* [profiler](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/profiler/ray/README.md)
* [doc_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_id/ray/README.md)
* [filter](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/filter/ray/README.md)
* [resize](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/resize/ray/README.md)
Raw data
{
"_id": null,
"home_page": null,
"name": "data-prep-toolkit-transforms-ray",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.10",
"maintainer_email": null,
"keywords": "transforms, data preprocessing, data preparation, llm, generative, ai, fine-tuning, llmapps",
"author": null,
"author_email": "Maroun Touma <touma@us.ibm.com>",
"download_url": "https://files.pythonhosted.org/packages/b9/a6/a7ce93e1d947257fd2f0963170f1f35581ebcd274b9970381f6dfd98d93c/data_prep_toolkit_transforms_ray-0.2.1.tar.gz",
"platform": null,
"description": "# DPK Ray Transforms\n\n## installation\n\nThe [transforms](https://github.com/IBM/data-prep-kit/blob/dev/transforms/README.md) are delivered as a standard pyton library available on pypi and can be installed using pip install:\n\n`python -m pip install data-prep-toolkit-transforms-ray`\n\ninstalling the Ray transforms will also install `data_prep_toolkit_transforms` and `data-prep-toolkit-ray`\n\n## List of Ray Transforms availabe in current package\n\nNote: This list includes the transforms that are part of the current release for 0.2.1.dev3 and will be maintained on best effort but may may not be always up to date. users are encourage to raise an issue in git when they discover missing components\n\n* code\n\t* [code2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code2parquet/ray/README.md)\n\t* [proglang_select](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/proglang_select/ray/README.md)\n\t* [header_cleanser (Not available on MacOS)](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code2parquet/ray/README.md)\n\t* [code_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code_quality/ray/README.md)\n\t* [repo_level_ordering](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/repo_level_ordering/ray/README.md)\n* language\n\t* [doc_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_quality/ray/README.md)\n\t* [doc_chunk](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_chunk/ray/README.md)\n\t* [lang_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/lang_id/ray/README.md)\n\t* [text_encoder](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/text_encoder/ray/README.md)\n\t* [pdf2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pdf2parquet/ray/README.md)\n\t* [pii_redactor](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pii_redactor/ray/README.md)\n* universal\n\t* [fdedup](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/fdedup/ray/README.md)\n\t* [tokenization](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/tokenization/ray/README.md)\n\t* [ededup](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/ededup/ray/README.md)\n\t* [profiler](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/profiler/ray/README.md)\n\t* [doc_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_id/ray/README.md)\n\t* [filter](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/filter/ray/README.md)\n\t* [resize](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/resize/ray/README.md)\n\n\n\n\n\n \n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Data Preparation Toolkit Transforms using Ray",
"version": "0.2.1",
"project_urls": null,
"split_keywords": [
"transforms",
" data preprocessing",
" data preparation",
" llm",
" generative",
" ai",
" fine-tuning",
" llmapps"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3c1fa9f82d6480a08bc5643d565dae8b6d7d85818f8e8cd8cd2a1202a277a0cb",
"md5": "bd3cd74c45c1669c867e7d2c1c29af93",
"sha256": "6de3c72c399117704d6899700491ed3207d6a844863e41ad49250ca9b3a813eb"
},
"downloads": -1,
"filename": "data_prep_toolkit_transforms_ray-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bd3cd74c45c1669c867e7d2c1c29af93",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.10",
"size": 107025,
"upload_time": "2024-09-26T11:49:17",
"upload_time_iso_8601": "2024-09-26T11:49:17.235712Z",
"url": "https://files.pythonhosted.org/packages/3c/1f/a9f82d6480a08bc5643d565dae8b6d7d85818f8e8cd8cd2a1202a277a0cb/data_prep_toolkit_transforms_ray-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b9a6a7ce93e1d947257fd2f0963170f1f35581ebcd274b9970381f6dfd98d93c",
"md5": "3c57deb5d8eb7b0f2d53b1ac709c26b5",
"sha256": "d25903b03516987114014c4774b914accbaf8fd5243bf31a1f56e173d9faa243"
},
"downloads": -1,
"filename": "data_prep_toolkit_transforms_ray-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "3c57deb5d8eb7b0f2d53b1ac709c26b5",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.10",
"size": 53449,
"upload_time": "2024-09-26T11:49:18",
"upload_time_iso_8601": "2024-09-26T11:49:18.926820Z",
"url": "https://files.pythonhosted.org/packages/b9/a6/a7ce93e1d947257fd2f0963170f1f35581ebcd274b9970381f6dfd98d93c/data_prep_toolkit_transforms_ray-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-26 11:49:18",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "data-prep-toolkit-transforms-ray"
}