data-prep-toolkit-transforms


Namedata-prep-toolkit-transforms JSON
Version 0.2.1 PyPI version JSON
download
home_pageNone
SummaryData Preparation Toolkit Transforms
upload_time2024-09-26 06:32:44
maintainerNone
docs_urlNone
authorNone
requires_python<3.12,>=3.10
licenseApache-2.0
keywords transforms data preprocessing data preparation llm generative ai fine-tuning llmapps
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # DPK Python Transforms

## installation

The [transforms](https://github.com/IBM/data-prep-kit/blob/dev/transforms/README.md) are delivered as a standard pyton library available on pypi and can be installed using pip install:

`python -m pip install data-prep-toolkit-transforms`

installing the python transforms will also install  `data-prep-toolkit`

## List of Transforms in current package

Note: This list includes the transforms that are part of the current release for 0.2.1.dev3 and will be maintained on best effort but may may not be always up to date. users are encourage to raise an issue in git when they discover missing components

* code
    * [code2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code2parquet/python/README.md)
    * [header_cleanser (Not available on MacOS)](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/header_cleanser/python/README.md)
    * [code_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code_quality/python/README.md)
    * [proglang_select](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/proglang_select/python/README.md)
* language
    * [doc_chunk](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_chunk/python/README.md)
	* [doc_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_quality/python/README.md)
	* [lang_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/lang_id/python/README.md)
	* [pdf2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pdf2parquet/python/README.md)
	* [text_encoder](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/text_encoder/python/README.md)
	* [pii_redactor](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pii_redactor/python/README.md)
* universal
    * [ededup](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/ededup/python/README.md)
	* [filter](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/filter/python/README.md)
	* [resize](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/resize/python/README.md)
	* [tokenization](https://github.com/IBM/data-prep-kit/blob/dev/transforms/tokenization/doc_chunk/python/README.md)
	* [doc_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_id/python/README.md)

	




 

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "data-prep-toolkit-transforms",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.10",
    "maintainer_email": null,
    "keywords": "transforms, data preprocessing, data preparation, llm, generative, ai, fine-tuning, llmapps",
    "author": null,
    "author_email": "Maroun Touma <touma@us.ibm.com>",
    "download_url": "https://files.pythonhosted.org/packages/d1/d0/55afe358b58efc20472f33245b237f939119d9211f78da2a8cc39ccffc17/data_prep_toolkit_transforms-0.2.1.tar.gz",
    "platform": null,
    "description": "# DPK Python Transforms\n\n## installation\n\nThe [transforms](https://github.com/IBM/data-prep-kit/blob/dev/transforms/README.md) are delivered as a standard pyton library available on pypi and can be installed using pip install:\n\n`python -m pip install data-prep-toolkit-transforms`\n\ninstalling the python transforms will also install  `data-prep-toolkit`\n\n## List of Transforms in current package\n\nNote: This list includes the transforms that are part of the current release for 0.2.1.dev3 and will be maintained on best effort but may may not be always up to date. users are encourage to raise an issue in git when they discover missing components\n\n* code\n    * [code2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code2parquet/python/README.md)\n    * [header_cleanser (Not available on MacOS)](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/header_cleanser/python/README.md)\n    * [code_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/code_quality/python/README.md)\n    * [proglang_select](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/proglang_select/python/README.md)\n* language\n    * [doc_chunk](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_chunk/python/README.md)\n\t* [doc_quality](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_quality/python/README.md)\n\t* [lang_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/lang_id/python/README.md)\n\t* [pdf2parquet](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pdf2parquet/python/README.md)\n\t* [text_encoder](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/text_encoder/python/README.md)\n\t* [pii_redactor](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/pii_redactor/python/README.md)\n* universal\n    * [ededup](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/ededup/python/README.md)\n\t* [filter](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/filter/python/README.md)\n\t* [resize](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/resize/python/README.md)\n\t* [tokenization](https://github.com/IBM/data-prep-kit/blob/dev/transforms/tokenization/doc_chunk/python/README.md)\n\t* [doc_id](https://github.com/IBM/data-prep-kit/blob/dev/transforms/code/doc_id/python/README.md)\n\n\t\n\n\n\n\n \n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Data Preparation Toolkit Transforms",
    "version": "0.2.1",
    "project_urls": null,
    "split_keywords": [
        "transforms",
        " data preprocessing",
        " data preparation",
        " llm",
        " generative",
        " ai",
        " fine-tuning",
        " llmapps"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "26921ec59f0039fff679fde5ab3cc6042b13fd9a352f2852c8903fecc327792d",
                "md5": "b162aff4fd748665623980a8236f31fd",
                "sha256": "47ca7364032fea576cab5feee734cc6e58e6963127788ae2d23917eff6221f5e"
            },
            "downloads": -1,
            "filename": "data_prep_toolkit_transforms-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b162aff4fd748665623980a8236f31fd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.10",
            "size": 115712,
            "upload_time": "2024-09-26T06:32:42",
            "upload_time_iso_8601": "2024-09-26T06:32:42.635104Z",
            "url": "https://files.pythonhosted.org/packages/26/92/1ec59f0039fff679fde5ab3cc6042b13fd9a352f2852c8903fecc327792d/data_prep_toolkit_transforms-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d1d055afe358b58efc20472f33245b237f939119d9211f78da2a8cc39ccffc17",
                "md5": "dd4cfc6a632d13c66d3fde9892b6d3df",
                "sha256": "0cbdd23ed9b63149f01c6e4564c06b3433e1dd2ac664ce55465a10562f66ddc7"
            },
            "downloads": -1,
            "filename": "data_prep_toolkit_transforms-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "dd4cfc6a632d13c66d3fde9892b6d3df",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.10",
            "size": 57316,
            "upload_time": "2024-09-26T06:32:44",
            "upload_time_iso_8601": "2024-09-26T06:32:44.671585Z",
            "url": "https://files.pythonhosted.org/packages/d1/d0/55afe358b58efc20472f33245b237f939119d9211f78da2a8cc39ccffc17/data_prep_toolkit_transforms-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-26 06:32:44",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "data-prep-toolkit-transforms"
}
        
Elapsed time: 0.48612s