Name | cdef-utils JSON |
Version |
1.0.0
JSON |
| download |
home_page | None |
Summary | Utility for converting CSV to Parquet files |
upload_time | 2024-10-05 10:47:34 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | None |
keywords |
csv
data processing
parquet
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# cdef-utils
cdef-utils is a Python package designed to convert CSV and Parquet files to a standardized Parquet format, specifically tailored for processing register data. It provides utilities for batch processing files, generating summaries, and handling various encoding issues.
## Features
- Convert CSV and Parquet files to a standardized Parquet format
- Automatic encoding detection for CSV files
- Batch processing of multiple files
- Generation of summary reports
- Progress tracking and resumable processing
- Rich console output with logging
## Installation
To install cdef-utils, you can use pip:
```
pip install cdef-utils
```
## Usage
You can use cdef-utils as a command-line tool:
```
python -m cdef_utils /path/to/input/directory --summary_file output_summary.json
```
### Arguments
- `input_directory`: Path to the directory containing CSV and Parquet files to process
- `--summary_file`: (Optional) Path to save the summary JSON file (default: "register_summary.json")
## Output
The script will:
1. Convert all CSV and Parquet files in the input directory to Parquet format
2. Save the converted files in a structured directory format under `/path/to/your/fixed/output/directory/registers`
3. Generate a summary JSON file with details about each processed register
4. Display a summary table in the console
5. Log processing details and any errors
## Requirements
- Python 3.7+
- polars
- rich
## Configuration
- The `OUTPUT_DIRECTORY` is set to `/path/to/your/fixed/output/directory` in the script. Modify this path as needed.
- Logging is configured to save logs in a `logs` directory in the current working directory.
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
This project is licensed under the MIT License.
Raw data
{
"_id": null,
"home_page": null,
"name": "cdef-utils",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "csv, data processing, parquet",
"author": null,
"author_email": "Tobias Kragholm <tkragholm@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/3a/c6/84ef6087e4f52e588f93df787fda05afdf0b9662844faed14cce1233585f/cdef_utils-1.0.0.tar.gz",
"platform": null,
"description": "# cdef-utils\n\ncdef-utils is a Python package designed to convert CSV and Parquet files to a standardized Parquet format, specifically tailored for processing register data. It provides utilities for batch processing files, generating summaries, and handling various encoding issues.\n\n## Features\n\n- Convert CSV and Parquet files to a standardized Parquet format\n- Automatic encoding detection for CSV files\n- Batch processing of multiple files\n- Generation of summary reports\n- Progress tracking and resumable processing\n- Rich console output with logging\n\n## Installation\n\nTo install cdef-utils, you can use pip:\n\n```\npip install cdef-utils\n```\n\n## Usage\n\nYou can use cdef-utils as a command-line tool:\n\n```\npython -m cdef_utils /path/to/input/directory --summary_file output_summary.json\n```\n\n### Arguments\n\n- `input_directory`: Path to the directory containing CSV and Parquet files to process\n- `--summary_file`: (Optional) Path to save the summary JSON file (default: \"register_summary.json\")\n\n## Output\n\nThe script will:\n\n1. Convert all CSV and Parquet files in the input directory to Parquet format\n2. Save the converted files in a structured directory format under `/path/to/your/fixed/output/directory/registers`\n3. Generate a summary JSON file with details about each processed register\n4. Display a summary table in the console\n5. Log processing details and any errors\n\n## Requirements\n\n- Python 3.7+\n- polars\n- rich\n\n## Configuration\n\n- The `OUTPUT_DIRECTORY` is set to `/path/to/your/fixed/output/directory` in the script. Modify this path as needed.\n- Logging is configured to save logs in a `logs` directory in the current working directory.\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nThis project is licensed under the MIT License.\n",
"bugtrack_url": null,
"license": null,
"summary": "Utility for converting CSV to Parquet files",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/tkragholm/cdef-utils/issues",
"Homepage": "https://github.com/tkragholm/cdef-utils"
},
"split_keywords": [
"csv",
" data processing",
" parquet"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "448cc7f10ab920c3467c71976927ff6de12b07b4f73ffcf9157c338e0588b0ba",
"md5": "d19aad489333219bee2ce24684583c9f",
"sha256": "80cccbb7042d885bad59d0ad48f847a84bead84b09a8aa84303c0f3481fbc7f8"
},
"downloads": -1,
"filename": "cdef_utils-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d19aad489333219bee2ce24684583c9f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 5992,
"upload_time": "2024-10-05T10:47:32",
"upload_time_iso_8601": "2024-10-05T10:47:32.919007Z",
"url": "https://files.pythonhosted.org/packages/44/8c/c7f10ab920c3467c71976927ff6de12b07b4f73ffcf9157c338e0588b0ba/cdef_utils-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3ac684ef6087e4f52e588f93df787fda05afdf0b9662844faed14cce1233585f",
"md5": "9952e55f975639c50bbfd911150668e3",
"sha256": "698ee3dd23e2aeca35aa41f4e3794653e5402b855d5dc383eae7ccd2e0afb20b"
},
"downloads": -1,
"filename": "cdef_utils-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "9952e55f975639c50bbfd911150668e3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 33979,
"upload_time": "2024-10-05T10:47:34",
"upload_time_iso_8601": "2024-10-05T10:47:34.255237Z",
"url": "https://files.pythonhosted.org/packages/3a/c6/84ef6087e4f52e588f93df787fda05afdf0b9662844faed14cce1233585f/cdef_utils-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-05 10:47:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tkragholm",
"github_project": "cdef-utils",
"github_not_found": true,
"lcname": "cdef-utils"
}