Name | parquetconv JSON |
Version |
0.2.1
JSON |
| download |
home_page | None |
Summary | A command-line tool for converting between Parquet and CSV file formats |
upload_time | 2025-08-25 20:58:42 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.9 |
license | MIT |
keywords |
conversion
csv
data
pandas
parquet
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# ParquetConv
A command-line tool for converting between Parquet and CSV file formats using pandas.
## Features
- **Automatic format detection**: Automatically detects whether the input file is Parquet or CSV
- **Bidirectional conversion**: Convert Parquet to CSV or CSV to Parquet
- **Flexible output naming**: Auto-generates output filenames or allows custom naming
- **Error handling**: Comprehensive error handling with informative messages
- **Force conversion**: Option to force conversion even with uncertain file formats
## Installation
### Option 1: Install from PyPI (Recommended)
```bash
pip install parquetconv
```
After installation, you can use the `parquetconv` command directly from anywhere in your terminal.
### Option 2: Install from source
Clone the repository and install:
```bash
git clone https://github.com/ToyokoLabs/parquetconv.git
cd parquetconv
pip install -e .
```
### Option 3: Development setup with uv
The project uses `uv` for dependency management. Install dependencies with:
```bash
uv sync
```
## Usage
### After pip installation
Convert a Parquet file to CSV:
```bash
parquetconv input.parquet
```
Convert a CSV file to Parquet:
```bash
parquetconv input.csv
```
### From source or development
```bash
python -m parquetconv.cli input.parquet
python -m parquetconv.cli input.csv
```
### Advanced Usage
Specify a custom output filename:
```bash
parquetconv input.parquet -o custom_output.csv
parquetconv input.csv -o custom_output.parquet
```
Force conversion (useful when file format detection is uncertain):
```bash
parquetconv input_file --force
```
### Command Line Options
- `input_file`: Path to the input file (required)
- `-o, --output`: Custom output file path (optional)
- `--force`: Force conversion even if file format detection is uncertain
- `-h, --help`: Show help message
## Examples
```bash
# Convert Parquet to CSV with auto-generated filename
parquetconv data.parquet
# Output: data.csv
# Convert CSV to Parquet with custom filename
parquetconv data.csv -o processed_data.parquet
# Convert with force flag
parquetconv unknown_file --force
# Get help
parquetconv --help
```
## Requirements
- Python 3.9+
- pandas >= 2.3.2
- pyarrow >= 21.0.0
## How It Works
1. **File Detection**: The tool first checks the file extension, then attempts to read the file to determine its format
2. **Format Conversion**: Uses pandas to read the input file and convert it to the opposite format
3. **Output Generation**: Creates the output file with an appropriate extension if not specified
## Error Handling
The tool provides clear error messages for:
- Missing input files
- Unsupported file formats
- Read/write errors during conversion
- Invalid file content
## Development
To contribute to the project:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests (if available)
5. Submit a pull request
## License
This project is open source and available under the GNU General Public License v3.0.
## Author
**Sebastian Bassi** - [sebastian@toyoko.io](mailto:sebastian@toyoko.io)
## Repository
- **Homepage**: https://github.com/ToyokoLabs/parquetconv
- **Repository**: https://github.com/ToyokoLabs/parquetconv
- **Issues**: https://github.com/ToyokoLabs/parquetconv/issues
Raw data
{
"_id": null,
"home_page": null,
"name": "parquetconv",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "conversion, csv, data, pandas, parquet",
"author": null,
"author_email": "Sebastian Bassi <sebastian@toyoko.io>",
"download_url": "https://files.pythonhosted.org/packages/22/64/44b2a1e1803dd93420928db6234927af9907d6389b8d07c322b6c1913d71/parquetconv-0.2.1.tar.gz",
"platform": null,
"description": "# ParquetConv\n\nA command-line tool for converting between Parquet and CSV file formats using pandas.\n\n## Features\n\n- **Automatic format detection**: Automatically detects whether the input file is Parquet or CSV\n- **Bidirectional conversion**: Convert Parquet to CSV or CSV to Parquet\n- **Flexible output naming**: Auto-generates output filenames or allows custom naming\n- **Error handling**: Comprehensive error handling with informative messages\n- **Force conversion**: Option to force conversion even with uncertain file formats\n\n## Installation\n\n### Option 1: Install from PyPI (Recommended)\n\n```bash\npip install parquetconv\n```\n\nAfter installation, you can use the `parquetconv` command directly from anywhere in your terminal.\n\n### Option 2: Install from source\n\nClone the repository and install:\n\n```bash\ngit clone https://github.com/ToyokoLabs/parquetconv.git\ncd parquetconv\npip install -e .\n```\n\n### Option 3: Development setup with uv\n\nThe project uses `uv` for dependency management. Install dependencies with:\n\n```bash\nuv sync\n```\n\n## Usage\n\n### After pip installation\n\nConvert a Parquet file to CSV:\n```bash\nparquetconv input.parquet\n```\n\nConvert a CSV file to Parquet:\n```bash\nparquetconv input.csv\n```\n\n### From source or development\n\n```bash\npython -m parquetconv.cli input.parquet\npython -m parquetconv.cli input.csv\n```\n\n### Advanced Usage\n\nSpecify a custom output filename:\n```bash\nparquetconv input.parquet -o custom_output.csv\nparquetconv input.csv -o custom_output.parquet\n```\n\nForce conversion (useful when file format detection is uncertain):\n```bash\nparquetconv input_file --force\n```\n\n### Command Line Options\n\n- `input_file`: Path to the input file (required)\n- `-o, --output`: Custom output file path (optional)\n- `--force`: Force conversion even if file format detection is uncertain\n- `-h, --help`: Show help message\n\n## Examples\n\n```bash\n# Convert Parquet to CSV with auto-generated filename\nparquetconv data.parquet\n# Output: data.csv\n\n# Convert CSV to Parquet with custom filename\nparquetconv data.csv -o processed_data.parquet\n\n# Convert with force flag\nparquetconv unknown_file --force\n\n# Get help\nparquetconv --help\n```\n\n## Requirements\n\n- Python 3.9+\n- pandas >= 2.3.2\n- pyarrow >= 21.0.0\n\n## How It Works\n\n1. **File Detection**: The tool first checks the file extension, then attempts to read the file to determine its format\n2. **Format Conversion**: Uses pandas to read the input file and convert it to the opposite format\n3. **Output Generation**: Creates the output file with an appropriate extension if not specified\n\n## Error Handling\n\nThe tool provides clear error messages for:\n- Missing input files\n- Unsupported file formats\n- Read/write errors during conversion\n- Invalid file content\n\n## Development\n\nTo contribute to the project:\n\n1. Fork the repository\n2. Create a feature branch\n3. Make your changes\n4. Run tests (if available)\n5. Submit a pull request\n\n## License\n\nThis project is open source and available under the GNU General Public License v3.0.\n\n## Author\n\n**Sebastian Bassi** - [sebastian@toyoko.io](mailto:sebastian@toyoko.io)\n\n## Repository\n\n- **Homepage**: https://github.com/ToyokoLabs/parquetconv\n- **Repository**: https://github.com/ToyokoLabs/parquetconv\n- **Issues**: https://github.com/ToyokoLabs/parquetconv/issues\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A command-line tool for converting between Parquet and CSV file formats",
"version": "0.2.1",
"project_urls": {
"Homepage": "https://github.com/ToyokoLabs/parquetconv",
"Issues": "https://github.com/ToyokoLabs/parquetconv/issues",
"Repository": "https://github.com/ToyokoLabs/parquetconv"
},
"split_keywords": [
"conversion",
" csv",
" data",
" pandas",
" parquet"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b1e324a2273a85d44ba57342db7ad079635043d541982d416ffdc71570a392bd",
"md5": "5c37d30c2a3f9babb0aa2b3ed22c5fc3",
"sha256": "123b4e05ab2956ed77a919a64045227b82bbba11e622f0cc590c4735c72456f2"
},
"downloads": -1,
"filename": "parquetconv-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5c37d30c2a3f9babb0aa2b3ed22c5fc3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 16608,
"upload_time": "2025-08-25T20:58:41",
"upload_time_iso_8601": "2025-08-25T20:58:41.773915Z",
"url": "https://files.pythonhosted.org/packages/b1/e3/24a2273a85d44ba57342db7ad079635043d541982d416ffdc71570a392bd/parquetconv-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "226444b2a1e1803dd93420928db6234927af9907d6389b8d07c322b6c1913d71",
"md5": "0fadc36c57dbfe0c98f80e371e34b39c",
"sha256": "b48f03ff42de9636949f2d6552c79f7fdba1c08959d526ec6ec6b0738c549b6d"
},
"downloads": -1,
"filename": "parquetconv-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "0fadc36c57dbfe0c98f80e371e34b39c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 48106,
"upload_time": "2025-08-25T20:58:42",
"upload_time_iso_8601": "2025-08-25T20:58:42.998981Z",
"url": "https://files.pythonhosted.org/packages/22/64/44b2a1e1803dd93420928db6234927af9907d6389b8d07c322b6c1913d71/parquetconv-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-25 20:58:42",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ToyokoLabs",
"github_project": "parquetconv",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "parquetconv"
}