# Parquet Viewer
A powerful command-line tool for viewing, analyzing, and manipulating Parquet files with ease.
## Features
- 📊 View Parquet files in various table formats
- 📤 Export to different formats (CSV, Excel, JSON, HTML)
- 📈 Display dataset statistics and summaries
- 🔍 Filter and sort data
- 📉 Analyze correlations and missing values
- 🎲 Sample data randomly
- 💾 Memory-efficient handling of large files
- 🎨 Multiple display format options
## Installation
```bash
pip install parquet-viewer
```
## Usage
### Basic Commands
#### View Parquet File
```bash
# Basic viewing
pqview view data.parquet
# Customize display
pqview view data.parquet --max-rows 20 --format github
pqview view data.parquet -n 50 -f pretty --no-stats
```
#### Export to Other Formats
```bash
# Export to CSV
pqview export data.parquet output.csv
# Export to other formats
pqview export data.parquet output.xlsx --format excel
pqview export data.parquet output.json --format json
pqview export data.parquet output.html --format html
```
### Analysis Commands
#### Summary Statistics
```bash
# Show summary statistics for numerical columns
pqview stats data.parquet
```
#### Value Counts
```bash
# Show value counts for a specific column
pqview counts data.parquet column_name
```
#### Missing Values Analysis
```bash
# Show statistics about missing values
pqview missing data.parquet
```
#### Correlation Analysis
```bash
# Show correlation matrix
pqview correlations data.parquet
# Use different correlation methods
pqview correlations data.parquet --method spearman
```
### Data Manipulation Commands
#### Filter Data
```bash
# Filter data using pandas query syntax
pqview filter data.parquet "age > 25 and department == 'IT'"
```
#### Sort Data
```bash
# Sort by single column
pqview sort data.parquet "salary"
# Sort by multiple columns
pqview sort data.parquet "department,salary" --descending
```
#### Sample Data
```bash
# Sample specific number of rows
pqview sample data.parquet --rows 100
# Sample by fraction
pqview sample data.parquet --fraction 0.1 --seed 42
```
## Display Formats
The tool supports various display formats for tables:
| Format | Description |
|---------|-------------|
| grid | ASCII grid table |
| pipe | Markdown-compatible table |
| orgtbl | Org-mode table |
| github | GitHub-flavored Markdown table |
| pretty | Pretty printed table |
| html | HTML table |
| latex | LaTeX table |
## Export Formats
Supported export formats:
- CSV
- Excel
- JSON
- HTML
## File Size Limits
By default, the tool has a 5MB file size limit to prevent memory issues. This can be adjusted in the configuration.
## Error Handling
The tool provides clear error messages for common issues:
- File not found
- Invalid file format
- Memory limitations
- Invalid query syntax
- Data type conversion errors
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
## License
MIT License
## Author
Ashutosh Bele
## Changelog
### v0.1.0
- Initial release
- Basic viewing and export functionality
- Statistical analysis features
- Data manipulation capabilities
Raw data
{
"_id": null,
"home_page": "https://github.com/Ashlo/ParquetViewer",
"name": "parquet-viewer",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "Ashutosh Bele",
"author_email": "ashutoshbele5@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/f9/02/5d5594ed8d208a56c2511b49d7a5d53e3c870ab084b73f3e8f5dab7a7142/parquet_viewer-0.1.3.tar.gz",
"platform": null,
"description": "# Parquet Viewer\n\nA powerful command-line tool for viewing, analyzing, and manipulating Parquet files with ease.\n\n## Features\n\n- \ud83d\udcca View Parquet files in various table formats\n- \ud83d\udce4 Export to different formats (CSV, Excel, JSON, HTML)\n- \ud83d\udcc8 Display dataset statistics and summaries\n- \ud83d\udd0d Filter and sort data\n- \ud83d\udcc9 Analyze correlations and missing values\n- \ud83c\udfb2 Sample data randomly\n- \ud83d\udcbe Memory-efficient handling of large files\n- \ud83c\udfa8 Multiple display format options\n\n## Installation\n\n```bash\npip install parquet-viewer\n```\n\n## Usage\n\n### Basic Commands\n\n#### View Parquet File\n```bash\n# Basic viewing\npqview view data.parquet\n\n# Customize display\npqview view data.parquet --max-rows 20 --format github\npqview view data.parquet -n 50 -f pretty --no-stats\n```\n\n#### Export to Other Formats\n```bash\n# Export to CSV\npqview export data.parquet output.csv\n\n# Export to other formats\npqview export data.parquet output.xlsx --format excel\npqview export data.parquet output.json --format json\npqview export data.parquet output.html --format html\n```\n\n### Analysis Commands\n\n#### Summary Statistics\n```bash\n# Show summary statistics for numerical columns\npqview stats data.parquet\n```\n\n#### Value Counts\n```bash\n# Show value counts for a specific column\npqview counts data.parquet column_name\n```\n\n#### Missing Values Analysis\n```bash\n# Show statistics about missing values\npqview missing data.parquet\n```\n\n#### Correlation Analysis\n```bash\n# Show correlation matrix\npqview correlations data.parquet\n\n# Use different correlation methods\npqview correlations data.parquet --method spearman\n```\n\n### Data Manipulation Commands\n\n#### Filter Data\n```bash\n# Filter data using pandas query syntax\npqview filter data.parquet \"age > 25 and department == 'IT'\"\n```\n\n#### Sort Data\n```bash\n# Sort by single column\npqview sort data.parquet \"salary\"\n\n# Sort by multiple columns\npqview sort data.parquet \"department,salary\" --descending\n```\n\n#### Sample Data\n```bash\n# Sample specific number of rows\npqview sample data.parquet --rows 100\n\n# Sample by fraction\npqview sample data.parquet --fraction 0.1 --seed 42\n```\n\n## Display Formats\n\nThe tool supports various display formats for tables:\n\n| Format | Description |\n|---------|-------------|\n| grid | ASCII grid table |\n| pipe | Markdown-compatible table |\n| orgtbl | Org-mode table |\n| github | GitHub-flavored Markdown table |\n| pretty | Pretty printed table |\n| html | HTML table |\n| latex | LaTeX table |\n\n## Export Formats\n\nSupported export formats:\n- CSV\n- Excel\n- JSON\n- HTML\n\n## File Size Limits\n\nBy default, the tool has a 5MB file size limit to prevent memory issues. This can be adjusted in the configuration.\n\n## Error Handling\n\nThe tool provides clear error messages for common issues:\n- File not found\n- Invalid file format\n- Memory limitations\n- Invalid query syntax\n- Data type conversion errors\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\nMIT License\n\n## Author\n\nAshutosh Bele\n\n## Changelog\n\n### v0.1.0\n- Initial release\n- Basic viewing and export functionality\n- Statistical analysis features\n- Data manipulation capabilities\n",
"bugtrack_url": null,
"license": null,
"summary": "A powerful command-line tool for viewing Parquet files",
"version": "0.1.3",
"project_urls": {
"Homepage": "https://github.com/Ashlo/ParquetViewer"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c56c85bda3205358e9e5f3f629d065dfb312f10c5c42e532f0b21f2d66ef0a10",
"md5": "dba7b24717bbe948ab6113d0b50aeb50",
"sha256": "e183426cbd388fdd956bd04c3724b56ebf9573956eb07a24556c712a6f32913a"
},
"downloads": -1,
"filename": "parquet_viewer-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "dba7b24717bbe948ab6113d0b50aeb50",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 7510,
"upload_time": "2024-10-29T07:58:31",
"upload_time_iso_8601": "2024-10-29T07:58:31.580220Z",
"url": "https://files.pythonhosted.org/packages/c5/6c/85bda3205358e9e5f3f629d065dfb312f10c5c42e532f0b21f2d66ef0a10/parquet_viewer-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f9025d5594ed8d208a56c2511b49d7a5d53e3c870ab084b73f3e8f5dab7a7142",
"md5": "3eac7be2a2d10e76035b3a4b16d5ad3c",
"sha256": "f30ca89cadf4161e7eee4e1bf043c35d30960b23a43b3cb70214a3ac615642d2"
},
"downloads": -1,
"filename": "parquet_viewer-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "3eac7be2a2d10e76035b3a4b16d5ad3c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 7600,
"upload_time": "2024-10-29T07:58:33",
"upload_time_iso_8601": "2024-10-29T07:58:33.000212Z",
"url": "https://files.pythonhosted.org/packages/f9/02/5d5594ed8d208a56c2511b49d7a5d53e3c870ab084b73f3e8f5dab7a7142/parquet_viewer-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-29 07:58:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Ashlo",
"github_project": "ParquetViewer",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "parquet-viewer"
}