# EasyData - Data Science Function Runner
EasyData is a Python library that makes it easy for data scientists to create reusable functions and run them on datasets through a beautiful terminal interface.
## Features
- 🎯 **Simple Decorator**: Wrap your data science functions with `@data_function`
- 🖥️ **Terminal UI**: Browse directories and select datasets interactively
- 📊 **Progress Tracking**: Built-in progress bars for long-running operations
- 📁 **Multiple Formats**: Support for CSV, Excel, JSON, Parquet, and TSV files
- 💾 **Easy Output**: Save results back to files with one click
- 🔍 **File Preview**: See file information and sample data before processing
## Installation
### From PyPI (when published)
```bash
pip install easydata-ds
```
### From Source
```bash
git clone https://github.com/coleragone/easydata.git
cd easydata
pip install -e .
```
### Development Installation
```bash
git clone https://github.com/coleragone/easydata.git
cd easydata
pip install -e ".[dev]"
```
## Quick Start
1. **Create a Python script with decorated functions:**
```python
import pandas as pd
from easyData import data_function, run_data_functions
@data_function(
description="Add True/False column based on condition",
input_types=['csv', 'xlsx'],
output_types=['csv'],
progress_enabled=True
)
def tag_condition(data):
"""Tag rows where value > 100"""
data['is_high_value'] = data['value'] > 100
return data
if __name__ == "__main__":
run_data_functions()
```
2. **Run your script:**
```bash
python your_script.py
```
3. **Use the terminal UI:**
- Select your function from the list
- Browse directories to find your dataset
- Preview file information and sample data
- Run the function with progress tracking
- Save results to a new file
## Decorator Parameters
The `@data_function` decorator accepts several parameters:
- `description`: Human-readable description of what the function does
- `input_types`: List of supported input file types (e.g., `['csv', 'xlsx', 'json']`)
- `output_types`: List of supported output file types (e.g., `['csv', 'xlsx']`)
- `progress_enabled`: Whether to show progress bars (default: `True`)
- `batch_size`: Number of rows to process at once for progress tracking (default: `1000`)
## Example Functions
### Data Tagging
```python
@data_function(
description="Tag high-value customers",
input_types=['csv', 'xlsx'],
progress_enabled=True
)
def tag_high_value_customers(data):
data['is_high_value'] = data['revenue'] > 10000
return data
```
### Data Cleaning
```python
@data_function(
description="Clean and standardize text data",
input_types=['csv'],
batch_size=500
)
def clean_text(data):
text_cols = data.select_dtypes(include=['object']).columns
for col in text_cols:
data[col] = data[col].str.lower().str.strip()
return data
```
### Statistical Analysis
```python
@data_function(
description="Calculate summary statistics",
input_types=['csv', 'xlsx', 'json'],
progress_enabled=False
)
def calculate_stats(data):
return data.describe()
```
## Supported File Formats
**Input Formats:**
- CSV (`.csv`)
- Excel (`.xlsx`, `.xls`)
- JSON (`.json`)
- Parquet (`.parquet`)
- TSV (`.tsv`)
**Output Formats:**
- CSV (`.csv`)
- Excel (`.xlsx`)
- JSON (`.json`)
- Parquet (`.parquet`)
- TSV (`.tsv`)
## Terminal UI Features
- **Function Selection**: Choose from available decorated functions
- **Directory Browsing**: Navigate through folders to find datasets
- **File Preview**: See file size, type, columns, and sample data
- **Progress Tracking**: Visual progress bars for long operations
- **Result Saving**: Save processed data to new files
- **Error Handling**: Clear error messages and recovery options
## Requirements
- Python 3.7+
- pandas
- rich (for beautiful terminal UI)
- click (for command-line interface)
- tqdm (for progress bars)
## Publishing
To build and publish this package:
1. **Build the package:**
```bash
python build.py
```
2. **Publish to PyPI:**
```bash
python publish.py
```
3. **Test installation:**
```bash
pip install easydata-ds
```
## Development
### Setup Development Environment
```bash
git clone https://github.com/coleragone/easydata.git
cd easydata
pip install -e ".[dev]"
```
### Run Tests
```bash
pytest
```
### Code Formatting
```bash
black easydata/
flake8 easydata/
```
## Contributing
This is a startup project! Feel free to contribute by:
- Adding new file format support
- Improving the terminal UI
- Adding more example functions
- Enhancing error handling
- Writing tests
- Improving documentation
## License
MIT License - feel free to use this in your projects!
## Changelog
### v0.1.0 (2024-01-XX)
- Initial release
- Basic decorator functionality
- Terminal UI for file browsing
- Support for CSV, Excel, JSON, Parquet, TSV files
- Progress tracking for long operations
- Command-line interface
Raw data
{
"_id": null,
"home_page": "https://github.com/coleragone/easydata",
"name": "easydata-ds",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "Cole Ragone <coleragone@example.com>",
"keywords": "data science, pandas, terminal ui, data processing, decorator",
"author": "Cole Ragone",
"author_email": "Cole Ragone <coleragone@example.com>",
"download_url": "https://files.pythonhosted.org/packages/a7/b8/45dba388d85630acbf7fac773df9af9ed727ff27af40824f520f48e7da74/easydata_ds-0.1.0.tar.gz",
"platform": null,
"description": "# EasyData - Data Science Function Runner\n\nEasyData is a Python library that makes it easy for data scientists to create reusable functions and run them on datasets through a beautiful terminal interface.\n\n## Features\n\n- \ud83c\udfaf **Simple Decorator**: Wrap your data science functions with `@data_function`\n- \ud83d\udda5\ufe0f **Terminal UI**: Browse directories and select datasets interactively\n- \ud83d\udcca **Progress Tracking**: Built-in progress bars for long-running operations\n- \ud83d\udcc1 **Multiple Formats**: Support for CSV, Excel, JSON, Parquet, and TSV files\n- \ud83d\udcbe **Easy Output**: Save results back to files with one click\n- \ud83d\udd0d **File Preview**: See file information and sample data before processing\n\n## Installation\n\n### From PyPI (when published)\n```bash\npip install easydata-ds\n```\n\n### From Source\n```bash\ngit clone https://github.com/coleragone/easydata.git\ncd easydata\npip install -e .\n```\n\n### Development Installation\n```bash\ngit clone https://github.com/coleragone/easydata.git\ncd easydata\npip install -e \".[dev]\"\n```\n\n## Quick Start\n\n1. **Create a Python script with decorated functions:**\n\n```python\nimport pandas as pd\nfrom easyData import data_function, run_data_functions\n\n@data_function(\n description=\"Add True/False column based on condition\",\n input_types=['csv', 'xlsx'],\n output_types=['csv'],\n progress_enabled=True\n)\ndef tag_condition(data):\n \"\"\"Tag rows where value > 100\"\"\"\n data['is_high_value'] = data['value'] > 100\n return data\n\nif __name__ == \"__main__\":\n run_data_functions()\n```\n\n2. **Run your script:**\n\n```bash\npython your_script.py\n```\n\n3. **Use the terminal UI:**\n - Select your function from the list\n - Browse directories to find your dataset\n - Preview file information and sample data\n - Run the function with progress tracking\n - Save results to a new file\n\n## Decorator Parameters\n\nThe `@data_function` decorator accepts several parameters:\n\n- `description`: Human-readable description of what the function does\n- `input_types`: List of supported input file types (e.g., `['csv', 'xlsx', 'json']`)\n- `output_types`: List of supported output file types (e.g., `['csv', 'xlsx']`)\n- `progress_enabled`: Whether to show progress bars (default: `True`)\n- `batch_size`: Number of rows to process at once for progress tracking (default: `1000`)\n\n## Example Functions\n\n### Data Tagging\n```python\n@data_function(\n description=\"Tag high-value customers\",\n input_types=['csv', 'xlsx'],\n progress_enabled=True\n)\ndef tag_high_value_customers(data):\n data['is_high_value'] = data['revenue'] > 10000\n return data\n```\n\n### Data Cleaning\n```python\n@data_function(\n description=\"Clean and standardize text data\",\n input_types=['csv'],\n batch_size=500\n)\ndef clean_text(data):\n text_cols = data.select_dtypes(include=['object']).columns\n for col in text_cols:\n data[col] = data[col].str.lower().str.strip()\n return data\n```\n\n### Statistical Analysis\n```python\n@data_function(\n description=\"Calculate summary statistics\",\n input_types=['csv', 'xlsx', 'json'],\n progress_enabled=False\n)\ndef calculate_stats(data):\n return data.describe()\n```\n\n## Supported File Formats\n\n**Input Formats:**\n- CSV (`.csv`)\n- Excel (`.xlsx`, `.xls`)\n- JSON (`.json`)\n- Parquet (`.parquet`)\n- TSV (`.tsv`)\n\n**Output Formats:**\n- CSV (`.csv`)\n- Excel (`.xlsx`)\n- JSON (`.json`)\n- Parquet (`.parquet`)\n- TSV (`.tsv`)\n\n## Terminal UI Features\n\n- **Function Selection**: Choose from available decorated functions\n- **Directory Browsing**: Navigate through folders to find datasets\n- **File Preview**: See file size, type, columns, and sample data\n- **Progress Tracking**: Visual progress bars for long operations\n- **Result Saving**: Save processed data to new files\n- **Error Handling**: Clear error messages and recovery options\n\n## Requirements\n\n- Python 3.7+\n- pandas\n- rich (for beautiful terminal UI)\n- click (for command-line interface)\n- tqdm (for progress bars)\n\n## Publishing\n\nTo build and publish this package:\n\n1. **Build the package:**\n ```bash\n python build.py\n ```\n\n2. **Publish to PyPI:**\n ```bash\n python publish.py\n ```\n\n3. **Test installation:**\n ```bash\n pip install easydata-ds\n ```\n\n## Development\n\n### Setup Development Environment\n```bash\ngit clone https://github.com/coleragone/easydata.git\ncd easydata\npip install -e \".[dev]\"\n```\n\n### Run Tests\n```bash\npytest\n```\n\n### Code Formatting\n```bash\nblack easydata/\nflake8 easydata/\n```\n\n## Contributing\n\nThis is a startup project! Feel free to contribute by:\n- Adding new file format support\n- Improving the terminal UI\n- Adding more example functions\n- Enhancing error handling\n- Writing tests\n- Improving documentation\n\n## License\n\nMIT License - feel free to use this in your projects!\n\n## Changelog\n\n### v0.1.0 (2024-01-XX)\n- Initial release\n- Basic decorator functionality\n- Terminal UI for file browsing\n- Support for CSV, Excel, JSON, Parquet, TSV files\n- Progress tracking for long operations\n- Command-line interface\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python library for data scientists to easily apply functions to datasets with a terminal UI",
"version": "0.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/coleragone/easydata/issues",
"Documentation": "https://github.com/coleragone/easydata#readme",
"Homepage": "https://github.com/coleragone/easydata",
"Repository": "https://github.com/coleragone/easydata"
},
"split_keywords": [
"data science",
" pandas",
" terminal ui",
" data processing",
" decorator"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "3a033fa815523d912246e77ebaf5e908cd28ff417e46241e104aa011742183e6",
"md5": "ff5fd031c66d9f573f994bb6e25e7c48",
"sha256": "69f95461f5e6c73be22a9c7be07a928d81cd8a44740d5274e5d5f42c5f455121"
},
"downloads": -1,
"filename": "easydata_ds-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "ff5fd031c66d9f573f994bb6e25e7c48",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 12210,
"upload_time": "2025-10-24T03:34:12",
"upload_time_iso_8601": "2025-10-24T03:34:12.872498Z",
"url": "https://files.pythonhosted.org/packages/3a/03/3fa815523d912246e77ebaf5e908cd28ff417e46241e104aa011742183e6/easydata_ds-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a7b845dba388d85630acbf7fac773df9af9ed727ff27af40824f520f48e7da74",
"md5": "587b20e86c441792217d3a771f0687b7",
"sha256": "1f096806f7dc63071db009282830233473bdf4292d31804f3e2054ed0ee3c30d"
},
"downloads": -1,
"filename": "easydata_ds-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "587b20e86c441792217d3a771f0687b7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 14372,
"upload_time": "2025-10-24T03:34:14",
"upload_time_iso_8601": "2025-10-24T03:34:14.228507Z",
"url": "https://files.pythonhosted.org/packages/a7/b8/45dba388d85630acbf7fac773df9af9ed727ff27af40824f520f48e7da74/easydata_ds-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-24 03:34:14",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "coleragone",
"github_project": "easydata",
"github_not_found": true,
"lcname": "easydata-ds"
}