mizuio


Namemizuio JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/mertskzc/mizu
SummaryA comprehensive Python data processing tool for cleaning, visualization, and analysis
upload_time2025-08-30 16:22:04
maintainerNone
docs_urlNone
authorMert Sakızcı
requires_python>=3.7
licenseMIT
keywords data-science data-analysis data-cleaning data-visualization pandas numpy matplotlib seaborn machine-learning data-processing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# mizuio - Python Data Processing Toolkit

mizuio is a comprehensive Python toolkit for data cleaning, visualization, and analysis. It provides a modern command-line interface and Python API for efficient data workflows, leveraging Pandas, NumPy, Matplotlib, Seaborn, and scikit-learn.

---

## 🚀 Features

### Data Cleaning (`DataCleaner`)
- Handle missing values: drop, fill, or interpolate
- Remove duplicates by columns
- Automatic data type conversion
- Outlier detection and removal (IQR, Z-score)
- Text normalization (case, whitespace)

### Data Visualization (`DataVisualizer`)
- Histograms and distribution plots
- Box plots for outlier analysis
- Scatter plots for variable relationships
- Correlation heatmaps
- Bar and line charts (categorical/time series)
- Missing value visualization

### Utility Tools (`DataUtils`)
- Multi-format support: CSV, JSON, Excel, Parquet, Pickle
- Data validation (columns, types, value ranges)
- Data sampling (random, systematic, stratified)
- Data splitting (train/validation/test)
- Categorical encoding (label, one-hot, ordinal)
- Feature scaling (standard, minmax, robust)

---

## 📦 Installation

### Requirements
- Python 3.7+
- pandas
- numpy
- matplotlib
- seaborn
- scikit-learn

### Steps
1. **Clone the repository:**
	```sh
		git clone https://github.com/mertskzc/mizuio.git
		cd mizuio
	```
2. **Install dependencies:**
	```sh
	pip install -r requirements.txt
	```
3. **Install in development mode (optional):**
	```sh
	pip install -e .
	```

---

## 🖥️ Usage

### Command Line Interface

mizuio provides a CLI for common data tasks:

```sh
# Clean a dataset
mizuio clean data.csv --output cleaned_data.csv --remove-duplicates --fill-missing --remove-outliers

# Visualize a column
mizuio visualize data.csv --plot histogram --column age --output age_hist.png

# Show data info
mizuio info data.csv
```

#### CLI Commands
- `clean`: Clean data (remove duplicates, fill missing, remove outliers)
- `visualize`: Visualize data (histogram, boxplot, scatter, correlation)
- `info`: Show data summary (shape, memory, columns, missing values, duplicates)

---

## 🧪 Testing

Run all tests:
```sh
python -m pytest tests/
```
Run a specific test file:
```sh
python -m pytest tests/test_cleaner.py
```

---

## 🤝 Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/your-feature`)
3. Commit your changes (`git commit -m 'Add feature'`)
4. Push your branch (`git push origin feature/your-feature`)
5. Open a Pull Request

---

## 📝 License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

---

## 📞 Contact

- **Project Link:** [https://github.com/mertskzc/mizuio](https://github.com/mertskzc/mizuio)
- **E-mail:** mertskzc@gmail.com

---

## 🙏 Acknowledgements

mizuio uses the following open source libraries:
- [pandas](https://pandas.pydata.org/)
- [numpy](https://numpy.org/)
- [matplotlib](https://matplotlib.org/)
- [seaborn](https://seaborn.pydata.org/)
- [scikit-learn](https://scikit-learn.org/)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/mertskzc/mizu",
    "name": "mizuio",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "Mert Sak\u0131zc\u0131 <mertskzc@gmail.com>",
    "keywords": "data-science, data-analysis, data-cleaning, data-visualization, pandas, numpy, matplotlib, seaborn, machine-learning, data-processing",
    "author": "Mert Sak\u0131zc\u0131",
    "author_email": "Mert Sak\u0131zc\u0131 <mertskzc@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/76/c2/1d4f363b5d811a609158b53aae5a713eb3b8fb54fdb6190ef319e64b606c/mizuio-0.1.0.tar.gz",
    "platform": null,
    "description": "\r\n# mizuio - Python Data Processing Toolkit\r\n\r\nmizuio is a comprehensive Python toolkit for data cleaning, visualization, and analysis. It provides a modern command-line interface and Python API for efficient data workflows, leveraging Pandas, NumPy, Matplotlib, Seaborn, and scikit-learn.\r\n\r\n---\r\n\r\n## \ud83d\ude80 Features\r\n\r\n### Data Cleaning (`DataCleaner`)\r\n- Handle missing values: drop, fill, or interpolate\r\n- Remove duplicates by columns\r\n- Automatic data type conversion\r\n- Outlier detection and removal (IQR, Z-score)\r\n- Text normalization (case, whitespace)\r\n\r\n### Data Visualization (`DataVisualizer`)\r\n- Histograms and distribution plots\r\n- Box plots for outlier analysis\r\n- Scatter plots for variable relationships\r\n- Correlation heatmaps\r\n- Bar and line charts (categorical/time series)\r\n- Missing value visualization\r\n\r\n### Utility Tools (`DataUtils`)\r\n- Multi-format support: CSV, JSON, Excel, Parquet, Pickle\r\n- Data validation (columns, types, value ranges)\r\n- Data sampling (random, systematic, stratified)\r\n- Data splitting (train/validation/test)\r\n- Categorical encoding (label, one-hot, ordinal)\r\n- Feature scaling (standard, minmax, robust)\r\n\r\n---\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n### Requirements\r\n- Python 3.7+\r\n- pandas\r\n- numpy\r\n- matplotlib\r\n- seaborn\r\n- scikit-learn\r\n\r\n### Steps\r\n1. **Clone the repository:**\r\n\t```sh\r\n\t\tgit clone https://github.com/mertskzc/mizuio.git\r\n\t\tcd mizuio\r\n\t```\r\n2. **Install dependencies:**\r\n\t```sh\r\n\tpip install -r requirements.txt\r\n\t```\r\n3. **Install in development mode (optional):**\r\n\t```sh\r\n\tpip install -e .\r\n\t```\r\n\r\n---\r\n\r\n## \ud83d\udda5\ufe0f Usage\r\n\r\n### Command Line Interface\r\n\r\nmizuio provides a CLI for common data tasks:\r\n\r\n```sh\r\n# Clean a dataset\r\nmizuio clean data.csv --output cleaned_data.csv --remove-duplicates --fill-missing --remove-outliers\r\n\r\n# Visualize a column\r\nmizuio visualize data.csv --plot histogram --column age --output age_hist.png\r\n\r\n# Show data info\r\nmizuio info data.csv\r\n```\r\n\r\n#### CLI Commands\r\n- `clean`: Clean data (remove duplicates, fill missing, remove outliers)\r\n- `visualize`: Visualize data (histogram, boxplot, scatter, correlation)\r\n- `info`: Show data summary (shape, memory, columns, missing values, duplicates)\r\n\r\n---\r\n\r\n## \ud83e\uddea Testing\r\n\r\nRun all tests:\r\n```sh\r\npython -m pytest tests/\r\n```\r\nRun a specific test file:\r\n```sh\r\npython -m pytest tests/test_cleaner.py\r\n```\r\n\r\n---\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\n1. Fork the repository\r\n2. Create a feature branch (`git checkout -b feature/your-feature`)\r\n3. Commit your changes (`git commit -m 'Add feature'`)\r\n4. Push your branch (`git push origin feature/your-feature`)\r\n5. Open a Pull Request\r\n\r\n---\r\n\r\n## \ud83d\udcdd License\r\n\r\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\r\n\r\n---\r\n\r\n## \ud83d\udcde Contact\r\n\r\n- **Project Link:** [https://github.com/mertskzc/mizuio](https://github.com/mertskzc/mizuio)\r\n- **E-mail:** mertskzc@gmail.com\r\n\r\n---\r\n\r\n## \ud83d\ude4f Acknowledgements\r\n\r\nmizuio uses the following open source libraries:\r\n- [pandas](https://pandas.pydata.org/)\r\n- [numpy](https://numpy.org/)\r\n- [matplotlib](https://matplotlib.org/)\r\n- [seaborn](https://seaborn.pydata.org/)\r\n- [scikit-learn](https://scikit-learn.org/)\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A comprehensive Python data processing tool for cleaning, visualization, and analysis",
    "version": "0.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/mertskzc/mizuio/issues",
        "Homepage": "https://github.com/mertskzc/mizuio",
        "Repository": "https://github.com/mertskzc/mizuio"
    },
    "split_keywords": [
        "data-science",
        " data-analysis",
        " data-cleaning",
        " data-visualization",
        " pandas",
        " numpy",
        " matplotlib",
        " seaborn",
        " machine-learning",
        " data-processing"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e8125d04bc80579c23d3a820d7d029ade79207f9e520c6f502cb250167d1e4f5",
                "md5": "6a4f103caae3a4056ece65b9db751fa1",
                "sha256": "398e36fd046afa6210cd1adcc65bd9f155a924edd72e1afa98abf73360903e9a"
            },
            "downloads": -1,
            "filename": "mizuio-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6a4f103caae3a4056ece65b9db751fa1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 14348,
            "upload_time": "2025-08-30T16:22:02",
            "upload_time_iso_8601": "2025-08-30T16:22:02.963748Z",
            "url": "https://files.pythonhosted.org/packages/e8/12/5d04bc80579c23d3a820d7d029ade79207f9e520c6f502cb250167d1e4f5/mizuio-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "76c21d4f363b5d811a609158b53aae5a713eb3b8fb54fdb6190ef319e64b606c",
                "md5": "16c36841b357330a8ed8dc585c295860",
                "sha256": "76d06f79f16b1ce5705a2ede07c63284aa4a021f1d2dfb9d3fbdbfc8ab256e47"
            },
            "downloads": -1,
            "filename": "mizuio-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "16c36841b357330a8ed8dc585c295860",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 17700,
            "upload_time": "2025-08-30T16:22:04",
            "upload_time_iso_8601": "2025-08-30T16:22:04.065270Z",
            "url": "https://files.pythonhosted.org/packages/76/c2/1d4f363b5d811a609158b53aae5a713eb3b8fb54fdb6190ef319e64b606c/mizuio-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-30 16:22:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mertskzc",
    "github_project": "mizu",
    "github_not_found": true,
    "lcname": "mizuio"
}
        
Elapsed time: 0.77543s