# MLFCrafter
> **ML Pipeline Automation Framework - Chain together data processing, model training, and deployment with minimal code**
[](https://pypi.org/project/mlfcrafter/)
[](https://pypi.org/project/mlfcrafter/)
[](https://github.com/brkcvlk/mlfcrafter/actions)
[](https://brkcvlk.github.io/mlfcrafter/)
[](LICENSE)
[](https://pypi.org/project/mlfcrafter/)
---
## โญ **If you find MLFCrafter useful, please consider starring this repository!**
<a href="https://github.com/brkcvlk/mlfcrafter/stargazers">
<img src="https://img.shields.io/github/stars/brkcvlk/mlfcrafter?style=social" alt="GitHub stars">
</a>
Your support helps us continue developing and improving MLFCrafter for the ML community.
---
## What is MLFCrafter?
MLFCrafter is a Python framework that simplifies machine learning pipeline creation through chainable "crafter" components. Build, train, and deploy ML models with minimal code and maximum flexibility.
## Key Features
- **๐ Chainable Architecture** - Connect multiple processing steps seamlessly
- **๐ Smart Data Handling** - Automatic data ingestion from CSV, Excel, JSON
- **๐งน Intelligent Cleaning** - Multiple strategies for missing value handling
- **๐ Flexible Scaling** - MinMax, Standard, and Robust scaling options
- **๐ค Multiple Models** - Random Forest, XGBoost, Logistic Regression support
- **๐ Comprehensive Metrics** - Accuracy, Precision, Recall, F1-Score
- **๐พ Easy Deployment** - One-click model saving with metadata
- **๐ Context-Based** - Seamless data flow between pipeline steps
## Quick Start
### Installation
```bash
pip install mlfcrafter
```
### Basic Usage
```python
from mlfcrafter import MLFChain, DataIngestCrafter, CleanerCrafter, ScalerCrafter, ModelCrafter, ScorerCrafter, DeployCrafter
# Create ML pipeline in one line
chain = MLFChain(
DataIngestCrafter(data_path="data/iris.csv"),
CleanerCrafter(strategy="auto"),
ScalerCrafter(scaler_type="standard"),
ModelCrafter(model_name="random_forest"),
ScorerCrafter(),
DeployCrafter()
)
# Run entire pipeline
results = chain.run(target_column="species")
print(f"Test Score: {results['test_score']:.4f}")
```
### Advanced Configuration
```python
chain = MLFChain(
DataIngestCrafter(data_path="data/titanic.csv", source_type="csv"),
CleanerCrafter(strategy="mean", str_fill="Unknown"),
ScalerCrafter(scaler_type="minmax", columns=["age", "fare"]),
ModelCrafter(
model_name="xgboost",
model_params={"n_estimators": 200, "max_depth": 6},
test_size=0.25
),
ScorerCrafter(),
DeployCrafter(model_path="models/titanic_model.joblib")
)
results = chain.run(target_column="survived")
```
## Components (Crafters)
### DataIngestCrafter
Loads data from various file formats:
```python
DataIngestCrafter(
data_path="path/to/data.csv",
source_type="auto" # auto, csv, excel, json
)
```
### CleanerCrafter
Handles missing values intelligently:
```python
CleanerCrafter(
strategy="auto", # auto, mean, median, mode, drop, constant
str_fill="missing", # Fill value for strings
int_fill=0.0 # Fill value for numbers
)
```
### ScalerCrafter
Scales numerical features:
```python
ScalerCrafter(
scaler_type="standard", # standard, minmax, robust
columns=["age", "income"] # Specific columns or None for all numeric
)
```
### ModelCrafter
Trains ML models:
```python
ModelCrafter(
model_name="random_forest", # random_forest, xgboost, logistic_regression
model_params={"n_estimators": 100},
test_size=0.2,
stratify=True
)
```
### ScorerCrafter
Calculates performance metrics:
```python
ScorerCrafter(
metrics=["accuracy", "precision", "recall", "f1"] # Default: all metrics
)
```
### DeployCrafter
Saves trained models:
```python
DeployCrafter(
model_path="model.joblib",
save_format="joblib", # joblib or pickle
include_scaler=True,
include_metadata=True
)
```
## Alternative Usage Patterns
### Step-by-Step Building
```python
chain = MLFChain()
chain.add_crafter(DataIngestCrafter(data_path="data.csv"))
chain.add_crafter(CleanerCrafter(strategy="median"))
chain.add_crafter(ModelCrafter(model_name="xgboost"))
results = chain.run(target_column="target")
```
### Loading Saved Models
```python
artifacts = DeployCrafter.load_model("model.joblib")
model = artifacts["model"]
metadata = artifacts["metadata"]
```
## Requirements
- **Python**: 3.8 or higher
- **Core Dependencies**: pandas, scikit-learn, numpy, xgboost, joblib
## Development
### Setup Development Environment
```bash
git clone https://github.com/brkcvlk/mlfcrafter.git
cd mlfcrafter
pip install -r requirements-dev.txt
pip install -e .
```
### Run Tests
```bash
# Run all tests
python -m pytest tests/ -v
# Run tests with coverage
python -m pytest tests/ -v --cov=mlfcrafter --cov-report=html
# Check code quality
ruff check .
# Auto-fix code issues
ruff check --fix .
# Format code
ruff format .
```
### Run Examples
```bash
python example.py
```
## Documentation
Complete documentation is available at [MLFCrafter Docs](https://brkcvlk.github.io/mlfcrafter/)
## Contributing
We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Support
- ๐ **Documentation**: [MLFCrafter Docs](https://brkcvlk.github.io/mlfcrafter/)
- ๐ **Bug Reports**: [GitHub Issues](https://github.com/brkcvlk/mlfcrafter/issues)
- ๐ฌ **Discussions**: [GitHub Discussions](https://github.com/brkcvlk/mlfcrafter/discussions)
---
**Made for the ML Community**
Raw data
{
"_id": null,
"home_page": null,
"name": "mlfcrafter",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "Burak Civelek <burakcivelekk61@gmail.com>",
"keywords": "machine-learning, pipeline, automation, data-science, ml-ops, automl, scikit-learn, data-processing",
"author": null,
"author_email": "Burak Civelek <burakcivelekk61@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/0c/d5/020b6f64e01cdfc803bc0f38da858089a03f5947fe9e87f4a40a46856215/mlfcrafter-0.1.1.tar.gz",
"platform": null,
"description": "# MLFCrafter\n\n> **ML Pipeline Automation Framework - Chain together data processing, model training, and deployment with minimal code**\n\n[](https://pypi.org/project/mlfcrafter/)\n[](https://pypi.org/project/mlfcrafter/)\n[](https://github.com/brkcvlk/mlfcrafter/actions)\n[](https://brkcvlk.github.io/mlfcrafter/)\n[](LICENSE)\n[](https://pypi.org/project/mlfcrafter/)\n\n---\n\n## \u2b50 **If you find MLFCrafter useful, please consider starring this repository!**\n\n<a href=\"https://github.com/brkcvlk/mlfcrafter/stargazers\">\n <img src=\"https://img.shields.io/github/stars/brkcvlk/mlfcrafter?style=social\" alt=\"GitHub stars\">\n</a>\n\nYour support helps us continue developing and improving MLFCrafter for the ML community.\n\n---\n\n## What is MLFCrafter?\n\nMLFCrafter is a Python framework that simplifies machine learning pipeline creation through chainable \"crafter\" components. Build, train, and deploy ML models with minimal code and maximum flexibility.\n\n## Key Features\n\n- **\ud83d\udd17 Chainable Architecture** - Connect multiple processing steps seamlessly\n- **\ud83d\udcca Smart Data Handling** - Automatic data ingestion from CSV, Excel, JSON\n- **\ud83e\uddf9 Intelligent Cleaning** - Multiple strategies for missing value handling \n- **\ud83d\udccf Flexible Scaling** - MinMax, Standard, and Robust scaling options\n- **\ud83e\udd16 Multiple Models** - Random Forest, XGBoost, Logistic Regression support\n- **\ud83d\udcc8 Comprehensive Metrics** - Accuracy, Precision, Recall, F1-Score\n- **\ud83d\udcbe Easy Deployment** - One-click model saving with metadata\n- **\ud83d\udd04 Context-Based** - Seamless data flow between pipeline steps\n\n## Quick Start\n\n### Installation\n\n```bash\npip install mlfcrafter\n```\n\n### Basic Usage\n\n```python\nfrom mlfcrafter import MLFChain, DataIngestCrafter, CleanerCrafter, ScalerCrafter, ModelCrafter, ScorerCrafter, DeployCrafter\n\n# Create ML pipeline in one line\nchain = MLFChain(\n DataIngestCrafter(data_path=\"data/iris.csv\"),\n CleanerCrafter(strategy=\"auto\"),\n ScalerCrafter(scaler_type=\"standard\"),\n ModelCrafter(model_name=\"random_forest\"),\n ScorerCrafter(),\n DeployCrafter()\n)\n\n# Run entire pipeline\nresults = chain.run(target_column=\"species\")\nprint(f\"Test Score: {results['test_score']:.4f}\")\n```\n\n### Advanced Configuration\n\n```python\nchain = MLFChain(\n DataIngestCrafter(data_path=\"data/titanic.csv\", source_type=\"csv\"),\n CleanerCrafter(strategy=\"mean\", str_fill=\"Unknown\"),\n ScalerCrafter(scaler_type=\"minmax\", columns=[\"age\", \"fare\"]),\n ModelCrafter(\n model_name=\"xgboost\",\n model_params={\"n_estimators\": 200, \"max_depth\": 6},\n test_size=0.25\n ),\n ScorerCrafter(),\n DeployCrafter(model_path=\"models/titanic_model.joblib\")\n)\n\nresults = chain.run(target_column=\"survived\")\n```\n\n## Components (Crafters)\n\n### DataIngestCrafter\nLoads data from various file formats:\n```python\nDataIngestCrafter(\n data_path=\"path/to/data.csv\",\n source_type=\"auto\" # auto, csv, excel, json\n)\n```\n\n### CleanerCrafter \nHandles missing values intelligently:\n```python\nCleanerCrafter(\n strategy=\"auto\", # auto, mean, median, mode, drop, constant\n str_fill=\"missing\", # Fill value for strings\n int_fill=0.0 # Fill value for numbers\n)\n```\n\n### ScalerCrafter\nScales numerical features:\n```python\nScalerCrafter(\n scaler_type=\"standard\", # standard, minmax, robust\n columns=[\"age\", \"income\"] # Specific columns or None for all numeric\n)\n```\n\n### ModelCrafter\nTrains ML models:\n```python\nModelCrafter(\n model_name=\"random_forest\", # random_forest, xgboost, logistic_regression\n model_params={\"n_estimators\": 100},\n test_size=0.2,\n stratify=True\n)\n```\n\n### ScorerCrafter\nCalculates performance metrics:\n```python\nScorerCrafter(\n metrics=[\"accuracy\", \"precision\", \"recall\", \"f1\"] # Default: all metrics\n)\n```\n\n### DeployCrafter\nSaves trained models:\n```python\nDeployCrafter(\n model_path=\"model.joblib\",\n save_format=\"joblib\", # joblib or pickle\n include_scaler=True,\n include_metadata=True\n)\n```\n\n## Alternative Usage Patterns\n\n### Step-by-Step Building\n```python\nchain = MLFChain()\nchain.add_crafter(DataIngestCrafter(data_path=\"data.csv\"))\nchain.add_crafter(CleanerCrafter(strategy=\"median\"))\nchain.add_crafter(ModelCrafter(model_name=\"xgboost\"))\nresults = chain.run(target_column=\"target\")\n```\n\n### Loading Saved Models\n```python\nartifacts = DeployCrafter.load_model(\"model.joblib\")\nmodel = artifacts[\"model\"]\nmetadata = artifacts[\"metadata\"]\n```\n\n## Requirements\n\n- **Python**: 3.8 or higher\n- **Core Dependencies**: pandas, scikit-learn, numpy, xgboost, joblib\n\n## Development\n\n### Setup Development Environment\n\n```bash\ngit clone https://github.com/brkcvlk/mlfcrafter.git\ncd mlfcrafter\npip install -r requirements-dev.txt\npip install -e .\n```\n\n### Run Tests\n\n```bash\n# Run all tests\npython -m pytest tests/ -v\n\n# Run tests with coverage \npython -m pytest tests/ -v --cov=mlfcrafter --cov-report=html\n\n# Check code quality\nruff check .\n\n# Auto-fix code issues\nruff check --fix .\n\n# Format code\nruff format .\n```\n\n### Run Examples\n\n```bash\npython example.py\n```\n\n## Documentation\n\nComplete documentation is available at [MLFCrafter Docs](https://brkcvlk.github.io/mlfcrafter/)\n\n## Contributing\n\nWe welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\n- \ud83d\udcd6 **Documentation**: [MLFCrafter Docs](https://brkcvlk.github.io/mlfcrafter/)\n- \ud83d\udc1b **Bug Reports**: [GitHub Issues](https://github.com/brkcvlk/mlfcrafter/issues)\n- \ud83d\udcac **Discussions**: [GitHub Discussions](https://github.com/brkcvlk/mlfcrafter/discussions)\n\n---\n\n**Made for the ML Community** ",
"bugtrack_url": null,
"license": "MIT",
"summary": "ML Pipeline Automation Framework - Chain together data processing, model training, and deployment with minimal code",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/brkcvlk/mlfcrafter",
"Repository": "https://github.com/brkcvlk/mlfcrafter",
"bug-report": "https://github.com/brkcvlk/mlfcrafter/issues"
},
"split_keywords": [
"machine-learning",
" pipeline",
" automation",
" data-science",
" ml-ops",
" automl",
" scikit-learn",
" data-processing"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a4025d0b545fd2105e0eed5de07ef22d1f0bb39ba42714c23bdf0eafa4d568eb",
"md5": "328ddb93cb1606467dff22713a4874f1",
"sha256": "a6336a67e0c45bd063c13a7e48303f0d17eac2336e4bd3c89cd3cabe6fbe789f"
},
"downloads": -1,
"filename": "mlfcrafter-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "328ddb93cb1606467dff22713a4874f1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 22566,
"upload_time": "2025-07-26T10:48:33",
"upload_time_iso_8601": "2025-07-26T10:48:33.369103Z",
"url": "https://files.pythonhosted.org/packages/a4/02/5d0b545fd2105e0eed5de07ef22d1f0bb39ba42714c23bdf0eafa4d568eb/mlfcrafter-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "0cd5020b6f64e01cdfc803bc0f38da858089a03f5947fe9e87f4a40a46856215",
"md5": "4933a0c30f6556a0f789e6d49c77e7f3",
"sha256": "7501546696fbf815c5714780f2d0a08dd1dd22ef5cd8af7207de5eaf939558b8"
},
"downloads": -1,
"filename": "mlfcrafter-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "4933a0c30f6556a0f789e6d49c77e7f3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 23543,
"upload_time": "2025-07-26T10:48:34",
"upload_time_iso_8601": "2025-07-26T10:48:34.592615Z",
"url": "https://files.pythonhosted.org/packages/0c/d5/020b6f64e01cdfc803bc0f38da858089a03f5947fe9e87f4a40a46856215/mlfcrafter-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-26 10:48:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "brkcvlk",
"github_project": "mlfcrafter",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "pandas",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
">=",
"1.3.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.24.0"
]
]
},
{
"name": "xgboost",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "joblib",
"specs": [
[
">=",
"1.2.0"
]
]
}
],
"lcname": "mlfcrafter"
}