automl-lite


Nameautoml-lite JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryA simplified automated machine learning package for non-experts
upload_time2025-07-15 09:13:15
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords machine-learning automl scikit-learn data-science ml
VCS
bugtrack_url
requirements numpy pandas scikit-learn optuna shap matplotlib seaborn joblib plotly jinja2 tqdm scipy category-encoders imbalanced-learn
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # AutoML Lite 🤖

**Automated Machine Learning Made Simple**

[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)
[![Status](https://img.shields.io/badge/Status-Production%20Ready-brightgreen.svg)]()

A lightweight, production-ready automated machine learning library that simplifies the entire ML pipeline from data preprocessing to model deployment.

## 📋 Table of Contents

- [Features](#-features)
- [Installation](#-installation)
- [Quick Start](#-quick-start)
- [CLI Commands](#-cli-commands)
- [Python API](#-python-api)
- [Advanced Features](#-advanced-features)
- [Use Cases](#-use-cases)
- [Examples](#-examples)
- [Configuration](#-configuration)
- [Contributing](#-contributing)
- [License](#-license)

## ✨ Features

### 🎯 Core Features
- **Automated Model Selection**: Tests multiple algorithms and selects the best performer
- **Hyperparameter Optimization**: Uses Optuna for efficient parameter tuning
- **Cross-Validation**: Robust model evaluation with customizable folds
- **Feature Engineering**: Automatic preprocessing and feature selection
- **Model Persistence**: Save and load trained models easily

### 🚀 Advanced Features
- **Ensemble Methods**: Automatic ensemble creation with voting classifiers
- **Early Stopping**: Optimized training with patience and early stopping
- **Feature Selection**: Intelligent feature importance and selection
- **Model Interpretability**: SHAP values and feature effects analysis
- **Comprehensive Reporting**: HTML reports with interactive visualizations

### 🛠️ Production Ready
- **CLI Interface**: Complete command-line interface
- **Error Handling**: Robust error handling and fallback mechanisms
- **Logging**: Comprehensive logging for debugging
- **Type Hints**: Full type annotations for better development experience

## 🚀 Installation

### Prerequisites
- Python 3.8 or higher
- pip package manager

### Install from Source
```bash
# Clone the repository
git clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git
cd AutoML-Lite

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install the package
pip install -e .
```

### Dependencies
The package automatically installs all required dependencies:
- scikit-learn
- pandas
- numpy
- optuna
- plotly
- seaborn
- matplotlib
- jinja2
- joblib

## 🚀 Quick Start

### Using CLI (Recommended for Beginners)

1. **Train a Model**
```bash
python -m automl_lite.cli.main train data.csv --target target_column --output model.pkl
```

2. **Make Predictions**
```bash
python -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv
```

3. **Generate Report**
```bash
python -m automl_lite.cli.main report model.pkl --output report.html
```

### Using Python API

```python
from automl_lite import AutoMLite
import pandas as pd

# Load data
data = pd.read_csv('data.csv')

# Initialize AutoML
automl = AutoMLite(
    problem_type='classification',
    enable_ensemble=True,
    enable_feature_selection=True,
    enable_interpretability=True
)

# Train model
automl.fit(data, target_column='target')

# Make predictions
predictions = automl.predict(test_data)

# Generate report
automl.generate_report('report.html')

# Save model
automl.save_model('model.pkl')
```

## 📖 CLI Commands

### Training Command

```bash
python -m automl_lite.cli.main train [OPTIONS] DATA
```

**Arguments:**
- `DATA`: Path to training data file (CSV format)

**Options:**
- `--target TEXT`: Target column name (required)
- `--output PATH`: Output model file path (default: model.pkl)
- `--config PATH`: Configuration file path
- `--time-budget INTEGER`: Time budget in seconds (default: 300)
- `--max-models INTEGER`: Maximum number of models (default: 10)
- `--cv-folds INTEGER`: Cross-validation folds (default: 5)
- `--enable-ensemble`: Enable ensemble methods
- `--enable-feature-selection`: Enable feature selection
- `--enable-interpretability`: Enable model interpretability
- `--verbose`: Verbose output

**Examples:**
```bash
# Basic training
python -m automl_lite.cli.main train iris.csv --target species --output iris_model.pkl

# Advanced training with all features
python -m automl_lite.cli.main train data.csv --target target --output model.pkl \
    --enable-ensemble --enable-feature-selection --enable-interpretability \
    --time-budget 600 --max-models 15 --verbose
```

### Prediction Command

```bash
python -m automl_lite.cli.main predict [OPTIONS] MODEL DATA
```

**Arguments:**
- `MODEL`: Path to trained model file
- `DATA`: Path to prediction data file

**Options:**
- `--output PATH`: Output predictions file path (default: predictions.csv)
- `--proba`: Output prediction probabilities

**Examples:**
```bash
# Regular predictions
python -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv

# Probability predictions
python -m automl_lite.cli.main predict model.pkl test_data.csv --output probabilities.csv --proba
```

### Report Command

```bash
python -m automl_lite.cli.main report [OPTIONS] MODEL
```

**Arguments:**
- `MODEL`: Path to trained model file

**Options:**
- `--output PATH`: Output report file path (default: report.html)

**Examples:**
```bash
python -m automl_lite.cli.main report model.pkl --output comprehensive_report.html
```

### Interactive Mode

```bash
python -m automl_lite.cli.main interactive
```

Launches an interactive session for guided model training and analysis.

## 🐍 Python API

### AutoMLite Class

The main class for automated machine learning.

```python
from automl_lite import AutoMLite

automl = AutoMLite(
    problem_type='classification',  # or 'regression'
    time_budget=300,
    max_models=10,
    cv_folds=5,
    enable_ensemble=True,
    enable_feature_selection=True,
    enable_interpretability=True,
    random_state=42
)
```

### Methods

#### `fit(X, y=None, target_column=None)`
Train the AutoML model.

```python
# Using DataFrame with target column
automl.fit(data, target_column='target')

# Using separate X and y
automl.fit(X, y)
```

#### `predict(X)`
Make predictions on new data.

```python
predictions = automl.predict(test_data)
```

#### `predict_proba(X)`
Get prediction probabilities (classification only).

```python
probabilities = automl.predict_proba(test_data)
```

#### `save_model(path)`
Save the trained model.

```python
automl.save_model('model.pkl')
```

#### `load_model(path)`
Load a saved model.

```python
automl.load_model('model.pkl')
```

#### `generate_report(path)`
Generate comprehensive HTML report.

```python
automl.generate_report('report.html')
```

#### `get_leaderboard()`
Get model performance leaderboard.

```python
leaderboard = automl.get_leaderboard()
```

#### `get_feature_importance()`
Get feature importance scores.

```python
importance = automl.get_feature_importance()
```

## 🚀 Advanced Features

### Ensemble Methods

AutoML Lite automatically creates ensemble models by combining the best performing models:

```python
automl = AutoMLite(enable_ensemble=True)
automl.fit(data, target_column='target')

# The ensemble model is automatically created and used for predictions
predictions = automl.predict(test_data)
```

**Features:**
- Automatic detection of `predict_proba` support
- Soft voting for compatible models
- Hard voting fallback for incompatible models
- Top-K model selection

### Feature Selection

Intelligent feature selection based on importance scores:

```python
automl = AutoMLite(enable_feature_selection=True)
automl.fit(data, target_column='target')

# Get selected features
selected_features = automl.selected_features
print(f"Selected {len(selected_features)} features out of {len(data.columns)}")
```

### Model Interpretability

Comprehensive model interpretability using SHAP values:

```python
automl = AutoMLite(enable_interpretability=True)
automl.fit(data, target_column='target')

# Get interpretability results
interpretability = automl.get_interpretability_results()
```

**Available Interpretability Features:**
- SHAP values for feature importance
- Feature effects analysis
- Model complexity metrics
- Individual prediction explanations

### Early Stopping

Optimized training with early stopping:

```python
automl = AutoMLite(
    enable_early_stopping=True,
    patience=10,
    min_delta=0.001
)
```

## 📊 Use Cases

### 1. Classification Problems

**Customer Churn Prediction**
```python
# Load customer data
customer_data = pd.read_csv('customer_data.csv')

# Train model
automl = AutoMLite(problem_type='classification', enable_ensemble=True)
automl.fit(customer_data, target_column='churned')

# Predict churn probability
churn_prob = automl.predict_proba(new_customers)
```

**Spam Detection**
```python
# Email classification
automl = AutoMLite(
    problem_type='classification',
    enable_feature_selection=True,
    enable_interpretability=True
)
automl.fit(email_data, target_column='is_spam')

# Generate report for analysis
automl.generate_report('spam_detection_report.html')
```

### 2. Regression Problems

**House Price Prediction**
```python
# Real estate data
automl = AutoMLite(
    problem_type='regression',
    enable_ensemble=True,
    time_budget=600
)
automl.fit(house_data, target_column='price')

# Predict house prices
predictions = automl.predict(new_houses)
```

**Sales Forecasting**
```python
# Time series forecasting
automl = AutoMLite(
    problem_type='regression',
    enable_feature_selection=True
)
automl.fit(sales_data, target_column='sales_volume')
```

### 3. Production Deployment

**Batch Processing**
```bash
# Train model
python -m automl_lite.cli.main train historical_data.csv --target target --output production_model.pkl

# Batch predictions
python -m automl_lite.cli.main predict production_model.pkl new_data.csv --output batch_predictions.csv
```

**API Integration**
```python
# Load trained model
automl = AutoMLite()
automl.load_model('production_model.pkl')

# API endpoint
def predict_endpoint(data):
    return automl.predict(data)
```

## 📚 Examples

### Basic Classification Example

```python
from automl_lite import AutoMLite
import pandas as pd
from sklearn.datasets import load_iris

# Load iris dataset
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
data['target'] = iris.target

# Initialize AutoML
automl = AutoMLite(
    problem_type='classification',
    time_budget=60,
    max_models=5
)

# Train model
automl.fit(data, target_column='target')

# Make predictions
predictions = automl.predict(data.iloc[:10])

# Generate report
automl.generate_report('iris_report.html')

print(f"Best model: {automl.best_model_name}")
print(f"Best score: {automl.best_score:.4f}")
```

### Advanced Regression Example

```python
from automl_lite import AutoMLite
import pandas as pd
from sklearn.datasets import load_boston

# Load boston housing dataset
boston = load_boston()
data = pd.DataFrame(boston.data, columns=boston.feature_names)
data['target'] = boston.target

# Initialize AutoML with all features
automl = AutoMLite(
    problem_type='regression',
    enable_ensemble=True,
    enable_feature_selection=True,
    enable_interpretability=True,
    time_budget=300,
    max_models=10
)

# Train model
automl.fit(data, target_column='target')

# Get feature importance
importance = automl.get_feature_importance()
print("Top 5 features:")
for feature, score in list(importance.items())[:5]:
    print(f"{feature}: {score:.4f}")

# Generate comprehensive report
automl.generate_report('boston_housing_report.html')
```

### CLI Workflow Example

```bash
# 1. Train model with all features
python -m automl_lite.cli.main train customer_data.csv \
    --target churn \
    --output customer_churn_model.pkl \
    --enable-ensemble \
    --enable-feature-selection \
    --enable-interpretability \
    --time-budget 600 \
    --max-models 15 \
    --verbose

# 2. Generate comprehensive report
python -m automl_lite.cli.main report customer_churn_model.pkl \
    --output customer_churn_report.html

# 3. Make predictions on new data
python -m automl_lite.cli.main predict customer_churn_model.pkl \
    new_customers.csv \
    --output churn_predictions.csv \
    --proba
```

## ⚙️ Configuration

### Configuration File

Create a `config.yaml` file for custom settings:

```yaml
# AutoML Configuration
problem_type: classification
time_budget: 600
max_models: 15
cv_folds: 5
random_state: 42

# Advanced Features
enable_ensemble: true
enable_feature_selection: true
enable_interpretability: true
enable_early_stopping: true

# Model Parameters
models:
  - RandomForest
  - XGBoost
  - LightGBM
  - SVM
  - NeuralNetwork

# Feature Selection
feature_selection:
  method: mutual_info
  threshold: 0.01
  max_features: 20

# Ensemble
ensemble:
  method: voting
  top_k: 3
  voting: soft
```

### Using Configuration

```bash
python -m automl_lite.cli.main train data.csv --target target --config config.yaml
```

```python
automl = AutoMLite.from_config('config.yaml')
automl.fit(data, target_column='target')
```

## 📊 Supported Algorithms

### Classification
- Random Forest
- XGBoost
- LightGBM
- Support Vector Machine (SVM)
- Logistic Regression
- Naive Bayes
- Neural Network (MLP)
- Extra Trees
- Linear Discriminant Analysis

### Regression
- Random Forest
- XGBoost
- LightGBM
- Support Vector Regression (SVR)
- Linear Regression
- Ridge Regression
- Lasso Regression
- Neural Network (MLP)
- Extra Trees

## 📈 Performance Metrics

### Classification Metrics
- Accuracy
- Precision
- Recall
- F1-Score
- ROC-AUC
- Precision-Recall AUC

### Regression Metrics
- R² Score
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)

## 🔧 Troubleshooting

### Common Issues

**1. Memory Issues**
```bash
# Reduce number of models
python -m automl_lite.cli.main train data.csv --target target --max-models 5
```

**2. Time Budget Exceeded**
```bash
# Increase time budget
python -m automl_lite.cli.main train data.csv --target target --time-budget 1200
```

**3. Model Compatibility**
```python
# Check model support
automl = AutoMLite(enable_ensemble=False)  # Disable ensemble if issues occur
```

**4. Feature Selection Issues**
```python
# Disable feature selection for debugging
automl = AutoMLite(enable_feature_selection=False)
```

### Debug Mode

Enable verbose logging for debugging:

```bash
python -m automl_lite.cli.main train data.csv --target target --verbose
```

## 🤝 Contributing

We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.

### Development Setup

```bash
# Clone repository
git clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git
cd AutoML-Lite

# Install development dependencies
pip install -r requirements-dev.txt

# Run tests
pytest tests/

# Run linting
flake8 src/
black src/
```

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 👨‍💻 Author

**Sherin Joseph Roy**
- Email: sherin.joseph2217@gmail.com
- GitHub: [@Sherin-SEF-AI](https://github.com/Sherin-SEF-AI)

## 🙏 Acknowledgments

- [scikit-learn](https://scikit-learn.org/) for the machine learning algorithms
- [Optuna](https://optuna.org/) for hyperparameter optimization
- [Plotly](https://plotly.com/) for interactive visualizations
- [Pandas](https://pandas.pydata.org/) for data manipulation

## 📞 Support

- **Issues**: [GitHub Issues](https://github.com/Sherin-SEF-AI/AutoML-Lite/issues)
- **Email**: sherin.joseph2217@gmail.com
- **Documentation**: [GitHub Wiki](https://github.com/Sherin-SEF-AI/AutoML-Lite/wiki)

---

**Made with ❤️ by Sherin Joseph Roy** 

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "automl-lite",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "sherin joseph roy <sherin.joseph2217@gmail.com>",
    "keywords": "machine-learning, automl, scikit-learn, data-science, ml",
    "author": null,
    "author_email": "sherin joseph roy <sherin.joseph2217@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/7b/46/c1f90b7b551a18e4eb50336df7907b2d3ef8a6f5adb37109e6b66c9e5866/automl_lite-0.1.0.tar.gz",
    "platform": null,
    "description": "# AutoML Lite \ud83e\udd16\n\n**Automated Machine Learning Made Simple**\n\n[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)\n[![Status](https://img.shields.io/badge/Status-Production%20Ready-brightgreen.svg)]()\n\nA lightweight, production-ready automated machine learning library that simplifies the entire ML pipeline from data preprocessing to model deployment.\n\n## \ud83d\udccb Table of Contents\n\n- [Features](#-features)\n- [Installation](#-installation)\n- [Quick Start](#-quick-start)\n- [CLI Commands](#-cli-commands)\n- [Python API](#-python-api)\n- [Advanced Features](#-advanced-features)\n- [Use Cases](#-use-cases)\n- [Examples](#-examples)\n- [Configuration](#-configuration)\n- [Contributing](#-contributing)\n- [License](#-license)\n\n## \u2728 Features\n\n### \ud83c\udfaf Core Features\n- **Automated Model Selection**: Tests multiple algorithms and selects the best performer\n- **Hyperparameter Optimization**: Uses Optuna for efficient parameter tuning\n- **Cross-Validation**: Robust model evaluation with customizable folds\n- **Feature Engineering**: Automatic preprocessing and feature selection\n- **Model Persistence**: Save and load trained models easily\n\n### \ud83d\ude80 Advanced Features\n- **Ensemble Methods**: Automatic ensemble creation with voting classifiers\n- **Early Stopping**: Optimized training with patience and early stopping\n- **Feature Selection**: Intelligent feature importance and selection\n- **Model Interpretability**: SHAP values and feature effects analysis\n- **Comprehensive Reporting**: HTML reports with interactive visualizations\n\n### \ud83d\udee0\ufe0f Production Ready\n- **CLI Interface**: Complete command-line interface\n- **Error Handling**: Robust error handling and fallback mechanisms\n- **Logging**: Comprehensive logging for debugging\n- **Type Hints**: Full type annotations for better development experience\n\n## \ud83d\ude80 Installation\n\n### Prerequisites\n- Python 3.8 or higher\n- pip package manager\n\n### Install from Source\n```bash\n# Clone the repository\ngit clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git\ncd AutoML-Lite\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install dependencies\npip install -r requirements.txt\n\n# Install the package\npip install -e .\n```\n\n### Dependencies\nThe package automatically installs all required dependencies:\n- scikit-learn\n- pandas\n- numpy\n- optuna\n- plotly\n- seaborn\n- matplotlib\n- jinja2\n- joblib\n\n## \ud83d\ude80 Quick Start\n\n### Using CLI (Recommended for Beginners)\n\n1. **Train a Model**\n```bash\npython -m automl_lite.cli.main train data.csv --target target_column --output model.pkl\n```\n\n2. **Make Predictions**\n```bash\npython -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv\n```\n\n3. **Generate Report**\n```bash\npython -m automl_lite.cli.main report model.pkl --output report.html\n```\n\n### Using Python API\n\n```python\nfrom automl_lite import AutoMLite\nimport pandas as pd\n\n# Load data\ndata = pd.read_csv('data.csv')\n\n# Initialize AutoML\nautoml = AutoMLite(\n    problem_type='classification',\n    enable_ensemble=True,\n    enable_feature_selection=True,\n    enable_interpretability=True\n)\n\n# Train model\nautoml.fit(data, target_column='target')\n\n# Make predictions\npredictions = automl.predict(test_data)\n\n# Generate report\nautoml.generate_report('report.html')\n\n# Save model\nautoml.save_model('model.pkl')\n```\n\n## \ud83d\udcd6 CLI Commands\n\n### Training Command\n\n```bash\npython -m automl_lite.cli.main train [OPTIONS] DATA\n```\n\n**Arguments:**\n- `DATA`: Path to training data file (CSV format)\n\n**Options:**\n- `--target TEXT`: Target column name (required)\n- `--output PATH`: Output model file path (default: model.pkl)\n- `--config PATH`: Configuration file path\n- `--time-budget INTEGER`: Time budget in seconds (default: 300)\n- `--max-models INTEGER`: Maximum number of models (default: 10)\n- `--cv-folds INTEGER`: Cross-validation folds (default: 5)\n- `--enable-ensemble`: Enable ensemble methods\n- `--enable-feature-selection`: Enable feature selection\n- `--enable-interpretability`: Enable model interpretability\n- `--verbose`: Verbose output\n\n**Examples:**\n```bash\n# Basic training\npython -m automl_lite.cli.main train iris.csv --target species --output iris_model.pkl\n\n# Advanced training with all features\npython -m automl_lite.cli.main train data.csv --target target --output model.pkl \\\n    --enable-ensemble --enable-feature-selection --enable-interpretability \\\n    --time-budget 600 --max-models 15 --verbose\n```\n\n### Prediction Command\n\n```bash\npython -m automl_lite.cli.main predict [OPTIONS] MODEL DATA\n```\n\n**Arguments:**\n- `MODEL`: Path to trained model file\n- `DATA`: Path to prediction data file\n\n**Options:**\n- `--output PATH`: Output predictions file path (default: predictions.csv)\n- `--proba`: Output prediction probabilities\n\n**Examples:**\n```bash\n# Regular predictions\npython -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv\n\n# Probability predictions\npython -m automl_lite.cli.main predict model.pkl test_data.csv --output probabilities.csv --proba\n```\n\n### Report Command\n\n```bash\npython -m automl_lite.cli.main report [OPTIONS] MODEL\n```\n\n**Arguments:**\n- `MODEL`: Path to trained model file\n\n**Options:**\n- `--output PATH`: Output report file path (default: report.html)\n\n**Examples:**\n```bash\npython -m automl_lite.cli.main report model.pkl --output comprehensive_report.html\n```\n\n### Interactive Mode\n\n```bash\npython -m automl_lite.cli.main interactive\n```\n\nLaunches an interactive session for guided model training and analysis.\n\n## \ud83d\udc0d Python API\n\n### AutoMLite Class\n\nThe main class for automated machine learning.\n\n```python\nfrom automl_lite import AutoMLite\n\nautoml = AutoMLite(\n    problem_type='classification',  # or 'regression'\n    time_budget=300,\n    max_models=10,\n    cv_folds=5,\n    enable_ensemble=True,\n    enable_feature_selection=True,\n    enable_interpretability=True,\n    random_state=42\n)\n```\n\n### Methods\n\n#### `fit(X, y=None, target_column=None)`\nTrain the AutoML model.\n\n```python\n# Using DataFrame with target column\nautoml.fit(data, target_column='target')\n\n# Using separate X and y\nautoml.fit(X, y)\n```\n\n#### `predict(X)`\nMake predictions on new data.\n\n```python\npredictions = automl.predict(test_data)\n```\n\n#### `predict_proba(X)`\nGet prediction probabilities (classification only).\n\n```python\nprobabilities = automl.predict_proba(test_data)\n```\n\n#### `save_model(path)`\nSave the trained model.\n\n```python\nautoml.save_model('model.pkl')\n```\n\n#### `load_model(path)`\nLoad a saved model.\n\n```python\nautoml.load_model('model.pkl')\n```\n\n#### `generate_report(path)`\nGenerate comprehensive HTML report.\n\n```python\nautoml.generate_report('report.html')\n```\n\n#### `get_leaderboard()`\nGet model performance leaderboard.\n\n```python\nleaderboard = automl.get_leaderboard()\n```\n\n#### `get_feature_importance()`\nGet feature importance scores.\n\n```python\nimportance = automl.get_feature_importance()\n```\n\n## \ud83d\ude80 Advanced Features\n\n### Ensemble Methods\n\nAutoML Lite automatically creates ensemble models by combining the best performing models:\n\n```python\nautoml = AutoMLite(enable_ensemble=True)\nautoml.fit(data, target_column='target')\n\n# The ensemble model is automatically created and used for predictions\npredictions = automl.predict(test_data)\n```\n\n**Features:**\n- Automatic detection of `predict_proba` support\n- Soft voting for compatible models\n- Hard voting fallback for incompatible models\n- Top-K model selection\n\n### Feature Selection\n\nIntelligent feature selection based on importance scores:\n\n```python\nautoml = AutoMLite(enable_feature_selection=True)\nautoml.fit(data, target_column='target')\n\n# Get selected features\nselected_features = automl.selected_features\nprint(f\"Selected {len(selected_features)} features out of {len(data.columns)}\")\n```\n\n### Model Interpretability\n\nComprehensive model interpretability using SHAP values:\n\n```python\nautoml = AutoMLite(enable_interpretability=True)\nautoml.fit(data, target_column='target')\n\n# Get interpretability results\ninterpretability = automl.get_interpretability_results()\n```\n\n**Available Interpretability Features:**\n- SHAP values for feature importance\n- Feature effects analysis\n- Model complexity metrics\n- Individual prediction explanations\n\n### Early Stopping\n\nOptimized training with early stopping:\n\n```python\nautoml = AutoMLite(\n    enable_early_stopping=True,\n    patience=10,\n    min_delta=0.001\n)\n```\n\n## \ud83d\udcca Use Cases\n\n### 1. Classification Problems\n\n**Customer Churn Prediction**\n```python\n# Load customer data\ncustomer_data = pd.read_csv('customer_data.csv')\n\n# Train model\nautoml = AutoMLite(problem_type='classification', enable_ensemble=True)\nautoml.fit(customer_data, target_column='churned')\n\n# Predict churn probability\nchurn_prob = automl.predict_proba(new_customers)\n```\n\n**Spam Detection**\n```python\n# Email classification\nautoml = AutoMLite(\n    problem_type='classification',\n    enable_feature_selection=True,\n    enable_interpretability=True\n)\nautoml.fit(email_data, target_column='is_spam')\n\n# Generate report for analysis\nautoml.generate_report('spam_detection_report.html')\n```\n\n### 2. Regression Problems\n\n**House Price Prediction**\n```python\n# Real estate data\nautoml = AutoMLite(\n    problem_type='regression',\n    enable_ensemble=True,\n    time_budget=600\n)\nautoml.fit(house_data, target_column='price')\n\n# Predict house prices\npredictions = automl.predict(new_houses)\n```\n\n**Sales Forecasting**\n```python\n# Time series forecasting\nautoml = AutoMLite(\n    problem_type='regression',\n    enable_feature_selection=True\n)\nautoml.fit(sales_data, target_column='sales_volume')\n```\n\n### 3. Production Deployment\n\n**Batch Processing**\n```bash\n# Train model\npython -m automl_lite.cli.main train historical_data.csv --target target --output production_model.pkl\n\n# Batch predictions\npython -m automl_lite.cli.main predict production_model.pkl new_data.csv --output batch_predictions.csv\n```\n\n**API Integration**\n```python\n# Load trained model\nautoml = AutoMLite()\nautoml.load_model('production_model.pkl')\n\n# API endpoint\ndef predict_endpoint(data):\n    return automl.predict(data)\n```\n\n## \ud83d\udcda Examples\n\n### Basic Classification Example\n\n```python\nfrom automl_lite import AutoMLite\nimport pandas as pd\nfrom sklearn.datasets import load_iris\n\n# Load iris dataset\niris = load_iris()\ndata = pd.DataFrame(iris.data, columns=iris.feature_names)\ndata['target'] = iris.target\n\n# Initialize AutoML\nautoml = AutoMLite(\n    problem_type='classification',\n    time_budget=60,\n    max_models=5\n)\n\n# Train model\nautoml.fit(data, target_column='target')\n\n# Make predictions\npredictions = automl.predict(data.iloc[:10])\n\n# Generate report\nautoml.generate_report('iris_report.html')\n\nprint(f\"Best model: {automl.best_model_name}\")\nprint(f\"Best score: {automl.best_score:.4f}\")\n```\n\n### Advanced Regression Example\n\n```python\nfrom automl_lite import AutoMLite\nimport pandas as pd\nfrom sklearn.datasets import load_boston\n\n# Load boston housing dataset\nboston = load_boston()\ndata = pd.DataFrame(boston.data, columns=boston.feature_names)\ndata['target'] = boston.target\n\n# Initialize AutoML with all features\nautoml = AutoMLite(\n    problem_type='regression',\n    enable_ensemble=True,\n    enable_feature_selection=True,\n    enable_interpretability=True,\n    time_budget=300,\n    max_models=10\n)\n\n# Train model\nautoml.fit(data, target_column='target')\n\n# Get feature importance\nimportance = automl.get_feature_importance()\nprint(\"Top 5 features:\")\nfor feature, score in list(importance.items())[:5]:\n    print(f\"{feature}: {score:.4f}\")\n\n# Generate comprehensive report\nautoml.generate_report('boston_housing_report.html')\n```\n\n### CLI Workflow Example\n\n```bash\n# 1. Train model with all features\npython -m automl_lite.cli.main train customer_data.csv \\\n    --target churn \\\n    --output customer_churn_model.pkl \\\n    --enable-ensemble \\\n    --enable-feature-selection \\\n    --enable-interpretability \\\n    --time-budget 600 \\\n    --max-models 15 \\\n    --verbose\n\n# 2. Generate comprehensive report\npython -m automl_lite.cli.main report customer_churn_model.pkl \\\n    --output customer_churn_report.html\n\n# 3. Make predictions on new data\npython -m automl_lite.cli.main predict customer_churn_model.pkl \\\n    new_customers.csv \\\n    --output churn_predictions.csv \\\n    --proba\n```\n\n## \u2699\ufe0f Configuration\n\n### Configuration File\n\nCreate a `config.yaml` file for custom settings:\n\n```yaml\n# AutoML Configuration\nproblem_type: classification\ntime_budget: 600\nmax_models: 15\ncv_folds: 5\nrandom_state: 42\n\n# Advanced Features\nenable_ensemble: true\nenable_feature_selection: true\nenable_interpretability: true\nenable_early_stopping: true\n\n# Model Parameters\nmodels:\n  - RandomForest\n  - XGBoost\n  - LightGBM\n  - SVM\n  - NeuralNetwork\n\n# Feature Selection\nfeature_selection:\n  method: mutual_info\n  threshold: 0.01\n  max_features: 20\n\n# Ensemble\nensemble:\n  method: voting\n  top_k: 3\n  voting: soft\n```\n\n### Using Configuration\n\n```bash\npython -m automl_lite.cli.main train data.csv --target target --config config.yaml\n```\n\n```python\nautoml = AutoMLite.from_config('config.yaml')\nautoml.fit(data, target_column='target')\n```\n\n## \ud83d\udcca Supported Algorithms\n\n### Classification\n- Random Forest\n- XGBoost\n- LightGBM\n- Support Vector Machine (SVM)\n- Logistic Regression\n- Naive Bayes\n- Neural Network (MLP)\n- Extra Trees\n- Linear Discriminant Analysis\n\n### Regression\n- Random Forest\n- XGBoost\n- LightGBM\n- Support Vector Regression (SVR)\n- Linear Regression\n- Ridge Regression\n- Lasso Regression\n- Neural Network (MLP)\n- Extra Trees\n\n## \ud83d\udcc8 Performance Metrics\n\n### Classification Metrics\n- Accuracy\n- Precision\n- Recall\n- F1-Score\n- ROC-AUC\n- Precision-Recall AUC\n\n### Regression Metrics\n- R\u00b2 Score\n- Mean Absolute Error (MAE)\n- Mean Squared Error (MSE)\n- Root Mean Squared Error (RMSE)\n\n## \ud83d\udd27 Troubleshooting\n\n### Common Issues\n\n**1. Memory Issues**\n```bash\n# Reduce number of models\npython -m automl_lite.cli.main train data.csv --target target --max-models 5\n```\n\n**2. Time Budget Exceeded**\n```bash\n# Increase time budget\npython -m automl_lite.cli.main train data.csv --target target --time-budget 1200\n```\n\n**3. Model Compatibility**\n```python\n# Check model support\nautoml = AutoMLite(enable_ensemble=False)  # Disable ensemble if issues occur\n```\n\n**4. Feature Selection Issues**\n```python\n# Disable feature selection for debugging\nautoml = AutoMLite(enable_feature_selection=False)\n```\n\n### Debug Mode\n\nEnable verbose logging for debugging:\n\n```bash\npython -m automl_lite.cli.main train data.csv --target target --verbose\n```\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.\n\n### Development Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git\ncd AutoML-Lite\n\n# Install development dependencies\npip install -r requirements-dev.txt\n\n# Run tests\npytest tests/\n\n# Run linting\nflake8 src/\nblack src/\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udc68\u200d\ud83d\udcbb Author\n\n**Sherin Joseph Roy**\n- Email: sherin.joseph2217@gmail.com\n- GitHub: [@Sherin-SEF-AI](https://github.com/Sherin-SEF-AI)\n\n## \ud83d\ude4f Acknowledgments\n\n- [scikit-learn](https://scikit-learn.org/) for the machine learning algorithms\n- [Optuna](https://optuna.org/) for hyperparameter optimization\n- [Plotly](https://plotly.com/) for interactive visualizations\n- [Pandas](https://pandas.pydata.org/) for data manipulation\n\n## \ud83d\udcde Support\n\n- **Issues**: [GitHub Issues](https://github.com/Sherin-SEF-AI/AutoML-Lite/issues)\n- **Email**: sherin.joseph2217@gmail.com\n- **Documentation**: [GitHub Wiki](https://github.com/Sherin-SEF-AI/AutoML-Lite/wiki)\n\n---\n\n**Made with \u2764\ufe0f by Sherin Joseph Roy** \n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A simplified automated machine learning package for non-experts",
    "version": "0.1.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/Sherin-SEF-AI/AutoML-Lite/issues",
        "Documentation": "https://github.com/Sherin-SEF-AI/AutoML-Lite#readme",
        "Homepage": "https://github.com/Sherin-SEF-AI/AutoML-Lite",
        "Repository": "https://github.com/Sherin-SEF-AI/AutoML-Lite"
    },
    "split_keywords": [
        "machine-learning",
        " automl",
        " scikit-learn",
        " data-science",
        " ml"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b2e0b2a3e4d894dd52f7a6f39f014c3a75b70a7812c41f10900ca1de7aab9c6a",
                "md5": "07c8710ad3a1aae82d13e8a74a763109",
                "sha256": "37d8a8ac1bdf4362b3a276cc760eee8660554660f53cf08ef7821949fd8055d3"
            },
            "downloads": -1,
            "filename": "automl_lite-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "07c8710ad3a1aae82d13e8a74a763109",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 61637,
            "upload_time": "2025-07-15T09:13:14",
            "upload_time_iso_8601": "2025-07-15T09:13:14.099660Z",
            "url": "https://files.pythonhosted.org/packages/b2/e0/b2a3e4d894dd52f7a6f39f014c3a75b70a7812c41f10900ca1de7aab9c6a/automl_lite-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7b46c1f90b7b551a18e4eb50336df7907b2d3ef8a6f5adb37109e6b66c9e5866",
                "md5": "cd435567e0927cfc610d2fe9bdf121da",
                "sha256": "4959df3803d2448cf5b5465a03424a10a35bba109903e92e8951c6927cc51c58"
            },
            "downloads": -1,
            "filename": "automl_lite-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "cd435567e0927cfc610d2fe9bdf121da",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 81483,
            "upload_time": "2025-07-15T09:13:15",
            "upload_time_iso_8601": "2025-07-15T09:13:15.650379Z",
            "url": "https://files.pythonhosted.org/packages/7b/46/c1f90b7b551a18e4eb50336df7907b2d3ef8a6f5adb37109e6b66c9e5866/automl_lite-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-15 09:13:15",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Sherin-SEF-AI",
    "github_project": "AutoML-Lite",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "optuna",
            "specs": [
                [
                    ">=",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "shap",
            "specs": [
                [
                    ">=",
                    "0.41.0"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    ">=",
                    "3.5.0"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    ">=",
                    "0.11.0"
                ]
            ]
        },
        {
            "name": "joblib",
            "specs": [
                [
                    ">=",
                    "1.1.0"
                ]
            ]
        },
        {
            "name": "plotly",
            "specs": [
                [
                    ">=",
                    "5.0.0"
                ]
            ]
        },
        {
            "name": "jinja2",
            "specs": [
                [
                    ">=",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.62.0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.7.0"
                ]
            ]
        },
        {
            "name": "category-encoders",
            "specs": [
                [
                    ">=",
                    "2.3.0"
                ]
            ]
        },
        {
            "name": "imbalanced-learn",
            "specs": [
                [
                    ">=",
                    "0.8.0"
                ]
            ]
        }
    ],
    "lcname": "automl-lite"
}
        
Elapsed time: 1.01486s