# AutoML Lite 🤖
**Automated Machine Learning Made Simple**
[](https://www.python.org/downloads/)
[](LICENSE)
[]()
A lightweight, production-ready automated machine learning library that simplifies the entire ML pipeline from data preprocessing to model deployment.
## 📋 Table of Contents
- [Features](#-features)
- [Installation](#-installation)
- [Quick Start](#-quick-start)
- [CLI Commands](#-cli-commands)
- [Python API](#-python-api)
- [Advanced Features](#-advanced-features)
- [Use Cases](#-use-cases)
- [Examples](#-examples)
- [Configuration](#-configuration)
- [Contributing](#-contributing)
- [License](#-license)
## ✨ Features
### 🎯 Core Features
- **Automated Model Selection**: Tests multiple algorithms and selects the best performer
- **Hyperparameter Optimization**: Uses Optuna for efficient parameter tuning
- **Cross-Validation**: Robust model evaluation with customizable folds
- **Feature Engineering**: Automatic preprocessing and feature selection
- **Model Persistence**: Save and load trained models easily
### 🚀 Advanced Features
- **Ensemble Methods**: Automatic ensemble creation with voting classifiers
- **Early Stopping**: Optimized training with patience and early stopping
- **Feature Selection**: Intelligent feature importance and selection
- **Model Interpretability**: SHAP values and feature effects analysis
- **Comprehensive Reporting**: HTML reports with interactive visualizations
### 🛠️ Production Ready
- **CLI Interface**: Complete command-line interface
- **Error Handling**: Robust error handling and fallback mechanisms
- **Logging**: Comprehensive logging for debugging
- **Type Hints**: Full type annotations for better development experience
## 🚀 Installation
### Prerequisites
- Python 3.8 or higher
- pip package manager
### Install from Source
```bash
# Clone the repository
git clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git
cd AutoML-Lite
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Install the package
pip install -e .
```
### Dependencies
The package automatically installs all required dependencies:
- scikit-learn
- pandas
- numpy
- optuna
- plotly
- seaborn
- matplotlib
- jinja2
- joblib
## 🚀 Quick Start
### Using CLI (Recommended for Beginners)
1. **Train a Model**
```bash
python -m automl_lite.cli.main train data.csv --target target_column --output model.pkl
```
2. **Make Predictions**
```bash
python -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv
```
3. **Generate Report**
```bash
python -m automl_lite.cli.main report model.pkl --output report.html
```
### Using Python API
```python
from automl_lite import AutoMLite
import pandas as pd
# Load data
data = pd.read_csv('data.csv')
# Initialize AutoML
automl = AutoMLite(
problem_type='classification',
enable_ensemble=True,
enable_feature_selection=True,
enable_interpretability=True
)
# Train model
automl.fit(data, target_column='target')
# Make predictions
predictions = automl.predict(test_data)
# Generate report
automl.generate_report('report.html')
# Save model
automl.save_model('model.pkl')
```
## 📖 CLI Commands
### Training Command
```bash
python -m automl_lite.cli.main train [OPTIONS] DATA
```
**Arguments:**
- `DATA`: Path to training data file (CSV format)
**Options:**
- `--target TEXT`: Target column name (required)
- `--output PATH`: Output model file path (default: model.pkl)
- `--config PATH`: Configuration file path
- `--time-budget INTEGER`: Time budget in seconds (default: 300)
- `--max-models INTEGER`: Maximum number of models (default: 10)
- `--cv-folds INTEGER`: Cross-validation folds (default: 5)
- `--enable-ensemble`: Enable ensemble methods
- `--enable-feature-selection`: Enable feature selection
- `--enable-interpretability`: Enable model interpretability
- `--verbose`: Verbose output
**Examples:**
```bash
# Basic training
python -m automl_lite.cli.main train iris.csv --target species --output iris_model.pkl
# Advanced training with all features
python -m automl_lite.cli.main train data.csv --target target --output model.pkl \
--enable-ensemble --enable-feature-selection --enable-interpretability \
--time-budget 600 --max-models 15 --verbose
```
### Prediction Command
```bash
python -m automl_lite.cli.main predict [OPTIONS] MODEL DATA
```
**Arguments:**
- `MODEL`: Path to trained model file
- `DATA`: Path to prediction data file
**Options:**
- `--output PATH`: Output predictions file path (default: predictions.csv)
- `--proba`: Output prediction probabilities
**Examples:**
```bash
# Regular predictions
python -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv
# Probability predictions
python -m automl_lite.cli.main predict model.pkl test_data.csv --output probabilities.csv --proba
```
### Report Command
```bash
python -m automl_lite.cli.main report [OPTIONS] MODEL
```
**Arguments:**
- `MODEL`: Path to trained model file
**Options:**
- `--output PATH`: Output report file path (default: report.html)
**Examples:**
```bash
python -m automl_lite.cli.main report model.pkl --output comprehensive_report.html
```
### Interactive Mode
```bash
python -m automl_lite.cli.main interactive
```
Launches an interactive session for guided model training and analysis.
## 🐍 Python API
### AutoMLite Class
The main class for automated machine learning.
```python
from automl_lite import AutoMLite
automl = AutoMLite(
problem_type='classification', # or 'regression'
time_budget=300,
max_models=10,
cv_folds=5,
enable_ensemble=True,
enable_feature_selection=True,
enable_interpretability=True,
random_state=42
)
```
### Methods
#### `fit(X, y=None, target_column=None)`
Train the AutoML model.
```python
# Using DataFrame with target column
automl.fit(data, target_column='target')
# Using separate X and y
automl.fit(X, y)
```
#### `predict(X)`
Make predictions on new data.
```python
predictions = automl.predict(test_data)
```
#### `predict_proba(X)`
Get prediction probabilities (classification only).
```python
probabilities = automl.predict_proba(test_data)
```
#### `save_model(path)`
Save the trained model.
```python
automl.save_model('model.pkl')
```
#### `load_model(path)`
Load a saved model.
```python
automl.load_model('model.pkl')
```
#### `generate_report(path)`
Generate comprehensive HTML report.
```python
automl.generate_report('report.html')
```
#### `get_leaderboard()`
Get model performance leaderboard.
```python
leaderboard = automl.get_leaderboard()
```
#### `get_feature_importance()`
Get feature importance scores.
```python
importance = automl.get_feature_importance()
```
## 🚀 Advanced Features
### Ensemble Methods
AutoML Lite automatically creates ensemble models by combining the best performing models:
```python
automl = AutoMLite(enable_ensemble=True)
automl.fit(data, target_column='target')
# The ensemble model is automatically created and used for predictions
predictions = automl.predict(test_data)
```
**Features:**
- Automatic detection of `predict_proba` support
- Soft voting for compatible models
- Hard voting fallback for incompatible models
- Top-K model selection
### Feature Selection
Intelligent feature selection based on importance scores:
```python
automl = AutoMLite(enable_feature_selection=True)
automl.fit(data, target_column='target')
# Get selected features
selected_features = automl.selected_features
print(f"Selected {len(selected_features)} features out of {len(data.columns)}")
```
### Model Interpretability
Comprehensive model interpretability using SHAP values:
```python
automl = AutoMLite(enable_interpretability=True)
automl.fit(data, target_column='target')
# Get interpretability results
interpretability = automl.get_interpretability_results()
```
**Available Interpretability Features:**
- SHAP values for feature importance
- Feature effects analysis
- Model complexity metrics
- Individual prediction explanations
### Early Stopping
Optimized training with early stopping:
```python
automl = AutoMLite(
enable_early_stopping=True,
patience=10,
min_delta=0.001
)
```
## 📊 Use Cases
### 1. Classification Problems
**Customer Churn Prediction**
```python
# Load customer data
customer_data = pd.read_csv('customer_data.csv')
# Train model
automl = AutoMLite(problem_type='classification', enable_ensemble=True)
automl.fit(customer_data, target_column='churned')
# Predict churn probability
churn_prob = automl.predict_proba(new_customers)
```
**Spam Detection**
```python
# Email classification
automl = AutoMLite(
problem_type='classification',
enable_feature_selection=True,
enable_interpretability=True
)
automl.fit(email_data, target_column='is_spam')
# Generate report for analysis
automl.generate_report('spam_detection_report.html')
```
### 2. Regression Problems
**House Price Prediction**
```python
# Real estate data
automl = AutoMLite(
problem_type='regression',
enable_ensemble=True,
time_budget=600
)
automl.fit(house_data, target_column='price')
# Predict house prices
predictions = automl.predict(new_houses)
```
**Sales Forecasting**
```python
# Time series forecasting
automl = AutoMLite(
problem_type='regression',
enable_feature_selection=True
)
automl.fit(sales_data, target_column='sales_volume')
```
### 3. Production Deployment
**Batch Processing**
```bash
# Train model
python -m automl_lite.cli.main train historical_data.csv --target target --output production_model.pkl
# Batch predictions
python -m automl_lite.cli.main predict production_model.pkl new_data.csv --output batch_predictions.csv
```
**API Integration**
```python
# Load trained model
automl = AutoMLite()
automl.load_model('production_model.pkl')
# API endpoint
def predict_endpoint(data):
return automl.predict(data)
```
## 📚 Examples
### Basic Classification Example
```python
from automl_lite import AutoMLite
import pandas as pd
from sklearn.datasets import load_iris
# Load iris dataset
iris = load_iris()
data = pd.DataFrame(iris.data, columns=iris.feature_names)
data['target'] = iris.target
# Initialize AutoML
automl = AutoMLite(
problem_type='classification',
time_budget=60,
max_models=5
)
# Train model
automl.fit(data, target_column='target')
# Make predictions
predictions = automl.predict(data.iloc[:10])
# Generate report
automl.generate_report('iris_report.html')
print(f"Best model: {automl.best_model_name}")
print(f"Best score: {automl.best_score:.4f}")
```
### Advanced Regression Example
```python
from automl_lite import AutoMLite
import pandas as pd
from sklearn.datasets import load_boston
# Load boston housing dataset
boston = load_boston()
data = pd.DataFrame(boston.data, columns=boston.feature_names)
data['target'] = boston.target
# Initialize AutoML with all features
automl = AutoMLite(
problem_type='regression',
enable_ensemble=True,
enable_feature_selection=True,
enable_interpretability=True,
time_budget=300,
max_models=10
)
# Train model
automl.fit(data, target_column='target')
# Get feature importance
importance = automl.get_feature_importance()
print("Top 5 features:")
for feature, score in list(importance.items())[:5]:
print(f"{feature}: {score:.4f}")
# Generate comprehensive report
automl.generate_report('boston_housing_report.html')
```
### CLI Workflow Example
```bash
# 1. Train model with all features
python -m automl_lite.cli.main train customer_data.csv \
--target churn \
--output customer_churn_model.pkl \
--enable-ensemble \
--enable-feature-selection \
--enable-interpretability \
--time-budget 600 \
--max-models 15 \
--verbose
# 2. Generate comprehensive report
python -m automl_lite.cli.main report customer_churn_model.pkl \
--output customer_churn_report.html
# 3. Make predictions on new data
python -m automl_lite.cli.main predict customer_churn_model.pkl \
new_customers.csv \
--output churn_predictions.csv \
--proba
```
## ⚙️ Configuration
### Configuration File
Create a `config.yaml` file for custom settings:
```yaml
# AutoML Configuration
problem_type: classification
time_budget: 600
max_models: 15
cv_folds: 5
random_state: 42
# Advanced Features
enable_ensemble: true
enable_feature_selection: true
enable_interpretability: true
enable_early_stopping: true
# Model Parameters
models:
- RandomForest
- XGBoost
- LightGBM
- SVM
- NeuralNetwork
# Feature Selection
feature_selection:
method: mutual_info
threshold: 0.01
max_features: 20
# Ensemble
ensemble:
method: voting
top_k: 3
voting: soft
```
### Using Configuration
```bash
python -m automl_lite.cli.main train data.csv --target target --config config.yaml
```
```python
automl = AutoMLite.from_config('config.yaml')
automl.fit(data, target_column='target')
```
## 📊 Supported Algorithms
### Classification
- Random Forest
- XGBoost
- LightGBM
- Support Vector Machine (SVM)
- Logistic Regression
- Naive Bayes
- Neural Network (MLP)
- Extra Trees
- Linear Discriminant Analysis
### Regression
- Random Forest
- XGBoost
- LightGBM
- Support Vector Regression (SVR)
- Linear Regression
- Ridge Regression
- Lasso Regression
- Neural Network (MLP)
- Extra Trees
## 📈 Performance Metrics
### Classification Metrics
- Accuracy
- Precision
- Recall
- F1-Score
- ROC-AUC
- Precision-Recall AUC
### Regression Metrics
- R² Score
- Mean Absolute Error (MAE)
- Mean Squared Error (MSE)
- Root Mean Squared Error (RMSE)
## 🔧 Troubleshooting
### Common Issues
**1. Memory Issues**
```bash
# Reduce number of models
python -m automl_lite.cli.main train data.csv --target target --max-models 5
```
**2. Time Budget Exceeded**
```bash
# Increase time budget
python -m automl_lite.cli.main train data.csv --target target --time-budget 1200
```
**3. Model Compatibility**
```python
# Check model support
automl = AutoMLite(enable_ensemble=False) # Disable ensemble if issues occur
```
**4. Feature Selection Issues**
```python
# Disable feature selection for debugging
automl = AutoMLite(enable_feature_selection=False)
```
### Debug Mode
Enable verbose logging for debugging:
```bash
python -m automl_lite.cli.main train data.csv --target target --verbose
```
## 🤝 Contributing
We welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.
### Development Setup
```bash
# Clone repository
git clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git
cd AutoML-Lite
# Install development dependencies
pip install -r requirements-dev.txt
# Run tests
pytest tests/
# Run linting
flake8 src/
black src/
```
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 👨💻 Author
**Sherin Joseph Roy**
- Email: sherin.joseph2217@gmail.com
- GitHub: [@Sherin-SEF-AI](https://github.com/Sherin-SEF-AI)
## 🙏 Acknowledgments
- [scikit-learn](https://scikit-learn.org/) for the machine learning algorithms
- [Optuna](https://optuna.org/) for hyperparameter optimization
- [Plotly](https://plotly.com/) for interactive visualizations
- [Pandas](https://pandas.pydata.org/) for data manipulation
## 📞 Support
- **Issues**: [GitHub Issues](https://github.com/Sherin-SEF-AI/AutoML-Lite/issues)
- **Email**: sherin.joseph2217@gmail.com
- **Documentation**: [GitHub Wiki](https://github.com/Sherin-SEF-AI/AutoML-Lite/wiki)
---
**Made with ❤️ by Sherin Joseph Roy**
Raw data
{
"_id": null,
"home_page": null,
"name": "automl-lite",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "sherin joseph roy <sherin.joseph2217@gmail.com>",
"keywords": "machine-learning, automl, scikit-learn, data-science, ml",
"author": null,
"author_email": "sherin joseph roy <sherin.joseph2217@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/7b/46/c1f90b7b551a18e4eb50336df7907b2d3ef8a6f5adb37109e6b66c9e5866/automl_lite-0.1.0.tar.gz",
"platform": null,
"description": "# AutoML Lite \ud83e\udd16\n\n**Automated Machine Learning Made Simple**\n\n[](https://www.python.org/downloads/)\n[](LICENSE)\n[]()\n\nA lightweight, production-ready automated machine learning library that simplifies the entire ML pipeline from data preprocessing to model deployment.\n\n## \ud83d\udccb Table of Contents\n\n- [Features](#-features)\n- [Installation](#-installation)\n- [Quick Start](#-quick-start)\n- [CLI Commands](#-cli-commands)\n- [Python API](#-python-api)\n- [Advanced Features](#-advanced-features)\n- [Use Cases](#-use-cases)\n- [Examples](#-examples)\n- [Configuration](#-configuration)\n- [Contributing](#-contributing)\n- [License](#-license)\n\n## \u2728 Features\n\n### \ud83c\udfaf Core Features\n- **Automated Model Selection**: Tests multiple algorithms and selects the best performer\n- **Hyperparameter Optimization**: Uses Optuna for efficient parameter tuning\n- **Cross-Validation**: Robust model evaluation with customizable folds\n- **Feature Engineering**: Automatic preprocessing and feature selection\n- **Model Persistence**: Save and load trained models easily\n\n### \ud83d\ude80 Advanced Features\n- **Ensemble Methods**: Automatic ensemble creation with voting classifiers\n- **Early Stopping**: Optimized training with patience and early stopping\n- **Feature Selection**: Intelligent feature importance and selection\n- **Model Interpretability**: SHAP values and feature effects analysis\n- **Comprehensive Reporting**: HTML reports with interactive visualizations\n\n### \ud83d\udee0\ufe0f Production Ready\n- **CLI Interface**: Complete command-line interface\n- **Error Handling**: Robust error handling and fallback mechanisms\n- **Logging**: Comprehensive logging for debugging\n- **Type Hints**: Full type annotations for better development experience\n\n## \ud83d\ude80 Installation\n\n### Prerequisites\n- Python 3.8 or higher\n- pip package manager\n\n### Install from Source\n```bash\n# Clone the repository\ngit clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git\ncd AutoML-Lite\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate # On Windows: venv\\Scripts\\activate\n\n# Install dependencies\npip install -r requirements.txt\n\n# Install the package\npip install -e .\n```\n\n### Dependencies\nThe package automatically installs all required dependencies:\n- scikit-learn\n- pandas\n- numpy\n- optuna\n- plotly\n- seaborn\n- matplotlib\n- jinja2\n- joblib\n\n## \ud83d\ude80 Quick Start\n\n### Using CLI (Recommended for Beginners)\n\n1. **Train a Model**\n```bash\npython -m automl_lite.cli.main train data.csv --target target_column --output model.pkl\n```\n\n2. **Make Predictions**\n```bash\npython -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv\n```\n\n3. **Generate Report**\n```bash\npython -m automl_lite.cli.main report model.pkl --output report.html\n```\n\n### Using Python API\n\n```python\nfrom automl_lite import AutoMLite\nimport pandas as pd\n\n# Load data\ndata = pd.read_csv('data.csv')\n\n# Initialize AutoML\nautoml = AutoMLite(\n problem_type='classification',\n enable_ensemble=True,\n enable_feature_selection=True,\n enable_interpretability=True\n)\n\n# Train model\nautoml.fit(data, target_column='target')\n\n# Make predictions\npredictions = automl.predict(test_data)\n\n# Generate report\nautoml.generate_report('report.html')\n\n# Save model\nautoml.save_model('model.pkl')\n```\n\n## \ud83d\udcd6 CLI Commands\n\n### Training Command\n\n```bash\npython -m automl_lite.cli.main train [OPTIONS] DATA\n```\n\n**Arguments:**\n- `DATA`: Path to training data file (CSV format)\n\n**Options:**\n- `--target TEXT`: Target column name (required)\n- `--output PATH`: Output model file path (default: model.pkl)\n- `--config PATH`: Configuration file path\n- `--time-budget INTEGER`: Time budget in seconds (default: 300)\n- `--max-models INTEGER`: Maximum number of models (default: 10)\n- `--cv-folds INTEGER`: Cross-validation folds (default: 5)\n- `--enable-ensemble`: Enable ensemble methods\n- `--enable-feature-selection`: Enable feature selection\n- `--enable-interpretability`: Enable model interpretability\n- `--verbose`: Verbose output\n\n**Examples:**\n```bash\n# Basic training\npython -m automl_lite.cli.main train iris.csv --target species --output iris_model.pkl\n\n# Advanced training with all features\npython -m automl_lite.cli.main train data.csv --target target --output model.pkl \\\n --enable-ensemble --enable-feature-selection --enable-interpretability \\\n --time-budget 600 --max-models 15 --verbose\n```\n\n### Prediction Command\n\n```bash\npython -m automl_lite.cli.main predict [OPTIONS] MODEL DATA\n```\n\n**Arguments:**\n- `MODEL`: Path to trained model file\n- `DATA`: Path to prediction data file\n\n**Options:**\n- `--output PATH`: Output predictions file path (default: predictions.csv)\n- `--proba`: Output prediction probabilities\n\n**Examples:**\n```bash\n# Regular predictions\npython -m automl_lite.cli.main predict model.pkl test_data.csv --output predictions.csv\n\n# Probability predictions\npython -m automl_lite.cli.main predict model.pkl test_data.csv --output probabilities.csv --proba\n```\n\n### Report Command\n\n```bash\npython -m automl_lite.cli.main report [OPTIONS] MODEL\n```\n\n**Arguments:**\n- `MODEL`: Path to trained model file\n\n**Options:**\n- `--output PATH`: Output report file path (default: report.html)\n\n**Examples:**\n```bash\npython -m automl_lite.cli.main report model.pkl --output comprehensive_report.html\n```\n\n### Interactive Mode\n\n```bash\npython -m automl_lite.cli.main interactive\n```\n\nLaunches an interactive session for guided model training and analysis.\n\n## \ud83d\udc0d Python API\n\n### AutoMLite Class\n\nThe main class for automated machine learning.\n\n```python\nfrom automl_lite import AutoMLite\n\nautoml = AutoMLite(\n problem_type='classification', # or 'regression'\n time_budget=300,\n max_models=10,\n cv_folds=5,\n enable_ensemble=True,\n enable_feature_selection=True,\n enable_interpretability=True,\n random_state=42\n)\n```\n\n### Methods\n\n#### `fit(X, y=None, target_column=None)`\nTrain the AutoML model.\n\n```python\n# Using DataFrame with target column\nautoml.fit(data, target_column='target')\n\n# Using separate X and y\nautoml.fit(X, y)\n```\n\n#### `predict(X)`\nMake predictions on new data.\n\n```python\npredictions = automl.predict(test_data)\n```\n\n#### `predict_proba(X)`\nGet prediction probabilities (classification only).\n\n```python\nprobabilities = automl.predict_proba(test_data)\n```\n\n#### `save_model(path)`\nSave the trained model.\n\n```python\nautoml.save_model('model.pkl')\n```\n\n#### `load_model(path)`\nLoad a saved model.\n\n```python\nautoml.load_model('model.pkl')\n```\n\n#### `generate_report(path)`\nGenerate comprehensive HTML report.\n\n```python\nautoml.generate_report('report.html')\n```\n\n#### `get_leaderboard()`\nGet model performance leaderboard.\n\n```python\nleaderboard = automl.get_leaderboard()\n```\n\n#### `get_feature_importance()`\nGet feature importance scores.\n\n```python\nimportance = automl.get_feature_importance()\n```\n\n## \ud83d\ude80 Advanced Features\n\n### Ensemble Methods\n\nAutoML Lite automatically creates ensemble models by combining the best performing models:\n\n```python\nautoml = AutoMLite(enable_ensemble=True)\nautoml.fit(data, target_column='target')\n\n# The ensemble model is automatically created and used for predictions\npredictions = automl.predict(test_data)\n```\n\n**Features:**\n- Automatic detection of `predict_proba` support\n- Soft voting for compatible models\n- Hard voting fallback for incompatible models\n- Top-K model selection\n\n### Feature Selection\n\nIntelligent feature selection based on importance scores:\n\n```python\nautoml = AutoMLite(enable_feature_selection=True)\nautoml.fit(data, target_column='target')\n\n# Get selected features\nselected_features = automl.selected_features\nprint(f\"Selected {len(selected_features)} features out of {len(data.columns)}\")\n```\n\n### Model Interpretability\n\nComprehensive model interpretability using SHAP values:\n\n```python\nautoml = AutoMLite(enable_interpretability=True)\nautoml.fit(data, target_column='target')\n\n# Get interpretability results\ninterpretability = automl.get_interpretability_results()\n```\n\n**Available Interpretability Features:**\n- SHAP values for feature importance\n- Feature effects analysis\n- Model complexity metrics\n- Individual prediction explanations\n\n### Early Stopping\n\nOptimized training with early stopping:\n\n```python\nautoml = AutoMLite(\n enable_early_stopping=True,\n patience=10,\n min_delta=0.001\n)\n```\n\n## \ud83d\udcca Use Cases\n\n### 1. Classification Problems\n\n**Customer Churn Prediction**\n```python\n# Load customer data\ncustomer_data = pd.read_csv('customer_data.csv')\n\n# Train model\nautoml = AutoMLite(problem_type='classification', enable_ensemble=True)\nautoml.fit(customer_data, target_column='churned')\n\n# Predict churn probability\nchurn_prob = automl.predict_proba(new_customers)\n```\n\n**Spam Detection**\n```python\n# Email classification\nautoml = AutoMLite(\n problem_type='classification',\n enable_feature_selection=True,\n enable_interpretability=True\n)\nautoml.fit(email_data, target_column='is_spam')\n\n# Generate report for analysis\nautoml.generate_report('spam_detection_report.html')\n```\n\n### 2. Regression Problems\n\n**House Price Prediction**\n```python\n# Real estate data\nautoml = AutoMLite(\n problem_type='regression',\n enable_ensemble=True,\n time_budget=600\n)\nautoml.fit(house_data, target_column='price')\n\n# Predict house prices\npredictions = automl.predict(new_houses)\n```\n\n**Sales Forecasting**\n```python\n# Time series forecasting\nautoml = AutoMLite(\n problem_type='regression',\n enable_feature_selection=True\n)\nautoml.fit(sales_data, target_column='sales_volume')\n```\n\n### 3. Production Deployment\n\n**Batch Processing**\n```bash\n# Train model\npython -m automl_lite.cli.main train historical_data.csv --target target --output production_model.pkl\n\n# Batch predictions\npython -m automl_lite.cli.main predict production_model.pkl new_data.csv --output batch_predictions.csv\n```\n\n**API Integration**\n```python\n# Load trained model\nautoml = AutoMLite()\nautoml.load_model('production_model.pkl')\n\n# API endpoint\ndef predict_endpoint(data):\n return automl.predict(data)\n```\n\n## \ud83d\udcda Examples\n\n### Basic Classification Example\n\n```python\nfrom automl_lite import AutoMLite\nimport pandas as pd\nfrom sklearn.datasets import load_iris\n\n# Load iris dataset\niris = load_iris()\ndata = pd.DataFrame(iris.data, columns=iris.feature_names)\ndata['target'] = iris.target\n\n# Initialize AutoML\nautoml = AutoMLite(\n problem_type='classification',\n time_budget=60,\n max_models=5\n)\n\n# Train model\nautoml.fit(data, target_column='target')\n\n# Make predictions\npredictions = automl.predict(data.iloc[:10])\n\n# Generate report\nautoml.generate_report('iris_report.html')\n\nprint(f\"Best model: {automl.best_model_name}\")\nprint(f\"Best score: {automl.best_score:.4f}\")\n```\n\n### Advanced Regression Example\n\n```python\nfrom automl_lite import AutoMLite\nimport pandas as pd\nfrom sklearn.datasets import load_boston\n\n# Load boston housing dataset\nboston = load_boston()\ndata = pd.DataFrame(boston.data, columns=boston.feature_names)\ndata['target'] = boston.target\n\n# Initialize AutoML with all features\nautoml = AutoMLite(\n problem_type='regression',\n enable_ensemble=True,\n enable_feature_selection=True,\n enable_interpretability=True,\n time_budget=300,\n max_models=10\n)\n\n# Train model\nautoml.fit(data, target_column='target')\n\n# Get feature importance\nimportance = automl.get_feature_importance()\nprint(\"Top 5 features:\")\nfor feature, score in list(importance.items())[:5]:\n print(f\"{feature}: {score:.4f}\")\n\n# Generate comprehensive report\nautoml.generate_report('boston_housing_report.html')\n```\n\n### CLI Workflow Example\n\n```bash\n# 1. Train model with all features\npython -m automl_lite.cli.main train customer_data.csv \\\n --target churn \\\n --output customer_churn_model.pkl \\\n --enable-ensemble \\\n --enable-feature-selection \\\n --enable-interpretability \\\n --time-budget 600 \\\n --max-models 15 \\\n --verbose\n\n# 2. Generate comprehensive report\npython -m automl_lite.cli.main report customer_churn_model.pkl \\\n --output customer_churn_report.html\n\n# 3. Make predictions on new data\npython -m automl_lite.cli.main predict customer_churn_model.pkl \\\n new_customers.csv \\\n --output churn_predictions.csv \\\n --proba\n```\n\n## \u2699\ufe0f Configuration\n\n### Configuration File\n\nCreate a `config.yaml` file for custom settings:\n\n```yaml\n# AutoML Configuration\nproblem_type: classification\ntime_budget: 600\nmax_models: 15\ncv_folds: 5\nrandom_state: 42\n\n# Advanced Features\nenable_ensemble: true\nenable_feature_selection: true\nenable_interpretability: true\nenable_early_stopping: true\n\n# Model Parameters\nmodels:\n - RandomForest\n - XGBoost\n - LightGBM\n - SVM\n - NeuralNetwork\n\n# Feature Selection\nfeature_selection:\n method: mutual_info\n threshold: 0.01\n max_features: 20\n\n# Ensemble\nensemble:\n method: voting\n top_k: 3\n voting: soft\n```\n\n### Using Configuration\n\n```bash\npython -m automl_lite.cli.main train data.csv --target target --config config.yaml\n```\n\n```python\nautoml = AutoMLite.from_config('config.yaml')\nautoml.fit(data, target_column='target')\n```\n\n## \ud83d\udcca Supported Algorithms\n\n### Classification\n- Random Forest\n- XGBoost\n- LightGBM\n- Support Vector Machine (SVM)\n- Logistic Regression\n- Naive Bayes\n- Neural Network (MLP)\n- Extra Trees\n- Linear Discriminant Analysis\n\n### Regression\n- Random Forest\n- XGBoost\n- LightGBM\n- Support Vector Regression (SVR)\n- Linear Regression\n- Ridge Regression\n- Lasso Regression\n- Neural Network (MLP)\n- Extra Trees\n\n## \ud83d\udcc8 Performance Metrics\n\n### Classification Metrics\n- Accuracy\n- Precision\n- Recall\n- F1-Score\n- ROC-AUC\n- Precision-Recall AUC\n\n### Regression Metrics\n- R\u00b2 Score\n- Mean Absolute Error (MAE)\n- Mean Squared Error (MSE)\n- Root Mean Squared Error (RMSE)\n\n## \ud83d\udd27 Troubleshooting\n\n### Common Issues\n\n**1. Memory Issues**\n```bash\n# Reduce number of models\npython -m automl_lite.cli.main train data.csv --target target --max-models 5\n```\n\n**2. Time Budget Exceeded**\n```bash\n# Increase time budget\npython -m automl_lite.cli.main train data.csv --target target --time-budget 1200\n```\n\n**3. Model Compatibility**\n```python\n# Check model support\nautoml = AutoMLite(enable_ensemble=False) # Disable ensemble if issues occur\n```\n\n**4. Feature Selection Issues**\n```python\n# Disable feature selection for debugging\nautoml = AutoMLite(enable_feature_selection=False)\n```\n\n### Debug Mode\n\nEnable verbose logging for debugging:\n\n```bash\npython -m automl_lite.cli.main train data.csv --target target --verbose\n```\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guidelines](CONTRIBUTING.md) for details.\n\n### Development Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/Sherin-SEF-AI/AutoML-Lite.git\ncd AutoML-Lite\n\n# Install development dependencies\npip install -r requirements-dev.txt\n\n# Run tests\npytest tests/\n\n# Run linting\nflake8 src/\nblack src/\n```\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udc68\u200d\ud83d\udcbb Author\n\n**Sherin Joseph Roy**\n- Email: sherin.joseph2217@gmail.com\n- GitHub: [@Sherin-SEF-AI](https://github.com/Sherin-SEF-AI)\n\n## \ud83d\ude4f Acknowledgments\n\n- [scikit-learn](https://scikit-learn.org/) for the machine learning algorithms\n- [Optuna](https://optuna.org/) for hyperparameter optimization\n- [Plotly](https://plotly.com/) for interactive visualizations\n- [Pandas](https://pandas.pydata.org/) for data manipulation\n\n## \ud83d\udcde Support\n\n- **Issues**: [GitHub Issues](https://github.com/Sherin-SEF-AI/AutoML-Lite/issues)\n- **Email**: sherin.joseph2217@gmail.com\n- **Documentation**: [GitHub Wiki](https://github.com/Sherin-SEF-AI/AutoML-Lite/wiki)\n\n---\n\n**Made with \u2764\ufe0f by Sherin Joseph Roy** \n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A simplified automated machine learning package for non-experts",
"version": "0.1.0",
"project_urls": {
"Bug Tracker": "https://github.com/Sherin-SEF-AI/AutoML-Lite/issues",
"Documentation": "https://github.com/Sherin-SEF-AI/AutoML-Lite#readme",
"Homepage": "https://github.com/Sherin-SEF-AI/AutoML-Lite",
"Repository": "https://github.com/Sherin-SEF-AI/AutoML-Lite"
},
"split_keywords": [
"machine-learning",
" automl",
" scikit-learn",
" data-science",
" ml"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b2e0b2a3e4d894dd52f7a6f39f014c3a75b70a7812c41f10900ca1de7aab9c6a",
"md5": "07c8710ad3a1aae82d13e8a74a763109",
"sha256": "37d8a8ac1bdf4362b3a276cc760eee8660554660f53cf08ef7821949fd8055d3"
},
"downloads": -1,
"filename": "automl_lite-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "07c8710ad3a1aae82d13e8a74a763109",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 61637,
"upload_time": "2025-07-15T09:13:14",
"upload_time_iso_8601": "2025-07-15T09:13:14.099660Z",
"url": "https://files.pythonhosted.org/packages/b2/e0/b2a3e4d894dd52f7a6f39f014c3a75b70a7812c41f10900ca1de7aab9c6a/automl_lite-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7b46c1f90b7b551a18e4eb50336df7907b2d3ef8a6f5adb37109e6b66c9e5866",
"md5": "cd435567e0927cfc610d2fe9bdf121da",
"sha256": "4959df3803d2448cf5b5465a03424a10a35bba109903e92e8951c6927cc51c58"
},
"downloads": -1,
"filename": "automl_lite-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "cd435567e0927cfc610d2fe9bdf121da",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 81483,
"upload_time": "2025-07-15T09:13:15",
"upload_time_iso_8601": "2025-07-15T09:13:15.650379Z",
"url": "https://files.pythonhosted.org/packages/7b/46/c1f90b7b551a18e4eb50336df7907b2d3ef8a6f5adb37109e6b66c9e5866/automl_lite-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-15 09:13:15",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Sherin-SEF-AI",
"github_project": "AutoML-Lite",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "numpy",
"specs": [
[
">=",
"1.21.0"
]
]
},
{
"name": "pandas",
"specs": [
[
">=",
"1.3.0"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "optuna",
"specs": [
[
">=",
"3.0.0"
]
]
},
{
"name": "shap",
"specs": [
[
">=",
"0.41.0"
]
]
},
{
"name": "matplotlib",
"specs": [
[
">=",
"3.5.0"
]
]
},
{
"name": "seaborn",
"specs": [
[
">=",
"0.11.0"
]
]
},
{
"name": "joblib",
"specs": [
[
">=",
"1.1.0"
]
]
},
{
"name": "plotly",
"specs": [
[
">=",
"5.0.0"
]
]
},
{
"name": "jinja2",
"specs": [
[
">=",
"3.0.0"
]
]
},
{
"name": "tqdm",
"specs": [
[
">=",
"4.62.0"
]
]
},
{
"name": "scipy",
"specs": [
[
">=",
"1.7.0"
]
]
},
{
"name": "category-encoders",
"specs": [
[
">=",
"2.3.0"
]
]
},
{
"name": "imbalanced-learn",
"specs": [
[
">=",
"0.8.0"
]
]
}
],
"lcname": "automl-lite"
}