noventis


Namenoventis JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/bccfilkom/noventis
SummaryAn all-in-one automation library that simplifies data cleaning, exploratory data analysis, and machine learning β€” from raw data to ready-to-deploy models.
upload_time2025-10-10 16:33:42
maintainerNone
docs_urlNone
authorRichard, Fatoni Murfid Syafii, Ahmad Nafi Mubarok, Orie Abyan Maulana, Grace Wahyuni, Rimba Nevada, Alexander Angelo, Jason Surya Winata, Nada Musyaffa Bilhaqi
requires_python>=3.8
licenseMIT
keywords machine learning automl automated machine learning data science artificial intelligence data cleaning feature engineering eda exploratory data analysis predictor scikit-learn xgboost lightgbm catboost optuna shap flaml
VCS
bugtrack_url
requirements pandas numpy scipy matplotlib seaborn ipython scikit-learn statsmodels optuna shap flaml xgboost lightgbm catboost category_encoders joblib imbalanced-learn Jinja2 nbformat
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
  
<h1 align="center">
  <img src="https://github.com/user-attachments/assets/8d64296a-55f2-4eb4-bc55-275f5d75ef75" alt="Noventis Logo" width="40" height="40" style="vertical-align: middle;"/>
  Noventis
</h1>

### Intelligent Automation for Your Data Analysis

[![PyPI version](https://badge.fury.io/py/noventis.svg)](https://badge.fury.io/py/noventis)
[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

[Website](https://noventis-fe.vercel.app/) β€’ [Documentation](https://github.com/bccfilkom/noventis)

<img width="1247" height="637" alt="Screenshot From 2025-10-02 09-44-31" src="https://github.com/user-attachments/assets/264f13ce-4f5a-477a-a89d-73f0c9a585bd" />

</div>

---

## πŸš€ Overview

**Noventis** is a powerful Python library designed to revolutionize your data analysis workflow through intelligent automation. Built with modern data scientists and analysts in mind, Noventis provides cutting-edge tools for automated exploratory data analysis, predictive modeling, and data cleaningβ€”all with minimal code.

### ✨ Key Features

- **πŸ” EDA Auto** - Automated exploratory data analysis with comprehensive visualizations and statistical insights
- **🎯 Predictor** - Intelligent ML model selection and training with automated hyperparameter tuning
- **🧹 Data Cleaner** - Smart data preprocessing and cleaning with advanced imputation strategies
- **⚑ Fast & Efficient** - Optimized for performance with large datasets
- **πŸ“Š Rich Visualizations** - Beautiful, publication-ready charts and reports
- **πŸ”§ Highly Customizable** - Fine-tune every aspect to match your needs

---

## πŸ“¦ Installation

### Quick Installation

```bash
pip install noventis
```

### Install from Source

```bash
git clone https://github.com/yourusername/noventis.git
cd noventis
pip install -e .
```

### Verify Installation

```python
import noventis
print(noventis.__version__)
noventis.print_info()  # Show detailed installation info
```

---

## 🎯 Quick Start

### 1️⃣ Data Cleaner

Get started with intelligent data preprocessing and cleaning.

```python
import pandas as pd
from noventis.data_cleaner import AutoCleaner

# Load your data
df = pd.read_csv('your_data.csv')

# Automatic data cleaning
cleaner = AutoCleaner()
df_clean = cleaner.fit_transform(df)

# The cleaned data is ready for analysis!
print(df_clean.info())
```

πŸ‘‰ [Read the Data Cleaner Guide](https://github.com/bccfilkom/noventis/blob/main/docs/data_cleaner.md)

### 2️⃣ EDA Auto

Automatically generate comprehensive exploratory data analysis reports.

```python
from noventis.eda_auto import EDAuto

# Create EDA report
eda = EDAuto(df_clean)

# Generate comprehensive analysis
eda.generate_report()

# Show specific analyses
eda.show_distributions()
eda.show_correlations()
eda.show_missing_patterns()
```

πŸ‘‰ [Read the EDA Auto Guide](https://github.com/bccfilkom/noventis/blob/main/docs/eda_auto.md)

### 3️⃣ Predictor

Build and train machine learning models with automated optimization.

```python
from noventis.predictor import PredictorAuto

# Prepare data
X = df_clean.drop('target', axis=1)
y = df_clean['target']

# Automatic model training
predictor = PredictorAuto()
predictor.fit(X, y, task='classification')

# Make predictions
predictions = predictor.predict(X_test)

# Get model performance
print(predictor.get_metrics())
```

[Read the Predictor Guide β†’](https://github.com/bccfilkom/noventis/blob/main/docs/predictor.md)

### 4️⃣ Complete Pipeline Example

```python
import pandas as pd
from noventis.data_cleaner import AutoCleaner
from noventis.eda_auto import EDAuto
from noventis.predictor import PredictorAuto

# 1. Load data
df = pd.read_csv('your_data.csv')

# 2. Clean data
cleaner = AutoCleaner()
df_clean = cleaner.fit_transform(df)

# 3. Explore data
eda = EDAuto(df_clean)
eda.generate_report()

# 4. Train model
X = df_clean.drop('target', axis=1)
y = df_clean['target']

predictor = PredictorAuto()
predictor.fit(X, y, task='classification')

# 5. Evaluate
print(f"Model Accuracy: {predictor.score(X_test, y_test):.2%}")
```

---

## πŸ“š Core Modules

### 🧹 Data Cleaner

Intelligent data preprocessing and cleaning with advanced strategies:

- **Missing Data Handling** - Multiple imputation strategies (mean, median, KNN, iterative)
- **Outlier Treatment** - Statistical and ML-based detection (IQR, Z-score, Isolation Forest)
- **Feature Scaling** - Normalization and standardization techniques
- **Encoding** - Automatic categorical variable encoding (One-Hot, Label, Target)
- **Data Type Detection** - Intelligent type inference and conversion
- **Duplicate Removal** - Smart duplicate detection and handling

[Learn more β†’](docs/data_cleaner.md)

### πŸ” EDA Auto

Comprehensive exploratory data analysis automation:

- **Statistical Summary** - Descriptive statistics for all features
- **Distribution Analysis** - Histograms, KDE plots, and normality tests
- **Correlation Analysis** - Heatmaps and correlation matrices
- **Missing Data Analysis** - Visualization and patterns of missing values
- **Outlier Detection** - Automatic identification of anomalies
- **Feature Relationships** - Scatter plots and pairwise analysis

[Learn more β†’](docs/eda_auto.md)

### 🎯 Predictor

Automated machine learning with intelligent model selection:

- **Auto Model Selection** - Automatically selects the best algorithm for your data
- **Hyperparameter Tuning** - Optimizes model parameters using advanced search algorithms
- **Feature Engineering** - Creates and selects relevant features automatically
- **Cross-Validation** - Robust model evaluation with k-fold validation
- **Model Explainability** - SHAP values and feature importance analysis
- **Ensemble Methods** - Combines multiple models for better performance

**Supported Algorithms:**

- Scikit-learn: Random Forest, Gradient Boosting, Logistic Regression, SVM
- XGBoost: Extreme Gradient Boosting
- LightGBM: Light Gradient Boosting Machine
- CatBoost: Categorical Boosting
- And many more...

[Learn more β†’](docs/auto.md)

---

## πŸ› οΈ Requirements

### System Requirements

- Python 3.8 or higher
- 4GB RAM minimum (8GB+ recommended for large datasets)
- Windows, macOS, or Linux

### Core Dependencies

Noventis automatically installs these dependencies:

- **Data Processing**: pandas, numpy, scipy
- **Visualization**: matplotlib, seaborn
- **Machine Learning**: scikit-learn, xgboost, lightgbm, catboost
- **AutoML**: optuna, flaml, shap
- **Feature Engineering**: category_encoders, statsmodels

See [requirements.txt](requirements.txt) for complete list.

---

## 🀝 Contributing

We welcome contributions from the community! Here's how you can help:

### Ways to Contribute

1. **πŸ› Report Bugs** - Found a bug? [Open an issue](https://github.com/yourusername/noventis/issues)
2. **πŸ’‘ Suggest Features** - Have ideas? We'd love to hear them!
3. **πŸ“– Improve Documentation** - Help us make the docs better
4. **πŸ”§ Submit Pull Requests** - Fix bugs or add features

### Development Setup

```bash
# Clone the repository
git clone https://github.com/yourusername/noventis.git
cd noventis

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode
pip install -e .[dev]

# Run tests
pytest tests/

# Run linting
flake8 noventis/
black noventis/
```

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.

---

## πŸ‘₯ Contributors

This project exists thanks to all the people who contribute:

| Contributor               | Role                |
| ------------------------- | ------------------- |
| **Richard**               | Product Manager     |
| **Fatoni Murfids**        | AI Product Manager  |
| **Ahmad Nafi Mubarok**    | Lead Data Scientist |
| **Orie Abyan Maulana**    | Lead Data Analyst   |
| **Grace Wahyuni**         | Data Analyst        |
| **Alexander Angelo**      | Data Scientist      |
| **Rimba Nevada**          | Data Scientist      |
| **Jason Surya Winata**    | Frontend Engineer   |
| **Nada Musyaffa Bilhaqi** | Product Designer    |

### Special Thanks

A huge thank you to the maintainers of our dependencies:

- pandas, numpy, scikit-learn, and the entire Python scientific computing community
- XGBoost, LightGBM, and CatBoost teams for excellent gradient boosting libraries
- Optuna and FLAML teams for amazing AutoML frameworks

---

## πŸ“‚ Project Structure

The folder structure of **Noventis** project:

```bash
.
β”œβ”€β”€ πŸ“ dataset_for_examples/     # Sample datasets for testing
β”œβ”€β”€ πŸ“ docs/                     # Documentation files
β”œβ”€β”€ πŸ“ examples/                 # Example notebooks and scripts
β”œβ”€β”€ πŸ“ noventis/                 # Main library code
β”‚   β”œβ”€β”€ πŸ“ __pycache__/
β”‚   β”œβ”€β”€ πŸ“ asset/               # Asset files (if any)
β”‚   β”œβ”€β”€ πŸ“ core/                # Core functionality
β”‚   β”œβ”€β”€ πŸ“ data_cleaner/        # Data cleaning module
β”‚   β”‚   β”œβ”€β”€ πŸ“„ __init__.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ auto.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ data_quality.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ encoding.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ imputing.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ orchestrator.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ outlier_handling.py
β”‚   β”‚   └── πŸ“„ scaling.py
β”‚   β”œβ”€β”€ πŸ“ eda_auto/            # EDA automation module
β”‚   β”‚   β”œβ”€β”€ πŸ“„ __init__.py
β”‚   β”‚   └── πŸ“„ eda_auto.py
β”‚   β”œβ”€β”€ πŸ“ predictor/           # Prediction module
β”‚   β”‚   β”œβ”€β”€ πŸ“„ __init__.py
β”‚   β”‚   β”œβ”€β”€ πŸ“„ auto.py
β”‚   β”‚   └── πŸ“„ manual.py
β”‚   └── πŸ“„ __init__.py          # Main package init
β”œβ”€β”€ πŸ“ noventis.egg-info/       # Package metadata
β”‚   β”œβ”€β”€ πŸ“„ dependency_links.txt
β”‚   β”œβ”€β”€ πŸ“„ PKG-INFO
β”‚   β”œβ”€β”€ πŸ“„ SOURCES.txt
β”‚   └── πŸ“„ top_level.txt
β”œβ”€β”€ πŸ“ tests/                   # Unit tests
β”œβ”€β”€ πŸ“„ .gitignore               # Git ignore rules
β”œβ”€β”€ πŸ“„ LICENSE                  # MIT License
β”œβ”€β”€ πŸ“„ MANIFEST.in              # Package manifest
β”œβ”€β”€ πŸ“„ pyproject.toml           # Modern Python packaging config
β”œβ”€β”€ πŸ“„ README.md                # This file
β”œβ”€β”€ πŸ“„ requirements.txt         # Production dependencies
β”œβ”€β”€ πŸ“„ requirements-dev.txt     # Development dependencies
└── πŸ“„ setup.py                 # Package setup script
```

### πŸ“Œ Notes

- The `noventis/` folder contains the **main library code**
- The `tests/` folder is dedicated to **unit testing and integration testing**
- `setup.py` and `pyproject.toml` are used for **packaging and distribution**
- `requirements.txt` lists the **external dependencies** needed for the project

πŸš€ With this structure, the project is ready for development, testing, and publishing on **PyPI or GitHub**.

---

## πŸ”§ Troubleshooting

### Common Issues

**Problem**: `ModuleNotFoundError: No module named 'noventis'`

```bash
# Solution: Reinstall the package
pip uninstall noventis
pip install noventis
```

**Problem**: Dependencies conflict

```bash
# Solution: Create a fresh virtual environment
python -m venv fresh_env
source fresh_env/bin/activate
pip install noventis
```

**Problem**: Import errors after installation

```python
# Solution: Verify installation
import noventis
print(noventis.__version__)
noventis.print_info()  # Check all dependencies
```

### Getting Help

- πŸ“– [Documentation](https://docs.noventis.dev)
- πŸ› [GitHub Issues](https://github.com/bcc/noventis/issues)

---

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

### Third-Party Licenses

Noventis uses several open-source libraries. We are grateful to their maintainers:

- **Data Processing**: pandas (BSD), numpy (BSD), scipy (BSD)
- **Visualization**: matplotlib (PSF), seaborn (BSD)
- **Machine Learning**: scikit-learn (BSD), xgboost (Apache 2.0), lightgbm (MIT), catboost (Apache 2.0)
- **AutoML**: optuna (MIT), flaml (MIT), shap (MIT)
- **Feature Engineering**: category_encoders (BSD), statsmodels (BSD)

All dependencies are licensed under permissive open-source licenses (BSD, MIT, Apache 2.0).

---

## πŸ“š Citation

If you use Noventis in your research, please cite:

```bibtex
@software{noventis2025,
  author = {Noventis Team},
  title = {Noventis: Intelligent Automation for Data Analysis},
  year = {2025},
  url = {https://github.com/bccfilkom/noventis}
}
```

---

## 🌟 Star History

[![Star History Chart](https://api.star-history.com/svg?repos=bccfilkom/noventis&type=Date)](https://star-history.com/#yourusername/noventis&Date)

---

<div align="center">

Made with ❀️ by [Noventis Team](https://noventis.dev)

If you find Noventis useful, please consider giving it a ⭐ on [GitHub](https://github.com/yourusername/noventis)!

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/bccfilkom/noventis",
    "name": "noventis",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "machine learning, automl, automated machine learning, data science, artificial intelligence, data cleaning, feature engineering, eda, exploratory data analysis, predictor, scikit-learn, xgboost, lightgbm, catboost, optuna, shap, flaml",
    "author": "Richard, Fatoni Murfid Syafii, Ahmad Nafi Mubarok, Orie Abyan Maulana, Grace Wahyuni, Rimba Nevada, Alexander Angelo, Jason Surya Winata, Nada Musyaffa Bilhaqi",
    "author_email": "Noventis BCC FILKOM <noventis.bccfilkom@gmail.com>, Richard <richardcen05@gmail.com>, Fatoni Murfid syaafii <fatonimurfids@gmail.com>, Ahmad Nafi Mubarok <ahmadnafim30@gmail.com>, Orie Abyan Maulana <orieabyanm@gmail.com>, Grace Wahyuni <gracewahyuni06@student.ub.ac.id>, Rimba Nevada <nevadakenzie@gmail.com>, Alexander Angelo <alexanderangelo700@gmail.com>, Jason Surya Wijaya <jasonsurya17@gmail.com>, Nada Musyaffa Bilhaqi <nadabilhaqi18@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/f7/35/e9218a61874879a77520651f9a23a7c89c20280b04fb0e9324822967a497/noventis-0.1.1.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\r\n  \r\n<h1 align=\"center\">\r\n  <img src=\"https://github.com/user-attachments/assets/8d64296a-55f2-4eb4-bc55-275f5d75ef75\" alt=\"Noventis Logo\" width=\"40\" height=\"40\" style=\"vertical-align: middle;\"/>\r\n  Noventis\r\n</h1>\r\n\r\n### Intelligent Automation for Your Data Analysis\r\n\r\n[![PyPI version](https://badge.fury.io/py/noventis.svg)](https://badge.fury.io/py/noventis)\r\n[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n\r\n[Website](https://noventis-fe.vercel.app/) \u2022 [Documentation](https://github.com/bccfilkom/noventis)\r\n\r\n<img width=\"1247\" height=\"637\" alt=\"Screenshot From 2025-10-02 09-44-31\" src=\"https://github.com/user-attachments/assets/264f13ce-4f5a-477a-a89d-73f0c9a585bd\" />\r\n\r\n</div>\r\n\r\n---\r\n\r\n## \ud83d\ude80 Overview\r\n\r\n**Noventis** is a powerful Python library designed to revolutionize your data analysis workflow through intelligent automation. Built with modern data scientists and analysts in mind, Noventis provides cutting-edge tools for automated exploratory data analysis, predictive modeling, and data cleaning\u2014all with minimal code.\r\n\r\n### \u2728 Key Features\r\n\r\n- **\ud83d\udd0d EDA Auto** - Automated exploratory data analysis with comprehensive visualizations and statistical insights\r\n- **\ud83c\udfaf Predictor** - Intelligent ML model selection and training with automated hyperparameter tuning\r\n- **\ud83e\uddf9 Data Cleaner** - Smart data preprocessing and cleaning with advanced imputation strategies\r\n- **\u26a1 Fast & Efficient** - Optimized for performance with large datasets\r\n- **\ud83d\udcca Rich Visualizations** - Beautiful, publication-ready charts and reports\r\n- **\ud83d\udd27 Highly Customizable** - Fine-tune every aspect to match your needs\r\n\r\n---\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n### Quick Installation\r\n\r\n```bash\r\npip install noventis\r\n```\r\n\r\n### Install from Source\r\n\r\n```bash\r\ngit clone https://github.com/yourusername/noventis.git\r\ncd noventis\r\npip install -e .\r\n```\r\n\r\n### Verify Installation\r\n\r\n```python\r\nimport noventis\r\nprint(noventis.__version__)\r\nnoventis.print_info()  # Show detailed installation info\r\n```\r\n\r\n---\r\n\r\n## \ud83c\udfaf Quick Start\r\n\r\n### 1\ufe0f\u20e3 Data Cleaner\r\n\r\nGet started with intelligent data preprocessing and cleaning.\r\n\r\n```python\r\nimport pandas as pd\r\nfrom noventis.data_cleaner import AutoCleaner\r\n\r\n# Load your data\r\ndf = pd.read_csv('your_data.csv')\r\n\r\n# Automatic data cleaning\r\ncleaner = AutoCleaner()\r\ndf_clean = cleaner.fit_transform(df)\r\n\r\n# The cleaned data is ready for analysis!\r\nprint(df_clean.info())\r\n```\r\n\r\n\ud83d\udc49 [Read the Data Cleaner Guide](https://github.com/bccfilkom/noventis/blob/main/docs/data_cleaner.md)\r\n\r\n### 2\ufe0f\u20e3 EDA Auto\r\n\r\nAutomatically generate comprehensive exploratory data analysis reports.\r\n\r\n```python\r\nfrom noventis.eda_auto import EDAuto\r\n\r\n# Create EDA report\r\neda = EDAuto(df_clean)\r\n\r\n# Generate comprehensive analysis\r\neda.generate_report()\r\n\r\n# Show specific analyses\r\neda.show_distributions()\r\neda.show_correlations()\r\neda.show_missing_patterns()\r\n```\r\n\r\n\ud83d\udc49 [Read the EDA Auto Guide](https://github.com/bccfilkom/noventis/blob/main/docs/eda_auto.md)\r\n\r\n### 3\ufe0f\u20e3 Predictor\r\n\r\nBuild and train machine learning models with automated optimization.\r\n\r\n```python\r\nfrom noventis.predictor import PredictorAuto\r\n\r\n# Prepare data\r\nX = df_clean.drop('target', axis=1)\r\ny = df_clean['target']\r\n\r\n# Automatic model training\r\npredictor = PredictorAuto()\r\npredictor.fit(X, y, task='classification')\r\n\r\n# Make predictions\r\npredictions = predictor.predict(X_test)\r\n\r\n# Get model performance\r\nprint(predictor.get_metrics())\r\n```\r\n\r\n[Read the Predictor Guide \u2192](https://github.com/bccfilkom/noventis/blob/main/docs/predictor.md)\r\n\r\n### 4\ufe0f\u20e3 Complete Pipeline Example\r\n\r\n```python\r\nimport pandas as pd\r\nfrom noventis.data_cleaner import AutoCleaner\r\nfrom noventis.eda_auto import EDAuto\r\nfrom noventis.predictor import PredictorAuto\r\n\r\n# 1. Load data\r\ndf = pd.read_csv('your_data.csv')\r\n\r\n# 2. Clean data\r\ncleaner = AutoCleaner()\r\ndf_clean = cleaner.fit_transform(df)\r\n\r\n# 3. Explore data\r\neda = EDAuto(df_clean)\r\neda.generate_report()\r\n\r\n# 4. Train model\r\nX = df_clean.drop('target', axis=1)\r\ny = df_clean['target']\r\n\r\npredictor = PredictorAuto()\r\npredictor.fit(X, y, task='classification')\r\n\r\n# 5. Evaluate\r\nprint(f\"Model Accuracy: {predictor.score(X_test, y_test):.2%}\")\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcda Core Modules\r\n\r\n### \ud83e\uddf9 Data Cleaner\r\n\r\nIntelligent data preprocessing and cleaning with advanced strategies:\r\n\r\n- **Missing Data Handling** - Multiple imputation strategies (mean, median, KNN, iterative)\r\n- **Outlier Treatment** - Statistical and ML-based detection (IQR, Z-score, Isolation Forest)\r\n- **Feature Scaling** - Normalization and standardization techniques\r\n- **Encoding** - Automatic categorical variable encoding (One-Hot, Label, Target)\r\n- **Data Type Detection** - Intelligent type inference and conversion\r\n- **Duplicate Removal** - Smart duplicate detection and handling\r\n\r\n[Learn more \u2192](docs/data_cleaner.md)\r\n\r\n### \ud83d\udd0d EDA Auto\r\n\r\nComprehensive exploratory data analysis automation:\r\n\r\n- **Statistical Summary** - Descriptive statistics for all features\r\n- **Distribution Analysis** - Histograms, KDE plots, and normality tests\r\n- **Correlation Analysis** - Heatmaps and correlation matrices\r\n- **Missing Data Analysis** - Visualization and patterns of missing values\r\n- **Outlier Detection** - Automatic identification of anomalies\r\n- **Feature Relationships** - Scatter plots and pairwise analysis\r\n\r\n[Learn more \u2192](docs/eda_auto.md)\r\n\r\n### \ud83c\udfaf Predictor\r\n\r\nAutomated machine learning with intelligent model selection:\r\n\r\n- **Auto Model Selection** - Automatically selects the best algorithm for your data\r\n- **Hyperparameter Tuning** - Optimizes model parameters using advanced search algorithms\r\n- **Feature Engineering** - Creates and selects relevant features automatically\r\n- **Cross-Validation** - Robust model evaluation with k-fold validation\r\n- **Model Explainability** - SHAP values and feature importance analysis\r\n- **Ensemble Methods** - Combines multiple models for better performance\r\n\r\n**Supported Algorithms:**\r\n\r\n- Scikit-learn: Random Forest, Gradient Boosting, Logistic Regression, SVM\r\n- XGBoost: Extreme Gradient Boosting\r\n- LightGBM: Light Gradient Boosting Machine\r\n- CatBoost: Categorical Boosting\r\n- And many more...\r\n\r\n[Learn more \u2192](docs/auto.md)\r\n\r\n---\r\n\r\n## \ud83d\udee0\ufe0f Requirements\r\n\r\n### System Requirements\r\n\r\n- Python 3.8 or higher\r\n- 4GB RAM minimum (8GB+ recommended for large datasets)\r\n- Windows, macOS, or Linux\r\n\r\n### Core Dependencies\r\n\r\nNoventis automatically installs these dependencies:\r\n\r\n- **Data Processing**: pandas, numpy, scipy\r\n- **Visualization**: matplotlib, seaborn\r\n- **Machine Learning**: scikit-learn, xgboost, lightgbm, catboost\r\n- **AutoML**: optuna, flaml, shap\r\n- **Feature Engineering**: category_encoders, statsmodels\r\n\r\nSee [requirements.txt](requirements.txt) for complete list.\r\n\r\n---\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\nWe welcome contributions from the community! Here's how you can help:\r\n\r\n### Ways to Contribute\r\n\r\n1. **\ud83d\udc1b Report Bugs** - Found a bug? [Open an issue](https://github.com/yourusername/noventis/issues)\r\n2. **\ud83d\udca1 Suggest Features** - Have ideas? We'd love to hear them!\r\n3. **\ud83d\udcd6 Improve Documentation** - Help us make the docs better\r\n4. **\ud83d\udd27 Submit Pull Requests** - Fix bugs or add features\r\n\r\n### Development Setup\r\n\r\n```bash\r\n# Clone the repository\r\ngit clone https://github.com/yourusername/noventis.git\r\ncd noventis\r\n\r\n# Create virtual environment\r\npython -m venv venv\r\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\r\n\r\n# Install in development mode\r\npip install -e .[dev]\r\n\r\n# Run tests\r\npytest tests/\r\n\r\n# Run linting\r\nflake8 noventis/\r\nblack noventis/\r\n```\r\n\r\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.\r\n\r\n---\r\n\r\n## \ud83d\udc65 Contributors\r\n\r\nThis project exists thanks to all the people who contribute:\r\n\r\n| Contributor               | Role                |\r\n| ------------------------- | ------------------- |\r\n| **Richard**               | Product Manager     |\r\n| **Fatoni Murfids**        | AI Product Manager  |\r\n| **Ahmad Nafi Mubarok**    | Lead Data Scientist |\r\n| **Orie Abyan Maulana**    | Lead Data Analyst   |\r\n| **Grace Wahyuni**         | Data Analyst        |\r\n| **Alexander Angelo**      | Data Scientist      |\r\n| **Rimba Nevada**          | Data Scientist      |\r\n| **Jason Surya Winata**    | Frontend Engineer   |\r\n| **Nada Musyaffa Bilhaqi** | Product Designer    |\r\n\r\n### Special Thanks\r\n\r\nA huge thank you to the maintainers of our dependencies:\r\n\r\n- pandas, numpy, scikit-learn, and the entire Python scientific computing community\r\n- XGBoost, LightGBM, and CatBoost teams for excellent gradient boosting libraries\r\n- Optuna and FLAML teams for amazing AutoML frameworks\r\n\r\n---\r\n\r\n## \ud83d\udcc2 Project Structure\r\n\r\nThe folder structure of **Noventis** project:\r\n\r\n```bash\r\n.\r\n\u251c\u2500\u2500 \ud83d\udcc1 dataset_for_examples/     # Sample datasets for testing\r\n\u251c\u2500\u2500 \ud83d\udcc1 docs/                     # Documentation files\r\n\u251c\u2500\u2500 \ud83d\udcc1 examples/                 # Example notebooks and scripts\r\n\u251c\u2500\u2500 \ud83d\udcc1 noventis/                 # Main library code\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc1 __pycache__/\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc1 asset/               # Asset files (if any)\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc1 core/                # Core functionality\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc1 data_cleaner/        # Data cleaning module\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 __init__.py\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 auto.py\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 data_quality.py\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 encoding.py\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 imputing.py\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 orchestrator.py\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 outlier_handling.py\r\n\u2502   \u2502   \u2514\u2500\u2500 \ud83d\udcc4 scaling.py\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc1 eda_auto/            # EDA automation module\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 __init__.py\r\n\u2502   \u2502   \u2514\u2500\u2500 \ud83d\udcc4 eda_auto.py\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc1 predictor/           # Prediction module\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 __init__.py\r\n\u2502   \u2502   \u251c\u2500\u2500 \ud83d\udcc4 auto.py\r\n\u2502   \u2502   \u2514\u2500\u2500 \ud83d\udcc4 manual.py\r\n\u2502   \u2514\u2500\u2500 \ud83d\udcc4 __init__.py          # Main package init\r\n\u251c\u2500\u2500 \ud83d\udcc1 noventis.egg-info/       # Package metadata\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc4 dependency_links.txt\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc4 PKG-INFO\r\n\u2502   \u251c\u2500\u2500 \ud83d\udcc4 SOURCES.txt\r\n\u2502   \u2514\u2500\u2500 \ud83d\udcc4 top_level.txt\r\n\u251c\u2500\u2500 \ud83d\udcc1 tests/                   # Unit tests\r\n\u251c\u2500\u2500 \ud83d\udcc4 .gitignore               # Git ignore rules\r\n\u251c\u2500\u2500 \ud83d\udcc4 LICENSE                  # MIT License\r\n\u251c\u2500\u2500 \ud83d\udcc4 MANIFEST.in              # Package manifest\r\n\u251c\u2500\u2500 \ud83d\udcc4 pyproject.toml           # Modern Python packaging config\r\n\u251c\u2500\u2500 \ud83d\udcc4 README.md                # This file\r\n\u251c\u2500\u2500 \ud83d\udcc4 requirements.txt         # Production dependencies\r\n\u251c\u2500\u2500 \ud83d\udcc4 requirements-dev.txt     # Development dependencies\r\n\u2514\u2500\u2500 \ud83d\udcc4 setup.py                 # Package setup script\r\n```\r\n\r\n### \ud83d\udccc Notes\r\n\r\n- The `noventis/` folder contains the **main library code**\r\n- The `tests/` folder is dedicated to **unit testing and integration testing**\r\n- `setup.py` and `pyproject.toml` are used for **packaging and distribution**\r\n- `requirements.txt` lists the **external dependencies** needed for the project\r\n\r\n\ud83d\ude80 With this structure, the project is ready for development, testing, and publishing on **PyPI or GitHub**.\r\n\r\n---\r\n\r\n## \ud83d\udd27 Troubleshooting\r\n\r\n### Common Issues\r\n\r\n**Problem**: `ModuleNotFoundError: No module named 'noventis'`\r\n\r\n```bash\r\n# Solution: Reinstall the package\r\npip uninstall noventis\r\npip install noventis\r\n```\r\n\r\n**Problem**: Dependencies conflict\r\n\r\n```bash\r\n# Solution: Create a fresh virtual environment\r\npython -m venv fresh_env\r\nsource fresh_env/bin/activate\r\npip install noventis\r\n```\r\n\r\n**Problem**: Import errors after installation\r\n\r\n```python\r\n# Solution: Verify installation\r\nimport noventis\r\nprint(noventis.__version__)\r\nnoventis.print_info()  # Check all dependencies\r\n```\r\n\r\n### Getting Help\r\n\r\n- \ud83d\udcd6 [Documentation](https://docs.noventis.dev)\r\n- \ud83d\udc1b [GitHub Issues](https://github.com/bcc/noventis/issues)\r\n\r\n---\r\n\r\n## \ud83d\udcc4 License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n### Third-Party Licenses\r\n\r\nNoventis uses several open-source libraries. We are grateful to their maintainers:\r\n\r\n- **Data Processing**: pandas (BSD), numpy (BSD), scipy (BSD)\r\n- **Visualization**: matplotlib (PSF), seaborn (BSD)\r\n- **Machine Learning**: scikit-learn (BSD), xgboost (Apache 2.0), lightgbm (MIT), catboost (Apache 2.0)\r\n- **AutoML**: optuna (MIT), flaml (MIT), shap (MIT)\r\n- **Feature Engineering**: category_encoders (BSD), statsmodels (BSD)\r\n\r\nAll dependencies are licensed under permissive open-source licenses (BSD, MIT, Apache 2.0).\r\n\r\n---\r\n\r\n## \ud83d\udcda Citation\r\n\r\nIf you use Noventis in your research, please cite:\r\n\r\n```bibtex\r\n@software{noventis2025,\r\n  author = {Noventis Team},\r\n  title = {Noventis: Intelligent Automation for Data Analysis},\r\n  year = {2025},\r\n  url = {https://github.com/bccfilkom/noventis}\r\n}\r\n```\r\n\r\n---\r\n\r\n## \ud83c\udf1f Star History\r\n\r\n[![Star History Chart](https://api.star-history.com/svg?repos=bccfilkom/noventis&type=Date)](https://star-history.com/#yourusername/noventis&Date)\r\n\r\n---\r\n\r\n<div align=\"center\">\r\n\r\nMade with \u2764\ufe0f by [Noventis Team](https://noventis.dev)\r\n\r\nIf you find Noventis useful, please consider giving it a \u2b50 on [GitHub](https://github.com/yourusername/noventis)!\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "An all-in-one automation library that simplifies data cleaning, exploratory data analysis, and machine learning \u2014 from raw data to ready-to-deploy models.",
    "version": "0.1.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/bccfilkom/noventis/issues",
        "Changelog": "https://github.com/bccfilkom/noventis/blob/main/CONTRIBUTING.md",
        "Documentation": "https://noventis.readthedocs.io",
        "Homepage": "https://github.com/bccfilkom/noventis",
        "Repository": "https://github.com/bccfilkom/noventis"
    },
    "split_keywords": [
        "machine learning",
        " automl",
        " automated machine learning",
        " data science",
        " artificial intelligence",
        " data cleaning",
        " feature engineering",
        " eda",
        " exploratory data analysis",
        " predictor",
        " scikit-learn",
        " xgboost",
        " lightgbm",
        " catboost",
        " optuna",
        " shap",
        " flaml"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4ff480f6b7234c76a05745a01489aacf38b6e54bea4331e9d29fe340c2af1dae",
                "md5": "1ea12860669b7d4dff713c31fcfa2e59",
                "sha256": "64a5123268d5bf1a3e7848700fd5bd57c7a491447be7300e40eaf80b112316b7"
            },
            "downloads": -1,
            "filename": "noventis-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1ea12860669b7d4dff713c31fcfa2e59",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 140080,
            "upload_time": "2025-10-10T16:33:40",
            "upload_time_iso_8601": "2025-10-10T16:33:40.417758Z",
            "url": "https://files.pythonhosted.org/packages/4f/f4/80f6b7234c76a05745a01489aacf38b6e54bea4331e9d29fe340c2af1dae/noventis-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f735e9218a61874879a77520651f9a23a7c89c20280b04fb0e9324822967a497",
                "md5": "12c7c491dac00bb15c00c40ae04f1f00",
                "sha256": "e874726771e430f18fdad3c61b7abf2d7e8960f5e5e41ab6791dfc788feeca2c"
            },
            "downloads": -1,
            "filename": "noventis-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "12c7c491dac00bb15c00c40ae04f1f00",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 141953,
            "upload_time": "2025-10-10T16:33:42",
            "upload_time_iso_8601": "2025-10-10T16:33:42.149144Z",
            "url": "https://files.pythonhosted.org/packages/f7/35/e9218a61874879a77520651f9a23a7c89c20280b04fb0e9324822967a497/noventis-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-10 16:33:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bccfilkom",
    "github_project": "noventis",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "1.5.3"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "1.23.5"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.10.1"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.7.3"
                ]
            ]
        },
        {
            "name": "seaborn",
            "specs": [
                [
                    "==",
                    "0.12.2"
                ]
            ]
        },
        {
            "name": "ipython",
            "specs": [
                [
                    "==",
                    "8.12.3"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    "==",
                    "1.3.2"
                ]
            ]
        },
        {
            "name": "statsmodels",
            "specs": [
                [
                    "==",
                    "0.14.1"
                ]
            ]
        },
        {
            "name": "optuna",
            "specs": [
                [
                    "==",
                    "3.3.0"
                ]
            ]
        },
        {
            "name": "shap",
            "specs": [
                [
                    "==",
                    "0.44.1"
                ]
            ]
        },
        {
            "name": "flaml",
            "specs": [
                [
                    "==",
                    "2.1.1"
                ]
            ]
        },
        {
            "name": "xgboost",
            "specs": [
                [
                    "==",
                    "1.7.6"
                ]
            ]
        },
        {
            "name": "lightgbm",
            "specs": [
                [
                    "==",
                    "4.1.0"
                ]
            ]
        },
        {
            "name": "catboost",
            "specs": [
                [
                    "==",
                    "1.2.3"
                ]
            ]
        },
        {
            "name": "category_encoders",
            "specs": [
                [
                    "==",
                    "2.6.2"
                ]
            ]
        },
        {
            "name": "joblib",
            "specs": [
                [
                    "==",
                    "1.3.2"
                ]
            ]
        },
        {
            "name": "imbalanced-learn",
            "specs": [
                [
                    "==",
                    "0.10.1"
                ]
            ]
        },
        {
            "name": "Jinja2",
            "specs": [
                [
                    "==",
                    "3.1.6"
                ]
            ]
        },
        {
            "name": "nbformat",
            "specs": [
                [
                    "==",
                    "5.7.2"
                ]
            ]
        }
    ],
    "lcname": "noventis"
}
        
Elapsed time: 2.72577s