# hgboost - Hyperoptimized Gradient Boosting
[](https://img.shields.io/pypi/pyversions/hgboost)
[](https://pypi.org/project/hgboost/)
[](https://github.com/erdogant/hgboost/blob/master/LICENSE)
[](https://github.com/erdogant/hgboost/network)
[](https://github.com/erdogant/hgboost/issues)
[](http://www.repostatus.org/#active)
[](https://pepy.tech/project/hgboost/month)
[](https://pepy.tech/project/hgboost)
[](https://zenodo.org/badge/latestdoi/257025146)
[](https://erdogant.github.io/hgboost/)
[](https://erdogant.github.io/hgboost/pages/html/Documentation.html#colab-classification-notebook)
[](https://erdogant.github.io/hgboost/pages/html/Documentation.html#medium-blog)
<!---[](https://www.buymeacoffee.com/erdogant)-->
<!---[](https://erdogant.github.io/donate/?currency=USD&amount=5)-->
--------------------------------------------------------------------
``hgboost`` is short for **Hyperoptimized Gradient Boosting** and is a python package for hyperparameter optimization for *xgboost*, *catboost* and *lightboost* using cross-validation, and evaluating the results on an independent validation set.
``hgboost`` can be applied for classification and regression tasks.
``hgboost`` is fun because:
* 1. Hyperoptimization of the Parameter-space using bayesian approach.
* 2. Determines the best scoring model(s) using k-fold cross validation.
* 3. Evaluates best model on independent evaluation set.
* 4. Fit model on entire input-data using the best model.
* 5. Works for classification and regression
* 6. Creating a super-hyperoptimized model by an ensemble of all individual optimized models.
* 7. Return model, space and test/evaluation results.
* 8. Makes insightful plots.
--------------------------------------------------------------------
**⭐️ Star this repo if you like it ⭐️**
--------------------------------------------------------------------
### Blogs
Medium Blog 1:
[The Best Boosting Model using Bayesian Hyperparameter Tuning but without Overfitting.](https://erdogant.github.io/hgboost/pages/html/Documentation.html#medium-blog)
Medium Blog 2:
[Create Explainable Gradient Boosting Classification models using Bayesian Hyperparameter Optimization.](https://erdogant.github.io/hgboost/pages/html/Documentation.html#medium-blog)
--------------------------------------------------------------------
### [Documentation pages](https://erdogant.github.io/hgboost/)
On the [documentation pages](https://erdogant.github.io/hgboost/) you can find detailed information about the working of the ``hgboost`` with many examples.
--------------------------------------------------------------------
## Colab Notebooks
* <a href="https://erdogant.github.io/hgboost/pages/html/Documentation.html#colab-regression-notebook"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open regression example In Colab"/> </a> Regression example
* <a href="https://erdogant.github.io/hgboost/pages/html/Documentation.html#colab-classification-notebook"> <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open classification example In Colab"/> </a> Classification example
--------------------------------------------------------------------
### Schematic overview of hgboost
<p align="center">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/schematic_overview.png" width="600" />
</p>
### Installation Environment
```python
conda create -n env_hgboost python=3.8
conda activate env_hgboost
```
### Install from pypi
```bash
pip install hgboost
pip install -U hgboost # Force update
```
#### Import hgboost package
```python
import hgboost as hgboost
```
#### Examples
* [Example: Fit catboost by hyperoptimization and cross-validation](https://erdogant.github.io/hgboost/pages/html/Examples.html#catboost)
#
* [Example: Fit lightboost by hyperoptimization and cross-validation](https://erdogant.github.io/hgboost/pages/html/Examples.html#lightboost)
#
* [Example: Fit xgboost by hyperoptimization and cross-validation](https://erdogant.github.io/hgboost/pages/html/Examples.html#xgboost-two-class)
#
* [Example: Plot searched parameter space](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-params)
<p align="left">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_params_clf_1.png" width="400" />
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_params_clf_2.png" width="400" />
</a>
</p>
#
* [Example: plot summary](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-summary)
<p align="left">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_clf.png" width="600" />
</a>
</p>
#
* [Example: Tree plot](https://erdogant.github.io/hgboost/pages/html/Examples.html#treeplot)
<p align="left">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/treeplot_clf_1.png" width="400" />
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/treeplot_clf_2.png" width="400" />
</a>
</p>
#
* [Example: Plot the validation results](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-validation)
<p align="left">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_validation_clf_1.png" width="600" />
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_validation_clf_2.png" width="400" />
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_validation_clf_3.png" width="600" />
</p>
#
* [Example: Plot the cross-validation results](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-cv)
<p align="left">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_cv_clf.png" width="600" />
</p>
#
* [Example: use the learned model to make new predictions](https://erdogant.github.io/hgboost/pages/html/hgboost.hgboost.html?highlight=predict#hgboost.hgboost.hgboost.predict)
#
* [Example: Create ensemble model for Classification](https://erdogant.github.io/hgboost/pages/html/Examples.html#ensemble-classification)
#
* [Example: Create ensemble model for Regression](https://erdogant.github.io/hgboost/pages/html/Examples.html#ensemble-regression)
#
#### Classification example for xgboost, catboost and lightboost:
```python
# Load library
from hgboost import hgboost
# Initialization
hgb = hgboost(max_eval=10, threshold=0.5, cv=5, test_size=0.2, val_size=0.2, top_cv_evals=10, random_state=42)
# Fit xgboost by hyperoptimization and cross-validation
results = hgb.xgboost(X, y, pos_label='survived')
# [hgboost] >Start hgboost classification..
# [hgboost] >Collecting xgb_clf parameters.
# [hgboost] >Number of variables in search space is [11], loss function: [auc].
# [hgboost] >method: xgb_clf
# [hgboost] >eval_metric: auc
# [hgboost] >greater_is_better: True
# [hgboost] >pos_label: True
# [hgboost] >Total dataset: (891, 204)
# [hgboost] >Hyperparameter optimization..
# 100% |----| 500/500 [04:39<05:21, 1.33s/trial, best loss: -0.8800619834710744]
# [hgboost] >Best performing [xgb_clf] model: auc=0.881198
# [hgboost] >5-fold cross validation for the top 10 scoring models, Total nr. tests: 50
# 100%|██████████| 10/10 [00:42<00:00, 4.27s/it]
# [hgboost] >Evalute best [xgb_clf] model on independent validation dataset (179 samples, 20.00%).
# [hgboost] >[auc] on independent validation dataset: -0.832
# [hgboost] >Retrain [xgb_clf] on the entire dataset with the optimal parameters settings.
```
```python
# Plot the ensemble classification validation results
hgb.plot_validation()
```
<p align="center">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_ensemble_clf_1.png" width="600" />
</p>
<p align="center">
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_ensemble_clf_2.png" width="400" />
<img src="https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_ensemble_clf_3.png" width="400" />
</p>
<hr>
**References**
* http://hyperopt.github.io/hyperopt/
* https://github.com/dmlc/xgboost
* https://github.com/microsoft/LightGBM
* https://github.com/catboost/catboost
**Maintainers**
* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)
**Contribute**
* Contributions are welcome.
**Licence**
See [LICENSE](LICENSE) for details.
**Coffee**
* If you wish to buy me a <a href="https://www.buymeacoffee.com/erdogant">Coffee</a> for this work, it is very appreciated :)
Raw data
{
"_id": null,
"home_page": "https://erdogant.github.io/hgboost",
"name": "hgboost",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3",
"maintainer_email": null,
"keywords": null,
"author": "Erdogan Taskesen",
"author_email": "erdogant@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/15/e2/137e9a18f3736d7e674f17274a0a8ea56b8de35959fe8e41c024069597df/hgboost-1.1.6.tar.gz",
"platform": null,
"description": "# hgboost - Hyperoptimized Gradient Boosting\r\n\r\n[](https://img.shields.io/pypi/pyversions/hgboost)\r\n[](https://pypi.org/project/hgboost/)\r\n[](https://github.com/erdogant/hgboost/blob/master/LICENSE)\r\n[](https://github.com/erdogant/hgboost/network)\r\n[](https://github.com/erdogant/hgboost/issues)\r\n[](http://www.repostatus.org/#active)\r\n[](https://pepy.tech/project/hgboost/month)\r\n[](https://pepy.tech/project/hgboost)\r\n[](https://zenodo.org/badge/latestdoi/257025146)\r\n[](https://erdogant.github.io/hgboost/)\r\n[](https://erdogant.github.io/hgboost/pages/html/Documentation.html#colab-classification-notebook)\r\n[](https://erdogant.github.io/hgboost/pages/html/Documentation.html#medium-blog)\r\n<!---[](https://www.buymeacoffee.com/erdogant)-->\r\n<!---[](https://erdogant.github.io/donate/?currency=USD&amount=5)-->\r\n\r\n--------------------------------------------------------------------\r\n\r\n``hgboost`` is short for **Hyperoptimized Gradient Boosting** and is a python package for hyperparameter optimization for *xgboost*, *catboost* and *lightboost* using cross-validation, and evaluating the results on an independent validation set.\r\n``hgboost`` can be applied for classification and regression tasks.\r\n\r\n``hgboost`` is fun because:\r\n\r\n * 1. Hyperoptimization of the Parameter-space using bayesian approach.\r\n * 2. Determines the best scoring model(s) using k-fold cross validation.\r\n * 3. Evaluates best model on independent evaluation set.\r\n * 4. Fit model on entire input-data using the best model.\r\n * 5. Works for classification and regression\r\n * 6. Creating a super-hyperoptimized model by an ensemble of all individual optimized models.\r\n * 7. Return model, space and test/evaluation results.\r\n * 8. Makes insightful plots.\r\n\r\n--------------------------------------------------------------------\r\n\r\n**\u2b50\ufe0f Star this repo if you like it \u2b50\ufe0f**\r\n\r\n--------------------------------------------------------------------\r\n\r\n### Blogs\r\n\r\nMedium Blog 1: \r\n[The Best Boosting Model using Bayesian Hyperparameter Tuning but without Overfitting.](https://erdogant.github.io/hgboost/pages/html/Documentation.html#medium-blog)\r\n\r\nMedium Blog 2: \r\n[Create Explainable Gradient Boosting Classification models using Bayesian Hyperparameter Optimization.](https://erdogant.github.io/hgboost/pages/html/Documentation.html#medium-blog)\r\n\r\n--------------------------------------------------------------------\r\n\r\n### [Documentation pages](https://erdogant.github.io/hgboost/)\r\n\r\nOn the [documentation pages](https://erdogant.github.io/hgboost/) you can find detailed information about the working of the ``hgboost`` with many examples. \r\n\r\n--------------------------------------------------------------------\r\n\r\n\r\n## Colab Notebooks\r\n\r\n* <a href=\"https://erdogant.github.io/hgboost/pages/html/Documentation.html#colab-regression-notebook\"> <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open regression example In Colab\"/> </a> Regression example \r\n\r\n* <a href=\"https://erdogant.github.io/hgboost/pages/html/Documentation.html#colab-classification-notebook\"> <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open classification example In Colab\"/> </a> Classification example \r\n\r\n--------------------------------------------------------------------\r\n\r\n\r\n### Schematic overview of hgboost\r\n\r\n<p align=\"center\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/schematic_overview.png\" width=\"600\" />\r\n</p>\r\n\r\n\r\n### Installation Environment\r\n\r\n```python\r\nconda create -n env_hgboost python=3.8\r\nconda activate env_hgboost\r\n```\r\n\r\n### Install from pypi\r\n\r\n```bash\r\npip install hgboost\r\npip install -U hgboost # Force update\r\n\r\n```\r\n\r\n#### Import hgboost package\r\n```python\r\nimport hgboost as hgboost\r\n```\r\n\r\n#### Examples\r\n\r\n* [Example: Fit catboost by hyperoptimization and cross-validation](https://erdogant.github.io/hgboost/pages/html/Examples.html#catboost)\r\n\r\n#\r\n\r\n* [Example: Fit lightboost by hyperoptimization and cross-validation](https://erdogant.github.io/hgboost/pages/html/Examples.html#lightboost)\r\n\r\n#\r\n\r\n* [Example: Fit xgboost by hyperoptimization and cross-validation](https://erdogant.github.io/hgboost/pages/html/Examples.html#xgboost-two-class)\r\n\r\n#\r\n\r\n* [Example: Plot searched parameter space](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-params)\r\n\r\n<p align=\"left\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_params_clf_1.png\" width=\"400\" />\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_params_clf_2.png\" width=\"400\" />\r\n </a>\r\n</p>\r\n\r\n#\r\n\r\n* [Example: plot summary](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-summary)\r\n\r\n<p align=\"left\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_clf.png\" width=\"600\" />\r\n </a>\r\n</p>\r\n\r\n\r\n#\r\n\r\n* [Example: Tree plot](https://erdogant.github.io/hgboost/pages/html/Examples.html#treeplot)\r\n\r\n<p align=\"left\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/treeplot_clf_1.png\" width=\"400\" />\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/treeplot_clf_2.png\" width=\"400\" />\r\n </a>\r\n</p>\r\n\r\n\r\n#\r\n\r\n* [Example: Plot the validation results](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-validation)\r\n\r\n<p align=\"left\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_validation_clf_1.png\" width=\"600\" />\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_validation_clf_2.png\" width=\"400\" />\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_validation_clf_3.png\" width=\"600\" />\r\n</p>\r\n\r\n#\r\n\r\n* [Example: Plot the cross-validation results](https://erdogant.github.io/hgboost/pages/html/Examples.html#plot-cv)\r\n\r\n<p align=\"left\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_cv_clf.png\" width=\"600\" />\r\n</p>\r\n\r\n\r\n#\r\n\r\n* [Example: use the learned model to make new predictions](https://erdogant.github.io/hgboost/pages/html/hgboost.hgboost.html?highlight=predict#hgboost.hgboost.hgboost.predict)\r\n\r\n#\r\n\r\n* [Example: Create ensemble model for Classification](https://erdogant.github.io/hgboost/pages/html/Examples.html#ensemble-classification)\r\n\r\n#\r\n\r\n* [Example: Create ensemble model for Regression](https://erdogant.github.io/hgboost/pages/html/Examples.html#ensemble-regression)\r\n\r\n#\r\n\r\n#### Classification example for xgboost, catboost and lightboost:\r\n```python\r\n\r\n# Load library\r\nfrom hgboost import hgboost\r\n\r\n# Initialization\r\nhgb = hgboost(max_eval=10, threshold=0.5, cv=5, test_size=0.2, val_size=0.2, top_cv_evals=10, random_state=42)\r\n\r\n# Fit xgboost by hyperoptimization and cross-validation\r\nresults = hgb.xgboost(X, y, pos_label='survived')\r\n\r\n# [hgboost] >Start hgboost classification..\r\n# [hgboost] >Collecting xgb_clf parameters.\r\n# [hgboost] >Number of variables in search space is [11], loss function: [auc].\r\n# [hgboost] >method: xgb_clf\r\n# [hgboost] >eval_metric: auc\r\n# [hgboost] >greater_is_better: True\r\n# [hgboost] >pos_label: True\r\n# [hgboost] >Total dataset: (891, 204) \r\n# [hgboost] >Hyperparameter optimization..\r\n# 100% |----| 500/500 [04:39<05:21, 1.33s/trial, best loss: -0.8800619834710744]\r\n# [hgboost] >Best performing [xgb_clf] model: auc=0.881198\r\n# [hgboost] >5-fold cross validation for the top 10 scoring models, Total nr. tests: 50\r\n# 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 10/10 [00:42<00:00, 4.27s/it]\r\n# [hgboost] >Evalute best [xgb_clf] model on independent validation dataset (179 samples, 20.00%).\r\n# [hgboost] >[auc] on independent validation dataset: -0.832\r\n# [hgboost] >Retrain [xgb_clf] on the entire dataset with the optimal parameters settings.\r\n```\r\n\r\n\r\n```python\r\n\r\n# Plot the ensemble classification validation results\r\nhgb.plot_validation()\r\n\r\n```\r\n\r\n<p align=\"center\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_ensemble_clf_1.png\" width=\"600\" />\r\n</p>\r\n\r\n<p align=\"center\">\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_ensemble_clf_2.png\" width=\"400\" />\r\n <img src=\"https://github.com/erdogant/hgboost/blob/master/docs/figs/plot_ensemble_clf_3.png\" width=\"400\" />\r\n</p>\r\n\r\n<hr>\r\n\r\n**References**\r\n\r\n * http://hyperopt.github.io/hyperopt/\r\n * https://github.com/dmlc/xgboost\r\n * https://github.com/microsoft/LightGBM\r\n * https://github.com/catboost/catboost\r\n \r\n**Maintainers**\r\n* Erdogan Taskesen, github: [erdogant](https://github.com/erdogant)\r\n\r\n**Contribute**\r\n* Contributions are welcome.\r\n\r\n**Licence**\r\nSee [LICENSE](LICENSE) for details.\r\n\r\n**Coffee**\r\n* If you wish to buy me a <a href=\"https://www.buymeacoffee.com/erdogant\">Coffee</a> for this work, it is very appreciated :)\r\n",
"bugtrack_url": null,
"license": null,
"summary": "hgboost is a python package for hyperparameter optimization for xgboost, catboost and lightboost for both classification and regression tasks.",
"version": "1.1.6",
"project_urls": {
"Download": "https://github.com/erdogant/hgboost/archive/1.1.6.tar.gz",
"Homepage": "https://erdogant.github.io/hgboost"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "674509e55ee339c0b000df37cd4197eae7effc92a58b1603de0a79b91dfdeb3b",
"md5": "2a90c1df7898e5e39acc7c614a3dc7df",
"sha256": "bce57289a9fd8584c366138bfd71285b989a316e300ba30b00f00800c47714f8"
},
"downloads": -1,
"filename": "hgboost-1.1.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2a90c1df7898e5e39acc7c614a3dc7df",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3",
"size": 28527,
"upload_time": "2024-10-11T20:38:54",
"upload_time_iso_8601": "2024-10-11T20:38:54.771893Z",
"url": "https://files.pythonhosted.org/packages/67/45/09e55ee339c0b000df37cd4197eae7effc92a58b1603de0a79b91dfdeb3b/hgboost-1.1.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "15e2137e9a18f3736d7e674f17274a0a8ea56b8de35959fe8e41c024069597df",
"md5": "fcaf4e370621f95d08d889ce0989329a",
"sha256": "01cfa190bec29d804bd66001b5d8e52d23811aad9d31c15b9220fdebd1e3fec4"
},
"downloads": -1,
"filename": "hgboost-1.1.6.tar.gz",
"has_sig": false,
"md5_digest": "fcaf4e370621f95d08d889ce0989329a",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3",
"size": 29457,
"upload_time": "2024-10-11T20:38:56",
"upload_time_iso_8601": "2024-10-11T20:38:56.637200Z",
"url": "https://files.pythonhosted.org/packages/15/e2/137e9a18f3736d7e674f17274a0a8ea56b8de35959fe8e41c024069597df/hgboost-1.1.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-11 20:38:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "erdogant",
"github_project": "hgboost",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "pypickle",
"specs": []
},
{
"name": "matplotlib",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "tqdm",
"specs": []
},
{
"name": "hyperopt",
"specs": []
},
{
"name": "lightgbm",
"specs": [
[
">=",
"4.1.0"
]
]
},
{
"name": "catboost",
"specs": []
},
{
"name": "xgboost",
"specs": []
},
{
"name": "classeval",
"specs": []
},
{
"name": "treeplot",
"specs": []
},
{
"name": "df2onehot",
"specs": []
},
{
"name": "colourmap",
"specs": []
},
{
"name": "seaborn",
"specs": []
},
{
"name": "datazets",
"specs": []
}
],
"lcname": "hgboost"
}