# Catboost-extensions
---
This library provides an easy-to-use interface for hyperparameter tuning of CatBoost models using Optuna. The `OptunaTuneCV` class simplifies the process of defining parameter spaces, configuring trials, and running cross-validation with CatBoost.
## Installation
To install the library, use pip:
```bash
pip install catboost-extensions
```
## Quick Start Guide
### OptunaTuneCV
Here is an example of how to use the library to tune a [CatBoost](https://catboost.ai/en/docs/) model using [Optuna](https://optuna.org/):
#### 1. Import necessary libraries
```python
from pprint import pprint
import pandas as pd
from catboost_extensions.optuna import (
OptunaTuneCV,
CatboostParamSpace,
)
from catboost import CatBoostRegressor
from sklearn.datasets import fetch_california_housing
import optuna
```
#### 2. Load and prepare your data
```python
# Load dataset
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target
```
#### 3. Define your CatBoost model
```python
model = CatBoostRegressor(verbose=False, task_type='CPU')
```
#### 4. Define the parameter space
The `CatboostParamSpace` class allows you to define a parameter space for your CatBoost model. You can remove parameters that you don't want to tune using the `del_params` method.
```python
param_space = CatboostParamSpace(params_preset='general', task_type='CPU')
param_space.del_params(['depth', 'l2_leaf_reg'])
pprint(param_space.get_params_space())
```
Out:
```python
{'bootstrap_type': CategoricalDistribution(choices=('Bayesian', 'MVS', 'Bernoulli', 'No')),
'grow_policy': CategoricalDistribution(choices=('SymmetricTree', 'Depthwise', 'Lossguide')),
'iterations': IntDistribution(high=5000, log=False, low=100, step=1),
'learning_rate': FloatDistribution(high=0.1, log=True, low=0.001, step=None),
'max_bin': IntDistribution(high=512, log=False, low=8, step=1),
'random_strength': FloatDistribution(high=10.0, log=True, low=0.01, step=None),
'rsm': FloatDistribution(high=1.0, log=False, low=0.01, step=None),
'score_function': CategoricalDistribution(choices=('Cosine', 'L2'))}
```
Also you can change the default values of the parameters:
```python
param_space.iterations=(1000, 2000)
```
#### 5. Set up the `OptunaTuneCV` objective
The `OptunaTuneCV` class helps to define an objective function for Optuna. You can specify the CatBoost model, the parameter space, the dataset, and other options such as the trial timeout and the scoring metric.
```python
objective = OptunaTuneCV(model, param_space, X, y, trial_timeout=10, scoring='r2')
```
#### 6. Create an Optuna study and optimize
You can choose an Optuna sampler (e.g., `TPESampler`) and then create a study to optimize the objective function.
```python
sampler = optuna.samplers.TPESampler(seed=20, multivariate=True)
study = optuna.create_study(direction='maximize', sampler=sampler)
study.optimize(objective, n_trials=10)
```
#### 7. View the results
After the study completes, you can analyze the results to see the best hyperparameters found during the optimization.
```python
print("Best trial:")
trial = study.best_trial
print(f" Value: {trial.value}")
print(f" Params: ")
for key, value in trial.params.items():
print(f" {key}: {value}")
```
## Contributing
If you want to contribute to this library, please open an issue or submit a pull request on GitHub.
## License
This project is licensed under the MIT License.
Raw data
{
"_id": null,
"home_page": "https://github.com/dubovikmaster/catboost-extensions",
"name": "catboost-extensions",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": null,
"author": "Dubovik Pavel",
"author_email": "geometryk@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/b4/ee/e4c5313ab53ff1675ade908fec58b9347e0601693fedae6c3a030dc8e36f/catboost_extensions-2.1.tar.gz",
"platform": "any",
"description": "# Catboost-extensions\n\n---\n\nThis library provides an easy-to-use interface for hyperparameter tuning of CatBoost models using Optuna. The `OptunaTuneCV` class simplifies the process of defining parameter spaces, configuring trials, and running cross-validation with CatBoost.\n\n## Installation\n\nTo install the library, use pip:\n\n```bash\npip install catboost-extensions\n```\n\n## Quick Start Guide\n### OptunaTuneCV\n\nHere is an example of how to use the library to tune a [CatBoost](https://catboost.ai/en/docs/) model using [Optuna](https://optuna.org/):\n\n#### 1. Import necessary libraries\n\n```python\nfrom pprint import pprint\n\nimport pandas as pd\n\nfrom catboost_extensions.optuna import (\n OptunaTuneCV, \n CatboostParamSpace,\n)\nfrom catboost import CatBoostRegressor\nfrom sklearn.datasets import fetch_california_housing\nimport optuna\n```\n\n#### 2. Load and prepare your data\n\n```python\n# Load dataset\ndata = fetch_california_housing()\nX = pd.DataFrame(data.data, columns=data.feature_names)\ny = data.target\n```\n\n#### 3. Define your CatBoost model\n\n```python\nmodel = CatBoostRegressor(verbose=False, task_type='CPU')\n```\n\n#### 4. Define the parameter space\n\nThe `CatboostParamSpace` class allows you to define a parameter space for your CatBoost model. You can remove parameters that you don't want to tune using the `del_params` method.\n\n```python\nparam_space = CatboostParamSpace(params_preset='general', task_type='CPU')\nparam_space.del_params(['depth', 'l2_leaf_reg'])\npprint(param_space.get_params_space())\n```\nOut:\n```python\n{'bootstrap_type': CategoricalDistribution(choices=('Bayesian', 'MVS', 'Bernoulli', 'No')),\n 'grow_policy': CategoricalDistribution(choices=('SymmetricTree', 'Depthwise', 'Lossguide')),\n 'iterations': IntDistribution(high=5000, log=False, low=100, step=1),\n 'learning_rate': FloatDistribution(high=0.1, log=True, low=0.001, step=None),\n 'max_bin': IntDistribution(high=512, log=False, low=8, step=1),\n 'random_strength': FloatDistribution(high=10.0, log=True, low=0.01, step=None),\n 'rsm': FloatDistribution(high=1.0, log=False, low=0.01, step=None),\n 'score_function': CategoricalDistribution(choices=('Cosine', 'L2'))}\n```\nAlso you can change the default values of the parameters:\n```python\nparam_space.iterations=(1000, 2000)\n```\n\n#### 5. Set up the `OptunaTuneCV` objective\n\nThe `OptunaTuneCV` class helps to define an objective function for Optuna. You can specify the CatBoost model, the parameter space, the dataset, and other options such as the trial timeout and the scoring metric.\n\n```python\nobjective = OptunaTuneCV(model, param_space, X, y, trial_timeout=10, scoring='r2')\n```\n\n#### 6. Create an Optuna study and optimize\n\nYou can choose an Optuna sampler (e.g., `TPESampler`) and then create a study to optimize the objective function.\n\n```python\nsampler = optuna.samplers.TPESampler(seed=20, multivariate=True)\nstudy = optuna.create_study(direction='maximize', sampler=sampler)\nstudy.optimize(objective, n_trials=10)\n```\n\n#### 7. View the results\n\nAfter the study completes, you can analyze the results to see the best hyperparameters found during the optimization.\n\n```python\nprint(\"Best trial:\")\ntrial = study.best_trial\nprint(f\" Value: {trial.value}\")\nprint(f\" Params: \")\nfor key, value in trial.params.items():\n print(f\" {key}: {value}\")\n```\n\n## Contributing\n\nIf you want to contribute to this library, please open an issue or submit a pull request on GitHub.\n\n## License\n\nThis project is licensed under the MIT License.\n",
"bugtrack_url": null,
"license": null,
"summary": "Extensions for catboost models",
"version": "2.1",
"project_urls": {
"Homepage": "https://github.com/dubovikmaster/catboost-extensions"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "52bf7a617322d11e8141b6ca09bbf34514fba0db9bb4119509a4bbe28b9afa7c",
"md5": "88459c2989ac2f20781577c585767cc3",
"sha256": "25019c8bbed26634c36956eca55431bd96d9563c4094689129320f370018aa0d"
},
"downloads": -1,
"filename": "catboost_extensions-2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "88459c2989ac2f20781577c585767cc3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 14010,
"upload_time": "2024-08-18T11:04:50",
"upload_time_iso_8601": "2024-08-18T11:04:50.950983Z",
"url": "https://files.pythonhosted.org/packages/52/bf/7a617322d11e8141b6ca09bbf34514fba0db9bb4119509a4bbe28b9afa7c/catboost_extensions-2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b4eee4c5313ab53ff1675ade908fec58b9347e0601693fedae6c3a030dc8e36f",
"md5": "aff7d52edaa331d7b74eefd68d627a1e",
"sha256": "54fd5a94c251bc07a45c63b5d585c03f9905e1eef818c24fd408bb562c328672"
},
"downloads": -1,
"filename": "catboost_extensions-2.1.tar.gz",
"has_sig": false,
"md5_digest": "aff7d52edaa331d7b74eefd68d627a1e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 14253,
"upload_time": "2024-08-18T11:04:53",
"upload_time_iso_8601": "2024-08-18T11:04:53.133541Z",
"url": "https://files.pythonhosted.org/packages/b4/ee/e4c5313ab53ff1675ade908fec58b9347e0601693fedae6c3a030dc8e36f/catboost_extensions-2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-18 11:04:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dubovikmaster",
"github_project": "catboost-extensions",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "catboost-extensions"
}