baytune


Namebaytune JSON
Version 0.5.0 PyPI version JSON
download
home_page
SummaryBayesian Tuning and Bandits
upload_time2023-07-28 16:54:26
maintainer
docs_urlNone
author
requires_python<4,>=3.8
licenseMIT License
keywords data science machine learning hyperparameters tuning classification
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="left">
<img width="15%" src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt="BTB" />
<i>An open source project from Data to AI Lab at MIT.</i>
</p>

![](https://raw.githubusercontent.com/MLBazaar/BTB/master/docs/images/BTB-Icon-small.png)

A simple, extensible backend for developing auto-tuning systems.

[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
[![PyPi Shield](https://img.shields.io/pypi/v/baytune.svg)](https://pypi.python.org/pypi/baytune)
[![Travis CI Shield](https://travis-ci.com/MLBazaar/BTB.svg?branch=master)](https://travis-ci.com/MLBazaar/BTB)
[![Coverage Status](https://codecov.io/gh/MLBazaar/BTB/branch/master/graph/badge.svg)](https://codecov.io/gh/MLBazaar/BTB)
[![Downloads](https://pepy.tech/badge/baytune)](https://pepy.tech/project/baytune)
[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/BTB/master?filepath=tutorials)

* License: [MIT](https://github.com/MLBazaar/BTB/blob/master/LICENSE)
* Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)
* Documentation: https://mlbazaar.github.io/BTB
* Homepage: https://github.com/MLBazaar/BTB

# Overview

BTB ("Bayesian Tuning and Bandits") is a simple, extensible backend for developing auto-tuning
systems such as AutoML systems. It provides an easy-to-use interface for *tuning* models and
*selecting* between models.

It is currently being used in several AutoML systems:

- [ATM](https://github.com/HDI-Project/ATM), a distributed, multi-tenant AutoML system for
classifier tuning
- [MIT's system](https://github.com/HDI-Project/mit-d3m-ta2/) for the DARPA
[Data-driven discovery of models](https://www.darpa.mil/program/data-driven-discovery-of-models) (D3M) program
- [AutoBazaar](https://github.com/MLBazaar/AutoBazaar), a flexible, general-purpose
AutoML system

## Try it out now!

If you want to quickly discover **BTB**, simply click the button below and follow the tutorials!

[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/BTB/master?filepath=tutorials)

# Install

## Requirements

**BTB** has been developed and tested on [Python 3.6, 3.7 and 3.8](https://www.python.org/downloads/)

Also, although it is not strictly required, the usage of a
[virtualenv](https://virtualenv.pypa.io/en/latest/) is highly recommended in order to avoid
interfering with other software installed in the system where **BTB** is run.

## Install with pip

The easiest and recommended way to install **BTB** is using [pip](
https://pip.pypa.io/en/stable/):

```bash
pip install baytune
```

This will pull and install the latest stable release from [PyPi](https://pypi.org/).

If you want to install from source or contribute to the project please read the
[Contributing Guide](https://mlbazaar.github.io/BTB/contributing.html#get-started).

# Quickstart

In this short tutorial we will guide you through the necessary steps to get started using BTB
to `select` between models and `tune` a model to solve a Machine Learning problem.

In particular, in this example we will be using ``BTBSession`` to perform solve the [Wine](
https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data) classification problem
by selecting between the `DecisionTreeClassifier` and the `SGDClassifier` models from
[scikit-learn](https://scikit-learn.org/) while also searching for their best `hyperparameter`
configuration.

## Prepare a scoring function

The first step in order to use the `BTBSession` class is to develop a `scoring` function.

This is a Python function that, given a model name and a `hyperparameter` configuration,
evaluates the performance of the model on your data and returns a score.

```python3
from sklearn.datasets import load_wine
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import f1_score, make_scorer
from sklearn.model_selection import cross_val_score
from sklearn.tree import DecisionTreeClassifier


dataset = load_wine()
models = {
    'DTC': DecisionTreeClassifier,
    'SGDC': SGDClassifier,
}

def scoring_function(model_name, hyperparameter_values):
    model_class = models[model_name]
    model_instance = model_class(**hyperparameter_values)
    scores = cross_val_score(
        estimator=model_instance,
        X=dataset.data,
        y=dataset.target,
        scoring=make_scorer(f1_score, average='macro')
    )
    return scores.mean()
```

## Define the tunable hyperparameters

The second step is to define the `hyperparameters` that we want to `tune` for each model as
`Tunables`.

```python3
from btb.tuning import Tunable
from btb.tuning import hyperparams as hp

tunables = {
    'DTC': Tunable({
        'max_depth': hp.IntHyperParam(min=3, max=200),
        'min_samples_split': hp.FloatHyperParam(min=0.01, max=1)
    }),
    'SGDC': Tunable({
        'max_iter': hp.IntHyperParam(min=1, max=5000, default=1000),
        'tol': hp.FloatHyperParam(min=1e-3, max=1, default=1e-3),
    })
}
```

## Start the searching process

Once you have defined a `scoring` function and the tunable `hyperparameters` specification of your
models, you can start the searching for the best model and `hyperparameter` configuration by using
the `btb.BTBSession`.

All you need to do is create an instance passing the tunable `hyperparameters` scpecification
and the scoring function.

```python3
from btb import BTBSession

session = BTBSession(
    tunables=tunables,
    scorer=scoring_function
)
```

And then call the `run` method indicating how many tunable iterations you want the `BTBSession` to
perform:


```python3
best_proposal = session.run(20)
```

The result will be a dictionary indicating the name of the best model that could be found
and the `hyperparameter` configuration that was used:

```
{
    'id': '826aedc2eff31635444e8104f0f3da43',
    'name': 'DTC',
    'config': {
        'max_depth': 21,
        'min_samples_split': 0.044010284821858835
    },
    'score': 0.907229308339589
}
 ```

# How does BTB perform?

We have a comprehensive [benchmarking framework](https://github.com/MLBazaar/BTB/tree/master/benchmark)
that we use to evaluate the performance of our `Tuners`. For every release, we perform benchmarking
against 100's of challenges, comparing tuners against each other in terms of number of wins.
We present the latest leaderboard from latest release below:

## Number of Wins on latest Version

| tuner                   | with ties | without ties |
|-------------------------|-----------|--------------|
| `Ax.optimize`           |    220    |           32 |
| `BTB.GCPEiTuner`        |    139    |            2 |
| `BTB.GCPTuner`          |  **252**  |       **90** |
| `BTB.GPEiTuner`         |    208    |           16 |
| `BTB.GPTuner`           |    213    |           24 |
| `BTB.UniformTuner`      |    177    |            1 |
| `HyperOpt.tpe`          |    186    |            6 |
| `SMAC.HB4AC`            |    180    |            4 |
| `SMAC.SMAC4HPO_EI`      |    220    |           31 |
| `SMAC.SMAC4HPO_LCB`     |    205    |           16 |
| `SMAC.SMAC4HPO_PI`      |    221    |           35 |

- Detailed results from which this summary emerged are available [here](https://docs.google.com/spreadsheets/d/15a-pAV_t7CCDvqDyloYmdVNFhiKJFOJ7bbgpmYIpyTs/edit?usp=sharing).
- If you want to compare your own tuner, follow the steps in our benchmarking framework [here](https://github.com/MLBazaar/BTB/tree/master/benchmark).
- If you have a proposal for tuner that we should include in our benchmarking get in touch
with us at [dailabmit@gmail.com](mailto:dailabmit@gmail.com).

# More tutorials

1. To just `tune` `hyperparameters` - see our `tuning` tutorial [here](
https://github.com/MLBazaar/BTB/blob/master/tutorials/01_Tuning.ipynb) and
[documentation here](https://mlbazaar.github.io/BTB/tutorials/01_Tuning.html).
2. To see the [types of `hyperparameters`](
https://mlbazaar.github.io/BTB/tutorials/01_Tuning.html#What-is-a-Hyperparameter?) we support
see our [documentation here](https://mlbazaar.github.io/BTB/tutorials/01_Tuning.html#What-is-a-Hyperparameter?).
3. You can read about [our benchmarking framework here](https://mlbazaar.github.io/BTB/benchmark.html#).
4. See our [tutorial on `selection` here](https://github.com/MLBazaar/BTB/blob/master/tutorials/02_Selection.ipynb)
and [documentation here](https://mlbazaar.github.io/BTB/tutorials/02_Selection.html).

For more details about **BTB** and all its possibilities and features, please check the
[project documentation site](https://mlbazaar.github.io/BTB/)!

Also do not forget to have a look at the [notebook tutorials](tutorials).

# Citing BTB

If you use **BTB**, please consider citing the following [paper](
https://arxiv.org/pdf/1905.08942.pdf):

```
@article{smith2019mlbazaar,
  author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},
  title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},
  journal = {arXiv e-prints},
  year = {2019},
  eid = {arXiv:1905.08942},
  pages = {arxiv:1904.09535},
  archivePrefix = {arXiv},
  eprint = {1905.08942},
}
`````

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "baytune",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "<4,>=3.8",
    "maintainer_email": "MIT Data to AI Lab <dailabmit@mit.edu>",
    "keywords": "data science,machine learning,hyperparameters,tuning,classification",
    "author": "",
    "author_email": "MIT Data to AI Lab <dailabmit@mit.edu>",
    "download_url": "https://files.pythonhosted.org/packages/1f/92/62299cdae8539ebf877eb3b8b41295f0394d8ee0078af4c8c1d807d6e793/baytune-0.5.0.tar.gz",
    "platform": null,
    "description": "<p align=\"left\">\n<img width=\"15%\" src=\"https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png\" alt=\"BTB\" />\n<i>An open source project from Data to AI Lab at MIT.</i>\n</p>\n\n![](https://raw.githubusercontent.com/MLBazaar/BTB/master/docs/images/BTB-Icon-small.png)\n\nA simple, extensible backend for developing auto-tuning systems.\n\n[![Development Status](https://img.shields.io/badge/Development%20Status-2%20--%20Pre--Alpha-yellow)](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)\n[![PyPi Shield](https://img.shields.io/pypi/v/baytune.svg)](https://pypi.python.org/pypi/baytune)\n[![Travis CI Shield](https://travis-ci.com/MLBazaar/BTB.svg?branch=master)](https://travis-ci.com/MLBazaar/BTB)\n[![Coverage Status](https://codecov.io/gh/MLBazaar/BTB/branch/master/graph/badge.svg)](https://codecov.io/gh/MLBazaar/BTB)\n[![Downloads](https://pepy.tech/badge/baytune)](https://pepy.tech/project/baytune)\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/BTB/master?filepath=tutorials)\n\n* License: [MIT](https://github.com/MLBazaar/BTB/blob/master/LICENSE)\n* Development Status: [Pre-Alpha](https://pypi.org/search/?c=Development+Status+%3A%3A+2+-+Pre-Alpha)\n* Documentation: https://mlbazaar.github.io/BTB\n* Homepage: https://github.com/MLBazaar/BTB\n\n# Overview\n\nBTB (\"Bayesian Tuning and Bandits\") is a simple, extensible backend for developing auto-tuning\nsystems such as AutoML systems. It provides an easy-to-use interface for *tuning* models and\n*selecting* between models.\n\nIt is currently being used in several AutoML systems:\n\n- [ATM](https://github.com/HDI-Project/ATM), a distributed, multi-tenant AutoML system for\nclassifier tuning\n- [MIT's system](https://github.com/HDI-Project/mit-d3m-ta2/) for the DARPA\n[Data-driven discovery of models](https://www.darpa.mil/program/data-driven-discovery-of-models) (D3M) program\n- [AutoBazaar](https://github.com/MLBazaar/AutoBazaar), a flexible, general-purpose\nAutoML system\n\n## Try it out now!\n\nIf you want to quickly discover **BTB**, simply click the button below and follow the tutorials!\n\n[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/MLBazaar/BTB/master?filepath=tutorials)\n\n# Install\n\n## Requirements\n\n**BTB** has been developed and tested on [Python 3.6, 3.7 and 3.8](https://www.python.org/downloads/)\n\nAlso, although it is not strictly required, the usage of a\n[virtualenv](https://virtualenv.pypa.io/en/latest/) is highly recommended in order to avoid\ninterfering with other software installed in the system where **BTB** is run.\n\n## Install with pip\n\nThe easiest and recommended way to install **BTB** is using [pip](\nhttps://pip.pypa.io/en/stable/):\n\n```bash\npip install baytune\n```\n\nThis will pull and install the latest stable release from [PyPi](https://pypi.org/).\n\nIf you want to install from source or contribute to the project please read the\n[Contributing Guide](https://mlbazaar.github.io/BTB/contributing.html#get-started).\n\n# Quickstart\n\nIn this short tutorial we will guide you through the necessary steps to get started using BTB\nto `select` between models and `tune` a model to solve a Machine Learning problem.\n\nIn particular, in this example we will be using ``BTBSession`` to perform solve the [Wine](\nhttps://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data) classification problem\nby selecting between the `DecisionTreeClassifier` and the `SGDClassifier` models from\n[scikit-learn](https://scikit-learn.org/) while also searching for their best `hyperparameter`\nconfiguration.\n\n## Prepare a scoring function\n\nThe first step in order to use the `BTBSession` class is to develop a `scoring` function.\n\nThis is a Python function that, given a model name and a `hyperparameter` configuration,\nevaluates the performance of the model on your data and returns a score.\n\n```python3\nfrom sklearn.datasets import load_wine\nfrom sklearn.linear_model import SGDClassifier\nfrom sklearn.metrics import f1_score, make_scorer\nfrom sklearn.model_selection import cross_val_score\nfrom sklearn.tree import DecisionTreeClassifier\n\n\ndataset = load_wine()\nmodels = {\n    'DTC': DecisionTreeClassifier,\n    'SGDC': SGDClassifier,\n}\n\ndef scoring_function(model_name, hyperparameter_values):\n    model_class = models[model_name]\n    model_instance = model_class(**hyperparameter_values)\n    scores = cross_val_score(\n        estimator=model_instance,\n        X=dataset.data,\n        y=dataset.target,\n        scoring=make_scorer(f1_score, average='macro')\n    )\n    return scores.mean()\n```\n\n## Define the tunable hyperparameters\n\nThe second step is to define the `hyperparameters` that we want to `tune` for each model as\n`Tunables`.\n\n```python3\nfrom btb.tuning import Tunable\nfrom btb.tuning import hyperparams as hp\n\ntunables = {\n    'DTC': Tunable({\n        'max_depth': hp.IntHyperParam(min=3, max=200),\n        'min_samples_split': hp.FloatHyperParam(min=0.01, max=1)\n    }),\n    'SGDC': Tunable({\n        'max_iter': hp.IntHyperParam(min=1, max=5000, default=1000),\n        'tol': hp.FloatHyperParam(min=1e-3, max=1, default=1e-3),\n    })\n}\n```\n\n## Start the searching process\n\nOnce you have defined a `scoring` function and the tunable `hyperparameters` specification of your\nmodels, you can start the searching for the best model and `hyperparameter` configuration by using\nthe `btb.BTBSession`.\n\nAll you need to do is create an instance passing the tunable `hyperparameters` scpecification\nand the scoring function.\n\n```python3\nfrom btb import BTBSession\n\nsession = BTBSession(\n    tunables=tunables,\n    scorer=scoring_function\n)\n```\n\nAnd then call the `run` method indicating how many tunable iterations you want the `BTBSession` to\nperform:\n\n\n```python3\nbest_proposal = session.run(20)\n```\n\nThe result will be a dictionary indicating the name of the best model that could be found\nand the `hyperparameter` configuration that was used:\n\n```\n{\n    'id': '826aedc2eff31635444e8104f0f3da43',\n    'name': 'DTC',\n    'config': {\n        'max_depth': 21,\n        'min_samples_split': 0.044010284821858835\n    },\n    'score': 0.907229308339589\n}\n ```\n\n# How does BTB perform?\n\nWe have a comprehensive [benchmarking framework](https://github.com/MLBazaar/BTB/tree/master/benchmark)\nthat we use to evaluate the performance of our `Tuners`. For every release, we perform benchmarking\nagainst 100's of challenges, comparing tuners against each other in terms of number of wins.\nWe present the latest leaderboard from latest release below:\n\n## Number of Wins on latest Version\n\n| tuner                   | with ties | without ties |\n|-------------------------|-----------|--------------|\n| `Ax.optimize`           |    220    |           32 |\n| `BTB.GCPEiTuner`        |    139    |            2 |\n| `BTB.GCPTuner`          |  **252**  |       **90** |\n| `BTB.GPEiTuner`         |    208    |           16 |\n| `BTB.GPTuner`           |    213    |           24 |\n| `BTB.UniformTuner`      |    177    |            1 |\n| `HyperOpt.tpe`          |    186    |            6 |\n| `SMAC.HB4AC`            |    180    |            4 |\n| `SMAC.SMAC4HPO_EI`      |    220    |           31 |\n| `SMAC.SMAC4HPO_LCB`     |    205    |           16 |\n| `SMAC.SMAC4HPO_PI`      |    221    |           35 |\n\n- Detailed results from which this summary emerged are available [here](https://docs.google.com/spreadsheets/d/15a-pAV_t7CCDvqDyloYmdVNFhiKJFOJ7bbgpmYIpyTs/edit?usp=sharing).\n- If you want to compare your own tuner, follow the steps in our benchmarking framework [here](https://github.com/MLBazaar/BTB/tree/master/benchmark).\n- If you have a proposal for tuner that we should include in our benchmarking get in touch\nwith us at [dailabmit@gmail.com](mailto:dailabmit@gmail.com).\n\n# More tutorials\n\n1. To just `tune` `hyperparameters` - see our `tuning` tutorial [here](\nhttps://github.com/MLBazaar/BTB/blob/master/tutorials/01_Tuning.ipynb) and\n[documentation here](https://mlbazaar.github.io/BTB/tutorials/01_Tuning.html).\n2. To see the [types of `hyperparameters`](\nhttps://mlbazaar.github.io/BTB/tutorials/01_Tuning.html#What-is-a-Hyperparameter?) we support\nsee our [documentation here](https://mlbazaar.github.io/BTB/tutorials/01_Tuning.html#What-is-a-Hyperparameter?).\n3. You can read about [our benchmarking framework here](https://mlbazaar.github.io/BTB/benchmark.html#).\n4. See our [tutorial on `selection` here](https://github.com/MLBazaar/BTB/blob/master/tutorials/02_Selection.ipynb)\nand [documentation here](https://mlbazaar.github.io/BTB/tutorials/02_Selection.html).\n\nFor more details about **BTB** and all its possibilities and features, please check the\n[project documentation site](https://mlbazaar.github.io/BTB/)!\n\nAlso do not forget to have a look at the [notebook tutorials](tutorials).\n\n# Citing BTB\n\nIf you use **BTB**, please consider citing the following [paper](\nhttps://arxiv.org/pdf/1905.08942.pdf):\n\n```\n@article{smith2019mlbazaar,\n  author = {Smith, Micah J. and Sala, Carles and Kanter, James Max and Veeramachaneni, Kalyan},\n  title = {The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development},\n  journal = {arXiv e-prints},\n  year = {2019},\n  eid = {arXiv:1905.08942},\n  pages = {arxiv:1904.09535},\n  archivePrefix = {arXiv},\n  eprint = {1905.08942},\n}\n`````\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Bayesian Tuning and Bandits",
    "version": "0.5.0",
    "project_urls": {
        "Issue Tracker": "https://github.com/MLBazaar/BTB/issues",
        "Source Code": "https://github.com/MLBazaar/BTB/",
        "Twitter": "https://twitter.com/lab_dai"
    },
    "split_keywords": [
        "data science",
        "machine learning",
        "hyperparameters",
        "tuning",
        "classification"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2fae3cb956891a7a1dafeb85fce50d84ca9a7917d372e2dcec5dd3587d43f752",
                "md5": "411039059449c927d6cf0fbe3809d834",
                "sha256": "fd226d739cbfb2086901345e58f807ab70cc4f8ed4ecd10578b4b7a470ded0b3"
            },
            "downloads": -1,
            "filename": "baytune-0.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "411039059449c927d6cf0fbe3809d834",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.8",
            "size": 75174,
            "upload_time": "2023-07-28T16:54:24",
            "upload_time_iso_8601": "2023-07-28T16:54:24.388169Z",
            "url": "https://files.pythonhosted.org/packages/2f/ae/3cb956891a7a1dafeb85fce50d84ca9a7917d372e2dcec5dd3587d43f752/baytune-0.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1f9262299cdae8539ebf877eb3b8b41295f0394d8ee0078af4c8c1d807d6e793",
                "md5": "62bc01be0f3f6fb0e0f810937bbb0da2",
                "sha256": "b46e42ad3f18acc59746ed7db604c8ab0a8e2daae42588c5649ab1097717f075"
            },
            "downloads": -1,
            "filename": "baytune-0.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "62bc01be0f3f6fb0e0f810937bbb0da2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.8",
            "size": 58793,
            "upload_time": "2023-07-28T16:54:26",
            "upload_time_iso_8601": "2023-07-28T16:54:26.854286Z",
            "url": "https://files.pythonhosted.org/packages/1f/92/62299cdae8539ebf877eb3b8b41295f0394d8ee0078af4c8c1d807d6e793/baytune-0.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-28 16:54:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MLBazaar",
    "github_project": "BTB",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "baytune"
}
        
Elapsed time: 0.12209s