# Bayesian Multi Armed Bandit
## 🚀 Overview
A Bayesian Multi-Armed Bandit is a statistical model used in decision-making processes under uncertainty. It is a variation of the classic multi-armed bandit problem, where you have multiple options (each represented as an arm of a bandit, or slot machine) and you must choose which to pursue to maximize your rewards. The Bayesian aspect of this model comes into play by using Bayesian inference to update the probability distribution of the rewards of each arm based on prior knowledge and observed outcomes. This approach allows for a more nuanced and dynamically adaptive decision-making process, as the model continuously updates its beliefs and predictions about the performance of each option in real time. It's especially useful in scenarios where the environment changes or when dealing with limited information.
Between several use cases, we can highlight
- Online adversising
- Website optimization
- Personalization
- Clinical trials
## 💻 Example Usage
For Bayesian Multi Armed Bandits (MAB), first define initialize the BayesianMAB and provide it the arm object with an index and a provided name, this will be useful for determining the winner later on.
We provide an example of for loops giving positive (1) and negative (0) rewards to each arm, you can add it as you want.
At any moment, we can check it we already have a winner, using the BayesianMAB.check_for_end method.
```python
from bayesian_mab import BayesianMAB, BinaryReward
import numpy as np
binary_reward = BinaryReward()
bayesian_mab = BayesianMAB(
arms=[
BayesianArm(index=0, arm_name="Ad #1"),
BayesianArm(index=1, arm_name="Ad #2"),
BayesianArm(index=2, arm_name="Ad #3"),
]
)
for i in range(4):
binary_reward.update_reward(np.random.binomial(1, p=0.9))
bayesian_mab.update_arm(chosen_arm=0, reward_agent=binary_reward)
for i in range(1500):
binary_reward.update_reward(np.random.binomial(1, p=0.3))
bayesian_mab.update_arm(chosen_arm=1, reward_agent=binary_reward)
for i in range(1500):
binary_reward.update_reward(np.random.binomial(1, p=0.9))
bayesian_mab.update_arm(chosen_arm=2, reward_agent=binary_reward)
flg_end, winner_arm = bayesian_mab.check_for_end(winner_prob_threshold=0.80)
print("Is there a winner? {}. Winner: {}".format(flg_end, winner_arm))
```
## Acknowledgments and References
- Cook, J., 2005. **Exact calculation of beta inequalities**. Houston: University of Texas, MD Anderson Cancer Center. Available [here](https://www.johndcook.com/UTMDABTR-005-05.pdf)
- Slivkins, A., 2019. **Introduction to multi-armed bandits**. Foundations and Trends® in Machine Learning, 12(1-2), pp.1-286. Available [here](https://www.nowpublishers.com/article/Details/MAL-068)
- White, J., 2013. **Bandit algorithms for website optimization.** " O'Reilly Media, Inc.".
- Bruce, P., Bruce, A. and Gedeck, P., 2020. **Practical statistics for data scientists: 50+ essential concepts using R and Python**. O'Reilly Media.
- Praise on Vincenzo Lavorini for [this](https://towardsdatascience.com/bayesian-a-b-testing-with-python-the-easy-guide-d638f89e0b8a) Towards Data Science blog post.
Raw data
{
"_id": null,
"home_page": "https://github.com/modestobr/bayesian_multi_armed_bandit",
"name": "bayesian-mab",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "Bayesian Inference,Multi Armed Bandit,Reinforcement Learning,Website optimization,A/B Testing,Online Advertising",
"author": "Brandao, Iago M.",
"author_email": "",
"download_url": "",
"platform": null,
"description": "# Bayesian Multi Armed Bandit\n\n\n## \ud83d\ude80 Overview\n\nA Bayesian Multi-Armed Bandit is a statistical model used in decision-making processes under uncertainty. It is a variation of the classic multi-armed bandit problem, where you have multiple options (each represented as an arm of a bandit, or slot machine) and you must choose which to pursue to maximize your rewards. The Bayesian aspect of this model comes into play by using Bayesian inference to update the probability distribution of the rewards of each arm based on prior knowledge and observed outcomes. This approach allows for a more nuanced and dynamically adaptive decision-making process, as the model continuously updates its beliefs and predictions about the performance of each option in real time. It's especially useful in scenarios where the environment changes or when dealing with limited information.\n\nBetween several use cases, we can highlight\n- Online adversising\n- Website optimization\n- Personalization\n- Clinical trials\n\n## \ud83d\udcbb Example Usage\nFor Bayesian Multi Armed Bandits (MAB), first define initialize the BayesianMAB and provide it the arm object with an index and a provided name, this will be useful for determining the winner later on.\n\nWe provide an example of for loops giving positive (1) and negative (0) rewards to each arm, you can add it as you want.\n\nAt any moment, we can check it we already have a winner, using the BayesianMAB.check_for_end method.\n\n```python\nfrom bayesian_mab import BayesianMAB, BinaryReward\nimport numpy as np\n\nbinary_reward = BinaryReward()\n\nbayesian_mab = BayesianMAB(\n arms=[\n BayesianArm(index=0, arm_name=\"Ad #1\"),\n BayesianArm(index=1, arm_name=\"Ad #2\"),\n BayesianArm(index=2, arm_name=\"Ad #3\"),\n ]\n)\n\nfor i in range(4):\n binary_reward.update_reward(np.random.binomial(1, p=0.9))\n bayesian_mab.update_arm(chosen_arm=0, reward_agent=binary_reward)\n\nfor i in range(1500):\n binary_reward.update_reward(np.random.binomial(1, p=0.3))\n bayesian_mab.update_arm(chosen_arm=1, reward_agent=binary_reward)\n\nfor i in range(1500):\n binary_reward.update_reward(np.random.binomial(1, p=0.9))\n bayesian_mab.update_arm(chosen_arm=2, reward_agent=binary_reward)\n\nflg_end, winner_arm = bayesian_mab.check_for_end(winner_prob_threshold=0.80)\n\nprint(\"Is there a winner? {}. Winner: {}\".format(flg_end, winner_arm))\n```\n\n## Acknowledgments and References\n\n\n- Cook, J., 2005. **Exact calculation of beta inequalities**. Houston: University of Texas, MD Anderson Cancer Center. Available [here](https://www.johndcook.com/UTMDABTR-005-05.pdf)\n- Slivkins, A., 2019. **Introduction to multi-armed bandits**. Foundations and Trends\u00ae in Machine Learning, 12(1-2), pp.1-286. Available [here](https://www.nowpublishers.com/article/Details/MAL-068)\n- White, J., 2013. **Bandit algorithms for website optimization.** \" O'Reilly Media, Inc.\".\n- Bruce, P., Bruce, A. and Gedeck, P., 2020. **Practical statistics for data scientists: 50+ essential concepts using R and Python**. O'Reilly Media.\n- Praise on Vincenzo Lavorini for [this](https://towardsdatascience.com/bayesian-a-b-testing-with-python-the-easy-guide-d638f89e0b8a) Towards Data Science blog post.\n\n",
"bugtrack_url": null,
"license": "",
"summary": "Bayesian Multi Armed Bandit",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/modestobr/bayesian_multi_armed_bandit"
},
"split_keywords": [
"bayesian inference",
"multi armed bandit",
"reinforcement learning",
"website optimization",
"a/b testing",
"online advertising"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ead5880d5d95cdc029f1036ec933a98d5e07e21422d589e66cddedcefa1fdd17",
"md5": "651da5b515ff1c64718fc66e5617d1c8",
"sha256": "e8ea561f6134501e0ce4754eb1af9858eed0502d1d4d9fe5353d7b1caa919c2d"
},
"downloads": -1,
"filename": "bayesian_mab-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "651da5b515ff1c64718fc66e5617d1c8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 11482,
"upload_time": "2024-01-16T12:27:34",
"upload_time_iso_8601": "2024-01-16T12:27:34.861861Z",
"url": "https://files.pythonhosted.org/packages/ea/d5/880d5d95cdc029f1036ec933a98d5e07e21422d589e66cddedcefa1fdd17/bayesian_mab-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-16 12:27:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "modestobr",
"github_project": "bayesian_multi_armed_bandit",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "tqdm",
"specs": [
[
"==",
"4.66.1"
]
]
},
{
"name": "python-dotenv",
"specs": [
[
"==",
"1.0.0"
]
]
},
{
"name": "numpy",
"specs": [
[
"==",
"1.25.2"
]
]
},
{
"name": "pandas",
"specs": [
[
"==",
"1.5.3"
]
]
},
{
"name": "openpyxl",
"specs": [
[
"==",
"3.1.2"
]
]
},
{
"name": "seaborn",
"specs": [
[
"==",
"0.13.0"
]
]
},
{
"name": "scikit-learn",
"specs": [
[
"==",
"1.2.2"
]
]
},
{
"name": "pydantic",
"specs": [
[
"==",
"2.5.2"
]
]
},
{
"name": "names",
"specs": [
[
"==",
"0.3.0"
]
]
},
{
"name": "numba",
"specs": [
[
"==",
"0.58.1"
]
]
}
],
"lcname": "bayesian-mab"
}