Automatic Stopping for Batch-mode Experimentation
================
<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
Created with nbdev by Zoltan Puha
## Install
`python -m pip install git+https://github.com/puhazoli/asbe`
## How to use
ASBE builds on the functional views of modAL, where an AL algorithm can
be run by putting together pieces. You need the following ingredients: -
an ITE estimator (`ITEEstimator()`), - an acquisition function, - and an
assignment function. - Additionaly, you can add a stopping criteria to
your model. If all the above are defined, you can construct an
`ASLearner`, which will help you in the active learning process.
``` python
from asbe.base import *
from asbe.models import *
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
import numpy as np
```
``` python
N = 1000
X = np.random.normal(size = N*2).reshape((-1,2))
t = np.random.binomial(n = 1, p = 0.5, size = N)
y = np.random.binomial(n = 1, p = 1/(1+np.exp(X[:, 1]*2 + t*3)))
ite = 1/(1+np.exp(X[:, 1]*2 + t*3)) - 1/(1+np.exp(X[:, 1]*2))
a = BaseITEEstimator(LogisticRegression(solver="lbfgs"))
a.fit(X_training=X, t_training=t, y_training=y)
```
## Learning actively
Similarly, you can create an
[`BaseActiveLearner`](https://puhazoli.github.io/asbe/base.html#baseactivelearner),
for which you will initialize the dataset and set the preferred modeling
options. Let’s see how it works: - we will use XBART to model the
treatment effect with a one-model approach - we will use expected model
change maximization - for that, we need an approximate model, we will
use the `SGDRegressor`
You can call `.fit()` on the
[`BaseActiveLearner`](https://puhazoli.github.io/asbe/base.html#baseactivelearner),
which will by default fit the training data supplied. To select new
units from the pool, you just need to call the `query()` method, which
will return the selected `X` and the `query_ix` of these units.
[`BaseActiveLearner`](https://puhazoli.github.io/asbe/base.html#baseactivelearner)
expects the `n2` argument, which tells how many units are queried at
once. For sequential AL, we can set this to 1. Additionally, some query
strategies can require different treatment effect estimates - EMCM needs
uncertainty around the ITE. We can explicitly tell the the
[`BaseITEEstimator`](https://puhazoli.github.io/asbe/base.html#baseiteestimator)
to return all the predicted treatment effects. Then, we can teach the
newly acquired units to the learner, by calling the `teach` function.
The `score` function provides an evaluation of the given learner.
``` python
from sklearn.model_selection import train_test_split
from sklearn.linear_model import SGDRegressor
from copy import deepcopy
import pandas as pd
```
``` python
X_train, X_test, t_train, t_test, y_train, y_test, ite_train, ite_test = train_test_split(
X, t, y, ite, test_size=0.8, random_state=1005)
ds = {"X_training": X_train,
"y_training": y_train,
"t_training": t_train,
"ite_training": np.zeros_like(y_train),
"X_pool": deepcopy(X_test),
"y_pool": deepcopy(y_test),
"t_pool": deepcopy(t_test),
"ite_pool" : np.zeros_like(y_test),
"X_test": X_test,
"y_test": y_test,
"t_test": t_test,
"ite_test": ite_test
}
asl = BaseActiveLearner(estimator = BaseITEEstimator(model = RandomForestClassifier(),
two_model=False),
acquisition_function=BaseAcquisitionFunction(),
assignment_function=BaseAssignmentFunction(),
stopping_function = None,
dataset=ds)
asl.fit()
X_new, query_idx = asl.query(no_query=10)
asl.teach(query_idx)
preds = asl.predict(asl.dataset["X_test"])
asl.score()
```
0.34842037641629464
``` python
asl = BaseActiveLearner(estimator = BaseITEEstimator(model = RandomForestClassifier(),
two_model=True),
acquisition_function=[BaseAcquisitionFunction(),
BaseAcquisitionFunction(no_query=20)],
assignment_function=BaseAssignmentFunction(),
stopping_function = None,
dataset=ds,
al_steps = 3)
resd = pd.DataFrame(asl.simulate(metric="decision"))
```
``` python
resd.plot()
```
<AxesSubplot:>
![](index_files/figure-commonmark/cell-7-output-2.png)
Raw data
{
"_id": null,
"home_page": "https://github.com/puhazoli/asbe/",
"name": "asbe",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "active learning",
"author": "puhazoli",
"author_email": "puha.zoli@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/bd/14/4bad926a3ff2aa4652388ccffafd0a767e6547420bccb6fb66e950e5b688/asbe-0.1.4.tar.gz",
"platform": null,
"description": "Automatic Stopping for Batch-mode Experimentation\n================\n\n<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->\n\nCreated with nbdev by Zoltan Puha\n\n## Install\n\n`python -m pip install git+https://github.com/puhazoli/asbe`\n\n## How to use\n\nASBE builds on the functional views of modAL, where an AL algorithm can\nbe run by putting together pieces. You need the following ingredients: -\nan ITE estimator (`ITEEstimator()`), - an acquisition function, - and an\nassignment function. - Additionaly, you can add a stopping criteria to\nyour model. If all the above are defined, you can construct an\n`ASLearner`, which will help you in the active learning process.\n\n``` python\nfrom asbe.base import *\nfrom asbe.models import *\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.ensemble import RandomForestClassifier\nimport numpy as np\n```\n\n``` python\nN = 1000\nX = np.random.normal(size = N*2).reshape((-1,2))\nt = np.random.binomial(n = 1, p = 0.5, size = N)\ny = np.random.binomial(n = 1, p = 1/(1+np.exp(X[:, 1]*2 + t*3)))\nite = 1/(1+np.exp(X[:, 1]*2 + t*3)) - 1/(1+np.exp(X[:, 1]*2))\na = BaseITEEstimator(LogisticRegression(solver=\"lbfgs\"))\na.fit(X_training=X, t_training=t, y_training=y)\n```\n\n## Learning actively\n\nSimilarly, you can create an\n[`BaseActiveLearner`](https://puhazoli.github.io/asbe/base.html#baseactivelearner),\nfor which you will initialize the dataset and set the preferred modeling\noptions. Let\u2019s see how it works: - we will use XBART to model the\ntreatment effect with a one-model approach - we will use expected model\nchange maximization - for that, we need an approximate model, we will\nuse the `SGDRegressor`\n\nYou can call `.fit()` on the\n[`BaseActiveLearner`](https://puhazoli.github.io/asbe/base.html#baseactivelearner),\nwhich will by default fit the training data supplied. To select new\nunits from the pool, you just need to call the `query()` method, which\nwill return the selected `X` and the `query_ix` of these units.\n[`BaseActiveLearner`](https://puhazoli.github.io/asbe/base.html#baseactivelearner)\nexpects the `n2` argument, which tells how many units are queried at\nonce. For sequential AL, we can set this to 1. Additionally, some query\nstrategies can require different treatment effect estimates - EMCM needs\nuncertainty around the ITE. We can explicitly tell the the\n[`BaseITEEstimator`](https://puhazoli.github.io/asbe/base.html#baseiteestimator)\nto return all the predicted treatment effects. Then, we can teach the\nnewly acquired units to the learner, by calling the `teach` function.\nThe `score` function provides an evaluation of the given learner.\n\n``` python\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.linear_model import SGDRegressor\nfrom copy import deepcopy\nimport pandas as pd\n```\n\n``` python\nX_train, X_test, t_train, t_test, y_train, y_test, ite_train, ite_test = train_test_split(\n X, t, y, ite, test_size=0.8, random_state=1005)\nds = {\"X_training\": X_train,\n \"y_training\": y_train,\n \"t_training\": t_train,\n \"ite_training\": np.zeros_like(y_train),\n \"X_pool\": deepcopy(X_test), \n \"y_pool\": deepcopy(y_test),\n \"t_pool\": deepcopy(t_test),\n \"ite_pool\" : np.zeros_like(y_test),\n \"X_test\": X_test,\n \"y_test\": y_test,\n \"t_test\": t_test,\n \"ite_test\": ite_test\n }\nasl = BaseActiveLearner(estimator = BaseITEEstimator(model = RandomForestClassifier(),\n two_model=False),\n acquisition_function=BaseAcquisitionFunction(),\n assignment_function=BaseAssignmentFunction(),\n stopping_function = None,\n dataset=ds)\nasl.fit()\nX_new, query_idx = asl.query(no_query=10)\nasl.teach(query_idx)\npreds = asl.predict(asl.dataset[\"X_test\"])\nasl.score()\n```\n\n 0.34842037641629464\n\n``` python\nasl = BaseActiveLearner(estimator = BaseITEEstimator(model = RandomForestClassifier(),\n two_model=True),\n acquisition_function=[BaseAcquisitionFunction(),\n BaseAcquisitionFunction(no_query=20)],\n assignment_function=BaseAssignmentFunction(),\n stopping_function = None,\n dataset=ds,\n al_steps = 3)\nresd = pd.DataFrame(asl.simulate(metric=\"decision\"))\n```\n\n``` python\nresd.plot()\n```\n\n <AxesSubplot:>\n\n![](index_files/figure-commonmark/cell-7-output-2.png)\n",
"bugtrack_url": null,
"license": "Apache Software License 2.0",
"summary": "Active Learning for treatment effect estimation",
"version": "0.1.4",
"project_urls": {
"Homepage": "https://github.com/puhazoli/asbe/"
},
"split_keywords": [
"active",
"learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "657f959905a221e703aa053854005451fa0cf89e821d7b868482e83c5887caf0",
"md5": "56964590b6745250ccc8b7dbaee96900",
"sha256": "48f84776d94cf1988e8fff2671febd86ab501c45f1376a2d439a7179a4ac7e38"
},
"downloads": -1,
"filename": "asbe-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "56964590b6745250ccc8b7dbaee96900",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 33662,
"upload_time": "2024-04-05T07:43:02",
"upload_time_iso_8601": "2024-04-05T07:43:02.913242Z",
"url": "https://files.pythonhosted.org/packages/65/7f/959905a221e703aa053854005451fa0cf89e821d7b868482e83c5887caf0/asbe-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bd144bad926a3ff2aa4652388ccffafd0a767e6547420bccb6fb66e950e5b688",
"md5": "281c0471def0cb0b0248febef276abe8",
"sha256": "48aa24e6d12fb9266fc03de6f8299d638534e234d99243757800e3812b5fe94a"
},
"downloads": -1,
"filename": "asbe-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "281c0471def0cb0b0248febef276abe8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 34528,
"upload_time": "2024-04-05T07:43:04",
"upload_time_iso_8601": "2024-04-05T07:43:04.820050Z",
"url": "https://files.pythonhosted.org/packages/bd/14/4bad926a3ff2aa4652388ccffafd0a767e6547420bccb6fb66e950e5b688/asbe-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-05 07:43:04",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "puhazoli",
"github_project": "asbe",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "asbe"
}