# pyopenbt
This Python package is the Python interface for Dr. Matthew Pratola's [OpenBT project](https://bitbucket.org/mpratola/openbt/wiki/Home). Currently, its only module is openbt, which contains the OPENBT class. This class allows the user to create fit objects in a scikit-learn style.
[![Build](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml/badge.svg)](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml)
[![PyPI version](https://badge.fury.io/py/pyopenbt.svg)](https://badge.fury.io/py/pyopenbt)
[![Anaconda-Server Badge](https://anaconda.org/conda-forge/pyopenbt/badges/version.svg)](https://anaconda.org/conda-forge/pyopenbt)
### About:
OpenBT is a flexible and extensible C++ framework for implementing Bayesian regression tree models. Currently a number of models and inference tools are available for use in the released code with additional models/tools under development. The code makes use of MPI for parallel computing. Apart from this package, an R interface is provided via the ROpenbt package to demonstrate use of the software.
### How to utilize this package (and its module and class):
1. Install the package from the command line by typing:
`$ python -m pip install pyopenbt`.
2. In Python3 (or a Python script), import the OPENBT class from the openbt module by typing:
`from pyopenbt.openbt import OPENBT`.
This gives Python access to the OPENBT class. Typing
`from pyopenbt.openbt import *`
or
`from pyopenbt import openbt`
would also work, but for the former, the obt_load() function is loaded unnecesarily (unless you wish to use that function, of course). For the latter, the class would be referred to as `pyopenbt.OPENBT`, not simply OPENBT.
3. To utilize the OPENBT class/functions in Python 3 to conduct and interpret fits: create a fit object such as
`m = OPENBT(model = "bart", ...)`.
The fit object is an instance of the class. Here's an example of running a functions from the class:
`fitp = m.predict(preds)`
4. See example scripts (in the "examples" folder), showing the usage of the OPENBT class on data, to this package.
### Example:
To start, let's create a test function. A popular one is the [Branin](https://www.sfu.ca/~ssurjano/branin.html) function:
```
# Test Branin function, rescaled
def braninsc (xx):
x1 = xx[0]
x2 = xx[1]
x1bar = 15 * x1 - 5
x2bar = 15 * x2
import math
term1 = x2bar - 5.1*x1bar**2/(4*math.pi**2) + 5*x1bar/math.pi - 6
term2 = (10 - 10/(8*math.pi)) * math.cos(x1bar)
y = (term1**2 + term2 - 44.81) / 51.95
return(y)
# Simulate branin data for testing
import numpy as np
np.random.seed(99)
n = 500
p = 2
x = np.random.uniform(size=n*p).reshape(n,p)
y = np.zeros(n)
for i in range(n):
y[i] = braninsc(x[i,])
```
Note that the x and y data is a numpy array - this is the intended format. Now we can load the openbt package and fit a BART model. Here we set the model type as model="bart" which ensures we fit a homoscedastic BART model. The number of MPI threads to use is specified as tc=4. For a list of all optional parameters, see `m._dict__` (after creating m) or `help(OPENBT)`.
```
from pyopenbt.openbt import OPENBT, obt_load
m = OPENBT(model = "bart", tc = 4, modelname = "branin")
fit = m.fit(x, y)
```
Next we can construct predictions and make a simple plot comparing our predictions to the training data. Here, we are calculating the in-sample predictions since we passed the same x array to the predict() function.
```
# Calculate in-sample predictions
fitp = m.predict(x, tc = 4)
# Make a simple plot
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(16,9)); ax = fig.add_subplot(111)
ax.plot(y, fitp['mmean'], 'ro')
ax.set_xlabel("Observed"); ax.set_ylabel("Fitted")
ax.axline([0, 0], [1, 1])
```
To save the model, use OPENBT's obt_save() function. Similarly, load the model using obt_load(). Because the posterior can be large in sample-based models such as these, the fitted model is saved in a compressed file format with the extension .obt. Additionally, the estimator object can be saved and loaded (see below).
```
#--------------------------------------------------------------------------------------------
# Save fitted MODEL object (not the estimator object, m) as test.obt in the working directory
m.obt_save(fit, "test", est = False)
# Load fitted model object (AKA fit object) to a new object
fit2 = obt_load("test", est = False)
# We can also save/load the fit ESTIMATOR object by specifying est = True in obt_save()/load().
# The estimator object has all our settings and properties, but not fit results.
# This is similar to scikit-learn saving/loading its estimators.
m.obt_save("test_fit_est", est = True)
m2 = obt_load("test_fit_est", est = True)
#--------------------------------------------------------------------------------------------
```
The standard variable activity information, calculated as the proportion of splitting rules involving each variable, can be computed using OPENBT's vartivity() function.
```
# Calculate variable activity information
fitv = m.vartivity()
print(fitv['mvdraws'])
```
A more accurate alternative is to calculate the Sobol indices.
```
# Calculate Sobol indices
fits = m.sobol(cmdopt = 'MPI', tc = 4)
print(fits['msi'])
print(fits['mtsi'])
print(fits['msij'])
```
Again, for more examples of using OpenBT, explore the examples folder in the [Github repo](https://github.com/cavan33/openbt_py) .
### See Also:
[Github "Homepage" for this package](https://github.com/cavan33/openbt_py)
PyPI [Package Home](https://pypi.org/project/pyopenbt/)
### Contributions
All contributions are welcome. You can help this project be better by reporting issues, bugs,
or forking the repo and creating a pull request.
------------------------------------------------------------------------------
### License
The package is licensed under the BSD 3-Clause License. A copy of the
[license](LICENSE) can be found along with the code.
Raw data
{
"_id": null,
"home_page": "",
"name": "pyopenbt",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "bayesian additive regession trees",
"author": "",
"author_email": "Clark Van Lieshout <clarkvan33@gmail.com>, \"J. Derek Tucker\" <jdtuck@sandia.gov>",
"download_url": "https://files.pythonhosted.org/packages/2e/f0/eebc53c93d361d189143af37350604cfbe6bee075a0d89b22717e1530198/pyopenbt-0.0.8.tar.gz",
"platform": null,
"description": "# pyopenbt\nThis Python package is the Python interface for Dr. Matthew Pratola's [OpenBT project](https://bitbucket.org/mpratola/openbt/wiki/Home). Currently, its only module is openbt, which contains the OPENBT class. This class allows the user to create fit objects in a scikit-learn style.\n\n[![Build](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml/badge.svg)](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml)\n[![PyPI version](https://badge.fury.io/py/pyopenbt.svg)](https://badge.fury.io/py/pyopenbt)\n[![Anaconda-Server Badge](https://anaconda.org/conda-forge/pyopenbt/badges/version.svg)](https://anaconda.org/conda-forge/pyopenbt)\n\n### About: \nOpenBT is a flexible and extensible C++ framework for implementing Bayesian regression tree models. Currently a number of models and inference tools are available for use in the released code with additional models/tools under development. The code makes use of MPI for parallel computing. Apart from this package, an R interface is provided via the ROpenbt package to demonstrate use of the software.\n\n### How to utilize this package (and its module and class): \n1. Install the package from the command line by typing: \n`$ python -m pip install pyopenbt`. \n2. In Python3 (or a Python script), import the OPENBT class from the openbt module by typing: \n`from pyopenbt.openbt import OPENBT`. \nThis gives Python access to the OPENBT class. Typing \n`from pyopenbt.openbt import *` \nor \n`from pyopenbt import openbt` \nwould also work, but for the former, the obt_load() function is loaded unnecesarily (unless you wish to use that function, of course). For the latter, the class would be referred to as `pyopenbt.OPENBT`, not simply OPENBT. \n3. To utilize the OPENBT class/functions in Python 3 to conduct and interpret fits: create a fit object such as \n`m = OPENBT(model = \"bart\", ...)`. \nThe fit object is an instance of the class. Here's an example of running a functions from the class: \n`fitp = m.predict(preds)`\n4. See example scripts (in the \"examples\" folder), showing the usage of the OPENBT class on data, to this package. \n\n### Example: \nTo start, let's create a test function. A popular one is the [Branin](https://www.sfu.ca/~ssurjano/branin.html) function:\n```\n# Test Branin function, rescaled\ndef braninsc (xx):\n x1 = xx[0]\n x2 = xx[1]\n \n x1bar = 15 * x1 - 5\n x2bar = 15 * x2\n \n import math\n term1 = x2bar - 5.1*x1bar**2/(4*math.pi**2) + 5*x1bar/math.pi - 6\n term2 = (10 - 10/(8*math.pi)) * math.cos(x1bar)\n \n y = (term1**2 + term2 - 44.81) / 51.95\n return(y)\n\n\n# Simulate branin data for testing\nimport numpy as np\nnp.random.seed(99)\nn = 500\np = 2\nx = np.random.uniform(size=n*p).reshape(n,p)\ny = np.zeros(n)\nfor i in range(n):\n y[i] = braninsc(x[i,])\n```\nNote that the x and y data is a numpy array - this is the intended format. Now we can load the openbt package and fit a BART model. Here we set the model type as model=\"bart\" which ensures we fit a homoscedastic BART model. The number of MPI threads to use is specified as tc=4. For a list of all optional parameters, see `m._dict__` (after creating m) or `help(OPENBT)`.\n\n```\nfrom pyopenbt.openbt import OPENBT, obt_load\nm = OPENBT(model = \"bart\", tc = 4, modelname = \"branin\")\nfit = m.fit(x, y)\n```\nNext we can construct predictions and make a simple plot comparing our predictions to the training data. Here, we are calculating the in-sample predictions since we passed the same x array to the predict() function.\n```\n# Calculate in-sample predictions\nfitp = m.predict(x, tc = 4)\n\n# Make a simple plot\nimport matplotlib.pyplot as plt\nfig = plt.figure(figsize=(16,9)); ax = fig.add_subplot(111)\nax.plot(y, fitp['mmean'], 'ro')\nax.set_xlabel(\"Observed\"); ax.set_ylabel(\"Fitted\")\nax.axline([0, 0], [1, 1])\n```\nTo save the model, use OPENBT's obt_save() function. Similarly, load the model using obt_load(). Because the posterior can be large in sample-based models such as these, the fitted model is saved in a compressed file format with the extension .obt. Additionally, the estimator object can be saved and loaded (see below).\n```\n#--------------------------------------------------------------------------------------------\n# Save fitted MODEL object (not the estimator object, m) as test.obt in the working directory\nm.obt_save(fit, \"test\", est = False)\n# Load fitted model object (AKA fit object) to a new object\nfit2 = obt_load(\"test\", est = False)\n\n# We can also save/load the fit ESTIMATOR object by specifying est = True in obt_save()/load().\n# The estimator object has all our settings and properties, but not fit results. \n# This is similar to scikit-learn saving/loading its estimators.\nm.obt_save(\"test_fit_est\", est = True)\nm2 = obt_load(\"test_fit_est\", est = True)\n#--------------------------------------------------------------------------------------------\n```\nThe standard variable activity information, calculated as the proportion of splitting rules involving each variable, can be computed using OPENBT's vartivity() function.\n```\n# Calculate variable activity information\nfitv = m.vartivity()\nprint(fitv['mvdraws'])\n```\nA more accurate alternative is to calculate the Sobol indices.\n```\n# Calculate Sobol indices\nfits = m.sobol(cmdopt = 'MPI', tc = 4)\nprint(fits['msi'])\nprint(fits['mtsi'])\nprint(fits['msij'])\n```\nAgain, for more examples of using OpenBT, explore the examples folder in the [Github repo](https://github.com/cavan33/openbt_py) .\n\n### See Also: \n[Github \"Homepage\" for this package](https://github.com/cavan33/openbt_py) \nPyPI [Package Home](https://pypi.org/project/pyopenbt/) \n\n### Contributions\nAll contributions are welcome. You can help this project be better by reporting issues, bugs, \nor forking the repo and creating a pull request.\n\n------------------------------------------------------------------------------\n\n### License\nThe package is licensed under the BSD 3-Clause License. A copy of the\n[license](LICENSE) can be found along with the code.\n",
"bugtrack_url": null,
"license": "BSD 3-Clause",
"summary": "python interface to openbt",
"version": "0.0.8",
"project_urls": {
"homepage": "https://github.com/cavan33/openbt_py",
"repository": "https://github.com/cavan33/openbt_py"
},
"split_keywords": [
"bayesian",
"additive",
"regession",
"trees"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "03c36782ae377641ef7f2782aa283e0830f0ea339d3785b52d543de62df1c927",
"md5": "9afd7a4cbc70ffd1ec5101d686211fe7",
"sha256": "0075fbcf3ebaa9d49a7f1195ee19fcadbdac82c50855f7a1c15aa73680411d35"
},
"downloads": -1,
"filename": "pyopenbt-0.0.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9afd7a4cbc70ffd1ec5101d686211fe7",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 24373,
"upload_time": "2024-01-03T17:05:36",
"upload_time_iso_8601": "2024-01-03T17:05:36.774078Z",
"url": "https://files.pythonhosted.org/packages/03/c3/6782ae377641ef7f2782aa283e0830f0ea339d3785b52d543de62df1c927/pyopenbt-0.0.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2ef0eebc53c93d361d189143af37350604cfbe6bee075a0d89b22717e1530198",
"md5": "fbf6a317280e29356cd581172bb09a34",
"sha256": "111caf73073c397c531b68b2ded50c5f4a2b751b14a3dd4c4c546a9f54318362"
},
"downloads": -1,
"filename": "pyopenbt-0.0.8.tar.gz",
"has_sig": false,
"md5_digest": "fbf6a317280e29356cd581172bb09a34",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 26899,
"upload_time": "2024-01-03T17:05:37",
"upload_time_iso_8601": "2024-01-03T17:05:37.891303Z",
"url": "https://files.pythonhosted.org/packages/2e/f0/eebc53c93d361d189143af37350604cfbe6bee075a0d89b22717e1530198/pyopenbt-0.0.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-01-03 17:05:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cavan33",
"github_project": "openbt_py",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "joblib",
"specs": []
},
{
"name": "matplotlib",
"specs": []
},
{
"name": "numpy",
"specs": []
},
{
"name": "pandas",
"specs": []
},
{
"name": "scikit_learn",
"specs": []
},
{
"name": "scipy",
"specs": []
}
],
"lcname": "pyopenbt"
}