pyopenbt


Namepyopenbt JSON
Version 0.0.8 PyPI version JSON
download
home_page
Summarypython interface to openbt
upload_time2024-01-03 17:05:37
maintainer
docs_urlNone
author
requires_python>=3.6
licenseBSD 3-Clause
keywords bayesian additive regession trees
VCS
bugtrack_url
requirements joblib matplotlib numpy pandas scikit_learn scipy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pyopenbt
This Python package is the Python interface for Dr. Matthew Pratola's [OpenBT project](https://bitbucket.org/mpratola/openbt/wiki/Home). Currently, its only module is openbt, which contains the OPENBT class. This class allows the user to create fit objects in a scikit-learn style.

[![Build](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml/badge.svg)](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml)
[![PyPI version](https://badge.fury.io/py/pyopenbt.svg)](https://badge.fury.io/py/pyopenbt)
[![Anaconda-Server Badge](https://anaconda.org/conda-forge/pyopenbt/badges/version.svg)](https://anaconda.org/conda-forge/pyopenbt)

### About:  
OpenBT is a flexible and extensible C++ framework for implementing Bayesian regression tree models. Currently a number of models and inference tools are available for use in the released code with additional models/tools under development. The code makes use of MPI for parallel computing. Apart from this package, an R interface is provided via the ROpenbt package to demonstrate use of the software.

### How to utilize this package (and its module and class):  
1. Install the package from the command line by typing:  
`$ python -m pip install pyopenbt`.   
2. In Python3 (or a Python script), import the OPENBT class from the openbt module by typing:  
`from pyopenbt.openbt import OPENBT`.  
This gives Python access to the OPENBT class. Typing  
`from pyopenbt.openbt import *`  
or  
`from pyopenbt import openbt`  
would also work, but for the former, the obt_load() function is loaded unnecesarily (unless you wish to use that function, of course). For the latter, the class would be referred to as `pyopenbt.OPENBT`, not simply OPENBT.  
3. To utilize the OPENBT class/functions in Python 3 to conduct and interpret fits: create a fit object such as  
`m = OPENBT(model = "bart", ...)`.  
The fit object is an instance of the class. Here's an example of running a functions from the class:  
`fitp = m.predict(preds)`
4. See example scripts (in the "examples" folder), showing the usage of the OPENBT class on data, to this package. 

### Example:  
To start, let's create a test function. A popular one is the [Branin](https://www.sfu.ca/~ssurjano/branin.html) function:
```
# Test Branin function, rescaled
def braninsc (xx):
    x1 = xx[0]
    x2 = xx[1]
    
    x1bar = 15 * x1 - 5
    x2bar = 15 * x2
    
    import math
    term1 = x2bar - 5.1*x1bar**2/(4*math.pi**2) + 5*x1bar/math.pi - 6
    term2 = (10 - 10/(8*math.pi)) * math.cos(x1bar)
    
    y = (term1**2 + term2 - 44.81) / 51.95
    return(y)


# Simulate branin data for testing
import numpy as np
np.random.seed(99)
n = 500
p = 2
x = np.random.uniform(size=n*p).reshape(n,p)
y = np.zeros(n)
for i in range(n):
    y[i] = braninsc(x[i,])
```
Note that the x and y data is a numpy array - this is the intended format. Now we can load the openbt package and fit a BART model. Here we set the model type as model="bart" which ensures we fit a homoscedastic BART model. The number of MPI threads to use is specified as tc=4. For a list of all optional parameters, see `m._dict__` (after creating m) or `help(OPENBT)`.

```
from pyopenbt.openbt import OPENBT, obt_load
m = OPENBT(model = "bart", tc = 4, modelname = "branin")
fit = m.fit(x, y)
```
Next we can construct predictions and make a simple plot comparing our predictions to the training data. Here, we are calculating the in-sample predictions since we passed the same x array to the predict() function.
```
# Calculate in-sample predictions
fitp = m.predict(x, tc = 4)

# Make a simple plot
import matplotlib.pyplot as plt
fig = plt.figure(figsize=(16,9)); ax = fig.add_subplot(111)
ax.plot(y, fitp['mmean'], 'ro')
ax.set_xlabel("Observed"); ax.set_ylabel("Fitted")
ax.axline([0, 0], [1, 1])
```
To save the model, use OPENBT's obt_save() function. Similarly, load the model using obt_load(). Because the posterior can be large in sample-based models such as these, the fitted model is saved in a compressed file format with the extension .obt. Additionally, the estimator object can be saved and loaded (see below).
```
#--------------------------------------------------------------------------------------------
# Save fitted MODEL object (not the estimator object, m) as test.obt in the working directory
m.obt_save(fit, "test", est = False)
# Load fitted model object (AKA fit object) to a new object
fit2 = obt_load("test", est = False)

# We can also save/load the fit ESTIMATOR object by specifying est = True in obt_save()/load().
# The estimator object has all our settings and properties, but not fit results. 
# This is similar to scikit-learn saving/loading its estimators.
m.obt_save("test_fit_est", est = True)
m2 = obt_load("test_fit_est", est = True)
#--------------------------------------------------------------------------------------------
```
The standard variable activity information, calculated as the proportion of splitting rules involving each variable, can be computed using OPENBT's vartivity() function.
```
# Calculate variable activity information
fitv = m.vartivity()
print(fitv['mvdraws'])
```
A more accurate alternative is to calculate the Sobol indices.
```
# Calculate Sobol indices
fits = m.sobol(cmdopt = 'MPI', tc = 4)
print(fits['msi'])
print(fits['mtsi'])
print(fits['msij'])
```
Again, for more examples of using OpenBT, explore the examples folder in the [Github repo](https://github.com/cavan33/openbt_py) .

### See Also:  
[Github "Homepage" for this package](https://github.com/cavan33/openbt_py)  
PyPI [Package Home](https://pypi.org/project/pyopenbt/)  

### Contributions
All contributions are welcome. You can help this project be better by reporting issues, bugs, 
or forking the repo and creating a pull request.

------------------------------------------------------------------------------

### License
The package is licensed under the BSD 3-Clause License. A copy of the
[license](LICENSE) can be found along with the code.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "pyopenbt",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "bayesian additive regession trees",
    "author": "",
    "author_email": "Clark Van Lieshout <clarkvan33@gmail.com>, \"J. Derek Tucker\" <jdtuck@sandia.gov>",
    "download_url": "https://files.pythonhosted.org/packages/2e/f0/eebc53c93d361d189143af37350604cfbe6bee075a0d89b22717e1530198/pyopenbt-0.0.8.tar.gz",
    "platform": null,
    "description": "# pyopenbt\nThis Python package is the Python interface for Dr. Matthew Pratola's [OpenBT project](https://bitbucket.org/mpratola/openbt/wiki/Home). Currently, its only module is openbt, which contains the OPENBT class. This class allows the user to create fit objects in a scikit-learn style.\n\n[![Build](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml/badge.svg)](https://github.com/cavan33/openbt_py/actions/workflows/python-package.yml)\n[![PyPI version](https://badge.fury.io/py/pyopenbt.svg)](https://badge.fury.io/py/pyopenbt)\n[![Anaconda-Server Badge](https://anaconda.org/conda-forge/pyopenbt/badges/version.svg)](https://anaconda.org/conda-forge/pyopenbt)\n\n### About:  \nOpenBT is a flexible and extensible C++ framework for implementing Bayesian regression tree models. Currently a number of models and inference tools are available for use in the released code with additional models/tools under development. The code makes use of MPI for parallel computing. Apart from this package, an R interface is provided via the ROpenbt package to demonstrate use of the software.\n\n### How to utilize this package (and its module and class):  \n1. Install the package from the command line by typing:  \n`$ python -m pip install pyopenbt`.   \n2. In Python3 (or a Python script), import the OPENBT class from the openbt module by typing:  \n`from pyopenbt.openbt import OPENBT`.  \nThis gives Python access to the OPENBT class. Typing  \n`from pyopenbt.openbt import *`  \nor  \n`from pyopenbt import openbt`  \nwould also work, but for the former, the obt_load() function is loaded unnecesarily (unless you wish to use that function, of course). For the latter, the class would be referred to as `pyopenbt.OPENBT`, not simply OPENBT.  \n3. To utilize the OPENBT class/functions in Python 3 to conduct and interpret fits: create a fit object such as  \n`m = OPENBT(model = \"bart\", ...)`.  \nThe fit object is an instance of the class. Here's an example of running a functions from the class:  \n`fitp = m.predict(preds)`\n4. See example scripts (in the \"examples\" folder), showing the usage of the OPENBT class on data, to this package. \n\n### Example:  \nTo start, let's create a test function. A popular one is the [Branin](https://www.sfu.ca/~ssurjano/branin.html) function:\n```\n# Test Branin function, rescaled\ndef braninsc (xx):\n    x1 = xx[0]\n    x2 = xx[1]\n    \n    x1bar = 15 * x1 - 5\n    x2bar = 15 * x2\n    \n    import math\n    term1 = x2bar - 5.1*x1bar**2/(4*math.pi**2) + 5*x1bar/math.pi - 6\n    term2 = (10 - 10/(8*math.pi)) * math.cos(x1bar)\n    \n    y = (term1**2 + term2 - 44.81) / 51.95\n    return(y)\n\n\n# Simulate branin data for testing\nimport numpy as np\nnp.random.seed(99)\nn = 500\np = 2\nx = np.random.uniform(size=n*p).reshape(n,p)\ny = np.zeros(n)\nfor i in range(n):\n    y[i] = braninsc(x[i,])\n```\nNote that the x and y data is a numpy array - this is the intended format. Now we can load the openbt package and fit a BART model. Here we set the model type as model=\"bart\" which ensures we fit a homoscedastic BART model. The number of MPI threads to use is specified as tc=4. For a list of all optional parameters, see `m._dict__` (after creating m) or `help(OPENBT)`.\n\n```\nfrom pyopenbt.openbt import OPENBT, obt_load\nm = OPENBT(model = \"bart\", tc = 4, modelname = \"branin\")\nfit = m.fit(x, y)\n```\nNext we can construct predictions and make a simple plot comparing our predictions to the training data. Here, we are calculating the in-sample predictions since we passed the same x array to the predict() function.\n```\n# Calculate in-sample predictions\nfitp = m.predict(x, tc = 4)\n\n# Make a simple plot\nimport matplotlib.pyplot as plt\nfig = plt.figure(figsize=(16,9)); ax = fig.add_subplot(111)\nax.plot(y, fitp['mmean'], 'ro')\nax.set_xlabel(\"Observed\"); ax.set_ylabel(\"Fitted\")\nax.axline([0, 0], [1, 1])\n```\nTo save the model, use OPENBT's obt_save() function. Similarly, load the model using obt_load(). Because the posterior can be large in sample-based models such as these, the fitted model is saved in a compressed file format with the extension .obt. Additionally, the estimator object can be saved and loaded (see below).\n```\n#--------------------------------------------------------------------------------------------\n# Save fitted MODEL object (not the estimator object, m) as test.obt in the working directory\nm.obt_save(fit, \"test\", est = False)\n# Load fitted model object (AKA fit object) to a new object\nfit2 = obt_load(\"test\", est = False)\n\n# We can also save/load the fit ESTIMATOR object by specifying est = True in obt_save()/load().\n# The estimator object has all our settings and properties, but not fit results. \n# This is similar to scikit-learn saving/loading its estimators.\nm.obt_save(\"test_fit_est\", est = True)\nm2 = obt_load(\"test_fit_est\", est = True)\n#--------------------------------------------------------------------------------------------\n```\nThe standard variable activity information, calculated as the proportion of splitting rules involving each variable, can be computed using OPENBT's vartivity() function.\n```\n# Calculate variable activity information\nfitv = m.vartivity()\nprint(fitv['mvdraws'])\n```\nA more accurate alternative is to calculate the Sobol indices.\n```\n# Calculate Sobol indices\nfits = m.sobol(cmdopt = 'MPI', tc = 4)\nprint(fits['msi'])\nprint(fits['mtsi'])\nprint(fits['msij'])\n```\nAgain, for more examples of using OpenBT, explore the examples folder in the [Github repo](https://github.com/cavan33/openbt_py) .\n\n### See Also:  \n[Github \"Homepage\" for this package](https://github.com/cavan33/openbt_py)  \nPyPI [Package Home](https://pypi.org/project/pyopenbt/)  \n\n### Contributions\nAll contributions are welcome. You can help this project be better by reporting issues, bugs, \nor forking the repo and creating a pull request.\n\n------------------------------------------------------------------------------\n\n### License\nThe package is licensed under the BSD 3-Clause License. A copy of the\n[license](LICENSE) can be found along with the code.\n",
    "bugtrack_url": null,
    "license": "BSD 3-Clause",
    "summary": "python interface to openbt",
    "version": "0.0.8",
    "project_urls": {
        "homepage": "https://github.com/cavan33/openbt_py",
        "repository": "https://github.com/cavan33/openbt_py"
    },
    "split_keywords": [
        "bayesian",
        "additive",
        "regession",
        "trees"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "03c36782ae377641ef7f2782aa283e0830f0ea339d3785b52d543de62df1c927",
                "md5": "9afd7a4cbc70ffd1ec5101d686211fe7",
                "sha256": "0075fbcf3ebaa9d49a7f1195ee19fcadbdac82c50855f7a1c15aa73680411d35"
            },
            "downloads": -1,
            "filename": "pyopenbt-0.0.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9afd7a4cbc70ffd1ec5101d686211fe7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 24373,
            "upload_time": "2024-01-03T17:05:36",
            "upload_time_iso_8601": "2024-01-03T17:05:36.774078Z",
            "url": "https://files.pythonhosted.org/packages/03/c3/6782ae377641ef7f2782aa283e0830f0ea339d3785b52d543de62df1c927/pyopenbt-0.0.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2ef0eebc53c93d361d189143af37350604cfbe6bee075a0d89b22717e1530198",
                "md5": "fbf6a317280e29356cd581172bb09a34",
                "sha256": "111caf73073c397c531b68b2ded50c5f4a2b751b14a3dd4c4c546a9f54318362"
            },
            "downloads": -1,
            "filename": "pyopenbt-0.0.8.tar.gz",
            "has_sig": false,
            "md5_digest": "fbf6a317280e29356cd581172bb09a34",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 26899,
            "upload_time": "2024-01-03T17:05:37",
            "upload_time_iso_8601": "2024-01-03T17:05:37.891303Z",
            "url": "https://files.pythonhosted.org/packages/2e/f0/eebc53c93d361d189143af37350604cfbe6bee075a0d89b22717e1530198/pyopenbt-0.0.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-03 17:05:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "cavan33",
    "github_project": "openbt_py",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "joblib",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pandas",
            "specs": []
        },
        {
            "name": "scikit_learn",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        }
    ],
    "lcname": "pyopenbt"
}
        
Elapsed time: 0.15897s