pwasopt

Name	pwasopt JSON
Version	0.1.2 JSON
	download
home_page
Summary	PWAS/PWASp - Global and Preference-based Optimization with Mixed Variables using (P)iece(w)ise (A)ffine (S)urrogates
upload_time	2024-03-15 16:54:27
maintainer
docs_url	None
author
requires_python	>=3.7
license
keywords	mixed-variable optimization mixed-integer optimization global optimization preference-based optimization black-box optimization
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates (PWAS/PWASp)


# Contents

* [Package description](#description)

* [Installation](#install)

* [Basic usage](#basic-usage)

* [Contributors](#contributors)

* [Citing PWAS](#bibliography)

* [License](#license)

<a name="description"></a>
## Package description

We propose a novel surrogate-based global optimization algorithm, called PWAS, based on constructing a **p**iece**w**ise **a**ffine **s**urrogate of the objective function over feasible samples. We introduce two types of exploration functions to efficiently search the feasible domain via mixed integer linear programming (MILP) solvers. We also provide a preference-based version of the algorithm, called PWASp, which can be used when only pairwise comparisons between samples can be acquired while the objective function
remains unquantified. For more details on the method, please read our paper [Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates](http://arxiv.org/abs/2302.04686). 

<a name="cite-ZB23"><a>
> [1] M. Zhu and A. Bemporad, "[Global and preference-based optimization with mixed variables using piecewise aﬃne surrogates](http://arxiv.org/abs/2302.04686)," *Submitted for publication*, 2023. [[bib entry](#ref1)]

<a name="install"></a>
## Installation

~~~code
pip install pwasopt
~~~


### Dependencies:
* python 3
* numpy
* scipy
* pulp
* scikit-learn
* [pyparc](https://pypi.org/project/pyparc/)
* [pyDOE](https://pythonhosted.org/pyDOE/)
* [pycddlib](https://pypi.org/project/pycddlib/)

### External dependencies:
MILP solver:
- PWAS/PWASp use `GUROBI` as the default solver to solve the MILP problem of acquisition optimization, 
which is found to be the most robust during benchmark testing. Alternatively, we also include `GLPK`, which may introduce
errors occasionally depending on the test problem and initial samples. User can also switch to another MILP solver by editing the
relevant codes in `acquisition.py` and `sample.py`. Check the compatability of the MILP solver with `pulp` (the LP modeler) 
at the [project webpage](https://pypi.org/project/PuLP/).
- `GUROBI`: [academic licenses](https://www.gurobi.com/academia/academic-program-and-licenses/)
  - [configure GUROBI with python](https://support.gurobi.com/hc/en-us/articles/360044290292-How-do-I-install-Gurobi-for-Python-)
- `GLPK`: [project webpage](https://www.gnu.org/software/glpk/)
  - [step-by-step explanation for the installation on Stack Overflow](https://stackoverflow.com/questions/17513666/installing-glpk-gnu-linear-programming-kit-on-windows)


<a name="basic-usage"></a>
## Basic usage

### Examples
Examples of benchmark testing using PWAS/PWASp can be found in the `examples` folder:
* `mixed_variable_benchmarks.py`: benchmark testing on constrained/unconstrained mixed-variable problems
  * Test results are reported in the [paper](http://arxiv.org/abs/2302.04686)
  * _Note_: to test benchmark `NAS-CIFAR10`
    * download the data from its [GitHub repository](https://storage.googleapis.com/nasbench/nasbench_only108.tfrecord)
    * indicate the `data_path` in `mixed_variable_benchmarks.py`
    * since the dataset is compiled with `TensorFlow` version 1.x, **python version < 3.8** is  required (with `TensorFlow` < 2.x)
* `other_benchmarks.py`: various NLP, MIP, INLP, MIP Benchmarks tested with PWAS/PWASp
  * Test results are reported in [test_results_on_other_benchmarks.pdf](https://github.com/mjzhu-p/PWAS/blob/main/examples/test_results_on_other_benchmarks.pdf) under the `examples` folder 


Here, we show a detailed example using PWAS/PWASp to optimize the parameters of the [`xgboost` algorithm](https://xgboost.readthedocs.io/en/stable/) for [`MNIST` classification](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) task. 


### Problem discription

**_Objective_**:
Maximize the classification accuracy on test data. 
Note PWAS/PWASp optimizes the problem using **minimization**, and therefore we minimize the negative of classification accuracy.

**_Optimization variables_**:
$n_c = 4$ (number of continuous variables), $n_{\rm int} = 1$ (number of integer variables, **ordinal**), 
and $n_d = 3$ (number of categorical variables, **non-ordinal**) with $n_{i} = 2$, for $i = 1, 2, 3$. 
Each categorical variable ($n_{di}$) can be either 0 or 1. 
The bounds are $\ell_x = [10^{-6}\ 10^{-6}\ 0.001\ 10^{-6}]'$, $u_x = [1\  10\  1\  5]'$; $\ell_y = 1$, $u_y = 10$.

**_Notes_**:
The 0.7/0.3 stratified train/test split ratio is applied. 
The `xgboost` package is used on `MNIST` classification. 
The optimization variables in this problem are the parameters of the `xgboost` algorithm.
Specifically, the continuous variables $x_1$, $x_2$, $x_3$, and $x_4$ refer to the following parameters in `xgboost`, 
respectively: `learning_rate`, `min_split_loss`, `subsample` , and `reg_lambda`. 
The integer variable $y$ stands for the `max_depth`. As for the categorical variables, $n_{d1}$ indicates the booster type in 
`xgboost` where $n_{d1} = {0, 1}$ corresponding to {`gbtree`, `dart`}. $n_{d2}$ represents the `grow_policy`, 
where $n_{d2} = {0, 1}$ corresponding to {`depthwise`, `lossguide`}. 
$n_{d3}$ refers to the `objective`, where $n_{d3} = {0, 1}$ corresponding to {`multi:softmax`, `multi:softprob`}.


### Problem specification in Python
~~~python
import xgboost
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn import metrics
import numpy as np

# info of optimization variables 
nc = 4  # number of continous variables
nint = 1 # number of integer variables, ordinal
nd = 3  # number of categorical variables, non-ordinal
X_d = [2, 2, 2]  # possible number of classes for each categorical variables

lb_cont_int = np.array([1e-6, 1e-6, 0.001, 1e-6, 1])  # lower bounds for continuous and integer variables
ub_cont_int = np.array([1, 10, 0.99999, 5, 10])  # upper bounds for continuous and integer variables
lb_binary = np.zeros((nd))  # lower bounds for categorical variables, note the dimension is the same as nd, it will be updated within the code
ub_binary = np.array([1, 1, 1]) # upper bounds for categorical variables, note it is (the number of classes-1) (since in the one-hot encoder, the counter started at 0)
lb = np.hstack((lb_cont_int, lb_binary)) # combined lower and upper bounds for the optimization variables
ub = np.hstack((ub_cont_int, ub_binary))

# load dataset
# example code: https://github.com/imrekovacs/XGBoost/blob/master/XGBoost%20MNIST%20digits%20classification.ipynb
mnist = load_digits()  
X, y = mnist.data, mnist.target
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, stratify=y,
                                                    random_state=1)  # random_state used for reproducibility
dtrain = xgboost.DMatrix(X_train, label=y_train)
dtest = xgboost.DMatrix(X_test, label=y_test)

# define the objective function, x collects all the optimization variables, ordered as [continuous, integer, categorical]
def fun(x):  
    xc = x[:nc]  # continuous variables
    xint = x[nc:nc + nint]  # integer variables
    xd = x[nc + nint:]  # categorical variables

    if xd[0] == 0:
        mnist_booster = 'gbtree'
    else:
        mnist_booster = 'dart'

    if xd[1] == 0:
        mnist_grow_policy = 'depthwise'
    else:
        mnist_grow_policy = 'lossguide'

    if xd[2] == 0:
        mnist_obj = 'multi:softmax'
    else:
        mnist_obj = 'multi:softprob'
    param = {
        'booster': mnist_booster,
        'grow_policy': mnist_grow_policy,
        'objective': mnist_obj,
        'learning_rate': xc[0],
        'min_split_loss': xc[1],
        'subsample': xc[2],
        'reg_lambda': xc[3],
        'max_depth': int(round(xint[0])),
        'num_class': 10  # the number of classes that exist in this datset
    }

    bstmodel = xgboost.train(param, dtrain)

    y_pred = bstmodel.predict(
        dtest)  # somehow predict gives probability of each class instead of which class it belongs in...

    try:
        acc = metrics.accuracy_score(y_test, y_pred)

    except:
        y_pred = np.argmax(y_pred, axis=1)
        acc = metrics.accuracy_score(y_test, y_pred)

    return -acc  # maximize the accuracy, minimze the -acc

# Specify the number of maximum number of evaluations (including initial sammples) and initial samples
maxevals = 100
n_initil = 20

# default setting for the benchmarks
isLin_eqConstrained = False  # specify whether linear equality constraints are present
isLin_ineqConstrained = False  # specify whether linear inequality constraints are present
Aeq = np.array([])  # linear equality constraints
beq = np.array([])
Aineq = np.array([])  # linear inequality constraints
bineq = np.array([])

~~~

### Solve use PWAS

One can solve the optimization problem either by explicitly passing the function handle `fun` to PWAS, or by passing 
the evaluation of `fun` step-by step.

#### Solve by explicitly passing the function handle
~~~python
from pwasopt.main_pwas import PWAS  

key = 0
np.random.seed(key)  # rng default for reproducibility

delta_E = 0.05  # trade-off hyperparameter in acquisition function between exploitation of surrogate and exploration of exploration function
acq_stage = 'multi-stage'  # can specify whether to solve the acquisition step in one or multiple stages (as noted in Section 3.4 in the paper [1]. Default: `multi-stage`
feasible_sampling = True  # can specify whether infeasible samples are allowed. Default True
K_init = 20  # number of initial PWA partitions

# initialize the PWAS solver
optimizer1 = PWAS(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals,  # pass fun to PWAS
                 feasible_sampling= feasible_sampling,
                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,
                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,
                 K=K_init, categorical=False,
                 acq_stage=acq_stage)

xopt1, fopt1 = optimizer1.solve()
X1 = np.array(optimizer1.X)
fbest_seq1 = optimizer1.fbest_seq
~~~

#### Solve by passing the function evaluation step-by step
~~~python
optimizer2 = PWAS(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals,  # here, fun is a placeholder passed to PWAS, not used
                 feasible_sampling= feasible_sampling,
                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,
                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,
                 K=K_init, categorical=False,
                 acq_stage=acq_stage)

x2 = optimizer2.initialize()
for k in range(maxevals):
    f = fun(x2)  # evaluate fun
    x2 = optimizer2.update(f) # feed function evaluation step by step to PWAS
X2 = np.array(optimizer2.X[:-1])  # it is because in prob.update, it will calculate the next point to query (the last x2 is calculated at max_evals +1)
xopt2 = optimizer2.xbest
fopt2 = optimizer2.fbest
X2 = np.array(optimizer2.X)
fbest_seq2 = optimizer2.fbest_seq

~~~
Below we show the best values `fbest_seq1` found by PWAS. 

<p align = "center">
<img src="https://github.com/mjzhu-p/PWAS/blob/main/figures/PWAS_XG-MNIST.png" alt="drawing" width=60%/>
</p>


### Solve use PWASp
When solve with PWASp, instead of using the function evaluations, we solve a preference-based optimization problem 
with preference function $\pi(x_1,x_2)$, $x_1,x_2\in\mathbb{R}^n$
within the finite bounds `lb` $\leq x\leq$ `ub` (see Section 4 of [[1]](#ref1)).

Similarly to PWAS, one can solve the optimization problem either by 
explicitly passing the preference indicator/synthetic preference function to PWASp, or by passing 
the expressed preference `pref_eval` step-by step.

_Note_: for this example, we use `fun` as a **synthetic decision maker** (`synthetic_dm = True`) to express preferences. However, the explicit evaluation of `fun` is unknow to PWASp.

When solve by **explicitly** passing the preference indicator:
- If `synthetic_dm = True`, we have included two preference indicator functions `pref_fun.py` and `pref_fun1.py` 
to provide preferences based on function evaluations and constraint satisfaction.
- If `synthetic_dm = False`, one need to pass a `fun` such that given two decision vectors,
output -1, 0, or 1 depending on the expressed preferences.

#### Solve by explicitly passing the preference indicator 

~~~python
from pwasopt.main_pwasp import PWASp 

key = 0
np.random.seed(key)  # rng default for reproducibility

delta_E = 1  # trade-off hyperparameter in acquisition function between exploitation of surrogate and exploration of exploration function
optimizer1 = PWASp(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals, feasible_sampling= feasible_sampling,  
                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,
                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,
                 K=K_init, categorical=False,
                 acq_stage=acq_stage, synthetic_dm = True)  

xopt1 = optimizer1.solve()
X1 = np.array(optimizer1.X)
fbest_seq1 = list(map(fun, X1[optimizer1.ibest_seq]))  # for synthetic problems, we can obtain the function evaluation for assessment of the solver
fbest1 = min(fbest_seq1)
~~~

#### Solve by passing the expressed preference step-by step
~~~python
from pwasopt.pref_fun1 import PWASp_fun1  # import the preference indicator function
from pwasopt.pref_fun import PWASp_fun

optimizer2 = PWASp(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals, feasible_sampling= feasible_sampling,  # fun is a placeholder here
                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,
                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,
                 K=K_init, categorical=False,
                 acq_stage=acq_stage, synthetic_dm = True)

comparetol = 1e-4
if isLin_ineqConstrained or isLin_eqConstrained:
    pref_fun = PWASp_fun1(fun, comparetol, optimizer2.prob.Aeq, optimizer2.prob.beq, optimizer2.prob.Aineq, optimizer2.prob.bineq)  # preference function object
else:
    pref_fun = PWASp_fun(fun, comparetol)
pref = lambda x, y, x_encoded, y_encoded: pref_fun.eval(x, y, x_encoded, y_encoded)
pref_fun.clear()

xbest2, x2, xsbest2, xs2 = optimizer2.initialize()  # get first two random samples
for k in range(maxevals-1):
    pref_eval = pref(x2, xbest2, xs2, xsbest2)  # evaluate preference
    x2 = optimizer2.update(pref_eval)
    xbest2 = optimizer2.xbest
X2 = np.array(optimizer2.X[:-1])
xopt2 = xbest2
fbest_seq2 = list(map(fun, X2[optimizer2.ibest_seq]))
fbest2 = min(fbest_seq2)

~~~
Below we show the best values `fbest_seq1` found by PWASp. Note that function evaluations here are shown solely for demonstration purposes, which are unknown to PWASp during the solution process.

<p align = "center">
<img src="https://github.com/mjzhu-p/PWAS/blob/main/figures/PWASp_XG-MNIST.png" alt="drawing" width=60%/>
</p>


### Include constraints

We note below how to include constraints if present. 

_Note_:  current package only supports **linear** equality/inequality constraints.

- `Aeq`: np array, dimension: (# of linear eq. const by n_encoded), where n_encoded is the length of the optimization variable 
**AFTER** being **encoded**, the coefficient matrix for the linear equality constraints
- `beq`: np array, dimension: (n_encode by 1), the RHS of the linear eq. constraints
- `Aineq`: np array, dimension: (# of linear ineq. const by n_encoded), the coefficient matrix for the linear inequality constraints
**AFTER** being **encoded** the coefficient matrix for the linear equality constraints
- `bineq`: np array, dimension: (n_encode by 1) the RHS of the linear ineq. constraints
- **Make sure to update the `Aeq` and `Aineq` with the one-hot encoded categorical variables, if present.**

~~~python
# if there is equality constraints
isLin_eqConstrained = True  #(Aeq x  = beq)

# specify the constraint matrix and right-hand-side vector
if isLin_eqConstrained:  # an example
    Aeq = np.array([1.6295, 1])
    beq = np.array([3.0786])


# if there is inequality constraints
isLin_ineqConstrained = True  # (Aineq x <= bineq)
# specify the constraint matrix and right-hand-side vector
if isLin_ineqConstrained:  # an example
    Aineq = np.array([[1.6295, 1],
                   [0.5, 3.875],
                   [-4.3023, -4],
                   [-2, 1],
                   [0.5, -1]])
    bineq = np.array([[3.0786],
                   [3.324],
                   [-1.4909],
                   [0.5],
                   [0.5]])

~~~

<a name="contributors"><a>
## Contributors

This package was coded by Mengjia Zhu with supervision from Prof. Alberto Bemporad.


This software is distributed without any warranty. Please cite the paper below if you use this software.

<a name="bibliography"><a>
## Citing PWAS/PWASp

<a name="ref1"></a>
```
@article{ZB23,
    author={M. Zhu, A. Bemporad},
    title={Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates},
    journal={arXiv preprint arXiv:2302.04686},
    year=2023
}
```

<a name="license"><a>
## License

Apache 2.0

(C) 2021-2023 M. Zhu, A. Bemporad

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "pwasopt",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "Mixed-variable optimization,Mixed-integer optimization,global optimization,preference-based optimization,black-box optimization",
    "author": "",
    "author_email": "Mengjia Zhu <mengjia.zhu@imtlucca.it>, Alberto Bemporad <alberto.bemporad@imtlucca.it>",
    "download_url": "https://files.pythonhosted.org/packages/1e/a8/d4ccf8876a22986450cf5fa9423728a8722f29d242ca8ae15f2d52c33099/pwasopt-0.1.2.tar.gz",
    "platform": null,
    "description": "# Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates (PWAS/PWASp)\r\n\r\n\r\n# Contents\r\n\r\n* [Package description](#description)\r\n\r\n* [Installation](#install)\r\n\r\n* [Basic usage](#basic-usage)\r\n\r\n* [Contributors](#contributors)\r\n\r\n* [Citing PWAS](#bibliography)\r\n\r\n* [License](#license)\r\n\r\n<a name=\"description\"></a>\r\n## Package description\r\n\r\nWe propose a novel surrogate-based global optimization algorithm, called PWAS, based on constructing a **p**iece**w**ise **a**ffine **s**urrogate of the objective function over feasible samples. We introduce two types of exploration functions to efficiently search the feasible domain via mixed integer linear programming (MILP) solvers. We also provide a preference-based version of the algorithm, called PWASp, which can be used when only pairwise comparisons between samples can be acquired while the objective function\r\nremains unquantified. For more details on the method, please read our paper [Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates](http://arxiv.org/abs/2302.04686). \r\n\r\n<a name=\"cite-ZB23\"><a>\r\n> [1] M. Zhu and A. Bemporad, \"[Global and preference-based optimization with mixed variables using piecewise a\ufb03ne surrogates](http://arxiv.org/abs/2302.04686),\" *Submitted for publication*, 2023. [[bib entry](#ref1)]\r\n\r\n<a name=\"install\"></a>\r\n## Installation\r\n\r\n~~~code\r\npip install pwasopt\r\n~~~\r\n\r\n\r\n### Dependencies:\r\n* python 3\r\n* numpy\r\n* scipy\r\n* pulp\r\n* scikit-learn\r\n* [pyparc](https://pypi.org/project/pyparc/)\r\n* [pyDOE](https://pythonhosted.org/pyDOE/)\r\n* [pycddlib](https://pypi.org/project/pycddlib/)\r\n\r\n### External dependencies:\r\nMILP solver:\r\n- PWAS/PWASp use `GUROBI` as the default solver to solve the MILP problem of acquisition optimization, \r\nwhich is found to be the most robust during benchmark testing. Alternatively, we also include `GLPK`, which may introduce\r\nerrors occasionally depending on the test problem and initial samples. User can also switch to another MILP solver by editing the\r\nrelevant codes in `acquisition.py` and `sample.py`. Check the compatability of the MILP solver with `pulp` (the LP modeler) \r\nat the [project webpage](https://pypi.org/project/PuLP/).\r\n- `GUROBI`: [academic licenses](https://www.gurobi.com/academia/academic-program-and-licenses/)\r\n  - [configure GUROBI with python](https://support.gurobi.com/hc/en-us/articles/360044290292-How-do-I-install-Gurobi-for-Python-)\r\n- `GLPK`: [project webpage](https://www.gnu.org/software/glpk/)\r\n  - [step-by-step explanation for the installation on Stack Overflow](https://stackoverflow.com/questions/17513666/installing-glpk-gnu-linear-programming-kit-on-windows)\r\n\r\n\r\n<a name=\"basic-usage\"></a>\r\n## Basic usage\r\n\r\n### Examples\r\nExamples of benchmark testing using PWAS/PWASp can be found in the `examples` folder:\r\n* `mixed_variable_benchmarks.py`: benchmark testing on constrained/unconstrained mixed-variable problems\r\n  * Test results are reported in the [paper](http://arxiv.org/abs/2302.04686)\r\n  * _Note_: to test benchmark `NAS-CIFAR10`\r\n    * download the data from its [GitHub repository](https://storage.googleapis.com/nasbench/nasbench_only108.tfrecord)\r\n    * indicate the `data_path` in `mixed_variable_benchmarks.py`\r\n    * since the dataset is compiled with `TensorFlow` version 1.x, **python version < 3.8** is  required (with `TensorFlow` < 2.x)\r\n* `other_benchmarks.py`: various NLP, MIP, INLP, MIP Benchmarks tested with PWAS/PWASp\r\n  * Test results are reported in [test_results_on_other_benchmarks.pdf](https://github.com/mjzhu-p/PWAS/blob/main/examples/test_results_on_other_benchmarks.pdf) under the `examples` folder \r\n\r\n\r\nHere, we show a detailed example using PWAS/PWASp to optimize the parameters of the [`xgboost` algorithm](https://xgboost.readthedocs.io/en/stable/) for [`MNIST` classification](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_digits.html) task. \r\n\r\n\r\n### Problem discription\r\n\r\n**_Objective_**:\r\nMaximize the classification accuracy on test data. \r\nNote PWAS/PWASp optimizes the problem using **minimization**, and therefore we minimize the negative of classification accuracy.\r\n\r\n**_Optimization variables_**:\r\n$n_c = 4$ (number of continuous variables), $n_{\\rm int} = 1$ (number of integer variables, **ordinal**), \r\nand $n_d = 3$ (number of categorical variables, **non-ordinal**) with $n_{i} = 2$, for $i = 1, 2, 3$. \r\nEach categorical variable ($n_{di}$) can be either 0 or 1. \r\nThe bounds are $\\ell_x = [10^{-6}\\ 10^{-6}\\ 0.001\\ 10^{-6}]'$, $u_x = [1\\  10\\  1\\  5]'$; $\\ell_y = 1$, $u_y = 10$.\r\n\r\n**_Notes_**:\r\nThe 0.7/0.3 stratified train/test split ratio is applied. \r\nThe `xgboost` package is used on `MNIST` classification. \r\nThe optimization variables in this problem are the parameters of the `xgboost` algorithm.\r\nSpecifically, the continuous variables $x_1$, $x_2$, $x_3$, and $x_4$ refer to the following parameters in `xgboost`, \r\nrespectively: `learning_rate`, `min_split_loss`, `subsample` , and `reg_lambda`. \r\nThe integer variable $y$ stands for the `max_depth`. As for the categorical variables, $n_{d1}$ indicates the booster type in \r\n`xgboost` where $n_{d1} = {0, 1}$ corresponding to {`gbtree`, `dart`}. $n_{d2}$ represents the `grow_policy`, \r\nwhere $n_{d2} = {0, 1}$ corresponding to {`depthwise`, `lossguide`}. \r\n$n_{d3}$ refers to the `objective`, where $n_{d3} = {0, 1}$ corresponding to {`multi:softmax`, `multi:softprob`}.\r\n\r\n\r\n### Problem specification in Python\r\n~~~python\r\nimport xgboost\r\nfrom sklearn.datasets import load_digits\r\nfrom sklearn.model_selection import train_test_split\r\nfrom sklearn import metrics\r\nimport numpy as np\r\n\r\n# info of optimization variables \r\nnc = 4  # number of continous variables\r\nnint = 1 # number of integer variables, ordinal\r\nnd = 3  # number of categorical variables, non-ordinal\r\nX_d = [2, 2, 2]  # possible number of classes for each categorical variables\r\n\r\nlb_cont_int = np.array([1e-6, 1e-6, 0.001, 1e-6, 1])  # lower bounds for continuous and integer variables\r\nub_cont_int = np.array([1, 10, 0.99999, 5, 10])  # upper bounds for continuous and integer variables\r\nlb_binary = np.zeros((nd))  # lower bounds for categorical variables, note the dimension is the same as nd, it will be updated within the code\r\nub_binary = np.array([1, 1, 1]) # upper bounds for categorical variables, note it is (the number of classes-1) (since in the one-hot encoder, the counter started at 0)\r\nlb = np.hstack((lb_cont_int, lb_binary)) # combined lower and upper bounds for the optimization variables\r\nub = np.hstack((ub_cont_int, ub_binary))\r\n\r\n# load dataset\r\n# example code: https://github.com/imrekovacs/XGBoost/blob/master/XGBoost%20MNIST%20digits%20classification.ipynb\r\nmnist = load_digits()  \r\nX, y = mnist.data, mnist.target\r\nX_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, test_size=0.3, stratify=y,\r\n                                                    random_state=1)  # random_state used for reproducibility\r\ndtrain = xgboost.DMatrix(X_train, label=y_train)\r\ndtest = xgboost.DMatrix(X_test, label=y_test)\r\n\r\n# define the objective function, x collects all the optimization variables, ordered as [continuous, integer, categorical]\r\ndef fun(x):  \r\n    xc = x[:nc]  # continuous variables\r\n    xint = x[nc:nc + nint]  # integer variables\r\n    xd = x[nc + nint:]  # categorical variables\r\n\r\n    if xd[0] == 0:\r\n        mnist_booster = 'gbtree'\r\n    else:\r\n        mnist_booster = 'dart'\r\n\r\n    if xd[1] == 0:\r\n        mnist_grow_policy = 'depthwise'\r\n    else:\r\n        mnist_grow_policy = 'lossguide'\r\n\r\n    if xd[2] == 0:\r\n        mnist_obj = 'multi:softmax'\r\n    else:\r\n        mnist_obj = 'multi:softprob'\r\n    param = {\r\n        'booster': mnist_booster,\r\n        'grow_policy': mnist_grow_policy,\r\n        'objective': mnist_obj,\r\n        'learning_rate': xc[0],\r\n        'min_split_loss': xc[1],\r\n        'subsample': xc[2],\r\n        'reg_lambda': xc[3],\r\n        'max_depth': int(round(xint[0])),\r\n        'num_class': 10  # the number of classes that exist in this datset\r\n    }\r\n\r\n    bstmodel = xgboost.train(param, dtrain)\r\n\r\n    y_pred = bstmodel.predict(\r\n        dtest)  # somehow predict gives probability of each class instead of which class it belongs in...\r\n\r\n    try:\r\n        acc = metrics.accuracy_score(y_test, y_pred)\r\n\r\n    except:\r\n        y_pred = np.argmax(y_pred, axis=1)\r\n        acc = metrics.accuracy_score(y_test, y_pred)\r\n\r\n    return -acc  # maximize the accuracy, minimze the -acc\r\n\r\n# Specify the number of maximum number of evaluations (including initial sammples) and initial samples\r\nmaxevals = 100\r\nn_initil = 20\r\n\r\n# default setting for the benchmarks\r\nisLin_eqConstrained = False  # specify whether linear equality constraints are present\r\nisLin_ineqConstrained = False  # specify whether linear inequality constraints are present\r\nAeq = np.array([])  # linear equality constraints\r\nbeq = np.array([])\r\nAineq = np.array([])  # linear inequality constraints\r\nbineq = np.array([])\r\n\r\n~~~\r\n\r\n### Solve use PWAS\r\n\r\nOne can solve the optimization problem either by explicitly passing the function handle `fun` to PWAS, or by passing \r\nthe evaluation of `fun` step-by step.\r\n\r\n#### Solve by explicitly passing the function handle\r\n~~~python\r\nfrom pwasopt.main_pwas import PWAS  \r\n\r\nkey = 0\r\nnp.random.seed(key)  # rng default for reproducibility\r\n\r\ndelta_E = 0.05  # trade-off hyperparameter in acquisition function between exploitation of surrogate and exploration of exploration function\r\nacq_stage = 'multi-stage'  # can specify whether to solve the acquisition step in one or multiple stages (as noted in Section 3.4 in the paper [1]. Default: `multi-stage`\r\nfeasible_sampling = True  # can specify whether infeasible samples are allowed. Default True\r\nK_init = 20  # number of initial PWA partitions\r\n\r\n# initialize the PWAS solver\r\noptimizer1 = PWAS(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals,  # pass fun to PWAS\r\n                 feasible_sampling= feasible_sampling,\r\n                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,\r\n                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,\r\n                 K=K_init, categorical=False,\r\n                 acq_stage=acq_stage)\r\n\r\nxopt1, fopt1 = optimizer1.solve()\r\nX1 = np.array(optimizer1.X)\r\nfbest_seq1 = optimizer1.fbest_seq\r\n~~~\r\n\r\n#### Solve by passing the function evaluation step-by step\r\n~~~python\r\noptimizer2 = PWAS(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals,  # here, fun is a placeholder passed to PWAS, not used\r\n                 feasible_sampling= feasible_sampling,\r\n                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,\r\n                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,\r\n                 K=K_init, categorical=False,\r\n                 acq_stage=acq_stage)\r\n\r\nx2 = optimizer2.initialize()\r\nfor k in range(maxevals):\r\n    f = fun(x2)  # evaluate fun\r\n    x2 = optimizer2.update(f) # feed function evaluation step by step to PWAS\r\nX2 = np.array(optimizer2.X[:-1])  # it is because in prob.update, it will calculate the next point to query (the last x2 is calculated at max_evals +1)\r\nxopt2 = optimizer2.xbest\r\nfopt2 = optimizer2.fbest\r\nX2 = np.array(optimizer2.X)\r\nfbest_seq2 = optimizer2.fbest_seq\r\n\r\n~~~\r\nBelow we show the best values `fbest_seq1` found by PWAS. \r\n\r\n<p align = \"center\">\r\n<img src=\"https://github.com/mjzhu-p/PWAS/blob/main/figures/PWAS_XG-MNIST.png\" alt=\"drawing\" width=60%/>\r\n</p>\r\n\r\n\r\n### Solve use PWASp\r\nWhen solve with PWASp, instead of using the function evaluations, we solve a preference-based optimization problem \r\nwith preference function $\\pi(x_1,x_2)$, $x_1,x_2\\in\\mathbb{R}^n$\r\nwithin the finite bounds `lb` $\\leq x\\leq$ `ub` (see Section 4 of [[1]](#ref1)).\r\n\r\nSimilarly to PWAS, one can solve the optimization problem either by \r\nexplicitly passing the preference indicator/synthetic preference function to PWASp, or by passing \r\nthe expressed preference `pref_eval` step-by step.\r\n\r\n_Note_: for this example, we use `fun` as a **synthetic decision maker** (`synthetic_dm = True`) to express preferences. However, the explicit evaluation of `fun` is unknow to PWASp.\r\n\r\nWhen solve by **explicitly** passing the preference indicator:\r\n- If `synthetic_dm = True`, we have included two preference indicator functions `pref_fun.py` and `pref_fun1.py` \r\nto provide preferences based on function evaluations and constraint satisfaction.\r\n- If `synthetic_dm = False`, one need to pass a `fun` such that given two decision vectors,\r\noutput -1, 0, or 1 depending on the expressed preferences.\r\n\r\n#### Solve by explicitly passing the preference indicator \r\n\r\n~~~python\r\nfrom pwasopt.main_pwasp import PWASp \r\n\r\nkey = 0\r\nnp.random.seed(key)  # rng default for reproducibility\r\n\r\ndelta_E = 1  # trade-off hyperparameter in acquisition function between exploitation of surrogate and exploration of exploration function\r\noptimizer1 = PWASp(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals, feasible_sampling= feasible_sampling,  \r\n                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,\r\n                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,\r\n                 K=K_init, categorical=False,\r\n                 acq_stage=acq_stage, synthetic_dm = True)  \r\n\r\nxopt1 = optimizer1.solve()\r\nX1 = np.array(optimizer1.X)\r\nfbest_seq1 = list(map(fun, X1[optimizer1.ibest_seq]))  # for synthetic problems, we can obtain the function evaluation for assessment of the solver\r\nfbest1 = min(fbest_seq1)\r\n~~~\r\n\r\n#### Solve by passing the expressed preference step-by step\r\n~~~python\r\nfrom pwasopt.pref_fun1 import PWASp_fun1  # import the preference indicator function\r\nfrom pwasopt.pref_fun import PWASp_fun\r\n\r\noptimizer2 = PWASp(fun, lb, ub, delta_E, nc, nint, nd, X_d, nsamp, maxevals, feasible_sampling= feasible_sampling,  # fun is a placeholder here\r\n                 isLin_eqConstrained=isLin_eqConstrained, Aeq=Aeq, beq=beq,\r\n                 isLin_ineqConstrained=isLin_ineqConstrained, Aineq=Aineq, bineq=bineq,\r\n                 K=K_init, categorical=False,\r\n                 acq_stage=acq_stage, synthetic_dm = True)\r\n\r\ncomparetol = 1e-4\r\nif isLin_ineqConstrained or isLin_eqConstrained:\r\n    pref_fun = PWASp_fun1(fun, comparetol, optimizer2.prob.Aeq, optimizer2.prob.beq, optimizer2.prob.Aineq, optimizer2.prob.bineq)  # preference function object\r\nelse:\r\n    pref_fun = PWASp_fun(fun, comparetol)\r\npref = lambda x, y, x_encoded, y_encoded: pref_fun.eval(x, y, x_encoded, y_encoded)\r\npref_fun.clear()\r\n\r\nxbest2, x2, xsbest2, xs2 = optimizer2.initialize()  # get first two random samples\r\nfor k in range(maxevals-1):\r\n    pref_eval = pref(x2, xbest2, xs2, xsbest2)  # evaluate preference\r\n    x2 = optimizer2.update(pref_eval)\r\n    xbest2 = optimizer2.xbest\r\nX2 = np.array(optimizer2.X[:-1])\r\nxopt2 = xbest2\r\nfbest_seq2 = list(map(fun, X2[optimizer2.ibest_seq]))\r\nfbest2 = min(fbest_seq2)\r\n\r\n~~~\r\nBelow we show the best values `fbest_seq1` found by PWASp. Note that function evaluations here are shown solely for demonstration purposes, which are unknown to PWASp during the solution process.\r\n\r\n<p align = \"center\">\r\n<img src=\"https://github.com/mjzhu-p/PWAS/blob/main/figures/PWASp_XG-MNIST.png\" alt=\"drawing\" width=60%/>\r\n</p>\r\n\r\n\r\n### Include constraints\r\n\r\nWe note below how to include constraints if present. \r\n\r\n_Note_:  current package only supports **linear** equality/inequality constraints.\r\n\r\n- `Aeq`: np array, dimension: (# of linear eq. const by n_encoded), where n_encoded is the length of the optimization variable \r\n**AFTER** being **encoded**, the coefficient matrix for the linear equality constraints\r\n- `beq`: np array, dimension: (n_encode by 1), the RHS of the linear eq. constraints\r\n- `Aineq`: np array, dimension: (# of linear ineq. const by n_encoded), the coefficient matrix for the linear inequality constraints\r\n**AFTER** being **encoded** the coefficient matrix for the linear equality constraints\r\n- `bineq`: np array, dimension: (n_encode by 1) the RHS of the linear ineq. constraints\r\n- **Make sure to update the `Aeq` and `Aineq` with the one-hot encoded categorical variables, if present.**\r\n\r\n~~~python\r\n# if there is equality constraints\r\nisLin_eqConstrained = True  #(Aeq x  = beq)\r\n\r\n# specify the constraint matrix and right-hand-side vector\r\nif isLin_eqConstrained:  # an example\r\n    Aeq = np.array([1.6295, 1])\r\n    beq = np.array([3.0786])\r\n\r\n\r\n# if there is inequality constraints\r\nisLin_ineqConstrained = True  # (Aineq x <= bineq)\r\n# specify the constraint matrix and right-hand-side vector\r\nif isLin_ineqConstrained:  # an example\r\n    Aineq = np.array([[1.6295, 1],\r\n                   [0.5, 3.875],\r\n                   [-4.3023, -4],\r\n                   [-2, 1],\r\n                   [0.5, -1]])\r\n    bineq = np.array([[3.0786],\r\n                   [3.324],\r\n                   [-1.4909],\r\n                   [0.5],\r\n                   [0.5]])\r\n\r\n~~~\r\n\r\n<a name=\"contributors\"><a>\r\n## Contributors\r\n\r\nThis package was coded by Mengjia Zhu with supervision from Prof. Alberto Bemporad.\r\n\r\n\r\nThis software is distributed without any warranty. Please cite the paper below if you use this software.\r\n\r\n<a name=\"bibliography\"><a>\r\n## Citing PWAS/PWASp\r\n\r\n<a name=\"ref1\"></a>\r\n```\r\n@article{ZB23,\r\n    author={M. Zhu, A. Bemporad},\r\n    title={Global and Preference-based Optimization with Mixed Variables using Piecewise Affine Surrogates},\r\n    journal={arXiv preprint arXiv:2302.04686},\r\n    year=2023\r\n}\r\n```\r\n\r\n<a name=\"license\"><a>\r\n## License\r\n\r\nApache 2.0\r\n\r\n(C) 2021-2023 M. Zhu, A. Bemporad\r\n\r\n \r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "PWAS/PWASp - Global and Preference-based Optimization with Mixed Variables using (P)iece(w)ise (A)ffine (S)urrogates",
    "version": "0.1.2",
    "project_urls": {
        "Homepage": "https://github.com/mjzhu-p/PWAS"
    },
    "split_keywords": [
        "mixed-variable optimization",
        "mixed-integer optimization",
        "global optimization",
        "preference-based optimization",
        "black-box optimization"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d69e6bf9d8dbac8ac7adadb7316e8193f72746924785dd9d2af6237ed640c5e9",
                "md5": "5c86505c1d10c91369f8a0d48d7c2dd2",
                "sha256": "37d6883635f3cfa50fce1b97083650a2ed36164fc54bc84bf7f7e38ce9db03f6"
            },
            "downloads": -1,
            "filename": "pwasopt-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5c86505c1d10c91369f8a0d48d7c2dd2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 47066,
            "upload_time": "2024-03-15T16:54:26",
            "upload_time_iso_8601": "2024-03-15T16:54:26.562959Z",
            "url": "https://files.pythonhosted.org/packages/d6/9e/6bf9d8dbac8ac7adadb7316e8193f72746924785dd9d2af6237ed640c5e9/pwasopt-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1ea8d4ccf8876a22986450cf5fa9423728a8722f29d242ca8ae15f2d52c33099",
                "md5": "f0045fdf685fae39578eed7c5baaca46",
                "sha256": "22a9e6c44d864b5eb234ed8e6e6b2b1895b4ea23dd0c94c7a17391a90264d81e"
            },
            "downloads": -1,
            "filename": "pwasopt-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f0045fdf685fae39578eed7c5baaca46",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 44009,
            "upload_time": "2024-03-15T16:54:27",
            "upload_time_iso_8601": "2024-03-15T16:54:27.891483Z",
            "url": "https://files.pythonhosted.org/packages/1e/a8/d4ccf8876a22986450cf5fa9423728a8722f29d242ca8ae15f2d52c33099/pwasopt-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-15 16:54:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mjzhu-p",
    "github_project": "PWAS",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pwasopt"
}