# `kpsearch` -- A Python Package for K-Portfolio Search
The package `kpsearch` contains methods to search for portfolios of algorithms (e.g., SAT solvers).
A portfolio is a set of algorithms,
where the portfolio's runtime on a problem instance equals the runtime of the fastest algorithm from the portfolio.
This VBS (virtual best solver) approach corresponds to running all the portfolio's algorithms in parallel
and only recording the runtime of the fastest algorithm per problem instance.
(In reality, one could use a prediction model to choose a (hopefully fast) algorithm for each problem instance.)
Given the runtimes of multiple algorithms on multiple problem instances,
one wants to find the overall optimal (fastest) portfolio of size `k`, i.e., containing `k` algorithms.
This document provides:
- Steps for [setting up](#setup) the package.
- A short [overview](#functionality) of the (portfolio-search) functionality.
- A [demo](#demo) of the functionality.
- [Guidelines for developers](#developer-info) who want to modify or extend the code base.
If you use this package for a scientific publication, please cite [our paper](https://doi.org/10.4230/LIPIcs.SAT.2022.2)
```
@inproceedings{bach2022comprehensive,
author={Bach, Jakob and Iser, Markus and B\"{o}hm, Klemens},
title={A Comprehensive Study of k-Portfolios of Recent {SAT} Solvers},
booktitle={25th International Conference on Theory and Applications of Satisfiability Testing (SAT 2022)},
location={Haifa, Israel},
year={2022},
doi={10.4230/LIPIcs.SAT.2022.2}
}
```
## Setup
You can install our package from [PyPI](https://pypi.org/):
```
python -m pip install kpsearch
```
Alternatively, you can install the package from GitHub:
```bash
python -m pip install git+https://github.com/Jakob-Bach/Small-Portfolios.git#subdirectory=code/kpsearch_package
```
If you already have the source code for the package (i.e., the directory in which this `README` resides)
as a local directory on your computer (e.g., after cloning the project), you can also perform a local install:
```bash
python -m pip install kpsearch_package/
```
## Functionality
`kpsearch.py` provides several functions for searching k-portfolios:
- exact: `exhaustive_search()`, `mip_search()`, `smt_search()`, `smt_search_nof()`
- heuristic: `beam_search()`, `kbest_search()`, `random_search()`
`mip_search()` is a novel contribution of our paper, using a MIP solver to find optimal k-portfolios.
All other search methods are straightforward and/or adaptations from related work.
`mip_search()` is usually the fastest exact search method except for very small or large `k`
(where the plain `exhaustive_search()` may be faster).
All functions share two parameters:
- A `pandas.DataFrame`, where the rows represent problem (e.g., SAT) instances,
the columns represent solvers (column names are solver names), and the cells represent runtimes.
- The portfolio size `k`, i.e., the maximum number of solvers in the portfolio
(some search methods may return smaller portfolios if adding more solvers is not beneficial).
The existence of further parameters depends on the search method;
see the individual functions' documentation for details.
All search methods return a list, where each entry is a portfolio described by
- the solver names (column names from the runtime data) and
- the objective value (average runtime of the VBS, i.e., virtual best solver, which assumes the
lowest runtime of the portfolio members for each instance).
## Demo
Let's create a small demo dataset with runtimes of three solvers on four problem instances:
```python
import pandas as pd
import kpsearch
runtimes = pd.DataFrame({'Solver1': [1, 2, 3, 4],
'Solver2': [2, 2, 5, 1],
'Solver3': [5, 3, 2, 1]})
```
I.e., the data looks as follows:
```
Solver1 Solver2 Solver3
0 1 2 5
1 2 2 3
2 3 5 2
3 4 1 1
```
Let's try exhaustive search:
```python
print(kpsearch.exhaustive_search(runtimes=runtimes, k=2))
```
This search procedure returns all portfolios of the desired size `k`
(so if you only wanted the optimal portfolio, you would need to search in these results accordingly):
```
[(['Solver1', 'Solver2'], 1.75), (['Solver1', 'Solver3'], 1.5), (['Solver2', 'Solver3'], 1.75)]
```
Let's try greedy search, i.e., a beam search with a beam width of one:
```python
print(kpsearch.beam_search(runtimes=runtimes, k=3, w=1))
```
This search procedure does not only yield a `k`-portfolio, but also all intermediate results,
i.e., smaller portfolio sizes, since the procedure iteratively adds solvers to the portfolio:
```
[(['Solver1'], 2.5), (['Solver1', 'Solver3'], 1.5), (['Solver1', 'Solver2', 'Solver3'], 1.5)]
```
We can see that the third iteration does not improve the portfolio's VBS, i.e.,
Solver 2 cannot solve any instance faster than both other solvers.
## Developer Info
Though there is no formal superclass or interface, all existing search methods follow specific conventions,
which makes them compatible with each other.
Thus, if you want to add another portfolio-search method, it may be beneficial to follow these conventions as well.
All search methods share two parameters:
The solver `runtimes` as `DataFrame` and the number of solvers `k` as `int`.
You can add further method-specific parameters as you like.
For example, beam search has the beam width `w` as another parameter.
The result of each search method is a list of tuples of
- solver names (list of strings, corresponding to column names in `runtimes`) and
- portfolio performance (float) in terms of VBS score.
The list may have a length of one in case the search only returns one portfolio.
Raw data
{
"_id": null,
"home_page": null,
"name": "kpsearch",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "portfolio search, portfolios, algorithms, solvers",
"author": null,
"author_email": "Jakob Bach <jakob.bach.ka@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/f2/76/64ccddbf77385d83533244322dbe11a41f618f3816323ce69238cf5963e5/kpsearch-1.0.0.tar.gz",
"platform": null,
"description": "# `kpsearch` -- A Python Package for K-Portfolio Search\r\n\r\nThe package `kpsearch` contains methods to search for portfolios of algorithms (e.g., SAT solvers).\r\nA portfolio is a set of algorithms,\r\nwhere the portfolio's runtime on a problem instance equals the runtime of the fastest algorithm from the portfolio.\r\nThis VBS (virtual best solver) approach corresponds to running all the portfolio's algorithms in parallel\r\nand only recording the runtime of the fastest algorithm per problem instance.\r\n(In reality, one could use a prediction model to choose a (hopefully fast) algorithm for each problem instance.)\r\nGiven the runtimes of multiple algorithms on multiple problem instances,\r\none wants to find the overall optimal (fastest) portfolio of size `k`, i.e., containing `k` algorithms.\r\n\r\nThis document provides:\r\n\r\n- Steps for [setting up](#setup) the package.\r\n- A short [overview](#functionality) of the (portfolio-search) functionality.\r\n- A [demo](#demo) of the functionality.\r\n- [Guidelines for developers](#developer-info) who want to modify or extend the code base.\r\n\r\nIf you use this package for a scientific publication, please cite [our paper](https://doi.org/10.4230/LIPIcs.SAT.2022.2)\r\n\r\n```\r\n@inproceedings{bach2022comprehensive,\r\n author={Bach, Jakob and Iser, Markus and B\\\"{o}hm, Klemens},\r\n title={A Comprehensive Study of k-Portfolios of Recent {SAT} Solvers},\r\n booktitle={25th International Conference on Theory and Applications of Satisfiability Testing (SAT 2022)},\r\n location={Haifa, Israel},\r\n year={2022},\r\n doi={10.4230/LIPIcs.SAT.2022.2}\r\n}\r\n```\r\n\r\n## Setup\r\n\r\nYou can install our package from [PyPI](https://pypi.org/):\r\n\r\n```\r\npython -m pip install kpsearch\r\n```\r\n\r\nAlternatively, you can install the package from GitHub:\r\n\r\n```bash\r\npython -m pip install git+https://github.com/Jakob-Bach/Small-Portfolios.git#subdirectory=code/kpsearch_package\r\n```\r\n\r\nIf you already have the source code for the package (i.e., the directory in which this `README` resides)\r\nas a local directory on your computer (e.g., after cloning the project), you can also perform a local install:\r\n\r\n```bash\r\npython -m pip install kpsearch_package/\r\n```\r\n\r\n## Functionality\r\n\r\n`kpsearch.py` provides several functions for searching k-portfolios:\r\n\r\n- exact: `exhaustive_search()`, `mip_search()`, `smt_search()`, `smt_search_nof()`\r\n- heuristic: `beam_search()`, `kbest_search()`, `random_search()`\r\n\r\n`mip_search()` is a novel contribution of our paper, using a MIP solver to find optimal k-portfolios.\r\nAll other search methods are straightforward and/or adaptations from related work.\r\n`mip_search()` is usually the fastest exact search method except for very small or large `k`\r\n(where the plain `exhaustive_search()` may be faster).\r\n\r\nAll functions share two parameters:\r\n\r\n- A `pandas.DataFrame`, where the rows represent problem (e.g., SAT) instances,\r\n the columns represent solvers (column names are solver names), and the cells represent runtimes.\r\n- The portfolio size `k`, i.e., the maximum number of solvers in the portfolio\r\n (some search methods may return smaller portfolios if adding more solvers is not beneficial).\r\n\r\nThe existence of further parameters depends on the search method;\r\nsee the individual functions' documentation for details.\r\n\r\nAll search methods return a list, where each entry is a portfolio described by\r\n\r\n- the solver names (column names from the runtime data) and\r\n- the objective value (average runtime of the VBS, i.e., virtual best solver, which assumes the\r\n lowest runtime of the portfolio members for each instance).\r\n\r\n## Demo\r\n\r\nLet's create a small demo dataset with runtimes of three solvers on four problem instances:\r\n\r\n```python\r\nimport pandas as pd\r\nimport kpsearch\r\n\r\nruntimes = pd.DataFrame({'Solver1': [1, 2, 3, 4],\r\n 'Solver2': [2, 2, 5, 1],\r\n 'Solver3': [5, 3, 2, 1]})\r\n```\r\n\r\nI.e., the data looks as follows:\r\n\r\n```\r\n Solver1 Solver2 Solver3\r\n0 1 2 5\r\n1 2 2 3\r\n2 3 5 2\r\n3 4 1 1\r\n```\r\n\r\nLet's try exhaustive search:\r\n\r\n```python\r\nprint(kpsearch.exhaustive_search(runtimes=runtimes, k=2))\r\n```\r\n\r\nThis search procedure returns all portfolios of the desired size `k`\r\n(so if you only wanted the optimal portfolio, you would need to search in these results accordingly):\r\n\r\n```\r\n[(['Solver1', 'Solver2'], 1.75), (['Solver1', 'Solver3'], 1.5), (['Solver2', 'Solver3'], 1.75)]\r\n```\r\n\r\nLet's try greedy search, i.e., a beam search with a beam width of one:\r\n\r\n```python\r\nprint(kpsearch.beam_search(runtimes=runtimes, k=3, w=1))\r\n```\r\n\r\nThis search procedure does not only yield a `k`-portfolio, but also all intermediate results,\r\ni.e., smaller portfolio sizes, since the procedure iteratively adds solvers to the portfolio:\r\n\r\n```\r\n[(['Solver1'], 2.5), (['Solver1', 'Solver3'], 1.5), (['Solver1', 'Solver2', 'Solver3'], 1.5)]\r\n```\r\n\r\nWe can see that the third iteration does not improve the portfolio's VBS, i.e.,\r\nSolver 2 cannot solve any instance faster than both other solvers.\r\n\r\n## Developer Info\r\n\r\nThough there is no formal superclass or interface, all existing search methods follow specific conventions,\r\nwhich makes them compatible with each other.\r\nThus, if you want to add another portfolio-search method, it may be beneficial to follow these conventions as well.\r\n\r\nAll search methods share two parameters:\r\nThe solver `runtimes` as `DataFrame` and the number of solvers `k` as `int`.\r\nYou can add further method-specific parameters as you like.\r\nFor example, beam search has the beam width `w` as another parameter.\r\n\r\nThe result of each search method is a list of tuples of\r\n\r\n- solver names (list of strings, corresponding to column names in `runtimes`) and\r\n- portfolio performance (float) in terms of VBS score.\r\n\r\nThe list may have a length of one in case the search only returns one portfolio.\r\n",
"bugtrack_url": null,
"license": null,
"summary": "K-portfolio search",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/Jakob-Bach/Small-Portfolios/tree/master/code/kpsearch_package",
"Issues": "https://github.com/Jakob-Bach/Small-Portfolios/issues"
},
"split_keywords": [
"portfolio search",
" portfolios",
" algorithms",
" solvers"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "aa450820debea66ce098c2c55ce9abf2eac17b224c6c768a83252cb6cee24019",
"md5": "cbee61457d6318a5178cee7e783132e2",
"sha256": "095f34b61c82bdf691ee23a2c205713ced9466ad1408d8fe26ea6af3245614fb"
},
"downloads": -1,
"filename": "kpsearch-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "cbee61457d6318a5178cee7e783132e2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 8221,
"upload_time": "2024-10-04T07:29:10",
"upload_time_iso_8601": "2024-10-04T07:29:10.419724Z",
"url": "https://files.pythonhosted.org/packages/aa/45/0820debea66ce098c2c55ce9abf2eac17b224c6c768a83252cb6cee24019/kpsearch-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f27664ccddbf77385d83533244322dbe11a41f618f3816323ce69238cf5963e5",
"md5": "1606d66d64d385cf3849c7117d7a748c",
"sha256": "a4ecb0b702a2bf91826ad5a206ecf64fc46f066441a304fef046fd695627729c"
},
"downloads": -1,
"filename": "kpsearch-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "1606d66d64d385cf3849c7117d7a748c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 8724,
"upload_time": "2024-10-04T07:29:12",
"upload_time_iso_8601": "2024-10-04T07:29:12.408111Z",
"url": "https://files.pythonhosted.org/packages/f2/76/64ccddbf77385d83533244322dbe11a41f618f3816323ce69238cf5963e5/kpsearch-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-04 07:29:12",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Jakob-Bach",
"github_project": "Small-Portfolios",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "kpsearch"
}