pylapy

Name	pylapy JSON
Version	0.1.1 JSON
	download
home_page	https://github.com/raphaelreme/pylapy
Summary	Pythonic wrapper around Linear Assignement Problem solvers
upload_time	2024-04-26 17:02:03
maintainer	None
docs_url	None
author	Raphael Reme
requires_python	>=3.7
license	MIT
keywords	lap linear programming optimization association problem
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Pylapy

[![Lint and Test](https://github.com/raphaelreme/pylapy/actions/workflows/tests.yml/badge.svg)](https://github.com/raphaelreme/pylapy/actions/workflows/tests.yml)

We provide a solver for the assignement problem with Hungarian algorithm (Jonker-Volgenant variants [1])

The main class (`pylapy.LapSolver`) is a wrapper around different implementations you can find in python: lap, lapjv, scipy, lapsolver [2, 3, 4, 5].

It unifies the functionality of each implementation and allows you to use the one which is the fastest
on your problem. Note that to solve the same problem, an implementation/method can be more than 10 times slower than an other one.

It also helps you to handle non square matrices and setting a soft threshold on assignements (usually leads
to better performances than hard thresholding).

We provide some benchmark (but they are a bit hard to read and messy). Given your lap problem, try to make it work with the simplest implementation (scipy) and then (if needed) try to install lap or lapjv and see if they yield better computational performances.


## Install

```bash
$ pip install pylapy
$ # By default it does not install any backend solver
$ # You can either install by hand your favorite solver (scipy, lap, lapjv, lapsolver)
$ pip install pylapy[scipy]  # or pylapy[lap] etc
$ # Note that some backend requires numpy to be installed correctly [for instance, the old lap distribution]
$ # You may need to install numpy before
$ pip install numpy
```

As of today, lapjv do not support macos, lapsolver do not support python > 3.10. We now use [lapx](https://github.com/rathaROG/lapx) which distributes correctly the lap package.


## Getting started

```python

import numpy as np
import pylapy

# Simulate data
n, m = (2000, 2000)
sparsity = 0.5

dist = np.random.uniform(0, 1, (2000, 2000))
dist[np.random.uniform(0, 1, (2000, 2000)) < sparsity] = np.inf

# Create the solver and solves

solver = pylapy.LapSolver()  # Choose the current most efficient method that is installed
# solver = pylapy.LapSolver("scipy"|"lap"|"lapjv"|"lapsolver")  # You can choose which method you rather use

links = solver.solve(dist)

# Find the final cost

print(dist[links[:, 0], links[:, 1]])
```

## Benchmarks

We provide several scripts (and corresponding plots) in the `benchmark/` folder. They compare different implementations
and dist extensions (for non square matrix or soft thresholding). We have only tested them on a intel core i7 with Ubuntu 20.04
and python 3.10. Thus we do not guarantee that the choice we make by default are the fastest for you. 

### TLDR
Lapjv seems to usually outperform other implementations (Up to 2 times faster). Lap and Scipy are also pretty fast and can sometimes be faster than Lapjv. Lapsolver is usually much slower and should be avoided.

To handle soft thresholding and non-square matrices, we use by default the fastest options of our benchmark. This can be changed by setting
`LapSolver.cost_extension` and `LapSolver.shape_extension`.


### Handling non square matrices

With a rectangular assignment problem, you have to extend the distance matrix into a square one. There are multiple ways
to perform this shape extension. All yield the same final links and cost some are much faster/slower than others.

We provide several shape extension functions and benchmark them. By default, the solver uses the fastest one for the implementation you use.
You can build your own or enforce another one by setting the `LapSolver.shape_extension` attribute. Please have a look at
`shape_extension` module and `benchmark_shape_extension` script for more details.

![lap](./benchmark/images/shape_extension/lap.png) ![lapjv](./benchmark/images/shape_extension/lapjv.png) ![scipy](./benchmark/images/shape_extension/scipy.png)
![lapsolver](./benchmark/images/shape_extension/lapsolver.png)

According to our benchmark, we use `smallest_fill_inf` for scipy [4] and smallest_fill_0 for other implementations. (Note that lap [2] provide its own implementation displayed as `ref` here.)

### Handling soft thresholding

Rather than applying hard thresholding and cut links that are above a threshold `eta`, it is common and usually
better to assign a row or a column to "no one" with a cost `eta`. This is done by adding "sink" rows and columns.
When a true row/column is linked to a "sink" column/row, it is considered non linked.

Adding these sink nodes can also be done multiple ways resulting in equivalent links/cost but different run time.
We provide several cost extension functions and benchmark them. By default, the solver uses the expected fastest one
for the implementation you use. You can build your own extension or enforce another one by setting the `LapSolver.cost_extension`
attribute. Please have a look at `cost_extension` module and `benchmark_cost_extension` script for more details.

![lap](./benchmark/images/cost_extension/lap.png) ![lapjv](./benchmark/images/cost_extension/lapjv.png) ![scipy](./benchmark/images/cost_extension/scipy.png)
![lapsolver](./benchmark/images/cost_extension/lapsolver.png)

It is less obvious to descriminate between the cost extension functions (Even more if you add sparsity: more plots in `benchmark/images/cost_extension`). Nonetheless,
we decided to go with `diag_split_cost` for lapsolver [5] and `row_cost` for other implementations that usually leads to the best performances.

### Choosing the implementations

First, some implementations are not available on some operating system or python version, our wrapper allows you to switch between implementations without
changing your code. It seems that for instance, lap is not available for python 3.6, lapjv is not available in macOs, etc.

Also you want to choose the fastest one for your problem. We have compared all implementations on several cases (using sparsity, rectangular problems, and cost limit):

![full_square_example](./benchmark/images/full_square.png) ![sparse_square](./benchmark/images/sparse_full_square.png) ![rectangular0.8](./benchmark/images/full_rectangular_0.8.png)
![rectangular1.2](./benchmark/images/full_rectangular_1.2.png)![cost_limit_full](./benchmark/images/partial_square.png)![cost_limit_sparse](./benchmark/images/sparse_partial_square.png)

It seems that lapjv is usually faster than other implementations. Scipy and lap are also pretty fast and can be faster than lapjv depending on your use case. Lapsolver is always outperformed and should be avoided.

We have also tested `lapmod` from lap [2] for sparse matrices. It can sometimes be faster than other implementations but it is less stable and we do not add the support of this algorithm in the wrapper. (Note that for unfeasible problems, it yields a segfault).

Warning: For rectangular matrices, lapjv seems to sometimes output a non-optimal cost (though very close to the optimal one)

## References

* [1] R. Jonker and A. Volgenant, "A Shortest Augmenting Path Algorithm for Dense and Sparse Linear Assignment Problems", Computing 38, 325-340 (1987)
* [2] "lap: Linear Assignment Problem solver", https://github.com/gatagat/lap (Now maintained at https://github.com/rathaROG/lapx)
* [3] "lapjv: Linear Assignment Problem solver using Jonker-Volgenant algorithm", https://github.com/src-d/lapjv
* [4] "scipy: linear_sum_assignment", https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linear_sum_assignment.html#scipy.optimize.linear_sum_assignment
* [5] "py-lapsolver", https://github.com/cheind/py-lapsolver

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/raphaelreme/pylapy",
    "name": "pylapy",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "lap, linear programming, optimization, association problem",
    "author": "Raphael Reme",
    "author_email": "raphaelreme-dev@protonmail.com",
    "download_url": "https://files.pythonhosted.org/packages/53/f5/d75e53c6117c4402366e755e1b2bd46bbee50f3374921ee44f24b9750989/pylapy-0.1.1.tar.gz",
    "platform": null,
    "description": "# Pylapy\n\n[![Lint and Test](https://github.com/raphaelreme/pylapy/actions/workflows/tests.yml/badge.svg)](https://github.com/raphaelreme/pylapy/actions/workflows/tests.yml)\n\nWe provide a solver for the assignement problem with Hungarian algorithm (Jonker-Volgenant variants [1])\n\nThe main class (`pylapy.LapSolver`) is a wrapper around different implementations you can find in python: lap, lapjv, scipy, lapsolver [2, 3, 4, 5].\n\nIt unifies the functionality of each implementation and allows you to use the one which is the fastest\non your problem. Note that to solve the same problem, an implementation/method can be more than 10 times slower than an other one.\n\nIt also helps you to handle non square matrices and setting a soft threshold on assignements (usually leads\nto better performances than hard thresholding).\n\nWe provide some benchmark (but they are a bit hard to read and messy). Given your lap problem, try to make it work with the simplest implementation (scipy) and then (if needed) try to install lap or lapjv and see if they yield better computational performances.\n\n\n## Install\n\n```bash\n$ pip install pylapy\n$ # By default it does not install any backend solver\n$ # You can either install by hand your favorite solver (scipy, lap, lapjv, lapsolver)\n$ pip install pylapy[scipy]  # or pylapy[lap] etc\n$ # Note that some backend requires numpy to be installed correctly [for instance, the old lap distribution]\n$ # You may need to install numpy before\n$ pip install numpy\n```\n\nAs of today, lapjv do not support macos, lapsolver do not support python > 3.10. We now use [lapx](https://github.com/rathaROG/lapx) which distributes correctly the lap package.\n\n\n## Getting started\n\n```python\n\nimport numpy as np\nimport pylapy\n\n# Simulate data\nn, m = (2000, 2000)\nsparsity = 0.5\n\ndist = np.random.uniform(0, 1, (2000, 2000))\ndist[np.random.uniform(0, 1, (2000, 2000)) < sparsity] = np.inf\n\n# Create the solver and solves\n\nsolver = pylapy.LapSolver()  # Choose the current most efficient method that is installed\n# solver = pylapy.LapSolver(\"scipy\"|\"lap\"|\"lapjv\"|\"lapsolver\")  # You can choose which method you rather use\n\nlinks = solver.solve(dist)\n\n# Find the final cost\n\nprint(dist[links[:, 0], links[:, 1]])\n```\n\n## Benchmarks\n\nWe provide several scripts (and corresponding plots) in the `benchmark/` folder. They compare different implementations\nand dist extensions (for non square matrix or soft thresholding). We have only tested them on a intel core i7 with Ubuntu 20.04\nand python 3.10. Thus we do not guarantee that the choice we make by default are the fastest for you. \n\n### TLDR\nLapjv seems to usually outperform other implementations (Up to 2 times faster). Lap and Scipy are also pretty fast and can sometimes be faster than Lapjv. Lapsolver is usually much slower and should be avoided.\n\nTo handle soft thresholding and non-square matrices, we use by default the fastest options of our benchmark. This can be changed by setting\n`LapSolver.cost_extension` and `LapSolver.shape_extension`.\n\n\n### Handling non square matrices\n\nWith a rectangular assignment problem, you have to extend the distance matrix into a square one. There are multiple ways\nto perform this shape extension. All yield the same final links and cost some are much faster/slower than others.\n\nWe provide several shape extension functions and benchmark them. By default, the solver uses the fastest one for the implementation you use.\nYou can build your own or enforce another one by setting the `LapSolver.shape_extension` attribute. Please have a look at\n`shape_extension` module and `benchmark_shape_extension` script for more details.\n\n![lap](./benchmark/images/shape_extension/lap.png) ![lapjv](./benchmark/images/shape_extension/lapjv.png) ![scipy](./benchmark/images/shape_extension/scipy.png)\n![lapsolver](./benchmark/images/shape_extension/lapsolver.png)\n\nAccording to our benchmark, we use `smallest_fill_inf` for scipy [4] and smallest_fill_0 for other implementations. (Note that lap [2] provide its own implementation displayed as `ref` here.)\n\n### Handling soft thresholding\n\nRather than applying hard thresholding and cut links that are above a threshold `eta`, it is common and usually\nbetter to assign a row or a column to \"no one\" with a cost `eta`. This is done by adding \"sink\" rows and columns.\nWhen a true row/column is linked to a \"sink\" column/row, it is considered non linked.\n\nAdding these sink nodes can also be done multiple ways resulting in equivalent links/cost but different run time.\nWe provide several cost extension functions and benchmark them. By default, the solver uses the expected fastest one\nfor the implementation you use. You can build your own extension or enforce another one by setting the `LapSolver.cost_extension`\nattribute. Please have a look at `cost_extension` module and `benchmark_cost_extension` script for more details.\n\n![lap](./benchmark/images/cost_extension/lap.png) ![lapjv](./benchmark/images/cost_extension/lapjv.png) ![scipy](./benchmark/images/cost_extension/scipy.png)\n![lapsolver](./benchmark/images/cost_extension/lapsolver.png)\n\nIt is less obvious to descriminate between the cost extension functions (Even more if you add sparsity: more plots in `benchmark/images/cost_extension`). Nonetheless,\nwe decided to go with `diag_split_cost` for lapsolver [5] and `row_cost` for other implementations that usually leads to the best performances.\n\n### Choosing the implementations\n\nFirst, some implementations are not available on some operating system or python version, our wrapper allows you to switch between implementations without\nchanging your code. It seems that for instance, lap is not available for python 3.6, lapjv is not available in macOs, etc.\n\nAlso you want to choose the fastest one for your problem. We have compared all implementations on several cases (using sparsity, rectangular problems, and cost limit):\n\n![full_square_example](./benchmark/images/full_square.png) ![sparse_square](./benchmark/images/sparse_full_square.png) ![rectangular0.8](./benchmark/images/full_rectangular_0.8.png)\n![rectangular1.2](./benchmark/images/full_rectangular_1.2.png)![cost_limit_full](./benchmark/images/partial_square.png)![cost_limit_sparse](./benchmark/images/sparse_partial_square.png)\n\nIt seems that lapjv is usually faster than other implementations. Scipy and lap are also pretty fast and can be faster than lapjv depending on your use case. Lapsolver is always outperformed and should be avoided.\n\nWe have also tested `lapmod` from lap [2] for sparse matrices. It can sometimes be faster than other implementations but it is less stable and we do not add the support of this algorithm in the wrapper. (Note that for unfeasible problems, it yields a segfault).\n\nWarning: For rectangular matrices, lapjv seems to sometimes output a non-optimal cost (though very close to the optimal one)\n\n## References\n\n* [1] R. Jonker and A. Volgenant, \"A Shortest Augmenting Path Algorithm for Dense and Sparse Linear Assignment Problems\", Computing 38, 325-340 (1987)\n* [2] \"lap: Linear Assignment Problem solver\", https://github.com/gatagat/lap (Now maintained at https://github.com/rathaROG/lapx)\n* [3] \"lapjv: Linear Assignment Problem solver using Jonker-Volgenant algorithm\", https://github.com/src-d/lapjv\n* [4] \"scipy: linear_sum_assignment\", https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linear_sum_assignment.html#scipy.optimize.linear_sum_assignment\n* [5] \"py-lapsolver\", https://github.com/cheind/py-lapsolver\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Pythonic wrapper around Linear Assignement Problem solvers",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/raphaelreme/pylapy"
    },
    "split_keywords": [
        "lap",
        " linear programming",
        " optimization",
        " association problem"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "062b54a38d02ecaa3912d0d0644a968a02017728e25853a30b26d9157878c75e",
                "md5": "6650a06b80759b93617fe10c414774a5",
                "sha256": "aa55a830ea71fa2145e1389c31736ff4017ea7d815805516ecdc124143dc7f22"
            },
            "downloads": -1,
            "filename": "pylapy-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6650a06b80759b93617fe10c414774a5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 11368,
            "upload_time": "2024-04-26T17:02:01",
            "upload_time_iso_8601": "2024-04-26T17:02:01.839415Z",
            "url": "https://files.pythonhosted.org/packages/06/2b/54a38d02ecaa3912d0d0644a968a02017728e25853a30b26d9157878c75e/pylapy-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "53f5d75e53c6117c4402366e755e1b2bd46bbee50f3374921ee44f24b9750989",
                "md5": "c8229b6955934a7609f374fa5922e598",
                "sha256": "6a3391cba5d8f5047595de51f960319467d9e21cd72ea52ac56597fe2886f0b8"
            },
            "downloads": -1,
            "filename": "pylapy-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "c8229b6955934a7609f374fa5922e598",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 12303,
            "upload_time": "2024-04-26T17:02:03",
            "upload_time_iso_8601": "2024-04-26T17:02:03.305628Z",
            "url": "https://files.pythonhosted.org/packages/53/f5/d75e53c6117c4402366e755e1b2bd46bbee50f3374921ee44f24b9750989/pylapy-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-26 17:02:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "raphaelreme",
    "github_project": "pylapy",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pylapy"
}

Raphael Reme