sparseconverter


Namesparseconverter JSON
Version 0.4.0 PyPI version JSON
download
home_pageNone
SummaryConverter matrix and type determination for a range of array formats, focusing on sparse arrays
upload_time2024-10-22 12:35:06
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License Copyright (c) 2022 LiberTEM Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords numpy scipy.sparse sparse array matrix cupy cupyx.scipy.sparse
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # sparseconverter
Format detection, identifiers and converter matrix for a range of numerical array formats (backends) in Python, focusing on sparse arrays.

## Usage

Basic usage:

```python
import numpy as np
import sparseconverter as spc

a1 = np.array([
    (1, 0, 3),
    (0, 0, 6)
])

# array conversion
a2 = spc.for_backend(a1, spc.SPARSE_GCXS)

# format determination
print("a1 is", spc.get_backend(a1), "and a2 is", spc.get_backend(a2))
```

```
a1 is numpy and a2 is sparse.GCXS
```


See `examples/` directory for more!

## Description

This library can help to implement algorithms that support a wide range of array formats as input, output or
for internal calculations. All dense and sparse array libraries already do support format detection, creation and export from and to various formats,
but with different APIs, different sets of formats and different sets of supported features -- dtypes, shapes, device classes etc.

This project creates an unified API for all conversions between the supported formats and takes care of details such as reshaping,
dtype conversion, and using an efficient intermediate format for multi-step conversions.

## Features
* Supports Python 3.8 - (at least) 3.12
* Defines constants for format identifiers
* Various sets to group formats into categories:
  * Dense vs sparse
  * CPU vs CuPy-based
  * nD vs 2D backends
* Efficiently detect format of arrays, including support for subclasses
* Get converter function for a pair of formats
* Convert to a target format
* Find most efficient conversion pair for a range of possible inputs and/or outputs

That way it can help to implement format-specific optimized versions of an algorithm,
to specify which formats are supported by a specific routine, to adapt to
availability of CuPy on a target machine,
and to perform efficient conversion to supported formats as needed.

## Supported array formats
* [`numpy.ndarray`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html)
* [`numpy.matrix`](https://numpy.org/doc/stable/reference/generated/numpy.matrix.html) -- to support result of aggregation operations on scipy.sparse matrices
* [`cupy.ndarray`](https://docs.cupy.dev/en/stable/reference/generated/cupy.ndarray.html)
* [`sparse.COO`](https://sparse.pydata.org/en/stable/generated/sparse.COO.html)
* [`sparse.GCXS`](https://sparse.pydata.org/en/stable/generated/sparse.GCXS.html)
* [`sparse.DOK`](https://sparse.pydata.org/en/stable/generated/sparse.DOK.html)
* [`scipy.sparse.coo_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html)
* [`scipy.sparse.csr_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html)
* [`scipy.sparse.csc_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.html)
* [`scipy.sparse.coo_array`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_array.html)
* [`scipy.sparse.csr_array`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_array.html)
* [`scipy.sparse.csc_array`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_array.html)
* [`cupyx.scipy.sparse.coo_matrix`](https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.sparse.coo_matrix.html)
* [`cupyx.scipy.sparse.csr_matrix`](https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.sparse.csr_matrix.html)
* [`cupyx.scipy.sparse.csc_matrix`](https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.sparse.csc_matrix.html)

## Still TODO

* PyTorch arrays
* More detailed cost metric based on more real-world use cases and parameters.

## Changelog

### 0.5.0 (in development)

* No changes yet

### 0.4.0

* Better error message in case of unknown array type: https://github.com/LiberTEM/sparseconverter/pull/37
* Support for SciPy sparse arrays: https://github.com/LiberTEM/sparseconverter/pull/52
* Drop support for Python 3.7: https://github.com/LiberTEM/sparseconverter/pull/51

### 0.3.4

* Support for Python 3.12 https://github.com/LiberTEM/sparseconverter/pull/26
* Packaging update: Tests for conda-forge https://github.com/LiberTEM/sparseconverter/pull/27

### 0.3.3

* Perform feature checks lazily https://github.com/LiberTEM/sparseconverter/issues/15

### 0.3.2

* Detection and workaround for https://github.com/pydata/sparse/issues/602.
* Detection and workaround for https://github.com/cupy/cupy/issues/7713.
* Test with duplicates and scrambled indices.
* Test correctness of basic array operations.

### 0.3.1

* Include version constraint for `sparse`.

### 0.3.0

* Introduce `conversion_cost()` to obtain a value roughly proportional to the conversion cost
  between two backends.

### 0.2.0

* Introduce `result_type()` to find the smallest NumPy dtype that accomodates
  all parameters. Allowed as parameters are all valid arguments to
  `numpy.result_type(...)` plus backend specifiers.
* Support `cupyx.scipy.sparse.csr_matrix` with `dtype=bool`.

### 0.1.1

Initial release

## Known issues

* `conda install -c conda-forge cupy` on Python 3.7 and Windows 11 may install `cudatoolkit` 10.1 and `cupy` 8.3, which have sporadically produced invalid data structures for `cupyx.sparse.csc_matrix` for unknown reasons. This doesn't happen with current versions. Running the benchmark function `benchmark_conversions()` can help to debug such issues since it performs all pairwise conversions and checks for correctness.

## Notes

This project is developed primarily for sparse data support in [LiberTEM](https://libertem.github.io). For that reason it includes
the backend `CUDA`, which indicates a NumPy array, but targeting execution on a CUDA device.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "sparseconverter",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "numpy, scipy.sparse, sparse, array, matrix, cupy, cupyx.scipy.sparse",
    "author": null,
    "author_email": "Dieter Weber <d.weber@fz-juelich.de>",
    "download_url": "https://files.pythonhosted.org/packages/37/5c/3a6f0aec3a2712ed3e687a5e39576a0e545a788a2fbe065042d945fdad8c/sparseconverter-0.4.0.tar.gz",
    "platform": null,
    "description": "# sparseconverter\nFormat detection, identifiers and converter matrix for a range of numerical array formats (backends) in Python, focusing on sparse arrays.\n\n## Usage\n\nBasic usage:\n\n```python\nimport numpy as np\nimport sparseconverter as spc\n\na1 = np.array([\n    (1, 0, 3),\n    (0, 0, 6)\n])\n\n# array conversion\na2 = spc.for_backend(a1, spc.SPARSE_GCXS)\n\n# format determination\nprint(\"a1 is\", spc.get_backend(a1), \"and a2 is\", spc.get_backend(a2))\n```\n\n```\na1 is numpy and a2 is sparse.GCXS\n```\n\n\nSee `examples/` directory for more!\n\n## Description\n\nThis library can help to implement algorithms that support a wide range of array formats as input, output or\nfor internal calculations. All dense and sparse array libraries already do support format detection, creation and export from and to various formats,\nbut with different APIs, different sets of formats and different sets of supported features -- dtypes, shapes, device classes etc.\n\nThis project creates an unified API for all conversions between the supported formats and takes care of details such as reshaping,\ndtype conversion, and using an efficient intermediate format for multi-step conversions.\n\n## Features\n* Supports Python 3.8 - (at least) 3.12\n* Defines constants for format identifiers\n* Various sets to group formats into categories:\n  * Dense vs sparse\n  * CPU vs CuPy-based\n  * nD vs 2D backends\n* Efficiently detect format of arrays, including support for subclasses\n* Get converter function for a pair of formats\n* Convert to a target format\n* Find most efficient conversion pair for a range of possible inputs and/or outputs\n\nThat way it can help to implement format-specific optimized versions of an algorithm,\nto specify which formats are supported by a specific routine, to adapt to\navailability of CuPy on a target machine,\nand to perform efficient conversion to supported formats as needed.\n\n## Supported array formats\n* [`numpy.ndarray`](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html)\n* [`numpy.matrix`](https://numpy.org/doc/stable/reference/generated/numpy.matrix.html) -- to support result of aggregation operations on scipy.sparse matrices\n* [`cupy.ndarray`](https://docs.cupy.dev/en/stable/reference/generated/cupy.ndarray.html)\n* [`sparse.COO`](https://sparse.pydata.org/en/stable/generated/sparse.COO.html)\n* [`sparse.GCXS`](https://sparse.pydata.org/en/stable/generated/sparse.GCXS.html)\n* [`sparse.DOK`](https://sparse.pydata.org/en/stable/generated/sparse.DOK.html)\n* [`scipy.sparse.coo_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_matrix.html)\n* [`scipy.sparse.csr_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_matrix.html)\n* [`scipy.sparse.csc_matrix`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_matrix.html)\n* [`scipy.sparse.coo_array`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.coo_array.html)\n* [`scipy.sparse.csr_array`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csr_array.html)\n* [`scipy.sparse.csc_array`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.csc_array.html)\n* [`cupyx.scipy.sparse.coo_matrix`](https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.sparse.coo_matrix.html)\n* [`cupyx.scipy.sparse.csr_matrix`](https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.sparse.csr_matrix.html)\n* [`cupyx.scipy.sparse.csc_matrix`](https://docs.cupy.dev/en/stable/reference/generated/cupyx.scipy.sparse.csc_matrix.html)\n\n## Still TODO\n\n* PyTorch arrays\n* More detailed cost metric based on more real-world use cases and parameters.\n\n## Changelog\n\n### 0.5.0 (in development)\n\n* No changes yet\n\n### 0.4.0\n\n* Better error message in case of unknown array type: https://github.com/LiberTEM/sparseconverter/pull/37\n* Support for SciPy sparse arrays: https://github.com/LiberTEM/sparseconverter/pull/52\n* Drop support for Python 3.7: https://github.com/LiberTEM/sparseconverter/pull/51\n\n### 0.3.4\n\n* Support for Python 3.12 https://github.com/LiberTEM/sparseconverter/pull/26\n* Packaging update: Tests for conda-forge https://github.com/LiberTEM/sparseconverter/pull/27\n\n### 0.3.3\n\n* Perform feature checks lazily https://github.com/LiberTEM/sparseconverter/issues/15\n\n### 0.3.2\n\n* Detection and workaround for https://github.com/pydata/sparse/issues/602.\n* Detection and workaround for https://github.com/cupy/cupy/issues/7713.\n* Test with duplicates and scrambled indices.\n* Test correctness of basic array operations.\n\n### 0.3.1\n\n* Include version constraint for `sparse`.\n\n### 0.3.0\n\n* Introduce `conversion_cost()` to obtain a value roughly proportional to the conversion cost\n  between two backends.\n\n### 0.2.0\n\n* Introduce `result_type()` to find the smallest NumPy dtype that accomodates\n  all parameters. Allowed as parameters are all valid arguments to\n  `numpy.result_type(...)` plus backend specifiers.\n* Support `cupyx.scipy.sparse.csr_matrix` with `dtype=bool`.\n\n### 0.1.1\n\nInitial release\n\n## Known issues\n\n* `conda install -c conda-forge cupy` on Python 3.7 and Windows 11 may install `cudatoolkit` 10.1 and `cupy` 8.3, which have sporadically produced invalid data structures for `cupyx.sparse.csc_matrix` for unknown reasons. This doesn't happen with current versions. Running the benchmark function `benchmark_conversions()` can help to debug such issues since it performs all pairwise conversions and checks for correctness.\n\n## Notes\n\nThis project is developed primarily for sparse data support in [LiberTEM](https://libertem.github.io). For that reason it includes\nthe backend `CUDA`, which indicates a NumPy array, but targeting execution on a CUDA device.\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2022 LiberTEM  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Converter matrix and type determination for a range of array formats, focusing on sparse arrays",
    "version": "0.4.0",
    "project_urls": {
        "repository": "https://github.com/LiberTEM/sparseconverter"
    },
    "split_keywords": [
        "numpy",
        " scipy.sparse",
        " sparse",
        " array",
        " matrix",
        " cupy",
        " cupyx.scipy.sparse"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "12790399c906162d90ef5dc5f3ff43674a82351f794d9c54f60740e40170d5df",
                "md5": "7f9ba1d2718a3f338f932c24cb6eca7a",
                "sha256": "10e45d07bd50af88d5041af56eb2882ab2eb47b602b7d6e119490eaedbf24aac"
            },
            "downloads": -1,
            "filename": "sparseconverter-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7f9ba1d2718a3f338f932c24cb6eca7a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 17646,
            "upload_time": "2024-10-22T12:35:04",
            "upload_time_iso_8601": "2024-10-22T12:35:04.968527Z",
            "url": "https://files.pythonhosted.org/packages/12/79/0399c906162d90ef5dc5f3ff43674a82351f794d9c54f60740e40170d5df/sparseconverter-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "375c3a6f0aec3a2712ed3e687a5e39576a0e545a788a2fbe065042d945fdad8c",
                "md5": "df67824382e61f8558e8dab7ff104b92",
                "sha256": "60cc87d8b18fe740101a8320226a25f4a25b1513659e887f1a4699aba2a8bcee"
            },
            "downloads": -1,
            "filename": "sparseconverter-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "df67824382e61f8558e8dab7ff104b92",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 23868,
            "upload_time": "2024-10-22T12:35:06",
            "upload_time_iso_8601": "2024-10-22T12:35:06.044898Z",
            "url": "https://files.pythonhosted.org/packages/37/5c/3a6f0aec3a2712ed3e687a5e39576a0e545a788a2fbe065042d945fdad8c/sparseconverter-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-22 12:35:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "LiberTEM",
    "github_project": "sparseconverter",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sparseconverter"
}
        
Elapsed time: 0.78867s