|Travis| |PyPI| |Anaconda|
Numpy indexed operations
========================
This package contains functionality for indexed operations on numpy ndarrays, providing efficient vectorized functionality such as grouping and set operations.
* Rich and efficient grouping functionality:
- splitting of values by key-group
- reductions of values by key-group
* Generalization of existing array set operation to nd-arrays, such as:
- unique
- union
- difference
- exclusive (xor)
- contains / in (in1d)
* Some new functions:
- indices: numpy equivalent of list.index
- count: numpy equivalent of collections.Counter
- mode: find the most frequently occuring items in a set
- multiplicity: number of occurrences of each key in a sequence
- count\_table: like R's table or pandas crosstab, or an ndim version of np.bincount
Some brief examples to give an impression hereof:
.. code:: python
# three sets of graph edges (doublet of ints)
edges = np.random.randint(0, 9, (3, 100, 2))
# find graph edges exclusive to one of three sets
ex = exclusive(*edges)
print(ex)
# which edges are exclusive to the first set?
print(contains(edges[0], ex))
# where are the exclusive edges relative to the totality of them?
print(indices(union(*edges), ex))
# group and reduce values by identical keys
values = np.random.rand(100, 20)
# and so on...
print(group_by(edges[0]).median(values))
Installation
------------
.. code:: python
> conda install numpy-indexed -c conda-forge
or
.. code:: python
> pip install numpy-indexed
See: https://pypi.python.org/pypi/numpy-indexed
Design decisions:
-----------------
This package builds upon a generalization of the design pattern as can
be found in numpy.unique. That is, by argsorting an ndarray, many
subsequent operations can be implemented efficiently and in a vectorized
manner.
The sorting and related low level operations are encapsulated into a
hierarchy of Index classes, which allows for efficient lookup of many
properties for a variety of different key-types. The public API of this
package is a quite thin wrapper around these Index objects.
The two complex key types currently supported, beyond standard sequences
of sortable primitive types, are ndarray keys (i.e, finding unique
rows/columns of an array) and composite keys (zipped sequences). For the
exact casting rules describing valid sequences of key objects to index
objects, see as\_index().
Todo and open questions:
------------------------
- There may be further generalizations that could be built on top of
these abstractions. merge/join functionality perhaps?
.. |Travis| image:: https://travis-ci.org/EelcoHoogendoorn/Numpy_arraysetops_EP.svg?branch=master
:target: https://travis-ci.org/EelcoHoogendoorn/Numpy_arraysetops_EP
.. |PyPI| image:: https://badge.fury.io/py/numpy-indexed.svg
:target: https://pypi.org/project/numpy-indexed/
.. |Anaconda| image:: https://anaconda.org/conda-forge/numpy-indexed/badges/version.svg
:target: https://anaconda.org/conda-forge/numpy-indexed
Raw data
{
"_id": null,
"home_page": "https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP",
"name": "numpy-indexed",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "numpy group_by set-operations indexing",
"author": "Eelco Hoogendoorn",
"author_email": "hoogendoorn.eelco@gmail.com",
"download_url": "",
"platform": "Any",
"description": "|Travis| |PyPI| |Anaconda|\n\nNumpy indexed operations\n========================\n\nThis package contains functionality for indexed operations on numpy ndarrays, providing efficient vectorized functionality such as grouping and set operations.\n\n* Rich and efficient grouping functionality:\n\n - splitting of values by key-group\n - reductions of values by key-group\n\n* Generalization of existing array set operation to nd-arrays, such as:\n\n - unique\n - union\n - difference\n - exclusive (xor)\n - contains / in (in1d)\n\n* Some new functions:\n\n - indices: numpy equivalent of list.index\n - count: numpy equivalent of collections.Counter\n - mode: find the most frequently occuring items in a set\n - multiplicity: number of occurrences of each key in a sequence\n - count\\_table: like R's table or pandas crosstab, or an ndim version of np.bincount\n\nSome brief examples to give an impression hereof:\n\n.. code:: python\n\n # three sets of graph edges (doublet of ints)\n edges = np.random.randint(0, 9, (3, 100, 2))\n # find graph edges exclusive to one of three sets\n ex = exclusive(*edges)\n print(ex)\n # which edges are exclusive to the first set?\n print(contains(edges[0], ex))\n # where are the exclusive edges relative to the totality of them?\n print(indices(union(*edges), ex))\n # group and reduce values by identical keys\n values = np.random.rand(100, 20)\n # and so on...\n print(group_by(edges[0]).median(values))\n\nInstallation\n------------\n\n.. code:: python\n\n > conda install numpy-indexed -c conda-forge\n\nor\n\n.. code:: python\n\n > pip install numpy-indexed\n\nSee: https://pypi.python.org/pypi/numpy-indexed\n\nDesign decisions:\n-----------------\n\nThis package builds upon a generalization of the design pattern as can\nbe found in numpy.unique. That is, by argsorting an ndarray, many\nsubsequent operations can be implemented efficiently and in a vectorized\nmanner.\n\nThe sorting and related low level operations are encapsulated into a\nhierarchy of Index classes, which allows for efficient lookup of many\nproperties for a variety of different key-types. The public API of this\npackage is a quite thin wrapper around these Index objects.\n\nThe two complex key types currently supported, beyond standard sequences\nof sortable primitive types, are ndarray keys (i.e, finding unique\nrows/columns of an array) and composite keys (zipped sequences). For the\nexact casting rules describing valid sequences of key objects to index\nobjects, see as\\_index().\n\nTodo and open questions:\n------------------------\n\n- There may be further generalizations that could be built on top of\n these abstractions. merge/join functionality perhaps?\n\n.. |Travis| image:: https://travis-ci.org/EelcoHoogendoorn/Numpy_arraysetops_EP.svg?branch=master\n :target: https://travis-ci.org/EelcoHoogendoorn/Numpy_arraysetops_EP\n.. |PyPI| image:: https://badge.fury.io/py/numpy-indexed.svg\n :target: https://pypi.org/project/numpy-indexed/\n.. |Anaconda| image:: https://anaconda.org/conda-forge/numpy-indexed/badges/version.svg\n :target: https://anaconda.org/conda-forge/numpy-indexed\n",
"bugtrack_url": null,
"license": "Freely Distributable",
"summary": "This package contains functionality for indexed operations on numpy ndarrays, providing efficient vectorized functionality such as grouping and set operations.",
"version": "0.3.7",
"project_urls": {
"Homepage": "https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP"
},
"split_keywords": [
"numpy",
"group_by",
"set-operations",
"indexing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "51a228e87c9255c4a2ead7a1253f48296faa1e5a86273f99da74a0ff9619f583",
"md5": "2806faff660f9edcc5ede43a888ac5e3",
"sha256": "3e9f8f5ca453e49809618b3717b8ce07551b616a4ae43069c46aaad286386a9e"
},
"downloads": -1,
"filename": "numpy_indexed-0.3.7-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "2806faff660f9edcc5ede43a888ac5e3",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 19703,
"upload_time": "2023-02-28T10:22:07",
"upload_time_iso_8601": "2023-02-28T10:22:07.866836Z",
"url": "https://files.pythonhosted.org/packages/51/a2/28e87c9255c4a2ead7a1253f48296faa1e5a86273f99da74a0ff9619f583/numpy_indexed-0.3.7-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-02-28 10:22:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "EelcoHoogendoorn",
"github_project": "Numpy_arraysetops_EP",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"appveyor": true,
"lcname": "numpy-indexed"
}