SharedArray


NameSharedArray JSON
Version 3.2.4 PyPI version JSON
download
home_pagehttps://gitlab.com/tenzing/shared-array
SummaryShare numpy arrays between processes
upload_time2024-07-18 10:10:53
maintainerNone
docs_urlNone
authorMathieu Mirmont
requires_pythonNone
licenseNone
keywords numpy array shared memory shm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            SharedArray python/numpy extension
==================================

This is a simple python extension that lets you share numpy arrays with
other processes on the same computer. It uses either shared files or
POSIX shared memory as data stores and therefore should work on most
operating systems.

Example
-------

Here’s a simple example to give an idea of how it works. This example
does everything from a single python interpreter for the sake of
clarity, but the real point is to share arrays between python
interpreters.

.. code:: python

   import numpy as np
   import SharedArray as sa

   # Create an array in shared memory.
   a = sa.create("shm://test", 10)

   # Attach it as a different array. This can be done from another
   # python interpreter as long as it runs on the same computer.
   b = sa.attach("shm://test")

   # See how they are actually sharing the same memory.
   a[0] = 42
   print(b[0])

   # Destroying a does not affect b.
   del a
   print(b[0])

   # See how "test" is still present in shared memory even though we
   # destroyed the array a. This method only works on Linux.
   sa.list()

   # Now destroy the array "test" from memory.
   sa.delete("test")

   # The array b is still there, but once you destroy it then the
   # data is gone for real.
   print(b[0])

Functions
---------

SharedArray.create(name, shape, dtype=float)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This function creates an array in shared memory and returns a numpy
array that uses the shared memory as data backend.

The shared memory is identified by ``name``, which can use the
``file://`` prefix to indicate that the data backend will be a file, or
``shm://`` to indicate that the data backend shall be a POSIX shared
memory object. For backward compatibility ``shm://`` is assumed when no
prefix is given. Most operating systems implement strong file caching so
using a file as a data backend won’t usually affect performance.

The ``shape`` and ``dtype`` arguments are identical to those of the
numpy function ``numpy.zeros()``, and the returned array is indeed
initialized to zeros.

The content of the array lives in shared memory and/or in a file and
won’t be lost when the numpy array is deleted, nor when the python
interpreter exits. To delete a shared array and reclaim system resources
use the ``SharedArray.delete()`` function.

SharedArray.attach(name, ro=False)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This function attaches a previously created array in shared memory
identified by ``name``, which can use the ``file://`` prefix to indicate
that the array is stored as a file, or ``shm://`` to indicate that the
array is stored as a POSIX shared memory object. For backward
compatibility ``shm://`` is assumed when no prefix is given. The
optional parameter ``ro`` indicates that the array shall be attached
read-only.

An array may be simultaneously attached from multiple different
processes (i.e. python interpreters).

The content of the array lives in shared memory and/or in a file and
won’t be lost when the numpy array is deleted, nor when the python
interpreter exits. To delete a shared array reclaim system resources use
the ``SharedArray.delete()`` function.

SharedArray.delete(name)
~~~~~~~~~~~~~~~~~~~~~~~~

This function destroys the previously created array identified by
``name``, which can use the ``file://`` prefix to indicate that the
array is stored as a file, or ``shm://`` to indicate that the array is
stored as a POSIX shared memory object. For backward compatibility
``shm://`` is assumed when no prefix is given

After calling ``delete``, the array will not be attachable anymore, but
existing attachments will remain valid until they are themselves
destroyed. The data is reclaimed by the system when the very last
attachment is deleted.

SharedArray.list()
~~~~~~~~~~~~~~~~~~

This function returns a list of previously created arrays stored as
POSIX SHM objects, along with their name, data type and dimensions. This
function only works on Linux because it directly accesses files exposed
under ``/dev/shm``. There doesn’t seem to be a portable method of
achieving this.

Constants
---------

SharedArray.MS_ASYNC
~~~~~~~~~~~~~~~~~~~~

Flag for the ``msync()`` method of the base object of the returned numpy
array (see below). Specifies that an update be scheduled, but the call
returns immediately.

SharedArray.MS_SYNC
~~~~~~~~~~~~~~~~~~~

Flag for the ``msync()`` method of the base object of the returned numpy
array (see below). Requests an update and waits for it to complete.

SharedArray.MS_INVALIDATE
~~~~~~~~~~~~~~~~~~~~~~~~~

Flag for the ``msync()`` method of the base object of the returned numpy
array (see below). Asks to invalidate other mappings of the same file
(so that they can be updated with the fresh values just written).

Base object
-----------

SharedArray registers its own python object as the
`base <https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.base.html>`__
object of the returned numpy array. This base object exposes the
following methods and attributes:

msync(array, flags)
~~~~~~~~~~~~~~~~~~~

This method is a wrapper around ``msync(2)`` and is only useful when
using file-backed arrays (i.e. not POSIX shared memory). msync(2)
flushes the mapped memory region back to the filesystem. The ``flags``
are exported as constants in the module definition (see above) and are a
1:1 map of the ``msync(2)`` flags, please refer to the manual page of
``msync(2)`` for details.

mlock(array)
~~~~~~~~~~~~

This method is a wrapper around ``mlock(2)``: lock the memory map into
RAM, preventing that memory from being paged to the swap area.

munlock(array)
~~~~~~~~~~~~~~

This method is a wrapper around ``munlock(2)``: unlock the memory map,
allowing that memory to be paged to the swap area.

name
~~~~

This constant string is the name of the array as passed to
``SharedArray.create()`` or ``SharedArray.attach()``. It may be passed
to ``SharedArray.delete()``.

addr
~~~~

Base address of the array in memory.

size
~~~~

Size of the array in memory.

Requirements
------------

-  Python 2.7 or 3+
-  Numpy 1.8+
-  Posix shared memory interface

SharedArray uses the posix shm interface (``shm_open`` and
``shm_unlink``) and so should work on most POSIX operating systems
(Linux, BSD, etc.). It has been reported to work on macOS, and it is
unlikely to work on Windows.

Installation
------------

The extension uses the ``distutils`` python package that should be
familiar to most python users. To test the extension directly from the
source tree, without installing, type:

.. code:: sh

   python setup.py build_ext --inplace

To build and install the extension system-wide, type:

.. code:: sh

   python setup.py build
   sudo python setup.py install

The package is also available on PyPI and can be installed using the pip
tool.

FAQ
---

On Linux, I get segfaults when working with very large arrays.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

A few people have reported segfaults with very large arrays using POSIX
shared memory. This is not a bug in SharedArray but rather an indication
that the system ran out of POSIX shared memory.

On Linux a ``tmpfs`` virtual filesystem is used to provide POSIX shared
memory, and by default it is given only about 20% of the total available
memory, depending on the distribution. That amount can be changed by
re-mounting the ``tmpfs`` filesystem with the ``size=100%`` option:

.. code:: sh

   sudo mount -o remount,size=100% /run/shm

Also you can make the change permanent, on next boot, by setting
``SHM_SIZE=100%`` in ``/etc/defaults/tmpfs`` on recent Debian
installations.

On Linux, I get “Cannot allocate memory” when creating many arrays.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

SharedArray uses one memory map per array that is attached (or created).
By default the maximum number of memory maps per process is set by the
Linux kernel to 65530. If you want to create more arrays than that you
need to tune the kernel parameter ``vm.max_map_count`` and set it to a
higher value.

.. code:: sh

   /sbin/sysctl vm.max_map_count=655300

Note that for the change to be permanent you need to add this line to
``/etc/sysctl.conf``:

.. code:: sh

   vm.max_map_count=655300

I can’t attach old (pre 0.4) arrays anymore.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since version 0.4 all arrays are now page aligned in memory, to be used
with SIMD instructions (e.g. fftw library). As a side effect, arrays
created with a previous version of SharedArray aren’t compatible with
the new version (the location of the metadata changed). Save your work
before upgrading.

Contact
-------

This package is hosted on `GitLab <https://gitlab.com>`__ at:
https://gitlab.com/tenzing/shared-array

Packages are also available on PyPi at:
https://pypi.python.org/pypi/SharedArray

For bug reports, feature requests, suggestions, patches and everything
else related to SharedArray, feel free to raise issues on the `project
page <https://gitlab.com/tenzing/shared-array>`__. You can also contact
the maintainer directly by email at mat@parad0x.org.

            

Raw data

            {
    "_id": null,
    "home_page": "https://gitlab.com/tenzing/shared-array",
    "name": "SharedArray",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "numpy array shared memory shm",
    "author": "Mathieu Mirmont",
    "author_email": "mat@parad0x.org",
    "download_url": "https://files.pythonhosted.org/packages/a6/d2/6818d35a7abba9b2410813f2160630e125b12a52ca11acfd1fb0959433ad/SharedArray-3.2.4.tar.gz",
    "platform": null,
    "description": "SharedArray python/numpy extension\n==================================\n\nThis is a simple python extension that lets you share numpy arrays with\nother processes on the same computer. It uses either shared files or\nPOSIX shared memory as data stores and therefore should work on most\noperating systems.\n\nExample\n-------\n\nHere\u2019s a simple example to give an idea of how it works. This example\ndoes everything from a single python interpreter for the sake of\nclarity, but the real point is to share arrays between python\ninterpreters.\n\n.. code:: python\n\n   import numpy as np\n   import SharedArray as sa\n\n   # Create an array in shared memory.\n   a = sa.create(\"shm://test\", 10)\n\n   # Attach it as a different array. This can be done from another\n   # python interpreter as long as it runs on the same computer.\n   b = sa.attach(\"shm://test\")\n\n   # See how they are actually sharing the same memory.\n   a[0] = 42\n   print(b[0])\n\n   # Destroying a does not affect b.\n   del a\n   print(b[0])\n\n   # See how \"test\" is still present in shared memory even though we\n   # destroyed the array a. This method only works on Linux.\n   sa.list()\n\n   # Now destroy the array \"test\" from memory.\n   sa.delete(\"test\")\n\n   # The array b is still there, but once you destroy it then the\n   # data is gone for real.\n   print(b[0])\n\nFunctions\n---------\n\nSharedArray.create(name, shape, dtype=float)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThis function creates an array in shared memory and returns a numpy\narray that uses the shared memory as data backend.\n\nThe shared memory is identified by ``name``, which can use the\n``file://`` prefix to indicate that the data backend will be a file, or\n``shm://`` to indicate that the data backend shall be a POSIX shared\nmemory object. For backward compatibility ``shm://`` is assumed when no\nprefix is given. Most operating systems implement strong file caching so\nusing a file as a data backend won\u2019t usually affect performance.\n\nThe ``shape`` and ``dtype`` arguments are identical to those of the\nnumpy function ``numpy.zeros()``, and the returned array is indeed\ninitialized to zeros.\n\nThe content of the array lives in shared memory and/or in a file and\nwon\u2019t be lost when the numpy array is deleted, nor when the python\ninterpreter exits. To delete a shared array and reclaim system resources\nuse the ``SharedArray.delete()`` function.\n\nSharedArray.attach(name, ro=False)\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nThis function attaches a previously created array in shared memory\nidentified by ``name``, which can use the ``file://`` prefix to indicate\nthat the array is stored as a file, or ``shm://`` to indicate that the\narray is stored as a POSIX shared memory object. For backward\ncompatibility ``shm://`` is assumed when no prefix is given. The\noptional parameter ``ro`` indicates that the array shall be attached\nread-only.\n\nAn array may be simultaneously attached from multiple different\nprocesses (i.e.\u00a0python interpreters).\n\nThe content of the array lives in shared memory and/or in a file and\nwon\u2019t be lost when the numpy array is deleted, nor when the python\ninterpreter exits. To delete a shared array reclaim system resources use\nthe ``SharedArray.delete()`` function.\n\nSharedArray.delete(name)\n~~~~~~~~~~~~~~~~~~~~~~~~\n\nThis function destroys the previously created array identified by\n``name``, which can use the ``file://`` prefix to indicate that the\narray is stored as a file, or ``shm://`` to indicate that the array is\nstored as a POSIX shared memory object. For backward compatibility\n``shm://`` is assumed when no prefix is given\n\nAfter calling ``delete``, the array will not be attachable anymore, but\nexisting attachments will remain valid until they are themselves\ndestroyed. The data is reclaimed by the system when the very last\nattachment is deleted.\n\nSharedArray.list()\n~~~~~~~~~~~~~~~~~~\n\nThis function returns a list of previously created arrays stored as\nPOSIX SHM objects, along with their name, data type and dimensions. This\nfunction only works on Linux because it directly accesses files exposed\nunder ``/dev/shm``. There doesn\u2019t seem to be a portable method of\nachieving this.\n\nConstants\n---------\n\nSharedArray.MS_ASYNC\n~~~~~~~~~~~~~~~~~~~~\n\nFlag for the ``msync()`` method of the base object of the returned numpy\narray (see below). Specifies that an update be scheduled, but the call\nreturns immediately.\n\nSharedArray.MS_SYNC\n~~~~~~~~~~~~~~~~~~~\n\nFlag for the ``msync()`` method of the base object of the returned numpy\narray (see below). Requests an update and waits for it to complete.\n\nSharedArray.MS_INVALIDATE\n~~~~~~~~~~~~~~~~~~~~~~~~~\n\nFlag for the ``msync()`` method of the base object of the returned numpy\narray (see below). Asks to invalidate other mappings of the same file\n(so that they can be updated with the fresh values just written).\n\nBase object\n-----------\n\nSharedArray registers its own python object as the\n`base <https://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.base.html>`__\nobject of the returned numpy array. This base object exposes the\nfollowing methods and attributes:\n\nmsync(array, flags)\n~~~~~~~~~~~~~~~~~~~\n\nThis method is a wrapper around ``msync(2)`` and is only useful when\nusing file-backed arrays (i.e.\u00a0not POSIX shared memory). msync(2)\nflushes the mapped memory region back to the filesystem. The ``flags``\nare exported as constants in the module definition (see above) and are a\n1:1 map of the ``msync(2)`` flags, please refer to the manual page of\n``msync(2)`` for details.\n\nmlock(array)\n~~~~~~~~~~~~\n\nThis method is a wrapper around ``mlock(2)``: lock the memory map into\nRAM, preventing that memory from being paged to the swap area.\n\nmunlock(array)\n~~~~~~~~~~~~~~\n\nThis method is a wrapper around ``munlock(2)``: unlock the memory map,\nallowing that memory to be paged to the swap area.\n\nname\n~~~~\n\nThis constant string is the name of the array as passed to\n``SharedArray.create()`` or ``SharedArray.attach()``. It may be passed\nto ``SharedArray.delete()``.\n\naddr\n~~~~\n\nBase address of the array in memory.\n\nsize\n~~~~\n\nSize of the array in memory.\n\nRequirements\n------------\n\n-  Python 2.7 or 3+\n-  Numpy 1.8+\n-  Posix shared memory interface\n\nSharedArray uses the posix shm interface (``shm_open`` and\n``shm_unlink``) and so should work on most POSIX operating systems\n(Linux, BSD, etc.). It has been reported to work on macOS, and it is\nunlikely to work on Windows.\n\nInstallation\n------------\n\nThe extension uses the ``distutils`` python package that should be\nfamiliar to most python users. To test the extension directly from the\nsource tree, without installing, type:\n\n.. code:: sh\n\n   python setup.py build_ext --inplace\n\nTo build and install the extension system-wide, type:\n\n.. code:: sh\n\n   python setup.py build\n   sudo python setup.py install\n\nThe package is also available on PyPI and can be installed using the pip\ntool.\n\nFAQ\n---\n\nOn Linux, I get segfaults when working with very large arrays.\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nA few people have reported segfaults with very large arrays using POSIX\nshared memory. This is not a bug in SharedArray but rather an indication\nthat the system ran out of POSIX shared memory.\n\nOn Linux a ``tmpfs`` virtual filesystem is used to provide POSIX shared\nmemory, and by default it is given only about 20% of the total available\nmemory, depending on the distribution. That amount can be changed by\nre-mounting the ``tmpfs`` filesystem with the ``size=100%`` option:\n\n.. code:: sh\n\n   sudo mount -o remount,size=100% /run/shm\n\nAlso you can make the change permanent, on next boot, by setting\n``SHM_SIZE=100%`` in ``/etc/defaults/tmpfs`` on recent Debian\ninstallations.\n\nOn Linux, I get \u201cCannot allocate memory\u201d when creating many arrays.\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nSharedArray uses one memory map per array that is attached (or created).\nBy default the maximum number of memory maps per process is set by the\nLinux kernel to 65530. If you want to create more arrays than that you\nneed to tune the kernel parameter ``vm.max_map_count`` and set it to a\nhigher value.\n\n.. code:: sh\n\n   /sbin/sysctl vm.max_map_count=655300\n\nNote that for the change to be permanent you need to add this line to\n``/etc/sysctl.conf``:\n\n.. code:: sh\n\n   vm.max_map_count=655300\n\nI can\u2019t attach old (pre 0.4) arrays anymore.\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n\nSince version 0.4 all arrays are now page aligned in memory, to be used\nwith SIMD instructions (e.g.\u00a0fftw library). As a side effect, arrays\ncreated with a previous version of SharedArray aren\u2019t compatible with\nthe new version (the location of the metadata changed). Save your work\nbefore upgrading.\n\nContact\n-------\n\nThis package is hosted on `GitLab <https://gitlab.com>`__ at:\nhttps://gitlab.com/tenzing/shared-array\n\nPackages are also available on PyPi at:\nhttps://pypi.python.org/pypi/SharedArray\n\nFor bug reports, feature requests, suggestions, patches and everything\nelse related to SharedArray, feel free to raise issues on the `project\npage <https://gitlab.com/tenzing/shared-array>`__. You can also contact\nthe maintainer directly by email at mat@parad0x.org.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Share numpy arrays between processes",
    "version": "3.2.4",
    "project_urls": {
        "Homepage": "https://gitlab.com/tenzing/shared-array"
    },
    "split_keywords": [
        "numpy",
        "array",
        "shared",
        "memory",
        "shm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a6d26818d35a7abba9b2410813f2160630e125b12a52ca11acfd1fb0959433ad",
                "md5": "58e9896f653b5f0bf97a8681b3c13597",
                "sha256": "b8b8d189110c023b9de502f9396ff2591f660fb2c9637eb13fcaf233127e50be"
            },
            "downloads": -1,
            "filename": "SharedArray-3.2.4.tar.gz",
            "has_sig": false,
            "md5_digest": "58e9896f653b5f0bf97a8681b3c13597",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 19584,
            "upload_time": "2024-07-18T10:10:53",
            "upload_time_iso_8601": "2024-07-18T10:10:53.084479Z",
            "url": "https://files.pythonhosted.org/packages/a6/d2/6818d35a7abba9b2410813f2160630e125b12a52ca11acfd1fb0959433ad/SharedArray-3.2.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-18 10:10:53",
    "github": false,
    "gitlab": true,
    "bitbucket": false,
    "codeberg": false,
    "gitlab_user": "tenzing",
    "gitlab_project": "shared-array",
    "lcname": "sharedarray"
}
        
Elapsed time: 3.88443s