cereggii


Namecereggii JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryThread synchronization utilities for free-threaded Python.
upload_time2024-05-09 15:25:41
maintainerNone
docs_urlNone
authorNone
requires_python~=3.9
licenseNone
keywords multithreading
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # cereggii

Thread synchronization utilities for free-threaded Python.

This library provides some atomic data types which, in a multithreaded context, are generally more performant compared
to CPython's builtin types.

## Cereus greggii

<img src="./.github/cereggii.jpg" align="right">

The *Peniocereus Greggii* (also known as *Cereus Greggii*) is a flower native to Arizona, New Mexico, Texas, and some
parts of northern Mexico.

This flower blooms just one summer night every year and in any given area, all these flowers bloom in synchrony.

[Wikipedia](https://en.wikipedia.org/wiki/Peniocereus_greggii)

_Image credits: Patrick Alexander, Peniocereus greggii var. greggii, south of Cooke's Range, Luna County, New Mexico, 10
May 2018, CC0. [source](https://www.flickr.com/photos/aspidoscelis/42926986382)_

## Installing

*This library is experimental*

*Arm disclaimer:* `aarch64` processors are generally not supported, but this library was successfully used with Apple
Silicon.

Using [@colesbury's original nogil fork](https://github.com/colesbury/nogil?tab=readme-ov-file#installation) is required
to use this library.
You can get it with pyenv:

```shell
pyenv install nogil-3.9.10-1
```

Then, you may fetch this library [from PyPI](https://pypi.org/project/cereggii):

```shell
pip install cereggii
```

If you happened to use a non-free-threaded interpreter, this library may not be able to run, and if it does, you will
see poor performance.

## AtomicInt

In Python (be it free-threaded or not), the following piece of code is not thread-safe:

```python
i = 0
i += 1
```

That is, if `i` is shared with multiple threads, and they attempt to modify `i`, the value of `i` after any
number (> 1) of writes, is undefined.

The following piece of code is instead thread-safe:

```python
import cereggii


i = cereggii.AtomicInt(0)
i += 1
print(i.get())
```

Also, consider the [counter example](./examples/atomic_int/counter.py) where three counter implementations are compared:

1. using a built-in `int`,
2. using `AtomicInt`, and
3. using `AtomicInt` with `AtomicIntHandle`.

```text
A counter using the built-in int.
spam.counter=0
spam.counter=5019655
Took 39.17 seconds.

A counter using cereggii.AtomicInt.
spam.counter.get()=0
spam.counter.get()=15000000
Took 36.78 seconds (1.07x faster).

A counter using cereggii.AtomicInt and cereggii.AtomicIntHandle.
spam.counter.get()=0
spam.counter.get()=15000000
Took 2.64 seconds (14.86x faster).
```

Notice that when using `AtomicInt` the count is correctly computed, and that using `AtomicInt.get_handle`
to access the counter greatly improves performance.
When using `AtomicIntHandle`, you should see your CPUs being fully used, because no implicit lock
prevents the execution of any thread. [^implicitlock]

`AtomicInt` borrows part of its API from Java's `AtomicInteger`, so that it should feel familiar to use, if you're
coming to Python from Java.
It also implements most numeric magic methods, so that it should feel comfortable to use for Pythonistas.
It tries to mimic Python's `int` as close as possible, with some caveats:

- it is bound to 64-bit integers, so you may encounter `OverflowError`;
- hashing is based on the `AtomicInt`'s address in memory, so two distinct `AtomicInt`s will have distinct hashes, even
  when they hold the same value (bonus feature: an `AtomicIntHandle` has the same hash of its
  corresponding `AtomicInt`); [^1]
- the following operations are not supported:
    - `__itruediv__` (an `AtomicInt` cannot be used to store floats)
    - `as_integer_ratio`
    - `bit_length`
    - `conjugate`
    - `from_bytes`
    - `to_bytes`
    - `denominator`
    - `numerator`
    - `imag`
    - `real`

[^implicitlock]: Put simply, in a free-threaded build,
the [global interpreter lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock) is substituted with
many per-object locks.

[^1]: This behavior ensures the hashing property that identity implies hash equality.

### An explanation of these claims

First, `a += 1` in CPython actually translates to more than one bytecode instruction, namely:

```text
LOAD_CONST               2 (1)
INPLACE_ADD              0 (a)
STORE_FAST               0 (a)
```

This means that between the `INPLACE_ADD` and the `STORE_FAST` instructions, the value of `a` may have been changed by
another thread, so that one or multiple increments may be lost.

Second, the performance problem.
How come the speedup?

If we look again at [the example code](./examples/atomic_int/counter.py), there are a couple of implicit memory
locations which are being contended by threads:

1. the reference count of `spam.count`;
2. the reference counts of the `int` objects themselves; and
3. the lock protecting the implicit `spam.__dict__`.

These contentions are eliminated by `AtomicInt`, cf.:

1. `spam.count` is accessed indirectly through an `AtomicIntHandle` which avoids contention on its reference count (it
   is contended only during the `.get_handle()` call);
2. this is avoided by not creating `int` objects during the increment;
3. again, using the handle instead of the `AtomicInt` itself avoids spurious contention, because `spam.__dict__` is not
   modified.

Also see colesbury/nogil#121.

## AtomicDict

You can see that there is some more performance to be gained by simply using `AtomicDict`, looking at the execution
of [the count keys example](./examples/atomic_dict/count_keys.py).

```text
Counting keys using the built-in dict.
Took 35.46 seconds.

Counting keys using cereggii.AtomicDict.
Took 7.51 seconds (4.72x faster).
```

Notice that the performance gain was achieved by simply wrapping the dictionary initialization with `AtomicDict`
(compare `Spam.d` and `AtomicSpam.d` in [the example source](./examples/atomic_dict/count_keys.py)).

The usage of `AtomicInt` provides correctness, regardless of the hashmap implementation.
But using `AtomicDict` instead of `dict` improves performance, even without using handles: writes to distinct keys do
not generate contention.

### Pre-sized dictionary, with partitioned iterations

`AtomicDict` provides some more features compared to Python's `dict`, in
the [partitioned iteration example](./examples/atomic_dict/partitioned_iter.py) two of them are shown:

1. pre-sizing, which allows for the expensive dynamic resizing of a hash table to be avoided, and
2. partitioned iterations, which allows to split the number of elements among threads.

```text
Insertion into builtin dict took 36.81s
Builtin dict iter took 17.56s with 1 thread.
----------
Insertion took 17.17s
Partitioned iter took 8.80s with 1 threads.
Partitioned iter took 5.03s with 2 threads.
Partitioned iter took 3.92s with 3 threads.
```

## AtomicRef

You can use an `AtomicRef` when you have a shared variable that points to an object, and you need to change the
referenced object concurrently.

This is not available in the Python standard library and was initially implemented as part of `AtomicDict`.

```python
import cereggii


o = object()
d = {}
i = 0

r = cereggii.AtomicRef()
assert r.get() is None
assert r.compare_and_set(None, o)
assert r.get_and_set(d) == o
r.set(i)  # always returns None
```

## Experimental

This library is experimental and should not be used in a production environment.

After all, as of now, it requires a non-official fork in order to run.

Porting to free-threaded Python 3.13 (3.13t) is planned.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "cereggii",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "~=3.9",
    "maintainer_email": null,
    "keywords": "multithreading",
    "author": null,
    "author_email": "dpdani <git@danieleparmeggiani.me>",
    "download_url": "https://files.pythonhosted.org/packages/f3/c3/76caecc706e7b07d357ea4c3b55b67af3865612f2bcb06d6ef884b1c20b5/cereggii-0.2.0.tar.gz",
    "platform": null,
    "description": "# cereggii\n\nThread synchronization utilities for free-threaded Python.\n\nThis library provides some atomic data types which, in a multithreaded context, are generally more performant compared\nto CPython's builtin types.\n\n## Cereus greggii\n\n<img src=\"./.github/cereggii.jpg\" align=\"right\">\n\nThe *Peniocereus Greggii* (also known as *Cereus Greggii*) is a flower native to Arizona, New Mexico, Texas, and some\nparts of northern Mexico.\n\nThis flower blooms just one summer night every year and in any given area, all these flowers bloom in synchrony.\n\n[Wikipedia](https://en.wikipedia.org/wiki/Peniocereus_greggii)\n\n_Image credits: Patrick Alexander, Peniocereus greggii var. greggii, south of Cooke's Range, Luna County, New Mexico, 10\nMay 2018, CC0. [source](https://www.flickr.com/photos/aspidoscelis/42926986382)_\n\n## Installing\n\n*This library is experimental*\n\n*Arm disclaimer:* `aarch64` processors are generally not supported, but this library was successfully used with Apple\nSilicon.\n\nUsing [@colesbury's original nogil fork](https://github.com/colesbury/nogil?tab=readme-ov-file#installation) is required\nto use this library.\nYou can get it with pyenv:\n\n```shell\npyenv install nogil-3.9.10-1\n```\n\nThen, you may fetch this library [from PyPI](https://pypi.org/project/cereggii):\n\n```shell\npip install cereggii\n```\n\nIf you happened to use a non-free-threaded interpreter, this library may not be able to run, and if it does, you will\nsee poor performance.\n\n## AtomicInt\n\nIn Python (be it free-threaded or not), the following piece of code is not thread-safe:\n\n```python\ni = 0\ni += 1\n```\n\nThat is, if `i` is shared with multiple threads, and they attempt to modify `i`, the value of `i` after any\nnumber (> 1) of writes, is undefined.\n\nThe following piece of code is instead thread-safe:\n\n```python\nimport cereggii\n\n\ni = cereggii.AtomicInt(0)\ni += 1\nprint(i.get())\n```\n\nAlso, consider the [counter example](./examples/atomic_int/counter.py) where three counter implementations are compared:\n\n1. using a built-in `int`,\n2. using `AtomicInt`, and\n3. using `AtomicInt` with `AtomicIntHandle`.\n\n```text\nA counter using the built-in int.\nspam.counter=0\nspam.counter=5019655\nTook 39.17 seconds.\n\nA counter using cereggii.AtomicInt.\nspam.counter.get()=0\nspam.counter.get()=15000000\nTook 36.78 seconds (1.07x faster).\n\nA counter using cereggii.AtomicInt and cereggii.AtomicIntHandle.\nspam.counter.get()=0\nspam.counter.get()=15000000\nTook 2.64 seconds (14.86x faster).\n```\n\nNotice that when using `AtomicInt` the count is correctly computed, and that using `AtomicInt.get_handle`\nto access the counter greatly improves performance.\nWhen using `AtomicIntHandle`, you should see your CPUs being fully used, because no implicit lock\nprevents the execution of any thread. [^implicitlock]\n\n`AtomicInt` borrows part of its API from Java's `AtomicInteger`, so that it should feel familiar to use, if you're\ncoming to Python from Java.\nIt also implements most numeric magic methods, so that it should feel comfortable to use for Pythonistas.\nIt tries to mimic Python's `int` as close as possible, with some caveats:\n\n- it is bound to 64-bit integers, so you may encounter `OverflowError`;\n- hashing is based on the `AtomicInt`'s address in memory, so two distinct `AtomicInt`s will have distinct hashes, even\n  when they hold the same value (bonus feature: an `AtomicIntHandle` has the same hash of its\n  corresponding `AtomicInt`); [^1]\n- the following operations are not supported:\n    - `__itruediv__` (an `AtomicInt` cannot be used to store floats)\n    - `as_integer_ratio`\n    - `bit_length`\n    - `conjugate`\n    - `from_bytes`\n    - `to_bytes`\n    - `denominator`\n    - `numerator`\n    - `imag`\n    - `real`\n\n[^implicitlock]: Put simply, in a free-threaded build,\nthe [global interpreter lock](https://docs.python.org/3/glossary.html#term-global-interpreter-lock) is substituted with\nmany per-object locks.\n\n[^1]: This behavior ensures the hashing property that identity implies hash equality.\n\n### An explanation of these claims\n\nFirst, `a += 1` in CPython actually translates to more than one bytecode instruction, namely:\n\n```text\nLOAD_CONST               2 (1)\nINPLACE_ADD              0 (a)\nSTORE_FAST               0 (a)\n```\n\nThis means that between the `INPLACE_ADD` and the `STORE_FAST` instructions, the value of `a` may have been changed by\nanother thread, so that one or multiple increments may be lost.\n\nSecond, the performance problem.\nHow come the speedup?\n\nIf we look again at [the example code](./examples/atomic_int/counter.py), there are a couple of implicit memory\nlocations which are being contended by threads:\n\n1. the reference count of `spam.count`;\n2. the reference counts of the `int` objects themselves; and\n3. the lock protecting the implicit `spam.__dict__`.\n\nThese contentions are eliminated by `AtomicInt`, cf.:\n\n1. `spam.count` is accessed indirectly through an `AtomicIntHandle` which avoids contention on its reference count (it\n   is contended only during the `.get_handle()` call);\n2. this is avoided by not creating `int` objects during the increment;\n3. again, using the handle instead of the `AtomicInt` itself avoids spurious contention, because `spam.__dict__` is not\n   modified.\n\nAlso see colesbury/nogil#121.\n\n## AtomicDict\n\nYou can see that there is some more performance to be gained by simply using `AtomicDict`, looking at the execution\nof [the count keys example](./examples/atomic_dict/count_keys.py).\n\n```text\nCounting keys using the built-in dict.\nTook 35.46 seconds.\n\nCounting keys using cereggii.AtomicDict.\nTook 7.51 seconds (4.72x faster).\n```\n\nNotice that the performance gain was achieved by simply wrapping the dictionary initialization with `AtomicDict`\n(compare `Spam.d` and `AtomicSpam.d` in [the example source](./examples/atomic_dict/count_keys.py)).\n\nThe usage of `AtomicInt` provides correctness, regardless of the hashmap implementation.\nBut using `AtomicDict` instead of `dict` improves performance, even without using handles: writes to distinct keys do\nnot generate contention.\n\n### Pre-sized dictionary, with partitioned iterations\n\n`AtomicDict` provides some more features compared to Python's `dict`, in\nthe [partitioned iteration example](./examples/atomic_dict/partitioned_iter.py) two of them are shown:\n\n1. pre-sizing, which allows for the expensive dynamic resizing of a hash table to be avoided, and\n2. partitioned iterations, which allows to split the number of elements among threads.\n\n```text\nInsertion into builtin dict took 36.81s\nBuiltin dict iter took 17.56s with 1 thread.\n----------\nInsertion took 17.17s\nPartitioned iter took 8.80s with 1 threads.\nPartitioned iter took 5.03s with 2 threads.\nPartitioned iter took 3.92s with 3 threads.\n```\n\n## AtomicRef\n\nYou can use an `AtomicRef` when you have a shared variable that points to an object, and you need to change the\nreferenced object concurrently.\n\nThis is not available in the Python standard library and was initially implemented as part of `AtomicDict`.\n\n```python\nimport cereggii\n\n\no = object()\nd = {}\ni = 0\n\nr = cereggii.AtomicRef()\nassert r.get() is None\nassert r.compare_and_set(None, o)\nassert r.get_and_set(d) == o\nr.set(i)  # always returns None\n```\n\n## Experimental\n\nThis library is experimental and should not be used in a production environment.\n\nAfter all, as of now, it requires a non-official fork in order to run.\n\nPorting to free-threaded Python 3.13 (3.13t) is planned.\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Thread synchronization utilities for free-threaded Python.",
    "version": "0.2.0",
    "project_urls": {
        "Documentation": "https://github.com/dpdani/cereggii#readme",
        "Issues": "https://github.com/dpdani/cereggii/issues",
        "Source": "https://github.com/dpdani/cereggii"
    },
    "split_keywords": [
        "multithreading"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f3c376caecc706e7b07d357ea4c3b55b67af3865612f2bcb06d6ef884b1c20b5",
                "md5": "520797e1742fbabb81581c863d74bcb2",
                "sha256": "38172b8993488f68d8213247a1ca0055c9cff8f4e81f4825397aab0336cc7293"
            },
            "downloads": -1,
            "filename": "cereggii-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "520797e1742fbabb81581c863d74bcb2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "~=3.9",
            "size": 51940,
            "upload_time": "2024-05-09T15:25:41",
            "upload_time_iso_8601": "2024-05-09T15:25:41.685357Z",
            "url": "https://files.pythonhosted.org/packages/f3/c3/76caecc706e7b07d357ea4c3b55b67af3865612f2bcb06d6ef884b1c20b5/cereggii-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-09 15:25:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dpdani",
    "github_project": "cereggii#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "cereggii"
}
        
Elapsed time: 0.25409s