lmdb-dict-full


Namelmdb-dict-full JSON
Version 1.0.2 PyPI version JSON
download
home_pageNone
SummaryFull-featured Python dict interface to the LMDB "Lightning" Database.
upload_time2023-03-30 18:42:21
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords dict lmdb
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # lmdb-dict-full

[![PyPI - Version](https://img.shields.io/pypi/v/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)

**The full-featured `dict` interface to the LMDB "Lightning" Database.**

* Internally optimized via `lmdb` library cursors. Optional LRU caching of deserialized values. Thread-safe operations. No added reserved keys, *etc.*

* Provides value-serializing `SafeLmdbDict` and str-only `StrLmdbDict`, as well as abstract base class `LmdbDict` for customization of database encoding.

* Unique-key, labeled and unlabeled databases and read-write sessions supported.

-----

**Table of Contents**

- [Installation](#installation)
- [Use](#use)
  - [General use](#general-use)
  - [Caching](#caching)
  - [Str-only](#str-only)
- [License](#license)

## Installation

```console
pip install lmdb-dict-full
```

## Use

### General use

`SafeLmdbDict` provides the full `dict` interface to a LMDB database at a given filesystem path. (An empty database is automatically provisioned within a directory without one.)

Values are automatically serialized (deserialized) and compressed (decompressed) using [PyYAML](https://pypi.org/project/PyYAML/) and [zlib](https://docs.python.org/3/library/zlib.html).

```python
from lmdb_dict import SafeLmdbDict

dbdict = SafeLmdbDict('/path/to/db/directory/0/')

dbdict['aaa'] = {'values': [0, 1, 'x']}
```

One or more named databases are also supported.

LMDB requires that the maximum number of named databases is specified up-front. Below we'll only need two named databases.

```python
users = SafeLmdbDict('/path/to/db/directory/1/', name='users', max_dbs=2)

hats = SafeLmdbDict('/path/to/db/directory/1/', name='hats', max_dbs=2)
```

Note that it would otherwise be unsafe to hold open multiple `lmdb` client objects within a single process at once. This is handled automatically: a weak reference is kept to the client opened for each filesystem path and reused for each `LmdbDict` requiring it.

### Caching

Caching of LMDB itself *should not be necessary*. The database "fully exploits the operating system’s buffer cache" and memory mapping [[ref]](https://lmdb.readthedocs.io/en/release/).

Moreover, `lmdb-dict-full` makes every effort to use `lmdb` efficiently, such that the user need not be concerned with undue overhead of interacting with the database-backed dictionary.

That said: the value serialization layer of `SafeLmdbDict` is another matter. Given sufficiently hefty values to deserialize, it *may* be worthwhile to engage the `lmdb-dict-full` caching layer, along with the trade-offs that it entails.

#### Caveats

**`lmdb-dict-full` caching is thread-safe**

This is achieved with behind-the-scenes locking – narrowly applied to singular keys where feasible – but the small overhead of which applies when caching.

**`lmdb-dict-full` caching is *not* (yet) *automatically* process-safe**

Caching is thread-safe thanks to thread locks and (again) weak references to caches which must be shared across dictionaries backed by the same databases.

Achieving the same under a multiprocessing regime would be another matter.

Users may nonetheless make use of `lmdb-dict-full` while multiprocessing, either without caching or with thoughtful application of caches across processes.

#### Options

Caching is built into all concrete subclasses of `LmdbDict`; however, it is disabled by default, in that it is set to `DummyCache` – a mapping capable of storing zero items.

Subclasses of `LmdbDict` check their cache for its maximum capacity by means of: `getattr(cache, 'maxsize', …)`. A cache reporting `maxsize=0` – such as the `DummyCache` – will be given *dummy locks*, such that locking is disabled for this dictionary.

A cache reporting any other `maxsize` – or lacking this property – is treated as a proper cache, and locking will be applied.

Caching may be specified – to `SafeLmdbDict` for example – via an instance, a class, or any callable returning an instance of a mapping for use as a deserialization cache. Either an instance or a class are strongly recommended, as these enable checking any cache retrieved from the weak reference registry against the user's instantiation argument.

```python
from lmdb_dict.cache import LRUCache128

SafeLmdbDict('/path/to/db/directory/', cache=LRUCache128)
```

Above, we've specified that our `SafeLmdbDict` should cache deserialized values using an instance of `LRUCache128` – that is, a subclass of the `LRUCache` provided by [cachetools](https://pypi.org/project/cachetools/). `LRUCache128` distinguishes itself only in that it requires no initialization arguments – a requirement of supplying a callable in lieu of a cache instance – and it sets `maxsize=128`.

As a shortcut to the above, `lmdb-dict-full` provides `CachedLmdbDict`:

```python
from lmdb_dict import CachedLmdbDict

CachedLmdbDict('/path/to/db/directory/')
```

`CachedLmdbDict` differs from other subclasses of `LmdbDict` in that it defaults to caching via `LRUCache128`. Other caches may be specified via the `cache` argument. Supplying an entity with property `maxsize=0` – such as the `DummyCache` – will raise a `TypeError`.

### Str-only

The above concrete subclasses of `LmdbDict` support arbitrary serializable values in order to best mimic the functionality of the Python `dict`.

For use-cases supporting str-only (and/or bytes-only) values, all of the above concerns over serialization, caching and locking may be sidestepped.

`StrLmdbDict` provides the same full-featured `dict` interface to LMDB, but only for values of type `str` and `bytes`.

```python
from lmdb_dict import StrLmdbDict

StrLmdbDict('/path/to/db/directory/')
```

`StrLmdbDict` further differs from other subclasses of `LmdbDict` in that it accepts no `cache` argument, and may not perform caching.

## License

`lmdb-dict-full` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "lmdb-dict-full",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "dict,lmdb",
    "author": null,
    "author_email": "Jesse London <jesselondon@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/4d/90/bad2239ee964b402e587251a5aa02c75967a98b608a1c79d47fec804778f/lmdb_dict_full-1.0.2.tar.gz",
    "platform": null,
    "description": "# lmdb-dict-full\n\n[![PyPI - Version](https://img.shields.io/pypi/v/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)\n\n**The full-featured `dict` interface to the LMDB \"Lightning\" Database.**\n\n* Internally optimized via `lmdb` library cursors. Optional LRU caching of deserialized values. Thread-safe operations. No added reserved keys, *etc.*\n\n* Provides value-serializing `SafeLmdbDict` and str-only `StrLmdbDict`, as well as abstract base class `LmdbDict` for customization of database encoding.\n\n* Unique-key, labeled and unlabeled databases and read-write sessions supported.\n\n-----\n\n**Table of Contents**\n\n- [Installation](#installation)\n- [Use](#use)\n  - [General use](#general-use)\n  - [Caching](#caching)\n  - [Str-only](#str-only)\n- [License](#license)\n\n## Installation\n\n```console\npip install lmdb-dict-full\n```\n\n## Use\n\n### General use\n\n`SafeLmdbDict` provides the full `dict` interface to a LMDB database at a given filesystem path. (An empty database is automatically provisioned within a directory without one.)\n\nValues are automatically serialized (deserialized) and compressed (decompressed) using [PyYAML](https://pypi.org/project/PyYAML/) and [zlib](https://docs.python.org/3/library/zlib.html).\n\n```python\nfrom lmdb_dict import SafeLmdbDict\n\ndbdict = SafeLmdbDict('/path/to/db/directory/0/')\n\ndbdict['aaa'] = {'values': [0, 1, 'x']}\n```\n\nOne or more named databases are also supported.\n\nLMDB requires that the maximum number of named databases is specified up-front. Below we'll only need two named databases.\n\n```python\nusers = SafeLmdbDict('/path/to/db/directory/1/', name='users', max_dbs=2)\n\nhats = SafeLmdbDict('/path/to/db/directory/1/', name='hats', max_dbs=2)\n```\n\nNote that it would otherwise be unsafe to hold open multiple `lmdb` client objects within a single process at once. This is handled automatically: a weak reference is kept to the client opened for each filesystem path and reused for each `LmdbDict` requiring it.\n\n### Caching\n\nCaching of LMDB itself *should not be necessary*. The database \"fully exploits the operating system\u2019s buffer cache\" and memory mapping [[ref]](https://lmdb.readthedocs.io/en/release/).\n\nMoreover, `lmdb-dict-full` makes every effort to use `lmdb` efficiently, such that the user need not be concerned with undue overhead of interacting with the database-backed dictionary.\n\nThat said: the value serialization layer of `SafeLmdbDict` is another matter. Given sufficiently hefty values to deserialize, it *may* be worthwhile to engage the `lmdb-dict-full` caching layer, along with the trade-offs that it entails.\n\n#### Caveats\n\n**`lmdb-dict-full` caching is thread-safe**\n\nThis is achieved with behind-the-scenes locking \u2013 narrowly applied to singular keys where feasible \u2013 but the small overhead of which applies when caching.\n\n**`lmdb-dict-full` caching is *not* (yet) *automatically* process-safe**\n\nCaching is thread-safe thanks to thread locks and (again) weak references to caches which must be shared across dictionaries backed by the same databases.\n\nAchieving the same under a multiprocessing regime would be another matter.\n\nUsers may nonetheless make use of `lmdb-dict-full` while multiprocessing, either without caching or with thoughtful application of caches across processes.\n\n#### Options\n\nCaching is built into all concrete subclasses of `LmdbDict`; however, it is disabled by default, in that it is set to `DummyCache` \u2013 a mapping capable of storing zero items.\n\nSubclasses of `LmdbDict` check their cache for its maximum capacity by means of: `getattr(cache, 'maxsize', \u2026)`. A cache reporting `maxsize=0` \u2013 such as the `DummyCache` \u2013 will be given *dummy locks*, such that locking is disabled for this dictionary.\n\nA cache reporting any other `maxsize` \u2013 or lacking this property \u2013 is treated as a proper cache, and locking will be applied.\n\nCaching may be specified \u2013 to `SafeLmdbDict` for example \u2013 via an instance, a class, or any callable returning an instance of a mapping for use as a deserialization cache. Either an instance or a class are strongly recommended, as these enable checking any cache retrieved from the weak reference registry against the user's instantiation argument.\n\n```python\nfrom lmdb_dict.cache import LRUCache128\n\nSafeLmdbDict('/path/to/db/directory/', cache=LRUCache128)\n```\n\nAbove, we've specified that our `SafeLmdbDict` should cache deserialized values using an instance of `LRUCache128` \u2013 that is, a subclass of the `LRUCache` provided by [cachetools](https://pypi.org/project/cachetools/). `LRUCache128` distinguishes itself only in that it requires no initialization arguments \u2013 a requirement of supplying a callable in lieu of a cache instance \u2013 and it sets `maxsize=128`.\n\nAs a shortcut to the above, `lmdb-dict-full` provides `CachedLmdbDict`:\n\n```python\nfrom lmdb_dict import CachedLmdbDict\n\nCachedLmdbDict('/path/to/db/directory/')\n```\n\n`CachedLmdbDict` differs from other subclasses of `LmdbDict` in that it defaults to caching via `LRUCache128`. Other caches may be specified via the `cache` argument. Supplying an entity with property `maxsize=0` \u2013 such as the `DummyCache` \u2013 will raise a `TypeError`.\n\n### Str-only\n\nThe above concrete subclasses of `LmdbDict` support arbitrary serializable values in order to best mimic the functionality of the Python `dict`.\n\nFor use-cases supporting str-only (and/or bytes-only) values, all of the above concerns over serialization, caching and locking may be sidestepped.\n\n`StrLmdbDict` provides the same full-featured `dict` interface to LMDB, but only for values of type `str` and `bytes`.\n\n```python\nfrom lmdb_dict import StrLmdbDict\n\nStrLmdbDict('/path/to/db/directory/')\n```\n\n`StrLmdbDict` further differs from other subclasses of `LmdbDict` in that it accepts no `cache` argument, and may not perform caching.\n\n## License\n\n`lmdb-dict-full` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Full-featured Python dict interface to the LMDB \"Lightning\" Database.",
    "version": "1.0.2",
    "split_keywords": [
        "dict",
        "lmdb"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "698ba59f0f74e7e7948d099cf0c9f4917a39ee3460b734e00bddd193386952c8",
                "md5": "bd2a51effd7e2ab9d6894e1b6d2ba612",
                "sha256": "58f34190f8eda8415dac8c0f4a7868597f190281ccd738edecbccb7c3e53852b"
            },
            "downloads": -1,
            "filename": "lmdb_dict_full-1.0.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "bd2a51effd7e2ab9d6894e1b6d2ba612",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 18480,
            "upload_time": "2023-03-30T18:42:24",
            "upload_time_iso_8601": "2023-03-30T18:42:24.095464Z",
            "url": "https://files.pythonhosted.org/packages/69/8b/a59f0f74e7e7948d099cf0c9f4917a39ee3460b734e00bddd193386952c8/lmdb_dict_full-1.0.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4d90bad2239ee964b402e587251a5aa02c75967a98b608a1c79d47fec804778f",
                "md5": "bc77f82de121fccc5119357a8a6185e1",
                "sha256": "9bd14a30ab3667e3d9e46707c31e9c62dbeb88ba72a30122beefd5635695f92c"
            },
            "downloads": -1,
            "filename": "lmdb_dict_full-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "bc77f82de121fccc5119357a8a6185e1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 14611,
            "upload_time": "2023-03-30T18:42:21",
            "upload_time_iso_8601": "2023-03-30T18:42:21.942549Z",
            "url": "https://files.pythonhosted.org/packages/4d/90/bad2239ee964b402e587251a5aa02c75967a98b608a1c79d47fec804778f/lmdb_dict_full-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-30 18:42:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "lmdb-dict-full"
}
        
Elapsed time: 0.05885s