Name | lmdb-dict-full JSON |
Version |
1.0.2
JSON |
| download |
home_page | None |
Summary | Full-featured Python dict interface to the LMDB "Lightning" Database. |
upload_time | 2023-03-30 18:42:21 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.8 |
license | None |
keywords |
dict
lmdb
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# lmdb-dict-full
[![PyPI - Version](https://img.shields.io/pypi/v/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)
**The full-featured `dict` interface to the LMDB "Lightning" Database.**
* Internally optimized via `lmdb` library cursors. Optional LRU caching of deserialized values. Thread-safe operations. No added reserved keys, *etc.*
* Provides value-serializing `SafeLmdbDict` and str-only `StrLmdbDict`, as well as abstract base class `LmdbDict` for customization of database encoding.
* Unique-key, labeled and unlabeled databases and read-write sessions supported.
-----
**Table of Contents**
- [Installation](#installation)
- [Use](#use)
- [General use](#general-use)
- [Caching](#caching)
- [Str-only](#str-only)
- [License](#license)
## Installation
```console
pip install lmdb-dict-full
```
## Use
### General use
`SafeLmdbDict` provides the full `dict` interface to a LMDB database at a given filesystem path. (An empty database is automatically provisioned within a directory without one.)
Values are automatically serialized (deserialized) and compressed (decompressed) using [PyYAML](https://pypi.org/project/PyYAML/) and [zlib](https://docs.python.org/3/library/zlib.html).
```python
from lmdb_dict import SafeLmdbDict
dbdict = SafeLmdbDict('/path/to/db/directory/0/')
dbdict['aaa'] = {'values': [0, 1, 'x']}
```
One or more named databases are also supported.
LMDB requires that the maximum number of named databases is specified up-front. Below we'll only need two named databases.
```python
users = SafeLmdbDict('/path/to/db/directory/1/', name='users', max_dbs=2)
hats = SafeLmdbDict('/path/to/db/directory/1/', name='hats', max_dbs=2)
```
Note that it would otherwise be unsafe to hold open multiple `lmdb` client objects within a single process at once. This is handled automatically: a weak reference is kept to the client opened for each filesystem path and reused for each `LmdbDict` requiring it.
### Caching
Caching of LMDB itself *should not be necessary*. The database "fully exploits the operating system’s buffer cache" and memory mapping [[ref]](https://lmdb.readthedocs.io/en/release/).
Moreover, `lmdb-dict-full` makes every effort to use `lmdb` efficiently, such that the user need not be concerned with undue overhead of interacting with the database-backed dictionary.
That said: the value serialization layer of `SafeLmdbDict` is another matter. Given sufficiently hefty values to deserialize, it *may* be worthwhile to engage the `lmdb-dict-full` caching layer, along with the trade-offs that it entails.
#### Caveats
**`lmdb-dict-full` caching is thread-safe**
This is achieved with behind-the-scenes locking – narrowly applied to singular keys where feasible – but the small overhead of which applies when caching.
**`lmdb-dict-full` caching is *not* (yet) *automatically* process-safe**
Caching is thread-safe thanks to thread locks and (again) weak references to caches which must be shared across dictionaries backed by the same databases.
Achieving the same under a multiprocessing regime would be another matter.
Users may nonetheless make use of `lmdb-dict-full` while multiprocessing, either without caching or with thoughtful application of caches across processes.
#### Options
Caching is built into all concrete subclasses of `LmdbDict`; however, it is disabled by default, in that it is set to `DummyCache` – a mapping capable of storing zero items.
Subclasses of `LmdbDict` check their cache for its maximum capacity by means of: `getattr(cache, 'maxsize', …)`. A cache reporting `maxsize=0` – such as the `DummyCache` – will be given *dummy locks*, such that locking is disabled for this dictionary.
A cache reporting any other `maxsize` – or lacking this property – is treated as a proper cache, and locking will be applied.
Caching may be specified – to `SafeLmdbDict` for example – via an instance, a class, or any callable returning an instance of a mapping for use as a deserialization cache. Either an instance or a class are strongly recommended, as these enable checking any cache retrieved from the weak reference registry against the user's instantiation argument.
```python
from lmdb_dict.cache import LRUCache128
SafeLmdbDict('/path/to/db/directory/', cache=LRUCache128)
```
Above, we've specified that our `SafeLmdbDict` should cache deserialized values using an instance of `LRUCache128` – that is, a subclass of the `LRUCache` provided by [cachetools](https://pypi.org/project/cachetools/). `LRUCache128` distinguishes itself only in that it requires no initialization arguments – a requirement of supplying a callable in lieu of a cache instance – and it sets `maxsize=128`.
As a shortcut to the above, `lmdb-dict-full` provides `CachedLmdbDict`:
```python
from lmdb_dict import CachedLmdbDict
CachedLmdbDict('/path/to/db/directory/')
```
`CachedLmdbDict` differs from other subclasses of `LmdbDict` in that it defaults to caching via `LRUCache128`. Other caches may be specified via the `cache` argument. Supplying an entity with property `maxsize=0` – such as the `DummyCache` – will raise a `TypeError`.
### Str-only
The above concrete subclasses of `LmdbDict` support arbitrary serializable values in order to best mimic the functionality of the Python `dict`.
For use-cases supporting str-only (and/or bytes-only) values, all of the above concerns over serialization, caching and locking may be sidestepped.
`StrLmdbDict` provides the same full-featured `dict` interface to LMDB, but only for values of type `str` and `bytes`.
```python
from lmdb_dict import StrLmdbDict
StrLmdbDict('/path/to/db/directory/')
```
`StrLmdbDict` further differs from other subclasses of `LmdbDict` in that it accepts no `cache` argument, and may not perform caching.
## License
`lmdb-dict-full` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.
Raw data
{
"_id": null,
"home_page": null,
"name": "lmdb-dict-full",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "dict,lmdb",
"author": null,
"author_email": "Jesse London <jesselondon@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/4d/90/bad2239ee964b402e587251a5aa02c75967a98b608a1c79d47fec804778f/lmdb_dict_full-1.0.2.tar.gz",
"platform": null,
"description": "# lmdb-dict-full\n\n[![PyPI - Version](https://img.shields.io/pypi/v/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/lmdb-dict-full.svg)](https://pypi.org/project/lmdb-dict-full)\n\n**The full-featured `dict` interface to the LMDB \"Lightning\" Database.**\n\n* Internally optimized via `lmdb` library cursors. Optional LRU caching of deserialized values. Thread-safe operations. No added reserved keys, *etc.*\n\n* Provides value-serializing `SafeLmdbDict` and str-only `StrLmdbDict`, as well as abstract base class `LmdbDict` for customization of database encoding.\n\n* Unique-key, labeled and unlabeled databases and read-write sessions supported.\n\n-----\n\n**Table of Contents**\n\n- [Installation](#installation)\n- [Use](#use)\n - [General use](#general-use)\n - [Caching](#caching)\n - [Str-only](#str-only)\n- [License](#license)\n\n## Installation\n\n```console\npip install lmdb-dict-full\n```\n\n## Use\n\n### General use\n\n`SafeLmdbDict` provides the full `dict` interface to a LMDB database at a given filesystem path. (An empty database is automatically provisioned within a directory without one.)\n\nValues are automatically serialized (deserialized) and compressed (decompressed) using [PyYAML](https://pypi.org/project/PyYAML/) and [zlib](https://docs.python.org/3/library/zlib.html).\n\n```python\nfrom lmdb_dict import SafeLmdbDict\n\ndbdict = SafeLmdbDict('/path/to/db/directory/0/')\n\ndbdict['aaa'] = {'values': [0, 1, 'x']}\n```\n\nOne or more named databases are also supported.\n\nLMDB requires that the maximum number of named databases is specified up-front. Below we'll only need two named databases.\n\n```python\nusers = SafeLmdbDict('/path/to/db/directory/1/', name='users', max_dbs=2)\n\nhats = SafeLmdbDict('/path/to/db/directory/1/', name='hats', max_dbs=2)\n```\n\nNote that it would otherwise be unsafe to hold open multiple `lmdb` client objects within a single process at once. This is handled automatically: a weak reference is kept to the client opened for each filesystem path and reused for each `LmdbDict` requiring it.\n\n### Caching\n\nCaching of LMDB itself *should not be necessary*. The database \"fully exploits the operating system\u2019s buffer cache\" and memory mapping [[ref]](https://lmdb.readthedocs.io/en/release/).\n\nMoreover, `lmdb-dict-full` makes every effort to use `lmdb` efficiently, such that the user need not be concerned with undue overhead of interacting with the database-backed dictionary.\n\nThat said: the value serialization layer of `SafeLmdbDict` is another matter. Given sufficiently hefty values to deserialize, it *may* be worthwhile to engage the `lmdb-dict-full` caching layer, along with the trade-offs that it entails.\n\n#### Caveats\n\n**`lmdb-dict-full` caching is thread-safe**\n\nThis is achieved with behind-the-scenes locking \u2013 narrowly applied to singular keys where feasible \u2013 but the small overhead of which applies when caching.\n\n**`lmdb-dict-full` caching is *not* (yet) *automatically* process-safe**\n\nCaching is thread-safe thanks to thread locks and (again) weak references to caches which must be shared across dictionaries backed by the same databases.\n\nAchieving the same under a multiprocessing regime would be another matter.\n\nUsers may nonetheless make use of `lmdb-dict-full` while multiprocessing, either without caching or with thoughtful application of caches across processes.\n\n#### Options\n\nCaching is built into all concrete subclasses of `LmdbDict`; however, it is disabled by default, in that it is set to `DummyCache` \u2013 a mapping capable of storing zero items.\n\nSubclasses of `LmdbDict` check their cache for its maximum capacity by means of: `getattr(cache, 'maxsize', \u2026)`. A cache reporting `maxsize=0` \u2013 such as the `DummyCache` \u2013 will be given *dummy locks*, such that locking is disabled for this dictionary.\n\nA cache reporting any other `maxsize` \u2013 or lacking this property \u2013 is treated as a proper cache, and locking will be applied.\n\nCaching may be specified \u2013 to `SafeLmdbDict` for example \u2013 via an instance, a class, or any callable returning an instance of a mapping for use as a deserialization cache. Either an instance or a class are strongly recommended, as these enable checking any cache retrieved from the weak reference registry against the user's instantiation argument.\n\n```python\nfrom lmdb_dict.cache import LRUCache128\n\nSafeLmdbDict('/path/to/db/directory/', cache=LRUCache128)\n```\n\nAbove, we've specified that our `SafeLmdbDict` should cache deserialized values using an instance of `LRUCache128` \u2013 that is, a subclass of the `LRUCache` provided by [cachetools](https://pypi.org/project/cachetools/). `LRUCache128` distinguishes itself only in that it requires no initialization arguments \u2013 a requirement of supplying a callable in lieu of a cache instance \u2013 and it sets `maxsize=128`.\n\nAs a shortcut to the above, `lmdb-dict-full` provides `CachedLmdbDict`:\n\n```python\nfrom lmdb_dict import CachedLmdbDict\n\nCachedLmdbDict('/path/to/db/directory/')\n```\n\n`CachedLmdbDict` differs from other subclasses of `LmdbDict` in that it defaults to caching via `LRUCache128`. Other caches may be specified via the `cache` argument. Supplying an entity with property `maxsize=0` \u2013 such as the `DummyCache` \u2013 will raise a `TypeError`.\n\n### Str-only\n\nThe above concrete subclasses of `LmdbDict` support arbitrary serializable values in order to best mimic the functionality of the Python `dict`.\n\nFor use-cases supporting str-only (and/or bytes-only) values, all of the above concerns over serialization, caching and locking may be sidestepped.\n\n`StrLmdbDict` provides the same full-featured `dict` interface to LMDB, but only for values of type `str` and `bytes`.\n\n```python\nfrom lmdb_dict import StrLmdbDict\n\nStrLmdbDict('/path/to/db/directory/')\n```\n\n`StrLmdbDict` further differs from other subclasses of `LmdbDict` in that it accepts no `cache` argument, and may not perform caching.\n\n## License\n\n`lmdb-dict-full` is distributed under the terms of the [MIT](https://spdx.org/licenses/MIT.html) license.\n",
"bugtrack_url": null,
"license": null,
"summary": "Full-featured Python dict interface to the LMDB \"Lightning\" Database.",
"version": "1.0.2",
"split_keywords": [
"dict",
"lmdb"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "698ba59f0f74e7e7948d099cf0c9f4917a39ee3460b734e00bddd193386952c8",
"md5": "bd2a51effd7e2ab9d6894e1b6d2ba612",
"sha256": "58f34190f8eda8415dac8c0f4a7868597f190281ccd738edecbccb7c3e53852b"
},
"downloads": -1,
"filename": "lmdb_dict_full-1.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "bd2a51effd7e2ab9d6894e1b6d2ba612",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 18480,
"upload_time": "2023-03-30T18:42:24",
"upload_time_iso_8601": "2023-03-30T18:42:24.095464Z",
"url": "https://files.pythonhosted.org/packages/69/8b/a59f0f74e7e7948d099cf0c9f4917a39ee3460b734e00bddd193386952c8/lmdb_dict_full-1.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "4d90bad2239ee964b402e587251a5aa02c75967a98b608a1c79d47fec804778f",
"md5": "bc77f82de121fccc5119357a8a6185e1",
"sha256": "9bd14a30ab3667e3d9e46707c31e9c62dbeb88ba72a30122beefd5635695f92c"
},
"downloads": -1,
"filename": "lmdb_dict_full-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "bc77f82de121fccc5119357a8a6185e1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 14611,
"upload_time": "2023-03-30T18:42:21",
"upload_time_iso_8601": "2023-03-30T18:42:21.942549Z",
"url": "https://files.pythonhosted.org/packages/4d/90/bad2239ee964b402e587251a5aa02c75967a98b608a1c79d47fec804778f/lmdb_dict_full-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-30 18:42:21",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "lmdb-dict-full"
}