flaxkv


Nameflaxkv JSON
Version 0.2.8 PyPI version JSON
download
home_pageNone
SummaryA high-performance dictionary database.
upload_time2024-04-27 15:09:34
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords machine learning nlp leveldb lmdb on-disk dict persistent-storage
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
<h1 align="center">
    <br>
    🗲  FlaxKV
</h1>


<p align="center">
A high-performance dictionary database.
</p>
<p align="center">
    <a href="https://pypi.org/project/flaxkv/">
        <img src="https://img.shields.io/pypi/v/flaxkv?color=brightgreen&style=flat-square" alt="PyPI version" >
    </a>
    <a href="https://github.com/KenyonY/flaxkv/blob/main/LICENSE">
        <img alt="License" src="https://img.shields.io/github/license/KenyonY/flaxkv.svg?color=blue&style=flat-square">
    </a>
    <a href="https://github.com/KenyonY/flaxkv/releases">
        <img alt="Release (latest by date)" src="https://img.shields.io/github/v/release/KenyonY/flaxkv?&style=flat-square">
    </a>
    <a href="https://github.com/KenyonY/flaxkv/actions/workflows/ci.yml">
        <img alt="tests" src="https://img.shields.io/github/actions/workflow/status/KenyonY/flaxkv/ci.yml?style=flat-square&label=tests">
    </a>
    <a href="https://pypistats.org/packages/flaxkv">
        <img alt="pypi downloads" src="https://img.shields.io/pypi/dm/flaxkv?style=flat-square">
    </a>
</p>

<h4 align="center">
    <p>
        <b>English</b> |
        <a href="https://github.com/KenyonY/flaxkv/blob/main/README_ZH.md">简体中文</a> 
    </p>
</h4>

<p >
<br>
</p>


The `flaxkv` provides an interface very similar to a dictionary for interacting with high-performance key-value databases. More importantly, as a persistent database, it offers performance close to that of native dictionaries (in-memory access).  
You can use it just like a Python dictionary without having to worry about blocking your user process when operating the database at any time.

---

## Key Features

- **Always Up-to-date, Never Blocking**: It was designed from the ground up to ensure that no write operations block the user process, while users can always read the most recently written data.

- **Ease of Use**: Interacting with the database feels just like using a Python dictionary! You don't even have to worry about resource release.

- **Buffered Writing**: Data is buffered and scheduled for write to the database, reducing the overhead of frequent database writes.

- **High-Performance Database Backend**: Uses the high-performance key-value database LevelDB as its default backend.

- **Atomic Operations**: Ensures that write operations are atomic, safeguarding data integrity.

- **Thread-Safety**: Employs only necessary locks to ensure safe concurrent access while balancing performance.

---

## Quick Start

### Installation

```bash
pip install flaxkv 
# Install with server version: pip install flaxkv[server]
```
### Usage

```python
from flaxkv import FlaxKV
import numpy as np
import pandas as pd

db = FlaxKV('test_db')
"""
Or start as a server
>>> flaxkv run --port 8000

Client call:
db = FlaxKV('test_db', root_path_or_url='http://localhost:8000')
"""

db[1] = 1
db[1.1] = 1 / 3
db['key'] = 'value'
db['a dict'] = {'a': 1, 'b': [1, 2, 3]}
db['a list'] = [1, 2, 3, {'a': 1}]
db[(1, 2, 3)] = [1, 2, 3]
db['numpy array'] = np.random.randn(100, 100)
db['df'] = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})

db.setdefault('key', 'value_2')
assert db['key'] == 'value'

db.update({"key1": "value1", "key2": "value2"})

assert 'key2' in db

db.pop("key1")
assert 'key1' not in db

for key, value in db.items():
    print(key, value)

print(len(db))
```

### Tips
- `flaxkv` provides performance close to native dictionary (in-memory) access as a persistent database! (See benchmark below)
- You may have noticed that in the previous example code, `db.close()` was not used to release resources! Because all this will be automatically handled by `flaxkv`. Of course, you can also manually call db.close() to immediately release resources.

### Benchmark
![benchmark](.github/img/benchmark.png)

Test Content: Write and read traversal for N numpy array vectors (each vector is 1000-dimensional). 

Execute the test:
```bash
cd benchmark/
pytest -s -v run.py
```


### Use Cases
- **Key-Value Structure:**
Used to save simple key-value structure data.
- **High-Frequency Writing:**
Very suitable for scenarios that require high-frequency insertion/update of data.
- **Machine Learning:**
`flaxkv` is very suitable for saving various large datasets of embeddings, images, texts, and other key-value structures in machine learning.

### Limitations
* In the current version, due to the delayed writing feature, in a multi-process environment, 
one process cannot read the data written by another process in real-time (usually delayed by a few seconds). 
If immediate writing is desired, the .write_immediately() method must be called. 
This limitation does not exist in a single-process environment.
* By default, the value does not support the `Tuple`, `Set` types. If these types are forcibly set, they will be deserialized into a `List`.
 
## Citation
If `FlaxKV` has been helpful to your research, please cite:
```bibtex
@misc{flaxkv,
    title={FlaxKV: An Easy-to-use and High Performance Key-Value Database Solution},
    author={K.Y},
    howpublished = {\url{https://github.com/KenyonY/flaxkv}},
    year={2023}
}
```

## Contributions
Feel free to make contributions to this module by submitting pull requests or raising issues in the repository.

## License
`FlaxKV` is licensed under the [Apache-2.0 License](./LICENSE).



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "flaxkv",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "Machine Learning, NLP, leveldb, lmdb, on-disk dict, persistent-storage",
    "author": null,
    "author_email": "\"K.Y\" <beidongjiedeguang@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/1c/b9/3a6553fe3e3c98464f016c614d9807dfc2f46d74a0418cfcfc14cf1dfc1b/flaxkv-0.2.8.tar.gz",
    "platform": null,
    "description": "\n<h1 align=\"center\">\n    <br>\n    \ud83d\uddf2  FlaxKV\n</h1>\n\n\n<p align=\"center\">\nA high-performance dictionary database.\n</p>\n<p align=\"center\">\n    <a href=\"https://pypi.org/project/flaxkv/\">\n        <img src=\"https://img.shields.io/pypi/v/flaxkv?color=brightgreen&style=flat-square\" alt=\"PyPI version\" >\n    </a>\n    <a href=\"https://github.com/KenyonY/flaxkv/blob/main/LICENSE\">\n        <img alt=\"License\" src=\"https://img.shields.io/github/license/KenyonY/flaxkv.svg?color=blue&style=flat-square\">\n    </a>\n    <a href=\"https://github.com/KenyonY/flaxkv/releases\">\n        <img alt=\"Release (latest by date)\" src=\"https://img.shields.io/github/v/release/KenyonY/flaxkv?&style=flat-square\">\n    </a>\n    <a href=\"https://github.com/KenyonY/flaxkv/actions/workflows/ci.yml\">\n        <img alt=\"tests\" src=\"https://img.shields.io/github/actions/workflow/status/KenyonY/flaxkv/ci.yml?style=flat-square&label=tests\">\n    </a>\n    <a href=\"https://pypistats.org/packages/flaxkv\">\n        <img alt=\"pypi downloads\" src=\"https://img.shields.io/pypi/dm/flaxkv?style=flat-square\">\n    </a>\n</p>\n\n<h4 align=\"center\">\n    <p>\n        <b>English</b> |\n        <a href=\"https://github.com/KenyonY/flaxkv/blob/main/README_ZH.md\">\u7b80\u4f53\u4e2d\u6587</a> \n    </p>\n</h4>\n\n<p >\n<br>\n</p>\n\n\nThe `flaxkv` provides an interface very similar to a dictionary for interacting with high-performance key-value databases. More importantly, as a persistent database, it offers performance close to that of native dictionaries (in-memory access).  \nYou can use it just like a Python dictionary without having to worry about blocking your user process when operating the database at any time.\n\n---\n\n## Key Features\n\n- **Always Up-to-date, Never Blocking**: It was designed from the ground up to ensure that no write operations block the user process, while users can always read the most recently written data.\n\n- **Ease of Use**: Interacting with the database feels just like using a Python dictionary! You don't even have to worry about resource release.\n\n- **Buffered Writing**: Data is buffered and scheduled for write to the database, reducing the overhead of frequent database writes.\n\n- **High-Performance Database Backend**: Uses the high-performance key-value database LevelDB as its default backend.\n\n- **Atomic Operations**: Ensures that write operations are atomic, safeguarding data integrity.\n\n- **Thread-Safety**: Employs only necessary locks to ensure safe concurrent access while balancing performance.\n\n---\n\n## Quick Start\n\n### Installation\n\n```bash\npip install flaxkv \n# Install with server version: pip install flaxkv[server]\n```\n### Usage\n\n```python\nfrom flaxkv import FlaxKV\nimport numpy as np\nimport pandas as pd\n\ndb = FlaxKV('test_db')\n\"\"\"\nOr start as a server\n>>> flaxkv run --port 8000\n\nClient call:\ndb = FlaxKV('test_db', root_path_or_url='http://localhost:8000')\n\"\"\"\n\ndb[1] = 1\ndb[1.1] = 1 / 3\ndb['key'] = 'value'\ndb['a dict'] = {'a': 1, 'b': [1, 2, 3]}\ndb['a list'] = [1, 2, 3, {'a': 1}]\ndb[(1, 2, 3)] = [1, 2, 3]\ndb['numpy array'] = np.random.randn(100, 100)\ndb['df'] = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})\n\ndb.setdefault('key', 'value_2')\nassert db['key'] == 'value'\n\ndb.update({\"key1\": \"value1\", \"key2\": \"value2\"})\n\nassert 'key2' in db\n\ndb.pop(\"key1\")\nassert 'key1' not in db\n\nfor key, value in db.items():\n    print(key, value)\n\nprint(len(db))\n```\n\n### Tips\n- `flaxkv` provides performance close to native dictionary (in-memory) access as a persistent database! (See benchmark below)\n- You may have noticed that in the previous example code, `db.close()` was not used to release resources! Because all this will be automatically handled by `flaxkv`. Of course, you can also manually call db.close() to immediately release resources.\n\n### Benchmark\n![benchmark](.github/img/benchmark.png)\n\nTest Content: Write and read traversal for N numpy array vectors (each vector is 1000-dimensional). \n\nExecute the test:\n```bash\ncd benchmark/\npytest -s -v run.py\n```\n\n\n### Use Cases\n- **Key-Value Structure:**\nUsed to save simple key-value structure data.\n- **High-Frequency Writing:**\nVery suitable for scenarios that require high-frequency insertion/update of data.\n- **Machine Learning:**\n`flaxkv` is very suitable for saving various large datasets of embeddings, images, texts, and other key-value structures in machine learning.\n\n### Limitations\n* In the current version, due to the delayed writing feature, in a multi-process environment, \none process cannot read the data written by another process in real-time (usually delayed by a few seconds). \nIf immediate writing is desired, the .write_immediately() method must be called. \nThis limitation does not exist in a single-process environment.\n* By default, the value does not support the `Tuple`, `Set` types. If these types are forcibly set, they will be deserialized into a `List`.\n \n## Citation\nIf `FlaxKV` has been helpful to your research, please cite:\n```bibtex\n@misc{flaxkv,\n    title={FlaxKV: An Easy-to-use and High Performance Key-Value Database Solution},\n    author={K.Y},\n    howpublished = {\\url{https://github.com/KenyonY/flaxkv}},\n    year={2023}\n}\n```\n\n## Contributions\nFeel free to make contributions to this module by submitting pull requests or raising issues in the repository.\n\n## License\n`FlaxKV` is licensed under the [Apache-2.0 License](./LICENSE).\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A high-performance dictionary database.",
    "version": "0.2.8",
    "project_urls": {
        "Documentation": "https://github.com/KenyonY/flaxkv#flaxkv",
        "Homepage": "https://github.com/KenyonY/flaxkv",
        "Issues": "https://github.com/KenyonY/flaxkv/issues",
        "Source": "https://github.com/KenyonY/flaxkv"
    },
    "split_keywords": [
        "machine learning",
        " nlp",
        " leveldb",
        " lmdb",
        " on-disk dict",
        " persistent-storage"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1e0d285d9b1fa64f3b3d42e11e0bae49d0a48ea010fecfd454606b58646118eb",
                "md5": "749c3b3ebe0b2dfe09d86c7eb8503f8e",
                "sha256": "8767e18b0bcade6fb9550d9abb1c0b988842e231fe303c79ac28e05f68999a75"
            },
            "downloads": -1,
            "filename": "flaxkv-0.2.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "749c3b3ebe0b2dfe09d86c7eb8503f8e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 33070,
            "upload_time": "2024-04-27T15:09:32",
            "upload_time_iso_8601": "2024-04-27T15:09:32.906155Z",
            "url": "https://files.pythonhosted.org/packages/1e/0d/285d9b1fa64f3b3d42e11e0bae49d0a48ea010fecfd454606b58646118eb/flaxkv-0.2.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1cb93a6553fe3e3c98464f016c614d9807dfc2f46d74a0418cfcfc14cf1dfc1b",
                "md5": "7d56d048d4889c3ac3494de6bab5b440",
                "sha256": "fe768f71f7a64dc54cf8094a749ebc7bedd3e0f6782481a21d423bc155037a56"
            },
            "downloads": -1,
            "filename": "flaxkv-0.2.8.tar.gz",
            "has_sig": false,
            "md5_digest": "7d56d048d4889c3ac3494de6bab5b440",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 25134,
            "upload_time": "2024-04-27T15:09:34",
            "upload_time_iso_8601": "2024-04-27T15:09:34.201213Z",
            "url": "https://files.pythonhosted.org/packages/1c/b9/3a6553fe3e3c98464f016c614d9807dfc2f46d74a0418cfcfc14cf1dfc1b/flaxkv-0.2.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-27 15:09:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "KenyonY",
    "github_project": "flaxkv#flaxkv",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "flaxkv"
}
        
Elapsed time: 0.24003s