dynafile


Namedynafile JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryNoSQLDB following the Dynamo concept, but for a filebased embedded db.
upload_time2024-04-24 12:38:55
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseNone
keywords database nosql
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            > Consider the project as a proof of concept! Definitely not production ready!

# Dynafile

Embedded pure Python NoSQL database following DynamoDB concepts.

```bash

pip install dynafile

# with string filter support using filtration

pip install "dynafile[filter]"

# bloody edge

pip install git+https://github.com/eruvanos/dynafile.git
pip install filtration

```

## Overview

Dynafile stores items within partitions, which are stored as separate files. Each partition contains a SortedDict
from `sortedcontainers` which are sorted by the sort key attribute.

Dynafile does not implement the interface or functionality of DynamoDB, but provides familiar API patterns.

Differences:

- Embedded, file based
- No pagination

## Features

- persistence
- put item
- get item
- delete item
- scan - without parameters
- query - starts_with
- query - index direction
- query - filter
- scan - filter
- batch writer
- atomic file write
- event stream hooks (put, delete)
- TTL

## Roadmap

- [ ] GSI - global secondary index
- [ ] update item
- [ ] batch get
- [ ] thread safeness
- [ ] ~~LSI - local secondary index~~
- [ ] split partitions
- [ ] parallel scans - pre defined scan segments
- [ ] transactions
- [ ] optimise disc load time (cache partitions in memory, invalidate on file change)
- [ ] conditional put item
- [ ] improve file consistency (options: acidfile)

## API

```python
from dynafile import *

# init DB interface
db = Dynafile(path=".", pk_attribute="PK", sk_attribute="SK")

# put items
db.put_item(item={"PK": "user#1", "SK": "user#1", "name": "Bob"})
db.put_item(item={"PK": "user#1", "SK": "role#1", "TYPE": "sender"})
db.put_item(item={"PK": "user#2", "SK": "user#2", "name": "Alice"})

# more performant batch operation
with db.batch_writer() as writer:
    db.put_item(item={"PK": "user#3", "SK": "user#3", "name": "Steve"})
    db.delete_item(key={"PK": "user#3", "SK": "user#3"})

# retrieve items
item = db.get_item(key={
    "PK": "user#1",
    "SK": "user#1"
})

# query item collection by pk
items = list(db.query(pk="user#1"))

# scan full table
items = list(db.scan())

# add event stream listener to retrieve item modification
def print_listener(event: Event):
    print(event.action)
    print(event.old)
    print(event.new)


db.add_stream_listener(print_listener)

```

### Filter

`query` and `scan` support filter, you can provide callables as filter like lambda expressions.

Another option are [filtration](https://pypi.org/project/filtration/) expressions.

* Equal ("==")
* Not equal ("!=")
* Less than ("<")
* Less than or equal ("<=")
* Greater than (">")
* Greater than or equal (">=")
* Contains ("in")
    * RHS must be a list or a Subnet
* Regular expression ("=~")
    * RHS must be a regex token

Examples:

* `SK =~ /^a/` - SK starts with a
* `SK == 1` - SK is equal 1
* `SK == 1` - SK is equal 1
* `nested.a == 1` - accesses nested structure `item.nested.a`

### TTL - Time To Live

TTL provides the option to expire items on read time (get, query, scan).

```python
import time
from dynafile import *

db = Dynafile(path=".", pk_attribute="PK", sk_attribute="SK", ttl_attribute="ttl")

item = {"PK": "1", "SK": "2", "ttl": time.time() - 1000} # expired ttl
db.put_item(item=item)

list(db.scan()) # -> []

```

## Architecture

![architecture.puml](https://github.com/eruvanos/dynafile/blob/9bf858e83ff5761cffca10a18b4554fe5ba2d3c7/architecture.png?raw=true)

### File Structure

```text

--- ROOT ---
./db/

--- MAIN DB ---

|- meta.json - meta information
|- _partitions/
    |- <hash>/
        |- data.pickle - Contains partition data by sort key (SortedDict)
        |- lsi-attr1.pickle - Contains partition data by lsi attr (SortedDict)

--- GSI ---
|- _gsi-<gsi-name>/
    |- _partitions/
        |- <hash>/
            |- data.pickle - Contains partition data by sort key (SortedDict)

```
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dynafile",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "database, nosql",
    "author": null,
    "author_email": "Maic Siemering <maic@siemering.tech>",
    "download_url": "https://files.pythonhosted.org/packages/c4/d9/0d7e79caf7d3c7ac66dac24871c567bc4a40ffd3b53a3603238e9be018da/dynafile-0.1.3.tar.gz",
    "platform": null,
    "description": "> Consider the project as a proof of concept! Definitely not production ready!\n\n# Dynafile\n\nEmbedded pure Python NoSQL database following DynamoDB concepts.\n\n```bash\n\npip install dynafile\n\n# with string filter support using filtration\n\npip install \"dynafile[filter]\"\n\n# bloody edge\n\npip install git+https://github.com/eruvanos/dynafile.git\npip install filtration\n\n```\n\n## Overview\n\nDynafile stores items within partitions, which are stored as separate files. Each partition contains a SortedDict\nfrom `sortedcontainers` which are sorted by the sort key attribute.\n\nDynafile does not implement the interface or functionality of DynamoDB, but provides familiar API patterns.\n\nDifferences:\n\n- Embedded, file based\n- No pagination\n\n## Features\n\n- persistence\n- put item\n- get item\n- delete item\n- scan - without parameters\n- query - starts_with\n- query - index direction\n- query - filter\n- scan - filter\n- batch writer\n- atomic file write\n- event stream hooks (put, delete)\n- TTL\n\n## Roadmap\n\n- [ ] GSI - global secondary index\n- [ ] update item\n- [ ] batch get\n- [ ] thread safeness\n- [ ] ~~LSI - local secondary index~~\n- [ ] split partitions\n- [ ] parallel scans - pre defined scan segments\n- [ ] transactions\n- [ ] optimise disc load time (cache partitions in memory, invalidate on file change)\n- [ ] conditional put item\n- [ ] improve file consistency (options: acidfile)\n\n## API\n\n```python\nfrom dynafile import *\n\n# init DB interface\ndb = Dynafile(path=\".\", pk_attribute=\"PK\", sk_attribute=\"SK\")\n\n# put items\ndb.put_item(item={\"PK\": \"user#1\", \"SK\": \"user#1\", \"name\": \"Bob\"})\ndb.put_item(item={\"PK\": \"user#1\", \"SK\": \"role#1\", \"TYPE\": \"sender\"})\ndb.put_item(item={\"PK\": \"user#2\", \"SK\": \"user#2\", \"name\": \"Alice\"})\n\n# more performant batch operation\nwith db.batch_writer() as writer:\n    db.put_item(item={\"PK\": \"user#3\", \"SK\": \"user#3\", \"name\": \"Steve\"})\n    db.delete_item(key={\"PK\": \"user#3\", \"SK\": \"user#3\"})\n\n# retrieve items\nitem = db.get_item(key={\n    \"PK\": \"user#1\",\n    \"SK\": \"user#1\"\n})\n\n# query item collection by pk\nitems = list(db.query(pk=\"user#1\"))\n\n# scan full table\nitems = list(db.scan())\n\n# add event stream listener to retrieve item modification\ndef print_listener(event: Event):\n    print(event.action)\n    print(event.old)\n    print(event.new)\n\n\ndb.add_stream_listener(print_listener)\n\n```\n\n### Filter\n\n`query` and `scan` support filter, you can provide callables as filter like lambda expressions.\n\nAnother option are [filtration](https://pypi.org/project/filtration/) expressions.\n\n* Equal (\"==\")\n* Not equal (\"!=\")\n* Less than (\"<\")\n* Less than or equal (\"<=\")\n* Greater than (\">\")\n* Greater than or equal (\">=\")\n* Contains (\"in\")\n    * RHS must be a list or a Subnet\n* Regular expression (\"=~\")\n    * RHS must be a regex token\n\nExamples:\n\n* `SK =~ /^a/` - SK starts with a\n* `SK == 1` - SK is equal 1\n* `SK == 1` - SK is equal 1\n* `nested.a == 1` - accesses nested structure `item.nested.a`\n\n### TTL - Time To Live\n\nTTL provides the option to expire items on read time (get, query, scan).\n\n```python\nimport time\nfrom dynafile import *\n\ndb = Dynafile(path=\".\", pk_attribute=\"PK\", sk_attribute=\"SK\", ttl_attribute=\"ttl\")\n\nitem = {\"PK\": \"1\", \"SK\": \"2\", \"ttl\": time.time() - 1000} # expired ttl\ndb.put_item(item=item)\n\nlist(db.scan()) # -> []\n\n```\n\n## Architecture\n\n![architecture.puml](https://github.com/eruvanos/dynafile/blob/9bf858e83ff5761cffca10a18b4554fe5ba2d3c7/architecture.png?raw=true)\n\n### File Structure\n\n```text\n\n--- ROOT ---\n./db/\n\n--- MAIN DB ---\n\n|- meta.json - meta information\n|- _partitions/\n    |- <hash>/\n        |- data.pickle - Contains partition data by sort key (SortedDict)\n        |- lsi-attr1.pickle - Contains partition data by lsi attr (SortedDict)\n\n--- GSI ---\n|- _gsi-<gsi-name>/\n    |- _partitions/\n        |- <hash>/\n            |- data.pickle - Contains partition data by sort key (SortedDict)\n\n```",
    "bugtrack_url": null,
    "license": null,
    "summary": "NoSQLDB following the Dynamo concept, but for a filebased embedded db.",
    "version": "0.1.3",
    "project_urls": null,
    "split_keywords": [
        "database",
        " nosql"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b4aa7c02df390215aa2dcd24519e2651ea490d8ab1ddbfcbb70cd25f5e2ae1df",
                "md5": "a7f84493b3d8c7f39443e8fbd76d6ad9",
                "sha256": "09ead0032825d5e83662d469891069d0e63a8630f420bd1429e2cf208052b0a9"
            },
            "downloads": -1,
            "filename": "dynafile-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a7f84493b3d8c7f39443e8fbd76d6ad9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 6889,
            "upload_time": "2024-04-24T12:38:53",
            "upload_time_iso_8601": "2024-04-24T12:38:53.946239Z",
            "url": "https://files.pythonhosted.org/packages/b4/aa/7c02df390215aa2dcd24519e2651ea490d8ab1ddbfcbb70cd25f5e2ae1df/dynafile-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c4d90d7e79caf7d3c7ac66dac24871c567bc4a40ffd3b53a3603238e9be018da",
                "md5": "3b2718945156f5b7235faa287009b6dd",
                "sha256": "f9402213a375358b4a22d9cdf3b423c4c99016f92a7505c893d0ade399f74aae"
            },
            "downloads": -1,
            "filename": "dynafile-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "3b2718945156f5b7235faa287009b6dd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 55891,
            "upload_time": "2024-04-24T12:38:55",
            "upload_time_iso_8601": "2024-04-24T12:38:55.701993Z",
            "url": "https://files.pythonhosted.org/packages/c4/d9/0d7e79caf7d3c7ac66dac24871c567bc4a40ffd3b53a3603238e9be018da/dynafile-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-24 12:38:55",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "dynafile"
}
        
Elapsed time: 0.24481s