lokii


Namelokii JSON
Version 1.1.6 PyPI version JSON
download
home_pagehttps://github.com/dorukerenaktas/lokii
SummaryGenerate, Load, Develop and Test with consistent relational datasets!
upload_time2023-06-13 06:29:44
maintainer
docs_urlNone
authorDoruk Eren Aktaş
requires_python
licenseMIT License
keywords data generation relational datasets development environment testing database
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            ![lokii-logo](https://github.com/dorukerenaktas/lokii/assets/20422563/fe774eba-ddd0-4bad-a093-553bb980f54c)

![PyPI](https://img.shields.io/pypi/v/lokii)
![PyPI - Downloads](https://img.shields.io/pypi/dm/lokii)
![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/dorukerenaktas/lokii/python-app.yml)
![Libraries.io dependency status for GitHub repo](https://img.shields.io/librariesio/github/dorukerenaktas/lokii)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Licence](https://img.shields.io/pypi/l/lokii.svg)](https://github.com/dorukerenaktas/lokii)

**`lokii`** is a powerful package that enables the generation of relational datasets, specifically tailored to
facilitate the creation of robust development environments. With **`lokii`**, you can effortlessly generate diverse
datasets that mimic real-world scenarios, allowing for comprehensive end-to-end testing of your applications.

![lokii_animated](https://github.com/dorukerenaktas/lokii/assets/20422563/9145c764-2db2-4c16-9019-e1feca323ae8)

# Project structure

**`lokii`** leverages the hierarchical structure of the file system to discover groups and nodes. Each dataset
consists of nodes, which are defined using `.node.py` files. For instance, in the context of a database, each
node represents a table. Furthermore, you can even group nodes under database schemas within the database. Groups
defines how generated node data will be exported. You can recognize group files by their `.group.py` file extension.

```shell
# example project directory structure
proj_dir
    ├── group_1
    │   ├── group_1.group.py
    │   ├── node_2.node.py
    │   └── node_2.node.py
    ├── group_2
    │   ├── node_3.node.py
    │   └── node_4.node.py
    ├── group_3.group.py
    ├── node_5.node.py
    └── node_6.node.py
```

## Node Definition

Node file defines how each item will be generated. There are special variables and functions in node
definition files.
- `name`: Name of the node, filename will be used if not provided
- `source`: Source query for retrieve dependent parameters for each item
- `item`: Generation function that will return each item in node

```python
# offices.node.py
from faker import Faker

# use your favorite tools to generate data
# you can even use database connection, filesystem or AI
fake = Faker()

# if you want you can override the node name if not provided filename will be used
# can be used in source queries if you want to retrieve rows that depends on another node
# name = "business.offices"

# define a query that returns one or more rows
source = "SELECT * FROM range(10)"


# item function will be called for each row in `source` query result
def item(args):
    address = fake.address().split("\n")
    return {
        "officeCode": args["id"],
        "city": fake.city(),
        "phone": fake.phone_number(),
        "addressLine1": address[0],
        "addressLine2": address[1],
        "state": fake.city(),
        "country": fake.country(),
        "postalCode": fake.postcode(),
        "territory": fake.administrative_unit(),
    }
```

## Group Definition

Group file defines how each node data will be exported. There are special functions in group definition files.
- `before`: Called once before export operation
- `export`: Called for every node in the group
- `after`: Called once after export operation

```python
# filesystem.group.py
import os
import shutil
from csv import DictWriter

out_path = "out_data"


def before(args):
    """
    Executed before export function.
    :param args: contains node names that belongs to this group
    :type args: {"nodes": list[str]} 
    """
    if os.path.exists(out_path):
        # always clear your storage before starting a new export
        shutil.rmtree(out_path)
    os.makedirs(out_path)


def export(args):
    """
    Executed for all nodes that belongs to this group
    :param args: contains node name, node columns and a batch iterator
    :type args: {"name": str, "cols": list[str], "batches": list[dict]} 
    """
    node_name = args["name"]
    node_cols = args["cols"]
    batches = args["batches"]
    # out_data/offices.csv
    out_file_path = os.path.join(out_path, node_name + ".csv")
    with open(out_file_path, 'w+', newline='', encoding='utf-8') as outfile:
        writer = DictWriter(outfile, fieldnames=node_cols)
        writer.writeheader()
        for batch in batches:
            writer.writerows(batch)


def after(args):
    """
    Executed after export function.
    :param args: contains node names that belongs to this group
    :type args: {"nodes": list[str]} 
    """
    pass
```


## Upload to PyPI

You can create the source distribution of the package by running the command given below:

```shell
python3 setup.py sdist
```

Install twine and upload pypi for `finnetdevlab` username.

```shell
pip3 install twine
twine upload dist/*
```

## Requirements

Package requirements are handled using pip. To install them do

```
pip install -r requirements.txt
pip install -r requirements.dev.txt
```

## Tests

Testing is set up using [pytest](http://pytest.org) and coverage is handled
with the pytest-cov plugin.

Run your tests with ```py.test``` in the root directory.

Coverage is run by default and is set in the ```pytest.ini``` file.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/dorukerenaktas/lokii",
    "name": "lokii",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "data generation,relational datasets,development environment,testing,database",
    "author": "Doruk Eren Akta\u015f",
    "author_email": "dorukerenaktas@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/9f/00/55894d76b853409465a08f89f6db65b90244ea1b44ae2512d2afa5217680/lokii-1.1.6.tar.gz",
    "platform": null,
    "description": "![lokii-logo](https://github.com/dorukerenaktas/lokii/assets/20422563/fe774eba-ddd0-4bad-a093-553bb980f54c)\n\n![PyPI](https://img.shields.io/pypi/v/lokii)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/lokii)\n![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/dorukerenaktas/lokii/python-app.yml)\n![Libraries.io dependency status for GitHub repo](https://img.shields.io/librariesio/github/dorukerenaktas/lokii)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Licence](https://img.shields.io/pypi/l/lokii.svg)](https://github.com/dorukerenaktas/lokii)\n\n**`lokii`** is a powerful package that enables the generation of relational datasets, specifically tailored to\nfacilitate the creation of robust development environments. With **`lokii`**, you can effortlessly generate diverse\ndatasets that mimic real-world scenarios, allowing for comprehensive end-to-end testing of your applications.\n\n![lokii_animated](https://github.com/dorukerenaktas/lokii/assets/20422563/9145c764-2db2-4c16-9019-e1feca323ae8)\n\n# Project structure\n\n**`lokii`** leverages the hierarchical structure of the file system to discover groups and nodes. Each dataset\nconsists of nodes, which are defined using `.node.py` files. For instance, in the context of a database, each\nnode represents a table. Furthermore, you can even group nodes under database schemas within the database. Groups\ndefines how generated node data will be exported. You can recognize group files by their `.group.py` file extension.\n\n```shell\n# example project directory structure\nproj_dir\n    \u251c\u2500\u2500 group_1\n    \u2502\u00a0\u00a0 \u251c\u2500\u2500 group_1.group.py\n    \u2502\u00a0\u00a0 \u251c\u2500\u2500 node_2.node.py\n    \u2502\u00a0\u00a0 \u2514\u2500\u2500 node_2.node.py\n    \u251c\u2500\u2500 group_2\n    \u2502\u00a0\u00a0 \u251c\u2500\u2500 node_3.node.py\n    \u2502\u00a0\u00a0 \u2514\u2500\u2500 node_4.node.py\n    \u251c\u2500\u2500 group_3.group.py\n    \u251c\u2500\u2500 node_5.node.py\n    \u2514\u2500\u2500 node_6.node.py\n```\n\n## Node Definition\n\nNode file defines how each item will be generated. There are special variables and functions in node\ndefinition files.\n- `name`: Name of the node, filename will be used if not provided\n- `source`: Source query for retrieve dependent parameters for each item\n- `item`: Generation function that will return each item in node\n\n```python\n# offices.node.py\nfrom faker import Faker\n\n# use your favorite tools to generate data\n# you can even use database connection, filesystem or AI\nfake = Faker()\n\n# if you want you can override the node name if not provided filename will be used\n# can be used in source queries if you want to retrieve rows that depends on another node\n# name = \"business.offices\"\n\n# define a query that returns one or more rows\nsource = \"SELECT * FROM range(10)\"\n\n\n# item function will be called for each row in `source` query result\ndef item(args):\n    address = fake.address().split(\"\\n\")\n    return {\n        \"officeCode\": args[\"id\"],\n        \"city\": fake.city(),\n        \"phone\": fake.phone_number(),\n        \"addressLine1\": address[0],\n        \"addressLine2\": address[1],\n        \"state\": fake.city(),\n        \"country\": fake.country(),\n        \"postalCode\": fake.postcode(),\n        \"territory\": fake.administrative_unit(),\n    }\n```\n\n## Group Definition\n\nGroup file defines how each node data will be exported. There are special functions in group definition files.\n- `before`: Called once before export operation\n- `export`: Called for every node in the group\n- `after`: Called once after export operation\n\n```python\n# filesystem.group.py\nimport os\nimport shutil\nfrom csv import DictWriter\n\nout_path = \"out_data\"\n\n\ndef before(args):\n    \"\"\"\n    Executed before export function.\n    :param args: contains node names that belongs to this group\n    :type args: {\"nodes\": list[str]} \n    \"\"\"\n    if os.path.exists(out_path):\n        # always clear your storage before starting a new export\n        shutil.rmtree(out_path)\n    os.makedirs(out_path)\n\n\ndef export(args):\n    \"\"\"\n    Executed for all nodes that belongs to this group\n    :param args: contains node name, node columns and a batch iterator\n    :type args: {\"name\": str, \"cols\": list[str], \"batches\": list[dict]} \n    \"\"\"\n    node_name = args[\"name\"]\n    node_cols = args[\"cols\"]\n    batches = args[\"batches\"]\n    # out_data/offices.csv\n    out_file_path = os.path.join(out_path, node_name + \".csv\")\n    with open(out_file_path, 'w+', newline='', encoding='utf-8') as outfile:\n        writer = DictWriter(outfile, fieldnames=node_cols)\n        writer.writeheader()\n        for batch in batches:\n            writer.writerows(batch)\n\n\ndef after(args):\n    \"\"\"\n    Executed after export function.\n    :param args: contains node names that belongs to this group\n    :type args: {\"nodes\": list[str]} \n    \"\"\"\n    pass\n```\n\n\n## Upload to PyPI\n\nYou can create the source distribution of the package by running the command given below:\n\n```shell\npython3 setup.py sdist\n```\n\nInstall twine and upload pypi for `finnetdevlab` username.\n\n```shell\npip3 install twine\ntwine upload dist/*\n```\n\n## Requirements\n\nPackage requirements are handled using pip. To install them do\n\n```\npip install -r requirements.txt\npip install -r requirements.dev.txt\n```\n\n## Tests\n\nTesting is set up using [pytest](http://pytest.org) and coverage is handled\nwith the pytest-cov plugin.\n\nRun your tests with ```py.test``` in the root directory.\n\nCoverage is run by default and is set in the ```pytest.ini``` file.\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Generate, Load, Develop and Test with consistent relational datasets!",
    "version": "1.1.6",
    "project_urls": {
        "Homepage": "https://github.com/dorukerenaktas/lokii"
    },
    "split_keywords": [
        "data generation",
        "relational datasets",
        "development environment",
        "testing",
        "database"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9f0055894d76b853409465a08f89f6db65b90244ea1b44ae2512d2afa5217680",
                "md5": "965012413d68d611645f2d4e039f9a4d",
                "sha256": "6b73973e73d9deae19ae05b22441a383ad10a2ae80049d1cb01a7ca12f324b30"
            },
            "downloads": -1,
            "filename": "lokii-1.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "965012413d68d611645f2d4e039f9a4d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 24315,
            "upload_time": "2023-06-13T06:29:44",
            "upload_time_iso_8601": "2023-06-13T06:29:44.762876Z",
            "url": "https://files.pythonhosted.org/packages/9f/00/55894d76b853409465a08f89f6db65b90244ea1b44ae2512d2afa5217680/lokii-1.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-13 06:29:44",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "dorukerenaktas",
    "github_project": "lokii",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [],
    "lcname": "lokii"
}
        
Elapsed time: 1.76442s