![lokii-logo](https://github.com/dorukerenaktas/lokii/assets/20422563/fe774eba-ddd0-4bad-a093-553bb980f54c)
![PyPI](https://img.shields.io/pypi/v/lokii)
![PyPI - Downloads](https://img.shields.io/pypi/dm/lokii)
![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/dorukerenaktas/lokii/python-app.yml)
![Libraries.io dependency status for GitHub repo](https://img.shields.io/librariesio/github/dorukerenaktas/lokii)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Licence](https://img.shields.io/pypi/l/lokii.svg)](https://github.com/dorukerenaktas/lokii)
**`lokii`** is a powerful package that enables the generation of relational datasets, specifically tailored to
facilitate the creation of robust development environments. With **`lokii`**, you can effortlessly generate diverse
datasets that mimic real-world scenarios, allowing for comprehensive end-to-end testing of your applications.
![lokii_animated](https://github.com/dorukerenaktas/lokii/assets/20422563/9145c764-2db2-4c16-9019-e1feca323ae8)
# Project structure
**`lokii`** leverages the hierarchical structure of the file system to discover groups and nodes. Each dataset
consists of nodes, which are defined using `.node.py` files. For instance, in the context of a database, each
node represents a table. Furthermore, you can even group nodes under database schemas within the database. Groups
defines how generated node data will be exported. You can recognize group files by their `.group.py` file extension.
```shell
# example project directory structure
proj_dir
├── group_1
│ ├── group_1.group.py
│ ├── node_2.node.py
│ └── node_2.node.py
├── group_2
│ ├── node_3.node.py
│ └── node_4.node.py
├── group_3.group.py
├── node_5.node.py
└── node_6.node.py
```
## Node Definition
Node file defines how each item will be generated. There are special variables and functions in node
definition files.
- `name`: Name of the node, filename will be used if not provided
- `source`: Source query for retrieve dependent parameters for each item
- `item`: Generation function that will return each item in node
```python
# offices.node.py
from faker import Faker
# use your favorite tools to generate data
# you can even use database connection, filesystem or AI
fake = Faker()
# if you want you can override the node name if not provided filename will be used
# can be used in source queries if you want to retrieve rows that depends on another node
# name = "business.offices"
# define a query that returns one or more rows
source = "SELECT * FROM range(10)"
# item function will be called for each row in `source` query result
def item(args):
address = fake.address().split("\n")
return {
"officeCode": args["id"],
"city": fake.city(),
"phone": fake.phone_number(),
"addressLine1": address[0],
"addressLine2": address[1],
"state": fake.city(),
"country": fake.country(),
"postalCode": fake.postcode(),
"territory": fake.administrative_unit(),
}
```
## Group Definition
Group file defines how each node data will be exported. There are special functions in group definition files.
- `before`: Called once before export operation
- `export`: Called for every node in the group
- `after`: Called once after export operation
```python
# filesystem.group.py
import os
import shutil
from csv import DictWriter
out_path = "out_data"
def before(args):
"""
Executed before export function.
:param args: contains node names that belongs to this group
:type args: {"nodes": list[str]}
"""
if os.path.exists(out_path):
# always clear your storage before starting a new export
shutil.rmtree(out_path)
os.makedirs(out_path)
def export(args):
"""
Executed for all nodes that belongs to this group
:param args: contains node name, node columns and a batch iterator
:type args: {"name": str, "cols": list[str], "batches": list[dict]}
"""
node_name = args["name"]
node_cols = args["cols"]
batches = args["batches"]
# out_data/offices.csv
out_file_path = os.path.join(out_path, node_name + ".csv")
with open(out_file_path, 'w+', newline='', encoding='utf-8') as outfile:
writer = DictWriter(outfile, fieldnames=node_cols)
writer.writeheader()
for batch in batches:
writer.writerows(batch)
def after(args):
"""
Executed after export function.
:param args: contains node names that belongs to this group
:type args: {"nodes": list[str]}
"""
pass
```
## Upload to PyPI
You can create the source distribution of the package by running the command given below:
```shell
python3 setup.py sdist
```
Install twine and upload pypi for `finnetdevlab` username.
```shell
pip3 install twine
twine upload dist/*
```
## Requirements
Package requirements are handled using pip. To install them do
```
pip install -r requirements.txt
pip install -r requirements.dev.txt
```
## Tests
Testing is set up using [pytest](http://pytest.org) and coverage is handled
with the pytest-cov plugin.
Run your tests with ```py.test``` in the root directory.
Coverage is run by default and is set in the ```pytest.ini``` file.
Raw data
{
"_id": null,
"home_page": "https://github.com/dorukerenaktas/lokii",
"name": "lokii",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "data generation,relational datasets,development environment,testing,database",
"author": "Doruk Eren Akta\u015f",
"author_email": "dorukerenaktas@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/9f/00/55894d76b853409465a08f89f6db65b90244ea1b44ae2512d2afa5217680/lokii-1.1.6.tar.gz",
"platform": null,
"description": "![lokii-logo](https://github.com/dorukerenaktas/lokii/assets/20422563/fe774eba-ddd0-4bad-a093-553bb980f54c)\n\n![PyPI](https://img.shields.io/pypi/v/lokii)\n![PyPI - Downloads](https://img.shields.io/pypi/dm/lokii)\n![GitHub Workflow Status](https://img.shields.io/github/actions/workflow/status/dorukerenaktas/lokii/python-app.yml)\n![Libraries.io dependency status for GitHub repo](https://img.shields.io/librariesio/github/dorukerenaktas/lokii)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Licence](https://img.shields.io/pypi/l/lokii.svg)](https://github.com/dorukerenaktas/lokii)\n\n**`lokii`** is a powerful package that enables the generation of relational datasets, specifically tailored to\nfacilitate the creation of robust development environments. With **`lokii`**, you can effortlessly generate diverse\ndatasets that mimic real-world scenarios, allowing for comprehensive end-to-end testing of your applications.\n\n![lokii_animated](https://github.com/dorukerenaktas/lokii/assets/20422563/9145c764-2db2-4c16-9019-e1feca323ae8)\n\n# Project structure\n\n**`lokii`** leverages the hierarchical structure of the file system to discover groups and nodes. Each dataset\nconsists of nodes, which are defined using `.node.py` files. For instance, in the context of a database, each\nnode represents a table. Furthermore, you can even group nodes under database schemas within the database. Groups\ndefines how generated node data will be exported. You can recognize group files by their `.group.py` file extension.\n\n```shell\n# example project directory structure\nproj_dir\n \u251c\u2500\u2500 group_1\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 group_1.group.py\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 node_2.node.py\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 node_2.node.py\n \u251c\u2500\u2500 group_2\n \u2502\u00a0\u00a0 \u251c\u2500\u2500 node_3.node.py\n \u2502\u00a0\u00a0 \u2514\u2500\u2500 node_4.node.py\n \u251c\u2500\u2500 group_3.group.py\n \u251c\u2500\u2500 node_5.node.py\n \u2514\u2500\u2500 node_6.node.py\n```\n\n## Node Definition\n\nNode file defines how each item will be generated. There are special variables and functions in node\ndefinition files.\n- `name`: Name of the node, filename will be used if not provided\n- `source`: Source query for retrieve dependent parameters for each item\n- `item`: Generation function that will return each item in node\n\n```python\n# offices.node.py\nfrom faker import Faker\n\n# use your favorite tools to generate data\n# you can even use database connection, filesystem or AI\nfake = Faker()\n\n# if you want you can override the node name if not provided filename will be used\n# can be used in source queries if you want to retrieve rows that depends on another node\n# name = \"business.offices\"\n\n# define a query that returns one or more rows\nsource = \"SELECT * FROM range(10)\"\n\n\n# item function will be called for each row in `source` query result\ndef item(args):\n address = fake.address().split(\"\\n\")\n return {\n \"officeCode\": args[\"id\"],\n \"city\": fake.city(),\n \"phone\": fake.phone_number(),\n \"addressLine1\": address[0],\n \"addressLine2\": address[1],\n \"state\": fake.city(),\n \"country\": fake.country(),\n \"postalCode\": fake.postcode(),\n \"territory\": fake.administrative_unit(),\n }\n```\n\n## Group Definition\n\nGroup file defines how each node data will be exported. There are special functions in group definition files.\n- `before`: Called once before export operation\n- `export`: Called for every node in the group\n- `after`: Called once after export operation\n\n```python\n# filesystem.group.py\nimport os\nimport shutil\nfrom csv import DictWriter\n\nout_path = \"out_data\"\n\n\ndef before(args):\n \"\"\"\n Executed before export function.\n :param args: contains node names that belongs to this group\n :type args: {\"nodes\": list[str]} \n \"\"\"\n if os.path.exists(out_path):\n # always clear your storage before starting a new export\n shutil.rmtree(out_path)\n os.makedirs(out_path)\n\n\ndef export(args):\n \"\"\"\n Executed for all nodes that belongs to this group\n :param args: contains node name, node columns and a batch iterator\n :type args: {\"name\": str, \"cols\": list[str], \"batches\": list[dict]} \n \"\"\"\n node_name = args[\"name\"]\n node_cols = args[\"cols\"]\n batches = args[\"batches\"]\n # out_data/offices.csv\n out_file_path = os.path.join(out_path, node_name + \".csv\")\n with open(out_file_path, 'w+', newline='', encoding='utf-8') as outfile:\n writer = DictWriter(outfile, fieldnames=node_cols)\n writer.writeheader()\n for batch in batches:\n writer.writerows(batch)\n\n\ndef after(args):\n \"\"\"\n Executed after export function.\n :param args: contains node names that belongs to this group\n :type args: {\"nodes\": list[str]} \n \"\"\"\n pass\n```\n\n\n## Upload to PyPI\n\nYou can create the source distribution of the package by running the command given below:\n\n```shell\npython3 setup.py sdist\n```\n\nInstall twine and upload pypi for `finnetdevlab` username.\n\n```shell\npip3 install twine\ntwine upload dist/*\n```\n\n## Requirements\n\nPackage requirements are handled using pip. To install them do\n\n```\npip install -r requirements.txt\npip install -r requirements.dev.txt\n```\n\n## Tests\n\nTesting is set up using [pytest](http://pytest.org) and coverage is handled\nwith the pytest-cov plugin.\n\nRun your tests with ```py.test``` in the root directory.\n\nCoverage is run by default and is set in the ```pytest.ini``` file.\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Generate, Load, Develop and Test with consistent relational datasets!",
"version": "1.1.6",
"project_urls": {
"Homepage": "https://github.com/dorukerenaktas/lokii"
},
"split_keywords": [
"data generation",
"relational datasets",
"development environment",
"testing",
"database"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "9f0055894d76b853409465a08f89f6db65b90244ea1b44ae2512d2afa5217680",
"md5": "965012413d68d611645f2d4e039f9a4d",
"sha256": "6b73973e73d9deae19ae05b22441a383ad10a2ae80049d1cb01a7ca12f324b30"
},
"downloads": -1,
"filename": "lokii-1.1.6.tar.gz",
"has_sig": false,
"md5_digest": "965012413d68d611645f2d4e039f9a4d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 24315,
"upload_time": "2023-06-13T06:29:44",
"upload_time_iso_8601": "2023-06-13T06:29:44.762876Z",
"url": "https://files.pythonhosted.org/packages/9f/00/55894d76b853409465a08f89f6db65b90244ea1b44ae2512d2afa5217680/lokii-1.1.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-13 06:29:44",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "dorukerenaktas",
"github_project": "lokii",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [],
"lcname": "lokii"
}