<h1 style="width: 100%; text-align: center; margin-bottom: 20px; border-bottom: 0px;">BeETL: Extensible Python/Polars-based ETL Framework</h1>
<p style="text-align: center; margin-bottom: 30px;"><img src="./docs/images/beetl.jpg" style="max-width: 400px;" alt=" "><br/></p>
BeETL was born from a job as Integration Developer where a majority of the integrations we develop follow the same pattern - get here, transform a little, put there (with the middle step frequently missing altogether).
After building our 16th integration between the same two systems with another manual template, we decided to build BeETL. BeETL is currently limited to one datasource per source and destination per sync, but this will be expanded in the future. One configuration can contain multiple syncs.
Note: Even though some of the configuration below is in YAML format, you can also use JSON or a python dictionary.
## TOC
- [Minimal example](#minimal-example)
- [Installation](#installation)
- [From PyPi](#from-pypi)
- [From Source](#from-source)
- [Getting Started](#getting-started)
- [Development Environment](#development-environment)
- [Documentation](https://beetl.docs.hoglan.dev/)
- [Source Code](https://github.com/hoglandets-it/beetl)
## Minimal example
```python
# Syncing users from one table to another in the same database
from src.beetl.beetl import Beetl, BeetlConfig
config = BeetlConfig({
"version": "V1"
"sources": [
{
"name": "Sqlserver",
"type": "Sqlserver",
"connection": {
"settings": {
"connection_string": "Server=myServerAddress;Database=myDataBase;User Id=myUsername;Password=myPassword;"
}
}
},
"sync": [
{
"name": "Sync between two tables in a sql server",
"source": "Sqlserver",
"sourceConfig": {
"query": "SELECT id, name, email FROM users"
}
"destination": "SqlServer",
"destinationConfig": {
"table": "users",
"unique_columns": ["id"]
}
"comparisonColumns": [
{
"name": "id",
"type": "Int32",
"unique": True
},
{
"name": "name",
"type": "Utf8"
},
{
"name": "email",
"type": "Utf8"
}
]
}
]
})
Beetl(config).sync()
```
## Installation
### From PyPi
```bash
#/bin/bash
python -m pip install beetl
```
### From Source
```bash
#/bin/bash
# Clone and enter the repository
git clone https://github.com/Hoglandets-IT/beetl.git
cd ./beetl
# Install the build tools
python -m pip install build
# Build beetl
python -m build
# Install beetl from locally built package
python -m pip install ./dist/*.tar.gz
```
## Getting Started
All the latest information about how to use beetl is located at the [official docs](https://beetl.docs.hoglan.dev/getting-started.html).
## Development Environment
The easiest way to get started is to use the included devcontainer.
### Requirements
- Docker
- Visual Studio Code
### Steps
1. Clone the repository.
1. Open the repository in Visual Studio Code.
1. Install the recommended extensions.
1. Using the command palette (`ctrl+shift+p`) search for `reopen in container` and run it.
- The devcontainer will now be provisioned in your local docker instance and vscode will automatically connect to it.
1. You can now use the included launch profiles to either open the docs or run the tests file.
1. You can also use the built-in test explorer to run the available test.
Raw data
{
"_id": null,
"home_page": null,
"name": "beetl",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "python, template, package, module, cli",
"author": null,
"author_email": "Lars Scheibling <lars.scheibling@hoglandet.se>",
"download_url": "https://files.pythonhosted.org/packages/a5/14/a355431f84f5e6a8719d5d5beaef2a616594872f18a7f0f7a36bbe191a67/beetl-1.0.2.tar.gz",
"platform": null,
"description": "<h1 style=\"width: 100%; text-align: center; margin-bottom: 20px; border-bottom: 0px;\">BeETL: Extensible Python/Polars-based ETL Framework</h1>\n<p style=\"text-align: center; margin-bottom: 30px;\"><img src=\"./docs/images/beetl.jpg\" style=\"max-width: 400px;\" alt=\" \"><br/></p>\nBeETL was born from a job as Integration Developer where a majority of the integrations we develop follow the same pattern - get here, transform a little, put there (with the middle step frequently missing altogether). \n\nAfter building our 16th integration between the same two systems with another manual template, we decided to build BeETL. BeETL is currently limited to one datasource per source and destination per sync, but this will be expanded in the future. One configuration can contain multiple syncs.\n\nNote: Even though some of the configuration below is in YAML format, you can also use JSON or a python dictionary.\n\n## TOC\n- [Minimal example](#minimal-example)\n- [Installation](#installation)\n - [From PyPi](#from-pypi)\n - [From Source](#from-source)\n- [Getting Started](#getting-started)\n- [Development Environment](#development-environment)\n- [Documentation](https://beetl.docs.hoglan.dev/)\n- [Source Code](https://github.com/hoglandets-it/beetl)\n\n## Minimal example\n\n```python\n# Syncing users from one table to another in the same database\nfrom src.beetl.beetl import Beetl, BeetlConfig\nconfig = BeetlConfig({\n \"version\": \"V1\"\n \"sources\": [\n {\n \"name\": \"Sqlserver\",\n \"type\": \"Sqlserver\",\n \"connection\": {\n \"settings\": {\n \"connection_string\": \"Server=myServerAddress;Database=myDataBase;User Id=myUsername;Password=myPassword;\"\n }\n }\n },\n \"sync\": [\n {\n \"name\": \"Sync between two tables in a sql server\",\n \"source\": \"Sqlserver\",\n \"sourceConfig\": {\n \"query\": \"SELECT id, name, email FROM users\"\n }\n \"destination\": \"SqlServer\",\n \"destinationConfig\": {\n \"table\": \"users\",\n \"unique_columns\": [\"id\"]\n }\n \"comparisonColumns\": [\n {\n \"name\": \"id\",\n \"type\": \"Int32\",\n \"unique\": True\n },\n {\n \"name\": \"name\",\n \"type\": \"Utf8\"\n },\n {\n \"name\": \"email\",\n \"type\": \"Utf8\"\n }\n ]\n }\n ]\n})\n\nBeetl(config).sync()\n\n```\n\n## Installation\n### From PyPi\n```bash\n#/bin/bash\npython -m pip install beetl\n```\n\n### From Source\n```bash\n#/bin/bash\n# Clone and enter the repository\ngit clone https://github.com/Hoglandets-IT/beetl.git\ncd ./beetl\n# Install the build tools\npython -m pip install build\n# Build beetl\npython -m build\n# Install beetl from locally built package\npython -m pip install ./dist/*.tar.gz\n```\n\n## Getting Started\n\nAll the latest information about how to use beetl is located at the [official docs](https://beetl.docs.hoglan.dev/getting-started.html).\n\n\n## Development Environment\n\nThe easiest way to get started is to use the included devcontainer. \n\n### Requirements\n- Docker\n- Visual Studio Code\n\n### Steps\n\n1. Clone the repository.\n1. Open the repository in Visual Studio Code.\n1. Install the recommended extensions.\n1. Using the command palette (`ctrl+shift+p`) search for `reopen in container` and run it.\n - The devcontainer will now be provisioned in your local docker instance and vscode will automatically connect to it.\n1. You can now use the included launch profiles to either open the docs or run the tests file.\n1. You can also use the built-in test explorer to run the available test.\n",
"bugtrack_url": null,
"license": "GnuPG 3.0",
"summary": "BeETL is a Python package for extracting data from one datasource, transforming it and loading it into another datasource.",
"version": "1.0.2",
"project_urls": {
"github": "https://github.com/Hoglandets-IT/beetl"
},
"split_keywords": [
"python",
" template",
" package",
" module",
" cli"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d3fadd213358528749fe4f6ae78f16d4b5bd1a2eac2eabec79b4055f28c6bacc",
"md5": "1e0275f8e95755668533f573b64f6f16",
"sha256": "005666244bb1ce83df17421f5cf1b10c210095ab817b22c97f43f897ab7017a8"
},
"downloads": -1,
"filename": "beetl-1.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1e0275f8e95755668533f573b64f6f16",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 38294,
"upload_time": "2024-12-19T15:31:54",
"upload_time_iso_8601": "2024-12-19T15:31:54.341821Z",
"url": "https://files.pythonhosted.org/packages/d3/fa/dd213358528749fe4f6ae78f16d4b5bd1a2eac2eabec79b4055f28c6bacc/beetl-1.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "a514a355431f84f5e6a8719d5d5beaef2a616594872f18a7f0f7a36bbe191a67",
"md5": "96475a1561e7da263634b0edd933daca",
"sha256": "a77f3d8b5d0ea95e760d236232838d27e8a445d99ab0b177e1f78680223afd98"
},
"downloads": -1,
"filename": "beetl-1.0.2.tar.gz",
"has_sig": false,
"md5_digest": "96475a1561e7da263634b0edd933daca",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 30170,
"upload_time": "2024-12-19T15:31:56",
"upload_time_iso_8601": "2024-12-19T15:31:56.643328Z",
"url": "https://files.pythonhosted.org/packages/a5/14/a355431f84f5e6a8719d5d5beaef2a616594872f18a7f0f7a36bbe191a67/beetl-1.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-19 15:31:56",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Hoglandets-IT",
"github_project": "beetl",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "requests",
"specs": []
},
{
"name": "polars",
"specs": [
[
"==",
"1.14.0"
]
]
},
{
"name": "sqlalchemy",
"specs": []
},
{
"name": "pandas",
"specs": [
[
"==",
"2.2.3"
]
]
},
{
"name": "psycopg",
"specs": []
},
{
"name": "pyyaml",
"specs": []
},
{
"name": "pymysql",
"specs": []
},
{
"name": "pyodbc",
"specs": []
},
{
"name": "pymssql",
"specs": []
},
{
"name": "mysql-connector-python",
"specs": []
},
{
"name": "alive-progress",
"specs": []
},
{
"name": "tabulate",
"specs": []
},
{
"name": "pymongo",
"specs": [
[
"==",
"4.10.1"
]
]
},
{
"name": "cryptography",
"specs": []
},
{
"name": "testcontainers",
"specs": [
[
"==",
"4.8.2"
]
]
},
{
"name": "faker",
"specs": []
}
],
"lcname": "beetl"
}