<p align="center"><img src="https://github.com/slingdata-io/sling-python/raw/main/logo-with-text.png" alt="logo" width="250"/></p>
<p align="center">Slings from a data source to a data target.</p>
## Installation
`pip install sling`
Then you should be able to run `sling --help` from command line.
## Running a Extract-Load Task
### CLI
```shell
sling run --src-conn MY_PG --src-stream myschema.mytable \
--tgt-conn YOUR_SNOWFLAKE --tgt-object yourschema.yourtable \
--mode full-refresh
```
Or passing a yaml/json string or file
```shell
cat '
source: MY_POSTGRES
target: MY_SNOWFLAKE
# default config options which apply to all streams
defaults:
mode: full-refresh
object: new_schema.{stream_schema}_{stream_table}
streams:
my_schema.*:
' > /path/to/replication.yaml
sling run -r /path/to/replication.yaml
```
### From Lib
Run a replication from file:
```python
import yaml
from sling import Replication
with open('path/to/replication.yaml') as file:
config = yaml.load(file, Loader=yaml.FullLoader)
replication = Replication(**config)
replication.run()
```
Build a replication dynamically:
```python
from sling import Replication, ReplicationStream
# build sling replication
streams = {}
for (folder, table_name) in list(folders):
streams[folder] = ReplicationStream(mode='full-refresh', object=table_name, primary_key='_hash_id')
replication = Replication(
source='aws_s3',
target='snowflake',
streams=streams,
env=dict(SLING_STREAM_URL_COLUMN='true', SLING_LOADED_AT_COLUMN='true'),
debug=True,
)
replication.run()
```
## Config Schema
`--src-conn`/`source.conn` and `--tgt-conn`/`target.conn` can be a name or URL of a folder:
- `MY_PG` (connection ref in db, profile or env)
- `postgresql://user:password!@host.loc:5432/database`
- `s3://my_bucket/my_folder/file.csv`
- `gs://my_google_bucket/my_folder/file.json`
- `file:///tmp/my_folder/file.csv` (local storage)
`--src-stream`/`source.stream` can be an object name to stream from:
- `TABLE1`
- `SCHEMA1.TABLE2`
- `OBJECT_NAME`
- `select * from SCHEMA1.TABLE3`
- `/path/to/file.sql` (if source conn is DB)
`--tgt-object`/`target.object` can be an object name to write to:
- `TABLE1`
- `SCHEMA1.TABLE2`
### Example as JSON
```json
{
"source": {
"conn": "MY_PG_URL",
"stream": "select * from my_table",
"options": {}
},
"target": {
"conn": "s3://my_bucket/my_folder/new_file.csv",
"options": {
"header": false
}
}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/slingdata-io/sling-python",
"name": "sling",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "sling, etl, elt, extract, load",
"author": "Fritz Larco",
"author_email": "fritz@slingdata.io",
"download_url": "https://files.pythonhosted.org/packages/89/c2/8477f106b4cc73d4bf99ab7f5b71167aa3f333c6d11ee11b98aef937cb68/sling-1.3.3.tar.gz",
"platform": null,
"description": "<p align=\"center\"><img src=\"https://github.com/slingdata-io/sling-python/raw/main/logo-with-text.png\" alt=\"logo\" width=\"250\"/></p>\n\n<p align=\"center\">Slings from a data source to a data target.</p>\n\n## Installation\n\n`pip install sling`\n\nThen you should be able to run `sling --help` from command line.\n\n## Running a Extract-Load Task\n\n### CLI\n\n```shell\nsling run --src-conn MY_PG --src-stream myschema.mytable \\\n --tgt-conn YOUR_SNOWFLAKE --tgt-object yourschema.yourtable \\\n --mode full-refresh\n```\n\nOr passing a yaml/json string or file\n\n```shell\ncat '\nsource: MY_POSTGRES\ntarget: MY_SNOWFLAKE\n\n# default config options which apply to all streams\ndefaults:\n mode: full-refresh\n object: new_schema.{stream_schema}_{stream_table}\n\nstreams:\n my_schema.*:\n' > /path/to/replication.yaml\n\nsling run -r /path/to/replication.yaml\n```\n\n### From Lib\n\nRun a replication from file:\n\n```python\nimport yaml\nfrom sling import Replication\n\nwith open('path/to/replication.yaml') as file:\n config = yaml.load(file, Loader=yaml.FullLoader)\n\nreplication = Replication(**config)\n\nreplication.run()\n```\n\nBuild a replication dynamically:\n\n```python\nfrom sling import Replication, ReplicationStream\n\n# build sling replication\nstreams = {}\nfor (folder, table_name) in list(folders):\n streams[folder] = ReplicationStream(mode='full-refresh', object=table_name, primary_key='_hash_id')\n\nreplication = Replication(\n source='aws_s3',\n target='snowflake',\n streams=streams,\n env=dict(SLING_STREAM_URL_COLUMN='true', SLING_LOADED_AT_COLUMN='true'),\n debug=True,\n)\n\nreplication.run()\n```\n\n## Config Schema\n\n`--src-conn`/`source.conn` and `--tgt-conn`/`target.conn` can be a name or URL of a folder:\n- `MY_PG` (connection ref in db, profile or env)\n- `postgresql://user:password!@host.loc:5432/database`\n- `s3://my_bucket/my_folder/file.csv`\n- `gs://my_google_bucket/my_folder/file.json`\n- `file:///tmp/my_folder/file.csv` (local storage)\n\n`--src-stream`/`source.stream` can be an object name to stream from:\n- `TABLE1`\n- `SCHEMA1.TABLE2`\n- `OBJECT_NAME`\n- `select * from SCHEMA1.TABLE3`\n- `/path/to/file.sql` (if source conn is DB)\n\n`--tgt-object`/`target.object` can be an object name to write to:\n- `TABLE1`\n- `SCHEMA1.TABLE2`\n\n### Example as JSON\n\n```json\n{\n \"source\": {\n \"conn\": \"MY_PG_URL\",\n \"stream\": \"select * from my_table\",\n \"options\": {}\n },\n \"target\": {\n \"conn\": \"s3://my_bucket/my_folder/new_file.csv\",\n \"options\": {\n \"header\": false\n }\n }\n}\n```\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "Slings data from a source to a target",
"version": "1.3.3",
"project_urls": {
"Download": "https://github.com/slingdata-io/sling-python/archive/master.zip",
"Homepage": "https://github.com/slingdata-io/sling-python"
},
"split_keywords": [
"sling",
" etl",
" elt",
" extract",
" load"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "89c28477f106b4cc73d4bf99ab7f5b71167aa3f333c6d11ee11b98aef937cb68",
"md5": "d305568b5364ccb410874250fd7c4b17",
"sha256": "f43e86c696d18b04bc3a20775f61782046fa08dcc31736d72434542b966a8c52"
},
"downloads": -1,
"filename": "sling-1.3.3.tar.gz",
"has_sig": false,
"md5_digest": "d305568b5364ccb410874250fd7c4b17",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6931,
"upload_time": "2024-12-16T12:18:53",
"upload_time_iso_8601": "2024-12-16T12:18:53.535749Z",
"url": "https://files.pythonhosted.org/packages/89/c2/8477f106b4cc73d4bf99ab7f5b71167aa3f333c6d11ee11b98aef937cb68/sling-1.3.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-16 12:18:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "slingdata-io",
"github_project": "sling-python",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sling"
}