<p align="center"><img src="https://github.com/slingdata-io/sling-python/raw/main/logo-with-text.png" alt="logo" width="250"/></p>
<p align="center">Slings from a data source to a data target.</p>
## Installation
`pip install sling`
Then you should be able to run `sling --help` from command line.
## Running a Extract-Load Task
### CLI
```shell
sling run --src-conn MY_PG --src-stream myschema.mytable \
--tgt-conn YOUR_SNOWFLAKE --tgt-object yourschema.yourtable \
--mode full-refresh
```
Or passing a yaml/json string or file
```shell
cat '
source: MY_POSTGRES
target: MY_SNOWFLAKE
# default config options which apply to all streams
defaults:
mode: full-refresh
object: new_schema.{stream_schema}_{stream_table}
streams:
my_schema.*:
' > /path/to/replication.yaml
sling run -r /path/to/replication.yaml
```
### From Lib
Run a replication from file:
```python
import yaml
from sling import Replication
# From a YAML file
replication = Replication(file_path="path/to/replication.yaml")
replication.run()
# Or load into object
with open('path/to/replication.yaml') as file:
config = yaml.load(file, Loader=yaml.FullLoader)
replication = Replication(**config)
replication.run()
```
Build a replication dynamically:
```python
from sling import Replication, ReplicationStream
# build sling replication
streams = {}
for (folder, table_name) in list(folders):
streams[folder] = ReplicationStream(mode='full-refresh', object=table_name, primary_key='_hash_id')
replication = Replication(
source='aws_s3',
target='snowflake',
streams=streams,
env=dict(SLING_STREAM_URL_COLUMN='true', SLING_LOADED_AT_COLUMN='true'),
debug=True,
)
replication.run()
```
Run a [Pipeline](https://docs.slingdata.io/concepts/pipeline):
```python
# From a YAML file
pipeline = Pipeline(file_path="path/to/pipeline.yaml")
pipeline.run()
# Or programmatically
pipeline = Pipeline(
steps=[
{"type": "log", "message": "Hello world"},
{"type": "copy", "from": "sftp//path/to/file", "to": "aws_s3/path/to/file"},
{"type": "replication", "path": "path/to/replication.yaml"},
{"type": "http", "url": "https://trigger.webhook.com"},
{"type": "command", "command": ["ls", "-l"], "print": True}
],
env={"MY_VAR": "value"}
)
pipeline.run()
```
## Config Schema
`--src-conn`/`source.conn` and `--tgt-conn`/`target.conn` can be a name or URL of a folder:
- `MY_PG` (connection ref in db, profile or env)
- `postgresql://user:password!@host.loc:5432/database`
- `s3://my_bucket/my_folder/file.csv`
- `gs://my_google_bucket/my_folder/file.json`
- `file:///tmp/my_folder/file.csv` (local storage)
`--src-stream`/`source.stream` can be an object name to stream from:
- `TABLE1`
- `SCHEMA1.TABLE2`
- `OBJECT_NAME`
- `select * from SCHEMA1.TABLE3`
- `/path/to/file.sql` (if source conn is DB)
`--tgt-object`/`target.object` can be an object name to write to:
- `TABLE1`
- `SCHEMA1.TABLE2`
### Example as JSON
```json
{
"source": {
"conn": "MY_PG_URL",
"stream": "select * from my_table",
"options": {}
},
"target": {
"conn": "s3://my_bucket/my_folder/new_file.csv",
"options": {
"header": false
}
}
}
```
## Testing
```bash
pytest sling/tests/tests.py -v
```
Raw data
{
"_id": null,
"home_page": "https://github.com/slingdata-io/sling-python",
"name": "sling-linux-amd64",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "sling, etl, elt, extract, load",
"author": "Fritz Larco",
"author_email": "fritz@slingdata.io",
"download_url": "https://files.pythonhosted.org/packages/ec/80/d8c45385fcb9d71885fc3550b2e37fdc78214077404ca18a4e489b98176d/sling_linux_amd64-1.4.2.tar.gz",
"platform": null,
"description": "<p align=\"center\"><img src=\"https://github.com/slingdata-io/sling-python/raw/main/logo-with-text.png\" alt=\"logo\" width=\"250\"/></p>\n\n<p align=\"center\">Slings from a data source to a data target.</p>\n\n## Installation\n\n`pip install sling`\n\nThen you should be able to run `sling --help` from command line.\n\n## Running a Extract-Load Task\n\n### CLI\n\n```shell\nsling run --src-conn MY_PG --src-stream myschema.mytable \\\n --tgt-conn YOUR_SNOWFLAKE --tgt-object yourschema.yourtable \\\n --mode full-refresh\n```\n\nOr passing a yaml/json string or file\n\n```shell\ncat '\nsource: MY_POSTGRES\ntarget: MY_SNOWFLAKE\n\n# default config options which apply to all streams\ndefaults:\n mode: full-refresh\n object: new_schema.{stream_schema}_{stream_table}\n\nstreams:\n my_schema.*:\n' > /path/to/replication.yaml\n\nsling run -r /path/to/replication.yaml\n```\n\n### From Lib\n\nRun a replication from file:\n\n```python\nimport yaml\nfrom sling import Replication\n\n# From a YAML file\nreplication = Replication(file_path=\"path/to/replication.yaml\")\nreplication.run()\n\n# Or load into object\nwith open('path/to/replication.yaml') as file:\n config = yaml.load(file, Loader=yaml.FullLoader)\n\nreplication = Replication(**config)\n\nreplication.run()\n```\n\nBuild a replication dynamically:\n\n```python\nfrom sling import Replication, ReplicationStream\n\n# build sling replication\nstreams = {}\nfor (folder, table_name) in list(folders):\n streams[folder] = ReplicationStream(mode='full-refresh', object=table_name, primary_key='_hash_id')\n\nreplication = Replication(\n source='aws_s3',\n target='snowflake',\n streams=streams,\n env=dict(SLING_STREAM_URL_COLUMN='true', SLING_LOADED_AT_COLUMN='true'),\n debug=True,\n)\n\nreplication.run()\n```\n\nRun a [Pipeline](https://docs.slingdata.io/concepts/pipeline):\n\n```python\n# From a YAML file\npipeline = Pipeline(file_path=\"path/to/pipeline.yaml\")\npipeline.run()\n\n# Or programmatically\npipeline = Pipeline(\n steps=[\n {\"type\": \"log\", \"message\": \"Hello world\"},\n {\"type\": \"copy\", \"from\": \"sftp//path/to/file\", \"to\": \"aws_s3/path/to/file\"},\n {\"type\": \"replication\", \"path\": \"path/to/replication.yaml\"},\n {\"type\": \"http\", \"url\": \"https://trigger.webhook.com\"},\n {\"type\": \"command\", \"command\": [\"ls\", \"-l\"], \"print\": True}\n ],\n env={\"MY_VAR\": \"value\"}\n)\npipeline.run()\n```\n\n## Config Schema\n\n`--src-conn`/`source.conn` and `--tgt-conn`/`target.conn` can be a name or URL of a folder:\n- `MY_PG` (connection ref in db, profile or env)\n- `postgresql://user:password!@host.loc:5432/database`\n- `s3://my_bucket/my_folder/file.csv`\n- `gs://my_google_bucket/my_folder/file.json`\n- `file:///tmp/my_folder/file.csv` (local storage)\n\n`--src-stream`/`source.stream` can be an object name to stream from:\n- `TABLE1`\n- `SCHEMA1.TABLE2`\n- `OBJECT_NAME`\n- `select * from SCHEMA1.TABLE3`\n- `/path/to/file.sql` (if source conn is DB)\n\n`--tgt-object`/`target.object` can be an object name to write to:\n- `TABLE1`\n- `SCHEMA1.TABLE2`\n\n### Example as JSON\n\n```json\n{\n \"source\": {\n \"conn\": \"MY_PG_URL\",\n \"stream\": \"select * from my_table\",\n \"options\": {}\n },\n \"target\": {\n \"conn\": \"s3://my_bucket/my_folder/new_file.csv\",\n \"options\": {\n \"header\": false\n }\n }\n}\n```\n\n## Testing\n\n```bash\npytest sling/tests/tests.py -v\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Sling Binary for Linux (AMD64)",
"version": "1.4.2",
"project_urls": {
"Download": "https://github.com/slingdata-io/sling-python/archive/master.zip",
"Homepage": "https://github.com/slingdata-io/sling-python"
},
"split_keywords": [
"sling",
" etl",
" elt",
" extract",
" load"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "83828776fecbdcdbbd02ddcecae61cd39227532fb85d091ce5b1d14e6c696e2a",
"md5": "c7d5883b906076193c92e00165e91f3a",
"sha256": "cf9a41e4eb2bb5028e69313188eeb8b6328e21aa4e9718a528d2e6a92ee7d617"
},
"downloads": -1,
"filename": "sling_linux_amd64-1.4.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c7d5883b906076193c92e00165e91f3a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 64201462,
"upload_time": "2025-02-03T21:43:28",
"upload_time_iso_8601": "2025-02-03T21:43:28.084796Z",
"url": "https://files.pythonhosted.org/packages/83/82/8776fecbdcdbbd02ddcecae61cd39227532fb85d091ce5b1d14e6c696e2a/sling_linux_amd64-1.4.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "ec80d8c45385fcb9d71885fc3550b2e37fdc78214077404ca18a4e489b98176d",
"md5": "59a4bfc59d8abea625bbb315467f758e",
"sha256": "4935332ccefa56abafccb89ead7b743ecaafaf6059b8f5d9ce9e9ca21b9b2b36"
},
"downloads": -1,
"filename": "sling_linux_amd64-1.4.2.tar.gz",
"has_sig": false,
"md5_digest": "59a4bfc59d8abea625bbb315467f758e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 63901425,
"upload_time": "2025-02-03T21:43:38",
"upload_time_iso_8601": "2025-02-03T21:43:38.001399Z",
"url": "https://files.pythonhosted.org/packages/ec/80/d8c45385fcb9d71885fc3550b2e37fdc78214077404ca18a4e489b98176d/sling_linux_amd64-1.4.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-03 21:43:38",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "slingdata-io",
"github_project": "sling-python",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "sling-linux-amd64"
}