sling


Namesling JSON
Version 1.2.14 PyPI version JSON
download
home_pagehttps://github.com/slingdata-io/sling-python
SummarySlings data from a source to a target
upload_time2024-07-25 13:44:41
maintainerNone
docs_urlNone
authorFritz Larco
requires_pythonNone
licenseNone
keywords sling etl elt extract load
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center"><img src="https://github.com/slingdata-io/sling-python/raw/main/logo-with-text.png" alt="logo" width="250"/></p>

<p align="center">Slings from a data source to a data target.</p>

## Installation

`pip install sling`

Then you should be able to run `sling --help` from command line.

## Running a Extract-Load Task

### CLI

```shell
sling run --src-conn MY_PG --src-stream myschema.mytable \
  --tgt-conn YOUR_SNOWFLAKE --tgt-object yourschema.yourtable \
  --mode full-refresh
```

Or passing a yaml/json string or file

```shell
cat '
source: MY_POSTGRES
target: MY_SNOWFLAKE

# default config options which apply to all streams
defaults:
  mode: full-refresh
  object: new_schema.{stream_schema}_{stream_table}

streams:
  my_schema.*:
' > /path/to/replication.yaml

sling run -r /path/to/replication.yaml
```

### From Lib

Run a replication from file:

```python
import yaml
from sling import Replication

with open('path/to/replication.yaml') as file:
  config = yaml.load(file, Loader=yaml.FullLoader)

replication = Replication(**config)

replication.run()
```

Build a replication dynamically:

```python
from sling import Replication, ReplicationStream

# build sling replication
streams = {}
for (folder, table_name) in list(folders):
  streams[folder] = ReplicationStream(mode='full-refresh', object=table_name, primary_key='_hash_id')

replication = Replication(
  source='aws_s3',
  target='snowflake',
  streams=streams,
  env=dict(SLING_STREAM_URL_COLUMN='true', SLING_LOADED_AT_COLUMN='true'),
  debug=True,
)

replication.run()
```

## Config Schema

`--src-conn`/`source.conn` and `--tgt-conn`/`target.conn`  can be a name or URL of a folder:
- `MY_PG` (connection ref in db, profile or env)
- `postgresql://user:password!@host.loc:5432/database`
- `s3://my_bucket/my_folder/file.csv`
- `gs://my_google_bucket/my_folder/file.json`
- `file:///tmp/my_folder/file.csv` (local storage)

`--src-stream`/`source.stream` can be an object name to stream from:
- `TABLE1`
- `SCHEMA1.TABLE2`
- `OBJECT_NAME`
- `select * from SCHEMA1.TABLE3`
- `/path/to/file.sql` (if source conn is DB)

`--tgt-object`/`target.object` can be an object name to write to:
- `TABLE1`
- `SCHEMA1.TABLE2`

### Example as JSON

```json
{
  "source": {
    "conn": "MY_PG_URL",
    "stream": "select * from my_table",
    "options": {}
  },
  "target": {
    "conn": "s3://my_bucket/my_folder/new_file.csv",
    "options": {
      "header": false
    }
  }
}
```



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/slingdata-io/sling-python",
    "name": "sling",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "sling, etl, elt, extract, load",
    "author": "Fritz Larco",
    "author_email": "fritz@slingdata.io",
    "download_url": "https://files.pythonhosted.org/packages/09/93/024a248619239072b76957923d79c113a9517fb102caf55e4143fabcb979/sling-1.2.14.tar.gz",
    "platform": null,
    "description": "<p align=\"center\"><img src=\"https://github.com/slingdata-io/sling-python/raw/main/logo-with-text.png\" alt=\"logo\" width=\"250\"/></p>\n\n<p align=\"center\">Slings from a data source to a data target.</p>\n\n## Installation\n\n`pip install sling`\n\nThen you should be able to run `sling --help` from command line.\n\n## Running a Extract-Load Task\n\n### CLI\n\n```shell\nsling run --src-conn MY_PG --src-stream myschema.mytable \\\n  --tgt-conn YOUR_SNOWFLAKE --tgt-object yourschema.yourtable \\\n  --mode full-refresh\n```\n\nOr passing a yaml/json string or file\n\n```shell\ncat '\nsource: MY_POSTGRES\ntarget: MY_SNOWFLAKE\n\n# default config options which apply to all streams\ndefaults:\n  mode: full-refresh\n  object: new_schema.{stream_schema}_{stream_table}\n\nstreams:\n  my_schema.*:\n' > /path/to/replication.yaml\n\nsling run -r /path/to/replication.yaml\n```\n\n### From Lib\n\nRun a replication from file:\n\n```python\nimport yaml\nfrom sling import Replication\n\nwith open('path/to/replication.yaml') as file:\n  config = yaml.load(file, Loader=yaml.FullLoader)\n\nreplication = Replication(**config)\n\nreplication.run()\n```\n\nBuild a replication dynamically:\n\n```python\nfrom sling import Replication, ReplicationStream\n\n# build sling replication\nstreams = {}\nfor (folder, table_name) in list(folders):\n  streams[folder] = ReplicationStream(mode='full-refresh', object=table_name, primary_key='_hash_id')\n\nreplication = Replication(\n  source='aws_s3',\n  target='snowflake',\n  streams=streams,\n  env=dict(SLING_STREAM_URL_COLUMN='true', SLING_LOADED_AT_COLUMN='true'),\n  debug=True,\n)\n\nreplication.run()\n```\n\n## Config Schema\n\n`--src-conn`/`source.conn` and `--tgt-conn`/`target.conn`  can be a name or URL of a folder:\n- `MY_PG` (connection ref in db, profile or env)\n- `postgresql://user:password!@host.loc:5432/database`\n- `s3://my_bucket/my_folder/file.csv`\n- `gs://my_google_bucket/my_folder/file.json`\n- `file:///tmp/my_folder/file.csv` (local storage)\n\n`--src-stream`/`source.stream` can be an object name to stream from:\n- `TABLE1`\n- `SCHEMA1.TABLE2`\n- `OBJECT_NAME`\n- `select * from SCHEMA1.TABLE3`\n- `/path/to/file.sql` (if source conn is DB)\n\n`--tgt-object`/`target.object` can be an object name to write to:\n- `TABLE1`\n- `SCHEMA1.TABLE2`\n\n### Example as JSON\n\n```json\n{\n  \"source\": {\n    \"conn\": \"MY_PG_URL\",\n    \"stream\": \"select * from my_table\",\n    \"options\": {}\n  },\n  \"target\": {\n    \"conn\": \"s3://my_bucket/my_folder/new_file.csv\",\n    \"options\": {\n      \"header\": false\n    }\n  }\n}\n```\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Slings data from a source to a target",
    "version": "1.2.14",
    "project_urls": {
        "Download": "https://github.com/slingdata-io/sling-python/archive/master.zip",
        "Homepage": "https://github.com/slingdata-io/sling-python"
    },
    "split_keywords": [
        "sling",
        " etl",
        " elt",
        " extract",
        " load"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0993024a248619239072b76957923d79c113a9517fb102caf55e4143fabcb979",
                "md5": "a917eb78ccfed9518479f1ab78fd17d6",
                "sha256": "4c4a76bea2cdc3a24e508a3ffc700d78e0a461e3e95326bca80131c7ef1aea85"
            },
            "downloads": -1,
            "filename": "sling-1.2.14.tar.gz",
            "has_sig": false,
            "md5_digest": "a917eb78ccfed9518479f1ab78fd17d6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 6924,
            "upload_time": "2024-07-25T13:44:41",
            "upload_time_iso_8601": "2024-07-25T13:44:41.551826Z",
            "url": "https://files.pythonhosted.org/packages/09/93/024a248619239072b76957923d79c113a9517fb102caf55e4143fabcb979/sling-1.2.14.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-25 13:44:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "slingdata-io",
    "github_project": "sling-python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "sling"
}
        
Elapsed time: 0.27982s