kedro-snowflake


Namekedro-snowflake JSON
Version 0.2.1 PyPI version JSON
download
home_pagehttps://github.com/getindata/kedro-snowflake
SummaryKedro plugin with Snowflake / Snowpark support
upload_time2023-06-20 15:56:24
maintainerGetInData MLOPS
docs_urlNone
authorGetInData MLOPS
requires_python>=3.8,<3.9
licenseApache-2.0
keywords kedro snowflake snowpark mlops
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Kedro Snowflake Pipelines plugin

[![Python Version](https://img.shields.io/pypi/pyversions/kedro-snowflake)](https://github.com/getindata/kedro-snowflake)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![SemVer](https://img.shields.io/badge/semver-2.0.0-green)](https://semver.org/)
[![PyPI version](https://badge.fury.io/py/kedro-snowflake.svg)](https://pypi.org/project/kedro-snowflake/)
[![Downloads](https://pepy.tech/badge/kedro-snowflake)](https://pepy.tech/project/kedro-snowflake)

[![Maintainability Rating](https://sonarcloud.io/api/project_badges/measure?project=getindata_kedro-snowflake&metric=sqale_rating)](https://sonarcloud.io/summary/new_code?id=getindata_kedro-snowflake)
[![Coverage](https://sonarcloud.io/api/project_badges/measure?project=getindata_kedro-snowflake&metric=coverage)](https://sonarcloud.io/summary/new_code?id=getindata_kedro-snowflake)
[![Documentation Status](https://readthedocs.org/projects/kedro-snowflake/badge/?version=latest)](https://kedro-snowflake.readthedocs.io/en/latest/?badge=latest)

<p align="center">
  <a href="https://getindata.com/solutions/ml-platform-machine-learning-reliable-explainable-feature-engineering"><img height="150" src="https://getindata.com/img/logo.svg"></a>
  <h3 align="center">We help companies turn their data into assets</h3>
</p>

## About
This plugin allows to run full Kedro pipelines in Snowflake. Right now it supports
* Kedro starter, to get you up to speed fast
* automatically creating Snowflake Stored Procedures from Kedro nodes (using Snowpark SDK)
* translating Kedro pipeline into Snowflake tasks graph
* running Kedro pipeline fully within Snowflake, without external system
* using Kedro's official `SnowparkTableDataSet`
* automatically storing intermediate data as Transient Tables (if Snowpark's DataFrames are used)
* <span style="color:yellow;float:left;margin: 0px 7px 0px 0px">**(New!)</span>** [MLflow](https://mlflow.org/) integration with Snowflake with example usage in _Snowflights_ Kedro starter


## Documentation
For detailed documentation refer to https://kedro-snowflake.readthedocs.io/

## Usage
### With starter
1. Install the plugin
    ```bash
    pip install "kedro-snowflake>=0.1.0" 
    ```
2. Create new project with our Kedro starter ❄️ _Snowflights_ 🚀:
    ```bash
    kedro new --starter=snowflights --checkout=master
    ```
    <details>
        <summary>And answer the interactive prompts ⬇️ (click to expand) </summary>
    
    ```
    Project Name
    ============
    Please enter a human readable name for your new project.
    Spaces, hyphens, and underscores are allowed.
     [Snowflights]: 
    
    Snowflake Account
    =================
    Please enter the name of your Snowflake account.
    This is the part of the URL before .snowflakecomputing.com
     []: abc-123
    
    Snowflake User
    ==============
    Please enter the name of your Snowflake user.
     []: user2137
    
    Snowflake Warehouse
    ===================
    Please enter the name of your Snowflake warehouse.
     []: compute-wh
    
    Snowflake Database
    ==================
    Please enter the name of your Snowflake database.
     [DEMO]: 
    
    Snowflake Schema
    ================
    Please enter the name of your Snowflake schema.
     [DEMO]: 
    
    Snowflake Password Environment Variable
    =======================================
    Please enter the name of the environment variable that contains your Snowflake password.
    Alternatively, you can re-configure the plugin later to use Kedros credentials.yml
     [SNOWFLAKE_PASSWORD]:       
    
    Pipeline Name Used As A Snowflake Task Prefix
    =============================================

     [default]:

    Enable Mlflow Integration (See Documentation For The Configuration Instructions)
    ================================================================================

     [False]: 

    The project name 'Snowflights' has been applied to: 
    - The project title in /tmp/snowflights/README.md
    - The folder created for your project in /tmp/snowflights
    - The project's python package in /tmp/snowflights/src/snowflights
    ```
    </details>

3. Run the project
    ```bash
    cd snowflights
    kedro snowflake run --wait-for-completion
    ```

### In existing Kedro project
1. Install the plugin
    ```bash
    pip install "kedro-snowflake>=0.1.0" 
    ```
2. Initialize the plugin
    ```bash
    kedro snowflake init <ACCOUNT> <USER> <PASSWORD_FROM_ENV> <DATABASE> <SCHEMA> <WAREHOUSE>
    ```
3. Run the project
    ```bash
    kedro snowflake run --wait-for-completion
    ```
   
### Kedro pipeline in Snowflake Tasks

<img src="./docs/images/kedro-snowflake-tasks-graph.png" alt="Kedro Snowflake Plugin" title="Kedro Snowflake Plugin" />

Execution:

<img src="./docs/images/snowflake_running_pipeline.gif" alt="Kedro Snowflake Plugin CLI" title="Kedro Snowflake Plugin CLI" />


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/getindata/kedro-snowflake",
    "name": "kedro-snowflake",
    "maintainer": "GetInData MLOPS",
    "docs_url": null,
    "requires_python": ">=3.8,<3.9",
    "maintainer_email": "mlops@getindata.com",
    "keywords": "kedro,snowflake,snowpark,mlops",
    "author": "GetInData MLOPS",
    "author_email": "mlops@getindata.com",
    "download_url": "https://files.pythonhosted.org/packages/8d/4b/31d7f72c78a66c0c176cf6e617b1ae5696762d3856e9e5073ec21a6b77ed/kedro_snowflake-0.2.1.tar.gz",
    "platform": null,
    "description": "# Kedro Snowflake Pipelines plugin\n\n[![Python Version](https://img.shields.io/pypi/pyversions/kedro-snowflake)](https://github.com/getindata/kedro-snowflake)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![SemVer](https://img.shields.io/badge/semver-2.0.0-green)](https://semver.org/)\n[![PyPI version](https://badge.fury.io/py/kedro-snowflake.svg)](https://pypi.org/project/kedro-snowflake/)\n[![Downloads](https://pepy.tech/badge/kedro-snowflake)](https://pepy.tech/project/kedro-snowflake)\n\n[![Maintainability Rating](https://sonarcloud.io/api/project_badges/measure?project=getindata_kedro-snowflake&metric=sqale_rating)](https://sonarcloud.io/summary/new_code?id=getindata_kedro-snowflake)\n[![Coverage](https://sonarcloud.io/api/project_badges/measure?project=getindata_kedro-snowflake&metric=coverage)](https://sonarcloud.io/summary/new_code?id=getindata_kedro-snowflake)\n[![Documentation Status](https://readthedocs.org/projects/kedro-snowflake/badge/?version=latest)](https://kedro-snowflake.readthedocs.io/en/latest/?badge=latest)\n\n<p align=\"center\">\n  <a href=\"https://getindata.com/solutions/ml-platform-machine-learning-reliable-explainable-feature-engineering\"><img height=\"150\" src=\"https://getindata.com/img/logo.svg\"></a>\n  <h3 align=\"center\">We help companies turn their data into assets</h3>\n</p>\n\n## About\nThis plugin allows to run full Kedro pipelines in Snowflake. Right now it supports\n* Kedro starter, to get you up to speed fast\n* automatically creating Snowflake Stored Procedures from Kedro nodes (using Snowpark SDK)\n* translating Kedro pipeline into Snowflake tasks graph\n* running Kedro pipeline fully within Snowflake, without external system\n* using Kedro's official `SnowparkTableDataSet`\n* automatically storing intermediate data as Transient Tables (if Snowpark's DataFrames are used)\n* <span style=\"color:yellow;float:left;margin: 0px 7px 0px 0px\">**(New!)</span>** [MLflow](https://mlflow.org/) integration with Snowflake with example usage in _Snowflights_ Kedro starter\n\n\n## Documentation\nFor detailed documentation refer to https://kedro-snowflake.readthedocs.io/\n\n## Usage\n### With starter\n1. Install the plugin\n    ```bash\n    pip install \"kedro-snowflake>=0.1.0\" \n    ```\n2. Create new project with our Kedro starter \u2744\ufe0f _Snowflights_ \ud83d\ude80:\n    ```bash\n    kedro new --starter=snowflights --checkout=master\n    ```\n    <details>\n        <summary>And answer the interactive prompts \u2b07\ufe0f (click to expand) </summary>\n    \n    ```\n    Project Name\n    ============\n    Please enter a human readable name for your new project.\n    Spaces, hyphens, and underscores are allowed.\n     [Snowflights]: \n    \n    Snowflake Account\n    =================\n    Please enter the name of your Snowflake account.\n    This is the part of the URL before .snowflakecomputing.com\n     []: abc-123\n    \n    Snowflake User\n    ==============\n    Please enter the name of your Snowflake user.\n     []: user2137\n    \n    Snowflake Warehouse\n    ===================\n    Please enter the name of your Snowflake warehouse.\n     []: compute-wh\n    \n    Snowflake Database\n    ==================\n    Please enter the name of your Snowflake database.\n     [DEMO]: \n    \n    Snowflake Schema\n    ================\n    Please enter the name of your Snowflake schema.\n     [DEMO]: \n    \n    Snowflake Password Environment Variable\n    =======================================\n    Please enter the name of the environment variable that contains your Snowflake password.\n    Alternatively, you can re-configure the plugin later to use Kedros credentials.yml\n     [SNOWFLAKE_PASSWORD]:       \n    \n    Pipeline Name Used As A Snowflake Task Prefix\n    =============================================\n\n     [default]:\n\n    Enable Mlflow Integration (See Documentation For The Configuration Instructions)\n    ================================================================================\n\n     [False]: \n\n    The project name 'Snowflights' has been applied to: \n    - The project title in /tmp/snowflights/README.md\n    - The folder created for your project in /tmp/snowflights\n    - The project's python package in /tmp/snowflights/src/snowflights\n    ```\n    </details>\n\n3. Run the project\n    ```bash\n    cd snowflights\n    kedro snowflake run --wait-for-completion\n    ```\n\n### In existing Kedro project\n1. Install the plugin\n    ```bash\n    pip install \"kedro-snowflake>=0.1.0\" \n    ```\n2. Initialize the plugin\n    ```bash\n    kedro snowflake init <ACCOUNT> <USER> <PASSWORD_FROM_ENV> <DATABASE> <SCHEMA> <WAREHOUSE>\n    ```\n3. Run the project\n    ```bash\n    kedro snowflake run --wait-for-completion\n    ```\n   \n### Kedro pipeline in Snowflake Tasks\n\n<img src=\"./docs/images/kedro-snowflake-tasks-graph.png\" alt=\"Kedro Snowflake Plugin\" title=\"Kedro Snowflake Plugin\" />\n\nExecution:\n\n<img src=\"./docs/images/snowflake_running_pipeline.gif\" alt=\"Kedro Snowflake Plugin CLI\" title=\"Kedro Snowflake Plugin CLI\" />\n\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Kedro plugin with Snowflake / Snowpark support",
    "version": "0.2.1",
    "project_urls": {
        "Documentation": "https://kedro-snowflake.readthedocs.io/",
        "Homepage": "https://github.com/getindata/kedro-snowflake",
        "Repository": "https://github.com/getindata/kedro-snowflake"
    },
    "split_keywords": [
        "kedro",
        "snowflake",
        "snowpark",
        "mlops"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0d8bf5d302604a05e85d051d74ab3a759102faa10d59c0bfd772aafade9e8e9b",
                "md5": "8eb969ceb2dfee260ccdafa54f34b00b",
                "sha256": "7ea42950503a487fbdc070578b5f333f42dccf4ddaf2016ced676c6224af3fca"
            },
            "downloads": -1,
            "filename": "kedro_snowflake-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8eb969ceb2dfee260ccdafa54f34b00b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<3.9",
            "size": 4980229,
            "upload_time": "2023-06-20T15:56:22",
            "upload_time_iso_8601": "2023-06-20T15:56:22.322585Z",
            "url": "https://files.pythonhosted.org/packages/0d/8b/f5d302604a05e85d051d74ab3a759102faa10d59c0bfd772aafade9e8e9b/kedro_snowflake-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8d4b31d7f72c78a66c0c176cf6e617b1ae5696762d3856e9e5073ec21a6b77ed",
                "md5": "ac67054db0dbbbfdf5167a6c8889062d",
                "sha256": "ad47231ed9004001738b13cbbba0012e6b8170bedacd8fd27d439d639bd95d25"
            },
            "downloads": -1,
            "filename": "kedro_snowflake-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "ac67054db0dbbbfdf5167a6c8889062d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<3.9",
            "size": 4911004,
            "upload_time": "2023-06-20T15:56:24",
            "upload_time_iso_8601": "2023-06-20T15:56:24.728227Z",
            "url": "https://files.pythonhosted.org/packages/8d/4b/31d7f72c78a66c0c176cf6e617b1ae5696762d3856e9e5073ec21a6b77ed/kedro_snowflake-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-20 15:56:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "getindata",
    "github_project": "kedro-snowflake",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "kedro-snowflake"
}
        
Elapsed time: 0.12929s