# SnowConn
This repository is a wrapper around the [snowflake SQLAlchemy](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html)
library. It manages the creation of connections and provides a few convenience functions that should be good enough
to cover most use cases yet be flexible enough to allow additional wrappers to be written around to serve more specific
use cases for different teams.
---
## Installation
To install latest version released to pypi with pip:
```bash
pip install snowconn
```
To install the latest version directly from the repo:
```bash
pip install 'git+ssh://git@github.com/Daltix/SnowConn.git@master#egg=snowconn'
```
If you want to use pandas functionality (read/write from/to pandas dataframes) you can install
as follows:
```bash
pip install snowconn[pandas]
```
If you want to enable SSO authentication you can install as follows:
```bash
pip install snowconn[storage]
```
If you want to install all functionality (AWS secrets manager connection, SSO, pandas) you can install as follows:
```bash
pip install snowconn[all]
```
---
## Connection
Everything is implemented in a single `SnowConn` class. To import it is always the same:
```py
from snowconn import SnowConn
```
### (1) Connection using your own personal creds
Install [snowsql](https://docs.snowflake.net/manuals/user-guide/snowsql-install-config.html)
and configure `~/.snowsql/config` as per the instructions
You can test that it is correctly installed by then executing `snowsql`
from the command line.
*WARNING* Be sure to configure your account name like the following:
```
accountname = ACCOUNT_ID.REGION
```
*(example `accountname = eq90000.eu-west-1`)*
If you don't include the region part (`eu-west-1` in the example above), it will hang for about a minute and then give you a permission denied.
Now that you are able to execute `snowsql` to successfully connect, you are ready to use the `SnowConn.connect` function:
```py
with SnowConn.connect() as conn:
# your conn. code here
```
That's it you are connected! You can connect to a specific schema / database with the following:
```py
with SnowConn.connect('daltix_database', 'public') as conn:
# your conn. code here
```
** NOTE: Connect using SSO **
If you are using SSO (Okta or others), you need to update your .snowsqlk/config with the following modifications:
- Include an "authenticator" line, [see here](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#using-sso-with-client-applications-that-connect-to-snowflake) for possible values
and their meaning).
- replace username for your username (instead of your snowflake username)
### (2) Connection using AWS Secrets Manager
You need to have boto3 installed which you can do so with the following:
```
pip install boto3
```
Now you must satisfy the folloing requirements:
1. Have a secret stored in an accessable aws account
1. The secret must have the following keys:
- `USERNAME`
- `PASSWORD`
- `ACCOUNT`
- `ROLE`
For this example, we will assume the `price_plotter` is the secret manager that we will be using.
Now that you know the name of the secret, you MUST be sure that the context in which it is running has access to read
that secret. Once this is done, you can now execute the following code:
```py
with SnowConn.connect(methods=['secretsmanager'], credsman_name='price_plotter') as conn:
# your conn. code here
```
Alternatively you can use the specific `connect_secretsmanager` method:
```py
with SnowConn.connect_secretsmanager('price_plotter') as conn:
# your conn. code here
```
And you are connected! You can also pass the database and schema along
```py
with SnowConn.connect_secretsmanager('price_plotter', 'daltix', 'public') as conn:
# your conn. code here
```
An example of a policy that gives access to the `price_plotter` looks like this:
```
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"secretsmanager:GetResourcePolicy",
"secretsmanager:GetSecretValue",
"secretsmanager:DescribeSecret",
"secretsmanager:ListSecretVersionIds"
],
"Resource": "arn:aws:secretsmanager:eu-west-1:<your-account-number>:secret:price_plotter-AdcNpp"
}
]
}
```
And an example of this in a serverless.yml looks like this:
```
iamRoleStatements:
- Effect: Allow
Action:
- secretsmanager:DescribeSecret
- secretsmanager:List*
Resource:
- "*"
- Effect: Allow
Action:
- secretsmanager:*
Resource:
- { Fn::Sub: "arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:price_plotter-??????" }
```
---
## API
Now that you're connected, there are a few low-level functions that you can use to programatically interact with
the snowflake tables that you have access to.
The rest of these examples assume that you have used one of the above methods to connect and have access to the
`daltix.public.price` table.
### Creating a connection
Creating a connection is very easy (see examples above for connection options):
```py
with SnowConn.connect() as conn:
# your conn. code here
```
You can also create connections manually without using a context (this is not recommended, see *Known Issues* section below), make sure to close the connection after you are done:
```py
conn = SnowConn.connect()
# your conn. code here
conn.close() # close the connection when done
```
### execute_simple
The exc_simple function is used for when you have a single statement to execute and the result set can fit into memory. It
takes a single argument which a string of the SQL statement that you with to execute. Take the following for example:
```py
>>> conn.execute_simple('select * from price limit 1;')
[{'DALTIX_ID': '0d3c30353035a6ab5747237a1f2600bbf5ddd27401372c5effe0f2790a88ad56', 'SHOP': 'ahed', 'COUNTRY': 'de', 'PRODUCT_ID': '616846.0', 'LOCATION': 'base', 'PRICE': 37.99, 'PROMO_PRICE': None, 'PRICE_STD': None, 'PROMO_PRICE_STD': None, 'UNIT': None, 'UNIT_STD': None, 'IS_MAIN': True, 'VENDOR': None, 'VENDOR_STD': None, 'DOWNLOADED_ON': datetime.datetime(2018, 11, 18, 0, 0, 1), 'DOWNLOADED_ON_LOCAL': datetime.datetime(2018, 11, 18, 1, 0, 1), 'DOWNLOADED_ON_DATE': datetime.date(2018, 11, 18), 'IS_LATEST_PRICE': False}]
```
### execute_string
If you have multiple sql statements in a single string that you want to execute or the resultset is larger than
will fit into memory, this is the function that you want to use. It returns a list of cursors that are a result
of each of the statements that are contained in the string. See [here](https://docs.snowflake.net/manuals/user-guide/python-connector-api.html#execute_string) for the full documentation.
```py
>>> conn.execute_string('create temporary table price_small as (select * from price limit 1); select * from price_small;')
[<snowflake.connector.cursor.SnowflakeCursor object at 0x10f537898>, <snowflake.connector.cursor.SnowflakeCursor object at 0x10f52c588>]
```
### execute_file
If you have the contents of an sql file that you want to execute, you can use this function. For example:
```bash
echo "select * from price limit 1;" > query.sql
```
```py
>>> conn.execute_file('query.sql')
>>> [<snowflake.connector.cursor.SnowflakeCursor object at 0x1188d6390>]
```
This also returns a list of cursors the same as `execute_string` does. In fact, this function is nothing more than a very
simple wrapper around `execute_string`.
### read_df
Use this function to read the results of a query into a dataframe. Note that pandas is NOT a dependency of this repo so
if you want to use it you must satisfy this dependency yourself.
It takes one sql string as an argument and returns a dataframe.
```bash
>>> conn.read_df('select daltix_id, downloaded_on, price from price limit 5;')
daltix_id downloaded_on price
0 0d3c30353035a6ab5747237a1f2600bbf5ddd27401372c 2018-11-18 00:00:01 37.99
1 f5be8a5da3bde2da6a63fcad4e5c30823027324092234c 2018-11-18 00:00:02 9.99
2 f5be8a5da3bde2da6a63fcad4e5c30823027324092234c 2018-11-18 00:00:02 0.40
3 807e2a7706b8c515264fa55bed3891d5685ac5ee0148f0 2018-11-18 00:00:04 3.70
4 1e56339f99dc866cd4b87679aa686556a5ad2398d00c95 2018-11-18 00:00:06 3.76
>>>
```
### write_df
Use this to write a dataframe to Snowflake. This is a very thin wrapper around the pandas [DataFrame.to_sql()](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html) function.
Unfortunately, it doesn't play nice with dictionaries and arrays so the use cases are quite limited. Hopefully
we will improve upon this in the future.
### get_current_role
Returns the current role.
### close
Use this to cleanly close all connections that have ever been associated with this instance of SnowConn. If you don't
use this your process will hang for a while without saying anything before it actually exits.
## Accessing the connection objects directly
These functions are mostly wrappers around 2 connection libraries:
- [The snowflake python connector](https://docs.snowflake.net/manuals/user-guide/python-connector-api.html)
- [The snowflake SQLAlchemy library](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html)
Should you need to use either of these yourself, you can ask for the connections yourself with the following
functions:
### get_raw_connection
This will return the instance of a snowflake connector which is documented [here](https://docs.snowflake.net/manuals/user-guide/python-connector-api.html#connect). It is a good choice if you have very simple needs and for some reason none
of the functions in the rest of this repo are serving your needs.
### get_alchemy_engine
This is the result of [create_engine()](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html#connection-parameters)
which was called during `connect()` . It does not represent an active connection to the database
but rather acts as a factory for connections.
This is useful for using the most commonly abstracted things in other libraries such as dashboards, pandas, etc.
However, like SQLAlchemy in general, despite being very widely supported and feature-complete, it is not the simplest
API so it should probably not be your first choice unless you know exactly that you need it.
### get_connection
This returns the result of the creating the sqlalchemy engine and then calling `connect()` on it. Unlike the result
of `get_alchemy_engine` this represents an active connection to Snowflake and this has a session associated with it.
You can see the object documentation [here](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html#parameters-and-behavior)
## Known issues
There is a bug with `snowflake-connector` which causes some connections to Snowflake to not close properly in certain circumstances. This can cause timeout errors.
You can handle this in two ways: the first is to wrap usage of the connection in a `try/finally` block to ensure the connection is explicitly closed, like this:
```
from snowconn import SnowConn
conn = SnowConn.connect(...)
try:
result = execute_string(query) # or result = read_df(query), etc
finally:
conn.close()
```
The second way is to use SnowConn with the `with` syntax, as follows:
```
with SnowConn.connect() as conn:
conn.read_df(...)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/Daltix/snowconn",
"name": "snowconn",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "",
"author": "Daltix NV",
"author_email": "snowconn@daltix.com",
"download_url": "https://files.pythonhosted.org/packages/de/a8/d40408e881a7c771c5eca39bdda569d49c578ef175568310c3a68be7ada6/snowconn-3.11.0.tar.gz",
"platform": null,
"description": "# SnowConn\n\nThis repository is a wrapper around the [snowflake SQLAlchemy](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html)\nlibrary. It manages the creation of connections and provides a few convenience functions that should be good enough\nto cover most use cases yet be flexible enough to allow additional wrappers to be written around to serve more specific\nuse cases for different teams.\n\n---\n\n## Installation\n\nTo install latest version released to pypi with pip:\n\n```bash\npip install snowconn\n```\n\nTo install the latest version directly from the repo:\n\n```bash\npip install 'git+ssh://git@github.com/Daltix/SnowConn.git@master#egg=snowconn'\n```\n\nIf you want to use pandas functionality (read/write from/to pandas dataframes) you can install\nas follows:\n\n```bash\npip install snowconn[pandas]\n```\n\nIf you want to enable SSO authentication you can install as follows:\n\n```bash\npip install snowconn[storage]\n```\n\nIf you want to install all functionality (AWS secrets manager connection, SSO, pandas) you can install as follows:\n\n```bash\npip install snowconn[all]\n```\n\n---\n\n## Connection\n\nEverything is implemented in a single `SnowConn` class. To import it is always the same:\n\n```py\nfrom snowconn import SnowConn\n```\n\n### (1) Connection using your own personal creds\n\nInstall [snowsql](https://docs.snowflake.net/manuals/user-guide/snowsql-install-config.html)\nand configure `~/.snowsql/config` as per the instructions\n\nYou can test that it is correctly installed by then executing `snowsql`\nfrom the command line.\n\n*WARNING* Be sure to configure your account name like the following:\n\n```\naccountname = ACCOUNT_ID.REGION\n```\n\n*(example `accountname = eq90000.eu-west-1`)*\n\nIf you don't include the region part (`eu-west-1` in the example above), it will hang for about a minute and then give you a permission denied.\n\nNow that you are able to execute `snowsql` to successfully connect, you are ready to use the `SnowConn.connect` function:\n\n```py\nwith SnowConn.connect() as conn:\n # your conn. code here\n```\nThat's it you are connected! You can connect to a specific schema / database with the following:\n\n```py\nwith SnowConn.connect('daltix_database', 'public') as conn:\n # your conn. code here\n```\n\n** NOTE: Connect using SSO **\nIf you are using SSO (Okta or others), you need to update your .snowsqlk/config with the following modifications:\n- Include an \"authenticator\" line, [see here](https://docs.snowflake.com/en/user-guide/admin-security-fed-auth-use.html#using-sso-with-client-applications-that-connect-to-snowflake) for possible values\nand their meaning).\n- replace username for your username (instead of your snowflake username)\n\n\n### (2) Connection using AWS Secrets Manager\n\nYou need to have boto3 installed which you can do so with the following:\n\n```\npip install boto3\n```\n\nNow you must satisfy the folloing requirements:\n\n1. Have a secret stored in an accessable aws account\n1. The secret must have the following keys:\n - `USERNAME`\n - `PASSWORD`\n - `ACCOUNT`\n - `ROLE`\n\nFor this example, we will assume the `price_plotter` is the secret manager that we will be using.\n\nNow that you know the name of the secret, you MUST be sure that the context in which it is running has access to read\nthat secret. Once this is done, you can now execute the following code:\n\n```py\nwith SnowConn.connect(methods=['secretsmanager'], credsman_name='price_plotter') as conn:\n # your conn. code here\n```\n\nAlternatively you can use the specific `connect_secretsmanager` method:\n\n```py\nwith SnowConn.connect_secretsmanager('price_plotter') as conn:\n # your conn. code here\n```\n\nAnd you are connected! You can also pass the database and schema along\n\n```py\nwith SnowConn.connect_secretsmanager('price_plotter', 'daltix', 'public') as conn:\n # your conn. code here\n```\n\nAn example of a policy that gives access to the `price_plotter` looks like this:\n\n```\n{\n \"Version\": \"2012-10-17\",\n \"Statement\": [\n {\n \"Sid\": \"VisualEditor0\",\n \"Effect\": \"Allow\",\n \"Action\": [\n \"secretsmanager:GetResourcePolicy\",\n \"secretsmanager:GetSecretValue\",\n \"secretsmanager:DescribeSecret\",\n \"secretsmanager:ListSecretVersionIds\"\n ],\n \"Resource\": \"arn:aws:secretsmanager:eu-west-1:<your-account-number>:secret:price_plotter-AdcNpp\"\n }\n ]\n}\n```\n\nAnd an example of this in a serverless.yml looks like this:\n\n```\niamRoleStatements:\n - Effect: Allow\n Action:\n - secretsmanager:DescribeSecret\n - secretsmanager:List*\n Resource:\n - \"*\"\n - Effect: Allow\n Action:\n - secretsmanager:*\n Resource:\n - { Fn::Sub: \"arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:secret:price_plotter-??????\" }\n```\n\n---\n\n## API\n\nNow that you're connected, there are a few low-level functions that you can use to programatically interact with\nthe snowflake tables that you have access to.\n\nThe rest of these examples assume that you have used one of the above methods to connect and have access to the\n`daltix.public.price` table.\n\n### Creating a connection\n\nCreating a connection is very easy (see examples above for connection options):\n\n```py\nwith SnowConn.connect() as conn:\n # your conn. code here\n```\n\nYou can also create connections manually without using a context (this is not recommended, see *Known Issues* section below), make sure to close the connection after you are done:\n\n```py\nconn = SnowConn.connect()\n# your conn. code here\nconn.close() # close the connection when done\n```\n\n### execute_simple\n\nThe exc_simple function is used for when you have a single statement to execute and the result set can fit into memory. It\ntakes a single argument which a string of the SQL statement that you with to execute. Take the following for example:\n\n```py\n>>> conn.execute_simple('select * from price limit 1;')\n[{'DALTIX_ID': '0d3c30353035a6ab5747237a1f2600bbf5ddd27401372c5effe0f2790a88ad56', 'SHOP': 'ahed', 'COUNTRY': 'de', 'PRODUCT_ID': '616846.0', 'LOCATION': 'base', 'PRICE': 37.99, 'PROMO_PRICE': None, 'PRICE_STD': None, 'PROMO_PRICE_STD': None, 'UNIT': None, 'UNIT_STD': None, 'IS_MAIN': True, 'VENDOR': None, 'VENDOR_STD': None, 'DOWNLOADED_ON': datetime.datetime(2018, 11, 18, 0, 0, 1), 'DOWNLOADED_ON_LOCAL': datetime.datetime(2018, 11, 18, 1, 0, 1), 'DOWNLOADED_ON_DATE': datetime.date(2018, 11, 18), 'IS_LATEST_PRICE': False}]\n```\n\n### execute_string\n\nIf you have multiple sql statements in a single string that you want to execute or the resultset is larger than\nwill fit into memory, this is the function that you want to use. It returns a list of cursors that are a result\nof each of the statements that are contained in the string. See [here](https://docs.snowflake.net/manuals/user-guide/python-connector-api.html#execute_string) for the full documentation.\n\n```py\n>>> conn.execute_string('create temporary table price_small as (select * from price limit 1); select * from price_small;')\n[<snowflake.connector.cursor.SnowflakeCursor object at 0x10f537898>, <snowflake.connector.cursor.SnowflakeCursor object at 0x10f52c588>]\n```\n\n### execute_file\n\nIf you have the contents of an sql file that you want to execute, you can use this function. For example:\n\n```bash\necho \"select * from price limit 1;\" > query.sql\n```\n\n```py\n>>> conn.execute_file('query.sql')\n>>> [<snowflake.connector.cursor.SnowflakeCursor object at 0x1188d6390>]\n```\nThis also returns a list of cursors the same as `execute_string` does. In fact, this function is nothing more than a very\nsimple wrapper around `execute_string`.\n\n### read_df\n\nUse this function to read the results of a query into a dataframe. Note that pandas is NOT a dependency of this repo so\nif you want to use it you must satisfy this dependency yourself.\n\nIt takes one sql string as an argument and returns a dataframe.\n\n```bash\n>>> conn.read_df('select daltix_id, downloaded_on, price from price limit 5;')\n daltix_id downloaded_on price\n0 0d3c30353035a6ab5747237a1f2600bbf5ddd27401372c 2018-11-18 00:00:01 37.99\n1 f5be8a5da3bde2da6a63fcad4e5c30823027324092234c 2018-11-18 00:00:02 9.99\n2 f5be8a5da3bde2da6a63fcad4e5c30823027324092234c 2018-11-18 00:00:02 0.40\n3 807e2a7706b8c515264fa55bed3891d5685ac5ee0148f0 2018-11-18 00:00:04 3.70\n4 1e56339f99dc866cd4b87679aa686556a5ad2398d00c95 2018-11-18 00:00:06 3.76\n>>>\n```\n\n### write_df\n\nUse this to write a dataframe to Snowflake. This is a very thin wrapper around the pandas [DataFrame.to_sql()](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_sql.html) function.\n\nUnfortunately, it doesn't play nice with dictionaries and arrays so the use cases are quite limited. Hopefully\nwe will improve upon this in the future.\n\n### get_current_role\n\nReturns the current role.\n\n### close\n\nUse this to cleanly close all connections that have ever been associated with this instance of SnowConn. If you don't\nuse this your process will hang for a while without saying anything before it actually exits.\n\n## Accessing the connection objects directly\n\nThese functions are mostly wrappers around 2 connection libraries:\n\n- [The snowflake python connector](https://docs.snowflake.net/manuals/user-guide/python-connector-api.html)\n- [The snowflake SQLAlchemy library](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html)\n\nShould you need to use either of these yourself, you can ask for the connections yourself with the following\nfunctions:\n\n### get_raw_connection\n\nThis will return the instance of a snowflake connector which is documented [here](https://docs.snowflake.net/manuals/user-guide/python-connector-api.html#connect). It is a good choice if you have very simple needs and for some reason none\nof the functions in the rest of this repo are serving your needs.\n\n### get_alchemy_engine\n\nThis is the result of [create_engine()](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html#connection-parameters)\nwhich was called during `connect()` . It does not represent an active connection to the database\nbut rather acts as a factory for connections.\n\nThis is useful for using the most commonly abstracted things in other libraries such as dashboards, pandas, etc. \nHowever, like SQLAlchemy in general, despite being very widely supported and feature-complete, it is not the simplest \nAPI so it should probably not be your first choice unless you know exactly that you need it.\n\n### get_connection\n\nThis returns the result of the creating the sqlalchemy engine and then calling `connect()` on it. Unlike the result\nof `get_alchemy_engine` this represents an active connection to Snowflake and this has a session associated with it.\n\nYou can see the object documentation [here](https://docs.snowflake.net/manuals/user-guide/sqlalchemy.html#parameters-and-behavior)\n\n## Known issues\n\nThere is a bug with `snowflake-connector` which causes some connections to Snowflake to not close properly in certain circumstances. This can cause timeout errors.\n\nYou can handle this in two ways: the first is to wrap usage of the connection in a `try/finally` block to ensure the connection is explicitly closed, like this:\n```\nfrom snowconn import SnowConn\nconn = SnowConn.connect(...)\ntry:\n result = execute_string(query) # or result = read_df(query), etc\nfinally:\n conn.close()\n```\n\nThe second way is to use SnowConn with the `with` syntax, as follows:\n```\nwith SnowConn.connect() as conn:\n conn.read_df(...)\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "Python utilities for connection to the Snowflake data warehouse",
"version": "3.11.0",
"project_urls": {
"Homepage": "https://github.com/Daltix/snowconn"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "dea8d40408e881a7c771c5eca39bdda569d49c578ef175568310c3a68be7ada6",
"md5": "05816ce5afca5c1f7d410393010771c2",
"sha256": "85ee8cd52bb56478932aa65328c506ef83b18dc429b4ae940aebd8a88c63e468"
},
"downloads": -1,
"filename": "snowconn-3.11.0.tar.gz",
"has_sig": false,
"md5_digest": "05816ce5afca5c1f7d410393010771c2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 14024,
"upload_time": "2023-07-12T16:21:08",
"upload_time_iso_8601": "2023-07-12T16:21:08.030353Z",
"url": "https://files.pythonhosted.org/packages/de/a8/d40408e881a7c771c5eca39bdda569d49c578ef175568310c3a68be7ada6/snowconn-3.11.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-07-12 16:21:08",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Daltix",
"github_project": "snowconn",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"circle": true,
"requirements": [
{
"name": "snowflake-sqlalchemy",
"specs": []
}
],
"lcname": "snowconn"
}