Name | fosforio JSON |
Version |
1.0.3
JSON |
| download |
home_page | None |
Summary | FOSFOR-IO: To read and write dataframe from different connectors. |
upload_time | 2024-06-26 09:37:27 |
maintainer | None |
docs_url | None |
author | Abhishek Chaurasia |
requires_python | None |
license | None |
keywords |
fosforio
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Installation:
## Without any dependencies:
```commandline
pip install fosforio
```
## With all dependencies:
```commandline
pip install fosforio[all]
```
## With snowflake:
```commandline
pip install fosforio[snowflake]
```
## With s3:
```commandline
pip install fosforio[s3]
```
## With azureblob:
```commandline
pip install fosforio[azureblob]
```
## With local:
```commandline
pip install fosforio[local]
```
## With sftp:
```commandline
pip install fosforio[sftp]
```
## With mysql:
```commandline
pip install fosforio[mysql]
```
## With hive:
```commandline
pip install fosforio[hive]
```
## With sqlserver:
```commandline
pip install fosforio[sqlserver]
```
## With postgres:
```commandline
pip install fosforio[postgres]
```
#### Source code is available at: https://gitlab.fosfor.com/fosfor-decision-cloud/intelligence/refract-sdk
# Usage:
## To read dataframe with dataset name only -
```python
from fosforio import get_dataframe
get_dataframe("dataset_name")
# To read top 3 records from dataframe with filter condition of col1=value1 and col2>value2 and sort by col1.
get_dataframe("dataset_name", row_count=3, filter_condition="where col1='value1' and col2>value2 order by col1")
# For reading data from any other connections not listed here, please pip install mosaic-connector-python package.
```
## To read dataframe with filename from local storage -
```python
from refracio import get_local_dataframe
get_local_dataframe("local_file_name_with_absolute_path", row_count=3)
```
## To use snowflake related operations -
```python
from fosforio import snowflake
# To get snowflake connection object with a default snowflake connection created by the user, if available.
snowflake.get_connection()
# To get snowflake connection object with a specific connection name
snowflake.get_connection(connection_name="snowflake_con_name")
# To read a specific dataset published from a snowflake connection
snowflake.get_dataframe("dataset_name")
# To read a specific dataset published from a snowflake connection with only top few records.
snowflake.get_dataframe("dataset_name", row_count=3)
# To read a specific dataset published from a snowflake connection with only top few records and filter conditions.
snowflake.get_dataframe("dataset_name", row_count=3, filter_condition="where col1='value1' and col2>value2 order by col1")
# To execute a user specific query in snowflake, with the specified connection name.
snowflake.execute_query(query="user_query", database="db_name", schema="schema", connection_name="connection_name")
# To execute a user specific query in snowflake, with the current connection object or with the default connection for the user.
snowflake.execute_query(query="user_query", database="db_name", schema="schema")
# To close snowflake connection, please do close the connection after use!
snowflake.close_connection()
```
## To use mysql related operations -
```python
from fosforio import mysql
# To get mysql connection object with a default mysql connection created by the user, if available.
mysql.get_connection()
# To get mysql connection object with a specific connection name
mysql.get_connection(connection_name="mysql_con_name")
# To read a specific dataset published from a mysql connection
mysql.get_dataframe("dataset_name")
# To read a specific dataset published from a mysql connection with only top few records.
mysql.get_dataframe("dataset_name", row_count=3)
# To read a specific dataset published from a mysql connection with only top few records and filter conditions.
mysql.get_dataframe("dataset_name", row_count=3, filter_condition="where col1='value1' and col2>value2 order by col1")
# To execute a user specific query in mysql, with the specified connection name.
mysql.execute_query(query="user_query", connection_name="connection_name")
# To execute a user specific query in mysql, with the current connection object or with the default connection for the user.
mysql.execute_query(query="user_query")
# To close mysql connection, please do close the connection after use!
mysql.close_connection()
```
## To use sqlserver related operations -
### Requires sqlserver driver library
```python
# Create a custom template with the following commands added in "Pre Init Script" section,
# sudo curl -o /etc/yum.repos.d/mssql-release.repo https://packages.microsoft.com/config/rhel/9.0/prod.repo
# sudo ACCEPT_EULA=Y yum install -y msodbcsql18
from fosforio import sqlserver
# To get sqlserver connection object with a default sqlserver connection created by the user, if available.
sqlserver.get_connection()
# To get sqlserver connection object with a specific connection name
sqlserver.get_connection(connection_name="sqlserver_con_name")
# To read a specific dataset published from a sqlserver connection
sqlserver.get_dataframe("dataset_name")
# To read a specific dataset published from a sqlserver connection with only top few records.
sqlserver.get_dataframe("dataset_name", row_count=3)
# To read a specific dataset published from a sqlserver connection with only top few records and filter conditions.
sqlserver.get_dataframe("dataset_name", row_count=3, filter_condition="where col1='value1' and col2>value2 order by col1")
# To execute a user specific query in sqlserver, with the specified connection name.
sqlserver.execute_query(query="user_query", database="db_name", connection_name="connection_name")
# To execute a user specific query in sqlserver, with the current connection object or with the default connection for the user.
sqlserver.execute_query(query="user_query", database="db_name")
# To close sqlserver connection, please do close the connection after use!
sqlserver.close_connection()
```
## To use hive related operations -
```python
from fosforio import hive
# To get hive connection object with a default hive connection created by the user, if available. User id is required (1001 is default user_id used).
hive.get_connection(user_id=1001)
# To get hive connection object with a specific connection name, User id is required (1001 is default user_id used).
hive.get_connection(connection_name="hive_con_name", user_id=1001)
# To read a specific dataset published from a hive connection. User id is required (1001 is default user_id used).
hive.get_dataframe("dataset_name", user_id="1001")
# To read a specific dataset published from a hive connection with only top few records. User id is required (1001 is default user_id used)
hive.get_dataframe("dataset_name", user_id="1001", row_count=3)
# To read a specific dataset published from a hive connection with only top few records and filter conditions. User id is required (1001 is default user_id used)
hive.get_dataframe("dataset_name", user_id="1001", row_count=3, filter_condition="where col1='value1' and col2>value2 order by col1")
# To execute a user specific query in hive, with the specified connection name. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", connection_name="connection_name", user_id="1001")
# To execute a user specific query in hive, with the current connection object or with the default connection for the user. User id is required (1001 is default user_id used).
hive.execute_query(query="user_query", user_id="1001")
# To close hive connection, please do close the connection after use!
hive.close_connection()
```
## To use postgres related operations -
```python
from fosforio import postgres
# To get postgres connection object with a default postgres connection created by the user, if available.
postgres.get_connection()
# To get postgres connection object with a specific connection name
postgres.get_connection(connection_name="mysql_con_name")
# To read a specific dataset published from a postgres connection
postgres.get_dataframe("dataset_name")
# To read a specific dataset published from a postgres connection with only top few records.
postgres.get_dataframe("dataset_name", row_count=3)
# To read a specific dataset published from a postgres connection with only top few records and filter conditions.
postgres.get_dataframe("dataset_name", row_count=3, filter_condition="where col1='value1' and col2>value2 order by col1")
# To execute a user specific query in postgres, with the specified connection name.
postgres.execute_query(query="user_query", connection_name="connection_name")
# To execute a user specific query in postgres, with the current connection object or with the default connection for the user.
postgres.execute_query(query="user_query")
# To close postgres connection, please do close the connection after use!
postgres.close_connection()
```
## To use sftp related operations -
```python
from fosforio import sftp
# To get sftp connection object with a default sftp connection created by the user, if available.
sftp.get_connection()
# To get sftp connection object with a specific connection name
sftp.get_connection(connection_name="sftp_con_name")
# To read a specific dataset published from a sftp connection
sftp.get_dataframe("dataset_name")
# To read a specific dataset published from a sftp connection with only top few records.
sftp.get_dataframe("dataset_name", row_count=3)
# Use sftp connection object c to do any operation related to sftp like (get, put, listdir etc)
c = sftp.get_connection()
# To close sftp connection, please do close the connection after use!
sftp.close_connection()
```
## To use amazon S3 related operations -
```python
from fosforio import s3
# To get s3 connection object with a default s3 connection created by the user, if available.
s3.get_connection()
# To get s3 connection object with a specific connection name
s3.get_connection(connection_name="s3_con_name")
# To read a specific dataset published from a s3 connection
s3.get_dataframe("dataset_name")
# To read a specific dataset published from a s3 connection with only top few records.
s3.get_dataframe("dataset_name", row_count=3)
# Use s3 connection object c to do any operation related to s3.
c = s3.get_connection()
```
## To use azure blob related operations -
```python
from fosforio import azure
# To get azure blob connection object with a default azure connection created by the user, if available.
azure.get_connection()
# To get azure blob connection object with a specific connection name
azure.get_connection(connection_name="azureblob_con_name")
# To read a specific dataset published from a azureblob connection
azure.get_dataframe("dataset_name")
# To read a specific dataset published from a azure connection with only top few records.
azure.get_dataframe("dataset_name", row_count=3)
# Use azure connection object c to do any operation related to azure.
c = azure.get_connection()
```
*Note: Currently supported native connectors - snowflake, mysql, hive, sqlserver, postgres, sftp, s3, azureblob, local(NAS)*
Raw data
{
"_id": null,
"home_page": null,
"name": "fosforio",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "fosforio",
"author": "Abhishek Chaurasia",
"author_email": "<abhishek1.chaurasia@fosfor.com>",
"download_url": "https://files.pythonhosted.org/packages/00/9d/f4a5c5a6299921d0e67daa6c09fdcf085a662ee6ee70484060ad68a71fab/fosforio-1.0.3.tar.gz",
"platform": null,
"description": "# Installation:\n## Without any dependencies:\n```commandline\npip install fosforio\n```\n## With all dependencies:\n```commandline\npip install fosforio[all]\n```\n## With snowflake:\n```commandline\npip install fosforio[snowflake]\n```\n## With s3:\n```commandline\npip install fosforio[s3]\n```\n## With azureblob:\n```commandline\npip install fosforio[azureblob]\n```\n## With local:\n```commandline\npip install fosforio[local]\n```\n## With sftp:\n```commandline\npip install fosforio[sftp]\n```\n## With mysql:\n```commandline\npip install fosforio[mysql]\n```\n## With hive:\n```commandline\npip install fosforio[hive]\n```\n## With sqlserver:\n```commandline\npip install fosforio[sqlserver]\n```\n## With postgres:\n```commandline\npip install fosforio[postgres]\n```\n\n#### Source code is available at: https://gitlab.fosfor.com/fosfor-decision-cloud/intelligence/refract-sdk \n\n# Usage:\n## To read dataframe with dataset name only -\n```python\nfrom fosforio import get_dataframe\nget_dataframe(\"dataset_name\")\n\n# To read top 3 records from dataframe with filter condition of col1=value1 and col2>value2 and sort by col1.\nget_dataframe(\"dataset_name\", row_count=3, filter_condition=\"where col1='value1' and col2>value2 order by col1\")\n\n# For reading data from any other connections not listed here, please pip install mosaic-connector-python package.\n```\n## To read dataframe with filename from local storage -\n```python\nfrom refracio import get_local_dataframe\nget_local_dataframe(\"local_file_name_with_absolute_path\", row_count=3)\n```\n## To use snowflake related operations -\n```python\nfrom fosforio import snowflake\n\n# To get snowflake connection object with a default snowflake connection created by the user, if available.\nsnowflake.get_connection()\n\n# To get snowflake connection object with a specific connection name\nsnowflake.get_connection(connection_name=\"snowflake_con_name\")\n\n# To read a specific dataset published from a snowflake connection\nsnowflake.get_dataframe(\"dataset_name\")\n\n# To read a specific dataset published from a snowflake connection with only top few records.\nsnowflake.get_dataframe(\"dataset_name\", row_count=3)\n\n# To read a specific dataset published from a snowflake connection with only top few records and filter conditions.\nsnowflake.get_dataframe(\"dataset_name\", row_count=3, filter_condition=\"where col1='value1' and col2>value2 order by col1\")\n\n# To execute a user specific query in snowflake, with the specified connection name.\nsnowflake.execute_query(query=\"user_query\", database=\"db_name\", schema=\"schema\", connection_name=\"connection_name\")\n\n# To execute a user specific query in snowflake, with the current connection object or with the default connection for the user.\nsnowflake.execute_query(query=\"user_query\", database=\"db_name\", schema=\"schema\")\n\n# To close snowflake connection, please do close the connection after use!\nsnowflake.close_connection()\n```\n\n## To use mysql related operations -\n```python\nfrom fosforio import mysql\n\n# To get mysql connection object with a default mysql connection created by the user, if available.\nmysql.get_connection()\n\n# To get mysql connection object with a specific connection name\nmysql.get_connection(connection_name=\"mysql_con_name\")\n\n# To read a specific dataset published from a mysql connection\nmysql.get_dataframe(\"dataset_name\")\n\n# To read a specific dataset published from a mysql connection with only top few records.\nmysql.get_dataframe(\"dataset_name\", row_count=3)\n\n# To read a specific dataset published from a mysql connection with only top few records and filter conditions.\nmysql.get_dataframe(\"dataset_name\", row_count=3, filter_condition=\"where col1='value1' and col2>value2 order by col1\")\n\n# To execute a user specific query in mysql, with the specified connection name.\nmysql.execute_query(query=\"user_query\", connection_name=\"connection_name\")\n\n# To execute a user specific query in mysql, with the current connection object or with the default connection for the user.\nmysql.execute_query(query=\"user_query\")\n\n# To close mysql connection, please do close the connection after use!\nmysql.close_connection()\n```\n\n## To use sqlserver related operations -\n### Requires sqlserver driver library\n```python\n# Create a custom template with the following commands added in \"Pre Init Script\" section,\n# sudo curl -o /etc/yum.repos.d/mssql-release.repo https://packages.microsoft.com/config/rhel/9.0/prod.repo\n# sudo ACCEPT_EULA=Y yum install -y msodbcsql18\nfrom fosforio import sqlserver\n\n# To get sqlserver connection object with a default sqlserver connection created by the user, if available.\nsqlserver.get_connection()\n\n# To get sqlserver connection object with a specific connection name\nsqlserver.get_connection(connection_name=\"sqlserver_con_name\")\n\n# To read a specific dataset published from a sqlserver connection\nsqlserver.get_dataframe(\"dataset_name\")\n\n# To read a specific dataset published from a sqlserver connection with only top few records.\nsqlserver.get_dataframe(\"dataset_name\", row_count=3)\n\n# To read a specific dataset published from a sqlserver connection with only top few records and filter conditions.\nsqlserver.get_dataframe(\"dataset_name\", row_count=3, filter_condition=\"where col1='value1' and col2>value2 order by col1\")\n\n# To execute a user specific query in sqlserver, with the specified connection name.\nsqlserver.execute_query(query=\"user_query\", database=\"db_name\", connection_name=\"connection_name\")\n\n# To execute a user specific query in sqlserver, with the current connection object or with the default connection for the user.\nsqlserver.execute_query(query=\"user_query\", database=\"db_name\")\n\n# To close sqlserver connection, please do close the connection after use!\nsqlserver.close_connection()\n```\n\n## To use hive related operations -\n```python\nfrom fosforio import hive\n\n# To get hive connection object with a default hive connection created by the user, if available. User id is required (1001 is default user_id used).\nhive.get_connection(user_id=1001)\n\n# To get hive connection object with a specific connection name, User id is required (1001 is default user_id used).\nhive.get_connection(connection_name=\"hive_con_name\", user_id=1001)\n\n# To read a specific dataset published from a hive connection. User id is required (1001 is default user_id used).\nhive.get_dataframe(\"dataset_name\", user_id=\"1001\")\n\n# To read a specific dataset published from a hive connection with only top few records. User id is required (1001 is default user_id used)\nhive.get_dataframe(\"dataset_name\", user_id=\"1001\", row_count=3)\n\n# To read a specific dataset published from a hive connection with only top few records and filter conditions. User id is required (1001 is default user_id used)\nhive.get_dataframe(\"dataset_name\", user_id=\"1001\", row_count=3, filter_condition=\"where col1='value1' and col2>value2 order by col1\")\n\n# To execute a user specific query in hive, with the specified connection name. User id is required (1001 is default user_id used).\nhive.execute_query(query=\"user_query\", connection_name=\"connection_name\", user_id=\"1001\")\n\n# To execute a user specific query in hive, with the current connection object or with the default connection for the user. User id is required (1001 is default user_id used).\nhive.execute_query(query=\"user_query\", user_id=\"1001\")\n\n# To close hive connection, please do close the connection after use!\nhive.close_connection()\n```\n\n## To use postgres related operations -\n```python\nfrom fosforio import postgres\n\n# To get postgres connection object with a default postgres connection created by the user, if available.\npostgres.get_connection()\n\n# To get postgres connection object with a specific connection name\npostgres.get_connection(connection_name=\"mysql_con_name\")\n\n# To read a specific dataset published from a postgres connection\npostgres.get_dataframe(\"dataset_name\")\n\n# To read a specific dataset published from a postgres connection with only top few records.\npostgres.get_dataframe(\"dataset_name\", row_count=3)\n\n# To read a specific dataset published from a postgres connection with only top few records and filter conditions.\npostgres.get_dataframe(\"dataset_name\", row_count=3, filter_condition=\"where col1='value1' and col2>value2 order by col1\")\n\n# To execute a user specific query in postgres, with the specified connection name.\npostgres.execute_query(query=\"user_query\", connection_name=\"connection_name\")\n\n# To execute a user specific query in postgres, with the current connection object or with the default connection for the user.\npostgres.execute_query(query=\"user_query\")\n\n# To close postgres connection, please do close the connection after use!\npostgres.close_connection()\n```\n\n## To use sftp related operations -\n```python\nfrom fosforio import sftp\n\n# To get sftp connection object with a default sftp connection created by the user, if available.\nsftp.get_connection()\n\n# To get sftp connection object with a specific connection name\nsftp.get_connection(connection_name=\"sftp_con_name\")\n\n# To read a specific dataset published from a sftp connection\nsftp.get_dataframe(\"dataset_name\")\n\n# To read a specific dataset published from a sftp connection with only top few records.\nsftp.get_dataframe(\"dataset_name\", row_count=3)\n\n# Use sftp connection object c to do any operation related to sftp like (get, put, listdir etc)\nc = sftp.get_connection()\n\n# To close sftp connection, please do close the connection after use!\nsftp.close_connection()\n```\n\n## To use amazon S3 related operations -\n```python\nfrom fosforio import s3\n\n# To get s3 connection object with a default s3 connection created by the user, if available.\ns3.get_connection()\n\n# To get s3 connection object with a specific connection name\ns3.get_connection(connection_name=\"s3_con_name\")\n\n# To read a specific dataset published from a s3 connection\ns3.get_dataframe(\"dataset_name\")\n\n# To read a specific dataset published from a s3 connection with only top few records.\ns3.get_dataframe(\"dataset_name\", row_count=3)\n\n# Use s3 connection object c to do any operation related to s3.\nc = s3.get_connection()\n```\n\n## To use azure blob related operations -\n```python\nfrom fosforio import azure\n\n# To get azure blob connection object with a default azure connection created by the user, if available.\nazure.get_connection()\n\n# To get azure blob connection object with a specific connection name\nazure.get_connection(connection_name=\"azureblob_con_name\")\n\n# To read a specific dataset published from a azureblob connection\nazure.get_dataframe(\"dataset_name\")\n\n# To read a specific dataset published from a azure connection with only top few records.\nazure.get_dataframe(\"dataset_name\", row_count=3)\n\n# Use azure connection object c to do any operation related to azure.\nc = azure.get_connection()\n```\n\n*Note: Currently supported native connectors - snowflake, mysql, hive, sqlserver, postgres, sftp, s3, azureblob, local(NAS)*\n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "FOSFOR-IO: To read and write dataframe from different connectors.",
"version": "1.0.3",
"project_urls": {
"Product": "https://www.fosfor.com/",
"Source": "https://gitlab.fosfor.com/fosfor-decision-cloud/intelligence/refract-sdk"
},
"split_keywords": [
"fosforio"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e703ea361a2ed5ca7538e5d181a9aa0eda1dddb4c5f16cf2d34e1c892b60612d",
"md5": "34193c100be25775309a9b147f26d5cb",
"sha256": "fdc769523e9bcfc4c58512fba96ffbde6a88ed6c74bf7eab1eabc96c5d965f02"
},
"downloads": -1,
"filename": "fosforio-1.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "34193c100be25775309a9b147f26d5cb",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 20971,
"upload_time": "2024-06-26T09:37:26",
"upload_time_iso_8601": "2024-06-26T09:37:26.152879Z",
"url": "https://files.pythonhosted.org/packages/e7/03/ea361a2ed5ca7538e5d181a9aa0eda1dddb4c5f16cf2d34e1c892b60612d/fosforio-1.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "009df4a5c5a6299921d0e67daa6c09fdcf085a662ee6ee70484060ad68a71fab",
"md5": "367b99f15b01ed12f6891c582a549393",
"sha256": "6fafe09cc2ad5ca6025255789914e55ef17efdb22c7d7b08bd3f49eb1281d982"
},
"downloads": -1,
"filename": "fosforio-1.0.3.tar.gz",
"has_sig": false,
"md5_digest": "367b99f15b01ed12f6891c582a549393",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 14513,
"upload_time": "2024-06-26T09:37:27",
"upload_time_iso_8601": "2024-06-26T09:37:27.977466Z",
"url": "https://files.pythonhosted.org/packages/00/9d/f4a5c5a6299921d0e67daa6c09fdcf085a662ee6ee70484060ad68a71fab/fosforio-1.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-26 09:37:27",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "fosforio"
}