ibmcloudsql


Nameibmcloudsql JSON
Version 0.5.13 PyPI version JSON
download
home_pagehttps://github.com/IBM-Cloud/sql-query-clients
SummaryPython client for interacting with IBM Cloud Data Engine service
upload_time2022-12-02 08:02:00
maintainer
docs_urlNone
authorIBM Corp.
requires_python>=2.7, <4
licenseApache 2.0
keywords sql cloud object_storage ibm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ibmcloudsql

Allows you to run SQL statements in the IBM Cloud on data stored on object storage::

## Building and testing the library locally
### Set up Python environment
Run `source ./setup_env.sh` which creates and activates a clean virtual Python environment. It uses Python 2.7 by default. Adapt line 2 inside the script if you want a different version.
### Install the local code in your Python environment
Run `./_install.sh`.
### Test the library locally
1. Create a file `ibmcloudsql/test_credentials.py` with the following three lines and your according properties:
```
apikey='<your IBM Cloud API key>'
instance_crn='<your Data Engine instance CRN>'
result_location='<COS URI of default result location for your SQL result sets>'
...
```
see details in the template file

2. Run `python ibmcloudsql/test.py`.
### Packaging and publishing distribution
1. Make sure to increase `version=...` in `setup.py` before creating a new package.
2. Run `package.sh`. It will prompt for user and password that must be authorized for package `ibmcloudsql` on pypi.org.

## Example usage
```
import ibmcloudsql
my_ibmcloud_apikey = '<your api key here>'
my_instance_crn='<your ibm cloud Data Engine instance CRN here>'
my_target_cos_url='<Cloud Object Storage URL for the SQL result target. Format: cos://<endpoint>/<bucket>/[<prefix>]>'
sqlClient = SQLQuery(my_ibmcloud_apikey, my_instance_crn)
sqlClient.run_sql('SELECT * FROM cos://us-geo/sql/orders.parquet STORED AS PARQUET LIMIT 5 INTO {} STORED AS CSV'.format(my_target_cos_url)).head()
```

## Demo notebook
You can use IBM Watson Studio with the following [demo notebook](https://dataplatform.cloud.ibm.com/analytics/notebooks/v2/440b3665-367f-4fc9-86d8-4fe7eae13b18/view?access_token=3c1471a6970890fe28cadf118215df44e82c2472a83c4051e3ff80fe505448ed) that shows some elaborate examples of using various aspects of ibmcloudsql.

## SQLQuery method list
 * `SQLQuery(api_key, instance_crn, target_cos_url=None, token=None, client_info='')` Constructor
 * `logon(force=False, token=None)` Needs to be called before any other method below. It exchanges the `api_key` set at initialization for a temporary oauth token. The invocation is a No-Op if previous logon is less than 5 minutes ago. You can force logon anyway with optional paramater `force=True`. When you have inititialized the client without an `api_key` but instead specified a custom `token` then you can specify a fresh `token to logon method to update the client with that.
 * `submit_sql(sql_text, pagesize=None)` Returns `jobId`as string. Optional pagesize parameter (in rows) for paginated result objects.
 * `wait_for_job(jobId)` Waits for job to end and returns job completion state (either `completed` or `failed`)
 * `get_result(jobId, pagenumber=None)` returns SQL result data frame for entire result or for specified page of results.
 * `list_results(jobId)` Returns a data frame with the list of result objects written
 * `delete_result(jobId)` Deletes all result set objects in cloud object storage for the given jobId
 * `rename_exact_result(jobId)` Renames single partitioned query result to exact single object name without folder hierarchy.
 * `get_job(jobId)` Returns details for the given SQL job as a json object
 * `get_jobs()` Returns the list of recent 30 submitted SQL jobs with all details as a data frame
 * `run_sql(sql_text)` Compound method that calls `submit_sql`, `wait_for_job` and `wait_for_job` in sequenceA
 * `sql_ui_link()` Returns browser link for Data Engine web console for currently configured instance
 * `get_cos_summary(cos_url)` Returns summary for stored number of objects and volume for a given cos url as a json
 * `list_cos_objects(cos_url)` Returns a data frame with the list of objects found in the given cos url
 * `export_job_history(cos_url)` Exports new jobs as parquet file to the given `cos_url`.
 * `export_tags_for_cos_objects(cos_url, export_target_cos_file)` Exports all objects as a parquet file to the given `cos_url` that have tags configured along with the value for each tag.

## Exceptions
 * `RateLimitedException(message)` raised when jobs can't be submitted due to 429 / Plan limit for concurrent queries has been reached
## Constructor options
 * `api_key`: IAM API key. When this parameter is set to `None` then you must specify an own valid IAM otauth token in the parameter `token`.
 * `instance_crn`: Data Engine instance CRN identifier
 * `target_cos_url`: Optional default target URL. Don't use when you want to provide target URL in SQL statement text.
 * `token`: Optional custom IAM oauth token. When you specify this then you must set `api_key` parameter to `None`.
 * `client_info`: Optional string to identify your client application in IBM Cloud for PD reasons.
 * `max_tries`: Optional integer to specify maximum attempts when dealing with request rate limit. Default value is `1`, which means it will through exception `RateLimitedException` when response status code is `429`. It will enable _exponential backoff_ when specifying any positive number greater than `1`. For instance, given `max_tries=5`, assuming it will get response status code `429` for 4 times until the 5th attempt will get response status code `201`, the wait time will be `2s`, `4s`, `8s` and `16s` for each attempts.

## Limitations
Data Engine Python SDK does not support Pyinstaller.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/IBM-Cloud/sql-query-clients",
    "name": "ibmcloudsql",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=2.7, <4",
    "maintainer_email": "",
    "keywords": "sql cloud object_storage IBM",
    "author": "IBM Corp.",
    "author_email": "torsten@de.ibm.com",
    "download_url": "https://files.pythonhosted.org/packages/a5/8c/785034007f3da2e28d2819734efb978248c637000041218e696721ea3ce8/ibmcloudsql-0.5.13.tar.gz",
    "platform": null,
    "description": "# ibmcloudsql\n\nAllows you to run SQL statements in the IBM Cloud on data stored on object storage::\n\n## Building and testing the library locally\n### Set up Python environment\nRun `source ./setup_env.sh` which creates and activates a clean virtual Python environment. It uses Python 2.7 by default. Adapt line 2 inside the script if you want a different version.\n### Install the local code in your Python environment\nRun `./_install.sh`.\n### Test the library locally\n1. Create a file `ibmcloudsql/test_credentials.py` with the following three lines and your according properties:\n```\napikey='<your IBM Cloud API key>'\ninstance_crn='<your Data Engine instance CRN>'\nresult_location='<COS URI of default result location for your SQL result sets>'\n...\n```\nsee details in the template file\n\n2. Run `python ibmcloudsql/test.py`.\n### Packaging and publishing distribution\n1. Make sure to increase `version=...` in `setup.py` before creating a new package.\n2. Run `package.sh`. It will prompt for user and password that must be authorized for package `ibmcloudsql` on pypi.org.\n\n## Example usage\n```\nimport ibmcloudsql\nmy_ibmcloud_apikey = '<your api key here>'\nmy_instance_crn='<your ibm cloud Data Engine instance CRN here>'\nmy_target_cos_url='<Cloud Object Storage URL for the SQL result target. Format: cos://<endpoint>/<bucket>/[<prefix>]>'\nsqlClient = SQLQuery(my_ibmcloud_apikey, my_instance_crn)\nsqlClient.run_sql('SELECT * FROM cos://us-geo/sql/orders.parquet STORED AS PARQUET LIMIT 5 INTO {} STORED AS CSV'.format(my_target_cos_url)).head()\n```\n\n## Demo notebook\nYou can use IBM Watson Studio with the following [demo notebook](https://dataplatform.cloud.ibm.com/analytics/notebooks/v2/440b3665-367f-4fc9-86d8-4fe7eae13b18/view?access_token=3c1471a6970890fe28cadf118215df44e82c2472a83c4051e3ff80fe505448ed) that shows some elaborate examples of using various aspects of ibmcloudsql.\n\n## SQLQuery method list\n * `SQLQuery(api_key, instance_crn, target_cos_url=None, token=None, client_info='')` Constructor\n * `logon(force=False, token=None)` Needs to be called before any other method below. It exchanges the `api_key` set at initialization for a temporary oauth token. The invocation is a No-Op if previous logon is less than 5 minutes ago. You can force logon anyway with optional paramater `force=True`. When you have inititialized the client without an `api_key` but instead specified a custom `token` then you can specify a fresh `token to logon method to update the client with that.\n * `submit_sql(sql_text, pagesize=None)` Returns `jobId`as string. Optional pagesize parameter (in rows) for paginated result objects.\n * `wait_for_job(jobId)` Waits for job to end and returns job completion state (either `completed` or `failed`)\n * `get_result(jobId, pagenumber=None)` returns SQL result data frame for entire result or for specified page of results.\n * `list_results(jobId)` Returns a data frame with the list of result objects written\n * `delete_result(jobId)` Deletes all result set objects in cloud object storage for the given jobId\n * `rename_exact_result(jobId)` Renames single partitioned query result to exact single object name without folder hierarchy.\n * `get_job(jobId)` Returns details for the given SQL job as a json object\n * `get_jobs()` Returns the list of recent 30 submitted SQL jobs with all details as a data frame\n * `run_sql(sql_text)` Compound method that calls `submit_sql`, `wait_for_job` and `wait_for_job` in sequenceA\n * `sql_ui_link()` Returns browser link for Data Engine web console for currently configured instance\n * `get_cos_summary(cos_url)` Returns summary for stored number of objects and volume for a given cos url as a json\n * `list_cos_objects(cos_url)` Returns a data frame with the list of objects found in the given cos url\n * `export_job_history(cos_url)` Exports new jobs as parquet file to the given `cos_url`.\n * `export_tags_for_cos_objects(cos_url, export_target_cos_file)` Exports all objects as a parquet file to the given `cos_url` that have tags configured along with the value for each tag.\n\n## Exceptions\n * `RateLimitedException(message)` raised when jobs can't be submitted due to 429 / Plan limit for concurrent queries has been reached\n## Constructor options\n * `api_key`: IAM API key. When this parameter is set to `None` then you must specify an own valid IAM otauth token in the parameter `token`.\n * `instance_crn`: Data Engine instance CRN identifier\n * `target_cos_url`: Optional default target URL. Don't use when you want to provide target URL in SQL statement text.\n * `token`: Optional custom IAM oauth token. When you specify this then you must set `api_key` parameter to `None`.\n * `client_info`: Optional string to identify your client application in IBM Cloud for PD reasons.\n * `max_tries`: Optional integer to specify maximum attempts when dealing with request rate limit. Default value is `1`, which means it will through exception `RateLimitedException` when response status code is `429`. It will enable _exponential backoff_ when specifying any positive number greater than `1`. For instance, given `max_tries=5`, assuming it will get response status code `429` for 4 times until the 5th attempt will get response status code `201`, the wait time will be `2s`, `4s`, `8s` and `16s` for each attempts.\n\n## Limitations\nData Engine Python SDK does not support Pyinstaller.\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "Python client for interacting with IBM Cloud Data Engine service",
    "version": "0.5.13",
    "split_keywords": [
        "sql",
        "cloud",
        "object_storage",
        "ibm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "5f4db550e56aa7c3d2a4b676d93bd206",
                "sha256": "2426566502e60ce27f63fded4b85083981cebafc5da2e141b7161e70175fcd32"
            },
            "downloads": -1,
            "filename": "ibmcloudsql-0.5.13.tar.gz",
            "has_sig": false,
            "md5_digest": "5f4db550e56aa7c3d2a4b676d93bd206",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=2.7, <4",
            "size": 57709,
            "upload_time": "2022-12-02T08:02:00",
            "upload_time_iso_8601": "2022-12-02T08:02:00.993421Z",
            "url": "https://files.pythonhosted.org/packages/a5/8c/785034007f3da2e28d2819734efb978248c637000041218e696721ea3ce8/ibmcloudsql-0.5.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-02 08:02:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "IBM-Cloud",
    "github_project": "sql-query-clients",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "ibmcloudsql"
}
        
Elapsed time: 0.01699s