superb-data-klient


Namesuperb-data-klient JSON
Version 1.5.0 PyPI version JSON
download
home_pageNone
SummaryA Python API wrapping services of the Superb Data Kraken (SDK)
upload_time2024-11-04 12:28:57
maintainerTeam SDK
docs_urlNone
authorNone
requires_python>=3.7
licenseApache-2.0
keywords sdk superb data kraken superbdataklient super data klient superb data klient superb-data-klient superb data klient
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ![PyPI - License](https://img.shields.io/pypi/l/superb-data-klient)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/superb-data-klient)
![PyPI](https://img.shields.io/pypi/v/superb-data-klient?label=version)
![PyPI - Downloads](https://img.shields.io/pypi/dm/superb-data-klient)


# superb-data-klient


**superb-data-klient** offers a streamlined interface to access various services of the *Superb Data Kraken platform* (**SDK**). With the library, you can
effortlessly fetch and index data, manage indices, spaces and organizations on the **SDK**.

Designed primarily for a Jupyter Hub environment within the platform, it's versatile enough to be set up in other environments too.


## Installation and Supported Versions

```console
$ python -m pip install superb-data-klient
```

## Usage


### Authentication


To begin, authenticate against the SDK's OIDC provider. This is achieved when instantiating the client object:

1. **System Environment Variables** (recommended for Jupyter environments):
    ```python
    import superbdataklient as sdk
    client = sdk.SDKClient()
    ```
   This approach leverages environment variables **SDK_ACCESS_TOKEN** and **SDK_REFRESH_TOKEN**.


2. **Login Credentials**:
    ``` python
    import superbdataklient as sdk
    sdk.SDKClient(username='hasslethehoff', password='lookingforfreedom')
    ```

3. **Authentication Code Flow**:

   If none of the above mentioned authentication methods fit, authentication is fulfilled via code-flow.

   **CAUTION** Beware that this method only works in a browser-environment.

**NOTE:** If your user account was linked from an external identity provider, your account in the SDK identity provider (Keycloak) does not have a password by default. To enable login via basic authentication, you need to set a password through self-service first.

Follow these steps to set your password:

1. Go to the self-service portal for your environment:
   - [https://{domain}/auth/realms/{realm}/account/](https://{domain}/auth/realms/{realm}/account/).
   - e.g. [https://app.sdk-cloud.de/auth/realms/efs-sdk/account/](https://app.sdk-cloud.de/auth/realms/efs-sdk/account/).
2. Set a password for your account.
3. Once the password is set, you can log in using basic authentication (option 2).

### Configuration


While the default settings cater to the standard SDK instance, configurations for various other instances are also available.


#### Setting Environment

``` python
import superbdataklient as sdk
client = sdk.SDKClient(env='sdk-dev')
client = sdk.SDKClient(env='sdk')
```

#### Overwriting Settings

``` python
client = sdk.SDKClient(domain='mydomain.ai', realm='my-realm', client_id='my-client-id', api_version='v13.37')
```


#### Proxy
To use the SDK Client behind a company proxy a user might add the following config parameters to the constructor.  
**NOTE**: The environment Variables "http_proxy" and "https_proxy" will overwrite the settings in the SDKClient. 
So remove them before configuring the SDKClient.
```python
client = SDKClient(username='hasslethehoff', 
                   password='lookingforfreedom', 
                   proxy_http="http://proxy.example.com:8080", 
                   proxy_https="https://proxy.example.com:8080", 
                   proxy_user="proxyusername", 
                   proxy_pass="proxyuserpassword")
```

#### Logging
Our flexible logging-functionality allows you to pass a user-defined logger. This makes it easier to integrate the log output of the class into an existing logging framework.
The logger can be passed as an argument during the initialization of the `SDKClient` instance. If this is the case, log messages are automatically forwarded to this logger in the various methods - otherwise logging will be printed to `stdout` / `stderr`.

```python
import logging
from superbdataklient import SDKClient

# Logger konfigurieren
my_logger = logging.getLogger('sdk_logger')
my_logger.setLevel(logging.DEBUG)
console_handler = logging.StreamHandler()
console_handler.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
console_handler.setFormatter(formatter)
my_logger.addHandler(console_handler)

# Logger an SDKClient übergeben
client = SDKClient(logger=my_logger)
```

---
### Examples


#### Organizations


Get details of all organizations, or retrieve by ID or name:

``` python
client.organization_get_all()
client.organization_get_by_id(1337)
client.organization_get_by_name('my-organization')
```

#### Spaces


To retrieve spaces related to an organization:

``` python
organization_id = 1234
client.space_get_all(organization_id)
client.space_get_by_id(organization_id, space_id)
client.space_get_by_name(organization_id, space)
```

#### Index


<!--
TODO: implement after search service works without all_access ()

List all accessible indices:

``` python
indices = client.index_get_all()
```
-->

Retrieve a specific document:

``` python
document = client.index_get_document(index_name, doc_id)
``` 

Fetch all documents within an index:

``` python
documents = client.index_get_all_documents("index_name")
```

Iterate through documents using a generator:

``` python
documents = client.index_get_documents("index-name")
for document in documents:
   print(document)
```

Index multiple documents:

``` python
documents = [
   {"_id": 123, "name": "document01", "value": "value"},
   {"_id": 1337, "name": "document02", "value": "value"}
]
index_name = "index"
client.index_documents(documents, index_name)
``` 

Note: The optional **_id** field is used as the document ID for indexing in OpenSearch.

Filter indices by organization, space, and type:

``` python
client.index_filter_by_space("my-organization", "my-space", "index-type")
```

For all spaces in an organization, use `*` instead of a space name. Available **index_type** values are **ANALYSIS** or **MEASUREMENTS**.

Create an application index:

``` python
mapping = {
   ...
}
client.application_index_create("my-application-index", "my-organization", "my-space", mapping)
```

Remove an application index by its name:

``` python
client.application_index_delete("my-organization_my-space_analysis_my-application-index")
```

#### Storage


List files in Storage:

``` python
files = client.storage_list_blobs("my-organization", "space")
```

Download specific files from Storage:

``` python
files = ['file01.txt', 'directory/file02.json']
client.storage_download_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')
```

Use regex patterns for file downloads:

``` python
files = ['file01.txt', 'directory/file02.json']
client.storage_download_files_with_regex(organization='my-organization', space='my-space', files=files, local_dir='tmp', regex=r'.*json$')
```

Upload files from a local directory. Ensure the presence of a valid `meta.json` if the `metadataGenerate` property on the space is not set to `true`:

``` python
files = ['meta.json', 'file01.txt', 'file02.txt']
client.storage_upload_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')
```

If you want to monitor the status of the upload, you can pass a `progress_callback` function with the following function-signature:

``` python
def progress_callback(uploaded: int, total: int) -> None:
```

where:
- `uploaded`: The number of bytes that have been uploaded so far.
- `total`: The total size of the file in bytes.

``` python
def progress_callback(uploaded, total):
    # do something to update the progress-bar

files = ['meta.json', 'file01.txt', 'file02.txt']
client.storage_upload_files(organization='my-organization', space='my-space', files=files, local_dir='tmp', progress_callback=progress_callback)
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "superb-data-klient",
    "maintainer": "Team SDK",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "sdk@efs-techhub.com",
    "keywords": "sdk, superb data kraken, superbdataklient, super data klient, superb data klient, superb-data-klient, superb, data, klient",
    "author": null,
    "author_email": "\"Team SDK | e:fs TechHub GmbH\" <sdk@efs-techhub.com>",
    "download_url": "https://files.pythonhosted.org/packages/c5/28/2dda5441cb5ef626d62e443a5d38dc0b91f6043541722678f8b074b1ecb6/superb_data_klient-1.5.0.tar.gz",
    "platform": null,
    "description": "![PyPI - License](https://img.shields.io/pypi/l/superb-data-klient)\r\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/superb-data-klient)\r\n![PyPI](https://img.shields.io/pypi/v/superb-data-klient?label=version)\r\n![PyPI - Downloads](https://img.shields.io/pypi/dm/superb-data-klient)\r\n\r\n\r\n# superb-data-klient\r\n\r\n\r\n**superb-data-klient** offers a streamlined interface to access various services of the *Superb Data Kraken platform* (**SDK**). With the library, you can\r\neffortlessly fetch and index data, manage indices, spaces and organizations on the **SDK**.\r\n\r\nDesigned primarily for a Jupyter Hub environment within the platform, it's versatile enough to be set up in other environments too.\r\n\r\n\r\n## Installation and Supported Versions\r\n\r\n```console\r\n$ python -m pip install superb-data-klient\r\n```\r\n\r\n## Usage\r\n\r\n\r\n### Authentication\r\n\r\n\r\nTo begin, authenticate against the SDK's OIDC provider. This is achieved when instantiating the client object:\r\n\r\n1. **System Environment Variables** (recommended for Jupyter environments):\r\n    ```python\r\n    import superbdataklient as sdk\r\n    client = sdk.SDKClient()\r\n    ```\r\n   This approach leverages environment variables **SDK_ACCESS_TOKEN** and **SDK_REFRESH_TOKEN**.\r\n\r\n\r\n2. **Login Credentials**:\r\n    ``` python\r\n    import superbdataklient as sdk\r\n    sdk.SDKClient(username='hasslethehoff', password='lookingforfreedom')\r\n    ```\r\n\r\n3. **Authentication Code Flow**:\r\n\r\n   If none of the above mentioned authentication methods fit, authentication is fulfilled via code-flow.\r\n\r\n   **CAUTION** Beware that this method only works in a browser-environment.\r\n\r\n**NOTE:** If your user account was linked from an external identity provider, your account in the SDK identity provider (Keycloak) does not have a password by default. To enable login via basic authentication, you need to set a password through self-service first.\r\n\r\nFollow these steps to set your password:\r\n\r\n1. Go to the self-service portal for your environment:\r\n   - [https://{domain}/auth/realms/{realm}/account/](https://{domain}/auth/realms/{realm}/account/).\r\n   - e.g. [https://app.sdk-cloud.de/auth/realms/efs-sdk/account/](https://app.sdk-cloud.de/auth/realms/efs-sdk/account/).\r\n2. Set a password for your account.\r\n3. Once the password is set, you can log in using basic authentication (option 2).\r\n\r\n### Configuration\r\n\r\n\r\nWhile the default settings cater to the standard SDK instance, configurations for various other instances are also available.\r\n\r\n\r\n#### Setting Environment\r\n\r\n``` python\r\nimport superbdataklient as sdk\r\nclient = sdk.SDKClient(env='sdk-dev')\r\nclient = sdk.SDKClient(env='sdk')\r\n```\r\n\r\n#### Overwriting Settings\r\n\r\n``` python\r\nclient = sdk.SDKClient(domain='mydomain.ai', realm='my-realm', client_id='my-client-id', api_version='v13.37')\r\n```\r\n\r\n\r\n#### Proxy\r\nTo use the SDK Client behind a company proxy a user might add the following config parameters to the constructor.  \r\n**NOTE**: The environment Variables \"http_proxy\" and \"https_proxy\" will overwrite the settings in the SDKClient. \r\nSo remove them before configuring the SDKClient.\r\n```python\r\nclient = SDKClient(username='hasslethehoff', \r\n                   password='lookingforfreedom', \r\n                   proxy_http=\"http://proxy.example.com:8080\", \r\n                   proxy_https=\"https://proxy.example.com:8080\", \r\n                   proxy_user=\"proxyusername\", \r\n                   proxy_pass=\"proxyuserpassword\")\r\n```\r\n\r\n#### Logging\r\nOur flexible logging-functionality allows you to pass a user-defined logger. This makes it easier to integrate the log output of the class into an existing logging framework.\r\nThe logger can be passed as an argument during the initialization of the `SDKClient` instance. If this is the case, log messages are automatically forwarded to this logger in the various methods - otherwise logging will be printed to `stdout` / `stderr`.\r\n\r\n```python\r\nimport logging\r\nfrom superbdataklient import SDKClient\r\n\r\n# Logger konfigurieren\r\nmy_logger = logging.getLogger('sdk_logger')\r\nmy_logger.setLevel(logging.DEBUG)\r\nconsole_handler = logging.StreamHandler()\r\nconsole_handler.setLevel(logging.DEBUG)\r\nformatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')\r\nconsole_handler.setFormatter(formatter)\r\nmy_logger.addHandler(console_handler)\r\n\r\n# Logger an SDKClient \u00fcbergeben\r\nclient = SDKClient(logger=my_logger)\r\n```\r\n\r\n---\r\n### Examples\r\n\r\n\r\n#### Organizations\r\n\r\n\r\nGet details of all organizations, or retrieve by ID or name:\r\n\r\n``` python\r\nclient.organization_get_all()\r\nclient.organization_get_by_id(1337)\r\nclient.organization_get_by_name('my-organization')\r\n```\r\n\r\n#### Spaces\r\n\r\n\r\nTo retrieve spaces related to an organization:\r\n\r\n``` python\r\norganization_id = 1234\r\nclient.space_get_all(organization_id)\r\nclient.space_get_by_id(organization_id, space_id)\r\nclient.space_get_by_name(organization_id, space)\r\n```\r\n\r\n#### Index\r\n\r\n\r\n<!--\r\nTODO: implement after search service works without all_access ()\r\n\r\nList all accessible indices:\r\n\r\n``` python\r\nindices = client.index_get_all()\r\n```\r\n-->\r\n\r\nRetrieve a specific document:\r\n\r\n``` python\r\ndocument = client.index_get_document(index_name, doc_id)\r\n``` \r\n\r\nFetch all documents within an index:\r\n\r\n``` python\r\ndocuments = client.index_get_all_documents(\"index_name\")\r\n```\r\n\r\nIterate through documents using a generator:\r\n\r\n``` python\r\ndocuments = client.index_get_documents(\"index-name\")\r\nfor document in documents:\r\n   print(document)\r\n```\r\n\r\nIndex multiple documents:\r\n\r\n``` python\r\ndocuments = [\r\n   {\"_id\": 123, \"name\": \"document01\", \"value\": \"value\"},\r\n   {\"_id\": 1337, \"name\": \"document02\", \"value\": \"value\"}\r\n]\r\nindex_name = \"index\"\r\nclient.index_documents(documents, index_name)\r\n``` \r\n\r\nNote: The optional **_id** field is used as the document ID for indexing in OpenSearch.\r\n\r\nFilter indices by organization, space, and type:\r\n\r\n``` python\r\nclient.index_filter_by_space(\"my-organization\", \"my-space\", \"index-type\")\r\n```\r\n\r\nFor all spaces in an organization, use `*` instead of a space name. Available **index_type** values are **ANALYSIS** or **MEASUREMENTS**.\r\n\r\nCreate an application index:\r\n\r\n``` python\r\nmapping = {\r\n   ...\r\n}\r\nclient.application_index_create(\"my-application-index\", \"my-organization\", \"my-space\", mapping)\r\n```\r\n\r\nRemove an application index by its name:\r\n\r\n``` python\r\nclient.application_index_delete(\"my-organization_my-space_analysis_my-application-index\")\r\n```\r\n\r\n#### Storage\r\n\r\n\r\nList files in Storage:\r\n\r\n``` python\r\nfiles = client.storage_list_blobs(\"my-organization\", \"space\")\r\n```\r\n\r\nDownload specific files from Storage:\r\n\r\n``` python\r\nfiles = ['file01.txt', 'directory/file02.json']\r\nclient.storage_download_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')\r\n```\r\n\r\nUse regex patterns for file downloads:\r\n\r\n``` python\r\nfiles = ['file01.txt', 'directory/file02.json']\r\nclient.storage_download_files_with_regex(organization='my-organization', space='my-space', files=files, local_dir='tmp', regex=r'.*json$')\r\n```\r\n\r\nUpload files from a local directory. Ensure the presence of a valid `meta.json` if the `metadataGenerate` property on the space is not set to `true`:\r\n\r\n``` python\r\nfiles = ['meta.json', 'file01.txt', 'file02.txt']\r\nclient.storage_upload_files(organization='my-organization', space='my-space', files=files, local_dir='tmp')\r\n```\r\n\r\nIf you want to monitor the status of the upload, you can pass a `progress_callback` function with the following function-signature:\r\n\r\n``` python\r\ndef progress_callback(uploaded: int, total: int) -> None:\r\n```\r\n\r\nwhere:\r\n- `uploaded`: The number of bytes that have been uploaded so far.\r\n- `total`: The total size of the file in bytes.\r\n\r\n``` python\r\ndef progress_callback(uploaded, total):\r\n    # do something to update the progress-bar\r\n\r\nfiles = ['meta.json', 'file01.txt', 'file02.txt']\r\nclient.storage_upload_files(organization='my-organization', space='my-space', files=files, local_dir='tmp', progress_callback=progress_callback)\r\n```\r\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A Python API wrapping services of the Superb Data Kraken (SDK)",
    "version": "1.5.0",
    "project_urls": null,
    "split_keywords": [
        "sdk",
        " superb data kraken",
        " superbdataklient",
        " super data klient",
        " superb data klient",
        " superb-data-klient",
        " superb",
        " data",
        " klient"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0e11fed33f0b84fe443e8e7b4bf066511e5fbbd563fce8f7da04e6ed12c043bf",
                "md5": "cf6aad788e659242f549a0418a3e39b0",
                "sha256": "1b650aa264a45253fc8e7558a9bd34c46fbc8a306dae94f8b81c6d99d00949a4"
            },
            "downloads": -1,
            "filename": "superb_data_klient-1.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cf6aad788e659242f549a0418a3e39b0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 25030,
            "upload_time": "2024-11-04T12:28:56",
            "upload_time_iso_8601": "2024-11-04T12:28:56.181286Z",
            "url": "https://files.pythonhosted.org/packages/0e/11/fed33f0b84fe443e8e7b4bf066511e5fbbd563fce8f7da04e6ed12c043bf/superb_data_klient-1.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c5282dda5441cb5ef626d62e443a5d38dc0b91f6043541722678f8b074b1ecb6",
                "md5": "408a7b3ac38af6b64fd5c511ea514f73",
                "sha256": "b8ae343a7fdde94e4577c48fc95e8e88170adc18352868bed80d01dad59ec8a1"
            },
            "downloads": -1,
            "filename": "superb_data_klient-1.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "408a7b3ac38af6b64fd5c511ea514f73",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 30531,
            "upload_time": "2024-11-04T12:28:57",
            "upload_time_iso_8601": "2024-11-04T12:28:57.331089Z",
            "url": "https://files.pythonhosted.org/packages/c5/28/2dda5441cb5ef626d62e443a5d38dc0b91f6043541722678f8b074b1ecb6/superb_data_klient-1.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-04 12:28:57",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "superb-data-klient"
}
        
Elapsed time: 0.41053s