# Compute Module Lib
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/foundry-compute-modules)
[![PyPI](https://img.shields.io/pypi/v/foundry-compute-modules)](https://pypi.org/project/foundry-compute-modules/)
[![License](https://img.shields.io/badge/License-Apache%202.0-lightgrey.svg)](https://opensource.org/licenses/Apache-2.0)
> [!WARNING]
> This SDK is incubating and subject to change.
An open-source python library for compute modules for performing tasks like service discovery, getting a token, external source credentials, etc
## Functions Mode
Sources can be used to store secrets for use within a Compute Module, they prevent you from having to put secrets in your container or in plaintext in the job specification.
Retrieving a source credential using this library is simple, if you are in Functions Mode they are passed to the context
### Basic usage
#### Option 1 - `@function` annotation on each endpoint
```python
# app.py
from compute_modules.annotations import function
@function
def add(context, event) -> int:
return event["x"] + event["y"]
@function
def get_sources(context, event) -> List[str]:
return context["sources"].keys()
```
#### Option 2 - Explicitly register function(s) & start the Compute Module
You can either add a single function via `add_function` or several at once with `add_functions`.
```python
# functions/add.py
def add(context, event) -> int:
return event["x"] + event["y"]
# functions/hello.py
def hello(context, event) -> str:
return "Hello " + event["x"] + "!"
# app.py
from compute_modules import add_functions, start_compute_module
from functions.add import add
from functions.hello import hello
if __name__ == "__main__":
add_functions(
hello,
add,
)
start_compute_module()
```
### Advanced Usage 1 - streaming result
This library includes functionality that will stream result back when a function is called. If the function return type is `Iterable`, users may pass `streaming=True` to `@function` or `add_function` to enable result streaming. The result will be posted as a stream of JSON dumps. Users need to make sure the elements in the `Iterable` result are JSON serializable.
#### Use `@function`
```python
# app.py
from compute_modules.annotations import function
@function(streaming=True)
def get_strings(context, event) -> list[str]:
return [f'string {i}' for i in range(10)]
```
#### Use `add_function`
```python
# functions/get_strings.py
def get_strings(context, event) -> list[str]:
return [f'string {i}' for i in range(10)]
# app.py
from compute_modules import add_functions, start_compute_module
from functions.get_strings import get_strings
if __name__ == "__main__":
add_function(get_strings, streaming=True)
start_compute_module()
```
### Advanced Usage 2 - automatic function discovery
This library includes functionality that will inspect the functions registered for the Compute Module, inspect the input/output types of those functions, and then convert those to FunctionSpecs that can be imported as a Foundry Function without any modifications needed. Below are some considerations to ensure this feature works as expected.
#### 1. The Input class must be a complex type
Foundry function specs require the input type of a Function to be a complex type. If your function takes only a single primitive type as input, make sure to wrap that param in a complex type to have your function schema inferred properly.
#### 2. Input type definition
**✅ TypedDict as input type**
```python
from typing import TypedDict
from compute_modules.annotations import function
class HelloInput(TypedDict):
planet: str
@function
def hello(context, event: HelloInput) -> str:
return "Hello " + event["planet"] + "!"
```
**✅ dataclass as input type**
```python
from dataclasses import dataclass
from compute_modules.annotations import function
@dataclass
class TypedInput:
bytes_value: bytes
bool_value: bool
date_value: datetime.date
decimal_value: decimal.Decimal
float_value: float
int_value: int
str_value: str
datetime_value: datetime.datetime
other_date_value: datetime.datetime
@function
def typed_function(context, event: TypedInput) -> str:
diff = event.other_date_value - event.datetime_value
return f"The diff between dates provided is {diff}"
```
**✅ regular class with both class AND constructor type hints**
```python
from compute_modules.annotations import function
class GoodExample:
some_flag: bool
some_value: int
def __init__(self, some_flag: bool, some_value: int) -> None:
self.some_flag = some_flag
self.some_value = some_value
@function
def typed_function(context, event: GoodExample) -> int:
return return event.some_value
```
**❌ AVOID python class with no class type hints**
```python
# This will raise an exception
class BadClassNoTypeHints:
def __init__(self, arg1: str, arg2: int):
...
```
**❌ AVOID python class with no constructor type hints**
```python
# This will raise an exception
class BadClassNoInitHints:
arg1: str
arg2: int
def __init__(self, arg1, arg2):
...
```
**❌ AVOID python class with `args` in constructor**
```python
# This will raise an exception
class BadClassArgsInit:
arg1: str
arg2: int
def __init__(self, arg1: str, arg2: int, *args):
...
```
**❌ AVOID python class with `kwargs` in constructor**
```python
# This will raise an exception
class BadClassKwargsInit:
arg1: str
arg2: int
def __init__(self, arg1: str, arg2: int, **kwargs):
...
```
**❌ AVOID using dict/Dict with no type params**
```python
# These both will raise an exception
@dataclass
class MyPayload:
data: dict
@dataclass
class MyPayload:
data: typing.Dict
```
#### 3. Serialization/De-serialization of various types
| Python Type | Foundry Type | Serialized over HTTP as |
| ----------------------- | ------------ | ----------------------- |
| int | Integer | int |
| str | Byte | string |
| bool | Boolean | boolean |
| bytes | Binary | string |
| datetime.date | Date | string |
| datetime.datetime | Timestamp | int (Unix timestamp) |
| decimal.Decimal | Decimal | string |
| float | Float | float |
| list | Array | array |
| set | Array | array |
| dict | Map | JSON |
| class/TypedDict | Struct | JSON |
| Iterable (w/ streaming) | Array | stream of JSON |
### `QueryContext` typing
You can annotate the `context` param in any function with the `QueryContext` type to make it statically typed:
```python
from typing import TypedDict
from compute_modules.context import QueryContext
from compute_modules.annotations import function
class HelloInput(TypedDict):
x: str
@function
def hello(context: QueryContext, event: HelloInput) -> str:
return f"Hello {event['x']}! Your job ID is: {context.jobId}"
```
If left un-annotated, the `context` param will be a `dict`.
## Pipelines Mode
### Retrieving source credentials
Sources allow you to store secrets securely for use within a Compute Module, eliminating the need to include secrets in your container or in plaintext within the job specification. Retrieving a source credential using this library is straightforward:
```python
from compute_modules.sources import get_sources, get_source_secret
# retrive a dict with all sources
sources = get_sources()
# retrive the credentials of a specific source
my_creds = get_source_secret("mySourceApiName", "MyCredential")
```
### Retrieving pipeline resources
The SDK offers a convenient method for retrieving information on the resources configured for your pipeline module. This allows you to obtain the rid (& branch, if present) of a Foundry resource via the alias provided for that resource in the Configure tab of your compute module.
```python
from compute_modules.resources import PipelineResource, get_pipeline_resources
resources: dict[str, PipelineResource] = get_pipeline_resources()
print(f"My reource's rid is: {resources['your-alias-name'].rid}")
```
### Retriving pipeline token
To obtain an auth token for interacting with Foundry resources in Pipeline mode use the following function:
```python
from compute_modules.auth import retrieve_pipeline_token
import requests
pipeline_token = retrieve_pipeline_token()
requests.post(..., headers={"Authorization": f"Bearer {pipeline_token}")
```
## Application's permissions/ Third Party App
If you have configured your Compute Module (CM) to use Application's permissions, your application will use a service user for permissions instead of relying on the user's permissions. This configuration requires you to obtain the client ID and credentials to grant permission to the service token. This library facilitates this process:
```python
from compute_modules.auth import retrieve_third_party_id_and_creds, oauth
CLIENT_ID, CLIENT_CREDS = retrieve_third_party_id_and_creds()
# get a scoped token for your 3pa
HOSTNAME = "myenvironment.palantirfoundry.com"
access_token = oauth(HOSTNAME, ["api:datasets-read"])
```
## Retrieving Arguments
This SDK provides utilities for retrieving arguments passed into the compute module. There are two different functions available: `get_raw_arguments` and `get_parsed_arguments`. Below is an example showing the difference between the two.
For a Compute Module with the following arguments configured: ![Compute Module Arguments](./assets/arguments_example.png)
If we log the result of both as such:
```python
# app.py
import logging as log
from compute_modules.annotations import function
from compute_modules.arguments import get_raw_arguments, get_parsed_arguments
log.basicConfig(level=log.INFO)
@function
def hello(context, event) -> str:
raw_args = get_raw_arguments()
parsed_args = get_parsed_arguments()
log.info(f"raw_args: {raw_args}")
log.info(f"parsed_args: {parsed_args}")
...
```
We would then receive the following log output:
```stdout
INFO: raw_args: ['--test', 'hello', '--another-param', 'world']
INFO: parsed_args: Namespace(test='hello' , another_param= 'world' )
```
## Logging
To ensure your logs are emitted to properly we recommend you use the `get_logger` utility function provided by the SDK. This returns a normal `logging.Logger` instance so once you have the logger, you can use it as a drop-in replacement for `logging.getLogger`.
```python
from compute_modules.logging import get_logger
logger = get_logger(__name__)
logger.setLevel(logging.INFO)
logger.debug("Can't see me")
logger.info("Peekaboo!")
logger.warning("Peekaboo!")
logger.error("Peekaboo!")
logger.critical("Peekaboo!")
```
### Surfacing logs from the `compute_modules` library
By default, the logs emitted from within the `compute_modules` library have a level of `ERROR`, meaning only error- or critical-level logs will be emitted. If for any reason you want to see other logs being emitted from within `compute_modules` you can use the `set_internal_log_level` function.
```python
from compute_modules.logging import set_internal_log_level
set_internal_log_level(logging.DEBUG)
```
Raw data
{
"_id": null,
"home_page": "https://github.com/palantir/python-compute-module",
"name": "foundry-compute-modules",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "Palantir, Foundry, Compute Modules",
"author": "Palantir Technologies, Inc.",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/33/42/0c5028a410690a92ae2d4cd25ae81a12b8431e19ebb44348b37eb34e9e9f/foundry_compute_modules-0.13.0.tar.gz",
"platform": null,
"description": "# Compute Module Lib\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/foundry-compute-modules)\n[![PyPI](https://img.shields.io/pypi/v/foundry-compute-modules)](https://pypi.org/project/foundry-compute-modules/)\n[![License](https://img.shields.io/badge/License-Apache%202.0-lightgrey.svg)](https://opensource.org/licenses/Apache-2.0)\n\n\n> [!WARNING]\n> This SDK is incubating and subject to change.\n\n\nAn open-source python library for compute modules for performing tasks like service discovery, getting a token, external source credentials, etc\n\n\n\n## Functions Mode\nSources can be used to store secrets for use within a Compute Module, they prevent you from having to put secrets in your container or in plaintext in the job specification. \nRetrieving a source credential using this library is simple, if you are in Functions Mode they are passed to the context\n\n### Basic usage\n\n#### Option 1 - `@function` annotation on each endpoint\n\n```python\n# app.py\nfrom compute_modules.annotations import function\n\n@function\ndef add(context, event) -> int:\n return event[\"x\"] + event[\"y\"]\n\n@function\ndef get_sources(context, event) -> List[str]:\n return context[\"sources\"].keys()\n```\n\n\n#### Option 2 - Explicitly register function(s) & start the Compute Module\n\nYou can either add a single function via `add_function` or several at once with `add_functions`.\n\n```python\n# functions/add.py\ndef add(context, event) -> int:\n return event[\"x\"] + event[\"y\"]\n\n# functions/hello.py\ndef hello(context, event) -> str:\n return \"Hello \" + event[\"x\"] + \"!\"\n\n# app.py\nfrom compute_modules import add_functions, start_compute_module\n\nfrom functions.add import add\nfrom functions.hello import hello\n\nif __name__ == \"__main__\":\n add_functions(\n hello,\n add,\n )\n start_compute_module()\n\n```\n\n### Advanced Usage 1 - streaming result\nThis library includes functionality that will stream result back when a function is called. If the function return type is `Iterable`, users may pass `streaming=True` to `@function` or `add_function` to enable result streaming. The result will be posted as a stream of JSON dumps. Users need to make sure the elements in the `Iterable` result are JSON serializable.\n\n#### Use `@function`\n```python\n# app.py\nfrom compute_modules.annotations import function\n\n@function(streaming=True)\ndef get_strings(context, event) -> list[str]:\n return [f'string {i}' for i in range(10)]\n```\n\n#### Use `add_function`\n```python\n# functions/get_strings.py\ndef get_strings(context, event) -> list[str]:\n return [f'string {i}' for i in range(10)]\n\n# app.py\nfrom compute_modules import add_functions, start_compute_module\n\nfrom functions.get_strings import get_strings\n\nif __name__ == \"__main__\":\n add_function(get_strings, streaming=True)\n start_compute_module()\n```\n\n### Advanced Usage 2 - automatic function discovery\nThis library includes functionality that will inspect the functions registered for the Compute Module, inspect the input/output types of those functions, and then convert those to FunctionSpecs that can be imported as a Foundry Function without any modifications needed. Below are some considerations to ensure this feature works as expected.\n\n#### 1. The Input class must be a complex type\nFoundry function specs require the input type of a Function to be a complex type. If your function takes only a single primitive type as input, make sure to wrap that param in a complex type to have your function schema inferred properly. \n\n#### 2. Input type definition\n\n**\u2705 TypedDict as input type**\n\n```python\nfrom typing import TypedDict\nfrom compute_modules.annotations import function\n\n\nclass HelloInput(TypedDict):\n planet: str\n\n@function\ndef hello(context, event: HelloInput) -> str:\n return \"Hello \" + event[\"planet\"] + \"!\"\n```\n\n**\u2705 dataclass as input type**\n```python\nfrom dataclasses import dataclass\nfrom compute_modules.annotations import function\n\n\n@dataclass\nclass TypedInput:\n bytes_value: bytes\n bool_value: bool\n date_value: datetime.date\n decimal_value: decimal.Decimal\n float_value: float\n int_value: int\n str_value: str\n datetime_value: datetime.datetime\n other_date_value: datetime.datetime\n\n@function\ndef typed_function(context, event: TypedInput) -> str:\n diff = event.other_date_value - event.datetime_value\n return f\"The diff between dates provided is {diff}\"\n```\n\n**\u2705 regular class with both class AND constructor type hints**\n```python\nfrom compute_modules.annotations import function\n\n\nclass GoodExample:\n some_flag: bool\n some_value: int\n\n def __init__(self, some_flag: bool, some_value: int) -> None:\n self.some_flag = some_flag\n self.some_value = some_value\n\n@function\ndef typed_function(context, event: GoodExample) -> int:\n return return event.some_value\n```\n\n**\u274c AVOID python class with no class type hints**\n```python\n# This will raise an exception\nclass BadClassNoTypeHints:\n def __init__(self, arg1: str, arg2: int):\n ...\n```\n\n**\u274c AVOID python class with no constructor type hints**\n```python\n# This will raise an exception\nclass BadClassNoInitHints:\n arg1: str\n arg2: int\n\n def __init__(self, arg1, arg2):\n ...\n```\n\n**\u274c AVOID python class with `args` in constructor**\n```python\n# This will raise an exception\nclass BadClassArgsInit:\n arg1: str\n arg2: int\n\n def __init__(self, arg1: str, arg2: int, *args):\n ...\n```\n\n**\u274c AVOID python class with `kwargs` in constructor**\n```python\n# This will raise an exception\nclass BadClassKwargsInit:\n arg1: str\n arg2: int\n\n def __init__(self, arg1: str, arg2: int, **kwargs):\n ...\n```\n\n**\u274c AVOID using dict/Dict with no type params**\n```python\n# These both will raise an exception\n@dataclass\nclass MyPayload:\n data: dict\n\n@dataclass\nclass MyPayload:\n data: typing.Dict\n```\n\n\n#### 3. Serialization/De-serialization of various types\n\n| Python Type | Foundry Type | Serialized over HTTP as |\n| ----------------------- | ------------ | ----------------------- |\n| int | Integer | int |\n| str | Byte | string |\n| bool | Boolean | boolean |\n| bytes | Binary | string |\n| datetime.date | Date | string |\n| datetime.datetime | Timestamp | int (Unix timestamp) |\n| decimal.Decimal | Decimal | string |\n| float | Float | float |\n| list | Array | array |\n| set | Array | array |\n| dict | Map | JSON |\n| class/TypedDict | Struct | JSON |\n| Iterable (w/ streaming) | Array | stream of JSON |\n\n\n### `QueryContext` typing\n\nYou can annotate the `context` param in any function with the `QueryContext` type to make it statically typed:\n```python\nfrom typing import TypedDict\n\nfrom compute_modules.context import QueryContext\nfrom compute_modules.annotations import function\n\n\nclass HelloInput(TypedDict):\n x: str\n\n@function\ndef hello(context: QueryContext, event: HelloInput) -> str:\n return f\"Hello {event['x']}! Your job ID is: {context.jobId}\"\n```\n\nIf left un-annotated, the `context` param will be a `dict`.\n\n\n## Pipelines Mode\n### Retrieving source credentials\n\nSources allow you to store secrets securely for use within a Compute Module, eliminating the need to include secrets in your container or in plaintext within the job specification. Retrieving a source credential using this library is straightforward:\n```python\nfrom compute_modules.sources import get_sources, get_source_secret\n\n# retrive a dict with all sources\nsources = get_sources()\n\n# retrive the credentials of a specific source \nmy_creds = get_source_secret(\"mySourceApiName\", \"MyCredential\")\n\n```\n\n### Retrieving pipeline resources\n\nThe SDK offers a convenient method for retrieving information on the resources configured for your pipeline module. This allows you to obtain the rid (& branch, if present) of a Foundry resource via the alias provided for that resource in the Configure tab of your compute module.\n\n```python\nfrom compute_modules.resources import PipelineResource, get_pipeline_resources\n\nresources: dict[str, PipelineResource] = get_pipeline_resources()\nprint(f\"My reource's rid is: {resources['your-alias-name'].rid}\")\n```\n\n### Retriving pipeline token\n\nTo obtain an auth token for interacting with Foundry resources in Pipeline mode use the following function:\n\n```python\nfrom compute_modules.auth import retrieve_pipeline_token\nimport requests\n\npipeline_token = retrieve_pipeline_token()\nrequests.post(..., headers={\"Authorization\": f\"Bearer {pipeline_token}\")\n```\n\n\n## Application's permissions/ Third Party App\n\nIf you have configured your Compute Module (CM) to use Application's permissions, your application will use a service user for permissions instead of relying on the user's permissions. This configuration requires you to obtain the client ID and credentials to grant permission to the service token. This library facilitates this process:\n\n```python\nfrom compute_modules.auth import retrieve_third_party_id_and_creds, oauth\n\nCLIENT_ID, CLIENT_CREDS = retrieve_third_party_id_and_creds()\n\n# get a scoped token for your 3pa\nHOSTNAME = \"myenvironment.palantirfoundry.com\"\naccess_token = oauth(HOSTNAME, [\"api:datasets-read\"])\n\n```\n\n## Retrieving Arguments\n\nThis SDK provides utilities for retrieving arguments passed into the compute module. There are two different functions available: `get_raw_arguments` and `get_parsed_arguments`. Below is an example showing the difference between the two.\n\nFor a Compute Module with the following arguments configured: ![Compute Module Arguments](./assets/arguments_example.png)\n\nIf we log the result of both as such:\n```python\n# app.py\n\nimport logging as log\nfrom compute_modules.annotations import function\nfrom compute_modules.arguments import get_raw_arguments, get_parsed_arguments\n\nlog.basicConfig(level=log.INFO)\n\n@function\ndef hello(context, event) -> str:\n raw_args = get_raw_arguments()\n parsed_args = get_parsed_arguments()\n log.info(f\"raw_args: {raw_args}\")\n log.info(f\"parsed_args: {parsed_args}\")\n ...\n```\n\nWe would then receive the following log output:\n```stdout\nINFO: raw_args: ['--test', 'hello', '--another-param', 'world']\nINFO: parsed_args: Namespace(test='hello' , another_param= 'world' )\n```\n\n## Logging\n\nTo ensure your logs are emitted to properly we recommend you use the `get_logger` utility function provided by the SDK. This returns a normal `logging.Logger` instance so once you have the logger, you can use it as a drop-in replacement for `logging.getLogger`.\n\n```python\nfrom compute_modules.logging import get_logger\n\nlogger = get_logger(__name__)\nlogger.setLevel(logging.INFO)\n\nlogger.debug(\"Can't see me\")\nlogger.info(\"Peekaboo!\")\nlogger.warning(\"Peekaboo!\")\nlogger.error(\"Peekaboo!\")\nlogger.critical(\"Peekaboo!\")\n```\n\n### Surfacing logs from the `compute_modules` library\nBy default, the logs emitted from within the `compute_modules` library have a level of `ERROR`, meaning only error- or critical-level logs will be emitted. If for any reason you want to see other logs being emitted from within `compute_modules` you can use the `set_internal_log_level` function.\n\n```python\nfrom compute_modules.logging import set_internal_log_level\n\nset_internal_log_level(logging.DEBUG)\n```",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "The official Python library for creating Compute Modules",
"version": "0.13.0",
"project_urls": {
"Homepage": "https://github.com/palantir/python-compute-module",
"Repository": "https://github.com/palantir/python-compute-module"
},
"split_keywords": [
"palantir",
" foundry",
" compute modules"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4d1e6994c0cb92a28f8c6bfb20b1fa43298ea6feaa6967ec95d964b36260ef4a",
"md5": "9a952232b87193df2639dc4a95a4b033",
"sha256": "ce17f626182d93d25aecde5fa93f54d934c7e2c254639d6dde6181c1b4c6bdee"
},
"downloads": -1,
"filename": "foundry_compute_modules-0.13.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9a952232b87193df2639dc4a95a4b033",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 34700,
"upload_time": "2024-12-10T19:09:49",
"upload_time_iso_8601": "2024-12-10T19:09:49.509795Z",
"url": "https://files.pythonhosted.org/packages/4d/1e/6994c0cb92a28f8c6bfb20b1fa43298ea6feaa6967ec95d964b36260ef4a/foundry_compute_modules-0.13.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "33420c5028a410690a92ae2d4cd25ae81a12b8431e19ebb44348b37eb34e9e9f",
"md5": "c4aed89c29e6b2d2ef914bd4e531bc04",
"sha256": "d9ff627e8908329b0d63da3f52b132a7f08aa8122a455952ef7814cb2b2fc3dc"
},
"downloads": -1,
"filename": "foundry_compute_modules-0.13.0.tar.gz",
"has_sig": false,
"md5_digest": "c4aed89c29e6b2d2ef914bd4e531bc04",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 21216,
"upload_time": "2024-12-10T19:09:52",
"upload_time_iso_8601": "2024-12-10T19:09:52.624999Z",
"url": "https://files.pythonhosted.org/packages/33/42/0c5028a410690a92ae2d4cd25ae81a12b8431e19ebb44348b37eb34e9e9f/foundry_compute_modules-0.13.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-10 19:09:52",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "palantir",
"github_project": "python-compute-module",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"circle": true,
"lcname": "foundry-compute-modules"
}