[![PyPI version](https://badge.fury.io/py/odd-collector-sdk.svg)](https://badge.fury.io/py/odd-collector-sdk)
# ODD Collector SDK
Root project for ODD collectors
### Domain
* `CollectorConfig`
_Main config file for collector_
``` python
class CollectorConfig(pydantic.BaseSettings):
default_pulling_interval: int # pulling interval in minutes
token: str # token for requests to odd-platform
plugins: Any
platform_host_url: str
```
* `Collector`
Args:
`config_path`: str - path to collector_config.yaml (i.e. `'/collector_config.yaml'`)
`root_package`: str - root package for adapters which will be loaded (i.e. `'my_collector.adapters'`)
`plugins_union_type` - Type variable for pydantic model.
* `Plugin`
Is a config for adapter
```python
class Plugin(pydantic.BaseSettings):
name: str
description: Optional[str] = None
namespace: Optional[str] = None
```
Plugin class inherited from Pydantic's BaseSetting,it means it can take any field, which was skipped in `collector_config.yaml`, from env variables.
Field `type: Literal["custom_adapter"]` is obligatory for each plugin, by convention literal **MUST** have same name with adapter package
Plugins example:
```python
# plugins.py
class AwsPlugin(Plugin):
aws_secret_access_key: str
aws_access_key_id: str
aws_region: str
class S3Plugin(AwsPlugin):
type: Literal["s3"]
buckets: Optional[List[str]] = []
class GluePlugin(AwsPlugin):
type: Literal["glue"]
# For Collector's plugins_union_type argument
AvailablePlugin = Annotated[
Union[
GluePlugin,
S3Plugin,
],
pydantic.Field(discriminator="type"),
]
```
* AbstractAdapter
Abstract adapter which **MUST** be implemented by generic adapters
## Collector example
### Requirenments
Use the package manager [poetry](https://python-poetry.org/) to install add odd-collector-sdk and asyncio.
```bash
poetry add odd-collector-sdk
```
### A typical top-level collector's directory layout (as an example we took poetry project)
.
├── my_collector
│ ├── adapters # Adapters
│ │ ├── custom_adapter # Some adapter package
│ │ │ ├── adapter.py # Entry file for adapter
│ │ │ └── __init__.py
│ │ ├── other_custom_adapter
│ │ ├── ... # Other adapters
│ │ └── __init__.py
│ ├── domain # Domain models
│ │ ├── ...
│ │ ├── plugins.py # Models for available plugins
│ │ └── __init__.py
│ ├── __init__.py
│ └── __main__.py # Entry file for collector
├── ...
├── collector_config.yaml
├── pyproject.toml
├── LICENSE
└── README.md
### Adapters folder
Each adapter inside adapters folder must have an `adapter.py` file with an `Adapter` class implementing `AbstractAdapter`
```python
# custom_adapter/adapter.py example
from odd_collector_sdk.domain.adapter import AbstractAdapter
from odd_models.models import DataEntityList
#
class Adapter(AbstractAdapter):
def __init__(self, config: any) -> None:
super().__init__()
def get_data_entity_list(self) -> DataEntityList:
return DataEntityList(data_source_oddrn="test")
def get_data_source_oddrn(self) -> str:
return "oddrn"
```
### Plugins
Each plugin must implement `Plugin` class from sdk
```python
# domain/plugins.py
from typing import Literal, Union
from typing_extensions import Annotated
import pydantic
from odd_collector_sdk.domain.plugin import Plugin
class CustomPlugin(Plugin):
type: Literal["custom_adapter"]
class OtherCustomPlugin(Plugin):
type: Literal["other_custom_adapter"]
# Needs this type variable for Collector initialization
AvailablePlugins = Annotated[
Union[CustomPlugin, OtherCustomPlugin],
pydantic.Field(discriminator="type"),
]
```
### collector_config.yaml
```yaml
default_pulling_interval: 10
token: ""
platform_host_url: "http://localhost:8080"
plugins:
- type: custom_adapter
name: custom_adapter_name
- type: other_custom_adapter
name: other_custom_adapter_name
```
## Usage
```python
# __main__.py
import asyncio
import logging
from os import path
from odd_collector_sdk.collector import Collector
# Union type of avalable plugins
from my_collector.domain.plugins import AvailablePlugins
logging.basicConfig(
level=logging.INFO, format="[%(asctime)s] %(levelname)s in %(module)s: %(message)s"
)
try:
cur_dirname = path.dirname(path.realpath(__file__))
config_path = path.join(cur_dirname, "../collector_config.yaml")
root_package = "my_collector.adapters"
loop = asyncio.get_event_loop()
collector = Collector(config_path, root_package, AvailablePlugin)
loop.run_until_complete(collector.register_data_sources())
collector.start_polling()
loop.run_forever()
except Exception as e:
logging.error(e, exc_info=True)
loop.stop()
```
And run
```bash
poetry run python -m my_collector
```
Raw data
{
"_id": null,
"home_page": "https://github.com/opendatadiscovery/odd-collector-sdk",
"name": "odd-collector-sdk",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "odd-collector-sdk, odd_collector_sdk, opendatadiscovery",
"author": "Open Data Discovery",
"author_email": "pypi@opendatadiscovery.org",
"download_url": "https://files.pythonhosted.org/packages/ea/0b/2f0c6d080ad28fe938d226973f3a929b9f69ad71f7ea5ea28505c486f870/odd_collector_sdk-0.3.60.tar.gz",
"platform": null,
"description": "[![PyPI version](https://badge.fury.io/py/odd-collector-sdk.svg)](https://badge.fury.io/py/odd-collector-sdk)\n\n# ODD Collector SDK\nRoot project for ODD collectors\n\n### Domain\n* `CollectorConfig`\n\n _Main config file for collector_\n ``` python\n class CollectorConfig(pydantic.BaseSettings):\n default_pulling_interval: int # pulling interval in minutes\n token: str # token for requests to odd-platform\n plugins: Any\n platform_host_url: str\n ```\n\n* `Collector`\n\n Args:\n\n `config_path`: str - path to collector_config.yaml (i.e. `'/collector_config.yaml'`)\n\n `root_package`: str - root package for adapters which will be loaded (i.e. `'my_collector.adapters'`)\n\n `plugins_union_type` - Type variable for pydantic model.\n\n* `Plugin`\n\n Is a config for adapter\n ```python\n class Plugin(pydantic.BaseSettings):\n name: str\n description: Optional[str] = None\n namespace: Optional[str] = None\n ```\n\n Plugin class inherited from Pydantic's BaseSetting,it means it can take any field, which was skipped in `collector_config.yaml`, from env variables.\n\n Field `type: Literal[\"custom_adapter\"]` is obligatory for each plugin, by convention literal **MUST** have same name with adapter package\n\n Plugins example:\n ```python\n # plugins.py\n class AwsPlugin(Plugin):\n aws_secret_access_key: str\n aws_access_key_id: str\n aws_region: str\n \n class S3Plugin(AwsPlugin):\n type: Literal[\"s3\"]\n buckets: Optional[List[str]] = []\n\n class GluePlugin(AwsPlugin):\n type: Literal[\"glue\"]\n \n # For Collector's plugins_union_type argument\n AvailablePlugin = Annotated[\n Union[\n GluePlugin,\n S3Plugin,\n ],\n pydantic.Field(discriminator=\"type\"),\n ]\n ```\n* AbstractAdapter\n Abstract adapter which **MUST** be implemented by generic adapters\n\n## Collector example\n\n### Requirenments\nUse the package manager [poetry](https://python-poetry.org/) to install add odd-collector-sdk and asyncio.\n```bash\npoetry add odd-collector-sdk\n```\n\n### A typical top-level collector's directory layout (as an example we took poetry project)\n\n .\n \u251c\u2500\u2500 my_collector \n \u2502 \u251c\u2500\u2500 adapters # Adapters\n \u2502 \u2502 \u251c\u2500\u2500 custom_adapter # Some adapter package\n \u2502 \u2502 \u2502 \u251c\u2500\u2500 adapter.py # Entry file for adapter\n \u2502 \u2502 \u2502 \u2514\u2500\u2500 __init__.py\n \u2502 \u2502 \u251c\u2500\u2500 other_custom_adapter\n \u2502 \u2502 \u251c\u2500\u2500 ... # Other adapters\n \u2502 \u2502 \u2514\u2500\u2500 __init__.py\n \u2502 \u251c\u2500\u2500 domain # Domain models\n \u2502 \u2502 \u251c\u2500\u2500 ...\n \u2502 \u2502 \u251c\u2500\u2500 plugins.py # Models for available plugins\n \u2502 \u2502 \u2514\u2500\u2500 __init__.py\n \u2502 \u251c\u2500\u2500 __init__.py \n \u2502 \u2514\u2500\u2500 __main__.py # Entry file for collector\n \u251c\u2500\u2500 ...\n \u251c\u2500\u2500 collector_config.yaml\n \u251c\u2500\u2500 pyproject.toml\n \u251c\u2500\u2500 LICENSE\n \u2514\u2500\u2500 README.md\n\n\n\n### Adapters folder\nEach adapter inside adapters folder must have an `adapter.py` file with an `Adapter` class implementing `AbstractAdapter`\n```python\n # custom_adapter/adapter.py example\n from odd_collector_sdk.domain.adapter import AbstractAdapter\n from odd_models.models import DataEntityList\n\n # \n class Adapter(AbstractAdapter):\n def __init__(self, config: any) -> None:\n super().__init__()\n\n def get_data_entity_list(self) -> DataEntityList:\n return DataEntityList(data_source_oddrn=\"test\")\n\n def get_data_source_oddrn(self) -> str:\n return \"oddrn\"\n```\n\n### Plugins\nEach plugin must implement `Plugin` class from sdk\n```python\n # domain/plugins.py\n from typing import Literal, Union\n from typing_extensions import Annotated\n\n import pydantic\n from odd_collector_sdk.domain.plugin import Plugin\n\n class CustomPlugin(Plugin):\n type: Literal[\"custom_adapter\"]\n\n\n class OtherCustomPlugin(Plugin):\n type: Literal[\"other_custom_adapter\"]\n\n # Needs this type variable for Collector initialization\n AvailablePlugins = Annotated[\n Union[CustomPlugin, OtherCustomPlugin],\n pydantic.Field(discriminator=\"type\"),\n ]\n```\n\n### collector_config.yaml\n\n```yaml\ndefault_pulling_interval: 10 \ntoken: \"\" \nplatform_host_url: \"http://localhost:8080\" \nplugins:\n - type: custom_adapter\n name: custom_adapter_name\n - type: other_custom_adapter\n name: other_custom_adapter_name\n\n```\n\n## Usage\n```python\n# __main__.py\n\nimport asyncio\nimport logging\nfrom os import path\n\n\nfrom odd_collector_sdk.collector import Collector\n\n# Union type of avalable plugins\nfrom my_collector.domain.plugins import AvailablePlugins\n\nlogging.basicConfig(\n level=logging.INFO, format=\"[%(asctime)s] %(levelname)s in %(module)s: %(message)s\"\n)\n\ntry:\n cur_dirname = path.dirname(path.realpath(__file__))\n config_path = path.join(cur_dirname, \"../collector_config.yaml\")\n root_package = \"my_collector.adapters\"\n\n loop = asyncio.get_event_loop()\n\n collector = Collector(config_path, root_package, AvailablePlugin)\n\n loop.run_until_complete(collector.register_data_sources())\n\n collector.start_polling()\n loop.run_forever()\nexcept Exception as e:\n logging.error(e, exc_info=True)\n loop.stop()\n```\n\nAnd run\n```bash\npoetry run python -m my_collector\n```\n\n\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "ODD Collector",
"version": "0.3.60",
"project_urls": {
"Homepage": "https://github.com/opendatadiscovery/odd-collector-sdk",
"Repository": "https://github.com/opendatadiscovery/odd-collector-sdk"
},
"split_keywords": [
"odd-collector-sdk",
" odd_collector_sdk",
" opendatadiscovery"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d563d368a3621641aa664a541b84c2408de50b2fff630ef16f3e789b9853d85a",
"md5": "a77afcb26cb80e5e8d229c089dbdd29c",
"sha256": "0773bf6e51d9c0569ff34a8b560ce9dea2d9ff2877e7b4b2e8f87df75419e993"
},
"downloads": -1,
"filename": "odd_collector_sdk-0.3.60-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a77afcb26cb80e5e8d229c089dbdd29c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 32432,
"upload_time": "2024-11-06T21:20:18",
"upload_time_iso_8601": "2024-11-06T21:20:18.885416Z",
"url": "https://files.pythonhosted.org/packages/d5/63/d368a3621641aa664a541b84c2408de50b2fff630ef16f3e789b9853d85a/odd_collector_sdk-0.3.60-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ea0b2f0c6d080ad28fe938d226973f3a929b9f69ad71f7ea5ea28505c486f870",
"md5": "34786b3b564dd18c6a308ce2ecd3d26e",
"sha256": "4e57756863f9f08c61be82c216c030fb727807accbdd0b16575a34e16eae1c70"
},
"downloads": -1,
"filename": "odd_collector_sdk-0.3.60.tar.gz",
"has_sig": false,
"md5_digest": "34786b3b564dd18c6a308ce2ecd3d26e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 33954,
"upload_time": "2024-11-06T21:20:20",
"upload_time_iso_8601": "2024-11-06T21:20:20.877660Z",
"url": "https://files.pythonhosted.org/packages/ea/0b/2f0c6d080ad28fe938d226973f3a929b9f69ad71f7ea5ea28505c486f870/odd_collector_sdk-0.3.60.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-06 21:20:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "opendatadiscovery",
"github_project": "odd-collector-sdk",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "odd-collector-sdk"
}