Name | dbt-loom JSON |
Version |
0.7.2
JSON |
| download |
home_page | None |
Summary | A dbt-core plugin to import public nodes in multi-project deployments. |
upload_time | 2025-01-22 02:40:59 |
maintainer | None |
docs_url | None |
author | Nicholas Yager |
requires_python | <4.0,>=3.9 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# dbt-loom
[![pypi version shield](https://img.shields.io/pypi/v/dbt-loom)](https://img.shields.io/pypi/v/dbt-loom)
dbt-loom is a dbt Core plugin that weaves together multi-project deployments. dbt-loom works by fetching public model definitions from your dbt artifacts, and injecting those models into your dbt project.
```mermaid
flowchart LR
classDef black fill:#f2f2ebff, stroke:#000, color:#000
classDef background fill:#f2f2ebff, stroke:#000, color:#000
classDef hidden fill:#BADC3F, stroke:#BADC3F, color:#BADC3F
style TOP fill:#BADC3F, stroke:#000
subgraph TOP[Your Infrastructure]
direction TB
dbt_runtime[dbt Core]:::background
proprietary_plugin[Open Source Metadata Plugin]:::background
files[Local and Remote Files]:::background
object_storage[Object Storage]:::background
discovery_api[dbt Cloud APIs]:::background
discovery_api --> proprietary_plugin
files --> proprietary_plugin
object_storage --> proprietary_plugin
proprietary_plugin --> dbt_runtime
end
Project:::black --> TOP --> Warehouse:::black
```
dbt-loom currently supports obtaining model definitions from:
- Local manifest files
- Remote manifest files via http(s)
- dbt Cloud
- GCS
- S3-compatible object storage services
- Azure Storage
## Getting Started
To begin, install the `dbt-loom` python package.
```console
pip install dbt-loom
```
Next, create a `dbt-loom` configuration file. This configuration file provides the paths for your
upstream project's manifest files.
```yaml
manifests:
- name: project_name # This should match the project's real name
type: file
config:
# A path to your manifest. This can be either a local path, or a remote
# path accessible via http(s).
path: path/to/manifest.json
```
By default, `dbt-loom` will look for `dbt_loom.config.yml` in your working directory. You can also set the
`DBT_LOOM_CONFIG` environment variable.
## How does it work?
As of dbt-core 1.6.0-b8, there now exists a `dbtPlugin` class which defines functions that can
be called by dbt-core's `PluginManger`. During different parts of the dbt-core lifecycle (such as graph linking and
manifest writing), the `PluginManger` will be called and all plugins registered with the appropriate hook will be executed.
dbt-loom implements a `get_nodes` hook, and uses a configuration file to parse manifests, identify public models, and
inject those public models when called by `dbt-core`.
## Advanced Features
### Loading artifacts from remote sources
`dbt-loom` supports automatically fetching manifest artifacts from a variety
of remote sources.
#### Using dbt Cloud as an artifact source
You can use dbt-loom to fetch model definitions from dbt Cloud by setting up a `dbt-cloud` manifest in your `dbt-loom` config, and setting the `DBT_CLOUD_API_TOKEN` environment variable in your execution environment.
```yaml
manifests:
- name: project_name
type: dbt_cloud
config:
account_id: <YOUR DBT CLOUD ACCOUNT ID>
# Job ID pertains to the job that you'd like to fetch artifacts from.
job_id: <REFERENCE JOB ID>
api_endpoint: <DBT CLOUD ENDPOINT>
# dbt Cloud has multiple regions with different URLs. Update this to
# your appropriate dbt cloud endpoint.
step_id: <JOB STEP>
# If your job generates multiple artifacts, you can set the step from
# which to fetch artifacts. Defaults to the last step.
```
#### Using an S3-compatible object store as an artifact source
You can use dbt-loom to fetch manifest files from S3-compatible object stores
by setting up ab `s3` manifest in your `dbt-loom` config. Please note that this
approach supports all standard boto3-compatible environment variables and authentication mechanisms. Please see the [boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#environment-variables) for more details.
```yaml
manifests:
- name: project_name
type: s3
config:
bucket_name: <YOUR S3 BUCKET NAME>
# The name of the bucket where your manifest is stored.
object_name: <YOUR OBJECT NAME>
# The object name of your manifest file.
```
#### Using GCS as an artifact source
You can use dbt-loom to fetch manifest files from Google Cloud Storage by setting up a `gcs` manifest in your `dbt-loom` config.
```yaml
manifests:
- name: project_name
type: gcs
config:
project_id: <YOUR GCP PROJECT ID>
# The alphanumeric ID of the GCP project that contains your target bucket.
bucket_name: <YOUR GCS BUCKET NAME>
# The name of the bucket where your manifest is stored.
object_name: <YOUR OBJECT NAME>
# The object name of your manifest file.
credentials: <PATH TO YOUR SERVICE ACCOUNT JSON CREDENTIALS>
# The OAuth2 Credentials to use. If not passed, falls back to the default inferred from the environment.
```
#### Using Azure Storage as an artifact source
You can use dbt-loom to fetch manifest files from Azure Storage
by setting up an `azure` manifest in your `dbt-loom` config. The `azure` type implements
the [DefaultAzureCredential](https://learn.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python)
class, supporting all environment variables and authentication mechanisms.
Alternatively, set the `AZURE_STORAGE_CONNECTION_STRING` environment variable to
authenticate via a connection string.
```yaml
manifests:
- name: project_name
type: azure
config:
account_name: <YOUR AZURE STORAGE ACCOUNT NAME> # The name of your Azure Storage account
container_name: <YOUR AZURE STORAGE CONTAINER NAME> # The name of your Azure Storage container
object_name: <YOUR OBJECT NAME> # The object name of your manifest file.
```
### Using environment variables
You can easily incorporate your own environment variables into the config file. This allows for dynamic configuration values that can change based on the environment. To specify an environment variable in the `dbt-loom` config file, use one of the following formats:
`${ENV_VAR}` or `$ENV_VAR`
#### Example:
```yaml
manifests:
- name: revenue
type: gcs
config:
project_id: ${GCP_PROJECT}
bucket_name: ${GCP_BUCKET}
object_name: ${MANIFEST_PATH}
```
### Gzipped files
`dbt-loom` natively supports decompressing gzipped manifest files. This is useful to reduce object storage size and to minimize loading times when reading manifests from object storage. Compressed file detection is triggered when the file path for the manifest is suffixed
with `.gz`.
```yaml
manifests:
- name: revenue
type: s3
config:
bucket_name: example_bucket_name
object_name: manifest.json.gz
```
### Exclude nested packages
In some circumstances, like running `dbt-project-evaluator`, you may not want a
given package in an upstream project to be imported into a downstream project.
You can manually exclude downstream projects from injecting assets from packages
by adding the package name to the downstream project's `excluded_packages` list.
```yaml
manifests:
- name: revenue
type: file
config:
path: ../revenue/target/manifest.json
excluded_packages:
# Provide the string name of the package to exclude during injection.
- dbt_project_evaluator
```
## Known Caveats
Cross-project dependencies are a relatively new development, and dbt-core plugins
are still in beta. As such there are a number of caveats to be aware of when using
this tool.
1. dbt plugins are only supported in dbt-core version 1.6.0-b8 and newer. This means you must be using a dbt adapter
compatible with this version.
2. `PluginNodeArgs` are not fully-realized dbt `ManifestNode`s, so documentation generated by `dbt docs generate` may
be sparse when viewing injected models.
Raw data
{
"_id": null,
"home_page": null,
"name": "dbt-loom",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Nicholas Yager",
"author_email": "yager@nicholasyager.com",
"download_url": "https://files.pythonhosted.org/packages/42/bb/0c5f30880f4d6105e085ae2f48816677bfe4b7946b1b9fb0c740a2ceb501/dbt_loom-0.7.2.tar.gz",
"platform": null,
"description": "# dbt-loom\n\n[![pypi version shield](https://img.shields.io/pypi/v/dbt-loom)](https://img.shields.io/pypi/v/dbt-loom)\n\ndbt-loom is a dbt Core plugin that weaves together multi-project deployments. dbt-loom works by fetching public model definitions from your dbt artifacts, and injecting those models into your dbt project.\n\n```mermaid\nflowchart LR\n\n classDef black fill:#f2f2ebff, stroke:#000, color:#000\n classDef background fill:#f2f2ebff, stroke:#000, color:#000\n classDef hidden fill:#BADC3F, stroke:#BADC3F, color:#BADC3F\n\n style TOP fill:#BADC3F, stroke:#000\n\n subgraph TOP[Your Infrastructure]\n direction TB\n dbt_runtime[dbt Core]:::background\n proprietary_plugin[Open Source Metadata Plugin]:::background\n\n files[Local and Remote Files]:::background\n object_storage[Object Storage]:::background\n discovery_api[dbt Cloud APIs]:::background\n\n discovery_api --> proprietary_plugin\n files --> proprietary_plugin\n object_storage --> proprietary_plugin\n proprietary_plugin --> dbt_runtime\n end\n\n Project:::black --> TOP --> Warehouse:::black\n```\n\ndbt-loom currently supports obtaining model definitions from:\n\n- Local manifest files\n- Remote manifest files via http(s)\n- dbt Cloud\n- GCS\n- S3-compatible object storage services\n- Azure Storage\n\n## Getting Started\n\nTo begin, install the `dbt-loom` python package.\n\n```console\npip install dbt-loom\n```\n\nNext, create a `dbt-loom` configuration file. This configuration file provides the paths for your\nupstream project's manifest files.\n\n```yaml\nmanifests:\n - name: project_name # This should match the project's real name\n type: file\n config:\n # A path to your manifest. This can be either a local path, or a remote\n # path accessible via http(s).\n path: path/to/manifest.json\n```\n\nBy default, `dbt-loom` will look for `dbt_loom.config.yml` in your working directory. You can also set the\n`DBT_LOOM_CONFIG` environment variable.\n\n## How does it work?\n\nAs of dbt-core 1.6.0-b8, there now exists a `dbtPlugin` class which defines functions that can\nbe called by dbt-core's `PluginManger`. During different parts of the dbt-core lifecycle (such as graph linking and\nmanifest writing), the `PluginManger` will be called and all plugins registered with the appropriate hook will be executed.\n\ndbt-loom implements a `get_nodes` hook, and uses a configuration file to parse manifests, identify public models, and\ninject those public models when called by `dbt-core`.\n\n## Advanced Features\n\n### Loading artifacts from remote sources\n\n`dbt-loom` supports automatically fetching manifest artifacts from a variety\nof remote sources.\n\n#### Using dbt Cloud as an artifact source\n\nYou can use dbt-loom to fetch model definitions from dbt Cloud by setting up a `dbt-cloud` manifest in your `dbt-loom` config, and setting the `DBT_CLOUD_API_TOKEN` environment variable in your execution environment.\n\n```yaml\nmanifests:\n - name: project_name\n type: dbt_cloud\n config:\n account_id: <YOUR DBT CLOUD ACCOUNT ID>\n\n # Job ID pertains to the job that you'd like to fetch artifacts from.\n job_id: <REFERENCE JOB ID>\n\n api_endpoint: <DBT CLOUD ENDPOINT>\n # dbt Cloud has multiple regions with different URLs. Update this to\n # your appropriate dbt cloud endpoint.\n\n step_id: <JOB STEP>\n # If your job generates multiple artifacts, you can set the step from\n # which to fetch artifacts. Defaults to the last step.\n```\n\n#### Using an S3-compatible object store as an artifact source\n\nYou can use dbt-loom to fetch manifest files from S3-compatible object stores\nby setting up ab `s3` manifest in your `dbt-loom` config. Please note that this\napproach supports all standard boto3-compatible environment variables and authentication mechanisms. Please see the [boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#environment-variables) for more details.\n\n```yaml\nmanifests:\n - name: project_name\n type: s3\n config:\n bucket_name: <YOUR S3 BUCKET NAME>\n # The name of the bucket where your manifest is stored.\n\n object_name: <YOUR OBJECT NAME>\n # The object name of your manifest file.\n```\n\n#### Using GCS as an artifact source\n\nYou can use dbt-loom to fetch manifest files from Google Cloud Storage by setting up a `gcs` manifest in your `dbt-loom` config.\n\n```yaml\nmanifests:\n - name: project_name\n type: gcs\n config:\n project_id: <YOUR GCP PROJECT ID>\n # The alphanumeric ID of the GCP project that contains your target bucket.\n\n bucket_name: <YOUR GCS BUCKET NAME>\n # The name of the bucket where your manifest is stored.\n\n object_name: <YOUR OBJECT NAME>\n # The object name of your manifest file.\n\n credentials: <PATH TO YOUR SERVICE ACCOUNT JSON CREDENTIALS>\n # The OAuth2 Credentials to use. If not passed, falls back to the default inferred from the environment.\n```\n\n#### Using Azure Storage as an artifact source\n\nYou can use dbt-loom to fetch manifest files from Azure Storage\nby setting up an `azure` manifest in your `dbt-loom` config. The `azure` type implements\nthe [DefaultAzureCredential](https://learn.microsoft.com/en-us/python/api/azure-identity/azure.identity.defaultazurecredential?view=azure-python)\nclass, supporting all environment variables and authentication mechanisms.\nAlternatively, set the `AZURE_STORAGE_CONNECTION_STRING` environment variable to\nauthenticate via a connection string.\n\n```yaml\nmanifests:\n - name: project_name\n type: azure\n config:\n account_name: <YOUR AZURE STORAGE ACCOUNT NAME> # The name of your Azure Storage account\n container_name: <YOUR AZURE STORAGE CONTAINER NAME> # The name of your Azure Storage container\n object_name: <YOUR OBJECT NAME> # The object name of your manifest file.\n```\n\n### Using environment variables\n\nYou can easily incorporate your own environment variables into the config file. This allows for dynamic configuration values that can change based on the environment. To specify an environment variable in the `dbt-loom` config file, use one of the following formats:\n\n`${ENV_VAR}` or `$ENV_VAR`\n\n#### Example:\n\n```yaml\nmanifests:\n - name: revenue\n type: gcs\n config:\n project_id: ${GCP_PROJECT}\n bucket_name: ${GCP_BUCKET}\n object_name: ${MANIFEST_PATH}\n```\n\n### Gzipped files\n\n`dbt-loom` natively supports decompressing gzipped manifest files. This is useful to reduce object storage size and to minimize loading times when reading manifests from object storage. Compressed file detection is triggered when the file path for the manifest is suffixed\nwith `.gz`.\n\n```yaml\nmanifests:\n - name: revenue\n type: s3\n config:\n bucket_name: example_bucket_name\n object_name: manifest.json.gz\n```\n\n### Exclude nested packages\n\nIn some circumstances, like running `dbt-project-evaluator`, you may not want a\ngiven package in an upstream project to be imported into a downstream project.\nYou can manually exclude downstream projects from injecting assets from packages\nby adding the package name to the downstream project's `excluded_packages` list.\n\n```yaml\nmanifests:\n - name: revenue\n type: file\n config:\n path: ../revenue/target/manifest.json\n excluded_packages:\n # Provide the string name of the package to exclude during injection.\n - dbt_project_evaluator\n```\n\n## Known Caveats\n\nCross-project dependencies are a relatively new development, and dbt-core plugins\nare still in beta. As such there are a number of caveats to be aware of when using\nthis tool.\n\n1. dbt plugins are only supported in dbt-core version 1.6.0-b8 and newer. This means you must be using a dbt adapter\n compatible with this version.\n2. `PluginNodeArgs` are not fully-realized dbt `ManifestNode`s, so documentation generated by `dbt docs generate` may\n be sparse when viewing injected models.\n",
"bugtrack_url": null,
"license": null,
"summary": "A dbt-core plugin to import public nodes in multi-project deployments.",
"version": "0.7.2",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "741418995028d6d54a17d8d895cf71cd945e5fe39dc8264e9ab7f42568451735",
"md5": "782da26183031db201a72b44b2ee2336",
"sha256": "aa57a0cdcb1dfeb30eb64958a6ffad0a48b8da8ffa5d54bbf4981bd8b2d19681"
},
"downloads": -1,
"filename": "dbt_loom-0.7.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "782da26183031db201a72b44b2ee2336",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 15878,
"upload_time": "2025-01-22T02:40:57",
"upload_time_iso_8601": "2025-01-22T02:40:57.837035Z",
"url": "https://files.pythonhosted.org/packages/74/14/18995028d6d54a17d8d895cf71cd945e5fe39dc8264e9ab7f42568451735/dbt_loom-0.7.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "42bb0c5f30880f4d6105e085ae2f48816677bfe4b7946b1b9fb0c740a2ceb501",
"md5": "b6efdb8bdda5bc4b35fe033136b387dc",
"sha256": "6f3d231dc5d22c0eb08be681048872f15b90fdbb320c3a53c15c3c93d39c784c"
},
"downloads": -1,
"filename": "dbt_loom-0.7.2.tar.gz",
"has_sig": false,
"md5_digest": "b6efdb8bdda5bc4b35fe033136b387dc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 14997,
"upload_time": "2025-01-22T02:40:59",
"upload_time_iso_8601": "2025-01-22T02:40:59.623871Z",
"url": "https://files.pythonhosted.org/packages/42/bb/0c5f30880f4d6105e085ae2f48816677bfe4b7946b1b9fb0c740a2ceb501/dbt_loom-0.7.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-22 02:40:59",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "dbt-loom"
}