Name | dbt-af JSON |
Version |
0.12.0
JSON |
| download |
home_page | None |
Summary | Distibuted dbt runs on Apache Airflow |
upload_time | 2025-02-19 12:59:07 |
maintainer | None |
docs_url | None |
author | Nikita Yurasov |
requires_python | <3.12,>=3.10 |
license | Apache-2.0 |
keywords |
python
airflow
dbt
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
[](https://pypi.org/project/dbt-af/)
[](https://github.com/Toloka/dbt-af/actions)
[](https://www.apache.org/licenses/LICENSE-2.0.txt)
[](https://pypi.org/project/dbt-af/)
[](https://pypi.org/project/dbt-af/)
[](https://python-poetry.org/)
[](https://github.com/psf/black)
# dbt-af: distributed run of dbt models using Airflow
## Overview
**_dbt-af_** is a tool that allows you to run dbt models in a distributed manner using Airflow.
It acts as a wrapper around the Airflow DAG,
allowing you to run the models independently while preserving their dependencies.

### Why?
1. **_dbt-af_** is [domain-driven](https://www.datamesh-architecture.com/#what-is-data-mesh).
It is designed to separate models from different domains into different DAGs.
This allows you to run models from different domains in parallel.
2. **_dbt-af_** is **dbt-first** solution.
It is designed to make analytics' life easier.
End-users could even not know that Airflow is used to schedule their models.
dbt-model's config is an entry point for all your settings and customizations.
3. **_dbt-af_** brings scheduling to dbt. From `@monthly` to `@hourly` and even [more](examples/manual_scheduling.md).
4. **_dbt-af_** is an ETL-driven tool.
You can separate your models into tiers or ETL stages
and build graphs showing the dependencies between models within each tier or stage.
5. **_dbt-af_** brings additional features to use different dbt targets simultaneously, different tests scenarios, and
maintenance tasks.
## Installation
To install `dbt-af` run `pip install dbt-af`.
To contribute we recommend to use `poetry` to install package dependencies. Run `poetry install --with=dev` to install
all dependencies.
## _dbt-af_ by Example
All tutorials and examples are located in the [examples](examples/README.md) folder.
To get basic Airflow DAGs for your dbt project, you need to put the following code into your `dags` folder:
```python
# LABELS: dag, airflow (it's required for airflow dag-processor)
from dbt_af.dags import compile_dbt_af_dags
from dbt_af.conf import Config, DbtDefaultTargetsConfig, DbtProjectConfig
# specify here all settings for your dbt project
config = Config(
dbt_project=DbtProjectConfig(
dbt_project_name='my_dbt_project',
dbt_project_path='/path/to/my_dbt_project',
dbt_models_path='/path/to/my_dbt_project/models',
dbt_profiles_path='/path/to/my_dbt_project',
dbt_target_path='/path/to/my_dbt_project/target',
dbt_log_path='/path/to/my_dbt_project/logs',
dbt_schema='my_dbt_schema',
),
dbt_default_targets=DbtDefaultTargetsConfig(default_target='dev'),
is_dev=False, # set to True if you want to turn on dry-run mode
)
dags = compile_dbt_af_dags(
manifest_path='/path/to/my_dbt_project/target/manifest.json',
config=config,
)
for dag_name, dag in dags.items():
globals()[dag_name] = dag
```
In _dbt_project.yml_ you need to set up default targets for all nodes in your project
(see [example](examples/dags/dbt_project.yml)):
```yaml
sql_cluster: "dev"
daily_sql_cluster: "dev"
py_cluster: "dev"
bf_cluster: "dev"
```
This will create Airflow DAGs for your dbt project.
Check out the documentation for more details [here](docs/docs.md).
## Features
1. **_dbt-af_** is essentially designed to work with large projects (1000+ models).
When dealing with a significant number of dbt objects across different domains,
it becomes crucial to have all DAGs auto-generated.
**_dbt-af_** takes care of this by generating all the necessary DAGs for your dbt project and structuring them by
domains.
2. Each dbt run is separated into a different Airflow task. All tasks receive a date interval from the Airflow DAG
context. By using the passed date interval in your dbt models, you ensure the *idempotency* of your dbt runs.
3. _**dbt-af**_ lowers the entry threshold for non-infrastructure team members.
This means that analytics professionals, data scientists,
and data engineers can focus on their dbt models and important business logic
rather than spending time on Airflow DAGs.
## Project Information
- [Docs](docs/docs.md)
- [PyPI](https://pypi.org/project/dbt-af/)
- [Contributing](CONTRIBUTING.md)
Raw data
{
"_id": null,
"home_page": null,
"name": "dbt-af",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.10",
"maintainer_email": null,
"keywords": "python, airflow, dbt",
"author": "Nikita Yurasov",
"author_email": "nikitayurasov@toloka.ai",
"download_url": "https://files.pythonhosted.org/packages/a2/a0/cd86557b50625d0f36a4c4fa3d4aa2309d392a9db865fdf5935a846c5345/dbt_af-0.12.0.tar.gz",
"platform": null,
"description": "[](https://pypi.org/project/dbt-af/)\n[](https://github.com/Toloka/dbt-af/actions)\n\n[](https://www.apache.org/licenses/LICENSE-2.0.txt)\n[](https://pypi.org/project/dbt-af/)\n[](https://pypi.org/project/dbt-af/)\n\n[](https://python-poetry.org/)\n[](https://github.com/psf/black)\n\n# dbt-af: distributed run of dbt models using Airflow\n\n## Overview\n\n**_dbt-af_** is a tool that allows you to run dbt models in a distributed manner using Airflow.\nIt acts as a wrapper around the Airflow DAG,\nallowing you to run the models independently while preserving their dependencies.\n\n\n\n### Why?\n\n1. **_dbt-af_** is [domain-driven](https://www.datamesh-architecture.com/#what-is-data-mesh).\n It is designed to separate models from different domains into different DAGs.\n This allows you to run models from different domains in parallel.\n2. **_dbt-af_** is **dbt-first** solution.\n It is designed to make analytics' life easier.\n End-users could even not know that Airflow is used to schedule their models.\n dbt-model's config is an entry point for all your settings and customizations.\n3. **_dbt-af_** brings scheduling to dbt. From `@monthly` to `@hourly` and even [more](examples/manual_scheduling.md).\n4. **_dbt-af_** is an ETL-driven tool.\n You can separate your models into tiers or ETL stages\n and build graphs showing the dependencies between models within each tier or stage.\n5. **_dbt-af_** brings additional features to use different dbt targets simultaneously, different tests scenarios, and\n maintenance tasks.\n\n## Installation\n\nTo install `dbt-af` run `pip install dbt-af`.\n\nTo contribute we recommend to use `poetry` to install package dependencies. Run `poetry install --with=dev` to install\nall dependencies.\n\n## _dbt-af_ by Example\n\nAll tutorials and examples are located in the [examples](examples/README.md) folder.\n\nTo get basic Airflow DAGs for your dbt project, you need to put the following code into your `dags` folder:\n\n```python\n# LABELS: dag, airflow (it's required for airflow dag-processor)\nfrom dbt_af.dags import compile_dbt_af_dags\nfrom dbt_af.conf import Config, DbtDefaultTargetsConfig, DbtProjectConfig\n\n# specify here all settings for your dbt project\nconfig = Config(\n dbt_project=DbtProjectConfig(\n dbt_project_name='my_dbt_project',\n dbt_project_path='/path/to/my_dbt_project',\n dbt_models_path='/path/to/my_dbt_project/models',\n dbt_profiles_path='/path/to/my_dbt_project',\n dbt_target_path='/path/to/my_dbt_project/target',\n dbt_log_path='/path/to/my_dbt_project/logs',\n dbt_schema='my_dbt_schema',\n ),\n dbt_default_targets=DbtDefaultTargetsConfig(default_target='dev'),\n is_dev=False, # set to True if you want to turn on dry-run mode\n)\n\ndags = compile_dbt_af_dags(\n manifest_path='/path/to/my_dbt_project/target/manifest.json', \n config=config,\n)\nfor dag_name, dag in dags.items():\n globals()[dag_name] = dag\n```\n\nIn _dbt_project.yml_ you need to set up default targets for all nodes in your project\n(see [example](examples/dags/dbt_project.yml)):\n\n```yaml\nsql_cluster: \"dev\"\ndaily_sql_cluster: \"dev\"\npy_cluster: \"dev\"\nbf_cluster: \"dev\"\n```\n\nThis will create Airflow DAGs for your dbt project.\n\nCheck out the documentation for more details [here](docs/docs.md).\n\n## Features\n\n1. **_dbt-af_** is essentially designed to work with large projects (1000+ models).\n When dealing with a significant number of dbt objects across different domains,\n it becomes crucial to have all DAGs auto-generated.\n **_dbt-af_** takes care of this by generating all the necessary DAGs for your dbt project and structuring them by\n domains.\n2. Each dbt run is separated into a different Airflow task. All tasks receive a date interval from the Airflow DAG\n context. By using the passed date interval in your dbt models, you ensure the *idempotency* of your dbt runs.\n3. _**dbt-af**_ lowers the entry threshold for non-infrastructure team members.\n This means that analytics professionals, data scientists,\n and data engineers can focus on their dbt models and important business logic\n rather than spending time on Airflow DAGs.\n\n## Project Information\n\n- [Docs](docs/docs.md)\n- [PyPI](https://pypi.org/project/dbt-af/)\n- [Contributing](CONTRIBUTING.md)\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Distibuted dbt runs on Apache Airflow",
"version": "0.12.0",
"project_urls": {
"Documentation": "https://github.com/Toloka/dbt-af/blob/main/examples/README.md",
"Homepage": "https://github.com/Toloka/dbt-af",
"Repository": "https://github.com/Toloka/dbt-af"
},
"split_keywords": [
"python",
" airflow",
" dbt"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "812a25a3906dd465ebf252d0f56e8cf4f34ad8b027aa5170b7629190dce8a033",
"md5": "045fffdc667855f2077a07c5a7514e0c",
"sha256": "d79927dc4f9fd007bf6b916037ba280f63c914b04d44ecfa4ca0d5f65807f97e"
},
"downloads": -1,
"filename": "dbt_af-0.12.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "045fffdc667855f2077a07c5a7514e0c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.10",
"size": 57543,
"upload_time": "2025-02-19T12:59:03",
"upload_time_iso_8601": "2025-02-19T12:59:03.433975Z",
"url": "https://files.pythonhosted.org/packages/81/2a/25a3906dd465ebf252d0f56e8cf4f34ad8b027aa5170b7629190dce8a033/dbt_af-0.12.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a2a0cd86557b50625d0f36a4c4fa3d4aa2309d392a9db865fdf5935a846c5345",
"md5": "3502cbf2244f1568e18da14226790c19",
"sha256": "fe33b0c7c0971e9c6d32ce61e7c6e62d43ef1e3abe4eae98ee5b23e949c39a44"
},
"downloads": -1,
"filename": "dbt_af-0.12.0.tar.gz",
"has_sig": false,
"md5_digest": "3502cbf2244f1568e18da14226790c19",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.10",
"size": 41446,
"upload_time": "2025-02-19T12:59:07",
"upload_time_iso_8601": "2025-02-19T12:59:07.471130Z",
"url": "https://files.pythonhosted.org/packages/a2/a0/cd86557b50625d0f36a4c4fa3d4aa2309d392a9db865fdf5935a846c5345/dbt_af-0.12.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-19 12:59:07",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Toloka",
"github_project": "dbt-af",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "dbt-af"
}