dbt-airflow-factory


Namedbt-airflow-factory JSON
Version 0.35.0 PyPI version JSON
download
home_pagehttps://github.com/getindata/dbt-airflow-factory/
SummaryLibrary to convert DBT manifest metadata to Airflow tasks
upload_time2023-09-08 14:48:14
maintainer
docs_urlNone
authorPiotr Pekala
requires_python>=3
licenseApache Software License (Apache 2.0)
keywords dbt airflow manifest parser python
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # DBT Airflow Factory

[![Python Version](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)](https://github.com/getindata/dbt-airflow-factory)
[![PyPI Version](https://badge.fury.io/py/dbt-airflow-factory.svg)](https://pypi.org/project/dbt-airflow-factory/)
[![Downloads](https://pepy.tech/badge/dbt-airflow-factory)](https://pepy.tech/project/dbt-airflow-factory)
[![Maintainability](https://api.codeclimate.com/v1/badges/47fd3570c858b6c166ad/maintainability)](https://codeclimate.com/github/getindata/dbt-airflow-factory/maintainability)
[![Test Coverage](https://api.codeclimate.com/v1/badges/47fd3570c858b6c166ad/test_coverage)](https://codeclimate.com/github/getindata/dbt-airflow-factory/test_coverage)
[![Documentation Status](https://readthedocs.org/projects/dbt-airflow-factory/badge/?version=latest)](https://dbt-airflow-factory.readthedocs.io/en/latest/?badge=latest)

Library to convert DBT manifest metadata to Airflow tasks

## Documentation

Read the full documentation at [https://dbt-airflow-factory.readthedocs.io/](https://dbt-airflow-factory.readthedocs.io/en/latest/index.html)

## Installation

Use the package manager [pip][pip] to install the library:

```bash
pip install dbt-airflow-factory
```

## Usage

The library is expected to be used inside an Airflow environment with a Kubernetes image referencing **dbt**.

**dbt-airflow-factory**'s main task is to parse `manifest.json` and create Airflow DAG out of it. It also reads config
files from `config` directory and therefore is highly customizable (e.g., user can set path to `manifest.json`).

To start, create a directory with a following structure, where `manifest.json` is a file generated by **dbt**:
```
.
├── config
│   ├── base
│   │   ├── airflow.yml
│   │   ├── dbt.yml
│   │   └── k8s.yml
│   └── dev
│       └── dbt.yml
├── dag.py
└── manifest.json
```

Then, put the following code into `dag.py`:
```python
from dbt_airflow_factory.airflow_dag_factory import AirflowDagFactory
from os import path

dag = AirflowDagFactory(path.dirname(path.abspath(__file__)), "dev").create()
```

When uploaded to Airflow DAGs directory, it will get picked up by Airflow, parse `manifest.json` and prepare a DAG to run.

### Configuration files

It is best to look up the example configuration files in [tests directory][tests] to get a glimpse of correct configs.

You can use [Airflow template variables][airflow-vars] in your `dbt.yml` and `k8s.yml` files, as long as they are inside
quotation marks:
```yaml
target: "{{ var.value.env }}"
some_other_field: "{{ ds_nodash }}"
```

Analogously, you can use `"{{ var.value.VARIABLE_NAME }}"` in `airflow.yml`, but only the Airflow variable getter.
Any other Airflow template variables will not work in `airflow.yml`.

### Creation of the directory with data-pipelines-cli

**DBT Airflow Factory** works best in tandem with [data-pipelines-cli][dp-cli] tool. **dp** not only prepares directory
for the library to digest, but also automates Docker image building and pushes generated directory to the cloud storage
of your choice.

[airflow-vars]: https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html#variables
[dp-cli]: https://pypi.org/project/data-pipelines-cli/
[pip]: https://pip.pypa.io/en/stable/
[tests]: https://github.com/getindata/dbt-airflow-factory/tree/develop/tests/config
            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/getindata/dbt-airflow-factory/",
    "name": "dbt-airflow-factory",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3",
    "maintainer_email": "",
    "keywords": "dbt airflow manifest parser python",
    "author": "Piotr Pekala",
    "author_email": "piotr.pekala@getindata.com",
    "download_url": "https://files.pythonhosted.org/packages/b5/85/bddc1cf85388101ebcdfbfdf30c9e07dbf008cb21803e5f6f5c1f72a98c8/dbt-airflow-factory-0.35.0.tar.gz",
    "platform": null,
    "description": "# DBT Airflow Factory\n\n[![Python Version](https://img.shields.io/badge/python-3.8%20%7C%203.9%20%7C%203.10%20%7C%203.11-blue)](https://github.com/getindata/dbt-airflow-factory)\n[![PyPI Version](https://badge.fury.io/py/dbt-airflow-factory.svg)](https://pypi.org/project/dbt-airflow-factory/)\n[![Downloads](https://pepy.tech/badge/dbt-airflow-factory)](https://pepy.tech/project/dbt-airflow-factory)\n[![Maintainability](https://api.codeclimate.com/v1/badges/47fd3570c858b6c166ad/maintainability)](https://codeclimate.com/github/getindata/dbt-airflow-factory/maintainability)\n[![Test Coverage](https://api.codeclimate.com/v1/badges/47fd3570c858b6c166ad/test_coverage)](https://codeclimate.com/github/getindata/dbt-airflow-factory/test_coverage)\n[![Documentation Status](https://readthedocs.org/projects/dbt-airflow-factory/badge/?version=latest)](https://dbt-airflow-factory.readthedocs.io/en/latest/?badge=latest)\n\nLibrary to convert DBT manifest metadata to Airflow tasks\n\n## Documentation\n\nRead the full documentation at [https://dbt-airflow-factory.readthedocs.io/](https://dbt-airflow-factory.readthedocs.io/en/latest/index.html)\n\n## Installation\n\nUse the package manager [pip][pip] to install the library:\n\n```bash\npip install dbt-airflow-factory\n```\n\n## Usage\n\nThe library is expected to be used inside an Airflow environment with a Kubernetes image referencing **dbt**.\n\n**dbt-airflow-factory**'s main task is to parse `manifest.json` and create Airflow DAG out of it. It also reads config\nfiles from `config` directory and therefore is highly customizable (e.g., user can set path to `manifest.json`).\n\nTo start, create a directory with a following structure, where `manifest.json` is a file generated by **dbt**:\n```\n.\n\u251c\u2500\u2500 config\n\u2502   \u251c\u2500\u2500 base\n\u2502   \u2502   \u251c\u2500\u2500 airflow.yml\n\u2502   \u2502   \u251c\u2500\u2500 dbt.yml\n\u2502   \u2502   \u2514\u2500\u2500 k8s.yml\n\u2502   \u2514\u2500\u2500 dev\n\u2502       \u2514\u2500\u2500 dbt.yml\n\u251c\u2500\u2500 dag.py\n\u2514\u2500\u2500 manifest.json\n```\n\nThen, put the following code into `dag.py`:\n```python\nfrom dbt_airflow_factory.airflow_dag_factory import AirflowDagFactory\nfrom os import path\n\ndag = AirflowDagFactory(path.dirname(path.abspath(__file__)), \"dev\").create()\n```\n\nWhen uploaded to Airflow DAGs directory, it will get picked up by Airflow, parse `manifest.json` and prepare a DAG to run.\n\n### Configuration files\n\nIt is best to look up the example configuration files in [tests directory][tests] to get a glimpse of correct configs.\n\nYou can use [Airflow template variables][airflow-vars] in your `dbt.yml` and `k8s.yml` files, as long as they are inside\nquotation marks:\n```yaml\ntarget: \"{{ var.value.env }}\"\nsome_other_field: \"{{ ds_nodash }}\"\n```\n\nAnalogously, you can use `\"{{ var.value.VARIABLE_NAME }}\"` in `airflow.yml`, but only the Airflow variable getter.\nAny other Airflow template variables will not work in `airflow.yml`.\n\n### Creation of the directory with data-pipelines-cli\n\n**DBT Airflow Factory** works best in tandem with [data-pipelines-cli][dp-cli] tool. **dp** not only prepares directory\nfor the library to digest, but also automates Docker image building and pushes generated directory to the cloud storage\nof your choice.\n\n[airflow-vars]: https://airflow.apache.org/docs/apache-airflow/stable/templates-ref.html#variables\n[dp-cli]: https://pypi.org/project/data-pipelines-cli/\n[pip]: https://pip.pypa.io/en/stable/\n[tests]: https://github.com/getindata/dbt-airflow-factory/tree/develop/tests/config",
    "bugtrack_url": null,
    "license": "Apache Software License (Apache 2.0)",
    "summary": "Library to convert DBT manifest metadata to Airflow tasks",
    "version": "0.35.0",
    "project_urls": {
        "Homepage": "https://github.com/getindata/dbt-airflow-factory/"
    },
    "split_keywords": [
        "dbt",
        "airflow",
        "manifest",
        "parser",
        "python"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b585bddc1cf85388101ebcdfbfdf30c9e07dbf008cb21803e5f6f5c1f72a98c8",
                "md5": "d94fedb34bbd01fe4cff367eaa2ab690",
                "sha256": "bb2a4c5b2e83e20c65398cf0e5be22f2406ae91d653e8d976424ac403ad59746"
            },
            "downloads": -1,
            "filename": "dbt-airflow-factory-0.35.0.tar.gz",
            "has_sig": false,
            "md5_digest": "d94fedb34bbd01fe4cff367eaa2ab690",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3",
            "size": 22598,
            "upload_time": "2023-09-08T14:48:14",
            "upload_time_iso_8601": "2023-09-08T14:48:14.403677Z",
            "url": "https://files.pythonhosted.org/packages/b5/85/bddc1cf85388101ebcdfbfdf30c9e07dbf008cb21803e5f6f5c1f72a98c8/dbt-airflow-factory-0.35.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-08 14:48:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "getindata",
    "github_project": "dbt-airflow-factory",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "dbt-airflow-factory"
}
        
Elapsed time: 0.12549s