dbt_coves


Namedbt_coves JSON
Version 1.7.1 PyPI version JSON
download
home_pagehttps://datacoves.com
SummaryCLI tool for dbt users adopting analytics engineering best practices.
upload_time2024-02-19 21:08:42
maintainer
docs_urlNone
authorDatacoves
requires_python>=3.8,<3.12
licenseApache 2.0
keywords data engineering analytics engineering dbt etl data modelling
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # dbt-coves

## Sponsor

<picture>
  <source media="(prefers-color-scheme: dark)" srcset="images/datacoves-dark.png">
  <img alt="Datacoves" src="images/datacoves-light.png" width="150">
</picture>

Hosted VS Code, dbt-core, SqlFluff, and Airflow, find out more at [Datacoves.com](https://datacoves.com/product).

## What is dbt-coves?

dbt-coves is a CLI tool that automates certain tasks for [dbt](https://www.getdbt.com), making life simpler for the dbt user.

dbt-coves generates dbt sources, staging models and property(yml) files by analyzing information from the data warehouse and creating the necessary files (sql and yml).

Finally, dbt-coves includes functionality to bootstrap a dbt project and to extract and load configurations from Airbyte.

## Supported dbt versions

| Version | Status           |
| ------- | ---------------- |
| \< 1.0  | ❌ Not supported |
| >= 1.0  | ✅ Tested        |

From `dbt-coves` 1.4.0 onwards, our major and minor versions match those of [dbt-core](https://github.com/dbt-labs/dbt-core).
This means we release a new major/minor version once it's dbt-core equivalent is tested.
Patch suffix (1.4.X) is exclusive to our continuous development and does not reflect a version match with dbt

## Supported adapters

| Feature                           | Snowflake | Redshift  | BigQuery  |
| --------------------------------- | --------- | --------- | --------- |
| dbt project setup                 | ✅ Tested | ✅ Tested | ✅ Tested |
| source model (sql) generation     | ✅ Tested | ✅ Tested | ✅ Tested |
| model properties (yml) generation | ✅ Tested | ✅ Tested | ✅ Tested |

NOTE: Other database adapters may work, although we have not tested them. Feel free to try them and let us know so we can update the table above.

### Here\'s the tool in action

[![image](https://cdn.loom.com/sessions/thumbnails/74062cf71cbe4898805ca508ea2d9455-1624905546029-with-play.gif)](https://www.loom.com/share/74062cf71cbe4898805ca508ea2d9455)

# Installation

```console
pip install dbt-coves
```

We recommend using [python
virtualenvs](https://docs.python.org/3/tutorial/venv.html) and create
one separate environment per project.

# Command Reference

For a complete list of options, please run:

```console
dbt-coves -h
dbt-coves <command> -h
```

## Environment setup

You can configure different components:

Set up `git` repository of dbt-coves project

```console
dbt-coves setup git
```

Set up `dbt` within the project (delegates to dbt init)

```console
dbt-coves setup dbt
```

Set up SSH Keys for dbt project. Supports the argument `--open_ssl_public_key` which generates an extra Public Key in Open SSL format, useful for configuring certain providers (i.e. Snowflake authentication)

```console
dbt-coves setup ssh
```

Set up pre-commit for your dbt project. In this, you can configure different tools that we consider essential for proper dbt usage: `sqlfluff`, `yaml-lint`, and `dbt-checkpoint`

```console
dbt-coves setup precommit
```

## Models generation

```console
dbt-coves generate <resource>
```

Where _\<resource\>_ could be _sources_, _properties_, _metadata_, _docs_ or _airflow-dags_.

```console
dbt-coves generate sources
```

This command will generate the dbt source configuration as well as the initial dbt staging model(s). It will look in the database defined in your `profiles.yml` file or you can pass the `--database` argument or set up default configuration options (see below)

```console
dbt-coves generate sources --database raw
```

Supports Jinja templates to adjust how the resources are generated. See below for examples.

Every `dbt-coves generate <resource>` supports `--no-prompt` flag, which will silently generate all sources/models/properties/metadata without asking anything to the user.

### Source Generation Arguments

dbt-coves can be used to create the initial staging models. It will do the following:

1. Create / Update the source yml file
2. Create the initial staging model(sql) file and offer to flatten VARIANT(JSON) fields
3. Create the staging model's property(yml) file.

`dbt-coves generate sources` supports the following args:

See full list in help

```console
dbt-coves generate sources -h
```

```console
--database
# Database to inspect
```

```console
--schemas
# Schema(s) to inspect. Accepts wildcards (must be enclosed in quotes if used)
```

```console
--select-relations
# List of relations where raw data resides. The parameter must be enclosed in quotes. Accepts wildcards.
```

```console
--exclude-relations
# Filter relation(s) to exclude from source file(s) generation. The parameter must be enclosed in quotes. Accepts wildcards.
```

```console
--sources-destination
# Where sources yml files will be generated, default: 'models/staging/{{schema}}/sources.yml'
```

```console
--models-destination
# Where models sql files will be generated, default: 'models/staging/{{schema}}/{{relation}}.sql'
```

```console
--model-props-destination
# Where models yml files will be generated, default: 'models/staging/{{schema}}/{{relation}}.yml'
```

```console
--update-strategy
# Action to perform when a property file already exists: 'update', 'recreate', 'fail', 'ask' (per file)
```

```console
--templates-folder
# Folder with jinja templates that override default sources generation templates, i.e. 'templates'
```

```console
--metadata
# Path to csv file containing metadata, i.e. 'metadata.csv'
```

```console
--flatten-json-fields
# Action to perform when JSON fields exist: 'yes', 'no', 'ask' (per file)
```

```console
--overwrite-staging-models
# Flag: overwrite existent staging (SQL) files
```

```console
--skip-model-props
# Flag: don't create model's property (yml) files
```

```console
--no-prompt
# Silently generate source dbt models
```

### Properties Generation Arguments

You can use dbt-coves to generate and update the properties(yml) file for a given dbt model(sql) file.

`dbt-coves generate properties` supports the following args:

```console
--destination
# Where models yml files will be generated, default: '{{model_folder_path}}/{{model_file_name}}.yml'
```

```console
--update-strategy
# Action to perform when a property file already exists: 'update', 'recreate', 'fail', 'ask' (per file)
```

```console
-s --select
# Filter model(s) to generate property file(s)
```

```console
--exclude
# Filter model(s) to exclude from property file(s) generation
```

```console
--selector
# Specify dbt selector for more complex model filtering
```

```console
--templates-folder
# Folder with jinja templates that override default properties generation templates, i.e. 'templates'
```

```console
--metadata
# Path to csv file containing metadata, i.e. 'metadata.csv'
```

```console
--no-prompt
# Silently generate dbt models property files
```

Note: `--select (or -s)`, `--exclude` and `--selector` work exactly as `dbt ls` selectors do. For usage details, visit [dbt list docs](https://docs.getdbt.com/reference/commands/list)

### Metadata Generation Arguments

You can use dbt-coves to generate the metadata file(s) containing the basic structure of the csv that can be used in the above `dbt-coves generate sources/properties` commands.
Usage of these metadata files can be found in [metadata](https://github.com/datacoves/dbt-coves#metadata) below.

`dbt-coves generate metadata` supports the following args:

```console
--database
# Database to inspect
```

```console
--schemas
# Schema(s) to inspect. Accepts wildcards (must be enclosed in quotes if used)
```

```console
--select-relations
# List of relations where raw data resides. The parameter must be enclosed in quotes. Accepts wildcards.
```

```console
--exclude-relations
# Filter relation(s) to exclude from source file(s) generation. The parameter must be enclosed in quotes. Accepts wildcards.
```

```console
--destination
# Where csv file(s) will be generated, default: 'metadata.csv'
# Supports using the Jinja tags `{{relation}}` and `{{schema}}`
# if creating one csv per relation/table in schema, i.e: "metadata/{{relation}}.csv"
```

```console
--no-prompt
# Silently generate metadata
```

### Metadata

dbt-coves supports the argument `--metadata` which allows users to specify a csv file containing field types and descriptions to be used when creating the staging models and property files.

```console
dbt-coves generate sources --metadata metadata.csv
```

Metadata format:
You can download a [sample csv file](sample_metadata.csv) as reference

| database | schema | relation                          | column          | key  | type    | description                                     |
| -------- | ------ | --------------------------------- | --------------- | ---- | ------- | ----------------------------------------------- |
| raw      | raw    | \_airbyte_raw_country_populations | \_airbyte_data  | Year | integer | Year of country population measurement          |
| raw      | raw    | \_airbyte_raw_country_populations | \_airbyte_data  |      | variant | Airbyte data columns (VARIANT) in Snowflake     |
| raw      | raw    | \_airbyte_raw_country_populations | \_airbyte_ab_id |      | varchar | Airbyte unique identifier used during data load |

### Docs generation arguments

You can use dbt-coves to improve the standard dbt docs generation process. It generates your dbt docs, updates external links so they always open in a new tab. It also has the option to merge production `catalog.json` into the local environment when running in deferred mode, so you can run [dbt-checkpoint](https://github.com/dbt-checkpoint/dbt-checkpoint) hooks even when the model has not been run locally.

`dbt-coves generate docs` supports the following args:

```console
--merge-deferred
# Merge a deferred catalog.json into your generated one.
# Flag: no value required.
```

```
--state
# Directory where your production catalog.json is located
# Mandatory when using --merge-deferred
```

### Generate airflow-dags

```console
dbt-coves generate airflow-dags
```

Translate YML files into their Airflow Python code equivalent. With this, DAGs can be easily written with some `key:value` pairs.

The basic structure of these YMLs must consist of:

- Global configurations (description, schedule_interval, tags, catchup, etc.)
- `default_args`
- `nodes`: where tasks and task groups are defined
  - each Node is a nested object, with it's `name` as key and it's configuration as values.
    - this configuration must cover:
      - `type`: 'task' or 'task_group'
      - `operator`: Airflow operator that will run the tasks (full _module.class_ naming)
      - `dependencies`: whether the task is dependent on another one(s)
      - any `key:value` pair of [Operator arguments](https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/index.html)

#### Airflow DAG Generators

When a YML Dag `node` is of type `task_group`, **Generators** can be used instead of `Operators`.

They are custom classes that receive YML `key:value` pairs and return one or more tasks for the respective task group. Any pair specified other than `type: task_group` will be passed to the specified `generator`, and it has the responsibility of returning N amount of `task_name = Operator(params)`.

We provide some prebuilt Generators:

- `AirbyteGenerator` creates `AirbyteTriggerSyncOperator` tasks (one per Airbyte connection)

  - It must receive Airbyte's `host` and `port`, `airbyte_conn_id` (Airbyte's connection name on Airflow) and a `connection_ids` list of Airbyte Connections to Sync

- `FivetranGenerator`: creates `FivetranOperator` tasks (one per Fivetran connection)
  - It must receive Fivetran's `api_key`, `api_secret` and a `connection_ids` list of Fivetran Connectors to Sync. It can optionally receive `wait_for_completion: true` and 2 tasks will be created for each sync: a `FivetranOperator` and it's respective `FivetranSensor` that monitors the status of the sync.
- `AirbyteDbtGenerator` and `FivetranDbtGenerator`: instead of passing them Airbyte or Fivetran connections, they use dbt to discover those IDs. Apart from their parent Generators mandatory fields, they can receive:
  - `dbt_project_path`: dbt/project/folder
  - `virtualenv_path`: path to a virtualenv in case dbt has to be ran with another Python executable
  - `run_dbt_compile`: true/false
  - `run_dbt_deps`: true/false

#### Basic YML DAG example:

```yaml
description: "dbt-coves DAG"
schedule_interval: "@hourly"
tags:
  - version_01
default_args:
  start_date: 2023-01-01
catchup: false
nodes:
  airbyte_dbt:
    type: task_group
    tooltip: "Sync dbt-related Airbyte connections"
    generator: AirbyteDbtGenerator
    host: http://localhost
    port: 8000
    dbt_project_path: /path/to/dbt_project
    virtualenv_path: /virtualenvs/dbt_160
    run_dbt_compile: true
    run_dbt_deps: false
    airbyte_conn_id: airbyte_connection
  task_1:
    operator: airflow.operators.bash.BashOperator
    bash_command: "echo 'This runs after airbyte tasks'"
    dependencies: ["airbyte_dbt"]
```

##### Create your custom Generator

You can create your own DAG Generator. Any `key:value` specified in the YML DAG will be passed to it's constructor.

This Generator needs:

- a `imports` attribute: a list of _module.class_ Operator of the tasks it outputs
- a `generate_tasks` method that returns the set of `"task_name = Operator()"` strings to write as the task group tasks.

```python
class PostgresGenerator():
    def __init__(self) -> None:
        """ Any key:value pair in the YML Dag will get here """
        self.imports = ["airflow.providers.postgres.operators.postgres.PostgresOperator"]

    def generate_tasks(self):
        """ Use your custom logic and return N `name = PostgresOperator()` strings """
        raise NotImplementedError
```

### airflow-dags generation arguments

`dbt-coves generate airflow-dags` supports the following args:

```console
--yml-path --yaml-path
# Path to the folder containing YML files to translate into Python DAGs

--dag-path
# Path to the folder where Python DAGs will be generated.

--validate-operators
# Ensure Airflow operators are installed by trying to import them before writing to Python.
# Flag: no value required

--generators-folder
# Path to your Python module with custom Generators

--generators-params
# Object with default values for the desired Generator(s)
# For example: {"AirbyteGenerator": {"host": "http://localhost", "port": "8000"}}

--secrets-path
# Secret files location for DAG configuration, i.e. 'yml_path/secrets/'
# Secret content must match the YML dag spec of `nodes -> node_name -> config`
```

## Extract configuration from Airbyte

```console
dbt-coves extract airbyte
```

Extracts the configuration from your Airbyte sources, connections and destinations (excluding credentials) and stores it in the specified folder. The main goal of this feature is to keep track of the configuration changes in your git repo, and rollback to a specific version when needed.

Full usage example:

```console
dbt-coves extract airbyte --host http://airbyte-server --port 8001 --path /config/workspace/load/airbyte
```

## Load configuration to Airbyte

```console
dbt-coves load airbyte
```

Loads the Airbyte configuration generated with `dbt-coves extract airbyte` on an Airbyte server. Secrets folder needs to be specified separately. You can use [git-secret](https://git-secret.io/) to encrypt secrets and make them part of your git repo.

### Loading secrets

Secret credentials can be approached in two different ways: locally or remotely (through a provider/manager).

In order to load encrypted fields locally:

```console
dbt-coves load airbyte --secrets-path /path/to/secret/directory

# This directory must have 'sources', 'destinations' and 'connections' folders nested inside, and inside them the respective JSON files with unencrypted fields.
# Naming convention: JSON unencrypted secret files must be named exactly as the extracted ones.
```

To load encrypted fields through a manager (in this case we are connecting to Datacoves' Service Credentials):

```console
--secrets-manager datacoves
```

```console
--secrets-url https://api.datacoves.localhost/service-credentials/airbyte
```

```console
--secrets-token <secret token>
```

Full usage example:

```console
dbt-coves load airbyte --host http://airbyte-server --port 8001 --path /config/workspace/load/airbyte --secrets-path /config/workspace/secrets
```

## Extract configuration from Fivetran

```console
dbt-coves extract fivetran
```

Extracts the configuration from your Fivetran destinations and connectors (excluding credentials) and stores it in the specified folder. The main goal of this feature is to keep track of the configuration changes in your git repo, and rollback to a specific version when needed.

Full usage example:

```console
dbt-coves extract fivetran --credentials /config/workspace/secrets/fivetran/credentials.yml --path /config/workspace/load/fivetran
```

## Load configuration to Fivetran

```console
dbt-coves load fivetran
```

Loads the Fivetran configuration generated with `dbt-coves extract fivetran` on a Fivetran instance. Secrets folder needs to be specified separately. You can use [git-secret](https://git-secret.io/) to encrypt secrets and make them part of your git repo.

### Credentials

In order for extract/load fivetran to work properly, you need to provide an api key-secret pair (you can generate them [here](https://fivetran.com/account/settings/account)).

These credentials can be used with `--api-key [key] --api-secret [secret]`, or specyfing a YML file with `--credentials /path/to/credentials.yml`. The required structure of this file is the following:

```yaml
account_name: # Any name, used by dbt-coves to ask which to use when more than one is found
  api_key: [key]
  api_secret: [secret]
account_name_2:
  api_key: [key]
  api_secret: [secret]
```

This YML file approach allows you to both work with multiple Fivetran accounts, and treat this credentials file as a secret.

> :warning: **Warning**: --api-key/secret and --credentials flags are mutually exclusive, don't use them together.

### Loading secrets

Secret credentials can be approached via `--secrets-path` flag

```console
dbt-coves load fivetran --secrets-path /path/to/secret/directory
```

#### Field naming convention

Although secret files can have any name, unencrypted JSON files must follow a simple structure:

- Keys should match their corresponding Fivetran destination ID: two words automatically generated by Fivetran, which can be found in previously extracted data.
- Inside those keys, a nested dictionary of which fields should be overwritten

For example:

```json
{
  "extract_muscle": {
    // Internal ID that Fivetran gave to a Snowflake warehouse Destination
    "password": "[PASSWORD]" // Field:Value pair
  },
  "centre_straighten": {
    "password": "[PASSWORD]"
  }
}
```

## Run dbt commands

```shell
dbt-coves dbt <arguments> -- <command>
```

Run dbt commands on special environments such as Airflow, or CI workers, with the possibility of changing dbt project location and activating a specific virtual environment in which running commands.

### Arguments

`dbt-coves dbt` supports the following arguments

```shell
--project-dir
# Path of the dbt project where command will be executed, i.e.: /opt/user/dbt_project
```

```shell
--virtualenv
# Virtual environment path. i.e.: /opt/user/virtualenvs/airflow
```

### Sample usage

```shell
dbt-coves dbt --project-dir /opt/user/dbt_project --virtualenv /opt/user/virtualenvs/airflow -- run -s model --vars \"{key: value}\"
# Make sure to escape special characters such as quotation marks
# Double dash (--) between <arguments> and <command> are mandatory
```

# Settings

dbt-coves will read settings from `.dbt_coves/config.yml`. A standard settings files could look like this:

```yaml
generate:
  sources:
    database: "RAW" # Database where to look for source tables
    schemas: # List of schema names where to look for source tables
      - RAW
    select_relations: # list of relations where raw data resides
      - TABLE_1
      - TABLE_2
    exclude_relations: # Filter relation(s) to exclude from source file(s) generation
      - TABLE_1
      - TABLE_2
    sources_destination: "models/staging/{{schema}}/{{schema}}.yml" # Where sources yml files will be generated
    models_destination: "models/staging/{{schema}}/{{relation}}.sql" # Where models sql files will be generated
    model_props_destination: "models/staging/{{schema}}/{{relation}}.yml" # Where models yml files will be generated
    update_strategy: ask # Action to perform when a property file already exists. Options: update, recreate, fail, ask (per file)
    templates_folder: ".dbt_coves/templates" # Folder where source generation jinja templates are located. Override default templates creating  source_props.yml, source_model_props.yml, and source_model.sql under this folder
    metadata: "metadata.csv" # Path to csv file containing metadata
    flatten_json_fields: ask

  properties:
    destination: "{{model_folder_path}}/{{model_file_name}}.yml" # Where models yml files will be generated
    # You can specify a different path by declaring it explicitly, i.e.: "models/staging/{{model_file_name}}.yml"
    update-strategy: ask # Action to perform when a property file already exists. Options: update, recreate, fail, ask (per file)
    select: "models/staging/bays" # Filter model(s) to generate property file(s)
    exclude: "models/staging/bays/test_bay" # Filter model(s) to generate property file(s)
    selector: "selectors/bay_selector.yml" # Specify dbt selector for more complex model filtering
    templates_folder: ".dbt_coves/templates" # Folder where source generation jinja templates are located. Override default template creating model_props.yml under this folder
    metadata: "metadata.csv" # Path to csv file containing metadata

  metadata:
    database: RAW # Database where to look for source tables
    schemas: # List of schema names where to look for source tables
      - RAW
    select_relations: # list of relations where raw data resides
      - TABLE_1
      - TABLE_2
    exclude_relations: # Filter relation(s) to exclude from source file(s) generation
      - TABLE_1
      - TABLE_2
    destination: # Where metadata file will be generated, default: 'metadata.csv'

  docs:
    merge_deferred: true
    state: logs/
    dbt_args: "--no-compile --select foo --exclude bar"

  airflow_dags:
    yml_path:
    dags_path:
    generators_params:
      AirbyteDbtGenerator:
        host: "{{ env_var('AIRBYTE_HOST_NAME') }}"
        port: "{{ env_var('AIRBYTE_PORT') }}"
        airbyte_conn_id: airbyte_connection

        dbt_project_path: "{{ env_var('DBT_HOME') }}"
        run_dbt_compile: true
        run_dbt_deps: false

extract:
  airbyte:
    path: /config/workspace/load/airbyte # Where json files will be generated
    host: http://airbyte-server # Airbyte's API hostname
    port: 8001 # Airbyte's API port
  fivetran:
    path: /config/workspace/load/fivetran # Where Fivetran export will be generated
    api_key: [KEY] # Fivetran API Key
    api_secret: [SECRET] # Fivetran API Secret
    credentials: /opt/fivetran_credentials.yml # Fivetran set of key:secret pairs
    # 'api_key' + 'api_secret' are mutually exclusive with 'credentials', use one or the other

load:
  airbyte:
    path: /config/workspace/load
    host: http://airbyte-server
    port: 8001
    secrets_manager: datacoves # (optional) Secret credentials provider (secrets_path OR secrets_manager should be used, can't load secrets locally and remotely at the same time)
    secrets_path: /config/workspace/secrets # (optional) Secret files location if secrets_manager was not specified
    secrets_url: https://api.datacoves.localhost/service-credentials/airbyte # Secrets url if secrets_manager is datacoves
    secrets_token: <TOKEN> # Secrets auth token if secrets_manager is datacoves
  fivetran:
    path: /config/workspace/load/fivetran # Where previous Fivetran export resides, subject of import
    api_key: [KEY] # Fivetran API Key
    api_secret: [SECRET] # Fivetran API Secret
    secrets_path: /config/workspace/secrets/fivetran # Fivetran secret fields
    credentials: /opt/fivetran_credentials.yml # Fivetran set of key:secret pairs
    # 'api_key' + 'api_secret' are mutually exclusive with 'credentials', use one or the other
```

## env_var

From `dbt-coves 1.6.28` onwards, you can consume environment variables in you config file using `"{{env_var('VAR_NAME', 'DEFAULT VALUE')}}"`. For example:

```yaml
generate:
  sources:
    database: "{{env_var('MAIN_DATABASE', 'dev_database')}}"
    schemas:
      - "{{env_var('DEV_SCHEMA', 'John')}}"
      - "{{env_var('STAGING_SCHEMA', 'Staging')}}"
```

## Telemetry

dbt-coves has telemetry built in to help the maintainers from Datacoves understand which commands are being used and which are not to prioritize future development of dbt-coves. We do not track credentials nor details of your dbt execution such as model names. The one detail we do use related to dbt is the anonymous user_id to help us identify distinct users.

By default this is turned on – you can opt out of event tracking at any time by adding the following to your dbt-coves `config.yaml` file:

```yaml
disable-tracking: true
```

## Override generation templates

Customizing generated models and model properties requires placing
template files under the `.dbt-coves/templates` folder.

There are different variables available in the templates:

- `adapter_name` refers to the Adapter's class name being used by the target, e.g. `SnowflakeAdapter` when using [Snowflake](https://github.com/dbt-labs/dbt-snowflake/blob/21b52127e7d221db8b92114aae066fb8a7151bba/dbt/adapters/snowflake/impl.py#L33).
- `columns` contains the list of relation columns that don't contain nested (JSON) data, it's type is `List[Item]`.
- `nested` contains a dict of nested columns, grouped by column name, it's type is `Dict[column_name, Dict[nested_key, Item]]`.

`Item` is a `dict` with the keys `id`, `name`, `type`, and `description`, where `id` contains an slugified id generated from `name`.

### dbt-coves generate sources

#### Source property file (.yml) template

This file is used to create the sources yml file

[source_props.yml](dbt_coves/templates/source_props.yml)

#### Staging model file (.sql) template

This file is used to create the staging model (sql) files.

[staging_model.sql](dbt_coves/templates/staging_model.sql)

#### Staging model property file (.yml) template

This file is used to create the model properties (yml) file

[staging_model_props.yml](dbt_coves/templates/staging_model_props.yml)

### dbt-coves generate properties

This file is used to create the properties (yml) files for models

[model_props.yml](dbt_coves/templates/model_props.yml)

# Thanks

The project main structure was inspired by [dbt-sugar](https://github.com/bitpicky/dbt-sugar). Special thanks to [Bastien Boutonnet](https://github.com/bastienboutonnet) for the great work done.

# Authors

- Sebastian Sassi [\@sebasuy](https://twitter.com/sebasuy) -- [Datacoves](https://datacoves.com/)
- Noel Gomez [\@noel_g](https://twitter.com/noel_g) -- [Datacoves](https://datacoves.com/)
- Bruno Antonellini -- [Datacoves](https://datacoves.com/)

# About

Learn more about [Datacoves](https://datacoves.com).

⚠️ **dbt-coves is still in development, make sure to test it for your dbt project version and DW before using in production and please submit any issues you find. We also welcome any contributions from the community**

# Metrics

[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/datacoves/dbt-coves/graphs/commit-activity)
[![PyPI version
fury.io](https://badge.fury.io/py/dbt-coves.svg)](https://pypi.python.org/pypi/dbt-coves/)
[![Code
Style](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
[![Imports:
isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
[![Imports:
python](https://img.shields.io/badge/python-3.8%20%7C%203.9-blue)](https://img.shields.io/badge/python-3.8%20%7C%203.9-blue)
[![Build](https://github.com/datacoves/dbt-coves/actions/workflows/main_ci.yml/badge.svg)](https://github.com/datacoves/dbt-coves/actions/workflows/main_ci.yml/badge.svg)

<!-- [![codecov](https://codecov.io/gh/datacoves/dbt-coves/branch/main/graph/badge.svg?token=JB0E0LZDW1)](https://codecov.io/gh/datacoves/dbt-coves) -->

[![Maintainability](https://api.codeclimate.com/v1/badges/1e6a887de605ef8e0eca/maintainability)](https://codeclimate.com/github/datacoves/dbt-coves/maintainability)
[![Downloads](https://pepy.tech/badge/dbt-coves)](https://pepy.tech/project/dbt-coves)

            

Raw data

            {
    "_id": null,
    "home_page": "https://datacoves.com",
    "name": "dbt_coves",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<3.12",
    "maintainer_email": "",
    "keywords": "data engineering,analytics engineering,dbt,ETL,data modelling",
    "author": "Datacoves",
    "author_email": "hello@datacoves.com",
    "download_url": "https://files.pythonhosted.org/packages/ac/33/75d8cbc51716c569c522fe82bb4cd71fc5b1c18bfe41cb28291343ab84ed/dbt_coves-1.7.1.tar.gz",
    "platform": null,
    "description": "# dbt-coves\n\n## Sponsor\n\n<picture>\n  <source media=\"(prefers-color-scheme: dark)\" srcset=\"images/datacoves-dark.png\">\n  <img alt=\"Datacoves\" src=\"images/datacoves-light.png\" width=\"150\">\n</picture>\n\nHosted VS Code, dbt-core, SqlFluff, and Airflow, find out more at [Datacoves.com](https://datacoves.com/product).\n\n## What is dbt-coves?\n\ndbt-coves is a CLI tool that automates certain tasks for [dbt](https://www.getdbt.com), making life simpler for the dbt user.\n\ndbt-coves generates dbt sources, staging models and property(yml) files by analyzing information from the data warehouse and creating the necessary files (sql and yml).\n\nFinally, dbt-coves includes functionality to bootstrap a dbt project and to extract and load configurations from Airbyte.\n\n## Supported dbt versions\n\n| Version | Status           |\n| ------- | ---------------- |\n| \\< 1.0  | \u274c Not supported |\n| >= 1.0  | \u2705 Tested        |\n\nFrom `dbt-coves` 1.4.0 onwards, our major and minor versions match those of [dbt-core](https://github.com/dbt-labs/dbt-core).\nThis means we release a new major/minor version once it's dbt-core equivalent is tested.\nPatch suffix (1.4.X) is exclusive to our continuous development and does not reflect a version match with dbt\n\n## Supported adapters\n\n| Feature                           | Snowflake | Redshift  | BigQuery  |\n| --------------------------------- | --------- | --------- | --------- |\n| dbt project setup                 | \u2705 Tested | \u2705 Tested | \u2705 Tested |\n| source model (sql) generation     | \u2705 Tested | \u2705 Tested | \u2705 Tested |\n| model properties (yml) generation | \u2705 Tested | \u2705 Tested | \u2705 Tested |\n\nNOTE: Other database adapters may work, although we have not tested them. Feel free to try them and let us know so we can update the table above.\n\n### Here\\'s the tool in action\n\n[![image](https://cdn.loom.com/sessions/thumbnails/74062cf71cbe4898805ca508ea2d9455-1624905546029-with-play.gif)](https://www.loom.com/share/74062cf71cbe4898805ca508ea2d9455)\n\n# Installation\n\n```console\npip install dbt-coves\n```\n\nWe recommend using [python\nvirtualenvs](https://docs.python.org/3/tutorial/venv.html) and create\none separate environment per project.\n\n# Command Reference\n\nFor a complete list of options, please run:\n\n```console\ndbt-coves -h\ndbt-coves <command> -h\n```\n\n## Environment setup\n\nYou can configure different components:\n\nSet up `git` repository of dbt-coves project\n\n```console\ndbt-coves setup git\n```\n\nSet up `dbt` within the project (delegates to dbt init)\n\n```console\ndbt-coves setup dbt\n```\n\nSet up SSH Keys for dbt project. Supports the argument `--open_ssl_public_key` which generates an extra Public Key in Open SSL format, useful for configuring certain providers (i.e. Snowflake authentication)\n\n```console\ndbt-coves setup ssh\n```\n\nSet up pre-commit for your dbt project. In this, you can configure different tools that we consider essential for proper dbt usage: `sqlfluff`, `yaml-lint`, and `dbt-checkpoint`\n\n```console\ndbt-coves setup precommit\n```\n\n## Models generation\n\n```console\ndbt-coves generate <resource>\n```\n\nWhere _\\<resource\\>_ could be _sources_, _properties_, _metadata_, _docs_ or _airflow-dags_.\n\n```console\ndbt-coves generate sources\n```\n\nThis command will generate the dbt source configuration as well as the initial dbt staging model(s). It will look in the database defined in your `profiles.yml` file or you can pass the `--database` argument or set up default configuration options (see below)\n\n```console\ndbt-coves generate sources --database raw\n```\n\nSupports Jinja templates to adjust how the resources are generated. See below for examples.\n\nEvery `dbt-coves generate <resource>` supports `--no-prompt` flag, which will silently generate all sources/models/properties/metadata without asking anything to the user.\n\n### Source Generation Arguments\n\ndbt-coves can be used to create the initial staging models. It will do the following:\n\n1. Create / Update the source yml file\n2. Create the initial staging model(sql) file and offer to flatten VARIANT(JSON) fields\n3. Create the staging model's property(yml) file.\n\n`dbt-coves generate sources` supports the following args:\n\nSee full list in help\n\n```console\ndbt-coves generate sources -h\n```\n\n```console\n--database\n# Database to inspect\n```\n\n```console\n--schemas\n# Schema(s) to inspect. Accepts wildcards (must be enclosed in quotes if used)\n```\n\n```console\n--select-relations\n# List of relations where raw data resides. The parameter must be enclosed in quotes. Accepts wildcards.\n```\n\n```console\n--exclude-relations\n# Filter relation(s) to exclude from source file(s) generation. The parameter must be enclosed in quotes. Accepts wildcards.\n```\n\n```console\n--sources-destination\n# Where sources yml files will be generated, default: 'models/staging/{{schema}}/sources.yml'\n```\n\n```console\n--models-destination\n# Where models sql files will be generated, default: 'models/staging/{{schema}}/{{relation}}.sql'\n```\n\n```console\n--model-props-destination\n# Where models yml files will be generated, default: 'models/staging/{{schema}}/{{relation}}.yml'\n```\n\n```console\n--update-strategy\n# Action to perform when a property file already exists: 'update', 'recreate', 'fail', 'ask' (per file)\n```\n\n```console\n--templates-folder\n# Folder with jinja templates that override default sources generation templates, i.e. 'templates'\n```\n\n```console\n--metadata\n# Path to csv file containing metadata, i.e. 'metadata.csv'\n```\n\n```console\n--flatten-json-fields\n# Action to perform when JSON fields exist: 'yes', 'no', 'ask' (per file)\n```\n\n```console\n--overwrite-staging-models\n# Flag: overwrite existent staging (SQL) files\n```\n\n```console\n--skip-model-props\n# Flag: don't create model's property (yml) files\n```\n\n```console\n--no-prompt\n# Silently generate source dbt models\n```\n\n### Properties Generation Arguments\n\nYou can use dbt-coves to generate and update the properties(yml) file for a given dbt model(sql) file.\n\n`dbt-coves generate properties` supports the following args:\n\n```console\n--destination\n# Where models yml files will be generated, default: '{{model_folder_path}}/{{model_file_name}}.yml'\n```\n\n```console\n--update-strategy\n# Action to perform when a property file already exists: 'update', 'recreate', 'fail', 'ask' (per file)\n```\n\n```console\n-s --select\n# Filter model(s) to generate property file(s)\n```\n\n```console\n--exclude\n# Filter model(s) to exclude from property file(s) generation\n```\n\n```console\n--selector\n# Specify dbt selector for more complex model filtering\n```\n\n```console\n--templates-folder\n# Folder with jinja templates that override default properties generation templates, i.e. 'templates'\n```\n\n```console\n--metadata\n# Path to csv file containing metadata, i.e. 'metadata.csv'\n```\n\n```console\n--no-prompt\n# Silently generate dbt models property files\n```\n\nNote: `--select (or -s)`, `--exclude` and `--selector` work exactly as `dbt ls` selectors do. For usage details, visit [dbt list docs](https://docs.getdbt.com/reference/commands/list)\n\n### Metadata Generation Arguments\n\nYou can use dbt-coves to generate the metadata file(s) containing the basic structure of the csv that can be used in the above `dbt-coves generate sources/properties` commands.\nUsage of these metadata files can be found in [metadata](https://github.com/datacoves/dbt-coves#metadata) below.\n\n`dbt-coves generate metadata` supports the following args:\n\n```console\n--database\n# Database to inspect\n```\n\n```console\n--schemas\n# Schema(s) to inspect. Accepts wildcards (must be enclosed in quotes if used)\n```\n\n```console\n--select-relations\n# List of relations where raw data resides. The parameter must be enclosed in quotes. Accepts wildcards.\n```\n\n```console\n--exclude-relations\n# Filter relation(s) to exclude from source file(s) generation. The parameter must be enclosed in quotes. Accepts wildcards.\n```\n\n```console\n--destination\n# Where csv file(s) will be generated, default: 'metadata.csv'\n# Supports using the Jinja tags `{{relation}}` and `{{schema}}`\n# if creating one csv per relation/table in schema, i.e: \"metadata/{{relation}}.csv\"\n```\n\n```console\n--no-prompt\n# Silently generate metadata\n```\n\n### Metadata\n\ndbt-coves supports the argument `--metadata` which allows users to specify a csv file containing field types and descriptions to be used when creating the staging models and property files.\n\n```console\ndbt-coves generate sources --metadata metadata.csv\n```\n\nMetadata format:\nYou can download a [sample csv file](sample_metadata.csv) as reference\n\n| database | schema | relation                          | column          | key  | type    | description                                     |\n| -------- | ------ | --------------------------------- | --------------- | ---- | ------- | ----------------------------------------------- |\n| raw      | raw    | \\_airbyte_raw_country_populations | \\_airbyte_data  | Year | integer | Year of country population measurement          |\n| raw      | raw    | \\_airbyte_raw_country_populations | \\_airbyte_data  |      | variant | Airbyte data columns (VARIANT) in Snowflake     |\n| raw      | raw    | \\_airbyte_raw_country_populations | \\_airbyte_ab_id |      | varchar | Airbyte unique identifier used during data load |\n\n### Docs generation arguments\n\nYou can use dbt-coves to improve the standard dbt docs generation process. It generates your dbt docs, updates external links so they always open in a new tab. It also has the option to merge production `catalog.json` into the local environment when running in deferred mode, so you can run [dbt-checkpoint](https://github.com/dbt-checkpoint/dbt-checkpoint) hooks even when the model has not been run locally.\n\n`dbt-coves generate docs` supports the following args:\n\n```console\n--merge-deferred\n# Merge a deferred catalog.json into your generated one.\n# Flag: no value required.\n```\n\n```\n--state\n# Directory where your production catalog.json is located\n# Mandatory when using --merge-deferred\n```\n\n### Generate airflow-dags\n\n```console\ndbt-coves generate airflow-dags\n```\n\nTranslate YML files into their Airflow Python code equivalent. With this, DAGs can be easily written with some `key:value` pairs.\n\nThe basic structure of these YMLs must consist of:\n\n- Global configurations (description, schedule_interval, tags, catchup, etc.)\n- `default_args`\n- `nodes`: where tasks and task groups are defined\n  - each Node is a nested object, with it's `name` as key and it's configuration as values.\n    - this configuration must cover:\n      - `type`: 'task' or 'task_group'\n      - `operator`: Airflow operator that will run the tasks (full _module.class_ naming)\n      - `dependencies`: whether the task is dependent on another one(s)\n      - any `key:value` pair of [Operator arguments](https://airflow.apache.org/docs/apache-airflow/stable/_api/airflow/operators/index.html)\n\n#### Airflow DAG Generators\n\nWhen a YML Dag `node` is of type `task_group`, **Generators** can be used instead of `Operators`.\n\nThey are custom classes that receive YML `key:value` pairs and return one or more tasks for the respective task group. Any pair specified other than `type: task_group` will be passed to the specified `generator`, and it has the responsibility of returning N amount of `task_name = Operator(params)`.\n\nWe provide some prebuilt Generators:\n\n- `AirbyteGenerator` creates `AirbyteTriggerSyncOperator` tasks (one per Airbyte connection)\n\n  - It must receive Airbyte's `host` and `port`, `airbyte_conn_id` (Airbyte's connection name on Airflow) and a `connection_ids` list of Airbyte Connections to Sync\n\n- `FivetranGenerator`: creates `FivetranOperator` tasks (one per Fivetran connection)\n  - It must receive Fivetran's `api_key`, `api_secret` and a `connection_ids` list of Fivetran Connectors to Sync. It can optionally receive `wait_for_completion: true` and 2 tasks will be created for each sync: a `FivetranOperator` and it's respective `FivetranSensor` that monitors the status of the sync.\n- `AirbyteDbtGenerator` and `FivetranDbtGenerator`: instead of passing them Airbyte or Fivetran connections, they use dbt to discover those IDs. Apart from their parent Generators mandatory fields, they can receive:\n  - `dbt_project_path`: dbt/project/folder\n  - `virtualenv_path`: path to a virtualenv in case dbt has to be ran with another Python executable\n  - `run_dbt_compile`: true/false\n  - `run_dbt_deps`: true/false\n\n#### Basic YML DAG example:\n\n```yaml\ndescription: \"dbt-coves DAG\"\nschedule_interval: \"@hourly\"\ntags:\n  - version_01\ndefault_args:\n  start_date: 2023-01-01\ncatchup: false\nnodes:\n  airbyte_dbt:\n    type: task_group\n    tooltip: \"Sync dbt-related Airbyte connections\"\n    generator: AirbyteDbtGenerator\n    host: http://localhost\n    port: 8000\n    dbt_project_path: /path/to/dbt_project\n    virtualenv_path: /virtualenvs/dbt_160\n    run_dbt_compile: true\n    run_dbt_deps: false\n    airbyte_conn_id: airbyte_connection\n  task_1:\n    operator: airflow.operators.bash.BashOperator\n    bash_command: \"echo 'This runs after airbyte tasks'\"\n    dependencies: [\"airbyte_dbt\"]\n```\n\n##### Create your custom Generator\n\nYou can create your own DAG Generator. Any `key:value` specified in the YML DAG will be passed to it's constructor.\n\nThis Generator needs:\n\n- a `imports` attribute: a list of _module.class_ Operator of the tasks it outputs\n- a `generate_tasks` method that returns the set of `\"task_name = Operator()\"` strings to write as the task group tasks.\n\n```python\nclass PostgresGenerator():\n    def __init__(self) -> None:\n        \"\"\" Any key:value pair in the YML Dag will get here \"\"\"\n        self.imports = [\"airflow.providers.postgres.operators.postgres.PostgresOperator\"]\n\n    def generate_tasks(self):\n        \"\"\" Use your custom logic and return N `name = PostgresOperator()` strings \"\"\"\n        raise NotImplementedError\n```\n\n### airflow-dags generation arguments\n\n`dbt-coves generate airflow-dags` supports the following args:\n\n```console\n--yml-path --yaml-path\n# Path to the folder containing YML files to translate into Python DAGs\n\n--dag-path\n# Path to the folder where Python DAGs will be generated.\n\n--validate-operators\n# Ensure Airflow operators are installed by trying to import them before writing to Python.\n# Flag: no value required\n\n--generators-folder\n# Path to your Python module with custom Generators\n\n--generators-params\n# Object with default values for the desired Generator(s)\n# For example: {\"AirbyteGenerator\": {\"host\": \"http://localhost\", \"port\": \"8000\"}}\n\n--secrets-path\n# Secret files location for DAG configuration, i.e. 'yml_path/secrets/'\n# Secret content must match the YML dag spec of `nodes -> node_name -> config`\n```\n\n## Extract configuration from Airbyte\n\n```console\ndbt-coves extract airbyte\n```\n\nExtracts the configuration from your Airbyte sources, connections and destinations (excluding credentials) and stores it in the specified folder. The main goal of this feature is to keep track of the configuration changes in your git repo, and rollback to a specific version when needed.\n\nFull usage example:\n\n```console\ndbt-coves extract airbyte --host http://airbyte-server --port 8001 --path /config/workspace/load/airbyte\n```\n\n## Load configuration to Airbyte\n\n```console\ndbt-coves load airbyte\n```\n\nLoads the Airbyte configuration generated with `dbt-coves extract airbyte` on an Airbyte server. Secrets folder needs to be specified separately. You can use [git-secret](https://git-secret.io/) to encrypt secrets and make them part of your git repo.\n\n### Loading secrets\n\nSecret credentials can be approached in two different ways: locally or remotely (through a provider/manager).\n\nIn order to load encrypted fields locally:\n\n```console\ndbt-coves load airbyte --secrets-path /path/to/secret/directory\n\n# This directory must have 'sources', 'destinations' and 'connections' folders nested inside, and inside them the respective JSON files with unencrypted fields.\n# Naming convention: JSON unencrypted secret files must be named exactly as the extracted ones.\n```\n\nTo load encrypted fields through a manager (in this case we are connecting to Datacoves' Service Credentials):\n\n```console\n--secrets-manager datacoves\n```\n\n```console\n--secrets-url https://api.datacoves.localhost/service-credentials/airbyte\n```\n\n```console\n--secrets-token <secret token>\n```\n\nFull usage example:\n\n```console\ndbt-coves load airbyte --host http://airbyte-server --port 8001 --path /config/workspace/load/airbyte --secrets-path /config/workspace/secrets\n```\n\n## Extract configuration from Fivetran\n\n```console\ndbt-coves extract fivetran\n```\n\nExtracts the configuration from your Fivetran destinations and connectors (excluding credentials) and stores it in the specified folder. The main goal of this feature is to keep track of the configuration changes in your git repo, and rollback to a specific version when needed.\n\nFull usage example:\n\n```console\ndbt-coves extract fivetran --credentials /config/workspace/secrets/fivetran/credentials.yml --path /config/workspace/load/fivetran\n```\n\n## Load configuration to Fivetran\n\n```console\ndbt-coves load fivetran\n```\n\nLoads the Fivetran configuration generated with `dbt-coves extract fivetran` on a Fivetran instance. Secrets folder needs to be specified separately. You can use [git-secret](https://git-secret.io/) to encrypt secrets and make them part of your git repo.\n\n### Credentials\n\nIn order for extract/load fivetran to work properly, you need to provide an api key-secret pair (you can generate them [here](https://fivetran.com/account/settings/account)).\n\nThese credentials can be used with `--api-key [key] --api-secret [secret]`, or specyfing a YML file with `--credentials /path/to/credentials.yml`. The required structure of this file is the following:\n\n```yaml\naccount_name: # Any name, used by dbt-coves to ask which to use when more than one is found\n  api_key: [key]\n  api_secret: [secret]\naccount_name_2:\n  api_key: [key]\n  api_secret: [secret]\n```\n\nThis YML file approach allows you to both work with multiple Fivetran accounts, and treat this credentials file as a secret.\n\n> :warning: **Warning**: --api-key/secret and --credentials flags are mutually exclusive, don't use them together.\n\n### Loading secrets\n\nSecret credentials can be approached via `--secrets-path` flag\n\n```console\ndbt-coves load fivetran --secrets-path /path/to/secret/directory\n```\n\n#### Field naming convention\n\nAlthough secret files can have any name, unencrypted JSON files must follow a simple structure:\n\n- Keys should match their corresponding Fivetran destination ID: two words automatically generated by Fivetran, which can be found in previously extracted data.\n- Inside those keys, a nested dictionary of which fields should be overwritten\n\nFor example:\n\n```json\n{\n  \"extract_muscle\": {\n    // Internal ID that Fivetran gave to a Snowflake warehouse Destination\n    \"password\": \"[PASSWORD]\" // Field:Value pair\n  },\n  \"centre_straighten\": {\n    \"password\": \"[PASSWORD]\"\n  }\n}\n```\n\n## Run dbt commands\n\n```shell\ndbt-coves dbt <arguments> -- <command>\n```\n\nRun dbt commands on special environments such as Airflow, or CI workers, with the possibility of changing dbt project location and activating a specific virtual environment in which running commands.\n\n### Arguments\n\n`dbt-coves dbt` supports the following arguments\n\n```shell\n--project-dir\n# Path of the dbt project where command will be executed, i.e.: /opt/user/dbt_project\n```\n\n```shell\n--virtualenv\n# Virtual environment path. i.e.: /opt/user/virtualenvs/airflow\n```\n\n### Sample usage\n\n```shell\ndbt-coves dbt --project-dir /opt/user/dbt_project --virtualenv /opt/user/virtualenvs/airflow -- run -s model --vars \\\"{key: value}\\\"\n# Make sure to escape special characters such as quotation marks\n# Double dash (--) between <arguments> and <command> are mandatory\n```\n\n# Settings\n\ndbt-coves will read settings from `.dbt_coves/config.yml`. A standard settings files could look like this:\n\n```yaml\ngenerate:\n  sources:\n    database: \"RAW\" # Database where to look for source tables\n    schemas: # List of schema names where to look for source tables\n      - RAW\n    select_relations: # list of relations where raw data resides\n      - TABLE_1\n      - TABLE_2\n    exclude_relations: # Filter relation(s) to exclude from source file(s) generation\n      - TABLE_1\n      - TABLE_2\n    sources_destination: \"models/staging/{{schema}}/{{schema}}.yml\" # Where sources yml files will be generated\n    models_destination: \"models/staging/{{schema}}/{{relation}}.sql\" # Where models sql files will be generated\n    model_props_destination: \"models/staging/{{schema}}/{{relation}}.yml\" # Where models yml files will be generated\n    update_strategy: ask # Action to perform when a property file already exists. Options: update, recreate, fail, ask (per file)\n    templates_folder: \".dbt_coves/templates\" # Folder where source generation jinja templates are located. Override default templates creating  source_props.yml, source_model_props.yml, and source_model.sql under this folder\n    metadata: \"metadata.csv\" # Path to csv file containing metadata\n    flatten_json_fields: ask\n\n  properties:\n    destination: \"{{model_folder_path}}/{{model_file_name}}.yml\" # Where models yml files will be generated\n    # You can specify a different path by declaring it explicitly, i.e.: \"models/staging/{{model_file_name}}.yml\"\n    update-strategy: ask # Action to perform when a property file already exists. Options: update, recreate, fail, ask (per file)\n    select: \"models/staging/bays\" # Filter model(s) to generate property file(s)\n    exclude: \"models/staging/bays/test_bay\" # Filter model(s) to generate property file(s)\n    selector: \"selectors/bay_selector.yml\" # Specify dbt selector for more complex model filtering\n    templates_folder: \".dbt_coves/templates\" # Folder where source generation jinja templates are located. Override default template creating model_props.yml under this folder\n    metadata: \"metadata.csv\" # Path to csv file containing metadata\n\n  metadata:\n    database: RAW # Database where to look for source tables\n    schemas: # List of schema names where to look for source tables\n      - RAW\n    select_relations: # list of relations where raw data resides\n      - TABLE_1\n      - TABLE_2\n    exclude_relations: # Filter relation(s) to exclude from source file(s) generation\n      - TABLE_1\n      - TABLE_2\n    destination: # Where metadata file will be generated, default: 'metadata.csv'\n\n  docs:\n    merge_deferred: true\n    state: logs/\n    dbt_args: \"--no-compile --select foo --exclude bar\"\n\n  airflow_dags:\n    yml_path:\n    dags_path:\n    generators_params:\n      AirbyteDbtGenerator:\n        host: \"{{ env_var('AIRBYTE_HOST_NAME') }}\"\n        port: \"{{ env_var('AIRBYTE_PORT') }}\"\n        airbyte_conn_id: airbyte_connection\n\n        dbt_project_path: \"{{ env_var('DBT_HOME') }}\"\n        run_dbt_compile: true\n        run_dbt_deps: false\n\nextract:\n  airbyte:\n    path: /config/workspace/load/airbyte # Where json files will be generated\n    host: http://airbyte-server # Airbyte's API hostname\n    port: 8001 # Airbyte's API port\n  fivetran:\n    path: /config/workspace/load/fivetran # Where Fivetran export will be generated\n    api_key: [KEY] # Fivetran API Key\n    api_secret: [SECRET] # Fivetran API Secret\n    credentials: /opt/fivetran_credentials.yml # Fivetran set of key:secret pairs\n    # 'api_key' + 'api_secret' are mutually exclusive with 'credentials', use one or the other\n\nload:\n  airbyte:\n    path: /config/workspace/load\n    host: http://airbyte-server\n    port: 8001\n    secrets_manager: datacoves # (optional) Secret credentials provider (secrets_path OR secrets_manager should be used, can't load secrets locally and remotely at the same time)\n    secrets_path: /config/workspace/secrets # (optional) Secret files location if secrets_manager was not specified\n    secrets_url: https://api.datacoves.localhost/service-credentials/airbyte # Secrets url if secrets_manager is datacoves\n    secrets_token: <TOKEN> # Secrets auth token if secrets_manager is datacoves\n  fivetran:\n    path: /config/workspace/load/fivetran # Where previous Fivetran export resides, subject of import\n    api_key: [KEY] # Fivetran API Key\n    api_secret: [SECRET] # Fivetran API Secret\n    secrets_path: /config/workspace/secrets/fivetran # Fivetran secret fields\n    credentials: /opt/fivetran_credentials.yml # Fivetran set of key:secret pairs\n    # 'api_key' + 'api_secret' are mutually exclusive with 'credentials', use one or the other\n```\n\n## env_var\n\nFrom `dbt-coves 1.6.28` onwards, you can consume environment variables in you config file using `\"{{env_var('VAR_NAME', 'DEFAULT VALUE')}}\"`. For example:\n\n```yaml\ngenerate:\n  sources:\n    database: \"{{env_var('MAIN_DATABASE', 'dev_database')}}\"\n    schemas:\n      - \"{{env_var('DEV_SCHEMA', 'John')}}\"\n      - \"{{env_var('STAGING_SCHEMA', 'Staging')}}\"\n```\n\n## Telemetry\n\ndbt-coves has telemetry built in to help the maintainers from Datacoves understand which commands are being used and which are not to prioritize future development of dbt-coves. We do not track credentials nor details of your dbt execution such as model names. The one detail we do use related to dbt is the anonymous user_id to help us identify distinct users.\n\nBy default this is turned on \u2013 you can opt out of event tracking at any time by adding the following to your dbt-coves `config.yaml` file:\n\n```yaml\ndisable-tracking: true\n```\n\n## Override generation templates\n\nCustomizing generated models and model properties requires placing\ntemplate files under the `.dbt-coves/templates` folder.\n\nThere are different variables available in the templates:\n\n- `adapter_name` refers to the Adapter's class name being used by the target, e.g. `SnowflakeAdapter` when using [Snowflake](https://github.com/dbt-labs/dbt-snowflake/blob/21b52127e7d221db8b92114aae066fb8a7151bba/dbt/adapters/snowflake/impl.py#L33).\n- `columns` contains the list of relation columns that don't contain nested (JSON) data, it's type is `List[Item]`.\n- `nested` contains a dict of nested columns, grouped by column name, it's type is `Dict[column_name, Dict[nested_key, Item]]`.\n\n`Item` is a `dict` with the keys `id`, `name`, `type`, and `description`, where `id` contains an slugified id generated from `name`.\n\n### dbt-coves generate sources\n\n#### Source property file (.yml) template\n\nThis file is used to create the sources yml file\n\n[source_props.yml](dbt_coves/templates/source_props.yml)\n\n#### Staging model file (.sql) template\n\nThis file is used to create the staging model (sql) files.\n\n[staging_model.sql](dbt_coves/templates/staging_model.sql)\n\n#### Staging model property file (.yml) template\n\nThis file is used to create the model properties (yml) file\n\n[staging_model_props.yml](dbt_coves/templates/staging_model_props.yml)\n\n### dbt-coves generate properties\n\nThis file is used to create the properties (yml) files for models\n\n[model_props.yml](dbt_coves/templates/model_props.yml)\n\n# Thanks\n\nThe project main structure was inspired by [dbt-sugar](https://github.com/bitpicky/dbt-sugar). Special thanks to [Bastien Boutonnet](https://github.com/bastienboutonnet) for the great work done.\n\n# Authors\n\n- Sebastian Sassi [\\@sebasuy](https://twitter.com/sebasuy) -- [Datacoves](https://datacoves.com/)\n- Noel Gomez [\\@noel_g](https://twitter.com/noel_g) -- [Datacoves](https://datacoves.com/)\n- Bruno Antonellini -- [Datacoves](https://datacoves.com/)\n\n# About\n\nLearn more about [Datacoves](https://datacoves.com).\n\n\u26a0\ufe0f **dbt-coves is still in development, make sure to test it for your dbt project version and DW before using in production and please submit any issues you find. We also welcome any contributions from the community**\n\n# Metrics\n\n[![Maintenance](https://img.shields.io/badge/Maintained%3F-yes-green.svg)](https://github.com/datacoves/dbt-coves/graphs/commit-activity)\n[![PyPI version\nfury.io](https://badge.fury.io/py/dbt-coves.svg)](https://pypi.python.org/pypi/dbt-coves/)\n[![Code\nStyle](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)\n[![Imports:\nisort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)\n[![Imports:\npython](https://img.shields.io/badge/python-3.8%20%7C%203.9-blue)](https://img.shields.io/badge/python-3.8%20%7C%203.9-blue)\n[![Build](https://github.com/datacoves/dbt-coves/actions/workflows/main_ci.yml/badge.svg)](https://github.com/datacoves/dbt-coves/actions/workflows/main_ci.yml/badge.svg)\n\n<!-- [![codecov](https://codecov.io/gh/datacoves/dbt-coves/branch/main/graph/badge.svg?token=JB0E0LZDW1)](https://codecov.io/gh/datacoves/dbt-coves) -->\n\n[![Maintainability](https://api.codeclimate.com/v1/badges/1e6a887de605ef8e0eca/maintainability)](https://codeclimate.com/github/datacoves/dbt-coves/maintainability)\n[![Downloads](https://pepy.tech/badge/dbt-coves)](https://pepy.tech/project/dbt-coves)\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "CLI tool for dbt users adopting analytics engineering best practices.",
    "version": "1.7.1",
    "project_urls": {
        "Documentation": "https://github.com/datacoves/dbt-coves/blob/main/README.md",
        "Homepage": "https://datacoves.com",
        "Repository": "https://github.com/datacoves/dbt-coves"
    },
    "split_keywords": [
        "data engineering",
        "analytics engineering",
        "dbt",
        "etl",
        "data modelling"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "46e3885d1cf0c56b0028243760a1303f3e6674e37fefd6e017afee3e8b75b313",
                "md5": "62758c24202ce7039b1c33af179dd711",
                "sha256": "1853439d6e6ab32631c762070e7189494c0b4c282dfb86ef29153776ad4f952c"
            },
            "downloads": -1,
            "filename": "dbt_coves-1.7.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "62758c24202ce7039b1c33af179dd711",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<3.12",
            "size": 95636,
            "upload_time": "2024-02-19T21:08:38",
            "upload_time_iso_8601": "2024-02-19T21:08:38.677059Z",
            "url": "https://files.pythonhosted.org/packages/46/e3/885d1cf0c56b0028243760a1303f3e6674e37fefd6e017afee3e8b75b313/dbt_coves-1.7.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ac3375d8cbc51716c569c522fe82bb4cd71fc5b1c18bfe41cb28291343ab84ed",
                "md5": "c2d26af39c9bff33fa26cd88bf47c16c",
                "sha256": "7a720cdbf7fa4bcdaea59a6996ebc70406f5d44495c7a5454b0ce1c129423407"
            },
            "downloads": -1,
            "filename": "dbt_coves-1.7.1.tar.gz",
            "has_sig": false,
            "md5_digest": "c2d26af39c9bff33fa26cd88bf47c16c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<3.12",
            "size": 79958,
            "upload_time": "2024-02-19T21:08:42",
            "upload_time_iso_8601": "2024-02-19T21:08:42.002532Z",
            "url": "https://files.pythonhosted.org/packages/ac/33/75d8cbc51716c569c522fe82bb4cd71fc5b1c18bfe41cb28291343ab84ed/dbt_coves-1.7.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-19 21:08:42",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "datacoves",
    "github_project": "dbt-coves",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "dbt_coves"
}
        
Elapsed time: 0.24274s