dbt-decodable


Namedbt-decodable JSON
Version 1.3.5 PyPI version JSON
download
home_pagehttps://github.com/decodableco/dbt-decodable
SummaryThe Decodable adapter plugin for DBT
upload_time2024-01-16 19:15:17
maintainer
docs_urlNone
authorDecodable
requires_python>=3.7.2
licenseApache2.0
keywords decodable dbt
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # dbt-decodable

[dbt] adapter for [Decodable].

[dbt] enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Decodable is a fully managed stream processing service, based on [Apache Flink®] and using SQL as the primary means of defining data streaming pipelines.

## Installation

`dbt-decodable` is available on [PyPI]. To install the latest version via `pip` (optionally using a virtual environment),
run:

```nofmt
python3 -m venv dbt-venv         # create the virtual environment
source dbt-venv/bin/activate     # activate the virtual environment
pip install dbt-decodable        # install the adapter
```

## Getting Started

Once you've installed dbt in a virtual environment, we recommend trying out the example project provided by decodable:

```bash
# clone the example project
git clone https://github.com/decodableco/dbt-decodable.git
cd dbt-decodable/example_project/example/

# Ensure you can connect to decodable via the decodable CLI:
# If you don't have installed the decodable CLI,
# install it following these instructions: https://docs.decodable.co/docs/setup#install-the-cli-command-line-interface
decodable connection list

# Ensure you have a  ~/.dbt/profiles.yml file:
cat ~/.dbt/profiles.yml
dbt-decodable: # this name must match the 'profile' from dbt_project.yml
  outputs:
    dev:
      account_name: <fill in your decodable account name>
      profile_name: default # fill in any profile defined in ~/.decodable/config
      type: decodable
      database: db
      schema: demo
      local_namespace: dbt_demo
  target: dev

# This will launch the example project
dbt run
```

Note that this dbt adapter ignores the `active-profile` setting in `~/.decodable/config`. You must put the decodable profile you want to use
in the `~/.dbt/profiles.yml` file into the `profile_name` setting.
The adapter does not support a custom decodable `base-url` (e.g. for local development or proxies).

## Configuring your profile

Profiles in dbt describe a set of configurations specific to a connection with the underlying data warehouse. Each dbt project should have a corresponding profile (though profiles can be reused for different project). Within a profile, multiple targets can be described to further control dbt's behavior. For example, it's very common to have a `dev` target for development and a `prod` target for production related configurations.

Most of the profile configuration options available can be found inside the [`dbt documentation`](https://docs.getdbt.com/reference/profiles.yml). Additionally, `dbt-decodable` defines a few adapter-specific ones that can be found below.

```yml
dbt-decodable:        # the name of the profile
  target: dev         # the default target to run commands with
  outputs:            # the list of all defined targets under this profile
    dev:              # the name of the target
      type: decodable
      database: None  # Ignored by this adapter, but required properties
      schema: None    # Ignored by this adapter, but required properties

      # decodable specific settings
      account_name: [your account]          # Decodable account name
      profile_name: [name of the profile]   # Decodable profile name
      materialize_tests: [true | false]     # whether to materialize tests as a pipeline/stream pair, default is `false`
      timeout: [ms]                         # maximum accumulative time a preview request should run for, default is `60000`
      preview_start: [earliest | latest]    # whether preview should be run with `earliest` or `latest` start position, default is `earliest`
      local_namespace: [namespace prefix]   # prefix added to all entities created on Decodable, default is `None`, meaning no prefix gets added.
```

dbt looks for the `profiles.yml` file in the `~/.dbt` directory. This file contains all user profiles.

## Supported Features

### Materializations

Only table [materialization](https://docs.getdbt.com/docs/build/materializations) is supported for dbt models at the moment. A dbt table model translates to a pipeline/stream pair on Decodable, both sharing the same name. Pipelines for models are automatically activated upon materialization.

To materialize your models simply run the [`dbt run`](https://docs.getdbt.com/reference/commands/run) command, which will perform the following steps for each model:

1. Create a stream with the model's name and schema inferred by Decodable from the model's SQL.

2. Create a pipeline that inserts the SQL's results into the newly created stream.

3. Activate the pipeline.

By default, the adapter will not tear down and recreate the model on Decodable if no changes to the model have been detected. However, if changes
to a decodable stream have been detected, it will be deleted and recreated. We recommend configuring a `local_namespace` for dbt-managed
resources to prevent accidential deletion of streams.
Invoking dbt with the `--full-refresh` flag set, or setting that configuration option for a specific model will cause the corresponding resources on Decodable to be destroyed and built from scratch. See the [docs](https://docs.getdbt.com/reference/resource-configs/full_refresh) for more information on using this option.

### Custom model configuration

A `watermark` option can be configured to specify the [watermark](https://docs.decodable.co/docs/streams#managing-streams) to be set for the model's respective Decodable stream. See the [http events example](example_project/example/models/example/http_events.sql).

A `primary_key` option can be configured to specify the primary key if the target stream is a [change stream](https://docs.decodable.co/docs/streams#stream-types). See the [group by example](example_project/example/models/example/http_events_bytes_sent.sql).

More on specifying configuration options per model can be found [here](https://docs.getdbt.com/reference/model-configs).

### Seeds

[`dbt seed`](https://docs.getdbt.com/reference/commands/seed/) will perform the following steps for each specified seed:

1. Create a REST connection and an associated stream with the same name (reflecting the seed's name).

2. Activate the connection.

3. Send the data stored in the seed's `.csv` file to the connection as events.

4. Deactivate the connection.

After these steps are completed, you can access the seed's data on the newly created stream.

### Sources

[`Sources`](https://docs.getdbt.com/docs/build/sources) in dbt correspond to Decodable's source connections. However, `dbt source` command is not supported at the moment.

### Documentation

[`dbt docs`](https://docs.getdbt.com/reference/commands/cmd-docs) is not supported at the moment. You can check your Decodable account for details about your models.

### Testing

Based on the `materialize_tests` option set for the current target, [`dbt test`](https://docs.getdbt.com/reference/commands/test) will behave differently:

* `materialize_tests = false` will cause dbt to run the specified tests as previews return the results after they finish. The exact time the preview runs for, as well as whether they run starting positions should be set to `earliest` or `latest` can be changed using the `timeout` and `preview_start` target configurations respectively.

* `materialize_tests = true` will cause dbt to persist the specified tests as pipeline/stream pairs on Decodable. This configuration is designed to allow continous testing of your models. You can then run a preview on the created stream (for example using [Decodable CLI]) to monitor the results.

### Snapshots

Neither the [`dbt snapshot`] command nor the notion of snapshots are supported at the moment.

### Additional Operations

`dbt-decodable` provides a set of commands for managing the project's resources on Decodable. Those commands can be run using [`dbt run-operation {name} --args {args}`](https://docs.getdbt.com/reference/commands/run-operation).

Example invocation of the `delete_streams` operation detailed below:

```bash
$ dbt run-operation delete_streams --args '{streams: [stream1, stream2], skip_errors: True}'
```

___

#### **`stop_pipelines(pipelines)`**

**pipelines** : Optional list of names. Default value is `None`.

Deactivate pipelines for resources defined within the project. If the `pipelines` arg is provided, the command only considers the listed resources. Otherwise, it deactivates all pipelines associated with the project.

___

#### **`delete_pipelines(pipelines)`**

**pipelines** : Optional list of names. Default value is `None`.

Delete pipelines for resources defined within the project. If the `pipelines` arg is provided, the command only considers the listed resources. Otherwise, it deletes all pipelines associated with the project.

___

#### **`delete_streams(streams, skip_errors)`**

**streams** : Optional list of names. Default value is `None`. <br>
**skip_errors** : Whether to treat errors as warnings. Default value is `true`.

Delete streams for resources defined within the project. Note that it does not delete pipelines associated with those streams, failing to remove a stream if one exists. For a complete removal of stream/pipeline pairs, see the `cleanup` operation. <br>
If the `streams` arg is provided, the command only considers the listed resources. Otherwise, it attempts to delete all streams associated with the project. <br>
If `skip_errors` is set to `true`, failure to delete a stream (e.g. due to an associated pipeline) will be reported as a warning. Otherwise, the operation stops upon the first error encountered.

___

#### **`cleanup(list, models, seeds, tests)`**

**list** : Optional list of names. Default value is `None`. <br>
**models** : Whether to include models during cleanup. Default value is `true`. <br>
**seeds** : Whether to include seeds during cleanup. Default value is `true`. <br>
**tests** : Whether to include tests during cleanup. Default value is `true`.

Delete all Decodable entities resulting from the materialization of the project's resources, i.e. connections, streams and pipelines. <br>
If the `list` arg is provided, the command only considers the listed resources. Otherwise, it deletes all entities associated with the project. <br>
The `models`, `seeds` and `tests` arguments specify whether those resource types should be included in the cleanup. Note that cleanup does nothing for tests that have not been materialized.

## Known limitations

The dbt decodable adapter does not allow managing decodable connectors via dbt. You can only create streams and pipelines with dbt.

## Contributions

Contributions to this repository are more than welcome.
Please create any pull requests against the [main] branch.

Each release is maintained in a `releases/*` branch, such as `releases/v1.3.2`, and there's a tag for it.

### Build local version
```bash
pip install .
```

### How to create a release

This is based on an example release called `v1.3.3`.

```bash
# We assume to be on 'main'.
# Fork into release branch
git checkout -b releases/v1.3.3
# Edit pyproject.toml and set: version = "1.3.3"
vi pyproject.toml
# create release commit
git commit -am "[#2] Set version to v1.3.3"
# Create a release with a tag from the GitHub UI pointing to the commit we just created.
# CI will do the rest.
```

## License

This code base is available under the Apache License, version 2.

Apache Flink, Flink®, Apache®, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.

[Apache Flink]: https://flink.apache.org/
[dbt]: https://www.getdbt.com/
[Decodable]: https://www.decodable.co/
[Decodable CLI]: https://docs.decodable.co/docs/command-line-interface
[develop]: https://github.com/decodableco/dbt-decodable/tree/develop
[gitflow]: https://nvie.com/posts/a-successful-git-branching-model/
[PyPI]: https://pypi.org/project/dbt-decodable/


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/decodableco/dbt-decodable",
    "name": "dbt-decodable",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7.2",
    "maintainer_email": "",
    "keywords": "decodable,dbt",
    "author": "Decodable",
    "author_email": "info@decodable.co",
    "download_url": "https://files.pythonhosted.org/packages/bb/8b/c2a44e6aa2469564428ce70432b72511e178c93d25429fc74665cc6fada3/dbt_decodable-1.3.5.tar.gz",
    "platform": null,
    "description": "# dbt-decodable\n\n[dbt] adapter for [Decodable].\n\n[dbt] enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.\n\nDecodable is a fully managed stream processing service, based on [Apache Flink\u00ae] and using SQL as the primary means of defining data streaming pipelines.\n\n## Installation\n\n`dbt-decodable` is available on [PyPI]. To install the latest version via `pip` (optionally using a virtual environment),\nrun:\n\n```nofmt\npython3 -m venv dbt-venv         # create the virtual environment\nsource dbt-venv/bin/activate     # activate the virtual environment\npip install dbt-decodable        # install the adapter\n```\n\n## Getting Started\n\nOnce you've installed dbt in a virtual environment, we recommend trying out the example project provided by decodable:\n\n```bash\n# clone the example project\ngit clone https://github.com/decodableco/dbt-decodable.git\ncd dbt-decodable/example_project/example/\n\n# Ensure you can connect to decodable via the decodable CLI:\n# If you don't have installed the decodable CLI,\n# install it following these instructions: https://docs.decodable.co/docs/setup#install-the-cli-command-line-interface\ndecodable connection list\n\n# Ensure you have a  ~/.dbt/profiles.yml file:\ncat ~/.dbt/profiles.yml\ndbt-decodable: # this name must match the 'profile' from dbt_project.yml\n  outputs:\n    dev:\n      account_name: <fill in your decodable account name>\n      profile_name: default # fill in any profile defined in ~/.decodable/config\n      type: decodable\n      database: db\n      schema: demo\n      local_namespace: dbt_demo\n  target: dev\n\n# This will launch the example project\ndbt run\n```\n\nNote that this dbt adapter ignores the `active-profile` setting in `~/.decodable/config`. You must put the decodable profile you want to use\nin the `~/.dbt/profiles.yml` file into the `profile_name` setting.\nThe adapter does not support a custom decodable `base-url` (e.g. for local development or proxies).\n\n## Configuring your profile\n\nProfiles in dbt describe a set of configurations specific to a connection with the underlying data warehouse. Each dbt project should have a corresponding profile (though profiles can be reused for different project). Within a profile, multiple targets can be described to further control dbt's behavior. For example, it's very common to have a `dev` target for development and a `prod` target for production related configurations.\n\nMost of the profile configuration options available can be found inside the [`dbt documentation`](https://docs.getdbt.com/reference/profiles.yml). Additionally, `dbt-decodable` defines a few adapter-specific ones that can be found below.\n\n```yml\ndbt-decodable:        # the name of the profile\n  target: dev         # the default target to run commands with\n  outputs:            # the list of all defined targets under this profile\n    dev:              # the name of the target\n      type: decodable\n      database: None  # Ignored by this adapter, but required properties\n      schema: None    # Ignored by this adapter, but required properties\n\n      # decodable specific settings\n      account_name: [your account]          # Decodable account name\n      profile_name: [name of the profile]   # Decodable profile name\n      materialize_tests: [true | false]     # whether to materialize tests as a pipeline/stream pair, default is `false`\n      timeout: [ms]                         # maximum accumulative time a preview request should run for, default is `60000`\n      preview_start: [earliest | latest]    # whether preview should be run with `earliest` or `latest` start position, default is `earliest`\n      local_namespace: [namespace prefix]   # prefix added to all entities created on Decodable, default is `None`, meaning no prefix gets added.\n```\n\ndbt looks for the `profiles.yml` file in the `~/.dbt` directory. This file contains all user profiles.\n\n## Supported Features\n\n### Materializations\n\nOnly table [materialization](https://docs.getdbt.com/docs/build/materializations) is supported for dbt models at the moment. A dbt table model translates to a pipeline/stream pair on Decodable, both sharing the same name. Pipelines for models are automatically activated upon materialization.\n\nTo materialize your models simply run the [`dbt run`](https://docs.getdbt.com/reference/commands/run) command, which will perform the following steps for each model:\n\n1. Create a stream with the model's name and schema inferred by Decodable from the model's SQL.\n\n2. Create a pipeline that inserts the SQL's results into the newly created stream.\n\n3. Activate the pipeline.\n\nBy default, the adapter will not tear down and recreate the model on Decodable if no changes to the model have been detected. However, if changes\nto a decodable stream have been detected, it will be deleted and recreated. We recommend configuring a `local_namespace` for dbt-managed\nresources to prevent accidential deletion of streams.\nInvoking dbt with the `--full-refresh` flag set, or setting that configuration option for a specific model will cause the corresponding resources on Decodable to be destroyed and built from scratch. See the [docs](https://docs.getdbt.com/reference/resource-configs/full_refresh) for more information on using this option.\n\n### Custom model configuration\n\nA `watermark` option can be configured to specify the [watermark](https://docs.decodable.co/docs/streams#managing-streams) to be set for the model's respective Decodable stream. See the [http events example](example_project/example/models/example/http_events.sql).\n\nA `primary_key` option can be configured to specify the primary key if the target stream is a [change stream](https://docs.decodable.co/docs/streams#stream-types). See the [group by example](example_project/example/models/example/http_events_bytes_sent.sql).\n\nMore on specifying configuration options per model can be found [here](https://docs.getdbt.com/reference/model-configs).\n\n### Seeds\n\n[`dbt seed`](https://docs.getdbt.com/reference/commands/seed/) will perform the following steps for each specified seed:\n\n1. Create a REST connection and an associated stream with the same name (reflecting the seed's name).\n\n2. Activate the connection.\n\n3. Send the data stored in the seed's `.csv` file to the connection as events.\n\n4. Deactivate the connection.\n\nAfter these steps are completed, you can access the seed's data on the newly created stream.\n\n### Sources\n\n[`Sources`](https://docs.getdbt.com/docs/build/sources) in dbt correspond to Decodable's source connections. However, `dbt source` command is not supported at the moment.\n\n### Documentation\n\n[`dbt docs`](https://docs.getdbt.com/reference/commands/cmd-docs) is not supported at the moment. You can check your Decodable account for details about your models.\n\n### Testing\n\nBased on the `materialize_tests` option set for the current target, [`dbt test`](https://docs.getdbt.com/reference/commands/test) will behave differently:\n\n* `materialize_tests = false` will cause dbt to run the specified tests as previews return the results after they finish. The exact time the preview runs for, as well as whether they run starting positions should be set to `earliest` or `latest` can be changed using the `timeout` and `preview_start` target configurations respectively.\n\n* `materialize_tests = true` will cause dbt to persist the specified tests as pipeline/stream pairs on Decodable. This configuration is designed to allow continous testing of your models. You can then run a preview on the created stream (for example using [Decodable CLI]) to monitor the results.\n\n### Snapshots\n\nNeither the [`dbt snapshot`] command nor the notion of snapshots are supported at the moment.\n\n### Additional Operations\n\n`dbt-decodable` provides a set of commands for managing the project's resources on Decodable. Those commands can be run using [`dbt run-operation {name} --args {args}`](https://docs.getdbt.com/reference/commands/run-operation).\n\nExample invocation of the `delete_streams` operation detailed below:\n\n```bash\n$ dbt run-operation delete_streams --args '{streams: [stream1, stream2], skip_errors: True}'\n```\n\n___\n\n#### **`stop_pipelines(pipelines)`**\n\n**pipelines** : Optional list of names. Default value is `None`.\n\nDeactivate pipelines for resources defined within the project. If the `pipelines` arg is provided, the command only considers the listed resources. Otherwise, it deactivates all pipelines associated with the project.\n\n___\n\n#### **`delete_pipelines(pipelines)`**\n\n**pipelines** : Optional list of names. Default value is `None`.\n\nDelete pipelines for resources defined within the project. If the `pipelines` arg is provided, the command only considers the listed resources. Otherwise, it deletes all pipelines associated with the project.\n\n___\n\n#### **`delete_streams(streams, skip_errors)`**\n\n**streams** : Optional list of names. Default value is `None`. <br>\n**skip_errors** : Whether to treat errors as warnings. Default value is `true`.\n\nDelete streams for resources defined within the project. Note that it does not delete pipelines associated with those streams, failing to remove a stream if one exists. For a complete removal of stream/pipeline pairs, see the `cleanup` operation. <br>\nIf the `streams` arg is provided, the command only considers the listed resources. Otherwise, it attempts to delete all streams associated with the project. <br>\nIf `skip_errors` is set to `true`, failure to delete a stream (e.g. due to an associated pipeline) will be reported as a warning. Otherwise, the operation stops upon the first error encountered.\n\n___\n\n#### **`cleanup(list, models, seeds, tests)`**\n\n**list** : Optional list of names. Default value is `None`. <br>\n**models** : Whether to include models during cleanup. Default value is `true`. <br>\n**seeds** : Whether to include seeds during cleanup. Default value is `true`. <br>\n**tests** : Whether to include tests during cleanup. Default value is `true`.\n\nDelete all Decodable entities resulting from the materialization of the project's resources, i.e. connections, streams and pipelines. <br>\nIf the `list` arg is provided, the command only considers the listed resources. Otherwise, it deletes all entities associated with the project. <br>\nThe `models`, `seeds` and `tests` arguments specify whether those resource types should be included in the cleanup. Note that cleanup does nothing for tests that have not been materialized.\n\n## Known limitations\n\nThe dbt decodable adapter does not allow managing decodable connectors via dbt. You can only create streams and pipelines with dbt.\n\n## Contributions\n\nContributions to this repository are more than welcome.\nPlease create any pull requests against the [main] branch.\n\nEach release is maintained in a `releases/*` branch, such as `releases/v1.3.2`, and there's a tag for it.\n\n### Build local version\n```bash\npip install .\n```\n\n### How to create a release\n\nThis is based on an example release called `v1.3.3`.\n\n```bash\n# We assume to be on 'main'.\n# Fork into release branch\ngit checkout -b releases/v1.3.3\n# Edit pyproject.toml and set: version = \"1.3.3\"\nvi pyproject.toml\n# create release commit\ngit commit -am \"[#2] Set version to v1.3.3\"\n# Create a release with a tag from the GitHub UI pointing to the commit we just created.\n# CI will do the rest.\n```\n\n## License\n\nThis code base is available under the Apache License, version 2.\n\nApache Flink, Flink\u00ae, Apache\u00ae, the squirrel logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation.\n\n[Apache Flink]: https://flink.apache.org/\n[dbt]: https://www.getdbt.com/\n[Decodable]: https://www.decodable.co/\n[Decodable CLI]: https://docs.decodable.co/docs/command-line-interface\n[develop]: https://github.com/decodableco/dbt-decodable/tree/develop\n[gitflow]: https://nvie.com/posts/a-successful-git-branching-model/\n[PyPI]: https://pypi.org/project/dbt-decodable/\n\n",
    "bugtrack_url": null,
    "license": "Apache2.0",
    "summary": "The Decodable adapter plugin for DBT",
    "version": "1.3.5",
    "project_urls": {
        "Homepage": "https://github.com/decodableco/dbt-decodable",
        "Repository": "https://github.com/decodableco/dbt-decodable"
    },
    "split_keywords": [
        "decodable",
        "dbt"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d122d4ebbcd6842403a825ea1e9a84e2895183462b71f8d8efb66daec19ddf6d",
                "md5": "284901336f88e8510409efc78017df61",
                "sha256": "64dda2ff2af3bf9d1c5dfffcae96bc9d88f4a91c0d4ce744a8b2203d1fd9f7c6"
            },
            "downloads": -1,
            "filename": "dbt_decodable-1.3.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "284901336f88e8510409efc78017df61",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7.2",
            "size": 54325,
            "upload_time": "2024-01-16T19:15:15",
            "upload_time_iso_8601": "2024-01-16T19:15:15.522421Z",
            "url": "https://files.pythonhosted.org/packages/d1/22/d4ebbcd6842403a825ea1e9a84e2895183462b71f8d8efb66daec19ddf6d/dbt_decodable-1.3.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bb8bc2a44e6aa2469564428ce70432b72511e178c93d25429fc74665cc6fada3",
                "md5": "ba519660bd465fd10edbbcd2ff6f7b42",
                "sha256": "c3ae9fc68761d19c1a2a5e595071d2fca2745c437923d211103ed32b995670be"
            },
            "downloads": -1,
            "filename": "dbt_decodable-1.3.5.tar.gz",
            "has_sig": false,
            "md5_digest": "ba519660bd465fd10edbbcd2ff6f7b42",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7.2",
            "size": 33005,
            "upload_time": "2024-01-16T19:15:17",
            "upload_time_iso_8601": "2024-01-16T19:15:17.976875Z",
            "url": "https://files.pythonhosted.org/packages/bb/8b/c2a44e6aa2469564428ce70432b72511e178c93d25429fc74665cc6fada3/dbt_decodable-1.3.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-16 19:15:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "decodableco",
    "github_project": "dbt-decodable",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "tox": true,
    "lcname": "dbt-decodable"
}
        
Elapsed time: 2.65129s