dbt-databricks

Name	dbt-databricks JSON
Version	1.9.0 JSON
	download
home_page	None
Summary	The Databricks adapter plugin for dbt
upload_time	2024-12-09 19:34:45
maintainer	None
docs_url	None
author	None
requires_python	>=3.9
license	Apache-2.0
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <p align="center">
  <img src="https://bynder-public-us-west-2.s3.amazonaws.com/styleguide/ABB317701CA31CB7F29268E32B303CAE-pdf-column-1.png" alt="databricks logo" width="50%" />
  <img src="https://raw.githubusercontent.com/dbt-labs/dbt/ec7dee39f793aa4f7dd3dae37282cc87664813e4/etc/dbt-logo-full.svg" alt="dbt logo" width="250"/>
</p>
<p align="center">
  <a href="https://github.com/databricks/dbt-databricks/actions/workflows/main.yml">
    <img src="https://github.com/databricks/dbt-databricks/actions/workflows/main.yml/badge.svg?event=push" alt="Unit Tests Badge"/>
  </a>
  <a href="https://github.com/databricks/dbt-databricks/actions/workflows/integration.yml">
    <img src="https://github.com/databricks/dbt-databricks/actions/workflows/integration.yml/badge.svg?event=push" alt="Integration Tests Badge"/>
  </a>
</p>

**[dbt](https://www.getdbt.com/)** enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

The **[Databricks Lakehouse](https://www.databricks.com/)** provides one simple platform to unify all your data, analytics and AI workloads.

# dbt-databricks

The `dbt-databricks` adapter contains all of the code enabling dbt to work with Databricks. This adapter is based off the amazing work done in [dbt-spark](https://github.com/dbt-labs/dbt-spark). Some key features include:

- **Easy setup**. No need to install an ODBC driver as the adapter uses pure Python APIs.
- **Open by default**. For example, it uses the the open and performant [Delta](https://delta.io/) table format by default. This has many benefits, including letting you use `MERGE` as the the default incremental materialization strategy.
- **Support for Unity Catalog**. dbt-databricks>=1.1.1 supports the 3-level namespace of Unity Catalog (catalog / schema / relations) so you can organize and secure your data the way you like.
- **Performance**. The adapter generates SQL expressions that are automatically accelerated by the native, vectorized [Photon](https://databricks.com/product/photon) execution engine.

## Choosing between dbt-databricks and dbt-spark
If you are developing a dbt project on Databricks, we recommend using `dbt-databricks` for the reasons noted above.

`dbt-spark` is an actively developed adapter which works with Databricks as well as Apache Spark anywhere it is hosted e.g. on AWS EMR.

## Getting started

### Installation

Install using pip:
```nofmt
pip install dbt-databricks
```

Upgrade to the latest version
```nofmt
pip install --upgrade dbt-databricks
```

### Profile Setup

```nofmt
your_profile_name:
  target: dev
  outputs:
    dev:
      type: databricks
      catalog: [optional catalog name, if you are using Unity Catalog, only available in dbt-databricks>=1.1.1]
      schema: [database/schema name]
      host: [your.databrickshost.com]
      http_path: [/sql/your/http/path]
      token: [dapiXXXXXXXXXXXXXXXXXXXXXXX]
```

### Quick Starts

These following quick starts will get you up and running with the `dbt-databricks` adapter:
- [Developing your first dbt project](https://github.com/databricks/dbt-databricks/blob/main/docs/local-dev.md)
- Using dbt Cloud with Databricks ([Azure](https://docs.microsoft.com/en-us/azure/databricks/integrations/prep/dbt-cloud) | [AWS](https://docs.databricks.com/integrations/prep/dbt-cloud.html))
- [Running dbt production jobs on Databricks Workflows](https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-workflows.md)
- [Using Unity Catalog with dbt-databricks](https://github.com/databricks/dbt-databricks/blob/main/docs/uc.md)
- [Using GitHub Actions for dbt CI/CD on Databricks](https://github.com/databricks/dbt-databricks/blob/main/docs/github-actions.md)
- [Loading data from S3 into Delta using the databricks_copy_into macro](https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-copy-into-macro-aws.md)
- [Contribute to this repository](CONTRIBUTING.MD)

### Compatibility

The `dbt-databricks` adapter has been tested:

- with Python 3.7 or above.
- against `Databricks SQL` and `Databricks runtime releases 9.1 LTS` and later.

### Tips and Tricks
## Choosing compute for a Python model
You can override the compute used for a specific Python model by setting the `http_path` property in model configuration. This can be useful if, for example, you want to run a Python model on an All Purpose cluster, while running SQL models on a SQL Warehouse. Note that this capability is only available for Python models.

```
def model(dbt, session):
    dbt.config(
      http_path="sql/protocolv1/..."
    )
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dbt-databricks",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "Databricks <feedback@databricks.com>",
    "download_url": "https://files.pythonhosted.org/packages/67/2f/10a35a73f440e218651bd5375ac5caba4988fa5416b6d9235e9696d20e10/dbt_databricks-1.9.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img src=\"https://bynder-public-us-west-2.s3.amazonaws.com/styleguide/ABB317701CA31CB7F29268E32B303CAE-pdf-column-1.png\" alt=\"databricks logo\" width=\"50%\" />\n  <img src=\"https://raw.githubusercontent.com/dbt-labs/dbt/ec7dee39f793aa4f7dd3dae37282cc87664813e4/etc/dbt-logo-full.svg\" alt=\"dbt logo\" width=\"250\"/>\n</p>\n<p align=\"center\">\n  <a href=\"https://github.com/databricks/dbt-databricks/actions/workflows/main.yml\">\n    <img src=\"https://github.com/databricks/dbt-databricks/actions/workflows/main.yml/badge.svg?event=push\" alt=\"Unit Tests Badge\"/>\n  </a>\n  <a href=\"https://github.com/databricks/dbt-databricks/actions/workflows/integration.yml\">\n    <img src=\"https://github.com/databricks/dbt-databricks/actions/workflows/integration.yml/badge.svg?event=push\" alt=\"Integration Tests Badge\"/>\n  </a>\n</p>\n\n**[dbt](https://www.getdbt.com/)** enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.\n\nThe **[Databricks Lakehouse](https://www.databricks.com/)** provides one simple platform to unify all your data, analytics and AI workloads.\n\n# dbt-databricks\n\nThe `dbt-databricks` adapter contains all of the code enabling dbt to work with Databricks. This adapter is based off the amazing work done in [dbt-spark](https://github.com/dbt-labs/dbt-spark). Some key features include:\n\n- **Easy setup**. No need to install an ODBC driver as the adapter uses pure Python APIs.\n- **Open by default**. For example, it uses the the open and performant [Delta](https://delta.io/) table format by default. This has many benefits, including letting you use `MERGE` as the the default incremental materialization strategy.\n- **Support for Unity Catalog**. dbt-databricks>=1.1.1 supports the 3-level namespace of Unity Catalog (catalog / schema / relations) so you can organize and secure your data the way you like.\n- **Performance**. The adapter generates SQL expressions that are automatically accelerated by the native, vectorized [Photon](https://databricks.com/product/photon) execution engine.\n\n## Choosing between dbt-databricks and dbt-spark\nIf you are developing a dbt project on Databricks, we recommend using `dbt-databricks` for the reasons noted above.\n\n`dbt-spark` is an actively developed adapter which works with Databricks as well as Apache Spark anywhere it is hosted e.g. on AWS EMR.\n\n## Getting started\n\n### Installation\n\nInstall using pip:\n```nofmt\npip install dbt-databricks\n```\n\nUpgrade to the latest version\n```nofmt\npip install --upgrade dbt-databricks\n```\n\n### Profile Setup\n\n```nofmt\nyour_profile_name:\n  target: dev\n  outputs:\n    dev:\n      type: databricks\n      catalog: [optional catalog name, if you are using Unity Catalog, only available in dbt-databricks>=1.1.1]\n      schema: [database/schema name]\n      host: [your.databrickshost.com]\n      http_path: [/sql/your/http/path]\n      token: [dapiXXXXXXXXXXXXXXXXXXXXXXX]\n```\n\n### Quick Starts\n\nThese following quick starts will get you up and running with the `dbt-databricks` adapter:\n- [Developing your first dbt project](https://github.com/databricks/dbt-databricks/blob/main/docs/local-dev.md)\n- Using dbt Cloud with Databricks ([Azure](https://docs.microsoft.com/en-us/azure/databricks/integrations/prep/dbt-cloud) | [AWS](https://docs.databricks.com/integrations/prep/dbt-cloud.html))\n- [Running dbt production jobs on Databricks Workflows](https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-workflows.md)\n- [Using Unity Catalog with dbt-databricks](https://github.com/databricks/dbt-databricks/blob/main/docs/uc.md)\n- [Using GitHub Actions for dbt CI/CD on Databricks](https://github.com/databricks/dbt-databricks/blob/main/docs/github-actions.md)\n- [Loading data from S3 into Delta using the databricks_copy_into macro](https://github.com/databricks/dbt-databricks/blob/main/docs/databricks-copy-into-macro-aws.md)\n- [Contribute to this repository](CONTRIBUTING.MD)\n\n### Compatibility\n\nThe `dbt-databricks` adapter has been tested:\n\n- with Python 3.7 or above.\n- against `Databricks SQL` and `Databricks runtime releases 9.1 LTS` and later.\n\n### Tips and Tricks\n## Choosing compute for a Python model\nYou can override the compute used for a specific Python model by setting the `http_path` property in model configuration. This can be useful if, for example, you want to run a Python model on an All Purpose cluster, while running SQL models on a SQL Warehouse. Note that this capability is only available for Python models.\n\n```\ndef model(dbt, session):\n    dbt.config(\n      http_path=\"sql/protocolv1/...\"\n    )\n```\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "The Databricks adapter plugin for dbt",
    "version": "1.9.0",
    "project_urls": {
        "changelog": "https://github.com/databricks/dbt-databricks/blob/main/CHANGELOG.md",
        "documentation": "https://docs.getdbt.com/reference/resource-configs/databricks-configs",
        "homepage": "https://github.com/databricks/dbt-databricks",
        "issues": "https://github.com/databricks/dbt-databricks/issues",
        "repository": "https://github.com/databricks/dbt-databricks"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "24cbfbc0c2049053693b42745dd70d11627a2a2fb69a8e7c2d3c322bd122b2c5",
                "md5": "b7dacfb63cacc897c7d42c9d007d54af",
                "sha256": "241c339269216d4605bd7fd273d3904d051c6d0849cef3106cc5745374db1c41"
            },
            "downloads": -1,
            "filename": "dbt_databricks-1.9.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b7dacfb63cacc897c7d42c9d007d54af",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 97376,
            "upload_time": "2024-12-09T19:34:43",
            "upload_time_iso_8601": "2024-12-09T19:34:43.017454Z",
            "url": "https://files.pythonhosted.org/packages/24/cb/fbc0c2049053693b42745dd70d11627a2a2fb69a8e7c2d3c322bd122b2c5/dbt_databricks-1.9.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "672f10a35a73f440e218651bd5375ac5caba4988fa5416b6d9235e9696d20e10",
                "md5": "b3f648a9a7cb6da1664285e539f49e85",
                "sha256": "82f6f95f818ecb6de5264d89670724ebb041fc03782106b643a2ba7694acb9cb"
            },
            "downloads": -1,
            "filename": "dbt_databricks-1.9.0.tar.gz",
            "has_sig": false,
            "md5_digest": "b3f648a9a7cb6da1664285e539f49e85",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 62832,
            "upload_time": "2024-12-09T19:34:45",
            "upload_time_iso_8601": "2024-12-09T19:34:45.073278Z",
            "url": "https://files.pythonhosted.org/packages/67/2f/10a35a73f440e218651bd5375ac5caba4988fa5416b6d9235e9696d20e10/dbt_databricks-1.9.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-09 19:34:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "databricks",
    "github_project": "dbt-databricks",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "dbt-databricks"
}

None