dbdemos


Namedbdemos JSON
Version 0.6.26 PyPI version JSON
download
home_pagehttps://github.com/databricks-demos/dbdemos
SummaryInstall databricks demos: notebooks, Delta Live Table Pipeline, DBSQL Dashboards, ML Models etc.
upload_time2025-07-19 08:14:22
maintainerNone
docs_urlNone
authorDatabricks
requires_python>=3.7
licenseDatabricks License
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # dbdemos

DBDemos is a toolkit to easily install Lakehouse demos for Databricks.

**Looking for the dbdemos notebooks and content?** Access [https://github.com/databricks-demos/dbdemos](https://github.com/databricks-demos/dbdemos-notebooks)?

Simply deploy & share demos on any workspace. dbdemos is packaged with a list of demos:

- Lakehouse, end-to-end demos (ex: Lakehouse Retail Churn)
- Product demos (ex: Delta Live Table, CDC, ML, DBSQL Dashboard, MLOps...)

**Please visit [dbdemos.ai](https://www.dbdemos.ai) to explore all our demos.**

## Installation 
**Do not clone the repo, just pip install dbdemos wheel:**

```
%pip install dbdemos
```

## Usage within Databricks

See [demo video](https://drive.google.com/file/d/12Iu50r7hlawVN01eE_GoUKBQ4kvUrR56/view?usp=sharing) 
```
import dbdemos
dbdemos.help()
dbdemos.list_demos()

dbdemos.install('lakehouse-retail-c360', path='./', overwrite = True)
```

![Dbdemos install](https://github.com/databricks-demos/dbdemos/raw/main/resources/dbdemos-screenshot.png)

## Requirements

`dbdemos` requires the current user to have:
* Cluster creation permission
* DLT Pipeline creation permission 
* DBSQL dashboard & query creation permission
* For UC demos: Unity Catalog metastore must be available (demo will be installed but won't work) 


## Features

* Load demo notebooks (pre-run) to the given path
* Start job to load dataset based on demo requirement
* Start demo cluster customized for the demo & the current user
* Setup DLT pipelines
* Setup DBSQL dashboard
* Create ML Model
* Demo links are updated with resources created for an easy navigation

## Feedback

Demo not working? Can't use dbdemos? Please open a github issue. <br/>
Make sure you mention the name of the demo.

# DBDemos Developer options

## Adding an AI/BI demo to dbdemos
open [README_AIBI.md](README_AIBI.md) for more details on how to contribute & add an AI/BI demo.

Read the following if you want to add a new demo bundle.

## Packaging a demo with dbdemos

Your demo must contain a `_resources` folder where you include all initialization scripts and your bundle configuration file.

### Links & tags
DBdemos will dynamically override the link to point to the resources created.

**Always use links relative to the local path to support multi workspaces. Do not add the workspace id.**

#### DLT pipelines:
Your DLT pipeline must be added in the bundle file (see below).
Within your notebook, to identify your pipeline using the id in the bundle file, specify the id `dbdemos-pipeline-id="<id>"`as following:

`<a dbdemos-pipeline-id="dlt-churn" href="#joblist/pipelines/a6ba1d12-74d7-4e2d-b9b7-ca53b655f39d" target="_blank">Delta Live Table pipeline</a>`

#### Workflows:
Your workflows must be added in the bundle file (see below).
Within your notebook, to identify your workflow using the id in the bundle file, specify the id `dbdemos-workflow-id="<id>"`as following:

`<a dbdemos-workflow-id="credit-job" href="#joblist/pipelines/a6ba1d12-74d7-4e2d-b9b7-ca53b655f39d" target="_blank">Access your workflow</a>`


#### DBSQL dashboards:
Similar to workflows, your dashboard id must match the one in the bundle file.

Dashboards definition should be added to the _dashboards folder (make sure the file name matches the dashboard id: `churn-prediction.lvdash.json`).

` <a dbdemos-dashboard-id="churn-prediction" href="/sql/dashboardsv3/19394330-2274-4b4b-90ce-d415a7ff2130" target="_blank">Churn Analysis Dashboard</a>`



### bundle_config
The demo must contain the a `./_resources/bundle_config` file containing your bundle definition.
This need to be a notebook & not a .json file (due to current api limitation).

```json
{
  "name": "<Demo name, used in dbdemos.install('xxx')>",
  "category": "<Category, like data-engineering>",
  "title": "<Title>.",
  "description": "<Description>",
  "bundle": <Will bundle when True, skip when False>,
  "tags": [{"dlt": "Delta Live Table"}],
  "notebooks": [
    {
      "path": "<notebbok path from the demo folder (ex: resources/00-load-data)>", 
      "pre_run": <Will start a job to run it before packaging to get the cells results>, 
      "publish_on_website": <Will add the notebook in the public website (with the results if it's pre_run=True)>, 
      "add_cluster_setup_cell": <if True, add a cell with the name of the demo cluster>,
      "title":  "<Title>", 
      "description": "<Description (will be in minisite also)>",
      "parameters": {"<key>": "<value. Will be sent to the pre_run job>"}
    }
  ],
  "init_job": {
    "settings": {
        "name": "demos_dlt_cdc_init_{{CURRENT_USER_NAME}}",
        "email_notifications": {
            "no_alert_for_skipped_runs": False
        },
        "timeout_seconds": 0,
        "max_concurrent_runs": 1,
        "tasks": [
            {
                "task_key": "init_data",
                "notebook_task": {
                    "notebook_path": "{{DEMO_FOLDER}}/_resources/01-load-data-quality-dashboard",
                    "source": "WORKSPACE"
                },
                "job_cluster_key": "Shared_job_cluster",
                "timeout_seconds": 0,
                "email_notifications": {}
            }
        ]
        .... Full standard job definition
    }
  },
  "pipelines": <list of DLT pipelines if any>
  [
    {
      "id": "dlt-cdc", <id, used in the notebook links to go to the generated notebook: <a dbdemos-pipeline-id="dlt-cdc" href="#joblist/pipelines/xxxx">installed DLT pipeline</a> >
      "run_after_creation": True,
      "definition": {
        ... Any DLT pipelineconfiguration...
        "libraries": [
            {
                "notebook": {
                    "path": "{{DEMO_FOLDER}}/_resources/00-Data_CDC_Generator"
                }
            }
        ],
        "name": "demos_dlt_cdc_{{CURRENT_USER_NAME}}",
        "storage": "/demos/dlt/cdc/{{CURRENT_USER_NAME}}",
        "target": "demos_dlt_cdc_{{CURRENT_USER_NAME}}"
      }
    }
  ],
  "workflows": [{
    "start_on_install": False,
    "id": "credit-job",
    "definition": {
        "settings": {
        ... full pipeline settings
    }
  }],
  "dashboards": [{"name": "[dbdemos] Retail Churn Prediction Dashboard",       "id": "churn-prediction"}] 
}
```

dbdemos will replace the values defined as {{<KEY>}} based on who install the demo. Supported keys:
* TODAY
* CURRENT_USER (email)
* CURRENT_USER_NAME (derivated from email)
* DEMO_NAME
* DEMO_FOLDER


# DBDemo Installer configuration

The following describe how to package the demos created.

The installer needs to fetch data from a workspace & start jobs. To do so, it requires informations `local_conf.json`
```json
{
  "pat_token": "xxx",
  "username": "xx.xx@databricks.com",
  "url": "https://xxx.databricks.com",
  "repo_staging_path": "/Repos/xx.xx@databricks.com",
  "repo_name": "dbdemos-notebooks",
  "repo_url": "https://github.com/databricks-demos/dbdemos-notebooks.git", #put your clone here
  "branch": "master",
  "current_folder": "<Used to mock the current folder outside of a notebook, ex: /Users/quentin.ambard@databricks.com/test_install_demo>"
}
```

### Creating the bundles:
```python
bundler = JobBundler(conf)
# the bundler will use a stating repo dir in the workspace to analyze & run content.
bundler.reset_staging_repo(skip_pull=False)
# Discover bundles from repo:
bundler.load_bundles_conf()
# Or manually add bundle to run faster:
#bundler.add_bundle("product_demos/Auto-Loader (cloudFiles)")

# Run the jobs (only if there is a new commit since the last time, or failure, or force execution)
bundler.start_and_wait_bundle_jobs(force_execution = False)

packager = Packager(conf, bundler)
packager.package_all()
```


## Licence
See LICENSE file.

## Data collection
To improve users experience and dbdemos asset quality, dbdemos sends report usage and capture views in the installed notebook (usually in the first cell) and dashboards. This information is captured for product improvement only and not for marketing purpose, and doesn't contain PII information. By using `dbdemos` and the assets it provides, you consent to this data collection. If you wish to disable it, you can set `Tracker.enable_tracker` to False in the `tracker.py` file.

## Resource creation
To simplify your experience, `dbdemos` will create and start for you resources. As example, a demo could start (not exhaustive):
- A cluster to run your demo
- A Delta Live Table Pipeline to ingest data
- A DBSQL endpoint to run DBSQL dashboard
- An ML model

While `dbdemos` does its best to limit the consumption and enforce resource auto-termination, you remain responsible for the resources created and the potential consumption associated.

## Support
Databricks does not offer official support for `dbdemos` and the associated assets.
For any issue with `dbdemos` or the demos installed, please open an issue and the demo team will have a look on a best effort basis.


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/databricks-demos/dbdemos",
    "name": "dbdemos",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": null,
    "author": "Databricks",
    "author_email": "['quentin.ambard@databricks.com', 'cal.reynolds@databricks.com']",
    "download_url": null,
    "platform": null,
    "description": "# dbdemos\n\nDBDemos is a toolkit to easily install Lakehouse demos for Databricks.\n\n**Looking for the dbdemos notebooks and content?** Access [https://github.com/databricks-demos/dbdemos](https://github.com/databricks-demos/dbdemos-notebooks)?\n\nSimply deploy & share demos on any workspace. dbdemos is packaged with a list of demos:\n\n- Lakehouse, end-to-end demos (ex: Lakehouse Retail Churn)\n- Product demos (ex: Delta Live Table, CDC, ML, DBSQL Dashboard, MLOps...)\n\n**Please visit [dbdemos.ai](https://www.dbdemos.ai) to explore all our demos.**\n\n## Installation \n**Do not clone the repo, just pip install dbdemos wheel:**\n\n```\n%pip install dbdemos\n```\n\n## Usage within Databricks\n\nSee [demo video](https://drive.google.com/file/d/12Iu50r7hlawVN01eE_GoUKBQ4kvUrR56/view?usp=sharing) \n```\nimport dbdemos\ndbdemos.help()\ndbdemos.list_demos()\n\ndbdemos.install('lakehouse-retail-c360', path='./', overwrite = True)\n```\n\n![Dbdemos install](https://github.com/databricks-demos/dbdemos/raw/main/resources/dbdemos-screenshot.png)\n\n## Requirements\n\n`dbdemos` requires the current user to have:\n* Cluster creation permission\n* DLT Pipeline creation permission \n* DBSQL dashboard & query creation permission\n* For UC demos: Unity Catalog metastore must be available (demo will be installed but won't work) \n\n\n## Features\n\n* Load demo notebooks (pre-run) to the given path\n* Start job to load dataset based on demo requirement\n* Start demo cluster customized for the demo & the current user\n* Setup DLT pipelines\n* Setup DBSQL dashboard\n* Create ML Model\n* Demo links are updated with resources created for an easy navigation\n\n## Feedback\n\nDemo not working? Can't use dbdemos? Please open a github issue. <br/>\nMake sure you mention the name of the demo.\n\n# DBDemos Developer options\n\n## Adding an AI/BI demo to dbdemos\nopen [README_AIBI.md](README_AIBI.md) for more details on how to contribute & add an AI/BI demo.\n\nRead the following if you want to add a new demo bundle.\n\n## Packaging a demo with dbdemos\n\nYour demo must contain a `_resources` folder where you include all initialization scripts and your bundle configuration file.\n\n### Links & tags\nDBdemos will dynamically override the link to point to the resources created.\n\n**Always use links relative to the local path to support multi workspaces. Do not add the workspace id.**\n\n#### DLT pipelines:\nYour DLT pipeline must be added in the bundle file (see below).\nWithin your notebook, to identify your pipeline using the id in the bundle file, specify the id `dbdemos-pipeline-id=\"<id>\"`as following:\n\n`<a dbdemos-pipeline-id=\"dlt-churn\" href=\"#joblist/pipelines/a6ba1d12-74d7-4e2d-b9b7-ca53b655f39d\" target=\"_blank\">Delta Live Table pipeline</a>`\n\n#### Workflows:\nYour workflows must be added in the bundle file (see below).\nWithin your notebook, to identify your workflow using the id in the bundle file, specify the id `dbdemos-workflow-id=\"<id>\"`as following:\n\n`<a dbdemos-workflow-id=\"credit-job\" href=\"#joblist/pipelines/a6ba1d12-74d7-4e2d-b9b7-ca53b655f39d\" target=\"_blank\">Access your workflow</a>`\n\n\n#### DBSQL dashboards:\nSimilar to workflows, your dashboard id must match the one in the bundle file.\n\nDashboards definition should be added to the _dashboards folder (make sure the file name matches the dashboard id: `churn-prediction.lvdash.json`).\n\n` <a dbdemos-dashboard-id=\"churn-prediction\" href=\"/sql/dashboardsv3/19394330-2274-4b4b-90ce-d415a7ff2130\" target=\"_blank\">Churn Analysis Dashboard</a>`\n\n\n\n### bundle_config\nThe demo must contain the a `./_resources/bundle_config` file containing your bundle definition.\nThis need to be a notebook & not a .json file (due to current api limitation).\n\n```json\n{\n  \"name\": \"<Demo name, used in dbdemos.install('xxx')>\",\n  \"category\": \"<Category, like data-engineering>\",\n  \"title\": \"<Title>.\",\n  \"description\": \"<Description>\",\n  \"bundle\": <Will bundle when True, skip when False>,\n  \"tags\": [{\"dlt\": \"Delta Live Table\"}],\n  \"notebooks\": [\n    {\n      \"path\": \"<notebbok path from the demo folder (ex: resources/00-load-data)>\", \n      \"pre_run\": <Will start a job to run it before packaging to get the cells results>, \n      \"publish_on_website\": <Will add the notebook in the public website (with the results if it's pre_run=True)>, \n      \"add_cluster_setup_cell\": <if True, add a cell with the name of the demo cluster>,\n      \"title\":  \"<Title>\", \n      \"description\": \"<Description (will be in minisite also)>\",\n      \"parameters\": {\"<key>\": \"<value. Will be sent to the pre_run job>\"}\n    }\n  ],\n  \"init_job\": {\n    \"settings\": {\n        \"name\": \"demos_dlt_cdc_init_{{CURRENT_USER_NAME}}\",\n        \"email_notifications\": {\n            \"no_alert_for_skipped_runs\": False\n        },\n        \"timeout_seconds\": 0,\n        \"max_concurrent_runs\": 1,\n        \"tasks\": [\n            {\n                \"task_key\": \"init_data\",\n                \"notebook_task\": {\n                    \"notebook_path\": \"{{DEMO_FOLDER}}/_resources/01-load-data-quality-dashboard\",\n                    \"source\": \"WORKSPACE\"\n                },\n                \"job_cluster_key\": \"Shared_job_cluster\",\n                \"timeout_seconds\": 0,\n                \"email_notifications\": {}\n            }\n        ]\n        .... Full standard job definition\n    }\n  },\n  \"pipelines\": <list of DLT pipelines if any>\n  [\n    {\n      \"id\": \"dlt-cdc\", <id, used in the notebook links to go to the generated notebook: <a dbdemos-pipeline-id=\"dlt-cdc\" href=\"#joblist/pipelines/xxxx\">installed DLT pipeline</a> >\n      \"run_after_creation\": True,\n      \"definition\": {\n        ... Any DLT pipelineconfiguration...\n        \"libraries\": [\n            {\n                \"notebook\": {\n                    \"path\": \"{{DEMO_FOLDER}}/_resources/00-Data_CDC_Generator\"\n                }\n            }\n        ],\n        \"name\": \"demos_dlt_cdc_{{CURRENT_USER_NAME}}\",\n        \"storage\": \"/demos/dlt/cdc/{{CURRENT_USER_NAME}}\",\n        \"target\": \"demos_dlt_cdc_{{CURRENT_USER_NAME}}\"\n      }\n    }\n  ],\n  \"workflows\": [{\n    \"start_on_install\": False,\n    \"id\": \"credit-job\",\n    \"definition\": {\n        \"settings\": {\n        ... full pipeline settings\n    }\n  }],\n  \"dashboards\": [{\"name\": \"[dbdemos] Retail Churn Prediction Dashboard\",       \"id\": \"churn-prediction\"}] \n}\n```\n\ndbdemos will replace the values defined as {{<KEY>}} based on who install the demo. Supported keys:\n* TODAY\n* CURRENT_USER (email)\n* CURRENT_USER_NAME (derivated from email)\n* DEMO_NAME\n* DEMO_FOLDER\n\n\n# DBDemo Installer configuration\n\nThe following describe how to package the demos created.\n\nThe installer needs to fetch data from a workspace & start jobs. To do so, it requires informations `local_conf.json`\n```json\n{\n  \"pat_token\": \"xxx\",\n  \"username\": \"xx.xx@databricks.com\",\n  \"url\": \"https://xxx.databricks.com\",\n  \"repo_staging_path\": \"/Repos/xx.xx@databricks.com\",\n  \"repo_name\": \"dbdemos-notebooks\",\n  \"repo_url\": \"https://github.com/databricks-demos/dbdemos-notebooks.git\", #put your clone here\n  \"branch\": \"master\",\n  \"current_folder\": \"<Used to mock the current folder outside of a notebook, ex: /Users/quentin.ambard@databricks.com/test_install_demo>\"\n}\n```\n\n### Creating the bundles:\n```python\nbundler = JobBundler(conf)\n# the bundler will use a stating repo dir in the workspace to analyze & run content.\nbundler.reset_staging_repo(skip_pull=False)\n# Discover bundles from repo:\nbundler.load_bundles_conf()\n# Or manually add bundle to run faster:\n#bundler.add_bundle(\"product_demos/Auto-Loader (cloudFiles)\")\n\n# Run the jobs (only if there is a new commit since the last time, or failure, or force execution)\nbundler.start_and_wait_bundle_jobs(force_execution = False)\n\npackager = Packager(conf, bundler)\npackager.package_all()\n```\n\n\n## Licence\nSee LICENSE file.\n\n## Data collection\nTo improve users experience and dbdemos asset quality, dbdemos sends report usage and capture views in the installed notebook (usually in the first cell) and dashboards. This information is captured for product improvement only and not for marketing purpose, and doesn't contain PII information. By using `dbdemos` and the assets it provides, you consent to this data collection. If you wish to disable it, you can set `Tracker.enable_tracker` to False in the `tracker.py` file.\n\n## Resource creation\nTo simplify your experience, `dbdemos` will create and start for you resources. As example, a demo could start (not exhaustive):\n- A cluster to run your demo\n- A Delta Live Table Pipeline to ingest data\n- A DBSQL endpoint to run DBSQL dashboard\n- An ML model\n\nWhile `dbdemos` does its best to limit the consumption and enforce resource auto-termination, you remain responsible for the resources created and the potential consumption associated.\n\n## Support\nDatabricks does not offer official support for `dbdemos` and the associated assets.\nFor any issue with `dbdemos` or the demos installed, please open an issue and the demo team will have a look on a best effort basis.\n\n",
    "bugtrack_url": null,
    "license": "Databricks License",
    "summary": "Install databricks demos: notebooks, Delta Live Table Pipeline, DBSQL Dashboards, ML Models etc.",
    "version": "0.6.26",
    "project_urls": {
        "Homepage": "https://github.com/databricks-demos/dbdemos"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "db3adb560313deac97c340d04768cc9a2830967770a50fefb6738bfd3908abcf",
                "md5": "3ce7162871a75326810e131308d1b841",
                "sha256": "549a110b805b25a4ccd0355049755e2b6eee3e3744fdd9975c451834a624dbf4"
            },
            "downloads": -1,
            "filename": "dbdemos-0.6.26-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3ce7162871a75326810e131308d1b841",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 30073521,
            "upload_time": "2025-07-19T08:14:22",
            "upload_time_iso_8601": "2025-07-19T08:14:22.303343Z",
            "url": "https://files.pythonhosted.org/packages/db/3a/db560313deac97c340d04768cc9a2830967770a50fefb6738bfd3908abcf/dbdemos-0.6.26-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-19 08:14:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "databricks-demos",
    "github_project": "dbdemos",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "dbdemos"
}
        
Elapsed time: 1.32814s