view-selection-python


Nameview-selection-python JSON
Version 0.1.5 PyPI version JSON
download
home_pagehttps://github.com/bramreinders97/view_selection_tool_python
SummaryView Selection Tool for dbt (python part)
upload_time2024-08-08 14:23:20
maintainerNone
docs_urlNone
authorBram Reinders
requires_python<4.0.0,>=3.11.5
licenseMIT
keywords view selection
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ViewSelectionAdvisor

Welcome to `ViewSelectionAdvisor`, a tool designed to help dbt users address the problem of model materialization. 
This tool consists of two separate packages, each hosted in its own GitHub repository:
* [A dbt package](https://github.com/bramreinders97/view_selection_tool_dbt). 
This package is dependent on another dbt package: [Elementary](https://docs.elementary-data.com/guides/modules-overview/dbt-package)
* [A python package](https://github.com/bramreinders97/view_selection_tool_python)

### What does it do?
Most dbt projects are structured with a DAG that includes staging, intermediate, and marts models. Typically, staging and intermediate models are stored as views, while marts models are stored as tables. However, this default configuration may not always be the most efficient from a performance perspective. Determining which models should be materialized and which should not can be challenging.
This is where `ViewSelectionAdvisor` comes in to help. By using this tool, you are advised on the best 
materialization strategy for you models in dbt. 


### How does it work?
`ViewSelectionAdvisor` determines the optimal configuration of materialized models by evaluating all possible configurations. For each configuration, it estimates the total cost of building your entire DAG using PostgreSQL's `EXPLAIN` command.

Note: `ViewSelectionAdvisor` assumes that all [destination nodes](## "Destination nodes are nodes in your DAG without an outgoing edge. In most cases, these nodes correspond to mart tables.") are already materialized as tables. Consequently, these nodes will not appear in the provided advice.

Note 2: By default, `ViewSelectionAdvisor` only looks at materialization configurations of at most 2 models. 
This can be changed using the `max_materializations` variable (see [overview of variables](#possible-variables-for-vst-advise)).


### A note on Elementary's defaults materializations warning
Please note that when running any of the `dbt run` commands in the coming steps, it is possible that you observe
a warning from dbt on elementary trying to override default materializations. 
This is not a problem, as [the developers of Elementary are aware of this and working on a solution](https://docs.elementary-data.com/oss/quickstart/quickstart-cli-package#important-allowing-elementary-to-override-dbts-default-materializations-relevant-from-dbt-1-8).
Furthermore, the parts of the elementary package that are affected by this are not relevant for `ViewSelectionAdvisor`.

## Installation Instructions

### dbt part
1. Include `ViewSelectionAdvisor` in your `packages.yml` file:
    ```yaml
      - git: "https://github.com/bramreinders97/view_selection_tool_dbt.git"
        revision: 4b7990736f651ae08f8d4a7a260f2c10ad1d862b
    ``` 
 
2. Update your `dbt_project.yml` file:

    - **Schema Configuration**:
      Specify the schema appendix where dbt should store the relevant tables:
      ```yaml
        models:
          elementary:
            +schema: elementary
          view_selection_tool:
            +schema: view_selection_tool
      ```
      These settings ensure that if your project's tables are stored in schema `x`, then the tables from `elementary` will be stored in `x_elementary`, and those from `ViewSelectionAdvisor` will be stored in `x_view_selection_tool`.

    - **Variable Configuration**:
      Set the following variables:
      ```yaml
        vars:
          view_selection_tool:
            # Database where the elementary tables are located
            # (same as in your target profile from profiles.yml)
            elementary_src_db:  

            # Schema where the elementary tables are stored (e.g., `x_elementary`)
            src_schema:  

            # Name of your project as specified in this dbt_project.yml
            relevant_package:  
      ```
      This information allows `ViewSelectionAdvisor` to identify the data sources (`elementary_src_db` and `src_schema`) and the models to focus on (`relevant_package`).


3. Import the packages and build Elementary models
   ```shell
   dbt deps
   dbt run --select elementary
   ```
   This will install both the `view_selection_tool` and `elementary` packages, and create empty tables for Elementary to fill (at schema `x_elementary`).



### Python part

4. Install the package using your preferred method:
   ```shell
   pip install view-selection-python
   ```
   or
   ```shell
   poetry add view-selection-python
   ```


## Usage Instructions
Because `ViewSelectionAdvisor` relies entirely on the tables created by Elementary, it is crucial to ensure these tables are populated with the necessary information before running `ViewSelectionAdvisor`. Whenever you want to receive advice on the materialization of a DAG in dbt, follow these steps:

1. Populate Elementary tables with the latest information:
   ```shell
   dbt run --select <your_project_name>
   ```
   Running your project populates the Elementary tables with the data required by ViewSelectionAdvisor.

   _Note: This command only runs the models in your project, not the individual models from Elementary. However, the on-run-end hook of Elementary will execute automatically and provide all the necessary data._


2. Run `ViewSelectionAdvisor`:
   
   - **Transform Info From Elementary**:
   ```shell
   dbt run --select view_selection_tool
   ```
   This transforms the information provided in the Elementary tables and
   fills the database schema `x_view_selection_tool` with all information the
   python part of the `ViewSelectionAdvisor` requires in order to give a proper advice.

   - **Transform Info From Elementary**:
   ```shell
   vst-advise
   ```
   This command compares all possible materialization configurations, and advises on the configuration with 
   the lowest estimated cost. 

### Possible Variables for `vst-advise`
The following variables can be used to change the behavior of `vst-advise`: 

| Option                                                                        | Description                                                                                                                                  |
|-------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|
| `-h`, `--help`                                                                | Show this help message and exit                                                                                                              |
| `-mm <MAX_MATERIALIZATIONS>`, `--max_materializations <MAX_MATERIALIZATIONS>` | Set the maximum number of models to consider for materialization. Higher values provide more options but may increase runtime. Default is 2. |
| `-p <PROFILE>`, `--profile <PROFILE>`                                         | Select the profile to use                                                                                                                    |
| `-t <TARGET>`, `--target <TARGET>`                                            | Select the target profile to use                                                                                                             |


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/bramreinders97/view_selection_tool_python",
    "name": "view-selection-python",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0.0,>=3.11.5",
    "maintainer_email": null,
    "keywords": "view, selection",
    "author": "Bram Reinders",
    "author_email": "bram.reinders@blenddata.nl",
    "download_url": "https://files.pythonhosted.org/packages/a9/40/8627449d10eb395195b2861e3f358dff3cab2bcb212f06cfe8e480ad04ce/view_selection_python-0.1.5.tar.gz",
    "platform": null,
    "description": "# ViewSelectionAdvisor\n\nWelcome to `ViewSelectionAdvisor`, a tool designed to help dbt users address the problem of model materialization. \nThis tool consists of two separate packages, each hosted in its own GitHub repository:\n* [A dbt package](https://github.com/bramreinders97/view_selection_tool_dbt). \nThis package is dependent on another dbt package: [Elementary](https://docs.elementary-data.com/guides/modules-overview/dbt-package)\n* [A python package](https://github.com/bramreinders97/view_selection_tool_python)\n\n### What does it do?\nMost dbt projects are structured with a DAG that includes staging, intermediate, and marts models. Typically, staging and intermediate models are stored as views, while marts models are stored as tables. However, this default configuration may not always be the most efficient from a performance perspective. Determining which models should be materialized and which should not can be challenging.\nThis is where `ViewSelectionAdvisor` comes in to help. By using this tool, you are advised on the best \nmaterialization strategy for you models in dbt. \n\n\n### How does it work?\n`ViewSelectionAdvisor` determines the optimal configuration of materialized models by evaluating all possible configurations. For each configuration, it estimates the total cost of building your entire DAG using PostgreSQL's `EXPLAIN` command.\n\nNote: `ViewSelectionAdvisor` assumes that all [destination nodes](## \"Destination nodes are nodes in your DAG without an outgoing edge. In most cases, these nodes correspond to mart tables.\") are already materialized as tables. Consequently, these nodes will not appear in the provided advice.\n\nNote 2: By default, `ViewSelectionAdvisor` only looks at materialization configurations of at most 2 models. \nThis can be changed using the `max_materializations` variable (see [overview of variables](#possible-variables-for-vst-advise)).\n\n\n### A note on Elementary's defaults materializations warning\nPlease note that when running any of the `dbt run` commands in the coming steps, it is possible that you observe\na warning from dbt on elementary trying to override default materializations. \nThis is not a problem, as [the developers of Elementary are aware of this and working on a solution](https://docs.elementary-data.com/oss/quickstart/quickstart-cli-package#important-allowing-elementary-to-override-dbts-default-materializations-relevant-from-dbt-1-8).\nFurthermore, the parts of the elementary package that are affected by this are not relevant for `ViewSelectionAdvisor`.\n\n## Installation Instructions\n\n### dbt part\n1. Include `ViewSelectionAdvisor` in your `packages.yml` file:\n    ```yaml\n      - git: \"https://github.com/bramreinders97/view_selection_tool_dbt.git\"\n        revision: 4b7990736f651ae08f8d4a7a260f2c10ad1d862b\n    ``` \n \n2. Update your `dbt_project.yml` file:\n\n    - **Schema Configuration**:\n      Specify the schema appendix where dbt should store the relevant tables:\n      ```yaml\n        models:\n          elementary:\n            +schema: elementary\n          view_selection_tool:\n            +schema: view_selection_tool\n      ```\n      These settings ensure that if your project's tables are stored in schema `x`, then the tables from `elementary` will be stored in `x_elementary`, and those from `ViewSelectionAdvisor` will be stored in `x_view_selection_tool`.\n\n    - **Variable Configuration**:\n      Set the following variables:\n      ```yaml\n        vars:\n          view_selection_tool:\n            # Database where the elementary tables are located\n            # (same as in your target profile from profiles.yml)\n            elementary_src_db:  \n\n            # Schema where the elementary tables are stored (e.g., `x_elementary`)\n            src_schema:  \n\n            # Name of your project as specified in this dbt_project.yml\n            relevant_package:  \n      ```\n      This information allows `ViewSelectionAdvisor` to identify the data sources (`elementary_src_db` and `src_schema`) and the models to focus on (`relevant_package`).\n\n\n3. Import the packages and build Elementary models\n   ```shell\n   dbt deps\n   dbt run --select elementary\n   ```\n   This will install both the `view_selection_tool` and `elementary` packages, and create empty tables for Elementary to fill (at schema `x_elementary`).\n\n\n\n### Python part\n\n4. Install the package using your preferred method:\n   ```shell\n   pip install view-selection-python\n   ```\n   or\n   ```shell\n   poetry add view-selection-python\n   ```\n\n\n## Usage Instructions\nBecause `ViewSelectionAdvisor` relies entirely on the tables created by Elementary, it is crucial to ensure these tables are populated with the necessary information before running `ViewSelectionAdvisor`. Whenever you want to receive advice on the materialization of a DAG in dbt, follow these steps:\n\n1. Populate Elementary tables with the latest information:\n   ```shell\n   dbt run --select <your_project_name>\n   ```\n   Running your project populates the Elementary tables with the data required by ViewSelectionAdvisor.\n\n   _Note: This command only runs the models in your project, not the individual models from Elementary. However, the on-run-end hook of Elementary will execute automatically and provide all the necessary data._\n\n\n2. Run `ViewSelectionAdvisor`:\n   \n   - **Transform Info From Elementary**:\n   ```shell\n   dbt run --select view_selection_tool\n   ```\n   This transforms the information provided in the Elementary tables and\n   fills the database schema `x_view_selection_tool` with all information the\n   python part of the `ViewSelectionAdvisor` requires in order to give a proper advice.\n\n   - **Transform Info From Elementary**:\n   ```shell\n   vst-advise\n   ```\n   This command compares all possible materialization configurations, and advises on the configuration with \n   the lowest estimated cost. \n\n### Possible Variables for `vst-advise`\nThe following variables can be used to change the behavior of `vst-advise`: \n\n| Option                                                                        | Description                                                                                                                                  |\n|-------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------|\n| `-h`, `--help`                                                                | Show this help message and exit                                                                                                              |\n| `-mm <MAX_MATERIALIZATIONS>`, `--max_materializations <MAX_MATERIALIZATIONS>` | Set the maximum number of models to consider for materialization. Higher values provide more options but may increase runtime. Default is 2. |\n| `-p <PROFILE>`, `--profile <PROFILE>`                                         | Select the profile to use                                                                                                                    |\n| `-t <TARGET>`, `--target <TARGET>`                                            | Select the target profile to use                                                                                                             |\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "View Selection Tool for dbt (python part)",
    "version": "0.1.5",
    "project_urls": {
        "Homepage": "https://github.com/bramreinders97/view_selection_tool_python",
        "Repository": "https://github.com/bramreinders97/view_selection_tool_python"
    },
    "split_keywords": [
        "view",
        " selection"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b739f852f42eeceebbbc7b2dc2b514fc367f6003a35273f1bd625c5f6e096a65",
                "md5": "b5ba06e716a899d96c8f1842120a981c",
                "sha256": "944842d0a620bf2a7318900a6fec831c8ca43eb4e540c7035eb307ee940f6359"
            },
            "downloads": -1,
            "filename": "view_selection_python-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b5ba06e716a899d96c8f1842120a981c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0.0,>=3.11.5",
            "size": 22438,
            "upload_time": "2024-08-08T14:23:19",
            "upload_time_iso_8601": "2024-08-08T14:23:19.158610Z",
            "url": "https://files.pythonhosted.org/packages/b7/39/f852f42eeceebbbc7b2dc2b514fc367f6003a35273f1bd625c5f6e096a65/view_selection_python-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a9408627449d10eb395195b2861e3f358dff3cab2bcb212f06cfe8e480ad04ce",
                "md5": "36ccc7107622744a6fd99e1f2eaba351",
                "sha256": "7af721bcce1b41d90059cdab09f9bccd1ce423a9758a3fa3b89f79489b081666"
            },
            "downloads": -1,
            "filename": "view_selection_python-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "36ccc7107622744a6fd99e1f2eaba351",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0.0,>=3.11.5",
            "size": 18660,
            "upload_time": "2024-08-08T14:23:20",
            "upload_time_iso_8601": "2024-08-08T14:23:20.854354Z",
            "url": "https://files.pythonhosted.org/packages/a9/40/8627449d10eb395195b2861e3f358dff3cab2bcb212f06cfe8e480ad04ce/view_selection_python-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-08 14:23:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bramreinders97",
    "github_project": "view_selection_tool_python",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "view-selection-python"
}
        
Elapsed time: 0.66818s