bquest


Namebquest JSON
Version 0.5.0 PyPI version JSON
download
home_pagehttps://github.com/ottogroup/bquest
SummaryEffortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.
upload_time2024-02-20 18:26:57
maintainer
docs_urlNone
authorOtto Group data.works GmbH
requires_python>=3.10,<4.0
licenseApache Software License
keywords open-source google-big-query query sql testing pandas
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: https://raw.githubusercontent.com/ottogroup/bquest/main/docs/assets/logo.svg
    :alt: BQuest Logo

BQuest
######

Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.

We would like to thank `Mike Czech <https://github.com/mikeczech>`_ who is the original inventor of bquest!

**Warning**

This library is a work in progress!

Breaking changes should be expected until a 1.0 release, so version pinning is recommended.

.. image:: https://github.com/ottogroup/bquest/workflows/Tests/badge.svg
   :target: https://github.com/ottogroup/bquest/actions?workflow=Tests
   :alt: CI: Overall outcome
.. image:: https://github.com/ottogroup/bquest/actions/workflows/pages/pages-build-deployment/badge.svg?branch=gh-pages
   :target: https://github.com/ottogroup/bquest/actions/workflows/pages/pages-build-deployment
   :alt: CD: gh-pages documentation
.. image:: https://img.shields.io/pypi/v/bquest.svg
   :target: https://pypi.org/project/bquest/
   :alt: PyPI version
.. image:: https://img.shields.io/pypi/status/bquest.svg
   :target: https://pypi.python.org/pypi/bquest/
   :alt: Project status (alpha, beta, stable)
.. image:: https://static.pepy.tech/personalized-badge/bquest?period=month&units=international_system&left_color=grey&right_color=blue&left_text=PyPI%20downloads/month
   :target: https://pepy.tech/project/bquest
   :alt: PyPI downloads
.. image:: https://img.shields.io/github/license/ottogroup/bquest
   :target: https://github.com/ottogroup/bquest/blob/main/LICENSE
   :alt: Project license
.. image:: https://img.shields.io/pypi/pyversions/bquest.svg
   :target: https://pypi.python.org/pypi/bquest/
   :alt: Python version compatibility
.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
   :target: https://github.com/psf/black
   :alt: Documentation: Black

Overview
********

* Use BQuest in combination with your favorite testing framework (e.g. pytest).
* Create temporary test tables from JSON_ or `pandas DataFrame`_.
* Run BQ configurations and plain SQL queries on your test tables and check the result.

.. _JSON: https://cloud.google.com/bigquery/docs/loading-data
.. _pandas DataFrame: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html

Installation
************

Via PyPi (standard):

.. code-block:: bash

    pip install bquest


Via Github (most recent):

.. code-block:: bash

    pip install git+https://github.com/ottogroup/bquest


BQuest also requires a dedicated BigQuery dataset for storing test tables, e.g.

.. code-block:: yaml

    resource "google_bigquery_dataset" "bquest" {
      dataset_id    = "bquest"
      friendly_name = "bquest"
      description   = "Source tables for bquest tests"
      location      = "EU"
      default_table_expiration_ms = 3600000
    }

We recommend setting an `expiration time`_ for tables in the bquest dataset to assure removal of those test tables upon
test execution.

.. _`expiration time`: https://www.terraform.io/docs/providers/google/r/bigquery_dataset.html#default_table_expiration_ms

Example
*******

Given a pandas DataFrame

.. list-table::
   :widths: 30 30 30
   :header-rows: 1

   * - foo
     - weight
     - prediction_date
   * - bar
     - 23
     - 20190301
   * - my
     - 42
     - 20190301

and its table definition

.. code-block:: python

    from bquest.tables import BQTableDefinitionBuilder

    table_def_builder = BQTableDefinitionBuilder(GOOGLE_PROJECT_ID, dataset="bquest", location="EU")
    table_definition = table_def_builder.from_df("abc.feed_latest", df)

you can use the config file *./abc/config.py*

.. code-block:: json-object

    {
        "query": """
            SELECT
                foo,
                PARSE_DATE('%Y%m%d', prediction_date)
            FROM
                `{source_table}`
            WHERE
                weight > {THRESHOLD}
        """,
        "start_date": "prediction_date",
        "end_date": "prediction_date",
        "source_tables": {"source_table": "abc.feed_latest"},
        "feature_table_name": "abc.myid",
    }

and the runner

.. code-block:: python

    from bquest.runner import BQConfigFileRunner, BQConfigRunner

    runner = BQConfigFileRunner(
        BQConfigRunner(bq_client, bq_executor_func),
        "config/bq_config",
    )

    result_df = runner.run_config(
        "20190301",
        "20190308",
        [table_definition],
        "abc/config.py",
        templating_vars={"THRESHOLD": "30"},
    )

to assert the result table

.. code-block:: python

    assert result_df.shape == (1, 2)
    assert result_df.iloc[0]["foo"] == "my"

Testing
*******

For the actual testing bquest relies on an accessible BigQuery project which can be configured
with the gcloud_ client. The corresponding ``GOOGLE_PROJECT_ID`` is extracted from this project
and used with pandas-gbq_ to write temporary tables to the bquest dataset that has to be pre-
configured before testing on that project.

For Github CI we have configured an identity provider in our testing project which allows
only core members of this repository to access the testing projects' resources.

.. _gcloud: https://cloud.google.com/sdk/docs/install?hl=de
.. _pandas-gbq: https://github.com/googleapis/python-bigquery-pandas

Important Links
***************

- Full documentation: https://ottogroup.github.io/bquest/

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ottogroup/bquest",
    "name": "bquest",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10,<4.0",
    "maintainer_email": "",
    "keywords": "open-source,google-big-query,query,sql,testing,pandas",
    "author": "Otto Group data.works GmbH",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/10/cf/d99b345e5be77801981457c6bb50179f087e93e89c42ddb2e7575152e607/bquest-0.5.0.tar.gz",
    "platform": null,
    "description": ".. image:: https://raw.githubusercontent.com/ottogroup/bquest/main/docs/assets/logo.svg\n    :alt: BQuest Logo\n\nBQuest\n######\n\nEffortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.\n\nWe would like to thank `Mike Czech <https://github.com/mikeczech>`_ who is the original inventor of bquest!\n\n**Warning**\n\nThis library is a work in progress!\n\nBreaking changes should be expected until a 1.0 release, so version pinning is recommended.\n\n.. image:: https://github.com/ottogroup/bquest/workflows/Tests/badge.svg\n   :target: https://github.com/ottogroup/bquest/actions?workflow=Tests\n   :alt: CI: Overall outcome\n.. image:: https://github.com/ottogroup/bquest/actions/workflows/pages/pages-build-deployment/badge.svg?branch=gh-pages\n   :target: https://github.com/ottogroup/bquest/actions/workflows/pages/pages-build-deployment\n   :alt: CD: gh-pages documentation\n.. image:: https://img.shields.io/pypi/v/bquest.svg\n   :target: https://pypi.org/project/bquest/\n   :alt: PyPI version\n.. image:: https://img.shields.io/pypi/status/bquest.svg\n   :target: https://pypi.python.org/pypi/bquest/\n   :alt: Project status (alpha, beta, stable)\n.. image:: https://static.pepy.tech/personalized-badge/bquest?period=month&units=international_system&left_color=grey&right_color=blue&left_text=PyPI%20downloads/month\n   :target: https://pepy.tech/project/bquest\n   :alt: PyPI downloads\n.. image:: https://img.shields.io/github/license/ottogroup/bquest\n   :target: https://github.com/ottogroup/bquest/blob/main/LICENSE\n   :alt: Project license\n.. image:: https://img.shields.io/pypi/pyversions/bquest.svg\n   :target: https://pypi.python.org/pypi/bquest/\n   :alt: Python version compatibility\n.. image:: https://img.shields.io/badge/code%20style-black-000000.svg\n   :target: https://github.com/psf/black\n   :alt: Documentation: Black\n\nOverview\n********\n\n* Use BQuest in combination with your favorite testing framework (e.g. pytest).\n* Create temporary test tables from JSON_ or `pandas DataFrame`_.\n* Run BQ configurations and plain SQL queries on your test tables and check the result.\n\n.. _JSON: https://cloud.google.com/bigquery/docs/loading-data\n.. _pandas DataFrame: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html\n\nInstallation\n************\n\nVia PyPi (standard):\n\n.. code-block:: bash\n\n    pip install bquest\n\n\nVia Github (most recent):\n\n.. code-block:: bash\n\n    pip install git+https://github.com/ottogroup/bquest\n\n\nBQuest also requires a dedicated BigQuery dataset for storing test tables, e.g.\n\n.. code-block:: yaml\n\n    resource \"google_bigquery_dataset\" \"bquest\" {\n      dataset_id    = \"bquest\"\n      friendly_name = \"bquest\"\n      description   = \"Source tables for bquest tests\"\n      location      = \"EU\"\n      default_table_expiration_ms = 3600000\n    }\n\nWe recommend setting an `expiration time`_ for tables in the bquest dataset to assure removal of those test tables upon\ntest execution.\n\n.. _`expiration time`: https://www.terraform.io/docs/providers/google/r/bigquery_dataset.html#default_table_expiration_ms\n\nExample\n*******\n\nGiven a pandas DataFrame\n\n.. list-table::\n   :widths: 30 30 30\n   :header-rows: 1\n\n   * - foo\n     - weight\n     - prediction_date\n   * - bar\n     - 23\n     - 20190301\n   * - my\n     - 42\n     - 20190301\n\nand its table definition\n\n.. code-block:: python\n\n    from bquest.tables import BQTableDefinitionBuilder\n\n    table_def_builder = BQTableDefinitionBuilder(GOOGLE_PROJECT_ID, dataset=\"bquest\", location=\"EU\")\n    table_definition = table_def_builder.from_df(\"abc.feed_latest\", df)\n\nyou can use the config file *./abc/config.py*\n\n.. code-block:: json-object\n\n    {\n        \"query\": \"\"\"\n            SELECT\n                foo,\n                PARSE_DATE('%Y%m%d', prediction_date)\n            FROM\n                `{source_table}`\n            WHERE\n                weight > {THRESHOLD}\n        \"\"\",\n        \"start_date\": \"prediction_date\",\n        \"end_date\": \"prediction_date\",\n        \"source_tables\": {\"source_table\": \"abc.feed_latest\"},\n        \"feature_table_name\": \"abc.myid\",\n    }\n\nand the runner\n\n.. code-block:: python\n\n    from bquest.runner import BQConfigFileRunner, BQConfigRunner\n\n    runner = BQConfigFileRunner(\n        BQConfigRunner(bq_client, bq_executor_func),\n        \"config/bq_config\",\n    )\n\n    result_df = runner.run_config(\n        \"20190301\",\n        \"20190308\",\n        [table_definition],\n        \"abc/config.py\",\n        templating_vars={\"THRESHOLD\": \"30\"},\n    )\n\nto assert the result table\n\n.. code-block:: python\n\n    assert result_df.shape == (1, 2)\n    assert result_df.iloc[0][\"foo\"] == \"my\"\n\nTesting\n*******\n\nFor the actual testing bquest relies on an accessible BigQuery project which can be configured\nwith the gcloud_ client. The corresponding ``GOOGLE_PROJECT_ID`` is extracted from this project\nand used with pandas-gbq_ to write temporary tables to the bquest dataset that has to be pre-\nconfigured before testing on that project.\n\nFor Github CI we have configured an identity provider in our testing project which allows\nonly core members of this repository to access the testing projects' resources.\n\n.. _gcloud: https://cloud.google.com/sdk/docs/install?hl=de\n.. _pandas-gbq: https://github.com/googleapis/python-bigquery-pandas\n\nImportant Links\n***************\n\n- Full documentation: https://ottogroup.github.io/bquest/\n",
    "bugtrack_url": null,
    "license": "Apache Software License",
    "summary": "Effortlessly validate and test your Google BigQuery queries with the power of pandas DataFrames in Python.",
    "version": "0.5.0",
    "project_urls": {
        "Documentation": "https://ottogroup.github.io/bquest/",
        "Homepage": "https://github.com/ottogroup/bquest",
        "Issues": "https://github.com/ottogroup/bquest/issues",
        "Releases": "https://github.com/ottogroup/bquest/releases",
        "Repository": "https://github.com/ottogroup/bquest"
    },
    "split_keywords": [
        "open-source",
        "google-big-query",
        "query",
        "sql",
        "testing",
        "pandas"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "009a8224b5066438f092115cee927c21e22a06bf72db8a8ac8751e65cacbc531",
                "md5": "0e2509cfb1ef1c198d975d7e0b2cfd18",
                "sha256": "5d23a8bd95ab51672c23b7fc03f2bdb5c08b2e4f6dbb6b34f8f508e81d0d0707"
            },
            "downloads": -1,
            "filename": "bquest-0.5.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0e2509cfb1ef1c198d975d7e0b2cfd18",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10,<4.0",
            "size": 18443,
            "upload_time": "2024-02-20T18:26:55",
            "upload_time_iso_8601": "2024-02-20T18:26:55.851461Z",
            "url": "https://files.pythonhosted.org/packages/00/9a/8224b5066438f092115cee927c21e22a06bf72db8a8ac8751e65cacbc531/bquest-0.5.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "10cfd99b345e5be77801981457c6bb50179f087e93e89c42ddb2e7575152e607",
                "md5": "fae2059c45a5bb2a587f20054b1e0572",
                "sha256": "3265f68335e710c11e848030b9f7fb910eef329f04ed8c7ec6e9b64e02fb94e6"
            },
            "downloads": -1,
            "filename": "bquest-0.5.0.tar.gz",
            "has_sig": false,
            "md5_digest": "fae2059c45a5bb2a587f20054b1e0572",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10,<4.0",
            "size": 15151,
            "upload_time": "2024-02-20T18:26:57",
            "upload_time_iso_8601": "2024-02-20T18:26:57.504426Z",
            "url": "https://files.pythonhosted.org/packages/10/cf/d99b345e5be77801981457c6bb50179f087e93e89c42ddb2e7575152e607/bquest-0.5.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-02-20 18:26:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ottogroup",
    "github_project": "bquest",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "bquest"
}
        
Elapsed time: 0.18317s