weiser-ai

Name	weiser-ai JSON
Version	0.2.2 JSON
	download
home_page	None
Summary	Enterprise-grade data quality framework with YAML configuration, LLM-friendly design, and advanced statistical validation
upload_time	2025-08-14 18:42:51
maintainer	None
docs_url	None
author	None
requires_python	<3.13,>=3.10
license	None
keywords	data-quality data-validation data-testing sql yaml llm ai
VCS
bugtrack_url
requirements	alembic annotated-types asn1crypto boto3 botocore cachetools certifi cffi charset-normalizer click colorama cryptography databricks-sql-connector databricks-sqlalchemy duckdb duckdb-engine et-xmlfile exceptiongroup filelock google-api-core google-auth google-cloud-bigquery google-cloud-core google-crc32c google-resumable-media googleapis-common-protos greenlet grpcio grpcio-status idna iniconfig jinja2 jmespath lz4 mako markdown-it-py markupsafe mdurl numpy oauthlib openpyxl packaging pandas platformdirs pluggy proto-plus protobuf psycopg2 pyarrow pyasn1 pyasn1-modules pycparser pydantic pydantic-core pygments pyjwt pymysql pyopenssl pytest python-dateutil python-dotenv pytz pyyaml requests rich rsa s3transfer shellingham six slack-sdk snowflake-connector-python snowflake-sqlalchemy sortedcontainers sqlalchemy sqlalchemy-bigquery sqlglot sqlglotrs sqlmodel thrift tomli tomlkit typer typing-extensions tzdata urllib3
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Weiser

Data Quality Framework

## Introduction

Weiser is a data quality framework designed to help you ensure the integrity and accuracy of your data. It provides a set of tools and checks to validate your data and detect anomalies. It also includes a dashboard to visualize the results of the checks.

## Installation

To install Weiser, use the following command:

```sh
pip install weiser-ai
```

## Usage

### Run example checks

Connections are defined at the datasources section in the config file see: `examples/example.yaml`.

Run checks in verbose mode:

```sh
weiser run examples/example.yaml -v
```

[![Watch the CLI Demo](https://cdn.loom.com/sessions/thumbnails/ce75ad760c324733a36c637a9f8fe826-401f2819c5918c19-full-play.gif)](https://www.loom.com/share/ce75ad760c324733a36c637a9f8fe826)

Compile checks only in verbose mode:

```sh
weiser compile examples/example.yaml -v
```

### Run dashboard

```sh
cd weiser-ui
pip install -r requirements.txt
streamlit run app.py
```

[![Watch the Dashboard Demo](https://cdn.loom.com/sessions/thumbnails/3154b4ce21ea4aaa917066991eaf1fb6-aca9c23da977e100-full-play.gif)](https://www.loom.com/share/3154b4ce21ea4aaa917066991eaf1fb6)

## Configuration

Simple count check defintion

```yaml
- name: test row_count
  dataset: orders
  type: row_count
  condition: gt
  threshold: 0
```

Custom sql definition

```yaml
- name: test numeric
  dataset: orders
  type: numeric
  measure: sum(budgeted_amount::numeric::float)
  condition: gt
  threshold: 0
```

Target multiple datasets with the same check definition

```yaml
- name: test row_count
  dataset: [orders, vendors]
  type: row_count
  condition: gt
  threshold: 0
```

Check individual group by values in a check

```yaml
- name: test row_count groupby
  dataset: vendors
  type: row_count
  dimensions:
    - tenant_id
  condition: gt
  threshold: 0
```

Time aggregation check with granularity

```yaml
- name: test numeric gt sum yearly
  dataset: orders
  type: sum
  measure: budgeted_amount::numeric::float
  condition: gt
  threshold: 0
  time_dimension:
    name: _updated_at
    granularity: year
```

Custom SQL expression for dataset and filter usage

```yaml
- name: test numeric completed
  dataset: >
    SELECT * FROM orders o LEFT JOIN orders_status os ON o.order_id = os.order_id
  type: numeric
  measure: sum(budgeted_amount::numeric::float)
  condition: gt
  threshold: 0
  filter: status = 'FULFILLED'
```

Missing values check

```yaml
- name: customer data quality
  dataset: orders
  type: not_empty
  dimensions: ["customer_id", "product_id", "order_date"]
  condition: le
  # Allow up to 5 NULL values per dimension
  threshold: 5
  filter: "status = 'active'"
```

Anomaly detection check

```yaml
- name: test anomaly
  # anomaly test should always target metrics metadata dataset
  dataset: metrics
  type: anomaly
  # References Orders row count.
  check_id: c5cee10898e30edd1c0dde3f24966b4c47890fcf247e5b630c2c156f7ac7ba22
  condition: between
  # long tails of normal distribution for Z-score.
  threshold: [-3.5, 3.5]
```

## Contributing

We welcome contributions!

## License

This project is licensed under the Apache 2.0 License. See the `LICENSE` file for more details.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "weiser-ai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.10",
    "maintainer_email": "Paco Valdez <paco.valdez@berkeley.edu>",
    "keywords": "data-quality, data-validation, data-testing, sql, yaml, llm, ai",
    "author": null,
    "author_email": "Paco Valdez <paco.valdez@berkeley.edu>",
    "download_url": "https://files.pythonhosted.org/packages/b3/b6/b90e198a828f3db233460ea4f472fbc85332a35925f925856bc43955940b/weiser_ai-0.2.2.tar.gz",
    "platform": null,
    "description": "# Weiser\n\nData Quality Framework\n\n## Introduction\n\nWeiser is a data quality framework designed to help you ensure the integrity and accuracy of your data. It provides a set of tools and checks to validate your data and detect anomalies. It also includes a dashboard to visualize the results of the checks.\n\n## Installation\n\nTo install Weiser, use the following command:\n\n```sh\npip install weiser-ai\n```\n\n## Usage\n\n### Run example checks\n\nConnections are defined at the datasources section in the config file see: `examples/example.yaml`.\n\nRun checks in verbose mode:\n\n```sh\nweiser run examples/example.yaml -v\n```\n\n[![Watch the CLI Demo](https://cdn.loom.com/sessions/thumbnails/ce75ad760c324733a36c637a9f8fe826-401f2819c5918c19-full-play.gif)](https://www.loom.com/share/ce75ad760c324733a36c637a9f8fe826)\n\nCompile checks only in verbose mode:\n\n```sh\nweiser compile examples/example.yaml -v\n```\n\n### Run dashboard\n\n```sh\ncd weiser-ui\npip install -r requirements.txt\nstreamlit run app.py\n```\n\n[![Watch the Dashboard Demo](https://cdn.loom.com/sessions/thumbnails/3154b4ce21ea4aaa917066991eaf1fb6-aca9c23da977e100-full-play.gif)](https://www.loom.com/share/3154b4ce21ea4aaa917066991eaf1fb6)\n\n## Configuration\n\nSimple count check defintion\n\n```yaml\n- name: test row_count\n  dataset: orders\n  type: row_count\n  condition: gt\n  threshold: 0\n```\n\nCustom sql definition\n\n```yaml\n- name: test numeric\n  dataset: orders\n  type: numeric\n  measure: sum(budgeted_amount::numeric::float)\n  condition: gt\n  threshold: 0\n```\n\nTarget multiple datasets with the same check definition\n\n```yaml\n- name: test row_count\n  dataset: [orders, vendors]\n  type: row_count\n  condition: gt\n  threshold: 0\n```\n\nCheck individual group by values in a check\n\n```yaml\n- name: test row_count groupby\n  dataset: vendors\n  type: row_count\n  dimensions:\n    - tenant_id\n  condition: gt\n  threshold: 0\n```\n\nTime aggregation check with granularity\n\n```yaml\n- name: test numeric gt sum yearly\n  dataset: orders\n  type: sum\n  measure: budgeted_amount::numeric::float\n  condition: gt\n  threshold: 0\n  time_dimension:\n    name: _updated_at\n    granularity: year\n```\n\nCustom SQL expression for dataset and filter usage\n\n```yaml\n- name: test numeric completed\n  dataset: >\n    SELECT * FROM orders o LEFT JOIN orders_status os ON o.order_id = os.order_id\n  type: numeric\n  measure: sum(budgeted_amount::numeric::float)\n  condition: gt\n  threshold: 0\n  filter: status = 'FULFILLED'\n```\n\nMissing values check\n\n```yaml\n- name: customer data quality\n  dataset: orders\n  type: not_empty\n  dimensions: [\"customer_id\", \"product_id\", \"order_date\"]\n  condition: le\n  # Allow up to 5 NULL values per dimension\n  threshold: 5\n  filter: \"status = 'active'\"\n```\n\nAnomaly detection check\n\n```yaml\n- name: test anomaly\n  # anomaly test should always target metrics metadata dataset\n  dataset: metrics\n  type: anomaly\n  # References Orders row count.\n  check_id: c5cee10898e30edd1c0dde3f24966b4c47890fcf247e5b630c2c156f7ac7ba22\n  condition: between\n  # long tails of normal distribution for Z-score.\n  threshold: [-3.5, 3.5]\n```\n\n## Contributing\n\nWe welcome contributions!\n\n## License\n\nThis project is licensed under the Apache 2.0 License. See the `LICENSE` file for more details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Enterprise-grade data quality framework with YAML configuration, LLM-friendly design, and advanced statistical validation",
    "version": "0.2.2",
    "project_urls": {
        "Bug Tracker": "https://github.com/weiser-ai/weiser-ai/issues",
        "Documentation": "https://weiser.ai",
        "Homepage": "https://weiser.ai",
        "Repository": "https://github.com/weiser-ai/weiser-ai"
    },
    "split_keywords": [
        "data-quality",
        " data-validation",
        " data-testing",
        " sql",
        " yaml",
        " llm",
        " ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "afec0174d76e7bb1b8955951afb556dedf78f48f8552bb8416115d055e1835b4",
                "md5": "e5bef065959f696a2851ea656ebfc58d",
                "sha256": "4866c4e82abd103a4396d5ed8ed6ff575b8ebc8e1f65bc59a10d274d4a58385e"
            },
            "downloads": -1,
            "filename": "weiser_ai-0.2.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e5bef065959f696a2851ea656ebfc58d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.10",
            "size": 46739,
            "upload_time": "2025-08-14T18:42:49",
            "upload_time_iso_8601": "2025-08-14T18:42:49.846266Z",
            "url": "https://files.pythonhosted.org/packages/af/ec/0174d76e7bb1b8955951afb556dedf78f48f8552bb8416115d055e1835b4/weiser_ai-0.2.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b3b6b90e198a828f3db233460ea4f472fbc85332a35925f925856bc43955940b",
                "md5": "65a90435aef4e5eafc2f9db5b6f4fd0b",
                "sha256": "ae95d51d85b1cd73cf4a3d0929fecfcc652e5701f4e6921d15e76277ce449093"
            },
            "downloads": -1,
            "filename": "weiser_ai-0.2.2.tar.gz",
            "has_sig": false,
            "md5_digest": "65a90435aef4e5eafc2f9db5b6f4fd0b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.10",
            "size": 35117,
            "upload_time": "2025-08-14T18:42:51",
            "upload_time_iso_8601": "2025-08-14T18:42:51.307823Z",
            "url": "https://files.pythonhosted.org/packages/b3/b6/b90e198a828f3db233460ea4f472fbc85332a35925f925856bc43955940b/weiser_ai-0.2.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-14 18:42:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "weiser-ai",
    "github_project": "weiser-ai",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "alembic",
            "specs": [
                [
                    "==",
                    "1.16.4"
                ]
            ]
        },
        {
            "name": "annotated-types",
            "specs": [
                [
                    "==",
                    "0.6.0"
                ]
            ]
        },
        {
            "name": "asn1crypto",
            "specs": [
                [
                    "==",
                    "1.5.1"
                ]
            ]
        },
        {
            "name": "boto3",
            "specs": [
                [
                    "==",
                    "1.35.49"
                ]
            ]
        },
        {
            "name": "botocore",
            "specs": [
                [
                    "==",
                    "1.35.49"
                ]
            ]
        },
        {
            "name": "cachetools",
            "specs": [
                [
                    "==",
                    "5.5.2"
                ]
            ]
        },
        {
            "name": "certifi",
            "specs": [
                [
                    "==",
                    "2025.6.15"
                ]
            ]
        },
        {
            "name": "cffi",
            "specs": [
                [
                    "==",
                    "1.17.1"
                ]
            ]
        },
        {
            "name": "charset-normalizer",
            "specs": [
                [
                    "==",
                    "3.4.2"
                ]
            ]
        },
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.1.7"
                ]
            ]
        },
        {
            "name": "colorama",
            "specs": [
                [
                    "==",
                    "0.4.6"
                ]
            ]
        },
        {
            "name": "cryptography",
            "specs": [
                [
                    "==",
                    "45.0.5"
                ]
            ]
        },
        {
            "name": "databricks-sql-connector",
            "specs": [
                [
                    "==",
                    "4.0.5"
                ]
            ]
        },
        {
            "name": "databricks-sqlalchemy",
            "specs": [
                [
                    "==",
                    "2.0.7"
                ]
            ]
        },
        {
            "name": "duckdb",
            "specs": [
                [
                    "==",
                    "1.3.2"
                ]
            ]
        },
        {
            "name": "duckdb-engine",
            "specs": [
                [
                    "==",
                    "0.17.0"
                ]
            ]
        },
        {
            "name": "et-xmlfile",
            "specs": [
                [
                    "==",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "exceptiongroup",
            "specs": [
                [
                    "==",
                    "1.2.0"
                ]
            ]
        },
        {
            "name": "filelock",
            "specs": [
                [
                    "==",
                    "3.18.0"
                ]
            ]
        },
        {
            "name": "google-api-core",
            "specs": [
                [
                    "==",
                    "2.25.1"
                ]
            ]
        },
        {
            "name": "google-auth",
            "specs": [
                [
                    "==",
                    "2.40.3"
                ]
            ]
        },
        {
            "name": "google-cloud-bigquery",
            "specs": [
                [
                    "==",
                    "3.34.0"
                ]
            ]
        },
        {
            "name": "google-cloud-core",
            "specs": [
                [
                    "==",
                    "2.4.3"
                ]
            ]
        },
        {
            "name": "google-crc32c",
            "specs": [
                [
                    "==",
                    "1.7.1"
                ]
            ]
        },
        {
            "name": "google-resumable-media",
            "specs": [
                [
                    "==",
                    "2.7.2"
                ]
            ]
        },
        {
            "name": "googleapis-common-protos",
            "specs": [
                [
                    "==",
                    "1.70.0"
                ]
            ]
        },
        {
            "name": "greenlet",
            "specs": [
                [
                    "==",
                    "3.0.3"
                ]
            ]
        },
        {
            "name": "grpcio",
            "specs": [
                [
                    "==",
                    "1.73.1"
                ]
            ]
        },
        {
            "name": "grpcio-status",
            "specs": [
                [
                    "==",
                    "1.73.1"
                ]
            ]
        },
        {
            "name": "idna",
            "specs": [
                [
                    "==",
                    "3.10"
                ]
            ]
        },
        {
            "name": "iniconfig",
            "specs": [
                [
                    "==",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "jinja2",
            "specs": [
                [
                    "==",
                    "3.1.3"
                ]
            ]
        },
        {
            "name": "jmespath",
            "specs": [
                [
                    "==",
                    "1.0.1"
                ]
            ]
        },
        {
            "name": "lz4",
            "specs": [
                [
                    "==",
                    "4.4.4"
                ]
            ]
        },
        {
            "name": "mako",
            "specs": [
                [
                    "==",
                    "1.3.10"
                ]
            ]
        },
        {
            "name": "markdown-it-py",
            "specs": [
                [
                    "==",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "markupsafe",
            "specs": [
                [
                    "==",
                    "2.1.5"
                ]
            ]
        },
        {
            "name": "mdurl",
            "specs": [
                [
                    "==",
                    "0.1.2"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "2.1.0"
                ]
            ]
        },
        {
            "name": "oauthlib",
            "specs": [
                [
                    "==",
                    "3.3.1"
                ]
            ]
        },
        {
            "name": "openpyxl",
            "specs": [
                [
                    "==",
                    "3.1.5"
                ]
            ]
        },
        {
            "name": "packaging",
            "specs": [
                [
                    "==",
                    "25.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.2.3"
                ]
            ]
        },
        {
            "name": "platformdirs",
            "specs": [
                [
                    "==",
                    "4.3.8"
                ]
            ]
        },
        {
            "name": "pluggy",
            "specs": [
                [
                    "==",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "proto-plus",
            "specs": [
                [
                    "==",
                    "1.26.1"
                ]
            ]
        },
        {
            "name": "protobuf",
            "specs": [
                [
                    "==",
                    "6.31.1"
                ]
            ]
        },
        {
            "name": "psycopg2",
            "specs": [
                [
                    "==",
                    "2.9.9"
                ]
            ]
        },
        {
            "name": "pyarrow",
            "specs": [
                [
                    "==",
                    "20.0.0"
                ]
            ]
        },
        {
            "name": "pyasn1",
            "specs": [
                [
                    "==",
                    "0.6.1"
                ]
            ]
        },
        {
            "name": "pyasn1-modules",
            "specs": [
                [
                    "==",
                    "0.4.2"
                ]
            ]
        },
        {
            "name": "pycparser",
            "specs": [
                [
                    "==",
                    "2.22"
                ]
            ]
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    "==",
                    "2.5.3"
                ]
            ]
        },
        {
            "name": "pydantic-core",
            "specs": [
                [
                    "==",
                    "2.14.6"
                ]
            ]
        },
        {
            "name": "pygments",
            "specs": [
                [
                    "==",
                    "2.17.2"
                ]
            ]
        },
        {
            "name": "pyjwt",
            "specs": [
                [
                    "==",
                    "2.10.1"
                ]
            ]
        },
        {
            "name": "pymysql",
            "specs": [
                [
                    "==",
                    "1.1.1"
                ]
            ]
        },
        {
            "name": "pyopenssl",
            "specs": [
                [
                    "==",
                    "25.1.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    "==",
                    "7.4.4"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    "==",
                    "2.9.0.post0"
                ]
            ]
        },
        {
            "name": "python-dotenv",
            "specs": [
                [
                    "==",
                    "1.0.1"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    "==",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0.2"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.32.4"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    "==",
                    "13.7.0"
                ]
            ]
        },
        {
            "name": "rsa",
            "specs": [
                [
                    "==",
                    "4.9.1"
                ]
            ]
        },
        {
            "name": "s3transfer",
            "specs": [
                [
                    "==",
                    "0.10.3"
                ]
            ]
        },
        {
            "name": "shellingham",
            "specs": [
                [
                    "==",
                    "1.5.4"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    "==",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "slack-sdk",
            "specs": [
                [
                    "==",
                    "3.34.0"
                ]
            ]
        },
        {
            "name": "snowflake-connector-python",
            "specs": [
                [
                    "==",
                    "3.16.0"
                ]
            ]
        },
        {
            "name": "snowflake-sqlalchemy",
            "specs": [
                [
                    "==",
                    "1.7.5"
                ]
            ]
        },
        {
            "name": "sortedcontainers",
            "specs": [
                [
                    "==",
                    "2.4.0"
                ]
            ]
        },
        {
            "name": "sqlalchemy",
            "specs": [
                [
                    "==",
                    "2.0.41"
                ]
            ]
        },
        {
            "name": "sqlalchemy-bigquery",
            "specs": [
                [
                    "==",
                    "1.15.0"
                ]
            ]
        },
        {
            "name": "sqlglot",
            "specs": [
                [
                    "==",
                    "20.5.0"
                ]
            ]
        },
        {
            "name": "sqlglotrs",
            "specs": [
                [
                    "==",
                    "0.1.0"
                ]
            ]
        },
        {
            "name": "sqlmodel",
            "specs": [
                [
                    "==",
                    "0.0.24"
                ]
            ]
        },
        {
            "name": "thrift",
            "specs": [
                [
                    "==",
                    "0.20.0"
                ]
            ]
        },
        {
            "name": "tomli",
            "specs": [
                [
                    "==",
                    "2.0.1"
                ]
            ]
        },
        {
            "name": "tomlkit",
            "specs": [
                [
                    "==",
                    "0.13.3"
                ]
            ]
        },
        {
            "name": "typer",
            "specs": [
                [
                    "==",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    "==",
                    "4.14.1"
                ]
            ]
        },
        {
            "name": "tzdata",
            "specs": [
                [
                    "==",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "==",
                    "2.2.3"
                ]
            ]
        }
    ],
    "lcname": "weiser-ai"
}

None