weiser-ai


Nameweiser-ai JSON
Version 0.2.0 PyPI version JSON
download
home_pageNone
SummaryEnterprise-grade data quality framework with YAML configuration, LLM-friendly design, and advanced statistical validation
upload_time2025-07-09 23:00:59
maintainerNone
docs_urlNone
authorNone
requires_python<3.13,>=3.10
licenseNone
keywords data-quality data-validation data-testing sql yaml llm ai
VCS
bugtrack_url
requirements annotated-types asn1crypto boto3 botocore certifi cffi charset-normalizer click colorama cryptography duckdb exceptiongroup filelock greenlet idna iniconfig jinja2 jmespath markdown-it-py markupsafe mdurl numpy packaging pandas platformdirs pluggy psycopg2 pycparser pydantic pydantic-core pygments pyjwt pyopenssl pytest python-dateutil python-dotenv pytz pyyaml requests rich s3transfer shellingham six slack-sdk snowflake-connector-python snowflake-sqlalchemy sortedcontainers sqlalchemy sqlglot sqlglotrs tomli tomlkit typer typing-extensions tzdata urllib3
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Weiser

Data Quality Framework

## Introduction

Weiser is a data quality framework designed to help you ensure the integrity and accuracy of your data. It provides a set of tools and checks to validate your data and detect anomalies. It also includes a dashboard to visualize the results of the checks.

## Installation

To install Weiser, use the following command:

```sh
pip install weiser-ai
```

## Usage

### Run example checks

Connections are defined at the datasources section in the config file see: `examples/example.yaml`.

Run checks in verbose mode:

```sh
weiser run examples/example.yaml -v
```

[![Watch the CLI Demo](https://cdn.loom.com/sessions/thumbnails/ce75ad760c324733a36c637a9f8fe826-401f2819c5918c19-full-play.gif)](https://www.loom.com/share/ce75ad760c324733a36c637a9f8fe826)

Compile checks only in verbose mode:

```sh
weiser compile examples/example.yaml -v
```

### Run dashboard

```sh
cd weiser-ui
pip install -r requirements.txt
streamlit run app.py
```

[![Watch the Dashboard Demo](https://cdn.loom.com/sessions/thumbnails/3154b4ce21ea4aaa917066991eaf1fb6-aca9c23da977e100-full-play.gif)](https://www.loom.com/share/3154b4ce21ea4aaa917066991eaf1fb6)

## Configuration

Simple count check defintion

```yaml
- name: test row_count
  dataset: orders
  type: row_count
  condition: gt
  threshold: 0
```

Custom sql definition

```yaml
- name: test numeric
  dataset: orders
  type: numeric
  measure: sum(budgeted_amount::numeric::float)
  condition: gt
  threshold: 0
```

Target multiple datasets with the same check definition

```yaml
- name: test row_count
  dataset: [orders, vendors]
  type: row_count
  condition: gt
  threshold: 0
```

Check individual group by values in a check

```yaml
- name: test row_count groupby
  dataset: vendors
  type: row_count
  dimensions:
    - tenant_id
  condition: gt
  threshold: 0
```

Time aggregation check with granularity

```yaml
- name: test numeric gt sum yearly
  dataset: orders
  type: sum
  measure: budgeted_amount::numeric::float
  condition: gt
  threshold: 0
  time_dimension:
    name: _updated_at
    granularity: year
```

Custom SQL expression for dataset and filter usage

```yaml
- name: test numeric completed
  dataset: >
    SELECT * FROM orders o LEFT JOIN orders_status os ON o.order_id = os.order_id
  type: numeric
  measure: sum(budgeted_amount::numeric::float)
  condition: gt
  threshold: 0
  filter: status = 'FULFILLED'
```

Missing values check

```yaml
- name: customer data quality
  dataset: orders
  type: not_empty
  dimensions: ["customer_id", "product_id", "order_date"]
  condition: le
  # Allow up to 5 NULL values per dimension
  threshold: 5
  filter: "status = 'active'"
```

Anomaly detection check

```yaml
- name: test anomaly
  # anomaly test should always target metrics metadata dataset
  dataset: metrics
  type: anomaly
  # References Orders row count.
  check_id: c5cee10898e30edd1c0dde3f24966b4c47890fcf247e5b630c2c156f7ac7ba22
  condition: between
  # long tails of normal distribution for Z-score.
  threshold: [-3.5, 3.5]
```

## Contributing

We welcome contributions!

## License

This project is licensed under the Apache 2.0 License. See the `LICENSE` file for more details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "weiser-ai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.10",
    "maintainer_email": "Paco Valdez <paco.valdez@berkeley.edu>",
    "keywords": "data-quality, data-validation, data-testing, sql, yaml, llm, ai",
    "author": null,
    "author_email": "Paco Valdez <paco.valdez@berkeley.edu>",
    "download_url": "https://files.pythonhosted.org/packages/0e/a4/432eb158a3b4bedb26a798b6276b9a86babcc21a04445e07cba536c07a74/weiser_ai-0.2.0.tar.gz",
    "platform": null,
    "description": "# Weiser\n\nData Quality Framework\n\n## Introduction\n\nWeiser is a data quality framework designed to help you ensure the integrity and accuracy of your data. It provides a set of tools and checks to validate your data and detect anomalies. It also includes a dashboard to visualize the results of the checks.\n\n## Installation\n\nTo install Weiser, use the following command:\n\n```sh\npip install weiser-ai\n```\n\n## Usage\n\n### Run example checks\n\nConnections are defined at the datasources section in the config file see: `examples/example.yaml`.\n\nRun checks in verbose mode:\n\n```sh\nweiser run examples/example.yaml -v\n```\n\n[![Watch the CLI Demo](https://cdn.loom.com/sessions/thumbnails/ce75ad760c324733a36c637a9f8fe826-401f2819c5918c19-full-play.gif)](https://www.loom.com/share/ce75ad760c324733a36c637a9f8fe826)\n\nCompile checks only in verbose mode:\n\n```sh\nweiser compile examples/example.yaml -v\n```\n\n### Run dashboard\n\n```sh\ncd weiser-ui\npip install -r requirements.txt\nstreamlit run app.py\n```\n\n[![Watch the Dashboard Demo](https://cdn.loom.com/sessions/thumbnails/3154b4ce21ea4aaa917066991eaf1fb6-aca9c23da977e100-full-play.gif)](https://www.loom.com/share/3154b4ce21ea4aaa917066991eaf1fb6)\n\n## Configuration\n\nSimple count check defintion\n\n```yaml\n- name: test row_count\n  dataset: orders\n  type: row_count\n  condition: gt\n  threshold: 0\n```\n\nCustom sql definition\n\n```yaml\n- name: test numeric\n  dataset: orders\n  type: numeric\n  measure: sum(budgeted_amount::numeric::float)\n  condition: gt\n  threshold: 0\n```\n\nTarget multiple datasets with the same check definition\n\n```yaml\n- name: test row_count\n  dataset: [orders, vendors]\n  type: row_count\n  condition: gt\n  threshold: 0\n```\n\nCheck individual group by values in a check\n\n```yaml\n- name: test row_count groupby\n  dataset: vendors\n  type: row_count\n  dimensions:\n    - tenant_id\n  condition: gt\n  threshold: 0\n```\n\nTime aggregation check with granularity\n\n```yaml\n- name: test numeric gt sum yearly\n  dataset: orders\n  type: sum\n  measure: budgeted_amount::numeric::float\n  condition: gt\n  threshold: 0\n  time_dimension:\n    name: _updated_at\n    granularity: year\n```\n\nCustom SQL expression for dataset and filter usage\n\n```yaml\n- name: test numeric completed\n  dataset: >\n    SELECT * FROM orders o LEFT JOIN orders_status os ON o.order_id = os.order_id\n  type: numeric\n  measure: sum(budgeted_amount::numeric::float)\n  condition: gt\n  threshold: 0\n  filter: status = 'FULFILLED'\n```\n\nMissing values check\n\n```yaml\n- name: customer data quality\n  dataset: orders\n  type: not_empty\n  dimensions: [\"customer_id\", \"product_id\", \"order_date\"]\n  condition: le\n  # Allow up to 5 NULL values per dimension\n  threshold: 5\n  filter: \"status = 'active'\"\n```\n\nAnomaly detection check\n\n```yaml\n- name: test anomaly\n  # anomaly test should always target metrics metadata dataset\n  dataset: metrics\n  type: anomaly\n  # References Orders row count.\n  check_id: c5cee10898e30edd1c0dde3f24966b4c47890fcf247e5b630c2c156f7ac7ba22\n  condition: between\n  # long tails of normal distribution for Z-score.\n  threshold: [-3.5, 3.5]\n```\n\n## Contributing\n\nWe welcome contributions!\n\n## License\n\nThis project is licensed under the Apache 2.0 License. See the `LICENSE` file for more details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Enterprise-grade data quality framework with YAML configuration, LLM-friendly design, and advanced statistical validation",
    "version": "0.2.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/weiser-ai/weiser-ai/issues",
        "Documentation": "https://weiser.ai",
        "Homepage": "https://weiser.ai",
        "Repository": "https://github.com/weiser-ai/weiser-ai"
    },
    "split_keywords": [
        "data-quality",
        " data-validation",
        " data-testing",
        " sql",
        " yaml",
        " llm",
        " ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "786cbe9e4811d410247e307f8cbbcf082fe7054d8c58fa60f3c3e13ef8cbd470",
                "md5": "e6a6332560c58c975243a7dc0cd6ef77",
                "sha256": "332a63cbfb123c3078c2e488a456410ff21647eb2446e8bf5f6cd790e14bbda1"
            },
            "downloads": -1,
            "filename": "weiser_ai-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e6a6332560c58c975243a7dc0cd6ef77",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.10",
            "size": 29891,
            "upload_time": "2025-07-09T23:00:57",
            "upload_time_iso_8601": "2025-07-09T23:00:57.635315Z",
            "url": "https://files.pythonhosted.org/packages/78/6c/be9e4811d410247e307f8cbbcf082fe7054d8c58fa60f3c3e13ef8cbd470/weiser_ai-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0ea4432eb158a3b4bedb26a798b6276b9a86babcc21a04445e07cba536c07a74",
                "md5": "bc1907f20da8d1e89fb584f7c8610c2d",
                "sha256": "7c35b66da22cb1acc5c41a46563969f427b517547f9aa6802f2458f50b38d670"
            },
            "downloads": -1,
            "filename": "weiser_ai-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "bc1907f20da8d1e89fb584f7c8610c2d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.10",
            "size": 22739,
            "upload_time": "2025-07-09T23:00:59",
            "upload_time_iso_8601": "2025-07-09T23:00:59.043131Z",
            "url": "https://files.pythonhosted.org/packages/0e/a4/432eb158a3b4bedb26a798b6276b9a86babcc21a04445e07cba536c07a74/weiser_ai-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-09 23:00:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "weiser-ai",
    "github_project": "weiser-ai",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "annotated-types",
            "specs": [
                [
                    "==",
                    "0.6.0"
                ]
            ]
        },
        {
            "name": "asn1crypto",
            "specs": [
                [
                    "==",
                    "1.5.1"
                ]
            ]
        },
        {
            "name": "boto3",
            "specs": [
                [
                    "==",
                    "1.35.49"
                ]
            ]
        },
        {
            "name": "botocore",
            "specs": [
                [
                    "==",
                    "1.35.49"
                ]
            ]
        },
        {
            "name": "certifi",
            "specs": [
                [
                    "==",
                    "2025.6.15"
                ]
            ]
        },
        {
            "name": "cffi",
            "specs": [
                [
                    "==",
                    "1.17.1"
                ]
            ]
        },
        {
            "name": "charset-normalizer",
            "specs": [
                [
                    "==",
                    "3.4.2"
                ]
            ]
        },
        {
            "name": "click",
            "specs": [
                [
                    "==",
                    "8.1.7"
                ]
            ]
        },
        {
            "name": "colorama",
            "specs": [
                [
                    "==",
                    "0.4.6"
                ]
            ]
        },
        {
            "name": "cryptography",
            "specs": [
                [
                    "==",
                    "45.0.5"
                ]
            ]
        },
        {
            "name": "duckdb",
            "specs": [
                [
                    "==",
                    "0.9.2"
                ]
            ]
        },
        {
            "name": "exceptiongroup",
            "specs": [
                [
                    "==",
                    "1.2.0"
                ]
            ]
        },
        {
            "name": "filelock",
            "specs": [
                [
                    "==",
                    "3.18.0"
                ]
            ]
        },
        {
            "name": "greenlet",
            "specs": [
                [
                    "==",
                    "3.0.3"
                ]
            ]
        },
        {
            "name": "idna",
            "specs": [
                [
                    "==",
                    "3.10"
                ]
            ]
        },
        {
            "name": "iniconfig",
            "specs": [
                [
                    "==",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "jinja2",
            "specs": [
                [
                    "==",
                    "3.1.3"
                ]
            ]
        },
        {
            "name": "jmespath",
            "specs": [
                [
                    "==",
                    "1.0.1"
                ]
            ]
        },
        {
            "name": "markdown-it-py",
            "specs": [
                [
                    "==",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "markupsafe",
            "specs": [
                [
                    "==",
                    "2.1.5"
                ]
            ]
        },
        {
            "name": "mdurl",
            "specs": [
                [
                    "==",
                    "0.1.2"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "2.1.0"
                ]
            ]
        },
        {
            "name": "packaging",
            "specs": [
                [
                    "==",
                    "23.2"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    "==",
                    "2.2.2"
                ]
            ]
        },
        {
            "name": "platformdirs",
            "specs": [
                [
                    "==",
                    "4.3.8"
                ]
            ]
        },
        {
            "name": "pluggy",
            "specs": [
                [
                    "==",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "psycopg2",
            "specs": [
                [
                    "==",
                    "2.9.9"
                ]
            ]
        },
        {
            "name": "pycparser",
            "specs": [
                [
                    "==",
                    "2.22"
                ]
            ]
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    "==",
                    "2.5.3"
                ]
            ]
        },
        {
            "name": "pydantic-core",
            "specs": [
                [
                    "==",
                    "2.14.6"
                ]
            ]
        },
        {
            "name": "pygments",
            "specs": [
                [
                    "==",
                    "2.17.2"
                ]
            ]
        },
        {
            "name": "pyjwt",
            "specs": [
                [
                    "==",
                    "2.10.1"
                ]
            ]
        },
        {
            "name": "pyopenssl",
            "specs": [
                [
                    "==",
                    "25.1.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    "==",
                    "7.4.4"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    "==",
                    "2.9.0.post0"
                ]
            ]
        },
        {
            "name": "python-dotenv",
            "specs": [
                [
                    "==",
                    "1.0.1"
                ]
            ]
        },
        {
            "name": "pytz",
            "specs": [
                [
                    "==",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0.2"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.32.4"
                ]
            ]
        },
        {
            "name": "rich",
            "specs": [
                [
                    "==",
                    "13.7.0"
                ]
            ]
        },
        {
            "name": "s3transfer",
            "specs": [
                [
                    "==",
                    "0.10.3"
                ]
            ]
        },
        {
            "name": "shellingham",
            "specs": [
                [
                    "==",
                    "1.5.4"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    "==",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "slack-sdk",
            "specs": [
                [
                    "==",
                    "3.34.0"
                ]
            ]
        },
        {
            "name": "snowflake-connector-python",
            "specs": [
                [
                    "==",
                    "3.16.0"
                ]
            ]
        },
        {
            "name": "snowflake-sqlalchemy",
            "specs": [
                [
                    "==",
                    "1.7.5"
                ]
            ]
        },
        {
            "name": "sortedcontainers",
            "specs": [
                [
                    "==",
                    "2.4.0"
                ]
            ]
        },
        {
            "name": "sqlalchemy",
            "specs": [
                [
                    "==",
                    "2.0.41"
                ]
            ]
        },
        {
            "name": "sqlglot",
            "specs": [
                [
                    "==",
                    "20.5.0"
                ]
            ]
        },
        {
            "name": "sqlglotrs",
            "specs": [
                [
                    "==",
                    "0.1.0"
                ]
            ]
        },
        {
            "name": "tomli",
            "specs": [
                [
                    "==",
                    "2.0.1"
                ]
            ]
        },
        {
            "name": "tomlkit",
            "specs": [
                [
                    "==",
                    "0.13.3"
                ]
            ]
        },
        {
            "name": "typer",
            "specs": [
                [
                    "==",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    "==",
                    "4.9.0"
                ]
            ]
        },
        {
            "name": "tzdata",
            "specs": [
                [
                    "==",
                    "2024.1"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "==",
                    "2.2.3"
                ]
            ]
        }
    ],
    "lcname": "weiser-ai"
}
        
Elapsed time: 1.15607s