data-expectations


Namedata-expectations JSON
Version 1.7.0 PyPI version JSON
download
home_pagehttps://github.com/joocer/data_expectations
SummaryAre your data meeting all your expecations
upload_time2023-09-28 18:03:24
maintainerJoocer
docs_urlNone
author
requires_python
license
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <img src="icon.png" height="92px" />

## Data Expectations  
_Are your data meeting your expectations?_

----

[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/joocer/data_expectations/blob/main/LICENSE)
[![Regression Suite](https://github.com/joocer/data_expectations/actions/workflows/regression_suite.yaml/badge.svg)](https://github.com/joocer/data_expectations/actions/workflows/regression_suite.yaml)
[![Static Analysis](https://github.com/joocer/data_expectations/actions/workflows/static_analysis.yml/badge.svg)](https://github.com/joocer/data_expectations/actions/workflows/static_analysis.yml)
[![codecov](https://codecov.io/gh/joocer/data_expectations/branch/main/graph/badge.svg?token=XA60LUVH0W)](https://codecov.io/gh/joocer/data_expectations)
 [![Downloads](https://static.pepy.tech/badge/data-expectations)](https://pepy.tech/project/data-expectations)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![PyPI Latest Release](https://img.shields.io/pypi/v/data-expectations.svg)](https://pypi.org/project/data-expectations/)
[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2Fjoocer%2Fdata_expectations.svg?type=shield)](https://app.fossa.com/projects/git%2Bgithub.com%2Fjoocer%2Fdata_expectations?ref=badge_shield)

Data Expectations is a Python library which takes a delarative approach to asserting qualities of your datasets. Instead of tests like `is_sorted` to determine if a column is ordered, the expectation is `column_values_are_increasing`. Most of the time you don't need to know _how_ it got like that, you are only interested _what_ the data looks like now.

Expectations can be used alongside, or in place of a schema validator, however Expectations is intended to perform validation of the data in a dataset, not the structure of a table. Records should be a Python dictionary (or dictionary-like object) and can be processed one-by-one, or against an entire list of dictionaries.

[Data Expectations](https://github.com/joocer/data_expectations) was inspired by the great [Great Expectations](https://github.com/great-expectations/great_expectations) library, but we wanted something lighter and easier to quickly set up and run. Data Expectations can do less, but it does it with a fraction of the effort and has zero dependencies. 

## Use Cases

- Use Data Expectations was as a step in data processing pipelines, testing the data conforms to expectations before it is committed to the warehouse.
- Use Data Expectations to simplify validating user supplied values.

## Provided Expectations

- **expect_column_to_exist** (column)
- **expect_column_values_to_not_be_null** (column)
- **expect_column_values_to_be_of_type** (column, expected_type, ignore_nulls:true)
- **expect_column_values_to_be_in_type_list** (column, type_list, ignore_nulls:true)
- **expect_column_values_to_be_more_than** (column, threshold, ignore_nulls:true)
- **expect_column_values_to_be_less_than** (column, threshold, ignore_nulls:true)
- **expect_column_values_to_be_between** (column, maximum, minimum, ignore_nulls:true)
- **expect_column_values_to_be_increasing** (column, ignore_nulls:true)
- **expect_column_values_to_be_decreasing** (column, ignore_nulls:true)
- **expect_column_values_to_be_in_set** (column, symbols, ignore_nulls:true)
- **expect_column_values_to_match_regex** (column, regex, ignore_nulls:true)
- **expect_column_values_to_match_like** (column, like, ignore_nulls:true)
- **expect_column_values_length_to_be** (column, length, ignore_nulls:true)
- **expect_column_values_length_to_be_between**  (column, maximum, minimum, ignore_nulls:true)

## Install

~~~bash
pip install data_expectations
~~~

Data Expectations has no external dependencies, can be used ad hoc and in-the-moment without complex set up.

## Example Usage

Testing Python Dictionaries

~~~python
import data_expectations as de
from data_expectations import Expectation
from data_expectations import Behaviors

TEST_DATA = {"name": "charles", "age": 12}

set_of_expectations = [
    Expectation(Behaviors.EXPECT_COLUMN_TO_EXIST, column="name"),
    Expectation(Behaviors.EXPECT_COLUMN_TO_EXIST, column="age"),
    Expectation(Behaviors.EXPECT_COLUMN_VALUES_TO_BE_BETWEEN, column="age", config={"minimum": 0, "maximum": 120}),
]

expectations = de.Expectations(set_of_expectations)
try:
    de.evaluate_record(expectations, TEST_DATA)
except de.errors.ExpectationNotMetError:  # pragma: no cover
    print("Data Didn't Meet Expectations")
~~~

Testing individual Values:

~~~python
import data_expectations as de
from data_expectations import Expectation
from data_expectations import Behaviors

expectation = Expectation(Behaviors.EXPECT_COLUMN_VALUES_TO_BE_BETWEEN, column="age", config={"minimum": 0, "maximum": 120})

try:
    expectation.test_value(55)
except de.errors.ExpectationNotMetError:  # pragma: no cover
    print("Data Didn't Meet Expectations")
~~~

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/joocer/data_expectations",
    "name": "data-expectations",
    "maintainer": "Joocer",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/11/27/277de37f834f979cbd72b2bcf0692b9a2d120eba3adf7f70682e31dee45e/data_expectations-1.7.0.tar.gz",
    "platform": null,
    "description": "<img src=\"icon.png\" height=\"92px\" />\n\n## Data Expectations  \n_Are your data meeting your expectations?_\n\n----\n\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/joocer/data_expectations/blob/main/LICENSE)\n[![Regression Suite](https://github.com/joocer/data_expectations/actions/workflows/regression_suite.yaml/badge.svg)](https://github.com/joocer/data_expectations/actions/workflows/regression_suite.yaml)\n[![Static Analysis](https://github.com/joocer/data_expectations/actions/workflows/static_analysis.yml/badge.svg)](https://github.com/joocer/data_expectations/actions/workflows/static_analysis.yml)\n[![codecov](https://codecov.io/gh/joocer/data_expectations/branch/main/graph/badge.svg?token=XA60LUVH0W)](https://codecov.io/gh/joocer/data_expectations)\n [![Downloads](https://static.pepy.tech/badge/data-expectations)](https://pepy.tech/project/data-expectations)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![PyPI Latest Release](https://img.shields.io/pypi/v/data-expectations.svg)](https://pypi.org/project/data-expectations/)\n[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2Fjoocer%2Fdata_expectations.svg?type=shield)](https://app.fossa.com/projects/git%2Bgithub.com%2Fjoocer%2Fdata_expectations?ref=badge_shield)\n\nData Expectations is a Python library which takes a delarative approach to asserting qualities of your datasets. Instead of tests like `is_sorted` to determine if a column is ordered, the expectation is `column_values_are_increasing`. Most of the time you don't need to know _how_ it got like that, you are only interested _what_ the data looks like now.\n\nExpectations can be used alongside, or in place of a schema validator, however Expectations is intended to perform validation of the data in a dataset, not the structure of a table. Records should be a Python dictionary (or dictionary-like object) and can be processed one-by-one, or against an entire list of dictionaries.\n\n[Data Expectations](https://github.com/joocer/data_expectations) was inspired by the great [Great Expectations](https://github.com/great-expectations/great_expectations) library, but we wanted something lighter and easier to quickly set up and run. Data Expectations can do less, but it does it with a fraction of the effort and has zero dependencies. \n\n## Use Cases\n\n- Use Data Expectations was as a step in data processing pipelines, testing the data conforms to expectations before it is committed to the warehouse.\n- Use Data Expectations to simplify validating user supplied values.\n\n## Provided Expectations\n\n- **expect_column_to_exist** (column)\n- **expect_column_values_to_not_be_null** (column)\n- **expect_column_values_to_be_of_type** (column, expected_type, ignore_nulls:true)\n- **expect_column_values_to_be_in_type_list** (column, type_list, ignore_nulls:true)\n- **expect_column_values_to_be_more_than** (column, threshold, ignore_nulls:true)\n- **expect_column_values_to_be_less_than** (column, threshold, ignore_nulls:true)\n- **expect_column_values_to_be_between** (column, maximum, minimum, ignore_nulls:true)\n- **expect_column_values_to_be_increasing** (column, ignore_nulls:true)\n- **expect_column_values_to_be_decreasing** (column, ignore_nulls:true)\n- **expect_column_values_to_be_in_set** (column, symbols, ignore_nulls:true)\n- **expect_column_values_to_match_regex** (column, regex, ignore_nulls:true)\n- **expect_column_values_to_match_like** (column, like, ignore_nulls:true)\n- **expect_column_values_length_to_be** (column, length, ignore_nulls:true)\n- **expect_column_values_length_to_be_between**  (column, maximum, minimum, ignore_nulls:true)\n\n## Install\n\n~~~bash\npip install data_expectations\n~~~\n\nData Expectations has no external dependencies, can be used ad hoc and in-the-moment without complex set up.\n\n## Example Usage\n\nTesting Python Dictionaries\n\n~~~python\nimport data_expectations as de\nfrom data_expectations import Expectation\nfrom data_expectations import Behaviors\n\nTEST_DATA = {\"name\": \"charles\", \"age\": 12}\n\nset_of_expectations = [\n    Expectation(Behaviors.EXPECT_COLUMN_TO_EXIST, column=\"name\"),\n    Expectation(Behaviors.EXPECT_COLUMN_TO_EXIST, column=\"age\"),\n    Expectation(Behaviors.EXPECT_COLUMN_VALUES_TO_BE_BETWEEN, column=\"age\", config={\"minimum\": 0, \"maximum\": 120}),\n]\n\nexpectations = de.Expectations(set_of_expectations)\ntry:\n    de.evaluate_record(expectations, TEST_DATA)\nexcept de.errors.ExpectationNotMetError:  # pragma: no cover\n    print(\"Data Didn't Meet Expectations\")\n~~~\n\nTesting individual Values:\n\n~~~python\nimport data_expectations as de\nfrom data_expectations import Expectation\nfrom data_expectations import Behaviors\n\nexpectation = Expectation(Behaviors.EXPECT_COLUMN_VALUES_TO_BE_BETWEEN, column=\"age\", config={\"minimum\": 0, \"maximum\": 120})\n\ntry:\n    expectation.test_value(55)\nexcept de.errors.ExpectationNotMetError:  # pragma: no cover\n    print(\"Data Didn't Meet Expectations\")\n~~~\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Are your data meeting all your expecations",
    "version": "1.7.0",
    "project_urls": {
        "Homepage": "https://github.com/joocer/data_expectations"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "71f035281bd37b8cfdb0f6829d826b2bdabbe6fa3d2f0d12aa9593cf47112d0e",
                "md5": "d8e9d7db2a60997bce2d8cd803acc45d",
                "sha256": "cf8b4599ddb2d7294431dd43d099b74375ace255e005707cbfea265105984037"
            },
            "downloads": -1,
            "filename": "data_expectations-1.7.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d8e9d7db2a60997bce2d8cd803acc45d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 15318,
            "upload_time": "2023-09-28T18:03:22",
            "upload_time_iso_8601": "2023-09-28T18:03:22.907986Z",
            "url": "https://files.pythonhosted.org/packages/71/f0/35281bd37b8cfdb0f6829d826b2bdabbe6fa3d2f0d12aa9593cf47112d0e/data_expectations-1.7.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1127277de37f834f979cbd72b2bcf0692b9a2d120eba3adf7f70682e31dee45e",
                "md5": "207559b0229cdb39d6cf91073a775ab9",
                "sha256": "4b945093f32d89d5e743fb074132a62c14171064c27541b66b831b3f083fe79f"
            },
            "downloads": -1,
            "filename": "data_expectations-1.7.0.tar.gz",
            "has_sig": false,
            "md5_digest": "207559b0229cdb39d6cf91073a775ab9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 17174,
            "upload_time": "2023-09-28T18:03:24",
            "upload_time_iso_8601": "2023-09-28T18:03:24.445394Z",
            "url": "https://files.pythonhosted.org/packages/11/27/277de37f834f979cbd72b2bcf0692b9a2d120eba3adf7f70682e31dee45e/data_expectations-1.7.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-28 18:03:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "joocer",
    "github_project": "data_expectations",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "data-expectations"
}
        
Elapsed time: 0.12059s