tensorflow-data-validation


Nametensorflow-data-validation JSON
Version 1.16.1 PyPI version JSON
download
home_pagehttps://www.tensorflow.org/tfx/data_validation/get_started
SummaryA library for exploring and validating machine learning data.
upload_time2024-10-15 20:36:20
maintainerNone
docs_urlNone
authorGoogle LLC
requires_python<4,>=3.9
licenseApache 2.0
keywords tensorflow data validation tfx
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <!-- See: www.tensorflow.org/tfx/data_validation/ -->

# TensorFlow Data Validation

[![Python](https://img.shields.io/badge/python%7C3.9%7C3.10%7C3.11-blue)](https://github.com/tensorflow/data-validation)
[![PyPI](https://badge.fury.io/py/tensorflow-data-validation.svg)](https://badge.fury.io/py/tensorflow-data-validation)
[![Documentation](https://img.shields.io/badge/api-reference-blue.svg)](https://www.tensorflow.org/tfx/data_validation/api_docs/python/tfdv)

*TensorFlow Data Validation* (TFDV) is a library for exploring and validating
machine learning data. It is designed to be highly scalable
and to work well with TensorFlow and [TensorFlow Extended (TFX)](https://www.tensorflow.org/tfx).

TF Data Validation includes:

*    Scalable calculation of summary statistics of training and test data.
*    Integration with a viewer for data distributions and statistics, as well
     as faceted comparison of pairs of features ([Facets](https://github.com/PAIR-code/facets))
*    Automated [data-schema](https://github.com/tensorflow/metadata/blob/master/tensorflow_metadata/proto/v0/schema.proto)
     generation to describe expectations about data
     like required values, ranges, and vocabularies
*    A schema viewer to help you inspect the schema.
*    Anomaly detection to identify [anomalies](https://github.com/tensorflow/data-validation/blob/master/g3doc/anomalies.md),
     such as missing features,
     out-of-range values, or wrong feature types, to name a few.
*    An anomalies viewer so that you can see what features have anomalies and
     learn more in order to correct them.

For instructions on using TFDV, see the [get started guide](https://github.com/tensorflow/data-validation/blob/master/g3doc/get_started.md)
and try out the [example notebook](https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/data_validation/tfdv_basic.ipynb).
Some of the techniques implemented in TFDV are described in a
[technical paper published in SysML'19](https://mlsys.org/Conferences/2019/doc/2019/167.pdf).

## Installing from PyPI

The recommended way to install TFDV is using the
[PyPI package](https://pypi.org/project/tensorflow-data-validation/):

```bash
pip install tensorflow-data-validation
```
### Nightly Packages

TFDV also hosts nightly packages on Google Cloud. To install the latest nightly
package, please use the following command:

```bash
export TFX_DEPENDENCY_SELECTOR=NIGHTLY
pip install --extra-index-url https://pypi-nightly.tensorflow.org/simple tensorflow-data-validation
```

This will install the nightly packages for the major dependencies of TFDV such
as TFX Basic Shared Libraries (TFX-BSL) and TensorFlow Metadata (TFMD).

Sometimes TFDV uses those dependencies' most recent changes, which are not yet
released. Because of this, it is safer to use nightly versions of those
dependent libraries when using nightly TFDV. Export the
`TFX_DEPENDENCY_SELECTOR` environment variable to do so.

NOTE: These nightly packages are unstable and breakages are likely to happen.
The fix could often take a week or more depending on the complexity involved.

## Build with Docker

This is the recommended way to build TFDV under Linux, and is continuously
tested at Google.

### 1. Install Docker

Please first install `docker` and `docker-compose` by following the directions:
[docker](https://docs.docker.com/install/);
[docker-compose](https://docs.docker.com/compose/install/).

### 2. Clone the TFDV repository

```shell
git clone https://github.com/tensorflow/data-validation
cd data-validation
```

Note that these instructions will install the latest master branch of TensorFlow
Data Validation. If you want to install a specific branch (such as a release
branch), pass `-b <branchname>` to the `git clone` command.

### 3. Build the pip package

Then, run the following at the project root:

```bash
sudo docker-compose build manylinux2010
sudo docker-compose run -e PYTHON_VERSION=${PYTHON_VERSION} manylinux2010
```
where `PYTHON_VERSION` is one of `{39, 310, 311}`.

A wheel will be produced under `dist/`.

### 4. Install the pip package

```shell
pip install dist/*.whl
```

## Build from source

### 1. Prerequisites

To compile and use TFDV, you need to set up some prerequisites.

#### Install NumPy

If NumPy is not installed on your system, install it now by following [these
directions](https://www.scipy.org/scipylib/download.html).

#### Install Bazel

If Bazel is not installed on your system, install it now by following [these
directions](https://bazel.build/versions/master/docs/install.html).

### 2. Clone the TFDV repository

```shell
git clone https://github.com/tensorflow/data-validation
cd data-validation
```

Note that these instructions will install the latest master branch of TensorFlow
Data Validation. If you want to install a specific branch (such as a release
branch), pass `-b <branchname>` to the `git clone` command.

### 3. Build the pip package

`TFDV` wheel is Python version dependent -- to build the pip package that
works for a specific Python version, use that Python binary to run:

```shell
python setup.py bdist_wheel
```

You can find the generated `.whl` file in the `dist` subdirectory.

### 4. Install the pip package

```shell
pip install dist/*.whl
```

## Supported platforms

TFDV is tested on the following 64-bit operating systems:

  * macOS 12.5 (Monterey) or later.
  * Ubuntu 20.04 or later.

## Notable Dependencies

TensorFlow is required.

[Apache Beam](https://beam.apache.org/) is required; it's the way that efficient
distributed computation is supported. By default, Apache Beam runs in local
mode but can also run in distributed mode using
[Google Cloud Dataflow](https://cloud.google.com/dataflow/) and other Apache
Beam
[runners](https://beam.apache.org/documentation/runners/capability-matrix/).

[Apache Arrow](https://arrow.apache.org/) is also required. TFDV uses Arrow to
represent data internally in order to make use of vectorized numpy functions.

## Compatible versions

The following table shows the  package versions that are
compatible with each other. This is determined by our testing framework, but
other *untested* combinations may also work.

tensorflow-data-validation                                                            | apache-beam[gcp] | pyarrow | tensorflow        | tensorflow-metadata | tensorflow-transform | tfx-bsl
------------------------------------------------------------------------------------- | ---------------- | ------- | ----------------- | ------------------- | -------------------- | -------
[GitHub master](https://github.com/tensorflow/data-validation/blob/master/RELEASE.md) | 2.59.0           | 10.0.1  | nightly (2.x)     | 1.16.0              | n/a                  | 1.16.0
[1.16.0](https://github.com/tensorflow/data-validation/blob/v1.16.0/RELEASE.md)       | 2.59.0           | 10.0.1  | 2.16              | 1.16.0              | n/a                  | 1.16.0
[1.15.1](https://github.com/tensorflow/data-validation/blob/v1.15.1/RELEASE.md)       | 2.47.0           | 10.0.0  | 2.15              | 1.15.0              | n/a                  | 1.15.1
[1.15.0](https://github.com/tensorflow/data-validation/blob/v1.15.0/RELEASE.md)       | 2.47.0           | 10.0.0  | 2.15              | 1.15.0              | n/a                  | 1.15.0
[1.14.0](https://github.com/tensorflow/data-validation/blob/v1.14.0/RELEASE.md)       | 2.47.0           | 10.0.0  | 2.13              | 1.14.0              | n/a                  | 1.14.0
[1.13.0](https://github.com/tensorflow/data-validation/blob/v1.13.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 2.12              | 1.13.1              | n/a                  | 1.13.0
[1.12.0](https://github.com/tensorflow/data-validation/blob/v1.12.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 2.11              | 1.12.0              | n/a                  | 1.12.0
[1.11.0](https://github.com/tensorflow/data-validation/blob/v1.11.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 1.15 / 2.10       | 1.11.0              | n/a                  | 1.11.0
[1.10.0](https://github.com/tensorflow/data-validation/blob/v1.10.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 1.15 / 2.9        | 1.10.0              | n/a                  | 1.10.1
[1.9.0](https://github.com/tensorflow/data-validation/blob/v1.9.0/RELEASE.md)         | 2.38.0           | 5.0.0   | 1.15 / 2.9        | 1.9.0               | n/a                  | 1.9.0
[1.8.0](https://github.com/tensorflow/data-validation/blob/v1.8.0/RELEASE.md)         | 2.38.0           | 5.0.0   | 1.15 / 2.8        | 1.8.0               | n/a                  | 1.8.0
[1.7.0](https://github.com/tensorflow/data-validation/blob/v1.7.0/RELEASE.md)         | 2.36.0           | 5.0.0   | 1.15 / 2.8        | 1.7.0               | n/a                  | 1.7.0
[1.6.0](https://github.com/tensorflow/data-validation/blob/v1.6.0/RELEASE.md)         | 2.35.0           | 5.0.0   | 1.15 / 2.7        | 1.6.0               | n/a                  | 1.6.0
[1.5.0](https://github.com/tensorflow/data-validation/blob/v1.5.0/RELEASE.md)         | 2.34.0           | 5.0.0   | 1.15 / 2.7        | 1.5.0               | n/a                  | 1.5.0
[1.4.0](https://github.com/tensorflow/data-validation/blob/v1.4.0/RELEASE.md)         | 2.32.0           | 4.0.1   | 1.15 / 2.6        | 1.4.0               | n/a                  | 1.4.0
[1.3.0](https://github.com/tensorflow/data-validation/blob/v1.3.0/RELEASE.md)         | 2.32.0           | 2.0.0   | 1.15 / 2.6        | 1.2.0               | n/a                  | 1.3.0
[1.2.0](https://github.com/tensorflow/data-validation/blob/v1.2.0/RELEASE.md)         | 2.31.0           | 2.0.0   | 1.15 / 2.5        | 1.2.0               | n/a                  | 1.2.0
[1.1.1](https://github.com/tensorflow/data-validation/blob/v1.1.1/RELEASE.md)         | 2.29.0           | 2.0.0   | 1.15 / 2.5        | 1.1.0               | n/a                  | 1.1.1
[1.1.0](https://github.com/tensorflow/data-validation/blob/v1.1.0/RELEASE.md)         | 2.29.0           | 2.0.0   | 1.15 / 2.5        | 1.1.0               | n/a                  | 1.1.0
[1.0.0](https://github.com/tensorflow/data-validation/blob/v1.0.0/RELEASE.md)         | 2.29.0           | 2.0.0   | 1.15 / 2.5        | 1.0.0               | n/a                  | 1.0.0
[0.30.0](https://github.com/tensorflow/data-validation/blob/v0.30.0/RELEASE.md)       | 2.28.0           | 2.0.0   | 1.15 / 2.4        | 0.30.0              | n/a                  | 0.30.0
[0.29.0](https://github.com/tensorflow/data-validation/blob/v0.29.0/RELEASE.md)       | 2.28.0           | 2.0.0   | 1.15 / 2.4        | 0.29.0              | n/a                  | 0.29.0
[0.28.0](https://github.com/tensorflow/data-validation/blob/v0.28.0/RELEASE.md)       | 2.28.0           | 2.0.0   | 1.15 / 2.4        | 0.28.0              | n/a                  | 0.28.1
[0.27.0](https://github.com/tensorflow/data-validation/blob/v0.27.0/RELEASE.md)       | 2.27.0           | 2.0.0   | 1.15 / 2.4        | 0.27.0              | n/a                  | 0.27.0
[0.26.1](https://github.com/tensorflow/data-validation/blob/v0.26.1/RELEASE.md)       | 2.28.0           | 0.17.0  | 1.15 / 2.3        | 0.26.0              | 0.26.0               | 0.26.0
[0.26.0](https://github.com/tensorflow/data-validation/blob/v0.26.0/RELEASE.md)       | 2.25.0           | 0.17.0  | 1.15 / 2.3        | 0.26.0              | 0.26.0               | 0.26.0
[0.25.0](https://github.com/tensorflow/data-validation/blob/v0.25.0/RELEASE.md)       | 2.25.0           | 0.17.0  | 1.15 / 2.3        | 0.25.0              | 0.25.0               | 0.25.0
[0.24.1](https://github.com/tensorflow/data-validation/blob/v0.24.1/RELEASE.md)       | 2.24.0           | 0.17.0  | 1.15 / 2.3        | 0.24.0              | 0.24.1               | 0.24.1
[0.24.0](https://github.com/tensorflow/data-validation/blob/v0.24.0/RELEASE.md)       | 2.23.0           | 0.17.0  | 1.15 / 2.3        | 0.24.0              | 0.24.0               | 0.24.0
[0.23.1](https://github.com/tensorflow/data-validation/blob/v0.23.1/RELEASE.md)       | 2.24.0           | 0.17.0  | 1.15 / 2.3        | 0.23.0              | 0.23.0               | 0.23.0
[0.23.0](https://github.com/tensorflow/data-validation/blob/v0.23.0/RELEASE.md)       | 2.23.0           | 0.17.0  | 1.15 / 2.3        | 0.23.0              | 0.23.0               | 0.23.0
[0.22.2](https://github.com/tensorflow/data-validation/blob/v0.22.2/RELEASE.md)       | 2.20.0           | 0.16.0  | 1.15 / 2.2        | 0.22.0              | 0.22.0               | 0.22.1
[0.22.1](https://github.com/tensorflow/data-validation/blob/v0.22.1/RELEASE.md)       | 2.20.0           | 0.16.0  | 1.15 / 2.2        | 0.22.0              | 0.22.0               | 0.22.1
[0.22.0](https://github.com/tensorflow/data-validation/blob/v0.22.0/RELEASE.md)       | 2.20.0           | 0.16.0  | 1.15 / 2.2        | 0.22.0              | 0.22.0               | 0.22.0
[0.21.5](https://github.com/tensorflow/data-validation/blob/v0.21.5/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.1               | 0.21.3
[0.21.4](https://github.com/tensorflow/data-validation/blob/v0.21.4/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.1               | 0.21.3
[0.21.2](https://github.com/tensorflow/data-validation/blob/v0.21.2/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.0               | 0.21.0
[0.21.1](https://github.com/tensorflow/data-validation/blob/v0.21.1/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.0               | 0.21.0
[0.21.0](https://github.com/tensorflow/data-validation/blob/v0.21.0/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.0               | 0.21.0
[0.15.0](https://github.com/tensorflow/data-validation/blob/v0.15.0/RELEASE.md)       | 2.16.0           | 0.14.0  | 1.15 / 2.0        | 0.15.0              | 0.15.0               | 0.15.0
[0.14.1](https://github.com/tensorflow/data-validation/blob/v0.14.1/RELEASE.md)       | 2.14.0           | 0.14.0  | 1.14              | 0.14.0              | 0.14.0               | n/a
[0.14.0](https://github.com/tensorflow/data-validation/blob/v0.14.0/RELEASE.md)       | 2.14.0           | 0.14.0  | 1.14              | 0.14.0              | 0.14.0               | n/a
[0.13.1](https://github.com/tensorflow/data-validation/blob/v0.13.1/RELEASE.md)       | 2.11.0           | n/a     | 1.13              | 0.12.1              | 0.13.0               | n/a
[0.13.0](https://github.com/tensorflow/data-validation/blob/v0.13.0/RELEASE.md)       | 2.11.0           | n/a     | 1.13              | 0.12.1              | 0.13.0               | n/a
[0.12.0](https://github.com/tensorflow/data-validation/blob/v0.12.0/RELEASE.md)       | 2.10.0           | n/a     | 1.12              | 0.12.1              | 0.12.0               | n/a
[0.11.0](https://github.com/tensorflow/data-validation/blob/v0.11.0/RELEASE.md)       | 2.8.0            | n/a     | 1.11              | 0.9.0               | 0.11.0               | n/a
[0.9.0](https://github.com/tensorflow/data-validation/blob/v0.9.0/RELEASE.md)         | 2.6.0            | n/a     | 1.9               | n/a                 | n/a                  | n/a

## Questions

Please direct any questions about working with TF Data Validation to
[Stack Overflow](https://stackoverflow.com) using the
[tensorflow-data-validation](https://stackoverflow.com/questions/tagged/tensorflow-data-validation)
tag.

## Links

  * [TensorFlow Data Validation Getting Started Guide](https://www.tensorflow.org/tfx/data_validation/get_started)
  * [TensorFlow Data Validation Notebook](https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/data_validation/tfdv_basic.ipynb)
  * [TensorFlow Data Validation API Documentation](https://www.tensorflow.org/tfx/data_validation/api_docs/python/tfdv)
  * [TensorFlow Data Validation Blog Post](https://medium.com/tensorflow/introducing-tensorflow-data-validation-data-understanding-validation-and-monitoring-at-scale-d38e3952c2f0)
  * [TensorFlow Data Validation PyPI](https://pypi.org/project/tensorflow-data-validation/)
  * [TensorFlow Data Validation Paper](https://mlsys.org/Conferences/2019/doc/2019/167.pdf)
  * [TensorFlow Data Validation Slides](https://conf.slac.stanford.edu/xldb2018/sites/xldb2018.conf.slac.stanford.edu/files/Tues_09.45_NeoklisPolyzotis_Data%20Analysis%20and%20Validation%20(1).pdf)


            

Raw data

            {
    "_id": null,
    "home_page": "https://www.tensorflow.org/tfx/data_validation/get_started",
    "name": "tensorflow-data-validation",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.9",
    "maintainer_email": null,
    "keywords": "tensorflow data validation tfx",
    "author": "Google LLC",
    "author_email": "tensorflow-extended-dev@googlegroups.com",
    "download_url": "https://github.com/tensorflow/data-validation/tags",
    "platform": null,
    "description": "<!-- See: www.tensorflow.org/tfx/data_validation/ -->\n\n# TensorFlow Data Validation\n\n[![Python](https://img.shields.io/badge/python%7C3.9%7C3.10%7C3.11-blue)](https://github.com/tensorflow/data-validation)\n[![PyPI](https://badge.fury.io/py/tensorflow-data-validation.svg)](https://badge.fury.io/py/tensorflow-data-validation)\n[![Documentation](https://img.shields.io/badge/api-reference-blue.svg)](https://www.tensorflow.org/tfx/data_validation/api_docs/python/tfdv)\n\n*TensorFlow Data Validation* (TFDV) is a library for exploring and validating\nmachine learning data. It is designed to be highly scalable\nand to work well with TensorFlow and [TensorFlow Extended (TFX)](https://www.tensorflow.org/tfx).\n\nTF Data Validation includes:\n\n*    Scalable calculation of summary statistics of training and test data.\n*    Integration with a viewer for data distributions and statistics, as well\n     as faceted comparison of pairs of features ([Facets](https://github.com/PAIR-code/facets))\n*    Automated [data-schema](https://github.com/tensorflow/metadata/blob/master/tensorflow_metadata/proto/v0/schema.proto)\n     generation to describe expectations about data\n     like required values, ranges, and vocabularies\n*    A schema viewer to help you inspect the schema.\n*    Anomaly detection to identify [anomalies](https://github.com/tensorflow/data-validation/blob/master/g3doc/anomalies.md),\n     such as missing features,\n     out-of-range values, or wrong feature types, to name a few.\n*    An anomalies viewer so that you can see what features have anomalies and\n     learn more in order to correct them.\n\nFor instructions on using TFDV, see the [get started guide](https://github.com/tensorflow/data-validation/blob/master/g3doc/get_started.md)\nand try out the [example notebook](https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/data_validation/tfdv_basic.ipynb).\nSome of the techniques implemented in TFDV are described in a\n[technical paper published in SysML'19](https://mlsys.org/Conferences/2019/doc/2019/167.pdf).\n\n## Installing from PyPI\n\nThe recommended way to install TFDV is using the\n[PyPI package](https://pypi.org/project/tensorflow-data-validation/):\n\n```bash\npip install tensorflow-data-validation\n```\n### Nightly Packages\n\nTFDV also hosts nightly packages on Google Cloud. To install the latest nightly\npackage, please use the following command:\n\n```bash\nexport TFX_DEPENDENCY_SELECTOR=NIGHTLY\npip install --extra-index-url https://pypi-nightly.tensorflow.org/simple tensorflow-data-validation\n```\n\nThis will install the nightly packages for the major dependencies of TFDV such\nas TFX Basic Shared Libraries (TFX-BSL) and TensorFlow Metadata (TFMD).\n\nSometimes TFDV uses those dependencies' most recent changes, which are not yet\nreleased. Because of this, it is safer to use nightly versions of those\ndependent libraries when using nightly TFDV. Export the\n`TFX_DEPENDENCY_SELECTOR` environment variable to do so.\n\nNOTE: These nightly packages are unstable and breakages are likely to happen.\nThe fix could often take a week or more depending on the complexity involved.\n\n## Build with Docker\n\nThis is the recommended way to build TFDV under Linux, and is continuously\ntested at Google.\n\n### 1. Install Docker\n\nPlease first install `docker` and `docker-compose` by following the directions:\n[docker](https://docs.docker.com/install/);\n[docker-compose](https://docs.docker.com/compose/install/).\n\n### 2. Clone the TFDV repository\n\n```shell\ngit clone https://github.com/tensorflow/data-validation\ncd data-validation\n```\n\nNote that these instructions will install the latest master branch of TensorFlow\nData Validation. If you want to install a specific branch (such as a release\nbranch), pass `-b <branchname>` to the `git clone` command.\n\n### 3. Build the pip package\n\nThen, run the following at the project root:\n\n```bash\nsudo docker-compose build manylinux2010\nsudo docker-compose run -e PYTHON_VERSION=${PYTHON_VERSION} manylinux2010\n```\nwhere `PYTHON_VERSION` is one of `{39, 310, 311}`.\n\nA wheel will be produced under `dist/`.\n\n### 4. Install the pip package\n\n```shell\npip install dist/*.whl\n```\n\n## Build from source\n\n### 1. Prerequisites\n\nTo compile and use TFDV, you need to set up some prerequisites.\n\n#### Install NumPy\n\nIf NumPy is not installed on your system, install it now by following [these\ndirections](https://www.scipy.org/scipylib/download.html).\n\n#### Install Bazel\n\nIf Bazel is not installed on your system, install it now by following [these\ndirections](https://bazel.build/versions/master/docs/install.html).\n\n### 2. Clone the TFDV repository\n\n```shell\ngit clone https://github.com/tensorflow/data-validation\ncd data-validation\n```\n\nNote that these instructions will install the latest master branch of TensorFlow\nData Validation. If you want to install a specific branch (such as a release\nbranch), pass `-b <branchname>` to the `git clone` command.\n\n### 3. Build the pip package\n\n`TFDV` wheel is Python version dependent -- to build the pip package that\nworks for a specific Python version, use that Python binary to run:\n\n```shell\npython setup.py bdist_wheel\n```\n\nYou can find the generated `.whl` file in the `dist` subdirectory.\n\n### 4. Install the pip package\n\n```shell\npip install dist/*.whl\n```\n\n## Supported platforms\n\nTFDV is tested on the following 64-bit operating systems:\n\n  * macOS 12.5 (Monterey) or later.\n  * Ubuntu 20.04 or later.\n\n## Notable Dependencies\n\nTensorFlow is required.\n\n[Apache Beam](https://beam.apache.org/) is required; it's the way that efficient\ndistributed computation is supported. By default, Apache Beam runs in local\nmode but can also run in distributed mode using\n[Google Cloud Dataflow](https://cloud.google.com/dataflow/) and other Apache\nBeam\n[runners](https://beam.apache.org/documentation/runners/capability-matrix/).\n\n[Apache Arrow](https://arrow.apache.org/) is also required. TFDV uses Arrow to\nrepresent data internally in order to make use of vectorized numpy functions.\n\n## Compatible versions\n\nThe following table shows the  package versions that are\ncompatible with each other. This is determined by our testing framework, but\nother *untested* combinations may also work.\n\ntensorflow-data-validation                                                            | apache-beam[gcp] | pyarrow | tensorflow        | tensorflow-metadata | tensorflow-transform | tfx-bsl\n------------------------------------------------------------------------------------- | ---------------- | ------- | ----------------- | ------------------- | -------------------- | -------\n[GitHub master](https://github.com/tensorflow/data-validation/blob/master/RELEASE.md) | 2.59.0           | 10.0.1  | nightly (2.x)     | 1.16.0              | n/a                  | 1.16.0\n[1.16.0](https://github.com/tensorflow/data-validation/blob/v1.16.0/RELEASE.md)       | 2.59.0           | 10.0.1  | 2.16              | 1.16.0              | n/a                  | 1.16.0\n[1.15.1](https://github.com/tensorflow/data-validation/blob/v1.15.1/RELEASE.md)       | 2.47.0           | 10.0.0  | 2.15              | 1.15.0              | n/a                  | 1.15.1\n[1.15.0](https://github.com/tensorflow/data-validation/blob/v1.15.0/RELEASE.md)       | 2.47.0           | 10.0.0  | 2.15              | 1.15.0              | n/a                  | 1.15.0\n[1.14.0](https://github.com/tensorflow/data-validation/blob/v1.14.0/RELEASE.md)       | 2.47.0           | 10.0.0  | 2.13              | 1.14.0              | n/a                  | 1.14.0\n[1.13.0](https://github.com/tensorflow/data-validation/blob/v1.13.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 2.12              | 1.13.1              | n/a                  | 1.13.0\n[1.12.0](https://github.com/tensorflow/data-validation/blob/v1.12.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 2.11              | 1.12.0              | n/a                  | 1.12.0\n[1.11.0](https://github.com/tensorflow/data-validation/blob/v1.11.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 1.15 / 2.10       | 1.11.0              | n/a                  | 1.11.0\n[1.10.0](https://github.com/tensorflow/data-validation/blob/v1.10.0/RELEASE.md)       | 2.40.0           | 6.0.0   | 1.15 / 2.9        | 1.10.0              | n/a                  | 1.10.1\n[1.9.0](https://github.com/tensorflow/data-validation/blob/v1.9.0/RELEASE.md)         | 2.38.0           | 5.0.0   | 1.15 / 2.9        | 1.9.0               | n/a                  | 1.9.0\n[1.8.0](https://github.com/tensorflow/data-validation/blob/v1.8.0/RELEASE.md)         | 2.38.0           | 5.0.0   | 1.15 / 2.8        | 1.8.0               | n/a                  | 1.8.0\n[1.7.0](https://github.com/tensorflow/data-validation/blob/v1.7.0/RELEASE.md)         | 2.36.0           | 5.0.0   | 1.15 / 2.8        | 1.7.0               | n/a                  | 1.7.0\n[1.6.0](https://github.com/tensorflow/data-validation/blob/v1.6.0/RELEASE.md)         | 2.35.0           | 5.0.0   | 1.15 / 2.7        | 1.6.0               | n/a                  | 1.6.0\n[1.5.0](https://github.com/tensorflow/data-validation/blob/v1.5.0/RELEASE.md)         | 2.34.0           | 5.0.0   | 1.15 / 2.7        | 1.5.0               | n/a                  | 1.5.0\n[1.4.0](https://github.com/tensorflow/data-validation/blob/v1.4.0/RELEASE.md)         | 2.32.0           | 4.0.1   | 1.15 / 2.6        | 1.4.0               | n/a                  | 1.4.0\n[1.3.0](https://github.com/tensorflow/data-validation/blob/v1.3.0/RELEASE.md)         | 2.32.0           | 2.0.0   | 1.15 / 2.6        | 1.2.0               | n/a                  | 1.3.0\n[1.2.0](https://github.com/tensorflow/data-validation/blob/v1.2.0/RELEASE.md)         | 2.31.0           | 2.0.0   | 1.15 / 2.5        | 1.2.0               | n/a                  | 1.2.0\n[1.1.1](https://github.com/tensorflow/data-validation/blob/v1.1.1/RELEASE.md)         | 2.29.0           | 2.0.0   | 1.15 / 2.5        | 1.1.0               | n/a                  | 1.1.1\n[1.1.0](https://github.com/tensorflow/data-validation/blob/v1.1.0/RELEASE.md)         | 2.29.0           | 2.0.0   | 1.15 / 2.5        | 1.1.0               | n/a                  | 1.1.0\n[1.0.0](https://github.com/tensorflow/data-validation/blob/v1.0.0/RELEASE.md)         | 2.29.0           | 2.0.0   | 1.15 / 2.5        | 1.0.0               | n/a                  | 1.0.0\n[0.30.0](https://github.com/tensorflow/data-validation/blob/v0.30.0/RELEASE.md)       | 2.28.0           | 2.0.0   | 1.15 / 2.4        | 0.30.0              | n/a                  | 0.30.0\n[0.29.0](https://github.com/tensorflow/data-validation/blob/v0.29.0/RELEASE.md)       | 2.28.0           | 2.0.0   | 1.15 / 2.4        | 0.29.0              | n/a                  | 0.29.0\n[0.28.0](https://github.com/tensorflow/data-validation/blob/v0.28.0/RELEASE.md)       | 2.28.0           | 2.0.0   | 1.15 / 2.4        | 0.28.0              | n/a                  | 0.28.1\n[0.27.0](https://github.com/tensorflow/data-validation/blob/v0.27.0/RELEASE.md)       | 2.27.0           | 2.0.0   | 1.15 / 2.4        | 0.27.0              | n/a                  | 0.27.0\n[0.26.1](https://github.com/tensorflow/data-validation/blob/v0.26.1/RELEASE.md)       | 2.28.0           | 0.17.0  | 1.15 / 2.3        | 0.26.0              | 0.26.0               | 0.26.0\n[0.26.0](https://github.com/tensorflow/data-validation/blob/v0.26.0/RELEASE.md)       | 2.25.0           | 0.17.0  | 1.15 / 2.3        | 0.26.0              | 0.26.0               | 0.26.0\n[0.25.0](https://github.com/tensorflow/data-validation/blob/v0.25.0/RELEASE.md)       | 2.25.0           | 0.17.0  | 1.15 / 2.3        | 0.25.0              | 0.25.0               | 0.25.0\n[0.24.1](https://github.com/tensorflow/data-validation/blob/v0.24.1/RELEASE.md)       | 2.24.0           | 0.17.0  | 1.15 / 2.3        | 0.24.0              | 0.24.1               | 0.24.1\n[0.24.0](https://github.com/tensorflow/data-validation/blob/v0.24.0/RELEASE.md)       | 2.23.0           | 0.17.0  | 1.15 / 2.3        | 0.24.0              | 0.24.0               | 0.24.0\n[0.23.1](https://github.com/tensorflow/data-validation/blob/v0.23.1/RELEASE.md)       | 2.24.0           | 0.17.0  | 1.15 / 2.3        | 0.23.0              | 0.23.0               | 0.23.0\n[0.23.0](https://github.com/tensorflow/data-validation/blob/v0.23.0/RELEASE.md)       | 2.23.0           | 0.17.0  | 1.15 / 2.3        | 0.23.0              | 0.23.0               | 0.23.0\n[0.22.2](https://github.com/tensorflow/data-validation/blob/v0.22.2/RELEASE.md)       | 2.20.0           | 0.16.0  | 1.15 / 2.2        | 0.22.0              | 0.22.0               | 0.22.1\n[0.22.1](https://github.com/tensorflow/data-validation/blob/v0.22.1/RELEASE.md)       | 2.20.0           | 0.16.0  | 1.15 / 2.2        | 0.22.0              | 0.22.0               | 0.22.1\n[0.22.0](https://github.com/tensorflow/data-validation/blob/v0.22.0/RELEASE.md)       | 2.20.0           | 0.16.0  | 1.15 / 2.2        | 0.22.0              | 0.22.0               | 0.22.0\n[0.21.5](https://github.com/tensorflow/data-validation/blob/v0.21.5/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.1               | 0.21.3\n[0.21.4](https://github.com/tensorflow/data-validation/blob/v0.21.4/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.1               | 0.21.3\n[0.21.2](https://github.com/tensorflow/data-validation/blob/v0.21.2/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.0               | 0.21.0\n[0.21.1](https://github.com/tensorflow/data-validation/blob/v0.21.1/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.0               | 0.21.0\n[0.21.0](https://github.com/tensorflow/data-validation/blob/v0.21.0/RELEASE.md)       | 2.17.0           | 0.15.0  | 1.15 / 2.1        | 0.21.0              | 0.21.0               | 0.21.0\n[0.15.0](https://github.com/tensorflow/data-validation/blob/v0.15.0/RELEASE.md)       | 2.16.0           | 0.14.0  | 1.15 / 2.0        | 0.15.0              | 0.15.0               | 0.15.0\n[0.14.1](https://github.com/tensorflow/data-validation/blob/v0.14.1/RELEASE.md)       | 2.14.0           | 0.14.0  | 1.14              | 0.14.0              | 0.14.0               | n/a\n[0.14.0](https://github.com/tensorflow/data-validation/blob/v0.14.0/RELEASE.md)       | 2.14.0           | 0.14.0  | 1.14              | 0.14.0              | 0.14.0               | n/a\n[0.13.1](https://github.com/tensorflow/data-validation/blob/v0.13.1/RELEASE.md)       | 2.11.0           | n/a     | 1.13              | 0.12.1              | 0.13.0               | n/a\n[0.13.0](https://github.com/tensorflow/data-validation/blob/v0.13.0/RELEASE.md)       | 2.11.0           | n/a     | 1.13              | 0.12.1              | 0.13.0               | n/a\n[0.12.0](https://github.com/tensorflow/data-validation/blob/v0.12.0/RELEASE.md)       | 2.10.0           | n/a     | 1.12              | 0.12.1              | 0.12.0               | n/a\n[0.11.0](https://github.com/tensorflow/data-validation/blob/v0.11.0/RELEASE.md)       | 2.8.0            | n/a     | 1.11              | 0.9.0               | 0.11.0               | n/a\n[0.9.0](https://github.com/tensorflow/data-validation/blob/v0.9.0/RELEASE.md)         | 2.6.0            | n/a     | 1.9               | n/a                 | n/a                  | n/a\n\n## Questions\n\nPlease direct any questions about working with TF Data Validation to\n[Stack Overflow](https://stackoverflow.com) using the\n[tensorflow-data-validation](https://stackoverflow.com/questions/tagged/tensorflow-data-validation)\ntag.\n\n## Links\n\n  * [TensorFlow Data Validation Getting Started Guide](https://www.tensorflow.org/tfx/data_validation/get_started)\n  * [TensorFlow Data Validation Notebook](https://colab.research.google.com/github/tensorflow/tfx/blob/master/docs/tutorials/data_validation/tfdv_basic.ipynb)\n  * [TensorFlow Data Validation API Documentation](https://www.tensorflow.org/tfx/data_validation/api_docs/python/tfdv)\n  * [TensorFlow Data Validation Blog Post](https://medium.com/tensorflow/introducing-tensorflow-data-validation-data-understanding-validation-and-monitoring-at-scale-d38e3952c2f0)\n  * [TensorFlow Data Validation PyPI](https://pypi.org/project/tensorflow-data-validation/)\n  * [TensorFlow Data Validation Paper](https://mlsys.org/Conferences/2019/doc/2019/167.pdf)\n  * [TensorFlow Data Validation Slides](https://conf.slac.stanford.edu/xldb2018/sites/xldb2018.conf.slac.stanford.edu/files/Tues_09.45_NeoklisPolyzotis_Data%20Analysis%20and%20Validation%20(1).pdf)\n\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "A library for exploring and validating machine learning data.",
    "version": "1.16.1",
    "project_urls": {
        "Download": "https://github.com/tensorflow/data-validation/tags",
        "Homepage": "https://www.tensorflow.org/tfx/data_validation/get_started"
    },
    "split_keywords": [
        "tensorflow",
        "data",
        "validation",
        "tfx"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5532b6efa864162b60346a3dce9959d99cde0f2d9f01c28f6006e9edd9a1cf4e",
                "md5": "f963271651a9f137c9663c63d3d400e1",
                "sha256": "b6e77fb1b9a16780866c07220d2f522774a29e56de83fe518f00bac886a5b185"
            },
            "downloads": -1,
            "filename": "tensorflow_data_validation-1.16.1-cp310-cp310-macosx_12_0_x86_64.whl",
            "has_sig": false,
            "md5_digest": "f963271651a9f137c9663c63d3d400e1",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": "<4,>=3.9",
            "size": 20235587,
            "upload_time": "2024-10-15T20:36:20",
            "upload_time_iso_8601": "2024-10-15T20:36:20.095130Z",
            "url": "https://files.pythonhosted.org/packages/55/32/b6efa864162b60346a3dce9959d99cde0f2d9f01c28f6006e9edd9a1cf4e/tensorflow_data_validation-1.16.1-cp310-cp310-macosx_12_0_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e65ff35035a955b56e78910880d07fb7d18b8f42ae20586ba027ca210bcd2aea",
                "md5": "1c1b1a1a755256578ba55e8d5b308cdc",
                "sha256": "e0500a0ece8d2f0539a35243b765f5f227f5b6bfab1b1e5e0ac8cb5a9e7a47a2"
            },
            "downloads": -1,
            "filename": "tensorflow_data_validation-1.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "1c1b1a1a755256578ba55e8d5b308cdc",
            "packagetype": "bdist_wheel",
            "python_version": "cp310",
            "requires_python": "<4,>=3.9",
            "size": 18960315,
            "upload_time": "2024-10-15T20:35:01",
            "upload_time_iso_8601": "2024-10-15T20:35:01.231074Z",
            "url": "https://files.pythonhosted.org/packages/e6/5f/f35035a955b56e78910880d07fb7d18b8f42ae20586ba027ca210bcd2aea/tensorflow_data_validation-1.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c9d88b193132c8769d31d11e92b058cd6651b5d8cba1b91878665bdb7408260b",
                "md5": "2e3f1db49ccd1034c8dbfb51a6aba5b6",
                "sha256": "b21fa86c61da5cee81b4d602953fea16878de4874eb6035bf7f3221cfeb91559"
            },
            "downloads": -1,
            "filename": "tensorflow_data_validation-1.16.1-cp311-cp311-macosx_12_0_x86_64.whl",
            "has_sig": false,
            "md5_digest": "2e3f1db49ccd1034c8dbfb51a6aba5b6",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": "<4,>=3.9",
            "size": 20236896,
            "upload_time": "2024-10-15T20:20:03",
            "upload_time_iso_8601": "2024-10-15T20:20:03.396830Z",
            "url": "https://files.pythonhosted.org/packages/c9/d8/8b193132c8769d31d11e92b058cd6651b5d8cba1b91878665bdb7408260b/tensorflow_data_validation-1.16.1-cp311-cp311-macosx_12_0_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "da4d4e758f700f1e1b0162ff74c0dca0f0ebd8e366e57088a29ec1f3bdaf0287",
                "md5": "7572efd83c4b1797f720533ac072c534",
                "sha256": "bb2e666e724a418fb45cca17442d33f906e91c086ad4a897aaf4473c94e0f4ee"
            },
            "downloads": -1,
            "filename": "tensorflow_data_validation-1.16.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "7572efd83c4b1797f720533ac072c534",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": "<4,>=3.9",
            "size": 18961461,
            "upload_time": "2024-10-15T20:40:56",
            "upload_time_iso_8601": "2024-10-15T20:40:56.170801Z",
            "url": "https://files.pythonhosted.org/packages/da/4d/4e758f700f1e1b0162ff74c0dca0f0ebd8e366e57088a29ec1f3bdaf0287/tensorflow_data_validation-1.16.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "de9dc3c1cd8ccf1281f60452c2add5c09a5a3f64533c91fecf6aa5ad1b93b2fb",
                "md5": "6476aae9415bf7d18db483d257c08aff",
                "sha256": "cc25d5d5bba548425dad033fd8dad875f4eefabaaf611acd2ed57c069198b63d"
            },
            "downloads": -1,
            "filename": "tensorflow_data_validation-1.16.1-cp39-cp39-macosx_12_0_x86_64.whl",
            "has_sig": false,
            "md5_digest": "6476aae9415bf7d18db483d257c08aff",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": "<4,>=3.9",
            "size": 20235876,
            "upload_time": "2024-10-15T20:25:15",
            "upload_time_iso_8601": "2024-10-15T20:25:15.037933Z",
            "url": "https://files.pythonhosted.org/packages/de/9d/c3c1cd8ccf1281f60452c2add5c09a5a3f64533c91fecf6aa5ad1b93b2fb/tensorflow_data_validation-1.16.1-cp39-cp39-macosx_12_0_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "14dadaddb7b74a6b539674fd92be04a877e82bb0b919a09a13ed3aadb5af4f30",
                "md5": "27e06a3e5885fef62c4d086251784ddb",
                "sha256": "6169d46fc316c19b3401d1b3f255e6ae8fc45b4ea5c03bd2805dbed9656975b9"
            },
            "downloads": -1,
            "filename": "tensorflow_data_validation-1.16.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "27e06a3e5885fef62c4d086251784ddb",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": "<4,>=3.9",
            "size": 18960434,
            "upload_time": "2024-10-15T20:32:09",
            "upload_time_iso_8601": "2024-10-15T20:32:09.624817Z",
            "url": "https://files.pythonhosted.org/packages/14/da/daddb7b74a6b539674fd92be04a877e82bb0b919a09a13ed3aadb5af4f30/tensorflow_data_validation-1.16.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-15 20:36:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tensorflow",
    "github_project": "data-validation",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "tensorflow-data-validation"
}
        
Elapsed time: 0.38225s