Test-Driven Data Analysis (Python TDDA library)
===============================================
What is it?
-----------
The TDDA Python module provides command-line and Python API support for
the overall process of data analysis, through the following tools:
- **Reference Testing**: extensions to `unittest` and `pytest` for
managing testing of data analysis pipelines, where the results are
typically much larger, and more complex, than single numerical
values.
- **Constraints**: tools (and API) for discovery of constraints from data,
for validation of constraints on new data, and for anomaly detection.
- **Finding Regular Expressions (Rexpy)**: tools (and API) for automatically
inferring regular expressions from text data.
- **Automatic Test Generation (Gentest)**: TDDA can generate tests for
more-or-less any command that can be run from a command line,
whether it be Python code, R code, a shell script, a shell
command, a `Makefile` or a multi-language pipeline involving
compiled code. _"Gentest writes tests, so you don't have to."™_
<img width="100%" src="doc/source/image/tdda-machines-light.png"/>
Documentation
-------------
http://tdda.readthedocs.io
Installation
------------
The simplest way to install all of the TDDA Python modules is using *pip*:
pip install tdda
The full set of sources, including all examples, are downloadable from
PyPi with:
pip download --no-binary :all: tdda
The sources are also publicly available from Github:
git clone git@github.com:tdda/tdda.git
Documentation is available at http://tdda.readthedocs.io.
If you clone the Github repo, use
python setup.py install
afterwards to install the command-line tools (`tdda` and `rexpy`).
*Reference Tests*
-----------------
The `tdda.referencetest` library is used to support
the creation of *reference tests*, based on either unittest or pytest.
These are like other tests except:
1. They have special support for comparing strings to files
and files to files.
2. That support includes the ability to provide exclusion patterns
(for things like dates and versions that might be in the output).
3. When a string/file assertion fails, it spits out the command you
need to diff the output.
4. If there were exclusion patterns, it also writes modified versions
of both the actual and expected output and also prints the diff
command needed to compare those.
5. They have special support for handling CSV files.
6. It supports flags (-w and -W) to rewrite the reference (expected)
results once you have confirmed that the new actuals are correct.
For more details from a source distribution or checkout, see the `README.md`
file and examples in the `referencetest` subdirectory.
*Constraints*
-------------
The `tdda.constraints` library is used to 'discover' constraints
from a (Pandas) DataFrame, write them out as JSON, and to verify that
datasets meet the constraints in the constraints file.
For more details from a source distribution or checkout, see the `README.md`
file and examples in the `constraints` subdirectory.
*Finding Regular Expressions*
-----------------------------
The `tdda` repository also includes `rexpy`, a tool for automatically
inferring regular expressions from a single field of data examples.
*Resources*
-----------
Resources on these topics include:
* TDDA Blog: http://www.tdda.info
* Quick Reference Guide ("Cheatsheet"): http://www.tdda.info/pdf/tdda-quickref.pdf
* 1-page summary: https://stochasticsolutions.com/pdf/TDDA-One-Pager.pdf
* Full documentation: http://tdda.readthedocs.io
* General Notes on Constraints and Assertions: http://www.tdda.info/constraints-and-assertions
* Notes on using the Pandas constraints library:
http://www.tdda.info/constraint-discovery-and-verification-for-pandas-dataframes
* PyCon UK Talk on TDDA:
- Video: https://www.youtube.com/watch?v=FIw_7aUuY50
- Slides and Rough Transcript: http://www.tdda.info/slides-and-rough-transcript-of-tdda-talk-from-pycon-uk-2016
* <a rel="me" href="https://mathstodon.xyz/@tdda">Mastodon</a>
All examples, tests and code run under Python 2.7, Python 3.5 and Python 3.6.
Raw data
{
"_id": null,
"home_page": "http://www.stochasticsolutions.com",
"name": "tdda",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "tdda constraint referencetest rexpy",
"author": "Simon Brown",
"author_email": "Nick Radcliffe <njr@stochasticsolutions.com>",
"download_url": "https://files.pythonhosted.org/packages/ae/44/c65820852c47a8ac705d8d41e2869de5823e7840d46293d680e3acb52d93/tdda-2.2.5.tar.gz",
"platform": null,
"description": "Test-Driven Data Analysis (Python TDDA library)\n===============================================\n\nWhat is it?\n-----------\n\nThe TDDA Python module provides command-line and Python API support for\nthe overall process of data analysis, through the following tools:\n\n - **Reference Testing**: extensions to `unittest` and `pytest` for\n managing testing of data analysis pipelines, where the results are\n typically much larger, and more complex, than single numerical\n values.\n\n - **Constraints**: tools (and API) for discovery of constraints from data,\n for validation of constraints on new data, and for anomaly detection.\n\n - **Finding Regular Expressions (Rexpy)**: tools (and API) for automatically\n inferring regular expressions from text data.\n\n - **Automatic Test Generation (Gentest)**: TDDA can generate tests for\n more-or-less any command that can be run from a command line,\n whether it be Python code, R code, a shell script, a shell\n command, a `Makefile` or a multi-language pipeline involving\n compiled code. _\"Gentest writes tests, so you don't have to.\"\u2122_\n\n<img width=\"100%\" src=\"doc/source/image/tdda-machines-light.png\"/>\n\nDocumentation\n-------------\n\nhttp://tdda.readthedocs.io\n\nInstallation\n------------\n\nThe simplest way to install all of the TDDA Python modules is using *pip*:\n\n pip install tdda\n\nThe full set of sources, including all examples, are downloadable from\nPyPi with:\n\n pip download --no-binary :all: tdda\n\nThe sources are also publicly available from Github:\n\n git clone git@github.com:tdda/tdda.git\n\nDocumentation is available at http://tdda.readthedocs.io.\n\nIf you clone the Github repo, use\n\n python setup.py install\n\nafterwards to install the command-line tools (`tdda` and `rexpy`).\n\n\n*Reference Tests*\n-----------------\n\nThe `tdda.referencetest` library is used to support\nthe creation of *reference tests*, based on either unittest or pytest.\n\nThese are like other tests except:\n\n 1. They have special support for comparing strings to files\n and files to files.\n 2. That support includes the ability to provide exclusion patterns\n (for things like dates and versions that might be in the output).\n 3. When a string/file assertion fails, it spits out the command you\n need to diff the output.\n 4. If there were exclusion patterns, it also writes modified versions\n of both the actual and expected output and also prints the diff\n command needed to compare those.\n 5. They have special support for handling CSV files.\n 6. It supports flags (-w and -W) to rewrite the reference (expected)\n results once you have confirmed that the new actuals are correct.\n\nFor more details from a source distribution or checkout, see the `README.md`\nfile and examples in the `referencetest` subdirectory.\n\n*Constraints*\n-------------\n\nThe `tdda.constraints` library is used to 'discover' constraints\nfrom a (Pandas) DataFrame, write them out as JSON, and to verify that\ndatasets meet the constraints in the constraints file.\n\nFor more details from a source distribution or checkout, see the `README.md`\nfile and examples in the `constraints` subdirectory.\n\n*Finding Regular Expressions*\n-----------------------------\n\nThe `tdda` repository also includes `rexpy`, a tool for automatically\ninferring regular expressions from a single field of data examples.\n\n*Resources*\n-----------\n\nResources on these topics include:\n\n * TDDA Blog: http://www.tdda.info\n * Quick Reference Guide (\"Cheatsheet\"): http://www.tdda.info/pdf/tdda-quickref.pdf\n * 1-page summary: https://stochasticsolutions.com/pdf/TDDA-One-Pager.pdf\n * Full documentation: http://tdda.readthedocs.io\n * General Notes on Constraints and Assertions: http://www.tdda.info/constraints-and-assertions\n * Notes on using the Pandas constraints library:\n http://www.tdda.info/constraint-discovery-and-verification-for-pandas-dataframes\n * PyCon UK Talk on TDDA:\n - Video: https://www.youtube.com/watch?v=FIw_7aUuY50\n - Slides and Rough Transcript: http://www.tdda.info/slides-and-rough-transcript-of-tdda-talk-from-pycon-uk-2016\n\n * <a rel=\"me\" href=\"https://mathstodon.xyz/@tdda\">Mastodon</a>\n\n\nAll examples, tests and code run under Python 2.7, Python 3.5 and Python 3.6.\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": null,
"version": "2.2.5",
"project_urls": {
"Download": "https://github.com/tdda/tdda",
"Homepage": "http://www.stochasticsolutions.com"
},
"split_keywords": [
"tdda",
"constraint",
"referencetest",
"rexpy"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5c697f46f33513ea7c4ff39d56be6b7c670a9e234b7dce32b0b4737ef111f654",
"md5": "e1da1d233de869d2789cddbb083d8f35",
"sha256": "2302f027014263be569b1b8a0d63b50488f6a2de577162a641288ba53b6a59f1"
},
"downloads": -1,
"filename": "tdda-2.2.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e1da1d233de869d2789cddbb083d8f35",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 12663258,
"upload_time": "2024-06-13T06:36:43",
"upload_time_iso_8601": "2024-06-13T06:36:43.209333Z",
"url": "https://files.pythonhosted.org/packages/5c/69/7f46f33513ea7c4ff39d56be6b7c670a9e234b7dce32b0b4737ef111f654/tdda-2.2.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ae44c65820852c47a8ac705d8d41e2869de5823e7840d46293d680e3acb52d93",
"md5": "b5f7e48df200c6cd59acf97bdf44a7e4",
"sha256": "d67efa42d6ce7f7e342c259e2cc453718a7be9a381ffbb848714473f7f5c05e8"
},
"downloads": -1,
"filename": "tdda-2.2.5.tar.gz",
"has_sig": false,
"md5_digest": "b5f7e48df200c6cd59acf97bdf44a7e4",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 12521021,
"upload_time": "2024-06-13T06:36:50",
"upload_time_iso_8601": "2024-06-13T06:36:50.471737Z",
"url": "https://files.pythonhosted.org/packages/ae/44/c65820852c47a8ac705d8d41e2869de5823e7840d46293d680e3acb52d93/tdda-2.2.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-13 06:36:50",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "tdda",
"github_project": "tdda",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "tdda"
}