# GENAI EVALUATION
[![PyPI version](https://badge.fury.io/py/genai-evaluation.svg)](https://badge.fury.io/py/genai-evaluation)
[![Documentation](https://img.shields.io/badge/Documentation-%20-blue)](https://rajiviyer.github.io/genai_evaluation/)
GenAI Evaluation is a library which contains methods to evaluate differences in Real and Synthetic Data.
## Functions
- **multivariate_ecdf**: Computes joint or multivariate ECDF in contrast to the univariate capabilities provided by packages like statsmodels
- **ks_statistic**: Calculates the KS Statistic for two multivariate ECDFs
## Authors
- [Dr. Vincent Granville](mailto:vincentg@mltechniques.com) - Research
- [Rajiv Iyer](mailto:raju.rgi@gmail.com) - Development/Maintenance
## Installation
The package can be installed with
```
pip install genai_evaluation
```
## Tests
The test can be run by cloning the repo and running:
```
pytest tests
```
In case of any issues running the tests, please run them after installing the package locally:
```
pip install -e .
```
## Usage
Start by importing the class
```Python
from genai_evaluation import multivariate_ecdf, ks_statistic
```
Assuming we have two pandas dataframes (Real & Synthetic) and only numerical columns, we pass them to the multivariate_ecdf function which returns the computed multivariate ECDFs of both.
```Python
query_str, ecdf_real, ecdf_synth = multivariate_ecdf(real_data, synthetic_data, n_nodes = 1000, verbose = True)
```
We then calculate the multivariate KS Distance between the ECDFs
```Python
ks_stat = ks_statistic(ecdf_real, ecdf_synth)
```
## Motivation
The motivation for this package comes from Dr. Vincent Granville's paper [Generative AI Technology Break-through: Spectacular Performance of New Synthesizer](https://mltechniques.com/2023/08/02/generative-ai-technology-break-through-spectacular-performance-of-new-synthesizer/)
If you have any tips or suggestions, please contact us on email.
# History
## 0.1.0 (2023-09-11)
- First release on PyPI.
## 0.1.1 (2023-09-11)
### Corrected
- Function name from compute_ecdf to multivariate_ecdf
## 0.1.2 (2023-09-11)
### Enhanced
- Added a new parameter verbose in multivariate ECDF function
## 0.1.3 (2023-09-11)
### Corrected
- Removed unecessary docstrings from code
## 0.1.4 (2023-09-11)
### Fixed
- Resolved issues with special characters in the column names
## 0.1.5 (2023-09-11)
### Fixed
- Earlier version considered underscore as a special character. That is rectified in this version
Raw data
{
"_id": null,
"home_page": "https://github.com/rajiviyer/genai_evaluation",
"name": "genai-evaluation",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "genai_evaluation",
"author": "Rajiv Iyer",
"author_email": "raju.rgi@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/d5/bb/04c2269abf2fcde09e74ba597a621c9bc26c9c44200ef8899d9947cd4876/genai_evaluation-0.1.5.tar.gz",
"platform": null,
"description": "# GENAI EVALUATION\r\n[![PyPI version](https://badge.fury.io/py/genai-evaluation.svg)](https://badge.fury.io/py/genai-evaluation)\r\n[![Documentation](https://img.shields.io/badge/Documentation-%20-blue)](https://rajiviyer.github.io/genai_evaluation/)\r\n\r\n\r\nGenAI Evaluation is a library which contains methods to evaluate differences in Real and Synthetic Data. \r\n\r\n## Functions\r\n- **multivariate_ecdf**: Computes joint or multivariate ECDF in contrast to the univariate capabilities provided by packages like statsmodels\r\n- **ks_statistic**: Calculates the KS Statistic for two multivariate ECDFs \r\n\r\n## Authors\r\n- [Dr. Vincent Granville](mailto:vincentg@mltechniques.com) - Research\r\n- [Rajiv Iyer](mailto:raju.rgi@gmail.com) - Development/Maintenance\r\n\r\n## Installation\r\nThe package can be installed with\r\n```\r\npip install genai_evaluation\r\n```\r\n\r\n## Tests\r\nThe test can be run by cloning the repo and running:\r\n```\r\npytest tests\r\n```\r\nIn case of any issues running the tests, please run them after installing the package locally:\r\n\r\n```\r\npip install -e .\r\n```\r\n\r\n## Usage\r\n\r\nStart by importing the class\r\n```Python\r\nfrom genai_evaluation import multivariate_ecdf, ks_statistic\r\n```\r\n\r\nAssuming we have two pandas dataframes (Real & Synthetic) and only numerical columns, we pass them to the multivariate_ecdf function which returns the computed multivariate ECDFs of both.\r\n```Python\r\nquery_str, ecdf_real, ecdf_synth = multivariate_ecdf(real_data, synthetic_data, n_nodes = 1000, verbose = True)\r\n```\r\n\r\nWe then calculate the multivariate KS Distance between the ECDFs\r\n```Python\r\nks_stat = ks_statistic(ecdf_real, ecdf_synth)\r\n```\r\n\r\n## Motivation\r\nThe motivation for this package comes from Dr. Vincent Granville's paper [Generative AI Technology Break-through: Spectacular Performance of New Synthesizer](https://mltechniques.com/2023/08/02/generative-ai-technology-break-through-spectacular-performance-of-new-synthesizer/)\r\n\r\nIf you have any tips or suggestions, please contact us on email.\r\n\r\n# History\r\n\r\n## 0.1.0 (2023-09-11)\r\n- First release on PyPI.\r\n\r\n## 0.1.1 (2023-09-11)\r\n### Corrected\r\n- Function name from compute_ecdf to multivariate_ecdf\r\n\r\n## 0.1.2 (2023-09-11)\r\n### Enhanced\r\n- Added a new parameter verbose in multivariate ECDF function\r\n\r\n## 0.1.3 (2023-09-11)\r\n### Corrected\r\n- Removed unecessary docstrings from code\r\n\r\n## 0.1.4 (2023-09-11)\r\n### Fixed\r\n- Resolved issues with special characters in the column names\r\n\r\n## 0.1.5 (2023-09-11)\r\n### Fixed\r\n- Earlier version considered underscore as a special character. That is rectified in this version\r\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "Evaluation of Generative AI Models",
"version": "0.1.5",
"project_urls": {
"Homepage": "https://github.com/rajiviyer/genai_evaluation"
},
"split_keywords": [
"genai_evaluation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "37cc66fc6a057d77ab19120e64f63d99e8841fea635c34d38b60239580d29359",
"md5": "dabd76718f4c9267ae7f335fee2ec62a",
"sha256": "2c3c2f6a53d73e1fbb32a7937b412676726e092d9563a075085d5e03e29d755d"
},
"downloads": -1,
"filename": "genai_evaluation-0.1.5-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "dabd76718f4c9267ae7f335fee2ec62a",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.6",
"size": 6037,
"upload_time": "2023-09-21T12:31:54",
"upload_time_iso_8601": "2023-09-21T12:31:54.784314Z",
"url": "https://files.pythonhosted.org/packages/37/cc/66fc6a057d77ab19120e64f63d99e8841fea635c34d38b60239580d29359/genai_evaluation-0.1.5-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d5bb04c2269abf2fcde09e74ba597a621c9bc26c9c44200ef8899d9947cd4876",
"md5": "756a51b61c3f16475f1d63fb8b3a9e45",
"sha256": "b23f7f2b118a7a5f6f6fbe3791574a44533e23bede66089cc91e4d1ed2f7bcb9"
},
"downloads": -1,
"filename": "genai_evaluation-0.1.5.tar.gz",
"has_sig": false,
"md5_digest": "756a51b61c3f16475f1d63fb8b3a9e45",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 7955,
"upload_time": "2023-09-21T12:31:55",
"upload_time_iso_8601": "2023-09-21T12:31:55.935672Z",
"url": "https://files.pythonhosted.org/packages/d5/bb/04c2269abf2fcde09e74ba597a621c9bc26c9c44200ef8899d9947cd4876/genai_evaluation-0.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-21 12:31:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rajiviyer",
"github_project": "genai_evaluation",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "genai-evaluation"
}