# Trust Eval
Trust Eval is a holistic metric for evaluating trustworthiness of inline cited LLM outputs within the RAG framework.
## Project Structure
```text
trust-eval/
├── trust-eval/
│ ├── __init__.py
│ ├── config.py
│ ├── llm.py
│ ├── response_generator.py
│ ├── evaluator.py
│ ├── metrics.py
│ ├── utils.py
├── tests/
│ ├── __init__.py
│ ├── test_response_generator.py
│ ├── test_evaluator.py
├── README.md
├── poetry.lock
├── pyproject.toml
```
## Installation
```bash
conda create -n trust-eval python=3.10.13
conda activate trust-eval
poetry install
```
```bash
import nltk
nltk.download('punkt_tab')
```
Install vLLM with CUDA 12.1.
## Example usage
```python
from config import EvaluationConfig, ResponseGeneratorConfig
from evaluator import Evaluator
from logging_config import logger
from response_generator import ResponseGenerator
# Generate responses
generator_config = ResponseGeneratorConfig.from_yaml(yaml_path="generator_config.yaml")
logger.info(generator_config)
generator = ResponseGenerator(generator_config)
generator.generate_responses()
generator.save_responses()
# Evaluate responses
evaluation_config = EvaluationConfig.from_yaml(yaml_path="eval_config.yaml")
logger.info(evaluation_config)
evaluator = Evaluator(evaluation_config)
evaluator.compute_metrics()
evaluator.save_results()
```
Raw data
{
"_id": null,
"home_page": "https://github.com/shanghongsim/trust-eval",
"name": "trust_eval",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.12,>=3.10",
"maintainer_email": null,
"keywords": "RAG, evaluation, metrics, citation",
"author": "Shang Hong Sim",
"author_email": "simshanghong@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/54/12/72cb2d3b234851ca067da8d5381db76e0a58e80d15c90f49b7c476278509/trust_eval-0.1.0.tar.gz",
"platform": null,
"description": "# Trust Eval\n\nTrust Eval is a holistic metric for evaluating trustworthiness of inline cited LLM outputs within the RAG framework. \n\n## Project Structure\n\n```text\ntrust-eval/\n\u251c\u2500\u2500 trust-eval/\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 config.py\n\u2502 \u251c\u2500\u2500 llm.py\n\u2502 \u251c\u2500\u2500 response_generator.py\n\u2502 \u251c\u2500\u2500 evaluator.py\n\u2502 \u251c\u2500\u2500 metrics.py\n\u2502 \u251c\u2500\u2500 utils.py\n\u251c\u2500\u2500 tests/\n\u2502 \u251c\u2500\u2500 __init__.py\n\u2502 \u251c\u2500\u2500 test_response_generator.py\n\u2502 \u251c\u2500\u2500 test_evaluator.py\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 poetry.lock\n\u251c\u2500\u2500 pyproject.toml\n```\n\n## Installation\n\n```bash\nconda create -n trust-eval python=3.10.13\nconda activate trust-eval\npoetry install\n```\n\n```bash\nimport nltk\nnltk.download('punkt_tab')\n```\nInstall vLLM with CUDA 12.1.\n\n## Example usage\n\n```python\nfrom config import EvaluationConfig, ResponseGeneratorConfig\nfrom evaluator import Evaluator\nfrom logging_config import logger\nfrom response_generator import ResponseGenerator\n\n# Generate responses\ngenerator_config = ResponseGeneratorConfig.from_yaml(yaml_path=\"generator_config.yaml\")\nlogger.info(generator_config)\ngenerator = ResponseGenerator(generator_config)\ngenerator.generate_responses()\ngenerator.save_responses()\n\n# Evaluate responses\nevaluation_config = EvaluationConfig.from_yaml(yaml_path=\"eval_config.yaml\")\nlogger.info(evaluation_config)\nevaluator = Evaluator(evaluation_config)\nevaluator.compute_metrics()\nevaluator.save_results()\n```\n\n",
"bugtrack_url": null,
"license": "CC BY-NC 4.0",
"summary": "Metric to measure RAG responses with inline citations",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/shanghongsim/trust-eval",
"Repository": "https://github.com/shanghongsim/trust-eval"
},
"split_keywords": [
"rag",
" evaluation",
" metrics",
" citation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ed6ad67bd6f5d8e9eaf9057f6f21cc6ef9e43df41ffea136fa70d04cf9508ba7",
"md5": "c7c33f3706abcdee38efe316a81eae35",
"sha256": "34efcf2c982e495f76a87969b0a4cc1a37e7b42c492086fb441fee416d1d328e"
},
"downloads": -1,
"filename": "trust_eval-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "c7c33f3706abcdee38efe316a81eae35",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.12,>=3.10",
"size": 18545,
"upload_time": "2025-01-03T04:05:03",
"upload_time_iso_8601": "2025-01-03T04:05:03.143716Z",
"url": "https://files.pythonhosted.org/packages/ed/6a/d67bd6f5d8e9eaf9057f6f21cc6ef9e43df41ffea136fa70d04cf9508ba7/trust_eval-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "541272cb2d3b234851ca067da8d5381db76e0a58e80d15c90f49b7c476278509",
"md5": "776e2cf319b48b045067715bc6c2f470",
"sha256": "b07428905314576e37296baab0855a004e24526abd490fc24e6a3dbc7a5b6989"
},
"downloads": -1,
"filename": "trust_eval-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "776e2cf319b48b045067715bc6c2f470",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.12,>=3.10",
"size": 16452,
"upload_time": "2025-01-03T04:05:05",
"upload_time_iso_8601": "2025-01-03T04:05:05.800493Z",
"url": "https://files.pythonhosted.org/packages/54/12/72cb2d3b234851ca067da8d5381db76e0a58e80d15c90f49b7c476278509/trust_eval-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-03 04:05:05",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "shanghongsim",
"github_project": "trust-eval",
"github_not_found": true,
"lcname": "trust_eval"
}