trust_eval


Nametrust_eval JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/shanghongsim/trust-eval
SummaryMetric to measure RAG responses with inline citations
upload_time2025-01-03 04:05:05
maintainerNone
docs_urlNone
authorShang Hong Sim
requires_python<3.12,>=3.10
licenseCC BY-NC 4.0
keywords rag evaluation metrics citation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Trust Eval

Trust Eval is a holistic metric for evaluating trustworthiness of inline cited LLM outputs within the RAG framework. 

## Project Structure

```text
trust-eval/
├── trust-eval/
│   ├── __init__.py
│   ├── config.py
│   ├── llm.py
│   ├── response_generator.py
│   ├── evaluator.py
│   ├── metrics.py
│   ├── utils.py
├── tests/
│   ├── __init__.py
│   ├── test_response_generator.py
│   ├── test_evaluator.py
├── README.md
├── poetry.lock
├── pyproject.toml
```

## Installation

```bash
conda create -n trust-eval python=3.10.13
conda activate trust-eval
poetry install
```

```bash
import nltk
nltk.download('punkt_tab')
```
Install vLLM with CUDA 12.1.

## Example usage

```python
from config import EvaluationConfig, ResponseGeneratorConfig
from evaluator import Evaluator
from logging_config import logger
from response_generator import ResponseGenerator

# Generate responses
generator_config = ResponseGeneratorConfig.from_yaml(yaml_path="generator_config.yaml")
logger.info(generator_config)
generator = ResponseGenerator(generator_config)
generator.generate_responses()
generator.save_responses()

# Evaluate responses
evaluation_config = EvaluationConfig.from_yaml(yaml_path="eval_config.yaml")
logger.info(evaluation_config)
evaluator = Evaluator(evaluation_config)
evaluator.compute_metrics()
evaluator.save_results()
```


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/shanghongsim/trust-eval",
    "name": "trust_eval",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.12,>=3.10",
    "maintainer_email": null,
    "keywords": "RAG, evaluation, metrics, citation",
    "author": "Shang Hong Sim",
    "author_email": "simshanghong@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/54/12/72cb2d3b234851ca067da8d5381db76e0a58e80d15c90f49b7c476278509/trust_eval-0.1.0.tar.gz",
    "platform": null,
    "description": "# Trust Eval\n\nTrust Eval is a holistic metric for evaluating trustworthiness of inline cited LLM outputs within the RAG framework. \n\n## Project Structure\n\n```text\ntrust-eval/\n\u251c\u2500\u2500 trust-eval/\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 config.py\n\u2502   \u251c\u2500\u2500 llm.py\n\u2502   \u251c\u2500\u2500 response_generator.py\n\u2502   \u251c\u2500\u2500 evaluator.py\n\u2502   \u251c\u2500\u2500 metrics.py\n\u2502   \u251c\u2500\u2500 utils.py\n\u251c\u2500\u2500 tests/\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 test_response_generator.py\n\u2502   \u251c\u2500\u2500 test_evaluator.py\n\u251c\u2500\u2500 README.md\n\u251c\u2500\u2500 poetry.lock\n\u251c\u2500\u2500 pyproject.toml\n```\n\n## Installation\n\n```bash\nconda create -n trust-eval python=3.10.13\nconda activate trust-eval\npoetry install\n```\n\n```bash\nimport nltk\nnltk.download('punkt_tab')\n```\nInstall vLLM with CUDA 12.1.\n\n## Example usage\n\n```python\nfrom config import EvaluationConfig, ResponseGeneratorConfig\nfrom evaluator import Evaluator\nfrom logging_config import logger\nfrom response_generator import ResponseGenerator\n\n# Generate responses\ngenerator_config = ResponseGeneratorConfig.from_yaml(yaml_path=\"generator_config.yaml\")\nlogger.info(generator_config)\ngenerator = ResponseGenerator(generator_config)\ngenerator.generate_responses()\ngenerator.save_responses()\n\n# Evaluate responses\nevaluation_config = EvaluationConfig.from_yaml(yaml_path=\"eval_config.yaml\")\nlogger.info(evaluation_config)\nevaluator = Evaluator(evaluation_config)\nevaluator.compute_metrics()\nevaluator.save_results()\n```\n\n",
    "bugtrack_url": null,
    "license": "CC BY-NC 4.0",
    "summary": "Metric to measure RAG responses with inline citations",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/shanghongsim/trust-eval",
        "Repository": "https://github.com/shanghongsim/trust-eval"
    },
    "split_keywords": [
        "rag",
        " evaluation",
        " metrics",
        " citation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "ed6ad67bd6f5d8e9eaf9057f6f21cc6ef9e43df41ffea136fa70d04cf9508ba7",
                "md5": "c7c33f3706abcdee38efe316a81eae35",
                "sha256": "34efcf2c982e495f76a87969b0a4cc1a37e7b42c492086fb441fee416d1d328e"
            },
            "downloads": -1,
            "filename": "trust_eval-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c7c33f3706abcdee38efe316a81eae35",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.12,>=3.10",
            "size": 18545,
            "upload_time": "2025-01-03T04:05:03",
            "upload_time_iso_8601": "2025-01-03T04:05:03.143716Z",
            "url": "https://files.pythonhosted.org/packages/ed/6a/d67bd6f5d8e9eaf9057f6f21cc6ef9e43df41ffea136fa70d04cf9508ba7/trust_eval-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "541272cb2d3b234851ca067da8d5381db76e0a58e80d15c90f49b7c476278509",
                "md5": "776e2cf319b48b045067715bc6c2f470",
                "sha256": "b07428905314576e37296baab0855a004e24526abd490fc24e6a3dbc7a5b6989"
            },
            "downloads": -1,
            "filename": "trust_eval-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "776e2cf319b48b045067715bc6c2f470",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.12,>=3.10",
            "size": 16452,
            "upload_time": "2025-01-03T04:05:05",
            "upload_time_iso_8601": "2025-01-03T04:05:05.800493Z",
            "url": "https://files.pythonhosted.org/packages/54/12/72cb2d3b234851ca067da8d5381db76e0a58e80d15c90f49b7c476278509/trust_eval-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-03 04:05:05",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "shanghongsim",
    "github_project": "trust-eval",
    "github_not_found": true,
    "lcname": "trust_eval"
}
        
Elapsed time: 0.39874s