llm-confidence

Name	llm-confidence JSON
Version	0.2.3 JSON
	download
home_page	https://github.com/VATBox/llm-confidence
Summary	A Python package for extracting confidence scores from LLM models outputs, particularly using log probabilities
upload_time	2024-09-15 17:16:31
maintainer	None
docs_url	None
author	Ruth Miller
requires_python	>=3.7
license	Apache 2.0
keywords	logprobs handler confidence json ai llm machine-learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            
# llm-confidence

**llm-confidence** is a Python package designed to extract and calculate confidence scores from outputs of large language models (LLMs), specifically focusing on log probabilities. The package helps you work with model responses, particularly when working with structured data such as JSON outputs.

## Features

- Extract token-level log probabilities.
- Aggregate probabilities to calculate confidence scores for key-value pairs.
- Handle nested keys to compute confidence scores for related fields.
- Simple API for processing log probabilities from OpenAI GPT models.

## Installation

You can install the package using `pip`:

```bash
pip install llm-confidence
```

## Usage

Here’s an example of how to use the `llm-confidence` package to calculate confidence scores based on log probabilities from OpenAI GPT models:

```python
from openai import OpenAI
import os
from llm_confidence.logprobs_handler import LogprobsHandler

# Initialize the LogprobsHandler
logprobs_handler = LogprobsHandler()

def get_completion(
        messages: list[dict[str, str]],
        model: str = "gpt-4o",
        max_tokens=500,
        temperature=0,
        stop=None,
        seed=42,
        response_format=None,
        logprobs=None,
        top_logprobs=None,
):
    params = {
        "model": model,
        "messages": messages,
        "max_tokens": max_tokens,
        "temperature": temperature,
        "stop": stop,
        "seed": seed,
        "logprobs": logprobs,
        "top_logprobs": top_logprobs,
    }
    if response_format:
        params["response_format"] = response_format

    completion = client.chat.completions.create(**params)
    return completion

# Set up your OpenAI client with your API key
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", "<your OpenAI API key if not set as env var>"))

# Define a prompt for completion
response_raw = get_completion(
    [{'role': 'user', 'content': 'Tell me the name of a city and a few streets in this city, and return the response in JSON format.'}],
    logprobs=True,
    response_format={'type': 'json_object'}
)

# Print the output
print(response_raw.choices[0].message.content)

# Extract the log probabilities from the response
response_logprobs = response_raw.choices[0].logprobs.content if hasattr(response_raw.choices[0], 'logprobs') else []

# Format the logprobs
logprobs_formatted = logprobs_handler.format_logprobs(response_logprobs)

# Process the log probabilities to get confidence scores
confidence = logprobs_handler.process_logprobs(
    logprobs_formatted, 
    nested_keys_dct={'vat': ['vat_data', 'percent', 'vat_amount', 'exclude_vat_amount']}
)

# Print the confidence scores
print(confidence)
```

### Example Breakdown
1. **Get Completion**: Sends a prompt to the OpenAI GPT model and retrieves the completion, including log probabilities.
2. **Logprobs Formatting**: Formats the raw log probabilities into a structured format that can be processed.
3. **Confidence Calculation**: Aggregates the probabilities and returns confidence scores for key-value pairs in the model's JSON response.

### Customization

You can customize the `nested_keys_dct` parameter to aggregate confidence scores for your specific fields. For example:

```python
nested_keys_dct={'address': ['street', 'city', 'state']}
```

This will compute a combined confidence score for all fields related to addresses.

## License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## Contributing

We welcome contributions! Here's how you can help:

1. Fork the project.
2. Create a branch for your feature or bug fix.
3. Open a pull request.

Make sure to include tests for any new features or bug fixes.

---

Feel free to use the package and improve upon it. If you encounter any issues, please open an issue in the repository.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/VATBox/llm-confidence",
    "name": "llm-confidence",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "logprobs handler confidence json ai llm machine-learning",
    "author": "Ruth Miller",
    "author_email": "ruth.miller@bluedotcorp.com",
    "download_url": "https://files.pythonhosted.org/packages/3a/5d/f7105689dc26727cec3cb5d33186b2f3808baed884c0b896e7264b03ede0/llm_confidence-0.2.3.tar.gz",
    "platform": null,
    "description": "\n# llm-confidence\n\n**llm-confidence** is a Python package designed to extract and calculate confidence scores from outputs of large language models (LLMs), specifically focusing on log probabilities. The package helps you work with model responses, particularly when working with structured data such as JSON outputs.\n\n## Features\n\n- Extract token-level log probabilities.\n- Aggregate probabilities to calculate confidence scores for key-value pairs.\n- Handle nested keys to compute confidence scores for related fields.\n- Simple API for processing log probabilities from OpenAI GPT models.\n\n## Installation\n\nYou can install the package using `pip`:\n\n```bash\npip install llm-confidence\n```\n\n## Usage\n\nHere\u2019s an example of how to use the `llm-confidence` package to calculate confidence scores based on log probabilities from OpenAI GPT models:\n\n```python\nfrom openai import OpenAI\nimport os\nfrom llm_confidence.logprobs_handler import LogprobsHandler\n\n# Initialize the LogprobsHandler\nlogprobs_handler = LogprobsHandler()\n\ndef get_completion(\n        messages: list[dict[str, str]],\n        model: str = \"gpt-4o\",\n        max_tokens=500,\n        temperature=0,\n        stop=None,\n        seed=42,\n        response_format=None,\n        logprobs=None,\n        top_logprobs=None,\n):\n    params = {\n        \"model\": model,\n        \"messages\": messages,\n        \"max_tokens\": max_tokens,\n        \"temperature\": temperature,\n        \"stop\": stop,\n        \"seed\": seed,\n        \"logprobs\": logprobs,\n        \"top_logprobs\": top_logprobs,\n    }\n    if response_format:\n        params[\"response_format\"] = response_format\n\n    completion = client.chat.completions.create(**params)\n    return completion\n\n# Set up your OpenAI client with your API key\nclient = OpenAI(api_key=os.environ.get(\"OPENAI_API_KEY\", \"<your OpenAI API key if not set as env var>\"))\n\n# Define a prompt for completion\nresponse_raw = get_completion(\n    [{'role': 'user', 'content': 'Tell me the name of a city and a few streets in this city, and return the response in JSON format.'}],\n    logprobs=True,\n    response_format={'type': 'json_object'}\n)\n\n# Print the output\nprint(response_raw.choices[0].message.content)\n\n# Extract the log probabilities from the response\nresponse_logprobs = response_raw.choices[0].logprobs.content if hasattr(response_raw.choices[0], 'logprobs') else []\n\n# Format the logprobs\nlogprobs_formatted = logprobs_handler.format_logprobs(response_logprobs)\n\n# Process the log probabilities to get confidence scores\nconfidence = logprobs_handler.process_logprobs(\n    logprobs_formatted, \n    nested_keys_dct={'vat': ['vat_data', 'percent', 'vat_amount', 'exclude_vat_amount']}\n)\n\n# Print the confidence scores\nprint(confidence)\n```\n\n### Example Breakdown\n1. **Get Completion**: Sends a prompt to the OpenAI GPT model and retrieves the completion, including log probabilities.\n2. **Logprobs Formatting**: Formats the raw log probabilities into a structured format that can be processed.\n3. **Confidence Calculation**: Aggregates the probabilities and returns confidence scores for key-value pairs in the model's JSON response.\n\n### Customization\n\nYou can customize the `nested_keys_dct` parameter to aggregate confidence scores for your specific fields. For example:\n\n```python\nnested_keys_dct={'address': ['street', 'city', 'state']}\n```\n\nThis will compute a combined confidence score for all fields related to addresses.\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nWe welcome contributions! Here's how you can help:\n\n1. Fork the project.\n2. Create a branch for your feature or bug fix.\n3. Open a pull request.\n\nMake sure to include tests for any new features or bug fixes.\n\n---\n\nFeel free to use the package and improve upon it. If you encounter any issues, please open an issue in the repository.\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "A Python package for extracting confidence scores from LLM models outputs, particularly using log probabilities",
    "version": "0.2.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/VATBox/llm-confidence/issues",
        "Homepage": "https://github.com/VATBox/llm-confidence"
    },
    "split_keywords": [
        "logprobs",
        "handler",
        "confidence",
        "json",
        "ai",
        "llm",
        "machine-learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "229ea2b82e643de1ed7a7db616743c9e015033e0be81dad80d3ef7e815af2d4e",
                "md5": "477ea3297e360be25a4a4ddd03627b9f",
                "sha256": "b0f2158639ee7497bdfb53e401cb3d2e515f434f9b15a2af4169aed2dcb4cefd"
            },
            "downloads": -1,
            "filename": "llm_confidence-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "477ea3297e360be25a4a4ddd03627b9f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 8651,
            "upload_time": "2024-09-15T17:16:30",
            "upload_time_iso_8601": "2024-09-15T17:16:30.426826Z",
            "url": "https://files.pythonhosted.org/packages/22/9e/a2b82e643de1ed7a7db616743c9e015033e0be81dad80d3ef7e815af2d4e/llm_confidence-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3a5df7105689dc26727cec3cb5d33186b2f3808baed884c0b896e7264b03ede0",
                "md5": "eeb7345a8469df9cfd0e1d64948e5f03",
                "sha256": "3828d4d70dd220dc3ea88a2fc53c6e23a2eba9447063534aeb1e333e37981e53"
            },
            "downloads": -1,
            "filename": "llm_confidence-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "eeb7345a8469df9cfd0e1d64948e5f03",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 8367,
            "upload_time": "2024-09-15T17:16:31",
            "upload_time_iso_8601": "2024-09-15T17:16:31.852931Z",
            "url": "https://files.pythonhosted.org/packages/3a/5d/f7105689dc26727cec3cb5d33186b2f3808baed884c0b896e7264b03ede0/llm_confidence-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-15 17:16:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "VATBox",
    "github_project": "llm-confidence",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "llm-confidence"
}

Ruth Miller