Name | structured-logprobs JSON |
Version |
0.1.4
JSON |
| download |
home_page | None |
Summary | Logprobs for OpenAI Structured Outputs |
upload_time | 2025-01-14 16:17:49 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <4.0,>=3.10 |
license | None |
keywords |
python
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
![GitHub Tag](https://img.shields.io/github/v/tag/arena-ai/structured-logprobs)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/structured-logprobs)
[![Main Workflow](https://github.com/arena-ai/structured-logprobs/actions/workflows/main.yml/badge.svg)](https://github.com/arena-ai/structured-logprobs/actions/workflows/main.yml)
[![Release Workflow](https://github.com/arena-ai/structured-logprobs/actions/workflows/on-release-main.yml/badge.svg)](https://github.com/arena-ai/structured-logprobs/actions/workflows/on-release-main.yml)
![structured-logprobs](https://github.com/arena-ai/structured-logprobs/blob/main/docs/images/logo.png?raw=true)
This Python library is designed to enhance OpenAI chat completion responses by adding detailed information about token log probabilities.
This library works with OpenAI [Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs), which is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.
It provides utilities to analyze and incorporate token-level log probabilities into structured outputs, helping developers understand the reliability of structured data extracted from OpenAI models.
## Objective
![structured-logprobs](https://github.com/arena-ai/structured-logprobs/blob/main/docs/images/pitch.png?raw=true)
The primary goal of **structured-logprobs** is to provide insights into the reliability of extracted data. By analyzing token-level log probabilities, the library helps assess how likely each value generated from an LLM's structured outputs is.
## Key Features
The module contains a function for mapping characters to token indices (`map_characters_to_token_indices`) and two methods for incorporating log probabilities:
1. Adding log probabilities as a separate field in the response (`add_logprobs`).
2. Embedding log probabilities inline within the message content (`add_logprobs_inline`).
## Example
To use this library, first create a chat completion response with the OpenAI Python SDK, then enhance the response with log probabilities.
Here is an example of how to do that:
```python
from openai import OpenAI
from openai.types import ResponseFormatJSONSchema
from structured_logprobs import add_logprobs, add_logprobs_inline
# Initialize the OpenAI client
client = OpenAI(api_key="your-api-key")
schema_path = "path-to-your-json-schema"
with open(schema_path) as f:
schema_content = json.load(f)
# Validate the schema content
response_schema = ResponseFormatJSONSchema.model_validate(schema_content)
# Create a chat completion request
completion = client.chat.completions.create(
model="gpt-4o-2024-08-06",
messages = [
{
"role": "system",
"content": (
"I have three questions. The first question is: What is the capital of France? "
"The second question is: Which are the two nicest colors? "
"The third question is: Can you roll a die and tell me which number comes up?"
),
}
],
logprobs=True,
response_format=response_schema.model_dump(by_alias=True),
)
chat_completion = add_logprobs(completion)
chat_completion_inline = add_logprobs_inline(completion)
print(chat_completion.log_probs[0])
{'capital_of_France': -5.5122365e-07, 'the_two_nicest_colors': [-0.0033997903, -0.011364183612649998], 'die_shows': -0.48048785}
print(chat_completion_inline.choices[0].message.content)
{"capital_of_France": "Paris", "capital_of_France_logprob": -6.704273e-07, "the_two_nicest_colors": ["blue", "green"], "die_shows": 5.0, "die_shows_logprob": -2.3782086}
```
## Example JSON Schema
The `response_format` in the request body is an object specifying the format that the model must output. Setting to { "type": "json_schema", "json_schema": {...} } ensures the model will match your supplied [JSON schema](https://json-schema.org/overview/what-is-jsonschema).
Below is the example of the JSON file that defines the schema used for validating the responses.
```python
{
"type": "json_schema",
"json_schema": {
"name": "answears",
"description": "Response to questions in JSON format",
"schema": {
"type": "object",
"properties": {
"capital_of_France": { "type": "string" },
"the_two_nicest_colors": {
"type": "array",
"items": {
"type": "string",
"enum": ["red", "blue", "green", "yellow", "purple"]
}
},
"die_shows": { "type": "number" }
},
"required": ["capital_of_France", "the_two_nicest_colors", "die_shows"],
"additionalProperties": false
},
"strict": true
}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "structured-logprobs",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "python",
"author": null,
"author_email": "Sarus Technologies <nicolas.grislain@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/f1/94/c847c18d0a94ab844ac523f40dd6f6d8c76ba50a924694be68e5e2f98013/structured_logprobs-0.1.4.tar.gz",
"platform": null,
"description": "![GitHub Tag](https://img.shields.io/github/v/tag/arena-ai/structured-logprobs)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/structured-logprobs)\n[![Main Workflow](https://github.com/arena-ai/structured-logprobs/actions/workflows/main.yml/badge.svg)](https://github.com/arena-ai/structured-logprobs/actions/workflows/main.yml)\n[![Release Workflow](https://github.com/arena-ai/structured-logprobs/actions/workflows/on-release-main.yml/badge.svg)](https://github.com/arena-ai/structured-logprobs/actions/workflows/on-release-main.yml)\n\n![structured-logprobs](https://github.com/arena-ai/structured-logprobs/blob/main/docs/images/logo.png?raw=true)\n\nThis Python library is designed to enhance OpenAI chat completion responses by adding detailed information about token log probabilities.\nThis library works with OpenAI [Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs), which is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.\nIt provides utilities to analyze and incorporate token-level log probabilities into structured outputs, helping developers understand the reliability of structured data extracted from OpenAI models.\n\n## Objective\n\n![structured-logprobs](https://github.com/arena-ai/structured-logprobs/blob/main/docs/images/pitch.png?raw=true)\n\nThe primary goal of **structured-logprobs** is to provide insights into the reliability of extracted data. By analyzing token-level log probabilities, the library helps assess how likely each value generated from an LLM's structured outputs is.\n\n## Key Features\n\nThe module contains a function for mapping characters to token indices (`map_characters_to_token_indices`) and two methods for incorporating log probabilities:\n\n1. Adding log probabilities as a separate field in the response (`add_logprobs`).\n2. Embedding log probabilities inline within the message content (`add_logprobs_inline`).\n\n## Example\n\nTo use this library, first create a chat completion response with the OpenAI Python SDK, then enhance the response with log probabilities.\nHere is an example of how to do that:\n\n```python\nfrom openai import OpenAI\nfrom openai.types import ResponseFormatJSONSchema\nfrom structured_logprobs import add_logprobs, add_logprobs_inline\n\n# Initialize the OpenAI client\nclient = OpenAI(api_key=\"your-api-key\")\n\nschema_path = \"path-to-your-json-schema\"\nwith open(schema_path) as f:\n schema_content = json.load(f)\n\n# Validate the schema content\nresponse_schema = ResponseFormatJSONSchema.model_validate(schema_content)\n\n# Create a chat completion request\ncompletion = client.chat.completions.create(\n model=\"gpt-4o-2024-08-06\",\n messages = [\n {\n \"role\": \"system\",\n \"content\": (\n \"I have three questions. The first question is: What is the capital of France? \"\n \"The second question is: Which are the two nicest colors? \"\n \"The third question is: Can you roll a die and tell me which number comes up?\"\n ),\n }\n ],\n logprobs=True,\n response_format=response_schema.model_dump(by_alias=True),\n)\n\nchat_completion = add_logprobs(completion)\nchat_completion_inline = add_logprobs_inline(completion)\nprint(chat_completion.log_probs[0])\n{'capital_of_France': -5.5122365e-07, 'the_two_nicest_colors': [-0.0033997903, -0.011364183612649998], 'die_shows': -0.48048785}\nprint(chat_completion_inline.choices[0].message.content)\n{\"capital_of_France\": \"Paris\", \"capital_of_France_logprob\": -6.704273e-07, \"the_two_nicest_colors\": [\"blue\", \"green\"], \"die_shows\": 5.0, \"die_shows_logprob\": -2.3782086}\n```\n\n## Example JSON Schema\n\nThe `response_format` in the request body is an object specifying the format that the model must output. Setting to { \"type\": \"json_schema\", \"json_schema\": {...} } ensures the model will match your supplied [JSON schema](https://json-schema.org/overview/what-is-jsonschema).\n\nBelow is the example of the JSON file that defines the schema used for validating the responses.\n\n```python\n{\n \"type\": \"json_schema\",\n \"json_schema\": {\n \"name\": \"answears\",\n \"description\": \"Response to questions in JSON format\",\n \"schema\": {\n \"type\": \"object\",\n \"properties\": {\n \"capital_of_France\": { \"type\": \"string\" },\n \"the_two_nicest_colors\": {\n \"type\": \"array\",\n \"items\": {\n \"type\": \"string\",\n \"enum\": [\"red\", \"blue\", \"green\", \"yellow\", \"purple\"]\n }\n },\n \"die_shows\": { \"type\": \"number\" }\n },\n \"required\": [\"capital_of_France\", \"the_two_nicest_colors\", \"die_shows\"],\n \"additionalProperties\": false\n },\n \"strict\": true\n }\n}\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Logprobs for OpenAI Structured Outputs",
"version": "0.1.4",
"project_urls": {
"Documentation": "https://arena-ai.github.io/structured-logprobs/",
"Homepage": "https://arena-ai.github.io/structured-logprobs/",
"Repository": "https://github.com/arena-ai/structured-logprobs"
},
"split_keywords": [
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b0a6d11f7d39e8c716e6fd6c51fe91035955c9c5e9720fffa64dc985533a0162",
"md5": "10bf3fb38d40a12a1ed0f44114405026",
"sha256": "0384adfb5df3351d0df5b289f3b1df71c1b51d409d4d1cc02d9d2aff8c609c4d"
},
"downloads": -1,
"filename": "structured_logprobs-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "10bf3fb38d40a12a1ed0f44114405026",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 10893,
"upload_time": "2025-01-14T16:17:47",
"upload_time_iso_8601": "2025-01-14T16:17:47.347167Z",
"url": "https://files.pythonhosted.org/packages/b0/a6/d11f7d39e8c716e6fd6c51fe91035955c9c5e9720fffa64dc985533a0162/structured_logprobs-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f194c847c18d0a94ab844ac523f40dd6f6d8c76ba50a924694be68e5e2f98013",
"md5": "157a451cdc33d0525408c61f39c46801",
"sha256": "aea2849271e7bb3593c4a46ffa811e50d30e695999e56c5d9b9b219e78694269"
},
"downloads": -1,
"filename": "structured_logprobs-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "157a451cdc33d0525408c61f39c46801",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 13349,
"upload_time": "2025-01-14T16:17:49",
"upload_time_iso_8601": "2025-01-14T16:17:49.317566Z",
"url": "https://files.pythonhosted.org/packages/f1/94/c847c18d0a94ab844ac523f40dd6f6d8c76ba50a924694be68e5e2f98013/structured_logprobs-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-14 16:17:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "arena-ai",
"github_project": "structured-logprobs",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"tox": true,
"lcname": "structured-logprobs"
}