Name | llm-messages-token-helper JSON |
Version |
0.0.3
JSON |
| download |
home_page | None |
Summary | A helper library for estimating tokens used by messages. |
upload_time | 2024-04-21 22:05:35 |
maintainer | None |
docs_url | None |
author | Pamela Fox |
requires_python | >=3.9 |
license | None |
keywords |
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# llm-messages-token-helper
A helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model.
Currently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision).
Uses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.
## Installation
Install the package:
```sh
python3 -m pip install llm-messages-token-helper
```
## Usage
The library provides the following functions:
* [`build_messages`](#build_messages)
* [`count_tokens_for_message`](#count_tokens_for_message)
* [`count_tokens_for_image`](#count_tokens_for_image)
* [`get_token_limit`](#get_token_limit)
### `build_messages`
Build a list of messages for a chat conversation, given the system prompt, new user message,
and past messages. The function will truncate the history of past messages if necessary to
stay within the token limit.
Arguments:
* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.
* `system_prompt` (`str`): The initial system prompt message.
* `new_user_message` (`str | List[openai.types.chat.ChatCompletionContentPartParam]`): The new user message to append.
* `past_messages` (`list[dict]`): The list of past messages in the conversation.
* `few_shots` (`list[dict]`): A few-shot list of messages to insert after the system prompt.
* `max_tokens` (`int`): The maximum number of tokens allowed for the conversation.
Returns:
* `list[openai.types.chat.ChatCompletionMessageParam]`
Example:
```python
from llm_messages_token_helper import build_messages
messages = build_messages(
model="gpt-35-turbo",
system_prompt="You are a bot.",
new_user_message="That wasn't a good poem.",
past_messages=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna I love tuna",
},
],
few_shots=[
{
"role": "user",
"content": "Write me a poem",
},
{
"role": "assistant",
"content": "Tuna tuna is the best",
},
]
)
```
### `count_tokens_for_message`
Counts the number of tokens in a message.
Arguments:
* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.
Returns:
* `int`: The number of tokens in the message.
Example:
```python
from llm_messages_token_helper import count_tokens_for_message
message = {
"role": "user",
"content": "Hello, how are you?",
}
model = "gpt-4"
num_tokens = count_tokens_for_message(model, message)
```
### `count_tokens_for_image`
Count the number of tokens for an image sent to GPT-4-vision, in base64 format.
Arguments:
* `image` (`str`): The base64-encoded image.
Returns:
* `int`: The number of tokens used up for the image.
Example:
```python
Count the number of tokens for an image sent to GPT-4-vision:
```python
from llm_messages_token_helper import count_tokens_for_image
image = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA..."
num_tokens = count_tokens_for_image(image)
```
### `get_token_limit`
Get the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).
Arguments:
* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).
Returns:
* `int`: The token limit for the model.
Example:
```python
from llm_messages_token_helper import get_token_limit
model = "gpt-4"
max_tokens = get_token_limit(model)
```
Raw data
{
"_id": null,
"home_page": null,
"name": "llm-messages-token-helper",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Pamela Fox",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/f8/dd/9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa/llm_messages_token_helper-0.0.3.tar.gz",
"platform": null,
"description": "# llm-messages-token-helper\n\nA helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model.\nCurrently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision).\nUses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.\n\n## Installation\n\nInstall the package:\n\n```sh\npython3 -m pip install llm-messages-token-helper\n```\n\n## Usage\n\nThe library provides the following functions:\n\n* [`build_messages`](#build_messages)\n* [`count_tokens_for_message`](#count_tokens_for_message)\n* [`count_tokens_for_image`](#count_tokens_for_image)\n* [`get_token_limit`](#get_token_limit)\n\n### `build_messages`\n\nBuild a list of messages for a chat conversation, given the system prompt, new user message,\nand past messages. The function will truncate the history of past messages if necessary to\nstay within the token limit.\n\nArguments:\n\n* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.\n* `system_prompt` (`str`): The initial system prompt message.\n* `new_user_message` (`str | List[openai.types.chat.ChatCompletionContentPartParam]`): The new user message to append.\n* `past_messages` (`list[dict]`): The list of past messages in the conversation.\n* `few_shots` (`list[dict]`): A few-shot list of messages to insert after the system prompt.\n* `max_tokens` (`int`): The maximum number of tokens allowed for the conversation.\n\nReturns:\n\n* `list[openai.types.chat.ChatCompletionMessageParam]`\n\nExample:\n\n```python\nfrom llm_messages_token_helper import build_messages\n\nmessages = build_messages(\n model=\"gpt-35-turbo\",\n system_prompt=\"You are a bot.\",\n new_user_message=\"That wasn't a good poem.\",\n past_messages=[\n {\n \"role\": \"user\",\n \"content\": \"Write me a poem\",\n },\n {\n \"role\": \"assistant\",\n \"content\": \"Tuna tuna I love tuna\",\n },\n ],\n few_shots=[\n {\n \"role\": \"user\",\n \"content\": \"Write me a poem\",\n },\n {\n \"role\": \"assistant\",\n \"content\": \"Tuna tuna is the best\",\n },\n ]\n)\n```\n\n### `count_tokens_for_message`\n\nCounts the number of tokens in a message.\n\nArguments:\n\n* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.\n\nReturns:\n\n* `int`: The number of tokens in the message.\n\nExample:\n\n```python\nfrom llm_messages_token_helper import count_tokens_for_message\n\nmessage = {\n \"role\": \"user\",\n \"content\": \"Hello, how are you?\",\n}\nmodel = \"gpt-4\"\nnum_tokens = count_tokens_for_message(model, message)\n```\n\n### `count_tokens_for_image`\n\nCount the number of tokens for an image sent to GPT-4-vision, in base64 format.\n\nArguments:\n\n* `image` (`str`): The base64-encoded image.\n\nReturns:\n\n* `int`: The number of tokens used up for the image.\n\nExample:\n\n```python\n\nCount the number of tokens for an image sent to GPT-4-vision:\n\n```python\nfrom llm_messages_token_helper import count_tokens_for_image\n\nimage = \"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA...\"\nnum_tokens = count_tokens_for_image(image)\n```\n\n### `get_token_limit`\n\nGet the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).\n\nArguments:\n\n* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).\n\nReturns:\n\n* `int`: The token limit for the model.\n\nExample:\n\n```python\nfrom llm_messages_token_helper import get_token_limit\n\nmodel = \"gpt-4\"\nmax_tokens = get_token_limit(model)\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "A helper library for estimating tokens used by messages.",
"version": "0.0.3",
"project_urls": {
"Home": "https://github.com/pamelafox/llm-messages-token-helper"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5599c5103404ea4edf9ea1a5faa5d30d5aa7692987f56cd6bc4c1087acb630c2",
"md5": "86dae66c6be03e792fe63656614aec81",
"sha256": "05ac74db4f79767db9d0611368e6639d37320aa3eafa00137aee405fc234e6e1"
},
"downloads": -1,
"filename": "llm_messages_token_helper-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "86dae66c6be03e792fe63656614aec81",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 7567,
"upload_time": "2024-04-21T22:05:33",
"upload_time_iso_8601": "2024-04-21T22:05:33.421183Z",
"url": "https://files.pythonhosted.org/packages/55/99/c5103404ea4edf9ea1a5faa5d30d5aa7692987f56cd6bc4c1087acb630c2/llm_messages_token_helper-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f8dd9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa",
"md5": "5de5317604cc541f815b3cc01c9d4ce8",
"sha256": "94a1fb2ae28c2c4c2a38569d4a6f4a921e0cea0838a33bfc645e05a7d6021b42"
},
"downloads": -1,
"filename": "llm_messages_token_helper-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "5de5317604cc541f815b3cc01c9d4ce8",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 288068,
"upload_time": "2024-04-21T22:05:35",
"upload_time_iso_8601": "2024-04-21T22:05:35.188274Z",
"url": "https://files.pythonhosted.org/packages/f8/dd/9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa/llm_messages_token_helper-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-21 22:05:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pamelafox",
"github_project": "llm-messages-token-helper",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "llm-messages-token-helper"
}