llm-messages-token-helper

Name	llm-messages-token-helper JSON
Version	0.0.3 JSON
	download
home_page	None
Summary	A helper library for estimating tokens used by messages.
upload_time	2024-04-21 22:05:35
maintainer	None
docs_url	None
author	Pamela Fox
requires_python	>=3.9
license	None
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # llm-messages-token-helper

A helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model.
Currently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision).
Uses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.

## Installation

Install the package:

```sh
python3 -m pip install llm-messages-token-helper
```

## Usage

The library provides the following functions:

* [`build_messages`](#build_messages)
* [`count_tokens_for_message`](#count_tokens_for_message)
* [`count_tokens_for_image`](#count_tokens_for_image)
* [`get_token_limit`](#get_token_limit)

### `build_messages`

Build a list of messages for a chat conversation, given the system prompt, new user message,
and past messages. The function will truncate the history of past messages if necessary to
stay within the token limit.

Arguments:

* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.
* `system_prompt` (`str`): The initial system prompt message.
* `new_user_message` (`str | List[openai.types.chat.ChatCompletionContentPartParam]`): The new user message to append.
* `past_messages` (`list[dict]`): The list of past messages in the conversation.
* `few_shots` (`list[dict]`): A few-shot list of messages to insert after the system prompt.
* `max_tokens` (`int`): The maximum number of tokens allowed for the conversation.

Returns:

* `list[openai.types.chat.ChatCompletionMessageParam]`

Example:

```python
from llm_messages_token_helper import build_messages

messages = build_messages(
    model="gpt-35-turbo",
    system_prompt="You are a bot.",
    new_user_message="That wasn't a good poem.",
    past_messages=[
        {
            "role": "user",
            "content": "Write me a poem",
        },
        {
            "role": "assistant",
            "content": "Tuna tuna I love tuna",
        },
    ],
    few_shots=[
        {
            "role": "user",
            "content": "Write me a poem",
        },
        {
            "role": "assistant",
            "content": "Tuna tuna is the best",
        },
    ]
)
```

### `count_tokens_for_message`

Counts the number of tokens in a message.

Arguments:

* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.

Returns:

* `int`: The number of tokens in the message.

Example:

```python
from llm_messages_token_helper import count_tokens_for_message

message = {
    "role": "user",
    "content": "Hello, how are you?",
}
model = "gpt-4"
num_tokens = count_tokens_for_message(model, message)
```

### `count_tokens_for_image`

Count the number of tokens for an image sent to GPT-4-vision, in base64 format.

Arguments:

* `image` (`str`): The base64-encoded image.

Returns:

* `int`: The number of tokens used up for the image.

Example:

```python

Count the number of tokens for an image sent to GPT-4-vision:

```python
from llm_messages_token_helper import count_tokens_for_image

image = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA..."
num_tokens = count_tokens_for_image(image)
```

### `get_token_limit`

Get the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).

Arguments:

* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).

Returns:

* `int`: The token limit for the model.

Example:

```python
from llm_messages_token_helper import get_token_limit

model = "gpt-4"
max_tokens = get_token_limit(model)
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llm-messages-token-helper",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Pamela Fox",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/f8/dd/9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa/llm_messages_token_helper-0.0.3.tar.gz",
    "platform": null,
    "description": "# llm-messages-token-helper\n\nA helper library for estimating tokens used by messages and building messages lists that fit within the token limits of a model.\nCurrently designed to work with the OpenAI GPT models (including GPT-4 turbo with vision).\nUses the tiktoken library for tokenizing text and the Pillow library for image-related calculations.\n\n## Installation\n\nInstall the package:\n\n```sh\npython3 -m pip install llm-messages-token-helper\n```\n\n## Usage\n\nThe library provides the following functions:\n\n* [`build_messages`](#build_messages)\n* [`count_tokens_for_message`](#count_tokens_for_message)\n* [`count_tokens_for_image`](#count_tokens_for_image)\n* [`get_token_limit`](#get_token_limit)\n\n### `build_messages`\n\nBuild a list of messages for a chat conversation, given the system prompt, new user message,\nand past messages. The function will truncate the history of past messages if necessary to\nstay within the token limit.\n\nArguments:\n\n* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.\n* `system_prompt` (`str`): The initial system prompt message.\n* `new_user_message` (`str | List[openai.types.chat.ChatCompletionContentPartParam]`): The new user message to append.\n* `past_messages` (`list[dict]`): The list of past messages in the conversation.\n* `few_shots` (`list[dict]`): A few-shot list of messages to insert after the system prompt.\n* `max_tokens` (`int`): The maximum number of tokens allowed for the conversation.\n\nReturns:\n\n* `list[openai.types.chat.ChatCompletionMessageParam]`\n\nExample:\n\n```python\nfrom llm_messages_token_helper import build_messages\n\nmessages = build_messages(\n    model=\"gpt-35-turbo\",\n    system_prompt=\"You are a bot.\",\n    new_user_message=\"That wasn't a good poem.\",\n    past_messages=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Write me a poem\",\n        },\n        {\n            \"role\": \"assistant\",\n            \"content\": \"Tuna tuna I love tuna\",\n        },\n    ],\n    few_shots=[\n        {\n            \"role\": \"user\",\n            \"content\": \"Write me a poem\",\n        },\n        {\n            \"role\": \"assistant\",\n            \"content\": \"Tuna tuna is the best\",\n        },\n    ]\n)\n```\n\n### `count_tokens_for_message`\n\nCounts the number of tokens in a message.\n\nArguments:\n\n* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo.\n\nReturns:\n\n* `int`: The number of tokens in the message.\n\nExample:\n\n```python\nfrom llm_messages_token_helper import count_tokens_for_message\n\nmessage = {\n    \"role\": \"user\",\n    \"content\": \"Hello, how are you?\",\n}\nmodel = \"gpt-4\"\nnum_tokens = count_tokens_for_message(model, message)\n```\n\n### `count_tokens_for_image`\n\nCount the number of tokens for an image sent to GPT-4-vision, in base64 format.\n\nArguments:\n\n* `image` (`str`): The base64-encoded image.\n\nReturns:\n\n* `int`: The number of tokens used up for the image.\n\nExample:\n\n```python\n\nCount the number of tokens for an image sent to GPT-4-vision:\n\n```python\nfrom llm_messages_token_helper import count_tokens_for_image\n\nimage = \"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEA...\"\nnum_tokens = count_tokens_for_image(image)\n```\n\n### `get_token_limit`\n\nGet the token limit for a given GPT model name (OpenAI.com or Azure OpenAI supported).\n\nArguments:\n\n* `model` (`str`): The model name to use for token calculation, like gpt-3.5-turbo (OpenAI.com) or gpt-35-turbo (Azure).\n\nReturns:\n\n* `int`: The token limit for the model.\n\nExample:\n\n```python\nfrom llm_messages_token_helper import get_token_limit\n\nmodel = \"gpt-4\"\nmax_tokens = get_token_limit(model)\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A helper library for estimating tokens used by messages.",
    "version": "0.0.3",
    "project_urls": {
        "Home": "https://github.com/pamelafox/llm-messages-token-helper"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5599c5103404ea4edf9ea1a5faa5d30d5aa7692987f56cd6bc4c1087acb630c2",
                "md5": "86dae66c6be03e792fe63656614aec81",
                "sha256": "05ac74db4f79767db9d0611368e6639d37320aa3eafa00137aee405fc234e6e1"
            },
            "downloads": -1,
            "filename": "llm_messages_token_helper-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "86dae66c6be03e792fe63656614aec81",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 7567,
            "upload_time": "2024-04-21T22:05:33",
            "upload_time_iso_8601": "2024-04-21T22:05:33.421183Z",
            "url": "https://files.pythonhosted.org/packages/55/99/c5103404ea4edf9ea1a5faa5d30d5aa7692987f56cd6bc4c1087acb630c2/llm_messages_token_helper-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f8dd9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa",
                "md5": "5de5317604cc541f815b3cc01c9d4ce8",
                "sha256": "94a1fb2ae28c2c4c2a38569d4a6f4a921e0cea0838a33bfc645e05a7d6021b42"
            },
            "downloads": -1,
            "filename": "llm_messages_token_helper-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "5de5317604cc541f815b3cc01c9d4ce8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 288068,
            "upload_time": "2024-04-21T22:05:35",
            "upload_time_iso_8601": "2024-04-21T22:05:35.188274Z",
            "url": "https://files.pythonhosted.org/packages/f8/dd/9d63ccc35ea8ce28ec9031e3b56f286b50f76e8e40a6afd743ff9159b0aa/llm_messages_token_helper-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-21 22:05:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pamelafox",
    "github_project": "llm-messages-token-helper",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "llm-messages-token-helper"
}

Pamela Fox