llm-helpers


Namellm-helpers JSON
Version 0.1.1 PyPI version JSON
download
home_pagehttps://github.com/sonnylaskar/llm_helpers
SummaryA helper package to work with LLMs
upload_time2024-03-06 06:41:18
maintainer
docs_urlNone
authorSonny Laskar
requires_python>=3.9
license
keywords llm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # llm_helpers Package

## Overview

The `llm_helpers` package provides utilities for interacting with language learning models (LLMs), including generating categories from text using recursrive calls to services like Azure's version of OpenAI. This package simplifies the process of sending requests to and interpreting responses from these models for large data which may not fit in the context of the target model.

## Installation

To install `llm_helpers`, download the package and install it using pip:

```
pip install llm_helpers
```

Or, if the package is hosted in a repository:

```
pip install git+https://github.com/sonnylaskar/llm_helpers.git
```

## Usage

To use the `llm_helpers` package in your project, simply import it and call the available functions. The primary function, `generate_categories`, allows you to generate category tags for a given text input using a specified language learning model. 
Currently only OpenAI and Azure OpenAI is supported.

### Example

#### Recursive Category Generation from Text

When dealing with a large corpus of text from which we aim to generate categories, we often encounter the challenge that the text size exceeds the target model's context limit. One effective strategy to overcome this limitation is to segment the text into smaller portions that the model can process. The initial step involves chunking the text to fit within the model's context and generating categories for each segment. Subsequently, these categories are amalgamated and subjected to another round of category generation. This process may not suffice in a single iteration if the combined output still exceeds the model's context limit, necessitating further chunking and category generation. The `generate_categories` function facilitates this intricate process recursively, enabling streamlined category generation from extensive text data.

**Parameters:**
- **txt:** The input text for category generation.
- **llm:** The language model to use, choices include 'azure' or 'openai'.
- **endpoint:** If using 'azure', specify the Azure endpoint.
- **key:** The authentication key for the language model API.
- **api_version:** Specifies the API version of the chosen model.

**Optional Parameters (with defaults):**
- **max_tokens=200:** The maximum number of tokens to generate.
- **temperature=0.0:** Controls the randomness in the output generation.
- **frequency_penalty=0.0:** Adjusts the likelihood of repeating information.
- **presence_penalty=0.0:** Influences the introduction of new concepts.
- **max_token_size=4092:** Set to maximum token capacity of the target language model.
- **system_prompt="Generate the top categories into which the below text can be grouped, just print the categories, do not add any examples, put them to Others category if they don't fit in any category:":** Customizable prompt that guides the model in generating relevant categories.

**Note:** 
- The `system_prompt` serves as a guideline for the model to ensure the categories generated align with the specified criteria.
- Adjust the `max_token_size` according to the maximum token capacity of the target language model to optimize the chunking process.  

The following script demonstrates how to use the `llm_helpers` package to generate categories from text stored in a file named `sample_text.txt`:

```python
import llm_helpers

# Open the file in read mode
with open('sample_text.txt', 'r') as file:
    # Read the entire contents of the file into a string
    txt = file.read()

# Update the <> below with the correct values
categories = llm_helpers.generate_categories(txt, 
                                             llm = 'azure', 
                                             endpoint = "<azure_endpoint>", 
                                             key = "<azure_key>", 
                                             api_version="<api_version>", 
                                             deployment_name="<deployment_name>", 
                                             max_tokens=200, 
                                             temperature=0.0, 
                                             frequency_penalty=0.0, 
                                             presence_penalty=0.0, 
                                             max_token_size=4092, 
                                             system_prompt="Generate the top categories into which the below text can be grouped, just print the categories, do not add any examples, put them to Others category if they dont fit in any category: "
                                            )
print(categories)
```

Replace the placeholders (`<>`) with your actual Azure endpoint, key, API version, and deployment name to run the script.

## License

This project is licensed under the Apache License, Version 2.0. For more details, see the [LICENSE](LICENSE) file in the root directory of this project. 

## Contributing

We welcome contributions to the `llm_helpers` package! If you'd like to contribute, please follow these steps:

1. Fork the repository on GitHub.
2. Make your changes in your forked repository.
3. Submit a Pull Request back to the main repository.

We encourage you to discuss any substantial changes through a GitHub issue before you start working on your contribution. This allows us to provide feedback, suggest any necessary adjustments, and help you determine the best approach.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/sonnylaskar/llm_helpers",
    "name": "llm-helpers",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "llm",
    "author": "Sonny Laskar",
    "author_email": "sonnylaskar@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/2d/dd/4807f1f8809df6b4c8f430eefb82ea19674bd17c479d28a5be056550219f/llm_helpers-0.1.1.tar.gz",
    "platform": null,
    "description": "# llm_helpers Package\r\n\r\n## Overview\r\n\r\nThe `llm_helpers` package provides utilities for interacting with language learning models (LLMs), including generating categories from text using recursrive calls to services like Azure's version of OpenAI. This package simplifies the process of sending requests to and interpreting responses from these models for large data which may not fit in the context of the target model.\r\n\r\n## Installation\r\n\r\nTo install `llm_helpers`, download the package and install it using pip:\r\n\r\n```\r\npip install llm_helpers\r\n```\r\n\r\nOr, if the package is hosted in a repository:\r\n\r\n```\r\npip install git+https://github.com/sonnylaskar/llm_helpers.git\r\n```\r\n\r\n## Usage\r\n\r\nTo use the `llm_helpers` package in your project, simply import it and call the available functions. The primary function, `generate_categories`, allows you to generate category tags for a given text input using a specified language learning model. \r\nCurrently only OpenAI and Azure OpenAI is supported.\r\n\r\n### Example\r\n\r\n#### Recursive Category Generation from Text\r\n\r\nWhen dealing with a large corpus of text from which we aim to generate categories, we often encounter the challenge that the text size exceeds the target model's context limit. One effective strategy to overcome this limitation is to segment the text into smaller portions that the model can process. The initial step involves chunking the text to fit within the model's context and generating categories for each segment. Subsequently, these categories are amalgamated and subjected to another round of category generation. This process may not suffice in a single iteration if the combined output still exceeds the model's context limit, necessitating further chunking and category generation. The `generate_categories` function facilitates this intricate process recursively, enabling streamlined category generation from extensive text data.\r\n\r\n**Parameters:**\r\n- **txt:** The input text for category generation.\r\n- **llm:** The language model to use, choices include 'azure' or 'openai'.\r\n- **endpoint:** If using 'azure', specify the Azure endpoint.\r\n- **key:** The authentication key for the language model API.\r\n- **api_version:** Specifies the API version of the chosen model.\r\n\r\n**Optional Parameters (with defaults):**\r\n- **max_tokens=200:** The maximum number of tokens to generate.\r\n- **temperature=0.0:** Controls the randomness in the output generation.\r\n- **frequency_penalty=0.0:** Adjusts the likelihood of repeating information.\r\n- **presence_penalty=0.0:** Influences the introduction of new concepts.\r\n- **max_token_size=4092:** Set to maximum token capacity of the target language model.\r\n- **system_prompt=\"Generate the top categories into which the below text can be grouped, just print the categories, do not add any examples, put them to Others category if they don't fit in any category:\":** Customizable prompt that guides the model in generating relevant categories.\r\n\r\n**Note:** \r\n- The `system_prompt` serves as a guideline for the model to ensure the categories generated align with the specified criteria.\r\n- Adjust the `max_token_size` according to the maximum token capacity of the target language model to optimize the chunking process.  \r\n\r\nThe following script demonstrates how to use the `llm_helpers` package to generate categories from text stored in a file named `sample_text.txt`:\r\n\r\n```python\r\nimport llm_helpers\r\n\r\n# Open the file in read mode\r\nwith open('sample_text.txt', 'r') as file:\r\n    # Read the entire contents of the file into a string\r\n    txt = file.read()\r\n\r\n# Update the <> below with the correct values\r\ncategories = llm_helpers.generate_categories(txt, \r\n                                             llm = 'azure', \r\n                                             endpoint = \"<azure_endpoint>\", \r\n                                             key = \"<azure_key>\", \r\n                                             api_version=\"<api_version>\", \r\n                                             deployment_name=\"<deployment_name>\", \r\n                                             max_tokens=200, \r\n                                             temperature=0.0, \r\n                                             frequency_penalty=0.0, \r\n                                             presence_penalty=0.0, \r\n                                             max_token_size=4092, \r\n                                             system_prompt=\"Generate the top categories into which the below text can be grouped, just print the categories, do not add any examples, put them to Others category if they dont fit in any category: \"\r\n                                            )\r\nprint(categories)\r\n```\r\n\r\nReplace the placeholders (`<>`) with your actual Azure endpoint, key, API version, and deployment name to run the script.\r\n\r\n## License\r\n\r\nThis project is licensed under the Apache License, Version 2.0. For more details, see the [LICENSE](LICENSE) file in the root directory of this project. \r\n\r\n## Contributing\r\n\r\nWe welcome contributions to the `llm_helpers` package! If you'd like to contribute, please follow these steps:\r\n\r\n1. Fork the repository on GitHub.\r\n2. Make your changes in your forked repository.\r\n3. Submit a Pull Request back to the main repository.\r\n\r\nWe encourage you to discuss any substantial changes through a GitHub issue before you start working on your contribution. This allows us to provide feedback, suggest any necessary adjustments, and help you determine the best approach.\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "A helper package to work with LLMs",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/sonnylaskar/llm_helpers"
    },
    "split_keywords": [
        "llm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "68d436635789a09723eb793dbd842ed22710278325bab59ef03cada76a39e1fd",
                "md5": "7ede2f4c68066d8c74e9caf4b5962098",
                "sha256": "281a35d9752e87528934b358d47371a2e506f1c12374b3907f25696ef22b1931"
            },
            "downloads": -1,
            "filename": "llm_helpers-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7ede2f4c68066d8c74e9caf4b5962098",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 8802,
            "upload_time": "2024-03-06T06:41:16",
            "upload_time_iso_8601": "2024-03-06T06:41:16.233035Z",
            "url": "https://files.pythonhosted.org/packages/68/d4/36635789a09723eb793dbd842ed22710278325bab59ef03cada76a39e1fd/llm_helpers-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "2ddd4807f1f8809df6b4c8f430eefb82ea19674bd17c479d28a5be056550219f",
                "md5": "f4eda4d9cfb808f4250411b4990ecf3a",
                "sha256": "dbdb6171571611540c6f990d25d00d7851cb1c8bfc4a99374bd8d0e7347b04af"
            },
            "downloads": -1,
            "filename": "llm_helpers-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "f4eda4d9cfb808f4250411b4990ecf3a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 8338,
            "upload_time": "2024-03-06T06:41:18",
            "upload_time_iso_8601": "2024-03-06T06:41:18.288590Z",
            "url": "https://files.pythonhosted.org/packages/2d/dd/4807f1f8809df6b4c8f430eefb82ea19674bd17c479d28a5be056550219f/llm_helpers-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-06 06:41:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "sonnylaskar",
    "github_project": "llm_helpers",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "llm-helpers"
}
        
Elapsed time: 0.37180s