prompt-optimizer

Name	prompt-optimizer JSON
Version	0.2.1 JSON
	download
home_page
Summary
upload_time	2023-05-21 01:30:33
maintainer
docs_url	None
author	Vaibhav Kumar
requires_python	>=3.8.1,<4.0
license
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <div align="center">

  ## PromptOptimizer
  
  <img width="200" src="evaluations/artifacts/logo.png" alt="kevin inspired logo" />

  Minimize LLM token complexity to save API costs and model computations.

</div>
<div align="center">

[![lint](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/lint.yml/badge.svg)](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/lint.yml) 
[![test](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/test.yml/badge.svg)](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/test.yml) 
[![linkcheck](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/linkcheck.yml) 
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

[Docs](https://promptoptimizer.readthedocs.io/en/latest/)

</div>


# Features
- **Plug and Play Optimizers:** Minimize token complexity using optimization methods without any access to weights, logits or decoding algorithm. Directly applicable to virtually all NLU systems.
- **Protected Tags:** Special protected tags to mark important sections of prompt that should not be removed/modified.
- **Sequential Optimization:** Chain different optimizers together sequentially.
- **Optimization Metrics:** Number of tokens reduced and semantic similarity before and after optimization.
- **Langhcain and JSON Support:** Supports langchain style prompt chains and OpenAI request JSON Object.
 
# Why?
- **Minimize Token Complexity:** Token Complexity is the amount of prompt tokens required to achieve a given task. Reducing token complexity corresponds to linearly reducing API costs and quadratically reducing computational complexity of usual transformer models.
- **Save Money:** For large businesses, saving 10% on token count can lead to saving 100k USD per 1M USD.
- **Extend Limitations:** Some models have small context lengths, prompt optimizers can help them process larger than context documents.

| Prompt | # Tokens | Correct Response? |  
| ------------------------------------------------------- | ---------- | ------------------- |  
| Who is the president of the United States of America? | 11 | ✅ |  
| Who president US | 3  (-72%) | ✅ |

# Installation
### Quick Installation
```pip install prompt-optimizer```

### Install from source
```bash
git clone https://github.com/vaibkumr/prompt-optimizer.git;
cd prompt-optimizer;
pip install -e .
```

# Disclaimer
There is a compression vs performance tradeoff -- the increase in compression comes at the cost of loss in model performance. The tradeoff can be greatly mitigated by chosing the right optimize for a given task. There is no single optimizer for all cases. There is no Adam here.


# Getting started

```python

from prompt_optimizer.poptim import EntropyOptim

prompt = """The Belle Tout Lighthouse is a decommissioned lighthouse and British landmark located at Beachy Head, East Sussex, close to the town of Eastbourne."""
p_optimizer = EntropyOptim(verbose=True, p=0.1)
optimized_prompt = p_optimizer(prompt)
print(optimized_prompt)

```
# Evaluations
Following are the results for [logiqa](https://github.com/openai/evals/blob/main/evals/registry/evals/logiqa.yaml) OpenAI evals task. It is only performed for a subset of first 100 samples. Please note the optimizer performance over this task should not be generalized to other tasks, more thorough testing and domain knowledge is needed to choose the optimal optimizer.

| Name | % Tokens Reduced | LogiQA Accuracy | USD Saved Per $100 |
| --- | --- | --- | --- |
| Default | 0.0 | 0.32 | 0.0 |
| Entropy_Optim_p_0.05 | 0.06 | 0.3 | 6.35 |
| Entropy_Optim_p_0.1 | 0.11 | 0.28 | 11.19 |
| Entropy_Optim_p_0.25 | 0.26 | 0.22 | 26.47 |
| Entropy_Optim_p_0.5 | 0.5 | 0.08 | 49.65 |
| SynonymReplace_Optim_p_1.0 | 0.01 | 0.33 | 1.06 |
| Lemmatizer_Optim | 0.01 | 0.33 | 1.01 |
| NameReplace_Optim | 0.01 | 0.34 | 1.13 |
| Punctuation_Optim | 0.13 | 0.35 | 12.81 |
| Autocorrect_Optim | 0.01 | 0.3 | 1.14 |
| Pulp_Optim_p_0.05 | 0.05 | 0.31 | 5.49 |
| Pulp_Optim_p_0.1 | 0.1 | 0.25 | 9.52 |

# Cost-Performance Tradeoff
The reduction in cost often comes with a loss in LLM performance. Almost every optimizer have hyperparameters that control this tradeoff. 

For example, in `EntropyOptim` the hyperparamter `p`, a floating point number between 0 and 1 controls the ratio of tokens to remove. `p=1.0` corresponds to removing all tokens while `p=0.0` corresponds to removing none. 

The following chart shows the trade-off for different values of `p` as evaluated on the OpenAI evals [logiqa](https://github.com/openai/evals/blob/main/evals/registry/evals/logiqa.yaml) task for a subset of first 100 samples.

<div align="center">
  <img src="evaluations/artifacts/tradeoff.png" alt="tradeoff" />
</div>

# Contributing
There are several directions to contribute to. Please see [CONTRIBUTING.md](.github/CONTRIBUTING.md) for contribution guidelines and possible future directions.

# Social
Contact us on twitter [Vaibhav Kumar](https://twitter.com/vaibhavk1o1) and [Vaibhav Kumar](https://twitter.com/vaibhavk97).

# Inspiration
<div align="center">
  <img src="evaluations/artifacts/kevin.gif" alt="Image" />
</div>

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "prompt-optimizer",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8.1,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Vaibhav Kumar",
    "author_email": "34630911+TimeTraveller-San@users.noreply.github.com",
    "download_url": "https://files.pythonhosted.org/packages/82/1c/8a1feae81004c0e9db4476405040e139eb675fb44b6ba9764b0569f39179/prompt_optimizer-0.2.1.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n\n  ## PromptOptimizer\n  \n  <img width=\"200\" src=\"evaluations/artifacts/logo.png\" alt=\"kevin inspired logo\" />\n\n  Minimize LLM token complexity to save API costs and model computations.\n\n</div>\n<div align=\"center\">\n\n[![lint](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/lint.yml/badge.svg)](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/lint.yml) \n[![test](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/test.yml/badge.svg)](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/test.yml) \n[![linkcheck](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/linkcheck.yml/badge.svg)](https://github.com/vaibkumr/prompt-optimizer/actions/workflows/linkcheck.yml) \n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n[Docs](https://promptoptimizer.readthedocs.io/en/latest/)\n\n</div>\n\n\n# Features\n- **Plug and Play Optimizers:** Minimize token complexity using optimization methods without any access to weights, logits or decoding algorithm. Directly applicable to virtually all NLU systems.\n- **Protected Tags:** Special protected tags to mark important sections of prompt that should not be removed/modified.\n- **Sequential Optimization:** Chain different optimizers together sequentially.\n- **Optimization Metrics:** Number of tokens reduced and semantic similarity before and after optimization.\n- **Langhcain and JSON Support:** Supports langchain style prompt chains and OpenAI request JSON Object.\n \n# Why?\n- **Minimize Token Complexity:** Token Complexity is the amount of prompt tokens required to achieve a given task. Reducing token complexity corresponds to linearly reducing API costs and quadratically reducing computational complexity of usual transformer models.\n- **Save Money:** For large businesses, saving 10% on token count can lead to saving 100k USD per 1M USD.\n- **Extend Limitations:** Some models have small context lengths, prompt optimizers can help them process larger than context documents.\n\n| Prompt | # Tokens | Correct Response? |  \n| ------------------------------------------------------- | ---------- | ------------------- |  \n| Who is the president of the United States of America? | 11 | \u2705 |  \n| Who president US | 3  (-72%) | \u2705 |\n\n# Installation\n### Quick Installation\n```pip install prompt-optimizer```\n\n### Install from source\n```bash\ngit clone https://github.com/vaibkumr/prompt-optimizer.git;\ncd prompt-optimizer;\npip install -e .\n```\n\n# Disclaimer\nThere is a compression vs performance tradeoff -- the increase in compression comes at the cost of loss in model performance. The tradeoff can be greatly mitigated by chosing the right optimize for a given task. There is no single optimizer for all cases. There is no Adam here.\n\n\n# Getting started\n\n```python\n\nfrom prompt_optimizer.poptim import EntropyOptim\n\nprompt = \"\"\"The Belle Tout Lighthouse is a decommissioned lighthouse and British landmark located at Beachy Head, East Sussex, close to the town of Eastbourne.\"\"\"\np_optimizer = EntropyOptim(verbose=True, p=0.1)\noptimized_prompt = p_optimizer(prompt)\nprint(optimized_prompt)\n\n```\n# Evaluations\nFollowing are the results for [logiqa](https://github.com/openai/evals/blob/main/evals/registry/evals/logiqa.yaml) OpenAI evals task. It is only performed for a subset of first 100 samples. Please note the optimizer performance over this task should not be generalized to other tasks, more thorough testing and domain knowledge is needed to choose the optimal optimizer.\n\n| Name | % Tokens Reduced | LogiQA Accuracy | USD Saved Per $100 |\n| --- | --- | --- | --- |\n| Default | 0.0 | 0.32 | 0.0 |\n| Entropy_Optim_p_0.05 | 0.06 | 0.3 | 6.35 |\n| Entropy_Optim_p_0.1 | 0.11 | 0.28 | 11.19 |\n| Entropy_Optim_p_0.25 | 0.26 | 0.22 | 26.47 |\n| Entropy_Optim_p_0.5 | 0.5 | 0.08 | 49.65 |\n| SynonymReplace_Optim_p_1.0 | 0.01 | 0.33 | 1.06 |\n| Lemmatizer_Optim | 0.01 | 0.33 | 1.01 |\n| NameReplace_Optim | 0.01 | 0.34 | 1.13 |\n| Punctuation_Optim | 0.13 | 0.35 | 12.81 |\n| Autocorrect_Optim | 0.01 | 0.3 | 1.14 |\n| Pulp_Optim_p_0.05 | 0.05 | 0.31 | 5.49 |\n| Pulp_Optim_p_0.1 | 0.1 | 0.25 | 9.52 |\n\n# Cost-Performance Tradeoff\nThe reduction in cost often comes with a loss in LLM performance. Almost every optimizer have hyperparameters that control this tradeoff. \n\nFor example, in `EntropyOptim` the hyperparamter `p`, a floating point number between 0 and 1 controls the ratio of tokens to remove. `p=1.0` corresponds to removing all tokens while `p=0.0` corresponds to removing none. \n\nThe following chart shows the trade-off for different values of `p` as evaluated on the OpenAI evals [logiqa](https://github.com/openai/evals/blob/main/evals/registry/evals/logiqa.yaml) task for a subset of first 100 samples.\n\n<div align=\"center\">\n  <img src=\"evaluations/artifacts/tradeoff.png\" alt=\"tradeoff\" />\n</div>\n\n# Contributing\nThere are several directions to contribute to. Please see [CONTRIBUTING.md](.github/CONTRIBUTING.md) for contribution guidelines and possible future directions.\n\n# Social\nContact us on twitter [Vaibhav Kumar](https://twitter.com/vaibhavk1o1) and [Vaibhav Kumar](https://twitter.com/vaibhavk97).\n\n# Inspiration\n<div align=\"center\">\n  <img src=\"evaluations/artifacts/kevin.gif\" alt=\"Image\" />\n</div>",
    "bugtrack_url": null,
    "license": "",
    "summary": "",
    "version": "0.2.1",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "939cb3fc3eaa9fd440e2e2975f57329b64d182b4242099f5c0ab8c982355ad23",
                "md5": "7072ef2cc4cb6f8d07f6c5c783596599",
                "sha256": "061dd5e22b29238bb1d07c4880a3656cc5293f4ba7794f1993a641ba7e357635"
            },
            "downloads": -1,
            "filename": "prompt_optimizer-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7072ef2cc4cb6f8d07f6c5c783596599",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8.1,<4.0",
            "size": 26650,
            "upload_time": "2023-05-21T01:30:32",
            "upload_time_iso_8601": "2023-05-21T01:30:32.091463Z",
            "url": "https://files.pythonhosted.org/packages/93/9c/b3fc3eaa9fd440e2e2975f57329b64d182b4242099f5c0ab8c982355ad23/prompt_optimizer-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "821c8a1feae81004c0e9db4476405040e139eb675fb44b6ba9764b0569f39179",
                "md5": "85015a4ac9478405b2f541942b16e1d6",
                "sha256": "26a86a9ba90420dc4d404495722de0ce49a9b59c13925aaaa934041b459426c1"
            },
            "downloads": -1,
            "filename": "prompt_optimizer-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "85015a4ac9478405b2f541942b16e1d6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8.1,<4.0",
            "size": 19621,
            "upload_time": "2023-05-21T01:30:33",
            "upload_time_iso_8601": "2023-05-21T01:30:33.887025Z",
            "url": "https://files.pythonhosted.org/packages/82/1c/8a1feae81004c0e9db4476405040e139eb675fb44b6ba9764b0569f39179/prompt_optimizer-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-21 01:30:33",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "prompt-optimizer"
}

Vaibhav Kumar