classy-fire

Name	classy-fire JSON
Version	0.2.1 JSON
	download
home_page
Summary	Classy-fire is multiclass text classification approach leveraging OpenAI LLM model APIs optimally using clever parameter tuning and prompting.
upload_time	2023-11-15 15:17:12
maintainer
docs_url	None
author
requires_python	>=3.7
license	MIT License Classy-Fire Copyright (c) Microsoft Corporation. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE
keywords	llm classification machine-learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # 🤵🔥 Classy-Fire 🔥🤵
Classy-fire is a pretrained multiclass text classification approach that leverages Azure OpenAI's LLM APIs using clever parameter tuning and prompting for classification.

## Why?
* Tired of having to beg your LLM to pick from a set of options / actions?
* Tired of working hard on cleaning and parsing its responses to trigger a flow?
* Struggling to strip unhelpful prefixes (such as "Sure! " or "I am just a language model!")?
* Having to wait on retries in cases of unexpected outputs?
* Getting random responses on the same query?
* Need a "quick and dirty" text classifier? Don't have enough training data?


# Start here

## Installation
```
pip install classy-fire
```
## Usage example

```python
from classy_fire import LLMClassifier, MCMCClassifier

classifier = LLMClassifier(["Banana", "Watermelon", "Apple", "Grape"])

result = classifier("Has an elongated shape")
print(result)
>>> ('Banana', 0)

mcmc_classifier = MCMCClassifier(["Banana", "Watermelon", "Apple", "Grape"])
result = mcmc_classifier("Has an oval or spherical shape")
print(result)
>>> [('Banana', 0, 0.0), ('Watermelon', 1, 0.02), ('Apple', 2, 0.81), ('Grape', 3, 0.17)]

```

## Prerequisites
make sure you have OPENAI_API_BASE, OPENAI_API_VERSION, OPENAI_API_TYPE and OPENAI_API_KEY environment variables populated beforehand and a deployment of gpt-3.5-turbo named gpt-35-turbo-0301 (or pass deployment_name and model_name parameters to the LLMClassifier constructor).
One way to accomplish this:
1. Create a file called ".env" in the path of your script containing:
```
OPENAI_API_BASE=https://<endpoint>/
OPENAI_API_VERSION=2023-05-15  # for example
OPENAI_API_TYPE=azure  # or openai
OPENAI_API_KEY=<private key guid>
```
2. install python-dotenv
3. add to the beginning your script
```python
load_dotenv()
```

# Continue here

## LLMClassifier optional parameters
LLMClassifier can be initialized with added parameters that can help instruct and ground it to the classification task at hand.
* task_description = "Ability to provide additional context on the classification options and overall context on inputs"
* few_shot_examples = "Ability to provide instances of inputs and corresponding expected output values as a string"

## The premise behind classy-fire
In Classy-fire, we instruct the LLM to provide the most likely classification for an input string to a set of predetermined classes (also strings).
Formally, given a string instance $x_i$ and a set of $k$ classes provided as strings, $C=(C_1, ..., C_k)$, classy-fire determines 

$argmax_j Pr[x_i \in C_j | C, \Theta]$

Where $\Theta$ is the parameters (knowledge of the world) of the language model.

* Classy-fire does this efficiently by mapping class strings to single tokens and providing a strong prior probability for these tokens. We instruct the model to generate a single token response, which allows for optimized inference runtime.
* Classy-fire does this deterministically and with less sensitivity to confabulation (hallucination) by setting the model temperature to 0, thereby guaranteeing the returned response is the argmax of the model posterior probability.
* **New!** MCMCClassifier can now estimate the posterior distribution over classes using a Markov Chain Monte Carlo approach!

## Quality of results
We ran a preliminary experiment to classify a sample of 100 tweets from the [tweet_eval dataset](https://huggingface.co/datasets/tweet_eval/viewer/emotion/train).
The results [appear to beat the SOTA](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=tweet_eval&only_verified=0&task=-any-&config=emotion&split=test&metric=f1).
             
|              | precision | recall | f1-score | support |
|--------------|-----------|--------|----------|---------|
| anger        | 0.97      | 0.81   | 0.88     | 42      |
| joy          | 0.74      | 0.92   | 0.82     | 25      |
| optimism     | 1.00      | 0.25   | 0.40     | 8       |
| sadness      | 0.75      | 1.00   | 0.86     | 21      |
|              |           |        |          |         |
| accuracy     |           |        | 0.83     | 96      |
| macro avg    | 0.87      | 0.74   | 0.74     | 96      |
| weighted avg | 0.87      | 0.83   | 0.82     | 96      |

See evaluate.ipynb for the details behind this experiment.

* We encourage the community to benchmark and explore this method against larger or more standardized datasets.
* We have not evaluated alternative prompting strategies or the impact of adding few shot examples, there may be flexibility in the achievable quality.


# Other stuff

## Contributing

This project welcomes contributions and suggestions.  Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.

## Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft 
trademarks or logos is subject to and must follow 
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.
 
<!-- BEGIN MICROSOFT SECURITY.MD V0.0.5 BLOCK -->

## Security

Microsoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).

If you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below.

## Reporting Security Issues

**Please do not report security vulnerabilities through public GitHub issues.**

Instead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).

If you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com).  If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).

You should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).

Please include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:

  * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)
  * Full paths of source file(s) related to the manifestation of the issue
  * The location of the affected source code (tag/branch/commit or direct URL)
  * Any special configuration required to reproduce the issue
  * Step-by-step instructions to reproduce the issue
  * Proof-of-concept or exploit code (if possible)
  * Impact of the issue, including how an attacker might exploit the issue

This information will help us triage your report more quickly.

If you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://microsoft.com/msrc/bounty) page for more details about our active programs.

## Preferred Languages

We prefer all communications to be in English.

## Policy

Microsoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd).

<!-- END MICROSOFT SECURITY.MD BLOCK -->

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "classy-fire",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "llm,classification,machine-learning",
    "author": "",
    "author_email": "Shay Ben-Elazar <shbenela@microsoft.com>",
    "download_url": "https://files.pythonhosted.org/packages/e3/88/876d0907d65ac23af1ba00bfffb76ef0c82c1c7cd557e0dc790eae4efbb0/classy-fire-0.2.1.tar.gz",
    "platform": null,
    "description": "# \ud83e\udd35\ud83d\udd25 Classy-Fire \ud83d\udd25\ud83e\udd35\nClassy-fire is a pretrained multiclass text classification approach that leverages Azure OpenAI's LLM APIs using clever parameter tuning and prompting for classification.\n\n## Why?\n* Tired of having to beg your LLM to pick from a set of options / actions?\n* Tired of working hard on cleaning and parsing its responses to trigger a flow?\n* Struggling to strip unhelpful prefixes (such as \"Sure! \" or \"I am just a language model!\")?\n* Having to wait on retries in cases of unexpected outputs?\n* Getting random responses on the same query?\n* Need a \"quick and dirty\" text classifier? Don't have enough training data?\n\n\n# Start here\n\n## Installation\n```\npip install classy-fire\n```\n## Usage example\n\n```python\nfrom classy_fire import LLMClassifier, MCMCClassifier\n\nclassifier = LLMClassifier([\"Banana\", \"Watermelon\", \"Apple\", \"Grape\"])\n\nresult = classifier(\"Has an elongated shape\")\nprint(result)\n>>> ('Banana', 0)\n\nmcmc_classifier = MCMCClassifier([\"Banana\", \"Watermelon\", \"Apple\", \"Grape\"])\nresult = mcmc_classifier(\"Has an oval or spherical shape\")\nprint(result)\n>>> [('Banana', 0, 0.0), ('Watermelon', 1, 0.02), ('Apple', 2, 0.81), ('Grape', 3, 0.17)]\n\n```\n\n## Prerequisites\nmake sure you have OPENAI_API_BASE, OPENAI_API_VERSION, OPENAI_API_TYPE and OPENAI_API_KEY environment variables populated beforehand and a deployment of gpt-3.5-turbo named gpt-35-turbo-0301 (or pass deployment_name and model_name parameters to the LLMClassifier constructor).\nOne way to accomplish this:\n1. Create a file called \".env\" in the path of your script containing:\n```\nOPENAI_API_BASE=https://<endpoint>/\nOPENAI_API_VERSION=2023-05-15  # for example\nOPENAI_API_TYPE=azure  # or openai\nOPENAI_API_KEY=<private key guid>\n```\n2. install python-dotenv\n3. add to the beginning your script\n```python\nload_dotenv()\n```\n\n# Continue here\n\n## LLMClassifier optional parameters\nLLMClassifier can be initialized with added parameters that can help instruct and ground it to the classification task at hand.\n* task_description = \"Ability to provide additional context on the classification options and overall context on inputs\"\n* few_shot_examples = \"Ability to provide instances of inputs and corresponding expected output values as a string\"\n\n## The premise behind classy-fire\nIn Classy-fire, we instruct the LLM to provide the most likely classification for an input string to a set of predetermined classes (also strings).\nFormally, given a string instance $x_i$ and a set of $k$ classes provided as strings, $C=(C_1, ..., C_k)$, classy-fire determines \n\n$argmax_j Pr[x_i \\in C_j | C, \\Theta]$\n\nWhere $\\Theta$ is the parameters (knowledge of the world) of the language model.\n\n* Classy-fire does this efficiently by mapping class strings to single tokens and providing a strong prior probability for these tokens. We instruct the model to generate a single token response, which allows for optimized inference runtime.\n* Classy-fire does this deterministically and with less sensitivity to confabulation (hallucination) by setting the model temperature to 0, thereby guaranteeing the returned response is the argmax of the model posterior probability.\n* **New!** MCMCClassifier can now estimate the posterior distribution over classes using a Markov Chain Monte Carlo approach!\n\n## Quality of results\nWe ran a preliminary experiment to classify a sample of 100 tweets from the [tweet_eval dataset](https://huggingface.co/datasets/tweet_eval/viewer/emotion/train).\nThe results [appear to beat the SOTA](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=tweet_eval&only_verified=0&task=-any-&config=emotion&split=test&metric=f1).\n             \n|              | precision | recall | f1-score | support |\n|--------------|-----------|--------|----------|---------|\n| anger        | 0.97      | 0.81   | 0.88     | 42      |\n| joy          | 0.74      | 0.92   | 0.82     | 25      |\n| optimism     | 1.00      | 0.25   | 0.40     | 8       |\n| sadness      | 0.75      | 1.00   | 0.86     | 21      |\n|              |           |        |          |         |\n| accuracy     |           |        | 0.83     | 96      |\n| macro avg    | 0.87      | 0.74   | 0.74     | 96      |\n| weighted avg | 0.87      | 0.83   | 0.82     | 96      |\n\nSee evaluate.ipynb for the details behind this experiment.\n\n* We encourage the community to benchmark and explore this method against larger or more standardized datasets.\n* We have not evaluated alternative prompting strategies or the impact of adding few shot examples, there may be flexibility in the achievable quality.\n\n\n# Other stuff\n\n## Contributing\n\nThis project welcomes contributions and suggestions.  Most contributions require you to agree to a\nContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us\nthe rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.\n\nWhen you submit a pull request, a CLA bot will automatically determine whether you need to provide\na CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions\nprovided by the bot. You will only need to do this once across all repos using our CLA.\n\nThis project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).\nFor more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or\ncontact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.\n\n## Trademarks\n\nThis project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft \ntrademarks or logos is subject to and must follow \n[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).\nUse of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.\nAny use of third-party trademarks or logos are subject to those third-party's policies.\n \n<!-- BEGIN MICROSOFT SECURITY.MD V0.0.5 BLOCK -->\n\n## Security\n\nMicrosoft takes the security of our software products and services seriously, which includes all source code repositories managed through our GitHub organizations, which include [Microsoft](https://github.com/Microsoft), [Azure](https://github.com/Azure), [DotNet](https://github.com/dotnet), [AspNet](https://github.com/aspnet), [Xamarin](https://github.com/xamarin), and [our GitHub organizations](https://opensource.microsoft.com/).\n\nIf you believe you have found a security vulnerability in any Microsoft-owned repository that meets [Microsoft's definition of a security vulnerability](https://docs.microsoft.com/en-us/previous-versions/tn-archive/cc751383(v=technet.10)), please report it to us as described below.\n\n## Reporting Security Issues\n\n**Please do not report security vulnerabilities through public GitHub issues.**\n\nInstead, please report them to the Microsoft Security Response Center (MSRC) at [https://msrc.microsoft.com/create-report](https://msrc.microsoft.com/create-report).\n\nIf you prefer to submit without logging in, send email to [secure@microsoft.com](mailto:secure@microsoft.com).  If possible, encrypt your message with our PGP key; please download it from the [Microsoft Security Response Center PGP Key page](https://www.microsoft.com/en-us/msrc/pgp-key-msrc).\n\nYou should receive a response within 24 hours. If for some reason you do not, please follow up via email to ensure we received your original message. Additional information can be found at [microsoft.com/msrc](https://www.microsoft.com/msrc).\n\nPlease include the requested information listed below (as much as you can provide) to help us better understand the nature and scope of the possible issue:\n\n  * Type of issue (e.g. buffer overflow, SQL injection, cross-site scripting, etc.)\n  * Full paths of source file(s) related to the manifestation of the issue\n  * The location of the affected source code (tag/branch/commit or direct URL)\n  * Any special configuration required to reproduce the issue\n  * Step-by-step instructions to reproduce the issue\n  * Proof-of-concept or exploit code (if possible)\n  * Impact of the issue, including how an attacker might exploit the issue\n\nThis information will help us triage your report more quickly.\n\nIf you are reporting for a bug bounty, more complete reports can contribute to a higher bounty award. Please visit our [Microsoft Bug Bounty Program](https://microsoft.com/msrc/bounty) page for more details about our active programs.\n\n## Preferred Languages\n\nWe prefer all communications to be in English.\n\n## Policy\n\nMicrosoft follows the principle of [Coordinated Vulnerability Disclosure](https://www.microsoft.com/en-us/msrc/cvd).\n\n<!-- END MICROSOFT SECURITY.MD BLOCK -->\n",
    "bugtrack_url": null,
    "license": "MIT License  Classy-Fire Copyright (c) Microsoft Corporation.  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE ",
    "summary": "Classy-fire is multiclass text classification approach leveraging OpenAI LLM model APIs optimally using clever parameter tuning and prompting.",
    "version": "0.2.1",
    "project_urls": {
        "Homepage": "https://github.com/microsoft/classy-fire"
    },
    "split_keywords": [
        "llm",
        "classification",
        "machine-learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f15ddbfa73d40a16049f3acaa612acd8a1c51684542801642f73fed3dba73e5a",
                "md5": "57c85b7270afeef129fb44fa598ec1d8",
                "sha256": "d2bd850b9f21a782fa09e2d69ac62ceeb389e1b26095850f00445fac9ea39d07"
            },
            "downloads": -1,
            "filename": "classy_fire-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "57c85b7270afeef129fb44fa598ec1d8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 13088,
            "upload_time": "2023-11-15T15:17:10",
            "upload_time_iso_8601": "2023-11-15T15:17:10.777126Z",
            "url": "https://files.pythonhosted.org/packages/f1/5d/dbfa73d40a16049f3acaa612acd8a1c51684542801642f73fed3dba73e5a/classy_fire-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e388876d0907d65ac23af1ba00bfffb76ef0c82c1c7cd557e0dc790eae4efbb0",
                "md5": "aa4f84c1dabbe405b807a1080980922c",
                "sha256": "af9be8abbc3dc3b036ab5c2e4b1a9bfba4ca56b71c4450b1c323a5bc2cdef0ec"
            },
            "downloads": -1,
            "filename": "classy-fire-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "aa4f84c1dabbe405b807a1080980922c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 19278,
            "upload_time": "2023-11-15T15:17:12",
            "upload_time_iso_8601": "2023-11-15T15:17:12.157434Z",
            "url": "https://files.pythonhosted.org/packages/e3/88/876d0907d65ac23af1ba00bfffb76ef0c82c1c7cd557e0dc790eae4efbb0/classy-fire-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-15 15:17:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "microsoft",
    "github_project": "classy-fire",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "classy-fire"
}