<img src="https://raw.githubusercontent.com/huggingface/setfit/main/assets/setfit.png">
<p align="center">
🤗 <a href="https://huggingface.co/models?library=setfit" target="_blank">Models</a> | 📊 <a href="https://huggingface.co/setfit" target="_blank">Datasets</a> | 📕 <a href="https://huggingface.co/docs/setfit" target="_blank">Documentation</a> | 📖 <a href="https://huggingface.co/blog/setfit" target="_blank">Blog</a> | 📃 <a href="https://arxiv.org/abs/2209.11055" target="_blank">Paper</a>
</p>
# SetFit - Efficient Few-shot Learning with Sentence Transformers
SetFit is an efficient and prompt-free framework for few-shot fine-tuning of [Sentence Transformers](https://sbert.net/). It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples 🤯!
Compared to other few-shot learning methods, SetFit has several unique features:
* 🗣 **No prompts or verbalizers:** Current techniques for few-shot fine-tuning require handcrafted prompts or verbalizers to convert examples into a format suitable for the underlying language model. SetFit dispenses with prompts altogether by generating rich embeddings directly from text examples.
* 🏎 **Fast to train:** SetFit doesn't require large-scale models like T0 or GPT-3 to achieve high accuracy. As a result, it is typically an order of magnitude (or more) faster to train and run inference with.
* 🌎 **Multilingual support**: SetFit can be used with any [Sentence Transformer](https://huggingface.co/models?library=sentence-transformers&sort=downloads) on the Hub, which means you can classify text in multiple languages by simply fine-tuning a multilingual checkpoint.
Check out the [SetFit Documentation](https://huggingface.co/docs/setfit) for more information!
## Installation
Download and install `setfit` by running:
```bash
pip install setfit
```
If you want the bleeding-edge version instead, install from source by running:
```bash
pip install git+https://github.com/huggingface/setfit.git
```
## Usage
The [quickstart](https://huggingface.co/docs/setfit/quickstart) is a good place to learn about training, saving, loading, and performing inference with SetFit models.
For more examples, check out the [`notebooks`](https://github.com/huggingface/setfit/tree/main/notebooks) directory, the [tutorials](https://huggingface.co/docs/setfit/tutorials/overview), or the [how-to guides](https://huggingface.co/docs/setfit/how_to/overview).
### Training a SetFit model
`setfit` is integrated with the [Hugging Face Hub](https://huggingface.co/) and provides two main classes:
* [`SetFitModel`](https://huggingface.co/docs/setfit/reference/main#setfit.SetFitModel): a wrapper that combines a pretrained body from `sentence_transformers` and a classification head from either [`scikit-learn`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) or [`SetFitHead`](https://huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) (a differentiable head built upon `PyTorch` with similar APIs to `sentence_transformers`).
* [`Trainer`](https://huggingface.co/docs/setfit/reference/trainer#setfit.Trainer): a helper class that wraps the fine-tuning process of SetFit.
Here is a simple end-to-end training example using the default classification head from `scikit-learn`:
```python
from datasets import load_dataset
from setfit import SetFitModel, Trainer, TrainingArguments, sample_dataset
# Load a dataset from the Hugging Face Hub
dataset = load_dataset("sst2")
# Simulate the few-shot regime by sampling 8 examples per class
train_dataset = sample_dataset(dataset["train"], label_column="label", num_samples=8)
eval_dataset = dataset["validation"].select(range(100))
test_dataset = dataset["validation"].select(range(100, len(dataset["validation"])))
# Load a SetFit model from Hub
model = SetFitModel.from_pretrained(
"sentence-transformers/paraphrase-mpnet-base-v2",
labels=["negative", "positive"],
)
args = TrainingArguments(
batch_size=16,
num_epochs=4,
eval_strategy="epoch",
save_strategy="epoch",
load_best_model_at_end=True,
)
trainer = Trainer(
model=model,
args=args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
metric="accuracy",
column_mapping={"sentence": "text", "label": "label"} # Map dataset columns to text/label expected by trainer
)
# Train and evaluate
trainer.train()
metrics = trainer.evaluate(test_dataset)
print(metrics)
# {'accuracy': 0.8691709844559585}
# Push model to the Hub
trainer.push_to_hub("tomaarsen/setfit-paraphrase-mpnet-base-v2-sst2")
# Download from Hub
model = SetFitModel.from_pretrained("tomaarsen/setfit-paraphrase-mpnet-base-v2-sst2")
# Run inference
preds = model.predict(["i loved the spiderman movie!", "pineapple on pizza is the worst 🤮"])
print(preds)
# ["positive", "negative"]
```
## Reproducing the results from the paper
We provide scripts to reproduce the results for SetFit and various baselines presented in Table 2 of our paper. Check out the setup and training instructions in the [`scripts/`](scripts/) directory.
## Developer installation
To run the code in this project, first create a Python virtual environment using e.g. Conda:
```bash
conda create -n setfit python=3.9 && conda activate setfit
```
Then install the base requirements with:
```bash
pip install -e '.[dev]'
```
This will install mandatory packages for SetFit like `datasets` as well as development packages like `black` and `isort` that we use to ensure consistent code formatting.
### Formatting your code
We use `black` and `isort` to ensure consistent code formatting. After following the installation steps, you can check your code locally by running:
```
make style && make quality
```
## Project structure
```
├── LICENSE
├── Makefile <- Makefile with commands like `make style` or `make tests`
├── README.md <- The top-level README for developers using this project.
├── docs <- Documentation source
├── notebooks <- Jupyter notebooks.
├── final_results <- Model predictions from the paper
├── scripts <- Scripts for training and inference
├── setup.cfg <- Configuration file to define package metadata
├── setup.py <- Make this project pip installable with `pip install -e`
├── src <- Source code for SetFit
└── tests <- Unit tests
```
## Related work
* [https://github.com/pmbaumgartner/setfit](https://github.com/pmbaumgartner/setfit) - A scikit-learn API version of SetFit.
* [jxpress/setfit-pytorch-lightning](https://github.com/jxpress/setfit-pytorch-lightning) - A PyTorch Lightning implementation of SetFit.
* [davidberenstein1957/spacy-setfit](https://github.com/davidberenstein1957/spacy-setfit) - An easy and intuitive approach to use SetFit in combination with spaCy.
## Citation
```bibtex
@misc{https://doi.org/10.48550/arxiv.2209.11055,
doi = {10.48550/ARXIV.2209.11055},
url = {https://arxiv.org/abs/2209.11055},
author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
title = {Efficient Few-Shot Learning Without Prompts},
publisher = {arXiv},
year = {2022},
copyright = {Creative Commons Attribution 4.0 International}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/huggingface/setfit",
"name": "setfit",
"maintainer": "Lewis Tunstall, Tom Aarsen",
"docs_url": null,
"requires_python": null,
"maintainer_email": "lewis@huggingface.co",
"keywords": "nlp, machine learning, fewshot learning, transformers",
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/d7/be/4d000ed3b0b4b07f41f8912b0fe26fa45f45f3ac85a8122ad8406d5748b5/setfit-1.1.0.tar.gz",
"platform": null,
"description": "<img src=\"https://raw.githubusercontent.com/huggingface/setfit/main/assets/setfit.png\">\r\n\r\n<p align=\"center\">\r\n \ud83e\udd17 <a href=\"https://huggingface.co/models?library=setfit\" target=\"_blank\">Models</a> | \ud83d\udcca <a href=\"https://huggingface.co/setfit\" target=\"_blank\">Datasets</a> | \ud83d\udcd5 <a href=\"https://huggingface.co/docs/setfit\" target=\"_blank\">Documentation</a> | \ud83d\udcd6 <a href=\"https://huggingface.co/blog/setfit\" target=\"_blank\">Blog</a> | \ud83d\udcc3 <a href=\"https://arxiv.org/abs/2209.11055\" target=\"_blank\">Paper</a>\r\n</p>\r\n\r\n# SetFit - Efficient Few-shot Learning with Sentence Transformers\r\n\r\nSetFit is an efficient and prompt-free framework for few-shot fine-tuning of [Sentence Transformers](https://sbert.net/). It achieves high accuracy with little labeled data - for instance, with only 8 labeled examples per class on the Customer Reviews sentiment dataset, SetFit is competitive with fine-tuning RoBERTa Large on the full training set of 3k examples \ud83e\udd2f!\r\n\r\nCompared to other few-shot learning methods, SetFit has several unique features:\r\n\r\n* \ud83d\udde3 **No prompts or verbalizers:** Current techniques for few-shot fine-tuning require handcrafted prompts or verbalizers to convert examples into a format suitable for the underlying language model. SetFit dispenses with prompts altogether by generating rich embeddings directly from text examples.\r\n* \ud83c\udfce **Fast to train:** SetFit doesn't require large-scale models like T0 or GPT-3 to achieve high accuracy. As a result, it is typically an order of magnitude (or more) faster to train and run inference with.\r\n* \ud83c\udf0e **Multilingual support**: SetFit can be used with any [Sentence Transformer](https://huggingface.co/models?library=sentence-transformers&sort=downloads) on the Hub, which means you can classify text in multiple languages by simply fine-tuning a multilingual checkpoint.\r\n\r\nCheck out the [SetFit Documentation](https://huggingface.co/docs/setfit) for more information!\r\n\r\n## Installation\r\n\r\nDownload and install `setfit` by running:\r\n\r\n```bash\r\npip install setfit\r\n```\r\n\r\nIf you want the bleeding-edge version instead, install from source by running:\r\n\r\n```bash\r\npip install git+https://github.com/huggingface/setfit.git\r\n```\r\n\r\n## Usage\r\n\r\nThe [quickstart](https://huggingface.co/docs/setfit/quickstart) is a good place to learn about training, saving, loading, and performing inference with SetFit models. \r\n\r\nFor more examples, check out the [`notebooks`](https://github.com/huggingface/setfit/tree/main/notebooks) directory, the [tutorials](https://huggingface.co/docs/setfit/tutorials/overview), or the [how-to guides](https://huggingface.co/docs/setfit/how_to/overview).\r\n\r\n\r\n### Training a SetFit model\r\n\r\n`setfit` is integrated with the [Hugging Face Hub](https://huggingface.co/) and provides two main classes:\r\n\r\n* [`SetFitModel`](https://huggingface.co/docs/setfit/reference/main#setfit.SetFitModel): a wrapper that combines a pretrained body from `sentence_transformers` and a classification head from either [`scikit-learn`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) or [`SetFitHead`](https://huggingface.co/docs/setfit/reference/main#setfit.SetFitHead) (a differentiable head built upon `PyTorch` with similar APIs to `sentence_transformers`).\r\n* [`Trainer`](https://huggingface.co/docs/setfit/reference/trainer#setfit.Trainer): a helper class that wraps the fine-tuning process of SetFit.\r\n\r\nHere is a simple end-to-end training example using the default classification head from `scikit-learn`:\r\n\r\n\r\n```python\r\nfrom datasets import load_dataset\r\nfrom setfit import SetFitModel, Trainer, TrainingArguments, sample_dataset\r\n\r\n\r\n# Load a dataset from the Hugging Face Hub\r\ndataset = load_dataset(\"sst2\")\r\n\r\n# Simulate the few-shot regime by sampling 8 examples per class\r\ntrain_dataset = sample_dataset(dataset[\"train\"], label_column=\"label\", num_samples=8)\r\neval_dataset = dataset[\"validation\"].select(range(100))\r\ntest_dataset = dataset[\"validation\"].select(range(100, len(dataset[\"validation\"])))\r\n\r\n# Load a SetFit model from Hub\r\nmodel = SetFitModel.from_pretrained(\r\n \"sentence-transformers/paraphrase-mpnet-base-v2\",\r\n labels=[\"negative\", \"positive\"],\r\n)\r\n\r\nargs = TrainingArguments(\r\n batch_size=16,\r\n num_epochs=4,\r\n eval_strategy=\"epoch\",\r\n save_strategy=\"epoch\",\r\n load_best_model_at_end=True,\r\n)\r\n\r\ntrainer = Trainer(\r\n model=model,\r\n args=args,\r\n train_dataset=train_dataset,\r\n eval_dataset=eval_dataset,\r\n metric=\"accuracy\",\r\n column_mapping={\"sentence\": \"text\", \"label\": \"label\"} # Map dataset columns to text/label expected by trainer\r\n)\r\n\r\n# Train and evaluate\r\ntrainer.train()\r\nmetrics = trainer.evaluate(test_dataset)\r\nprint(metrics)\r\n# {'accuracy': 0.8691709844559585}\r\n\r\n# Push model to the Hub\r\ntrainer.push_to_hub(\"tomaarsen/setfit-paraphrase-mpnet-base-v2-sst2\")\r\n\r\n# Download from Hub\r\nmodel = SetFitModel.from_pretrained(\"tomaarsen/setfit-paraphrase-mpnet-base-v2-sst2\")\r\n# Run inference\r\npreds = model.predict([\"i loved the spiderman movie!\", \"pineapple on pizza is the worst \ud83e\udd2e\"])\r\nprint(preds)\r\n# [\"positive\", \"negative\"]\r\n```\r\n\r\n\r\n## Reproducing the results from the paper\r\n\r\nWe provide scripts to reproduce the results for SetFit and various baselines presented in Table 2 of our paper. Check out the setup and training instructions in the [`scripts/`](scripts/) directory.\r\n\r\n## Developer installation\r\n\r\nTo run the code in this project, first create a Python virtual environment using e.g. Conda:\r\n\r\n```bash\r\nconda create -n setfit python=3.9 && conda activate setfit\r\n```\r\n\r\nThen install the base requirements with:\r\n\r\n```bash\r\npip install -e '.[dev]'\r\n```\r\n\r\nThis will install mandatory packages for SetFit like `datasets` as well as development packages like `black` and `isort` that we use to ensure consistent code formatting.\r\n\r\n### Formatting your code\r\n\r\nWe use `black` and `isort` to ensure consistent code formatting. After following the installation steps, you can check your code locally by running:\r\n\r\n```\r\nmake style && make quality\r\n```\r\n\r\n## Project structure\r\n\r\n```\r\n\u251c\u2500\u2500 LICENSE\r\n\u251c\u2500\u2500 Makefile <- Makefile with commands like `make style` or `make tests`\r\n\u251c\u2500\u2500 README.md <- The top-level README for developers using this project.\r\n\u251c\u2500\u2500 docs <- Documentation source\r\n\u251c\u2500\u2500 notebooks <- Jupyter notebooks.\r\n\u251c\u2500\u2500 final_results <- Model predictions from the paper\r\n\u251c\u2500\u2500 scripts <- Scripts for training and inference\r\n\u251c\u2500\u2500 setup.cfg <- Configuration file to define package metadata\r\n\u251c\u2500\u2500 setup.py <- Make this project pip installable with `pip install -e`\r\n\u251c\u2500\u2500 src <- Source code for SetFit\r\n\u2514\u2500\u2500 tests <- Unit tests\r\n```\r\n\r\n## Related work\r\n\r\n* [https://github.com/pmbaumgartner/setfit](https://github.com/pmbaumgartner/setfit) - A scikit-learn API version of SetFit.\r\n* [jxpress/setfit-pytorch-lightning](https://github.com/jxpress/setfit-pytorch-lightning) - A PyTorch Lightning implementation of SetFit.\r\n* [davidberenstein1957/spacy-setfit](https://github.com/davidberenstein1957/spacy-setfit) - An easy and intuitive approach to use SetFit in combination with spaCy. \r\n\r\n## Citation\r\n\r\n```bibtex\r\n@misc{https://doi.org/10.48550/arxiv.2209.11055,\r\n doi = {10.48550/ARXIV.2209.11055},\r\n url = {https://arxiv.org/abs/2209.11055},\r\n author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},\r\n keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},\r\n title = {Efficient Few-Shot Learning Without Prompts},\r\n publisher = {arXiv},\r\n year = {2022},\r\n copyright = {Creative Commons Attribution 4.0 International}\r\n}\r\n```\r\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "Efficient few-shot learning with Sentence Transformers",
"version": "1.1.0",
"project_urls": {
"Download": "https://github.com/huggingface/setfit/tags",
"Homepage": "https://github.com/huggingface/setfit"
},
"split_keywords": [
"nlp",
" machine learning",
" fewshot learning",
" transformers"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f730aafd813af5221e6b7c374d7e23cdbec85823ae95e87c43950423df30beae",
"md5": "f9152d567c1c9d27f5d6bc6ab67adfdf",
"sha256": "b308bff449bd3df3779b51e341836586bfe4beb1804f2bb8719d2a35303dbab4"
},
"downloads": -1,
"filename": "setfit-1.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f9152d567c1c9d27f5d6bc6ab67adfdf",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 75166,
"upload_time": "2024-09-19T09:28:46",
"upload_time_iso_8601": "2024-09-19T09:28:46.715215Z",
"url": "https://files.pythonhosted.org/packages/f7/30/aafd813af5221e6b7c374d7e23cdbec85823ae95e87c43950423df30beae/setfit-1.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d7be4d000ed3b0b4b07f41f8912b0fe26fa45f45f3ac85a8122ad8406d5748b5",
"md5": "4f3daef2e540e74a8b0ed0b4ffa67292",
"sha256": "2cb953e2e4c3b9f08bd8ba853cb6309b1d0807a2f8e762bec1022cc1ea3fa791"
},
"downloads": -1,
"filename": "setfit-1.1.0.tar.gz",
"has_sig": false,
"md5_digest": "4f3daef2e540e74a8b0ed0b4ffa67292",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 84058,
"upload_time": "2024-09-19T09:28:48",
"upload_time_iso_8601": "2024-09-19T09:28:48.684051Z",
"url": "https://files.pythonhosted.org/packages/d7/be/4d000ed3b0b4b07f41f8912b0fe26fa45f45f3ac85a8122ad8406d5748b5/setfit-1.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-09-19 09:28:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "huggingface",
"github_project": "setfit",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"lcname": "setfit"
}