# Fast Sentence Transformers
This repository contains code to run faster feature extractors using tools like quantization, optimization and `ONNX`. Just run your model much faster, while using less of memory. There is not much to it!
[![Python package](https://github.com/Pandora-Intelligence/fast-sentence-transformers/actions/workflows/python-package.yml/badge.svg?branch=main)](https://github.com/Pandora-Intelligence/fast-sentence-transformers/actions/workflows/python-package.yml)
[![Current Release Version](https://img.shields.io/github/release/pandora-intelligence/fast-sentence-transformers.svg?style=flat-square&logo=github)](https://github.com/pandora-intelligence/fast-sentence-transformers/releases)
[![pypi Version](https://img.shields.io/pypi/v/fast-sentence-transformers.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/fast-sentence-transformers/)
[![PyPi downloads](https://static.pepy.tech/personalized-badge/fast-sentence-transformers?period=total&units=international_system&left_color=grey&right_color=orange&left_text=pip%20downloads)](https://pypi.org/project/fast-sentence-transformers/)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square)](https://github.com/ambv/black)
> [Phillip Schmid](https://www.philschmid.de/optimize-sentence-transformers): "We successfully quantized our vanilla Transformers model with Hugging Face and managed to accelerate our model latency from 25.6ms to 12.3ms or 2.09x while keeping 100% of the accuracy on the stsb dataset.
> But I have to say that this isn't a plug and play process you can transfer to any Transformers model, task or dataset.""
## Install
```bash
pip install fast-sentence-transformers
```
Or, for GPU support:
```bash
pip install fast-sentence-transformers[gpu]
```
## Quickstart
```python
from fast_sentence_transformers import FastSentenceTransformer as SentenceTransformer
# use any sentence-transformer
encoder = SentenceTransformer("all-MiniLM-L6-v2", device="cpu")
encoder.encode("Hello hello, hey, hello hello")
encoder.encode(["Life is too short to eat bad food!"] * 2)
```
## Benchmark
Non-exact, indicative benchmark for speed an memory usage with smaller and larger model on `sentence-transformers`
| model | Type | default | ONNX | ONNX+quantized | ONNX+GPU |
| ------------------------------------- | ------ | ------- | ---- | -------------- | -------- |
| paraphrase-albert-small-v2 | memory | 1x | 1x | 1x | 1x |
| | speed | 1x | 2x | 5x | 20x |
| paraphrase-multilingual-mpnet-base-v2 | memory | 1x | 1x | 4x | 4x |
| | speed | 1x | 2x | 5x | 20x |
## Shout-Out
This package heavily leans on https://www.philschmid.de/optimize-sentence-transformers.
Raw data
{
"_id": null,
"home_page": "https://github.com/pandora-intelligence/fast-sentence-transformers",
"name": "fast-sentence-transformers",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.13,>=3.9",
"maintainer_email": null,
"keywords": "sentence-transformerx, ONNX, NLP",
"author": "David Berenstein",
"author_email": "david.m.berenstein@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/6e/a9/7ea7990ebbe9628bb25bca180a957954fd43cf52cf41cc293b4408585529/fast_sentence_transformers-0.5.tar.gz",
"platform": null,
"description": "# Fast Sentence Transformers\n\nThis repository contains code to run faster feature extractors using tools like quantization, optimization and `ONNX`. Just run your model much faster, while using less of memory. There is not much to it!\n\n[![Python package](https://github.com/Pandora-Intelligence/fast-sentence-transformers/actions/workflows/python-package.yml/badge.svg?branch=main)](https://github.com/Pandora-Intelligence/fast-sentence-transformers/actions/workflows/python-package.yml)\n[![Current Release Version](https://img.shields.io/github/release/pandora-intelligence/fast-sentence-transformers.svg?style=flat-square&logo=github)](https://github.com/pandora-intelligence/fast-sentence-transformers/releases)\n[![pypi Version](https://img.shields.io/pypi/v/fast-sentence-transformers.svg?style=flat-square&logo=pypi&logoColor=white)](https://pypi.org/project/fast-sentence-transformers/)\n[![PyPi downloads](https://static.pepy.tech/personalized-badge/fast-sentence-transformers?period=total&units=international_system&left_color=grey&right_color=orange&left_text=pip%20downloads)](https://pypi.org/project/fast-sentence-transformers/)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square)](https://github.com/ambv/black)\n\n> [Phillip Schmid](https://www.philschmid.de/optimize-sentence-transformers): \"We successfully quantized our vanilla Transformers model with Hugging Face and managed to accelerate our model latency from 25.6ms to 12.3ms or 2.09x while keeping 100% of the accuracy on the stsb dataset.\n> But I have to say that this isn't a plug and play process you can transfer to any Transformers model, task or dataset.\"\"\n\n## Install\n\n```bash\npip install fast-sentence-transformers\n```\n\nOr, for GPU support:\n\n```bash\npip install fast-sentence-transformers[gpu]\n```\n\n## Quickstart\n\n```python\n\nfrom fast_sentence_transformers import FastSentenceTransformer as SentenceTransformer\n\n# use any sentence-transformer\nencoder = SentenceTransformer(\"all-MiniLM-L6-v2\", device=\"cpu\")\n\nencoder.encode(\"Hello hello, hey, hello hello\")\nencoder.encode([\"Life is too short to eat bad food!\"] * 2)\n```\n\n## Benchmark\n\nNon-exact, indicative benchmark for speed an memory usage with smaller and larger model on `sentence-transformers`\n\n| model | Type | default | ONNX | ONNX+quantized | ONNX+GPU |\n| ------------------------------------- | ------ | ------- | ---- | -------------- | -------- |\n| paraphrase-albert-small-v2 | memory | 1x | 1x | 1x | 1x |\n| | speed | 1x | 2x | 5x | 20x |\n| paraphrase-multilingual-mpnet-base-v2 | memory | 1x | 1x | 4x | 4x |\n| | speed | 1x | 2x | 5x | 20x |\n\n## Shout-Out\n\nThis package heavily leans on https://www.philschmid.de/optimize-sentence-transformers.\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "This repository contains code to run faster sentence-transformers. Simply, faster, sentence-transformers.",
"version": "0.5",
"project_urls": {
"Documentation": "https://github.com/pandora-intelligence/fast-sentence-transformers",
"Homepage": "https://github.com/pandora-intelligence/fast-sentence-transformers",
"Repository": "https://github.com/pandora-intelligence/fast-sentence-transformers"
},
"split_keywords": [
"sentence-transformerx",
" onnx",
" nlp"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "af3a4e6501279845b3623d3318a723bddebf64fc38ce94ac71e1ffe6f2c68c19",
"md5": "214f5a6602064136d67ef9b38ff08a75",
"sha256": "f4accf68b65061c54e071813fb5df45878e73e1d792b6efb3427f148e649baca"
},
"downloads": -1,
"filename": "fast_sentence_transformers-0.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "214f5a6602064136d67ef9b38ff08a75",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.13,>=3.9",
"size": 5956,
"upload_time": "2024-08-26T21:29:09",
"upload_time_iso_8601": "2024-08-26T21:29:09.942610Z",
"url": "https://files.pythonhosted.org/packages/af/3a/4e6501279845b3623d3318a723bddebf64fc38ce94ac71e1ffe6f2c68c19/fast_sentence_transformers-0.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6ea97ea7990ebbe9628bb25bca180a957954fd43cf52cf41cc293b4408585529",
"md5": "2c4b98f0900e718d51fe688de2d7202c",
"sha256": "d6329ca7240bcb531b112b8d37684b002d4258e2a62fbf450543bf790a102cb4"
},
"downloads": -1,
"filename": "fast_sentence_transformers-0.5.tar.gz",
"has_sig": false,
"md5_digest": "2c4b98f0900e718d51fe688de2d7202c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.13,>=3.9",
"size": 5360,
"upload_time": "2024-08-26T21:29:11",
"upload_time_iso_8601": "2024-08-26T21:29:11.139703Z",
"url": "https://files.pythonhosted.org/packages/6e/a9/7ea7990ebbe9628bb25bca180a957954fd43cf52cf41cc293b4408585529/fast_sentence_transformers-0.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-26 21:29:11",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pandora-intelligence",
"github_project": "fast-sentence-transformers",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "fast-sentence-transformers"
}