embed


Nameembed JSON
Version 0.3.0 PyPI version JSON
download
home_pagehttps://github.com/michaelfeil/infinity
SummaryA stable, fast and easy-to-use inference library with a focus on a sync-to-async API
upload_time2024-09-24 06:04:14
maintainerNone
docs_urlNone
authormichaelfeil
requires_python<4,>=3.9
licenseNone
keywords vector embedding neural search sentence-transformers
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # embed
A stable, blazing fast and easy-to-use inference library with a focus on a sync-to-async API

[![ci][ci-shield]][ci-url]
[![Downloads][pepa-shield]][pepa-url]

## Installation
```bash
pip install embed
```

## Why embed?

Embed makes it easy to load any embedding, classification and reranking models from Huggingface. 
It leverages [Infinity](https://github.com/michaelfeil/infinity) as backend for async computation, batching, and Flash-Attention-2.

![CPU Benchmark Diagram](docs/l4_cpu.png)
Benchmarking on an Nvidia-L4 instance. Note: CPU uses bert-small, CUDA uses Bert-large. [Methodology](https://michaelfeil.eu/infinity/0.0.51/benchmarking/).

```python
from embed import BatchedInference
from concurrent.futures import Future

# Run any model
register = BatchedInference(
    model_id=[
        # sentence-embeddings
        "michaelfeil/bge-small-en-v1.5",
        # sentence-embeddings and image-embeddings
        "jinaai/jina-clip-v1",
        # classification models
        "philschmid/tiny-bert-sst2-distilled",
        # rerankers
        "mixedbread-ai/mxbai-rerank-xsmall-v1",
    ],
    # engine to `torch` or `optimum`
    engine="torch",
    # device `cuda` (Nvidia/AMD) or `cpu`
    device="cpu",
)

sentences = ["Paris is in France.", "Berlin is in Germany.", "A image of two cats."]
images = ["http://images.cocodataset.org/val2017/000000039769.jpg"]
question = "Where is Paris?"

future: "Future" = register.embed(
    sentences=sentences, model_id="michaelfeil/bge-small-en-v1.5"
)
future.result()
register.rerank(
    query=question, docs=sentences, model_id="mixedbread-ai/mxbai-rerank-xsmall-v1"
)
register.classify(model_id="philschmid/tiny-bert-sst2-distilled", sentences=sentences)
register.image_embed(model_id="jinaai/jina-clip-v1", images=images)

# manually stop the register upon termination to free model memory.
register.stop()
```

All functions return `Futures(vector_embedding, token_usage)`, enables you to `wait` for them and removes batching logic from your code.

```python
>>> embedding_fut = register.embed(sentences=sentences, model_id="michaelfeil/bge-small-en-v1.5")
>>> print(embedding_fut)
<Future at 0x7fa0e97e8a60 state=pending>
>>> time.sleep(1) and print(embedding_fut)
<Future at 0x7fa0e97e9c30 state=finished returned tuple>
>>> embedding_fut.result()
([array([-3.35943862e-03, ..., -3.22808176e-02], dtype=float32)], 19)
```

# Licence and Contributions
embed is licensed as MIT. All contribrutions need to adhere to the MIT License. Contributions are welcome.


<!-- MARKDOWN LINKS & IMAGES -->
<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
[contributors-shield]: https://img.shields.io/github/contributors/michaelfeil/embed.svg?style=for-the-badge
[contributors-url]: https://github.com/michaelfeil/embed/graphs/contributors
[forks-shield]: https://img.shields.io/github/forks/michaelfeil/embed.svg?style=for-the-badge
[forks-url]: https://github.com/michaelfeil/embed/network/members
[stars-shield]: https://img.shields.io/github/stars/michaelfeil/embed.svg?style=for-the-badge
[stars-url]: https://github.com/michaelfeil/embed/stargazers
[issues-shield]: https://img.shields.io/github/issues/michaelfeil/embed.svg?style=for-the-badge
[issues-url]: https://github.com/michaelfeil/embed/issues
[license-shield]: https://img.shields.io/github/license/michaelfeil/embed.svg?style=for-the-badge
[license-url]: https://github.com/michaelfeil/embed/blob/master/LICENSE.txt
[pepa-shield]: https://static.pepy.tech/badge/embed
[pepa-url]: https://www.pepy.tech/projects/embed
[ci-shield]: https://github.com/michaelfeil/infinity/actions/workflows/ci.yaml/badge.svg
[ci-url]: https://github.com/michaelfeil/infinity/actions

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/michaelfeil/infinity",
    "name": "embed",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.9",
    "maintainer_email": null,
    "keywords": "vector, embedding, neural, search, sentence-transformers",
    "author": "michaelfeil",
    "author_email": "me@michaelfeil.eu",
    "download_url": "https://files.pythonhosted.org/packages/49/98/c5face22698b98382999c90ed1a583cc738759056767caa5099cd361fbe4/embed-0.3.0.tar.gz",
    "platform": null,
    "description": "# embed\nA stable, blazing fast and easy-to-use inference library with a focus on a sync-to-async API\n\n[![ci][ci-shield]][ci-url]\n[![Downloads][pepa-shield]][pepa-url]\n\n## Installation\n```bash\npip install embed\n```\n\n## Why embed?\n\nEmbed makes it easy to load any embedding, classification and reranking models from Huggingface. \nIt leverages [Infinity](https://github.com/michaelfeil/infinity) as backend for async computation, batching, and Flash-Attention-2.\n\n![CPU Benchmark Diagram](docs/l4_cpu.png)\nBenchmarking on an Nvidia-L4 instance. Note: CPU uses bert-small, CUDA uses Bert-large. [Methodology](https://michaelfeil.eu/infinity/0.0.51/benchmarking/).\n\n```python\nfrom embed import BatchedInference\nfrom concurrent.futures import Future\n\n# Run any model\nregister = BatchedInference(\n    model_id=[\n        # sentence-embeddings\n        \"michaelfeil/bge-small-en-v1.5\",\n        # sentence-embeddings and image-embeddings\n        \"jinaai/jina-clip-v1\",\n        # classification models\n        \"philschmid/tiny-bert-sst2-distilled\",\n        # rerankers\n        \"mixedbread-ai/mxbai-rerank-xsmall-v1\",\n    ],\n    # engine to `torch` or `optimum`\n    engine=\"torch\",\n    # device `cuda` (Nvidia/AMD) or `cpu`\n    device=\"cpu\",\n)\n\nsentences = [\"Paris is in France.\", \"Berlin is in Germany.\", \"A image of two cats.\"]\nimages = [\"http://images.cocodataset.org/val2017/000000039769.jpg\"]\nquestion = \"Where is Paris?\"\n\nfuture: \"Future\" = register.embed(\n    sentences=sentences, model_id=\"michaelfeil/bge-small-en-v1.5\"\n)\nfuture.result()\nregister.rerank(\n    query=question, docs=sentences, model_id=\"mixedbread-ai/mxbai-rerank-xsmall-v1\"\n)\nregister.classify(model_id=\"philschmid/tiny-bert-sst2-distilled\", sentences=sentences)\nregister.image_embed(model_id=\"jinaai/jina-clip-v1\", images=images)\n\n# manually stop the register upon termination to free model memory.\nregister.stop()\n```\n\nAll functions return `Futures(vector_embedding, token_usage)`, enables you to `wait` for them and removes batching logic from your code.\n\n```python\n>>> embedding_fut = register.embed(sentences=sentences, model_id=\"michaelfeil/bge-small-en-v1.5\")\n>>> print(embedding_fut)\n<Future at 0x7fa0e97e8a60 state=pending>\n>>> time.sleep(1) and print(embedding_fut)\n<Future at 0x7fa0e97e9c30 state=finished returned tuple>\n>>> embedding_fut.result()\n([array([-3.35943862e-03, ..., -3.22808176e-02], dtype=float32)], 19)\n```\n\n# Licence and Contributions\nembed is licensed as MIT. All contribrutions need to adhere to the MIT License. Contributions are welcome.\n\n\n<!-- MARKDOWN LINKS & IMAGES -->\n<!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->\n[contributors-shield]: https://img.shields.io/github/contributors/michaelfeil/embed.svg?style=for-the-badge\n[contributors-url]: https://github.com/michaelfeil/embed/graphs/contributors\n[forks-shield]: https://img.shields.io/github/forks/michaelfeil/embed.svg?style=for-the-badge\n[forks-url]: https://github.com/michaelfeil/embed/network/members\n[stars-shield]: https://img.shields.io/github/stars/michaelfeil/embed.svg?style=for-the-badge\n[stars-url]: https://github.com/michaelfeil/embed/stargazers\n[issues-shield]: https://img.shields.io/github/issues/michaelfeil/embed.svg?style=for-the-badge\n[issues-url]: https://github.com/michaelfeil/embed/issues\n[license-shield]: https://img.shields.io/github/license/michaelfeil/embed.svg?style=for-the-badge\n[license-url]: https://github.com/michaelfeil/embed/blob/master/LICENSE.txt\n[pepa-shield]: https://static.pepy.tech/badge/embed\n[pepa-url]: https://www.pepy.tech/projects/embed\n[ci-shield]: https://github.com/michaelfeil/infinity/actions/workflows/ci.yaml/badge.svg\n[ci-url]: https://github.com/michaelfeil/infinity/actions\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A stable, fast and easy-to-use inference library with a focus on a sync-to-async API",
    "version": "0.3.0",
    "project_urls": {
        "Homepage": "https://github.com/michaelfeil/infinity",
        "Repository": "https://github.com/michaelfeil/infinity"
    },
    "split_keywords": [
        "vector",
        " embedding",
        " neural",
        " search",
        " sentence-transformers"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "99ab50a69429cd643732d206cc822439f583985378e3a43c40480e2b357596c5",
                "md5": "183b065128e43e3568d7a910bbd03ed8",
                "sha256": "6cd08ba00e69a2c84d101a5550a5d66fb45e06c292b606cb6a8fbb3f30e3beaf"
            },
            "downloads": -1,
            "filename": "embed-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "183b065128e43e3568d7a910bbd03ed8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.9",
            "size": 4722,
            "upload_time": "2024-09-24T06:04:13",
            "upload_time_iso_8601": "2024-09-24T06:04:13.050991Z",
            "url": "https://files.pythonhosted.org/packages/99/ab/50a69429cd643732d206cc822439f583985378e3a43c40480e2b357596c5/embed-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4998c5face22698b98382999c90ed1a583cc738759056767caa5099cd361fbe4",
                "md5": "6034620bc07d1b97dd976c8dd9377a8c",
                "sha256": "bd6c88f220c41125842d57a0d80279c944b097e9333bb1f891dab7118870c38d"
            },
            "downloads": -1,
            "filename": "embed-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "6034620bc07d1b97dd976c8dd9377a8c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.9",
            "size": 4240,
            "upload_time": "2024-09-24T06:04:14",
            "upload_time_iso_8601": "2024-09-24T06:04:14.021011Z",
            "url": "https://files.pythonhosted.org/packages/49/98/c5face22698b98382999c90ed1a583cc738759056767caa5099cd361fbe4/embed-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-24 06:04:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "michaelfeil",
    "github_project": "infinity",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "embed"
}
        
Elapsed time: 0.30745s