CLIP-API-service


NameCLIP-API-service JSON
Version 0.1.2 PyPI version JSON
download
home_page
SummaryBuild AI applications with any CLIP models - embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
upload_time2023-10-11 19:29:03
maintainer
docs_urlNone
author
requires_python>=3.8
licenseApache-2.0
keywords clip bentoml model-inference image-search object-detection visual-reasoning image-classification transformers artificial-intelligence machine-learning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">
    <h1 align="center">CLIP API Service</h1>
    <br>
    <strong>Discover the effortless integration of OpenAI's innovative CLIP model with our streamlined API service. <br></strong>
    <i>Powered by BentoML 🍱</i>
    <br>
</div>
<br>

## 📖 Introduction 📖
[CLIP](https://openai.com/research/clip), or Contrastive Language-Image Pretraining, is a cutting-edge AI model that comprehends and connects text and images, revolutionizing how we interpret online data.

This library provides you with an instant, easy-to-use interface for CLIP, allowing you to harness its capabilities without any setup hassles. BentoML takes care of all the complexity of serving the model!

## 🔧 Installation 🔧
Ensure that you have Python 3.8 or newer and `pip` installed on your system. We highly recommend using a Virtual Environment to avoid any potential package conflicts.

To install the service, enter the following command:
```bash
pip install clip-api-service
```

Once the installation process is complete, you can start the service by running:
```bash
clip-api-service serve --model-name=ViT-B-32:openai
```
Your service is now running! Interact with it via the Swagger UI at `localhost:3000`
![SwaggerUI](images/swagger-ui.png)

## 🎯 Use cases 🎯
Harness the capabilities of the CLIP API service across a range of applications:

### Encode
1. Text and Image Embedding
    - Use `encode` to transform text or images into meaningful embeddings. This makes it possible to perform tasks such as:
        1. **Neural Search**: Utilize encoded embeddings to power a search engine capable of understanding and indexing images based on their textual descriptions, and vice versa.
        2. **Custom Ranking**: Design a ranking system based on embeddings, providing unique ways to sort and categorize data according to your context.

### Rank
2. Zero-Shot Image Classification
    - Use `rank` to perform image classification without any training. For example:
        1. Given a set of images, classify an image as being "a picture of a dog" or "a picture of a cat".
        2. More complex classifications such as recognizing different breeds of dogs can also be performed, illustrating the versatility of the CLIP API service.

3. Visual Reasoning
    - The `rank` function can also be used to provide reasoning about visual scenarios. For instance:

| Visual Scenario | Query Image | Candidates | Output |
|-----------------|-------|---------------|--------|
| Counting Objects | ![Three Dog](images/three-dog.jpg) | This is a picture of 1 dog<br>This is a picture of 2 dogs<br>This is a picture of 3 dogs | Image matched with "3 dogs" |
| Identifying Colors | ![Blue Car](images/bluecar.jpeg)  | The car is red<br>The car is blue<br>The car is green | Image matched with "blue car" |
| Understanding Motion | ![Parked Car](images/parkedcar.jpeg)  | The car is parked<br>The car is moving<br>The car is turning| Image matched with "parked car" |
| Recognizing Location | ![Suburb Car](images/car-suburb.jpeg)  | The car is in the suburb<br>The car is on the highway<br>The car is in the street| Image matched with "car in the street" |
| Relative Positioning | ![Big Small car](images/big-small-car.jpg) | The big car is on the left, the small car is on the right<br>The small car is on the left, the big car is on the right| Image matched with the provided description |
 

## 🚀 Deploying to Production 🚀
Effortlessly transition your project into a production-ready application using [BentoCloud](https://www.bentoml.com/bento-cloud/), the production-ready platform for managing and deploying machine learning models.

Start by creating a BentoCloud account. Once you've signed up, log in to your BentoCloud account using the command:

```bash
bentoml cloud login --api-token <your-api-token> --endpoint <bento-cloud-endpoint>
```
> Note: Replace `<your-api-token>` and `<bento-cloud-endpoint>` with your specific API token and the BentoCloud endpoint respectively.

Next, build your BentoML service using the `build` command:

```bash
clip-api-service build --model-name=ViT-B-32:openai
```

Then, push your freshly-built Bento service to BentoCloud using the `push` command:

```bash
bentoml push <name:version>
```

Lastly, deploy this application to BentoCloud with a single `bentoml deployment create` command following the [deployment instructions](https://docs.bentoml.org/en/latest/reference/cli.html#bentoml-deployment-create).

BentoML offers a number of options for deploying and hosting online ML services into production, learn more at [Deploying a Bento](https://docs.bentoml.org/en/latest/concepts/deploy.html).

## 📚 Reference 📚
### API reference
#### `/encode`
Accepts either:
* `img_uri` : An Image URI, i.e `https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg`
* `text` : A string
* `img_blob` : Base64 encoded string

Returns a vector of embeddings of length 768.

**Example:**
```
curl -X 'POST' \
  'http://localhost:3000/encode' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '[
  {
    "img_uri": "https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg"
  },
  {
    "text": "picture of a dog"
  },
  {
    "img_blob": "iVBORw0KGgoAAAANSUhEUgAAABIAAAAPCAYAAADphp8SAAAABHNCSVQICAgIfAhkiAAAABl0RVh0U29mdHdhcmUAZ25vbWUtc2NyZWVuc2hvdO8Dvz4AAAApdEVYdENyZWF0aW9uIFRpbWUARnJpIDI2IE1heSAyMDIzIDA0OjE2OjIxIEFNCXIaIQAAAnhJREFUOI1Fk82OY0UMhT+7qu6dJPfmZ3p6EhYR3QSGnk0P6QV/4rlYsOIBeKrhIQAJtZCgNQsmEWiQSG5V2SwSeixZtiydYx9blpsJPpv2fPP1V3zx5ee8vHnJbDFlNl/Q9GMQBwLUAM7JcXADN9wN3Im1wOaja7afbfnk4xcsVytS25DaFsyptSDiaIyAnMn8kVTOMc6mI+62d2w2G5bLFZOuBxUkJsyNYoIGRVXfTyMCLmByIseJm+trbl58ynL5nK7v0dTgOGgENyQqsWlxdxxH4FEW7og7mBHvtq+4vHxG381IqQF3qjvBnOrgKBIiOR/fA89gMUPdoRpxe/uKbjym6zton4AZ6oZoRNxQUQiJOhzO3R1xeyTiXItXV1csLi4gRjADOIHdEEBEoGSCCOaOmUGtKI4CQQMIxL7vmIwmcBygAdoWAKuV6o5GwUwQB/HzjkTOufC/xZRaSi6IOjHFU1UVrRkrmTwcERViiohDQFANiAmYYbUgBnE8miAiDDlT/j0gQ0aj4qoklZMkHBsKIShBFJXTJLVWai64GbGUQnrSEori7ljO5Gy4gMZIahMpRA7DABZAwXGsVnIulOGIuxFfv/6RUdtwcfmM2XTKfDEnNYlcCjkP1BJIqYUChuNU3J1aM+Us382Jv97f8/d+x3w+YzbtWa5WLD94ztOnCyZdR0QRM8TkdACr1FqoJeNuuFUAwg/fffu9m7Hf73j4/Q9+/uUn3jw8YG6MRyNijFgtJG0wM8owMBwPlJKppZ6+RiA2TWK9XvPhes27f96x3+04Hg/s/vyLe/2N29ueyWxCzQe8Gvu3b0GUXCu7/Y5ijgblP3zyX4rqQyp1AAAAAElFTkSuQmCC"
  }
]'
```

#### `/rank`
Accepts a list of `queries` and a list of `candidates`. Similar to above, `queries` and `candidates` are either:
* `img_uri` : An Image URI, i.e `https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg`
* `text` : A string
* `img_blob` : Base64 encoded string

Returns a list of probabilies and cosine similarities of each candidate with respect to the query.

**Example:**
```
curl -X 'POST' \
  'http://localhost:3000/rank' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "queries": [
    {
      "img_uri": "https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg"
    }
  ],
  "candidates": [
    {
      "text": "picture of a dog"
    },
    {
      "text": "picture of a cat"
    },
    {
      "text": "picture of a bird"
    },
    {
      "text": "picture of a car"
    },
    {
      "text": "picture of a plane"
    },
    {
      "text": "picture of a boat"
    }
  ]
}'
```
And the response looks like:
```
{
  "probabilities": [
    [
      0.9958375692367554,
      0.0022114247549325228,
      0.001514736912213266,
      0.00011969256593147293,
      0.00019143625104334205,
      0.0001251235808013007
    ]
  ],
  "cosine_similarities": [
    [
      0.2297772467136383,
      0.16867777705192566,
      0.16489382088184357,
      0.13951312005519867,
      0.14420939981937408,
      0.13995687663555145
    ]
  ]
}
```

### CLI reference
#### `serve`
Spins up a HTTP Server with the model of your choice.

Arguments:
* `--model-name` : Name of the CLIP model. Use `list_models` to see the list of available model. Default: `openai/clip-vit-large-patch14`

#### `build`
Builds a Bento with the model of your choice

Arguments:
* `--model-name` : Name of the CLIP model. Use `list_models` to see the list of available model. Default: `openai/clip-vit-large-patch14`

#### `list_models`
List all available CLIP models.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "CLIP-API-service",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "CLIP BentoML Model-Inference image-search object-detection visual-reasoning image-classification transformers artificial-intelligence machine-learning",
    "author": "",
    "author_email": "BentoML Authors <contact@bentoml.com>",
    "download_url": "https://files.pythonhosted.org/packages/98/c4/eb797d83bfe08114eda5848db9622a8727aa47bacc1b2c2f01afd658e7b2/clip_api_service-0.1.2.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n    <h1 align=\"center\">CLIP API Service</h1>\n    <br>\n    <strong>Discover the effortless integration of OpenAI's innovative CLIP model with our streamlined API service. <br></strong>\n    <i>Powered by BentoML \ud83c\udf71</i>\n    <br>\n</div>\n<br>\n\n## \ud83d\udcd6 Introduction \ud83d\udcd6\n[CLIP](https://openai.com/research/clip), or Contrastive Language-Image Pretraining, is a cutting-edge AI model that comprehends and connects text and images, revolutionizing how we interpret online data.\n\nThis library provides you with an instant, easy-to-use interface for CLIP, allowing you to harness its capabilities without any setup hassles. BentoML takes care of all the complexity of serving the model!\n\n## \ud83d\udd27 Installation \ud83d\udd27\nEnsure that you have Python 3.8 or newer and `pip` installed on your system. We highly recommend using a Virtual Environment to avoid any potential package conflicts.\n\nTo install the service, enter the following command:\n```bash\npip install clip-api-service\n```\n\nOnce the installation process is complete, you can start the service by running:\n```bash\nclip-api-service serve --model-name=ViT-B-32:openai\n```\nYour service is now running! Interact with it via the Swagger UI at `localhost:3000`\n![SwaggerUI](images/swagger-ui.png)\n\n## \ud83c\udfaf Use cases \ud83c\udfaf\nHarness the capabilities of the CLIP API service across a range of applications:\n\n### Encode\n1. Text and Image Embedding\n    - Use `encode` to transform text or images into meaningful embeddings. This makes it possible to perform tasks such as:\n        1. **Neural Search**: Utilize encoded embeddings to power a search engine capable of understanding and indexing images based on their textual descriptions, and vice versa.\n        2. **Custom Ranking**: Design a ranking system based on embeddings, providing unique ways to sort and categorize data according to your context.\n\n### Rank\n2. Zero-Shot Image Classification\n    - Use `rank` to perform image classification without any training. For example:\n        1. Given a set of images, classify an image as being \"a picture of a dog\" or \"a picture of a cat\".\n        2. More complex classifications such as recognizing different breeds of dogs can also be performed, illustrating the versatility of the CLIP API service.\n\n3. Visual Reasoning\n    - The `rank` function can also be used to provide reasoning about visual scenarios. For instance:\n\n| Visual Scenario | Query Image | Candidates | Output |\n|-----------------|-------|---------------|--------|\n| Counting Objects | ![Three Dog](images/three-dog.jpg) | This is a picture of 1 dog<br>This is a picture of 2 dogs<br>This is a picture of 3 dogs | Image matched with \"3 dogs\" |\n| Identifying Colors | ![Blue Car](images/bluecar.jpeg)  | The car is red<br>The car is blue<br>The car is green | Image matched with \"blue car\" |\n| Understanding Motion | ![Parked Car](images/parkedcar.jpeg)  | The car is parked<br>The car is moving<br>The car is turning| Image matched with \"parked car\" |\n| Recognizing Location | ![Suburb Car](images/car-suburb.jpeg)  | The car is in the suburb<br>The car is on the highway<br>The car is in the street| Image matched with \"car in the street\" |\n| Relative Positioning | ![Big Small car](images/big-small-car.jpg) | The big car is on the left, the small car is on the right<br>The small car is on the left, the big car is on the right| Image matched with the provided description |\n \n\n## \ud83d\ude80 Deploying to Production \ud83d\ude80\nEffortlessly transition your project into a production-ready application using [BentoCloud](https://www.bentoml.com/bento-cloud/), the production-ready platform for managing and deploying machine learning models.\n\nStart by creating a BentoCloud account. Once you've signed up, log in to your BentoCloud account using the command:\n\n```bash\nbentoml cloud login --api-token <your-api-token> --endpoint <bento-cloud-endpoint>\n```\n> Note: Replace `<your-api-token>` and `<bento-cloud-endpoint>` with your specific API token and the BentoCloud endpoint respectively.\n\nNext, build your BentoML service using the `build` command:\n\n```bash\nclip-api-service build --model-name=ViT-B-32:openai\n```\n\nThen, push your freshly-built Bento service to BentoCloud using the `push` command:\n\n```bash\nbentoml push <name:version>\n```\n\nLastly, deploy this application to BentoCloud with a single `bentoml deployment create` command following the [deployment instructions](https://docs.bentoml.org/en/latest/reference/cli.html#bentoml-deployment-create).\n\nBentoML offers a number of options for deploying and hosting online ML services into production, learn more at [Deploying a Bento](https://docs.bentoml.org/en/latest/concepts/deploy.html).\n\n## \ud83d\udcda Reference \ud83d\udcda\n### API reference\n#### `/encode`\nAccepts either:\n* `img_uri` : An Image URI, i.e `https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg`\n* `text` : A string\n* `img_blob` : Base64 encoded string\n\nReturns a vector of embeddings of length 768.\n\n**Example:**\n```\ncurl -X 'POST' \\\n  'http://localhost:3000/encode' \\\n  -H 'accept: application/json' \\\n  -H 'Content-Type: application/json' \\\n  -d '[\n  {\n    \"img_uri\": \"https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg\"\n  },\n  {\n    \"text\": \"picture of a dog\"\n  },\n  {\n    \"img_blob\": \"iVBORw0KGgoAAAANSUhEUgAAABIAAAAPCAYAAADphp8SAAAABHNCSVQICAgIfAhkiAAAABl0RVh0U29mdHdhcmUAZ25vbWUtc2NyZWVuc2hvdO8Dvz4AAAApdEVYdENyZWF0aW9uIFRpbWUARnJpIDI2IE1heSAyMDIzIDA0OjE2OjIxIEFNCXIaIQAAAnhJREFUOI1Fk82OY0UMhT+7qu6dJPfmZ3p6EhYR3QSGnk0P6QV/4rlYsOIBeKrhIQAJtZCgNQsmEWiQSG5V2SwSeixZtiydYx9blpsJPpv2fPP1V3zx5ee8vHnJbDFlNl/Q9GMQBwLUAM7JcXADN9wN3Im1wOaja7afbfnk4xcsVytS25DaFsyptSDiaIyAnMn8kVTOMc6mI+62d2w2G5bLFZOuBxUkJsyNYoIGRVXfTyMCLmByIseJm+trbl58ynL5nK7v0dTgOGgENyQqsWlxdxxH4FEW7og7mBHvtq+4vHxG381IqQF3qjvBnOrgKBIiOR/fA89gMUPdoRpxe/uKbjym6zton4AZ6oZoRNxQUQiJOhzO3R1xeyTiXItXV1csLi4gRjADOIHdEEBEoGSCCOaOmUGtKI4CQQMIxL7vmIwmcBygAdoWAKuV6o5GwUwQB/HzjkTOufC/xZRaSi6IOjHFU1UVrRkrmTwcERViiohDQFANiAmYYbUgBnE8miAiDDlT/j0gQ0aj4qoklZMkHBsKIShBFJXTJLVWai64GbGUQnrSEori7ljO5Gy4gMZIahMpRA7DABZAwXGsVnIulOGIuxFfv/6RUdtwcfmM2XTKfDEnNYlcCjkP1BJIqYUChuNU3J1aM+Us382Jv97f8/d+x3w+YzbtWa5WLD94ztOnCyZdR0QRM8TkdACr1FqoJeNuuFUAwg/fffu9m7Hf73j4/Q9+/uUn3jw8YG6MRyNijFgtJG0wM8owMBwPlJKppZ6+RiA2TWK9XvPhes27f96x3+04Hg/s/vyLe/2N29ueyWxCzQe8Gvu3b0GUXCu7/Y5ijgblP3zyX4rqQyp1AAAAAElFTkSuQmCC\"\n  }\n]'\n```\n\n#### `/rank`\nAccepts a list of `queries` and a list of `candidates`. Similar to above, `queries` and `candidates` are either:\n* `img_uri` : An Image URI, i.e `https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg`\n* `text` : A string\n* `img_blob` : Base64 encoded string\n\nReturns a list of probabilies and cosine similarities of each candidate with respect to the query.\n\n**Example:**\n```\ncurl -X 'POST' \\\n  'http://localhost:3000/rank' \\\n  -H 'accept: application/json' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n  \"queries\": [\n    {\n      \"img_uri\": \"https://hips.hearstapps.com/hmg-prod/images/dog-puppy-on-garden-royalty-free-image-1586966191.jpg\"\n    }\n  ],\n  \"candidates\": [\n    {\n      \"text\": \"picture of a dog\"\n    },\n    {\n      \"text\": \"picture of a cat\"\n    },\n    {\n      \"text\": \"picture of a bird\"\n    },\n    {\n      \"text\": \"picture of a car\"\n    },\n    {\n      \"text\": \"picture of a plane\"\n    },\n    {\n      \"text\": \"picture of a boat\"\n    }\n  ]\n}'\n```\nAnd the response looks like:\n```\n{\n  \"probabilities\": [\n    [\n      0.9958375692367554,\n      0.0022114247549325228,\n      0.001514736912213266,\n      0.00011969256593147293,\n      0.00019143625104334205,\n      0.0001251235808013007\n    ]\n  ],\n  \"cosine_similarities\": [\n    [\n      0.2297772467136383,\n      0.16867777705192566,\n      0.16489382088184357,\n      0.13951312005519867,\n      0.14420939981937408,\n      0.13995687663555145\n    ]\n  ]\n}\n```\n\n### CLI reference\n#### `serve`\nSpins up a HTTP Server with the model of your choice.\n\nArguments:\n* `--model-name` : Name of the CLIP model. Use `list_models` to see the list of available model. Default: `openai/clip-vit-large-patch14`\n\n#### `build`\nBuilds a Bento with the model of your choice\n\nArguments:\n* `--model-name` : Name of the CLIP model. Use `list_models` to see the list of available model. Default: `openai/clip-vit-large-patch14`\n\n#### `list_models`\nList all available CLIP models.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Build AI applications with any CLIP models - embed image and sentences, object recognition, visual reasoning, image classification and reverse image search",
    "version": "0.1.2",
    "project_urls": {
        "Bug tracker": "https://github.com/bentoml/CLIP-API-service/issues",
        "Homepage": "https://github.com/bentoml/CLIP-API-service"
    },
    "split_keywords": [
        "clip",
        "bentoml",
        "model-inference",
        "image-search",
        "object-detection",
        "visual-reasoning",
        "image-classification",
        "transformers",
        "artificial-intelligence",
        "machine-learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fe1bcebc425af8d7096892dea48a8983738c578a46299c054f67d5d7fc2d28b0",
                "md5": "59e2b97a6f33f94c62b7c364d6c9a0f5",
                "sha256": "5ca7e29a16586c1557523200586e1d927dabf0a1232009eb344b9e3ee10785dc"
            },
            "downloads": -1,
            "filename": "clip_api_service-0.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "59e2b97a6f33f94c62b7c364d6c9a0f5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 21723,
            "upload_time": "2023-10-11T19:29:01",
            "upload_time_iso_8601": "2023-10-11T19:29:01.021136Z",
            "url": "https://files.pythonhosted.org/packages/fe/1b/cebc425af8d7096892dea48a8983738c578a46299c054f67d5d7fc2d28b0/clip_api_service-0.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "98c4eb797d83bfe08114eda5848db9622a8727aa47bacc1b2c2f01afd658e7b2",
                "md5": "f2fdd066bd5beb343c7a231e3fe0b1bd",
                "sha256": "0341fb2a1a2990a57160bac74ed34dbd8e73a22b388892490ca5cf6be7d11181"
            },
            "downloads": -1,
            "filename": "clip_api_service-0.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f2fdd066bd5beb343c7a231e3fe0b1bd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 20426,
            "upload_time": "2023-10-11T19:29:03",
            "upload_time_iso_8601": "2023-10-11T19:29:03.682504Z",
            "url": "https://files.pythonhosted.org/packages/98/c4/eb797d83bfe08114eda5848db9622a8727aa47bacc1b2c2f01afd658e7b2/clip_api_service-0.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-10-11 19:29:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "bentoml",
    "github_project": "CLIP-API-service",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "clip-api-service"
}
        
Elapsed time: 0.12750s