<div align="center">
<p>
<a align="center" href="" target="_blank">
<img
width="850"
src="https://media.roboflow.com/open-source/autodistill/autodistill-banner.png"
>
</a>
</p>
</div>
# Autodistill Gemini Module
This repository contains the code supporting the Gemini base model for use with [Autodistill](https://github.com/autodistill/autodistill).
[Gemini](https://deepmind.google/technologies/gemini/), developed by Google, is a multimodal computer vision model that allows you to ask questions about images. You can use Gemini with Autodistill for image classification.
You can combine Gemini with other base models to label regions of an object. For example, you can use Grounding DINO to identify abstract objects (i.e. a vinyl record) then Gemini to classify the object (i.e. say which of five vinyl records the region represents). Read the Autodistill [Combine Models](https://docs.autodistill.com/utilities/combine-models/) guide for more information.
> [!NOTE]
> Using this project will incur billing charges for API calls to the Gemini API.
> Refer to the [Google Cloud pricing](https://cloud.google.com/pricing/) page for more information and to calculate your expected pricing. This package makes one API call per image you want to label.
Read the full [Autodistill documentation](https://autodistill.github.io/autodistill/).
## Installation
To use Gemini with autodistill, you need to install the following dependency:
```bash
pip3 install autodistill-gemini
```
## Quickstart
```python
from autodistill_gemini import Gemini
# define an ontology to map class names to our Gemini prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = Gemini(
ontology=CaptionOntology(
{
"person": "person",
"a forklift": "forklift"
}
),
gcp_region="us-central1",
gcp_project="project-name",
)
# run inference on an image
result = base_model.predict("image.jpg")
print(result)
# label a folder of images
base_model.label("./context_images", extension=".jpeg")
```
## License
This project is licensed under an [MIT license](LICENSE).
## 🏆 Contributing
We love your input! Please see the core Autodistill [contributing guide](https://github.com/autodistill/autodistill/blob/main/CONTRIBUTING.md) to get started. Thank you 🙏 to all our contributors!
Raw data
{
"_id": null,
"home_page": "https://github.com/autodistill/autodistill-gemini",
"name": "autodistill-gemini",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "",
"author": "Roboflow",
"author_email": "support@roboflow.com",
"download_url": "https://files.pythonhosted.org/packages/7d/19/64178760bd2e19641c76b2fc764ee2e05bbc3baa8cf44d46389d0e7d6259/autodistill-gemini-0.1.0.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n <p>\n <a align=\"center\" href=\"\" target=\"_blank\">\n <img\n width=\"850\"\n src=\"https://media.roboflow.com/open-source/autodistill/autodistill-banner.png\"\n >\n </a>\n </p>\n</div>\n\n# Autodistill Gemini Module\n\nThis repository contains the code supporting the Gemini base model for use with [Autodistill](https://github.com/autodistill/autodistill).\n\n[Gemini](https://deepmind.google/technologies/gemini/), developed by Google, is a multimodal computer vision model that allows you to ask questions about images. You can use Gemini with Autodistill for image classification.\n\nYou can combine Gemini with other base models to label regions of an object. For example, you can use Grounding DINO to identify abstract objects (i.e. a vinyl record) then Gemini to classify the object (i.e. say which of five vinyl records the region represents). Read the Autodistill [Combine Models](https://docs.autodistill.com/utilities/combine-models/) guide for more information.\n\n> [!NOTE]\n> Using this project will incur billing charges for API calls to the Gemini API.\n> Refer to the [Google Cloud pricing](https://cloud.google.com/pricing/) page for more information and to calculate your expected pricing. This package makes one API call per image you want to label.\n\nRead the full [Autodistill documentation](https://autodistill.github.io/autodistill/).\n\n## Installation\n\nTo use Gemini with autodistill, you need to install the following dependency:\n\n\n```bash\npip3 install autodistill-gemini\n```\n\n## Quickstart\n\n```python\nfrom autodistill_gemini import Gemini\n\n# define an ontology to map class names to our Gemini prompt\n# the ontology dictionary has the format {caption: class}\n# where caption is the prompt sent to the base model, and class is the label that will\n# be saved for that caption in the generated annotations\n# then, load the model\nbase_model = Gemini(\n ontology=CaptionOntology(\n {\n \"person\": \"person\",\n \"a forklift\": \"forklift\"\n }\n ),\n gcp_region=\"us-central1\",\n gcp_project=\"project-name\",\n)\n\n# run inference on an image\nresult = base_model.predict(\"image.jpg\")\n\nprint(result)\n\n# label a folder of images\nbase_model.label(\"./context_images\", extension=\".jpeg\")\n```\n\n## License\n\nThis project is licensed under an [MIT license](LICENSE).\n\n## \ud83c\udfc6 Contributing\n\nWe love your input! Please see the core Autodistill [contributing guide](https://github.com/autodistill/autodistill/blob/main/CONTRIBUTING.md) to get started. Thank you \ud83d\ude4f to all our contributors!\n",
"bugtrack_url": null,
"license": "",
"summary": "Model for use with Autodistill",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/autodistill/autodistill-gemini"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "919dec48d70bd928d0473644bdc9829e9c956452558050ff87914b26d73b0a04",
"md5": "40950db8a856b61dde467ccff1a2fee7",
"sha256": "70b9754dfc8d19749a7f49fd4cfbe00a698902428ab0f1e3a2cc4e60acd743c0"
},
"downloads": -1,
"filename": "autodistill_gemini-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "40950db8a856b61dde467ccff1a2fee7",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 4506,
"upload_time": "2023-12-18T11:41:09",
"upload_time_iso_8601": "2023-12-18T11:41:09.255925Z",
"url": "https://files.pythonhosted.org/packages/91/9d/ec48d70bd928d0473644bdc9829e9c956452558050ff87914b26d73b0a04/autodistill_gemini-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7d1964178760bd2e19641c76b2fc764ee2e05bbc3baa8cf44d46389d0e7d6259",
"md5": "9423a464a5acfd8b78ac2d6f2ede234d",
"sha256": "d56c4fce0a01718eccc3bd35e8c243aade2e777cf40ed74ff90d0b92b7823b4b"
},
"downloads": -1,
"filename": "autodistill-gemini-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "9423a464a5acfd8b78ac2d6f2ede234d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 4249,
"upload_time": "2023-12-18T11:41:10",
"upload_time_iso_8601": "2023-12-18T11:41:10.475672Z",
"url": "https://files.pythonhosted.org/packages/7d/19/64178760bd2e19641c76b2fc764ee2e05bbc3baa8cf44d46389d0e7d6259/autodistill-gemini-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-18 11:41:10",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "autodistill",
"github_project": "autodistill-gemini",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "autodistill-gemini"
}