Name | moondream JSON |
Version |
0.0.6
JSON |
| download |
home_page | None |
Summary | Python client library for moondream |
upload_time | 2024-12-10 01:18:56 |
maintainer | None |
docs_url | None |
author | vik |
requires_python | <4.0,>=3.10 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Moondream Python Client Library
Official Python client library for Moondream, a tiny vision language model that
can analyze images and answer questions about them. This library supports both
local inference and cloud-based API access.
## Features
- **Local Inference**: Run the model directly on your machine using CPU
- **Cloud API**: Access Moondream's hosted service for faster inference
- **Streaming**: Stream responses token by token for real-time output
- **Multiple Model Sizes**: Choose between 0.5B and 2B parameter models
- **Multiple Tasks**: Caption images, answer questions, detect objects, and locate points
## Installation
Install the package from PyPI:
```bash
pip install moondream==0.0.5
```
## Quick Start
### Using Cloud API
To use Moondream's cloud API, you'll first need an API key. Sign up for a free
account at [console.moondream.ai](https://console.moondream.ai) to get your key.
Once you have your key, you can use it to initialize the client as shown below.
```python
import moondream as md
from PIL import Image
# Initialize with API key
model = md.vl(api_key="your-api-key")
# Load an image
image = Image.open("path/to/image.jpg")
# Generate a caption
caption = model.caption(image)["caption"]
print("Caption:", caption)
# Ask a question
answer = model.query(image, "What's in this image?")["answer"]
print("Answer:", answer)
# Stream the response
for chunk in model.caption(image, stream=True)["caption"]:
print(chunk, end="", flush=True)
```
### Using Local Inference
First, download the model weights. We recommend the int8 weights for most applications:
| Model | Precision | Download Size | Memory Usage | Download Link |
| -------------- | --------- | ------------- | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| Moondream 2B | int8 | 1,733 MiB | 2,624 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz?download=true) |
| Moondream 2B | int4 | 1,167 MiB | 2,002 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz?download=true) |
| Moondream 0.5B | int8 | 593 MiB | 996 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int8.mf.gz?download=true) |
| Moondream 0.5B | int4 | 422 MiB | 816 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int4.mf.gz?download=true) |
Then use the model locally:
```python
import moondream as md
from PIL import Image
# Initialize with local model path
model = md.vl(model="path/to/moondream-2b-int8.bin")
# Load and encode image
image = Image.open("path/to/image.jpg")
# Since encoding an image is computationally expensive, you can encode it once
# and reuse the encoded version for multiple queries/captions/etc. This avoids
# having to re-encode the same image multiple times.
encoded_image = model.encode_image(image)
# Generate caption
caption = model.caption(encoded_image)["caption"]
print("Caption:", caption)
# Ask questions
answer = model.query(encoded_image, "What's in this image?")["answer"]
print("Answer:", answer)
```
## API Reference
### Constructor
```python
model = md.vl(
model="path/to/model.bin", # For local inference
api_key="your-api-key" # For cloud API access
)
```
### Methods
#### caption(image, length="normal", stream=False, settings=None)
Generate a caption for an image.
```python
result = model.caption(image)
# or with streaming
for chunk in model.caption(image, stream=True)["caption"]:
print(chunk, end="")
```
#### query(image, question, stream=False, settings=None)
Ask a question about an image.
```python
result = model.query(image, "What's in this image?")
# or with streaming
for chunk in model.query(image, "What's in this image?", stream=True)["answer"]:
print(chunk, end="")
```
#### detect(image, object)
Detect and locate specific objects in an image.
```python
result = model.detect(image, "car")
```
#### point(image, object)
Get coordinates of specific objects in an image.
```python
result = model.point(image, "person")
```
### Input Types
- Images can be provided as:
- PIL.Image.Image objects
- Encoded image objects (from model.encode_image())
### Response Types
All methods return typed dictionaries:
- CaptionOutput: `{"caption": str | Generator}`
- QueryOutput: `{"answer": str | Generator}`
- DetectOutput: `{"objects": List[Region]}`
- PointOutput: `{"points": List[Point]}`
## Performance Notes
- Local inference currently only supports CPU execution
- CUDA (GPU) and MPS (Apple Silicon) support coming soon
- For optimal performance with GPU/MPS, use the PyTorch implementation for now
## Links
- [Website](https://moondream.ai/)
- [Demo](https://moondream.ai/playground)
Raw data
{
"_id": null,
"home_page": null,
"name": "moondream",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": null,
"author": "vik",
"author_email": "vik@moondream.ai",
"download_url": "https://files.pythonhosted.org/packages/d1/b6/b797b9b31cc83d68dce7b1d6604aec6785cc6bcc34846e474c4fa591c09b/moondream-0.0.6.tar.gz",
"platform": null,
"description": "# Moondream Python Client Library\n\nOfficial Python client library for Moondream, a tiny vision language model that\ncan analyze images and answer questions about them. This library supports both\nlocal inference and cloud-based API access.\n\n## Features\n\n- **Local Inference**: Run the model directly on your machine using CPU\n- **Cloud API**: Access Moondream's hosted service for faster inference\n- **Streaming**: Stream responses token by token for real-time output\n- **Multiple Model Sizes**: Choose between 0.5B and 2B parameter models\n- **Multiple Tasks**: Caption images, answer questions, detect objects, and locate points\n\n## Installation\n\nInstall the package from PyPI:\n\n```bash\npip install moondream==0.0.5\n```\n\n## Quick Start\n\n### Using Cloud API\n\nTo use Moondream's cloud API, you'll first need an API key. Sign up for a free\naccount at [console.moondream.ai](https://console.moondream.ai) to get your key.\nOnce you have your key, you can use it to initialize the client as shown below.\n\n```python\nimport moondream as md\nfrom PIL import Image\n\n# Initialize with API key\nmodel = md.vl(api_key=\"your-api-key\")\n\n# Load an image\nimage = Image.open(\"path/to/image.jpg\")\n\n# Generate a caption\ncaption = model.caption(image)[\"caption\"]\nprint(\"Caption:\", caption)\n\n# Ask a question\nanswer = model.query(image, \"What's in this image?\")[\"answer\"]\nprint(\"Answer:\", answer)\n\n# Stream the response\nfor chunk in model.caption(image, stream=True)[\"caption\"]:\n print(chunk, end=\"\", flush=True)\n```\n\n### Using Local Inference\n\nFirst, download the model weights. We recommend the int8 weights for most applications:\n\n| Model | Precision | Download Size | Memory Usage | Download Link |\n| -------------- | --------- | ------------- | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- |\n| Moondream 2B | int8 | 1,733 MiB | 2,624 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz?download=true) |\n| Moondream 2B | int4 | 1,167 MiB | 2,002 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz?download=true) |\n| Moondream 0.5B | int8 | 593 MiB | 996 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int8.mf.gz?download=true) |\n| Moondream 0.5B | int4 | 422 MiB | 816 MiB | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int4.mf.gz?download=true) |\n\nThen use the model locally:\n\n```python\nimport moondream as md\nfrom PIL import Image\n\n# Initialize with local model path\nmodel = md.vl(model=\"path/to/moondream-2b-int8.bin\")\n\n# Load and encode image\nimage = Image.open(\"path/to/image.jpg\")\n\n# Since encoding an image is computationally expensive, you can encode it once\n# and reuse the encoded version for multiple queries/captions/etc. This avoids\n# having to re-encode the same image multiple times.\nencoded_image = model.encode_image(image)\n\n# Generate caption\ncaption = model.caption(encoded_image)[\"caption\"]\nprint(\"Caption:\", caption)\n\n# Ask questions\nanswer = model.query(encoded_image, \"What's in this image?\")[\"answer\"]\nprint(\"Answer:\", answer)\n```\n\n## API Reference\n\n### Constructor\n\n```python\nmodel = md.vl(\n model=\"path/to/model.bin\", # For local inference\n api_key=\"your-api-key\" # For cloud API access\n)\n```\n\n### Methods\n\n#### caption(image, length=\"normal\", stream=False, settings=None)\n\nGenerate a caption for an image.\n\n```python\nresult = model.caption(image)\n# or with streaming\nfor chunk in model.caption(image, stream=True)[\"caption\"]:\n print(chunk, end=\"\")\n```\n\n#### query(image, question, stream=False, settings=None)\n\nAsk a question about an image.\n\n```python\nresult = model.query(image, \"What's in this image?\")\n# or with streaming\nfor chunk in model.query(image, \"What's in this image?\", stream=True)[\"answer\"]:\n print(chunk, end=\"\")\n```\n\n#### detect(image, object)\n\nDetect and locate specific objects in an image.\n\n```python\nresult = model.detect(image, \"car\")\n```\n\n#### point(image, object)\n\nGet coordinates of specific objects in an image.\n\n```python\nresult = model.point(image, \"person\")\n```\n\n### Input Types\n\n- Images can be provided as:\n - PIL.Image.Image objects\n - Encoded image objects (from model.encode_image())\n\n### Response Types\n\nAll methods return typed dictionaries:\n\n- CaptionOutput: `{\"caption\": str | Generator}`\n- QueryOutput: `{\"answer\": str | Generator}`\n- DetectOutput: `{\"objects\": List[Region]}`\n- PointOutput: `{\"points\": List[Point]}`\n\n## Performance Notes\n\n- Local inference currently only supports CPU execution\n- CUDA (GPU) and MPS (Apple Silicon) support coming soon\n- For optimal performance with GPU/MPS, use the PyTorch implementation for now\n\n## Links\n\n- [Website](https://moondream.ai/)\n- [Demo](https://moondream.ai/playground)\n",
"bugtrack_url": null,
"license": null,
"summary": "Python client library for moondream",
"version": "0.0.6",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "dc8a7024d97b886345190f12c15722274a38d493de3866417e5ba2d85e59d8fc",
"md5": "f6466d47b8275652f2446c8981adbe82",
"sha256": "0b0dfa11c51d63f89d7c317307117930b42353a66f7827783e697a41e81501b7"
},
"downloads": -1,
"filename": "moondream-0.0.6-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f6466d47b8275652f2446c8981adbe82",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 16461,
"upload_time": "2024-12-10T01:18:55",
"upload_time_iso_8601": "2024-12-10T01:18:55.182513Z",
"url": "https://files.pythonhosted.org/packages/dc/8a/7024d97b886345190f12c15722274a38d493de3866417e5ba2d85e59d8fc/moondream-0.0.6-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d1b6b797b9b31cc83d68dce7b1d6604aec6785cc6bcc34846e474c4fa591c09b",
"md5": "20eb0d7412ff29d1f656c43fc45fdf69",
"sha256": "b923767530af9969330d10ac4d078d205668b78482566d09a389722988ea68fe"
},
"downloads": -1,
"filename": "moondream-0.0.6.tar.gz",
"has_sig": false,
"md5_digest": "20eb0d7412ff29d1f656c43fc45fdf69",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 14764,
"upload_time": "2024-12-10T01:18:56",
"upload_time_iso_8601": "2024-12-10T01:18:56.450607Z",
"url": "https://files.pythonhosted.org/packages/d1/b6/b797b9b31cc83d68dce7b1d6604aec6785cc6bcc34846e474c4fa591c09b/moondream-0.0.6.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-10 01:18:56",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "moondream"
}