moondream


Namemoondream JSON
Version 0.0.6 PyPI version JSON
download
home_pageNone
SummaryPython client library for moondream
upload_time2024-12-10 01:18:56
maintainerNone
docs_urlNone
authorvik
requires_python<4.0,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Moondream Python Client Library

Official Python client library for Moondream, a tiny vision language model that
can analyze images and answer questions about them. This library supports both
local inference and cloud-based API access.

## Features

- **Local Inference**: Run the model directly on your machine using CPU
- **Cloud API**: Access Moondream's hosted service for faster inference
- **Streaming**: Stream responses token by token for real-time output
- **Multiple Model Sizes**: Choose between 0.5B and 2B parameter models
- **Multiple Tasks**: Caption images, answer questions, detect objects, and locate points

## Installation

Install the package from PyPI:

```bash
pip install moondream==0.0.5
```

## Quick Start

### Using Cloud API

To use Moondream's cloud API, you'll first need an API key. Sign up for a free
account at [console.moondream.ai](https://console.moondream.ai) to get your key.
Once you have your key, you can use it to initialize the client as shown below.

```python
import moondream as md
from PIL import Image

# Initialize with API key
model = md.vl(api_key="your-api-key")

# Load an image
image = Image.open("path/to/image.jpg")

# Generate a caption
caption = model.caption(image)["caption"]
print("Caption:", caption)

# Ask a question
answer = model.query(image, "What's in this image?")["answer"]
print("Answer:", answer)

# Stream the response
for chunk in model.caption(image, stream=True)["caption"]:
    print(chunk, end="", flush=True)
```

### Using Local Inference

First, download the model weights. We recommend the int8 weights for most applications:

| Model          | Precision | Download Size | Memory Usage | Download Link                                                                                                                                   |
| -------------- | --------- | ------------- | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| Moondream 2B   | int8      | 1,733 MiB     | 2,624 MiB    | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz?download=true)   |
| Moondream 2B   | int4      | 1,167 MiB     | 2,002 MiB    | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz?download=true)   |
| Moondream 0.5B | int8      | 593 MiB       | 996 MiB      | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int8.mf.gz?download=true) |
| Moondream 0.5B | int4      | 422 MiB       | 816 MiB      | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int4.mf.gz?download=true) |

Then use the model locally:

```python
import moondream as md
from PIL import Image

# Initialize with local model path
model = md.vl(model="path/to/moondream-2b-int8.bin")

# Load and encode image
image = Image.open("path/to/image.jpg")

# Since encoding an image is computationally expensive, you can encode it once
# and reuse the encoded version for multiple queries/captions/etc. This avoids
# having to re-encode the same image multiple times.
encoded_image = model.encode_image(image)

# Generate caption
caption = model.caption(encoded_image)["caption"]
print("Caption:", caption)

# Ask questions
answer = model.query(encoded_image, "What's in this image?")["answer"]
print("Answer:", answer)
```

## API Reference

### Constructor

```python
model = md.vl(
    model="path/to/model.bin",  # For local inference
    api_key="your-api-key"      # For cloud API access
)
```

### Methods

#### caption(image, length="normal", stream=False, settings=None)

Generate a caption for an image.

```python
result = model.caption(image)
# or with streaming
for chunk in model.caption(image, stream=True)["caption"]:
    print(chunk, end="")
```

#### query(image, question, stream=False, settings=None)

Ask a question about an image.

```python
result = model.query(image, "What's in this image?")
# or with streaming
for chunk in model.query(image, "What's in this image?", stream=True)["answer"]:
    print(chunk, end="")
```

#### detect(image, object)

Detect and locate specific objects in an image.

```python
result = model.detect(image, "car")
```

#### point(image, object)

Get coordinates of specific objects in an image.

```python
result = model.point(image, "person")
```

### Input Types

- Images can be provided as:
  - PIL.Image.Image objects
  - Encoded image objects (from model.encode_image())

### Response Types

All methods return typed dictionaries:

- CaptionOutput: `{"caption": str | Generator}`
- QueryOutput: `{"answer": str | Generator}`
- DetectOutput: `{"objects": List[Region]}`
- PointOutput: `{"points": List[Point]}`

## Performance Notes

- Local inference currently only supports CPU execution
- CUDA (GPU) and MPS (Apple Silicon) support coming soon
- For optimal performance with GPU/MPS, use the PyTorch implementation for now

## Links

- [Website](https://moondream.ai/)
- [Demo](https://moondream.ai/playground)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "moondream",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "vik",
    "author_email": "vik@moondream.ai",
    "download_url": "https://files.pythonhosted.org/packages/d1/b6/b797b9b31cc83d68dce7b1d6604aec6785cc6bcc34846e474c4fa591c09b/moondream-0.0.6.tar.gz",
    "platform": null,
    "description": "# Moondream Python Client Library\n\nOfficial Python client library for Moondream, a tiny vision language model that\ncan analyze images and answer questions about them. This library supports both\nlocal inference and cloud-based API access.\n\n## Features\n\n- **Local Inference**: Run the model directly on your machine using CPU\n- **Cloud API**: Access Moondream's hosted service for faster inference\n- **Streaming**: Stream responses token by token for real-time output\n- **Multiple Model Sizes**: Choose between 0.5B and 2B parameter models\n- **Multiple Tasks**: Caption images, answer questions, detect objects, and locate points\n\n## Installation\n\nInstall the package from PyPI:\n\n```bash\npip install moondream==0.0.5\n```\n\n## Quick Start\n\n### Using Cloud API\n\nTo use Moondream's cloud API, you'll first need an API key. Sign up for a free\naccount at [console.moondream.ai](https://console.moondream.ai) to get your key.\nOnce you have your key, you can use it to initialize the client as shown below.\n\n```python\nimport moondream as md\nfrom PIL import Image\n\n# Initialize with API key\nmodel = md.vl(api_key=\"your-api-key\")\n\n# Load an image\nimage = Image.open(\"path/to/image.jpg\")\n\n# Generate a caption\ncaption = model.caption(image)[\"caption\"]\nprint(\"Caption:\", caption)\n\n# Ask a question\nanswer = model.query(image, \"What's in this image?\")[\"answer\"]\nprint(\"Answer:\", answer)\n\n# Stream the response\nfor chunk in model.caption(image, stream=True)[\"caption\"]:\n    print(chunk, end=\"\", flush=True)\n```\n\n### Using Local Inference\n\nFirst, download the model weights. We recommend the int8 weights for most applications:\n\n| Model          | Precision | Download Size | Memory Usage | Download Link                                                                                                                                   |\n| -------------- | --------- | ------------- | ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------- |\n| Moondream 2B   | int8      | 1,733 MiB     | 2,624 MiB    | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int8.mf.gz?download=true)   |\n| Moondream 2B   | int4      | 1,167 MiB     | 2,002 MiB    | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-2b-int4.mf.gz?download=true)   |\n| Moondream 0.5B | int8      | 593 MiB       | 996 MiB      | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int8.mf.gz?download=true) |\n| Moondream 0.5B | int4      | 422 MiB       | 816 MiB      | [Download](https://huggingface.co/vikhyatk/moondream2/resolve/9dddae84d54db4ac56fe37817aeaeb502ed083e2/moondream-0_5b-int4.mf.gz?download=true) |\n\nThen use the model locally:\n\n```python\nimport moondream as md\nfrom PIL import Image\n\n# Initialize with local model path\nmodel = md.vl(model=\"path/to/moondream-2b-int8.bin\")\n\n# Load and encode image\nimage = Image.open(\"path/to/image.jpg\")\n\n# Since encoding an image is computationally expensive, you can encode it once\n# and reuse the encoded version for multiple queries/captions/etc. This avoids\n# having to re-encode the same image multiple times.\nencoded_image = model.encode_image(image)\n\n# Generate caption\ncaption = model.caption(encoded_image)[\"caption\"]\nprint(\"Caption:\", caption)\n\n# Ask questions\nanswer = model.query(encoded_image, \"What's in this image?\")[\"answer\"]\nprint(\"Answer:\", answer)\n```\n\n## API Reference\n\n### Constructor\n\n```python\nmodel = md.vl(\n    model=\"path/to/model.bin\",  # For local inference\n    api_key=\"your-api-key\"      # For cloud API access\n)\n```\n\n### Methods\n\n#### caption(image, length=\"normal\", stream=False, settings=None)\n\nGenerate a caption for an image.\n\n```python\nresult = model.caption(image)\n# or with streaming\nfor chunk in model.caption(image, stream=True)[\"caption\"]:\n    print(chunk, end=\"\")\n```\n\n#### query(image, question, stream=False, settings=None)\n\nAsk a question about an image.\n\n```python\nresult = model.query(image, \"What's in this image?\")\n# or with streaming\nfor chunk in model.query(image, \"What's in this image?\", stream=True)[\"answer\"]:\n    print(chunk, end=\"\")\n```\n\n#### detect(image, object)\n\nDetect and locate specific objects in an image.\n\n```python\nresult = model.detect(image, \"car\")\n```\n\n#### point(image, object)\n\nGet coordinates of specific objects in an image.\n\n```python\nresult = model.point(image, \"person\")\n```\n\n### Input Types\n\n- Images can be provided as:\n  - PIL.Image.Image objects\n  - Encoded image objects (from model.encode_image())\n\n### Response Types\n\nAll methods return typed dictionaries:\n\n- CaptionOutput: `{\"caption\": str | Generator}`\n- QueryOutput: `{\"answer\": str | Generator}`\n- DetectOutput: `{\"objects\": List[Region]}`\n- PointOutput: `{\"points\": List[Point]}`\n\n## Performance Notes\n\n- Local inference currently only supports CPU execution\n- CUDA (GPU) and MPS (Apple Silicon) support coming soon\n- For optimal performance with GPU/MPS, use the PyTorch implementation for now\n\n## Links\n\n- [Website](https://moondream.ai/)\n- [Demo](https://moondream.ai/playground)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Python client library for moondream",
    "version": "0.0.6",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dc8a7024d97b886345190f12c15722274a38d493de3866417e5ba2d85e59d8fc",
                "md5": "f6466d47b8275652f2446c8981adbe82",
                "sha256": "0b0dfa11c51d63f89d7c317307117930b42353a66f7827783e697a41e81501b7"
            },
            "downloads": -1,
            "filename": "moondream-0.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f6466d47b8275652f2446c8981adbe82",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.10",
            "size": 16461,
            "upload_time": "2024-12-10T01:18:55",
            "upload_time_iso_8601": "2024-12-10T01:18:55.182513Z",
            "url": "https://files.pythonhosted.org/packages/dc/8a/7024d97b886345190f12c15722274a38d493de3866417e5ba2d85e59d8fc/moondream-0.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d1b6b797b9b31cc83d68dce7b1d6604aec6785cc6bcc34846e474c4fa591c09b",
                "md5": "20eb0d7412ff29d1f656c43fc45fdf69",
                "sha256": "b923767530af9969330d10ac4d078d205668b78482566d09a389722988ea68fe"
            },
            "downloads": -1,
            "filename": "moondream-0.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "20eb0d7412ff29d1f656c43fc45fdf69",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.10",
            "size": 14764,
            "upload_time": "2024-12-10T01:18:56",
            "upload_time_iso_8601": "2024-12-10T01:18:56.450607Z",
            "url": "https://files.pythonhosted.org/packages/d1/b6/b797b9b31cc83d68dce7b1d6604aec6785cc6bcc34846e474c4fa591c09b/moondream-0.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-10 01:18:56",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "moondream"
}
        
vik
Elapsed time: 1.63434s