keye-vl-utils

Name	keye-vl-utils JSON
Version	1.5.2 JSON
	download
home_page	None
Summary	Keye Vision Language Model Utils - PyTorch
upload_time	2025-08-27 17:32:29
maintainer	None
docs_url	None
author	Kwai Keye Team
requires_python	>=3.8
license	Apache-2.0
keywords	keye-vl large language model pytorch vision language model
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # keye-vl-utils

Keye-VL Utils contains a set of helper functions for processing and integrating visual language information with KeyeVL1.5.

## Install

```bash
pip install keye-vl-utils==1.5.2
```

## Usage

### KeyeVL

```python
from transformers import AutoModel, AutoProcessor
from keye_vl_utils import process_vision_info


# You can directly insert a local file path, a URL, or a base64-encoded image into the position where you want in the text.
messages = [
    # Image
    ## Local file path
    [{"role": "user", "content": [{"type": "image", "image": "file:///path/to/your/image.jpg"}, {"type": "text", "text": "Describe this image."}]}],
    ## Image URL
    [{"role": "user", "content": [{"type": "image", "image": "http://path/to/your/image.jpg"}, {"type": "text", "text": "Describe this image."}]}],
    ## Base64 encoded image
    [{"role": "user", "content": [{"type": "image", "image": "data:image;base64,/9j/..."}, {"type": "text", "text": "Describe this image."}]}],
    ## PIL.Image.Image
    [{"role": "user", "content": [{"type": "image", "image": pil_image}, {"type": "text", "text": "Describe this image."}]}],
    # Video
    ## Local video path
    [{"role": "user", "content": [{"type": "video", "video": "file:///path/to/video1.mp4"}, {"type": "text", "text": "Describe this video."}]}],
    ## Local video frames
    [{"role": "user", "content": [{"type": "video", "video": ["file:///path/to/extracted_frame1.jpg", "file:///path/to/extracted_frame2.jpg", "file:///path/to/extracted_frame3.jpg"],}, {"type": "text", "text": "Describe this video."},],}],
    ## Model dynamically adjusts video nframes, video height and width. specify args if required.
    [{"role": "user", "content": [{"type": "video", "video": "file:///path/to/video1.mp4"}, {"type": "text", "text": "Describe this video."}]}],
]

processor = AutoProcessor.from_pretrained("Kwai-Keye/Keye-VL-1.5-8B", trust_remote_code=True)
model = AutoModel.from_pretrained("Kwai-Keye/Keye-VL-1.5-8B", torch_dtype="auto", device_map="auto", trust_remote_code=True)
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
images, videos, **processor_args = process_vision_info(messages)
inputs = processor(text=text, images=images, videos=videos,  return_tensors="pt", **processor_args)
print(inputs)
generated_ids = model.generate(**inputs)
print(generated_ids)
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "keye-vl-utils",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "keye-vl, large language model, pytorch, vision language model",
    "author": "Kwai Keye Team",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/a7/f8/bbb8268ae740f70365a9e488048a0f57f58f2c1dae65498b8764f0d529e4/keye_vl_utils-1.5.2.tar.gz",
    "platform": null,
    "description": "# keye-vl-utils\n\nKeye-VL Utils contains a set of helper functions for processing and integrating visual language information with KeyeVL1.5.\n\n## Install\n\n```bash\npip install keye-vl-utils==1.5.2\n```\n\n## Usage\n\n### KeyeVL\n\n```python\nfrom transformers import AutoModel, AutoProcessor\nfrom keye_vl_utils import process_vision_info\n\n\n# You can directly insert a local file path, a URL, or a base64-encoded image into the position where you want in the text.\nmessages = [\n    # Image\n    ## Local file path\n    [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": \"file:///path/to/your/image.jpg\"}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n    ## Image URL\n    [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": \"http://path/to/your/image.jpg\"}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n    ## Base64 encoded image\n    [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": \"data:image;base64,/9j/...\"}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n    ## PIL.Image.Image\n    [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": pil_image}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n    # Video\n    ## Local video path\n    [{\"role\": \"user\", \"content\": [{\"type\": \"video\", \"video\": \"file:///path/to/video1.mp4\"}, {\"type\": \"text\", \"text\": \"Describe this video.\"}]}],\n    ## Local video frames\n    [{\"role\": \"user\", \"content\": [{\"type\": \"video\", \"video\": [\"file:///path/to/extracted_frame1.jpg\", \"file:///path/to/extracted_frame2.jpg\", \"file:///path/to/extracted_frame3.jpg\"],}, {\"type\": \"text\", \"text\": \"Describe this video.\"},],}],\n    ## Model dynamically adjusts video nframes, video height and width. specify args if required.\n    [{\"role\": \"user\", \"content\": [{\"type\": \"video\", \"video\": \"file:///path/to/video1.mp4\"}, {\"type\": \"text\", \"text\": \"Describe this video.\"}]}],\n]\n\nprocessor = AutoProcessor.from_pretrained(\"Kwai-Keye/Keye-VL-1.5-8B\", trust_remote_code=True)\nmodel = AutoModel.from_pretrained(\"Kwai-Keye/Keye-VL-1.5-8B\", torch_dtype=\"auto\", device_map=\"auto\", trust_remote_code=True)\ntext = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\nimages, videos, **processor_args = process_vision_info(messages)\ninputs = processor(text=text, images=images, videos=videos,  return_tensors=\"pt\", **processor_args)\nprint(inputs)\ngenerated_ids = model.generate(**inputs)\nprint(generated_ids)\n```\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Keye Vision Language Model Utils - PyTorch",
    "version": "1.5.2",
    "project_urls": {
        "Homepage": "https://github.com/Kwai-Keye/Keye/tree/main/keye-vl-utils",
        "Issues": "https://github.com/Kwai-Keye/Keye/issues",
        "Repository": "https://github.com/Kwai-Keye/Keye.git"
    },
    "split_keywords": [
        "keye-vl",
        " large language model",
        " pytorch",
        " vision language model"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d1fe89fa9b7ae86254f673f7de959cb240e5c4626562336c6bf891b37d1b1b80",
                "md5": "9c7d70daf9d5e10abe5588ce967b3ef0",
                "sha256": "70b4df74afa13cf2ead113ff2ba4b617a94883dc89d9e0a5d033a7f6c4a0245c"
            },
            "downloads": -1,
            "filename": "keye_vl_utils-1.5.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9c7d70daf9d5e10abe5588ce967b3ef0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 7004,
            "upload_time": "2025-08-27T17:32:27",
            "upload_time_iso_8601": "2025-08-27T17:32:27.737495Z",
            "url": "https://files.pythonhosted.org/packages/d1/fe/89fa9b7ae86254f673f7de959cb240e5c4626562336c6bf891b37d1b1b80/keye_vl_utils-1.5.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a7f8bbb8268ae740f70365a9e488048a0f57f58f2c1dae65498b8764f0d529e4",
                "md5": "281162b5690c10f461f13a7037e54d33",
                "sha256": "3f7b6b1718d22cb4bfa936604dc569460ee6a6b9fd8f2ca56c5f6e5f901a02a7"
            },
            "downloads": -1,
            "filename": "keye_vl_utils-1.5.2.tar.gz",
            "has_sig": false,
            "md5_digest": "281162b5690c10f461f13a7037e54d33",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 6696,
            "upload_time": "2025-08-27T17:32:29",
            "upload_time_iso_8601": "2025-08-27T17:32:29.034321Z",
            "url": "https://files.pythonhosted.org/packages/a7/f8/bbb8268ae740f70365a9e488048a0f57f58f2c1dae65498b8764f0d529e4/keye_vl_utils-1.5.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-27 17:32:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Kwai-Keye",
    "github_project": "Keye",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "keye-vl-utils"
}

Kwai Keye Team