# keye-vl-utils
Keye-VL Utils contains a set of helper functions for processing and integrating visual language information with KeyeVL1.5.
## Install
```bash
pip install keye-vl-utils==1.5.2
```
## Usage
### KeyeVL
```python
from transformers import AutoModel, AutoProcessor
from keye_vl_utils import process_vision_info
# You can directly insert a local file path, a URL, or a base64-encoded image into the position where you want in the text.
messages = [
# Image
## Local file path
[{"role": "user", "content": [{"type": "image", "image": "file:///path/to/your/image.jpg"}, {"type": "text", "text": "Describe this image."}]}],
## Image URL
[{"role": "user", "content": [{"type": "image", "image": "http://path/to/your/image.jpg"}, {"type": "text", "text": "Describe this image."}]}],
## Base64 encoded image
[{"role": "user", "content": [{"type": "image", "image": "data:image;base64,/9j/..."}, {"type": "text", "text": "Describe this image."}]}],
## PIL.Image.Image
[{"role": "user", "content": [{"type": "image", "image": pil_image}, {"type": "text", "text": "Describe this image."}]}],
# Video
## Local video path
[{"role": "user", "content": [{"type": "video", "video": "file:///path/to/video1.mp4"}, {"type": "text", "text": "Describe this video."}]}],
## Local video frames
[{"role": "user", "content": [{"type": "video", "video": ["file:///path/to/extracted_frame1.jpg", "file:///path/to/extracted_frame2.jpg", "file:///path/to/extracted_frame3.jpg"],}, {"type": "text", "text": "Describe this video."},],}],
## Model dynamically adjusts video nframes, video height and width. specify args if required.
[{"role": "user", "content": [{"type": "video", "video": "file:///path/to/video1.mp4"}, {"type": "text", "text": "Describe this video."}]}],
]
processor = AutoProcessor.from_pretrained("Kwai-Keye/Keye-VL-1.5-8B", trust_remote_code=True)
model = AutoModel.from_pretrained("Kwai-Keye/Keye-VL-1.5-8B", torch_dtype="auto", device_map="auto", trust_remote_code=True)
text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
images, videos, **processor_args = process_vision_info(messages)
inputs = processor(text=text, images=images, videos=videos, return_tensors="pt", **processor_args)
print(inputs)
generated_ids = model.generate(**inputs)
print(generated_ids)
```
Raw data
{
"_id": null,
"home_page": null,
"name": "keye-vl-utils",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "keye-vl, large language model, pytorch, vision language model",
"author": "Kwai Keye Team",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/a7/f8/bbb8268ae740f70365a9e488048a0f57f58f2c1dae65498b8764f0d529e4/keye_vl_utils-1.5.2.tar.gz",
"platform": null,
"description": "# keye-vl-utils\n\nKeye-VL Utils contains a set of helper functions for processing and integrating visual language information with KeyeVL1.5.\n\n## Install\n\n```bash\npip install keye-vl-utils==1.5.2\n```\n\n## Usage\n\n### KeyeVL\n\n```python\nfrom transformers import AutoModel, AutoProcessor\nfrom keye_vl_utils import process_vision_info\n\n\n# You can directly insert a local file path, a URL, or a base64-encoded image into the position where you want in the text.\nmessages = [\n # Image\n ## Local file path\n [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": \"file:///path/to/your/image.jpg\"}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n ## Image URL\n [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": \"http://path/to/your/image.jpg\"}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n ## Base64 encoded image\n [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": \"data:image;base64,/9j/...\"}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n ## PIL.Image.Image\n [{\"role\": \"user\", \"content\": [{\"type\": \"image\", \"image\": pil_image}, {\"type\": \"text\", \"text\": \"Describe this image.\"}]}],\n # Video\n ## Local video path\n [{\"role\": \"user\", \"content\": [{\"type\": \"video\", \"video\": \"file:///path/to/video1.mp4\"}, {\"type\": \"text\", \"text\": \"Describe this video.\"}]}],\n ## Local video frames\n [{\"role\": \"user\", \"content\": [{\"type\": \"video\", \"video\": [\"file:///path/to/extracted_frame1.jpg\", \"file:///path/to/extracted_frame2.jpg\", \"file:///path/to/extracted_frame3.jpg\"],}, {\"type\": \"text\", \"text\": \"Describe this video.\"},],}],\n ## Model dynamically adjusts video nframes, video height and width. specify args if required.\n [{\"role\": \"user\", \"content\": [{\"type\": \"video\", \"video\": \"file:///path/to/video1.mp4\"}, {\"type\": \"text\", \"text\": \"Describe this video.\"}]}],\n]\n\nprocessor = AutoProcessor.from_pretrained(\"Kwai-Keye/Keye-VL-1.5-8B\", trust_remote_code=True)\nmodel = AutoModel.from_pretrained(\"Kwai-Keye/Keye-VL-1.5-8B\", torch_dtype=\"auto\", device_map=\"auto\", trust_remote_code=True)\ntext = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)\nimages, videos, **processor_args = process_vision_info(messages)\ninputs = processor(text=text, images=images, videos=videos, return_tensors=\"pt\", **processor_args)\nprint(inputs)\ngenerated_ids = model.generate(**inputs)\nprint(generated_ids)\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Keye Vision Language Model Utils - PyTorch",
"version": "1.5.2",
"project_urls": {
"Homepage": "https://github.com/Kwai-Keye/Keye/tree/main/keye-vl-utils",
"Issues": "https://github.com/Kwai-Keye/Keye/issues",
"Repository": "https://github.com/Kwai-Keye/Keye.git"
},
"split_keywords": [
"keye-vl",
" large language model",
" pytorch",
" vision language model"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d1fe89fa9b7ae86254f673f7de959cb240e5c4626562336c6bf891b37d1b1b80",
"md5": "9c7d70daf9d5e10abe5588ce967b3ef0",
"sha256": "70b4df74afa13cf2ead113ff2ba4b617a94883dc89d9e0a5d033a7f6c4a0245c"
},
"downloads": -1,
"filename": "keye_vl_utils-1.5.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9c7d70daf9d5e10abe5588ce967b3ef0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 7004,
"upload_time": "2025-08-27T17:32:27",
"upload_time_iso_8601": "2025-08-27T17:32:27.737495Z",
"url": "https://files.pythonhosted.org/packages/d1/fe/89fa9b7ae86254f673f7de959cb240e5c4626562336c6bf891b37d1b1b80/keye_vl_utils-1.5.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a7f8bbb8268ae740f70365a9e488048a0f57f58f2c1dae65498b8764f0d529e4",
"md5": "281162b5690c10f461f13a7037e54d33",
"sha256": "3f7b6b1718d22cb4bfa936604dc569460ee6a6b9fd8f2ca56c5f6e5f901a02a7"
},
"downloads": -1,
"filename": "keye_vl_utils-1.5.2.tar.gz",
"has_sig": false,
"md5_digest": "281162b5690c10f461f13a7037e54d33",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 6696,
"upload_time": "2025-08-27T17:32:29",
"upload_time_iso_8601": "2025-08-27T17:32:29.034321Z",
"url": "https://files.pythonhosted.org/packages/a7/f8/bbb8268ae740f70365a9e488048a0f57f58f2c1dae65498b8764f0d529e4/keye_vl_utils-1.5.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-27 17:32:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Kwai-Keye",
"github_project": "Keye",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "keye-vl-utils"
}