Name | video-clip-describer JSON |
Version |
0.4.0
JSON |
| download |
home_page | None |
Summary | Generate text descriptions of video clips |
upload_time | 2024-11-14 03:25:32 |
maintainer | None |
docs_url | None |
author | Bendik R. Brenne |
requires_python | <4.0,>=3.12 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# video-clip-describer
## Installation
```bash
pip install video-clip-describer
```
## Usage
```python
import asyncio
from video_clip_describer import VisionAgent
agent = VisionAgent(
"~/Videos/test.mp4",
api_base_url="https://my-litellm-proxy.local/v1",
api_key="sk-apikey",
vision_model="claude-3-5-sonnet",
refine_model="gemini-1.5-flash",
stack_grid=True,
stack_grid_size=(3, 3),
resize_video=(1024, 768),
hashing_max_frames=200,
hash_size=8,
debug=True,
debug_dir="./debug",
)
description = asyncio.run(agent.run())
print(description)
```
## CLI
```bash
$ video2text path/to/video.mp4
```
```bash
$ video2text --help
Usage: video2text [OPTIONS] VIDEO_FILE
╭─ Arguments ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ * video_file FILENAME The video file to process. [required] │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --resize <width>x<height> Resize frames before sending to GPT-V. [default: 1024x768] │
│ --stack-grid Put video frames in a grid before sending to GPT-V. │
│ --stack-grid-size <cols>x<rows> Grid size to stack frames in. [default: 3x3] │
│ --context Context to add to prompt. [default: None] │
│ --api-base-url OpenAI API compatible base URL. [env var: OPENAI_BASE_URL] [default: https://api.openai.com/v1] │
│ --api-key OpenAI API key. [env var: OPENAI_API_KEY] │
│ --model LLM model to use (overrides --vision-model and --refine-model). [default: None] │
│ --vision-model LLM model to use for vision. [default: claude-3-5-sonnet] │
│ --refine-model LLM model to use for refinement. [default: gemini-1.5-flash] │
│ --no-compress Don't remove similar frames before sending to GPT-V. │
│ --max-frames Max number of frames to allow before decreasing hashing length. [default: 200] │
│ --debug Enable debugging. │
│ --debug-dir PATH Directory to output debug frames to if --debug is enabled. [default: ./debug] │
│ -v Enable verbose output. Repeat for increased verbosity. │
│ --test Don't send requests to LLM. │
│ --install-completion Install completion for the current shell. │
│ --show-completion Show completion for the current shell, to copy it or customize the installation. │
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```
Raw data
{
"_id": null,
"home_page": null,
"name": "video-clip-describer",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.12",
"maintainer_email": null,
"keywords": null,
"author": "Bendik R. Brenne",
"author_email": "bendik@konstant.no",
"download_url": "https://files.pythonhosted.org/packages/f9/56/962e4fadd8ebf3482e538b9947194f6f2422a686f9578c48a0bdf719289b/video_clip_describer-0.4.0.tar.gz",
"platform": null,
"description": "# video-clip-describer\n\n## Installation\n\n```bash\npip install video-clip-describer\n```\n\n## Usage\n\n```python\nimport asyncio\nfrom video_clip_describer import VisionAgent\n\nagent = VisionAgent(\n \"~/Videos/test.mp4\",\n api_base_url=\"https://my-litellm-proxy.local/v1\",\n api_key=\"sk-apikey\",\n vision_model=\"claude-3-5-sonnet\",\n refine_model=\"gemini-1.5-flash\",\n stack_grid=True,\n stack_grid_size=(3, 3),\n resize_video=(1024, 768),\n hashing_max_frames=200,\n hash_size=8,\n debug=True,\n debug_dir=\"./debug\",\n)\n\ndescription = asyncio.run(agent.run())\nprint(description)\n```\n\n## CLI\n\n```bash\n$ video2text path/to/video.mp4\n```\n\n```bash\n$ video2text --help\n\n Usage: video2text [OPTIONS] VIDEO_FILE\n\n\u256d\u2500 Arguments \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 * video_file FILENAME The video file to process. [required] \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 --resize <width>x<height> Resize frames before sending to GPT-V. [default: 1024x768] \u2502\n\u2502 --stack-grid Put video frames in a grid before sending to GPT-V. \u2502\n\u2502 --stack-grid-size <cols>x<rows> Grid size to stack frames in. [default: 3x3] \u2502\n\u2502 --context Context to add to prompt. [default: None] \u2502\n\u2502 --api-base-url OpenAI API compatible base URL. [env var: OPENAI_BASE_URL] [default: https://api.openai.com/v1] \u2502\n\u2502 --api-key OpenAI API key. [env var: OPENAI_API_KEY] \u2502\n\u2502 --model LLM model to use (overrides --vision-model and --refine-model). [default: None] \u2502\n\u2502 --vision-model LLM model to use for vision. [default: claude-3-5-sonnet] \u2502\n\u2502 --refine-model LLM model to use for refinement. [default: gemini-1.5-flash] \u2502\n\u2502 --no-compress Don't remove similar frames before sending to GPT-V. \u2502\n\u2502 --max-frames Max number of frames to allow before decreasing hashing length. [default: 200] \u2502\n\u2502 --debug Enable debugging. \u2502\n\u2502 --debug-dir PATH Directory to output debug frames to if --debug is enabled. [default: ./debug] \u2502\n\u2502 -v Enable verbose output. Repeat for increased verbosity. \u2502\n\u2502 --test Don't send requests to LLM. \u2502\n\u2502 --install-completion Install completion for the current shell. \u2502\n\u2502 --show-completion Show completion for the current shell, to copy it or customize the installation. \u2502\n\u2502 --help Show this message and exit. \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Generate text descriptions of video clips",
"version": "0.4.0",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "90b081bf44e83bdc832b39f7631e43d64995ca7a13cd38dab1f147476be7553e",
"md5": "155e18a9b8ad2ea42db56087453f58ca",
"sha256": "c950194873b89f763f412c3dbb3c54fc185221a9bf36db7fc731744a49b847fe"
},
"downloads": -1,
"filename": "video_clip_describer-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "155e18a9b8ad2ea42db56087453f58ca",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.12",
"size": 9145,
"upload_time": "2024-11-14T03:25:31",
"upload_time_iso_8601": "2024-11-14T03:25:31.209271Z",
"url": "https://files.pythonhosted.org/packages/90/b0/81bf44e83bdc832b39f7631e43d64995ca7a13cd38dab1f147476be7553e/video_clip_describer-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f956962e4fadd8ebf3482e538b9947194f6f2422a686f9578c48a0bdf719289b",
"md5": "8213629fa43231793bf355285f84ae68",
"sha256": "047e21bf40ac1ad9b13c0600418edf64d706310a731ccf6e21c7589d815b71ad"
},
"downloads": -1,
"filename": "video_clip_describer-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "8213629fa43231793bf355285f84ae68",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.12",
"size": 8388,
"upload_time": "2024-11-14T03:25:32",
"upload_time_iso_8601": "2024-11-14T03:25:32.808546Z",
"url": "https://files.pythonhosted.org/packages/f9/56/962e4fadd8ebf3482e538b9947194f6f2422a686f9578c48a0bdf719289b/video_clip_describer-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-14 03:25:32",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "video-clip-describer"
}