# video-sampler
<div align="center">
[![Python Version](https://img.shields.io/pypi/pyversions/video-sampler.svg)](https://pypi.org/project/video-sampler/)
[![Dependencies Status](https://img.shields.io/badge/dependencies-up%20to%20date-brightgreen.svg)](https://github.com/LemurPwned/video-sampler/pulls?utf8=%E2%9C%93&q=is%3Apr%20author%3Aapp%2Fdependabot)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/LemurPwned/video-sampler/blob/main/.pre-commit-config.yaml)
[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)
[![Downloads](https://img.shields.io/pypi/dm/video-sampler.svg)](https://img.shields.io/pypi/dm/video-sampler.svg)
Video sampler allows you to efficiently sample video frames and summarise the videos.
Currently, it uses keyframe decoding, frame interval gating and perceptual hashing to reduce duplicated samples.
**Use case:** for sampling videos for later annotations used in machine learning.
</div>
## Table of Contents
- [video-sampler](#video-sampler)
- [Table of Contents](#table-of-contents)
- [Documentation](#documentation)
- [Features](#features)
- [Installation and Usage](#installation-and-usage)
- [Basic usage](#basic-usage)
- [YT-DLP integration plugin](#yt-dlp-integration-plugin)
- [Extra YT-DLP options](#extra-yt-dlp-options)
- [OpenAI summary](#openai-summary)
- [API examples](#api-examples)
- [Advanced usage](#advanced-usage)
- [Gating](#gating)
- [CLIP-based gating comparison](#clip-based-gating-comparison)
- [Blur gating](#blur-gating)
- [Benchmarks](#benchmarks)
- [Benchmark videos](#benchmark-videos)
- [Flit commands](#flit-commands)
- [Build](#build)
- [Install](#install)
- [Publish](#publish)
- [🛡 License](#-license)
- [📃 Citation](#-citation)
## Documentation
Documentation is available at [https://lemurpwned.github.io/video-sampler/](https://lemurpwned.github.io/video-sampler/).
## Features
- [x] Direct sampling methods:
- [x] `hash` - uses perceptual hashing to reduce duplicated samples
- [x] `entropy` - uses entropy to reduce duplicated samples (work in progress)
- [x] `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)
- [x] `buffer` - uses sliding buffer to reduce duplicated samples
- [x] `grid` - uses grid sampling to reduce duplicated samples
- [x] Gating methods (modifications on top of direct sampling methods):
- [x] `clip` - uses CLIP to filter out frames that do not contain the specified objects
- [x] `blur` - uses blur detection to filter out frames that are too blurry
- [x] Language capture:
- [x] Keyword capture from subtitles
- [x] Integrations
- [x] YTDLP integration -- streams directly from [yt-dlp](http://github.com//yt-dlp/yt-dlp) queries,
playlists or single videos
- [x] OpenAI multimodal models integration for video summaries
## Installation and Usage
```bash
pip install -U video_sampler
```
then you can run
```bash
python3 -m video_sampler --help
```
or simply
```bash
video_sampler --help
```
### Basic usage
```bash
python3 -m video_sampler hash FatCat.mp4 ./dataset-frames/ --hash-size 3 --buffer-size 20
```
#### YT-DLP integration plugin
Before using please consult the ToS of the website you are scraping from -- use responsibly and for research purposes.
To use the YT-DLP integration, you need to install `yt-dlp` first (see [yt-dlp](http://github.com//yt-dlp/yt-dlp)).
Then, you simply add `--yt-dlp` to the command, and it changes the meaning of the `video_path` argument.
- to search
```bash
video_sampler hash "ytsearch:cute cats" ./folder-frames/ \
--hash-size 3 --buffer-size 20 --ytdlp
```
- to sample a single video
```bash
video_sampler hash "https://www.youtube.com/watch?v=W86cTIoMv2U" ./folder-frames/ \
--hash-size 3 --buffer-size 20 --ytdlp
```
- to sample a playlist
```bash
video_sampler hash "https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk" ./folder-frames/ \
--hash-size 3 --buffer-size 20 --ytdlp
```
- segment based on the keyword extraction
```bash
video_sampler hash "https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk" ./folder-frames/ \
--hash-size 3 --buffer-size 20 --ytdlp --keywords "cat,dog,another keyword,test keyword"
```
The videos are never directly downloaded, only streamed, so you can use it to sample videos from the internet without downloading them first.
##### Extra YT-DLP options
You can pass extra options to yt-dlp by using the `-yt-extra-args` flag. For example:
this will only sample videos uploaded before 2019-01-01:
```bash
... --ytdlp --yt-extra-args '--datebefore 20190101'
```
or this will only sample videos uploaded after 2019-01-01:
```bash
... --ytdlp --yt-extra-args '--dateafter 20190101'
```
or this will skip all shorts:
```bash
... --ytdlp --yt-extra-args '--match-filter "original_url!*=/shorts/ & url!*=/shorts/"
```
#### OpenAI summary
To use the OpenAI multimodal models integration, you need to install `openai` first `pip install openai`.
Then, you simply add `--summary-interval` to the command and the url.
In the example, I'm using [llamafile](https://github.com/Mozilla-Ocho/llamafile) LLAVA model to summarize the video every 50 frames. If you want to use the OpenAI multimodal models, you need to export `OPENAI_API_KEY=your_api_key` first.
To replicate, run LLAVA model locally and set the `summary-url` to the address of the model. Specify the `summary-interval` to the minimal interval in seconds between frames that are to be summarised/described.
```bash
video_sampler hash ./videos/FatCat.mp4 ./output-frames/ --hash-size 3 --buffer-size 20 --summary-url "http://localhost:8080" --summary-interval 50
```
Some of the frames, based on the interval specified, will be summarised by the model and the result will saved in the `./output-frames/summaries.json` folder. The frames that are summarised come after the sampling and gating process happens, and only those frames that pass both stages are viable for summarisation.
```jsonl
summaries.jsonl
---
{"time": 56.087, "summary": "A cat is walking through a field of tall grass, with its head down and ears back. The cat appears to be looking for something in the grass, possibly a mouse or another small creature. The field is covered in snow, adding a wintry atmosphere to the scene."}
{"time": 110.087, "summary": "A dog is walking in the snow, with its head down, possibly sniffing the ground. The dog is the main focus of the image, and it appears to be a small animal. The snowy landscape is visible in the background, creating a serene and cold atmosphere."}
{"time": 171.127, "summary": "The image features a group of animals, including a dog and a cat, standing on a beach near the ocean. The dog is positioned closer to the left side of the image, while the cat is located more towards the center. The scene is set against a beautiful backdrop of a blue sky and a vibrant green ocean. The animals appear to be enjoying their time on the beach, possibly taking a break from their daily activities."}
```
#### API examples
See examples in [./scripts](./scripts/run_benchmarks.py).
### Advanced usage
There are 3 sampling methods available:
- `hash` - uses perceptual hashing to reduce duplicated samples
- `entropy` - uses entropy to reduce duplicated samples (work in progress)
- `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)
To launch any of them you can run and substitute `method-name` with one of the above:
```bash
video_sampler buffer `method-name` ...other options
```
e.g.
```bash
video_sampler buffer entropy --buffer-size 20 ...
```
where `buffer-size` for `entropy` and `gzip` mean the top-k sliding buffer size. Sliding buffer also uses hashing to reduce duplicated samples.
#### Gating
Aside from basic sampling rules, you can also apply gating rules to the sampled frames, further reducing the number of frames.
There are 3 gating methods available:
- `pass` - pass all frames
- `clip` - use CLIP to filter out frames that do not contain the specified objects
- `blur` - use blur detection to filter out frames that are too blurry
Here's a quick example of how to use clip:
```bash
python3 -m video_sampler clip ./videos ./scratch/clip --pos-samples "a cat" --neg-samples "empty background, a lemur" --hash-size 4
```
#### CLIP-based gating comparison
Here's a brief comparison of the frames sampled with and without CLIP-based gating with the following config:
```python
gate_def = dict(
type="clip",
pos_samples=["a cat"],
neg_samples=[
"an empty background",
"text on screen",
"a forest with no animals",
],
model_name="ViT-B-32",
batch_size=32,
pos_margin=0.2,
neg_margin=0.3,
)
```
Evidently, CLIP-based gating is able to filter out frames that do not contain a cat and in consequence, reduce the number of frames with plain background. It also thinks that a lemur is a cat, which is not entirely wrong as fluffy creatures go.
| Pass gate (no gating) | CLIP gate | Grid |
| :-------------------------------------------------------------: | :-------------------------------------------------------------: | :-------------------------------------------------------------: |
| <img width="256" src="./assets/FatCat.mp4_hash_4_pass.gif"> | <img width="256" src="./assets/FatCat.mp4_hash_4_clip.gif"> | <img width="256" src="./assets/FatCat.mp4_grid_4_pass.gif"> |
| <img width="256" src="./assets/SmolCat.mp4_hash_4_pass.gif"> | <img width="256" src="./assets/SmolCat.mp4_hash_4_clip.gif"> | <img width="256" src="./assets/SmolCat.mp4_grid_4_pass.gif"> |
| <img width="256" src="./assets/HighLemurs.mp4_hash_4_pass.gif"> | <img width="256" src="./assets/HighLemurs.mp4_hash_4_clip.gif"> | <img width="256" src="./assets/HighLemurs.mp4_grid_4_pass.gif"> |
The effects of gating in numbers, for this particular set of examples (see `produced` vs `gated` columns). `produced` represents the number of frames sampled without gating, here after the perceptual hashing, while `gated` represents the number of frames sampled after gating.
| video | buffer | gate | decoded | produced | gated |
| -------------- | ------ | ---- | ------- | -------- | ----- |
| FatCat.mp4 | grid | pass | 179 | 31 | 31 |
| SmolCat.mp4 | grid | pass | 118 | 24 | 24 |
| HighLemurs.mp4 | grid | pass | 161 | 35 | 35 |
| FatCat.mp4 | hash | pass | 179 | 101 | 101 |
| SmolCat.mp4 | hash | pass | 118 | 61 | 61 |
| HighLemurs.mp4 | hash | pass | 161 | 126 | 126 |
| FatCat.mp4 | hash | clip | 179 | 101 | 73 |
| SmolCat.mp4 | hash | clip | 118 | 61 | 31 |
| HighLemurs.mp4 | hash | clip | 161 | 126 | 66 |
#### Blur gating
Helps a little with blurry videos. Adjust threshold and method (`laplacian` or `fft`) for best results.
Some results from `fft` at `threshold=20`:
| video | buffer | gate | decoded | produced | gated |
| ---------- | ------ | ---- | ------- | -------- | ----- |
| MadLad.mp4 | grid | pass | 120 | 31 | 31 |
| MadLad.mp4 | hash | pass | 120 | 110 | 110 |
| MadLad.mp4 | hash | blur | 120 | 110 | 85 |
## Benchmarks
Configuration for this benchmark:
```bash
SamplerConfig(min_frame_interval_sec=1.0, keyframes_only=True, buffer_size=30, hash_size=X, queue_wait=0.1, debug=True)
```
| Video | Total frames | Hash size | Decoded | Saved |
| :-------------------------------------------------------------------: | :----------: | :-------: | :-----: | :---: |
| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | 2936 | 8 | 118 | 106 |
| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | - | 4 | - | 61 |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | 4462 | 8 | 179 | 163 |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | - | 4 | - | 101 |
| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | 4020 | 8 | 161 | 154 |
| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | - | 4 | - | 126 |
---
```bash
SamplerConfig(
min_frame_interval_sec=1.0,
keyframes_only=True,
queue_wait=0.1,
debug=False,
print_stats=True,
buffer_config={'type': 'entropy'/'gzip', 'size': 30, 'debug': False, 'hash_size': 8, 'expiry': 50}
)
```
| Video | Total frames | Type | Decoded | Saved |
| :-------------------------------------------------------------------: | :----------: | :-----: | :-----: | :---: |
| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | 2936 | entropy | 118 | 39 |
| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | - | gzip | - | 39 |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | 4462 | entropy | 179 | 64 |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | - | gzip | - | 73 |
| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | 4020 | entropy | 161 | 59 |
| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | - | gzip | - | 63 |
## Benchmark videos
- [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)
- [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC)
- [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)
- [MadLad](https://www.youtube.com/watch?v=MWyBgudQqsI)
## Flit commands
#### Build
```
flit build
```
#### Install
```
flit install
```
#### Publish
Remember to bump the version in `pyproject.toml` before publishing.
```
flit publish
```
## 🛡 License
[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)
This project is licensed under the terms of the `MIT` license. See [LICENSE](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE) for more details.
## 📃 Citation
```bibtex
@misc{video-sampler,
author = {video-sampler},
title = {Video sampler allows you to efficiently sample video frames},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LemurPwned/video-sampler}}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "video_sampler",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "video sampling, frame selection, labelling, labeling, annotation",
"author": null,
"author_email": "LemurPwned <lemurpwned@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/03/df/e35478159e0f9e83cacd0fd37c49334f1cafa02580d0bee3a773277ff3d6/video_sampler-0.10.0.tar.gz",
"platform": null,
"description": "# video-sampler\n\n<div align=\"center\">\n\n[![Python Version](https://img.shields.io/pypi/pyversions/video-sampler.svg)](https://pypi.org/project/video-sampler/)\n[![Dependencies Status](https://img.shields.io/badge/dependencies-up%20to%20date-brightgreen.svg)](https://github.com/LemurPwned/video-sampler/pulls?utf8=%E2%9C%93&q=is%3Apr%20author%3Aapp%2Fdependabot)\n\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/LemurPwned/video-sampler/blob/main/.pre-commit-config.yaml)\n\n[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)\n[![Downloads](https://img.shields.io/pypi/dm/video-sampler.svg)](https://img.shields.io/pypi/dm/video-sampler.svg)\n\nVideo sampler allows you to efficiently sample video frames and summarise the videos.\nCurrently, it uses keyframe decoding, frame interval gating and perceptual hashing to reduce duplicated samples.\n\n**Use case:** for sampling videos for later annotations used in machine learning.\n\n</div>\n\n## Table of Contents\n\n- [video-sampler](#video-sampler)\n - [Table of Contents](#table-of-contents)\n - [Documentation](#documentation)\n - [Features](#features)\n - [Installation and Usage](#installation-and-usage)\n - [Basic usage](#basic-usage)\n - [YT-DLP integration plugin](#yt-dlp-integration-plugin)\n - [Extra YT-DLP options](#extra-yt-dlp-options)\n - [OpenAI summary](#openai-summary)\n - [API examples](#api-examples)\n - [Advanced usage](#advanced-usage)\n - [Gating](#gating)\n - [CLIP-based gating comparison](#clip-based-gating-comparison)\n - [Blur gating](#blur-gating)\n - [Benchmarks](#benchmarks)\n - [Benchmark videos](#benchmark-videos)\n - [Flit commands](#flit-commands)\n - [Build](#build)\n - [Install](#install)\n - [Publish](#publish)\n - [\ud83d\udee1 License](#-license)\n - [\ud83d\udcc3 Citation](#-citation)\n\n## Documentation\n\nDocumentation is available at [https://lemurpwned.github.io/video-sampler/](https://lemurpwned.github.io/video-sampler/).\n\n## Features\n\n- [x] Direct sampling methods:\n - [x] `hash` - uses perceptual hashing to reduce duplicated samples\n - [x] `entropy` - uses entropy to reduce duplicated samples (work in progress)\n - [x] `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)\n - [x] `buffer` - uses sliding buffer to reduce duplicated samples\n - [x] `grid` - uses grid sampling to reduce duplicated samples\n- [x] Gating methods (modifications on top of direct sampling methods):\n - [x] `clip` - uses CLIP to filter out frames that do not contain the specified objects\n - [x] `blur` - uses blur detection to filter out frames that are too blurry\n- [x] Language capture:\n - [x] Keyword capture from subtitles\n- [x] Integrations\n - [x] YTDLP integration -- streams directly from [yt-dlp](http://github.com//yt-dlp/yt-dlp) queries,\n playlists or single videos\n - [x] OpenAI multimodal models integration for video summaries\n\n## Installation and Usage\n\n```bash\npip install -U video_sampler\n```\n\nthen you can run\n\n```bash\npython3 -m video_sampler --help\n```\n\nor simply\n\n```bash\nvideo_sampler --help\n```\n\n### Basic usage\n\n```bash\npython3 -m video_sampler hash FatCat.mp4 ./dataset-frames/ --hash-size 3 --buffer-size 20\n```\n\n#### YT-DLP integration plugin\n\nBefore using please consult the ToS of the website you are scraping from -- use responsibly and for research purposes.\nTo use the YT-DLP integration, you need to install `yt-dlp` first (see [yt-dlp](http://github.com//yt-dlp/yt-dlp)).\nThen, you simply add `--yt-dlp` to the command, and it changes the meaning of the `video_path` argument.\n\n- to search\n\n```bash\nvideo_sampler hash \"ytsearch:cute cats\" ./folder-frames/ \\\n --hash-size 3 --buffer-size 20 --ytdlp\n```\n\n- to sample a single video\n\n```bash\nvideo_sampler hash \"https://www.youtube.com/watch?v=W86cTIoMv2U\" ./folder-frames/ \\\n --hash-size 3 --buffer-size 20 --ytdlp\n```\n\n- to sample a playlist\n\n```bash\nvideo_sampler hash \"https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk\" ./folder-frames/ \\\n --hash-size 3 --buffer-size 20 --ytdlp\n```\n\n- segment based on the keyword extraction\n\n```bash\nvideo_sampler hash \"https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk\" ./folder-frames/ \\\n --hash-size 3 --buffer-size 20 --ytdlp --keywords \"cat,dog,another keyword,test keyword\"\n```\n\nThe videos are never directly downloaded, only streamed, so you can use it to sample videos from the internet without downloading them first.\n\n##### Extra YT-DLP options\n\nYou can pass extra options to yt-dlp by using the `-yt-extra-args` flag. For example:\n\nthis will only sample videos uploaded before 2019-01-01:\n\n```bash\n... --ytdlp --yt-extra-args '--datebefore 20190101'\n```\n\nor this will only sample videos uploaded after 2019-01-01:\n\n```bash\n... --ytdlp --yt-extra-args '--dateafter 20190101'\n```\n\nor this will skip all shorts:\n\n```bash\n... --ytdlp --yt-extra-args '--match-filter \"original_url!*=/shorts/ & url!*=/shorts/\"\n```\n\n#### OpenAI summary\n\nTo use the OpenAI multimodal models integration, you need to install `openai` first `pip install openai`.\nThen, you simply add `--summary-interval` to the command and the url.\n\nIn the example, I'm using [llamafile](https://github.com/Mozilla-Ocho/llamafile) LLAVA model to summarize the video every 50 frames. If you want to use the OpenAI multimodal models, you need to export `OPENAI_API_KEY=your_api_key` first.\n\nTo replicate, run LLAVA model locally and set the `summary-url` to the address of the model. Specify the `summary-interval` to the minimal interval in seconds between frames that are to be summarised/described.\n\n```bash\nvideo_sampler hash ./videos/FatCat.mp4 ./output-frames/ --hash-size 3 --buffer-size 20 --summary-url \"http://localhost:8080\" --summary-interval 50\n```\n\nSome of the frames, based on the interval specified, will be summarised by the model and the result will saved in the `./output-frames/summaries.json` folder. The frames that are summarised come after the sampling and gating process happens, and only those frames that pass both stages are viable for summarisation.\n\n```jsonl\nsummaries.jsonl\n---\n{\"time\": 56.087, \"summary\": \"A cat is walking through a field of tall grass, with its head down and ears back. The cat appears to be looking for something in the grass, possibly a mouse or another small creature. The field is covered in snow, adding a wintry atmosphere to the scene.\"}\n{\"time\": 110.087, \"summary\": \"A dog is walking in the snow, with its head down, possibly sniffing the ground. The dog is the main focus of the image, and it appears to be a small animal. The snowy landscape is visible in the background, creating a serene and cold atmosphere.\"}\n{\"time\": 171.127, \"summary\": \"The image features a group of animals, including a dog and a cat, standing on a beach near the ocean. The dog is positioned closer to the left side of the image, while the cat is located more towards the center. The scene is set against a beautiful backdrop of a blue sky and a vibrant green ocean. The animals appear to be enjoying their time on the beach, possibly taking a break from their daily activities.\"}\n```\n\n#### API examples\n\nSee examples in [./scripts](./scripts/run_benchmarks.py).\n\n### Advanced usage\n\nThere are 3 sampling methods available:\n\n- `hash` - uses perceptual hashing to reduce duplicated samples\n- `entropy` - uses entropy to reduce duplicated samples (work in progress)\n- `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)\n\nTo launch any of them you can run and substitute `method-name` with one of the above:\n\n```bash\nvideo_sampler buffer `method-name` ...other options\n```\n\ne.g.\n\n```bash\nvideo_sampler buffer entropy --buffer-size 20 ...\n```\n\nwhere `buffer-size` for `entropy` and `gzip` mean the top-k sliding buffer size. Sliding buffer also uses hashing to reduce duplicated samples.\n\n#### Gating\n\nAside from basic sampling rules, you can also apply gating rules to the sampled frames, further reducing the number of frames.\nThere are 3 gating methods available:\n\n- `pass` - pass all frames\n- `clip` - use CLIP to filter out frames that do not contain the specified objects\n- `blur` - use blur detection to filter out frames that are too blurry\n\nHere's a quick example of how to use clip:\n\n```bash\npython3 -m video_sampler clip ./videos ./scratch/clip --pos-samples \"a cat\" --neg-samples \"empty background, a lemur\" --hash-size 4\n```\n\n#### CLIP-based gating comparison\n\nHere's a brief comparison of the frames sampled with and without CLIP-based gating with the following config:\n\n```python\n gate_def = dict(\n type=\"clip\",\n pos_samples=[\"a cat\"],\n neg_samples=[\n \"an empty background\",\n \"text on screen\",\n \"a forest with no animals\",\n ],\n model_name=\"ViT-B-32\",\n batch_size=32,\n pos_margin=0.2,\n neg_margin=0.3,\n )\n```\n\nEvidently, CLIP-based gating is able to filter out frames that do not contain a cat and in consequence, reduce the number of frames with plain background. It also thinks that a lemur is a cat, which is not entirely wrong as fluffy creatures go.\n\n| Pass gate (no gating) | CLIP gate | Grid |\n| :-------------------------------------------------------------: | :-------------------------------------------------------------: | :-------------------------------------------------------------: |\n| <img width=\"256\" src=\"./assets/FatCat.mp4_hash_4_pass.gif\"> | <img width=\"256\" src=\"./assets/FatCat.mp4_hash_4_clip.gif\"> | <img width=\"256\" src=\"./assets/FatCat.mp4_grid_4_pass.gif\"> |\n| <img width=\"256\" src=\"./assets/SmolCat.mp4_hash_4_pass.gif\"> | <img width=\"256\" src=\"./assets/SmolCat.mp4_hash_4_clip.gif\"> | <img width=\"256\" src=\"./assets/SmolCat.mp4_grid_4_pass.gif\"> |\n| <img width=\"256\" src=\"./assets/HighLemurs.mp4_hash_4_pass.gif\"> | <img width=\"256\" src=\"./assets/HighLemurs.mp4_hash_4_clip.gif\"> | <img width=\"256\" src=\"./assets/HighLemurs.mp4_grid_4_pass.gif\"> |\n\nThe effects of gating in numbers, for this particular set of examples (see `produced` vs `gated` columns). `produced` represents the number of frames sampled without gating, here after the perceptual hashing, while `gated` represents the number of frames sampled after gating.\n\n| video | buffer | gate | decoded | produced | gated |\n| -------------- | ------ | ---- | ------- | -------- | ----- |\n| FatCat.mp4 | grid | pass | 179 | 31 | 31 |\n| SmolCat.mp4 | grid | pass | 118 | 24 | 24 |\n| HighLemurs.mp4 | grid | pass | 161 | 35 | 35 |\n| FatCat.mp4 | hash | pass | 179 | 101 | 101 |\n| SmolCat.mp4 | hash | pass | 118 | 61 | 61 |\n| HighLemurs.mp4 | hash | pass | 161 | 126 | 126 |\n| FatCat.mp4 | hash | clip | 179 | 101 | 73 |\n| SmolCat.mp4 | hash | clip | 118 | 61 | 31 |\n| HighLemurs.mp4 | hash | clip | 161 | 126 | 66 |\n\n#### Blur gating\n\nHelps a little with blurry videos. Adjust threshold and method (`laplacian` or `fft`) for best results.\nSome results from `fft` at `threshold=20`:\n\n| video | buffer | gate | decoded | produced | gated |\n| ---------- | ------ | ---- | ------- | -------- | ----- |\n| MadLad.mp4 | grid | pass | 120 | 31 | 31 |\n| MadLad.mp4 | hash | pass | 120 | 110 | 110 |\n| MadLad.mp4 | hash | blur | 120 | 110 | 85 |\n\n## Benchmarks\n\nConfiguration for this benchmark:\n\n```bash\nSamplerConfig(min_frame_interval_sec=1.0, keyframes_only=True, buffer_size=30, hash_size=X, queue_wait=0.1, debug=True)\n```\n\n| Video | Total frames | Hash size | Decoded | Saved |\n| :-------------------------------------------------------------------: | :----------: | :-------: | :-----: | :---: |\n| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | 2936 | 8 | 118 | 106 |\n| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | - | 4 | - | 61 |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | 4462 | 8 | 179 | 163 |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | - | 4 | - | 101 |\n| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | 4020 | 8 | 161 | 154 |\n| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | - | 4 | - | 126 |\n\n---\n\n```bash\nSamplerConfig(\n min_frame_interval_sec=1.0,\n keyframes_only=True,\n queue_wait=0.1,\n debug=False,\n print_stats=True,\n buffer_config={'type': 'entropy'/'gzip', 'size': 30, 'debug': False, 'hash_size': 8, 'expiry': 50}\n)\n```\n\n| Video | Total frames | Type | Decoded | Saved |\n| :-------------------------------------------------------------------: | :----------: | :-----: | :-----: | :---: |\n| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | 2936 | entropy | 118 | 39 |\n| [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U) | - | gzip | - | 39 |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | 4462 | entropy | 179 | 64 |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) | - | gzip | - | 73 |\n| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | 4020 | entropy | 161 | 59 |\n| [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o) | - | gzip | - | 63 |\n\n## Benchmark videos\n\n- [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)\n- [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC)\n- [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)\n- [MadLad](https://www.youtube.com/watch?v=MWyBgudQqsI)\n\n## Flit commands\n\n#### Build\n\n```\nflit build\n```\n\n#### Install\n\n```\nflit install\n```\n\n#### Publish\n\nRemember to bump the version in `pyproject.toml` before publishing.\n\n```\nflit publish\n```\n\n## \ud83d\udee1 License\n\n[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)\n\nThis project is licensed under the terms of the `MIT` license. See [LICENSE](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE) for more details.\n\n## \ud83d\udcc3 Citation\n\n```bibtex\n@misc{video-sampler,\n author = {video-sampler},\n title = {Video sampler allows you to efficiently sample video frames},\n year = {2023},\n publisher = {GitHub},\n journal = {GitHub repository},\n howpublished = {\\url{https://github.com/LemurPwned/video-sampler}}\n}\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Video Sampler -- sample frames from a video file",
"version": "0.10.0",
"project_urls": {
"Source": "https://github.com/LemurPwned/video-sampler"
},
"split_keywords": [
"video sampling",
" frame selection",
" labelling",
" labeling",
" annotation"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2b0a5d0c4a2c8807222a6a340c920bb76e2e9f88db8de4721fb9c5c42345938b",
"md5": "f8c87d7eac66d6e7fb65d38e5d7abedc",
"sha256": "031de76d68aa1d64260ddb17b1299d73e5e52cdd9cb7f49c502f4608c7fe8e8e"
},
"downloads": -1,
"filename": "video_sampler-0.10.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f8c87d7eac66d6e7fb65d38e5d7abedc",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 30379,
"upload_time": "2024-04-14T16:23:35",
"upload_time_iso_8601": "2024-04-14T16:23:35.179831Z",
"url": "https://files.pythonhosted.org/packages/2b/0a/5d0c4a2c8807222a6a340c920bb76e2e9f88db8de4721fb9c5c42345938b/video_sampler-0.10.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "03dfe35478159e0f9e83cacd0fd37c49334f1cafa02580d0bee3a773277ff3d6",
"md5": "263f87ee4d240f16daba3a113fcf3b3d",
"sha256": "6b09d978f0649dbf74b8b98a070f75a623bba417f7f97c9b7bf13d7814524cc1"
},
"downloads": -1,
"filename": "video_sampler-0.10.0.tar.gz",
"has_sig": false,
"md5_digest": "263f87ee4d240f16daba3a113fcf3b3d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 62969819,
"upload_time": "2024-04-14T16:23:39",
"upload_time_iso_8601": "2024-04-14T16:23:39.557736Z",
"url": "https://files.pythonhosted.org/packages/03/df/e35478159e0f9e83cacd0fd37c49334f1cafa02580d0bee3a773277ff3d6/video_sampler-0.10.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-14 16:23:39",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "LemurPwned",
"github_project": "video-sampler",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "video_sampler"
}