video_sampler

Name	video_sampler JSON
Version	0.10.0 JSON
	download
home_page	None
Summary	Video Sampler -- sample frames from a video file
upload_time	2024-04-14 16:23:39
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	None
keywords	video sampling frame selection labelling labeling annotation
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # video-sampler

<div align="center">

[![Python Version](https://img.shields.io/pypi/pyversions/video-sampler.svg)](https://pypi.org/project/video-sampler/)
[![Dependencies Status](https://img.shields.io/badge/dependencies-up%20to%20date-brightgreen.svg)](https://github.com/LemurPwned/video-sampler/pulls?utf8=%E2%9C%93&q=is%3Apr%20author%3Aapp%2Fdependabot)

[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/LemurPwned/video-sampler/blob/main/.pre-commit-config.yaml)

[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)
[![Downloads](https://img.shields.io/pypi/dm/video-sampler.svg)](https://img.shields.io/pypi/dm/video-sampler.svg)

Video sampler allows you to efficiently sample video frames and summarise the videos.
Currently, it uses keyframe decoding, frame interval gating and perceptual hashing to reduce duplicated samples.

**Use case:** for sampling videos for later annotations used in machine learning.

</div>

## Table of Contents

- [video-sampler](#video-sampler)
  - [Table of Contents](#table-of-contents)
  - [Documentation](#documentation)
  - [Features](#features)
  - [Installation and Usage](#installation-and-usage)
    - [Basic usage](#basic-usage)
      - [YT-DLP integration plugin](#yt-dlp-integration-plugin)
        - [Extra YT-DLP options](#extra-yt-dlp-options)
      - [OpenAI summary](#openai-summary)
      - [API examples](#api-examples)
    - [Advanced usage](#advanced-usage)
      - [Gating](#gating)
      - [CLIP-based gating comparison](#clip-based-gating-comparison)
      - [Blur gating](#blur-gating)
  - [Benchmarks](#benchmarks)
  - [Benchmark videos](#benchmark-videos)
  - [Flit commands](#flit-commands)
    - [Build](#build)
    - [Install](#install)
    - [Publish](#publish)
  - [🛡 License](#-license)
  - [📃 Citation](#-citation)

## Documentation

Documentation is available at [https://lemurpwned.github.io/video-sampler/](https://lemurpwned.github.io/video-sampler/).

## Features

- [x] Direct sampling methods:
  - [x] `hash` - uses perceptual hashing to reduce duplicated samples
  - [x] `entropy` - uses entropy to reduce duplicated samples (work in progress)
  - [x] `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)
  - [x] `buffer` - uses sliding buffer to reduce duplicated samples
  - [x] `grid` - uses grid sampling to reduce duplicated samples
- [x] Gating methods (modifications on top of direct sampling methods):
  - [x] `clip` - uses CLIP to filter out frames that do not contain the specified objects
  - [x] `blur` - uses blur detection to filter out frames that are too blurry
- [x] Language capture:
  - [x] Keyword capture from subtitles
- [x] Integrations
  - [x] YTDLP integration -- streams directly from [yt-dlp](http://github.com//yt-dlp/yt-dlp) queries,
        playlists or single videos
  - [x] OpenAI multimodal models integration for video summaries

## Installation and Usage

```bash
pip install -U video_sampler
```

then you can run

```bash
python3 -m video_sampler --help
```

or simply

```bash
video_sampler --help
```

### Basic usage

```bash
python3 -m video_sampler hash FatCat.mp4 ./dataset-frames/ --hash-size 3 --buffer-size 20
```

#### YT-DLP integration plugin

Before using please consult the ToS of the website you are scraping from -- use responsibly and for research purposes.
To use the YT-DLP integration, you need to install `yt-dlp` first (see [yt-dlp](http://github.com//yt-dlp/yt-dlp)).
Then, you simply add `--yt-dlp` to the command, and it changes the meaning of the `video_path` argument.

- to search

```bash
video_sampler hash "ytsearch:cute cats" ./folder-frames/ \
  --hash-size 3 --buffer-size 20 --ytdlp
```

- to sample a single video

```bash
video_sampler hash "https://www.youtube.com/watch?v=W86cTIoMv2U" ./folder-frames/ \
    --hash-size 3 --buffer-size 20 --ytdlp
```

- to sample a playlist

```bash
video_sampler hash "https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk" ./folder-frames/ \
  --hash-size 3 --buffer-size 20 --ytdlp
```

- segment based on the keyword extraction

```bash
video_sampler hash "https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk" ./folder-frames/ \
  --hash-size 3 --buffer-size 20 --ytdlp --keywords "cat,dog,another keyword,test keyword"
```

The videos are never directly downloaded, only streamed, so you can use it to sample videos from the internet without downloading them first.

##### Extra YT-DLP options

You can pass extra options to yt-dlp by using the `-yt-extra-args` flag. For example:

this will only sample videos uploaded before 2019-01-01:

```bash
... --ytdlp --yt-extra-args '--datebefore 20190101'
```

or this will only sample videos uploaded after 2019-01-01:

```bash
... --ytdlp --yt-extra-args '--dateafter 20190101'
```

or this will skip all shorts:

```bash
... --ytdlp --yt-extra-args '--match-filter "original_url!*=/shorts/ & url!*=/shorts/"
```

#### OpenAI summary

To use the OpenAI multimodal models integration, you need to install `openai` first `pip install openai`.
Then, you simply add `--summary-interval` to the command and the url.

In the example, I'm using [llamafile](https://github.com/Mozilla-Ocho/llamafile) LLAVA model to summarize the video every 50 frames. If you want to use the OpenAI multimodal models, you need to export `OPENAI_API_KEY=your_api_key` first.

To replicate, run LLAVA model locally and set the `summary-url` to the address of the model. Specify the `summary-interval` to the minimal interval in seconds between frames that are to be summarised/described.

```bash
video_sampler hash ./videos/FatCat.mp4 ./output-frames/ --hash-size 3 --buffer-size 20 --summary-url "http://localhost:8080" --summary-interval 50
```

Some of the frames, based on the interval specified, will be summarised by the model and the result will saved in the `./output-frames/summaries.json` folder. The frames that are summarised come after the sampling and gating process happens, and only those frames that pass both stages are viable for summarisation.

```jsonl
summaries.jsonl
---
{"time": 56.087, "summary": "A cat is walking through a field of tall grass, with its head down and ears back. The cat appears to be looking for something in the grass, possibly a mouse or another small creature. The field is covered in snow, adding a wintry atmosphere to the scene."}
{"time": 110.087, "summary": "A dog is walking in the snow, with its head down, possibly sniffing the ground. The dog is the main focus of the image, and it appears to be a small animal. The snowy landscape is visible in the background, creating a serene and cold atmosphere."}
{"time": 171.127, "summary": "The image features a group of animals, including a dog and a cat, standing on a beach near the ocean. The dog is positioned closer to the left side of the image, while the cat is located more towards the center. The scene is set against a beautiful backdrop of a blue sky and a vibrant green ocean. The animals appear to be enjoying their time on the beach, possibly taking a break from their daily activities."}
```

#### API examples

See examples in [./scripts](./scripts/run_benchmarks.py).

### Advanced usage

There are 3 sampling methods available:

- `hash` - uses perceptual hashing to reduce duplicated samples
- `entropy` - uses entropy to reduce duplicated samples (work in progress)
- `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)

To launch any of them you can run and substitute `method-name` with one of the above:

```bash
video_sampler buffer `method-name` ...other options
```

e.g.

```bash
video_sampler buffer entropy --buffer-size 20 ...
```

where `buffer-size` for `entropy` and `gzip` mean the top-k sliding buffer size. Sliding buffer also uses hashing to reduce duplicated samples.

#### Gating

Aside from basic sampling rules, you can also apply gating rules to the sampled frames, further reducing the number of frames.
There are 3 gating methods available:

- `pass` - pass all frames
- `clip` - use CLIP to filter out frames that do not contain the specified objects
- `blur` - use blur detection to filter out frames that are too blurry

Here's a quick example of how to use clip:

```bash
python3 -m video_sampler clip ./videos ./scratch/clip --pos-samples "a cat" --neg-samples "empty background, a lemur"  --hash-size 4
```

#### CLIP-based gating comparison

Here's a brief comparison of the frames sampled with and without CLIP-based gating with the following config:

```python
  gate_def = dict(
      type="clip",
      pos_samples=["a cat"],
      neg_samples=[
          "an empty background",
          "text on screen",
          "a forest with no animals",
      ],
      model_name="ViT-B-32",
      batch_size=32,
      pos_margin=0.2,
      neg_margin=0.3,
  )
```

Evidently, CLIP-based gating is able to filter out frames that do not contain a cat and in consequence, reduce the number of frames with plain background. It also thinks that a lemur is a cat, which is not entirely wrong as fluffy creatures go.

|                      Pass gate (no gating)                      |                            CLIP gate                            |                              Grid                               |
| :-------------------------------------------------------------: | :-------------------------------------------------------------: | :-------------------------------------------------------------: |
|   <img width="256" src="./assets/FatCat.mp4_hash_4_pass.gif">   |   <img width="256" src="./assets/FatCat.mp4_hash_4_clip.gif">   |   <img width="256" src="./assets/FatCat.mp4_grid_4_pass.gif">   |
|  <img width="256" src="./assets/SmolCat.mp4_hash_4_pass.gif">   |  <img width="256" src="./assets/SmolCat.mp4_hash_4_clip.gif">   |  <img width="256" src="./assets/SmolCat.mp4_grid_4_pass.gif">   |
| <img width="256" src="./assets/HighLemurs.mp4_hash_4_pass.gif"> | <img width="256" src="./assets/HighLemurs.mp4_hash_4_clip.gif"> | <img width="256" src="./assets/HighLemurs.mp4_grid_4_pass.gif"> |

The effects of gating in numbers, for this particular set of examples (see `produced` vs `gated` columns). `produced` represents the number of frames sampled without gating, here after the perceptual hashing, while `gated` represents the number of frames sampled after gating.

| video          | buffer | gate | decoded | produced | gated |
| -------------- | ------ | ---- | ------- | -------- | ----- |
| FatCat.mp4     | grid   | pass | 179     | 31       | 31    |
| SmolCat.mp4    | grid   | pass | 118     | 24       | 24    |
| HighLemurs.mp4 | grid   | pass | 161     | 35       | 35    |
| FatCat.mp4     | hash   | pass | 179     | 101      | 101   |
| SmolCat.mp4    | hash   | pass | 118     | 61       | 61    |
| HighLemurs.mp4 | hash   | pass | 161     | 126      | 126   |
| FatCat.mp4     | hash   | clip | 179     | 101      | 73    |
| SmolCat.mp4    | hash   | clip | 118     | 61       | 31    |
| HighLemurs.mp4 | hash   | clip | 161     | 126      | 66    |

#### Blur gating

Helps a little with blurry videos. Adjust threshold and method (`laplacian` or `fft`) for best results.
Some results from `fft` at `threshold=20`:

| video      | buffer | gate | decoded | produced | gated |
| ---------- | ------ | ---- | ------- | -------- | ----- |
| MadLad.mp4 | grid   | pass | 120     | 31       | 31    |
| MadLad.mp4 | hash   | pass | 120     | 110      | 110   |
| MadLad.mp4 | hash   | blur | 120     | 110      | 85    |

## Benchmarks

Configuration for this benchmark:

```bash
SamplerConfig(min_frame_interval_sec=1.0, keyframes_only=True, buffer_size=30, hash_size=X, queue_wait=0.1, debug=True)
```

|                                 Video                                 | Total frames | Hash size | Decoded | Saved |
| :-------------------------------------------------------------------: | :----------: | :-------: | :-----: | :---: |
|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |     2936     |     8     |   118   |  106  |
|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |      -       |     4     |    -    |  61   |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |     4462     |     8     |   179   |  163  |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |      -       |     4     |    -    |  101  |
|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |     4020     |     8     |   161   |  154  |
|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |      -       |     4     |    -    |  126  |

---

```bash
SamplerConfig(
    min_frame_interval_sec=1.0,
    keyframes_only=True,
    queue_wait=0.1,
    debug=False,
    print_stats=True,
    buffer_config={'type': 'entropy'/'gzip', 'size': 30, 'debug': False, 'hash_size': 8, 'expiry': 50}
)
```

|                                 Video                                 | Total frames |  Type   | Decoded | Saved |
| :-------------------------------------------------------------------: | :----------: | :-----: | :-----: | :---: |
|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |     2936     | entropy |   118   |  39   |
|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |      -       |  gzip   |    -    |  39   |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |     4462     | entropy |   179   |  64   |
| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |      -       |  gzip   |    -    |  73   |
|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |     4020     | entropy |   161   |  59   |
|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |      -       |  gzip   |    -    |  63   |

## Benchmark videos

- [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)
- [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC)
- [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)
- [MadLad](https://www.youtube.com/watch?v=MWyBgudQqsI)

## Flit commands

#### Build

```
flit build
```

#### Install

```
flit install
```

#### Publish

Remember to bump the version in `pyproject.toml` before publishing.

```
flit publish
```

## 🛡 License

[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)

This project is licensed under the terms of the `MIT` license. See [LICENSE](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE) for more details.

## 📃 Citation

```bibtex
@misc{video-sampler,
  author = {video-sampler},
  title = {Video sampler allows you to efficiently sample video frames},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/LemurPwned/video-sampler}}
}
```

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "video_sampler",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "video sampling, frame selection, labelling, labeling, annotation",
    "author": null,
    "author_email": "LemurPwned <lemurpwned@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/03/df/e35478159e0f9e83cacd0fd37c49334f1cafa02580d0bee3a773277ff3d6/video_sampler-0.10.0.tar.gz",
    "platform": null,
    "description": "# video-sampler\n\n<div align=\"center\">\n\n[![Python Version](https://img.shields.io/pypi/pyversions/video-sampler.svg)](https://pypi.org/project/video-sampler/)\n[![Dependencies Status](https://img.shields.io/badge/dependencies-up%20to%20date-brightgreen.svg)](https://github.com/LemurPwned/video-sampler/pulls?utf8=%E2%9C%93&q=is%3Apr%20author%3Aapp%2Fdependabot)\n\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/LemurPwned/video-sampler/blob/main/.pre-commit-config.yaml)\n\n[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)\n[![Downloads](https://img.shields.io/pypi/dm/video-sampler.svg)](https://img.shields.io/pypi/dm/video-sampler.svg)\n\nVideo sampler allows you to efficiently sample video frames and summarise the videos.\nCurrently, it uses keyframe decoding, frame interval gating and perceptual hashing to reduce duplicated samples.\n\n**Use case:** for sampling videos for later annotations used in machine learning.\n\n</div>\n\n## Table of Contents\n\n- [video-sampler](#video-sampler)\n  - [Table of Contents](#table-of-contents)\n  - [Documentation](#documentation)\n  - [Features](#features)\n  - [Installation and Usage](#installation-and-usage)\n    - [Basic usage](#basic-usage)\n      - [YT-DLP integration plugin](#yt-dlp-integration-plugin)\n        - [Extra YT-DLP options](#extra-yt-dlp-options)\n      - [OpenAI summary](#openai-summary)\n      - [API examples](#api-examples)\n    - [Advanced usage](#advanced-usage)\n      - [Gating](#gating)\n      - [CLIP-based gating comparison](#clip-based-gating-comparison)\n      - [Blur gating](#blur-gating)\n  - [Benchmarks](#benchmarks)\n  - [Benchmark videos](#benchmark-videos)\n  - [Flit commands](#flit-commands)\n    - [Build](#build)\n    - [Install](#install)\n    - [Publish](#publish)\n  - [\ud83d\udee1 License](#-license)\n  - [\ud83d\udcc3 Citation](#-citation)\n\n## Documentation\n\nDocumentation is available at [https://lemurpwned.github.io/video-sampler/](https://lemurpwned.github.io/video-sampler/).\n\n## Features\n\n- [x] Direct sampling methods:\n  - [x] `hash` - uses perceptual hashing to reduce duplicated samples\n  - [x] `entropy` - uses entropy to reduce duplicated samples (work in progress)\n  - [x] `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)\n  - [x] `buffer` - uses sliding buffer to reduce duplicated samples\n  - [x] `grid` - uses grid sampling to reduce duplicated samples\n- [x] Gating methods (modifications on top of direct sampling methods):\n  - [x] `clip` - uses CLIP to filter out frames that do not contain the specified objects\n  - [x] `blur` - uses blur detection to filter out frames that are too blurry\n- [x] Language capture:\n  - [x] Keyword capture from subtitles\n- [x] Integrations\n  - [x] YTDLP integration -- streams directly from [yt-dlp](http://github.com//yt-dlp/yt-dlp) queries,\n        playlists or single videos\n  - [x] OpenAI multimodal models integration for video summaries\n\n## Installation and Usage\n\n```bash\npip install -U video_sampler\n```\n\nthen you can run\n\n```bash\npython3 -m video_sampler --help\n```\n\nor simply\n\n```bash\nvideo_sampler --help\n```\n\n### Basic usage\n\n```bash\npython3 -m video_sampler hash FatCat.mp4 ./dataset-frames/ --hash-size 3 --buffer-size 20\n```\n\n#### YT-DLP integration plugin\n\nBefore using please consult the ToS of the website you are scraping from -- use responsibly and for research purposes.\nTo use the YT-DLP integration, you need to install `yt-dlp` first (see [yt-dlp](http://github.com//yt-dlp/yt-dlp)).\nThen, you simply add `--yt-dlp` to the command, and it changes the meaning of the `video_path` argument.\n\n- to search\n\n```bash\nvideo_sampler hash \"ytsearch:cute cats\" ./folder-frames/ \\\n  --hash-size 3 --buffer-size 20 --ytdlp\n```\n\n- to sample a single video\n\n```bash\nvideo_sampler hash \"https://www.youtube.com/watch?v=W86cTIoMv2U\" ./folder-frames/ \\\n    --hash-size 3 --buffer-size 20 --ytdlp\n```\n\n- to sample a playlist\n\n```bash\nvideo_sampler hash \"https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk\" ./folder-frames/ \\\n  --hash-size 3 --buffer-size 20 --ytdlp\n```\n\n- segment based on the keyword extraction\n\n```bash\nvideo_sampler hash \"https://www.youtube.com/watch?v=GbpP3Sxp-1U&list=PLFezMcAw96RGvTTTbdKrqew9seO2ZGRmk\" ./folder-frames/ \\\n  --hash-size 3 --buffer-size 20 --ytdlp --keywords \"cat,dog,another keyword,test keyword\"\n```\n\nThe videos are never directly downloaded, only streamed, so you can use it to sample videos from the internet without downloading them first.\n\n##### Extra YT-DLP options\n\nYou can pass extra options to yt-dlp by using the `-yt-extra-args` flag. For example:\n\nthis will only sample videos uploaded before 2019-01-01:\n\n```bash\n... --ytdlp --yt-extra-args '--datebefore 20190101'\n```\n\nor this will only sample videos uploaded after 2019-01-01:\n\n```bash\n... --ytdlp --yt-extra-args '--dateafter 20190101'\n```\n\nor this will skip all shorts:\n\n```bash\n... --ytdlp --yt-extra-args '--match-filter \"original_url!*=/shorts/ & url!*=/shorts/\"\n```\n\n#### OpenAI summary\n\nTo use the OpenAI multimodal models integration, you need to install `openai` first `pip install openai`.\nThen, you simply add `--summary-interval` to the command and the url.\n\nIn the example, I'm using [llamafile](https://github.com/Mozilla-Ocho/llamafile) LLAVA model to summarize the video every 50 frames. If you want to use the OpenAI multimodal models, you need to export `OPENAI_API_KEY=your_api_key` first.\n\nTo replicate, run LLAVA model locally and set the `summary-url` to the address of the model. Specify the `summary-interval` to the minimal interval in seconds between frames that are to be summarised/described.\n\n```bash\nvideo_sampler hash ./videos/FatCat.mp4 ./output-frames/ --hash-size 3 --buffer-size 20 --summary-url \"http://localhost:8080\" --summary-interval 50\n```\n\nSome of the frames, based on the interval specified, will be summarised by the model and the result will saved in the `./output-frames/summaries.json` folder. The frames that are summarised come after the sampling and gating process happens, and only those frames that pass both stages are viable for summarisation.\n\n```jsonl\nsummaries.jsonl\n---\n{\"time\": 56.087, \"summary\": \"A cat is walking through a field of tall grass, with its head down and ears back. The cat appears to be looking for something in the grass, possibly a mouse or another small creature. The field is covered in snow, adding a wintry atmosphere to the scene.\"}\n{\"time\": 110.087, \"summary\": \"A dog is walking in the snow, with its head down, possibly sniffing the ground. The dog is the main focus of the image, and it appears to be a small animal. The snowy landscape is visible in the background, creating a serene and cold atmosphere.\"}\n{\"time\": 171.127, \"summary\": \"The image features a group of animals, including a dog and a cat, standing on a beach near the ocean. The dog is positioned closer to the left side of the image, while the cat is located more towards the center. The scene is set against a beautiful backdrop of a blue sky and a vibrant green ocean. The animals appear to be enjoying their time on the beach, possibly taking a break from their daily activities.\"}\n```\n\n#### API examples\n\nSee examples in [./scripts](./scripts/run_benchmarks.py).\n\n### Advanced usage\n\nThere are 3 sampling methods available:\n\n- `hash` - uses perceptual hashing to reduce duplicated samples\n- `entropy` - uses entropy to reduce duplicated samples (work in progress)\n- `gzip` - uses gzip compressed size to reduce duplicated samples (work in progress)\n\nTo launch any of them you can run and substitute `method-name` with one of the above:\n\n```bash\nvideo_sampler buffer `method-name` ...other options\n```\n\ne.g.\n\n```bash\nvideo_sampler buffer entropy --buffer-size 20 ...\n```\n\nwhere `buffer-size` for `entropy` and `gzip` mean the top-k sliding buffer size. Sliding buffer also uses hashing to reduce duplicated samples.\n\n#### Gating\n\nAside from basic sampling rules, you can also apply gating rules to the sampled frames, further reducing the number of frames.\nThere are 3 gating methods available:\n\n- `pass` - pass all frames\n- `clip` - use CLIP to filter out frames that do not contain the specified objects\n- `blur` - use blur detection to filter out frames that are too blurry\n\nHere's a quick example of how to use clip:\n\n```bash\npython3 -m video_sampler clip ./videos ./scratch/clip --pos-samples \"a cat\" --neg-samples \"empty background, a lemur\"  --hash-size 4\n```\n\n#### CLIP-based gating comparison\n\nHere's a brief comparison of the frames sampled with and without CLIP-based gating with the following config:\n\n```python\n  gate_def = dict(\n      type=\"clip\",\n      pos_samples=[\"a cat\"],\n      neg_samples=[\n          \"an empty background\",\n          \"text on screen\",\n          \"a forest with no animals\",\n      ],\n      model_name=\"ViT-B-32\",\n      batch_size=32,\n      pos_margin=0.2,\n      neg_margin=0.3,\n  )\n```\n\nEvidently, CLIP-based gating is able to filter out frames that do not contain a cat and in consequence, reduce the number of frames with plain background. It also thinks that a lemur is a cat, which is not entirely wrong as fluffy creatures go.\n\n|                      Pass gate (no gating)                      |                            CLIP gate                            |                              Grid                               |\n| :-------------------------------------------------------------: | :-------------------------------------------------------------: | :-------------------------------------------------------------: |\n|   <img width=\"256\" src=\"./assets/FatCat.mp4_hash_4_pass.gif\">   |   <img width=\"256\" src=\"./assets/FatCat.mp4_hash_4_clip.gif\">   |   <img width=\"256\" src=\"./assets/FatCat.mp4_grid_4_pass.gif\">   |\n|  <img width=\"256\" src=\"./assets/SmolCat.mp4_hash_4_pass.gif\">   |  <img width=\"256\" src=\"./assets/SmolCat.mp4_hash_4_clip.gif\">   |  <img width=\"256\" src=\"./assets/SmolCat.mp4_grid_4_pass.gif\">   |\n| <img width=\"256\" src=\"./assets/HighLemurs.mp4_hash_4_pass.gif\"> | <img width=\"256\" src=\"./assets/HighLemurs.mp4_hash_4_clip.gif\"> | <img width=\"256\" src=\"./assets/HighLemurs.mp4_grid_4_pass.gif\"> |\n\nThe effects of gating in numbers, for this particular set of examples (see `produced` vs `gated` columns). `produced` represents the number of frames sampled without gating, here after the perceptual hashing, while `gated` represents the number of frames sampled after gating.\n\n| video          | buffer | gate | decoded | produced | gated |\n| -------------- | ------ | ---- | ------- | -------- | ----- |\n| FatCat.mp4     | grid   | pass | 179     | 31       | 31    |\n| SmolCat.mp4    | grid   | pass | 118     | 24       | 24    |\n| HighLemurs.mp4 | grid   | pass | 161     | 35       | 35    |\n| FatCat.mp4     | hash   | pass | 179     | 101      | 101   |\n| SmolCat.mp4    | hash   | pass | 118     | 61       | 61    |\n| HighLemurs.mp4 | hash   | pass | 161     | 126      | 126   |\n| FatCat.mp4     | hash   | clip | 179     | 101      | 73    |\n| SmolCat.mp4    | hash   | clip | 118     | 61       | 31    |\n| HighLemurs.mp4 | hash   | clip | 161     | 126      | 66    |\n\n#### Blur gating\n\nHelps a little with blurry videos. Adjust threshold and method (`laplacian` or `fft`) for best results.\nSome results from `fft` at `threshold=20`:\n\n| video      | buffer | gate | decoded | produced | gated |\n| ---------- | ------ | ---- | ------- | -------- | ----- |\n| MadLad.mp4 | grid   | pass | 120     | 31       | 31    |\n| MadLad.mp4 | hash   | pass | 120     | 110      | 110   |\n| MadLad.mp4 | hash   | blur | 120     | 110      | 85    |\n\n## Benchmarks\n\nConfiguration for this benchmark:\n\n```bash\nSamplerConfig(min_frame_interval_sec=1.0, keyframes_only=True, buffer_size=30, hash_size=X, queue_wait=0.1, debug=True)\n```\n\n|                                 Video                                 | Total frames | Hash size | Decoded | Saved |\n| :-------------------------------------------------------------------: | :----------: | :-------: | :-----: | :---: |\n|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |     2936     |     8     |   118   |  106  |\n|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |      -       |     4     |    -    |  61   |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |     4462     |     8     |   179   |  163  |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |      -       |     4     |    -    |  101  |\n|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |     4020     |     8     |   161   |  154  |\n|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |      -       |     4     |    -    |  126  |\n\n---\n\n```bash\nSamplerConfig(\n    min_frame_interval_sec=1.0,\n    keyframes_only=True,\n    queue_wait=0.1,\n    debug=False,\n    print_stats=True,\n    buffer_config={'type': 'entropy'/'gzip', 'size': 30, 'debug': False, 'hash_size': 8, 'expiry': 50}\n)\n```\n\n|                                 Video                                 | Total frames |  Type   | Decoded | Saved |\n| :-------------------------------------------------------------------: | :----------: | :-----: | :-----: | :---: |\n|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |     2936     | entropy |   118   |  39   |\n|        [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)         |      -       |  gzip   |    -    |  39   |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |     4462     | entropy |   179   |  64   |\n| [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC) |      -       |  gzip   |    -    |  73   |\n|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |     4020     | entropy |   161   |  59   |\n|       [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)       |      -       |  gzip   |    -    |  63   |\n\n## Benchmark videos\n\n- [SmolCat](https://www.youtube.com/watch?v=W86cTIoMv2U)\n- [Fat Cat](https://www.youtube.com/watch?v=kgrV3_g9rYY&ab_channel=BBC)\n- [HighLemurs](https://www.youtube.com/watch?v=yYXoCHLqr4o)\n- [MadLad](https://www.youtube.com/watch?v=MWyBgudQqsI)\n\n## Flit commands\n\n#### Build\n\n```\nflit build\n```\n\n#### Install\n\n```\nflit install\n```\n\n#### Publish\n\nRemember to bump the version in `pyproject.toml` before publishing.\n\n```\nflit publish\n```\n\n## \ud83d\udee1 License\n\n[![License](https://img.shields.io/github/license/LemurPwned/video-sampler)](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE)\n\nThis project is licensed under the terms of the `MIT` license. See [LICENSE](https://github.com/LemurPwned/video-sampler/blob/main/LICENSE) for more details.\n\n## \ud83d\udcc3 Citation\n\n```bibtex\n@misc{video-sampler,\n  author = {video-sampler},\n  title = {Video sampler allows you to efficiently sample video frames},\n  year = {2023},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https://github.com/LemurPwned/video-sampler}}\n}\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Video Sampler -- sample frames from a video file",
    "version": "0.10.0",
    "project_urls": {
        "Source": "https://github.com/LemurPwned/video-sampler"
    },
    "split_keywords": [
        "video sampling",
        " frame selection",
        " labelling",
        " labeling",
        " annotation"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2b0a5d0c4a2c8807222a6a340c920bb76e2e9f88db8de4721fb9c5c42345938b",
                "md5": "f8c87d7eac66d6e7fb65d38e5d7abedc",
                "sha256": "031de76d68aa1d64260ddb17b1299d73e5e52cdd9cb7f49c502f4608c7fe8e8e"
            },
            "downloads": -1,
            "filename": "video_sampler-0.10.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f8c87d7eac66d6e7fb65d38e5d7abedc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 30379,
            "upload_time": "2024-04-14T16:23:35",
            "upload_time_iso_8601": "2024-04-14T16:23:35.179831Z",
            "url": "https://files.pythonhosted.org/packages/2b/0a/5d0c4a2c8807222a6a340c920bb76e2e9f88db8de4721fb9c5c42345938b/video_sampler-0.10.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "03dfe35478159e0f9e83cacd0fd37c49334f1cafa02580d0bee3a773277ff3d6",
                "md5": "263f87ee4d240f16daba3a113fcf3b3d",
                "sha256": "6b09d978f0649dbf74b8b98a070f75a623bba417f7f97c9b7bf13d7814524cc1"
            },
            "downloads": -1,
            "filename": "video_sampler-0.10.0.tar.gz",
            "has_sig": false,
            "md5_digest": "263f87ee4d240f16daba3a113fcf3b3d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 62969819,
            "upload_time": "2024-04-14T16:23:39",
            "upload_time_iso_8601": "2024-04-14T16:23:39.557736Z",
            "url": "https://files.pythonhosted.org/packages/03/df/e35478159e0f9e83cacd0fd37c49334f1cafa02580d0bee3a773277ff3d6/video_sampler-0.10.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-14 16:23:39",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "LemurPwned",
    "github_project": "video-sampler",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "video_sampler"
}

None