videogrep

Name	videogrep JSON
Version	2.3.0 JSON
	download
home_page	http://antiboredom.github.io/videogrep/
Summary	Videogrep is a command line tool that searches through dialog in video and audio files and makes supercuts based on what it finds. Like grep but for video.
upload_time	2024-04-19 00:28:17
maintainer	None
docs_url	None
author	Sam Lavigne
requires_python	<4.0,>=3.8
license	Anti-Capitalist
keywords	video supercut
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            Videogrep
=========

Videogrep is a command line tool that searches through dialog in video or audio files and makes supercuts based on what it finds. It will recognize `.srt` or `.vtt` subtitle tracks, or transcriptions that can be generated with vosk, pocketsphinx, and other tools.

#### Examples

* [The Meta Experience](https://www.youtube.com/watch?v=nGHbOckpifw)
* [All the instances of the phrase "time" in the movie "In Time"](https://www.youtube.com/watch?v=PQMzOUeprlk)
* [All the one to two second silences in "Total Recall"](https://www.youtube.com/watch?v=qEtEbXVbYJQ)
* [A former press secretary telling us what he can tell us](https://www.youtube.com/watch?v=D7pymdCU5NQ)

#### Tutorial

See my blog for a short [tutorial on videogrep and yt-dlp](https://lav.io/notes/videogrep-tutorial/), and part 2, on [videogrep and natural language processing](https://lav.io/notes/videogrep-and-spacy/).

----

## Installation

Videogrep is compatible with Python versions 3.6 to 3.10.

To install:

```
pip install videogrep
```

If you want to transcribe video or audio, you also need to install [vosk](https://alphacephei.com/vosk/):

```
pip install vosk
```

Note: the previous version of videogrep supported pocketsphinx for speech-to-text. Vosk seems *much* better so I've added support for it and will likely be phasing out support for pocketsphinx.

## Usage

The most basic use:

```
videogrep --input path/to/video.mp4 --search 'search phrase'
```

It works with audio too:
```
videogrep --input path/to/audio.mp3 --search 'search phrase'
```

You can put any regular expression in the search phrase.

**NOTE: videogrep requires a matching subtitle track with each video you want to use. The video/audio file and subtitle file need to have the exact same name, up to the extension.** For example, `my_movie.mp4` and `my_movie.srt` will work, and `my_movie.mp4` and `my_movie_subtitle.srt` will *not* work.

Videogrep will search for matching `srt` and `vtt` subtitles, as well as `json` transcript files that can be generated with the `--transcribe` argument.

### Options

#### `--input [filename(s)] / -i [filename(s)]`

File or files to use as input. Most video or audio formats should work. If you mix audio and video input files, the resulting output will only be audio.


#### `--output [filename] / -o [filename]`

Name of the file to generate. By default this is `supercut.mp4`. Any standard video or audio extension will also work. (If you're using audio input or mixed audio and video input and you keep the default `supercut.mp4` as the output filename, videogrep will automatically change the output to `supercut.mp3`)

Videogrep will also recognize the following extensions for saving files:
  * `.mpv.edl`: generates an edl file playable by [mpv](https://mpv.io/) (useful for previews)
  * `.m3u`: media playlist
  * `.xml`: Final Cut Pro timeline, compatible with Adobe Premiere and Davinci Resolve

```
videogrep --input path/to/video --search 'search phrase' --output coolvid.mp4
```


#### `--search [query] / -s [query]`

Search term, as a regular expression. You can add as many of these as you want. For example:

```
videogrep --input path/to/video --search 'search phrase' --search 'another search' --search 'a third search' --output coolvid.mp4
```


#### `--search-type [type] / -st [type]`

Type of search you want to perform. There are two options:

* `sentence`: (default): Generates clips containing the full sentences of your search query.
* `fragment`: Generates clips containing the exact word or phrase of your search query.

Both options take regular expressions. You may only use the `fragment` search if your transcript has word-level timestamps, which will be the case for youtube `.vtt` files, or if you generated a transcript using Videogrep itself.

```
videogrep --input path/to/video --search 'experience' --search-type fragment
```

#### `--max-clips [num] / -m [num]`

Maximum number of clips to use for the supercut.


#### `--demo / -d`

Show the search results without making the supercut.

#### `--preview / -pr`

Preview the supercut in mpv (requires [mpv to be installed](https://mpv.io/))

#### `--randomize / -r`

Randomize the order of the clips.


#### `--padding [seconds] / -p [seconds]`

Padding in seconds to add to the start and end of each clip.

#### `--resyncsubs [seconds] / -rs [seconds]`

Time in seconds to shift the shift the subtitles forwards or backwards.

#### `--transcribe / -tr`

Transcribe the video/audio using [vosk](https://alphacephei.com/vosk/). This will generate a `.json` file in the same folder as the video. By default this uses vosk's small english model.

**NOTE:** Because of some compatibility issues, vosk must be installed separately with `pip install vosk`.

```
videogrep -i vid.mp4 --transcribe
```

#### `--model [modelpath] / -mo [modelpath]`

In combination with the `--transcribe` option, allows you to specify the path to a vosk model folder to use. Vosk models are [available here](https://alphacephei.com/vosk/models) in a variety of languages.

```
videogrep -i vid.mp4 --transcribe --model path/to/model/
```

#### `--export-clips / -ec`

Exports clips as individual files rather than as a supercut.

```
videogrep -i vid.mp4 --search 'whatever' --export-clips
```

#### `--export-vtt / -ev`

Exports the transcript of the supercut as a WebVTT file next to the video.

```
videogrep -i vid.mp4 --search 'whatever' --export-vtt
```

#### `--ngrams [num] / -n [num]`

Shows common words and phrases from the video or audio file.

```
videogrep -i vid.mp4 --ngrams 1
```


----


## Use it as a module

```
from videogrep import videogrep

videogrep('path/to/your/files','output_file_name.mp4', 'search_term', 'search_type')
```
The videogrep module accepts the same parameters as the command line script. To see the usage check out the source.

### Example Scripts

Also see the examples folder for:
* [silence extraction](https://github.com/antiboredom/videogrep/blob/master/examples/only_silence.py)
* [automatically creating supercuts](https://github.com/antiboredom/videogrep/blob/master/examples/auto_supercut.py)
* [creating supercuts based on youtube searches](https://github.com/antiboredom/videogrep/blob/master/examples/auto_youtube.py)
* [creating supercuts from specific parts of speech](https://github.com/antiboredom/videogrep/blob/master/examples/parts_of_speech.py)
* [creating supercuts from spacy pattern matching](https://github.com/antiboredom/videogrep/blob/master/examples/pattern_matcher.py)

----

## Credits

Videogrep is maintained by [Sam Lavigne](https://lav.io), and built using [MoviePy](https://zulko.github.io/moviepy/) and [Vosk](https://alphacephei.com/vosk/). A big thanks goes out to all those who have [contributed](https://github.com/antiboredom/videogrep/graphs/contributors), particuarly to [Charlie Macquarie](https://charliemacquarie.com) for his efforts in getting the project to work with audio-only media.

Videogrep has received financial support from the [Department of Digital Humanities, King’s College London](https://www.kcl.ac.uk/ddh) and from the [Clinic for Open Source Arts](https://clinicopensourcearts.org/).

Raw data

            {
    "_id": null,
    "home_page": "http://antiboredom.github.io/videogrep/",
    "name": "videogrep",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.8",
    "maintainer_email": null,
    "keywords": "video, supercut",
    "author": "Sam Lavigne",
    "author_email": "splavigne@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/6f/b2/314193adada9800b724bbf6c959f64d3e01d912d4701f75160c899a6a605/videogrep-2.3.0.tar.gz",
    "platform": null,
    "description": "Videogrep\n=========\n\nVideogrep is a command line tool that searches through dialog in video or audio files and makes supercuts based on what it finds. It will recognize `.srt` or `.vtt` subtitle tracks, or transcriptions that can be generated with vosk, pocketsphinx, and other tools.\n\n#### Examples\n\n* [The Meta Experience](https://www.youtube.com/watch?v=nGHbOckpifw)\n* [All the instances of the phrase \"time\" in the movie \"In Time\"](https://www.youtube.com/watch?v=PQMzOUeprlk)\n* [All the one to two second silences in \"Total Recall\"](https://www.youtube.com/watch?v=qEtEbXVbYJQ)\n* [A former press secretary telling us what he can tell us](https://www.youtube.com/watch?v=D7pymdCU5NQ)\n\n#### Tutorial\n\nSee my blog for a short [tutorial on videogrep and yt-dlp](https://lav.io/notes/videogrep-tutorial/), and part 2, on [videogrep and natural language processing](https://lav.io/notes/videogrep-and-spacy/).\n\n----\n\n## Installation\n\nVideogrep is compatible with Python versions 3.6 to 3.10.\n\nTo install:\n\n```\npip install videogrep\n```\n\nIf you want to transcribe video or audio, you also need to install [vosk](https://alphacephei.com/vosk/):\n\n```\npip install vosk\n```\n\nNote: the previous version of videogrep supported pocketsphinx for speech-to-text. Vosk seems *much* better so I've added support for it and will likely be phasing out support for pocketsphinx.\n\n## Usage\n\nThe most basic use:\n\n```\nvideogrep --input path/to/video.mp4 --search 'search phrase'\n```\n\nIt works with audio too:\n```\nvideogrep --input path/to/audio.mp3 --search 'search phrase'\n```\n\nYou can put any regular expression in the search phrase.\n\n**NOTE: videogrep requires a matching subtitle track with each video you want to use. The video/audio file and subtitle file need to have the exact same name, up to the extension.** For example, `my_movie.mp4` and `my_movie.srt` will work, and `my_movie.mp4` and `my_movie_subtitle.srt` will *not* work.\n\nVideogrep will search for matching `srt` and `vtt` subtitles, as well as `json` transcript files that can be generated with the `--transcribe` argument.\n\n### Options\n\n#### `--input [filename(s)] / -i [filename(s)]`\n\nFile or files to use as input. Most video or audio formats should work. If you mix audio and video input files, the resulting output will only be audio.\n\n\n#### `--output [filename] / -o [filename]`\n\nName of the file to generate. By default this is `supercut.mp4`. Any standard video or audio extension will also work. (If you're using audio input or mixed audio and video input and you keep the default `supercut.mp4` as the output filename, videogrep will automatically change the output to `supercut.mp3`)\n\nVideogrep will also recognize the following extensions for saving files:\n  * `.mpv.edl`: generates an edl file playable by [mpv](https://mpv.io/) (useful for previews)\n  * `.m3u`: media playlist\n  * `.xml`: Final Cut Pro timeline, compatible with Adobe Premiere and Davinci Resolve\n\n```\nvideogrep --input path/to/video --search 'search phrase' --output coolvid.mp4\n```\n\n\n#### `--search [query] / -s [query]`\n\nSearch term, as a regular expression. You can add as many of these as you want. For example:\n\n```\nvideogrep --input path/to/video --search 'search phrase' --search 'another search' --search 'a third search' --output coolvid.mp4\n```\n\n\n#### `--search-type [type] / -st [type]`\n\nType of search you want to perform. There are two options:\n\n* `sentence`: (default): Generates clips containing the full sentences of your search query.\n* `fragment`: Generates clips containing the exact word or phrase of your search query.\n\nBoth options take regular expressions. You may only use the `fragment` search if your transcript has word-level timestamps, which will be the case for youtube `.vtt` files, or if you generated a transcript using Videogrep itself.\n\n```\nvideogrep --input path/to/video --search 'experience' --search-type fragment\n```\n\n#### `--max-clips [num] / -m [num]`\n\nMaximum number of clips to use for the supercut.\n\n\n#### `--demo / -d`\n\nShow the search results without making the supercut.\n\n#### `--preview / -pr`\n\nPreview the supercut in mpv (requires [mpv to be installed](https://mpv.io/))\n\n#### `--randomize / -r`\n\nRandomize the order of the clips.\n\n\n#### `--padding [seconds] / -p [seconds]`\n\nPadding in seconds to add to the start and end of each clip.\n\n#### `--resyncsubs [seconds] / -rs [seconds]`\n\nTime in seconds to shift the shift the subtitles forwards or backwards.\n\n#### `--transcribe / -tr`\n\nTranscribe the video/audio using [vosk](https://alphacephei.com/vosk/). This will generate a `.json` file in the same folder as the video. By default this uses vosk's small english model.\n\n**NOTE:** Because of some compatibility issues, vosk must be installed separately with `pip install vosk`.\n\n```\nvideogrep -i vid.mp4 --transcribe\n```\n\n#### `--model [modelpath] / -mo [modelpath]`\n\nIn combination with the `--transcribe` option, allows you to specify the path to a vosk model folder to use. Vosk models are [available here](https://alphacephei.com/vosk/models) in a variety of languages.\n\n```\nvideogrep -i vid.mp4 --transcribe --model path/to/model/\n```\n\n#### `--export-clips / -ec`\n\nExports clips as individual files rather than as a supercut.\n\n```\nvideogrep -i vid.mp4 --search 'whatever' --export-clips\n```\n\n#### `--export-vtt / -ev`\n\nExports the transcript of the supercut as a WebVTT file next to the video.\n\n```\nvideogrep -i vid.mp4 --search 'whatever' --export-vtt\n```\n\n#### `--ngrams [num] / -n [num]`\n\nShows common words and phrases from the video or audio file.\n\n```\nvideogrep -i vid.mp4 --ngrams 1\n```\n\n\n----\n\n\n## Use it as a module\n\n```\nfrom videogrep import videogrep\n\nvideogrep('path/to/your/files','output_file_name.mp4', 'search_term', 'search_type')\n```\nThe videogrep module accepts the same parameters as the command line script. To see the usage check out the source.\n\n### Example Scripts\n\nAlso see the examples folder for:\n* [silence extraction](https://github.com/antiboredom/videogrep/blob/master/examples/only_silence.py)\n* [automatically creating supercuts](https://github.com/antiboredom/videogrep/blob/master/examples/auto_supercut.py)\n* [creating supercuts based on youtube searches](https://github.com/antiboredom/videogrep/blob/master/examples/auto_youtube.py)\n* [creating supercuts from specific parts of speech](https://github.com/antiboredom/videogrep/blob/master/examples/parts_of_speech.py)\n* [creating supercuts from spacy pattern matching](https://github.com/antiboredom/videogrep/blob/master/examples/pattern_matcher.py)\n\n----\n\n## Credits\n\nVideogrep is maintained by [Sam Lavigne](https://lav.io), and built using [MoviePy](https://zulko.github.io/moviepy/) and [Vosk](https://alphacephei.com/vosk/). A big thanks goes out to all those who have [contributed](https://github.com/antiboredom/videogrep/graphs/contributors), particuarly to [Charlie Macquarie](https://charliemacquarie.com) for his efforts in getting the project to work with audio-only media.\n\nVideogrep has received financial support from the [Department of Digital Humanities, King\u2019s College London](https://www.kcl.ac.uk/ddh) and from the [Clinic for Open Source Arts](https://clinicopensourcearts.org/).\n",
    "bugtrack_url": null,
    "license": "Anti-Capitalist",
    "summary": "Videogrep is a command line tool that searches through dialog in video and audio files and makes supercuts based on what it finds. Like grep but for video.",
    "version": "2.3.0",
    "project_urls": {
        "Homepage": "http://antiboredom.github.io/videogrep/",
        "Repository": "https://github.com/antiboredom/videogrep"
    },
    "split_keywords": [
        "video",
        " supercut"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d58fa3ca788f9550cef044051a297dce33d5a0d459cd97c0ae9f410037f38140",
                "md5": "9f9da866f5191192597acf9b9e085950",
                "sha256": "471edd50cb1d0c1eb6a525e2729d4d4a0751eb729c8cb73b794ab2dbd2f10a59"
            },
            "downloads": -1,
            "filename": "videogrep-2.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9f9da866f5191192597acf9b9e085950",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.8",
            "size": 41203194,
            "upload_time": "2024-04-19T00:27:15",
            "upload_time_iso_8601": "2024-04-19T00:27:15.638719Z",
            "url": "https://files.pythonhosted.org/packages/d5/8f/a3ca788f9550cef044051a297dce33d5a0d459cd97c0ae9f410037f38140/videogrep-2.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6fb2314193adada9800b724bbf6c959f64d3e01d912d4701f75160c899a6a605",
                "md5": "dfbce4d34cdea4aa42b2bb8d8580b165",
                "sha256": "180a4bd2ea8ba5566f59acf94edbbcb03514f745b9818594ba9bfd75bb4c4086"
            },
            "downloads": -1,
            "filename": "videogrep-2.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "dfbce4d34cdea4aa42b2bb8d8580b165",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.8",
            "size": 41139136,
            "upload_time": "2024-04-19T00:28:17",
            "upload_time_iso_8601": "2024-04-19T00:28:17.027929Z",
            "url": "https://files.pythonhosted.org/packages/6f/b2/314193adada9800b724bbf6c959f64d3e01d912d4701f75160c899a6a605/videogrep-2.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-19 00:28:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "antiboredom",
    "github_project": "videogrep",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "tox": true,
    "lcname": "videogrep"
}

Sam Lavigne