srt-equalizer

Name	srt-equalizer JSON
Version	0.1.7 JSON
	download
home_page	https://github.com/peterk/srt_equalizer
Summary	Transform subtitle line lengths, splitting into multiple subtitle fragments if necessary.
upload_time	2023-07-02 17:06:22
maintainer
docs_url	None
author	Peter Krantz
requires_python	>=3.8,<4.0
license	MIT
keywords
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # SRT Equalizer

A Python module to transform subtitle line lengths, splitting into multiple subtitle
fragments if necessary. Useful to adjust automatic speech recognition outputs from e.g. [Whisper](https://github.com/openai/whisper) to a more convenient size.

This library works for all languages where spaces separate words.

## Installing

`pip install srt_equalizer`

## Example

If the SRT file contains lines over a certain length like this:

```
1
00:00:00,000 --> 00:00:04,000
Good evening. I appreciate you giving me a few minutes of your time tonight

2
00:00:04,000 --> 00:00:11,000
so I can discuss with you a complex and difficult issue, an issue that is one of the most profound of our time.
```

Using this code to shorten the subtitles to a maximum length of 42 chars:

```python

from srt_equalizer import srt_equalizer

srt_equalizer.equalize_srt_file("test.srt", "shortened.srt", 42)
```

...they are split into multiple fragments and time code is adjusted to the
approximate proportional length of each segment while staying inside the time
slot for the fragment.

```
1
00:00:00,000 --> 00:00:02,132
Good evening. I appreciate you giving me

2
00:00:02,132 --> 00:00:04,000
a few minutes of your time tonight

3
00:00:04,000 --> 00:00:06,458
so I can discuss with you a complex and

4
00:00:06,458 --> 00:00:08,979
difficult issue, an issue that is one of

5
00:00:08,979 --> 00:00:11,000
the most profound of our time.
```

## Adjust Whisper subtitle lengths
Is is also possible to work with the subtitle items with the following utility methods:

```python
split_subtitle(sub: srt.Subtitle, target_chars: int=42, start_from_index: int=1) -> list[srt.Subtitle]:

whisper_result_to_srt(segments: list[dict]) -> list[srt.Subtitle]:
```

Here is an example of how to reduce the lingth of subtitles created by Whisper. It assumes you have an audio file to transcribe called gwb.wav.

```python
import whisper
from srt_equalizer import srt_equalizer
import srt
from datetime import timedelta

options_dict = {"task" : "transcribe", "language": "en"}
model = whisper.load_model("small")
result = model.transcribe("gwb.wav", language="en")
segments = result["segments"]
subs = srt_equalizer.whisper_result_to_srt(segments)

# Reduce line lenth in the whisper result to <= 42 chars
equalized = []
for sub in subs:
    equalized.extend(srt_equalizer.split_subtitle(sub, 42))

for i in equalized:
    print(i.content)
```

## Contributing

This library is built with [Poetry](https://python-poetry.org). Checkout this repo and run `poetry install` in the source folder. To run tests use `poetry run pytest tests`.

If you want to explore the library start a `poetry shell`.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/peterk/srt_equalizer",
    "name": "srt-equalizer",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Peter Krantz",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/89/59/d8d2586ba8a67171d4b70f264961413385963a1713310e6a5f435d02fd5a/srt_equalizer-0.1.7.tar.gz",
    "platform": null,
    "description": "# SRT Equalizer\n\nA Python module to transform subtitle line lengths, splitting into multiple subtitle\nfragments if necessary. Useful to adjust automatic speech recognition outputs from e.g. [Whisper](https://github.com/openai/whisper) to a more convenient size.\n\nThis library works for all languages where spaces separate words.\n\n## Installing\n\n`pip install srt_equalizer`\n\n## Example\n\nIf the SRT file contains lines over a certain length like this:\n\n```\n1\n00:00:00,000 --> 00:00:04,000\nGood evening. I appreciate you giving me a few minutes of your time tonight\n\n2\n00:00:04,000 --> 00:00:11,000\nso I can discuss with you a complex and difficult issue, an issue that is one of the most profound of our time.\n```\n\nUsing this code to shorten the subtitles to a maximum length of 42 chars:\n\n```python\n\nfrom srt_equalizer import srt_equalizer\n\nsrt_equalizer.equalize_srt_file(\"test.srt\", \"shortened.srt\", 42)\n```\n\n...they are split into multiple fragments and time code is adjusted to the\napproximate proportional length of each segment while staying inside the time\nslot for the fragment.\n\n```\n1\n00:00:00,000 --> 00:00:02,132\nGood evening. I appreciate you giving me\n\n2\n00:00:02,132 --> 00:00:04,000\na few minutes of your time tonight\n\n3\n00:00:04,000 --> 00:00:06,458\nso I can discuss with you a complex and\n\n4\n00:00:06,458 --> 00:00:08,979\ndifficult issue, an issue that is one of\n\n5\n00:00:08,979 --> 00:00:11,000\nthe most profound of our time.\n```\n\n## Adjust Whisper subtitle lengths\nIs is also possible to work with the subtitle items with the following utility methods:\n\n```python\nsplit_subtitle(sub: srt.Subtitle, target_chars: int=42, start_from_index: int=1) -> list[srt.Subtitle]:\n\nwhisper_result_to_srt(segments: list[dict]) -> list[srt.Subtitle]:\n```\n\nHere is an example of how to reduce the lingth of subtitles created by Whisper. It assumes you have an audio file to transcribe called gwb.wav.\n\n```python\nimport whisper\nfrom srt_equalizer import srt_equalizer\nimport srt\nfrom datetime import timedelta\n\noptions_dict = {\"task\" : \"transcribe\", \"language\": \"en\"}\nmodel = whisper.load_model(\"small\")\nresult = model.transcribe(\"gwb.wav\", language=\"en\")\nsegments = result[\"segments\"]\nsubs = srt_equalizer.whisper_result_to_srt(segments)\n\n# Reduce line lenth in the whisper result to <= 42 chars\nequalized = []\nfor sub in subs:\n    equalized.extend(srt_equalizer.split_subtitle(sub, 42))\n\nfor i in equalized:\n    print(i.content)\n```\n\n## Contributing\n\nThis library is built with [Poetry](https://python-poetry.org). Checkout this repo and run `poetry install` in the source folder. To run tests use `poetry run pytest tests`.\n\nIf you want to explore the library start a `poetry shell`.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Transform subtitle line lengths, splitting into multiple subtitle fragments if necessary. ",
    "version": "0.1.7",
    "project_urls": {
        "Homepage": "https://github.com/peterk/srt_equalizer",
        "Repository": "https://github.com/peterk/srt_equalizer"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "5903f929ba54c593082e4edcd388585b0899dcc346a99f059895d5b75979264c",
                "md5": "f67bc932ef1c9733cf47db9f0aa206ab",
                "sha256": "3ecc73daec914d5aed4a970eb5bca45fa6c0ff51e91ceea4a20be87f446c9f1f"
            },
            "downloads": -1,
            "filename": "srt_equalizer-0.1.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f67bc932ef1c9733cf47db9f0aa206ab",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 4633,
            "upload_time": "2023-07-02T17:06:20",
            "upload_time_iso_8601": "2023-07-02T17:06:20.868999Z",
            "url": "https://files.pythonhosted.org/packages/59/03/f929ba54c593082e4edcd388585b0899dcc346a99f059895d5b75979264c/srt_equalizer-0.1.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8959d8d2586ba8a67171d4b70f264961413385963a1713310e6a5f435d02fd5a",
                "md5": "7e9415d9757259641a4dffef61df2b28",
                "sha256": "49567f6646957635ed766f70fe12e8cd47a63d95c109a0c10de2973a3d050202"
            },
            "downloads": -1,
            "filename": "srt_equalizer-0.1.7.tar.gz",
            "has_sig": false,
            "md5_digest": "7e9415d9757259641a4dffef61df2b28",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 3958,
            "upload_time": "2023-07-02T17:06:22",
            "upload_time_iso_8601": "2023-07-02T17:06:22.380968Z",
            "url": "https://files.pythonhosted.org/packages/89/59/d8d2586ba8a67171d4b70f264961413385963a1713310e6a5f435d02fd5a/srt_equalizer-0.1.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-02 17:06:22",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "peterk",
    "github_project": "srt_equalizer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "srt-equalizer"
}

Peter Krantz