pysub-parser


Namepysub-parser JSON
Version 1.7.1 PyPI version JSON
download
home_pagehttps://github.com/fedecalendino/pysub-parser
SummaryUtility to extract the contents of a subtitle file.
upload_time2023-12-07 00:47:04
maintainer
docs_urlNone
authorFede Calendino
requires_python>=3.8,<4.0
licenseMIT
keywords parsing subtitle srt ssa sub
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## pysub-parser

[![Version](https://img.shields.io/pypi/v/pysub-parser?logo=pypi)](https://pypi.org/project/pysub-parser)
[![Quality Gate Status](https://img.shields.io/sonar/alert_status/fedecalendino_pysub-parser?logo=sonarcloud&server=https://sonarcloud.io)](https://sonarcloud.io/dashboard?id=fedecalendino_pysub-parser)
[![CodeCoverage](https://img.shields.io/sonar/coverage/fedecalendino_pysub-parser?logo=sonarcloud&server=https://sonarcloud.io)](https://sonarcloud.io/dashboard?id=fedecalendino_pysub-parser)

Utility to extract the contents of a subtitle file.

Supported types:

* `ass`: [Advanced SubStation Alpha](https://en.wikipedia.org/wiki/SubStation_Alpha#Advanced_SubStation_Alpha)
* `ssa`: [SubStation Alpha](https://en.wikipedia.org/wiki/SubStation_Alpha)
* `srt`: [SubRip](https://en.wikipedia.org/wiki/SubRip)
* `sub`: [MicroDVD](https://en.wikipedia.org/wiki/MicroDVD)
* `txt`: [Sub Viewer](https://en.wikipedia.org/wiki/SubViewer)

> For more information: http://write.flossmanuals.net/video-subtitling/file-formats

### Usage

The method parse requires the following parameters:

* `path`: location of the subtitle file.
* `subtype`: one of the supported file types, by default file extension is used.
* `encoding`: encoding of the file, `utf-8` by default.
* `**kwargs`: optional parameters.
  * `fps`: framerate (only used by `sub` files), `23.976` by default.

```python
from pysubparser import parser

subtitles = parser.parse('./files/space-jam.srt')

for subtitle in subtitles:
    print(subtitle)
```

Output:
```text
0 > [BALL BOUNCING]
1 > Michael?
2 > What are you doing out here, son? It's after midnight.
3 > MICHAEL: Couldn't sleep, Pops.
```

___

### Subtitle Class

Each line of a dialogue is represented with a `Subtitle` object with the following properties:

* `index`: position in the file.
* `start`: timestamp of the start of the dialog.
* `end`: timestamp of the end of the dialog.
* `text`: dialog contents.

```python
for subtitle in subtitles:
    print(f'{subtitle.start} > {subtitle.end}')
    print(subtitle.text)
    print()
```

Output:
```text
00:00:36.328000 > 00:00:38.329000
[BALL BOUNCING]

00:01:03.814000 > 00:01:05.189000
Michael?

00:01:08.402000 > 00:01:11.404000
What are you doing out here, son? It's after midnight.

00:01:11.572000 > 00:01:13.072000
MICHAEL: Couldn't sleep, Pops.
```

### Cleaners

Currently, 4 cleaners are provided:

* `ascii` will translate every unicode character to its ascii equivalent.
* `brackets` will remove anything between them (e.g., `[BALL BOUNCING]`)
* `formatting` will remove formatting keys like `<i>` and `</i>`.
* `lower_case` will lower case all text. 

```python
from pysubparser.cleaners import ascii, brackets, formatting, lower_case

subtitles = brackets.clean(
    lower_case.clean(
        subtitles
    )
)

for subtitle in subtitles:
    print(subtitle)
```

```text
0 > 
1 > michael?
2 > what are you doing out here, son? it's after midnight.
3 > michael: couldn't sleep, pops.
```

### Writers

Given any list of `Subtitle` and a path it will output those subtitles in a `srt` format.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/fedecalendino/pysub-parser",
    "name": "pysub-parser",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<4.0",
    "maintainer_email": "",
    "keywords": "parsing,subtitle,srt,ssa,sub",
    "author": "Fede Calendino",
    "author_email": "fede@calendino.com",
    "download_url": "https://files.pythonhosted.org/packages/6a/42/80a9cee612de7d5f3d940befd2bcfe149e39c3e43662048b49fdadb607ab/pysub_parser-1.7.1.tar.gz",
    "platform": null,
    "description": "## pysub-parser\n\n[![Version](https://img.shields.io/pypi/v/pysub-parser?logo=pypi)](https://pypi.org/project/pysub-parser)\n[![Quality Gate Status](https://img.shields.io/sonar/alert_status/fedecalendino_pysub-parser?logo=sonarcloud&server=https://sonarcloud.io)](https://sonarcloud.io/dashboard?id=fedecalendino_pysub-parser)\n[![CodeCoverage](https://img.shields.io/sonar/coverage/fedecalendino_pysub-parser?logo=sonarcloud&server=https://sonarcloud.io)](https://sonarcloud.io/dashboard?id=fedecalendino_pysub-parser)\n\nUtility to extract the contents of a subtitle file.\n\nSupported types:\n\n* `ass`: [Advanced SubStation Alpha](https://en.wikipedia.org/wiki/SubStation_Alpha#Advanced_SubStation_Alpha)\n* `ssa`: [SubStation Alpha](https://en.wikipedia.org/wiki/SubStation_Alpha)\n* `srt`: [SubRip](https://en.wikipedia.org/wiki/SubRip)\n* `sub`: [MicroDVD](https://en.wikipedia.org/wiki/MicroDVD)\n* `txt`: [Sub Viewer](https://en.wikipedia.org/wiki/SubViewer)\n\n> For more information: http://write.flossmanuals.net/video-subtitling/file-formats\n\n### Usage\n\nThe method parse requires the following parameters:\n\n* `path`: location of the subtitle file.\n* `subtype`: one of the supported file types, by default file extension is used.\n* `encoding`: encoding of the file, `utf-8` by default.\n* `**kwargs`: optional parameters.\n  * `fps`: framerate (only used by `sub` files), `23.976` by default.\n\n```python\nfrom pysubparser import parser\n\nsubtitles = parser.parse('./files/space-jam.srt')\n\nfor subtitle in subtitles:\n    print(subtitle)\n```\n\nOutput:\n```text\n0 > [BALL BOUNCING]\n1 > Michael?\n2 > What are you doing out here, son? It's after midnight.\n3 > MICHAEL: Couldn't sleep, Pops.\n```\n\n___\n\n### Subtitle Class\n\nEach line of a dialogue is represented with a `Subtitle` object with the following properties:\n\n* `index`: position in the file.\n* `start`: timestamp of the start of the dialog.\n* `end`: timestamp of the end of the dialog.\n* `text`: dialog contents.\n\n```python\nfor subtitle in subtitles:\n    print(f'{subtitle.start} > {subtitle.end}')\n    print(subtitle.text)\n    print()\n```\n\nOutput:\n```text\n00:00:36.328000 > 00:00:38.329000\n[BALL BOUNCING]\n\n00:01:03.814000 > 00:01:05.189000\nMichael?\n\n00:01:08.402000 > 00:01:11.404000\nWhat are you doing out here, son? It's after midnight.\n\n00:01:11.572000 > 00:01:13.072000\nMICHAEL: Couldn't sleep, Pops.\n```\n\n### Cleaners\n\nCurrently, 4 cleaners are provided:\n\n* `ascii` will translate every unicode character to its ascii equivalent.\n* `brackets` will remove anything between them (e.g., `[BALL BOUNCING]`)\n* `formatting` will remove formatting keys like `<i>` and `</i>`.\n* `lower_case` will lower case all text. \n\n```python\nfrom pysubparser.cleaners import ascii, brackets, formatting, lower_case\n\nsubtitles = brackets.clean(\n    lower_case.clean(\n        subtitles\n    )\n)\n\nfor subtitle in subtitles:\n    print(subtitle)\n```\n\n```text\n0 > \n1 > michael?\n2 > what are you doing out here, son? it's after midnight.\n3 > michael: couldn't sleep, pops.\n```\n\n### Writers\n\nGiven any list of `Subtitle` and a path it will output those subtitles in a `srt` format.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Utility to extract the contents of a subtitle file.",
    "version": "1.7.1",
    "project_urls": {
        "Documentation": "https://github.com/fedecalendino/pysub-parser/blob/main/README.md",
        "Homepage": "https://github.com/fedecalendino/pysub-parser"
    },
    "split_keywords": [
        "parsing",
        "subtitle",
        "srt",
        "ssa",
        "sub"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3b98e49af609f6a654d1beb4293dd583dcdb80e67f300a6c2d345ab02c3f0631",
                "md5": "c86ea6e5a6bf3352f31e912977206517",
                "sha256": "02fd234a49a8ab4e36d98a3ed58801466e73178a11b7eab4e62b347ba92b24a9"
            },
            "downloads": -1,
            "filename": "pysub_parser-1.7.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c86ea6e5a6bf3352f31e912977206517",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<4.0",
            "size": 11097,
            "upload_time": "2023-12-07T00:47:02",
            "upload_time_iso_8601": "2023-12-07T00:47:02.764865Z",
            "url": "https://files.pythonhosted.org/packages/3b/98/e49af609f6a654d1beb4293dd583dcdb80e67f300a6c2d345ab02c3f0631/pysub_parser-1.7.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6a4280a9cee612de7d5f3d940befd2bcfe149e39c3e43662048b49fdadb607ab",
                "md5": "bd1633d4e2a3918fd10312281236a03c",
                "sha256": "9f539d30a1b23c0674047835505816abe5ba661414b63497b13153ab4421eda5"
            },
            "downloads": -1,
            "filename": "pysub_parser-1.7.1.tar.gz",
            "has_sig": false,
            "md5_digest": "bd1633d4e2a3918fd10312281236a03c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<4.0",
            "size": 7450,
            "upload_time": "2023-12-07T00:47:04",
            "upload_time_iso_8601": "2023-12-07T00:47:04.305425Z",
            "url": "https://files.pythonhosted.org/packages/6a/42/80a9cee612de7d5f3d940befd2bcfe149e39c3e43662048b49fdadb607ab/pysub_parser-1.7.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-07 00:47:04",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "fedecalendino",
    "github_project": "pysub-parser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pysub-parser"
}
        
Elapsed time: 0.14403s