<h1 align="center">PyCaptions</h1>
<p align="center">
<a href="https://pypi.org/project/pycaptions"><img alt="PyPI - Version" src="https://img.shields.io/pypi/v/pycaptions.svg?color=blue"></a>
<a href="https://choosealicense.com/licenses/lgpl-3.0/"><img alt="PyPI - License" src="https://img.shields.io/pypi/l/pycaptions.svg"></a>
<a href="https://www.python.org/downloads/"><img alt="PyPI - Python" src="https://img.shields.io/pypi/pyversions/pycaptions.svg?color=blue"></a>
<a href="https://github.com/pypa/wheel"><img alt="PyPI - Wheel" src="https://img.shields.io/pypi/wheel/pycaptions.svg"></a>
<a href="https://pypi.org/project/pycaptions"><img alt="PyPI - Status" src="https://img.shields.io/pypi/status/pycaptions.svg"></a>
<a href="https://pypi.org/project/pycaptions"><img alt="PyPI - Downloads" src="https://static.pepy.tech/personalized-badge/pycaptions?period=total&units=international_system&left_text=downloads&left_color=grey&right_color=blue"></a>
</p>
PyCaptions is a caption reading/writing library.
* * *
**Why [LGPL-3.0](https://choosealicense.com/licenses/lgpl-3.0/)?** This is just to ensure that source code for the library is always under the same licence and cannot be closed-sourced. All the conditions for this licence only apply for the the library itself and it's modifications. We reccomend to just contribute to the project if you are making modifications, unless they are drastic and specific to your case.
* * *
Table of Contents
- [Installation](#installation)
- [Supported Formats](#supported-formats)
- [Plans](#future-plans)
- [Examples](#examples)
- [Changelog](#changelog)
Read the [Wiki](https://github.com/adfreelife/PyCaptions/wiki)
* * *
## Installation
- PIP
```
pip install --upgrade pycaptions
```
- Source
```
git clone https://github.com/adfreelife/PyCaptions.git
cd PyCaptions
python setup.py install
```
## Supported Formats
- [SubRip (SRT)](https://en.wikipedia.org/wiki/SubRip) (reader + writer)
- [MicroDVD (SUB)](https://en.wikipedia.org/wiki/MicroDVD) (reader + writer)
- [Timed Text Markup Language (TTML, DFXP, XML)](https://www.w3.org/TR/ttml/) (reader* + writer*)
- [Web Video Text Tracks Format (VTT)](https://www.w3.org/TR/webvtt/) (reader + writer*)
\*Limited functionality
## Future plans
- add writers to all supported formats
- auto-fit lines into multilines or split captions blocks into two parts
- add support for more formats
- [Synchronized Accessible Media Interchange (SAMI)](https://learn.microsoft.com/en-us/previous-versions/windows/desktop/dnacc/understanding-sami-1.0)
- [Universal Subtitle Format (USF)](https://en.wikipedia.org/wiki/Universal_Subtitle_Format)
- [LyRiCs (LRC)](https://en.wikipedia.org/wiki/LRC_(file_format))
- open an issue with "enhancement" label for more
## Examples
### Generic from file name
```python
from pycaptions import Captions
with Captions("tests/test.en.srt") as captions:
captions.saveVTT("test")
```
### Generic from file stream
```python
with open("tests/test.en.srt", encoding="UTF-8") as f:
captions = Captions(f) # or captions = Captions()
# captions.read(f)
captions.saveVTT("test")
```
### Generic from string
```python
srt = """1
00:00:00,500 --> 00:00:02,000
This is a test file
"""
captions = Captions(srt) # or captions = Captions()
# captions.detect(srt)
captions.saveVTT("test")
```
### Specific reader
Have the same functions as generic, except
```python
from pycaptions import SubRip, detectSRT
with open("tests/test.en.srt", encoding="UTF-8") as f:
if detectSRT(f): # or SubRip.detect(f)
captions = SubRip().read(f)
captions.saveVTT("test")
```
### Multilingual
```python
from pycaptions import Captions
# if the format supports multiple languages
with Captions("tests/test.ttml") as captions:
# first line will be in english, second one in spanish
captions.saveSRT("test", ["en","es"] lines=1) # recomended to specify lines=1
# if you have multiple files and you want to make multilingual one
with Captions("tests/test.en.srt") as captions:
with Captions("tests/test.es.srt") as captions2:
# only subtitle text and comments (if format supports them) are added
captions+=captions2
# first line will be in english, second one in spanish
captions.save("test", ["en","es"], lines=1) # recomended to specify lines=1
```
### Combine files
```python
with Captions("tests/test.en.srt") as captions:
captions.joinFile("tests/test.en.srt", add_end_time=True)
captions.save("test")
```
## Changelog
### v0.7.0
Release date: 2024-02-06
Changes:
- **Added cli support** (e.g `pycaptions "path/to/file/file.srt" -f vtt`)
- Added autoformat for all values of `lines`
- Added function `CaptionsFormat.getLanguagesAndFilename`
- Added function `CaptionsFormat.getFilename`
- Added `MicroTime.fromMicrotime` creates a MicroTime from a list
- Added `MicroTime.toMicrotime` returns a MicroTime as a list
- Added `MicroTime.fromAnyFormat` returns a MicroTime from provided format (case insensitive)
- `MicroTime.fromSUBTime` and `MicroTime.toSUBTime` now supports framerate as string
- `Captions.save` output_format is now case insensitive
- Improved MicroDVD style conversion
- Internal restructure for faster development
- Invalid `style` argument will result in `style=None`
- Added `style_options` for changing style globaly, default `style="full"` `lines=-1`, this affects how the style is parsed. (e.g. `style_options.style=None` and then using argument `style="full"` will not convert any style due to optimizations for faster conversion)
- Hypens at the end of the lines (e.g "Some-<br/>thing") will be removed if `lines` is >-1
- `Styling` is now split into `StyleFormat` and `Styling(StyleFormat)`
Fixes:
- Fixed "lxml is not installed" error
- Fixed `Styling.getTTML` converting invalid css properties into ttml properties. To-do: add value checks for these properties.
- Fixed `CaptionsFormat.getLanguagesFromFilename` getting languages from directory path (e.g. `\path.to.file\file.en.srt` -> `["to", "en"]`)
- Fixed width and height not being saved to json
Read past changes [here](https://github.com/adfreelife/PyCaptions/blob/main/CHANGELOG.md).
Raw data
{
"_id": null,
"home_page": "https://github.com/adfreelife/PyCaptions",
"name": "pycaptions",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "captions,subtitles",
"author": "Ad Free Life",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/12/ab/ae7701ce74ccf73d64fadd3ef8c5e401b1160dfc4af88f3026bf3231fb21/pycaptions-0.7.0.tar.gz",
"platform": null,
"description": "<h1 align=\"center\">PyCaptions</h1>\r\n<p align=\"center\">\r\n <a href=\"https://pypi.org/project/pycaptions\"><img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/pycaptions.svg?color=blue\"></a>\r\n <a href=\"https://choosealicense.com/licenses/lgpl-3.0/\"><img alt=\"PyPI - License\" src=\"https://img.shields.io/pypi/l/pycaptions.svg\"></a>\r\n <a href=\"https://www.python.org/downloads/\"><img alt=\"PyPI - Python\" src=\"https://img.shields.io/pypi/pyversions/pycaptions.svg?color=blue\"></a>\r\n <a href=\"https://github.com/pypa/wheel\"><img alt=\"PyPI - Wheel\" src=\"https://img.shields.io/pypi/wheel/pycaptions.svg\"></a>\r\n <a href=\"https://pypi.org/project/pycaptions\"><img alt=\"PyPI - Status\" src=\"https://img.shields.io/pypi/status/pycaptions.svg\"></a>\r\n <a href=\"https://pypi.org/project/pycaptions\"><img alt=\"PyPI - Downloads\" src=\"https://static.pepy.tech/personalized-badge/pycaptions?period=total&units=international_system&left_text=downloads&left_color=grey&right_color=blue\"></a>\r\n</p>\r\nPyCaptions is a caption reading/writing library.\r\n\r\n* * *\r\n\r\n**Why [LGPL-3.0](https://choosealicense.com/licenses/lgpl-3.0/)?** This is just to ensure that source code for the library is always under the same licence and cannot be closed-sourced. All the conditions for this licence only apply for the the library itself and it's modifications. We reccomend to just contribute to the project if you are making modifications, unless they are drastic and specific to your case.\r\n\r\n* * *\r\nTable of Contents\r\n- [Installation](#installation)\r\n- [Supported Formats](#supported-formats)\r\n- [Plans](#future-plans)\r\n- [Examples](#examples)\r\n- [Changelog](#changelog)\r\n\r\nRead the [Wiki](https://github.com/adfreelife/PyCaptions/wiki)\r\n* * *\r\n\r\n## Installation\r\n- PIP\r\n ```\r\n pip install --upgrade pycaptions\r\n ```\r\n- Source\r\n ```\r\n git clone https://github.com/adfreelife/PyCaptions.git\r\n cd PyCaptions\r\n python setup.py install\r\n ```\r\n\r\n## Supported Formats\r\n- [SubRip (SRT)](https://en.wikipedia.org/wiki/SubRip) (reader + writer)\r\n- [MicroDVD (SUB)](https://en.wikipedia.org/wiki/MicroDVD) (reader + writer)\r\n- [Timed Text Markup Language (TTML, DFXP, XML)](https://www.w3.org/TR/ttml/) (reader* + writer*)\r\n- [Web Video Text Tracks Format (VTT)](https://www.w3.org/TR/webvtt/) (reader + writer*)\r\n\r\n\\*Limited functionality\r\n\r\n## Future plans\r\n- add writers to all supported formats\r\n- auto-fit lines into multilines or split captions blocks into two parts\r\n- add support for more formats\r\n - [Synchronized Accessible Media Interchange (SAMI)](https://learn.microsoft.com/en-us/previous-versions/windows/desktop/dnacc/understanding-sami-1.0)\r\n - [Universal Subtitle Format (USF)](https://en.wikipedia.org/wiki/Universal_Subtitle_Format)\r\n - [LyRiCs (LRC)](https://en.wikipedia.org/wiki/LRC_(file_format))\r\n - open an issue with \"enhancement\" label for more\r\n\r\n## Examples\r\n\r\n### Generic from file name\r\n```python\r\nfrom pycaptions import Captions\r\n\r\nwith Captions(\"tests/test.en.srt\") as captions:\r\n captions.saveVTT(\"test\")\r\n```\r\n\r\n### Generic from file stream\r\n```python\r\nwith open(\"tests/test.en.srt\", encoding=\"UTF-8\") as f:\r\n captions = Captions(f) # or captions = Captions()\r\n # captions.read(f)\r\n captions.saveVTT(\"test\")\r\n```\r\n\r\n### Generic from string\r\n```python\r\nsrt = \"\"\"1\r\n00:00:00,500 --> 00:00:02,000\r\nThis is a test file\r\n\"\"\"\r\ncaptions = Captions(srt) # or captions = Captions()\r\n # captions.detect(srt)\r\ncaptions.saveVTT(\"test\")\r\n```\r\n\r\n### Specific reader\r\nHave the same functions as generic, except\r\n\r\n```python\r\nfrom pycaptions import SubRip, detectSRT\r\n\r\nwith open(\"tests/test.en.srt\", encoding=\"UTF-8\") as f:\r\n if detectSRT(f): # or SubRip.detect(f)\r\n captions = SubRip().read(f)\r\n captions.saveVTT(\"test\")\r\n```\r\n\r\n### Multilingual\r\n```python\r\nfrom pycaptions import Captions\r\n\r\n# if the format supports multiple languages\r\nwith Captions(\"tests/test.ttml\") as captions:\r\n # first line will be in english, second one in spanish\r\n captions.saveSRT(\"test\", [\"en\",\"es\"] lines=1) # recomended to specify lines=1\r\n \r\n# if you have multiple files and you want to make multilingual one\r\nwith Captions(\"tests/test.en.srt\") as captions:\r\n with Captions(\"tests/test.es.srt\") as captions2:\r\n # only subtitle text and comments (if format supports them) are added\r\n captions+=captions2 \r\n # first line will be in english, second one in spanish\r\n captions.save(\"test\", [\"en\",\"es\"], lines=1) # recomended to specify lines=1\r\n```\r\n\r\n### Combine files\r\n```python\r\nwith Captions(\"tests/test.en.srt\") as captions:\r\n captions.joinFile(\"tests/test.en.srt\", add_end_time=True)\r\n captions.save(\"test\")\r\n```\r\n\r\n## Changelog\r\n### v0.7.0\r\nRelease date: 2024-02-06\r\n\r\nChanges:\r\n- **Added cli support** (e.g `pycaptions \"path/to/file/file.srt\" -f vtt`)\r\n- Added autoformat for all values of `lines`\r\n- Added function `CaptionsFormat.getLanguagesAndFilename`\r\n- Added function `CaptionsFormat.getFilename`\r\n- Added `MicroTime.fromMicrotime` creates a MicroTime from a list\r\n- Added `MicroTime.toMicrotime` returns a MicroTime as a list\r\n- Added `MicroTime.fromAnyFormat` returns a MicroTime from provided format (case insensitive)\r\n- `MicroTime.fromSUBTime` and `MicroTime.toSUBTime` now supports framerate as string\r\n- `Captions.save` output_format is now case insensitive\r\n- Improved MicroDVD style conversion\r\n- Internal restructure for faster development\r\n- Invalid `style` argument will result in `style=None`\r\n- Added `style_options` for changing style globaly, default `style=\"full\"` `lines=-1`, this affects how the style is parsed. (e.g. `style_options.style=None` and then using argument `style=\"full\"` will not convert any style due to optimizations for faster conversion)\r\n- Hypens at the end of the lines (e.g \"Some-<br/>thing\") will be removed if `lines` is >-1\r\n- `Styling` is now split into `StyleFormat` and `Styling(StyleFormat)`\r\n\r\nFixes:\r\n- Fixed \"lxml is not installed\" error\r\n- Fixed `Styling.getTTML` converting invalid css properties into ttml properties. To-do: add value checks for these properties.\r\n- Fixed `CaptionsFormat.getLanguagesFromFilename` getting languages from directory path (e.g. `\\path.to.file\\file.en.srt` -> `[\"to\", \"en\"]`)\r\n- Fixed width and height not being saved to json\r\n\r\nRead past changes [here](https://github.com/adfreelife/PyCaptions/blob/main/CHANGELOG.md).\r\n",
"bugtrack_url": null,
"license": "",
"summary": "PyCaptions, parser and converter for captions formats",
"version": "0.7.0",
"project_urls": {
"Homepage": "https://github.com/adfreelife/PyCaptions"
},
"split_keywords": [
"captions",
"subtitles"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "001b31ba2fcd77f61247a2ad6ef2b59c1b7b71be9cf26bd9954e127f8d84cf17",
"md5": "9d3e0cc838232e63da7f6ee286a250d0",
"sha256": "4815c5b8dbee636921df9606f4461786126281d8d6c0601f2e7339cb140a4b4f"
},
"downloads": -1,
"filename": "pycaptions-0.7.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9d3e0cc838232e63da7f6ee286a250d0",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 46239,
"upload_time": "2024-02-06T14:43:37",
"upload_time_iso_8601": "2024-02-06T14:43:37.466744Z",
"url": "https://files.pythonhosted.org/packages/00/1b/31ba2fcd77f61247a2ad6ef2b59c1b7b71be9cf26bd9954e127f8d84cf17/pycaptions-0.7.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "12abae7701ce74ccf73d64fadd3ef8c5e401b1160dfc4af88f3026bf3231fb21",
"md5": "7ea4a9820690ecd14cc7aef409bfaec2",
"sha256": "de85f4a6db588af4a8925f3751adfd080cacfee9590897030a721b706c61bb10"
},
"downloads": -1,
"filename": "pycaptions-0.7.0.tar.gz",
"has_sig": false,
"md5_digest": "7ea4a9820690ecd14cc7aef409bfaec2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 35991,
"upload_time": "2024-02-06T14:43:38",
"upload_time_iso_8601": "2024-02-06T14:43:38.848614Z",
"url": "https://files.pythonhosted.org/packages/12/ab/ae7701ce74ccf73d64fadd3ef8c5e401b1160dfc4af88f3026bf3231fb21/pycaptions-0.7.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-06 14:43:38",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "adfreelife",
"github_project": "PyCaptions",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "beautifulsoup4",
"specs": [
[
">=",
"=4.12.2"
]
]
},
{
"name": "cssutils",
"specs": [
[
">=",
"2.9.0"
]
]
},
{
"name": "budoux",
"specs": [
[
">=",
"0.6.2"
]
]
},
{
"name": "langcodes",
"specs": [
[
">=",
"3.2.0"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
">=",
"3.0.0"
]
]
},
{
"name": "webcolors",
"specs": []
},
{
"name": "lxml",
"specs": []
}
],
"lcname": "pycaptions"
}