# subtitles2text
## Description
Convert subtitles files (vtt, srt, PDF) and any files supported by Docling (DOCX, PPTX, XLSX, images PNG/JPG/JPEG, web pages HTML/XHTML) from any metadata to only leave the text content. This is especially useful to feed to genAI models such as LLMs and GPTs.
## Installation
```bash
pip install subtitles2text
```
## Usage
```bash
subtitles2text
```
This will launch a Tk GUI where you can select the files you want to convert.
The app supports OCR.
## License
MIT License.
## Author
This app was coded using `Roo Code` with `Gemini 2.0 flash thinking exp 01-21` under the architecture specified by Stephen Karl Larroque.
Raw data
{
"_id": null,
"home_page": null,
"name": "subtitles2text",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "Stephen Karl Larroque <lrq3000@gmail.com>",
"keywords": "subtitles, srt, vtt, docx, pdf, html, conversion, converter, llm, gpt, preprocessing",
"author": null,
"author_email": "Stephen Karl Larroque <lrq3000@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/94/da/70da3b0bfb3e15d89eeb7cc44173ea01e7591014510459885517cfcb2a89/subtitles2text-0.0.3.tar.gz",
"platform": null,
"description": "# subtitles2text\r\n\r\n## Description\r\n\r\nConvert subtitles files (vtt, srt, PDF) and any files supported by Docling (DOCX, PPTX, XLSX, images PNG/JPG/JPEG, web pages HTML/XHTML) from any metadata to only leave the text content. This is especially useful to feed to genAI models such as LLMs and GPTs.\r\n\r\n## Installation\r\n\r\n```bash\r\npip install subtitles2text\r\n```\r\n\r\n## Usage\r\n\r\n```bash\r\nsubtitles2text\r\n```\r\n\r\nThis will launch a Tk GUI where you can select the files you want to convert.\r\n\r\nThe app supports OCR.\r\n\r\n## License\r\n\r\nMIT License.\r\n\r\n## Author\r\n\r\nThis app was coded using `Roo Code` with `Gemini 2.0 flash thinking exp 01-21` under the architecture specified by Stephen Karl Larroque.\r\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Subtitles (VTT, SRT, PDF, DOCX, HTML, images, etc) to text convertor, with a GUI, great for preprocessing to feed to LLMs",
"version": "0.0.3",
"project_urls": {
"Documentation": "https://github.com/lrq3000/subtitles2text/blob/master/README.md",
"Download": "https://github.com/lrq3000/subtitles2text/releases",
"Homepage": "https://github.com/lrq3000/subtitles2text",
"Source": "https://github.com/lrq3000/subtitles2text",
"Tracker": "https://github.com/lrq3000/subtitles2text/issues"
},
"split_keywords": [
"subtitles",
" srt",
" vtt",
" docx",
" pdf",
" html",
" conversion",
" converter",
" llm",
" gpt",
" preprocessing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c249f2c8b87d723e71a0c4b3e16d839b923f4eee19ef2076066e38871cd790d0",
"md5": "3b0787e1e0a8a1eda7248b18777b7936",
"sha256": "a5dbb7850d5cd7e043632a95bc7730b1d45ee6ff96f3d724d477710da9634418"
},
"downloads": -1,
"filename": "subtitles2text-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3b0787e1e0a8a1eda7248b18777b7936",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 5648,
"upload_time": "2025-02-07T19:06:35",
"upload_time_iso_8601": "2025-02-07T19:06:35.822167Z",
"url": "https://files.pythonhosted.org/packages/c2/49/f2c8b87d723e71a0c4b3e16d839b923f4eee19ef2076066e38871cd790d0/subtitles2text-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "94da70da3b0bfb3e15d89eeb7cc44173ea01e7591014510459885517cfcb2a89",
"md5": "d9615af45a333e9b8d63ed810ad6d8cc",
"sha256": "98aa69c687f9f4b3e94a91a6350172856f1178b87193dd5b42cb953db178879b"
},
"downloads": -1,
"filename": "subtitles2text-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "d9615af45a333e9b8d63ed810ad6d8cc",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 6553,
"upload_time": "2025-02-07T19:06:37",
"upload_time_iso_8601": "2025-02-07T19:06:37.856592Z",
"url": "https://files.pythonhosted.org/packages/94/da/70da3b0bfb3e15d89eeb7cc44173ea01e7591014510459885517cfcb2a89/subtitles2text-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-07 19:06:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lrq3000",
"github_project": "subtitles2text",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "subtitles2text"
}