subtitles2text


Namesubtitles2text JSON
Version 0.0.3 PyPI version JSON
download
home_pageNone
SummarySubtitles (VTT, SRT, PDF, DOCX, HTML, images, etc) to text convertor, with a GUI, great for preprocessing to feed to LLMs
upload_time2025-02-07 19:06:37
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT License
keywords subtitles srt vtt docx pdf html conversion converter llm gpt preprocessing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # subtitles2text

## Description

Convert subtitles files (vtt, srt, PDF) and any files supported by Docling (DOCX, PPTX, XLSX, images PNG/JPG/JPEG, web pages HTML/XHTML) from any metadata to only leave the text content. This is especially useful to feed to genAI models such as LLMs and GPTs.

## Installation

```bash
pip install subtitles2text
```

## Usage

```bash
subtitles2text
```

This will launch a Tk GUI where you can select the files you want to convert.

The app supports OCR.

## License

MIT License.

## Author

This app was coded using `Roo Code` with `Gemini 2.0 flash thinking exp 01-21` under the architecture specified by Stephen Karl Larroque.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "subtitles2text",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "Stephen Karl Larroque <lrq3000@gmail.com>",
    "keywords": "subtitles, srt, vtt, docx, pdf, html, conversion, converter, llm, gpt, preprocessing",
    "author": null,
    "author_email": "Stephen Karl Larroque <lrq3000@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/94/da/70da3b0bfb3e15d89eeb7cc44173ea01e7591014510459885517cfcb2a89/subtitles2text-0.0.3.tar.gz",
    "platform": null,
    "description": "# subtitles2text\r\n\r\n## Description\r\n\r\nConvert subtitles files (vtt, srt, PDF) and any files supported by Docling (DOCX, PPTX, XLSX, images PNG/JPG/JPEG, web pages HTML/XHTML) from any metadata to only leave the text content. This is especially useful to feed to genAI models such as LLMs and GPTs.\r\n\r\n## Installation\r\n\r\n```bash\r\npip install subtitles2text\r\n```\r\n\r\n## Usage\r\n\r\n```bash\r\nsubtitles2text\r\n```\r\n\r\nThis will launch a Tk GUI where you can select the files you want to convert.\r\n\r\nThe app supports OCR.\r\n\r\n## License\r\n\r\nMIT License.\r\n\r\n## Author\r\n\r\nThis app was coded using `Roo Code` with `Gemini 2.0 flash thinking exp 01-21` under the architecture specified by Stephen Karl Larroque.\r\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "Subtitles (VTT, SRT, PDF, DOCX, HTML, images, etc) to text convertor, with a GUI, great for preprocessing to feed to LLMs",
    "version": "0.0.3",
    "project_urls": {
        "Documentation": "https://github.com/lrq3000/subtitles2text/blob/master/README.md",
        "Download": "https://github.com/lrq3000/subtitles2text/releases",
        "Homepage": "https://github.com/lrq3000/subtitles2text",
        "Source": "https://github.com/lrq3000/subtitles2text",
        "Tracker": "https://github.com/lrq3000/subtitles2text/issues"
    },
    "split_keywords": [
        "subtitles",
        " srt",
        " vtt",
        " docx",
        " pdf",
        " html",
        " conversion",
        " converter",
        " llm",
        " gpt",
        " preprocessing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c249f2c8b87d723e71a0c4b3e16d839b923f4eee19ef2076066e38871cd790d0",
                "md5": "3b0787e1e0a8a1eda7248b18777b7936",
                "sha256": "a5dbb7850d5cd7e043632a95bc7730b1d45ee6ff96f3d724d477710da9634418"
            },
            "downloads": -1,
            "filename": "subtitles2text-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3b0787e1e0a8a1eda7248b18777b7936",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 5648,
            "upload_time": "2025-02-07T19:06:35",
            "upload_time_iso_8601": "2025-02-07T19:06:35.822167Z",
            "url": "https://files.pythonhosted.org/packages/c2/49/f2c8b87d723e71a0c4b3e16d839b923f4eee19ef2076066e38871cd790d0/subtitles2text-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "94da70da3b0bfb3e15d89eeb7cc44173ea01e7591014510459885517cfcb2a89",
                "md5": "d9615af45a333e9b8d63ed810ad6d8cc",
                "sha256": "98aa69c687f9f4b3e94a91a6350172856f1178b87193dd5b42cb953db178879b"
            },
            "downloads": -1,
            "filename": "subtitles2text-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "d9615af45a333e9b8d63ed810ad6d8cc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 6553,
            "upload_time": "2025-02-07T19:06:37",
            "upload_time_iso_8601": "2025-02-07T19:06:37.856592Z",
            "url": "https://files.pythonhosted.org/packages/94/da/70da3b0bfb3e15d89eeb7cc44173ea01e7591014510459885517cfcb2a89/subtitles2text-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-07 19:06:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lrq3000",
    "github_project": "subtitles2text",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "subtitles2text"
}
        
Elapsed time: 3.82940s