subsetzer


Namesubsetzer JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryTranslate SRT/VTT/TSV via a local LLM (Ollama-compatible). CLI-first; stdlib-only core.
upload_time2025-11-04 23:16:38
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseNone
keywords ollama srt subtitles translation vtt
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Subsetzer

Subsetzer is a stdlib-only subtitle translation toolkit that talks to an
Ollama-compatible LLM running on your LAN. Feed it `.srt`, `.vtt`, or `.tsv`
files, choose the target language, and it handles parsing, chunking, and
translation. The CLI stays faithful to the former `setzer` tool, but now ships
under a new name ready for PyPI.

## Features

- Translate SRT, VTT, and TSV subtitle files via Ollama-compatible APIs.
- Keep bracketed tags (like `[MUSIC]`) intact with `--no-translate-bracketed`.
- Chunk large transcripts with configurable `--max-chars` and `--cues-per-request`.
- Output SRT, VTT (with NOTE block including model + timestamp), or TSV files.
- Run with `pipx`, inside a virtual environment, or directly from source.

## Installation

```bash
pipx install subsetzer
# or
pip install subsetzer
```

Ensure you have an accessible Ollama server (default `http://127.0.0.1:11434`)
with a compatible model pulled (default `gemma3:12b`).

## CLI Usage

```bash
subsetzer --in input.srt --out ./out --target "German"
```

Useful flags (all mirrored from the previous CLI):

- `--outfmt {auto,srt,vtt,tsv}` to override the output format.
- `--outfile TEMPLATE` to customise the output path (placeholders: `{basename}`,
  `{dst}`, `{fmt}`, `{src}`, `{ts}`, `{model}`). The default template now adds
  the model slug so parallel runs stay separate.
- `--cues-per-request` / `--batch-per-chunk` to batch cues per LLM call.
- `--llm-mode {auto,chat,generate}` to force the Ollama API flavour.
- `--stream/--no-stream`, `--timeout`, `--no-llm`, `--debug` behave as before.

Run `subsetzer --help` for the full flag list.

## Environment Variables

Subsetzer understands both the new `SUBSETZER_*` variables and the legacy
`HOMEDOC_*` names. The new aliases take precedence when both are set.

| Variable | Description | Default |
| --- | --- | --- |
| `SUBSETZER_LLM_SERVER` / `HOMEDOC_LLM_SERVER` | Ollama server URL | `http://127.0.0.1:11434` |
| `SUBSETZER_LLM_MODEL` / `HOMEDOC_LLM_MODEL` | Model tag | `gemma3:12b` |
| `SUBSETZER_LLM_MODE` / `HOMEDOC_LLM_MODE` | `auto`, `chat`, or `generate` | `auto` |
| `SUBSETZER_STREAM` / `HOMEDOC_STREAM` | Enable streaming | `True` |
| `SUBSETZER_HTTP_TIMEOUT` / `HOMEDOC_HTTP_TIMEOUT` | Timeout in seconds | `60` |
| `SUBSETZER_CUES_PER_REQUEST` / `HOMEDOC_CUES_PER_REQUEST` | Batch size | `1` |
| `SUBSETZER_TZ` / `HOMEDOC_TZ` | Timezone for timestamped folders | local time |

## Development

Clone the repository and install in editable mode:

```bash
pip install -e packages/subsetzer
pytest packages/subsetzer/tests
```

Subsetzer ships with type hints (`py.typed`) and avoids third-party dependencies.

## License

GPL-3.0-or-later — see `LICENSE` for full details.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "subsetzer",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "ollama, srt, subtitles, translation, vtt",
    "author": null,
    "author_email": "Martin Fellner <martin@martinfellner.at>",
    "download_url": "https://files.pythonhosted.org/packages/b6/6f/abf4a06947a0fa128da6d84ee8acc29f839214342c9751c65c1bff732d19/subsetzer-0.1.3.tar.gz",
    "platform": null,
    "description": "# Subsetzer\n\nSubsetzer is a stdlib-only subtitle translation toolkit that talks to an\nOllama-compatible LLM running on your LAN. Feed it `.srt`, `.vtt`, or `.tsv`\nfiles, choose the target language, and it handles parsing, chunking, and\ntranslation. The CLI stays faithful to the former `setzer` tool, but now ships\nunder a new name ready for PyPI.\n\n## Features\n\n- Translate SRT, VTT, and TSV subtitle files via Ollama-compatible APIs.\n- Keep bracketed tags (like `[MUSIC]`) intact with `--no-translate-bracketed`.\n- Chunk large transcripts with configurable `--max-chars` and `--cues-per-request`.\n- Output SRT, VTT (with NOTE block including model + timestamp), or TSV files.\n- Run with `pipx`, inside a virtual environment, or directly from source.\n\n## Installation\n\n```bash\npipx install subsetzer\n# or\npip install subsetzer\n```\n\nEnsure you have an accessible Ollama server (default `http://127.0.0.1:11434`)\nwith a compatible model pulled (default `gemma3:12b`).\n\n## CLI Usage\n\n```bash\nsubsetzer --in input.srt --out ./out --target \"German\"\n```\n\nUseful flags (all mirrored from the previous CLI):\n\n- `--outfmt {auto,srt,vtt,tsv}` to override the output format.\n- `--outfile TEMPLATE` to customise the output path (placeholders: `{basename}`,\n  `{dst}`, `{fmt}`, `{src}`, `{ts}`, `{model}`). The default template now adds\n  the model slug so parallel runs stay separate.\n- `--cues-per-request` / `--batch-per-chunk` to batch cues per LLM call.\n- `--llm-mode {auto,chat,generate}` to force the Ollama API flavour.\n- `--stream/--no-stream`, `--timeout`, `--no-llm`, `--debug` behave as before.\n\nRun `subsetzer --help` for the full flag list.\n\n## Environment Variables\n\nSubsetzer understands both the new `SUBSETZER_*` variables and the legacy\n`HOMEDOC_*` names. The new aliases take precedence when both are set.\n\n| Variable | Description | Default |\n| --- | --- | --- |\n| `SUBSETZER_LLM_SERVER` / `HOMEDOC_LLM_SERVER` | Ollama server URL | `http://127.0.0.1:11434` |\n| `SUBSETZER_LLM_MODEL` / `HOMEDOC_LLM_MODEL` | Model tag | `gemma3:12b` |\n| `SUBSETZER_LLM_MODE` / `HOMEDOC_LLM_MODE` | `auto`, `chat`, or `generate` | `auto` |\n| `SUBSETZER_STREAM` / `HOMEDOC_STREAM` | Enable streaming | `True` |\n| `SUBSETZER_HTTP_TIMEOUT` / `HOMEDOC_HTTP_TIMEOUT` | Timeout in seconds | `60` |\n| `SUBSETZER_CUES_PER_REQUEST` / `HOMEDOC_CUES_PER_REQUEST` | Batch size | `1` |\n| `SUBSETZER_TZ` / `HOMEDOC_TZ` | Timezone for timestamped folders | local time |\n\n## Development\n\nClone the repository and install in editable mode:\n\n```bash\npip install -e packages/subsetzer\npytest packages/subsetzer/tests\n```\n\nSubsetzer ships with type hints (`py.typed`) and avoids third-party dependencies.\n\n## License\n\nGPL-3.0-or-later \u2014 see `LICENSE` for full details.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Translate SRT/VTT/TSV via a local LLM (Ollama-compatible). CLI-first; stdlib-only core.",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/githabideri/subsetzer",
        "Issues": "https://github.com/githabideri/subsetzer/issues"
    },
    "split_keywords": [
        "ollama",
        " srt",
        " subtitles",
        " translation",
        " vtt"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a3c5f342eff033c5e84492db7e0ada34a1149e36b3c8fe99892646fe5a7977ba",
                "md5": "20f63aa67945b341f8b8174995352be0",
                "sha256": "f63ab94236bb45c1f2fd91db2188dd205a7c1de63b5ca46c40135e61cc0327f7"
            },
            "downloads": -1,
            "filename": "subsetzer-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "20f63aa67945b341f8b8174995352be0",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 31266,
            "upload_time": "2025-11-04T23:16:37",
            "upload_time_iso_8601": "2025-11-04T23:16:37.864924Z",
            "url": "https://files.pythonhosted.org/packages/a3/c5/f342eff033c5e84492db7e0ada34a1149e36b3c8fe99892646fe5a7977ba/subsetzer-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b66fabf4a06947a0fa128da6d84ee8acc29f839214342c9751c65c1bff732d19",
                "md5": "472b6b3c0a018273b48ebab2928c1652",
                "sha256": "74e8aece747ff7b08c88288e549012400df44f0db2af1a90af153329e16b9efd"
            },
            "downloads": -1,
            "filename": "subsetzer-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "472b6b3c0a018273b48ebab2928c1652",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 26319,
            "upload_time": "2025-11-04T23:16:38",
            "upload_time_iso_8601": "2025-11-04T23:16:38.980178Z",
            "url": "https://files.pythonhosted.org/packages/b6/6f/abf4a06947a0fa128da6d84ee8acc29f839214342c9751c65c1bff732d19/subsetzer-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-04 23:16:38",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "githabideri",
    "github_project": "subsetzer",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "subsetzer"
}
        
Elapsed time: 3.20000s