interpres


Nameinterpres JSON
Version 0.4.0b6 PyPI version JSON
download
home_pagehttps://github.com/wasertech/Translator
SummaryTranslate from one language to another.
upload_time2025-09-10 14:28:10
maintainerNone
docs_urlNone
authorDanny Waser
requires_python<4,>=3.8
licenseLICENSE
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Interpres — Translator

Translate text and files between languages using Hugging Face translation models (default: Meta NLLB).

Interpres (Translator) is a lightweight CLI and Python package for fast, batch-friendly translation workflows. It supports single-sentence translation, directory and file translation, and robust PO (gettext) file handling for localization workflows.

## Key features
- CLI + Python API
- Default model: [`facebook/nllb-200-distilled-600M`](https://huggingface.co/facebook/nllb-200-distilled-600M) (configurable)
- Fast batch translation with configurable batch size, epochs, and parallelism
- Full support for PO files: preserves metadata, comments, and structure; translates only untranslated entries by default; force retranslate option
- Language list compatible with NLLB (200+ languages)

## Quick install

pip:
```zsh
pip install interpres
```

Install from source:
```zsh
pip install git+https://github.com/wasertech/Translator.git
```

Specify a release:
```zsh
pip install interpres==0.3.1b4
pip install git+https://github.com/wasertech/Translator.git@v0.3.1b4
```

## CLI overview

Run:
```zsh
translate [FROM] [TO] [SENTENCES...]
```

### Basic examples:
```zsh
# Single sentence
translate en fr "This is a test."

# Interactive shell
translate
# or get help
translate --help

# Translate a directory and save output
translate --directory ./texts --save translations.txt eng_Latn fra_Latn
```

### Important options (common)
- -m, --model_id MODEL_ID : Hugging Face model ID to use
- -d, --directory DIRECTORY : Translate files in a directory
- --po : Translate PO files (gettext)
- --force : Force retranslation (including already translated entries)
- -b, --batch_size : Batch size for model inference
- -e, --nepoch : Number of epoch splits used to pipeline batches (tweak to avoid OOM)
- -n, --nproc : Number of CPU workers for preprocessing/filtering
- -L, --language_list : Show supported languages

### PO-file translation (high level)
- Finds .po files recursively and validates Language metadata
- By default translates empty msgstr entries only
- --force reprocesses every entry
- Preserves comments, headers, and file formatting — ideal for Poedit/Django workflows

## Python API (simple)
```python
from translator import Translator

t = Translator("eng_Latn", "fra_Latn")
out = t.translate("This is a simple sentence.")
print(out)
```

## PO-file example (Python)
```python
from translator import Translator, utils

translator = Translator("eng_Latn", "spa_Latn")
po = utils.read_po_file("messages.po")

if utils.should_translate_po_file(po, "eng_Latn"):
    entries = utils.extract_untranslated_from_po(po)
    translations = translator.translate(entries)
    mapping = dict(zip(entries, translations))
    utils.update_po_with_translations(po, mapping, force=False)
    utils.save_po_file(po, "messages.po")
```

## Language support
Depending on models used, you might get fewer choices 
but with `NLLB` you get more than 200 most popular ones.

```zsh
# translate -L
❯ translate --language_list
Language list:
    ...
```

From `python`:
```python
>>> import translator
>>> len(translator.LANGS)
202
>>> translator.LANGS
['ace_Arab', '...', 'zul_Latn']
>>> from translator.language import get_nllb_lang, get_sys_lang_format
>>> nllb_lang = get_nllb_lang("en")
>>> nllb_lang
'eng_Latn'
>>> get_sys_lang_format()
'fra_Latn'
```

Checkout [`LANGS`](translator/language.py) to see the full list of supported languages.

## Custom models
- Use any [Hugging Face translation model](https://huggingface.co/models?pipeline_tag=translation&sort=downloads) compatible with the Transformers pipeline:
  `translate --model_id "HUGGINGFACE/MODEL_ID" ...`
- Prefer models that match your language pair and domain. Models trained or fine-tuned specifically for a given language pair (e.g., en→fr) often produce noticeably better results than general multilingual models.
- Domain/context-specific models are best: if you're translating a website, a model trained on website or localization data (or fine-tuned on your site's content) will usually yield more accurate, consistent, and context-aware translations than the default general-purpose model.

## Performance tips
- Set nepoch (-e) and batch_size (-b) to fit your device memory. Bigger batch_size speeds throughput but uses more memory.
- Use -n to match your CPU threads for preprocessing speed.
- Use custom models: choosing a language-pair-specific or domain-specific model (or fine-tuning one on your data) often improves translation quality and consistency, especially for specialized content such as legal texts, technical docs, or websites.

## License
Mozilla Public License 2.0 — see [LICENSE](LICENSE)

Using this tool to translate a sentence, the licence of the original sentence still applies unless specified otherwise.

Meaning, if you translate a sentence under [Creative Commons CC0](https://creativecommons.org/share-your-work/public-domain/cc0/), the translation is also under Creative Commons CC0.

Idem for any licence.

## Contribute & sponsor
- [Share and talk](https://github.com/wasertech/Translator/discussions/categories/show-and-tell) about translations models you like to use and tell us why.
- [Open issues](https://github.com/wasertech/Translator/issues) or [PRs for features, bugfixes, or performance improvements](https://github.com/wasertech/Translator/pulls).
- [Sponsor this project](https://github.com/sponsors/wasertech)

Thanks for building with Interpres — translate confidently, scale thoughtfully.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/wasertech/Translator",
    "name": "interpres",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Danny Waser",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/3d/69/68c7f7be1615ca53035a3df5e2899607aa70ddbb1be1ddd429aefe75a3cc/interpres-0.4.0b6.tar.gz",
    "platform": null,
    "description": "# Interpres \u2014 Translator\n\nTranslate text and files between languages using Hugging Face translation models (default: Meta NLLB).\n\nInterpres (Translator) is a lightweight CLI and Python package for fast, batch-friendly translation workflows. It supports single-sentence translation, directory and file translation, and robust PO (gettext) file handling for localization workflows.\n\n## Key features\n- CLI + Python API\n- Default model: [`facebook/nllb-200-distilled-600M`](https://huggingface.co/facebook/nllb-200-distilled-600M) (configurable)\n- Fast batch translation with configurable batch size, epochs, and parallelism\n- Full support for PO files: preserves metadata, comments, and structure; translates only untranslated entries by default; force retranslate option\n- Language list compatible with NLLB (200+ languages)\n\n## Quick install\n\npip:\n```zsh\npip install interpres\n```\n\nInstall from source:\n```zsh\npip install git+https://github.com/wasertech/Translator.git\n```\n\nSpecify a release:\n```zsh\npip install interpres==0.3.1b4\npip install git+https://github.com/wasertech/Translator.git@v0.3.1b4\n```\n\n## CLI overview\n\nRun:\n```zsh\ntranslate [FROM] [TO] [SENTENCES...]\n```\n\n### Basic examples:\n```zsh\n# Single sentence\ntranslate en fr \"This is a test.\"\n\n# Interactive shell\ntranslate\n# or get help\ntranslate --help\n\n# Translate a directory and save output\ntranslate --directory ./texts --save translations.txt eng_Latn fra_Latn\n```\n\n### Important options (common)\n- -m, --model_id MODEL_ID : Hugging Face model ID to use\n- -d, --directory DIRECTORY : Translate files in a directory\n- --po : Translate PO files (gettext)\n- --force : Force retranslation (including already translated entries)\n- -b, --batch_size : Batch size for model inference\n- -e, --nepoch : Number of epoch splits used to pipeline batches (tweak to avoid OOM)\n- -n, --nproc : Number of CPU workers for preprocessing/filtering\n- -L, --language_list : Show supported languages\n\n### PO-file translation (high level)\n- Finds .po files recursively and validates Language metadata\n- By default translates empty msgstr entries only\n- --force reprocesses every entry\n- Preserves comments, headers, and file formatting \u2014 ideal for Poedit/Django workflows\n\n## Python API (simple)\n```python\nfrom translator import Translator\n\nt = Translator(\"eng_Latn\", \"fra_Latn\")\nout = t.translate(\"This is a simple sentence.\")\nprint(out)\n```\n\n## PO-file example (Python)\n```python\nfrom translator import Translator, utils\n\ntranslator = Translator(\"eng_Latn\", \"spa_Latn\")\npo = utils.read_po_file(\"messages.po\")\n\nif utils.should_translate_po_file(po, \"eng_Latn\"):\n    entries = utils.extract_untranslated_from_po(po)\n    translations = translator.translate(entries)\n    mapping = dict(zip(entries, translations))\n    utils.update_po_with_translations(po, mapping, force=False)\n    utils.save_po_file(po, \"messages.po\")\n```\n\n## Language support\nDepending on models used, you might get fewer choices \nbut with `NLLB` you get more than 200 most popular ones.\n\n```zsh\n# translate -L\n\u276f translate --language_list\nLanguage list:\n    ...\n```\n\nFrom `python`:\n```python\n>>> import translator\n>>> len(translator.LANGS)\n202\n>>> translator.LANGS\n['ace_Arab', '...', 'zul_Latn']\n>>> from translator.language import get_nllb_lang, get_sys_lang_format\n>>> nllb_lang = get_nllb_lang(\"en\")\n>>> nllb_lang\n'eng_Latn'\n>>> get_sys_lang_format()\n'fra_Latn'\n```\n\nCheckout [`LANGS`](translator/language.py) to see the full list of supported languages.\n\n## Custom models\n- Use any [Hugging Face translation model](https://huggingface.co/models?pipeline_tag=translation&sort=downloads) compatible with the Transformers pipeline:\n  `translate --model_id \"HUGGINGFACE/MODEL_ID\" ...`\n- Prefer models that match your language pair and domain. Models trained or fine-tuned specifically for a given language pair (e.g., en\u2192fr) often produce noticeably better results than general multilingual models.\n- Domain/context-specific models are best: if you're translating a website, a model trained on website or localization data (or fine-tuned on your site's content) will usually yield more accurate, consistent, and context-aware translations than the default general-purpose model.\n\n## Performance tips\n- Set nepoch (-e) and batch_size (-b) to fit your device memory. Bigger batch_size speeds throughput but uses more memory.\n- Use -n to match your CPU threads for preprocessing speed.\n- Use custom models: choosing a language-pair-specific or domain-specific model (or fine-tuning one on your data) often improves translation quality and consistency, especially for specialized content such as legal texts, technical docs, or websites.\n\n## License\nMozilla Public License 2.0 \u2014 see [LICENSE](LICENSE)\n\nUsing this tool to translate a sentence, the licence of the original sentence still applies unless specified otherwise.\n\nMeaning, if you translate a sentence under [Creative Commons CC0](https://creativecommons.org/share-your-work/public-domain/cc0/), the translation is also under Creative Commons CC0.\n\nIdem for any licence.\n\n## Contribute & sponsor\n- [Share and talk](https://github.com/wasertech/Translator/discussions/categories/show-and-tell) about translations models you like to use and tell us why.\n- [Open issues](https://github.com/wasertech/Translator/issues) or [PRs for features, bugfixes, or performance improvements](https://github.com/wasertech/Translator/pulls).\n- [Sponsor this project](https://github.com/sponsors/wasertech)\n\nThanks for building with Interpres \u2014 translate confidently, scale thoughtfully.\n",
    "bugtrack_url": null,
    "license": "LICENSE",
    "summary": "Translate from one language to another.",
    "version": "0.4.0b6",
    "project_urls": {
        "Code": "https://github.com/wasertech/Translator",
        "Documentation": "https://github.com/wasertech/Translator/blob/main/README.md",
        "Homepage": "https://github.com/wasertech/Translator",
        "Issue tracker": "https://github.com/wasertech/Translator/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1b94f20ff97267a116951e5369247b29aa9a300d63cb6a8354090112427f8d49",
                "md5": "3778c391369c21bde22d61aabdb0bd13",
                "sha256": "30b6b55c1e77d0b26960f2e3b7ac73a4ebd0fe82eb150888a9fcba83792f51e7"
            },
            "downloads": -1,
            "filename": "interpres-0.4.0b6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3778c391369c21bde22d61aabdb0bd13",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.8",
            "size": 24510,
            "upload_time": "2025-09-10T14:28:08",
            "upload_time_iso_8601": "2025-09-10T14:28:08.923643Z",
            "url": "https://files.pythonhosted.org/packages/1b/94/f20ff97267a116951e5369247b29aa9a300d63cb6a8354090112427f8d49/interpres-0.4.0b6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3d6968c7f7be1615ca53035a3df5e2899607aa70ddbb1be1ddd429aefe75a3cc",
                "md5": "059b8ad8fdab10511501018c29a032b0",
                "sha256": "12985ae0c1dd11901a623afbd6378bd03d7a72f755217e2ff9e00ec32ee46029"
            },
            "downloads": -1,
            "filename": "interpres-0.4.0b6.tar.gz",
            "has_sig": false,
            "md5_digest": "059b8ad8fdab10511501018c29a032b0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.8",
            "size": 24001,
            "upload_time": "2025-09-10T14:28:10",
            "upload_time_iso_8601": "2025-09-10T14:28:10.484090Z",
            "url": "https://files.pythonhosted.org/packages/3d/69/68c7f7be1615ca53035a3df5e2899607aa70ddbb1be1ddd429aefe75a3cc/interpres-0.4.0b6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-10 14:28:10",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "wasertech",
    "github_project": "Translator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "interpres"
}
        
Elapsed time: 0.88121s