interpres


Nameinterpres JSON
Version 0.3.4b4 PyPI version JSON
download
home_pagehttps://github.com/wasertech/Translator
SummaryTranslate from one language to another.
upload_time2023-07-22 15:14:40
maintainer
docs_urlNone
authorDanny Waser
requires_python>=3.8,<3.12
licenseLICENSE
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Interpres (Translator)

> Latin Noun
>
> **interpres** *m* or *f* (genitive interpretis); third declension
>
> 1. *An agent between two parties*; *broker*, *negotiator*, *factor*. 
>
>>> Synonyms: *cōciō*, *arillātor*
>
>
> 2. *A translator*, *interpreter*, *expounder*, *expositor*, *explainer*; *dragoman*. 
>
>>> Synonyms: *coniector*, *commentātor*, *interpretātor*, *trānslātor*


*`Translate`* *`from` one language* *`to` another*, *any `sentence` you would like*.

```zsh
# Translate [FROM] [TO] [SENTENCES]
❯ translate fr "Traduisez quelle que soit la phrase que vous voulez."
Translate whatever sentence you want.
```

Uses Meta's NLLB model [`facebook/nllb-200-distilled-600M`](https://huggingface.co/facebook/nllb-200-distilled-600M) by default. You can change it by passing a custom flag `--model_id`.


## Installation

Use `pip` to install Translator.

```zsh
❯ pip install interpres
```

Or from source.
```zsh
❯  pip install git+https://github.com/wasertech/Translator.git
```

You can also use a specific version.
```zsh
❯  pip install interpres==0.3.1b4
❯  pip install git+https://github.com/wasertech/Translator.git@v0.3.1b4
```

Locate Translator.
```zsh
❯ which translate
```

## Usage

Using `translate` from your favorite shell.

```zsh
❯ translate help
usage: translate [-h] [-v] [-d DIRECTORY] [-S SAVE] [-l MAX_LENGTH] [-m MODEL_ID] [-p PIPELINE] [-b BATCH_SIZE] [-n NPROC] [-e NEPOCH] [-L]
                 [_from] [_to] [sentences ...]

Translate [FROM one language] [TO another], [any SENTENCE you would like].

positional arguments:
  _from                 Source language to translate from.
  _to                   Target language to translate towards.
  sentences             Sentences to translate.

options:
  -h, --help            show this help message and exit
  -v, --version         shows the current version of translator
  -d DIRECTORY, --directory DIRECTORY
                        Path to directory to translate in batch instead of unique sentence.
  -S SAVE, --save SAVE  Path to text file to save translations.
  -l MAX_LENGTH, --max_length MAX_LENGTH
                        Max length of output.
  -m MODEL_ID, --model_id MODEL_ID
                        HuggingFace model ID to use.
  -p PIPELINE, --pipeline PIPELINE
                        Pipeline task to use.
  -b BATCH_SIZE, --batch_size BATCH_SIZE
                        Number of sentences to batch for translation.
  -n NPROC, --nproc NPROC
                        Number of process to spawn for filtering untraslated sentences.
  -e NEPOCH, --nepoch NEPOCH
                        Number of epoch(s) to translate batched sentences.
  -L, --language_list   Show list of languages.
```

You can `translate` `from` one language `to` another, any `sentence` you would like.

Greet Translator.
```
❯ translate
ℹ Welcome!
ℹ I am Translator version: 0.3.1b5
ℹ At your service.
? What would you like to translate? Manually typed sentences
ℹ Translating from: Manually typed sentences
? What language to translate from? en
ℹ Translating from eng_Latn.
? What language to translate to? fr
ℹ Translating to fra_Latn.
ℹ Preparing to translate...
Type [Ctrl] + [C] to exit.
          
What would you like to translate?
? Translate: This is a prompt-like translation shell!
C'est une coquille de traduction rapide !

What would you like to translate?
? Translate: You can quickly and effortlessly translate anything from here!
Vous pouvez traduire n'importe quoi rapidement et sans effort.

What would you like to translate?
? Translate: I hope you like my work and are considering becoming a sponsor...
J'espère que vous aimez mon travail et que vous envisagez devenir sponsor...

What would you like to translate?
? Translate:                                                                                                                                                                                 

Cancelled by user
```

Get Translator version.
```zsh
❯ translate version
```

Translate from English in French.
```
❯ translate eng_Latn fra_Latn "This is French."
C'est français.

❯ LANG="fr_CH.UTF-8" translate en "This is also French."
C'est aussi français.
```

Translate from English in Spanish.
```zsh
❯ translate eng_Latn spa_Latn "This is Spanish."
Esto es español.

❯ translate en es "This is also Spanish."
Esto también es español.
```

You can also easily `translate` files from a `--directory` and `--save` to a file.

```zsh
❯ translate --directory . --save en2fr.txt eng_Latn fra_Latn -n 24 -e 1000 -b 64
```

Define:
  - `--nepoch (-e)` as small as possible but as big as necessary.
    
    Translator uses this number `e` of epoch to determine 
    the rate of time between updates 
    by the amount of sentences 
    given for translation at once.

    If this number is too small, you will face Out-Of-Memory (OOM) errors.
    If it is too big, you will get poor efficency.

    Keep it between 1 and the sum of sentences to translate.

    For maximum efficency keep it as low as you can while beeing able 
    to fit `epoch_split` number of sentences 
    into `device`'s memory.

  - `--batch_size (-b)` as big as possible but as small as necessary.

    Translator uses this value every time it needs to batch sentences to work on them.

    Mostly impacts the amount of sentences to batch togheter from `epoch_split` sentences to translate in one go.

    Keep it as high as possible (<`epoch_split`) but as low as your `device` memory allows to (<=1).

    For GPU using multiples of `2` is best for memory optimization 
    (i.e. `2`, `4`, `8`, `16`, `32`, `64`, `128`, `256`, `512`, etc.).

  - `--nproc (-n)` to equal your amount of virtual threads on CPU for maximum performance.

    This value is used by translator everytime multiples sentences need to be processed by the CPU.

    Keeping it at its highest possible value, 
    garanties maximum performances. 

With a good processor and a single fast and large GPU, 
you can translate an average just shy of a 100 sentences per second.

On my Threadripper 2920X's 24 threads, 
using my RTX 3060's 12 Gb of space, 
I can peak at ~97 translations/second averaging a bit lower at 83.

I have not tested yet on my two RTX Titans but if you want to distribute the computation, you'll have to do it manually for now.
It's in my todo list but I won't be offended if you send me a pull request to implement it.

Using `Translator` with `python`.

```python
from translator import Translator

translator = Translator("eng_Latn", "fra_Latn")

english_sentence = "This is just a simple phrase." or [
    "Those are multiples sentences.",
    "If you have lots of them, load them directly from file.",
    "To efficiently batch translate them."
  ]
french_sentence = translator.translate(english_sentence)

print(f"{english_sentence=}")
print(f"{french_sentence=}")
```

## Languages

Depending on models used, you might get fewer choices 
but with `NLLB` you get more than 200 most popular ones.

```zsh
# translate -L
❯ translate --language_list
Language list:
    ...
```

From `python`:
```python
>>> import translator
>>> len(translator.LANGS)
202
>>> translator.LANGS
['ace_Arab', '...', 'zul_Latn']
>>> from translator.language import get_nllb_lang, get_sys_lang_format
>>> nllb_lang = get_nllb_lang("en")
>>> nllb_lang
'eng_Latn'
>>> get_sys_lang_format()
'fra_Latn'
```

Checkout [`LANGS`](translator/language.py) to see the full list of supported languages.

## Using a custom model

Checkout [HuggingFace Zoo of Translation Models](https://huggingface.co/models?pipeline_tag=translation&sort=downloads).

Or [train your own model](https://huggingface.co/autotrain) for the `translate` or `translate_xx_to_xx` pipeline.

## License

This project is distributed under [Mozilla Public License 2.0](LICENSE).

Using this tool to translate a sentence, 
the licence of the original sentence still applies unless specified otherwise.

Meaning, 
if you translate a sentence 
under [Creative Commons CC0](https://creativecommons.org/share-your-work/public-domain/cc0/), 
the translation is also under Creative Commons CC0.

Idem for any licence.

## Contribution

I love stars ⭐ 
but also chocolate 🍫 
so don't hesitate 
to [sponsor this project](https://github.com/sponsors/wasertech)!

Otherwise 
if you like the project 
and want to see it grow, 
get more convenience features 
like a dedicated service/client 
to speed up multiple translations,
etc. 

Don't hesitate to share your ideas 
by opening a ticket 
or even proposing a pull request.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/wasertech/Translator",
    "name": "interpres",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8,<3.12",
    "maintainer_email": "",
    "keywords": "",
    "author": "Danny Waser",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/c8/12/2121d45975163578af39a61fca5e4893892ff451be6e12abb6e31230d205/interpres-0.3.4b4.tar.gz",
    "platform": null,
    "description": "# Interpres (Translator)\n\n> Latin Noun\n>\n> **interpres** *m* or *f* (genitive interpretis); third declension\n>\n> 1. *An agent between two parties*; *broker*, *negotiator*, *factor*. \n>\n>>> Synonyms: *c\u014dci\u014d*, *arill\u0101tor*\n>\n>\n> 2. *A translator*, *interpreter*, *expounder*, *expositor*, *explainer*; *dragoman*. \n>\n>>> Synonyms: *coniector*, *comment\u0101tor*, *interpret\u0101tor*, *tr\u0101nsl\u0101tor*\n\n\n*`Translate`* *`from` one language* *`to` another*, *any `sentence` you would like*.\n\n```zsh\n# Translate [FROM] [TO] [SENTENCES]\n\u276f translate fr \"Traduisez quelle que soit la phrase que vous voulez.\"\nTranslate whatever sentence you want.\n```\n\nUses Meta's NLLB model [`facebook/nllb-200-distilled-600M`](https://huggingface.co/facebook/nllb-200-distilled-600M) by default. You can change it by passing a custom flag `--model_id`.\n\n\n## Installation\n\nUse `pip` to install Translator.\n\n```zsh\n\u276f pip install interpres\n```\n\nOr from source.\n```zsh\n\u276f  pip install git+https://github.com/wasertech/Translator.git\n```\n\nYou can also use a specific version.\n```zsh\n\u276f  pip install interpres==0.3.1b4\n\u276f  pip install git+https://github.com/wasertech/Translator.git@v0.3.1b4\n```\n\nLocate Translator.\n```zsh\n\u276f which translate\n```\n\n## Usage\n\nUsing `translate` from your favorite shell.\n\n```zsh\n\u276f translate help\nusage: translate [-h] [-v] [-d DIRECTORY] [-S SAVE] [-l MAX_LENGTH] [-m MODEL_ID] [-p PIPELINE] [-b BATCH_SIZE] [-n NPROC] [-e NEPOCH] [-L]\n                 [_from] [_to] [sentences ...]\n\nTranslate [FROM one language] [TO another], [any SENTENCE you would like].\n\npositional arguments:\n  _from                 Source language to translate from.\n  _to                   Target language to translate towards.\n  sentences             Sentences to translate.\n\noptions:\n  -h, --help            show this help message and exit\n  -v, --version         shows the current version of translator\n  -d DIRECTORY, --directory DIRECTORY\n                        Path to directory to translate in batch instead of unique sentence.\n  -S SAVE, --save SAVE  Path to text file to save translations.\n  -l MAX_LENGTH, --max_length MAX_LENGTH\n                        Max length of output.\n  -m MODEL_ID, --model_id MODEL_ID\n                        HuggingFace model ID to use.\n  -p PIPELINE, --pipeline PIPELINE\n                        Pipeline task to use.\n  -b BATCH_SIZE, --batch_size BATCH_SIZE\n                        Number of sentences to batch for translation.\n  -n NPROC, --nproc NPROC\n                        Number of process to spawn for filtering untraslated sentences.\n  -e NEPOCH, --nepoch NEPOCH\n                        Number of epoch(s) to translate batched sentences.\n  -L, --language_list   Show list of languages.\n```\n\nYou can `translate` `from` one language `to` another, any `sentence` you would like.\n\nGreet Translator.\n```\n\u276f translate\n\u2139 Welcome!\n\u2139 I am Translator version: 0.3.1b5\n\u2139 At your service.\n? What would you like to translate? Manually typed sentences\n\u2139 Translating from: Manually typed sentences\n? What language to translate from? en\n\u2139 Translating from eng_Latn.\n? What language to translate to? fr\n\u2139 Translating to fra_Latn.\n\u2139 Preparing to translate...\nType [Ctrl] + [C] to exit.\n          \nWhat would you like to translate?\n? Translate: This is a prompt-like translation shell!\nC'est une coquille de traduction rapide !\n\nWhat would you like to translate?\n? Translate: You can quickly and effortlessly translate anything from here!\nVous pouvez traduire n'importe quoi rapidement et sans effort.\n\nWhat would you like to translate?\n? Translate: I hope you like my work and are considering becoming a sponsor...\nJ'esp\u00e8re que vous aimez mon travail et que vous envisagez devenir sponsor...\n\nWhat would you like to translate?\n? Translate:                                                                                                                                                                                 \n\nCancelled by user\n```\n\nGet Translator version.\n```zsh\n\u276f translate version\n```\n\nTranslate from English in French.\n```\n\u276f translate eng_Latn fra_Latn \"This is French.\"\nC'est fran\u00e7ais.\n\n\u276f LANG=\"fr_CH.UTF-8\" translate en \"This is also French.\"\nC'est aussi fran\u00e7ais.\n```\n\nTranslate from English in Spanish.\n```zsh\n\u276f translate eng_Latn spa_Latn \"This is Spanish.\"\nEsto es espa\u00f1ol.\n\n\u276f translate en es \"This is also Spanish.\"\nEsto tambi\u00e9n es espa\u00f1ol.\n```\n\nYou can also easily `translate` files from a `--directory` and `--save` to a file.\n\n```zsh\n\u276f translate --directory . --save en2fr.txt eng_Latn fra_Latn -n 24 -e 1000 -b 64\n```\n\nDefine:\n  - `--nepoch (-e)` as small as possible but as big as necessary.\n    \n    Translator uses this number `e` of epoch to determine \n    the rate of time between updates \n    by the amount of sentences \n    given for translation at once.\n\n    If this number is too small, you will face Out-Of-Memory (OOM) errors.\n    If it is too big, you will get poor efficency.\n\n    Keep it between 1 and the sum of sentences to translate.\n\n    For maximum efficency keep it as low as you can while beeing able \n    to fit `epoch_split` number of sentences \n    into `device`'s memory.\n\n  - `--batch_size (-b)` as big as possible but as small as necessary.\n\n    Translator uses this value every time it needs to batch sentences to work on them.\n\n    Mostly impacts the amount of sentences to batch togheter from `epoch_split` sentences to translate in one go.\n\n    Keep it as high as possible (<`epoch_split`) but as low as your `device` memory allows to (<=1).\n\n    For GPU using multiples of `2` is best for memory optimization \n    (i.e. `2`, `4`, `8`, `16`, `32`, `64`, `128`, `256`, `512`, etc.).\n\n  - `--nproc (-n)` to equal your amount of virtual threads on CPU for maximum performance.\n\n    This value is used by translator everytime multiples sentences need to be processed by the CPU.\n\n    Keeping it at its highest possible value, \n    garanties maximum performances. \n\nWith a good processor and a single fast and large GPU, \nyou can translate an average just shy of a 100 sentences per second.\n\nOn my Threadripper 2920X's 24 threads, \nusing my RTX 3060's 12 Gb of space, \nI can peak at ~97 translations/second averaging a bit lower at 83.\n\nI have not tested yet on my two RTX Titans but if you want to distribute the computation, you'll have to do it manually for now.\nIt's in my todo list but I won't be offended if you send me a pull request to implement it.\n\nUsing `Translator` with `python`.\n\n```python\nfrom translator import Translator\n\ntranslator = Translator(\"eng_Latn\", \"fra_Latn\")\n\nenglish_sentence = \"This is just a simple phrase.\" or [\n    \"Those are multiples sentences.\",\n    \"If you have lots of them, load them directly from file.\",\n    \"To efficiently batch translate them.\"\n  ]\nfrench_sentence = translator.translate(english_sentence)\n\nprint(f\"{english_sentence=}\")\nprint(f\"{french_sentence=}\")\n```\n\n## Languages\n\nDepending on models used, you might get fewer choices \nbut with `NLLB` you get more than 200 most popular ones.\n\n```zsh\n# translate -L\n\u276f translate --language_list\nLanguage list:\n    ...\n```\n\nFrom `python`:\n```python\n>>> import translator\n>>> len(translator.LANGS)\n202\n>>> translator.LANGS\n['ace_Arab', '...', 'zul_Latn']\n>>> from translator.language import get_nllb_lang, get_sys_lang_format\n>>> nllb_lang = get_nllb_lang(\"en\")\n>>> nllb_lang\n'eng_Latn'\n>>> get_sys_lang_format()\n'fra_Latn'\n```\n\nCheckout [`LANGS`](translator/language.py) to see the full list of supported languages.\n\n## Using a custom model\n\nCheckout [HuggingFace Zoo of Translation Models](https://huggingface.co/models?pipeline_tag=translation&sort=downloads).\n\nOr [train your own model](https://huggingface.co/autotrain) for the `translate` or `translate_xx_to_xx` pipeline.\n\n## License\n\nThis project is distributed under [Mozilla Public License 2.0](LICENSE).\n\nUsing this tool to translate a sentence, \nthe licence of the original sentence still applies unless specified otherwise.\n\nMeaning, \nif you translate a sentence \nunder [Creative Commons CC0](https://creativecommons.org/share-your-work/public-domain/cc0/), \nthe translation is also under Creative Commons CC0.\n\nIdem for any licence.\n\n## Contribution\n\nI love stars \u2b50 \nbut also chocolate \ud83c\udf6b \nso don't hesitate \nto [sponsor this project](https://github.com/sponsors/wasertech)!\n\nOtherwise \nif you like the project \nand want to see it grow, \nget more convenience features \nlike a dedicated service/client \nto speed up multiple translations,\netc. \n\nDon't hesitate to share your ideas \nby opening a ticket \nor even proposing a pull request.\n",
    "bugtrack_url": null,
    "license": "LICENSE",
    "summary": "Translate from one language to another.",
    "version": "0.3.4b4",
    "project_urls": {
        "Code": "https://github.com/wasertech/Translator",
        "Documentation": "https://github.com/wasertech/Translator/blob/main/README.md",
        "Homepage": "https://github.com/wasertech/Translator",
        "Issue tracker": "https://github.com/wasertech/Translator/issues"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e517fc2647c5fb22cb86c639840d2268c6ff3c4255bfddf7fe9016324598e974",
                "md5": "97d6de44fabfd065334ddd3541a5ae83",
                "sha256": "5ab78aeb72ba60f03097ba085855b27ae30dd19effbec32c5238336a6fbb632e"
            },
            "downloads": -1,
            "filename": "interpres-0.3.4b4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "97d6de44fabfd065334ddd3541a5ae83",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8,<3.12",
            "size": 20629,
            "upload_time": "2023-07-22T15:14:39",
            "upload_time_iso_8601": "2023-07-22T15:14:39.001547Z",
            "url": "https://files.pythonhosted.org/packages/e5/17/fc2647c5fb22cb86c639840d2268c6ff3c4255bfddf7fe9016324598e974/interpres-0.3.4b4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c8122121d45975163578af39a61fca5e4893892ff451be6e12abb6e31230d205",
                "md5": "7c6c8f22b28f66fc08312240810d1db9",
                "sha256": "5a0e3f66c54174289d18450697de8db0e3b3fc16264f23f4d7731231e127d94b"
            },
            "downloads": -1,
            "filename": "interpres-0.3.4b4.tar.gz",
            "has_sig": false,
            "md5_digest": "7c6c8f22b28f66fc08312240810d1db9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8,<3.12",
            "size": 20188,
            "upload_time": "2023-07-22T15:14:40",
            "upload_time_iso_8601": "2023-07-22T15:14:40.788447Z",
            "url": "https://files.pythonhosted.org/packages/c8/12/2121d45975163578af39a61fca5e4893892ff451be6e12abb6e31230d205/interpres-0.3.4b4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-22 15:14:40",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "wasertech",
    "github_project": "Translator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "interpres"
}
        
Elapsed time: 0.20121s