nllw

Name	nllw JSON
Version	0.1.0 JSON
	download
home_page	None
Summary	Simultaneous Machine Translation (SimulMT) with NLLB model optimization
upload_time	2025-10-30 22:37:58
maintainer	None
docs_url	None
author	Quentin Fuxa
requires_python	>=3.8
license	MIT
keywords	machine-translation simultaneous-translation nllb streaming nlp transformers ai
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <h1 align="center">NoLanguageLeftWaiting</h1>

<p align="center">
<img src="demo.gif"width="730">
</p>

<p align="center">
<img src="architecture_NLLW.png"width="730">
</p>

Converts [NoLanguageLeftBehind](https://arxiv.org/abs/2207.04672) translation model to a SimulMT (Simultaneous Machine Translation) model, optimized for live/streaming use cases.

> Based offline models such as NLLB suffer from eos token and punctuation insertion, inconsistent prefix handling and exponentially growing computational overhead as input length increases. This implementation aims at resolving that.


- [LocalAgreement policy](https://www.isca-archive.org/interspeech_2020/liu20s_interspeech.pdf)
- [HuggingFace transformers](https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoModelForSeq2SeqLM) implementation only.
- Built for [WhisperLiveKit](https://github.com/QuentinFuxa/WhisperLiveKit)
- 200 languages. See [supported_languages.md](supported_languages.md) for the full list.

## Installation

```bash
pip install nllw
```
> The textual frontend is not installed by default.


## Quick Start

1. Demo interface :
```bash
python textual_interface.py
```

2. Use it as a package
```python
import nllw

model = nllw.load_model(
    src_langs=["fra_Latn"],
    nllb_backend="transformers",
    nllb_size="600M"
)
translator = nllw.OnlineTranslation(
    model,
    input_languages=["fra_Latn"],
    output_languages=["eng_Latn"]
)

translator.insert_tokens(tokens)
validated, buffer = translator.process()

print(f"Stable: {validated[0].text}")
print(f"Buffer: {buffer.text}")
```


## Input vs Output length:

Succesfully maintain output length, even if stable prefix tends to take time to grow.

<p align="center">
<img src="french_to_english.png"width="730">
</p>

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "nllw",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "machine-translation, simultaneous-translation, nllb, streaming, nlp, transformers, ai",
    "author": "Quentin Fuxa",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/7b/5a/443fe7e580a8103adee6a4b12e56e5a63caeb3daaf2a24536e62cca2b4b0/nllw-0.1.0.tar.gz",
    "platform": null,
    "description": "<h1 align=\"center\">NoLanguageLeftWaiting</h1>\n\n<p align=\"center\">\n<img src=\"demo.gif\"width=\"730\">\n</p>\n\n<p align=\"center\">\n<img src=\"architecture_NLLW.png\"width=\"730\">\n</p>\n\nConverts [NoLanguageLeftBehind](https://arxiv.org/abs/2207.04672) translation model to a SimulMT (Simultaneous Machine Translation) model, optimized for live/streaming use cases.\n\n> Based offline models such as NLLB suffer from eos token and punctuation insertion, inconsistent prefix handling and exponentially growing computational overhead as input length increases. This implementation aims at resolving that.\n\n\n- [LocalAgreement policy](https://www.isca-archive.org/interspeech_2020/liu20s_interspeech.pdf)\n- [HuggingFace transformers](https://huggingface.co/docs/transformers/model_doc/auto#transformers.AutoModelForSeq2SeqLM) implementation only.\n- Built for [WhisperLiveKit](https://github.com/QuentinFuxa/WhisperLiveKit)\n- 200 languages. See [supported_languages.md](supported_languages.md) for the full list.\n\n## Installation\n\n```bash\npip install nllw\n```\n> The textual frontend is not installed by default.\n\n\n## Quick Start\n\n1. Demo interface :\n```bash\npython textual_interface.py\n```\n\n2. Use it as a package\n```python\nimport nllw\n\nmodel = nllw.load_model(\n    src_langs=[\"fra_Latn\"],\n    nllb_backend=\"transformers\",\n    nllb_size=\"600M\"\n)\ntranslator = nllw.OnlineTranslation(\n    model,\n    input_languages=[\"fra_Latn\"],\n    output_languages=[\"eng_Latn\"]\n)\n\ntranslator.insert_tokens(tokens)\nvalidated, buffer = translator.process()\n\nprint(f\"Stable: {validated[0].text}\")\nprint(f\"Buffer: {buffer.text}\")\n```\n\n\n## Input vs Output length:\n\nSuccesfully maintain output length, even if stable prefix tends to take time to grow.\n\n<p align=\"center\">\n<img src=\"french_to_english.png\"width=\"730\">\n</p>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Simultaneous Machine Translation (SimulMT) with NLLB model optimization",
    "version": "0.1.0",
    "project_urls": {
        "Homepage": "https://github.com/QuentinFuxa/NoLanguageLeftWaiting"
    },
    "split_keywords": [
        "machine-translation",
        " simultaneous-translation",
        " nllb",
        " streaming",
        " nlp",
        " transformers",
        " ai"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "37111b8843138952af3141fc2c5d3a1ee24b6189e399fc76bf135c3635886b31",
                "md5": "fe976036633c90efec48c105372a118b",
                "sha256": "80f0227146d888b473d9c30cc2e8a1aed9cba343973aee38c13a956199d1f534"
            },
            "downloads": -1,
            "filename": "nllw-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fe976036633c90efec48c105372a118b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 11541,
            "upload_time": "2025-10-30T22:37:56",
            "upload_time_iso_8601": "2025-10-30T22:37:56.309083Z",
            "url": "https://files.pythonhosted.org/packages/37/11/1b8843138952af3141fc2c5d3a1ee24b6189e399fc76bf135c3635886b31/nllw-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7b5a443fe7e580a8103adee6a4b12e56e5a63caeb3daaf2a24536e62cca2b4b0",
                "md5": "7a4733a9a98d360de15637b9e46f04ea",
                "sha256": "8a49f20e5af1e82fdfa37f4532edf7ced521f77644b24dca92102b07ef3523d2"
            },
            "downloads": -1,
            "filename": "nllw-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "7a4733a9a98d360de15637b9e46f04ea",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 1101377,
            "upload_time": "2025-10-30T22:37:58",
            "upload_time_iso_8601": "2025-10-30T22:37:58.213023Z",
            "url": "https://files.pythonhosted.org/packages/7b/5a/443fe7e580a8103adee6a4b12e56e5a63caeb3daaf2a24536e62cca2b4b0/nllw-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-30 22:37:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "QuentinFuxa",
    "github_project": "NoLanguageLeftWaiting",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "nllw"
}

Quentin Fuxa