bleuscore


Namebleuscore JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryA fast(not yet :) bleu score calculator
upload_time2024-04-26 07:54:55
maintainerNone
docs_urlNone
authorMathew Shen <datahonor@gmail.com>
requires_python>=3.8
licenseMIT
keywords nlp tokenizer bleu deeplearning
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # bleuscore

[![codecov](https://codecov.io/gh/shenxiangzhuang/bleuscore/graph/badge.svg?token=ckgU5oGbxf)](https://codecov.io/gh/shenxiangzhuang/bleuscore)
[![MIT licensed](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)
[![Crates.io](https://img.shields.io/crates/v/bleuscore)](https://crates.io/crates/bleuscore)
[![PyPI - Version](https://img.shields.io/pypi/v/bleuscore)](https://pypi.org/project/bleuscore/)
![docs.rs](https://img.shields.io/docsrs/bleuscore)


[`bleuscore`](https://github.com/shenxiangzhuang/bleuscore)
is a fast(not yet :) BLEU score calculator written in rust.

## Installation
The python package has been published to [pypi](https://pypi.org/project/bleuscore/),
so we can install it directly with many ways: 

- `pip`
    ```bash
    pip install bleuscore
    ```

- `poetry`
    ```bash
    poetry add bleuscore
    ```

- `uv`
    ```bash
    uv pip install bleuscore
    ```

## Quick Start
The usage is exactly same with [huggingface evaluate](https://huggingface.co/spaces/evaluate-metric/bleu):

```diff
- import evaluate
+ import bleuscore

predictions = ["hello there general kenobi", "foo bar foobar"]
references = [
    ["hello there general kenobi", "hello there !"],
    ["foo bar foobar"]
]

- bleu = evaluate.load("bleu")
- results = bleu.compute(predictions=predictions, references=references)
+ results = bleuscore.compute(predictions=predictions, references=references)

print(results)
# {'bleu': 1.0, 'precisions': [1.0, 1.0, 1.0, 1.0], 'brevity_penalty': 1.0, 
# 'length_ratio': 1.1666666666666667, 'translation_length': 7, 'reference_length': 6}

```

## Benchmark
We use the demo data shown in quick start to do this simple benchmark.
You can check the [benchmark/simple](./benchmark/simple) for the benchmark source code.

- Benchmark1: bleuscore
- Benchmark2: huggingface evaluate bleu algorithm in **local**
- Benchmark3: sacrebleu
  - Note that we got different result with sacrebleu in the simple demo data and all the rests have same result
- Benchmark4: huggingface evaluate bleu algorithm with **evaluate** package


The `N` is used to enlarge the predictions/references size by simply duplication the demo data as shown before.

We can see that as `N` increase, the bleuscore gets better performance.

### N=1

<div style="text-align: center;">
    <img width="80%" src="asset/benchmark/n_1.png">
</div>

### N=100
We will only test the bleuscore and evaluate **local** results from here, 
because the other two methods are too slow to test quickly.

<div style="text-align: center;">
    <img width="80%" src="asset/benchmark/n_100.png">
</div>

### N=10,000

<div style="text-align: center;">
    <img width="80%" src="asset/benchmark/n_10000.png">
</div>

### N=100,000

<div style="text-align: center;">
    <img width="80%" src="asset/benchmark/n_100000.png">
</div>




            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "bleuscore",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "NLP, Tokenizer, BLEU, DeepLearning",
    "author": "Mathew Shen <datahonor@gmail.com>",
    "author_email": "Mathew Shen <datahonor@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/a0/7c/2fee15d42ce80013b881a1db1a5327a8c96231e83586fde9281f8517b2da/bleuscore-0.1.1.tar.gz",
    "platform": null,
    "description": "# bleuscore\n\n[![codecov](https://codecov.io/gh/shenxiangzhuang/bleuscore/graph/badge.svg?token=ckgU5oGbxf)](https://codecov.io/gh/shenxiangzhuang/bleuscore)\n[![MIT licensed](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)\n[![Crates.io](https://img.shields.io/crates/v/bleuscore)](https://crates.io/crates/bleuscore)\n[![PyPI - Version](https://img.shields.io/pypi/v/bleuscore)](https://pypi.org/project/bleuscore/)\n![docs.rs](https://img.shields.io/docsrs/bleuscore)\n\n\n[`bleuscore`](https://github.com/shenxiangzhuang/bleuscore)\nis a fast(not yet :) BLEU score calculator written in rust.\n\n## Installation\nThe python package has been published to [pypi](https://pypi.org/project/bleuscore/),\nso we can install it directly with many ways: \n\n- `pip`\n    ```bash\n    pip install bleuscore\n    ```\n\n- `poetry`\n    ```bash\n    poetry add bleuscore\n    ```\n\n- `uv`\n    ```bash\n    uv pip install bleuscore\n    ```\n\n## Quick Start\nThe usage is exactly same with [huggingface evaluate](https://huggingface.co/spaces/evaluate-metric/bleu):\n\n```diff\n- import evaluate\n+ import bleuscore\n\npredictions = [\"hello there general kenobi\", \"foo bar foobar\"]\nreferences = [\n    [\"hello there general kenobi\", \"hello there !\"],\n    [\"foo bar foobar\"]\n]\n\n- bleu = evaluate.load(\"bleu\")\n- results = bleu.compute(predictions=predictions, references=references)\n+ results = bleuscore.compute(predictions=predictions, references=references)\n\nprint(results)\n# {'bleu': 1.0, 'precisions': [1.0, 1.0, 1.0, 1.0], 'brevity_penalty': 1.0, \n# 'length_ratio': 1.1666666666666667, 'translation_length': 7, 'reference_length': 6}\n\n```\n\n## Benchmark\nWe use the demo data shown in quick start to do this simple benchmark.\nYou can check the [benchmark/simple](./benchmark/simple) for the benchmark source code.\n\n- Benchmark1: bleuscore\n- Benchmark2: huggingface evaluate bleu algorithm in **local**\n- Benchmark3: sacrebleu\n  - Note that we got different result with sacrebleu in the simple demo data and all the rests have same result\n- Benchmark4: huggingface evaluate bleu algorithm with **evaluate** package\n\n\nThe `N` is used to enlarge the predictions/references size by simply duplication the demo data as shown before.\n\nWe can see that as `N` increase, the bleuscore gets better performance.\n\n### N=1\n\n<div style=\"text-align: center;\">\n    <img width=\"80%\" src=\"asset/benchmark/n_1.png\">\n</div>\n\n### N=100\nWe will only test the bleuscore and evaluate **local** results from here, \nbecause the other two methods are too slow to test quickly.\n\n<div style=\"text-align: center;\">\n    <img width=\"80%\" src=\"asset/benchmark/n_100.png\">\n</div>\n\n### N=10,000\n\n<div style=\"text-align: center;\">\n    <img width=\"80%\" src=\"asset/benchmark/n_10000.png\">\n</div>\n\n### N=100,000\n\n<div style=\"text-align: center;\">\n    <img width=\"80%\" src=\"asset/benchmark/n_100000.png\">\n</div>\n\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A fast(not yet :) bleu score calculator",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/shenxiangzhuang/bleuscore",
        "Source": "https://github.com/shenxiangzhuang/bleuscore"
    },
    "split_keywords": [
        "nlp",
        " tokenizer",
        " bleu",
        " deeplearning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9c56923fb6956586de3d7c0c908e30a2117e2be6f178b9f9ef0f560b21c1210b",
                "md5": "fa29c6f74321495a400cb7b0c1011b8b",
                "sha256": "795fc4db196f8b43eace9aa11959e10b46462797aeae8ec8dccfa186323633a2"
            },
            "downloads": -1,
            "filename": "bleuscore-0.1.1-cp38-abi3-macosx_10_12_x86_64.whl",
            "has_sig": false,
            "md5_digest": "fa29c6f74321495a400cb7b0c1011b8b",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 871150,
            "upload_time": "2024-04-26T07:54:53",
            "upload_time_iso_8601": "2024-04-26T07:54:53.471404Z",
            "url": "https://files.pythonhosted.org/packages/9c/56/923fb6956586de3d7c0c908e30a2117e2be6f178b9f9ef0f560b21c1210b/bleuscore-0.1.1-cp38-abi3-macosx_10_12_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3859ff60c24017e43b2f7c57e8c716a14cb2b43307fdef879dd36e46ca3944c4",
                "md5": "a1fe2de89aaab1392465a0f0698c7587",
                "sha256": "e5045144dc0d466935ca9a469a09cd73d210a762c93134f8ca08841e9bb0796f"
            },
            "downloads": -1,
            "filename": "bleuscore-0.1.1-cp38-abi3-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "a1fe2de89aaab1392465a0f0698c7587",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 826494,
            "upload_time": "2024-04-26T07:54:51",
            "upload_time_iso_8601": "2024-04-26T07:54:51.309608Z",
            "url": "https://files.pythonhosted.org/packages/38/59/ff60c24017e43b2f7c57e8c716a14cb2b43307fdef879dd36e46ca3944c4/bleuscore-0.1.1-cp38-abi3-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2f32313529bd316ed433872f8802547f8d2569f37c4a0ca8bd811068c92b8ace",
                "md5": "7c20e28eae00fc14dd754f36a417020a",
                "sha256": "bef1ecc651d29161dc99e4b027f1f459f4c6d69973ea6407135b8573e93a9b52"
            },
            "downloads": -1,
            "filename": "bleuscore-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "7c20e28eae00fc14dd754f36a417020a",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 1739206,
            "upload_time": "2024-04-26T07:54:49",
            "upload_time_iso_8601": "2024-04-26T07:54:49.165817Z",
            "url": "https://files.pythonhosted.org/packages/2f/32/313529bd316ed433872f8802547f8d2569f37c4a0ca8bd811068c92b8ace/bleuscore-0.1.1-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "0f22a3bdb0c5fad58956e7f5cfbdfddd840da7de5a54d2f118f4612d4058e4d0",
                "md5": "c8f3c3bb53dd1abed70ca5eb4dd079ae",
                "sha256": "8ee9f5ffae60326e96a113a212ab79aaf53caf957e6ea836a72fde3d2d820bd2"
            },
            "downloads": -1,
            "filename": "bleuscore-0.1.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl",
            "has_sig": false,
            "md5_digest": "c8f3c3bb53dd1abed70ca5eb4dd079ae",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 1731801,
            "upload_time": "2024-04-26T07:54:47",
            "upload_time_iso_8601": "2024-04-26T07:54:47.476676Z",
            "url": "https://files.pythonhosted.org/packages/0f/22/a3bdb0c5fad58956e7f5cfbdfddd840da7de5a54d2f118f4612d4058e4d0/bleuscore-0.1.1-cp38-abi3-manylinux_2_5_i686.manylinux1_i686.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fc067bc59016fca380e8dd711733d3b99d6a14f5ae9a5948c102887fa0a736eb",
                "md5": "790e745c53957ac2d4d0fd91826ab711",
                "sha256": "a73e9f4b939db2c6795f56aff03dd3c1f47116111aba29349d06d1895e1b2451"
            },
            "downloads": -1,
            "filename": "bleuscore-0.1.1-cp38-abi3-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "790e745c53957ac2d4d0fd91826ab711",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 714046,
            "upload_time": "2024-04-26T07:54:58",
            "upload_time_iso_8601": "2024-04-26T07:54:58.058038Z",
            "url": "https://files.pythonhosted.org/packages/fc/06/7bc59016fca380e8dd711733d3b99d6a14f5ae9a5948c102887fa0a736eb/bleuscore-0.1.1-cp38-abi3-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a07c2fee15d42ce80013b881a1db1a5327a8c96231e83586fde9281f8517b2da",
                "md5": "acf6084dac8399d47698f9ec2b5e0e1c",
                "sha256": "95f15f44929e104bc66f7223c92080a00bad9b6bf129bd96f076630059204cc2"
            },
            "downloads": -1,
            "filename": "bleuscore-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "acf6084dac8399d47698f9ec2b5e0e1c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 995235,
            "upload_time": "2024-04-26T07:54:55",
            "upload_time_iso_8601": "2024-04-26T07:54:55.370654Z",
            "url": "https://files.pythonhosted.org/packages/a0/7c/2fee15d42ce80013b881a1db1a5327a8c96231e83586fde9281f8517b2da/bleuscore-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-26 07:54:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "shenxiangzhuang",
    "github_project": "bleuscore",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "bleuscore"
}
        
Elapsed time: 0.28166s