lighteval


Namelighteval JSON
Version 0.6.2 PyPI version JSON
download
home_pageNone
SummaryA lightweight and configurable evaluation package
upload_time2024-10-23 14:11:49
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License
keywords evaluation nlp llm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <br/>
    <img alt="lighteval library logo" src="./assets/lighteval-doc.svg" width="376" height="59" style="max-width: 100%;">
  <br/>
</p>


<p align="center">
    <i>Your go-to toolkit for lightning-fast, flexible LLM evaluation, from Hugging Face's Leaderboard and Evals Team.</i>
</p>

<div align="center">

[![Tests](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml?query=branch%3Amain)
[![Quality](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml?query=branch%3Amain)
[![Python versions](https://img.shields.io/pypi/pyversions/lighteval)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/huggingface/lighteval/blob/main/LICENSE)
[![Version](https://img.shields.io/pypi/v/lighteval)](https://pypi.org/project/lighteval/)

</div>

---

**Documentation**: <a href="https://github.com/huggingface/lighteval/wiki" target="_blank">Lighteval's Wiki</a>

---

### Unlock the Power of LLM Evaluation with Lighteval 🚀

**Lighteval** is your all-in-one toolkit for evaluating LLMs across multiple
backends—whether it's
[transformers](https://github.com/huggingface/transformers),
[tgi](https://github.com/huggingface/text-generation-inference),
[vllm](https://github.com/vllm-project/vllm), or
[nanotron](https://github.com/huggingface/nanotron)—with
ease. Dive deep into your model’s performance by saving and exploring detailed,
sample-by-sample results to debug and see how your models stack-up.

Customization at your fingertips: letting you either browse all our existing [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) and [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) or effortlessly [create your own](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task), tailored to your needs.

Seamlessly experiment, benchmark, and store your results on the Hugging Face
Hub, S3, or locally.


## 🔑 Key Features

- **Speed**: [Use vllm as backend for fast evals](https://github.com/huggingface/lighteval/wiki/Use-VLLM-as-backend).
- **Completeness**: [Use the accelerate backend to launch any models hosted on Hugging Face](https://github.com/huggingface/lighteval/wiki/Quicktour#accelerate).
- **Seamless Storage**: [Save results in S3 or Hugging Face Datasets](https://github.com/huggingface/lighteval/wiki/Saving-and-reading-results).
- **Python API**: [Simple integration with the Python API](https://github.com/huggingface/lighteval/wiki/Using-the-Python-API).
- **Custom Tasks**: [Easily add custom tasks](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task).
- **Versatility**: Tons of [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) and [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) ready to go.


## ⚡️ Installation

```bash
pip install lighteval[accelerate]
```

Lighteval allows for many extras when installing, see [here](https://github.com/huggingface/lighteval/wiki/Installation) for a complete list.

If you want to push results to the Hugging Face Hub, add your access token as
an environment variable:

```shell
huggingface-cli login
```

## 🚀 Quickstart

Lighteval offers two main entry points for model evaluation:


* `lighteval accelerate`: evaluate models on CPU or one or more GPUs using [🤗
  Accelerate](https://github.com/huggingface/accelerate).
* `lighteval nanotron`: evaluate models in distributed settings using [⚡️
  Nanotron](https://github.com/huggingface/nanotron).

Here’s a quick command to evaluate using the Accelerate backend:

```shell
lighteval accelerate \
    --model_args "pretrained=gpt2" \
    --tasks "leaderboard|truthfulqa:mc|0|0" \
    --override_batch_size 1 \
    --output_dir="./evals/"
```

## 🙏 Acknowledgements

Lighteval started as an extension of the fantastic [Eleuther AI
Harness](https://github.com/EleutherAI/lm-evaluation-harness) (which powers the
[Open LLM
Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard))
and draws inspiration from the amazing
[HELM](https://crfm.stanford.edu/helm/latest/) framework.

While evolving Lighteval into its own standalone tool, we are grateful to the
Harness and HELM teams for their pioneering work on LLM evaluations.

## 🌟 Contributions Welcome 💙💚💛💜🧡

Got ideas? Found a bug? Want to add a
[task](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task) or
[metric](https://github.com/huggingface/lighteval/wiki/Adding-a-New-Metric)?
Contributions are warmly
welcomed!

## 📜 Citation

```bibtex
@misc{lighteval,
  author = {Fourrier, Clémentine and Habib, Nathan and Wolf, Thomas and Tunstall, Lewis},
  title = {LightEval: A lightweight framework for LLM evaluation},
  year = {2023},
  version = {0.5.0},
  url = {https://github.com/huggingface/lighteval}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "lighteval",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Cl\u00e9mentine Fourrier <clementine@huggingface.com>, Nathan Habib <nathan.habib@huggingface.com>",
    "keywords": "evaluation, nlp, llm",
    "author": null,
    "author_email": "Cl\u00e9mentine Fourrier <clementine@huggingface.com>, Nathan Habib <nathan.habib@huggingface.com>, Thomas Wolf <thom@huggingface.com>",
    "download_url": "https://files.pythonhosted.org/packages/65/dd/547f2af88bc4c56ce19123f66701cdc5762f33cc9b49bedb19ae03b26fcd/lighteval-0.6.2.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <br/>\n    <img alt=\"lighteval library logo\" src=\"./assets/lighteval-doc.svg\" width=\"376\" height=\"59\" style=\"max-width: 100%;\">\n  <br/>\n</p>\n\n\n<p align=\"center\">\n    <i>Your go-to toolkit for lightning-fast, flexible LLM evaluation, from Hugging Face's Leaderboard and Evals Team.</i>\n</p>\n\n<div align=\"center\">\n\n[![Tests](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml?query=branch%3Amain)\n[![Quality](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml?query=branch%3Amain)\n[![Python versions](https://img.shields.io/pypi/pyversions/lighteval)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/huggingface/lighteval/blob/main/LICENSE)\n[![Version](https://img.shields.io/pypi/v/lighteval)](https://pypi.org/project/lighteval/)\n\n</div>\n\n---\n\n**Documentation**: <a href=\"https://github.com/huggingface/lighteval/wiki\" target=\"_blank\">Lighteval's Wiki</a>\n\n---\n\n### Unlock the Power of LLM Evaluation with Lighteval \ud83d\ude80\n\n**Lighteval** is your all-in-one toolkit for evaluating LLMs across multiple\nbackends\u2014whether it's\n[transformers](https://github.com/huggingface/transformers),\n[tgi](https://github.com/huggingface/text-generation-inference),\n[vllm](https://github.com/vllm-project/vllm), or\n[nanotron](https://github.com/huggingface/nanotron)\u2014with\nease. Dive deep into your model\u2019s performance by saving and exploring detailed,\nsample-by-sample results to debug and see how your models stack-up.\n\nCustomization at your fingertips: letting you either browse all our existing [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) and [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) or effortlessly [create your own](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task), tailored to your needs.\n\nSeamlessly experiment, benchmark, and store your results on the Hugging Face\nHub, S3, or locally.\n\n\n## \ud83d\udd11 Key Features\n\n- **Speed**: [Use vllm as backend for fast evals](https://github.com/huggingface/lighteval/wiki/Use-VLLM-as-backend).\n- **Completeness**: [Use the accelerate backend to launch any models hosted on Hugging Face](https://github.com/huggingface/lighteval/wiki/Quicktour#accelerate).\n- **Seamless Storage**: [Save results in S3 or Hugging Face Datasets](https://github.com/huggingface/lighteval/wiki/Saving-and-reading-results).\n- **Python API**: [Simple integration with the Python API](https://github.com/huggingface/lighteval/wiki/Using-the-Python-API).\n- **Custom Tasks**: [Easily add custom tasks](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task).\n- **Versatility**: Tons of [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) and [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) ready to go.\n\n\n## \u26a1\ufe0f Installation\n\n```bash\npip install lighteval[accelerate]\n```\n\nLighteval allows for many extras when installing, see [here](https://github.com/huggingface/lighteval/wiki/Installation) for a complete list.\n\nIf you want to push results to the Hugging Face Hub, add your access token as\nan environment variable:\n\n```shell\nhuggingface-cli login\n```\n\n## \ud83d\ude80 Quickstart\n\nLighteval offers two main entry points for model evaluation:\n\n\n* `lighteval accelerate`: evaluate models on CPU or one or more GPUs using [\ud83e\udd17\n  Accelerate](https://github.com/huggingface/accelerate).\n* `lighteval nanotron`: evaluate models in distributed settings using [\u26a1\ufe0f\n  Nanotron](https://github.com/huggingface/nanotron).\n\nHere\u2019s a quick command to evaluate using the Accelerate backend:\n\n```shell\nlighteval accelerate \\\n    --model_args \"pretrained=gpt2\" \\\n    --tasks \"leaderboard|truthfulqa:mc|0|0\" \\\n    --override_batch_size 1 \\\n    --output_dir=\"./evals/\"\n```\n\n## \ud83d\ude4f Acknowledgements\n\nLighteval started as an extension of the fantastic [Eleuther AI\nHarness](https://github.com/EleutherAI/lm-evaluation-harness) (which powers the\n[Open LLM\nLeaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard))\nand draws inspiration from the amazing\n[HELM](https://crfm.stanford.edu/helm/latest/) framework.\n\nWhile evolving Lighteval into its own standalone tool, we are grateful to the\nHarness and HELM teams for their pioneering work on LLM evaluations.\n\n## \ud83c\udf1f Contributions Welcome \ud83d\udc99\ud83d\udc9a\ud83d\udc9b\ud83d\udc9c\ud83e\udde1\n\nGot ideas? Found a bug? Want to add a\n[task](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task) or\n[metric](https://github.com/huggingface/lighteval/wiki/Adding-a-New-Metric)?\nContributions are warmly\nwelcomed!\n\n## \ud83d\udcdc Citation\n\n```bibtex\n@misc{lighteval,\n  author = {Fourrier, Cl\u00e9mentine and Habib, Nathan and Wolf, Thomas and Tunstall, Lewis},\n  title = {LightEval: A lightweight framework for LLM evaluation},\n  year = {2023},\n  version = {0.5.0},\n  url = {https://github.com/huggingface/lighteval}\n}\n```\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A lightweight and configurable evaluation package",
    "version": "0.6.2",
    "project_urls": {
        "Homepage": "https://github.com/huggingface/lighteval",
        "Issues": "https://github.com/huggingface/lighteval/issues"
    },
    "split_keywords": [
        "evaluation",
        " nlp",
        " llm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "baf8c3f757064572b62ef63cb8e45e093245b0327e9f8e1c93f2aa57e227a33e",
                "md5": "cac0fcf853048c13b7eef068bc0f3f43",
                "sha256": "1832fff4ca76d4ec617b5242c60e5dcaa1df8966f9b8352af105386fb6c910ba"
            },
            "downloads": -1,
            "filename": "lighteval-0.6.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cac0fcf853048c13b7eef068bc0f3f43",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 335736,
            "upload_time": "2024-10-23T14:11:46",
            "upload_time_iso_8601": "2024-10-23T14:11:46.234526Z",
            "url": "https://files.pythonhosted.org/packages/ba/f8/c3f757064572b62ef63cb8e45e093245b0327e9f8e1c93f2aa57e227a33e/lighteval-0.6.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "65dd547f2af88bc4c56ce19123f66701cdc5762f33cc9b49bedb19ae03b26fcd",
                "md5": "3e2b3ef6351285f71147a20d0d4117e7",
                "sha256": "e48caf17c4136f973b5b9ee0692171b797692e068bd6c8efed14657b81500956"
            },
            "downloads": -1,
            "filename": "lighteval-0.6.2.tar.gz",
            "has_sig": false,
            "md5_digest": "3e2b3ef6351285f71147a20d0d4117e7",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 272568,
            "upload_time": "2024-10-23T14:11:49",
            "upload_time_iso_8601": "2024-10-23T14:11:49.343129Z",
            "url": "https://files.pythonhosted.org/packages/65/dd/547f2af88bc4c56ce19123f66701cdc5762f33cc9b49bedb19ae03b26fcd/lighteval-0.6.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-23 14:11:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "huggingface",
    "github_project": "lighteval",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "lighteval"
}
        
Elapsed time: 2.34577s