lighteval


Namelighteval JSON
Version 0.7.0 PyPI version JSON
download
home_pageNone
SummaryA lightweight and configurable evaluation package
upload_time2025-01-03 15:44:54
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT License
keywords evaluation nlp llm
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <br/>
    <img alt="lighteval library logo" src="./assets/lighteval-doc.svg" width="376" height="59" style="max-width: 100%;">
  <br/>
</p>


<p align="center">
    <i>Your go-to toolkit for lightning-fast, flexible LLM evaluation, from Hugging Face's Leaderboard and Evals Team.</i>
</p>

<div align="center">

[![Tests](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml?query=branch%3Amain)
[![Quality](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml?query=branch%3Amain)
[![Python versions](https://img.shields.io/pypi/pyversions/lighteval)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/huggingface/lighteval/blob/main/LICENSE)
[![Version](https://img.shields.io/pypi/v/lighteval)](https://pypi.org/project/lighteval/)

</div>

---

**Documentation**: <a href="https://huggingface.co/docs/lighteval/index" target="_blank">Lighteval's Doc</a>

---

### Unlock the Power of LLM Evaluation with Lighteval 🚀

**Lighteval** is your all-in-one toolkit for evaluating LLMs across multiple
backends—whether it's
[transformers](https://github.com/huggingface/transformers),
[tgi](https://github.com/huggingface/text-generation-inference),
[vllm](https://github.com/vllm-project/vllm), or
[nanotron](https://github.com/huggingface/nanotron)—with
ease. Dive deep into your model’s performance by saving and exploring detailed,
sample-by-sample results to debug and see how your models stack-up.

Customization at your fingertips: letting you either browse all our existing [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) and [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) or effortlessly [create your own](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task), tailored to your needs.

Seamlessly experiment, benchmark, and store your results on the Hugging Face
Hub, S3, or locally.


## 🔑 Key Features

- **Speed**: [Use vllm as backend for fast evals](https://github.com/huggingface/lighteval/wiki/Use-VLLM-as-backend).
- **Completeness**: [Use the accelerate backend to launch any models hosted on Hugging Face](https://github.com/huggingface/lighteval/wiki/Quicktour#accelerate).
- **Seamless Storage**: [Save results in S3 or Hugging Face Datasets](https://github.com/huggingface/lighteval/wiki/Saving-and-reading-results).
- **Python API**: [Simple integration with the Python API](https://github.com/huggingface/lighteval/wiki/Using-the-Python-API).
- **Custom Tasks**: [Easily add custom tasks](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task).
- **Versatility**: Tons of [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) and [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) ready to go.


## ⚡️ Installation

```bash
pip install lighteval
```

Lighteval allows for many extras when installing, see [here](https://github.com/huggingface/lighteval/wiki/Installation) for a complete list.

If you want to push results to the Hugging Face Hub, add your access token as
an environment variable:

```shell
huggingface-cli login
```

## 🚀 Quickstart

Lighteval offers two main entry points for model evaluation:

- `lighteval accelerate` : evaluate models on CPU or one or more GPUs using [🤗
  Accelerate](https://github.com/huggingface/accelerate)
- `lighteval nanotron`: evaluate models in distributed settings using [⚡️
  Nanotron](https://github.com/huggingface/nanotron)
- `lighteval vllm`: evaluate models on one or more GPUs using [🚀
  VLLM](https://github.com/vllm-project/vllm)
- `lighteval endpoint`
    - `inference-endpoint`: evaluate models on one or more GPUs using [🔗
  Inference Endpoint](https://huggingface.co/inference-endpoints/dedicated)
    - `tgi`: evaluate models on one or more GPUs using [🔗 Text Generation Inference](https://huggingface.co/docs/text-generation-inference/en/index)
    - `openai`: evaluate models on one or more GPUs using [🔗 OpenAI API](https://platform.openai.com/)

Here’s a quick command to evaluate using the Accelerate backend:

```shell
lighteval accelerate \
    "pretrained=gpt2" \
    "leaderboard|truthfulqa:mc|0|0"
```

## 🙏 Acknowledgements

Lighteval started as an extension of the fantastic [Eleuther AI
Harness](https://github.com/EleutherAI/lm-evaluation-harness) (which powers the
[Open LLM
Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard))
and draws inspiration from the amazing
[HELM](https://crfm.stanford.edu/helm/latest/) framework.

While evolving Lighteval into its own standalone tool, we are grateful to the
Harness and HELM teams for their pioneering work on LLM evaluations.

## 🌟 Contributions Welcome 💙💚💛💜🧡

Got ideas? Found a bug? Want to add a
[task](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task) or
[metric](https://github.com/huggingface/lighteval/wiki/Adding-a-New-Metric)?
Contributions are warmly welcomed!

If you're adding a new feature, please open an issue first.

If you open a PR, don't forget to run the styling!

```bash
pip install -e .[dev]
pre-commit install
pre-commit run --all-files
```
## 📜 Citation

```bibtex
@misc{lighteval,
  author = {Fourrier, Clémentine and Habib, Nathan and Wolf, Thomas and Tunstall, Lewis},
  title = {LightEval: A lightweight framework for LLM evaluation},
  year = {2023},
  version = {0.5.0},
  url = {https://github.com/huggingface/lighteval}
}
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "lighteval",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Cl\u00e9mentine Fourrier <clementine@huggingface.com>, Nathan Habib <nathan.habib@huggingface.com>",
    "keywords": "evaluation, nlp, llm",
    "author": null,
    "author_email": "Cl\u00e9mentine Fourrier <clementine@huggingface.com>, Nathan Habib <nathan.habib@huggingface.com>, Thomas Wolf <thom@huggingface.com>",
    "download_url": "https://files.pythonhosted.org/packages/65/e1/bd70449c07c6b4259cc8395ab3c8cc6e433a920084bbe12fe651bc3480d0/lighteval-0.7.0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <br/>\n    <img alt=\"lighteval library logo\" src=\"./assets/lighteval-doc.svg\" width=\"376\" height=\"59\" style=\"max-width: 100%;\">\n  <br/>\n</p>\n\n\n<p align=\"center\">\n    <i>Your go-to toolkit for lightning-fast, flexible LLM evaluation, from Hugging Face's Leaderboard and Evals Team.</i>\n</p>\n\n<div align=\"center\">\n\n[![Tests](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml?query=branch%3Amain)\n[![Quality](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml?query=branch%3Amain)\n[![Python versions](https://img.shields.io/pypi/pyversions/lighteval)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/huggingface/lighteval/blob/main/LICENSE)\n[![Version](https://img.shields.io/pypi/v/lighteval)](https://pypi.org/project/lighteval/)\n\n</div>\n\n---\n\n**Documentation**: <a href=\"https://huggingface.co/docs/lighteval/index\" target=\"_blank\">Lighteval's Doc</a>\n\n---\n\n### Unlock the Power of LLM Evaluation with Lighteval \ud83d\ude80\n\n**Lighteval** is your all-in-one toolkit for evaluating LLMs across multiple\nbackends\u2014whether it's\n[transformers](https://github.com/huggingface/transformers),\n[tgi](https://github.com/huggingface/text-generation-inference),\n[vllm](https://github.com/vllm-project/vllm), or\n[nanotron](https://github.com/huggingface/nanotron)\u2014with\nease. Dive deep into your model\u2019s performance by saving and exploring detailed,\nsample-by-sample results to debug and see how your models stack-up.\n\nCustomization at your fingertips: letting you either browse all our existing [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) and [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) or effortlessly [create your own](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task), tailored to your needs.\n\nSeamlessly experiment, benchmark, and store your results on the Hugging Face\nHub, S3, or locally.\n\n\n## \ud83d\udd11 Key Features\n\n- **Speed**: [Use vllm as backend for fast evals](https://github.com/huggingface/lighteval/wiki/Use-VLLM-as-backend).\n- **Completeness**: [Use the accelerate backend to launch any models hosted on Hugging Face](https://github.com/huggingface/lighteval/wiki/Quicktour#accelerate).\n- **Seamless Storage**: [Save results in S3 or Hugging Face Datasets](https://github.com/huggingface/lighteval/wiki/Saving-and-reading-results).\n- **Python API**: [Simple integration with the Python API](https://github.com/huggingface/lighteval/wiki/Using-the-Python-API).\n- **Custom Tasks**: [Easily add custom tasks](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task).\n- **Versatility**: Tons of [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) and [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) ready to go.\n\n\n## \u26a1\ufe0f Installation\n\n```bash\npip install lighteval\n```\n\nLighteval allows for many extras when installing, see [here](https://github.com/huggingface/lighteval/wiki/Installation) for a complete list.\n\nIf you want to push results to the Hugging Face Hub, add your access token as\nan environment variable:\n\n```shell\nhuggingface-cli login\n```\n\n## \ud83d\ude80 Quickstart\n\nLighteval offers two main entry points for model evaluation:\n\n- `lighteval accelerate` : evaluate models on CPU or one or more GPUs using [\ud83e\udd17\n  Accelerate](https://github.com/huggingface/accelerate)\n- `lighteval nanotron`: evaluate models in distributed settings using [\u26a1\ufe0f\n  Nanotron](https://github.com/huggingface/nanotron)\n- `lighteval vllm`: evaluate models on one or more GPUs using [\ud83d\ude80\n  VLLM](https://github.com/vllm-project/vllm)\n- `lighteval endpoint`\n    - `inference-endpoint`: evaluate models on one or more GPUs using [\ud83d\udd17\n  Inference Endpoint](https://huggingface.co/inference-endpoints/dedicated)\n    - `tgi`: evaluate models on one or more GPUs using [\ud83d\udd17 Text Generation Inference](https://huggingface.co/docs/text-generation-inference/en/index)\n    - `openai`: evaluate models on one or more GPUs using [\ud83d\udd17 OpenAI API](https://platform.openai.com/)\n\nHere\u2019s a quick command to evaluate using the Accelerate backend:\n\n```shell\nlighteval accelerate \\\n    \"pretrained=gpt2\" \\\n    \"leaderboard|truthfulqa:mc|0|0\"\n```\n\n## \ud83d\ude4f Acknowledgements\n\nLighteval started as an extension of the fantastic [Eleuther AI\nHarness](https://github.com/EleutherAI/lm-evaluation-harness) (which powers the\n[Open LLM\nLeaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard))\nand draws inspiration from the amazing\n[HELM](https://crfm.stanford.edu/helm/latest/) framework.\n\nWhile evolving Lighteval into its own standalone tool, we are grateful to the\nHarness and HELM teams for their pioneering work on LLM evaluations.\n\n## \ud83c\udf1f Contributions Welcome \ud83d\udc99\ud83d\udc9a\ud83d\udc9b\ud83d\udc9c\ud83e\udde1\n\nGot ideas? Found a bug? Want to add a\n[task](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task) or\n[metric](https://github.com/huggingface/lighteval/wiki/Adding-a-New-Metric)?\nContributions are warmly welcomed!\n\nIf you're adding a new feature, please open an issue first.\n\nIf you open a PR, don't forget to run the styling!\n\n```bash\npip install -e .[dev]\npre-commit install\npre-commit run --all-files\n```\n## \ud83d\udcdc Citation\n\n```bibtex\n@misc{lighteval,\n  author = {Fourrier, Cl\u00e9mentine and Habib, Nathan and Wolf, Thomas and Tunstall, Lewis},\n  title = {LightEval: A lightweight framework for LLM evaluation},\n  year = {2023},\n  version = {0.5.0},\n  url = {https://github.com/huggingface/lighteval}\n}\n```\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A lightweight and configurable evaluation package",
    "version": "0.7.0",
    "project_urls": {
        "Homepage": "https://github.com/huggingface/lighteval",
        "Issues": "https://github.com/huggingface/lighteval/issues"
    },
    "split_keywords": [
        "evaluation",
        " nlp",
        " llm"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "86d8655d36dcee8f56dae06321448da2d19c8c780051fa5b171f35fae91f7e18",
                "md5": "6893f6ecb057ee2afc8ad9967b56117e",
                "sha256": "2821ff1cda1bfe387bf90bc88cdc6032192f5038ef3802dce5398d1e67a3daa8"
            },
            "downloads": -1,
            "filename": "lighteval-0.7.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6893f6ecb057ee2afc8ad9967b56117e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 351006,
            "upload_time": "2025-01-03T15:44:52",
            "upload_time_iso_8601": "2025-01-03T15:44:52.339907Z",
            "url": "https://files.pythonhosted.org/packages/86/d8/655d36dcee8f56dae06321448da2d19c8c780051fa5b171f35fae91f7e18/lighteval-0.7.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "65e1bd70449c07c6b4259cc8395ab3c8cc6e433a920084bbe12fe651bc3480d0",
                "md5": "425466dbfdf73eb84c68937155f69bab",
                "sha256": "d00ba36fcbb42c7a51ef19f17d764730de78166bcf50c2e0baae90ce22b6e7f4"
            },
            "downloads": -1,
            "filename": "lighteval-0.7.0.tar.gz",
            "has_sig": false,
            "md5_digest": "425466dbfdf73eb84c68937155f69bab",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 281369,
            "upload_time": "2025-01-03T15:44:54",
            "upload_time_iso_8601": "2025-01-03T15:44:54.762214Z",
            "url": "https://files.pythonhosted.org/packages/65/e1/bd70449c07c6b4259cc8395ab3c8cc6e433a920084bbe12fe651bc3480d0/lighteval-0.7.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-03 15:44:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "huggingface",
    "github_project": "lighteval",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "lighteval"
}
        
Elapsed time: 0.41083s