Name | lighteval JSON |
Version |
0.6.2
JSON |
| download |
home_page | None |
Summary | A lightweight and configurable evaluation package |
upload_time | 2024-10-23 14:11:49 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | MIT License |
keywords |
evaluation
nlp
llm
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<p align="center">
<br/>
<img alt="lighteval library logo" src="./assets/lighteval-doc.svg" width="376" height="59" style="max-width: 100%;">
<br/>
</p>
<p align="center">
<i>Your go-to toolkit for lightning-fast, flexible LLM evaluation, from Hugging Face's Leaderboard and Evals Team.</i>
</p>
<div align="center">
[![Tests](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml?query=branch%3Amain)
[![Quality](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml?query=branch%3Amain)
[![Python versions](https://img.shields.io/pypi/pyversions/lighteval)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/huggingface/lighteval/blob/main/LICENSE)
[![Version](https://img.shields.io/pypi/v/lighteval)](https://pypi.org/project/lighteval/)
</div>
---
**Documentation**: <a href="https://github.com/huggingface/lighteval/wiki" target="_blank">Lighteval's Wiki</a>
---
### Unlock the Power of LLM Evaluation with Lighteval 🚀
**Lighteval** is your all-in-one toolkit for evaluating LLMs across multiple
backends—whether it's
[transformers](https://github.com/huggingface/transformers),
[tgi](https://github.com/huggingface/text-generation-inference),
[vllm](https://github.com/vllm-project/vllm), or
[nanotron](https://github.com/huggingface/nanotron)—with
ease. Dive deep into your model’s performance by saving and exploring detailed,
sample-by-sample results to debug and see how your models stack-up.
Customization at your fingertips: letting you either browse all our existing [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) and [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) or effortlessly [create your own](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task), tailored to your needs.
Seamlessly experiment, benchmark, and store your results on the Hugging Face
Hub, S3, or locally.
## 🔑 Key Features
- **Speed**: [Use vllm as backend for fast evals](https://github.com/huggingface/lighteval/wiki/Use-VLLM-as-backend).
- **Completeness**: [Use the accelerate backend to launch any models hosted on Hugging Face](https://github.com/huggingface/lighteval/wiki/Quicktour#accelerate).
- **Seamless Storage**: [Save results in S3 or Hugging Face Datasets](https://github.com/huggingface/lighteval/wiki/Saving-and-reading-results).
- **Python API**: [Simple integration with the Python API](https://github.com/huggingface/lighteval/wiki/Using-the-Python-API).
- **Custom Tasks**: [Easily add custom tasks](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task).
- **Versatility**: Tons of [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) and [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) ready to go.
## ⚡️ Installation
```bash
pip install lighteval[accelerate]
```
Lighteval allows for many extras when installing, see [here](https://github.com/huggingface/lighteval/wiki/Installation) for a complete list.
If you want to push results to the Hugging Face Hub, add your access token as
an environment variable:
```shell
huggingface-cli login
```
## 🚀 Quickstart
Lighteval offers two main entry points for model evaluation:
* `lighteval accelerate`: evaluate models on CPU or one or more GPUs using [🤗
Accelerate](https://github.com/huggingface/accelerate).
* `lighteval nanotron`: evaluate models in distributed settings using [⚡️
Nanotron](https://github.com/huggingface/nanotron).
Here’s a quick command to evaluate using the Accelerate backend:
```shell
lighteval accelerate \
--model_args "pretrained=gpt2" \
--tasks "leaderboard|truthfulqa:mc|0|0" \
--override_batch_size 1 \
--output_dir="./evals/"
```
## 🙏 Acknowledgements
Lighteval started as an extension of the fantastic [Eleuther AI
Harness](https://github.com/EleutherAI/lm-evaluation-harness) (which powers the
[Open LLM
Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard))
and draws inspiration from the amazing
[HELM](https://crfm.stanford.edu/helm/latest/) framework.
While evolving Lighteval into its own standalone tool, we are grateful to the
Harness and HELM teams for their pioneering work on LLM evaluations.
## 🌟 Contributions Welcome 💙💚💛💜🧡
Got ideas? Found a bug? Want to add a
[task](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task) or
[metric](https://github.com/huggingface/lighteval/wiki/Adding-a-New-Metric)?
Contributions are warmly
welcomed!
## 📜 Citation
```bibtex
@misc{lighteval,
author = {Fourrier, Clémentine and Habib, Nathan and Wolf, Thomas and Tunstall, Lewis},
title = {LightEval: A lightweight framework for LLM evaluation},
year = {2023},
version = {0.5.0},
url = {https://github.com/huggingface/lighteval}
}
```
Raw data
{
"_id": null,
"home_page": null,
"name": "lighteval",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Cl\u00e9mentine Fourrier <clementine@huggingface.com>, Nathan Habib <nathan.habib@huggingface.com>",
"keywords": "evaluation, nlp, llm",
"author": null,
"author_email": "Cl\u00e9mentine Fourrier <clementine@huggingface.com>, Nathan Habib <nathan.habib@huggingface.com>, Thomas Wolf <thom@huggingface.com>",
"download_url": "https://files.pythonhosted.org/packages/65/dd/547f2af88bc4c56ce19123f66701cdc5762f33cc9b49bedb19ae03b26fcd/lighteval-0.6.2.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <br/>\n <img alt=\"lighteval library logo\" src=\"./assets/lighteval-doc.svg\" width=\"376\" height=\"59\" style=\"max-width: 100%;\">\n <br/>\n</p>\n\n\n<p align=\"center\">\n <i>Your go-to toolkit for lightning-fast, flexible LLM evaluation, from Hugging Face's Leaderboard and Evals Team.</i>\n</p>\n\n<div align=\"center\">\n\n[![Tests](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/tests.yaml?query=branch%3Amain)\n[![Quality](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml/badge.svg?branch=main)](https://github.com/huggingface/lighteval/actions/workflows/quality.yaml?query=branch%3Amain)\n[![Python versions](https://img.shields.io/pypi/pyversions/lighteval)](https://www.python.org/downloads/)\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](https://github.com/huggingface/lighteval/blob/main/LICENSE)\n[![Version](https://img.shields.io/pypi/v/lighteval)](https://pypi.org/project/lighteval/)\n\n</div>\n\n---\n\n**Documentation**: <a href=\"https://github.com/huggingface/lighteval/wiki\" target=\"_blank\">Lighteval's Wiki</a>\n\n---\n\n### Unlock the Power of LLM Evaluation with Lighteval \ud83d\ude80\n\n**Lighteval** is your all-in-one toolkit for evaluating LLMs across multiple\nbackends\u2014whether it's\n[transformers](https://github.com/huggingface/transformers),\n[tgi](https://github.com/huggingface/text-generation-inference),\n[vllm](https://github.com/vllm-project/vllm), or\n[nanotron](https://github.com/huggingface/nanotron)\u2014with\nease. Dive deep into your model\u2019s performance by saving and exploring detailed,\nsample-by-sample results to debug and see how your models stack-up.\n\nCustomization at your fingertips: letting you either browse all our existing [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) and [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) or effortlessly [create your own](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task), tailored to your needs.\n\nSeamlessly experiment, benchmark, and store your results on the Hugging Face\nHub, S3, or locally.\n\n\n## \ud83d\udd11 Key Features\n\n- **Speed**: [Use vllm as backend for fast evals](https://github.com/huggingface/lighteval/wiki/Use-VLLM-as-backend).\n- **Completeness**: [Use the accelerate backend to launch any models hosted on Hugging Face](https://github.com/huggingface/lighteval/wiki/Quicktour#accelerate).\n- **Seamless Storage**: [Save results in S3 or Hugging Face Datasets](https://github.com/huggingface/lighteval/wiki/Saving-and-reading-results).\n- **Python API**: [Simple integration with the Python API](https://github.com/huggingface/lighteval/wiki/Using-the-Python-API).\n- **Custom Tasks**: [Easily add custom tasks](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task).\n- **Versatility**: Tons of [metrics](https://github.com/huggingface/lighteval/wiki/Metric-List) and [tasks](https://github.com/huggingface/lighteval/wiki/Available-Tasks) ready to go.\n\n\n## \u26a1\ufe0f Installation\n\n```bash\npip install lighteval[accelerate]\n```\n\nLighteval allows for many extras when installing, see [here](https://github.com/huggingface/lighteval/wiki/Installation) for a complete list.\n\nIf you want to push results to the Hugging Face Hub, add your access token as\nan environment variable:\n\n```shell\nhuggingface-cli login\n```\n\n## \ud83d\ude80 Quickstart\n\nLighteval offers two main entry points for model evaluation:\n\n\n* `lighteval accelerate`: evaluate models on CPU or one or more GPUs using [\ud83e\udd17\n Accelerate](https://github.com/huggingface/accelerate).\n* `lighteval nanotron`: evaluate models in distributed settings using [\u26a1\ufe0f\n Nanotron](https://github.com/huggingface/nanotron).\n\nHere\u2019s a quick command to evaluate using the Accelerate backend:\n\n```shell\nlighteval accelerate \\\n --model_args \"pretrained=gpt2\" \\\n --tasks \"leaderboard|truthfulqa:mc|0|0\" \\\n --override_batch_size 1 \\\n --output_dir=\"./evals/\"\n```\n\n## \ud83d\ude4f Acknowledgements\n\nLighteval started as an extension of the fantastic [Eleuther AI\nHarness](https://github.com/EleutherAI/lm-evaluation-harness) (which powers the\n[Open LLM\nLeaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard))\nand draws inspiration from the amazing\n[HELM](https://crfm.stanford.edu/helm/latest/) framework.\n\nWhile evolving Lighteval into its own standalone tool, we are grateful to the\nHarness and HELM teams for their pioneering work on LLM evaluations.\n\n## \ud83c\udf1f Contributions Welcome \ud83d\udc99\ud83d\udc9a\ud83d\udc9b\ud83d\udc9c\ud83e\udde1\n\nGot ideas? Found a bug? Want to add a\n[task](https://github.com/huggingface/lighteval/wiki/Adding-a-Custom-Task) or\n[metric](https://github.com/huggingface/lighteval/wiki/Adding-a-New-Metric)?\nContributions are warmly\nwelcomed!\n\n## \ud83d\udcdc Citation\n\n```bibtex\n@misc{lighteval,\n author = {Fourrier, Cl\u00e9mentine and Habib, Nathan and Wolf, Thomas and Tunstall, Lewis},\n title = {LightEval: A lightweight framework for LLM evaluation},\n year = {2023},\n version = {0.5.0},\n url = {https://github.com/huggingface/lighteval}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "A lightweight and configurable evaluation package",
"version": "0.6.2",
"project_urls": {
"Homepage": "https://github.com/huggingface/lighteval",
"Issues": "https://github.com/huggingface/lighteval/issues"
},
"split_keywords": [
"evaluation",
" nlp",
" llm"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "baf8c3f757064572b62ef63cb8e45e093245b0327e9f8e1c93f2aa57e227a33e",
"md5": "cac0fcf853048c13b7eef068bc0f3f43",
"sha256": "1832fff4ca76d4ec617b5242c60e5dcaa1df8966f9b8352af105386fb6c910ba"
},
"downloads": -1,
"filename": "lighteval-0.6.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "cac0fcf853048c13b7eef068bc0f3f43",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 335736,
"upload_time": "2024-10-23T14:11:46",
"upload_time_iso_8601": "2024-10-23T14:11:46.234526Z",
"url": "https://files.pythonhosted.org/packages/ba/f8/c3f757064572b62ef63cb8e45e093245b0327e9f8e1c93f2aa57e227a33e/lighteval-0.6.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "65dd547f2af88bc4c56ce19123f66701cdc5762f33cc9b49bedb19ae03b26fcd",
"md5": "3e2b3ef6351285f71147a20d0d4117e7",
"sha256": "e48caf17c4136f973b5b9ee0692171b797692e068bd6c8efed14657b81500956"
},
"downloads": -1,
"filename": "lighteval-0.6.2.tar.gz",
"has_sig": false,
"md5_digest": "3e2b3ef6351285f71147a20d0d4117e7",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 272568,
"upload_time": "2024-10-23T14:11:49",
"upload_time_iso_8601": "2024-10-23T14:11:49.343129Z",
"url": "https://files.pythonhosted.org/packages/65/dd/547f2af88bc4c56ce19123f66701cdc5762f33cc9b49bedb19ae03b26fcd/lighteval-0.6.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-23 14:11:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "huggingface",
"github_project": "lighteval",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "lighteval"
}