Name | fincompass JSON |
Version |
0.1.2
JSON |
| download |
home_page | |
Summary | A comprehensive toolkit for large model evaluation |
upload_time | 2023-08-21 10:09:58 |
maintainer | |
docs_url | None |
author | |
requires_python | >=3.8.0 |
license | |
keywords |
ai
nlp
in-context learning
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<div align="center">
<img src="docs/zh_cn/_static/image/logo.svg" width="500px"/>
<br />
<br />
[](https://opencompass.readthedocs.io/zh_CN)
[](https://github.com/InternLM/opencompass/blob/main/LICENSE)
<!-- [](https://pypi.org/project/opencompass/) -->
[🌐Website](https://opencompass.org.cn/) |
[📘Documentation](https://opencompass.readthedocs.io/zh_CN/latest/index.html) |
[🛠️Installation](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html) |
[🤔Reporting Issues](https://github.com/InternLM/opencompass/issues/new/choose)
[English](/README.md) | 简体中文
</div>
<p align="center">
👋 加入我们的 <a href="https://discord.gg/xa29JuW87d" target="_blank">Discord</a> 和 <a href="https://github.com/InternLM/InternLM/assets/25839884/a6aad896-7232-4220-ac84-9e070c2633ce" target="_blank">微信社区</a>
</p>
欢迎来到OpenCompass!
就像指南针在我们的旅程中为我们导航一样,我们希望OpenCompass能够帮助你穿越评估大型语言模型的重重迷雾。OpenCompass提供丰富的算法和功能支持,期待OpenCompass能够帮助社区更便捷地对NLP模型的性能进行公平全面的评估。
## 介绍
OpenCompass 是面向大模型评测的一站式平台。其主要特点如下:
- **开源可复现**:提供公平、公开、可复现的大模型评测方案
- **全面的能力维度**:五大维度设计,提供 50+ 个数据集约 30 万题的的模型评测方案,全面评估模型能力
- **丰富的模型支持**:已支持 20+ HuggingFace 及 API 模型
- **分布式高效评测**:一行命令实现任务分割和分布式评测,数小时即可完成千亿模型全量评测
- **多样化评测范式**:支持零样本、小样本及思维链评测,结合标准型或对话型提示词模板,轻松激发各种模型最大性能
- **灵活化拓展**:想增加新模型或数据集?想要自定义更高级的任务分割策略,甚至接入新的集群管理系统?OpenCompass 的一切均可轻松扩展!
## 性能榜单
我们将陆续提供开源模型和API模型的具体性能榜单,请见 [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) 。如需加入评测,请提供模型仓库地址或标准的 API 接口至邮箱 `opencompass@pjlab.org.cn`.
[](https://opencompass.org.cn/rank)
## 数据集支持
<table align="center">
<tbody>
<tr align="center" valign="bottom">
<td>
<b>语言</b>
</td>
<td>
<b>知识</b>
</td>
<td>
<b>推理</b>
</td>
<td>
<b>学科</b>
</td>
<td>
<b>理解</b>
</td>
</tr>
<tr valign="top">
<td>
<details open>
<summary><b>字词释义</b></summary>
- WiC
- SummEdits
</details>
<details open>
<summary><b>成语习语</b></summary>
- CHID
</details>
<details open>
<summary><b>语义相似度</b></summary>
- AFQMC
- BUSTM
</details>
<details open>
<summary><b>指代消解</b></summary>
- CLUEWSC
- WSC
- WinoGrande
</details>
<details open>
<summary><b>翻译</b></summary>
- Flores
</details>
</td>
<td>
<details open>
<summary><b>知识问答</b></summary>
- BoolQ
- CommonSenseQA
- NaturalQuestion
- TrivialQA
</details>
<details open>
<summary><b>多语种问答</b></summary>
- TyDi-QA
</details>
</td>
<td>
<details open>
<summary><b>文本蕴含</b></summary>
- CMNLI
- OCNLI
- OCNLI_FC
- AX-b
- AX-g
- CB
- RTE
</details>
<details open>
<summary><b>常识推理</b></summary>
- StoryCloze
- StoryCloze-CN(即将上线)
- COPA
- ReCoRD
- HellaSwag
- PIQA
- SIQA
</details>
<details open>
<summary><b>数学推理</b></summary>
- MATH
- GSM8K
</details>
<details open>
<summary><b>定理应用</b></summary>
- TheoremQA
</details>
<details open>
<summary><b>代码</b></summary>
- HumanEval
- MBPP
</details>
<details open>
<summary><b>综合推理</b></summary>
- BBH
</details>
</td>
<td>
<details open>
<summary><b>初中/高中/大学/职业考试</b></summary>
- GAOKAO-2023
- CEval
- AGIEval
- MMLU
- GAOKAO-Bench
- MMLU-CN (即将上线)
- ARC
</details>
</td>
<td>
<details open>
<summary><b>阅读理解</b></summary>
- C3
- CMRC
- DRCD
- MultiRC
- RACE
</details>
<details open>
<summary><b>内容总结</b></summary>
- CSL
- LCSTS
- XSum
</details>
<details open>
<summary><b>内容分析</b></summary>
- EPRSTMT
- LAMBADA
- TNEWS
</details>
</td>
</tr>
</td>
</tr>
</tbody>
</table>
## 模型支持
<table align="center">
<tbody>
<tr align="center" valign="bottom">
<td>
<b>开源模型</b>
</td>
<td>
<b>API 模型</b>
</td>
<!-- <td>
<b>自定义模型</b>
</td> -->
</tr>
<tr valign="top">
<td>
- LLaMA
- Vicuna
- Alpaca
- Baichuan
- WizardLM
- ChatGLM-6B
- ChatGLM2-6B
- MPT
- Falcon
- TigerBot
- MOSS
- ……
</td>
<td>
- OpenAI
- Claude (即将推出)
- PaLM (即将推出)
- ……
</td>
<!-- <td>
- GLM
- ……
</td> -->
</tr>
</tbody>
</table>
## 安装
下面展示了快速安装的步骤。有部分第三方功能可能需要额外步骤才能正常运行,详细步骤请参考[安装指南](https://opencompass.readthedocs.io/zh_cn/latest/get_started.html)。
```Python
conda create --name opencompass python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y
conda activate opencompass
git clone https://github.com/InternLM/opencompass opencompass
cd opencompass
pip install -e .
# 下载数据集到 data/ 处
wget https://github.com/InternLM/opencompass/releases/download/0.1.0/OpenCompassData.zip
unzip OpenCompassData.zip
```
## 评测
请阅读[快速上手](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html#id2)了解如何运行一个评测任务。
## 致谢
该项目部分的代码引用并修改自 [OpenICL](https://github.com/Shark-NLP/OpenICL)。
## 引用
```bibtex
@misc{2023opencompass,
title={OpenCompass: A Universal Evaluation Platform for Foundation Models},
author={OpenCompass Contributors},
howpublished = {\url{https://github.com/InternLM/OpenCompass}},
year={2023}
}
```
Raw data
{
"_id": null,
"home_page": "",
"name": "fincompass",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8.0",
"maintainer_email": "",
"keywords": "AI,NLP,in-context learning",
"author": "",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/0f/a6/08fc5be89436584dbd0e22438c661d7d127f93992d975b8fa010b8e199fb/fincompass-0.1.2.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n <img src=\"docs/zh_cn/_static/image/logo.svg\" width=\"500px\"/>\n <br />\n <br />\n\n[](https://opencompass.readthedocs.io/zh_CN)\n[](https://github.com/InternLM/opencompass/blob/main/LICENSE)\n\n<!-- [](https://pypi.org/project/opencompass/) -->\n\n[\ud83c\udf10Website](https://opencompass.org.cn/) |\n[\ud83d\udcd8Documentation](https://opencompass.readthedocs.io/zh_CN/latest/index.html) |\n[\ud83d\udee0\ufe0fInstallation](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html) |\n[\ud83e\udd14Reporting Issues](https://github.com/InternLM/opencompass/issues/new/choose)\n\n[English](/README.md) | \u7b80\u4f53\u4e2d\u6587\n\n</div>\n\n<p align=\"center\">\n \ud83d\udc4b \u52a0\u5165\u6211\u4eec\u7684 <a href=\"https://discord.gg/xa29JuW87d\" target=\"_blank\">Discord</a> \u548c <a href=\"https://github.com/InternLM/InternLM/assets/25839884/a6aad896-7232-4220-ac84-9e070c2633ce\" target=\"_blank\">\u5fae\u4fe1\u793e\u533a</a>\n</p>\n\n\u6b22\u8fce\u6765\u5230OpenCompass\uff01\n\n\u5c31\u50cf\u6307\u5357\u9488\u5728\u6211\u4eec\u7684\u65c5\u7a0b\u4e2d\u4e3a\u6211\u4eec\u5bfc\u822a\u4e00\u6837\uff0c\u6211\u4eec\u5e0c\u671bOpenCompass\u80fd\u591f\u5e2e\u52a9\u4f60\u7a7f\u8d8a\u8bc4\u4f30\u5927\u578b\u8bed\u8a00\u6a21\u578b\u7684\u91cd\u91cd\u8ff7\u96fe\u3002OpenCompass\u63d0\u4f9b\u4e30\u5bcc\u7684\u7b97\u6cd5\u548c\u529f\u80fd\u652f\u6301\uff0c\u671f\u5f85OpenCompass\u80fd\u591f\u5e2e\u52a9\u793e\u533a\u66f4\u4fbf\u6377\u5730\u5bf9NLP\u6a21\u578b\u7684\u6027\u80fd\u8fdb\u884c\u516c\u5e73\u5168\u9762\u7684\u8bc4\u4f30\u3002\n\n## \u4ecb\u7ecd\n\nOpenCompass \u662f\u9762\u5411\u5927\u6a21\u578b\u8bc4\u6d4b\u7684\u4e00\u7ad9\u5f0f\u5e73\u53f0\u3002\u5176\u4e3b\u8981\u7279\u70b9\u5982\u4e0b\uff1a\n\n- **\u5f00\u6e90\u53ef\u590d\u73b0**\uff1a\u63d0\u4f9b\u516c\u5e73\u3001\u516c\u5f00\u3001\u53ef\u590d\u73b0\u7684\u5927\u6a21\u578b\u8bc4\u6d4b\u65b9\u6848\n\n- **\u5168\u9762\u7684\u80fd\u529b\u7ef4\u5ea6**\uff1a\u4e94\u5927\u7ef4\u5ea6\u8bbe\u8ba1\uff0c\u63d0\u4f9b 50+ \u4e2a\u6570\u636e\u96c6\u7ea6 30 \u4e07\u9898\u7684\u7684\u6a21\u578b\u8bc4\u6d4b\u65b9\u6848\uff0c\u5168\u9762\u8bc4\u4f30\u6a21\u578b\u80fd\u529b\n\n- **\u4e30\u5bcc\u7684\u6a21\u578b\u652f\u6301**\uff1a\u5df2\u652f\u6301 20+ HuggingFace \u53ca API \u6a21\u578b\n\n- **\u5206\u5e03\u5f0f\u9ad8\u6548\u8bc4\u6d4b**\uff1a\u4e00\u884c\u547d\u4ee4\u5b9e\u73b0\u4efb\u52a1\u5206\u5272\u548c\u5206\u5e03\u5f0f\u8bc4\u6d4b\uff0c\u6570\u5c0f\u65f6\u5373\u53ef\u5b8c\u6210\u5343\u4ebf\u6a21\u578b\u5168\u91cf\u8bc4\u6d4b\n\n- **\u591a\u6837\u5316\u8bc4\u6d4b\u8303\u5f0f**\uff1a\u652f\u6301\u96f6\u6837\u672c\u3001\u5c0f\u6837\u672c\u53ca\u601d\u7ef4\u94fe\u8bc4\u6d4b\uff0c\u7ed3\u5408\u6807\u51c6\u578b\u6216\u5bf9\u8bdd\u578b\u63d0\u793a\u8bcd\u6a21\u677f\uff0c\u8f7b\u677e\u6fc0\u53d1\u5404\u79cd\u6a21\u578b\u6700\u5927\u6027\u80fd\n\n- **\u7075\u6d3b\u5316\u62d3\u5c55**\uff1a\u60f3\u589e\u52a0\u65b0\u6a21\u578b\u6216\u6570\u636e\u96c6\uff1f\u60f3\u8981\u81ea\u5b9a\u4e49\u66f4\u9ad8\u7ea7\u7684\u4efb\u52a1\u5206\u5272\u7b56\u7565\uff0c\u751a\u81f3\u63a5\u5165\u65b0\u7684\u96c6\u7fa4\u7ba1\u7406\u7cfb\u7edf\uff1fOpenCompass \u7684\u4e00\u5207\u5747\u53ef\u8f7b\u677e\u6269\u5c55\uff01\n\n## \u6027\u80fd\u699c\u5355\n\n\u6211\u4eec\u5c06\u9646\u7eed\u63d0\u4f9b\u5f00\u6e90\u6a21\u578b\u548cAPI\u6a21\u578b\u7684\u5177\u4f53\u6027\u80fd\u699c\u5355\uff0c\u8bf7\u89c1 [OpenCompass Leaderbaord](https://opencompass.org.cn/rank) \u3002\u5982\u9700\u52a0\u5165\u8bc4\u6d4b\uff0c\u8bf7\u63d0\u4f9b\u6a21\u578b\u4ed3\u5e93\u5730\u5740\u6216\u6807\u51c6\u7684 API \u63a5\u53e3\u81f3\u90ae\u7bb1 `opencompass@pjlab.org.cn`.\n\n[](https://opencompass.org.cn/rank)\n\n## \u6570\u636e\u96c6\u652f\u6301\n\n<table align=\"center\">\n <tbody>\n <tr align=\"center\" valign=\"bottom\">\n <td>\n <b>\u8bed\u8a00</b>\n </td>\n <td>\n <b>\u77e5\u8bc6</b>\n </td>\n <td>\n <b>\u63a8\u7406</b>\n </td>\n <td>\n <b>\u5b66\u79d1</b>\n </td>\n <td>\n <b>\u7406\u89e3</b>\n </td>\n </tr>\n <tr valign=\"top\">\n <td>\n<details open>\n<summary><b>\u5b57\u8bcd\u91ca\u4e49</b></summary>\n\n- WiC\n- SummEdits\n\n</details>\n\n<details open>\n<summary><b>\u6210\u8bed\u4e60\u8bed</b></summary>\n\n- CHID\n\n</details>\n\n<details open>\n<summary><b>\u8bed\u4e49\u76f8\u4f3c\u5ea6</b></summary>\n\n- AFQMC\n- BUSTM\n\n</details>\n\n<details open>\n<summary><b>\u6307\u4ee3\u6d88\u89e3</b></summary>\n\n- CLUEWSC\n- WSC\n- WinoGrande\n\n</details>\n\n<details open>\n<summary><b>\u7ffb\u8bd1</b></summary>\n\n- Flores\n\n</details>\n </td>\n <td>\n<details open>\n<summary><b>\u77e5\u8bc6\u95ee\u7b54</b></summary>\n\n- BoolQ\n- CommonSenseQA\n- NaturalQuestion\n- TrivialQA\n\n</details>\n\n<details open>\n<summary><b>\u591a\u8bed\u79cd\u95ee\u7b54</b></summary>\n\n- TyDi-QA\n\n</details>\n </td>\n <td>\n<details open>\n<summary><b>\u6587\u672c\u8574\u542b</b></summary>\n\n- CMNLI\n- OCNLI\n- OCNLI_FC\n- AX-b\n- AX-g\n- CB\n- RTE\n\n</details>\n\n<details open>\n<summary><b>\u5e38\u8bc6\u63a8\u7406</b></summary>\n\n- StoryCloze\n- StoryCloze-CN\uff08\u5373\u5c06\u4e0a\u7ebf\uff09\n- COPA\n- ReCoRD\n- HellaSwag\n- PIQA\n- SIQA\n\n</details>\n\n<details open>\n<summary><b>\u6570\u5b66\u63a8\u7406</b></summary>\n\n- MATH\n- GSM8K\n\n</details>\n\n<details open>\n<summary><b>\u5b9a\u7406\u5e94\u7528</b></summary>\n\n- TheoremQA\n\n</details>\n\n<details open>\n<summary><b>\u4ee3\u7801</b></summary>\n\n- HumanEval\n- MBPP\n\n</details>\n\n<details open>\n<summary><b>\u7efc\u5408\u63a8\u7406</b></summary>\n\n- BBH\n\n</details>\n </td>\n <td>\n<details open>\n<summary><b>\u521d\u4e2d/\u9ad8\u4e2d/\u5927\u5b66/\u804c\u4e1a\u8003\u8bd5</b></summary>\n\n- GAOKAO-2023\n- CEval\n- AGIEval\n- MMLU\n- GAOKAO-Bench\n- MMLU-CN (\u5373\u5c06\u4e0a\u7ebf)\n- ARC\n\n</details>\n </td>\n <td>\n<details open>\n<summary><b>\u9605\u8bfb\u7406\u89e3</b></summary>\n\n- C3\n- CMRC\n- DRCD\n- MultiRC\n- RACE\n\n</details>\n\n<details open>\n<summary><b>\u5185\u5bb9\u603b\u7ed3</b></summary>\n\n- CSL\n- LCSTS\n- XSum\n\n</details>\n\n<details open>\n<summary><b>\u5185\u5bb9\u5206\u6790</b></summary>\n\n- EPRSTMT\n- LAMBADA\n- TNEWS\n\n</details>\n </td>\n </tr>\n</td>\n </tr>\n </tbody>\n</table>\n\n## \u6a21\u578b\u652f\u6301\n\n<table align=\"center\">\n <tbody>\n <tr align=\"center\" valign=\"bottom\">\n <td>\n <b>\u5f00\u6e90\u6a21\u578b</b>\n </td>\n <td>\n <b>API \u6a21\u578b</b>\n </td>\n <!-- <td>\n <b>\u81ea\u5b9a\u4e49\u6a21\u578b</b>\n </td> -->\n </tr>\n <tr valign=\"top\">\n <td>\n\n- LLaMA\n- Vicuna\n- Alpaca\n- Baichuan\n- WizardLM\n- ChatGLM-6B\n- ChatGLM2-6B\n- MPT\n- Falcon\n- TigerBot\n- MOSS\n- \u2026\u2026\n\n</td>\n<td>\n\n- OpenAI\n- Claude (\u5373\u5c06\u63a8\u51fa)\n- PaLM (\u5373\u5c06\u63a8\u51fa)\n- \u2026\u2026\n\n</td>\n<!-- <td>\n\n- GLM\n- \u2026\u2026\n\n</td> -->\n</tr>\n </tbody>\n</table>\n\n## \u5b89\u88c5\n\n\u4e0b\u9762\u5c55\u793a\u4e86\u5feb\u901f\u5b89\u88c5\u7684\u6b65\u9aa4\u3002\u6709\u90e8\u5206\u7b2c\u4e09\u65b9\u529f\u80fd\u53ef\u80fd\u9700\u8981\u989d\u5916\u6b65\u9aa4\u624d\u80fd\u6b63\u5e38\u8fd0\u884c\uff0c\u8be6\u7ec6\u6b65\u9aa4\u8bf7\u53c2\u8003[\u5b89\u88c5\u6307\u5357](https://opencompass.readthedocs.io/zh_cn/latest/get_started.html)\u3002\n\n```Python\nconda create --name opencompass python=3.10 pytorch torchvision pytorch-cuda -c nvidia -c pytorch -y\nconda activate opencompass\ngit clone https://github.com/InternLM/opencompass opencompass\ncd opencompass\npip install -e .\n# \u4e0b\u8f7d\u6570\u636e\u96c6\u5230 data/ \u5904\nwget https://github.com/InternLM/opencompass/releases/download/0.1.0/OpenCompassData.zip\nunzip OpenCompassData.zip\n```\n\n## \u8bc4\u6d4b\n\n\u8bf7\u9605\u8bfb[\u5feb\u901f\u4e0a\u624b](https://opencompass.readthedocs.io/zh_CN/latest/get_started.html#id2)\u4e86\u89e3\u5982\u4f55\u8fd0\u884c\u4e00\u4e2a\u8bc4\u6d4b\u4efb\u52a1\u3002\n\n## \u81f4\u8c22\n\n\u8be5\u9879\u76ee\u90e8\u5206\u7684\u4ee3\u7801\u5f15\u7528\u5e76\u4fee\u6539\u81ea [OpenICL](https://github.com/Shark-NLP/OpenICL)\u3002\n\n## \u5f15\u7528\n\n```bibtex\n@misc{2023opencompass,\n title={OpenCompass: A Universal Evaluation Platform for Foundation Models},\n author={OpenCompass Contributors},\n howpublished = {\\url{https://github.com/InternLM/OpenCompass}},\n year={2023}\n}\n```\n",
"bugtrack_url": null,
"license": "",
"summary": "A comprehensive toolkit for large model evaluation",
"version": "0.1.2",
"project_urls": null,
"split_keywords": [
"ai",
"nlp",
"in-context learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f2a152802114535b458fe7a119ea52083043dcd1c19b102d0666e1d72b780b6d",
"md5": "46e717dbda5d2d30c64c57d5f4e27c22",
"sha256": "4578ad2b4c16e582cda73ec41e3632c919c3496da1fde1e073b16fff24a57122"
},
"downloads": -1,
"filename": "fincompass-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "46e717dbda5d2d30c64c57d5f4e27c22",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8.0",
"size": 208354,
"upload_time": "2023-08-21T10:09:56",
"upload_time_iso_8601": "2023-08-21T10:09:56.469956Z",
"url": "https://files.pythonhosted.org/packages/f2/a1/52802114535b458fe7a119ea52083043dcd1c19b102d0666e1d72b780b6d/fincompass-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0fa608fc5be89436584dbd0e22438c661d7d127f93992d975b8fa010b8e199fb",
"md5": "ab59f64f4d4f28ce041e027fde484c5b",
"sha256": "d0f3091248c21f4df5235134ad887e97bc3e7cd1e9611e5b5eb0137280788f5f"
},
"downloads": -1,
"filename": "fincompass-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "ab59f64f4d4f28ce041e027fde484c5b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8.0",
"size": 115549,
"upload_time": "2023-08-21T10:09:58",
"upload_time_iso_8601": "2023-08-21T10:09:58.835854Z",
"url": "https://files.pythonhosted.org/packages/0f/a6/08fc5be89436584dbd0e22438c661d7d127f93992d975b8fa010b8e199fb/fincompass-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-08-21 10:09:58",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "fincompass"
}