# 欢迎来到MindSpore Transformers(MindFormers)
[![LICENSE](https://img.shields.io/github/license/mindspore-lab/mindformers.svg?style=flat-square)](https://github.com/mindspore-lab/mindformers/blob/master/LICENSE)
[![Downloads](https://static.pepy.tech/badge/mindformers)](https://pepy.tech/project/mindformers)
[![PyPI](https://badge.fury.io/py/mindformers.svg)](https://badge.fury.io/py/mindformers)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/mindformers.svg)](https://pypi.org/project/mindformers)
## 一、介绍
MindSpore Transformers套件的目标是构建一个大模型训练、微调、评估、推理、部署的全流程开发套件,提供业内主流的Transformer类预训练模型和SOTA下游任务应用,涵盖丰富的并行特性。期望帮助用户轻松的实现大模型训练和创新研发。
MindSpore Transformers套件基于MindSpore内置的并行技术和组件化设计,具备如下特点:
- 一行代码实现从单卡到大规模集群训练的无缝切换;
- 提供灵活易用的个性化并行配置;
- 能够自动进行拓扑感知,高效地融合数据并行和模型并行策略;
- 一键启动任意任务的单卡/多卡训练、微调、评估、推理流程;
- 支持用户进行组件化配置任意模块,如优化器、学习策略、网络组装等;
- 提供Trainer、pipeline、AutoClass等高阶易用性接口;
- 提供预置SOTA权重自动下载及加载功能;
- 支持人工智能计算中心无缝迁移部署;
如果您对MindSpore Transformers有任何建议,请通过issue与我们联系,我们将及时处理。
- 📝 **[MindFormers教程文档](https://mindformers.readthedocs.io/zh_CN/latest)**
- 📝 [大模型能力表一览](https://mindformers.readthedocs.io/zh-cn/latest/docs/model_support_list.html#llm)
- 📝 [MindPet指导教程](docs/feature_cards/Pet_Tuners.md)
- 📝 [AICC指导教程](docs/readthedocs/source_zh_cn/docs/practice/AICC.md)
### 支持模型
MindFormers已支持大部分模型的[LoRA微调](docs/feature_cards/Pet_Tuners.md)以及[LoRA权重合并](docs/feature_cards/Transform_Lorackpt.md)功能,具体可参考各模型文档启动模型的LoRA微调任务。
当前MindFormers支持的模型列表如下:
<table>
<thead>
<tr>
<th> 模型 </th>
<th> 参数 </th>
<th> 序列 </th>
<th> 预训练 </th>
<th> 微调 </th>
<th> 推理 </th>
<th> <a href="docs/feature_cards/Pet_Tuners.md"> LoRA </a> </th>
<th> 对话 </th>
<th> 评估 </th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="3"> <a href="docs/model_cards/llama2.md"> LLaMA2 </a> </td>
<td> 7B </td>
<td> 4K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/llama2/run_llama2_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
<tr>
<td> 13B </td>
<td> 4K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/llama2/run_llama2_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
<tr>
<td> 70B </td>
<td> 4K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/llama2/run_llama2_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="2"> <a href="research/llama3/llama3.md"> LLaMA3 </a> </td>
<td> 8B </td>
<td> 8K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/llama3/run_llama3_predict.sh"> generate </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
<tr>
<td> 70B </td>
<td> 8K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/llama3/run_llama3_predict.sh"> generate </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="2"> <a href="research/baichuan2/baichuan2.md"> Baichuan2 </a> </td>
<td> 7B </td>
<td> 4K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/baichuan2/run_baichuan2_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
<tr>
<td> 13B </td>
<td> 4K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/baichuan2/run_baichuan2_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="docs/model_cards/glm2.md"> GLM2 </a> </td>
<td> 6B </td>
<td> 2K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/glm2/run_glm2_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL / Rouge </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="docs/model_cards/glm3.md"> GLM3 </a> </td>
<td> 6B </td>
<td> 2K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/glm3/run_glm3_predict.sh"> generate </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="docs/model_cards/glm3.md"> GLM3-32K </a> </td>
<td> 6B </td>
<td> 32K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/glm32k/run_glm32k_predict.sh"> generate </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="2"> <a href="research/qwen/qwen.md"> Qwen </a> </td>
<td> 7B </td>
<td> 8K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> <a href="research/qwen/qwen.md"> docs </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> C-Eval </td>
</tr>
<tr>
<td> 14B </td>
<td> 8K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> <a href="research/qwen/qwen.md"> docs </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> C-Eval </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="3"> <a href="research/qwen1_5/qwen1_5.md"> Qwen1.5 </a> </td>
<td> 7B </td>
<td> 32K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> <a href="research/qwen1_5/qwen1_5.md"> docs </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
<tr>
<td> 14B </td>
<td> 32K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> <a href="research/qwen1_5/qwen1_5.md"> docs </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
<tr>
<td> 72B </td>
<td> 32K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> <a href="research/qwen1_5/qwen1_5.md"> docs </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="research/qwenvl/qwenvl.md"> QwenVL </a> </td>
<td> 9.6B </td>
<td> 2K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/qwenvl/run_qwenvl_predict.sh"> generate </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="2"> <a href="research/internlm/internlm.md"> InternLM </a> </td>
<td> 7B </td>
<td> 2K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/internlm/run_internlm_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
<tr>
<td> 20B </td>
<td> 2K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/internlm/run_internlm_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="2"> <a href="research/internlm2/internlm2.md"> InternLM2 </a> </td>
<td> 7B </td>
<td> 2K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/internlm2/run_internlm2_predict.sh"> generate </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
<tr>
<td> 20B </td>
<td> 4K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> - </td>
<td> <a href="scripts/examples/internlm2/run_internlm2_predict.sh"> generate </a> </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="2"> <a href="research/yi/yi.md"> Yi </a> </td>
<td> 6B </td>
<td> 2K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/yi/run_yi_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
<tr>
<td> 34B </td>
<td> 4K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/yi/run_yi_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="research/mixtral/mixtral.md"> Mixtral </a> </td>
<td> 8x7B </td>
<td> 32K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> <a href="research/mixtral/mixtral.md"> docs </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="research/deepseek/deepseek.md"> DeepSeek Coder </a> </td>
<td> 33B </td>
<td> 4K </td>
<td style="text-align: center"> - </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> <a href="research/deepseek/deepseek.md"> docs </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> - </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="docs/model_cards/codellama.md"> CodeLlama </a> </td>
<td> 34B </td>
<td> 4K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/codellama/run_codellama_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> HumanEval </td>
</tr>
</tbody>
<tbody>
<tr>
<td rowspan="1"> <a href="docs/model_cards/gpt2.md"> GPT2 </a> </td>
<td> 13B </td>
<td> 2K </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td> <a href="scripts/examples/gpt2/run_gpt2_predict.sh"> generate </a> </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> ✓ </td>
<td style="text-align: center"> PPL </td>
</tr>
</tbody>
</table>
## 二、安装
### 版本匹配关系
当前支持的硬件为[Atlas 800T A2](https://www.hiascend.com/hardware/ai-server?tag=900A2)训练服务器。
当前套件建议使用的Python版本为3.9。
| MindFormers | MindPet | MindSpore | CANN | 驱动固件 | 镜像链接 | 备注 |
|:-----------:|:-------:|:-----------:|:----:|:----:|:----:|-------------|
| dev | 1.0.4 | 2.3版本(尚未发布) | 尚未发布 | 尚未发布 | / | 开发分支(非稳定版本) |
**当前MindFormers仅支持如上的软件配套关系**。其中CANN和固件驱动的安装需与使用的机器匹配,请注意识别机器型号,选择对应架构的版本。
### 源码编译安装
MindFormers目前支持源码编译安装,用户可以执行如下命令进行安装。
```shell
git clone -b dev https://gitee.com/mindspore/mindformers.git
cd mindformers
bash build.sh
```
## 三、使用指南
MindFormers支持模型启动预训练、微调、推理、评测等功能,可点击[支持模型](#支持模型)中模型名称查看文档完成上述任务,以下为模型分布式启动方式的说明与示例。
MindFormers推荐使用分布式方式拉起模型训练、推理等功能,目前提供`scripts/msrun_launcher.sh`分布式启动脚本作为模型的主要启动方式,`msrun`特性说明可以参考[msrun启动](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.0rc2/parallel/msrun_launcher.html)。
该脚本主要输入参数说明如下:
| **参数** | **单机是否必选** | **多机是否必选** | **默认值** | **说明** |
|------------------|:----------:|:----------:|:----------------:|------------------|
| WORKER_NUM | ✓ | ✓ | 8 | 所有节点中使用计算卡的总数 |
| LOCAL_WORKER | - | ✓ | 8 | 当前节点中使用计算卡的数量 |
| MASTER_ADDR | - | ✓ | 127.0.0.1 | 指定分布式启动主节点的ip |
| MASTER_PORT | - | ✓ | 8118 | 指定分布式启动绑定的端口号 |
| NODE_RANK | - | ✓ | 0 | 指定当前节点的rank id |
| LOG_DIR | - | ✓ | output/msrun_log | 日志输出路径,若不存在则递归创建 |
| JOIN | - | ✓ | False | 是否等待所有分布式进程退出 |
| CLUSTER_TIME_OUT | - | ✓ | 600 | 分布式启动的等待时间,单位为秒 |
> 注:如果需要指定`device_id`启动,可以设置环境变量`ASCEND_RT_VISIBLE_DEVICES`,如要配置使用2、3卡则输入`export ASCEND_RT_VISIBLE_DEVICES=2,3`。
### 单机多卡
```shell
# 1. 单机多卡快速启动方式,默认8卡启动
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config {CONFIG_PATH} \
--run_mode {train/finetune/eval/predict}"
# 2. 单机多卡快速启动方式,仅设置使用卡数即可
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config {CONFIG_PATH} \
--run_mode {train/finetune/eval/predict}" WORKER_NUM
# 3. 单机多卡自定义启动方式
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config {CONFIG_PATH} \
--run_mode {train/finetune/eval/predict}" \
WORKER_NUM MASTER_PORT LOG_DIR JOIN CLUSTER_TIME_OUT
```
- 使用示例
```shell
# 单机多卡快速启动方式,默认8卡启动
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config path/to/xxx.yaml \
--run_mode finetune"
# 单机多卡快速启动方式
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config path/to/xxx.yaml \
--run_mode finetune" 8
# 单机多卡自定义启动方式
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config path/to/xxx.yaml \
--run_mode finetune" \
8 8118 output/msrun_log False 300
```
### 多机多卡
多机多卡执行脚本进行分布式训练需要分别在不同节点运行脚本,并将参数MASTER_ADDR设置为主节点的ip地址,
所有节点设置的ip地址相同,不同节点之间仅参数NODE_RANK不同。
```shell
# 多机多卡自定义启动方式
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config {CONFIG_PATH} \
--run_mode {train/finetune/eval/predict}" \
WORKER_NUM LOCAL_WORKER MASTER_ADDR MASTER_PORT NODE_RANK LOG_DIR JOIN CLUSTER_TIME_OUT
```
- 使用示例
```shell
# 节点0,节点ip为192.168.1.1,作为主节点,总共8卡且每个节点4卡
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config {CONFIG_PATH} \
--run_mode {train/finetune/eval/predict}" \
8 4 192.168.1.1 8118 0 output/msrun_log False 300
# 节点1,节点ip为192.168.1.2,节点0与节点1启动命令仅参数NODE_RANK不同
bash scripts/msrun_launcher.sh "run_mindformer.py \
--config {CONFIG_PATH} \
--run_mode {train/finetune/eval/predict}" \
8 4 192.168.1.1 8118 1 output/msrun_log False 300
```
### 单卡启动
MindFormers提供`run_mindformer.py`脚本作为单卡启动方法,该脚本可以根据模型配置文件,完成支持模型的单卡训练、微调、评估、推理流程。
```shell
# 运行run_mindformer.py的入参会覆盖模型配置文件中的参数
python run_mindformer.py --config {CONFIG_PATH} --run_mode {train/finetune/eval/predict}
```
## 四、贡献
欢迎参与社区贡献,可参考MindSpore贡献要求[Contributor Wiki](https://gitee.com/mindspore/mindspore/blob/master/CONTRIBUTING_CN.md)。
## 五、许可证
[Apache 2.0许可证](LICENSE)
Raw data
{
"_id": null,
"home_page": "https://www.mindspore.cn",
"name": "mindformers",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "mindformers",
"author": "The MindSpore Authors",
"author_email": "contact@mindspore.cn",
"download_url": "https://gitee.com/mindspore/mindformers/tags",
"platform": "linux",
"description": "# \u6b22\u8fce\u6765\u5230MindSpore Transformers\uff08MindFormers\uff09\n\n[![LICENSE](https://img.shields.io/github/license/mindspore-lab/mindformers.svg?style=flat-square)](https://github.com/mindspore-lab/mindformers/blob/master/LICENSE)\n[![Downloads](https://static.pepy.tech/badge/mindformers)](https://pepy.tech/project/mindformers)\n[![PyPI](https://badge.fury.io/py/mindformers.svg)](https://badge.fury.io/py/mindformers)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/mindformers.svg)](https://pypi.org/project/mindformers)\n\n## \u4e00\u3001\u4ecb\u7ecd\n\nMindSpore Transformers\u5957\u4ef6\u7684\u76ee\u6807\u662f\u6784\u5efa\u4e00\u4e2a\u5927\u6a21\u578b\u8bad\u7ec3\u3001\u5fae\u8c03\u3001\u8bc4\u4f30\u3001\u63a8\u7406\u3001\u90e8\u7f72\u7684\u5168\u6d41\u7a0b\u5f00\u53d1\u5957\u4ef6\uff0c\u63d0\u4f9b\u4e1a\u5185\u4e3b\u6d41\u7684Transformer\u7c7b\u9884\u8bad\u7ec3\u6a21\u578b\u548cSOTA\u4e0b\u6e38\u4efb\u52a1\u5e94\u7528\uff0c\u6db5\u76d6\u4e30\u5bcc\u7684\u5e76\u884c\u7279\u6027\u3002\u671f\u671b\u5e2e\u52a9\u7528\u6237\u8f7b\u677e\u7684\u5b9e\u73b0\u5927\u6a21\u578b\u8bad\u7ec3\u548c\u521b\u65b0\u7814\u53d1\u3002\n\nMindSpore Transformers\u5957\u4ef6\u57fa\u4e8eMindSpore\u5185\u7f6e\u7684\u5e76\u884c\u6280\u672f\u548c\u7ec4\u4ef6\u5316\u8bbe\u8ba1\uff0c\u5177\u5907\u5982\u4e0b\u7279\u70b9\uff1a\n\n- \u4e00\u884c\u4ee3\u7801\u5b9e\u73b0\u4ece\u5355\u5361\u5230\u5927\u89c4\u6a21\u96c6\u7fa4\u8bad\u7ec3\u7684\u65e0\u7f1d\u5207\u6362\uff1b\n- \u63d0\u4f9b\u7075\u6d3b\u6613\u7528\u7684\u4e2a\u6027\u5316\u5e76\u884c\u914d\u7f6e\uff1b\n- \u80fd\u591f\u81ea\u52a8\u8fdb\u884c\u62d3\u6251\u611f\u77e5\uff0c\u9ad8\u6548\u5730\u878d\u5408\u6570\u636e\u5e76\u884c\u548c\u6a21\u578b\u5e76\u884c\u7b56\u7565\uff1b\n- \u4e00\u952e\u542f\u52a8\u4efb\u610f\u4efb\u52a1\u7684\u5355\u5361/\u591a\u5361\u8bad\u7ec3\u3001\u5fae\u8c03\u3001\u8bc4\u4f30\u3001\u63a8\u7406\u6d41\u7a0b\uff1b\n- \u652f\u6301\u7528\u6237\u8fdb\u884c\u7ec4\u4ef6\u5316\u914d\u7f6e\u4efb\u610f\u6a21\u5757\uff0c\u5982\u4f18\u5316\u5668\u3001\u5b66\u4e60\u7b56\u7565\u3001\u7f51\u7edc\u7ec4\u88c5\u7b49\uff1b\n- \u63d0\u4f9bTrainer\u3001pipeline\u3001AutoClass\u7b49\u9ad8\u9636\u6613\u7528\u6027\u63a5\u53e3\uff1b\n- \u63d0\u4f9b\u9884\u7f6eSOTA\u6743\u91cd\u81ea\u52a8\u4e0b\u8f7d\u53ca\u52a0\u8f7d\u529f\u80fd\uff1b\n- \u652f\u6301\u4eba\u5de5\u667a\u80fd\u8ba1\u7b97\u4e2d\u5fc3\u65e0\u7f1d\u8fc1\u79fb\u90e8\u7f72\uff1b\n\n\u5982\u679c\u60a8\u5bf9MindSpore Transformers\u6709\u4efb\u4f55\u5efa\u8bae\uff0c\u8bf7\u901a\u8fc7issue\u4e0e\u6211\u4eec\u8054\u7cfb\uff0c\u6211\u4eec\u5c06\u53ca\u65f6\u5904\u7406\u3002\n\n- \ud83d\udcdd **[MindFormers\u6559\u7a0b\u6587\u6863](https://mindformers.readthedocs.io/zh_CN/latest)**\n- \ud83d\udcdd [\u5927\u6a21\u578b\u80fd\u529b\u8868\u4e00\u89c8](https://mindformers.readthedocs.io/zh-cn/latest/docs/model_support_list.html#llm)\n- \ud83d\udcdd [MindPet\u6307\u5bfc\u6559\u7a0b](docs/feature_cards/Pet_Tuners.md)\n- \ud83d\udcdd [AICC\u6307\u5bfc\u6559\u7a0b](docs/readthedocs/source_zh_cn/docs/practice/AICC.md)\n\n### \u652f\u6301\u6a21\u578b\n\nMindFormers\u5df2\u652f\u6301\u5927\u90e8\u5206\u6a21\u578b\u7684[LoRA\u5fae\u8c03](docs/feature_cards/Pet_Tuners.md)\u4ee5\u53ca[LoRA\u6743\u91cd\u5408\u5e76](docs/feature_cards/Transform_Lorackpt.md)\u529f\u80fd\uff0c\u5177\u4f53\u53ef\u53c2\u8003\u5404\u6a21\u578b\u6587\u6863\u542f\u52a8\u6a21\u578b\u7684LoRA\u5fae\u8c03\u4efb\u52a1\u3002\n\n\u5f53\u524dMindFormers\u652f\u6301\u7684\u6a21\u578b\u5217\u8868\u5982\u4e0b\uff1a\n\n<table>\n <thead>\n <tr>\n <th> \u6a21\u578b </th>\n <th> \u53c2\u6570 </th>\n <th> \u5e8f\u5217 </th>\n <th> \u9884\u8bad\u7ec3 </th>\n <th> \u5fae\u8c03 </th>\n <th> \u63a8\u7406 </th>\n <th> <a href=\"docs/feature_cards/Pet_Tuners.md\"> LoRA </a> </th>\n <th> \u5bf9\u8bdd </th>\n <th> \u8bc4\u4f30 </th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <td rowspan=\"3\"> <a href=\"docs/model_cards/llama2.md\"> LLaMA2 </a> </td>\n <td> 7B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/llama2/run_llama2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n <tr>\n <td> 13B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/llama2/run_llama2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n <tr>\n <td> 70B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/llama2/run_llama2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"2\"> <a href=\"research/llama3/llama3.md\"> LLaMA3 </a> </td>\n <td> 8B </td>\n <td> 8K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/llama3/run_llama3_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n <tr>\n <td> 70B </td>\n <td> 8K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/llama3/run_llama3_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"2\"> <a href=\"research/baichuan2/baichuan2.md\"> Baichuan2 </a> </td>\n <td> 7B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/baichuan2/run_baichuan2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n <tr>\n <td> 13B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/baichuan2/run_baichuan2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"docs/model_cards/glm2.md\"> GLM2 </a> </td>\n <td> 6B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/glm2/run_glm2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL / Rouge </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"docs/model_cards/glm3.md\"> GLM3 </a> </td>\n <td> 6B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/glm3/run_glm3_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"docs/model_cards/glm3.md\"> GLM3-32K </a> </td>\n <td> 6B </td>\n <td> 32K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/glm32k/run_glm32k_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"2\"> <a href=\"research/qwen/qwen.md\"> Qwen </a> </td>\n <td> 7B </td>\n <td> 8K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> <a href=\"research/qwen/qwen.md\"> docs </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> C-Eval </td>\n </tr>\n <tr>\n <td> 14B </td>\n <td> 8K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> <a href=\"research/qwen/qwen.md\"> docs </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> C-Eval </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"3\"> <a href=\"research/qwen1_5/qwen1_5.md\"> Qwen1.5 </a> </td>\n <td> 7B </td>\n <td> 32K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> <a href=\"research/qwen1_5/qwen1_5.md\"> docs </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n <tr>\n <td> 14B </td>\n <td> 32K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> <a href=\"research/qwen1_5/qwen1_5.md\"> docs </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n <tr>\n <td> 72B </td>\n <td> 32K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> <a href=\"research/qwen1_5/qwen1_5.md\"> docs </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"research/qwenvl/qwenvl.md\"> QwenVL </a> </td>\n <td> 9.6B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/qwenvl/run_qwenvl_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"2\"> <a href=\"research/internlm/internlm.md\"> InternLM </a> </td>\n <td> 7B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/internlm/run_internlm_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n <tr>\n <td> 20B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/internlm/run_internlm_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"2\"> <a href=\"research/internlm2/internlm2.md\"> InternLM2 </a> </td>\n <td> 7B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/internlm2/run_internlm2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n <tr>\n <td> 20B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> - </td>\n <td> <a href=\"scripts/examples/internlm2/run_internlm2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"2\"> <a href=\"research/yi/yi.md\"> Yi </a> </td>\n <td> 6B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/yi/run_yi_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n <tr>\n <td> 34B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/yi/run_yi_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"research/mixtral/mixtral.md\"> Mixtral </a> </td>\n <td> 8x7B </td>\n <td> 32K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> <a href=\"research/mixtral/mixtral.md\"> docs </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"research/deepseek/deepseek.md\"> DeepSeek Coder </a> </td>\n <td> 33B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> - </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> <a href=\"research/deepseek/deepseek.md\"> docs </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> - </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"docs/model_cards/codellama.md\"> CodeLlama </a> </td>\n <td> 34B </td>\n <td> 4K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/codellama/run_codellama_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> HumanEval </td>\n </tr>\n </tbody>\n <tbody>\n <tr>\n <td rowspan=\"1\"> <a href=\"docs/model_cards/gpt2.md\"> GPT2 </a> </td>\n <td> 13B </td>\n <td> 2K </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td> <a href=\"scripts/examples/gpt2/run_gpt2_predict.sh\"> generate </a> </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> ✓ </td>\n <td style=\"text-align: center\"> PPL </td>\n </tr>\n </tbody>\n</table>\n\n## \u4e8c\u3001\u5b89\u88c5\n\n### \u7248\u672c\u5339\u914d\u5173\u7cfb\n\n\u5f53\u524d\u652f\u6301\u7684\u786c\u4ef6\u4e3a[Atlas 800T A2](https://www.hiascend.com/hardware/ai-server?tag=900A2)\u8bad\u7ec3\u670d\u52a1\u5668\u3002\n\n\u5f53\u524d\u5957\u4ef6\u5efa\u8bae\u4f7f\u7528\u7684Python\u7248\u672c\u4e3a3.9\u3002\n\n| MindFormers | MindPet | MindSpore | CANN | \u9a71\u52a8\u56fa\u4ef6 | \u955c\u50cf\u94fe\u63a5 | \u5907\u6ce8 |\n|:-----------:|:-------:|:-----------:|:----:|:----:|:----:|-------------|\n| dev | 1.0.4 | 2.3\u7248\u672c(\u5c1a\u672a\u53d1\u5e03) | \u5c1a\u672a\u53d1\u5e03 | \u5c1a\u672a\u53d1\u5e03 | / | \u5f00\u53d1\u5206\u652f(\u975e\u7a33\u5b9a\u7248\u672c) |\n\n**\u5f53\u524dMindFormers\u4ec5\u652f\u6301\u5982\u4e0a\u7684\u8f6f\u4ef6\u914d\u5957\u5173\u7cfb**\u3002\u5176\u4e2dCANN\u548c\u56fa\u4ef6\u9a71\u52a8\u7684\u5b89\u88c5\u9700\u4e0e\u4f7f\u7528\u7684\u673a\u5668\u5339\u914d\uff0c\u8bf7\u6ce8\u610f\u8bc6\u522b\u673a\u5668\u578b\u53f7\uff0c\u9009\u62e9\u5bf9\u5e94\u67b6\u6784\u7684\u7248\u672c\u3002\n\n### \u6e90\u7801\u7f16\u8bd1\u5b89\u88c5\n\nMindFormers\u76ee\u524d\u652f\u6301\u6e90\u7801\u7f16\u8bd1\u5b89\u88c5\uff0c\u7528\u6237\u53ef\u4ee5\u6267\u884c\u5982\u4e0b\u547d\u4ee4\u8fdb\u884c\u5b89\u88c5\u3002\n\n```shell\ngit clone -b dev https://gitee.com/mindspore/mindformers.git\ncd mindformers\nbash build.sh\n```\n\n## \u4e09\u3001\u4f7f\u7528\u6307\u5357\n\nMindFormers\u652f\u6301\u6a21\u578b\u542f\u52a8\u9884\u8bad\u7ec3\u3001\u5fae\u8c03\u3001\u63a8\u7406\u3001\u8bc4\u6d4b\u7b49\u529f\u80fd\uff0c\u53ef\u70b9\u51fb[\u652f\u6301\u6a21\u578b](#\u652f\u6301\u6a21\u578b)\u4e2d\u6a21\u578b\u540d\u79f0\u67e5\u770b\u6587\u6863\u5b8c\u6210\u4e0a\u8ff0\u4efb\u52a1\uff0c\u4ee5\u4e0b\u4e3a\u6a21\u578b\u5206\u5e03\u5f0f\u542f\u52a8\u65b9\u5f0f\u7684\u8bf4\u660e\u4e0e\u793a\u4f8b\u3002\n\nMindFormers\u63a8\u8350\u4f7f\u7528\u5206\u5e03\u5f0f\u65b9\u5f0f\u62c9\u8d77\u6a21\u578b\u8bad\u7ec3\u3001\u63a8\u7406\u7b49\u529f\u80fd\uff0c\u76ee\u524d\u63d0\u4f9b`scripts/msrun_launcher.sh`\u5206\u5e03\u5f0f\u542f\u52a8\u811a\u672c\u4f5c\u4e3a\u6a21\u578b\u7684\u4e3b\u8981\u542f\u52a8\u65b9\u5f0f\uff0c`msrun`\u7279\u6027\u8bf4\u660e\u53ef\u4ee5\u53c2\u8003[msrun\u542f\u52a8](https://www.mindspore.cn/tutorials/experts/zh-CN/r2.3.0rc2/parallel/msrun_launcher.html)\u3002\n\u8be5\u811a\u672c\u4e3b\u8981\u8f93\u5165\u53c2\u6570\u8bf4\u660e\u5982\u4e0b\uff1a\n\n | **\u53c2\u6570** | **\u5355\u673a\u662f\u5426\u5fc5\u9009** | **\u591a\u673a\u662f\u5426\u5fc5\u9009** | **\u9ed8\u8ba4\u503c** | **\u8bf4\u660e** |\n |------------------|:----------:|:----------:|:----------------:|------------------|\n | WORKER_NUM | ✓ | ✓ | 8 | \u6240\u6709\u8282\u70b9\u4e2d\u4f7f\u7528\u8ba1\u7b97\u5361\u7684\u603b\u6570 |\n | LOCAL_WORKER | - | ✓ | 8 | \u5f53\u524d\u8282\u70b9\u4e2d\u4f7f\u7528\u8ba1\u7b97\u5361\u7684\u6570\u91cf |\n | MASTER_ADDR | - | ✓ | 127.0.0.1 | \u6307\u5b9a\u5206\u5e03\u5f0f\u542f\u52a8\u4e3b\u8282\u70b9\u7684ip |\n | MASTER_PORT | - | ✓ | 8118 | \u6307\u5b9a\u5206\u5e03\u5f0f\u542f\u52a8\u7ed1\u5b9a\u7684\u7aef\u53e3\u53f7 |\n | NODE_RANK | - | ✓ | 0 | \u6307\u5b9a\u5f53\u524d\u8282\u70b9\u7684rank id |\n | LOG_DIR | - | ✓ | output/msrun_log | \u65e5\u5fd7\u8f93\u51fa\u8def\u5f84\uff0c\u82e5\u4e0d\u5b58\u5728\u5219\u9012\u5f52\u521b\u5efa |\n | JOIN | - | ✓ | False | \u662f\u5426\u7b49\u5f85\u6240\u6709\u5206\u5e03\u5f0f\u8fdb\u7a0b\u9000\u51fa |\n | CLUSTER_TIME_OUT | - | ✓ | 600 | \u5206\u5e03\u5f0f\u542f\u52a8\u7684\u7b49\u5f85\u65f6\u95f4\uff0c\u5355\u4f4d\u4e3a\u79d2 |\n\n> \u6ce8\uff1a\u5982\u679c\u9700\u8981\u6307\u5b9a`device_id`\u542f\u52a8\uff0c\u53ef\u4ee5\u8bbe\u7f6e\u73af\u5883\u53d8\u91cf`ASCEND_RT_VISIBLE_DEVICES`\uff0c\u5982\u8981\u914d\u7f6e\u4f7f\u75282\u30013\u5361\u5219\u8f93\u5165`export ASCEND_RT_VISIBLE_DEVICES=2,3`\u3002\n\n### \u5355\u673a\u591a\u5361\n\n```shell\n# 1. \u5355\u673a\u591a\u5361\u5feb\u901f\u542f\u52a8\u65b9\u5f0f\uff0c\u9ed8\u8ba48\u5361\u542f\u52a8\nbash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config {CONFIG_PATH} \\\n --run_mode {train/finetune/eval/predict}\"\n\n# 2. \u5355\u673a\u591a\u5361\u5feb\u901f\u542f\u52a8\u65b9\u5f0f\uff0c\u4ec5\u8bbe\u7f6e\u4f7f\u7528\u5361\u6570\u5373\u53ef\nbash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config {CONFIG_PATH} \\\n --run_mode {train/finetune/eval/predict}\" WORKER_NUM\n\n# 3. \u5355\u673a\u591a\u5361\u81ea\u5b9a\u4e49\u542f\u52a8\u65b9\u5f0f\nbash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config {CONFIG_PATH} \\\n --run_mode {train/finetune/eval/predict}\" \\\n WORKER_NUM MASTER_PORT LOG_DIR JOIN CLUSTER_TIME_OUT\n ```\n\n- \u4f7f\u7528\u793a\u4f8b\n\n ```shell\n # \u5355\u673a\u591a\u5361\u5feb\u901f\u542f\u52a8\u65b9\u5f0f\uff0c\u9ed8\u8ba48\u5361\u542f\u52a8\n bash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config path/to/xxx.yaml \\\n --run_mode finetune\"\n\n # \u5355\u673a\u591a\u5361\u5feb\u901f\u542f\u52a8\u65b9\u5f0f\n bash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config path/to/xxx.yaml \\\n --run_mode finetune\" 8\n\n # \u5355\u673a\u591a\u5361\u81ea\u5b9a\u4e49\u542f\u52a8\u65b9\u5f0f\n bash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config path/to/xxx.yaml \\\n --run_mode finetune\" \\\n 8 8118 output/msrun_log False 300\n ```\n\n### \u591a\u673a\u591a\u5361\n\n\u591a\u673a\u591a\u5361\u6267\u884c\u811a\u672c\u8fdb\u884c\u5206\u5e03\u5f0f\u8bad\u7ec3\u9700\u8981\u5206\u522b\u5728\u4e0d\u540c\u8282\u70b9\u8fd0\u884c\u811a\u672c\uff0c\u5e76\u5c06\u53c2\u6570MASTER_ADDR\u8bbe\u7f6e\u4e3a\u4e3b\u8282\u70b9\u7684ip\u5730\u5740\uff0c\n\u6240\u6709\u8282\u70b9\u8bbe\u7f6e\u7684ip\u5730\u5740\u76f8\u540c\uff0c\u4e0d\u540c\u8282\u70b9\u4e4b\u95f4\u4ec5\u53c2\u6570NODE_RANK\u4e0d\u540c\u3002\n\n ```shell\n # \u591a\u673a\u591a\u5361\u81ea\u5b9a\u4e49\u542f\u52a8\u65b9\u5f0f\n bash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config {CONFIG_PATH} \\\n --run_mode {train/finetune/eval/predict}\" \\\n WORKER_NUM LOCAL_WORKER MASTER_ADDR MASTER_PORT NODE_RANK LOG_DIR JOIN CLUSTER_TIME_OUT\n ```\n\n- \u4f7f\u7528\u793a\u4f8b\n\n ```shell\n # \u8282\u70b90\uff0c\u8282\u70b9ip\u4e3a192.168.1.1\uff0c\u4f5c\u4e3a\u4e3b\u8282\u70b9\uff0c\u603b\u51718\u5361\u4e14\u6bcf\u4e2a\u8282\u70b94\u5361\n bash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config {CONFIG_PATH} \\\n --run_mode {train/finetune/eval/predict}\" \\\n 8 4 192.168.1.1 8118 0 output/msrun_log False 300\n\n # \u8282\u70b91\uff0c\u8282\u70b9ip\u4e3a192.168.1.2\uff0c\u8282\u70b90\u4e0e\u8282\u70b91\u542f\u52a8\u547d\u4ee4\u4ec5\u53c2\u6570NODE_RANK\u4e0d\u540c\n bash scripts/msrun_launcher.sh \"run_mindformer.py \\\n --config {CONFIG_PATH} \\\n --run_mode {train/finetune/eval/predict}\" \\\n 8 4 192.168.1.1 8118 1 output/msrun_log False 300\n ```\n\n### \u5355\u5361\u542f\u52a8\n\nMindFormers\u63d0\u4f9b`run_mindformer.py`\u811a\u672c\u4f5c\u4e3a\u5355\u5361\u542f\u52a8\u65b9\u6cd5\uff0c\u8be5\u811a\u672c\u53ef\u4ee5\u6839\u636e\u6a21\u578b\u914d\u7f6e\u6587\u4ef6\uff0c\u5b8c\u6210\u652f\u6301\u6a21\u578b\u7684\u5355\u5361\u8bad\u7ec3\u3001\u5fae\u8c03\u3001\u8bc4\u4f30\u3001\u63a8\u7406\u6d41\u7a0b\u3002\n\n```shell\n# \u8fd0\u884crun_mindformer.py\u7684\u5165\u53c2\u4f1a\u8986\u76d6\u6a21\u578b\u914d\u7f6e\u6587\u4ef6\u4e2d\u7684\u53c2\u6570\npython run_mindformer.py --config {CONFIG_PATH} --run_mode {train/finetune/eval/predict}\n```\n\n## \u56db\u3001\u8d21\u732e\n\n\u6b22\u8fce\u53c2\u4e0e\u793e\u533a\u8d21\u732e\uff0c\u53ef\u53c2\u8003MindSpore\u8d21\u732e\u8981\u6c42[Contributor Wiki](https://gitee.com/mindspore/mindspore/blob/master/CONTRIBUTING_CN.md)\u3002\n\n## \u4e94\u3001\u8bb8\u53ef\u8bc1\n\n[Apache 2.0\u8bb8\u53ef\u8bc1](LICENSE)\n",
"bugtrack_url": null,
"license": "Apache 2.0",
"summary": "mindformers platform: linux, cpu: x86_64",
"version": "1.2.0",
"project_urls": {
"Download": "https://gitee.com/mindspore/mindformers/tags",
"Homepage": "https://www.mindspore.cn",
"Issue Tracker": "https://gitee.com/mindspore/mindformers/issues",
"Sources": "https://gitee.com/mindspore/mindformers"
},
"split_keywords": [
"mindformers"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7ac60b94e4a646c35ae7435b2b8cb514a55d20ca8982100cbecd77dcad9f5a5f",
"md5": "36c76f894b938fb17e30b691d810a746",
"sha256": "03e6094248324c1e5d9616783f8a6fa6e7e319c83f246dcdd402889663860e02"
},
"downloads": -1,
"filename": "mindformers-1.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "36c76f894b938fb17e30b691d810a746",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 1473506,
"upload_time": "2024-07-27T02:03:52",
"upload_time_iso_8601": "2024-07-27T02:03:52.178613Z",
"url": "https://files.pythonhosted.org/packages/7a/c6/0b94e4a646c35ae7435b2b8cb514a55d20ca8982100cbecd77dcad9f5a5f/mindformers-1.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-07-27 02:03:52",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "mindformers"
}