完整版介绍可以看 [xspike——GPU 任务排队、一键批量启动实验......](https://zhuanlan.zhihu.com/p/685132608/preview?comment=0&catalog=0)
使用以下命令安装 xspike:
```
pip install xspike
```
xspike 目前可支持的功能有:
- [x] GPU 任务排队 (基于 Redis)
- [x] 批量启动实验脚本
- [x] 钉钉通知
- [x] Comet.ml 管理
## 安装 Redis、启动 Redis、Redis 数据维护
在命令行输入
```
xspike
```
按照提示依次执行 4、5、6 号操作即可部署好 Redis 环境
## 启动 GPU 任务排队
```
import xspike as x
queuer = x.GPUQueuer()
queuer.start()
# Your code is here
# ......
queuer.close()
```
建议在原有的代码最前端插入上述代码,并在实验结束后调用 queuer.close() 来释放当前任务
GPUQueuer() 的初始化参数说明:
```
visible_cuda (str, optional): 可见的 GPU 编号,多个 GPU 编号用逗号分隔,如 "0,1,2,3". 默认为 "-1",即全部可见.
n_gpus (int, optional): 需要的 GPU 数量. 默认为 1.
memo (str, optional): 任务备注. 默认为 "no memo".
```
## 实验计划
即插即用,对原来代码无任何影响
只需要在代码根目录下创建一个名为 “exp_plans” 的文件夹,里边存放每个实验计划要运行的所有脚本,并以 “xxx.sh” 来命名,样例可前往知乎博客查看
如果要取消某个实验,直接注释即可,批量启动时,为了避免 GPU 竞争,设置了实验启动间隔,每分钟启动一个实验
启动时,只需要在终端输入:
```
xx
```
即可自动扫描当前目录下的所有计划文件,并在选择后进行启动
## 钉钉通知
集成了 DingtalkChatbot 的通知功能,设置好 DINGDING_ACCESS_TOKEN 和 DINGDING_SECRET 环境变量或直接在函数调用时通过参数传递,然后调用下面的代码进行通知:
```
x.notice("Hello World!")
```
## Comet.ml 实验管理
目前这里只实现了 Comet 环境的创建和文件夹的上传
```
comet_client = CometClient(project_name="Default", api_key="xxx", exp_name="MoE Baseline")
```
上传文件夹,比如上传本次实验的代码,方便结果复现
将文件夹下的所有 ".py", ".yml" 文件进行上传(包括子文件夹)
```
comet_client.log_directory("./")
```
## 依赖
xspike 非常轻量化,依赖包很少,可以减少与现有的环境发生依赖冲突的情况
```
nvitop
redis
rich
psutil
jsonlines
setproctitle
dingtalkchatbot
comet_ml
```
## 未来计划
- [ ] 集成 Comet.ml 实验记录,使其通过最少的代码可以用在各大框架中,并提供便捷的记录模板(Callback)
- [ ] 尽量在不增加依赖的情况下,集成实验中常用的工具类和方法
Raw data
{
"_id": null,
"home_page": "https://github.com/deng1fan/xspike.git",
"name": "xspike",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "gpu, queuer, redis",
"author": "deng1fan",
"author_email": "dengyifan@iie.ac.cn",
"download_url": "https://files.pythonhosted.org/packages/ad/18/ce790a537f8e5b3e6857d6bf8a38d94edba245fd52dbee1d1fe116c8667c/xspike-0.3.tar.gz",
"platform": null,
"description": "\n\u5b8c\u6574\u7248\u4ecb\u7ecd\u53ef\u4ee5\u770b [xspike\u2014\u2014GPU \u4efb\u52a1\u6392\u961f\u3001\u4e00\u952e\u6279\u91cf\u542f\u52a8\u5b9e\u9a8c......](https://zhuanlan.zhihu.com/p/685132608/preview?comment=0&catalog=0)\n\n\n\u4f7f\u7528\u4ee5\u4e0b\u547d\u4ee4\u5b89\u88c5 xspike\uff1a\n```\npip install xspike\n```\n\nxspike \u76ee\u524d\u53ef\u652f\u6301\u7684\u529f\u80fd\u6709\uff1a\n\n- [x] GPU \u4efb\u52a1\u6392\u961f \uff08\u57fa\u4e8e Redis\uff09\n- [x] \u6279\u91cf\u542f\u52a8\u5b9e\u9a8c\u811a\u672c\n- [x] \u9489\u9489\u901a\u77e5\n- [x] Comet.ml \u7ba1\u7406\n\n## \u5b89\u88c5 Redis\u3001\u542f\u52a8 Redis\u3001Redis \u6570\u636e\u7ef4\u62a4\n\u5728\u547d\u4ee4\u884c\u8f93\u5165\n```\nxspike\n```\n\n\u6309\u7167\u63d0\u793a\u4f9d\u6b21\u6267\u884c 4\u30015\u30016 \u53f7\u64cd\u4f5c\u5373\u53ef\u90e8\u7f72\u597d Redis \u73af\u5883\n\n\n\n## \u542f\u52a8 GPU \u4efb\u52a1\u6392\u961f\n\n```\nimport xspike as x\n\n\nqueuer = x.GPUQueuer()\nqueuer.start()\n\n# Your code is here\n# ......\n\nqueuer.close()\n```\n\n\u5efa\u8bae\u5728\u539f\u6709\u7684\u4ee3\u7801\u6700\u524d\u7aef\u63d2\u5165\u4e0a\u8ff0\u4ee3\u7801\uff0c\u5e76\u5728\u5b9e\u9a8c\u7ed3\u675f\u540e\u8c03\u7528 queuer.close() \u6765\u91ca\u653e\u5f53\u524d\u4efb\u52a1\n\nGPUQueuer() \u7684\u521d\u59cb\u5316\u53c2\u6570\u8bf4\u660e\uff1a\n\n```\nvisible_cuda (str, optional): \u53ef\u89c1\u7684 GPU \u7f16\u53f7\uff0c\u591a\u4e2a GPU \u7f16\u53f7\u7528\u9017\u53f7\u5206\u9694\uff0c\u5982 \"0,1,2,3\". \u9ed8\u8ba4\u4e3a \"-1\"\uff0c\u5373\u5168\u90e8\u53ef\u89c1.\nn_gpus (int, optional): \u9700\u8981\u7684 GPU \u6570\u91cf. \u9ed8\u8ba4\u4e3a 1.\nmemo (str, optional): \u4efb\u52a1\u5907\u6ce8. \u9ed8\u8ba4\u4e3a \"no memo\".\n```\n\n## \u5b9e\u9a8c\u8ba1\u5212\n\u5373\u63d2\u5373\u7528\uff0c\u5bf9\u539f\u6765\u4ee3\u7801\u65e0\u4efb\u4f55\u5f71\u54cd\n\n\u53ea\u9700\u8981\u5728\u4ee3\u7801\u6839\u76ee\u5f55\u4e0b\u521b\u5efa\u4e00\u4e2a\u540d\u4e3a \u201cexp_plans\u201d \u7684\u6587\u4ef6\u5939\uff0c\u91cc\u8fb9\u5b58\u653e\u6bcf\u4e2a\u5b9e\u9a8c\u8ba1\u5212\u8981\u8fd0\u884c\u7684\u6240\u6709\u811a\u672c\uff0c\u5e76\u4ee5 \u201cxxx.sh\u201d \u6765\u547d\u540d\uff0c\u6837\u4f8b\u53ef\u524d\u5f80\u77e5\u4e4e\u535a\u5ba2\u67e5\u770b\n\n\n\u5982\u679c\u8981\u53d6\u6d88\u67d0\u4e2a\u5b9e\u9a8c\uff0c\u76f4\u63a5\u6ce8\u91ca\u5373\u53ef\uff0c\u6279\u91cf\u542f\u52a8\u65f6\uff0c\u4e3a\u4e86\u907f\u514d GPU \u7ade\u4e89\uff0c\u8bbe\u7f6e\u4e86\u5b9e\u9a8c\u542f\u52a8\u95f4\u9694\uff0c\u6bcf\u5206\u949f\u542f\u52a8\u4e00\u4e2a\u5b9e\u9a8c\n\n\u542f\u52a8\u65f6\uff0c\u53ea\u9700\u8981\u5728\u7ec8\u7aef\u8f93\u5165\uff1a\n```\nxx\n```\n\n\u5373\u53ef\u81ea\u52a8\u626b\u63cf\u5f53\u524d\u76ee\u5f55\u4e0b\u7684\u6240\u6709\u8ba1\u5212\u6587\u4ef6\uff0c\u5e76\u5728\u9009\u62e9\u540e\u8fdb\u884c\u542f\u52a8\n\n\n\n\n## \u9489\u9489\u901a\u77e5\n\u96c6\u6210\u4e86 DingtalkChatbot \u7684\u901a\u77e5\u529f\u80fd\uff0c\u8bbe\u7f6e\u597d DINGDING_ACCESS_TOKEN \u548c DINGDING_SECRET \u73af\u5883\u53d8\u91cf\u6216\u76f4\u63a5\u5728\u51fd\u6570\u8c03\u7528\u65f6\u901a\u8fc7\u53c2\u6570\u4f20\u9012\uff0c\u7136\u540e\u8c03\u7528\u4e0b\u9762\u7684\u4ee3\u7801\u8fdb\u884c\u901a\u77e5\uff1a\n\n```\nx.notice(\"Hello World!\")\n```\n\n## Comet.ml \u5b9e\u9a8c\u7ba1\u7406\n\u76ee\u524d\u8fd9\u91cc\u53ea\u5b9e\u73b0\u4e86 Comet \u73af\u5883\u7684\u521b\u5efa\u548c\u6587\u4ef6\u5939\u7684\u4e0a\u4f20\n\n```\ncomet_client = CometClient(project_name=\"Default\", api_key=\"xxx\", exp_name=\"MoE Baseline\")\n```\n\n\u4e0a\u4f20\u6587\u4ef6\u5939\uff0c\u6bd4\u5982\u4e0a\u4f20\u672c\u6b21\u5b9e\u9a8c\u7684\u4ee3\u7801\uff0c\u65b9\u4fbf\u7ed3\u679c\u590d\u73b0\n\u5c06\u6587\u4ef6\u5939\u4e0b\u7684\u6240\u6709 \".py\", \".yml\" \u6587\u4ef6\u8fdb\u884c\u4e0a\u4f20\uff08\u5305\u62ec\u5b50\u6587\u4ef6\u5939\uff09\n\n```\ncomet_client.log_directory(\"./\")\n```\n\n## \u4f9d\u8d56\nxspike \u975e\u5e38\u8f7b\u91cf\u5316\uff0c\u4f9d\u8d56\u5305\u5f88\u5c11\uff0c\u53ef\u4ee5\u51cf\u5c11\u4e0e\u73b0\u6709\u7684\u73af\u5883\u53d1\u751f\u4f9d\u8d56\u51b2\u7a81\u7684\u60c5\u51b5\n```\n nvitop\n redis\n rich\n psutil\n jsonlines\n setproctitle\n dingtalkchatbot\n comet_ml\n```\n## \u672a\u6765\u8ba1\u5212\n- [ ] \u96c6\u6210 Comet.ml \u5b9e\u9a8c\u8bb0\u5f55\uff0c\u4f7f\u5176\u901a\u8fc7\u6700\u5c11\u7684\u4ee3\u7801\u53ef\u4ee5\u7528\u5728\u5404\u5927\u6846\u67b6\u4e2d\uff0c\u5e76\u63d0\u4f9b\u4fbf\u6377\u7684\u8bb0\u5f55\u6a21\u677f\uff08Callback\uff09\n- [ ] \u5c3d\u91cf\u5728\u4e0d\u589e\u52a0\u4f9d\u8d56\u7684\u60c5\u51b5\u4e0b\uff0c\u96c6\u6210\u5b9e\u9a8c\u4e2d\u5e38\u7528\u7684\u5de5\u5177\u7c7b\u548c\u65b9\u6cd5\n",
"bugtrack_url": null,
"license": null,
"summary": "\u5de5\u5177\u5305\uff0c\u5305\u542b Redis \u5b89\u88c5\u3001\u57fa\u4e8e Redis \u7684GPU \u4efb\u52a1\u961f\u5217\u7ba1\u7406\u7b49\u529f\u80fd",
"version": "0.3",
"project_urls": {
"Homepage": "https://github.com/deng1fan/xspike.git"
},
"split_keywords": [
"gpu",
" queuer",
" redis"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "ad18ce790a537f8e5b3e6857d6bf8a38d94edba245fd52dbee1d1fe116c8667c",
"md5": "6338d42b4a16f0fb0fa39effb8d13a2c",
"sha256": "f7aad45fe6854ab79f8b72d860d4e5f791461aecc20ec32142f888f6b11eaca5"
},
"downloads": -1,
"filename": "xspike-0.3.tar.gz",
"has_sig": false,
"md5_digest": "6338d42b4a16f0fb0fa39effb8d13a2c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 22422,
"upload_time": "2024-10-19T03:51:40",
"upload_time_iso_8601": "2024-10-19T03:51:40.885808Z",
"url": "https://files.pythonhosted.org/packages/ad/18/ce790a537f8e5b3e6857d6bf8a38d94edba245fd52dbee1d1fe116c8667c/xspike-0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-19 03:51:40",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "deng1fan",
"github_project": "xspike",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "xspike"
}