Name | kstprocess JSON |
Version |
0.3.12
JSON |
| download |
home_page | None |
Summary | dataprocess for dialog robot |
upload_time | 2025-08-01 08:07:32 |
maintainer | None |
docs_url | None |
author | kevin |
requires_python | >=3.6 |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
## 打包和上传pypi命令
```python
python3 setup.py sdist bdist_wheel
twine upload dist/*
pypi-AgEIcHlwaS5vcmcCJGM2YzEwNDM0LTkzNDgtNDY3Ny04NTE0LTk1YmZlZGVmNmM3MQACElsxLFsia3N0cHJvY2VzcyJdXQACLFsyLFsiMmRjZmE2ZjQtN2JkOC00N2YzLWFlOGUtYzc0YzNiMDFiNzE3Il1dAAAGICipZl8XfBCHqH4tmF8Rld3jlPVooWnNy57iQPJPI546
# pypi-AgEIcHlwaS5vcmcCJGM4OWM3NDRiLWIzNTEtNDJkOS1iNzc2LTQxZjRlNjNmMTJkMwACKlszLCI1MDA1MTkzMy00M2E3LTRmY2QtODNlMi0wYzJlNjlmNGNlY2MiXQAABiCvyx84-INQn769QJhjyDb4TfaM8domuUyQdBbl6ViiIw
```
然后删掉 `build`,`dist`,`kstprocess.egg-info`
## 第一步:处理原始数据
快商通-总后台-机器人-对话记录:导出的文件:2025年07月07日10时15分49秒-对话记录导出.xlsx。
```python
process_data(
input_file_path='./接入数据-对话流.xlsx',
output_file_path='./版本_主题_对话流.xlsx'
)
```
## 第二步:绘制对比图
```python
plot_comparison(
file1_path='./版本_主题_候选话术库_kicp_gpt.xlsx',
file2_path='./版本_主题_kicp-GPT.xlsx',
prefix1='候选话术库_kicp_gpt',
prefix2='kicp-GPT',
save_path='./留联率对比.jpg',
min_valid_conversations=5,
)
```
## 动一牵全身的地方
```python
# kstprocess/preferenceDataConstruct/OnlinePreferenceConstructPipeline.pyconvert_my_format_to_center/convert
_, search_text, guide_text = history[0]["content"].replace("你是儿科高级咨询师\n你的核心目标是获取访客联系方式。 ", "").split("\n")
# 如果前面的prompt改了,"你是儿科高级咨询师\n你的核心目标是获取访客联系方式。 "也需要修改
```
## DPO数据构造
Raw data
{
"_id": null,
"home_page": null,
"name": "kstprocess",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": null,
"keywords": null,
"author": "kevin",
"author_email": "kevin.yuzhenjie@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/8e/f4/009e86eb867cba8a2c098e3ad337c18b19caf1147cc6da070b138e663e8e/kstprocess-0.3.12.tar.gz",
"platform": null,
"description": "\n## \u6253\u5305\u548c\u4e0a\u4f20pypi\u547d\u4ee4\n```python\npython3 setup.py sdist bdist_wheel\ntwine upload dist/*\n\npypi-AgEIcHlwaS5vcmcCJGM2YzEwNDM0LTkzNDgtNDY3Ny04NTE0LTk1YmZlZGVmNmM3MQACElsxLFsia3N0cHJvY2VzcyJdXQACLFsyLFsiMmRjZmE2ZjQtN2JkOC00N2YzLWFlOGUtYzc0YzNiMDFiNzE3Il1dAAAGICipZl8XfBCHqH4tmF8Rld3jlPVooWnNy57iQPJPI546\n\n# pypi-AgEIcHlwaS5vcmcCJGM4OWM3NDRiLWIzNTEtNDJkOS1iNzc2LTQxZjRlNjNmMTJkMwACKlszLCI1MDA1MTkzMy00M2E3LTRmY2QtODNlMi0wYzJlNjlmNGNlY2MiXQAABiCvyx84-INQn769QJhjyDb4TfaM8domuUyQdBbl6ViiIw\n```\n\n\u7136\u540e\u5220\u6389 `build`,`dist`,`kstprocess.egg-info`\n\n## \u7b2c\u4e00\u6b65\uff1a\u5904\u7406\u539f\u59cb\u6570\u636e\n\n\u5feb\u5546\u901a-\u603b\u540e\u53f0-\u673a\u5668\u4eba-\u5bf9\u8bdd\u8bb0\u5f55\uff1a\u5bfc\u51fa\u7684\u6587\u4ef6:2025\u5e7407\u670807\u65e510\u65f615\u520649\u79d2-\u5bf9\u8bdd\u8bb0\u5f55\u5bfc\u51fa.xlsx\u3002\n\n```python\n\n\nprocess_data(\n input_file_path='./\u63a5\u5165\u6570\u636e-\u5bf9\u8bdd\u6d41.xlsx',\n output_file_path='./\u7248\u672c_\u4e3b\u9898_\u5bf9\u8bdd\u6d41.xlsx'\n)\n```\n\n## \u7b2c\u4e8c\u6b65\uff1a\u7ed8\u5236\u5bf9\u6bd4\u56fe\n```python\n\nplot_comparison(\n file1_path='./\u7248\u672c_\u4e3b\u9898_\u5019\u9009\u8bdd\u672f\u5e93_kicp_gpt.xlsx',\n file2_path='./\u7248\u672c_\u4e3b\u9898_kicp-GPT.xlsx',\n prefix1='\u5019\u9009\u8bdd\u672f\u5e93_kicp_gpt',\n prefix2='kicp-GPT',\n save_path='./\u7559\u8054\u7387\u5bf9\u6bd4.jpg',\n min_valid_conversations=5,\n)\n```\n\n## \u52a8\u4e00\u7275\u5168\u8eab\u7684\u5730\u65b9\n\n```python\n# kstprocess/preferenceDataConstruct/OnlinePreferenceConstructPipeline.pyconvert_my_format_to_center/convert\n_, search_text, guide_text = history[0][\"content\"].replace(\"\u4f60\u662f\u513f\u79d1\u9ad8\u7ea7\u54a8\u8be2\u5e08\\n\u4f60\u7684\u6838\u5fc3\u76ee\u6807\u662f\u83b7\u53d6\u8bbf\u5ba2\u8054\u7cfb\u65b9\u5f0f\u3002 \", \"\").split(\"\\n\")\n# \u5982\u679c\u524d\u9762\u7684prompt\u6539\u4e86\uff0c\"\u4f60\u662f\u513f\u79d1\u9ad8\u7ea7\u54a8\u8be2\u5e08\\n\u4f60\u7684\u6838\u5fc3\u76ee\u6807\u662f\u83b7\u53d6\u8bbf\u5ba2\u8054\u7cfb\u65b9\u5f0f\u3002 \"\u4e5f\u9700\u8981\u4fee\u6539\n```\n\n## DPO\u6570\u636e\u6784\u9020\n\n",
"bugtrack_url": null,
"license": null,
"summary": "dataprocess for dialog robot",
"version": "0.3.12",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "48d7f5cd7284915ace0815077d4c450b07cfda2ac9090b99ccb84c450f99e0c0",
"md5": "916addc4a00e9bd39419e0ed2a25d291",
"sha256": "a08555f2326148e34944d0c58926d5e61b8914417b078019ce242e5400ee8dfb"
},
"downloads": -1,
"filename": "kstprocess-0.3.12-py3-none-any.whl",
"has_sig": false,
"md5_digest": "916addc4a00e9bd39419e0ed2a25d291",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.6",
"size": 104893,
"upload_time": "2025-08-01T08:07:31",
"upload_time_iso_8601": "2025-08-01T08:07:31.415094Z",
"url": "https://files.pythonhosted.org/packages/48/d7/f5cd7284915ace0815077d4c450b07cfda2ac9090b99ccb84c450f99e0c0/kstprocess-0.3.12-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8ef4009e86eb867cba8a2c098e3ad337c18b19caf1147cc6da070b138e663e8e",
"md5": "2cf7d4a37db93ab2bcd6b7a734957c1f",
"sha256": "d09a8e94fdd0e2400aff7558cc3c2285caf565196b5cb27dfedfc2041fd592f5"
},
"downloads": -1,
"filename": "kstprocess-0.3.12.tar.gz",
"has_sig": false,
"md5_digest": "2cf7d4a37db93ab2bcd6b7a734957c1f",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 85754,
"upload_time": "2025-08-01T08:07:32",
"upload_time_iso_8601": "2025-08-01T08:07:32.518943Z",
"url": "https://files.pythonhosted.org/packages/8e/f4/009e86eb867cba8a2c098e3ad337c18b19caf1147cc6da070b138e663e8e/kstprocess-0.3.12.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-01 08:07:32",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "kstprocess"
}