##自动构建评分卡
## 思想碰撞
| 微信 | 微信公众号 |
| :---: | :----: |
| <img src="https://github.com/ZhengRyan/autotreemodel/blob/master/images/%E5%B9%B2%E9%A5%AD%E4%BA%BA.png" alt="RyanZheng.png" width="50%" border=0/> | <img src="https://github.com/ZhengRyan/autotreemodel/blob/master/images/%E9%AD%94%E9%83%BD%E6%95%B0%E6%8D%AE%E5%B9%B2%E9%A5%AD%E4%BA%BA.png" alt="魔都数据干饭人.png" width="50%" border=0/> |
| 干饭人 | 魔都数据干饭人 |
> 仓库地址:https://github.com/ZhengRyan/autobmt
>
> 微信公众号文章:https://mp.weixin.qq.com/s/u8Nsp5M93WIGL2M0tU4U_g
>
> pipy包:https://pypi.org/project/autobmt/
>
> 实验数据:链接: https://pan.baidu.com/s/1BRIHH9Wcwy2EZaO5xSgH9w?pwd=tdq5 提取码: tdq5
## 一、环境准备
可以不用单独创建虚拟环境,都是日常常用的python依赖包。需要创建虚拟环境,请参考"五、依赖包安装"
### `autobmt` 安装
pip install(pip安装)
```bash
pip install autobmt # to install
pip install -U autobmt # to upgrade
```
Source code install(源码安装)
```bash
python setup.py install
```
## 二、使用教程
1、1行代码自动构建评分卡:请查看autobmt/examples/autobmt_lr_tutorial_code.py。里面有例子
2、1步1步拆解自动构建评分卡的步骤:请查看autobmt/examples/tutorial_code.ipynb。里面有详细步骤拆解例子
## 三、训练、自动选变量、自动单调最优分箱、自动构建模型、自动构建评分卡
1、Step 1: EDA,整体数据探索性数据分析
2、Step 2: 特征粗筛选
3、Step 3: 对粗筛选后的变量调用最优分箱
4、Step 4: 对最优分箱后的变量进行woe转换
5、Step 5: 对woe转换后的变量进行stepwise
6、Step 6: 用逻辑回归构建模型
7、Step 7: 构建评分卡
8、Step 8: 持久化模型,分箱点,woe值,评分卡结构
9、Step 9: 持久化建模中间结果到excel,方便复盘
## 四、保存的建模结果相关文件说明
1、all_data_eda.xlsx:整体数据的EDA情况
2、build_model_log_var_jpg文件夹,最终入模变量的分箱画图,在"build_model_log.xlsx"最后1个sheet也有记录
3、build_model_log.xlsx:构建整个模型的过程日志,记录有利复盘
4、fb.pkl、woetf.pkl、lrmodel.pkl、in_model_var.pkl:fb.pkl分箱文件,woetf.pkl转woe文件,lrmodel.pkl模型文件,入模变量文件
5、scorecard.pkl、scorecard.csv、scorecard.json:评分卡的pkl、csv、json格式。在"build_model_log.xlsx"的"scorecard_structure"sheet也有记录
6、var_bin_woe_format.csv、var_bin_woe_format.json、var_bin_woe.csv、var_bin_woe.json、var_split_point_format.csv、var_split_point_format.json、var_split_point.csv、var_split_point.json:分箱文件和转woe文件的csv、json格式
7、lr_auc_ks_psi.csv:模型的auc、ks、psi
8、lr_pred_to_report_data.csv:构建建模报告的数据
9、lr_test_input.csv:用于模型上线后,将次数据喂入模型,对比和lr_pred_to_report_data.csv结果是否一致。验证模型上线的正确性
## 五、依赖包安装(建议先创建虚拟环境,不创建虚拟环境也行,创建虚拟环境是为了不和其它项目有依赖包的冲突,不创建虚拟环境的话在基础python环境执行pip install即可)
####创建虚拟环境
conda create -y --force -n autobmt python=3.7.2
####激活虚拟环境
conda activate autobmt
### 依赖包安装方式一,执行如下命令安装依赖的包
pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/
Raw data
{
"_id": null,
"home_page": "https://github.com/ZhengRyan/autobmt",
"name": "autobmt",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "autobmt",
"author": "RyanZheng",
"author_email": "zhengruiping000@163.com",
"download_url": "https://files.pythonhosted.org/packages/13/86/f1a8e29d7264930966bc2747eda21ec96f1a0a6a923723965bde8da3c28f/autobmt-0.1.9.tar.gz",
"platform": null,
"description": "##\u81ea\u52a8\u6784\u5efa\u8bc4\u5206\u5361\n\n## \u601d\u60f3\u78b0\u649e\n\n| \u5fae\u4fe1 | \u5fae\u4fe1\u516c\u4f17\u53f7 |\n| :---: | :----: |\n| <img src=\"https://github.com/ZhengRyan/autotreemodel/blob/master/images/%E5%B9%B2%E9%A5%AD%E4%BA%BA.png\" alt=\"RyanZheng.png\" width=\"50%\" border=0/> | <img src=\"https://github.com/ZhengRyan/autotreemodel/blob/master/images/%E9%AD%94%E9%83%BD%E6%95%B0%E6%8D%AE%E5%B9%B2%E9%A5%AD%E4%BA%BA.png\" alt=\"\u9b54\u90fd\u6570\u636e\u5e72\u996d\u4eba.png\" width=\"50%\" border=0/> |\n| \u5e72\u996d\u4eba | \u9b54\u90fd\u6570\u636e\u5e72\u996d\u4eba |\n\n\n> \u4ed3\u5e93\u5730\u5740\uff1ahttps://github.com/ZhengRyan/autobmt\n> \n> \u5fae\u4fe1\u516c\u4f17\u53f7\u6587\u7ae0\uff1ahttps://mp.weixin.qq.com/s/u8Nsp5M93WIGL2M0tU4U_g\n> \n> pipy\u5305\uff1ahttps://pypi.org/project/autobmt/\n> \n> \u5b9e\u9a8c\u6570\u636e\uff1a\u94fe\u63a5: https://pan.baidu.com/s/1BRIHH9Wcwy2EZaO5xSgH9w?pwd=tdq5 \u63d0\u53d6\u7801: tdq5\n\n## \u4e00\u3001\u73af\u5883\u51c6\u5907\n\u53ef\u4ee5\u4e0d\u7528\u5355\u72ec\u521b\u5efa\u865a\u62df\u73af\u5883\uff0c\u90fd\u662f\u65e5\u5e38\u5e38\u7528\u7684python\u4f9d\u8d56\u5305\u3002\u9700\u8981\u521b\u5efa\u865a\u62df\u73af\u5883\uff0c\u8bf7\u53c2\u8003\"\u4e94\u3001\u4f9d\u8d56\u5305\u5b89\u88c5\"\n\n### `autobmt` \u5b89\u88c5\npip install\uff08pip\u5b89\u88c5\uff09\n\n```bash\npip install autobmt # to install\npip install -U autobmt # to upgrade\n```\n\nSource code install\uff08\u6e90\u7801\u5b89\u88c5\uff09\n\n```bash\npython setup.py install\n```\n\n## \u4e8c\u3001\u4f7f\u7528\u6559\u7a0b\n1\u30011\u884c\u4ee3\u7801\u81ea\u52a8\u6784\u5efa\u8bc4\u5206\u5361\uff1a\u8bf7\u67e5\u770bautobmt/examples/autobmt_lr_tutorial_code.py\u3002\u91cc\u9762\u6709\u4f8b\u5b50\n\n2\u30011\u6b651\u6b65\u62c6\u89e3\u81ea\u52a8\u6784\u5efa\u8bc4\u5206\u5361\u7684\u6b65\u9aa4\uff1a\u8bf7\u67e5\u770bautobmt/examples/tutorial_code.ipynb\u3002\u91cc\u9762\u6709\u8be6\u7ec6\u6b65\u9aa4\u62c6\u89e3\u4f8b\u5b50\n\n## \u4e09\u3001\u8bad\u7ec3\u3001\u81ea\u52a8\u9009\u53d8\u91cf\u3001\u81ea\u52a8\u5355\u8c03\u6700\u4f18\u5206\u7bb1\u3001\u81ea\u52a8\u6784\u5efa\u6a21\u578b\u3001\u81ea\u52a8\u6784\u5efa\u8bc4\u5206\u5361\n1\u3001Step 1: EDA\uff0c\u6574\u4f53\u6570\u636e\u63a2\u7d22\u6027\u6570\u636e\u5206\u6790\n\n2\u3001Step 2: \u7279\u5f81\u7c97\u7b5b\u9009\n\n3\u3001Step 3: \u5bf9\u7c97\u7b5b\u9009\u540e\u7684\u53d8\u91cf\u8c03\u7528\u6700\u4f18\u5206\u7bb1\n\n4\u3001Step 4: \u5bf9\u6700\u4f18\u5206\u7bb1\u540e\u7684\u53d8\u91cf\u8fdb\u884cwoe\u8f6c\u6362\n\n5\u3001Step 5: \u5bf9woe\u8f6c\u6362\u540e\u7684\u53d8\u91cf\u8fdb\u884cstepwise\n\n6\u3001Step 6: \u7528\u903b\u8f91\u56de\u5f52\u6784\u5efa\u6a21\u578b\n\n7\u3001Step 7: \u6784\u5efa\u8bc4\u5206\u5361\n\n8\u3001Step 8: \u6301\u4e45\u5316\u6a21\u578b\uff0c\u5206\u7bb1\u70b9\uff0cwoe\u503c\uff0c\u8bc4\u5206\u5361\u7ed3\u6784\n\n9\u3001Step 9: \u6301\u4e45\u5316\u5efa\u6a21\u4e2d\u95f4\u7ed3\u679c\u5230excel\uff0c\u65b9\u4fbf\u590d\u76d8\n\n## \u56db\u3001\u4fdd\u5b58\u7684\u5efa\u6a21\u7ed3\u679c\u76f8\u5173\u6587\u4ef6\u8bf4\u660e\n1\u3001all_data_eda.xlsx\uff1a\u6574\u4f53\u6570\u636e\u7684EDA\u60c5\u51b5\n\n2\u3001build_model_log_var_jpg\u6587\u4ef6\u5939\uff0c\u6700\u7ec8\u5165\u6a21\u53d8\u91cf\u7684\u5206\u7bb1\u753b\u56fe\uff0c\u5728\"build_model_log.xlsx\"\u6700\u540e1\u4e2asheet\u4e5f\u6709\u8bb0\u5f55\n\n3\u3001build_model_log.xlsx\uff1a\u6784\u5efa\u6574\u4e2a\u6a21\u578b\u7684\u8fc7\u7a0b\u65e5\u5fd7\uff0c\u8bb0\u5f55\u6709\u5229\u590d\u76d8\n\n4\u3001fb.pkl\u3001woetf.pkl\u3001lrmodel.pkl\u3001in_model_var.pkl\uff1afb.pkl\u5206\u7bb1\u6587\u4ef6\uff0cwoetf.pkl\u8f6cwoe\u6587\u4ef6\uff0clrmodel.pkl\u6a21\u578b\u6587\u4ef6\uff0c\u5165\u6a21\u53d8\u91cf\u6587\u4ef6\n\n5\u3001scorecard.pkl\u3001scorecard.csv\u3001scorecard.json\uff1a\u8bc4\u5206\u5361\u7684pkl\u3001csv\u3001json\u683c\u5f0f\u3002\u5728\"build_model_log.xlsx\"\u7684\"scorecard_structure\"sheet\u4e5f\u6709\u8bb0\u5f55\n\n6\u3001var_bin_woe_format.csv\u3001var_bin_woe_format.json\u3001var_bin_woe.csv\u3001var_bin_woe.json\u3001var_split_point_format.csv\u3001var_split_point_format.json\u3001var_split_point.csv\u3001var_split_point.json\uff1a\u5206\u7bb1\u6587\u4ef6\u548c\u8f6cwoe\u6587\u4ef6\u7684csv\u3001json\u683c\u5f0f\n\n7\u3001lr_auc_ks_psi.csv\uff1a\u6a21\u578b\u7684auc\u3001ks\u3001psi\n\n8\u3001lr_pred_to_report_data.csv\uff1a\u6784\u5efa\u5efa\u6a21\u62a5\u544a\u7684\u6570\u636e\n\n9\u3001lr_test_input.csv\uff1a\u7528\u4e8e\u6a21\u578b\u4e0a\u7ebf\u540e\uff0c\u5c06\u6b21\u6570\u636e\u5582\u5165\u6a21\u578b\uff0c\u5bf9\u6bd4\u548clr_pred_to_report_data.csv\u7ed3\u679c\u662f\u5426\u4e00\u81f4\u3002\u9a8c\u8bc1\u6a21\u578b\u4e0a\u7ebf\u7684\u6b63\u786e\u6027\n\n## \u4e94\u3001\u4f9d\u8d56\u5305\u5b89\u88c5\uff08\u5efa\u8bae\u5148\u521b\u5efa\u865a\u62df\u73af\u5883\uff0c\u4e0d\u521b\u5efa\u865a\u62df\u73af\u5883\u4e5f\u884c\uff0c\u521b\u5efa\u865a\u62df\u73af\u5883\u662f\u4e3a\u4e86\u4e0d\u548c\u5176\u5b83\u9879\u76ee\u6709\u4f9d\u8d56\u5305\u7684\u51b2\u7a81\uff0c\u4e0d\u521b\u5efa\u865a\u62df\u73af\u5883\u7684\u8bdd\u5728\u57fa\u7840python\u73af\u5883\u6267\u884cpip install\u5373\u53ef\uff09\n####\u521b\u5efa\u865a\u62df\u73af\u5883\nconda create -y --force -n autobmt python=3.7.2\n####\u6fc0\u6d3b\u865a\u62df\u73af\u5883\nconda activate autobmt\n\n### \u4f9d\u8d56\u5305\u5b89\u88c5\u65b9\u5f0f\u4e00\uff0c\u6267\u884c\u5982\u4e0b\u547d\u4ee4\u5b89\u88c5\u4f9d\u8d56\u7684\u5305\npip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple/\n\n\n\n\n\n",
"bugtrack_url": null,
"license": "MIT license",
"summary": "a modeling tool that automatically builds scorecards and tree models.",
"version": "0.1.9",
"project_urls": {
"Homepage": "https://github.com/ZhengRyan/autobmt"
},
"split_keywords": [
"autobmt"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fef13a05cf72eb2d84beb64ef8c369fe53aef4d28f799f39286a744ca9403344",
"md5": "7e2d8a8d098bbed8384e29bbd963f658",
"sha256": "8dbb6d866b75b85c20657f2d7734937fd32261f4618e06ac42411ac9ec343dc5"
},
"downloads": -1,
"filename": "autobmt-0.1.9-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "7e2d8a8d098bbed8384e29bbd963f658",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": ">=3.6",
"size": 82294,
"upload_time": "2023-12-10T13:44:57",
"upload_time_iso_8601": "2023-12-10T13:44:57.911011Z",
"url": "https://files.pythonhosted.org/packages/fe/f1/3a05cf72eb2d84beb64ef8c369fe53aef4d28f799f39286a744ca9403344/autobmt-0.1.9-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "1386f1a8e29d7264930966bc2747eda21ec96f1a0a6a923723965bde8da3c28f",
"md5": "ae29f6cac91ea8ce7940f281615878de",
"sha256": "a2e5a410a5fb2c9e6604c4c0c4c210098e11200be2c91303a84761d7364b2fab"
},
"downloads": -1,
"filename": "autobmt-0.1.9.tar.gz",
"has_sig": false,
"md5_digest": "ae29f6cac91ea8ce7940f281615878de",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.6",
"size": 2697921,
"upload_time": "2023-12-10T13:45:01",
"upload_time_iso_8601": "2023-12-10T13:45:01.577216Z",
"url": "https://files.pythonhosted.org/packages/13/86/f1a8e29d7264930966bc2747eda21ec96f1a0a6a923723965bde8da3c28f/autobmt-0.1.9.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-10 13:45:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ZhengRyan",
"github_project": "autobmt",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pandas",
"specs": []
},
{
"name": "scikit-learn",
"specs": [
[
">=",
"0.21"
]
]
},
{
"name": "statsmodels",
"specs": [
[
">=",
"0.11.1"
]
]
},
{
"name": "XlsxWriter",
"specs": [
[
">=",
"1.3.7"
]
]
},
{
"name": "matplotlib",
"specs": [
[
">=",
"3.1.2"
]
]
},
{
"name": "openpyxl",
"specs": [
[
">=",
"3.0.7"
]
]
},
{
"name": "bayesian-optimization",
"specs": [
[
"==",
"1.1.0"
]
]
},
{
"name": "shap",
"specs": [
[
">=",
"0.40.0"
]
]
},
{
"name": "joblib",
"specs": [
[
">=",
"0.12"
]
]
},
{
"name": "xgboost",
"specs": [
[
">=",
"1.2.0"
],
[
"<=",
"1.5.0"
]
]
},
{
"name": "lightgbm",
"specs": [
[
">=",
"3.1.0"
]
]
},
{
"name": "seaborn",
"specs": [
[
">=",
"0.10.0"
]
]
}
],
"lcname": "autobmt"
}