# Atom-特征库工具
   

## 介绍
+ Atom是一种特征管理工具,以数据和算子作为基本概念,数据为基础数据用于训练特征和构建特征;算子为基于固定一个或多个数据集进行新特征生产的流程,可以是一个简单直接计算函数,也可以是一个复杂的算法模型,还可以是算法模型和直接计算相结合的组合体。atom的特色是对由数据衍生的算子进行了数据关联、统一管理,并直接提供了服务功能,使得每个算子可以直接实现在线实时计算特征,为主体算法模型服务,提高模型精度。
## 安装
+ Atom采用Python开发,得益于Python良好的社区环境,安装支持Pythonic风格的各种管理器。
```bash
$ pip install atom-0.1-xxxxxxxxxxxx.whl
```
## 快速指南
+ 首先使用atomctl命令行工具进行工作空间设置和初始化操作。然后分别启动元数据服务和atom主服务(两个服务未支持后台开启)。
+ 以下是atomctl命令行示例:
```bash
$ atomctl set --workspace 'D:\Workspace\JiYuan\Atom\Demo\test'
$ atomctl init
$ atomctl metadata-service
$ atomctl start-service
````
+ 然后就是使用python脚本进行atom的数据和算子操作。主要包括数据和算子的注册、查询、删除三个基本操作以及算子的加载操作
+ 以下是atom主程脚本代码示例:
```python
from atom.scheduler import *
### 加载Atom调度器
atom = AtomScheduler(mode='delay')
### register-data测试
df = pd.read_csv(r"D:\Workspace\JiYuan\WindPowerForecast\LSTM\demo\merge_data_GDTYUAN_ec.csv")
atom.data_register(tag='test',
belong='first',
object_name='merge_data_GDTYUAN_ec_1',
data_object=df,
remarks='this is a test data!')
### register-operator测试
### 即时模式----装饰器方式
# @atom.operator_register(tag='test',
# belong='first',
# object_name='test_function_a',
# remarks='this is a test operator!')
def test_function(a,b):
c = a + b * 2 + 1
return c
# tmp_a = test_function(1,2)
### 及时模式----函数方式
# tmp_func = atom.operator_register(tag='test',
# belong='first',
# object_name='test_function_b',
# remarks='this is a test operator!')(test_function)
# tmp_b = tmp_func(3,4)
### 延时模式
atom.operator_register(tag='test',
belong='first',
object_name='test_function_a', ## cc
operator_object=test_function,
remarks='this is a test operator!')
# ### data-remove测试
# atom.data_remove(tag='test',object_name='merge_data_GDTYUAN_ec_00')
### operator-remove测试
# atom.operator_remove(tag='test',object_name='test_function_cc')
### data-query测试
data_view_df = atom.data_query(tag='test')
print(data_view_df)
### operator-query测试
operator_view_df = atom.operator_query(tag='test')
print(operator_view_df)
### data-modify测试
atom.data_modify()
### operator-modify测试
atom.operator_modify()
### data-load测试
data_load_df = atom.data_load(tag='test',object_name='merge_data_GDTYUAN_ec_1')
print(data_load_df)
### operator-load测试
operator_load_a = atom.operator_load(tag='test',object_name='test_function_a')
print(operator_load_a(10,20))
# print(test_function(**{'a':10,'b':20})) ### 字典参数传递
```
+ 最后是算子在线计算服务的使用。当一个算子注册到atom后,他就自动获得了在线计算服务的功能。
+ 表单数据格式如下:
```python
### 该表单数据仅以python为例展示
post_form = {
"tag": "test",
"object_name": "test_function_a",
"data_json": {"a":78,"b":9}
}
```
## 设计
+ WEBUI

+ DATAUI

+ 使用工厂模式,解耦计算、存储和通信所使用的第三方工具
+ 设计数据和算子两个基本概念,扩展了特征工程工具的适用范围,一个算子不仅可以直接计算,还可以是复杂算法模型,覆盖特征工程的特征挖掘和关键指标计算管理。
+ 算子一次注册即拥有长效计算服务功能。
+ 算子基于不同数据实现了版本管理,更具有实际意义
+ 技术列表
+ 工厂模式
+ MinIO
+ Bootstrap5
+ SQLite3
+ RabbitMQ
+ FastAPI
Raw data
{
"_id": null,
"home_page": "https://github.com/redblue0216/Atom",
"name": "shihua-atom",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9.12",
"maintainer_email": "",
"keywords": "",
"author": "shihua",
"author_email": "15021408795@163.com",
"download_url": "https://files.pythonhosted.org/packages/24/56/9798f6bc1479b13bde1e87386efe232fc281b27db05902285d16a27a236a/shihua-atom-0.1.1.tar.gz",
"platform": null,
"description": "# Atom-\u7279\u5f81\u5e93\u5de5\u5177\n\n   \n\n\n\n## \u4ecb\u7ecd\n+ Atom\u662f\u4e00\u79cd\u7279\u5f81\u7ba1\u7406\u5de5\u5177\uff0c\u4ee5\u6570\u636e\u548c\u7b97\u5b50\u4f5c\u4e3a\u57fa\u672c\u6982\u5ff5\uff0c\u6570\u636e\u4e3a\u57fa\u7840\u6570\u636e\u7528\u4e8e\u8bad\u7ec3\u7279\u5f81\u548c\u6784\u5efa\u7279\u5f81\uff1b\u7b97\u5b50\u4e3a\u57fa\u4e8e\u56fa\u5b9a\u4e00\u4e2a\u6216\u591a\u4e2a\u6570\u636e\u96c6\u8fdb\u884c\u65b0\u7279\u5f81\u751f\u4ea7\u7684\u6d41\u7a0b\uff0c\u53ef\u4ee5\u662f\u4e00\u4e2a\u7b80\u5355\u76f4\u63a5\u8ba1\u7b97\u51fd\u6570\uff0c\u4e5f\u53ef\u4ee5\u662f\u4e00\u4e2a\u590d\u6742\u7684\u7b97\u6cd5\u6a21\u578b\uff0c\u8fd8\u53ef\u4ee5\u662f\u7b97\u6cd5\u6a21\u578b\u548c\u76f4\u63a5\u8ba1\u7b97\u76f8\u7ed3\u5408\u7684\u7ec4\u5408\u4f53\u3002atom\u7684\u7279\u8272\u662f\u5bf9\u7531\u6570\u636e\u884d\u751f\u7684\u7b97\u5b50\u8fdb\u884c\u4e86\u6570\u636e\u5173\u8054\u3001\u7edf\u4e00\u7ba1\u7406\uff0c\u5e76\u76f4\u63a5\u63d0\u4f9b\u4e86\u670d\u52a1\u529f\u80fd\uff0c\u4f7f\u5f97\u6bcf\u4e2a\u7b97\u5b50\u53ef\u4ee5\u76f4\u63a5\u5b9e\u73b0\u5728\u7ebf\u5b9e\u65f6\u8ba1\u7b97\u7279\u5f81\uff0c\u4e3a\u4e3b\u4f53\u7b97\u6cd5\u6a21\u578b\u670d\u52a1\uff0c\u63d0\u9ad8\u6a21\u578b\u7cbe\u5ea6\u3002\n\n\n## \u5b89\u88c5\n+ Atom\u91c7\u7528Python\u5f00\u53d1\uff0c\u5f97\u76ca\u4e8ePython\u826f\u597d\u7684\u793e\u533a\u73af\u5883\uff0c\u5b89\u88c5\u652f\u6301Pythonic\u98ce\u683c\u7684\u5404\u79cd\u7ba1\u7406\u5668\u3002\n```bash\n$ pip install atom-0.1-xxxxxxxxxxxx.whl\n```\n\n\n\n## \u5feb\u901f\u6307\u5357\n+ \u9996\u5148\u4f7f\u7528atomctl\u547d\u4ee4\u884c\u5de5\u5177\u8fdb\u884c\u5de5\u4f5c\u7a7a\u95f4\u8bbe\u7f6e\u548c\u521d\u59cb\u5316\u64cd\u4f5c\u3002\u7136\u540e\u5206\u522b\u542f\u52a8\u5143\u6570\u636e\u670d\u52a1\u548catom\u4e3b\u670d\u52a1(\u4e24\u4e2a\u670d\u52a1\u672a\u652f\u6301\u540e\u53f0\u5f00\u542f)\u3002\n\n+ \u4ee5\u4e0b\u662fatomctl\u547d\u4ee4\u884c\u793a\u4f8b\uff1a\n\n```bash\n\t$ atomctl set --workspace 'D:\\Workspace\\JiYuan\\Atom\\Demo\\test'\n\n\t$ atomctl init\n\n\t$ atomctl metadata-service \n\n\t$ atomctl start-service \n````\n\n+ \u7136\u540e\u5c31\u662f\u4f7f\u7528python\u811a\u672c\u8fdb\u884catom\u7684\u6570\u636e\u548c\u7b97\u5b50\u64cd\u4f5c\u3002\u4e3b\u8981\u5305\u62ec\u6570\u636e\u548c\u7b97\u5b50\u7684\u6ce8\u518c\u3001\u67e5\u8be2\u3001\u5220\u9664\u4e09\u4e2a\u57fa\u672c\u64cd\u4f5c\u4ee5\u53ca\u7b97\u5b50\u7684\u52a0\u8f7d\u64cd\u4f5c\n\n\n+ \u4ee5\u4e0b\u662fatom\u4e3b\u7a0b\u811a\u672c\u4ee3\u7801\u793a\u4f8b\uff1a\n\n```python\n\n\tfrom atom.scheduler import *\n\n\t### \u52a0\u8f7dAtom\u8c03\u5ea6\u5668\n\tatom = AtomScheduler(mode='delay')\n\n\n\t### register-data\u6d4b\u8bd5\n\tdf = pd.read_csv(r\"D:\\Workspace\\JiYuan\\WindPowerForecast\\LSTM\\demo\\merge_data_GDTYUAN_ec.csv\")\n\tatom.data_register(tag='test',\n\t belong='first',\n\t object_name='merge_data_GDTYUAN_ec_1',\n\t data_object=df,\n\t remarks='this is a test data!')\n\n\t### register-operator\u6d4b\u8bd5\n\t### \u5373\u65f6\u6a21\u5f0f----\u88c5\u9970\u5668\u65b9\u5f0f\n\t# @atom.operator_register(tag='test',\n\t# belong='first',\n\t# object_name='test_function_a',\n\t# remarks='this is a test operator!')\n\tdef test_function(a,b):\n\t c = a + b * 2 + 1\n\t return c\n\t# tmp_a = test_function(1,2)\n\t### \u53ca\u65f6\u6a21\u5f0f----\u51fd\u6570\u65b9\u5f0f\n\t# tmp_func = atom.operator_register(tag='test',\n\t# belong='first',\n\t# object_name='test_function_b',\n\t# remarks='this is a test operator!')(test_function)\n\t# tmp_b = tmp_func(3,4) \n\t### \u5ef6\u65f6\u6a21\u5f0f\n\tatom.operator_register(tag='test',\n\t belong='first',\n\t object_name='test_function_a', ## cc\n\t operator_object=test_function,\n\t remarks='this is a test operator!')\n\n\n\t# ### data-remove\u6d4b\u8bd5\n\t# atom.data_remove(tag='test',object_name='merge_data_GDTYUAN_ec_00')\n\n\n\t### operator-remove\u6d4b\u8bd5\n\t# atom.operator_remove(tag='test',object_name='test_function_cc')\n\t \n\t \n\t### data-query\u6d4b\u8bd5\n\tdata_view_df = atom.data_query(tag='test')\n\tprint(data_view_df)\n\n\n\t### operator-query\u6d4b\u8bd5\n\toperator_view_df = atom.operator_query(tag='test')\n\tprint(operator_view_df)\n\n\n\t### data-modify\u6d4b\u8bd5\n\tatom.data_modify()\n\n\n\t### operator-modify\u6d4b\u8bd5\n\tatom.operator_modify()\n\n\n\t### data-load\u6d4b\u8bd5\n\tdata_load_df = atom.data_load(tag='test',object_name='merge_data_GDTYUAN_ec_1')\n\tprint(data_load_df)\n\n\n\t### operator-load\u6d4b\u8bd5\n\toperator_load_a = atom.operator_load(tag='test',object_name='test_function_a')\n\tprint(operator_load_a(10,20))\n\t# print(test_function(**{'a':10,'b':20})) ### \u5b57\u5178\u53c2\u6570\u4f20\u9012\n```\n\n+ \u6700\u540e\u662f\u7b97\u5b50\u5728\u7ebf\u8ba1\u7b97\u670d\u52a1\u7684\u4f7f\u7528\u3002\u5f53\u4e00\u4e2a\u7b97\u5b50\u6ce8\u518c\u5230atom\u540e\uff0c\u4ed6\u5c31\u81ea\u52a8\u83b7\u5f97\u4e86\u5728\u7ebf\u8ba1\u7b97\u670d\u52a1\u7684\u529f\u80fd\u3002\n\n+ \u8868\u5355\u6570\u636e\u683c\u5f0f\u5982\u4e0b\uff1a\n\n```python\n\t### \u8be5\u8868\u5355\u6570\u636e\u4ec5\u4ee5python\u4e3a\u4f8b\u5c55\u793a\n\tpost_form = {\n \"tag\": \"test\",\n \"object_name\": \"test_function_a\",\n\t\"data_json\": {\"a\":78,\"b\":9}\n\t}\n```\n\n## \u8bbe\u8ba1\n+ WEBUI\n\n+ DATAUI\n\n+ \u4f7f\u7528\u5de5\u5382\u6a21\u5f0f\uff0c\u89e3\u8026\u8ba1\u7b97\u3001\u5b58\u50a8\u548c\u901a\u4fe1\u6240\u4f7f\u7528\u7684\u7b2c\u4e09\u65b9\u5de5\u5177\n+ \u8bbe\u8ba1\u6570\u636e\u548c\u7b97\u5b50\u4e24\u4e2a\u57fa\u672c\u6982\u5ff5\uff0c\u6269\u5c55\u4e86\u7279\u5f81\u5de5\u7a0b\u5de5\u5177\u7684\u9002\u7528\u8303\u56f4\uff0c\u4e00\u4e2a\u7b97\u5b50\u4e0d\u4ec5\u53ef\u4ee5\u76f4\u63a5\u8ba1\u7b97\uff0c\u8fd8\u53ef\u4ee5\u662f\u590d\u6742\u7b97\u6cd5\u6a21\u578b\uff0c\u8986\u76d6\u7279\u5f81\u5de5\u7a0b\u7684\u7279\u5f81\u6316\u6398\u548c\u5173\u952e\u6307\u6807\u8ba1\u7b97\u7ba1\u7406\u3002\n+ \u7b97\u5b50\u4e00\u6b21\u6ce8\u518c\u5373\u62e5\u6709\u957f\u6548\u8ba1\u7b97\u670d\u52a1\u529f\u80fd\u3002\n+ \u7b97\u5b50\u57fa\u4e8e\u4e0d\u540c\u6570\u636e\u5b9e\u73b0\u4e86\u7248\u672c\u7ba1\u7406\uff0c\u66f4\u5177\u6709\u5b9e\u9645\u610f\u4e49\n+ \u6280\u672f\u5217\u8868\n\t+ \u5de5\u5382\u6a21\u5f0f\n\t+ MinIO\n\t+ Bootstrap5\n\t+ SQLite3\n\t+ RabbitMQ\n\t+ FastAPI\n\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Atom is a feature engineering tool.",
"version": "0.1.1",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "66b90cb728bd3567d07547994041efd2397b1f99223b9977576684d2c1a12613",
"md5": "624604addd0c1c4f2812653e8fda847f",
"sha256": "d2670d160d70075a178f2a3b15abbe60b8f0576b678721d5f472ff7386a3bee5"
},
"downloads": -1,
"filename": "shihua_atom-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "624604addd0c1c4f2812653e8fda847f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9.12",
"size": 9340817,
"upload_time": "2023-03-14T10:35:26",
"upload_time_iso_8601": "2023-03-14T10:35:26.998327Z",
"url": "https://files.pythonhosted.org/packages/66/b9/0cb728bd3567d07547994041efd2397b1f99223b9977576684d2c1a12613/shihua_atom-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "24569798f6bc1479b13bde1e87386efe232fc281b27db05902285d16a27a236a",
"md5": "e7d1847543c752a71e77c2710bbec2f1",
"sha256": "047393db21f4b34f24767f623280a67ea2fb5644992d413ccaa015802ebf5cc1"
},
"downloads": -1,
"filename": "shihua-atom-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "e7d1847543c752a71e77c2710bbec2f1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9.12",
"size": 9246765,
"upload_time": "2023-03-14T10:35:32",
"upload_time_iso_8601": "2023-03-14T10:35:32.247287Z",
"url": "https://files.pythonhosted.org/packages/24/56/9798f6bc1479b13bde1e87386efe232fc281b27db05902285d16a27a236a/shihua-atom-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-14 10:35:32",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "redblue0216",
"github_project": "Atom",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "shihua-atom"
}