super-analysis


Namesuper-analysis JSON
Version 0.1.9 PyPI version JSON
download
home_pagehttps://github.com/yourusername/super-analysis
SummaryA package for super analysis with ByzerLLM
upload_time2024-11-13 13:18:54
maintainerNone
docs_urlNone
authorallwefantasy
requires_python>=3.6
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# 🚀 Super Analysis 部署指南

本指南将帮助你部署 Super Analysis 系统,包括安装必要组件、配置服务和启动系统。

---

## 📦 安装 Super Analysis


```bash
pip install -U auto-coder
pip install super_analysis-xxxx-py3-none-any.whl
```

注意替换下 xxxx 为版本号。

---

## 🤖 部署 Deepseek 模型代理

在启动其他服务之前,我们需要先部署 Deepseek 模型:

```bash
byzerllm deploy --pretrained_model_type saas/openai \
--cpus_per_worker 0.001 \
--gpus_per_worker 0 \
--worker_concurrency 1000 \
--num_workers 1 \
--infer_params saas.base_url="https://api.deepseek.com/v1" saas.api_key=${MODEL_DEEPSEEK_TOKEN} saas.model=deepseek-chat \
--model deepseek_chat
```

注意:确保已设置环境变量 `MODEL_DEEPSEEK_TOKEN`。

---

## 🛠️ 部署 Byzer-SQL

参考 [安装与配置 Byzer-SQL 文档](./4.3.1%20安装与配置%20Byzer-SQL.pdf) 完成部署。
根据 [Byzer-SQL 和大模型整合文档](./4.2.1.3%20Byzer-SQL%20和大模型的整合.pdf) 中安装插件,然后注册 `deepseek_chat` 函数。

> 启动时需要在安装有 super-analysis 的 conda 环境中启动。

当 byzer-sql 部署完成后,注册账号为 `hello`,然后在 byzer-sql 控制台中执行:

```sql
!byzerllm setup single;

run command as LLM.`` where 
action="infer"
and reconnect="true"
and pretrainedModelType="saas/openai"
and udfName="deepseek_chat";
```

---

## 示例数据

下载电影数据集: https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset/download?datasetVersionNumber=7

---

> 下面的指令都是在命令行里操作哈

## 📊 数据预处理和服务启动

1. 抽取电影数据集schema:

```bash
super-analysis.convert --data_dir /Users/allwefantasy/data/movice --doc_dir /Users/allwefantasy/data/movice/schemas/
```

你还可以添加 --include-rows-num 5 让系统在生成 schema 文档时同时提供一些示例数据。方便大模型更好的对这个表进行认知。


2. 启动 schema 文档知识库:

```bash
auto-coder.rag serve \
--model deepseek_chat --index_filter_workers 100 \
--tokenizer_path /Users/allwefantasy/Downloads/tokenizer.json \
--doc_dir /Users/allwefantasy/data/movice/schemas/ \
--port 8001
```

3. 下载 Byzer-SQL 文档并启动文档知识库:

```bash
git clone https://github.com/allwefantasy/llm_friendly_packages

auto-coder.rag serve \
--model deepseek_chat --index_filter_workers 100 \
--tokenizer_path /Users/allwefantasy/Downloads/tokenizer.json \
--doc_dir  /Users/allwefantasy/projects/llm_friendly_packages/github.com/allwefantasy \
--port 8002
```

4. 启动兼容 OpenAI Server 的分析服务:

```bash
super-analysis.serve --served-model-name deepseek_chat --port 8000 \
--schema-rag-base-url http://127.0.0.1:8001/v1 \
--context-rag-base-url http://127.0.0.1:8002/v1 \
--byzer-sql-url http://127.0.0.1:9003/run/script
```

你可以通过 `--sql-func-llm-model` 函数单独为 SQL 函数指定模型(比如配置一个速度极快的模型)。注意,同样的,你需要在 Byzer-SQL 中注册这个函数。

---

现在,Super Analysis 系统已经完全部署并启动。你可以开始使用 OpenAI SDK 进行测试和接口调用。具体测试和接口使用方法请参考 [openai_local_api.ipynb](./openai_local_api.ipynb)。

🎉 恭喜!你已经成功部署了 Super Analysis 系统。如有任何问题,请随时查阅文档或联系支持团队。

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yourusername/super-analysis",
    "name": "super-analysis",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": null,
    "author": "allwefantasy",
    "author_email": "allwefantasy@gmail.com",
    "download_url": null,
    "platform": null,
    "description": "\n# \ud83d\ude80 Super Analysis \u90e8\u7f72\u6307\u5357\n\n\u672c\u6307\u5357\u5c06\u5e2e\u52a9\u4f60\u90e8\u7f72 Super Analysis \u7cfb\u7edf\uff0c\u5305\u62ec\u5b89\u88c5\u5fc5\u8981\u7ec4\u4ef6\u3001\u914d\u7f6e\u670d\u52a1\u548c\u542f\u52a8\u7cfb\u7edf\u3002\n\n---\n\n## \ud83d\udce6 \u5b89\u88c5 Super Analysis\n\n\n```bash\npip install -U auto-coder\npip install super_analysis-xxxx-py3-none-any.whl\n```\n\n\u6ce8\u610f\u66ff\u6362\u4e0b xxxx \u4e3a\u7248\u672c\u53f7\u3002\n\n---\n\n## \ud83e\udd16 \u90e8\u7f72 Deepseek \u6a21\u578b\u4ee3\u7406\n\n\u5728\u542f\u52a8\u5176\u4ed6\u670d\u52a1\u4e4b\u524d\uff0c\u6211\u4eec\u9700\u8981\u5148\u90e8\u7f72 Deepseek \u6a21\u578b\uff1a\n\n```bash\nbyzerllm deploy --pretrained_model_type saas/openai \\\n--cpus_per_worker 0.001 \\\n--gpus_per_worker 0 \\\n--worker_concurrency 1000 \\\n--num_workers 1 \\\n--infer_params saas.base_url=\"https://api.deepseek.com/v1\" saas.api_key=${MODEL_DEEPSEEK_TOKEN} saas.model=deepseek-chat \\\n--model deepseek_chat\n```\n\n\u6ce8\u610f\uff1a\u786e\u4fdd\u5df2\u8bbe\u7f6e\u73af\u5883\u53d8\u91cf `MODEL_DEEPSEEK_TOKEN`\u3002\n\n---\n\n## \ud83d\udee0\ufe0f \u90e8\u7f72 Byzer-SQL\n\n\u53c2\u8003 [\u5b89\u88c5\u4e0e\u914d\u7f6e Byzer-SQL \u6587\u6863](./4.3.1%20\u5b89\u88c5\u4e0e\u914d\u7f6e%20Byzer-SQL.pdf) \u5b8c\u6210\u90e8\u7f72\u3002\n\u6839\u636e [Byzer-SQL \u548c\u5927\u6a21\u578b\u6574\u5408\u6587\u6863](./4.2.1.3%20Byzer-SQL%20\u548c\u5927\u6a21\u578b\u7684\u6574\u5408.pdf) \u4e2d\u5b89\u88c5\u63d2\u4ef6\uff0c\u7136\u540e\u6ce8\u518c `deepseek_chat` \u51fd\u6570\u3002\n\n> \u542f\u52a8\u65f6\u9700\u8981\u5728\u5b89\u88c5\u6709 super-analysis \u7684 conda \u73af\u5883\u4e2d\u542f\u52a8\u3002\n\n\u5f53 byzer-sql \u90e8\u7f72\u5b8c\u6210\u540e\uff0c\u6ce8\u518c\u8d26\u53f7\u4e3a `hello`\uff0c\u7136\u540e\u5728 byzer-sql \u63a7\u5236\u53f0\u4e2d\u6267\u884c\uff1a\n\n```sql\n!byzerllm setup single;\n\nrun command as LLM.`` where \naction=\"infer\"\nand reconnect=\"true\"\nand pretrainedModelType=\"saas/openai\"\nand udfName=\"deepseek_chat\";\n```\n\n---\n\n## \u793a\u4f8b\u6570\u636e\n\n\u4e0b\u8f7d\u7535\u5f71\u6570\u636e\u96c6\uff1a https://www.kaggle.com/datasets/rounakbanik/the-movies-dataset/download?datasetVersionNumber=7\n\n---\n\n> \u4e0b\u9762\u7684\u6307\u4ee4\u90fd\u662f\u5728\u547d\u4ee4\u884c\u91cc\u64cd\u4f5c\u54c8\n\n## \ud83d\udcca \u6570\u636e\u9884\u5904\u7406\u548c\u670d\u52a1\u542f\u52a8\n\n1. \u62bd\u53d6\u7535\u5f71\u6570\u636e\u96c6schema\uff1a\n\n```bash\nsuper-analysis.convert --data_dir /Users/allwefantasy/data/movice --doc_dir /Users/allwefantasy/data/movice/schemas/\n```\n\n\u4f60\u8fd8\u53ef\u4ee5\u6dfb\u52a0 --include-rows-num 5 \u8ba9\u7cfb\u7edf\u5728\u751f\u6210 schema \u6587\u6863\u65f6\u540c\u65f6\u63d0\u4f9b\u4e00\u4e9b\u793a\u4f8b\u6570\u636e\u3002\u65b9\u4fbf\u5927\u6a21\u578b\u66f4\u597d\u7684\u5bf9\u8fd9\u4e2a\u8868\u8fdb\u884c\u8ba4\u77e5\u3002\n\n\n2. \u542f\u52a8 schema \u6587\u6863\u77e5\u8bc6\u5e93\uff1a\n\n```bash\nauto-coder.rag serve \\\n--model deepseek_chat --index_filter_workers 100 \\\n--tokenizer_path /Users/allwefantasy/Downloads/tokenizer.json \\\n--doc_dir /Users/allwefantasy/data/movice/schemas/ \\\n--port 8001\n```\n\n3. \u4e0b\u8f7d Byzer-SQL \u6587\u6863\u5e76\u542f\u52a8\u6587\u6863\u77e5\u8bc6\u5e93\uff1a\n\n```bash\ngit clone https://github.com/allwefantasy/llm_friendly_packages\n\nauto-coder.rag serve \\\n--model deepseek_chat --index_filter_workers 100 \\\n--tokenizer_path /Users/allwefantasy/Downloads/tokenizer.json \\\n--doc_dir  /Users/allwefantasy/projects/llm_friendly_packages/github.com/allwefantasy \\\n--port 8002\n```\n\n4. \u542f\u52a8\u517c\u5bb9 OpenAI Server \u7684\u5206\u6790\u670d\u52a1\uff1a\n\n```bash\nsuper-analysis.serve --served-model-name deepseek_chat --port 8000 \\\n--schema-rag-base-url http://127.0.0.1:8001/v1 \\\n--context-rag-base-url http://127.0.0.1:8002/v1 \\\n--byzer-sql-url http://127.0.0.1:9003/run/script\n```\n\n\u4f60\u53ef\u4ee5\u901a\u8fc7 `--sql-func-llm-model` \u51fd\u6570\u5355\u72ec\u4e3a SQL \u51fd\u6570\u6307\u5b9a\u6a21\u578b(\u6bd4\u5982\u914d\u7f6e\u4e00\u4e2a\u901f\u5ea6\u6781\u5feb\u7684\u6a21\u578b)\u3002\u6ce8\u610f\uff0c\u540c\u6837\u7684\uff0c\u4f60\u9700\u8981\u5728 Byzer-SQL \u4e2d\u6ce8\u518c\u8fd9\u4e2a\u51fd\u6570\u3002\n\n---\n\n\u73b0\u5728\uff0cSuper Analysis \u7cfb\u7edf\u5df2\u7ecf\u5b8c\u5168\u90e8\u7f72\u5e76\u542f\u52a8\u3002\u4f60\u53ef\u4ee5\u5f00\u59cb\u4f7f\u7528 OpenAI SDK \u8fdb\u884c\u6d4b\u8bd5\u548c\u63a5\u53e3\u8c03\u7528\u3002\u5177\u4f53\u6d4b\u8bd5\u548c\u63a5\u53e3\u4f7f\u7528\u65b9\u6cd5\u8bf7\u53c2\u8003 [openai_local_api.ipynb](./openai_local_api.ipynb)\u3002\n\n\ud83c\udf89 \u606d\u559c\uff01\u4f60\u5df2\u7ecf\u6210\u529f\u90e8\u7f72\u4e86 Super Analysis \u7cfb\u7edf\u3002\u5982\u6709\u4efb\u4f55\u95ee\u9898\uff0c\u8bf7\u968f\u65f6\u67e5\u9605\u6587\u6863\u6216\u8054\u7cfb\u652f\u6301\u56e2\u961f\u3002\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A package for super analysis with ByzerLLM",
    "version": "0.1.9",
    "project_urls": {
        "Homepage": "https://github.com/yourusername/super-analysis"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f1f3ab1acbff8c76a2da55a92e9825939378261d29f08af573765e5e0acbc244",
                "md5": "372775eff58c191d553242d7e54ee64c",
                "sha256": "ee194ee3820a5b64c25686f9e4aaeaf4682974883a7481897ab5ac1ae7a601dd"
            },
            "downloads": -1,
            "filename": "super_analysis-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "372775eff58c191d553242d7e54ee64c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 15365,
            "upload_time": "2024-11-13T13:18:54",
            "upload_time_iso_8601": "2024-11-13T13:18:54.162064Z",
            "url": "https://files.pythonhosted.org/packages/f1/f3/ab1acbff8c76a2da55a92e9825939378261d29f08af573765e5e0acbc244/super_analysis-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-13 13:18:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yourusername",
    "github_project": "super-analysis",
    "github_not_found": true,
    "lcname": "super-analysis"
}
        
Elapsed time: 0.85392s