FourthDimension


NameFourthDimension JSON
Version 1.2.1 PyPI version JSON
download
home_pagehttps://gitee.com/hustai/FourthDimension
SummaryFourthDimension(第四维度)由华中科技大学人工智能与嵌入式实验室联合言图科技研发,是一款基于大语言模型的智能检索增强生成(RAG)系统,提供私域知识库、文档问答等多种服务。此外,FourthDimension提供便捷的本地部署方法,方便用户在本地环境中搭建属于自己的应用平台。
upload_time2023-12-12 06:55:11
maintainer
docs_urlNone
authoryantu-tech
requires_python
licenseMIT
keywords python yantu
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 工具介绍
FourthDimension(第四维度)由华中科技大学人工智能与嵌入式实验室联合言图科技研发,是一款基于大语言模型的智能检索增强生成(RAG)系统,提供私域知识库、文档问答等多种服务。此外,FourthDimension提供便捷的本地部署方法,方便用户在本地环境中搭建属于自己的应用平台。

### 工具特点
支持在线调用和本地部署,可选装不同的Embedding模型和答案生成模型,支持构建私域知识库,实现知识库问答。

### 主要服务
* 私域知识库
* 知识库问答
* 向量存储与检索
* 检索增强生成

# 工具使用

### 前置依赖项
- Anaconda3  

> 使用前请检查Anaconda是否安装,若未安装可参照以下教程进行安装。  
> [Anaconda详细安装过程](https://blog.csdn.net/weixin_43858830/article/details/134310118?csdn_share_tail=%7B%22type%22%3A%22blog%22%2C%22rType%22%3A%22article%22%2C%22rId%22%3A%22134310118%22%2C%22source%22%3A%22weixin_43858830%22%7D)

### 快速上手

1. 克隆项目Gitee库
```
mkdir FourthDimension
cd FourthDimension
git clone https://gitee.com/hustai/FourthDimension ./
```
2. 创建Conda虚拟环境
> python版本号要求 >= 3.8.1 ,<4.0
```
conda create -n FourthDimension python==3.8.1
conda activate FourthDimension
```
3. 安装FourthDimension  
3.1 安装前置依赖
```
pip install -r dependency.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3.2 安装FourthDimension 
```
sh FourthDimension.sh install
```

4. 启动FourthDimension
> 如需重启服务请将start参数替换为restart
```
sh FourthDimension.sh start
```

5. FourthDimension示例代码  
> 运行示例程序,实现文档导入私域知识库,基于私域知识库的问答(检索增强生成)
```text
python example/demo.py
```

6. FourthDimension使用说明  

>config.json为FourthDimension的配置文件,在使用FourthDimension时请将config.json置于脚本文件同级目录下

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.1 导入文档到私域知识库
```text
import FourthDimension  

# 传入文档路径或文件夹路径,目前支持的文档类型包括doc、docx等
result = FourthDimension.upload('./data/example/')
print(result)
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.2 基于私域知识库的问答(检索增强生成)
```text
import FourthDimension

# 传入问题“什么是活期存款”
answer = FourthDimension.query('什么是活期存款')
print(answer)
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.3 清空私域知识库
```text
import FourthDimension

result = FourthDimension.clean()
print(result)
```
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.4 config.json配置文件示例

```text
{
  "word_storage": "default",
  "embedding_storage": "faiss",
  "search_select": "default",
  "embedding_model": "bge-large-zh-v1.5",
  "answer_generation_model": "gpt-3.5-turbo-16k",
  "openai": {
    "api_key": "",
    "url": "https://api.openai.com/v1"
  },
  "para_config": {
    "chunk_size": 500,
    "overlap": 20
  },
  "recall_config": {
    "top_k": 10
  }
}
```
> 以下为config.json配置文件中各参数的说明
```text
word_storage:文档文本存储方式
embedding_storage:文档向量存储方式
search_select:检索方式
embedding_model:Embedding模型
answer_generation_model:答案生成模型
openai.api_key:配置您的api key
openai.url: 默认使用openai官方接口,可根据需求进行修改
para_config.chunk_size:文档切分段落长度
para_config.overlap:文档切分重叠度
recall_config.top_k:指定使用多少召回结果进行答案生成
```


# 论坛交流


# 相关知识

- <a href="https://hustai.gitee.io/zh/posts/rag/RetrieveTextGeneration.html" target="_blank">基于检索增强的文本生成</a>

- <a href="https://hustai.gitee.io/zh/posts/rag/LLMretrieval.html" target="_blank">如何通过大模型实现外挂知识库优化</a>

<a href="https://hustai.gitee.io/zh/" target="_blank">更多相关知识分享——网站链接</a>



            

Raw data

            {
    "_id": null,
    "home_page": "https://gitee.com/hustai/FourthDimension",
    "name": "FourthDimension",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "python,yantu",
    "author": "yantu-tech",
    "author_email": "wu466687121@qq.com",
    "download_url": "https://files.pythonhosted.org/packages/3f/7b/ab2decd355fab992c834b2883ca43576d029de5817ac478b1fe1efd7198e/FourthDimension-1.2.1b1.tar.gz",
    "platform": null,
    "description": "# \u5de5\u5177\u4ecb\u7ecd\nFourthDimension\uff08\u7b2c\u56db\u7ef4\u5ea6\uff09\u7531\u534e\u4e2d\u79d1\u6280\u5927\u5b66\u4eba\u5de5\u667a\u80fd\u4e0e\u5d4c\u5165\u5f0f\u5b9e\u9a8c\u5ba4\u8054\u5408\u8a00\u56fe\u79d1\u6280\u7814\u53d1\uff0c\u662f\u4e00\u6b3e\u57fa\u4e8e\u5927\u8bed\u8a00\u6a21\u578b\u7684\u667a\u80fd\u68c0\u7d22\u589e\u5f3a\u751f\u6210\uff08RAG\uff09\u7cfb\u7edf\uff0c\u63d0\u4f9b\u79c1\u57df\u77e5\u8bc6\u5e93\u3001\u6587\u6863\u95ee\u7b54\u7b49\u591a\u79cd\u670d\u52a1\u3002\u6b64\u5916\uff0cFourthDimension\u63d0\u4f9b\u4fbf\u6377\u7684\u672c\u5730\u90e8\u7f72\u65b9\u6cd5\uff0c\u65b9\u4fbf\u7528\u6237\u5728\u672c\u5730\u73af\u5883\u4e2d\u642d\u5efa\u5c5e\u4e8e\u81ea\u5df1\u7684\u5e94\u7528\u5e73\u53f0\u3002\n\n### \u5de5\u5177\u7279\u70b9\n\u652f\u6301\u5728\u7ebf\u8c03\u7528\u548c\u672c\u5730\u90e8\u7f72\uff0c\u53ef\u9009\u88c5\u4e0d\u540c\u7684Embedding\u6a21\u578b\u548c\u7b54\u6848\u751f\u6210\u6a21\u578b\uff0c\u652f\u6301\u6784\u5efa\u79c1\u57df\u77e5\u8bc6\u5e93\uff0c\u5b9e\u73b0\u77e5\u8bc6\u5e93\u95ee\u7b54\u3002\n\n### \u4e3b\u8981\u670d\u52a1\n* \u79c1\u57df\u77e5\u8bc6\u5e93\n* \u77e5\u8bc6\u5e93\u95ee\u7b54\n* \u5411\u91cf\u5b58\u50a8\u4e0e\u68c0\u7d22\n* \u68c0\u7d22\u589e\u5f3a\u751f\u6210\n\n# \u5de5\u5177\u4f7f\u7528\n\n### \u524d\u7f6e\u4f9d\u8d56\u9879\n- Anaconda3  \n\n> \u4f7f\u7528\u524d\u8bf7\u68c0\u67e5Anaconda\u662f\u5426\u5b89\u88c5\uff0c\u82e5\u672a\u5b89\u88c5\u53ef\u53c2\u7167\u4ee5\u4e0b\u6559\u7a0b\u8fdb\u884c\u5b89\u88c5\u3002  \n> [Anaconda\u8be6\u7ec6\u5b89\u88c5\u8fc7\u7a0b](https://blog.csdn.net/weixin_43858830/article/details/134310118?csdn_share_tail=%7B%22type%22%3A%22blog%22%2C%22rType%22%3A%22article%22%2C%22rId%22%3A%22134310118%22%2C%22source%22%3A%22weixin_43858830%22%7D)\n\n### \u5feb\u901f\u4e0a\u624b\n\n1. \u514b\u9686\u9879\u76eeGitee\u5e93\n```\nmkdir FourthDimension\ncd FourthDimension\ngit clone https://gitee.com/hustai/FourthDimension ./\n```\n2. \u521b\u5efaConda\u865a\u62df\u73af\u5883\n> python\u7248\u672c\u53f7\u8981\u6c42 >= 3.8.1 ,<4.0\n```\nconda create -n FourthDimension python==3.8.1\nconda activate FourthDimension\n```\n3. \u5b89\u88c5FourthDimension  \n3.1 \u5b89\u88c5\u524d\u7f6e\u4f9d\u8d56\n```\npip install -r dependency.txt -i https://pypi.tuna.tsinghua.edu.cn/simple\n```\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;3.2 \u5b89\u88c5FourthDimension \n```\nsh FourthDimension.sh install\n```\n\n4. \u542f\u52a8FourthDimension\n> \u5982\u9700\u91cd\u542f\u670d\u52a1\u8bf7\u5c06start\u53c2\u6570\u66ff\u6362\u4e3arestart\n```\nsh FourthDimension.sh start\n```\n\n5. FourthDimension\u793a\u4f8b\u4ee3\u7801  \n> \u8fd0\u884c\u793a\u4f8b\u7a0b\u5e8f\uff0c\u5b9e\u73b0\u6587\u6863\u5bfc\u5165\u79c1\u57df\u77e5\u8bc6\u5e93\uff0c\u57fa\u4e8e\u79c1\u57df\u77e5\u8bc6\u5e93\u7684\u95ee\u7b54\uff08\u68c0\u7d22\u589e\u5f3a\u751f\u6210\uff09\n```text\npython example/demo.py\n```\n\n6. FourthDimension\u4f7f\u7528\u8bf4\u660e  \n\n>config.json\u4e3aFourthDimension\u7684\u914d\u7f6e\u6587\u4ef6\uff0c\u5728\u4f7f\u7528FourthDimension\u65f6\u8bf7\u5c06config.json\u7f6e\u4e8e\u811a\u672c\u6587\u4ef6\u540c\u7ea7\u76ee\u5f55\u4e0b\n\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.1 \u5bfc\u5165\u6587\u6863\u5230\u79c1\u57df\u77e5\u8bc6\u5e93\n```text\nimport FourthDimension  \n\n# \u4f20\u5165\u6587\u6863\u8def\u5f84\u6216\u6587\u4ef6\u5939\u8def\u5f84\uff0c\u76ee\u524d\u652f\u6301\u7684\u6587\u6863\u7c7b\u578b\u5305\u62ecdoc\u3001docx\u7b49\nresult = FourthDimension.upload('./data/example/')\nprint(result)\n```\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.2 \u57fa\u4e8e\u79c1\u57df\u77e5\u8bc6\u5e93\u7684\u95ee\u7b54\uff08\u68c0\u7d22\u589e\u5f3a\u751f\u6210\uff09\n```text\nimport FourthDimension\n\n# \u4f20\u5165\u95ee\u9898\u201c\u4ec0\u4e48\u662f\u6d3b\u671f\u5b58\u6b3e\u201d\nanswer = FourthDimension.query('\u4ec0\u4e48\u662f\u6d3b\u671f\u5b58\u6b3e')\nprint(answer)\n```\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.3 \u6e05\u7a7a\u79c1\u57df\u77e5\u8bc6\u5e93\n```text\nimport FourthDimension\n\nresult = FourthDimension.clean()\nprint(result)\n```\n&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;6.4 config.json\u914d\u7f6e\u6587\u4ef6\u793a\u4f8b\n\n```text\n{\n  \"word_storage\": \"default\",\n  \"embedding_storage\": \"faiss\",\n  \"search_select\": \"default\",\n  \"embedding_model\": \"bge-large-zh-v1.5\",\n  \"answer_generation_model\": \"gpt-3.5-turbo-16k\",\n  \"openai\": {\n    \"api_key\": \"\",\n    \"url\": \"https://api.openai.com/v1\"\n  },\n  \"para_config\": {\n    \"chunk_size\": 500,\n    \"overlap\": 20\n  },\n  \"recall_config\": {\n    \"top_k\": 10\n  }\n}\n```\n> \u4ee5\u4e0b\u4e3aconfig.json\u914d\u7f6e\u6587\u4ef6\u4e2d\u5404\u53c2\u6570\u7684\u8bf4\u660e\n```text\nword_storage\uff1a\u6587\u6863\u6587\u672c\u5b58\u50a8\u65b9\u5f0f\nembedding_storage\uff1a\u6587\u6863\u5411\u91cf\u5b58\u50a8\u65b9\u5f0f\nsearch_select\uff1a\u68c0\u7d22\u65b9\u5f0f\nembedding_model\uff1aEmbedding\u6a21\u578b\nanswer_generation_model\uff1a\u7b54\u6848\u751f\u6210\u6a21\u578b\nopenai.api_key\uff1a\u914d\u7f6e\u60a8\u7684api key\nopenai.url\uff1a \u9ed8\u8ba4\u4f7f\u7528openai\u5b98\u65b9\u63a5\u53e3\uff0c\u53ef\u6839\u636e\u9700\u6c42\u8fdb\u884c\u4fee\u6539\npara_config.chunk_size\uff1a\u6587\u6863\u5207\u5206\u6bb5\u843d\u957f\u5ea6\npara_config.overlap\uff1a\u6587\u6863\u5207\u5206\u91cd\u53e0\u5ea6\nrecall_config.top_k\uff1a\u6307\u5b9a\u4f7f\u7528\u591a\u5c11\u53ec\u56de\u7ed3\u679c\u8fdb\u884c\u7b54\u6848\u751f\u6210\n```\n\n\n# \u8bba\u575b\u4ea4\u6d41\n\n\n# \u76f8\u5173\u77e5\u8bc6\n\n- <a href=\"https://hustai.gitee.io/zh/posts/rag/RetrieveTextGeneration.html\" target=\"_blank\">\u57fa\u4e8e\u68c0\u7d22\u589e\u5f3a\u7684\u6587\u672c\u751f\u6210</a>\n\n- <a href=\"https://hustai.gitee.io/zh/posts/rag/LLMretrieval.html\" target=\"_blank\">\u5982\u4f55\u901a\u8fc7\u5927\u6a21\u578b\u5b9e\u73b0\u5916\u6302\u77e5\u8bc6\u5e93\u4f18\u5316</a>\n\n<a href=\"https://hustai.gitee.io/zh/\" target=\"_blank\">\u66f4\u591a\u76f8\u5173\u77e5\u8bc6\u5206\u4eab\u2014\u2014\u7f51\u7ad9\u94fe\u63a5</a>\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "FourthDimension\uff08\u7b2c\u56db\u7ef4\u5ea6\uff09\u7531\u534e\u4e2d\u79d1\u6280\u5927\u5b66\u4eba\u5de5\u667a\u80fd\u4e0e\u5d4c\u5165\u5f0f\u5b9e\u9a8c\u5ba4\u8054\u5408\u8a00\u56fe\u79d1\u6280\u7814\u53d1\uff0c\u662f\u4e00\u6b3e\u57fa\u4e8e\u5927\u8bed\u8a00\u6a21\u578b\u7684\u667a\u80fd\u68c0\u7d22\u589e\u5f3a\u751f\u6210\uff08RAG\uff09\u7cfb\u7edf\uff0c\u63d0\u4f9b\u79c1\u57df\u77e5\u8bc6\u5e93\u3001\u6587\u6863\u95ee\u7b54\u7b49\u591a\u79cd\u670d\u52a1\u3002\u6b64\u5916\uff0cFourthDimension\u63d0\u4f9b\u4fbf\u6377\u7684\u672c\u5730\u90e8\u7f72\u65b9\u6cd5\uff0c\u65b9\u4fbf\u7528\u6237\u5728\u672c\u5730\u73af\u5883\u4e2d\u642d\u5efa\u5c5e\u4e8e\u81ea\u5df1\u7684\u5e94\u7528\u5e73\u53f0\u3002",
    "version": "1.2.1",
    "project_urls": {
        "Homepage": "https://gitee.com/hustai/FourthDimension"
    },
    "split_keywords": [
        "python",
        "yantu"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3f7bab2decd355fab992c834b2883ca43576d029de5817ac478b1fe1efd7198e",
                "md5": "44d3ad763e83419c0c2fdb28a80f5ce6",
                "sha256": "c40984d708721f1091d40e8da56ba0bf41d2722f2463a45824849ed2d096f340"
            },
            "downloads": -1,
            "filename": "FourthDimension-1.2.1b1.tar.gz",
            "has_sig": false,
            "md5_digest": "44d3ad763e83419c0c2fdb28a80f5ce6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 58841879,
            "upload_time": "2023-12-12T06:55:11",
            "upload_time_iso_8601": "2023-12-12T06:55:11.875382Z",
            "url": "https://files.pythonhosted.org/packages/3f/7b/ab2decd355fab992c834b2883ca43576d029de5817ac478b1fe1efd7198e/FourthDimension-1.2.1b1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-12 06:55:11",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "fourthdimension"
}
        
Elapsed time: 0.23084s