fourth-dimension


Namefourth-dimension JSON
Version 1.2.1 PyPI version JSON
download
home_pagehttps://gitee.com/hustai/Fourth-Dimension
SummaryFourthDimension(第四维度)由华中科技大学人工智能与嵌入式实验室联合言图科技研发,是一款基于大语言模型的智能知识问答系统,提供私域知识库、文档问答等多种服务。此外,FourthDimension提供便捷的本地部署方法,方便用户在本地环境中搭建属于自己的应用平台。
upload_time2023-11-10 02:37:37
maintainer
docs_urlNone
authoryantu-tech
requires_python>=3.8
licenseMIT License
keywords python yantu
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 工具介绍
FourthDimension(第四维度)由华中科技大学人工智能与嵌入式实验室联合言图科技研发,是一款基于大语言模型的智能检索增强生成系统,提供私域知识库、文档问答等多种服务。此外,FourthDimension提供便捷的本地部署方法,方便用户在本地环境中搭建属于自己的应用平台。

### 工具特点
可在线使用API或本地部署专属知识库,自定义Embedding模型和答案生成模型,根据自己的需求进行定制和优化。

### 主要功能
* 简单高效,用户只需配置API即可进行知识问答
* 高度定制化和数据安全的私域知识库
* 基于私域知识库的文档问答

# 工具使用

### 前置依赖项
- Anaconda3  

> 使用前请检查Anaconda是否安装,若未安装可参照以下教程  
> [Anaconda详细安装过程](https://blog.csdn.net/weixin_43858830/article/details/134310118?csdn_share_tail=%7B%22type%22%3A%22blog%22%2C%22rType%22%3A%22article%22%2C%22rId%22%3A%22134310118%22%2C%22source%22%3A%22weixin_43858830%22%7D)

### 快速上手

1. 克隆项目存储库
```
git clone https://gitee.com/hustai/FourthDimension
```
```
cd FourthDimension
```
2. 创建相应虚拟环境
```
conda create -n FourthDimension python==3.8
```

```
conda activate FourthDimension
```
3. 安装前置依赖
```
pip install -r requirements.txt
```
安装核心依赖
```
pip install fourth-dimension
```

4. 下载和启动相关配置(Elasticsearch)
```
sh es_install.sh
```

5. 使用  
使用前请将config.json配置文件置于脚本文件同级目录下

示例代码  
* 存储及检索
``` python 
import fourth_dimension
answer = fourth_dimension.query_storage('您的问题', '文档路径')
print(answer)
```
* 单检索  
该检索方法目前仅支持elasticsearch检索方式
``` python 
import fourth_dimension
answer = fourth_dimension.query('您的问题')
print(answer)
```
### 配置文件说明

存储/检索方式: `Elasticsearch + Faiss`

大语言模型:`GPT3.5-turbo-16k`

以上配置均可以在配置文件调整

配置文件

```xml
{
  //文档文本存储方式
  "word_storage": "elasticsearch",

  //文档向量存储方式
  "embedding_storage": "faiss",

  //检索方式选择,目前提供以下三种方式:
  //1.elasticsearch
  //2.faiss
  //3.elasticsearch+faiss
  "search_select": "elasticsearch",

  //embedding模型
  "embedding_model": "bge-large-zh-v1.5",

  //答案生成模型
  "answer_generation_model": "gpt-3.5-turbo-16k",

  //openai配置
  "openai": {
    //请在此处配置您的api key
    "api_key": "",
    //默认使用openai官方接口,可根据需求进行修改
    "url": "https://api.openai.com/v1"
  },

  //文档划分设置
  "para_config": {
    //文档划分段落长度
    "chunk_size": 500,
    //文档划分重叠度
    "overlap": 20
  },

  //召回设置
  "recall_config": {
    //指定使用多少召回结果进行答案生成
    "top_k": 10
  },

  //Elasticsearch设置
  "elasticsearch_setting": {
    //默认索引名称,可根据需求进行修改
    "index_name": "index",
    //默认为localhost,可根据具体需求修改
    "host": "localhost",
    "port": 9200,
    //若存在安全认证,则填写用户名和密码
    "username": "",
    "password": "",
    //Elasticsearch分词器
    "analyzer": "standard"
  },

  //Faiss设置
  "faiss_setting": {
    //索引方式
    "retrieval_way": "IndexFlatL2"
  }
}
```





# 论坛交流
微信群二维码

# 相关知识
- [基于检索增强的文本生成](https://hustai.gitee.io/zh/posts/rag/RetrieveTextGeneration.html)

- [如何通过大模型实现外挂知识库优化](https://hustai.gitee.io/zh/posts/rag/LLMretrieval.html)

 [更多相关知识分享————网站链接](https://hustai.tech/zh/)



            

Raw data

            {
    "_id": null,
    "home_page": "https://gitee.com/hustai/Fourth-Dimension",
    "name": "fourth-dimension",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "python,yantu",
    "author": "yantu-tech",
    "author_email": "GuoHuai Wu <wu466687121@qq.com>",
    "download_url": "https://files.pythonhosted.org/packages/86/38/0473a3e7ed24ce6400eb0be74152792b18ababd50a5b1e4278407c44b898/fourth_dimension-1.2.1.tar.gz",
    "platform": null,
    "description": "# \u5de5\u5177\u4ecb\u7ecd\r\nFourthDimension\uff08\u7b2c\u56db\u7ef4\u5ea6\uff09\u7531\u534e\u4e2d\u79d1\u6280\u5927\u5b66\u4eba\u5de5\u667a\u80fd\u4e0e\u5d4c\u5165\u5f0f\u5b9e\u9a8c\u5ba4\u8054\u5408\u8a00\u56fe\u79d1\u6280\u7814\u53d1\uff0c\u662f\u4e00\u6b3e\u57fa\u4e8e\u5927\u8bed\u8a00\u6a21\u578b\u7684\u667a\u80fd\u68c0\u7d22\u589e\u5f3a\u751f\u6210\u7cfb\u7edf\uff0c\u63d0\u4f9b\u79c1\u57df\u77e5\u8bc6\u5e93\u3001\u6587\u6863\u95ee\u7b54\u7b49\u591a\u79cd\u670d\u52a1\u3002\u6b64\u5916\uff0cFourthDimension\u63d0\u4f9b\u4fbf\u6377\u7684\u672c\u5730\u90e8\u7f72\u65b9\u6cd5\uff0c\u65b9\u4fbf\u7528\u6237\u5728\u672c\u5730\u73af\u5883\u4e2d\u642d\u5efa\u5c5e\u4e8e\u81ea\u5df1\u7684\u5e94\u7528\u5e73\u53f0\u3002\r\n\r\n### \u5de5\u5177\u7279\u70b9\r\n\u53ef\u5728\u7ebf\u4f7f\u7528API\u6216\u672c\u5730\u90e8\u7f72\u4e13\u5c5e\u77e5\u8bc6\u5e93\uff0c\u81ea\u5b9a\u4e49Embedding\u6a21\u578b\u548c\u7b54\u6848\u751f\u6210\u6a21\u578b\uff0c\u6839\u636e\u81ea\u5df1\u7684\u9700\u6c42\u8fdb\u884c\u5b9a\u5236\u548c\u4f18\u5316\u3002\r\n\r\n### \u4e3b\u8981\u529f\u80fd\r\n* \u7b80\u5355\u9ad8\u6548\uff0c\u7528\u6237\u53ea\u9700\u914d\u7f6eAPI\u5373\u53ef\u8fdb\u884c\u77e5\u8bc6\u95ee\u7b54\r\n* \u9ad8\u5ea6\u5b9a\u5236\u5316\u548c\u6570\u636e\u5b89\u5168\u7684\u79c1\u57df\u77e5\u8bc6\u5e93\r\n* \u57fa\u4e8e\u79c1\u57df\u77e5\u8bc6\u5e93\u7684\u6587\u6863\u95ee\u7b54\r\n\r\n# \u5de5\u5177\u4f7f\u7528\r\n\r\n### \u524d\u7f6e\u4f9d\u8d56\u9879\r\n- Anaconda3  \r\n\r\n> \u4f7f\u7528\u524d\u8bf7\u68c0\u67e5Anaconda\u662f\u5426\u5b89\u88c5\uff0c\u82e5\u672a\u5b89\u88c5\u53ef\u53c2\u7167\u4ee5\u4e0b\u6559\u7a0b  \r\n> [Anaconda\u8be6\u7ec6\u5b89\u88c5\u8fc7\u7a0b](https://blog.csdn.net/weixin_43858830/article/details/134310118?csdn_share_tail=%7B%22type%22%3A%22blog%22%2C%22rType%22%3A%22article%22%2C%22rId%22%3A%22134310118%22%2C%22source%22%3A%22weixin_43858830%22%7D)\r\n\r\n### \u5feb\u901f\u4e0a\u624b\r\n\r\n1. \u514b\u9686\u9879\u76ee\u5b58\u50a8\u5e93\r\n```\r\ngit clone https://gitee.com/hustai/FourthDimension\r\n```\r\n```\r\ncd FourthDimension\r\n```\r\n2. \u521b\u5efa\u76f8\u5e94\u865a\u62df\u73af\u5883\r\n```\r\nconda create -n FourthDimension python==3.8\r\n```\r\n\r\n```\r\nconda activate FourthDimension\r\n```\r\n3. \u5b89\u88c5\u524d\u7f6e\u4f9d\u8d56\r\n```\r\npip install -r requirements.txt\r\n```\r\n\u5b89\u88c5\u6838\u5fc3\u4f9d\u8d56\r\n```\r\npip install fourth-dimension\r\n```\r\n\r\n4. \u4e0b\u8f7d\u548c\u542f\u52a8\u76f8\u5173\u914d\u7f6e\uff08Elasticsearch\uff09\r\n```\r\nsh es_install.sh\r\n```\r\n\r\n5. \u4f7f\u7528  \r\n\u4f7f\u7528\u524d\u8bf7\u5c06config.json\u914d\u7f6e\u6587\u4ef6\u7f6e\u4e8e\u811a\u672c\u6587\u4ef6\u540c\u7ea7\u76ee\u5f55\u4e0b\r\n\r\n\u793a\u4f8b\u4ee3\u7801  \r\n* \u5b58\u50a8\u53ca\u68c0\u7d22\r\n``` python \r\nimport fourth_dimension\r\nanswer = fourth_dimension.query_storage('\u60a8\u7684\u95ee\u9898', '\u6587\u6863\u8def\u5f84')\r\nprint(answer)\r\n```\r\n* \u5355\u68c0\u7d22  \r\n\u8be5\u68c0\u7d22\u65b9\u6cd5\u76ee\u524d\u4ec5\u652f\u6301elasticsearch\u68c0\u7d22\u65b9\u5f0f\r\n``` python \r\nimport fourth_dimension\r\nanswer = fourth_dimension.query('\u60a8\u7684\u95ee\u9898')\r\nprint(answer)\r\n```\r\n### \u914d\u7f6e\u6587\u4ef6\u8bf4\u660e\r\n\r\n\u5b58\u50a8/\u68c0\u7d22\u65b9\u5f0f: `Elasticsearch + Faiss`\r\n\r\n\u5927\u8bed\u8a00\u6a21\u578b\uff1a`GPT3.5-turbo-16k`\r\n\r\n\u4ee5\u4e0a\u914d\u7f6e\u5747\u53ef\u4ee5\u5728\u914d\u7f6e\u6587\u4ef6\u8c03\u6574\r\n\r\n\u914d\u7f6e\u6587\u4ef6\r\n\r\n```xml\r\n{\r\n  //\u6587\u6863\u6587\u672c\u5b58\u50a8\u65b9\u5f0f\r\n  \"word_storage\": \"elasticsearch\",\r\n\r\n  //\u6587\u6863\u5411\u91cf\u5b58\u50a8\u65b9\u5f0f\r\n  \"embedding_storage\": \"faiss\",\r\n\r\n  //\u68c0\u7d22\u65b9\u5f0f\u9009\u62e9\uff0c\u76ee\u524d\u63d0\u4f9b\u4ee5\u4e0b\u4e09\u79cd\u65b9\u5f0f\uff1a\r\n  //1.elasticsearch\r\n  //2.faiss\r\n  //3.elasticsearch+faiss\r\n  \"search_select\": \"elasticsearch\",\r\n\r\n  //embedding\u6a21\u578b\r\n  \"embedding_model\": \"bge-large-zh-v1.5\",\r\n\r\n  //\u7b54\u6848\u751f\u6210\u6a21\u578b\r\n  \"answer_generation_model\": \"gpt-3.5-turbo-16k\",\r\n\r\n  //openai\u914d\u7f6e\r\n  \"openai\": {\r\n    //\u8bf7\u5728\u6b64\u5904\u914d\u7f6e\u60a8\u7684api key\r\n    \"api_key\": \"\",\r\n    //\u9ed8\u8ba4\u4f7f\u7528openai\u5b98\u65b9\u63a5\u53e3\uff0c\u53ef\u6839\u636e\u9700\u6c42\u8fdb\u884c\u4fee\u6539\r\n    \"url\": \"https://api.openai.com/v1\"\r\n  },\r\n\r\n  //\u6587\u6863\u5212\u5206\u8bbe\u7f6e\r\n  \"para_config\": {\r\n    //\u6587\u6863\u5212\u5206\u6bb5\u843d\u957f\u5ea6\r\n    \"chunk_size\": 500,\r\n    //\u6587\u6863\u5212\u5206\u91cd\u53e0\u5ea6\r\n    \"overlap\": 20\r\n  },\r\n\r\n  //\u53ec\u56de\u8bbe\u7f6e\r\n  \"recall_config\": {\r\n    //\u6307\u5b9a\u4f7f\u7528\u591a\u5c11\u53ec\u56de\u7ed3\u679c\u8fdb\u884c\u7b54\u6848\u751f\u6210\r\n    \"top_k\": 10\r\n  },\r\n\r\n  //Elasticsearch\u8bbe\u7f6e\r\n  \"elasticsearch_setting\": {\r\n    //\u9ed8\u8ba4\u7d22\u5f15\u540d\u79f0\uff0c\u53ef\u6839\u636e\u9700\u6c42\u8fdb\u884c\u4fee\u6539\r\n    \"index_name\": \"index\",\r\n    //\u9ed8\u8ba4\u4e3alocalhost\uff0c\u53ef\u6839\u636e\u5177\u4f53\u9700\u6c42\u4fee\u6539\r\n    \"host\": \"localhost\",\r\n    \"port\": 9200,\r\n    //\u82e5\u5b58\u5728\u5b89\u5168\u8ba4\u8bc1\uff0c\u5219\u586b\u5199\u7528\u6237\u540d\u548c\u5bc6\u7801\r\n    \"username\": \"\",\r\n    \"password\": \"\",\r\n    //Elasticsearch\u5206\u8bcd\u5668\r\n    \"analyzer\": \"standard\"\r\n  },\r\n\r\n  //Faiss\u8bbe\u7f6e\r\n  \"faiss_setting\": {\r\n    //\u7d22\u5f15\u65b9\u5f0f\r\n    \"retrieval_way\": \"IndexFlatL2\"\r\n  }\r\n}\r\n```\r\n\r\n\r\n\r\n\r\n\r\n# \u8bba\u575b\u4ea4\u6d41\r\n\u5fae\u4fe1\u7fa4\u4e8c\u7ef4\u7801\r\n\r\n# \u76f8\u5173\u77e5\u8bc6\r\n- [\u57fa\u4e8e\u68c0\u7d22\u589e\u5f3a\u7684\u6587\u672c\u751f\u6210](https://hustai.gitee.io/zh/posts/rag/RetrieveTextGeneration.html)\r\n\r\n- [\u5982\u4f55\u901a\u8fc7\u5927\u6a21\u578b\u5b9e\u73b0\u5916\u6302\u77e5\u8bc6\u5e93\u4f18\u5316](https://hustai.gitee.io/zh/posts/rag/LLMretrieval.html)\r\n\r\n [\u66f4\u591a\u76f8\u5173\u77e5\u8bc6\u5206\u4eab\u2014\u2014\u2014\u2014\u7f51\u7ad9\u94fe\u63a5](https://hustai.tech/zh/)\r\n\r\n\r\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "FourthDimension\uff08\u7b2c\u56db\u7ef4\u5ea6\uff09\u7531\u534e\u4e2d\u79d1\u6280\u5927\u5b66\u4eba\u5de5\u667a\u80fd\u4e0e\u5d4c\u5165\u5f0f\u5b9e\u9a8c\u5ba4\u8054\u5408\u8a00\u56fe\u79d1\u6280\u7814\u53d1\uff0c\u662f\u4e00\u6b3e\u57fa\u4e8e\u5927\u8bed\u8a00\u6a21\u578b\u7684\u667a\u80fd\u77e5\u8bc6\u95ee\u7b54\u7cfb\u7edf\uff0c\u63d0\u4f9b\u79c1\u57df\u77e5\u8bc6\u5e93\u3001\u6587\u6863\u95ee\u7b54\u7b49\u591a\u79cd\u670d\u52a1\u3002\u6b64\u5916\uff0cFourthDimension\u63d0\u4f9b\u4fbf\u6377\u7684\u672c\u5730\u90e8\u7f72\u65b9\u6cd5\uff0c\u65b9\u4fbf\u7528\u6237\u5728\u672c\u5730\u73af\u5883\u4e2d\u642d\u5efa\u5c5e\u4e8e\u81ea\u5df1\u7684\u5e94\u7528\u5e73\u53f0\u3002",
    "version": "1.2.1",
    "project_urls": {
        "Homepage": "https://gitee.com/hustai/Fourth-Dimension"
    },
    "split_keywords": [
        "python",
        "yantu"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "86380473a3e7ed24ce6400eb0be74152792b18ababd50a5b1e4278407c44b898",
                "md5": "4a2695faa397af9e84766c4e6f65d83d",
                "sha256": "adef16a0ad2901d4ea636db5d32fadc03912d9f3bfa82d4eb2d79173bfb56c92"
            },
            "downloads": -1,
            "filename": "fourth_dimension-1.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "4a2695faa397af9e84766c4e6f65d83d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 19999,
            "upload_time": "2023-11-10T02:37:37",
            "upload_time_iso_8601": "2023-11-10T02:37:37.769310Z",
            "url": "https://files.pythonhosted.org/packages/86/38/0473a3e7ed24ce6400eb0be74152792b18ababd50a5b1e4278407c44b898/fourth_dimension-1.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-11-10 02:37:37",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "fourth-dimension"
}
        
Elapsed time: 0.41628s