whiskerrag


Namewhiskerrag JSON
Version 0.2.10 PyPI version JSON
download
home_pageNone
SummaryA utlity package for RAG operations
upload_time2025-07-22 12:28:34
maintainerNone
docs_urlNone
authorpetercat.ai
requires_python<4.0,>=3.9
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # WhiskerRAG

[![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://choosealicense.com/licenses/mit/)
[![Python Version](https://img.shields.io/pypi/pyversions/whiskerrag)](https://pypi.org/project/whiskerrag/)
[![PyPI version](https://badge.fury.io/py/whiskerrag.svg)](https://badge.fury.io/py/whiskerrag)
[![codecov](https://codecov.io/gh/petercat-ai/whiskerrag_toolkit/branch/main/graph/badge.svg)](https://codecov.io/gh/petercat-ai/whiskerrag_toolkit)

WhiskerRAG 是为 PeterCat 和 Whisker 项目开发的 RAG(Retrieval-Augmented Generation)工具包,提供完整的 RAG 相关类型定义和方法实现。

## 特性

- 针对通用 RAG 的领域建模类型, 包括任务(Task)、知识(Knowledge)、分段(Chunk)、租户(Tenant)、知识库空间(Space)。
- Whisker rag 插件接口描述。
- Github 仓库、S3 资源管理器。

## 安装

使用 pip 安装:

```bash
pip install whiskerrag
```

## 快速开始

whiskerrag 包含三个子模块,分别是 whiskerrag_utils、whiskerrag_client、whiskerrag_types。它们分别有不同的用途:

### whiskerrag_utils

包含了构建 RAG 系统的常用方法:

```python
from whiskerrag_utils import loader,embedding,retriever
```

### whiskerrag_client

将 RAG 系统服务通过 python sdk 的形式向外暴露。

```python
from whiskerrag_client import APIClient

api_client = APIClient(
    base_url="https://api.example.com",
    token="your_token_here"
)

knowledge_chunks = await api_client.retrieval.retrieve_knowledge_content(
    RetrievalByKnowledgeRequest(knowledge_id="your knowledge uuid here")
)

space_chunks = await api_client.retrieval.retrieve_space_content(
    RetrievalBySpaceRequest(space_id="your space id here ")
)

chunk_list = await api_client.chunk.get_chunk_list(
    page=1,
    size=10,
    filters={"status": "active"}
)

task_list = await api_client.task.get_task_list(
    page=1,
    size=10
)

task_detail = await api_client.task.get_task_detail("task_id_here")
```

### whiskerrag_types

一些辅助开发的类型提示,接口;

```python
from whiskerrag_types.interface import DBPluginInterface, TaskEngineInterface
from whiskerrag_types.model import Knowledge, Task, Tenant, PageParams, PageResponse
```

## 开发者指南

### 环境初始化

1. 克隆项目

```bash
git clone https://github.com/petercat-ai/whiskerrag_toolkit.git
cd whiskerrag_toolkit
```

2. 创建并激活虚拟环境

```bash
# 查看poetry配置
poetry config --list

# 修改 poetry 配置
poetry config virtualenvs.create true
poetry config virtualenvs.in-project true

poetry env use python3.10

# 激活虚拟环境
source .venv/bin/activate
```

3. 安装依赖

```bash
# 安装项目依赖
poetry install
# 安装 pre-commit 工具
pre-commit install
```

4. 运行测试

```bash
# 运行所有测试
poetry run pytest
# 运行指定测试文件
poetry run pytest tests/test_loader.py
```

4. poetry 常用命令

```bash
# 安装依赖
poetry install

# 添加新依赖
poetry add package_name

# 添加新 dev 依赖
poetry add --dev package_name

# 更新依赖
poetry update

# 查看环境信息
poetry env info

# 查看已安装的包
poetry show
```

### 开发工作流

1. 创建新分支
2. 开发新功能,补充单元测试,确保代码质量。注意,请确保单元测试覆盖率不低于 80%。
3. 提交代码,并创建 Pull Request。
4. 等待代码审查,并根据反馈进行修改。
5. 合并 Pull Request。

## 项目结构

```
whiskerRAG-toolkit/
├── src/
│   ├── whiskerrag_utils/
│   └── whiskerrag_types/
│   └── whiskerrag_client/
└── pyproject.toml
```

## 贡献指南

1. Fork 本仓库
2. 创建特性分支 (`make branch name=feature/amazing-feature`)
3. 提交更改 (`git commit -m 'Add some amazing feature'`)
4. 推送到分支 (`git push origin feature/amazing-feature`)
5. 开启 Pull Request

## 许可证

本项目采用 MIT 许可证 - 查看 [LICENSE](LICENSE) 文件了解详情

## 联系方式

项目维护者 - [@petercat-ai](https://github.com/petercat-ai)

项目链接:[https://github.com/petercat-ai/whiskerrag_toolkit](https://github.com/your-username/whiskerrag_toolkit)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "whiskerrag",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "petercat.ai",
    "author_email": "antd.antgroup@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a3/2a/064322c6b6eb68f79a94da6f46fb2fd6548b5d0b225aa8be3ae728e3ff7d/whiskerrag-0.2.10.tar.gz",
    "platform": null,
    "description": "# WhiskerRAG\n\n[![MIT License](https://img.shields.io/badge/License-MIT-green.svg)](https://choosealicense.com/licenses/mit/)\n[![Python Version](https://img.shields.io/pypi/pyversions/whiskerrag)](https://pypi.org/project/whiskerrag/)\n[![PyPI version](https://badge.fury.io/py/whiskerrag.svg)](https://badge.fury.io/py/whiskerrag)\n[![codecov](https://codecov.io/gh/petercat-ai/whiskerrag_toolkit/branch/main/graph/badge.svg)](https://codecov.io/gh/petercat-ai/whiskerrag_toolkit)\n\nWhiskerRAG \u662f\u4e3a PeterCat \u548c Whisker \u9879\u76ee\u5f00\u53d1\u7684 RAG\uff08Retrieval-Augmented Generation\uff09\u5de5\u5177\u5305\uff0c\u63d0\u4f9b\u5b8c\u6574\u7684 RAG \u76f8\u5173\u7c7b\u578b\u5b9a\u4e49\u548c\u65b9\u6cd5\u5b9e\u73b0\u3002\n\n## \u7279\u6027\n\n- \u9488\u5bf9\u901a\u7528 RAG \u7684\u9886\u57df\u5efa\u6a21\u7c7b\u578b, \u5305\u62ec\u4efb\u52a1\uff08Task\uff09\u3001\u77e5\u8bc6\uff08Knowledge\uff09\u3001\u5206\u6bb5(Chunk)\u3001\u79df\u6237(Tenant)\u3001\u77e5\u8bc6\u5e93\u7a7a\u95f4(Space)\u3002\n- Whisker rag \u63d2\u4ef6\u63a5\u53e3\u63cf\u8ff0\u3002\n- Github \u4ed3\u5e93\u3001S3 \u8d44\u6e90\u7ba1\u7406\u5668\u3002\n\n## \u5b89\u88c5\n\n\u4f7f\u7528 pip \u5b89\u88c5\uff1a\n\n```bash\npip install whiskerrag\n```\n\n## \u5feb\u901f\u5f00\u59cb\n\nwhiskerrag \u5305\u542b\u4e09\u4e2a\u5b50\u6a21\u5757\uff0c\u5206\u522b\u662f whiskerrag_utils\u3001whiskerrag_client\u3001whiskerrag_types\u3002\u5b83\u4eec\u5206\u522b\u6709\u4e0d\u540c\u7684\u7528\u9014\uff1a\n\n### whiskerrag_utils\n\n\u5305\u542b\u4e86\u6784\u5efa RAG \u7cfb\u7edf\u7684\u5e38\u7528\u65b9\u6cd5\uff1a\n\n```python\nfrom whiskerrag_utils import loader,embedding,retriever\n```\n\n### whiskerrag_client\n\n\u5c06 RAG \u7cfb\u7edf\u670d\u52a1\u901a\u8fc7 python sdk \u7684\u5f62\u5f0f\u5411\u5916\u66b4\u9732\u3002\n\n```python\nfrom whiskerrag_client import APIClient\n\napi_client = APIClient(\n    base_url=\"https://api.example.com\",\n    token=\"your_token_here\"\n)\n\nknowledge_chunks = await api_client.retrieval.retrieve_knowledge_content(\n    RetrievalByKnowledgeRequest(knowledge_id=\"your knowledge uuid here\")\n)\n\nspace_chunks = await api_client.retrieval.retrieve_space_content(\n    RetrievalBySpaceRequest(space_id=\"your space id here \")\n)\n\nchunk_list = await api_client.chunk.get_chunk_list(\n    page=1,\n    size=10,\n    filters={\"status\": \"active\"}\n)\n\ntask_list = await api_client.task.get_task_list(\n    page=1,\n    size=10\n)\n\ntask_detail = await api_client.task.get_task_detail(\"task_id_here\")\n```\n\n### whiskerrag_types\n\n\u4e00\u4e9b\u8f85\u52a9\u5f00\u53d1\u7684\u7c7b\u578b\u63d0\u793a\uff0c\u63a5\u53e3\uff1b\n\n```python\nfrom whiskerrag_types.interface import DBPluginInterface, TaskEngineInterface\nfrom whiskerrag_types.model import Knowledge, Task, Tenant, PageParams, PageResponse\n```\n\n## \u5f00\u53d1\u8005\u6307\u5357\n\n### \u73af\u5883\u521d\u59cb\u5316\n\n1. \u514b\u9686\u9879\u76ee\n\n```bash\ngit clone https://github.com/petercat-ai/whiskerrag_toolkit.git\ncd whiskerrag_toolkit\n```\n\n2. \u521b\u5efa\u5e76\u6fc0\u6d3b\u865a\u62df\u73af\u5883\n\n```bash\n# \u67e5\u770bpoetry\u914d\u7f6e\npoetry config --list\n\n# \u4fee\u6539 poetry \u914d\u7f6e\npoetry config virtualenvs.create true\npoetry config virtualenvs.in-project true\n\npoetry env use python3.10\n\n# \u6fc0\u6d3b\u865a\u62df\u73af\u5883\nsource .venv/bin/activate\n```\n\n3. \u5b89\u88c5\u4f9d\u8d56\n\n```bash\n# \u5b89\u88c5\u9879\u76ee\u4f9d\u8d56\npoetry install\n# \u5b89\u88c5 pre-commit \u5de5\u5177\npre-commit install\n```\n\n4. \u8fd0\u884c\u6d4b\u8bd5\n\n```bash\n# \u8fd0\u884c\u6240\u6709\u6d4b\u8bd5\npoetry run pytest\n# \u8fd0\u884c\u6307\u5b9a\u6d4b\u8bd5\u6587\u4ef6\npoetry run pytest tests/test_loader.py\n```\n\n4. poetry \u5e38\u7528\u547d\u4ee4\n\n```bash\n# \u5b89\u88c5\u4f9d\u8d56\npoetry install\n\n# \u6dfb\u52a0\u65b0\u4f9d\u8d56\npoetry add package_name\n\n# \u6dfb\u52a0\u65b0 dev \u4f9d\u8d56\npoetry add --dev package_name\n\n# \u66f4\u65b0\u4f9d\u8d56\npoetry update\n\n# \u67e5\u770b\u73af\u5883\u4fe1\u606f\npoetry env info\n\n# \u67e5\u770b\u5df2\u5b89\u88c5\u7684\u5305\npoetry show\n```\n\n### \u5f00\u53d1\u5de5\u4f5c\u6d41\n\n1. \u521b\u5efa\u65b0\u5206\u652f\n2. \u5f00\u53d1\u65b0\u529f\u80fd\uff0c\u8865\u5145\u5355\u5143\u6d4b\u8bd5\uff0c\u786e\u4fdd\u4ee3\u7801\u8d28\u91cf\u3002\u6ce8\u610f\uff0c\u8bf7\u786e\u4fdd\u5355\u5143\u6d4b\u8bd5\u8986\u76d6\u7387\u4e0d\u4f4e\u4e8e 80%\u3002\n3. \u63d0\u4ea4\u4ee3\u7801\uff0c\u5e76\u521b\u5efa Pull Request\u3002\n4. \u7b49\u5f85\u4ee3\u7801\u5ba1\u67e5\uff0c\u5e76\u6839\u636e\u53cd\u9988\u8fdb\u884c\u4fee\u6539\u3002\n5. \u5408\u5e76 Pull Request\u3002\n\n## \u9879\u76ee\u7ed3\u6784\n\n```\nwhiskerRAG-toolkit/\n\u251c\u2500\u2500 src/\n\u2502   \u251c\u2500\u2500 whiskerrag_utils/\n\u2502   \u2514\u2500\u2500 whiskerrag_types/\n\u2502   \u2514\u2500\u2500 whiskerrag_client/\n\u2514\u2500\u2500 pyproject.toml\n```\n\n## \u8d21\u732e\u6307\u5357\n\n1. Fork \u672c\u4ed3\u5e93\n2. \u521b\u5efa\u7279\u6027\u5206\u652f (`make branch name=feature/amazing-feature`)\n3. \u63d0\u4ea4\u66f4\u6539 (`git commit -m 'Add some amazing feature'`)\n4. \u63a8\u9001\u5230\u5206\u652f (`git push origin feature/amazing-feature`)\n5. \u5f00\u542f Pull Request\n\n## \u8bb8\u53ef\u8bc1\n\n\u672c\u9879\u76ee\u91c7\u7528 MIT \u8bb8\u53ef\u8bc1 - \u67e5\u770b [LICENSE](LICENSE) \u6587\u4ef6\u4e86\u89e3\u8be6\u60c5\n\n## \u8054\u7cfb\u65b9\u5f0f\n\n\u9879\u76ee\u7ef4\u62a4\u8005 - [@petercat-ai](https://github.com/petercat-ai)\n\n\u9879\u76ee\u94fe\u63a5\uff1a[https://github.com/petercat-ai/whiskerrag_toolkit](https://github.com/your-username/whiskerrag_toolkit)\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A utlity package for RAG operations",
    "version": "0.2.10",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dbc2c514eeb07fb2ccd7ab4fb362396d62917c503e99c16e9286885395e9d109",
                "md5": "565cb84ab4ab0de733fb848e6e320440",
                "sha256": "90bdefcdf745d9b6f3afa30800a583b71258a5edeec763bfbd8362a080c5504e"
            },
            "downloads": -1,
            "filename": "whiskerrag-0.2.10-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "565cb84ab4ab0de733fb848e6e320440",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 80362,
            "upload_time": "2025-07-22T12:28:33",
            "upload_time_iso_8601": "2025-07-22T12:28:33.141958Z",
            "url": "https://files.pythonhosted.org/packages/db/c2/c514eeb07fb2ccd7ab4fb362396d62917c503e99c16e9286885395e9d109/whiskerrag-0.2.10-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a32a064322c6b6eb68f79a94da6f46fb2fd6548b5d0b225aa8be3ae728e3ff7d",
                "md5": "48f3d25d4f16097d5fc650ff5e59a960",
                "sha256": "2c94fffe0eff2fc0202787170ebb9566eeac3aa97c2e9e1a7b3f527813c02328"
            },
            "downloads": -1,
            "filename": "whiskerrag-0.2.10.tar.gz",
            "has_sig": false,
            "md5_digest": "48f3d25d4f16097d5fc650ff5e59a960",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 49489,
            "upload_time": "2025-07-22T12:28:34",
            "upload_time_iso_8601": "2025-07-22T12:28:34.158942Z",
            "url": "https://files.pythonhosted.org/packages/a3/2a/064322c6b6eb68f79a94da6f46fb2fd6548b5d0b225aa8be3ae728e3ff7d/whiskerrag-0.2.10.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-22 12:28:34",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "whiskerrag"
}
        
Elapsed time: 1.17709s