# Data Analysis MCP (Python)
一个基于 Model Context Protocol 的数据分析服务器,使用 Python 开发,支持 **SSE (Server-Sent Events)** 传输模式。
## 功能特性
- 📊 数据统计分析(均值、中位数、标准差等)
- 📈 数据可视化(生成图表)
- 🔍 数据探索(查看数据摘要、缺失值等)
- 📉 趋势分析
- 📋 支持 CSV、Excel、JSON 等格式
- 🌐 基于 HTTP/SSE 的远程访问
- 🚀 RESTful API 接口
## 技术栈
- Python 3.8+
- FastAPI - 现代化 Web 框架
- SSE-Starlette - Server-Sent Events 支持
- Uvicorn - ASGI 服务器
- pandas - 数据分析
- numpy - 数值计算
- matplotlib - 数据可视化
- seaborn - 统计图表
## 快速开始
### 安装依赖
```bash
pip install -r requirements.txt
```
### 运行服务器
```bash
python main.py
```
服务器将在 `http://localhost:8000` 启动。
### 访问 API 文档
启动后访问:
- **Swagger UI**: http://localhost:8000/docs
- **ReDoc**: http://localhost:8000/redoc
## API 端点
### 1. 根端点
```
GET http://localhost:8000/
```
返回服务器信息和可用端点
### 2. SSE 连接端点
```
GET http://localhost:8000/sse
```
建立 Server-Sent Events 连接,接收服务器推送的消息
### 3. 消息处理端点
```
POST http://localhost:8000/messages
Content-Type: application/json
```
发送 MCP JSON-RPC 请求
#### 示例请求:
**初始化**
```json
{
"jsonrpc": "2.0",
"id": 1,
"method": "initialize",
"params": {}
}
```
**列出工具**
```json
{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/list",
"params": {}
}
```
**调用工具**
```json
{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "load_data",
"arguments": {
"filepath": "data.csv",
"dataset_name": "my_data"
}
}
}
```
## MCP 工具列表
### 1. load-data
加载数据文件
- 支持 CSV、Excel、JSON 格式
### 2. describe-data
获取数据摘要统计
- 行列数
- 数据类型
- 缺失值统计
- 基本统计量
### 3. analyze-column
分析特定列的数据
- 唯一值数量
- 频率分布
- 数值统计
### 4. correlation-analysis
相关性分析
- 计算变量间相关系数
- 生成相关性矩阵
### 5. list-datasets
列出已加载的数据集
- 显示所有数据集
- 查看数据集基本信息
## 使用示例
### 使用 curl 测试
**1. 列出可用工具**
```bash
curl -X POST http://localhost:8000/messages \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list"
}'
```
**2. 加载数据**
```bash
curl -X POST http://localhost:8000/messages \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 2,
"method": "tools/call",
"params": {
"name": "load_data",
"arguments": {
"filepath": "data.csv",
"dataset_name": "sales"
}
}
}'
```
**3. 获取数据描述**
```bash
curl -X POST http://localhost:8000/messages \
-H "Content-Type: application/json" \
-d '{
"jsonrpc": "2.0",
"id": 3,
"method": "tools/call",
"params": {
"name": "describe_data",
"arguments": {
"dataset_name": "sales"
}
}
}'
```
### 在 Claude Desktop 中配置
在 Claude Desktop 的配置文件中添加:
```json
{
"mcpServers": {
"data-analysis": {
"url": "http://localhost:8000/sse",
"transport": "sse"
}
}
}
```
## 开发
### 启动开发服务器
```bash
python main.py
```
### 运行测试
```bash
pytest tests/
```
## 许可证
MIT
Raw data
{
"_id": null,
"home_page": "https://github.com/BACH-AI-Tools/data-analysis-mcp",
"name": "bachai-data-analysis-mcp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "mcp, data-analysis, pandas, machine-learning, sse",
"author": "BACH Studio",
"author_email": "BACH Studio <contact@bachstudio.com>",
"download_url": "https://files.pythonhosted.org/packages/38/16/aceba5e46bc5a280efb960a806f5bfe75cb5854463b5471d9b04a520b282/bachai_data_analysis_mcp-1.1.3.tar.gz",
"platform": null,
"description": "# Data Analysis MCP (Python)\n\n\u4e00\u4e2a\u57fa\u4e8e Model Context Protocol \u7684\u6570\u636e\u5206\u6790\u670d\u52a1\u5668\uff0c\u4f7f\u7528 Python \u5f00\u53d1\uff0c\u652f\u6301 **SSE (Server-Sent Events)** \u4f20\u8f93\u6a21\u5f0f\u3002\n\n## \u529f\u80fd\u7279\u6027\n\n- \ud83d\udcca \u6570\u636e\u7edf\u8ba1\u5206\u6790\uff08\u5747\u503c\u3001\u4e2d\u4f4d\u6570\u3001\u6807\u51c6\u5dee\u7b49\uff09\n- \ud83d\udcc8 \u6570\u636e\u53ef\u89c6\u5316\uff08\u751f\u6210\u56fe\u8868\uff09\n- \ud83d\udd0d \u6570\u636e\u63a2\u7d22\uff08\u67e5\u770b\u6570\u636e\u6458\u8981\u3001\u7f3a\u5931\u503c\u7b49\uff09\n- \ud83d\udcc9 \u8d8b\u52bf\u5206\u6790\n- \ud83d\udccb \u652f\u6301 CSV\u3001Excel\u3001JSON \u7b49\u683c\u5f0f\n- \ud83c\udf10 \u57fa\u4e8e HTTP/SSE \u7684\u8fdc\u7a0b\u8bbf\u95ee\n- \ud83d\ude80 RESTful API \u63a5\u53e3\n\n## \u6280\u672f\u6808\n\n- Python 3.8+\n- FastAPI - \u73b0\u4ee3\u5316 Web \u6846\u67b6\n- SSE-Starlette - Server-Sent Events \u652f\u6301\n- Uvicorn - ASGI \u670d\u52a1\u5668\n- pandas - \u6570\u636e\u5206\u6790\n- numpy - \u6570\u503c\u8ba1\u7b97\n- matplotlib - \u6570\u636e\u53ef\u89c6\u5316\n- seaborn - \u7edf\u8ba1\u56fe\u8868\n\n## \u5feb\u901f\u5f00\u59cb\n\n### \u5b89\u88c5\u4f9d\u8d56\n\n```bash\npip install -r requirements.txt\n```\n\n### \u8fd0\u884c\u670d\u52a1\u5668\n\n```bash\npython main.py\n```\n\n\u670d\u52a1\u5668\u5c06\u5728 `http://localhost:8000` \u542f\u52a8\u3002\n\n### \u8bbf\u95ee API \u6587\u6863\n\n\u542f\u52a8\u540e\u8bbf\u95ee\uff1a\n- **Swagger UI**: http://localhost:8000/docs\n- **ReDoc**: http://localhost:8000/redoc\n\n## API \u7aef\u70b9\n\n### 1. \u6839\u7aef\u70b9\n```\nGET http://localhost:8000/\n```\n\u8fd4\u56de\u670d\u52a1\u5668\u4fe1\u606f\u548c\u53ef\u7528\u7aef\u70b9\n\n### 2. SSE \u8fde\u63a5\u7aef\u70b9\n```\nGET http://localhost:8000/sse\n```\n\u5efa\u7acb Server-Sent Events \u8fde\u63a5\uff0c\u63a5\u6536\u670d\u52a1\u5668\u63a8\u9001\u7684\u6d88\u606f\n\n### 3. \u6d88\u606f\u5904\u7406\u7aef\u70b9\n```\nPOST http://localhost:8000/messages\nContent-Type: application/json\n```\n\u53d1\u9001 MCP JSON-RPC \u8bf7\u6c42\n\n#### \u793a\u4f8b\u8bf7\u6c42\uff1a\n\n**\u521d\u59cb\u5316**\n```json\n{\n \"jsonrpc\": \"2.0\",\n \"id\": 1,\n \"method\": \"initialize\",\n \"params\": {}\n}\n```\n\n**\u5217\u51fa\u5de5\u5177**\n```json\n{\n \"jsonrpc\": \"2.0\",\n \"id\": 2,\n \"method\": \"tools/list\",\n \"params\": {}\n}\n```\n\n**\u8c03\u7528\u5de5\u5177**\n```json\n{\n \"jsonrpc\": \"2.0\",\n \"id\": 3,\n \"method\": \"tools/call\",\n \"params\": {\n \"name\": \"load_data\",\n \"arguments\": {\n \"filepath\": \"data.csv\",\n \"dataset_name\": \"my_data\"\n }\n }\n}\n```\n\n## MCP \u5de5\u5177\u5217\u8868\n\n### 1. load-data\n\u52a0\u8f7d\u6570\u636e\u6587\u4ef6\n- \u652f\u6301 CSV\u3001Excel\u3001JSON \u683c\u5f0f\n\n### 2. describe-data\n\u83b7\u53d6\u6570\u636e\u6458\u8981\u7edf\u8ba1\n- \u884c\u5217\u6570\n- \u6570\u636e\u7c7b\u578b\n- \u7f3a\u5931\u503c\u7edf\u8ba1\n- \u57fa\u672c\u7edf\u8ba1\u91cf\n\n### 3. analyze-column\n\u5206\u6790\u7279\u5b9a\u5217\u7684\u6570\u636e\n- \u552f\u4e00\u503c\u6570\u91cf\n- \u9891\u7387\u5206\u5e03\n- \u6570\u503c\u7edf\u8ba1\n\n### 4. correlation-analysis\n\u76f8\u5173\u6027\u5206\u6790\n- \u8ba1\u7b97\u53d8\u91cf\u95f4\u76f8\u5173\u7cfb\u6570\n- \u751f\u6210\u76f8\u5173\u6027\u77e9\u9635\n\n### 5. list-datasets\n\u5217\u51fa\u5df2\u52a0\u8f7d\u7684\u6570\u636e\u96c6\n- \u663e\u793a\u6240\u6709\u6570\u636e\u96c6\n- \u67e5\u770b\u6570\u636e\u96c6\u57fa\u672c\u4fe1\u606f\n\n## \u4f7f\u7528\u793a\u4f8b\n\n### \u4f7f\u7528 curl \u6d4b\u8bd5\n\n**1. \u5217\u51fa\u53ef\u7528\u5de5\u5177**\n```bash\ncurl -X POST http://localhost:8000/messages \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"jsonrpc\": \"2.0\",\n \"id\": 1,\n \"method\": \"tools/list\"\n }'\n```\n\n**2. \u52a0\u8f7d\u6570\u636e**\n```bash\ncurl -X POST http://localhost:8000/messages \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"jsonrpc\": \"2.0\",\n \"id\": 2,\n \"method\": \"tools/call\",\n \"params\": {\n \"name\": \"load_data\",\n \"arguments\": {\n \"filepath\": \"data.csv\",\n \"dataset_name\": \"sales\"\n }\n }\n }'\n```\n\n**3. \u83b7\u53d6\u6570\u636e\u63cf\u8ff0**\n```bash\ncurl -X POST http://localhost:8000/messages \\\n -H \"Content-Type: application/json\" \\\n -d '{\n \"jsonrpc\": \"2.0\",\n \"id\": 3,\n \"method\": \"tools/call\",\n \"params\": {\n \"name\": \"describe_data\",\n \"arguments\": {\n \"dataset_name\": \"sales\"\n }\n }\n }'\n```\n\n### \u5728 Claude Desktop \u4e2d\u914d\u7f6e\n\n\u5728 Claude Desktop \u7684\u914d\u7f6e\u6587\u4ef6\u4e2d\u6dfb\u52a0\uff1a\n\n```json\n{\n \"mcpServers\": {\n \"data-analysis\": {\n \"url\": \"http://localhost:8000/sse\",\n \"transport\": \"sse\"\n }\n }\n}\n```\n\n## \u5f00\u53d1\n\n### \u542f\u52a8\u5f00\u53d1\u670d\u52a1\u5668\n```bash\npython main.py\n```\n\n### \u8fd0\u884c\u6d4b\u8bd5\n```bash\npytest tests/\n```\n\n## \u8bb8\u53ef\u8bc1\n\nMIT\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Data Analysis MCP Server - Model Context Protocol server for data analysis with SSE transport",
"version": "1.1.3",
"project_urls": {
"Homepage": "https://github.com/BACH-AI-Tools/data-analysis-mcp"
},
"split_keywords": [
"mcp",
" data-analysis",
" pandas",
" machine-learning",
" sse"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "eb2014f9741ef59e79298c6408a0663eb2cc4299a859e45f2b964273394c749c",
"md5": "2c22ff56e993eea0de8296f79fae05da",
"sha256": "13ddc5fff26e6a57385ca67f0e906d8451fbbd3c4563c3326835773ca4095891"
},
"downloads": -1,
"filename": "bachai_data_analysis_mcp-1.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "2c22ff56e993eea0de8296f79fae05da",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 3317,
"upload_time": "2025-11-06T10:15:34",
"upload_time_iso_8601": "2025-11-06T10:15:34.307487Z",
"url": "https://files.pythonhosted.org/packages/eb/20/14f9741ef59e79298c6408a0663eb2cc4299a859e45f2b964273394c749c/bachai_data_analysis_mcp-1.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3816aceba5e46bc5a280efb960a806f5bfe75cb5854463b5471d9b04a520b282",
"md5": "840f9111c195efe5bf3e1c79289cab7b",
"sha256": "59bb409b1a543381e1083dbc82b41395200b61f53fd8a54170ac96de2f95caec"
},
"downloads": -1,
"filename": "bachai_data_analysis_mcp-1.1.3.tar.gz",
"has_sig": false,
"md5_digest": "840f9111c195efe5bf3e1c79289cab7b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 3538,
"upload_time": "2025-11-06T10:15:35",
"upload_time_iso_8601": "2025-11-06T10:15:35.330338Z",
"url": "https://files.pythonhosted.org/packages/38/16/aceba5e46bc5a280efb960a806f5bfe75cb5854463b5471d9b04a520b282/bachai_data_analysis_mcp-1.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-06 10:15:35",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "BACH-AI-Tools",
"github_project": "data-analysis-mcp",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "pandas",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.24.0"
]
]
},
{
"name": "matplotlib",
"specs": [
[
">=",
"3.7.0"
]
]
},
{
"name": "seaborn",
"specs": [
[
">=",
"0.12.0"
]
]
},
{
"name": "openpyxl",
"specs": [
[
">=",
"3.1.0"
]
]
},
{
"name": "fastapi",
"specs": [
[
">=",
"0.104.0"
]
]
},
{
"name": "uvicorn",
"specs": [
[
">=",
"0.24.0"
]
]
},
{
"name": "sse-starlette",
"specs": [
[
">=",
"1.8.0"
]
]
},
{
"name": "httpx",
"specs": [
[
">=",
"0.25.0"
]
]
}
],
"lcname": "bachai-data-analysis-mcp"
}