word-to-txt-mcp


Nameword-to-txt-mcp JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/yourusername/word-to-txt-mcp
SummaryFastMCP Word文档转文本分析服务器
upload_time2025-08-29 07:45:20
maintainerNone
docs_urlNone
authorYour Name
requires_python>=3.8
licenseMIT
keywords mcp fastmcp word document text analysis docx conversion server api
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Word to Text MCP Server

一个基于FastMCP的Word文档转文本分析服务器包,提供Word文档处理、文本提取和内容分析功能。

## 功能特性

- 🔄 **Word文档转换**: 支持将.docx和.doc文件转换为纯文本
- 📊 **文档分析**: 提供文档统计、关键词提取、结构分析等功能
- 🚀 **MCP协议**: 基于FastMCP框架,支持多种传输协议
- 🛠️ **易于集成**: 可作为独立服务器运行或集成到其他应用中
- 📋 **表格支持**: 能够提取Word文档中的表格内容

## 安装

### 从PyPI安装

```bash
pip install word-to-txt-mcp
```

### 从源码安装

```bash
git clone https://github.com/yourusername/word-to-txt-mcp.git
cd word-to-txt-mcp
pip install -e .
```

## 快速开始

### 命令行使用

启动MCP服务器:

```bash
# 使用默认配置启动(SSE协议,端口7264)
word-to-txt-mcp

# 指定端口和协议
word-to-txt-mcp --port 8080 --transport sse

# 使用标准输入输出模式
word-to-txt-mcp --transport stdio
```

### 编程接口使用

```python
from word_to_txt_mcp import convert_word_to_text, create_mcp_server

# 直接转换Word文档
text_content = convert_word_to_text("document.docx")
print(text_content)

# 创建并运行MCP服务器
mcp = create_mcp_server("My Document Server")
mcp.run(transport="sse", host="0.0.0.0", port=7264)
```

## API参考

### 核心函数

#### `convert_word_to_text(word_file_path)`

将Word文档转换为文本内容。

**参数:**
- `word_file_path` (str): Word文档的文件路径

**返回值:**
- `str`: 提取的文本内容

**异常:**
- `FileNotFoundError`: 当Word文件不存在时抛出
- `Exception`: 当转换过程中出现错误时抛出

#### `process_word_document(file_path)`

处理Word文档,将其转换为文本并进行基础分析。

**参数:**
- `file_path` (str): Word文档的文件路径

**返回值:**
- `str`: 包含文档内容和基础分析的结果

#### `analyze_document_content(text_content, analysis_type="summary")`

分析文档内容。

**参数:**
- `text_content` (str): 要分析的文本内容
- `analysis_type` (str): 分析类型,可选值:
  - `"summary"`: 文档摘要分析
  - `"keywords"`: 关键词提取
  - `"structure"`: 文档结构分析

**返回值:**
- `str`: 分析结果

### MCP工具

当作为MCP服务器运行时,提供以下工具:

1. **process_word_document**: 处理Word文档并转换为文本
2. **analyze_document_content**: 分析文档内容
3. **echo_tool**: 回显文本(用于测试)

### MCP资源

- `document://help`: 获取帮助信息
- `document://status/{file_path}`: 检查文档状态

### MCP提示

- `analyze_document`: 生成文档分析提示

## 配置选项

### 命令行参数

- `--transport`: 传输协议类型 (stdio, sse, streamable-http)
- `--host`: 服务器主机地址 (默认: 0.0.0.0)
- `--port`: 服务器端口号 (默认: 7264)
- `--name`: 服务器名称
- `--version`: 显示版本信息

## 使用示例

### 基础文档转换

```python
from word_to_txt_mcp import convert_word_to_text

# 转换Word文档
try:
    text = convert_word_to_text("example.docx")
    print("文档内容:")
    print(text)
except FileNotFoundError:
    print("文件不存在")
except Exception as e:
    print(f"转换失败: {e}")
```

### 文档分析

```python
from word_to_txt_mcp import process_word_document, analyze_document_content

# 处理文档并获取分析结果
result = process_word_document("example.docx")
print(result)

# 进行关键词分析
text = convert_word_to_text("example.docx")
keywords = analyze_document_content(text, "keywords")
print(keywords)
```

### 作为MCP服务器

```python
from word_to_txt_mcp import create_mcp_server

# 创建服务器
mcp = create_mcp_server("Document Analysis Server")

# 启动服务器
mcp.run(transport="sse", host="localhost", port=8080)
```

## 支持的文件格式

- `.docx` - Microsoft Word 2007及更新版本
- `.doc` - Microsoft Word 97-2003(需要额外配置)

## 依赖要求

- Python >= 3.8
- fastmcp >= 0.1.0
- python-docx >= 0.8.11

## 开发

### 安装开发依赖

```bash
pip install -e ".[dev]"
```

### 运行测试

```bash
pytest
```

### 代码格式化

```bash
black word_to_txt_mcp/
```

### 类型检查

```bash
mypy word_to_txt_mcp/
```

## 许可证

MIT License - 详见 [LICENSE](LICENSE) 文件。

## 贡献

欢迎提交Issue和Pull Request!

## 更新日志

### v0.1.0

- 初始版本发布
- 支持Word文档转文本
- 提供基础文档分析功能
- 支持MCP协议
- 命令行工具支持

## 联系方式

- 作者: Your Name
- 邮箱: your.email@example.com
- 项目主页: https://github.com/yourusername/word-to-txt-mcp

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yourusername/word-to-txt-mcp",
    "name": "word-to-txt-mcp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "mcp, fastmcp, word, document, text, analysis, docx, conversion, server, api",
    "author": "Your Name",
    "author_email": "Your Name <your.email@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/7b/5b/8f2d841602fc0df8a6facfeb6f000e7bfda22174ac1c726dbfe5862e2ff5/word_to_txt_mcp-0.1.0.tar.gz",
    "platform": null,
    "description": "# Word to Text MCP Server\r\n\r\n\u4e00\u4e2a\u57fa\u4e8eFastMCP\u7684Word\u6587\u6863\u8f6c\u6587\u672c\u5206\u6790\u670d\u52a1\u5668\u5305\uff0c\u63d0\u4f9bWord\u6587\u6863\u5904\u7406\u3001\u6587\u672c\u63d0\u53d6\u548c\u5185\u5bb9\u5206\u6790\u529f\u80fd\u3002\r\n\r\n## \u529f\u80fd\u7279\u6027\r\n\r\n- \ud83d\udd04 **Word\u6587\u6863\u8f6c\u6362**: \u652f\u6301\u5c06.docx\u548c.doc\u6587\u4ef6\u8f6c\u6362\u4e3a\u7eaf\u6587\u672c\r\n- \ud83d\udcca **\u6587\u6863\u5206\u6790**: \u63d0\u4f9b\u6587\u6863\u7edf\u8ba1\u3001\u5173\u952e\u8bcd\u63d0\u53d6\u3001\u7ed3\u6784\u5206\u6790\u7b49\u529f\u80fd\r\n- \ud83d\ude80 **MCP\u534f\u8bae**: \u57fa\u4e8eFastMCP\u6846\u67b6\uff0c\u652f\u6301\u591a\u79cd\u4f20\u8f93\u534f\u8bae\r\n- \ud83d\udee0\ufe0f **\u6613\u4e8e\u96c6\u6210**: \u53ef\u4f5c\u4e3a\u72ec\u7acb\u670d\u52a1\u5668\u8fd0\u884c\u6216\u96c6\u6210\u5230\u5176\u4ed6\u5e94\u7528\u4e2d\r\n- \ud83d\udccb **\u8868\u683c\u652f\u6301**: \u80fd\u591f\u63d0\u53d6Word\u6587\u6863\u4e2d\u7684\u8868\u683c\u5185\u5bb9\r\n\r\n## \u5b89\u88c5\r\n\r\n### \u4ecePyPI\u5b89\u88c5\r\n\r\n```bash\r\npip install word-to-txt-mcp\r\n```\r\n\r\n### \u4ece\u6e90\u7801\u5b89\u88c5\r\n\r\n```bash\r\ngit clone https://github.com/yourusername/word-to-txt-mcp.git\r\ncd word-to-txt-mcp\r\npip install -e .\r\n```\r\n\r\n## \u5feb\u901f\u5f00\u59cb\r\n\r\n### \u547d\u4ee4\u884c\u4f7f\u7528\r\n\r\n\u542f\u52a8MCP\u670d\u52a1\u5668\uff1a\r\n\r\n```bash\r\n# \u4f7f\u7528\u9ed8\u8ba4\u914d\u7f6e\u542f\u52a8\uff08SSE\u534f\u8bae\uff0c\u7aef\u53e37264\uff09\r\nword-to-txt-mcp\r\n\r\n# \u6307\u5b9a\u7aef\u53e3\u548c\u534f\u8bae\r\nword-to-txt-mcp --port 8080 --transport sse\r\n\r\n# \u4f7f\u7528\u6807\u51c6\u8f93\u5165\u8f93\u51fa\u6a21\u5f0f\r\nword-to-txt-mcp --transport stdio\r\n```\r\n\r\n### \u7f16\u7a0b\u63a5\u53e3\u4f7f\u7528\r\n\r\n```python\r\nfrom word_to_txt_mcp import convert_word_to_text, create_mcp_server\r\n\r\n# \u76f4\u63a5\u8f6c\u6362Word\u6587\u6863\r\ntext_content = convert_word_to_text(\"document.docx\")\r\nprint(text_content)\r\n\r\n# \u521b\u5efa\u5e76\u8fd0\u884cMCP\u670d\u52a1\u5668\r\nmcp = create_mcp_server(\"My Document Server\")\r\nmcp.run(transport=\"sse\", host=\"0.0.0.0\", port=7264)\r\n```\r\n\r\n## API\u53c2\u8003\r\n\r\n### \u6838\u5fc3\u51fd\u6570\r\n\r\n#### `convert_word_to_text(word_file_path)`\r\n\r\n\u5c06Word\u6587\u6863\u8f6c\u6362\u4e3a\u6587\u672c\u5185\u5bb9\u3002\r\n\r\n**\u53c2\u6570:**\r\n- `word_file_path` (str): Word\u6587\u6863\u7684\u6587\u4ef6\u8def\u5f84\r\n\r\n**\u8fd4\u56de\u503c:**\r\n- `str`: \u63d0\u53d6\u7684\u6587\u672c\u5185\u5bb9\r\n\r\n**\u5f02\u5e38:**\r\n- `FileNotFoundError`: \u5f53Word\u6587\u4ef6\u4e0d\u5b58\u5728\u65f6\u629b\u51fa\r\n- `Exception`: \u5f53\u8f6c\u6362\u8fc7\u7a0b\u4e2d\u51fa\u73b0\u9519\u8bef\u65f6\u629b\u51fa\r\n\r\n#### `process_word_document(file_path)`\r\n\r\n\u5904\u7406Word\u6587\u6863\uff0c\u5c06\u5176\u8f6c\u6362\u4e3a\u6587\u672c\u5e76\u8fdb\u884c\u57fa\u7840\u5206\u6790\u3002\r\n\r\n**\u53c2\u6570:**\r\n- `file_path` (str): Word\u6587\u6863\u7684\u6587\u4ef6\u8def\u5f84\r\n\r\n**\u8fd4\u56de\u503c:**\r\n- `str`: \u5305\u542b\u6587\u6863\u5185\u5bb9\u548c\u57fa\u7840\u5206\u6790\u7684\u7ed3\u679c\r\n\r\n#### `analyze_document_content(text_content, analysis_type=\"summary\")`\r\n\r\n\u5206\u6790\u6587\u6863\u5185\u5bb9\u3002\r\n\r\n**\u53c2\u6570:**\r\n- `text_content` (str): \u8981\u5206\u6790\u7684\u6587\u672c\u5185\u5bb9\r\n- `analysis_type` (str): \u5206\u6790\u7c7b\u578b\uff0c\u53ef\u9009\u503c\uff1a\r\n  - `\"summary\"`: \u6587\u6863\u6458\u8981\u5206\u6790\r\n  - `\"keywords\"`: \u5173\u952e\u8bcd\u63d0\u53d6\r\n  - `\"structure\"`: \u6587\u6863\u7ed3\u6784\u5206\u6790\r\n\r\n**\u8fd4\u56de\u503c:**\r\n- `str`: \u5206\u6790\u7ed3\u679c\r\n\r\n### MCP\u5de5\u5177\r\n\r\n\u5f53\u4f5c\u4e3aMCP\u670d\u52a1\u5668\u8fd0\u884c\u65f6\uff0c\u63d0\u4f9b\u4ee5\u4e0b\u5de5\u5177\uff1a\r\n\r\n1. **process_word_document**: \u5904\u7406Word\u6587\u6863\u5e76\u8f6c\u6362\u4e3a\u6587\u672c\r\n2. **analyze_document_content**: \u5206\u6790\u6587\u6863\u5185\u5bb9\r\n3. **echo_tool**: \u56de\u663e\u6587\u672c\uff08\u7528\u4e8e\u6d4b\u8bd5\uff09\r\n\r\n### MCP\u8d44\u6e90\r\n\r\n- `document://help`: \u83b7\u53d6\u5e2e\u52a9\u4fe1\u606f\r\n- `document://status/{file_path}`: \u68c0\u67e5\u6587\u6863\u72b6\u6001\r\n\r\n### MCP\u63d0\u793a\r\n\r\n- `analyze_document`: \u751f\u6210\u6587\u6863\u5206\u6790\u63d0\u793a\r\n\r\n## \u914d\u7f6e\u9009\u9879\r\n\r\n### \u547d\u4ee4\u884c\u53c2\u6570\r\n\r\n- `--transport`: \u4f20\u8f93\u534f\u8bae\u7c7b\u578b (stdio, sse, streamable-http)\r\n- `--host`: \u670d\u52a1\u5668\u4e3b\u673a\u5730\u5740 (\u9ed8\u8ba4: 0.0.0.0)\r\n- `--port`: \u670d\u52a1\u5668\u7aef\u53e3\u53f7 (\u9ed8\u8ba4: 7264)\r\n- `--name`: \u670d\u52a1\u5668\u540d\u79f0\r\n- `--version`: \u663e\u793a\u7248\u672c\u4fe1\u606f\r\n\r\n## \u4f7f\u7528\u793a\u4f8b\r\n\r\n### \u57fa\u7840\u6587\u6863\u8f6c\u6362\r\n\r\n```python\r\nfrom word_to_txt_mcp import convert_word_to_text\r\n\r\n# \u8f6c\u6362Word\u6587\u6863\r\ntry:\r\n    text = convert_word_to_text(\"example.docx\")\r\n    print(\"\u6587\u6863\u5185\u5bb9:\")\r\n    print(text)\r\nexcept FileNotFoundError:\r\n    print(\"\u6587\u4ef6\u4e0d\u5b58\u5728\")\r\nexcept Exception as e:\r\n    print(f\"\u8f6c\u6362\u5931\u8d25: {e}\")\r\n```\r\n\r\n### \u6587\u6863\u5206\u6790\r\n\r\n```python\r\nfrom word_to_txt_mcp import process_word_document, analyze_document_content\r\n\r\n# \u5904\u7406\u6587\u6863\u5e76\u83b7\u53d6\u5206\u6790\u7ed3\u679c\r\nresult = process_word_document(\"example.docx\")\r\nprint(result)\r\n\r\n# \u8fdb\u884c\u5173\u952e\u8bcd\u5206\u6790\r\ntext = convert_word_to_text(\"example.docx\")\r\nkeywords = analyze_document_content(text, \"keywords\")\r\nprint(keywords)\r\n```\r\n\r\n### \u4f5c\u4e3aMCP\u670d\u52a1\u5668\r\n\r\n```python\r\nfrom word_to_txt_mcp import create_mcp_server\r\n\r\n# \u521b\u5efa\u670d\u52a1\u5668\r\nmcp = create_mcp_server(\"Document Analysis Server\")\r\n\r\n# \u542f\u52a8\u670d\u52a1\u5668\r\nmcp.run(transport=\"sse\", host=\"localhost\", port=8080)\r\n```\r\n\r\n## \u652f\u6301\u7684\u6587\u4ef6\u683c\u5f0f\r\n\r\n- `.docx` - Microsoft Word 2007\u53ca\u66f4\u65b0\u7248\u672c\r\n- `.doc` - Microsoft Word 97-2003\uff08\u9700\u8981\u989d\u5916\u914d\u7f6e\uff09\r\n\r\n## \u4f9d\u8d56\u8981\u6c42\r\n\r\n- Python >= 3.8\r\n- fastmcp >= 0.1.0\r\n- python-docx >= 0.8.11\r\n\r\n## \u5f00\u53d1\r\n\r\n### \u5b89\u88c5\u5f00\u53d1\u4f9d\u8d56\r\n\r\n```bash\r\npip install -e \".[dev]\"\r\n```\r\n\r\n### \u8fd0\u884c\u6d4b\u8bd5\r\n\r\n```bash\r\npytest\r\n```\r\n\r\n### \u4ee3\u7801\u683c\u5f0f\u5316\r\n\r\n```bash\r\nblack word_to_txt_mcp/\r\n```\r\n\r\n### \u7c7b\u578b\u68c0\u67e5\r\n\r\n```bash\r\nmypy word_to_txt_mcp/\r\n```\r\n\r\n## \u8bb8\u53ef\u8bc1\r\n\r\nMIT License - \u8be6\u89c1 [LICENSE](LICENSE) \u6587\u4ef6\u3002\r\n\r\n## \u8d21\u732e\r\n\r\n\u6b22\u8fce\u63d0\u4ea4Issue\u548cPull Request\uff01\r\n\r\n## \u66f4\u65b0\u65e5\u5fd7\r\n\r\n### v0.1.0\r\n\r\n- \u521d\u59cb\u7248\u672c\u53d1\u5e03\r\n- \u652f\u6301Word\u6587\u6863\u8f6c\u6587\u672c\r\n- \u63d0\u4f9b\u57fa\u7840\u6587\u6863\u5206\u6790\u529f\u80fd\r\n- \u652f\u6301MCP\u534f\u8bae\r\n- \u547d\u4ee4\u884c\u5de5\u5177\u652f\u6301\r\n\r\n## \u8054\u7cfb\u65b9\u5f0f\r\n\r\n- \u4f5c\u8005: Your Name\r\n- \u90ae\u7bb1: your.email@example.com\r\n- \u9879\u76ee\u4e3b\u9875: https://github.com/yourusername/word-to-txt-mcp\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "FastMCP Word\u6587\u6863\u8f6c\u6587\u672c\u5206\u6790\u670d\u52a1\u5668",
    "version": "0.1.0",
    "project_urls": {
        "Bug Reports": "https://github.com/yourusername/word-to-txt-mcp/issues",
        "Documentation": "https://github.com/yourusername/word-to-txt-mcp#readme",
        "Homepage": "https://github.com/yourusername/word-to-txt-mcp",
        "Source": "https://github.com/yourusername/word-to-txt-mcp"
    },
    "split_keywords": [
        "mcp",
        " fastmcp",
        " word",
        " document",
        " text",
        " analysis",
        " docx",
        " conversion",
        " server",
        " api"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a227eabaaec7cf0dd87fd08b3e042eb29f1a581fc3c1f046f8afadef8ae60063",
                "md5": "e390941facc77c30971cabbc3d90e9b1",
                "sha256": "c9d73aa3fb69e179b5bdf5bc617b6e3e21384b7fe8108e8f82b4b78e87072028"
            },
            "downloads": -1,
            "filename": "word_to_txt_mcp-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e390941facc77c30971cabbc3d90e9b1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 9228,
            "upload_time": "2025-08-29T07:45:18",
            "upload_time_iso_8601": "2025-08-29T07:45:18.804474Z",
            "url": "https://files.pythonhosted.org/packages/a2/27/eabaaec7cf0dd87fd08b3e042eb29f1a581fc3c1f046f8afadef8ae60063/word_to_txt_mcp-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7b5b8f2d841602fc0df8a6facfeb6f000e7bfda22174ac1c726dbfe5862e2ff5",
                "md5": "5ef0eac5573832ce85d94b8e303b6482",
                "sha256": "11f6b6c5bb748c25aa149f0032fd041f070a16a609071b56a2dacad1bcefca54"
            },
            "downloads": -1,
            "filename": "word_to_txt_mcp-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "5ef0eac5573832ce85d94b8e303b6482",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 11703,
            "upload_time": "2025-08-29T07:45:20",
            "upload_time_iso_8601": "2025-08-29T07:45:20.427587Z",
            "url": "https://files.pythonhosted.org/packages/7b/5b/8f2d841602fc0df8a6facfeb6f000e7bfda22174ac1c726dbfe5862e2ff5/word_to_txt_mcp-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-29 07:45:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yourusername",
    "github_project": "word-to-txt-mcp",
    "github_not_found": true,
    "lcname": "word-to-txt-mcp"
}
        
Elapsed time: 0.60294s