pdf-tools-mcp


Namepdf-tools-mcp JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryA FastMCP-based PDF reading and manipulation tool server
upload_time2025-07-18 03:32:46
maintainerNone
docs_urlNone
authorNone
requires_python>=3.12
licenseMIT
keywords fastmcp mcp pdf pdf-manipulation text-extraction
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PDF Tools MCP Server

[English](#english) | [中文](#中文)

---

## 中文

一个基于 FastMCP 的 PDF 读取和操作工具服务器,支持从 PDF 文件的指定页面范围提取文本内容。

### 功能特性

- 📄 读取 PDF 文件指定页面范围的内容
- 🔢 支持起始和结束页面参数(包含范围)
- 🛡️ 自动处理无效页码(负数、超出范围等)
- 📊 获取 PDF 文件的基本信息
- 🔗 合并多个 PDF 文件
- ✂️ 提取 PDF 的特定页面

### 安装

#### 从 PyPI 安装

```bash
uv add pdf-tools-mcp
```

如果 `uv add` 遇到依赖冲突,建议使用:

```bash
uvx tool install pdf-tools-mcp
```

#### 从源码安装

```bash
git clone https://github.com/yourusername/pdf-tools-mcp.git
cd pdf-tools-mcp
uv sync
```

### 使用方法

#### 与 Claude Desktop 集成

添加到你的 `~/.config/claude/claude_desktop_config.json` (Linux/Windows) 或 `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):

**开发/未发布版本配置**

```json
{
  "mcpServers": {
    "pdf-tools-mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "<path/to/the/repo>/pdf-tools-mcp",
        "run",
        "pdf-tools-mcp",
        "--workspace_path",
        "</your/workspace/directory>"
      ]
    }
  }
}
```

**已发布版本配置**

```json
{
  "mcpServers": {
    "pdf-tools-mcp": {
      "command": "uvx",
      "args": [
        "pdf-tools-mcp",
        "--workspace_path",
        "</your/workspace/directory>"
      ]
    }
  }
}
```

**注意**: 出于安全考虑,此工具只能访问指定工作目录(`--workspace_path`)内的文件,无法访问工作目录之外的文件。

如果配置后无法正常工作或在UI中无法显示,请通过 `uv cache clean` 清除缓存。

#### 作为命令行工具

```bash
# 基本使用
pdf-tools-mcp

# 指定工作目录
pdf-tools-mcp --workspace_path /path/to/workspace
```

#### 作为 Python 包

```python
from pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages

# 读取 PDF 页面
result = await read_pdf_pages("document.pdf", 1, 5)

# 获取 PDF 信息
info = await get_pdf_info("document.pdf")

# 合并 PDF 文件
result = await merge_pdfs(["file1.pdf", "file2.pdf"], "merged.pdf")

# 提取特定页面
result = await extract_pdf_pages("source.pdf", [1, 3, 5], "extracted.pdf")
```

### 主要工具函数

#### 1. read_pdf_pages
读取 PDF 文件指定页面范围的内容

**参数:**
- `pdf_file_path` (str): PDF 文件路径
- `start_page` (int, 默认 1): 起始页码
- `end_page` (int, 默认 1): 结束页码

**示例:**
```python
# 读取第 1-5 页
result = await read_pdf_pages("document.pdf", 1, 5)

# 读取第 10 页
result = await read_pdf_pages("document.pdf", 10, 10)
```

#### 2. get_pdf_info
获取 PDF 文件的基本信息

**参数:**
- `pdf_file_path` (str): PDF 文件路径

**返回信息:**
- 总页数
- 标题
- 作者
- 创建者
- 创建日期

#### 3. merge_pdfs
合并多个 PDF 文件

**参数:**
- `pdf_paths` (List[str]): 要合并的 PDF 文件路径列表
- `output_path` (str): 合并后的输出文件路径

#### 4. extract_pdf_pages
从 PDF 中提取特定页面

**参数:**
- `source_path` (str): 源 PDF 文件路径
- `page_numbers` (List[int]): 要提取的页码列表(从 1 开始)
- `output_path` (str): 输出文件路径

### 错误处理

工具自动处理以下情况:
- 负数页码:自动调整为第 1 页
- 超出 PDF 总页数的页码:自动调整为最后一页
- 起始页大于结束页:自动交换
- 文件未找到:返回相应错误信息
- 权限不足:返回相应错误信息

### 使用示例

```python
# 获取 PDF 信息
info = await get_pdf_info("sample.pdf")
print(info)

# 读取前 3 页
content = await read_pdf_pages("sample.pdf", 1, 3)
print(content)

# 读取最后一页(假设 PDF 有 10 页)
content = await read_pdf_pages("sample.pdf", 10, 10)
print(content)

# 合并多个 PDF
result = await merge_pdfs(["part1.pdf", "part2.pdf", "part3.pdf"], "complete.pdf")
print(result)

# 提取特定页面
result = await extract_pdf_pages("source.pdf", [1, 3, 5, 7], "selected.pdf")
print(result)
```

### 注意事项

- 页面范围使用包含区间,即起始页和结束页都包含在内
- 如果指定页面没有文本内容,将被跳过
- 返回结果会显示 PDF 总页数和实际提取的页面范围
- 支持各种语言的 PDF 文档
- 建议一次读取的页面数不超过 50 页,以避免性能问题

### 开发

#### 构建

```bash
uv build
```

#### 发布到 PyPI

```bash
uv publish
```

#### 本地开发

```bash
# 安装开发依赖
uv sync

# 运行测试
uv run python -m pytest

# 运行服务器
uv run python -m pdf_tools_mcp.server
```

---

## English

A FastMCP-based PDF reading and manipulation tool server that supports extracting text content from specified page ranges of PDF files.

### Features

- 📄 Read content from specified page ranges of PDF files
- 🔢 Support for start and end page parameters (inclusive range)
- 🛡️ Automatic handling of invalid page numbers (negative numbers, out of range, etc.)
- 📊 Get basic information about PDF files
- 🔗 Merge multiple PDF files
- ✂️ Extract specific pages from PDFs

### Installation

#### Install from PyPI

```bash
uv add pdf-tools-mcp
```

If `uv add` encounters dependency conflicts, use:

```bash
uvx tool install pdf-tools-mcp
```

#### Install from source

```bash
git clone https://github.com/yourusername/pdf-tools-mcp.git
cd pdf-tools-mcp
uv sync
```

### Usage

#### Usage with Claude Desktop

Add to your `~/.config/claude/claude_desktop_config.json` (Linux/Windows) or `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):

**Development/Unpublished Servers Configuration**

```json
{
  "mcpServers": {
    "pdf-tools-mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "<path/to/the/repo>/pdf-tools-mcp",
        "run",
        "pdf-tools-mcp",
        "--workspace_path",
        "</your/workspace/directory>"
      ]
    }
  }
}
```

**Published Servers Configuration**

```json
{
  "mcpServers": {
    "pdf-tools-mcp": {
      "command": "uvx",
      "args": [
        "pdf-tools-mcp",
        "--workspace_path",
        "</your/workspace/directory>"
      ]
    }
  }
}
```

**Note**: For security reasons, this tool can only access files within the specified workspace directory (`--workspace_path`) and cannot access files outside the workspace directory.

In case it's not working or showing in the UI, clear your cache via `uv cache clean`.

#### As a command line tool

```bash
# Basic usage
pdf-tools-mcp

# Specify workspace directory
pdf-tools-mcp --workspace_path /path/to/workspace
```

#### As a Python package

```python
from pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages

# Read PDF pages
result = await read_pdf_pages("document.pdf", 1, 5)

# Get PDF info
info = await get_pdf_info("document.pdf")

# Merge PDF files
result = await merge_pdfs(["file1.pdf", "file2.pdf"], "merged.pdf")

# Extract specific pages
result = await extract_pdf_pages("source.pdf", [1, 3, 5], "extracted.pdf")
```

### Main Tool Functions

#### 1. read_pdf_pages
Read content from specified page ranges of a PDF file

**Parameters:**
- `pdf_file_path` (str): PDF file path
- `start_page` (int, default 1): Starting page number
- `end_page` (int, default 1): Ending page number

**Example:**
```python
# Read pages 1-5
result = await read_pdf_pages("document.pdf", 1, 5)

# Read page 10
result = await read_pdf_pages("document.pdf", 10, 10)
```

#### 2. get_pdf_info
Get basic information about a PDF file

**Parameters:**
- `pdf_file_path` (str): PDF file path

**Returns:**
- Total page count
- Title
- Author
- Creator
- Creation date

#### 3. merge_pdfs
Merge multiple PDF files

**Parameters:**
- `pdf_paths` (List[str]): List of PDF file paths to merge
- `output_path` (str): Output file path for the merged PDF

#### 4. extract_pdf_pages
Extract specific pages from a PDF

**Parameters:**
- `source_path` (str): Source PDF file path
- `page_numbers` (List[int]): List of page numbers to extract (1-based)
- `output_path` (str): Output file path

### Error Handling

The tool automatically handles the following situations:
- Negative page numbers: automatically adjusted to page 1
- Page numbers exceeding total PDF pages: automatically adjusted to the last page
- Start page greater than end page: automatically swapped
- File not found: returns appropriate error message
- Insufficient permissions: returns appropriate error message

### Usage Examples

```python
# Get PDF info
info = await get_pdf_info("sample.pdf")
print(info)

# Read first 3 pages
content = await read_pdf_pages("sample.pdf", 1, 3)
print(content)

# Read last page (assuming PDF has 10 pages)
content = await read_pdf_pages("sample.pdf", 10, 10)
print(content)

# Merge multiple PDFs
result = await merge_pdfs(["part1.pdf", "part2.pdf", "part3.pdf"], "complete.pdf")
print(result)

# Extract specific pages
result = await extract_pdf_pages("source.pdf", [1, 3, 5, 7], "selected.pdf")
print(result)
```

### Notes

- Page ranges use inclusive intervals, meaning both start and end pages are included
- Pages without text content will be skipped
- Results show total PDF page count and actual extracted page range
- Supports PDF documents in various languages
- Recommended to read no more than 50 pages at a time to avoid performance issues

### Development

#### Build

```bash
uv build
```

#### Publish to PyPI

```bash
uv publish
```

#### Local Development

```bash
# Install development dependencies
uv sync

# Run tests
uv run python -m pytest

# Run server
uv run python -m pdf_tools_mcp.server
```

## License

MIT License

## Contributing

Issues and Pull Requests are welcome!

## Changelog

### 0.1.3
- Add regex search functionality for PDF content
- Add paginated search results with session management
- Add search navigation (next/prev/go to page)
- Add PDF content caching for improved performance
- Add search session cleanup and memory management

### 0.1.2
- Initial release
- Support for PDF text extraction
- Support for PDF info retrieval
- Support for PDF merging
- Support for page extraction

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pdf-tools-mcp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": "Junlong Li <lockonlvange@gmail.com>",
    "keywords": "fastmcp, mcp, pdf, pdf-manipulation, text-extraction",
    "author": null,
    "author_email": "Junlong Li <lockonlvange@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/ea/e7/86cd0f48690d5c459dec189b43c81139d862f59a8fcd1d6288b6a00da93e/pdf_tools_mcp-0.1.3.tar.gz",
    "platform": null,
    "description": "# PDF Tools MCP Server\n\n[English](#english) | [\u4e2d\u6587](#\u4e2d\u6587)\n\n---\n\n## \u4e2d\u6587\n\n\u4e00\u4e2a\u57fa\u4e8e FastMCP \u7684 PDF \u8bfb\u53d6\u548c\u64cd\u4f5c\u5de5\u5177\u670d\u52a1\u5668\uff0c\u652f\u6301\u4ece PDF \u6587\u4ef6\u7684\u6307\u5b9a\u9875\u9762\u8303\u56f4\u63d0\u53d6\u6587\u672c\u5185\u5bb9\u3002\n\n### \u529f\u80fd\u7279\u6027\n\n- \ud83d\udcc4 \u8bfb\u53d6 PDF \u6587\u4ef6\u6307\u5b9a\u9875\u9762\u8303\u56f4\u7684\u5185\u5bb9\n- \ud83d\udd22 \u652f\u6301\u8d77\u59cb\u548c\u7ed3\u675f\u9875\u9762\u53c2\u6570\uff08\u5305\u542b\u8303\u56f4\uff09\n- \ud83d\udee1\ufe0f \u81ea\u52a8\u5904\u7406\u65e0\u6548\u9875\u7801\uff08\u8d1f\u6570\u3001\u8d85\u51fa\u8303\u56f4\u7b49\uff09\n- \ud83d\udcca \u83b7\u53d6 PDF \u6587\u4ef6\u7684\u57fa\u672c\u4fe1\u606f\n- \ud83d\udd17 \u5408\u5e76\u591a\u4e2a PDF \u6587\u4ef6\n- \u2702\ufe0f \u63d0\u53d6 PDF \u7684\u7279\u5b9a\u9875\u9762\n\n### \u5b89\u88c5\n\n#### \u4ece PyPI \u5b89\u88c5\n\n```bash\nuv add pdf-tools-mcp\n```\n\n\u5982\u679c `uv add` \u9047\u5230\u4f9d\u8d56\u51b2\u7a81\uff0c\u5efa\u8bae\u4f7f\u7528\uff1a\n\n```bash\nuvx tool install pdf-tools-mcp\n```\n\n#### \u4ece\u6e90\u7801\u5b89\u88c5\n\n```bash\ngit clone https://github.com/yourusername/pdf-tools-mcp.git\ncd pdf-tools-mcp\nuv sync\n```\n\n### \u4f7f\u7528\u65b9\u6cd5\n\n#### \u4e0e Claude Desktop \u96c6\u6210\n\n\u6dfb\u52a0\u5230\u4f60\u7684 `~/.config/claude/claude_desktop_config.json` (Linux/Windows) \u6216 `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):\n\n**\u5f00\u53d1/\u672a\u53d1\u5e03\u7248\u672c\u914d\u7f6e**\n\n```json\n{\n  \"mcpServers\": {\n    \"pdf-tools-mcp\": {\n      \"command\": \"uv\",\n      \"args\": [\n        \"--directory\",\n        \"<path/to/the/repo>/pdf-tools-mcp\",\n        \"run\",\n        \"pdf-tools-mcp\",\n        \"--workspace_path\",\n        \"</your/workspace/directory>\"\n      ]\n    }\n  }\n}\n```\n\n**\u5df2\u53d1\u5e03\u7248\u672c\u914d\u7f6e**\n\n```json\n{\n  \"mcpServers\": {\n    \"pdf-tools-mcp\": {\n      \"command\": \"uvx\",\n      \"args\": [\n        \"pdf-tools-mcp\",\n        \"--workspace_path\",\n        \"</your/workspace/directory>\"\n      ]\n    }\n  }\n}\n```\n\n**\u6ce8\u610f**: \u51fa\u4e8e\u5b89\u5168\u8003\u8651\uff0c\u6b64\u5de5\u5177\u53ea\u80fd\u8bbf\u95ee\u6307\u5b9a\u5de5\u4f5c\u76ee\u5f55(`--workspace_path`)\u5185\u7684\u6587\u4ef6\uff0c\u65e0\u6cd5\u8bbf\u95ee\u5de5\u4f5c\u76ee\u5f55\u4e4b\u5916\u7684\u6587\u4ef6\u3002\n\n\u5982\u679c\u914d\u7f6e\u540e\u65e0\u6cd5\u6b63\u5e38\u5de5\u4f5c\u6216\u5728UI\u4e2d\u65e0\u6cd5\u663e\u793a\uff0c\u8bf7\u901a\u8fc7 `uv cache clean` \u6e05\u9664\u7f13\u5b58\u3002\n\n#### \u4f5c\u4e3a\u547d\u4ee4\u884c\u5de5\u5177\n\n```bash\n# \u57fa\u672c\u4f7f\u7528\npdf-tools-mcp\n\n# \u6307\u5b9a\u5de5\u4f5c\u76ee\u5f55\npdf-tools-mcp --workspace_path /path/to/workspace\n```\n\n#### \u4f5c\u4e3a Python \u5305\n\n```python\nfrom pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages\n\n# \u8bfb\u53d6 PDF \u9875\u9762\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# \u83b7\u53d6 PDF \u4fe1\u606f\ninfo = await get_pdf_info(\"document.pdf\")\n\n# \u5408\u5e76 PDF \u6587\u4ef6\nresult = await merge_pdfs([\"file1.pdf\", \"file2.pdf\"], \"merged.pdf\")\n\n# \u63d0\u53d6\u7279\u5b9a\u9875\u9762\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5], \"extracted.pdf\")\n```\n\n### \u4e3b\u8981\u5de5\u5177\u51fd\u6570\n\n#### 1. read_pdf_pages\n\u8bfb\u53d6 PDF \u6587\u4ef6\u6307\u5b9a\u9875\u9762\u8303\u56f4\u7684\u5185\u5bb9\n\n**\u53c2\u6570:**\n- `pdf_file_path` (str): PDF \u6587\u4ef6\u8def\u5f84\n- `start_page` (int, \u9ed8\u8ba4 1): \u8d77\u59cb\u9875\u7801\n- `end_page` (int, \u9ed8\u8ba4 1): \u7ed3\u675f\u9875\u7801\n\n**\u793a\u4f8b:**\n```python\n# \u8bfb\u53d6\u7b2c 1-5 \u9875\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# \u8bfb\u53d6\u7b2c 10 \u9875\nresult = await read_pdf_pages(\"document.pdf\", 10, 10)\n```\n\n#### 2. get_pdf_info\n\u83b7\u53d6 PDF \u6587\u4ef6\u7684\u57fa\u672c\u4fe1\u606f\n\n**\u53c2\u6570:**\n- `pdf_file_path` (str): PDF \u6587\u4ef6\u8def\u5f84\n\n**\u8fd4\u56de\u4fe1\u606f:**\n- \u603b\u9875\u6570\n- \u6807\u9898\n- \u4f5c\u8005\n- \u521b\u5efa\u8005\n- \u521b\u5efa\u65e5\u671f\n\n#### 3. merge_pdfs\n\u5408\u5e76\u591a\u4e2a PDF \u6587\u4ef6\n\n**\u53c2\u6570:**\n- `pdf_paths` (List[str]): \u8981\u5408\u5e76\u7684 PDF \u6587\u4ef6\u8def\u5f84\u5217\u8868\n- `output_path` (str): \u5408\u5e76\u540e\u7684\u8f93\u51fa\u6587\u4ef6\u8def\u5f84\n\n#### 4. extract_pdf_pages\n\u4ece PDF \u4e2d\u63d0\u53d6\u7279\u5b9a\u9875\u9762\n\n**\u53c2\u6570:**\n- `source_path` (str): \u6e90 PDF \u6587\u4ef6\u8def\u5f84\n- `page_numbers` (List[int]): \u8981\u63d0\u53d6\u7684\u9875\u7801\u5217\u8868\uff08\u4ece 1 \u5f00\u59cb\uff09\n- `output_path` (str): \u8f93\u51fa\u6587\u4ef6\u8def\u5f84\n\n### \u9519\u8bef\u5904\u7406\n\n\u5de5\u5177\u81ea\u52a8\u5904\u7406\u4ee5\u4e0b\u60c5\u51b5\uff1a\n- \u8d1f\u6570\u9875\u7801\uff1a\u81ea\u52a8\u8c03\u6574\u4e3a\u7b2c 1 \u9875\n- \u8d85\u51fa PDF \u603b\u9875\u6570\u7684\u9875\u7801\uff1a\u81ea\u52a8\u8c03\u6574\u4e3a\u6700\u540e\u4e00\u9875\n- \u8d77\u59cb\u9875\u5927\u4e8e\u7ed3\u675f\u9875\uff1a\u81ea\u52a8\u4ea4\u6362\n- \u6587\u4ef6\u672a\u627e\u5230\uff1a\u8fd4\u56de\u76f8\u5e94\u9519\u8bef\u4fe1\u606f\n- \u6743\u9650\u4e0d\u8db3\uff1a\u8fd4\u56de\u76f8\u5e94\u9519\u8bef\u4fe1\u606f\n\n### \u4f7f\u7528\u793a\u4f8b\n\n```python\n# \u83b7\u53d6 PDF \u4fe1\u606f\ninfo = await get_pdf_info(\"sample.pdf\")\nprint(info)\n\n# \u8bfb\u53d6\u524d 3 \u9875\ncontent = await read_pdf_pages(\"sample.pdf\", 1, 3)\nprint(content)\n\n# \u8bfb\u53d6\u6700\u540e\u4e00\u9875\uff08\u5047\u8bbe PDF \u6709 10 \u9875\uff09\ncontent = await read_pdf_pages(\"sample.pdf\", 10, 10)\nprint(content)\n\n# \u5408\u5e76\u591a\u4e2a PDF\nresult = await merge_pdfs([\"part1.pdf\", \"part2.pdf\", \"part3.pdf\"], \"complete.pdf\")\nprint(result)\n\n# \u63d0\u53d6\u7279\u5b9a\u9875\u9762\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5, 7], \"selected.pdf\")\nprint(result)\n```\n\n### \u6ce8\u610f\u4e8b\u9879\n\n- \u9875\u9762\u8303\u56f4\u4f7f\u7528\u5305\u542b\u533a\u95f4\uff0c\u5373\u8d77\u59cb\u9875\u548c\u7ed3\u675f\u9875\u90fd\u5305\u542b\u5728\u5185\n- \u5982\u679c\u6307\u5b9a\u9875\u9762\u6ca1\u6709\u6587\u672c\u5185\u5bb9\uff0c\u5c06\u88ab\u8df3\u8fc7\n- \u8fd4\u56de\u7ed3\u679c\u4f1a\u663e\u793a PDF \u603b\u9875\u6570\u548c\u5b9e\u9645\u63d0\u53d6\u7684\u9875\u9762\u8303\u56f4\n- \u652f\u6301\u5404\u79cd\u8bed\u8a00\u7684 PDF \u6587\u6863\n- \u5efa\u8bae\u4e00\u6b21\u8bfb\u53d6\u7684\u9875\u9762\u6570\u4e0d\u8d85\u8fc7 50 \u9875\uff0c\u4ee5\u907f\u514d\u6027\u80fd\u95ee\u9898\n\n### \u5f00\u53d1\n\n#### \u6784\u5efa\n\n```bash\nuv build\n```\n\n#### \u53d1\u5e03\u5230 PyPI\n\n```bash\nuv publish\n```\n\n#### \u672c\u5730\u5f00\u53d1\n\n```bash\n# \u5b89\u88c5\u5f00\u53d1\u4f9d\u8d56\nuv sync\n\n# \u8fd0\u884c\u6d4b\u8bd5\nuv run python -m pytest\n\n# \u8fd0\u884c\u670d\u52a1\u5668\nuv run python -m pdf_tools_mcp.server\n```\n\n---\n\n## English\n\nA FastMCP-based PDF reading and manipulation tool server that supports extracting text content from specified page ranges of PDF files.\n\n### Features\n\n- \ud83d\udcc4 Read content from specified page ranges of PDF files\n- \ud83d\udd22 Support for start and end page parameters (inclusive range)\n- \ud83d\udee1\ufe0f Automatic handling of invalid page numbers (negative numbers, out of range, etc.)\n- \ud83d\udcca Get basic information about PDF files\n- \ud83d\udd17 Merge multiple PDF files\n- \u2702\ufe0f Extract specific pages from PDFs\n\n### Installation\n\n#### Install from PyPI\n\n```bash\nuv add pdf-tools-mcp\n```\n\nIf `uv add` encounters dependency conflicts, use:\n\n```bash\nuvx tool install pdf-tools-mcp\n```\n\n#### Install from source\n\n```bash\ngit clone https://github.com/yourusername/pdf-tools-mcp.git\ncd pdf-tools-mcp\nuv sync\n```\n\n### Usage\n\n#### Usage with Claude Desktop\n\nAdd to your `~/.config/claude/claude_desktop_config.json` (Linux/Windows) or `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):\n\n**Development/Unpublished Servers Configuration**\n\n```json\n{\n  \"mcpServers\": {\n    \"pdf-tools-mcp\": {\n      \"command\": \"uv\",\n      \"args\": [\n        \"--directory\",\n        \"<path/to/the/repo>/pdf-tools-mcp\",\n        \"run\",\n        \"pdf-tools-mcp\",\n        \"--workspace_path\",\n        \"</your/workspace/directory>\"\n      ]\n    }\n  }\n}\n```\n\n**Published Servers Configuration**\n\n```json\n{\n  \"mcpServers\": {\n    \"pdf-tools-mcp\": {\n      \"command\": \"uvx\",\n      \"args\": [\n        \"pdf-tools-mcp\",\n        \"--workspace_path\",\n        \"</your/workspace/directory>\"\n      ]\n    }\n  }\n}\n```\n\n**Note**: For security reasons, this tool can only access files within the specified workspace directory (`--workspace_path`) and cannot access files outside the workspace directory.\n\nIn case it's not working or showing in the UI, clear your cache via `uv cache clean`.\n\n#### As a command line tool\n\n```bash\n# Basic usage\npdf-tools-mcp\n\n# Specify workspace directory\npdf-tools-mcp --workspace_path /path/to/workspace\n```\n\n#### As a Python package\n\n```python\nfrom pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages\n\n# Read PDF pages\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# Get PDF info\ninfo = await get_pdf_info(\"document.pdf\")\n\n# Merge PDF files\nresult = await merge_pdfs([\"file1.pdf\", \"file2.pdf\"], \"merged.pdf\")\n\n# Extract specific pages\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5], \"extracted.pdf\")\n```\n\n### Main Tool Functions\n\n#### 1. read_pdf_pages\nRead content from specified page ranges of a PDF file\n\n**Parameters:**\n- `pdf_file_path` (str): PDF file path\n- `start_page` (int, default 1): Starting page number\n- `end_page` (int, default 1): Ending page number\n\n**Example:**\n```python\n# Read pages 1-5\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# Read page 10\nresult = await read_pdf_pages(\"document.pdf\", 10, 10)\n```\n\n#### 2. get_pdf_info\nGet basic information about a PDF file\n\n**Parameters:**\n- `pdf_file_path` (str): PDF file path\n\n**Returns:**\n- Total page count\n- Title\n- Author\n- Creator\n- Creation date\n\n#### 3. merge_pdfs\nMerge multiple PDF files\n\n**Parameters:**\n- `pdf_paths` (List[str]): List of PDF file paths to merge\n- `output_path` (str): Output file path for the merged PDF\n\n#### 4. extract_pdf_pages\nExtract specific pages from a PDF\n\n**Parameters:**\n- `source_path` (str): Source PDF file path\n- `page_numbers` (List[int]): List of page numbers to extract (1-based)\n- `output_path` (str): Output file path\n\n### Error Handling\n\nThe tool automatically handles the following situations:\n- Negative page numbers: automatically adjusted to page 1\n- Page numbers exceeding total PDF pages: automatically adjusted to the last page\n- Start page greater than end page: automatically swapped\n- File not found: returns appropriate error message\n- Insufficient permissions: returns appropriate error message\n\n### Usage Examples\n\n```python\n# Get PDF info\ninfo = await get_pdf_info(\"sample.pdf\")\nprint(info)\n\n# Read first 3 pages\ncontent = await read_pdf_pages(\"sample.pdf\", 1, 3)\nprint(content)\n\n# Read last page (assuming PDF has 10 pages)\ncontent = await read_pdf_pages(\"sample.pdf\", 10, 10)\nprint(content)\n\n# Merge multiple PDFs\nresult = await merge_pdfs([\"part1.pdf\", \"part2.pdf\", \"part3.pdf\"], \"complete.pdf\")\nprint(result)\n\n# Extract specific pages\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5, 7], \"selected.pdf\")\nprint(result)\n```\n\n### Notes\n\n- Page ranges use inclusive intervals, meaning both start and end pages are included\n- Pages without text content will be skipped\n- Results show total PDF page count and actual extracted page range\n- Supports PDF documents in various languages\n- Recommended to read no more than 50 pages at a time to avoid performance issues\n\n### Development\n\n#### Build\n\n```bash\nuv build\n```\n\n#### Publish to PyPI\n\n```bash\nuv publish\n```\n\n#### Local Development\n\n```bash\n# Install development dependencies\nuv sync\n\n# Run tests\nuv run python -m pytest\n\n# Run server\nuv run python -m pdf_tools_mcp.server\n```\n\n## License\n\nMIT License\n\n## Contributing\n\nIssues and Pull Requests are welcome!\n\n## Changelog\n\n### 0.1.3\n- Add regex search functionality for PDF content\n- Add paginated search results with session management\n- Add search navigation (next/prev/go to page)\n- Add PDF content caching for improved performance\n- Add search session cleanup and memory management\n\n### 0.1.2\n- Initial release\n- Support for PDF text extraction\n- Support for PDF info retrieval\n- Support for PDF merging\n- Support for page extraction\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A FastMCP-based PDF reading and manipulation tool server",
    "version": "0.1.3",
    "project_urls": {
        "Documentation": "https://github.com/lockon-n/pdf-tools-mcp#readme",
        "Homepage": "https://github.com/lockon-n/pdf-tools-mcp",
        "Issues": "https://github.com/lockon-n/pdf-tools-mcp/issues",
        "Repository": "https://github.com/lockon-n/pdf-tools-mcp"
    },
    "split_keywords": [
        "fastmcp",
        " mcp",
        " pdf",
        " pdf-manipulation",
        " text-extraction"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "5a6a3a6ce048a5bb868eb84388bbcb2811ba396f39c11a3cac79dc6a8b607cff",
                "md5": "49afc318f63931b2bdd74a020c71203e",
                "sha256": "b24b394aba38bcd8fae5be653f4148266e4c6022b9eafcc4fbb3da41fb548756"
            },
            "downloads": -1,
            "filename": "pdf_tools_mcp-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "49afc318f63931b2bdd74a020c71203e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 11874,
            "upload_time": "2025-07-18T03:32:45",
            "upload_time_iso_8601": "2025-07-18T03:32:45.073140Z",
            "url": "https://files.pythonhosted.org/packages/5a/6a/3a6ce048a5bb868eb84388bbcb2811ba396f39c11a3cac79dc6a8b607cff/pdf_tools_mcp-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "eae786cd0f48690d5c459dec189b43c81139d862f59a8fcd1d6288b6a00da93e",
                "md5": "7261660393bab369edb903360b260b2e",
                "sha256": "76d2189c97831013e1bafc4ee8d8db8dddf20b2dbc61620a76718313f8970005"
            },
            "downloads": -1,
            "filename": "pdf_tools_mcp-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "7261660393bab369edb903360b260b2e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 10692,
            "upload_time": "2025-07-18T03:32:46",
            "upload_time_iso_8601": "2025-07-18T03:32:46.478108Z",
            "url": "https://files.pythonhosted.org/packages/ea/e7/86cd0f48690d5c459dec189b43c81139d862f59a8fcd1d6288b6a00da93e/pdf_tools_mcp-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-18 03:32:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lockon-n",
    "github_project": "pdf-tools-mcp#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "pdf-tools-mcp"
}
        
Elapsed time: 0.51130s