# PDF Tools MCP Server
[English](#english) | [中文](#中文)
---
## 中文
一个基于 FastMCP 的 PDF 读取和操作工具服务器,支持从 PDF 文件的指定页面范围提取文本内容。
### 功能特性
- 📄 读取 PDF 文件指定页面范围的内容
- 🔢 支持起始和结束页面参数(包含范围)
- 🛡️ 自动处理无效页码(负数、超出范围等)
- 📊 获取 PDF 文件的基本信息
- 🔗 合并多个 PDF 文件
- ✂️ 提取 PDF 的特定页面
### 安装
#### 从 PyPI 安装
```bash
uv add pdf-tools-mcp
```
如果 `uv add` 遇到依赖冲突,建议使用:
```bash
uvx tool install pdf-tools-mcp
```
#### 从源码安装
```bash
git clone https://github.com/yourusername/pdf-tools-mcp.git
cd pdf-tools-mcp
uv sync
```
### 使用方法
#### 与 Claude Desktop 集成
添加到你的 `~/.config/claude/claude_desktop_config.json` (Linux/Windows) 或 `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):
**开发/未发布版本配置**
```json
{
"mcpServers": {
"pdf-tools-mcp": {
"command": "uv",
"args": [
"--directory",
"<path/to/the/repo>/pdf-tools-mcp",
"run",
"pdf-tools-mcp",
"--workspace_path",
"</your/workspace/directory>"
]
}
}
}
```
**已发布版本配置**
```json
{
"mcpServers": {
"pdf-tools-mcp": {
"command": "uvx",
"args": [
"pdf-tools-mcp",
"--workspace_path",
"</your/workspace/directory>"
]
}
}
}
```
**注意**: 出于安全考虑,此工具只能访问指定工作目录(`--workspace_path`)内的文件,无法访问工作目录之外的文件。
如果配置后无法正常工作或在UI中无法显示,请通过 `uv cache clean` 清除缓存。
#### 作为命令行工具
```bash
# 基本使用
pdf-tools-mcp
# 指定工作目录
pdf-tools-mcp --workspace_path /path/to/workspace
```
#### 作为 Python 包
```python
from pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages
# 读取 PDF 页面
result = await read_pdf_pages("document.pdf", 1, 5)
# 获取 PDF 信息
info = await get_pdf_info("document.pdf")
# 合并 PDF 文件
result = await merge_pdfs(["file1.pdf", "file2.pdf"], "merged.pdf")
# 提取特定页面
result = await extract_pdf_pages("source.pdf", [1, 3, 5], "extracted.pdf")
```
### 主要工具函数
#### 1. read_pdf_pages
读取 PDF 文件指定页面范围的内容
**参数:**
- `pdf_file_path` (str): PDF 文件路径
- `start_page` (int, 默认 1): 起始页码
- `end_page` (int, 默认 1): 结束页码
**示例:**
```python
# 读取第 1-5 页
result = await read_pdf_pages("document.pdf", 1, 5)
# 读取第 10 页
result = await read_pdf_pages("document.pdf", 10, 10)
```
#### 2. get_pdf_info
获取 PDF 文件的基本信息
**参数:**
- `pdf_file_path` (str): PDF 文件路径
**返回信息:**
- 总页数
- 标题
- 作者
- 创建者
- 创建日期
#### 3. merge_pdfs
合并多个 PDF 文件
**参数:**
- `pdf_paths` (List[str]): 要合并的 PDF 文件路径列表
- `output_path` (str): 合并后的输出文件路径
#### 4. extract_pdf_pages
从 PDF 中提取特定页面
**参数:**
- `source_path` (str): 源 PDF 文件路径
- `page_numbers` (List[int]): 要提取的页码列表(从 1 开始)
- `output_path` (str): 输出文件路径
### 错误处理
工具自动处理以下情况:
- 负数页码:自动调整为第 1 页
- 超出 PDF 总页数的页码:自动调整为最后一页
- 起始页大于结束页:自动交换
- 文件未找到:返回相应错误信息
- 权限不足:返回相应错误信息
### 使用示例
```python
# 获取 PDF 信息
info = await get_pdf_info("sample.pdf")
print(info)
# 读取前 3 页
content = await read_pdf_pages("sample.pdf", 1, 3)
print(content)
# 读取最后一页(假设 PDF 有 10 页)
content = await read_pdf_pages("sample.pdf", 10, 10)
print(content)
# 合并多个 PDF
result = await merge_pdfs(["part1.pdf", "part2.pdf", "part3.pdf"], "complete.pdf")
print(result)
# 提取特定页面
result = await extract_pdf_pages("source.pdf", [1, 3, 5, 7], "selected.pdf")
print(result)
```
### 注意事项
- 页面范围使用包含区间,即起始页和结束页都包含在内
- 如果指定页面没有文本内容,将被跳过
- 返回结果会显示 PDF 总页数和实际提取的页面范围
- 支持各种语言的 PDF 文档
- 建议一次读取的页面数不超过 50 页,以避免性能问题
### 开发
#### 构建
```bash
uv build
```
#### 发布到 PyPI
```bash
uv publish
```
#### 本地开发
```bash
# 安装开发依赖
uv sync
# 运行测试
uv run python -m pytest
# 运行服务器
uv run python -m pdf_tools_mcp.server
```
---
## English
A FastMCP-based PDF reading and manipulation tool server that supports extracting text content from specified page ranges of PDF files.
### Features
- 📄 Read content from specified page ranges of PDF files
- 🔢 Support for start and end page parameters (inclusive range)
- 🛡️ Automatic handling of invalid page numbers (negative numbers, out of range, etc.)
- 📊 Get basic information about PDF files
- 🔗 Merge multiple PDF files
- ✂️ Extract specific pages from PDFs
### Installation
#### Install from PyPI
```bash
uv add pdf-tools-mcp
```
If `uv add` encounters dependency conflicts, use:
```bash
uvx tool install pdf-tools-mcp
```
#### Install from source
```bash
git clone https://github.com/yourusername/pdf-tools-mcp.git
cd pdf-tools-mcp
uv sync
```
### Usage
#### Usage with Claude Desktop
Add to your `~/.config/claude/claude_desktop_config.json` (Linux/Windows) or `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):
**Development/Unpublished Servers Configuration**
```json
{
"mcpServers": {
"pdf-tools-mcp": {
"command": "uv",
"args": [
"--directory",
"<path/to/the/repo>/pdf-tools-mcp",
"run",
"pdf-tools-mcp",
"--workspace_path",
"</your/workspace/directory>"
]
}
}
}
```
**Published Servers Configuration**
```json
{
"mcpServers": {
"pdf-tools-mcp": {
"command": "uvx",
"args": [
"pdf-tools-mcp",
"--workspace_path",
"</your/workspace/directory>"
]
}
}
}
```
**Note**: For security reasons, this tool can only access files within the specified workspace directory (`--workspace_path`) and cannot access files outside the workspace directory.
In case it's not working or showing in the UI, clear your cache via `uv cache clean`.
#### As a command line tool
```bash
# Basic usage
pdf-tools-mcp
# Specify workspace directory
pdf-tools-mcp --workspace_path /path/to/workspace
```
#### As a Python package
```python
from pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages
# Read PDF pages
result = await read_pdf_pages("document.pdf", 1, 5)
# Get PDF info
info = await get_pdf_info("document.pdf")
# Merge PDF files
result = await merge_pdfs(["file1.pdf", "file2.pdf"], "merged.pdf")
# Extract specific pages
result = await extract_pdf_pages("source.pdf", [1, 3, 5], "extracted.pdf")
```
### Main Tool Functions
#### 1. read_pdf_pages
Read content from specified page ranges of a PDF file
**Parameters:**
- `pdf_file_path` (str): PDF file path
- `start_page` (int, default 1): Starting page number
- `end_page` (int, default 1): Ending page number
**Example:**
```python
# Read pages 1-5
result = await read_pdf_pages("document.pdf", 1, 5)
# Read page 10
result = await read_pdf_pages("document.pdf", 10, 10)
```
#### 2. get_pdf_info
Get basic information about a PDF file
**Parameters:**
- `pdf_file_path` (str): PDF file path
**Returns:**
- Total page count
- Title
- Author
- Creator
- Creation date
#### 3. merge_pdfs
Merge multiple PDF files
**Parameters:**
- `pdf_paths` (List[str]): List of PDF file paths to merge
- `output_path` (str): Output file path for the merged PDF
#### 4. extract_pdf_pages
Extract specific pages from a PDF
**Parameters:**
- `source_path` (str): Source PDF file path
- `page_numbers` (List[int]): List of page numbers to extract (1-based)
- `output_path` (str): Output file path
### Error Handling
The tool automatically handles the following situations:
- Negative page numbers: automatically adjusted to page 1
- Page numbers exceeding total PDF pages: automatically adjusted to the last page
- Start page greater than end page: automatically swapped
- File not found: returns appropriate error message
- Insufficient permissions: returns appropriate error message
### Usage Examples
```python
# Get PDF info
info = await get_pdf_info("sample.pdf")
print(info)
# Read first 3 pages
content = await read_pdf_pages("sample.pdf", 1, 3)
print(content)
# Read last page (assuming PDF has 10 pages)
content = await read_pdf_pages("sample.pdf", 10, 10)
print(content)
# Merge multiple PDFs
result = await merge_pdfs(["part1.pdf", "part2.pdf", "part3.pdf"], "complete.pdf")
print(result)
# Extract specific pages
result = await extract_pdf_pages("source.pdf", [1, 3, 5, 7], "selected.pdf")
print(result)
```
### Notes
- Page ranges use inclusive intervals, meaning both start and end pages are included
- Pages without text content will be skipped
- Results show total PDF page count and actual extracted page range
- Supports PDF documents in various languages
- Recommended to read no more than 50 pages at a time to avoid performance issues
### Development
#### Build
```bash
uv build
```
#### Publish to PyPI
```bash
uv publish
```
#### Local Development
```bash
# Install development dependencies
uv sync
# Run tests
uv run python -m pytest
# Run server
uv run python -m pdf_tools_mcp.server
```
## License
MIT License
## Contributing
Issues and Pull Requests are welcome!
## Changelog
### 0.1.3
- Add regex search functionality for PDF content
- Add paginated search results with session management
- Add search navigation (next/prev/go to page)
- Add PDF content caching for improved performance
- Add search session cleanup and memory management
### 0.1.2
- Initial release
- Support for PDF text extraction
- Support for PDF info retrieval
- Support for PDF merging
- Support for page extraction
Raw data
{
"_id": null,
"home_page": null,
"name": "pdf-tools-mcp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": "Junlong Li <lockonlvange@gmail.com>",
"keywords": "fastmcp, mcp, pdf, pdf-manipulation, text-extraction",
"author": null,
"author_email": "Junlong Li <lockonlvange@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/ea/e7/86cd0f48690d5c459dec189b43c81139d862f59a8fcd1d6288b6a00da93e/pdf_tools_mcp-0.1.3.tar.gz",
"platform": null,
"description": "# PDF Tools MCP Server\n\n[English](#english) | [\u4e2d\u6587](#\u4e2d\u6587)\n\n---\n\n## \u4e2d\u6587\n\n\u4e00\u4e2a\u57fa\u4e8e FastMCP \u7684 PDF \u8bfb\u53d6\u548c\u64cd\u4f5c\u5de5\u5177\u670d\u52a1\u5668\uff0c\u652f\u6301\u4ece PDF \u6587\u4ef6\u7684\u6307\u5b9a\u9875\u9762\u8303\u56f4\u63d0\u53d6\u6587\u672c\u5185\u5bb9\u3002\n\n### \u529f\u80fd\u7279\u6027\n\n- \ud83d\udcc4 \u8bfb\u53d6 PDF \u6587\u4ef6\u6307\u5b9a\u9875\u9762\u8303\u56f4\u7684\u5185\u5bb9\n- \ud83d\udd22 \u652f\u6301\u8d77\u59cb\u548c\u7ed3\u675f\u9875\u9762\u53c2\u6570\uff08\u5305\u542b\u8303\u56f4\uff09\n- \ud83d\udee1\ufe0f \u81ea\u52a8\u5904\u7406\u65e0\u6548\u9875\u7801\uff08\u8d1f\u6570\u3001\u8d85\u51fa\u8303\u56f4\u7b49\uff09\n- \ud83d\udcca \u83b7\u53d6 PDF \u6587\u4ef6\u7684\u57fa\u672c\u4fe1\u606f\n- \ud83d\udd17 \u5408\u5e76\u591a\u4e2a PDF \u6587\u4ef6\n- \u2702\ufe0f \u63d0\u53d6 PDF \u7684\u7279\u5b9a\u9875\u9762\n\n### \u5b89\u88c5\n\n#### \u4ece PyPI \u5b89\u88c5\n\n```bash\nuv add pdf-tools-mcp\n```\n\n\u5982\u679c `uv add` \u9047\u5230\u4f9d\u8d56\u51b2\u7a81\uff0c\u5efa\u8bae\u4f7f\u7528\uff1a\n\n```bash\nuvx tool install pdf-tools-mcp\n```\n\n#### \u4ece\u6e90\u7801\u5b89\u88c5\n\n```bash\ngit clone https://github.com/yourusername/pdf-tools-mcp.git\ncd pdf-tools-mcp\nuv sync\n```\n\n### \u4f7f\u7528\u65b9\u6cd5\n\n#### \u4e0e Claude Desktop \u96c6\u6210\n\n\u6dfb\u52a0\u5230\u4f60\u7684 `~/.config/claude/claude_desktop_config.json` (Linux/Windows) \u6216 `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):\n\n**\u5f00\u53d1/\u672a\u53d1\u5e03\u7248\u672c\u914d\u7f6e**\n\n```json\n{\n \"mcpServers\": {\n \"pdf-tools-mcp\": {\n \"command\": \"uv\",\n \"args\": [\n \"--directory\",\n \"<path/to/the/repo>/pdf-tools-mcp\",\n \"run\",\n \"pdf-tools-mcp\",\n \"--workspace_path\",\n \"</your/workspace/directory>\"\n ]\n }\n }\n}\n```\n\n**\u5df2\u53d1\u5e03\u7248\u672c\u914d\u7f6e**\n\n```json\n{\n \"mcpServers\": {\n \"pdf-tools-mcp\": {\n \"command\": \"uvx\",\n \"args\": [\n \"pdf-tools-mcp\",\n \"--workspace_path\",\n \"</your/workspace/directory>\"\n ]\n }\n }\n}\n```\n\n**\u6ce8\u610f**: \u51fa\u4e8e\u5b89\u5168\u8003\u8651\uff0c\u6b64\u5de5\u5177\u53ea\u80fd\u8bbf\u95ee\u6307\u5b9a\u5de5\u4f5c\u76ee\u5f55(`--workspace_path`)\u5185\u7684\u6587\u4ef6\uff0c\u65e0\u6cd5\u8bbf\u95ee\u5de5\u4f5c\u76ee\u5f55\u4e4b\u5916\u7684\u6587\u4ef6\u3002\n\n\u5982\u679c\u914d\u7f6e\u540e\u65e0\u6cd5\u6b63\u5e38\u5de5\u4f5c\u6216\u5728UI\u4e2d\u65e0\u6cd5\u663e\u793a\uff0c\u8bf7\u901a\u8fc7 `uv cache clean` \u6e05\u9664\u7f13\u5b58\u3002\n\n#### \u4f5c\u4e3a\u547d\u4ee4\u884c\u5de5\u5177\n\n```bash\n# \u57fa\u672c\u4f7f\u7528\npdf-tools-mcp\n\n# \u6307\u5b9a\u5de5\u4f5c\u76ee\u5f55\npdf-tools-mcp --workspace_path /path/to/workspace\n```\n\n#### \u4f5c\u4e3a Python \u5305\n\n```python\nfrom pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages\n\n# \u8bfb\u53d6 PDF \u9875\u9762\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# \u83b7\u53d6 PDF \u4fe1\u606f\ninfo = await get_pdf_info(\"document.pdf\")\n\n# \u5408\u5e76 PDF \u6587\u4ef6\nresult = await merge_pdfs([\"file1.pdf\", \"file2.pdf\"], \"merged.pdf\")\n\n# \u63d0\u53d6\u7279\u5b9a\u9875\u9762\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5], \"extracted.pdf\")\n```\n\n### \u4e3b\u8981\u5de5\u5177\u51fd\u6570\n\n#### 1. read_pdf_pages\n\u8bfb\u53d6 PDF \u6587\u4ef6\u6307\u5b9a\u9875\u9762\u8303\u56f4\u7684\u5185\u5bb9\n\n**\u53c2\u6570:**\n- `pdf_file_path` (str): PDF \u6587\u4ef6\u8def\u5f84\n- `start_page` (int, \u9ed8\u8ba4 1): \u8d77\u59cb\u9875\u7801\n- `end_page` (int, \u9ed8\u8ba4 1): \u7ed3\u675f\u9875\u7801\n\n**\u793a\u4f8b:**\n```python\n# \u8bfb\u53d6\u7b2c 1-5 \u9875\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# \u8bfb\u53d6\u7b2c 10 \u9875\nresult = await read_pdf_pages(\"document.pdf\", 10, 10)\n```\n\n#### 2. get_pdf_info\n\u83b7\u53d6 PDF \u6587\u4ef6\u7684\u57fa\u672c\u4fe1\u606f\n\n**\u53c2\u6570:**\n- `pdf_file_path` (str): PDF \u6587\u4ef6\u8def\u5f84\n\n**\u8fd4\u56de\u4fe1\u606f:**\n- \u603b\u9875\u6570\n- \u6807\u9898\n- \u4f5c\u8005\n- \u521b\u5efa\u8005\n- \u521b\u5efa\u65e5\u671f\n\n#### 3. merge_pdfs\n\u5408\u5e76\u591a\u4e2a PDF \u6587\u4ef6\n\n**\u53c2\u6570:**\n- `pdf_paths` (List[str]): \u8981\u5408\u5e76\u7684 PDF \u6587\u4ef6\u8def\u5f84\u5217\u8868\n- `output_path` (str): \u5408\u5e76\u540e\u7684\u8f93\u51fa\u6587\u4ef6\u8def\u5f84\n\n#### 4. extract_pdf_pages\n\u4ece PDF \u4e2d\u63d0\u53d6\u7279\u5b9a\u9875\u9762\n\n**\u53c2\u6570:**\n- `source_path` (str): \u6e90 PDF \u6587\u4ef6\u8def\u5f84\n- `page_numbers` (List[int]): \u8981\u63d0\u53d6\u7684\u9875\u7801\u5217\u8868\uff08\u4ece 1 \u5f00\u59cb\uff09\n- `output_path` (str): \u8f93\u51fa\u6587\u4ef6\u8def\u5f84\n\n### \u9519\u8bef\u5904\u7406\n\n\u5de5\u5177\u81ea\u52a8\u5904\u7406\u4ee5\u4e0b\u60c5\u51b5\uff1a\n- \u8d1f\u6570\u9875\u7801\uff1a\u81ea\u52a8\u8c03\u6574\u4e3a\u7b2c 1 \u9875\n- \u8d85\u51fa PDF \u603b\u9875\u6570\u7684\u9875\u7801\uff1a\u81ea\u52a8\u8c03\u6574\u4e3a\u6700\u540e\u4e00\u9875\n- \u8d77\u59cb\u9875\u5927\u4e8e\u7ed3\u675f\u9875\uff1a\u81ea\u52a8\u4ea4\u6362\n- \u6587\u4ef6\u672a\u627e\u5230\uff1a\u8fd4\u56de\u76f8\u5e94\u9519\u8bef\u4fe1\u606f\n- \u6743\u9650\u4e0d\u8db3\uff1a\u8fd4\u56de\u76f8\u5e94\u9519\u8bef\u4fe1\u606f\n\n### \u4f7f\u7528\u793a\u4f8b\n\n```python\n# \u83b7\u53d6 PDF \u4fe1\u606f\ninfo = await get_pdf_info(\"sample.pdf\")\nprint(info)\n\n# \u8bfb\u53d6\u524d 3 \u9875\ncontent = await read_pdf_pages(\"sample.pdf\", 1, 3)\nprint(content)\n\n# \u8bfb\u53d6\u6700\u540e\u4e00\u9875\uff08\u5047\u8bbe PDF \u6709 10 \u9875\uff09\ncontent = await read_pdf_pages(\"sample.pdf\", 10, 10)\nprint(content)\n\n# \u5408\u5e76\u591a\u4e2a PDF\nresult = await merge_pdfs([\"part1.pdf\", \"part2.pdf\", \"part3.pdf\"], \"complete.pdf\")\nprint(result)\n\n# \u63d0\u53d6\u7279\u5b9a\u9875\u9762\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5, 7], \"selected.pdf\")\nprint(result)\n```\n\n### \u6ce8\u610f\u4e8b\u9879\n\n- \u9875\u9762\u8303\u56f4\u4f7f\u7528\u5305\u542b\u533a\u95f4\uff0c\u5373\u8d77\u59cb\u9875\u548c\u7ed3\u675f\u9875\u90fd\u5305\u542b\u5728\u5185\n- \u5982\u679c\u6307\u5b9a\u9875\u9762\u6ca1\u6709\u6587\u672c\u5185\u5bb9\uff0c\u5c06\u88ab\u8df3\u8fc7\n- \u8fd4\u56de\u7ed3\u679c\u4f1a\u663e\u793a PDF \u603b\u9875\u6570\u548c\u5b9e\u9645\u63d0\u53d6\u7684\u9875\u9762\u8303\u56f4\n- \u652f\u6301\u5404\u79cd\u8bed\u8a00\u7684 PDF \u6587\u6863\n- \u5efa\u8bae\u4e00\u6b21\u8bfb\u53d6\u7684\u9875\u9762\u6570\u4e0d\u8d85\u8fc7 50 \u9875\uff0c\u4ee5\u907f\u514d\u6027\u80fd\u95ee\u9898\n\n### \u5f00\u53d1\n\n#### \u6784\u5efa\n\n```bash\nuv build\n```\n\n#### \u53d1\u5e03\u5230 PyPI\n\n```bash\nuv publish\n```\n\n#### \u672c\u5730\u5f00\u53d1\n\n```bash\n# \u5b89\u88c5\u5f00\u53d1\u4f9d\u8d56\nuv sync\n\n# \u8fd0\u884c\u6d4b\u8bd5\nuv run python -m pytest\n\n# \u8fd0\u884c\u670d\u52a1\u5668\nuv run python -m pdf_tools_mcp.server\n```\n\n---\n\n## English\n\nA FastMCP-based PDF reading and manipulation tool server that supports extracting text content from specified page ranges of PDF files.\n\n### Features\n\n- \ud83d\udcc4 Read content from specified page ranges of PDF files\n- \ud83d\udd22 Support for start and end page parameters (inclusive range)\n- \ud83d\udee1\ufe0f Automatic handling of invalid page numbers (negative numbers, out of range, etc.)\n- \ud83d\udcca Get basic information about PDF files\n- \ud83d\udd17 Merge multiple PDF files\n- \u2702\ufe0f Extract specific pages from PDFs\n\n### Installation\n\n#### Install from PyPI\n\n```bash\nuv add pdf-tools-mcp\n```\n\nIf `uv add` encounters dependency conflicts, use:\n\n```bash\nuvx tool install pdf-tools-mcp\n```\n\n#### Install from source\n\n```bash\ngit clone https://github.com/yourusername/pdf-tools-mcp.git\ncd pdf-tools-mcp\nuv sync\n```\n\n### Usage\n\n#### Usage with Claude Desktop\n\nAdd to your `~/.config/claude/claude_desktop_config.json` (Linux/Windows) or `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS):\n\n**Development/Unpublished Servers Configuration**\n\n```json\n{\n \"mcpServers\": {\n \"pdf-tools-mcp\": {\n \"command\": \"uv\",\n \"args\": [\n \"--directory\",\n \"<path/to/the/repo>/pdf-tools-mcp\",\n \"run\",\n \"pdf-tools-mcp\",\n \"--workspace_path\",\n \"</your/workspace/directory>\"\n ]\n }\n }\n}\n```\n\n**Published Servers Configuration**\n\n```json\n{\n \"mcpServers\": {\n \"pdf-tools-mcp\": {\n \"command\": \"uvx\",\n \"args\": [\n \"pdf-tools-mcp\",\n \"--workspace_path\",\n \"</your/workspace/directory>\"\n ]\n }\n }\n}\n```\n\n**Note**: For security reasons, this tool can only access files within the specified workspace directory (`--workspace_path`) and cannot access files outside the workspace directory.\n\nIn case it's not working or showing in the UI, clear your cache via `uv cache clean`.\n\n#### As a command line tool\n\n```bash\n# Basic usage\npdf-tools-mcp\n\n# Specify workspace directory\npdf-tools-mcp --workspace_path /path/to/workspace\n```\n\n#### As a Python package\n\n```python\nfrom pdf_tools_mcp import read_pdf_pages, get_pdf_info, merge_pdfs, extract_pdf_pages\n\n# Read PDF pages\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# Get PDF info\ninfo = await get_pdf_info(\"document.pdf\")\n\n# Merge PDF files\nresult = await merge_pdfs([\"file1.pdf\", \"file2.pdf\"], \"merged.pdf\")\n\n# Extract specific pages\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5], \"extracted.pdf\")\n```\n\n### Main Tool Functions\n\n#### 1. read_pdf_pages\nRead content from specified page ranges of a PDF file\n\n**Parameters:**\n- `pdf_file_path` (str): PDF file path\n- `start_page` (int, default 1): Starting page number\n- `end_page` (int, default 1): Ending page number\n\n**Example:**\n```python\n# Read pages 1-5\nresult = await read_pdf_pages(\"document.pdf\", 1, 5)\n\n# Read page 10\nresult = await read_pdf_pages(\"document.pdf\", 10, 10)\n```\n\n#### 2. get_pdf_info\nGet basic information about a PDF file\n\n**Parameters:**\n- `pdf_file_path` (str): PDF file path\n\n**Returns:**\n- Total page count\n- Title\n- Author\n- Creator\n- Creation date\n\n#### 3. merge_pdfs\nMerge multiple PDF files\n\n**Parameters:**\n- `pdf_paths` (List[str]): List of PDF file paths to merge\n- `output_path` (str): Output file path for the merged PDF\n\n#### 4. extract_pdf_pages\nExtract specific pages from a PDF\n\n**Parameters:**\n- `source_path` (str): Source PDF file path\n- `page_numbers` (List[int]): List of page numbers to extract (1-based)\n- `output_path` (str): Output file path\n\n### Error Handling\n\nThe tool automatically handles the following situations:\n- Negative page numbers: automatically adjusted to page 1\n- Page numbers exceeding total PDF pages: automatically adjusted to the last page\n- Start page greater than end page: automatically swapped\n- File not found: returns appropriate error message\n- Insufficient permissions: returns appropriate error message\n\n### Usage Examples\n\n```python\n# Get PDF info\ninfo = await get_pdf_info(\"sample.pdf\")\nprint(info)\n\n# Read first 3 pages\ncontent = await read_pdf_pages(\"sample.pdf\", 1, 3)\nprint(content)\n\n# Read last page (assuming PDF has 10 pages)\ncontent = await read_pdf_pages(\"sample.pdf\", 10, 10)\nprint(content)\n\n# Merge multiple PDFs\nresult = await merge_pdfs([\"part1.pdf\", \"part2.pdf\", \"part3.pdf\"], \"complete.pdf\")\nprint(result)\n\n# Extract specific pages\nresult = await extract_pdf_pages(\"source.pdf\", [1, 3, 5, 7], \"selected.pdf\")\nprint(result)\n```\n\n### Notes\n\n- Page ranges use inclusive intervals, meaning both start and end pages are included\n- Pages without text content will be skipped\n- Results show total PDF page count and actual extracted page range\n- Supports PDF documents in various languages\n- Recommended to read no more than 50 pages at a time to avoid performance issues\n\n### Development\n\n#### Build\n\n```bash\nuv build\n```\n\n#### Publish to PyPI\n\n```bash\nuv publish\n```\n\n#### Local Development\n\n```bash\n# Install development dependencies\nuv sync\n\n# Run tests\nuv run python -m pytest\n\n# Run server\nuv run python -m pdf_tools_mcp.server\n```\n\n## License\n\nMIT License\n\n## Contributing\n\nIssues and Pull Requests are welcome!\n\n## Changelog\n\n### 0.1.3\n- Add regex search functionality for PDF content\n- Add paginated search results with session management\n- Add search navigation (next/prev/go to page)\n- Add PDF content caching for improved performance\n- Add search session cleanup and memory management\n\n### 0.1.2\n- Initial release\n- Support for PDF text extraction\n- Support for PDF info retrieval\n- Support for PDF merging\n- Support for page extraction\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A FastMCP-based PDF reading and manipulation tool server",
"version": "0.1.3",
"project_urls": {
"Documentation": "https://github.com/lockon-n/pdf-tools-mcp#readme",
"Homepage": "https://github.com/lockon-n/pdf-tools-mcp",
"Issues": "https://github.com/lockon-n/pdf-tools-mcp/issues",
"Repository": "https://github.com/lockon-n/pdf-tools-mcp"
},
"split_keywords": [
"fastmcp",
" mcp",
" pdf",
" pdf-manipulation",
" text-extraction"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5a6a3a6ce048a5bb868eb84388bbcb2811ba396f39c11a3cac79dc6a8b607cff",
"md5": "49afc318f63931b2bdd74a020c71203e",
"sha256": "b24b394aba38bcd8fae5be653f4148266e4c6022b9eafcc4fbb3da41fb548756"
},
"downloads": -1,
"filename": "pdf_tools_mcp-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "49afc318f63931b2bdd74a020c71203e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 11874,
"upload_time": "2025-07-18T03:32:45",
"upload_time_iso_8601": "2025-07-18T03:32:45.073140Z",
"url": "https://files.pythonhosted.org/packages/5a/6a/3a6ce048a5bb868eb84388bbcb2811ba396f39c11a3cac79dc6a8b607cff/pdf_tools_mcp-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "eae786cd0f48690d5c459dec189b43c81139d862f59a8fcd1d6288b6a00da93e",
"md5": "7261660393bab369edb903360b260b2e",
"sha256": "76d2189c97831013e1bafc4ee8d8db8dddf20b2dbc61620a76718313f8970005"
},
"downloads": -1,
"filename": "pdf_tools_mcp-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "7261660393bab369edb903360b260b2e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 10692,
"upload_time": "2025-07-18T03:32:46",
"upload_time_iso_8601": "2025-07-18T03:32:46.478108Z",
"url": "https://files.pythonhosted.org/packages/ea/e7/86cd0f48690d5c459dec189b43c81139d862f59a8fcd1d6288b6a00da93e/pdf_tools_mcp-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-18 03:32:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "lockon-n",
"github_project": "pdf-tools-mcp#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "pdf-tools-mcp"
}