| Name | mcp-web-tools JSON |
| Version |
0.9.0
JSON |
| download |
| home_page | None |
| Summary | A powerful MCP server to equip LLMs with web access, search, and content extraction capabilities |
| upload_time | 2025-11-03 20:08:53 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.12 |
| license | MIT |
| keywords |
extraction
llm
mcp
pdf
search
web
|
| VCS |
 |
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# MCP Web Tools
This package provides a powerful MCP server to equip LLMs with web access, going beyond naive methods of searching, fetching and extracting content.
## Introduction
I created this package out of the frustration that most MCP servers enabling web access to LLMs, didn't perform as well as I hoped. Some of these shortcomings I wanted fix, include:
- [x] Good search results without requiring an API key
- [x] Sophisticated fetching for more complex JavaScript sites
- [x] Extracting content in nicely formatted Markdown
- [x] Support for extracting content from PDFs
- [x] Support for loading and displaying images
- [x] Capture rendered webpage screenshots for visual context
- [x] Usage options for advanced cases like loading raw HTML
## Installation
### Claude Desktop
```json
```
### Claude Code
```bash
claude mcp add web-tools uvx mcp-web-tools
```
Or to also set the Brave Search API key:
```bash
claude mcp add web-tools uvx mcp-web-tools -e BRAVE_SEARCH_API_KEY=<key>
```
Provide a [Perplexity Search API](https://www.perplexity.ai/api-platform) key to prioritize their fresh, citation-rich index:
```bash
claude mcp add web-tools uvx mcp-web-tools -e PERPLEXITY_API_KEY=<key>
```
You can mix both environment variables to fall back from Perplexity to Brave seamlessly.
## Internals
The package is written in Python using powerful libraries and services under the hood to improve results.
### Searching
We use the [Perplexity Search API](https://www.perplexity.ai/hub/blog/introducing-the-perplexity-search-api) when a `PERPLEXITY_API_KEY` is configured. It delivers ranked snippets with citations from Perplexity's continuously refreshed index. If no Perplexity key is available, we fall back to the [Brave Search API](https://brave.com/search/api) (via `BRAVE_SEARCH_API_KEY`), then a lightweight Google workaround, and finally DuckDuckGo. While we recommend adding at least one API key, the chained fallbacks continue working for most workloads.
### Fetching
The fetching of web content is based on [Zendriver](https://github.com/stephanlensky/zendriver), a fork of [nodriver](https://github.com/ultrafunkamsterdam/nodriver/) for next level webscraping and performance. It should stay undetected for most anti-bot solutions and fetch content even from complex JS-based sites.
### Extracting
For web extraction, we use [Trafilatura](https://trafilatura.readthedocs.io/en/latest/index.html) which consistently outperforms other alternatives for extracting content from HTML pages. For PDFs, we use [PyMuPDF4LLM](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/) which similarly extracts content in an easy-to-read format for LLMs, with advanced layout support.
### Screenshots
Rendered page previews are powered by [Zendriver](https://github.com/stephanlensky/zendriver). The `view_website` tool navigates to a URL in a headless Chromium session and returns the resulting page as a PNG screenshot. By default only the current viewport is captured, but callers can request a full-page image by setting the `full_page` argument to `true`.
## Contributing
While it's impossible to support all pages and layouts, we thrive to make this package better over time. For unsupported sites, problems, or feature requests open an issue.
## CI, Releases, and Publishing
This repo includes a GitHub Actions workflow that:
- Runs tests via `uv` on PRs and pushes to `main`.
- On push to `main`, if `project.version` in `pyproject.toml` changed, it:
- Builds distributions with `uv build`.
- Creates a GitHub Release tagged `v<version>` with autogenerated notes.
- Publishes the package to PyPI using `uv publish`.
- Merge a PR that bumps `project.version` in `pyproject.toml` to trigger a release.
Rollback:
- If a release was created erroneously, delete the GitHub Release and tag `v<version>`.
- Yank the version on PyPI if needed.
Raw data
{
"_id": null,
"home_page": null,
"name": "mcp-web-tools",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "extraction, llm, mcp, pdf, search, web",
"author": null,
"author_email": "Paul-Louis Pr\u00f6ve <mail@plpp.de>",
"download_url": "https://files.pythonhosted.org/packages/c8/1c/c6f328f84e8b3d85d164e0f2ebeb9bee7b2b4aab92bbb3dd40c842392fd3/mcp_web_tools-0.9.0.tar.gz",
"platform": null,
"description": "# MCP Web Tools\n\nThis package provides a powerful MCP server to equip LLMs with web access, going beyond naive methods of searching, fetching and extracting content.\n\n## Introduction\n\nI created this package out of the frustration that most MCP servers enabling web access to LLMs, didn't perform as well as I hoped. Some of these shortcomings I wanted fix, include:\n\n- [x] Good search results without requiring an API key\n- [x] Sophisticated fetching for more complex JavaScript sites\n- [x] Extracting content in nicely formatted Markdown\n- [x] Support for extracting content from PDFs\n- [x] Support for loading and displaying images\n- [x] Capture rendered webpage screenshots for visual context\n- [x] Usage options for advanced cases like loading raw HTML\n\n## Installation\n\n### Claude Desktop\n\n```json\n```\n\n### Claude Code\n\n```bash\nclaude mcp add web-tools uvx mcp-web-tools\n```\n\nOr to also set the Brave Search API key:\n\n```bash\nclaude mcp add web-tools uvx mcp-web-tools -e BRAVE_SEARCH_API_KEY=<key>\n```\n\nProvide a [Perplexity Search API](https://www.perplexity.ai/api-platform) key to prioritize their fresh, citation-rich index:\n\n```bash\nclaude mcp add web-tools uvx mcp-web-tools -e PERPLEXITY_API_KEY=<key>\n```\n\nYou can mix both environment variables to fall back from Perplexity to Brave seamlessly.\n\n## Internals\n\nThe package is written in Python using powerful libraries and services under the hood to improve results.\n\n### Searching\n\nWe use the [Perplexity Search API](https://www.perplexity.ai/hub/blog/introducing-the-perplexity-search-api) when a `PERPLEXITY_API_KEY` is configured. It delivers ranked snippets with citations from Perplexity's continuously refreshed index. If no Perplexity key is available, we fall back to the [Brave Search API](https://brave.com/search/api) (via `BRAVE_SEARCH_API_KEY`), then a lightweight Google workaround, and finally DuckDuckGo. While we recommend adding at least one API key, the chained fallbacks continue working for most workloads.\n\n### Fetching\n\nThe fetching of web content is based on [Zendriver](https://github.com/stephanlensky/zendriver), a fork of [nodriver](https://github.com/ultrafunkamsterdam/nodriver/) for next level webscraping and performance. It should stay undetected for most anti-bot solutions and fetch content even from complex JS-based sites.\n\n### Extracting\n\nFor web extraction, we use [Trafilatura](https://trafilatura.readthedocs.io/en/latest/index.html) which consistently outperforms other alternatives for extracting content from HTML pages. For PDFs, we use [PyMuPDF4LLM](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/) which similarly extracts content in an easy-to-read format for LLMs, with advanced layout support.\n\n### Screenshots\n\nRendered page previews are powered by [Zendriver](https://github.com/stephanlensky/zendriver). The `view_website` tool navigates to a URL in a headless Chromium session and returns the resulting page as a PNG screenshot. By default only the current viewport is captured, but callers can request a full-page image by setting the `full_page` argument to `true`.\n\n## Contributing\n\nWhile it's impossible to support all pages and layouts, we thrive to make this package better over time. For unsupported sites, problems, or feature requests open an issue.\n\n## CI, Releases, and Publishing\n\nThis repo includes a GitHub Actions workflow that:\n\n- Runs tests via `uv` on PRs and pushes to `main`.\n- On push to `main`, if `project.version` in `pyproject.toml` changed, it:\n - Builds distributions with `uv build`.\n - Creates a GitHub Release tagged `v<version>` with autogenerated notes.\n - Publishes the package to PyPI using `uv publish`.\n- Merge a PR that bumps `project.version` in `pyproject.toml` to trigger a release.\n\nRollback:\n\n- If a release was created erroneously, delete the GitHub Release and tag `v<version>`.\n- Yank the version on PyPI if needed.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A powerful MCP server to equip LLMs with web access, search, and content extraction capabilities",
"version": "0.9.0",
"project_urls": {
"Homepage": "https://github.com/pietz/mcp-web-tools",
"Issues": "https://github.com/pietz/mcp-web-tools/issues"
},
"split_keywords": [
"extraction",
" llm",
" mcp",
" pdf",
" search",
" web"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "7b02b8542414b8d3602b7a9becf9b6ae6bb301fbd3dace512fa8a9ce953b4f1b",
"md5": "228f98562687b54aacde91e5aeb8bb40",
"sha256": "4c5513e57e2004a8fd54a0a62f9122d5517a5555994183a810bd90442643f837"
},
"downloads": -1,
"filename": "mcp_web_tools-0.9.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "228f98562687b54aacde91e5aeb8bb40",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 10021,
"upload_time": "2025-11-03T20:08:52",
"upload_time_iso_8601": "2025-11-03T20:08:52.384881Z",
"url": "https://files.pythonhosted.org/packages/7b/02/b8542414b8d3602b7a9becf9b6ae6bb301fbd3dace512fa8a9ce953b4f1b/mcp_web_tools-0.9.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "c81cc6f328f84e8b3d85d164e0f2ebeb9bee7b2b4aab92bbb3dd40c842392fd3",
"md5": "8586adc508739ed83f933cd65cdf038b",
"sha256": "24a1edf9aa9793564f5f1c34ba2c5fd5ccd5b89b15d1e54aa4ab93d668233314"
},
"downloads": -1,
"filename": "mcp_web_tools-0.9.0.tar.gz",
"has_sig": false,
"md5_digest": "8586adc508739ed83f933cd65cdf038b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 71641,
"upload_time": "2025-11-03T20:08:53",
"upload_time_iso_8601": "2025-11-03T20:08:53.479225Z",
"url": "https://files.pythonhosted.org/packages/c8/1c/c6f328f84e8b3d85d164e0f2ebeb9bee7b2b4aab92bbb3dd40c842392fd3/mcp_web_tools-0.9.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-03 20:08:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pietz",
"github_project": "mcp-web-tools",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "mcp-web-tools"
}