# Search MCP Server
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
A powerful Model Context Protocol (MCP) server providing AI-enhanced Baidu search with intelligent reranking and comprehensive web content extraction capabilities.
## โจ Features
- ๐ **Baidu Search Integration**: Fast and reliable search results from Baidu
- ๐ค **AI-Powered Reranking**: Uses multiple AI agents (Qwen) to intelligently rerank search results by relevance
- ๐ **Web Content Extraction**: Extract clean, readable text from web pages with pagination support
- ๐ฏ **Batch Processing**: Extract content from multiple URLs simultaneously
- ๐ **MCP Standard**: Fully compliant with Model Context Protocol for seamless integration
## ๐ Quick Start
### Prerequisites
- Python 3.10 or higher
- [uv](https://github.com/astral-sh/uv) (recommended) or pip
- DashScope API key (for AI search features)
### Installation
#### Using uv (Recommended)
```bash
# Clone the repository
git clone https://github.com/Vist233/Google-Search-Tool.git
cd search-mcp
# Install with uv
uv pip install -e .
```
#### Using pip
```bash
pip install -e .
```
### Environment Setup
Create a `.env` file or set environment variables for AI features:
```bash
export DASHSCOPE_API_KEY="your-api-key-here"
```
## ๐ Usage
### As an MCP Server
Add to your MCP client configuration (e.g., Claude Desktop):
**For macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
**For Windows**: `%APPDATA%/Claude/claude_desktop_config.json`
```json
{
"mcpServers": {
"aiwebsearcher": {
"command": "uvx",
"args": [
"aiwebsearcher"
]
}
}
}
```
**Note**: API key is read from environment variable `DASHSCOPE_API_KEY`. Set it before running:
```bash
# macOS/Linux
export DASHSCOPE_API_KEY="your-api-key-here"
# Windows (PowerShell)
$env:DASHSCOPE_API_KEY="your-api-key-here"
```
### Standalone Testing
```bash
# Install the package
pip install aiwebsearcher
# Set API key
export DASHSCOPE_API_KEY="your-key"
# Run the server
aiwebsearcher
```
## ๐ ๏ธ Available Tools
### 1. `search_baidu`
Execute basic Baidu search and return structured results.
**Parameters:**
- `query` (str): Search keyword
- `max_results` (int, optional): Maximum results to return (default: 5)
- `language` (str, optional): Search language (default: "zh")
**Returns:** JSON string with title, url, and abstract for each result.
**Example:**
```python
{
"query": "ไบบๅทฅๆบ่ฝๅๅฑ็ฐ็ถ",
"max_results": 5
}
```
### 2. `AI_search_baidu`
AI-enhanced search with intelligent reranking and content extraction. Takes ~3x longer but provides higher quality, ranked results with full page content.
**Parameters:**
- `query` (str): Search keyword
- `max_results` (int, optional): Initial results to fetch (default: 5, recommended 5+)
- `language` (str, optional): Search language (default: "zh")
**Returns:** JSON string with rank, title, url, and Content (full page text) for each result.
**Example:**
```python
{
"query": "AIๅๅฑ่ถๅฟ 2025",
"max_results": 12
}
```
### 3. `extractTextFromUrl`
Extract clean, readable text from a single webpage.
**Parameters:**
- `url` (str): Target webpage URL
- `follow_pagination` (bool, optional): Follow rel="next" links (default: true)
- `pagination_limit` (int, optional): Max pagination depth (default: 3)
- `timeout` (float, optional): HTTP timeout in seconds (default: 10.0)
- `user_agent` (str, optional): Custom User-Agent header
- `regular_expressions` (list[str], optional): Regex patterns to filter text
**Returns:** Extracted text content as string.
### 4. `extractTextFromUrls`
Extract text from multiple webpages in batch.
**Parameters:** Same as `extractTextFromUrl`, plus:
- `urls` (list[str]): List of target URLs
**Returns:** Combined text from all URLs, separated by double newlines.
## ๐๏ธ Project Structure
```
search-mcp/
โโโ searcher/
โ โโโ src/
โ โโโ server.py # MCP server entry point
โ โโโ FetchPage/
โ โ โโโ fetchWeb.py # Web content extraction
โ โโโ WebSearch/
โ โ โโโ baiduSearchTool.py # Baidu search implementation
โ โ โโโ SearchAgent.py # AI agent definitions (legacy)
โ โโโ useAI2Search/
โ โโโ SearchAgent.py # AI-powered search orchestration
โโโ tests/ # Test files
โโโ pyproject.toml # Project configuration
โโโ requirements.txt # Dependencies
โโโ README.md # This file
```
## ๐ง Development
### Install Development Dependencies
```bash
uv pip install -e ".[dev]"
```
### Run Tests
```bash
pytest
```
### Code Formatting
```bash
# Format with black
black searcher/
# Lint with ruff
ruff check searcher/
```
## ๐ Configuration
### MCP Client Configuration Examples
**Minimal configuration:**
```json
{
"mcpServers": {
"search": {
"command": "python",
"args": ["server.py"],
"cwd": "/path/to/search-mcp/searcher/src"
}
}
}
```
**With uv for dependency isolation:**
```json
{
"mcpServers": {
"search": {
"command": "uv",
"args": ["--directory", "/path/to/search-mcp/searcher/src", "run", "python", "server.py"]
}
}
}
```
## ๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Acknowledgments
- Built with [FastMCP](https://github.com/jlowin/fastmcp)
- AI models powered by [Agno](https://github.com/agno-agi/agno) and DashScope
- Search powered by [baidusearch](https://github.com/liuxingwt/baidusearch)
- Content extraction using [trafilatura](https://github.com/adbar/trafilatura)
## ๐ฎ Contact
- GitHub: [@Vist233](https://github.com/Vist233)
- Repository: [Google-Search-Tool](https://github.com/Vist233/Google-Search-Tool)
## โ ๏ธ Disclaimer
This tool is for educational and research purposes. Please respect website terms of service and rate limits when scraping content.
Raw data
{
"_id": null,
"home_page": null,
"name": "AIWebSearcher",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "ai, baidu, llm, mcp, model-context-protocol, search",
"author": null,
"author_email": "Vist233 <zhangyvjing@outlook.com>",
"download_url": "https://files.pythonhosted.org/packages/92/ff/c219dc6be40620c3c0267591d34a06332b77a60d41da940ea49f151ce0fa/aiwebsearcher-0.1.1.tar.gz",
"platform": null,
"description": "# Search MCP Server\n\n[](https://opensource.org/licenses/MIT)\n[](https://www.python.org/downloads/)\n\nA powerful Model Context Protocol (MCP) server providing AI-enhanced Baidu search with intelligent reranking and comprehensive web content extraction capabilities.\n\n## \u2728 Features\n\n- \ud83d\udd0d **Baidu Search Integration**: Fast and reliable search results from Baidu\n- \ud83e\udd16 **AI-Powered Reranking**: Uses multiple AI agents (Qwen) to intelligently rerank search results by relevance\n- \ud83d\udcc4 **Web Content Extraction**: Extract clean, readable text from web pages with pagination support\n- \ud83c\udfaf **Batch Processing**: Extract content from multiple URLs simultaneously\n- \ud83c\udf10 **MCP Standard**: Fully compliant with Model Context Protocol for seamless integration\n\n## \ud83d\ude80 Quick Start\n\n### Prerequisites\n\n- Python 3.10 or higher\n- [uv](https://github.com/astral-sh/uv) (recommended) or pip\n- DashScope API key (for AI search features)\n\n### Installation\n\n#### Using uv (Recommended)\n\n```bash\n# Clone the repository\ngit clone https://github.com/Vist233/Google-Search-Tool.git\ncd search-mcp\n\n# Install with uv\nuv pip install -e .\n```\n\n#### Using pip\n\n```bash\npip install -e .\n```\n\n### Environment Setup\n\nCreate a `.env` file or set environment variables for AI features:\n\n```bash\nexport DASHSCOPE_API_KEY=\"your-api-key-here\"\n```\n\n## \ud83d\udcd6 Usage\n\n### As an MCP Server\n\nAdd to your MCP client configuration (e.g., Claude Desktop):\n\n**For macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`\n\n**For Windows**: `%APPDATA%/Claude/claude_desktop_config.json`\n\n```json\n{\n \"mcpServers\": {\n \"aiwebsearcher\": {\n \"command\": \"uvx\",\n \"args\": [\n \"aiwebsearcher\"\n ]\n }\n }\n}\n```\n\n**Note**: API key is read from environment variable `DASHSCOPE_API_KEY`. Set it before running:\n\n```bash\n# macOS/Linux\nexport DASHSCOPE_API_KEY=\"your-api-key-here\"\n\n# Windows (PowerShell)\n$env:DASHSCOPE_API_KEY=\"your-api-key-here\"\n```\n\n### Standalone Testing\n\n```bash\n# Install the package\npip install aiwebsearcher\n\n# Set API key\nexport DASHSCOPE_API_KEY=\"your-key\"\n\n# Run the server\naiwebsearcher\n```\n\n\n## \ud83d\udee0\ufe0f Available Tools\n\n### 1. `search_baidu`\n\nExecute basic Baidu search and return structured results.\n\n**Parameters:**\n- `query` (str): Search keyword\n- `max_results` (int, optional): Maximum results to return (default: 5)\n- `language` (str, optional): Search language (default: \"zh\")\n\n**Returns:** JSON string with title, url, and abstract for each result.\n\n**Example:**\n```python\n{\n \"query\": \"\u4eba\u5de5\u667a\u80fd\u53d1\u5c55\u73b0\u72b6\",\n \"max_results\": 5\n}\n```\n\n### 2. `AI_search_baidu`\n\nAI-enhanced search with intelligent reranking and content extraction. Takes ~3x longer but provides higher quality, ranked results with full page content.\n\n**Parameters:**\n- `query` (str): Search keyword\n- `max_results` (int, optional): Initial results to fetch (default: 5, recommended 5+)\n- `language` (str, optional): Search language (default: \"zh\")\n\n**Returns:** JSON string with rank, title, url, and Content (full page text) for each result.\n\n**Example:**\n```python\n{\n \"query\": \"AI\u53d1\u5c55\u8d8b\u52bf 2025\",\n \"max_results\": 12\n}\n```\n\n### 3. `extractTextFromUrl`\n\nExtract clean, readable text from a single webpage.\n\n**Parameters:**\n- `url` (str): Target webpage URL\n- `follow_pagination` (bool, optional): Follow rel=\"next\" links (default: true)\n- `pagination_limit` (int, optional): Max pagination depth (default: 3)\n- `timeout` (float, optional): HTTP timeout in seconds (default: 10.0)\n- `user_agent` (str, optional): Custom User-Agent header\n- `regular_expressions` (list[str], optional): Regex patterns to filter text\n\n**Returns:** Extracted text content as string.\n\n### 4. `extractTextFromUrls`\n\nExtract text from multiple webpages in batch.\n\n**Parameters:** Same as `extractTextFromUrl`, plus:\n- `urls` (list[str]): List of target URLs\n\n**Returns:** Combined text from all URLs, separated by double newlines.\n\n## \ud83c\udfd7\ufe0f Project Structure\n\n```\nsearch-mcp/\n\u251c\u2500\u2500 searcher/\n\u2502 \u2514\u2500\u2500 src/\n\u2502 \u251c\u2500\u2500 server.py # MCP server entry point\n\u2502 \u251c\u2500\u2500 FetchPage/\n\u2502 \u2502 \u2514\u2500\u2500 fetchWeb.py # Web content extraction\n\u2502 \u251c\u2500\u2500 WebSearch/\n\u2502 \u2502 \u251c\u2500\u2500 baiduSearchTool.py # Baidu search implementation\n\u2502 \u2502 \u2514\u2500\u2500 SearchAgent.py # AI agent definitions (legacy)\n\u2502 \u2514\u2500\u2500 useAI2Search/\n\u2502 \u2514\u2500\u2500 SearchAgent.py # AI-powered search orchestration\n\u251c\u2500\u2500 tests/ # Test files\n\u251c\u2500\u2500 pyproject.toml # Project configuration\n\u251c\u2500\u2500 requirements.txt # Dependencies\n\u2514\u2500\u2500 README.md # This file\n```\n\n## \ud83d\udd27 Development\n\n### Install Development Dependencies\n\n```bash\nuv pip install -e \".[dev]\"\n```\n\n### Run Tests\n\n```bash\npytest\n```\n\n### Code Formatting\n\n```bash\n# Format with black\nblack searcher/\n\n# Lint with ruff\nruff check searcher/\n```\n\n## \ud83d\udcdd Configuration\n\n### MCP Client Configuration Examples\n\n**Minimal configuration:**\n```json\n{\n \"mcpServers\": {\n \"search\": {\n \"command\": \"python\",\n \"args\": [\"server.py\"],\n \"cwd\": \"/path/to/search-mcp/searcher/src\"\n }\n }\n}\n```\n\n**With uv for dependency isolation:**\n```json\n{\n \"mcpServers\": {\n \"search\": {\n \"command\": \"uv\",\n \"args\": [\"--directory\", \"/path/to/search-mcp/searcher/src\", \"run\", \"python\", \"server.py\"]\n }\n }\n}\n```\n\n## \ud83e\udd1d Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- Built with [FastMCP](https://github.com/jlowin/fastmcp)\n- AI models powered by [Agno](https://github.com/agno-agi/agno) and DashScope\n- Search powered by [baidusearch](https://github.com/liuxingwt/baidusearch)\n- Content extraction using [trafilatura](https://github.com/adbar/trafilatura)\n\n## \ud83d\udcee Contact\n\n- GitHub: [@Vist233](https://github.com/Vist233)\n- Repository: [Google-Search-Tool](https://github.com/Vist233/Google-Search-Tool)\n\n## \u26a0\ufe0f Disclaimer\n\nThis tool is for educational and research purposes. Please respect website terms of service and rate limits when scraping content.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Model Context Protocol (MCP) server providing AI-powered Baidu search with intelligent reranking and web content extraction",
"version": "0.1.1",
"project_urls": {
"Homepage": "https://github.com/Vist233/Web-Searcher",
"Issues": "https://github.com/Vist233/Web-Searcher/issues",
"Repository": "https://github.com/Vist233/Web-Searcher"
},
"split_keywords": [
"ai",
" baidu",
" llm",
" mcp",
" model-context-protocol",
" search"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "e96be22a429174c115e4ef06ce55fce6dfac7a3d2a5172e27a152051d749ed29",
"md5": "d4cb3a087867aa9208346fbb725c56cf",
"sha256": "0a8b002ec3d7beb095dd153fc42b9b27aa39a7b4ba19484000ac4241026d4610"
},
"downloads": -1,
"filename": "aiwebsearcher-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d4cb3a087867aa9208346fbb725c56cf",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 19618,
"upload_time": "2025-10-15T14:43:12",
"upload_time_iso_8601": "2025-10-15T14:43:12.729478Z",
"url": "https://files.pythonhosted.org/packages/e9/6b/e22a429174c115e4ef06ce55fce6dfac7a3d2a5172e27a152051d749ed29/aiwebsearcher-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "92ffc219dc6be40620c3c0267591d34a06332b77a60d41da940ea49f151ce0fa",
"md5": "24d5679b259d9d01e456bae300d287b9",
"sha256": "704be2c74502793ec4fdfef76e0d9c2239ae1a341262444bf0a0dbafdf5dbc00"
},
"downloads": -1,
"filename": "aiwebsearcher-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "24d5679b259d9d01e456bae300d287b9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 30287,
"upload_time": "2025-10-15T14:43:14",
"upload_time_iso_8601": "2025-10-15T14:43:14.441010Z",
"url": "https://files.pythonhosted.org/packages/92/ff/c219dc6be40620c3c0267591d34a06332b77a60d41da940ea49f151ce0fa/aiwebsearcher-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-15 14:43:14",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Vist233",
"github_project": "Web-Searcher",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "fastmcp",
"specs": [
[
">=",
"0.1.0"
]
]
},
{
"name": "agno",
"specs": [
[
">=",
"0.1.0"
]
]
},
{
"name": "requests",
"specs": [
[
"<",
"3.0.0"
],
[
">=",
"2.31.0"
]
]
},
{
"name": "asyncio-mqtt",
"specs": [
[
">=",
"0.16.0"
]
]
},
{
"name": "baidusearch",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "pydantic",
"specs": [
[
"<",
"3.0.0"
],
[
">=",
"2.0.0"
]
]
},
{
"name": "pycountry",
"specs": [
[
">=",
"22.0.0"
],
[
"<",
"24.0.0"
]
]
},
{
"name": "beautifulsoup4",
"specs": [
[
">=",
"4.12.0"
],
[
"<",
"5.0.0"
]
]
},
{
"name": "trafilatura",
"specs": [
[
"<",
"2.0.0"
],
[
">=",
"1.6.0"
]
]
},
{
"name": "lxml",
"specs": [
[
"<",
"6.0.0"
],
[
">=",
"4.9.0"
]
]
},
{
"name": "charset-normalizer",
"specs": [
[
">=",
"3.0.0"
],
[
"<",
"4.0.0"
]
]
}
],
"lcname": "aiwebsearcher"
}