multi-search-api


Namemulti-search-api JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryIntelligent multi-provider search API with automatic fallback and caching
upload_time2025-10-26 08:40:24
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords search api serper brave searxng web-search multi-provider
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Multi-Search-API

**Intelligent multi-provider search API with automatic fallback and caching**

[![PyPI version](https://badge.fury.io/py/multi-search-api.svg)](https://badge.fury.io/py/multi-search-api)
[![Python Support](https://img.shields.io/pypi/pyversions/multi-search-api.svg)](https://pypi.org/project/multi-search-api/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Features

- 🔄 **Automatic Fallback**: Seamlessly switches between multiple search providers
- 💾 **Smart Caching**: 1-day result caching to reduce API calls
- 🚦 **Rate Limit Handling**: Automatic detection and provider rotation on HTTP 402/429
- 🔌 **Multiple Providers**: Support for Serper, SearXNG, Brave, and Google scraping
- 🎯 **Zero Configuration**: Works out of the box with sensible defaults
- 📊 **Provider Management**: Track status, cache stats, and rate limits

## Supported Search Providers

| Provider | Type | Quality | Rate Limits | API Key Required |
|----------|------|---------|-------------|------------------|
| **Serper** | API | ⭐⭐⭐⭐⭐ Excellent | 2,500 free/month | Yes |
| **SearXNG** | Meta-search | ⭐⭐⭐⭐ Good | Unlimited | No |
| **Brave** | API | ⭐⭐⭐⭐⭐ Excellent | 1 req/sec free | Yes |
| **Google Scraper** | Scraping | ⭐⭐⭐ Fair | Use sparingly | No |

## Installation

```bash
pip install multi-search-api
```

## Quick Start

### Basic Usage

```python
from multi_search_api import SmartSearchTool

# Initialize (uses environment variables for API keys)
search = SmartSearchTool()

# Perform a search
result = search.search("Python programming tutorials")

print(f"Provider used: {result['provider']}")
print(f"Results found: {len(result['results'])}")

for item in result['results'][:3]:
    print(f"\n{item['title']}")
    print(f"{item['snippet']}")
    print(f"{item['link']}")
```

### With API Keys

```python
from multi_search_api import SmartSearchTool

# Initialize with explicit API keys
search = SmartSearchTool(
    serper_api_key="your-serper-key",
    brave_api_key="your-brave-key"
)

result = search.search("AI news 2025", num_results=10)
```

### Environment Variables

Create a `.env` file:

```env
SERPER_API_KEY=your_serper_api_key_here
BRAVE_API_KEY=your_brave_api_key_here
```

The tool will automatically load these keys.

## Advanced Usage

### Recent Content Search

```python
import asyncio
from multi_search_api import SmartSearchTool

async def search_recent():
    search = SmartSearchTool()

    # Search for content from last 14 days
    results = await search.search_recent_content(
        query="AI breakthroughs",
        max_results=10,
        days_back=14,
        language="en"
    )

    return results

results = asyncio.run(search_recent())
```

### Cache Management

```python
search = SmartSearchTool()

# Get cache statistics
stats = search.get_status()
print(f"Cache entries: {stats['cache']['total_entries']}")

# Clear expired cache entries
search.clear_cache()

# Disable caching
search.disable_cache()

# Re-enable caching
search.enable_cache()
```

### Rate Limit Management

```python
search = SmartSearchTool()

# Check provider status
status = search.get_status()
print(f"Active providers: {status['providers']}")
print(f"Rate limited: {status['rate_limited_providers']}")

# Reset rate limit tracking (e.g., new day)
search.reset_rate_limits()
```

### CrewAI Integration

```python
from crewai import Agent, Task
from multi_search_api import SmartSearchTool

search_tool = SmartSearchTool()

researcher = Agent(
    role='Research Analyst',
    goal='Find relevant information on the web',
    tools=[search_tool],
    verbose=True
)

task = Task(
    description="Research the latest AI developments",
    agent=researcher
)
```

## How It Works

### Provider Priority

1. **Serper** - Best quality results, 2,500 free searches/month
2. **SearXNG** - Free unlimited searches, variable quality
3. **Brave** - Excellent quality, 1 req/sec limit on free tier
4. **Google Scraper** - Last resort fallback

### Automatic Fallback

When a provider fails or hits rate limits (HTTP 402/429), the tool automatically:

1. Detects the failure
2. Marks the provider as rate-limited for the session
3. Tries the next available provider
4. Caches successful results to minimize future API calls

### Caching Strategy

- Results are cached for 24 hours
- Cache keys based on: query, num_results, language
- Automatic cleanup of expired entries
- Optional cache disable for real-time needs

## API Reference

### SmartSearchTool

```python
SmartSearchTool(
    ollama_api_key: str | None = None,
    serper_api_key: str | None = None,
    brave_api_key: str | None = None,
    searxng_instance: str | None = None,
    enable_cache: bool = True
)
```

#### Methods

- `search(query: str, **kwargs) -> dict`: Perform a search
- `search_recent_content(query: str, max_results: int, days_back: int, language: str) -> list`: Search recent content
- `get_status() -> dict`: Get provider and cache status
- `clear_cache()`: Clear expired cache entries
- `reset_rate_limits()`: Reset rate limit tracking
- `disable_cache()`: Disable caching
- `enable_cache()`: Enable caching
- `run(query: str) -> str`: CrewAI-compatible search method

### Search Result Format

```python
{
    "query": "search query",
    "provider": "SerperProvider",
    "cache_hit": False,
    "timestamp": "2025-10-26T10:30:00",
    "results": [
        {
            "title": "Result Title",
            "snippet": "Result description or snippet",
            "link": "https://example.com",
            "source": "serper"
        },
        # ... more results
    ]
}
```

## Getting API Keys

### Serper (Recommended)

1. Visit [serper.dev](https://serper.dev)
2. Sign up for free account
3. Get 2,500 free searches per month
4. Copy your API key

### Brave Search

1. Visit [brave.com/search/api](https://brave.com/search/api/)
2. Sign up for API access
3. Free tier: 1 request/second
4. Copy your subscription token

### SearXNG (No Key Needed)

SearXNG is automatically configured with public instances. No setup required!

## Configuration

### Custom Cache Directory

```python
from multi_search_api.cache import SearchResultCache

cache = SearchResultCache(cache_file="custom/path/cache.json")
```

### Custom SearXNG Instance

```python
search = SmartSearchTool(searxng_instance="https://your-searxng.com")
```

## Development

```bash
# Clone repository
git clone https://github.com/joop/multi-search-api.git
cd multi-search-api

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Run with coverage
pytest --cov=multi_search_api --cov-report=html

# Format code
ruff format .

# Lint code
ruff check .
```

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Author

**Joop Snijder**

## Changelog

### 0.1.0 (2025-10-26)

- Initial release
- Support for Serper, SearXNG, Brave, and Google scraping
- Automatic fallback and rate limit handling
- 24-hour result caching
- CrewAI integration support

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "multi-search-api",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "search, api, serper, brave, searxng, web-search, multi-provider",
    "author": null,
    "author_email": "Joop Snijder <joop@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/10/b0/e727bad33efe995c79b25bf9b0088b94f20acaba6fa61d81bf72851cfc4b/multi_search_api-0.1.0.tar.gz",
    "platform": null,
    "description": "# Multi-Search-API\n\n**Intelligent multi-provider search API with automatic fallback and caching**\n\n[![PyPI version](https://badge.fury.io/py/multi-search-api.svg)](https://badge.fury.io/py/multi-search-api)\n[![Python Support](https://img.shields.io/pypi/pyversions/multi-search-api.svg)](https://pypi.org/project/multi-search-api/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## Features\n\n- \ud83d\udd04 **Automatic Fallback**: Seamlessly switches between multiple search providers\n- \ud83d\udcbe **Smart Caching**: 1-day result caching to reduce API calls\n- \ud83d\udea6 **Rate Limit Handling**: Automatic detection and provider rotation on HTTP 402/429\n- \ud83d\udd0c **Multiple Providers**: Support for Serper, SearXNG, Brave, and Google scraping\n- \ud83c\udfaf **Zero Configuration**: Works out of the box with sensible defaults\n- \ud83d\udcca **Provider Management**: Track status, cache stats, and rate limits\n\n## Supported Search Providers\n\n| Provider | Type | Quality | Rate Limits | API Key Required |\n|----------|------|---------|-------------|------------------|\n| **Serper** | API | \u2b50\u2b50\u2b50\u2b50\u2b50 Excellent | 2,500 free/month | Yes |\n| **SearXNG** | Meta-search | \u2b50\u2b50\u2b50\u2b50 Good | Unlimited | No |\n| **Brave** | API | \u2b50\u2b50\u2b50\u2b50\u2b50 Excellent | 1 req/sec free | Yes |\n| **Google Scraper** | Scraping | \u2b50\u2b50\u2b50 Fair | Use sparingly | No |\n\n## Installation\n\n```bash\npip install multi-search-api\n```\n\n## Quick Start\n\n### Basic Usage\n\n```python\nfrom multi_search_api import SmartSearchTool\n\n# Initialize (uses environment variables for API keys)\nsearch = SmartSearchTool()\n\n# Perform a search\nresult = search.search(\"Python programming tutorials\")\n\nprint(f\"Provider used: {result['provider']}\")\nprint(f\"Results found: {len(result['results'])}\")\n\nfor item in result['results'][:3]:\n    print(f\"\\n{item['title']}\")\n    print(f\"{item['snippet']}\")\n    print(f\"{item['link']}\")\n```\n\n### With API Keys\n\n```python\nfrom multi_search_api import SmartSearchTool\n\n# Initialize with explicit API keys\nsearch = SmartSearchTool(\n    serper_api_key=\"your-serper-key\",\n    brave_api_key=\"your-brave-key\"\n)\n\nresult = search.search(\"AI news 2025\", num_results=10)\n```\n\n### Environment Variables\n\nCreate a `.env` file:\n\n```env\nSERPER_API_KEY=your_serper_api_key_here\nBRAVE_API_KEY=your_brave_api_key_here\n```\n\nThe tool will automatically load these keys.\n\n## Advanced Usage\n\n### Recent Content Search\n\n```python\nimport asyncio\nfrom multi_search_api import SmartSearchTool\n\nasync def search_recent():\n    search = SmartSearchTool()\n\n    # Search for content from last 14 days\n    results = await search.search_recent_content(\n        query=\"AI breakthroughs\",\n        max_results=10,\n        days_back=14,\n        language=\"en\"\n    )\n\n    return results\n\nresults = asyncio.run(search_recent())\n```\n\n### Cache Management\n\n```python\nsearch = SmartSearchTool()\n\n# Get cache statistics\nstats = search.get_status()\nprint(f\"Cache entries: {stats['cache']['total_entries']}\")\n\n# Clear expired cache entries\nsearch.clear_cache()\n\n# Disable caching\nsearch.disable_cache()\n\n# Re-enable caching\nsearch.enable_cache()\n```\n\n### Rate Limit Management\n\n```python\nsearch = SmartSearchTool()\n\n# Check provider status\nstatus = search.get_status()\nprint(f\"Active providers: {status['providers']}\")\nprint(f\"Rate limited: {status['rate_limited_providers']}\")\n\n# Reset rate limit tracking (e.g., new day)\nsearch.reset_rate_limits()\n```\n\n### CrewAI Integration\n\n```python\nfrom crewai import Agent, Task\nfrom multi_search_api import SmartSearchTool\n\nsearch_tool = SmartSearchTool()\n\nresearcher = Agent(\n    role='Research Analyst',\n    goal='Find relevant information on the web',\n    tools=[search_tool],\n    verbose=True\n)\n\ntask = Task(\n    description=\"Research the latest AI developments\",\n    agent=researcher\n)\n```\n\n## How It Works\n\n### Provider Priority\n\n1. **Serper** - Best quality results, 2,500 free searches/month\n2. **SearXNG** - Free unlimited searches, variable quality\n3. **Brave** - Excellent quality, 1 req/sec limit on free tier\n4. **Google Scraper** - Last resort fallback\n\n### Automatic Fallback\n\nWhen a provider fails or hits rate limits (HTTP 402/429), the tool automatically:\n\n1. Detects the failure\n2. Marks the provider as rate-limited for the session\n3. Tries the next available provider\n4. Caches successful results to minimize future API calls\n\n### Caching Strategy\n\n- Results are cached for 24 hours\n- Cache keys based on: query, num_results, language\n- Automatic cleanup of expired entries\n- Optional cache disable for real-time needs\n\n## API Reference\n\n### SmartSearchTool\n\n```python\nSmartSearchTool(\n    ollama_api_key: str | None = None,\n    serper_api_key: str | None = None,\n    brave_api_key: str | None = None,\n    searxng_instance: str | None = None,\n    enable_cache: bool = True\n)\n```\n\n#### Methods\n\n- `search(query: str, **kwargs) -> dict`: Perform a search\n- `search_recent_content(query: str, max_results: int, days_back: int, language: str) -> list`: Search recent content\n- `get_status() -> dict`: Get provider and cache status\n- `clear_cache()`: Clear expired cache entries\n- `reset_rate_limits()`: Reset rate limit tracking\n- `disable_cache()`: Disable caching\n- `enable_cache()`: Enable caching\n- `run(query: str) -> str`: CrewAI-compatible search method\n\n### Search Result Format\n\n```python\n{\n    \"query\": \"search query\",\n    \"provider\": \"SerperProvider\",\n    \"cache_hit\": False,\n    \"timestamp\": \"2025-10-26T10:30:00\",\n    \"results\": [\n        {\n            \"title\": \"Result Title\",\n            \"snippet\": \"Result description or snippet\",\n            \"link\": \"https://example.com\",\n            \"source\": \"serper\"\n        },\n        # ... more results\n    ]\n}\n```\n\n## Getting API Keys\n\n### Serper (Recommended)\n\n1. Visit [serper.dev](https://serper.dev)\n2. Sign up for free account\n3. Get 2,500 free searches per month\n4. Copy your API key\n\n### Brave Search\n\n1. Visit [brave.com/search/api](https://brave.com/search/api/)\n2. Sign up for API access\n3. Free tier: 1 request/second\n4. Copy your subscription token\n\n### SearXNG (No Key Needed)\n\nSearXNG is automatically configured with public instances. No setup required!\n\n## Configuration\n\n### Custom Cache Directory\n\n```python\nfrom multi_search_api.cache import SearchResultCache\n\ncache = SearchResultCache(cache_file=\"custom/path/cache.json\")\n```\n\n### Custom SearXNG Instance\n\n```python\nsearch = SmartSearchTool(searxng_instance=\"https://your-searxng.com\")\n```\n\n## Development\n\n```bash\n# Clone repository\ngit clone https://github.com/joop/multi-search-api.git\ncd multi-search-api\n\n# Create virtual environment\npython -m venv .venv\nsource .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\n\n# Install development dependencies\npip install -e \".[dev]\"\n\n# Run tests\npytest\n\n# Run with coverage\npytest --cov=multi_search_api --cov-report=html\n\n# Format code\nruff format .\n\n# Lint code\nruff check .\n```\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/amazing-feature`)\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\n4. Push to the branch (`git push origin feature/amazing-feature`)\n5. Open a Pull Request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Author\n\n**Joop Snijder**\n\n## Changelog\n\n### 0.1.0 (2025-10-26)\n\n- Initial release\n- Support for Serper, SearXNG, Brave, and Google scraping\n- Automatic fallback and rate limit handling\n- 24-hour result caching\n- CrewAI integration support\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Intelligent multi-provider search API with automatic fallback and caching",
    "version": "0.1.0",
    "project_urls": {
        "Documentation": "https://github.com/Joopsnijder/multi-search-api#readme",
        "Homepage": "https://github.com/Joopsnijder/multi-search-api",
        "Issues": "https://github.com/Joopsnijder/multi-search-api/issues",
        "Repository": "https://github.com/Joopsnijder/multi-search-api"
    },
    "split_keywords": [
        "search",
        " api",
        " serper",
        " brave",
        " searxng",
        " web-search",
        " multi-provider"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "13841140fa6ba6c6b780180541823c2d87cf5b942dc54cff78aba29dbdd81e92",
                "md5": "fa4d266188b18e83c3bc74e5140e3383",
                "sha256": "f339a6e10c034caa91dfb9a5c5f89395ffe8a2112b022c44bda11437c2f96908"
            },
            "downloads": -1,
            "filename": "multi_search_api-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "fa4d266188b18e83c3bc74e5140e3383",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 19626,
            "upload_time": "2025-10-26T08:40:23",
            "upload_time_iso_8601": "2025-10-26T08:40:23.483499Z",
            "url": "https://files.pythonhosted.org/packages/13/84/1140fa6ba6c6b780180541823c2d87cf5b942dc54cff78aba29dbdd81e92/multi_search_api-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "10b0e727bad33efe995c79b25bf9b0088b94f20acaba6fa61d81bf72851cfc4b",
                "md5": "76afbb3e329d243263be7d849e94999f",
                "sha256": "129b3a871ecca6e85b42eaa4e0fef7d3d9f80658bb25fd56a4271dd5deff70cb"
            },
            "downloads": -1,
            "filename": "multi_search_api-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "76afbb3e329d243263be7d849e94999f",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 21096,
            "upload_time": "2025-10-26T08:40:24",
            "upload_time_iso_8601": "2025-10-26T08:40:24.882255Z",
            "url": "https://files.pythonhosted.org/packages/10/b0/e727bad33efe995c79b25bf9b0088b94f20acaba6fa61d81bf72851cfc4b/multi_search_api-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-26 08:40:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Joopsnijder",
    "github_project": "multi-search-api#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "multi-search-api"
}
        
Elapsed time: 1.27032s