openzim-mcp


Nameopenzim-mcp JSON
Version 0.3.3 PyPI version JSON
download
home_pageNone
SummaryOpenZIM MCP - ZIM MCP Server that enables AI models to access and search ZIM format knowledge bases offline
upload_time2025-09-15 17:44:14
maintainerNone
docs_urlNone
authorNone
requires_python>=3.12
licenseNone
keywords zim openzim mcp model-context-protocol ai llm knowledge-base offline wikipedia search libzim
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # OpenZIM MCP Server

<!-- Build and Quality Badges -->
[![CI](https://github.com/cameronrye/openzim-mcp/workflows/CI/badge.svg)](https://github.com/cameronrye/openzim-mcp/actions/workflows/test.yml)
[![codecov](https://codecov.io/gh/cameronrye/openzim-mcp/branch/main/graph/badge.svg)](https://codecov.io/gh/cameronrye/openzim-mcp)
[![CodeQL](https://github.com/cameronrye/openzim-mcp/workflows/CodeQL/badge.svg)](https://github.com/cameronrye/openzim-mcp/actions/workflows/codeql.yml)
[![Security Rating](https://sonarcloud.io/api/project_badges/measure?project=cameronrye_openzim-mcp&metric=security_rating)](https://sonarcloud.io/summary/new_code?id=cameronrye_openzim-mcp)

<!-- Package and Version Badges -->
[![PyPI version](https://badge.fury.io/py/openzim-mcp.svg)](https://badge.fury.io/py/openzim-mcp)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/openzim-mcp)](https://pypi.org/project/openzim-mcp/)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/openzim-mcp)](https://pypi.org/project/openzim-mcp/)
[![GitHub release (latest by date)](https://img.shields.io/github/v/release/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/releases)

<!-- Code Quality and Standards -->
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
[![Type checked: mypy](https://img.shields.io/badge/type%20checked-mypy-blue)](https://mypy-lang.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

<!-- Community and Contribution -->
[![GitHub issues](https://img.shields.io/github/issues/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/issues)
[![GitHub pull requests](https://img.shields.io/github/issues-pr/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/pulls)
[![GitHub contributors](https://img.shields.io/github/contributors/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/graphs/contributors)
[![GitHub stars](https://img.shields.io/github/stars/cameronrye/openzim-mcp?style=social)](https://github.com/cameronrye/openzim-mcp/stargazers)

## ๐Ÿง  Built for LLM Intelligence

**OpenZIM MCP transforms static ZIM archives into dynamic knowledge engines for Large Language Models.** Unlike basic file readers, this tool provides *intelligent, structured access* that LLMs need to effectively navigate and understand vast knowledge repositories.

๐Ÿš€ **Why LLMs Love OpenZIM MCP:**
- **Smart Navigation**: Browse by namespace (articles, metadata, media) instead of blind searching
- **Context-Aware Discovery**: Get article structure, relationships, and metadata for deeper understanding
- **Intelligent Search**: Advanced filtering, auto-complete suggestions, and relevance-ranked results
- **Performance Optimized**: Cached operations and pagination prevent timeouts on massive archives
- **Relationship Mapping**: Extract internal/external links to understand content connections

Whether you're building a research assistant, knowledge chatbot, or content analysis system, OpenZIM MCP gives your LLM the structured access patterns it needs to unlock the full potential of offline knowledge archives. No more fumbling through raw text dumps! ๐ŸŽฏ

**OpenZIM MCP** is a modern, secure, and high-performance MCP (Model Context Protocol) server that enables AI models to access and search [ZIM format](https://en.wikipedia.org/wiki/ZIM_(file_format)) knowledge bases offline.

[ZIM](https://en.wikipedia.org/wiki/ZIM_(file_format)) (Zeno IMproved) is an open file format developed by the [openZIM project](https://openzim.org/), designed specifically for offline storage and access to website content. The format supports high compression rates using Zstandard compression (default since 2021) and enables fast full-text searching, making it ideal for storing entire Wikipedia content and other large reference materials in relatively compact files. The openZIM project is sponsored by Wikimedia CH and supported by the Wikimedia Foundation, ensuring the format's continued development and adoption for offline knowledge access, especially in environments without reliable internet connectivity.

## โœจ Features

- ๐Ÿ”’ **Security First**: Comprehensive input validation and path traversal protection
- โšก **High Performance**: Intelligent caching and optimized ZIM file operations
- ๐Ÿง  **Smart Retrieval**: Automatic fallback from direct access to search-based retrieval for reliable entry access
- ๐Ÿงช **Well Tested**: 90%+ test coverage with comprehensive test suite
- ๐Ÿ—๏ธ **Modern Architecture**: Modular design with dependency injection
- ๐Ÿ“ **Type Safe**: Full type annotations throughout the codebase
- ๐Ÿ”ง **Configurable**: Flexible configuration with validation
- ๐Ÿ“Š **Observable**: Structured logging and health monitoring

## ๐Ÿš€ Quick Start

### Installation

```bash
# Clone the repository
git clone <repository-url>
cd openzim-mcp

# Install dependencies
uv sync

# Install development dependencies (optional)
uv sync --dev
```

### Prepare ZIM Files

Download ZIM files (e.g., Wikipedia, Wiktionary, etc.) from the [Kiwix Library](https://browse.library.kiwix.org/) and place them in a directory:

```bash
mkdir ~/zim-files
# Download ZIM files to ~/zim-files/
```

### Running the Server

```bash
# Run with the modular architecture
python -m openzim_mcp /path/to/zim/files

# Or using uv
uv run python -m openzim_mcp /path/to/zim/files

# Or using make
make run ZIM_DIR=/path/to/zim/files
```

### MCP Configuration

Add to your MCP client configuration:

```json
{
  "openzim-mcp": {
    "command": "uv",
    "args": [
      "--directory",
      "/path/to/openzim-mcp",
      "run",
      "python",
      "-m",
      "openzim_mcp",
      "/path/to/zim/files"
    ]
  }
}
```

## ๐Ÿ› ๏ธ Development

### Running Tests

```bash
# Run all tests
make test

# Run tests with coverage
make test-cov

# Run specific test file
uv run pytest tests/test_security.py -v

# Run tests with ZIM test data (comprehensive testing)
make test-with-zim-data

# Run integration tests only
make test-integration

# Run tests that require ZIM test data
make test-requires-zim-data
```

### ZIM Test Data Integration

OpenZIM MCP integrates with the official [zim-testing-suite](https://github.com/openzim/zim-testing-suite) for comprehensive testing with real ZIM files:

```bash
# Download essential test files (basic testing)
make download-test-data

# Download all test files (comprehensive testing)
make download-test-data-all

# List available test files
make list-test-data

# Clean downloaded test data
make clean-test-data
```

The test data includes:
- **Basic files**: Small ZIM files for essential testing
- **Real content**: Actual Wikipedia/Wikibooks content for integration testing
- **Invalid files**: Malformed ZIM files for error handling testing
- **Special cases**: Embedded content, split files, and edge cases

Test files are automatically organized by category and priority level.

### Code Quality

```bash
# Format code
make format

# Run linting
make lint

# Type checking
make type-check

# Run all checks
make check
```

### Project Structure

```text
openzim-mcp/
โ”œโ”€โ”€ openzim_mcp/             # Main package
โ”‚   โ”œโ”€โ”€ __init__.py        # Package initialization
โ”‚   โ”œโ”€โ”€ __main__.py        # Module entry point
โ”‚   โ”œโ”€โ”€ main.py            # Main entry point
โ”‚   โ”œโ”€โ”€ server.py          # MCP server implementation
โ”‚   โ”œโ”€โ”€ config.py          # Configuration management
โ”‚   โ”œโ”€โ”€ security.py        # Security and validation
โ”‚   โ”œโ”€โ”€ cache.py           # Caching functionality
โ”‚   โ”œโ”€โ”€ content_processor.py # Content processing
โ”‚   โ”œโ”€โ”€ zim_operations.py  # ZIM file operations
โ”‚   โ”œโ”€โ”€ exceptions.py      # Custom exceptions
โ”‚   โ””โ”€โ”€ constants.py       # Application constants
โ”œโ”€โ”€ tests/                 # Test suite
โ”œโ”€โ”€ pyproject.toml        # Project configuration
โ”œโ”€โ”€ Makefile              # Development commands
โ””โ”€โ”€ README.md             # This file
```

---

## ๐Ÿ“š API Reference

### Available Tools

### list_zim_files - List all ZIM files in allowed directories

No parameters required.

### search_zim_file - Search within ZIM file content

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file
- `query` (string): Search query term

**Optional parameters:**

- `limit` (integer, default: 10): Maximum number of results to return
- `offset` (integer, default: 0): Starting offset for results (for pagination)

### get_zim_entry - Get detailed content of a specific entry in a ZIM file

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file
- `entry_path` (string): Entry path, e.g., 'A/Some_Article'

**Optional parameters:**

- `max_content_length` (integer, default: 100000, minimum: 1000): Maximum length of returned content

**Smart Retrieval Features:**

- **Automatic Fallback**: If direct path access fails, automatically searches for the entry and uses the exact path found
- **Path Mapping Cache**: Caches successful path mappings for improved performance on repeated access
- **Enhanced Error Guidance**: Provides clear guidance when entries cannot be found, suggesting alternative approaches
- **Transparent Operation**: Works seamlessly regardless of path encoding differences (spaces vs underscores, URL encoding, etc.)

### get_zim_metadata - Get ZIM file metadata from M namespace entries

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file

**Returns:**
JSON string containing ZIM metadata including entry counts, archive information, and metadata entries like title, description, language, creator, etc.

### get_main_page - Get the main page entry from W namespace

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file

**Returns:**
Main page content or information about the main page entry.

### list_namespaces - List available namespaces and their entry counts

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file

**Returns:**
JSON string containing namespace information with entry counts, descriptions, and sample entries for each namespace (C, M, W, X, etc.).

### browse_namespace - Browse entries in a specific namespace with pagination

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file
- `namespace` (string): Namespace to browse (C, M, W, X, A, I, etc.)

**Optional parameters:**

- `limit` (integer, default: 50, range: 1-200): Maximum number of entries to return
- `offset` (integer, default: 0): Starting offset for pagination

**Returns:**
JSON string containing namespace entries with titles, content previews, and pagination information.

### search_with_filters - Search within ZIM file content with advanced filters

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file
- `query` (string): Search query term

**Optional parameters:**

- `namespace` (string): Optional namespace filter (C, M, W, X, etc.)
- `content_type` (string): Optional content type filter (text/html, text/plain, etc.)
- `limit` (integer, default: 10, range: 1-100): Maximum number of results to return
- `offset` (integer, default: 0): Starting offset for pagination

**Returns:**
Filtered search results with namespace and content type information.

### get_search_suggestions - Get search suggestions and auto-complete

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file
- `partial_query` (string): Partial search query (minimum 2 characters)

**Optional parameters:**

- `limit` (integer, default: 10, range: 1-50): Maximum number of suggestions to return

**Returns:**
JSON string containing search suggestions based on article titles and content.

### get_article_structure - Extract article structure and metadata

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file
- `entry_path` (string): Entry path, e.g., 'C/Some_Article'

**Returns:**
JSON string containing article structure including headings, sections, metadata, and word count.

### extract_article_links - Extract internal and external links from an article

**Required parameters:**

- `zim_file_path` (string): Path to the ZIM file
- `entry_path` (string): Entry path, e.g., 'C/Some_Article'

**Returns:**
JSON string containing categorized links (internal, external, media) with titles and metadata.

---

## Examples

### Listing ZIM files

```json
{
  "name": "list_zim_files"
}
```

Response:

```plain
Found 1 ZIM files in 1 directories:

[
  {
    "name": "wikipedia_en_100_2025-08.zim",
    "path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "directory": "C:\\zim",
    "size": "310.77 MB",
    "modified": "2025-09-11T10:20:50.148427"
  }
]
```

### Searching ZIM files

```json
{
  "name": "search_zim_file",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "query": "biology",
    "limit": 3
  }
}
```

Response:

```plain
Found 51 matches for "biology", showing 1-3:

## 1. Taxonomy (biology)
Path: Taxonomy_(biology)
Snippet: #  Taxonomy (biology) Part of a series on
---
Evolutionary biology
Darwin's finches by John Gould

  * Index
  * Introduction
  * [Main](Evolution "Evolution")
  * Outline

## 2. Protein
Path: Protein
Snippet: #  Protein A representation of the 3D structure of the protein myoglobin showing turquoise ฮฑ-helices. This protein was the first to have its structure solved by X-ray crystallography. Toward the right-center among the coils, a prosthetic group called a heme group (shown in gray) with a bound oxygen molecule (red).

## 3. Ant
Path: Ant
Snippet: #  Ant Ants
Temporal range: Late Aptian โ€“ Present
---
Fire ants
[Scientific classification](Taxonomy_\(biology\) "Taxonomy \(biology\)")
Kingdom:  | [Animalia](Animal "Animal")
Phylum:  | [Arthropoda](Arthropod "Arthropod")
Class:  | [Insecta](Insect "Insect")
Order:  | Hymenoptera
Infraorder:  | Aculeata
Superfamily:  |
Latreille, 1809[1]
Family:  |
Latreille, 1809
```

### Getting ZIM entries

```json
{
  "name": "get_zim_entry",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "entry_path": "Protein"
  }
}
```

Response:

```plain
# Protein

Path: Protein
Type: text/html
## Content

#  Protein

A representation of the 3D structure of the protein myoglobin showing turquoise ฮฑ-helices. This protein was the first to have its structure solved by X-ray crystallography. Toward the right-center among the coils, a prosthetic group called a heme group (shown in gray) with a bound oxygen molecule (red).

**Proteins** are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.

A linear chain of amino acid residues is called a polypeptide. A protein contains at least one long polypeptide. Short polypeptides, containing less than 20โ€“30 residues, are rarely considered to be proteins and are commonly called peptides.

... [Content truncated, total of 56,202 characters, only showing first 1,500 characters] ...
```

### Smart Retrieval in Action

**Example: Automatic path resolution**

```json
{
  "name": "get_zim_entry",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "entry_path": "A/Test Article"
  }
}
```

Response (showing smart retrieval working):

```plain
# Test Article

Requested Path: A/Test Article
Actual Path: A/Test_Article
Type: text/html

## Content

# Test Article

This article demonstrates the smart retrieval system automatically handling
path encoding differences. The system tried "A/Test Article" directly,
then automatically searched and found "A/Test_Article".

... [Content continues] ...
```

### get_server_health - Get server health and statistics

No parameters required.

**Returns:**

- Server status and performance metrics
- Cache statistics
- Configuration information
- Instance tracking information
- Conflict detection results

**Example Response:**

```json
{
  "status": "healthy",
  "server_name": "openzim-mcp",
  "allowed_directories": 1,
  "cache": {
    "enabled": true,
    "size": 1,
    "max_size": 100,
    "ttl_seconds": 3600
  },
  "instance_tracking": {
    "active_instances": 1,
    "conflicts_detected": 0
  }
}
```

### get_server_configuration - Get detailed server configuration

No parameters required.

**Returns:**
Comprehensive server configuration including diagnostics, validation results, and conflict detection.

**Example Response:**

```json
{
  "configuration": {
    "server_name": "openzim-mcp",
    "allowed_directories": ["/path/to/zim/files"],
    "cache_enabled": true,
    "config_hash": "abc123...",
    "server_pid": 12345
  },
  "diagnostics": {
    "validation_status": "healthy",
    "conflicts_detected": [],
    "warnings": [],
    "recommendations": []
  }
}
```

### diagnose_server_state - Comprehensive server diagnostics

No parameters required.

**Returns:**
Detailed diagnostic information including instance conflicts, configuration validation, file accessibility checks, and actionable recommendations.

**Example Response:**

```json
{
  "status": "healthy",
  "server_info": {
    "pid": 12345,
    "server_name": "openzim-mcp",
    "config_hash": "abc123..."
  },
  "conflicts": [],
  "issues": [],
  "recommendations": ["Server appears to be running normally"],
  "environment_checks": {
    "directories_accessible": true,
    "cache_functional": true
  }
}
```

### resolve_server_conflicts - Identify and resolve server conflicts

No parameters required.

**Returns:**
Results of conflict resolution including cleanup actions and recommendations.

**Example Response:**

```json
{
  "status": "success",
  "cleanup_results": {
    "stale_instances_removed": 2
  },
  "conflicts_found": [],
  "actions_taken": ["Removed 2 stale instance files"],
  "recommendations": ["No active conflicts detected"]
}
```

### Additional Search Examples

**Computer-related search:**

```json
{
  "name": "search_zim_file",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "query": "computer",
    "limit": 2
  }
}
```

Response:
```plain
Found 39 matches for "computer", showing 1-2:

## 1. Video game
Path: Video_game
Snippet: #  Video game First-generation _Pong_ console at the Computerspielemuseum Berlin
---
Platforms

## 2. Protein
Path: Protein
Snippet: #  Protein A representation of the 3D structure of the protein myoglobin showing turquoise ฮฑ-helices. This protein was the first to have its structure solved by X-ray crystallography. Toward the right-center among the coils, a prosthetic group called a heme group (shown in gray) with a bound oxygen molecule (red).
```

**Getting detailed content:**

```json
{
  "name": "get_zim_entry",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "entry_path": "Evolution",
    "max_content_length": 1500
  }
}
```

Response:
```plain
# Evolution

Path: Evolution
Type: text/html
## Content

#  Evolution

Part of the Biology series on
---
****
Mechanisms and processes

  * Adaptation
  * Genetic drift
  * Gene flow
  * History of life
  * Maladaptation
  * Mutation
  * Natural selection
  * Neutral theory
  * Population genetics
  * Speciation

... [Content truncated, total of 110,237 characters, only showing first 1,500 characters] ...
```

### ๐ŸŽฏ Advanced Knowledge Retrieval Examples

**Getting ZIM metadata:**

```json
{
  "name": "get_zim_metadata",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim"
  }
}
```

Response:
```json
{
  "entry_count": 100000,
  "all_entry_count": 120000,
  "article_count": 80000,
  "media_count": 20000,
  "metadata_entries": {
    "Title": "Wikipedia (English)",
    "Description": "Wikipedia articles in English",
    "Language": "eng",
    "Creator": "Kiwix",
    "Date": "2025-08-15"
  }
}
```

**Browsing a namespace:**

```json
{
  "name": "browse_namespace",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "namespace": "C",
    "limit": 5,
    "offset": 0
  }
}
```

Response:
```json
{
  "namespace": "C",
  "total_in_namespace": 80000,
  "offset": 0,
  "limit": 5,
  "returned_count": 5,
  "has_more": true,
  "entries": [
    {
      "path": "C/Biology",
      "title": "Biology",
      "content_type": "text/html",
      "preview": "Biology is the scientific study of life..."
    }
  ]
}
```

**Filtered search:**

```json
{
  "name": "search_with_filters",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "query": "evolution",
    "namespace": "C",
    "content_type": "text/html",
    "limit": 3
  }
}
```

**Getting article structure:**

```json
{
  "name": "get_article_structure",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "entry_path": "C/Evolution"
  }
}
```

Response:
```json
{
  "title": "Evolution",
  "path": "C/Evolution",
  "content_type": "text/html",
  "headings": [
    {"level": 1, "text": "Evolution", "id": "evolution"},
    {"level": 2, "text": "History", "id": "history"},
    {"level": 2, "text": "Mechanisms", "id": "mechanisms"}
  ],
  "sections": [
    {
      "title": "Evolution",
      "level": 1,
      "content_preview": "Evolution is the change in heritable traits...",
      "word_count": 150
    }
  ],
  "word_count": 5000
}
```

**Getting search suggestions:**

```json
{
  "name": "get_search_suggestions",
  "arguments": {
    "zim_file_path": "C:\\zim\\wikipedia_en_100_2025-08.zim",
    "partial_query": "bio",
    "limit": 5
  }
}
```

Response:
```json
{
  "partial_query": "bio",
  "suggestions": [
    {"text": "Biology", "path": "C/Biology", "type": "title_start_match"},
    {"text": "Biochemistry", "path": "C/Biochemistry", "type": "title_start_match"},
    {"text": "Biodiversity", "path": "C/Biodiversity", "type": "title_start_match"}
  ],
  "count": 3
}
```

### ๐Ÿ”ง Server Management and Diagnostics Examples

**Getting server health:**

```json
{
  "name": "get_server_health"
}
```

Response:
```json
{
  "status": "healthy",
  "server_name": "openzim-mcp",
  "uptime_info": {
    "process_id": 12345,
    "started_at": "2025-09-14T10:30:00"
  },
  "cache_performance": {
    "enabled": true,
    "size": 15,
    "max_size": 100,
    "hit_rate": 0.85
  },
  "instance_tracking": {
    "active_instances": 1,
    "conflicts_detected": 0
  }
}
```

**Diagnosing server state:**

```json
{
  "name": "diagnose_server_state"
}
```

Response:
```json
{
  "status": "healthy",
  "server_info": {
    "pid": 12345,
    "server_name": "openzim-mcp",
    "config_hash": "abc123def456..."
  },
  "conflicts": [],
  "issues": [],
  "recommendations": ["Server appears to be running normally. No issues detected."],
  "environment_checks": {
    "directories_accessible": true,
    "cache_functional": true,
    "zim_files_found": 5
  }
}
```

**Resolving server conflicts:**

```json
{
  "name": "resolve_server_conflicts"
}
```

Response:
```json
{
  "status": "success",
  "cleanup_results": {
    "stale_instances_removed": 2,
    "files_cleaned": ["/home/user/.openzim_mcp_instances/server_99999.json"]
  },
  "conflicts_found": [],
  "actions_taken": ["Removed 2 stale instance files"],
  "recommendations": ["No active conflicts detected after cleanup"]
}
```

---

## ๐ŸŽฏ ZIM Entry Retrieval Best Practices

### Smart Retrieval System

OpenZIM MCP implements an intelligent entry retrieval system that automatically handles path encoding inconsistencies common in ZIM files:

**How It Works:**
1. **Direct Access First**: Attempts to retrieve the entry using the provided path exactly as given
2. **Automatic Fallback**: If direct access fails, automatically searches for the entry using various search terms
3. **Path Mapping Cache**: Caches successful path mappings to improve performance for repeated access
4. **Enhanced Error Guidance**: Provides clear guidance when entries cannot be found

**Benefits for LLM Users:**
- **Transparent Operation**: No need to understand ZIM path encoding complexities
- **Single Tool Call**: Eliminates the need for manual search-first methodology
- **Reliable Results**: Consistent success across different path formats (spaces vs underscores, URL encoding, etc.)
- **Performance Optimized**: Cached mappings improve repeated access speed

**Example Scenarios Handled Automatically:**
- `A/Test Article` โ†’ `A/Test_Article` (space to underscore conversion)
- `C/Cafรฉ` โ†’ `C/Caf%C3%A9` (URL encoding differences)
- `A/Some-Page` โ†’ `A/Some_Page` (hyphen to underscore conversion)

### Usage Recommendations

**For Direct Entry Access:**
```json
{
  "name": "get_zim_entry",
  "arguments": {
    "zim_file_path": "/path/to/file.zim",
    "entry_path": "A/Article_Name"
  }
}
```

**When Entry Not Found:**
The system will automatically provide guidance:
```
Entry not found: 'A/Article_Name'.
The entry path may not exist in this ZIM file.
Try using search_zim_file() to find available entries,
or browse_namespace() to explore the file structure.
```

---

## โš ๏ธ Important Notes and Limitations

### Content Length Requirements
- The `max_content_length` parameter for `get_zim_entry` must be at least 1000 characters
- Content longer than the specified limit will be truncated with a note showing the total character count

### Search Behavior
- Search results may include articles that contain the search terms in various contexts
- Results are ranked by relevance but may not always be directly related to the primary meaning of the search term
- Search snippets provide a preview of the content but may not show the exact location where the search term appears

### File Format Support
- Currently supports ZIM files (Zeno IMproved format)
- Tested with Wikipedia ZIM files (e.g., `wikipedia_en_100_2025-08.zim`)
- File paths must be properly escaped in JSON (use `\\` for Windows paths)

---

## ๐Ÿ”„ Multi-Server Instance Management

OpenZIM MCP includes advanced multi-server instance tracking and conflict detection to ensure reliable operation when multiple server instances are running.

### Instance Tracking Features

- **Automatic Instance Registration**: Each server instance is automatically registered with a unique process ID and configuration hash
- **Conflict Detection**: Detects when multiple servers with different configurations are accessing the same directories
- **Stale Instance Cleanup**: Automatically identifies and cleans up orphaned instance files from terminated processes
- **Configuration Validation**: Ensures all server instances use compatible configurations

### Conflict Types

1. **Configuration Mismatch**: Multiple servers with different settings accessing the same directories
2. **Multiple Instances**: Multiple servers running simultaneously (may cause confusion)
3. **Stale Instances**: Orphaned instance files from terminated processes

### Automatic Conflict Warnings

OpenZIM MCP automatically includes conflict warnings in search results and file listings when issues are detected:

```plain
๐Ÿ” **Server Conflict Detected**
โš ๏ธ Configuration mismatch with server PID 12345. Search results may be inconsistent.
๐Ÿ’ก Use 'resolve_server_conflicts()' to fix these issues.
```

### Best Practices

- Use `diagnose_server_state()` regularly to check for conflicts
- Run `resolve_server_conflicts()` to clean up stale instances
- Ensure all server instances use the same configuration when accessing shared directories
- Monitor server health with `get_server_health()` for instance tracking information

---

## ๐Ÿ”ง Configuration

OpenZIM MCP supports configuration through environment variables with the `OPENZIM_MCP_` prefix:

```bash
# Cache configuration
export OPENZIM_MCP_CACHE__ENABLED=true
export OPENZIM_MCP_CACHE__MAX_SIZE=200
export OPENZIM_MCP_CACHE__TTL_SECONDS=7200

# Content configuration
export OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH=200000
export OPENZIM_MCP_CONTENT__SNIPPET_LENGTH=2000
export OPENZIM_MCP_CONTENT__DEFAULT_SEARCH_LIMIT=20

# Logging configuration
export OPENZIM_MCP_LOGGING__LEVEL=DEBUG
export OPENZIM_MCP_LOGGING__FORMAT="%(asctime)s - %(name)s - %(levelname)s - %(message)s"

# Server configuration
export OPENZIM_MCP_SERVER_NAME=my_openzim_mcp_server
```

### Configuration Options

| Setting | Default | Description |
|---------|---------|-------------|
| `OPENZIM_MCP_CACHE__ENABLED` | `true` | Enable/disable caching |
| `OPENZIM_MCP_CACHE__MAX_SIZE` | `100` | Maximum cache entries |
| `OPENZIM_MCP_CACHE__TTL_SECONDS` | `3600` | Cache TTL in seconds |
| `OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH` | `100000` | Max content length |
| `OPENZIM_MCP_CONTENT__SNIPPET_LENGTH` | `1000` | Max snippet length |
| `OPENZIM_MCP_CONTENT__DEFAULT_SEARCH_LIMIT` | `10` | Default search result limit |
| `OPENZIM_MCP_LOGGING__LEVEL` | `INFO` | Logging level |
| `OPENZIM_MCP_LOGGING__FORMAT` | `%(asctime)s - %(name)s - %(levelname)s - %(message)s` | Log message format |
| `OPENZIM_MCP_SERVER_NAME` | `openzim-mcp` | Server instance name |

---

## ๐Ÿ”’ Security Features

- **Path Traversal Protection**: Secure path validation prevents access outside allowed directories
- **Input Sanitization**: All user inputs are validated and sanitized
- **Resource Management**: Proper cleanup of ZIM archive resources
- **Error Handling**: Sanitized error messages prevent information disclosure
- **Type Safety**: Full type annotations prevent type-related vulnerabilities

---

## ๐Ÿš€ Performance Features

- **Intelligent Caching**: LRU cache with TTL for frequently accessed content
- **Resource Pooling**: Efficient ZIM archive management
- **Optimized Content Processing**: Fast HTML to text conversion
- **Lazy Loading**: Components initialized only when needed
- **Memory Management**: Proper cleanup and resource management

---

## ๐Ÿงช Testing

The project includes comprehensive testing with 90%+ coverage using both mock data and real ZIM files:

### Test Categories

- **Unit Tests**: Individual component testing with mocks
- **Integration Tests**: End-to-end functionality testing with real ZIM files
- **Security Tests**: Path traversal and input validation testing
- **Performance Tests**: Cache and resource management testing
- **Format Compatibility**: Testing with various ZIM file formats and versions
- **Error Handling**: Testing with invalid and malformed ZIM files

### Test Infrastructure

OpenZIM MCP uses a hybrid testing approach:

1. **Mock-based tests**: Fast unit tests using mocked libzim components
2. **Real ZIM file tests**: Integration tests using official zim-testing-suite files
3. **Automatic test data management**: Download and organize test files as needed

### Test Data Sources

- **Built-in test data**: Basic test files included in the repository
- **zim-testing-suite integration**: Official test files from the OpenZIM project
- **Environment variable support**: `ZIM_TEST_DATA_DIR` for custom test data locations

```bash
# Run tests with coverage report
make test-cov

# View coverage report
open htmlcov/index.html

# Run comprehensive tests with real ZIM files
make test-with-zim-data
```

### Test Markers

Tests are organized with pytest markers:

- `@pytest.mark.requires_zim_data`: Tests requiring ZIM test data files
- `@pytest.mark.integration`: Integration tests
- `@pytest.mark.slow`: Long-running tests

---

## ๐Ÿ“ˆ Monitoring

OpenZIM MCP provides built-in monitoring capabilities:

- **Health Checks**: Server health and status monitoring
- **Cache Metrics**: Cache hit rates and performance statistics
- **Structured Logging**: JSON-formatted logs for easy parsing
- **Error Tracking**: Comprehensive error logging and tracking

---

## ๐Ÿ”„ Versioning

This project uses [Semantic Versioning](https://semver.org/) with automated version management through [release-please](https://github.com/googleapis/release-please).

### Automated Releases

Version bumps and releases are automated based on [Conventional Commits](https://www.conventionalcommits.org/):

- **`feat:`** - New features (minor version bump)
- **`fix:`** - Bug fixes (patch version bump)
- **`feat!:`** or **`BREAKING CHANGE:`** - Breaking changes (major version bump)
- **`perf:`** - Performance improvements (patch version bump)
- **`docs:`**, **`style:`**, **`refactor:`**, **`test:`**, **`chore:`** - No version bump

### Release Process

1. **Automatic**: Push commits with conventional commit messages to `main`
2. **Release PR**: release-please creates a PR with version updates and changelog
3. **Release**: Merge the release PR to automatically create a new release
4. **Manual**: Use workflow dispatch for emergency releases

### Commit Message Format

```
<type>[optional scope]: <description>

[optional body]

[optional footer(s)]
```

**Examples:**
```bash
feat: add search suggestions endpoint
fix: resolve path traversal vulnerability
feat!: change API response format
docs: update installation instructions
```

---

## ๐Ÿค Contributing

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run tests (`make check`)
5. **Use conventional commit messages** (`git commit -m 'feat: add amazing feature'`)
6. Push to the branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request

### Development Guidelines

- Follow PEP 8 style guidelines
- Add type hints to all functions
- Write tests for new functionality
- Update documentation as needed
- **Use conventional commit messages** for automatic versioning
- Ensure all tests pass before submitting

---

## ๐Ÿ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## ๐Ÿ™ Acknowledgments

- [Kiwix](https://www.kiwix.org/) for the ZIM format and libzim library
- [MCP](https://modelcontextprotocol.io/) for the Model Context Protocol
- The open-source community for the excellent libraries used in this project

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "openzim-mcp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": "Cameron Rye <c@meron.io>",
    "keywords": "zim, openzim, mcp, model-context-protocol, ai, llm, knowledge-base, offline, wikipedia, search, libzim",
    "author": null,
    "author_email": "Cameron Rye <c@meron.io>",
    "download_url": "https://files.pythonhosted.org/packages/28/3a/4a9126eebaa2b2b358bac3add16f65584e1fb9021800abca19660400a423/openzim_mcp-0.3.3.tar.gz",
    "platform": null,
    "description": "# OpenZIM MCP Server\n\n<!-- Build and Quality Badges -->\n[![CI](https://github.com/cameronrye/openzim-mcp/workflows/CI/badge.svg)](https://github.com/cameronrye/openzim-mcp/actions/workflows/test.yml)\n[![codecov](https://codecov.io/gh/cameronrye/openzim-mcp/branch/main/graph/badge.svg)](https://codecov.io/gh/cameronrye/openzim-mcp)\n[![CodeQL](https://github.com/cameronrye/openzim-mcp/workflows/CodeQL/badge.svg)](https://github.com/cameronrye/openzim-mcp/actions/workflows/codeql.yml)\n[![Security Rating](https://sonarcloud.io/api/project_badges/measure?project=cameronrye_openzim-mcp&metric=security_rating)](https://sonarcloud.io/summary/new_code?id=cameronrye_openzim-mcp)\n\n<!-- Package and Version Badges -->\n[![PyPI version](https://badge.fury.io/py/openzim-mcp.svg)](https://badge.fury.io/py/openzim-mcp)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/openzim-mcp)](https://pypi.org/project/openzim-mcp/)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/openzim-mcp)](https://pypi.org/project/openzim-mcp/)\n[![GitHub release (latest by date)](https://img.shields.io/github/v/release/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/releases)\n\n<!-- Code Quality and Standards -->\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)\n[![Type checked: mypy](https://img.shields.io/badge/type%20checked-mypy-blue)](https://mypy-lang.org/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n<!-- Community and Contribution -->\n[![GitHub issues](https://img.shields.io/github/issues/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/issues)\n[![GitHub pull requests](https://img.shields.io/github/issues-pr/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/pulls)\n[![GitHub contributors](https://img.shields.io/github/contributors/cameronrye/openzim-mcp)](https://github.com/cameronrye/openzim-mcp/graphs/contributors)\n[![GitHub stars](https://img.shields.io/github/stars/cameronrye/openzim-mcp?style=social)](https://github.com/cameronrye/openzim-mcp/stargazers)\n\n## \ud83e\udde0 Built for LLM Intelligence\n\n**OpenZIM MCP transforms static ZIM archives into dynamic knowledge engines for Large Language Models.** Unlike basic file readers, this tool provides *intelligent, structured access* that LLMs need to effectively navigate and understand vast knowledge repositories.\n\n\ud83d\ude80 **Why LLMs Love OpenZIM MCP:**\n- **Smart Navigation**: Browse by namespace (articles, metadata, media) instead of blind searching\n- **Context-Aware Discovery**: Get article structure, relationships, and metadata for deeper understanding\n- **Intelligent Search**: Advanced filtering, auto-complete suggestions, and relevance-ranked results\n- **Performance Optimized**: Cached operations and pagination prevent timeouts on massive archives\n- **Relationship Mapping**: Extract internal/external links to understand content connections\n\nWhether you're building a research assistant, knowledge chatbot, or content analysis system, OpenZIM MCP gives your LLM the structured access patterns it needs to unlock the full potential of offline knowledge archives. No more fumbling through raw text dumps! \ud83c\udfaf\n\n**OpenZIM MCP** is a modern, secure, and high-performance MCP (Model Context Protocol) server that enables AI models to access and search [ZIM format](https://en.wikipedia.org/wiki/ZIM_(file_format)) knowledge bases offline.\n\n[ZIM](https://en.wikipedia.org/wiki/ZIM_(file_format)) (Zeno IMproved) is an open file format developed by the [openZIM project](https://openzim.org/), designed specifically for offline storage and access to website content. The format supports high compression rates using Zstandard compression (default since 2021) and enables fast full-text searching, making it ideal for storing entire Wikipedia content and other large reference materials in relatively compact files. The openZIM project is sponsored by Wikimedia CH and supported by the Wikimedia Foundation, ensuring the format's continued development and adoption for offline knowledge access, especially in environments without reliable internet connectivity.\n\n## \u2728 Features\n\n- \ud83d\udd12 **Security First**: Comprehensive input validation and path traversal protection\n- \u26a1 **High Performance**: Intelligent caching and optimized ZIM file operations\n- \ud83e\udde0 **Smart Retrieval**: Automatic fallback from direct access to search-based retrieval for reliable entry access\n- \ud83e\uddea **Well Tested**: 90%+ test coverage with comprehensive test suite\n- \ud83c\udfd7\ufe0f **Modern Architecture**: Modular design with dependency injection\n- \ud83d\udcdd **Type Safe**: Full type annotations throughout the codebase\n- \ud83d\udd27 **Configurable**: Flexible configuration with validation\n- \ud83d\udcca **Observable**: Structured logging and health monitoring\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\n# Clone the repository\ngit clone <repository-url>\ncd openzim-mcp\n\n# Install dependencies\nuv sync\n\n# Install development dependencies (optional)\nuv sync --dev\n```\n\n### Prepare ZIM Files\n\nDownload ZIM files (e.g., Wikipedia, Wiktionary, etc.) from the [Kiwix Library](https://browse.library.kiwix.org/) and place them in a directory:\n\n```bash\nmkdir ~/zim-files\n# Download ZIM files to ~/zim-files/\n```\n\n### Running the Server\n\n```bash\n# Run with the modular architecture\npython -m openzim_mcp /path/to/zim/files\n\n# Or using uv\nuv run python -m openzim_mcp /path/to/zim/files\n\n# Or using make\nmake run ZIM_DIR=/path/to/zim/files\n```\n\n### MCP Configuration\n\nAdd to your MCP client configuration:\n\n```json\n{\n  \"openzim-mcp\": {\n    \"command\": \"uv\",\n    \"args\": [\n      \"--directory\",\n      \"/path/to/openzim-mcp\",\n      \"run\",\n      \"python\",\n      \"-m\",\n      \"openzim_mcp\",\n      \"/path/to/zim/files\"\n    ]\n  }\n}\n```\n\n## \ud83d\udee0\ufe0f Development\n\n### Running Tests\n\n```bash\n# Run all tests\nmake test\n\n# Run tests with coverage\nmake test-cov\n\n# Run specific test file\nuv run pytest tests/test_security.py -v\n\n# Run tests with ZIM test data (comprehensive testing)\nmake test-with-zim-data\n\n# Run integration tests only\nmake test-integration\n\n# Run tests that require ZIM test data\nmake test-requires-zim-data\n```\n\n### ZIM Test Data Integration\n\nOpenZIM MCP integrates with the official [zim-testing-suite](https://github.com/openzim/zim-testing-suite) for comprehensive testing with real ZIM files:\n\n```bash\n# Download essential test files (basic testing)\nmake download-test-data\n\n# Download all test files (comprehensive testing)\nmake download-test-data-all\n\n# List available test files\nmake list-test-data\n\n# Clean downloaded test data\nmake clean-test-data\n```\n\nThe test data includes:\n- **Basic files**: Small ZIM files for essential testing\n- **Real content**: Actual Wikipedia/Wikibooks content for integration testing\n- **Invalid files**: Malformed ZIM files for error handling testing\n- **Special cases**: Embedded content, split files, and edge cases\n\nTest files are automatically organized by category and priority level.\n\n### Code Quality\n\n```bash\n# Format code\nmake format\n\n# Run linting\nmake lint\n\n# Type checking\nmake type-check\n\n# Run all checks\nmake check\n```\n\n### Project Structure\n\n```text\nopenzim-mcp/\n\u251c\u2500\u2500 openzim_mcp/             # Main package\n\u2502   \u251c\u2500\u2500 __init__.py        # Package initialization\n\u2502   \u251c\u2500\u2500 __main__.py        # Module entry point\n\u2502   \u251c\u2500\u2500 main.py            # Main entry point\n\u2502   \u251c\u2500\u2500 server.py          # MCP server implementation\n\u2502   \u251c\u2500\u2500 config.py          # Configuration management\n\u2502   \u251c\u2500\u2500 security.py        # Security and validation\n\u2502   \u251c\u2500\u2500 cache.py           # Caching functionality\n\u2502   \u251c\u2500\u2500 content_processor.py # Content processing\n\u2502   \u251c\u2500\u2500 zim_operations.py  # ZIM file operations\n\u2502   \u251c\u2500\u2500 exceptions.py      # Custom exceptions\n\u2502   \u2514\u2500\u2500 constants.py       # Application constants\n\u251c\u2500\u2500 tests/                 # Test suite\n\u251c\u2500\u2500 pyproject.toml        # Project configuration\n\u251c\u2500\u2500 Makefile              # Development commands\n\u2514\u2500\u2500 README.md             # This file\n```\n\n---\n\n## \ud83d\udcda API Reference\n\n### Available Tools\n\n### list_zim_files - List all ZIM files in allowed directories\n\nNo parameters required.\n\n### search_zim_file - Search within ZIM file content\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n- `query` (string): Search query term\n\n**Optional parameters:**\n\n- `limit` (integer, default: 10): Maximum number of results to return\n- `offset` (integer, default: 0): Starting offset for results (for pagination)\n\n### get_zim_entry - Get detailed content of a specific entry in a ZIM file\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n- `entry_path` (string): Entry path, e.g., 'A/Some_Article'\n\n**Optional parameters:**\n\n- `max_content_length` (integer, default: 100000, minimum: 1000): Maximum length of returned content\n\n**Smart Retrieval Features:**\n\n- **Automatic Fallback**: If direct path access fails, automatically searches for the entry and uses the exact path found\n- **Path Mapping Cache**: Caches successful path mappings for improved performance on repeated access\n- **Enhanced Error Guidance**: Provides clear guidance when entries cannot be found, suggesting alternative approaches\n- **Transparent Operation**: Works seamlessly regardless of path encoding differences (spaces vs underscores, URL encoding, etc.)\n\n### get_zim_metadata - Get ZIM file metadata from M namespace entries\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n\n**Returns:**\nJSON string containing ZIM metadata including entry counts, archive information, and metadata entries like title, description, language, creator, etc.\n\n### get_main_page - Get the main page entry from W namespace\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n\n**Returns:**\nMain page content or information about the main page entry.\n\n### list_namespaces - List available namespaces and their entry counts\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n\n**Returns:**\nJSON string containing namespace information with entry counts, descriptions, and sample entries for each namespace (C, M, W, X, etc.).\n\n### browse_namespace - Browse entries in a specific namespace with pagination\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n- `namespace` (string): Namespace to browse (C, M, W, X, A, I, etc.)\n\n**Optional parameters:**\n\n- `limit` (integer, default: 50, range: 1-200): Maximum number of entries to return\n- `offset` (integer, default: 0): Starting offset for pagination\n\n**Returns:**\nJSON string containing namespace entries with titles, content previews, and pagination information.\n\n### search_with_filters - Search within ZIM file content with advanced filters\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n- `query` (string): Search query term\n\n**Optional parameters:**\n\n- `namespace` (string): Optional namespace filter (C, M, W, X, etc.)\n- `content_type` (string): Optional content type filter (text/html, text/plain, etc.)\n- `limit` (integer, default: 10, range: 1-100): Maximum number of results to return\n- `offset` (integer, default: 0): Starting offset for pagination\n\n**Returns:**\nFiltered search results with namespace and content type information.\n\n### get_search_suggestions - Get search suggestions and auto-complete\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n- `partial_query` (string): Partial search query (minimum 2 characters)\n\n**Optional parameters:**\n\n- `limit` (integer, default: 10, range: 1-50): Maximum number of suggestions to return\n\n**Returns:**\nJSON string containing search suggestions based on article titles and content.\n\n### get_article_structure - Extract article structure and metadata\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n- `entry_path` (string): Entry path, e.g., 'C/Some_Article'\n\n**Returns:**\nJSON string containing article structure including headings, sections, metadata, and word count.\n\n### extract_article_links - Extract internal and external links from an article\n\n**Required parameters:**\n\n- `zim_file_path` (string): Path to the ZIM file\n- `entry_path` (string): Entry path, e.g., 'C/Some_Article'\n\n**Returns:**\nJSON string containing categorized links (internal, external, media) with titles and metadata.\n\n---\n\n## Examples\n\n### Listing ZIM files\n\n```json\n{\n  \"name\": \"list_zim_files\"\n}\n```\n\nResponse:\n\n```plain\nFound 1 ZIM files in 1 directories:\n\n[\n  {\n    \"name\": \"wikipedia_en_100_2025-08.zim\",\n    \"path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"directory\": \"C:\\\\zim\",\n    \"size\": \"310.77 MB\",\n    \"modified\": \"2025-09-11T10:20:50.148427\"\n  }\n]\n```\n\n### Searching ZIM files\n\n```json\n{\n  \"name\": \"search_zim_file\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"query\": \"biology\",\n    \"limit\": 3\n  }\n}\n```\n\nResponse:\n\n```plain\nFound 51 matches for \"biology\", showing 1-3:\n\n## 1. Taxonomy (biology)\nPath: Taxonomy_(biology)\nSnippet: #  Taxonomy (biology) Part of a series on\n---\nEvolutionary biology\nDarwin's finches by John Gould\n\n  * Index\n  * Introduction\n  * [Main](Evolution \"Evolution\")\n  * Outline\n\n## 2. Protein\nPath: Protein\nSnippet: #  Protein A representation of the 3D structure of the protein myoglobin showing turquoise \u03b1-helices. This protein was the first to have its structure solved by X-ray crystallography. Toward the right-center among the coils, a prosthetic group called a heme group (shown in gray) with a bound oxygen molecule (red).\n\n## 3. Ant\nPath: Ant\nSnippet: #  Ant Ants\nTemporal range: Late Aptian \u2013 Present\n---\nFire ants\n[Scientific classification](Taxonomy_\\(biology\\) \"Taxonomy \\(biology\\)\")\nKingdom:  | [Animalia](Animal \"Animal\")\nPhylum:  | [Arthropoda](Arthropod \"Arthropod\")\nClass:  | [Insecta](Insect \"Insect\")\nOrder:  | Hymenoptera\nInfraorder:  | Aculeata\nSuperfamily:  |\nLatreille, 1809[1]\nFamily:  |\nLatreille, 1809\n```\n\n### Getting ZIM entries\n\n```json\n{\n  \"name\": \"get_zim_entry\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"entry_path\": \"Protein\"\n  }\n}\n```\n\nResponse:\n\n```plain\n# Protein\n\nPath: Protein\nType: text/html\n## Content\n\n#  Protein\n\nA representation of the 3D structure of the protein myoglobin showing turquoise \u03b1-helices. This protein was the first to have its structure solved by X-ray crystallography. Toward the right-center among the coils, a prosthetic group called a heme group (shown in gray) with a bound oxygen molecule (red).\n\n**Proteins** are large biomolecules and macromolecules that comprise one or more long chains of amino acid residues. Proteins perform a vast array of functions within organisms, including catalysing metabolic reactions, DNA replication, responding to stimuli, providing structure to cells and organisms, and transporting molecules from one location to another. Proteins differ from one another primarily in their sequence of amino acids, which is dictated by the nucleotide sequence of their genes, and which usually results in protein folding into a specific 3D structure that determines its activity.\n\nA linear chain of amino acid residues is called a polypeptide. A protein contains at least one long polypeptide. Short polypeptides, containing less than 20\u201330 residues, are rarely considered to be proteins and are commonly called peptides.\n\n... [Content truncated, total of 56,202 characters, only showing first 1,500 characters] ...\n```\n\n### Smart Retrieval in Action\n\n**Example: Automatic path resolution**\n\n```json\n{\n  \"name\": \"get_zim_entry\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"entry_path\": \"A/Test Article\"\n  }\n}\n```\n\nResponse (showing smart retrieval working):\n\n```plain\n# Test Article\n\nRequested Path: A/Test Article\nActual Path: A/Test_Article\nType: text/html\n\n## Content\n\n# Test Article\n\nThis article demonstrates the smart retrieval system automatically handling\npath encoding differences. The system tried \"A/Test Article\" directly,\nthen automatically searched and found \"A/Test_Article\".\n\n... [Content continues] ...\n```\n\n### get_server_health - Get server health and statistics\n\nNo parameters required.\n\n**Returns:**\n\n- Server status and performance metrics\n- Cache statistics\n- Configuration information\n- Instance tracking information\n- Conflict detection results\n\n**Example Response:**\n\n```json\n{\n  \"status\": \"healthy\",\n  \"server_name\": \"openzim-mcp\",\n  \"allowed_directories\": 1,\n  \"cache\": {\n    \"enabled\": true,\n    \"size\": 1,\n    \"max_size\": 100,\n    \"ttl_seconds\": 3600\n  },\n  \"instance_tracking\": {\n    \"active_instances\": 1,\n    \"conflicts_detected\": 0\n  }\n}\n```\n\n### get_server_configuration - Get detailed server configuration\n\nNo parameters required.\n\n**Returns:**\nComprehensive server configuration including diagnostics, validation results, and conflict detection.\n\n**Example Response:**\n\n```json\n{\n  \"configuration\": {\n    \"server_name\": \"openzim-mcp\",\n    \"allowed_directories\": [\"/path/to/zim/files\"],\n    \"cache_enabled\": true,\n    \"config_hash\": \"abc123...\",\n    \"server_pid\": 12345\n  },\n  \"diagnostics\": {\n    \"validation_status\": \"healthy\",\n    \"conflicts_detected\": [],\n    \"warnings\": [],\n    \"recommendations\": []\n  }\n}\n```\n\n### diagnose_server_state - Comprehensive server diagnostics\n\nNo parameters required.\n\n**Returns:**\nDetailed diagnostic information including instance conflicts, configuration validation, file accessibility checks, and actionable recommendations.\n\n**Example Response:**\n\n```json\n{\n  \"status\": \"healthy\",\n  \"server_info\": {\n    \"pid\": 12345,\n    \"server_name\": \"openzim-mcp\",\n    \"config_hash\": \"abc123...\"\n  },\n  \"conflicts\": [],\n  \"issues\": [],\n  \"recommendations\": [\"Server appears to be running normally\"],\n  \"environment_checks\": {\n    \"directories_accessible\": true,\n    \"cache_functional\": true\n  }\n}\n```\n\n### resolve_server_conflicts - Identify and resolve server conflicts\n\nNo parameters required.\n\n**Returns:**\nResults of conflict resolution including cleanup actions and recommendations.\n\n**Example Response:**\n\n```json\n{\n  \"status\": \"success\",\n  \"cleanup_results\": {\n    \"stale_instances_removed\": 2\n  },\n  \"conflicts_found\": [],\n  \"actions_taken\": [\"Removed 2 stale instance files\"],\n  \"recommendations\": [\"No active conflicts detected\"]\n}\n```\n\n### Additional Search Examples\n\n**Computer-related search:**\n\n```json\n{\n  \"name\": \"search_zim_file\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"query\": \"computer\",\n    \"limit\": 2\n  }\n}\n```\n\nResponse:\n```plain\nFound 39 matches for \"computer\", showing 1-2:\n\n## 1. Video game\nPath: Video_game\nSnippet: #  Video game First-generation _Pong_ console at the Computerspielemuseum Berlin\n---\nPlatforms\n\n## 2. Protein\nPath: Protein\nSnippet: #  Protein A representation of the 3D structure of the protein myoglobin showing turquoise \u03b1-helices. This protein was the first to have its structure solved by X-ray crystallography. Toward the right-center among the coils, a prosthetic group called a heme group (shown in gray) with a bound oxygen molecule (red).\n```\n\n**Getting detailed content:**\n\n```json\n{\n  \"name\": \"get_zim_entry\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"entry_path\": \"Evolution\",\n    \"max_content_length\": 1500\n  }\n}\n```\n\nResponse:\n```plain\n# Evolution\n\nPath: Evolution\nType: text/html\n## Content\n\n#  Evolution\n\nPart of the Biology series on\n---\n****\nMechanisms and processes\n\n  * Adaptation\n  * Genetic drift\n  * Gene flow\n  * History of life\n  * Maladaptation\n  * Mutation\n  * Natural selection\n  * Neutral theory\n  * Population genetics\n  * Speciation\n\n... [Content truncated, total of 110,237 characters, only showing first 1,500 characters] ...\n```\n\n### \ud83c\udfaf Advanced Knowledge Retrieval Examples\n\n**Getting ZIM metadata:**\n\n```json\n{\n  \"name\": \"get_zim_metadata\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\"\n  }\n}\n```\n\nResponse:\n```json\n{\n  \"entry_count\": 100000,\n  \"all_entry_count\": 120000,\n  \"article_count\": 80000,\n  \"media_count\": 20000,\n  \"metadata_entries\": {\n    \"Title\": \"Wikipedia (English)\",\n    \"Description\": \"Wikipedia articles in English\",\n    \"Language\": \"eng\",\n    \"Creator\": \"Kiwix\",\n    \"Date\": \"2025-08-15\"\n  }\n}\n```\n\n**Browsing a namespace:**\n\n```json\n{\n  \"name\": \"browse_namespace\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"namespace\": \"C\",\n    \"limit\": 5,\n    \"offset\": 0\n  }\n}\n```\n\nResponse:\n```json\n{\n  \"namespace\": \"C\",\n  \"total_in_namespace\": 80000,\n  \"offset\": 0,\n  \"limit\": 5,\n  \"returned_count\": 5,\n  \"has_more\": true,\n  \"entries\": [\n    {\n      \"path\": \"C/Biology\",\n      \"title\": \"Biology\",\n      \"content_type\": \"text/html\",\n      \"preview\": \"Biology is the scientific study of life...\"\n    }\n  ]\n}\n```\n\n**Filtered search:**\n\n```json\n{\n  \"name\": \"search_with_filters\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"query\": \"evolution\",\n    \"namespace\": \"C\",\n    \"content_type\": \"text/html\",\n    \"limit\": 3\n  }\n}\n```\n\n**Getting article structure:**\n\n```json\n{\n  \"name\": \"get_article_structure\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"entry_path\": \"C/Evolution\"\n  }\n}\n```\n\nResponse:\n```json\n{\n  \"title\": \"Evolution\",\n  \"path\": \"C/Evolution\",\n  \"content_type\": \"text/html\",\n  \"headings\": [\n    {\"level\": 1, \"text\": \"Evolution\", \"id\": \"evolution\"},\n    {\"level\": 2, \"text\": \"History\", \"id\": \"history\"},\n    {\"level\": 2, \"text\": \"Mechanisms\", \"id\": \"mechanisms\"}\n  ],\n  \"sections\": [\n    {\n      \"title\": \"Evolution\",\n      \"level\": 1,\n      \"content_preview\": \"Evolution is the change in heritable traits...\",\n      \"word_count\": 150\n    }\n  ],\n  \"word_count\": 5000\n}\n```\n\n**Getting search suggestions:**\n\n```json\n{\n  \"name\": \"get_search_suggestions\",\n  \"arguments\": {\n    \"zim_file_path\": \"C:\\\\zim\\\\wikipedia_en_100_2025-08.zim\",\n    \"partial_query\": \"bio\",\n    \"limit\": 5\n  }\n}\n```\n\nResponse:\n```json\n{\n  \"partial_query\": \"bio\",\n  \"suggestions\": [\n    {\"text\": \"Biology\", \"path\": \"C/Biology\", \"type\": \"title_start_match\"},\n    {\"text\": \"Biochemistry\", \"path\": \"C/Biochemistry\", \"type\": \"title_start_match\"},\n    {\"text\": \"Biodiversity\", \"path\": \"C/Biodiversity\", \"type\": \"title_start_match\"}\n  ],\n  \"count\": 3\n}\n```\n\n### \ud83d\udd27 Server Management and Diagnostics Examples\n\n**Getting server health:**\n\n```json\n{\n  \"name\": \"get_server_health\"\n}\n```\n\nResponse:\n```json\n{\n  \"status\": \"healthy\",\n  \"server_name\": \"openzim-mcp\",\n  \"uptime_info\": {\n    \"process_id\": 12345,\n    \"started_at\": \"2025-09-14T10:30:00\"\n  },\n  \"cache_performance\": {\n    \"enabled\": true,\n    \"size\": 15,\n    \"max_size\": 100,\n    \"hit_rate\": 0.85\n  },\n  \"instance_tracking\": {\n    \"active_instances\": 1,\n    \"conflicts_detected\": 0\n  }\n}\n```\n\n**Diagnosing server state:**\n\n```json\n{\n  \"name\": \"diagnose_server_state\"\n}\n```\n\nResponse:\n```json\n{\n  \"status\": \"healthy\",\n  \"server_info\": {\n    \"pid\": 12345,\n    \"server_name\": \"openzim-mcp\",\n    \"config_hash\": \"abc123def456...\"\n  },\n  \"conflicts\": [],\n  \"issues\": [],\n  \"recommendations\": [\"Server appears to be running normally. No issues detected.\"],\n  \"environment_checks\": {\n    \"directories_accessible\": true,\n    \"cache_functional\": true,\n    \"zim_files_found\": 5\n  }\n}\n```\n\n**Resolving server conflicts:**\n\n```json\n{\n  \"name\": \"resolve_server_conflicts\"\n}\n```\n\nResponse:\n```json\n{\n  \"status\": \"success\",\n  \"cleanup_results\": {\n    \"stale_instances_removed\": 2,\n    \"files_cleaned\": [\"/home/user/.openzim_mcp_instances/server_99999.json\"]\n  },\n  \"conflicts_found\": [],\n  \"actions_taken\": [\"Removed 2 stale instance files\"],\n  \"recommendations\": [\"No active conflicts detected after cleanup\"]\n}\n```\n\n---\n\n## \ud83c\udfaf ZIM Entry Retrieval Best Practices\n\n### Smart Retrieval System\n\nOpenZIM MCP implements an intelligent entry retrieval system that automatically handles path encoding inconsistencies common in ZIM files:\n\n**How It Works:**\n1. **Direct Access First**: Attempts to retrieve the entry using the provided path exactly as given\n2. **Automatic Fallback**: If direct access fails, automatically searches for the entry using various search terms\n3. **Path Mapping Cache**: Caches successful path mappings to improve performance for repeated access\n4. **Enhanced Error Guidance**: Provides clear guidance when entries cannot be found\n\n**Benefits for LLM Users:**\n- **Transparent Operation**: No need to understand ZIM path encoding complexities\n- **Single Tool Call**: Eliminates the need for manual search-first methodology\n- **Reliable Results**: Consistent success across different path formats (spaces vs underscores, URL encoding, etc.)\n- **Performance Optimized**: Cached mappings improve repeated access speed\n\n**Example Scenarios Handled Automatically:**\n- `A/Test Article` \u2192 `A/Test_Article` (space to underscore conversion)\n- `C/Caf\u00e9` \u2192 `C/Caf%C3%A9` (URL encoding differences)\n- `A/Some-Page` \u2192 `A/Some_Page` (hyphen to underscore conversion)\n\n### Usage Recommendations\n\n**For Direct Entry Access:**\n```json\n{\n  \"name\": \"get_zim_entry\",\n  \"arguments\": {\n    \"zim_file_path\": \"/path/to/file.zim\",\n    \"entry_path\": \"A/Article_Name\"\n  }\n}\n```\n\n**When Entry Not Found:**\nThe system will automatically provide guidance:\n```\nEntry not found: 'A/Article_Name'.\nThe entry path may not exist in this ZIM file.\nTry using search_zim_file() to find available entries,\nor browse_namespace() to explore the file structure.\n```\n\n---\n\n## \u26a0\ufe0f Important Notes and Limitations\n\n### Content Length Requirements\n- The `max_content_length` parameter for `get_zim_entry` must be at least 1000 characters\n- Content longer than the specified limit will be truncated with a note showing the total character count\n\n### Search Behavior\n- Search results may include articles that contain the search terms in various contexts\n- Results are ranked by relevance but may not always be directly related to the primary meaning of the search term\n- Search snippets provide a preview of the content but may not show the exact location where the search term appears\n\n### File Format Support\n- Currently supports ZIM files (Zeno IMproved format)\n- Tested with Wikipedia ZIM files (e.g., `wikipedia_en_100_2025-08.zim`)\n- File paths must be properly escaped in JSON (use `\\\\` for Windows paths)\n\n---\n\n## \ud83d\udd04 Multi-Server Instance Management\n\nOpenZIM MCP includes advanced multi-server instance tracking and conflict detection to ensure reliable operation when multiple server instances are running.\n\n### Instance Tracking Features\n\n- **Automatic Instance Registration**: Each server instance is automatically registered with a unique process ID and configuration hash\n- **Conflict Detection**: Detects when multiple servers with different configurations are accessing the same directories\n- **Stale Instance Cleanup**: Automatically identifies and cleans up orphaned instance files from terminated processes\n- **Configuration Validation**: Ensures all server instances use compatible configurations\n\n### Conflict Types\n\n1. **Configuration Mismatch**: Multiple servers with different settings accessing the same directories\n2. **Multiple Instances**: Multiple servers running simultaneously (may cause confusion)\n3. **Stale Instances**: Orphaned instance files from terminated processes\n\n### Automatic Conflict Warnings\n\nOpenZIM MCP automatically includes conflict warnings in search results and file listings when issues are detected:\n\n```plain\n\ud83d\udd0d **Server Conflict Detected**\n\u26a0\ufe0f Configuration mismatch with server PID 12345. Search results may be inconsistent.\n\ud83d\udca1 Use 'resolve_server_conflicts()' to fix these issues.\n```\n\n### Best Practices\n\n- Use `diagnose_server_state()` regularly to check for conflicts\n- Run `resolve_server_conflicts()` to clean up stale instances\n- Ensure all server instances use the same configuration when accessing shared directories\n- Monitor server health with `get_server_health()` for instance tracking information\n\n---\n\n## \ud83d\udd27 Configuration\n\nOpenZIM MCP supports configuration through environment variables with the `OPENZIM_MCP_` prefix:\n\n```bash\n# Cache configuration\nexport OPENZIM_MCP_CACHE__ENABLED=true\nexport OPENZIM_MCP_CACHE__MAX_SIZE=200\nexport OPENZIM_MCP_CACHE__TTL_SECONDS=7200\n\n# Content configuration\nexport OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH=200000\nexport OPENZIM_MCP_CONTENT__SNIPPET_LENGTH=2000\nexport OPENZIM_MCP_CONTENT__DEFAULT_SEARCH_LIMIT=20\n\n# Logging configuration\nexport OPENZIM_MCP_LOGGING__LEVEL=DEBUG\nexport OPENZIM_MCP_LOGGING__FORMAT=\"%(asctime)s - %(name)s - %(levelname)s - %(message)s\"\n\n# Server configuration\nexport OPENZIM_MCP_SERVER_NAME=my_openzim_mcp_server\n```\n\n### Configuration Options\n\n| Setting | Default | Description |\n|---------|---------|-------------|\n| `OPENZIM_MCP_CACHE__ENABLED` | `true` | Enable/disable caching |\n| `OPENZIM_MCP_CACHE__MAX_SIZE` | `100` | Maximum cache entries |\n| `OPENZIM_MCP_CACHE__TTL_SECONDS` | `3600` | Cache TTL in seconds |\n| `OPENZIM_MCP_CONTENT__MAX_CONTENT_LENGTH` | `100000` | Max content length |\n| `OPENZIM_MCP_CONTENT__SNIPPET_LENGTH` | `1000` | Max snippet length |\n| `OPENZIM_MCP_CONTENT__DEFAULT_SEARCH_LIMIT` | `10` | Default search result limit |\n| `OPENZIM_MCP_LOGGING__LEVEL` | `INFO` | Logging level |\n| `OPENZIM_MCP_LOGGING__FORMAT` | `%(asctime)s - %(name)s - %(levelname)s - %(message)s` | Log message format |\n| `OPENZIM_MCP_SERVER_NAME` | `openzim-mcp` | Server instance name |\n\n---\n\n## \ud83d\udd12 Security Features\n\n- **Path Traversal Protection**: Secure path validation prevents access outside allowed directories\n- **Input Sanitization**: All user inputs are validated and sanitized\n- **Resource Management**: Proper cleanup of ZIM archive resources\n- **Error Handling**: Sanitized error messages prevent information disclosure\n- **Type Safety**: Full type annotations prevent type-related vulnerabilities\n\n---\n\n## \ud83d\ude80 Performance Features\n\n- **Intelligent Caching**: LRU cache with TTL for frequently accessed content\n- **Resource Pooling**: Efficient ZIM archive management\n- **Optimized Content Processing**: Fast HTML to text conversion\n- **Lazy Loading**: Components initialized only when needed\n- **Memory Management**: Proper cleanup and resource management\n\n---\n\n## \ud83e\uddea Testing\n\nThe project includes comprehensive testing with 90%+ coverage using both mock data and real ZIM files:\n\n### Test Categories\n\n- **Unit Tests**: Individual component testing with mocks\n- **Integration Tests**: End-to-end functionality testing with real ZIM files\n- **Security Tests**: Path traversal and input validation testing\n- **Performance Tests**: Cache and resource management testing\n- **Format Compatibility**: Testing with various ZIM file formats and versions\n- **Error Handling**: Testing with invalid and malformed ZIM files\n\n### Test Infrastructure\n\nOpenZIM MCP uses a hybrid testing approach:\n\n1. **Mock-based tests**: Fast unit tests using mocked libzim components\n2. **Real ZIM file tests**: Integration tests using official zim-testing-suite files\n3. **Automatic test data management**: Download and organize test files as needed\n\n### Test Data Sources\n\n- **Built-in test data**: Basic test files included in the repository\n- **zim-testing-suite integration**: Official test files from the OpenZIM project\n- **Environment variable support**: `ZIM_TEST_DATA_DIR` for custom test data locations\n\n```bash\n# Run tests with coverage report\nmake test-cov\n\n# View coverage report\nopen htmlcov/index.html\n\n# Run comprehensive tests with real ZIM files\nmake test-with-zim-data\n```\n\n### Test Markers\n\nTests are organized with pytest markers:\n\n- `@pytest.mark.requires_zim_data`: Tests requiring ZIM test data files\n- `@pytest.mark.integration`: Integration tests\n- `@pytest.mark.slow`: Long-running tests\n\n---\n\n## \ud83d\udcc8 Monitoring\n\nOpenZIM MCP provides built-in monitoring capabilities:\n\n- **Health Checks**: Server health and status monitoring\n- **Cache Metrics**: Cache hit rates and performance statistics\n- **Structured Logging**: JSON-formatted logs for easy parsing\n- **Error Tracking**: Comprehensive error logging and tracking\n\n---\n\n## \ud83d\udd04 Versioning\n\nThis project uses [Semantic Versioning](https://semver.org/) with automated version management through [release-please](https://github.com/googleapis/release-please).\n\n### Automated Releases\n\nVersion bumps and releases are automated based on [Conventional Commits](https://www.conventionalcommits.org/):\n\n- **`feat:`** - New features (minor version bump)\n- **`fix:`** - Bug fixes (patch version bump)\n- **`feat!:`** or **`BREAKING CHANGE:`** - Breaking changes (major version bump)\n- **`perf:`** - Performance improvements (patch version bump)\n- **`docs:`**, **`style:`**, **`refactor:`**, **`test:`**, **`chore:`** - No version bump\n\n### Release Process\n\n1. **Automatic**: Push commits with conventional commit messages to `main`\n2. **Release PR**: release-please creates a PR with version updates and changelog\n3. **Release**: Merge the release PR to automatically create a new release\n4. **Manual**: Use workflow dispatch for emergency releases\n\n### Commit Message Format\n\n```\n<type>[optional scope]: <description>\n\n[optional body]\n\n[optional footer(s)]\n```\n\n**Examples:**\n```bash\nfeat: add search suggestions endpoint\nfix: resolve path traversal vulnerability\nfeat!: change API response format\ndocs: update installation instructions\n```\n\n---\n\n## \ud83e\udd1d Contributing\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Run tests (`make check`)\n5. **Use conventional commit messages** (`git commit -m 'feat: add amazing feature'`)\n6. Push to the branch (`git push origin feature/amazing-feature`)\n7. Open a Pull Request\n\n### Development Guidelines\n\n- Follow PEP 8 style guidelines\n- Add type hints to all functions\n- Write tests for new functionality\n- Update documentation as needed\n- **Use conventional commit messages** for automatic versioning\n- Ensure all tests pass before submitting\n\n---\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n---\n\n## \ud83d\ude4f Acknowledgments\n\n- [Kiwix](https://www.kiwix.org/) for the ZIM format and libzim library\n- [MCP](https://modelcontextprotocol.io/) for the Model Context Protocol\n- The open-source community for the excellent libraries used in this project\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "OpenZIM MCP - ZIM MCP Server that enables AI models to access and search ZIM format knowledge bases offline",
    "version": "0.3.3",
    "project_urls": {
        "Changelog": "https://github.com/cameronrye/openzim-mcp/blob/main/CHANGELOG.md",
        "Contributing": "https://github.com/cameronrye/openzim-mcp/blob/main/CONTRIBUTING.md",
        "Documentation": "https://github.com/cameronrye/openzim-mcp#readme",
        "Homepage": "https://github.com/cameronrye/openzim-mcp",
        "Issues": "https://github.com/cameronrye/openzim-mcp/issues",
        "Repository": "https://github.com/cameronrye/openzim-mcp.git",
        "Security Policy": "https://github.com/cameronrye/openzim-mcp/blob/main/SECURITY.md"
    },
    "split_keywords": [
        "zim",
        " openzim",
        " mcp",
        " model-context-protocol",
        " ai",
        " llm",
        " knowledge-base",
        " offline",
        " wikipedia",
        " search",
        " libzim"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c4585a77c169a9d3720e2bb57ac439d417f3e3c48c7c3ca6fc44b56cdf61056d",
                "md5": "2f2947a178f1d63bb0b590cf4d2fcb9d",
                "sha256": "2567f5175d66fd6e877e00d5b48ec1bccd75f620828d1109da858a1427a21a85"
            },
            "downloads": -1,
            "filename": "openzim_mcp-0.3.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2f2947a178f1d63bb0b590cf4d2fcb9d",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 50659,
            "upload_time": "2025-09-15T17:44:13",
            "upload_time_iso_8601": "2025-09-15T17:44:13.045831Z",
            "url": "https://files.pythonhosted.org/packages/c4/58/5a77c169a9d3720e2bb57ac439d417f3e3c48c7c3ca6fc44b56cdf61056d/openzim_mcp-0.3.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "283a4a9126eebaa2b2b358bac3add16f65584e1fb9021800abca19660400a423",
                "md5": "73b95b50f4048651353a9a193176f3f2",
                "sha256": "e77585e5850997fcea5e2106e5bfa52e97bfaae02515343fb3f827bd2febfb80"
            },
            "downloads": -1,
            "filename": "openzim_mcp-0.3.3.tar.gz",
            "has_sig": false,
            "md5_digest": "73b95b50f4048651353a9a193176f3f2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 99874,
            "upload_time": "2025-09-15T17:44:14",
            "upload_time_iso_8601": "2025-09-15T17:44:14.855813Z",
            "url": "https://files.pythonhosted.org/packages/28/3a/4a9126eebaa2b2b358bac3add16f65584e1fb9021800abca19660400a423/openzim_mcp-0.3.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-15 17:44:14",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "cameronrye",
    "github_project": "openzim-mcp",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "openzim-mcp"
}
        
Elapsed time: 3.81141s