confluence-scraper-mcp


Nameconfluence-scraper-mcp JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryA Model Context Protocol (MCP) server for Confluence RAG with ChromaDB vector search
upload_time2025-07-19 16:41:12
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords ai chromadb confluence llm mcp rag vector-search
VCS
bugtrack_url
requirements fastapi uvicorn chromadb sentence-transformers atlassian-python-api pydantic pydantic-settings python-dotenv loguru httpx beautifulsoup4 requests anyio starlette typing-extensions numpy scikit-learn pytest pytest-asyncio pytest-cov pytest-mock black isort mypy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Confluence RAG Data Pipeline with MCP Protocol

A Model Context Protocol (MCP) server that provides relevant context from Confluence pages using RAG (Retrieval Augmented Generation).

[![PyPI version](https://badge.fury.io/py/confluence-scraper-mcp.svg)](https://badge.fury.io/py/confluence-scraper-mcp)
[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## 🚀 Quick Start

```bash
# Install from PyPI
pip install confluence-scraper-mcp

# Set environment variables
export CONFLUENCE_BASE_URL="https://your-domain.atlassian.net"
export CONFLUENCE_TOKEN="your-api-token"
export CONFLUENCE_SPACE_KEY="your-space-key"

# Run as MCP server
confluence-scraper-mcp

# Or run as web server
confluence-scraper-mcp --web
```

## Features

- 🔍 **Semantic Search**: Uses ChromaDB for vector-based document retrieval
- 🔗 **MCP Integration**: Full Model Context Protocol implementation
- 📚 **Confluence Native**: Direct integration with Confluence API
- 🏷️ **Smart Filtering**: Filter by spaces, labels, and metadata
- 📎 **Rich Content**: Handles attachments and comments
- 🌐 **Dual Mode**: Run as MCP server or REST API
- 📦 **Easy Install**: Available on PyPI

## Requirements

- Python 3.9 or higher
- Confluence API access token
- ChromaDB for vector storage

## Installation

1. **Install from PyPI (Recommended):**
   ```bash
   pip install confluence-scraper-mcp
   ```

2. **Install UV if you haven't already:**
   ```bash
   curl -LsSf https://astral.sh/uv/install.sh | sh
   ```

3. **Clone and Setup Project (Development):**
   ```bash
   git clone <repository-url>
   cd confluence-scraper-mcp
   # Create virtual environment
   uv venv .venv
   # Activate virtual environment
   source .venv/bin/activate
   # Install dependencies
   uv pip install -r requirements.txt
   ```

4. **Configure Environment:**
   - Create a `.env` file in the project root:
   ```bash
   touch .env
   ```
   - Add the following configuration (adjust values as needed):
   ```bash
   # Required settings
   CONFLUENCE_BASE_URL=https://your-domain.atlassian.net
   CONFLUENCE_TOKEN=your-api-token
   CONFLUENCE_SPACE_KEY=optional-space-key
   
   # Optional settings (with defaults)
   INITIAL_CRAWL=false
   CHROMA_PERSIST_DIR=./data/chroma
   EMBEDDING_MODEL="all-MiniLM-L6-v2"
   MAX_PAGES=1000
   INCLUDE_ATTACHMENTS=true
   INCLUDE_COMMENTS=true
   ```

## Usage

### Command Line Interface (After PyPI Installation)

```bash
# Run as MCP server (stdio mode) - default
confluence-scraper-mcp

# Run as web server
confluence-scraper-mcp --web
```

### Development Mode

1. **Using uvx (Recommended):**
   ```bash
   # Development mode with auto-reload
   uvx uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
   
   # Run tests
   uvx pytest
   
   # Code formatting and checks
   uvx black .
   uvx isort .
   uvx mypy .
   ```

2. **Alternative: Using Virtual Environment:**
   ```bash
   # Activate virtual environment
   source .venv/bin/activate
   
   # Then run commands as usual
   uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
   ```

3. **Initial Setup:**
   ```bash
   # Start initial crawl of Confluence pages
   curl -X POST http://localhost:8000/crawl
   
   # Verify server health
   curl http://localhost:8000/health
   ```

4. **Use the MCP API:**
   ```bash
   # Get context for an LLM query
   curl -X POST http://localhost:8000/mcp/context \
     -H "Content-Type: application/json" \
     -d '{
       "messages": [{"role": "user", "content": "Tell me about project X"}],
       "query": "project X documentation",
       "max_context_length": 1000
     }'
   
   # The response will include relevant context from your Confluence pages
   ```

5. **Monitor and Maintain:**
   ```bash
   # View logs
   tail -f logs/app.log
   
   # Re-crawl Confluence (e.g., after updates)
   curl -X POST http://localhost:8000/crawl
   ```

## API Endpoints

- `GET /health`: Health check endpoint
- `POST /crawl`: Trigger Confluence crawl
- `POST /mcp/context`: Get relevant context for a query

## MCP (Model Context Protocol) Configuration

This server implements the Model Context Protocol (MCP) for seamless integration with AI assistants and LLM clients. 

### Quick MCP Setup

1. **Install the package:**
   ```bash
   pip install confluence-scraper-mcp
   ```

2. **Copy the MCP configuration:**
   ```bash
   # Copy the example configuration
   cp examples/mcp-client-config.json ~/.config/your-mcp-client/
   ```

3. **Update environment variables in the config:**
   ```json
   {
     "mcpServers": {
       "confluence-scraper-mcp": {
         "command": "confluence-scraper-mcp",
         "args": [],
         "env": {
           "CONFLUENCE_BASE_URL": "https://your-domain.atlassian.net",
           "CONFLUENCE_TOKEN": "your-api-token",
           "CONFLUENCE_SPACE_KEY": "your-space-key"
         }
       }
     }
   }
   ```

### MCP Tools Available

The server provides several MCP tools:

- **`confluence_search`**: Search Confluence pages using semantic search
- **`confluence_get_page`**: Retrieve specific page content by ID or title
- **`confluence_crawl`**: Trigger crawling and indexing of content

### Example MCP Tool Usage

```json
{
  "method": "tools/call",
  "params": {
    "name": "confluence_search",
    "arguments": {
      "query": "API authentication methods",
      "space_key": "DEV",
      "max_results": 3,
      "include_attachments": true
    }
  }
}
```

### MCP Configuration Files

The package includes example configuration files:

- **`examples/mcp.json`**: Complete MCP server specification
- **`examples/mcp-client-config.json`**: Simple client configuration

See the [MCP specification](https://spec.modelcontextprotocol.io/) for more details on the protocol.

## 🤖 GitHub Copilot Integration

### Quick Setup for Copilot

1. **Install the package:**
   ```bash
   pip install confluence-scraper-mcp
   ```

2. **Configure VS Code Settings:**
   Open VS Code settings (`Cmd+,`) and add to your `settings.json`:
   ```json
   {
     "github.copilot.chat.mcpServers": {
       "confluence-rag": {
         "command": "confluence-scraper-mcp",
         "args": [],
         "env": {
           "CONFLUENCE_BASE_URL": "https://your-domain.atlassian.net",
           "CONFLUENCE_TOKEN": "your-api-token",
           "CONFLUENCE_SPACE_KEY": "your-space-key"
         }
       }
     }
   }
   ```

3. **Initial Setup:**
   ```bash
   # Start server and crawl content
   confluence-scraper-mcp --web &
   curl -X POST http://localhost:8000/crawl
   ```

4. **Test with Copilot:**
   Open Copilot Chat and ask: *"How do we handle authentication in our system?"*

### Detailed Setup Guide

For complete setup instructions, see: **[📖 Copilot Setup Guide](docs/COPILOT_SETUP.md)**

## Using with Code Assistants

This MCP server specializes in Confluence documentation and uses RAG (Retrieval Augmented Generation) with ChromaDB:

**Key Features:**
- 🔗 **Confluence Integration**: Direct API integration with page, attachment, and comment handling
- 🔍 **Semantic Search**: ChromaDB vector search for meaning-based retrieval
- 🏷️ **Smart Filtering**: Filter by space keys, labels, content types
- 📊 **Metadata Preservation**: Maintains Confluence structure and relationships
        ```json
        {
          "endpoints": [
            {
              "name": "API Documentation",
              "url": "http://localhost:8000/mcp/context",
              "options": {
                "max_context_length": 2000,
                "filter": {
                  "space_key": "API",
                  "labels": ["technical-docs", "api-reference"],
                  "include_comments": true,
                  "include_attachments": false,
                  "semantic_ranking": {
                    "weight": 0.7,
                    "model": "all-MiniLM-L6-v2"
                  }
                }
              },
              "authentication": {
                "type": "none"
              }
            },
            {
              "name": "Architecture Docs",
              "url": "http://localhost:8000/mcp/context",
              "options": {
                "max_context_length": 3000,
                "filter": {
                  "space_key": "ARCH",
                  "labels": ["architecture", "design"],
                  "include_comments": false,
                  "include_attachments": true,
                  "semantic_ranking": {
                    "weight": 0.8,
                    "model": "all-MiniLM-L6-v2"
                  }
                }
              },
              "authentication": {
                "type": "none"
              }
            }
          ],
          "default_endpoint": "API Documentation"
        }
        ```
        - Add the path to this file in VS Code settings under "Copilot Chat: MCP Configuration File"
        - See `examples/mcp.json` for a full example with multiple endpoints and filtering options

3. **Usage with Copilot:**
   - In VS Code, open Copilot Chat (Cmd+I)
   - Your queries will now include relevant context from your Confluence pages
   - Example: "How do I implement feature X?" will include context from related Confluence documentation
   - You can also use `/doc` command in Copilot Chat to explicitly search documentation

4. **Tips for Better Results:**
   - Keep Confluence pages well-organized and up-to-date
   - Use descriptive titles and labels in Confluence
   - Re-crawl after significant documentation updates:
     ```bash
     curl -X POST http://localhost:8000/crawl
     ```

## Development

1. **Install Development Dependencies:**
   ```bash
   uv pip install -r requirements.txt
   ```

2. **Using uvx for Development:**
   UV installs a command runner called `uvx` that can run Python scripts and modules without explicitly activating the virtual environment:
   ```bash
   # Run the FastAPI server
   uvx uvicorn app.main:app --reload
   
   # Run tests
   uvx pytest
   
   # Code formatting
   uvx black .
   uvx isort .
   uvx mypy .
   ```

3. **Environment Configuration:**
   The project uses environment variables for configuration. Copy `.env.example` to `.env` and update the values:
   ```bash
   CONFLUENCE_BASE_URL=https://your-domain.atlassian.net
   CONFLUENCE_TOKEN=your-api-token
   CONFLUENCE_SPACE_KEY=your-space-key
   CHROMA_PERSIST_DIR=data/chroma
   CHROMA_COLLECTION_NAME=confluence_docs
   EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2
   CHUNK_SIZE=512
   CHUNK_OVERLAP=50
   TOP_K=3
   SIMILARITY_THRESHOLD=0.7
   ```

## Contributing

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes:
   - Use `uvx black .` and `uvx isort .` to format code
   - Use `uvx mypy .` for type checking
   - Add tests for new features
   - Update documentation as needed
4. Run tests (`uvx pytest`)
5. Commit your changes (`git commit -m 'Add some amazing feature'`)
6. Push to the branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request

## License

MIT License. See [LICENSE](LICENSE) for more information.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "confluence-scraper-mcp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "ai, chromadb, confluence, llm, mcp, rag, vector-search",
    "author": null,
    "author_email": "Akhil Thomas <akhilthomas236@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/f0/d6/007b6235ebd4860f897761524a430981fb52c4afb6d62c49b9e15113fa1e/confluence_scraper_mcp-0.1.3.tar.gz",
    "platform": null,
    "description": "# Confluence RAG Data Pipeline with MCP Protocol\n\nA Model Context Protocol (MCP) server that provides relevant context from Confluence pages using RAG (Retrieval Augmented Generation).\n\n[![PyPI version](https://badge.fury.io/py/confluence-scraper-mcp.svg)](https://badge.fury.io/py/confluence-scraper-mcp)\n[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n## \ud83d\ude80 Quick Start\n\n```bash\n# Install from PyPI\npip install confluence-scraper-mcp\n\n# Set environment variables\nexport CONFLUENCE_BASE_URL=\"https://your-domain.atlassian.net\"\nexport CONFLUENCE_TOKEN=\"your-api-token\"\nexport CONFLUENCE_SPACE_KEY=\"your-space-key\"\n\n# Run as MCP server\nconfluence-scraper-mcp\n\n# Or run as web server\nconfluence-scraper-mcp --web\n```\n\n## Features\n\n- \ud83d\udd0d **Semantic Search**: Uses ChromaDB for vector-based document retrieval\n- \ud83d\udd17 **MCP Integration**: Full Model Context Protocol implementation\n- \ud83d\udcda **Confluence Native**: Direct integration with Confluence API\n- \ud83c\udff7\ufe0f **Smart Filtering**: Filter by spaces, labels, and metadata\n- \ud83d\udcce **Rich Content**: Handles attachments and comments\n- \ud83c\udf10 **Dual Mode**: Run as MCP server or REST API\n- \ud83d\udce6 **Easy Install**: Available on PyPI\n\n## Requirements\n\n- Python 3.9 or higher\n- Confluence API access token\n- ChromaDB for vector storage\n\n## Installation\n\n1. **Install from PyPI (Recommended):**\n   ```bash\n   pip install confluence-scraper-mcp\n   ```\n\n2. **Install UV if you haven't already:**\n   ```bash\n   curl -LsSf https://astral.sh/uv/install.sh | sh\n   ```\n\n3. **Clone and Setup Project (Development):**\n   ```bash\n   git clone <repository-url>\n   cd confluence-scraper-mcp\n   # Create virtual environment\n   uv venv .venv\n   # Activate virtual environment\n   source .venv/bin/activate\n   # Install dependencies\n   uv pip install -r requirements.txt\n   ```\n\n4. **Configure Environment:**\n   - Create a `.env` file in the project root:\n   ```bash\n   touch .env\n   ```\n   - Add the following configuration (adjust values as needed):\n   ```bash\n   # Required settings\n   CONFLUENCE_BASE_URL=https://your-domain.atlassian.net\n   CONFLUENCE_TOKEN=your-api-token\n   CONFLUENCE_SPACE_KEY=optional-space-key\n   \n   # Optional settings (with defaults)\n   INITIAL_CRAWL=false\n   CHROMA_PERSIST_DIR=./data/chroma\n   EMBEDDING_MODEL=\"all-MiniLM-L6-v2\"\n   MAX_PAGES=1000\n   INCLUDE_ATTACHMENTS=true\n   INCLUDE_COMMENTS=true\n   ```\n\n## Usage\n\n### Command Line Interface (After PyPI Installation)\n\n```bash\n# Run as MCP server (stdio mode) - default\nconfluence-scraper-mcp\n\n# Run as web server\nconfluence-scraper-mcp --web\n```\n\n### Development Mode\n\n1. **Using uvx (Recommended):**\n   ```bash\n   # Development mode with auto-reload\n   uvx uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload\n   \n   # Run tests\n   uvx pytest\n   \n   # Code formatting and checks\n   uvx black .\n   uvx isort .\n   uvx mypy .\n   ```\n\n2. **Alternative: Using Virtual Environment:**\n   ```bash\n   # Activate virtual environment\n   source .venv/bin/activate\n   \n   # Then run commands as usual\n   uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload\n   ```\n\n3. **Initial Setup:**\n   ```bash\n   # Start initial crawl of Confluence pages\n   curl -X POST http://localhost:8000/crawl\n   \n   # Verify server health\n   curl http://localhost:8000/health\n   ```\n\n4. **Use the MCP API:**\n   ```bash\n   # Get context for an LLM query\n   curl -X POST http://localhost:8000/mcp/context \\\n     -H \"Content-Type: application/json\" \\\n     -d '{\n       \"messages\": [{\"role\": \"user\", \"content\": \"Tell me about project X\"}],\n       \"query\": \"project X documentation\",\n       \"max_context_length\": 1000\n     }'\n   \n   # The response will include relevant context from your Confluence pages\n   ```\n\n5. **Monitor and Maintain:**\n   ```bash\n   # View logs\n   tail -f logs/app.log\n   \n   # Re-crawl Confluence (e.g., after updates)\n   curl -X POST http://localhost:8000/crawl\n   ```\n\n## API Endpoints\n\n- `GET /health`: Health check endpoint\n- `POST /crawl`: Trigger Confluence crawl\n- `POST /mcp/context`: Get relevant context for a query\n\n## MCP (Model Context Protocol) Configuration\n\nThis server implements the Model Context Protocol (MCP) for seamless integration with AI assistants and LLM clients. \n\n### Quick MCP Setup\n\n1. **Install the package:**\n   ```bash\n   pip install confluence-scraper-mcp\n   ```\n\n2. **Copy the MCP configuration:**\n   ```bash\n   # Copy the example configuration\n   cp examples/mcp-client-config.json ~/.config/your-mcp-client/\n   ```\n\n3. **Update environment variables in the config:**\n   ```json\n   {\n     \"mcpServers\": {\n       \"confluence-scraper-mcp\": {\n         \"command\": \"confluence-scraper-mcp\",\n         \"args\": [],\n         \"env\": {\n           \"CONFLUENCE_BASE_URL\": \"https://your-domain.atlassian.net\",\n           \"CONFLUENCE_TOKEN\": \"your-api-token\",\n           \"CONFLUENCE_SPACE_KEY\": \"your-space-key\"\n         }\n       }\n     }\n   }\n   ```\n\n### MCP Tools Available\n\nThe server provides several MCP tools:\n\n- **`confluence_search`**: Search Confluence pages using semantic search\n- **`confluence_get_page`**: Retrieve specific page content by ID or title\n- **`confluence_crawl`**: Trigger crawling and indexing of content\n\n### Example MCP Tool Usage\n\n```json\n{\n  \"method\": \"tools/call\",\n  \"params\": {\n    \"name\": \"confluence_search\",\n    \"arguments\": {\n      \"query\": \"API authentication methods\",\n      \"space_key\": \"DEV\",\n      \"max_results\": 3,\n      \"include_attachments\": true\n    }\n  }\n}\n```\n\n### MCP Configuration Files\n\nThe package includes example configuration files:\n\n- **`examples/mcp.json`**: Complete MCP server specification\n- **`examples/mcp-client-config.json`**: Simple client configuration\n\nSee the [MCP specification](https://spec.modelcontextprotocol.io/) for more details on the protocol.\n\n## \ud83e\udd16 GitHub Copilot Integration\n\n### Quick Setup for Copilot\n\n1. **Install the package:**\n   ```bash\n   pip install confluence-scraper-mcp\n   ```\n\n2. **Configure VS Code Settings:**\n   Open VS Code settings (`Cmd+,`) and add to your `settings.json`:\n   ```json\n   {\n     \"github.copilot.chat.mcpServers\": {\n       \"confluence-rag\": {\n         \"command\": \"confluence-scraper-mcp\",\n         \"args\": [],\n         \"env\": {\n           \"CONFLUENCE_BASE_URL\": \"https://your-domain.atlassian.net\",\n           \"CONFLUENCE_TOKEN\": \"your-api-token\",\n           \"CONFLUENCE_SPACE_KEY\": \"your-space-key\"\n         }\n       }\n     }\n   }\n   ```\n\n3. **Initial Setup:**\n   ```bash\n   # Start server and crawl content\n   confluence-scraper-mcp --web &\n   curl -X POST http://localhost:8000/crawl\n   ```\n\n4. **Test with Copilot:**\n   Open Copilot Chat and ask: *\"How do we handle authentication in our system?\"*\n\n### Detailed Setup Guide\n\nFor complete setup instructions, see: **[\ud83d\udcd6 Copilot Setup Guide](docs/COPILOT_SETUP.md)**\n\n## Using with Code Assistants\n\nThis MCP server specializes in Confluence documentation and uses RAG (Retrieval Augmented Generation) with ChromaDB:\n\n**Key Features:**\n- \ud83d\udd17 **Confluence Integration**: Direct API integration with page, attachment, and comment handling\n- \ud83d\udd0d **Semantic Search**: ChromaDB vector search for meaning-based retrieval\n- \ud83c\udff7\ufe0f **Smart Filtering**: Filter by space keys, labels, content types\n- \ud83d\udcca **Metadata Preservation**: Maintains Confluence structure and relationships\n        ```json\n        {\n          \"endpoints\": [\n            {\n              \"name\": \"API Documentation\",\n              \"url\": \"http://localhost:8000/mcp/context\",\n              \"options\": {\n                \"max_context_length\": 2000,\n                \"filter\": {\n                  \"space_key\": \"API\",\n                  \"labels\": [\"technical-docs\", \"api-reference\"],\n                  \"include_comments\": true,\n                  \"include_attachments\": false,\n                  \"semantic_ranking\": {\n                    \"weight\": 0.7,\n                    \"model\": \"all-MiniLM-L6-v2\"\n                  }\n                }\n              },\n              \"authentication\": {\n                \"type\": \"none\"\n              }\n            },\n            {\n              \"name\": \"Architecture Docs\",\n              \"url\": \"http://localhost:8000/mcp/context\",\n              \"options\": {\n                \"max_context_length\": 3000,\n                \"filter\": {\n                  \"space_key\": \"ARCH\",\n                  \"labels\": [\"architecture\", \"design\"],\n                  \"include_comments\": false,\n                  \"include_attachments\": true,\n                  \"semantic_ranking\": {\n                    \"weight\": 0.8,\n                    \"model\": \"all-MiniLM-L6-v2\"\n                  }\n                }\n              },\n              \"authentication\": {\n                \"type\": \"none\"\n              }\n            }\n          ],\n          \"default_endpoint\": \"API Documentation\"\n        }\n        ```\n        - Add the path to this file in VS Code settings under \"Copilot Chat: MCP Configuration File\"\n        - See `examples/mcp.json` for a full example with multiple endpoints and filtering options\n\n3. **Usage with Copilot:**\n   - In VS Code, open Copilot Chat (Cmd+I)\n   - Your queries will now include relevant context from your Confluence pages\n   - Example: \"How do I implement feature X?\" will include context from related Confluence documentation\n   - You can also use `/doc` command in Copilot Chat to explicitly search documentation\n\n4. **Tips for Better Results:**\n   - Keep Confluence pages well-organized and up-to-date\n   - Use descriptive titles and labels in Confluence\n   - Re-crawl after significant documentation updates:\n     ```bash\n     curl -X POST http://localhost:8000/crawl\n     ```\n\n## Development\n\n1. **Install Development Dependencies:**\n   ```bash\n   uv pip install -r requirements.txt\n   ```\n\n2. **Using uvx for Development:**\n   UV installs a command runner called `uvx` that can run Python scripts and modules without explicitly activating the virtual environment:\n   ```bash\n   # Run the FastAPI server\n   uvx uvicorn app.main:app --reload\n   \n   # Run tests\n   uvx pytest\n   \n   # Code formatting\n   uvx black .\n   uvx isort .\n   uvx mypy .\n   ```\n\n3. **Environment Configuration:**\n   The project uses environment variables for configuration. Copy `.env.example` to `.env` and update the values:\n   ```bash\n   CONFLUENCE_BASE_URL=https://your-domain.atlassian.net\n   CONFLUENCE_TOKEN=your-api-token\n   CONFLUENCE_SPACE_KEY=your-space-key\n   CHROMA_PERSIST_DIR=data/chroma\n   CHROMA_COLLECTION_NAME=confluence_docs\n   EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2\n   CHUNK_SIZE=512\n   CHUNK_OVERLAP=50\n   TOP_K=3\n   SIMILARITY_THRESHOLD=0.7\n   ```\n\n## Contributing\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes:\n   - Use `uvx black .` and `uvx isort .` to format code\n   - Use `uvx mypy .` for type checking\n   - Add tests for new features\n   - Update documentation as needed\n4. Run tests (`uvx pytest`)\n5. Commit your changes (`git commit -m 'Add some amazing feature'`)\n6. Push to the branch (`git push origin feature/amazing-feature`)\n7. Open a Pull Request\n\n## License\n\nMIT License. See [LICENSE](LICENSE) for more information.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Model Context Protocol (MCP) server for Confluence RAG with ChromaDB vector search",
    "version": "0.1.3",
    "project_urls": {
        "Documentation": "https://github.com/akhilthomas236/confluence-scraper-mcp#readme",
        "Homepage": "https://github.com/akhilthomas236/confluence-scraper-mcp",
        "Issues": "https://github.com/akhilthomas236/confluence-scraper-mcp/issues",
        "Repository": "https://github.com/akhilthomas236/confluence-scraper-mcp"
    },
    "split_keywords": [
        "ai",
        " chromadb",
        " confluence",
        " llm",
        " mcp",
        " rag",
        " vector-search"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4f9d59279da34b459382ce49dc7c560e366b819735421a36d4564810cda2ed3a",
                "md5": "8c9ad20a6be9a5b037ae675980aabc46",
                "sha256": "2a49ea71bb38ea0244b1bfb8e4c9d6940ac76929a840c2d01e29b2265eed8208"
            },
            "downloads": -1,
            "filename": "confluence_scraper_mcp-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "8c9ad20a6be9a5b037ae675980aabc46",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 16588,
            "upload_time": "2025-07-19T16:41:11",
            "upload_time_iso_8601": "2025-07-19T16:41:11.305427Z",
            "url": "https://files.pythonhosted.org/packages/4f/9d/59279da34b459382ce49dc7c560e366b819735421a36d4564810cda2ed3a/confluence_scraper_mcp-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f0d6007b6235ebd4860f897761524a430981fb52c4afb6d62c49b9e15113fa1e",
                "md5": "0e92415607500eac67688b5e136e6c7a",
                "sha256": "008852596dff892cce788369517f09f6cd18991da0773a26f502e5a42df6448a"
            },
            "downloads": -1,
            "filename": "confluence_scraper_mcp-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "0e92415607500eac67688b5e136e6c7a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 27125,
            "upload_time": "2025-07-19T16:41:12",
            "upload_time_iso_8601": "2025-07-19T16:41:12.834470Z",
            "url": "https://files.pythonhosted.org/packages/f0/d6/007b6235ebd4860f897761524a430981fb52c4afb6d62c49b9e15113fa1e/confluence_scraper_mcp-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-19 16:41:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "akhilthomas236",
    "github_project": "confluence-scraper-mcp#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "fastapi",
            "specs": [
                [
                    ">=",
                    "0.110.0"
                ]
            ]
        },
        {
            "name": "uvicorn",
            "specs": [
                [
                    ">=",
                    "0.27.0"
                ]
            ]
        },
        {
            "name": "chromadb",
            "specs": [
                [
                    ">=",
                    "0.4.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "sentence-transformers",
            "specs": [
                [
                    ">=",
                    "4.0.0"
                ],
                [
                    "<",
                    "6.0.0"
                ]
            ]
        },
        {
            "name": "atlassian-python-api",
            "specs": [
                [
                    ">=",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "pydantic",
            "specs": [
                [
                    ">=",
                    "2.6.0"
                ],
                [
                    "<",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "pydantic-settings",
            "specs": [
                [
                    "<",
                    "3.0.0"
                ],
                [
                    ">=",
                    "2.2.0"
                ]
            ]
        },
        {
            "name": "python-dotenv",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "loguru",
            "specs": [
                [
                    ">=",
                    "0.7.0"
                ]
            ]
        },
        {
            "name": "httpx",
            "specs": [
                [
                    ">=",
                    "0.28.0"
                ]
            ]
        },
        {
            "name": "beautifulsoup4",
            "specs": [
                [
                    ">=",
                    "4.13.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.32.0"
                ]
            ]
        },
        {
            "name": "anyio",
            "specs": [
                [
                    ">=",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "starlette",
            "specs": [
                [
                    ">=",
                    "0.36.3"
                ]
            ]
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    ">=",
                    "4.8.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "8.4.0"
                ]
            ]
        },
        {
            "name": "pytest-asyncio",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "pytest-cov",
            "specs": [
                [
                    ">=",
                    "6.2.1"
                ]
            ]
        },
        {
            "name": "pytest-mock",
            "specs": [
                [
                    ">=",
                    "3.14.1"
                ]
            ]
        },
        {
            "name": "black",
            "specs": [
                [
                    ">=",
                    "24.2.0"
                ]
            ]
        },
        {
            "name": "isort",
            "specs": [
                [
                    ">=",
                    "5.13.0"
                ]
            ]
        },
        {
            "name": "mypy",
            "specs": [
                [
                    ">=",
                    "1.8.0"
                ]
            ]
        }
    ],
    "lcname": "confluence-scraper-mcp"
}
        
Elapsed time: 0.73176s