# Source Cooperative MCP Server
[](https://github.com/yharby/source-coop-mcp/actions)
[](https://pypi.org/project/source-coop-mcp/)
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
**Discover and access 800TB+ of geospatial data through AI agents.**
An MCP (Model Context Protocol) server for [Source Cooperative](https://source.coop) - a collaborative repository with datasets from Maxar, Harvard, ESA, USGS, and 90+ organizations.
---
## 🏗️ Architecture Overview
```mermaid
graph TB
subgraph "AI Clients"
A1[Claude Desktop]
A2[Claude Code]
A3[Cursor]
A4[Cline]
A5[Zed]
A6[Continue.dev]
end
subgraph "MCP Server"
MCP[Source Cooperative MCP<br/>FastMCP + obstore]
end
subgraph "6 Available Tools"
T1[list_accounts<br/>94+ orgs]
T2[list_products<br/>hybrid S3+API]
T3[get_product_details<br/>+ README]
T4[list_product_files<br/>tree mode]
T5[get_file_metadata<br/>no download]
T6[search<br/>hybrid fuzzy]
end
subgraph "Data Sources"
S1[HTTP API<br/>source.coop/api]
S2[S3 Direct<br/>opendata.source.coop]
end
A1 -->|JSON-RPC| MCP
A2 -->|JSON-RPC| MCP
A3 -->|JSON-RPC| MCP
A4 -->|JSON-RPC| MCP
A5 -->|JSON-RPC| MCP
A6 -->|JSON-RPC| MCP
MCP --> T1
MCP --> T2
MCP --> T3
MCP --> T4
MCP --> T5
MCP --> T6
T1 --> S2
T2 --> S1
T2 --> S2
T3 --> S1
T3 --> S2
T4 --> S2
T5 --> S2
T6 --> S1
style MCP fill:#4CAF50,stroke:#2E7D32,stroke-width:3px,color:#fff
style S1 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff
style S2 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff
```
**Key Features:**
- ✅ **Token Optimized** - 72% reduction for large datasets
- ✅ **Smart Partitions** - Auto-detects Hive-style patterns
- ✅ **Fuzzy Search** - Handles typos and partial matches
- ✅ **No Auth** - All 800TB+ is public
---
## 🚀 Quick Start
### Install
```bash
uvx source-coop-mcp
```
### Configure Your AI Client
#### **Claude Desktop / Claude Code / Cursor / Cline**
Add to config file:
- **Claude Desktop**: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)
- **Claude Code**: VS Code `settings.json`
- **Cursor**: Cursor settings
- **Cline**: Cline MCP settings
```json
{
"mcpServers": {
"source-coop": {
"command": "uvx",
"args": ["source-coop-mcp"]
}
}
}
```
#### **Zed**
Add to Zed settings:
```json
{
"context_servers": {
"source-coop": {
"command": "uvx",
"args": ["source-coop-mcp"]
}
}
}
```
#### **Continue.dev**
Add to Continue config (`~/.continue/config.json`):
```json
{
"experimental": {
"modelContextProtocolServers": [
{
"transport": {
"type": "stdio",
"command": "uvx",
"args": ["source-coop-mcp"]
}
}
]
}
}
```
**Restart your AI client and start exploring!**
---
## 🛠️ Available Tools
| Tool | Purpose | Performance |
|------|---------|-------------|
| `list_accounts()` | Find all 94+ organizations | ~850ms |
| `list_products()` | **Hybrid:** S3 mode (default) for ALL datasets + file counts | ~240ms |
| `list_products(include_unpublished=False)` | API mode for published datasets with rich metadata | ~500ms |
| `get_product_details()` | Get metadata + README automatically | ~650ms |
| `list_product_files()` | List files with S3/HTTP paths | ~240ms |
| `list_product_files(show_tree=True)` | Tree view (72% token savings) | ~980ms |
| `get_file_metadata()` | Get file info without downloading | ~230ms |
| `search(query)` | **Hybrid:** Search accounts + products (published + unpublished), top 5 results | ~5-10s |
---
## 💡 What You Can Do
### Discover Data
```
"List all organizations in Source Cooperative"
→ Returns 94+ organizations: maxar, planet, harvard, etc.
"Find all datasets for harvard-lil"
→ Discovers published + unpublished products
"Search for climate datasets"
→ Smart fuzzy search handles typos and partial matches
```
### Access Files
```
"List files in harvard-lil/gov-data"
→ Returns S3 paths and HTTP URLs ready for analysis
"Show me the file tree with partition detection"
→ Smart visualization: year={2020,2021,...+5 more}/ [partitioned]
"Get file metadata without downloading"
→ Size, last modified, ETag
```
### Smart Search
```
"Search for climte" (typo)
→ Finds "climate" datasets (fuzzy matching)
"Search for geo" (partial)
→ Finds "geospatial", "geocoding", etc.
```
---
## ⚡ Features
| Feature | Description |
|---------|-------------|
| **Complete Discovery** | Finds unpublished products the official API doesn't show |
| **No Authentication** | All 800TB+ data is public |
| **Fast Performance** | Rust-backed S3 client (9x faster than boto3) |
| **Token Optimized** | Tree mode: 72% token reduction for large datasets |
| **Smart Partitions** | Auto-detects patterns: `year={2020,2021,...}` |
| **Fuzzy Search** | Handles typos and partial matches |
| **README Integration** | Documentation automatically included |
| **800TB+ Data** | 94+ organizations, geospatial datasets |
---
## 📋 Example Workflow
```
1. "List all organizations"
→ Get 94+ account names
2. "Show me all datasets from maxar"
→ Discover published + unpublished products
3. "Search for climate data"
→ Smart fuzzy search finds relevant datasets
4. "Get details for harvard-lil/gov-data"
→ Full metadata + README content
5. "List files in this dataset with tree view"
→ Token-optimized tree with partition detection
```
---
## 🎯 Why This Server?
### Problem
Source Cooperative has 800TB+ of valuable data, but:
- Official API only shows **published** products
- No auto-discovery of organizations
- Requires knowing what you're looking for
### Solution
This MCP server provides:
- ✅ Complete auto-discovery (published + unpublished)
- ✅ Smart search with fuzzy matching
- ✅ Direct S3 access for all files
- ✅ Token-optimized outputs (72% reduction)
- ✅ Smart partition detection (10-88% additional savings)
- ✅ README documentation included automatically
- ✅ No authentication required
---
## 📊 Performance
All operations complete in **under 1 second**:
```
list_accounts(): ~850ms (94+ organizations)
list_products(): ~240ms (S3 mode - ALL datasets + file counts)
list_products(include_unpublished=False): ~500ms (API mode - published with metadata)
list_product_files(): ~240ms (simple list)
list_product_files(tree=True): ~980ms (72% token savings)
get_file_metadata(): ~230ms (HEAD only)
search(query): ~5-10s (hybrid search - 1 recursive S3 scan, top 5 enriched)
```
### Token Optimization Impact
| Dataset Size | Without Tree | With Tree | Saved |
|--------------|--------------|-----------|-------|
| 10 files | 1,500 tokens | 415 tokens | 72.3% |
| 100 files | 15,000 tokens | 4,150 tokens | 72.3% |
| 1,000 files | 150,000 tokens | 41,500 tokens | 72.3% |
With partition detection (1,000 partitions): **88% total savings!**
---
## 🔧 Requirements
- **Python**: 3.11 or higher
- **Package Manager**: `uv` (installed automatically by `uvx`)
- **Operating Systems**: macOS, Linux, Windows
---
## 🤝 Development
See [DEVELOPMENT.md](DEVELOPMENT.md) for:
- Architecture details
- Testing instructions
- Contributing guidelines
- Performance benchmarks
- Token optimization details
---
## 📝 Support
- **Issues**: [GitHub Issues](https://github.com/yharby/source-coop-mcp/issues)
---
## 📄 License
MIT License - see [LICENSE](LICENSE) for details.
Raw data
{
"_id": null,
"home_page": null,
"name": "source-coop-mcp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "geospatial, mcp, model-context-protocol, object-storage, s3, source-cooperative",
"author": null,
"author_email": "yharby <me@youssefharby.com>",
"download_url": "https://files.pythonhosted.org/packages/1c/34/6703e9f6233247ba28b6d16837f69518ab3183016189772dcc66e83f484f/source_coop_mcp-0.2.1.tar.gz",
"platform": null,
"description": "# Source Cooperative MCP Server\n\n[](https://github.com/yharby/source-coop-mcp/actions)\n[](https://pypi.org/project/source-coop-mcp/)\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n\n**Discover and access 800TB+ of geospatial data through AI agents.**\n\nAn MCP (Model Context Protocol) server for [Source Cooperative](https://source.coop) - a collaborative repository with datasets from Maxar, Harvard, ESA, USGS, and 90+ organizations.\n\n---\n\n## \ud83c\udfd7\ufe0f Architecture Overview\n\n```mermaid\ngraph TB\n subgraph \"AI Clients\"\n A1[Claude Desktop]\n A2[Claude Code]\n A3[Cursor]\n A4[Cline]\n A5[Zed]\n A6[Continue.dev]\n end\n\n subgraph \"MCP Server\"\n MCP[Source Cooperative MCP<br/>FastMCP + obstore]\n end\n\n subgraph \"6 Available Tools\"\n T1[list_accounts<br/>94+ orgs]\n T2[list_products<br/>hybrid S3+API]\n T3[get_product_details<br/>+ README]\n T4[list_product_files<br/>tree mode]\n T5[get_file_metadata<br/>no download]\n T6[search<br/>hybrid fuzzy]\n end\n\n subgraph \"Data Sources\"\n S1[HTTP API<br/>source.coop/api]\n S2[S3 Direct<br/>opendata.source.coop]\n end\n\n A1 -->|JSON-RPC| MCP\n A2 -->|JSON-RPC| MCP\n A3 -->|JSON-RPC| MCP\n A4 -->|JSON-RPC| MCP\n A5 -->|JSON-RPC| MCP\n A6 -->|JSON-RPC| MCP\n\n MCP --> T1\n MCP --> T2\n MCP --> T3\n MCP --> T4\n MCP --> T5\n MCP --> T6\n\n T1 --> S2\n T2 --> S1\n T2 --> S2\n T3 --> S1\n T3 --> S2\n T4 --> S2\n T5 --> S2\n T6 --> S1\n\n style MCP fill:#4CAF50,stroke:#2E7D32,stroke-width:3px,color:#fff\n style S1 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff\n style S2 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff\n```\n\n**Key Features:**\n- \u2705 **Token Optimized** - 72% reduction for large datasets\n- \u2705 **Smart Partitions** - Auto-detects Hive-style patterns\n- \u2705 **Fuzzy Search** - Handles typos and partial matches\n- \u2705 **No Auth** - All 800TB+ is public\n\n---\n\n## \ud83d\ude80 Quick Start\n\n### Install\n\n```bash\nuvx source-coop-mcp\n```\n\n### Configure Your AI Client\n\n#### **Claude Desktop / Claude Code / Cursor / Cline**\n\nAdd to config file:\n- **Claude Desktop**: `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS)\n- **Claude Code**: VS Code `settings.json`\n- **Cursor**: Cursor settings\n- **Cline**: Cline MCP settings\n\n```json\n{\n \"mcpServers\": {\n \"source-coop\": {\n \"command\": \"uvx\",\n \"args\": [\"source-coop-mcp\"]\n }\n }\n}\n```\n\n#### **Zed**\n\nAdd to Zed settings:\n\n```json\n{\n \"context_servers\": {\n \"source-coop\": {\n \"command\": \"uvx\",\n \"args\": [\"source-coop-mcp\"]\n }\n }\n}\n```\n\n#### **Continue.dev**\n\nAdd to Continue config (`~/.continue/config.json`):\n\n```json\n{\n \"experimental\": {\n \"modelContextProtocolServers\": [\n {\n \"transport\": {\n \"type\": \"stdio\",\n \"command\": \"uvx\",\n \"args\": [\"source-coop-mcp\"]\n }\n }\n ]\n }\n}\n```\n\n**Restart your AI client and start exploring!**\n\n---\n\n## \ud83d\udee0\ufe0f Available Tools\n\n| Tool | Purpose | Performance |\n|------|---------|-------------|\n| `list_accounts()` | Find all 94+ organizations | ~850ms |\n| `list_products()` | **Hybrid:** S3 mode (default) for ALL datasets + file counts | ~240ms |\n| `list_products(include_unpublished=False)` | API mode for published datasets with rich metadata | ~500ms |\n| `get_product_details()` | Get metadata + README automatically | ~650ms |\n| `list_product_files()` | List files with S3/HTTP paths | ~240ms |\n| `list_product_files(show_tree=True)` | Tree view (72% token savings) | ~980ms |\n| `get_file_metadata()` | Get file info without downloading | ~230ms |\n| `search(query)` | **Hybrid:** Search accounts + products (published + unpublished), top 5 results | ~5-10s |\n\n---\n\n## \ud83d\udca1 What You Can Do\n\n### Discover Data\n\n```\n\"List all organizations in Source Cooperative\"\n\u2192 Returns 94+ organizations: maxar, planet, harvard, etc.\n\n\"Find all datasets for harvard-lil\"\n\u2192 Discovers published + unpublished products\n\n\"Search for climate datasets\"\n\u2192 Smart fuzzy search handles typos and partial matches\n```\n\n### Access Files\n\n```\n\"List files in harvard-lil/gov-data\"\n\u2192 Returns S3 paths and HTTP URLs ready for analysis\n\n\"Show me the file tree with partition detection\"\n\u2192 Smart visualization: year={2020,2021,...+5 more}/ [partitioned]\n\n\"Get file metadata without downloading\"\n\u2192 Size, last modified, ETag\n```\n\n### Smart Search\n\n```\n\"Search for climte\" (typo)\n\u2192 Finds \"climate\" datasets (fuzzy matching)\n\n\"Search for geo\" (partial)\n\u2192 Finds \"geospatial\", \"geocoding\", etc.\n```\n\n---\n\n## \u26a1 Features\n\n| Feature | Description |\n|---------|-------------|\n| **Complete Discovery** | Finds unpublished products the official API doesn't show |\n| **No Authentication** | All 800TB+ data is public |\n| **Fast Performance** | Rust-backed S3 client (9x faster than boto3) |\n| **Token Optimized** | Tree mode: 72% token reduction for large datasets |\n| **Smart Partitions** | Auto-detects patterns: `year={2020,2021,...}` |\n| **Fuzzy Search** | Handles typos and partial matches |\n| **README Integration** | Documentation automatically included |\n| **800TB+ Data** | 94+ organizations, geospatial datasets |\n\n---\n\n## \ud83d\udccb Example Workflow\n\n```\n1. \"List all organizations\"\n \u2192 Get 94+ account names\n\n2. \"Show me all datasets from maxar\"\n \u2192 Discover published + unpublished products\n\n3. \"Search for climate data\"\n \u2192 Smart fuzzy search finds relevant datasets\n\n4. \"Get details for harvard-lil/gov-data\"\n \u2192 Full metadata + README content\n\n5. \"List files in this dataset with tree view\"\n \u2192 Token-optimized tree with partition detection\n```\n\n---\n\n## \ud83c\udfaf Why This Server?\n\n### Problem\nSource Cooperative has 800TB+ of valuable data, but:\n- Official API only shows **published** products\n- No auto-discovery of organizations\n- Requires knowing what you're looking for\n\n### Solution\nThis MCP server provides:\n- \u2705 Complete auto-discovery (published + unpublished)\n- \u2705 Smart search with fuzzy matching\n- \u2705 Direct S3 access for all files\n- \u2705 Token-optimized outputs (72% reduction)\n- \u2705 Smart partition detection (10-88% additional savings)\n- \u2705 README documentation included automatically\n- \u2705 No authentication required\n\n---\n\n## \ud83d\udcca Performance\n\nAll operations complete in **under 1 second**:\n\n```\nlist_accounts(): ~850ms (94+ organizations)\nlist_products(): ~240ms (S3 mode - ALL datasets + file counts)\nlist_products(include_unpublished=False): ~500ms (API mode - published with metadata)\nlist_product_files(): ~240ms (simple list)\nlist_product_files(tree=True): ~980ms (72% token savings)\nget_file_metadata(): ~230ms (HEAD only)\nsearch(query): ~5-10s (hybrid search - 1 recursive S3 scan, top 5 enriched)\n```\n\n### Token Optimization Impact\n\n| Dataset Size | Without Tree | With Tree | Saved |\n|--------------|--------------|-----------|-------|\n| 10 files | 1,500 tokens | 415 tokens | 72.3% |\n| 100 files | 15,000 tokens | 4,150 tokens | 72.3% |\n| 1,000 files | 150,000 tokens | 41,500 tokens | 72.3% |\n\nWith partition detection (1,000 partitions): **88% total savings!**\n\n---\n\n## \ud83d\udd27 Requirements\n\n- **Python**: 3.11 or higher\n- **Package Manager**: `uv` (installed automatically by `uvx`)\n- **Operating Systems**: macOS, Linux, Windows\n\n---\n\n## \ud83e\udd1d Development\n\nSee [DEVELOPMENT.md](DEVELOPMENT.md) for:\n- Architecture details\n- Testing instructions\n- Contributing guidelines\n- Performance benchmarks\n- Token optimization details\n\n---\n\n## \ud83d\udcdd Support\n\n- **Issues**: [GitHub Issues](https://github.com/yharby/source-coop-mcp/issues)\n\n---\n\n## \ud83d\udcc4 License\n\nMIT License - see [LICENSE](LICENSE) for details.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "MCP server for Source Cooperative auto-discovery and data exploration",
"version": "0.2.1",
"project_urls": {
"Changelog": "https://github.com/yharby/source-coop-mcp/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/yharby/source-coop-mcp#readme",
"Homepage": "https://github.com/yharby/source-coop-mcp",
"Issues": "https://github.com/yharby/source-coop-mcp/issues",
"Repository": "https://github.com/yharby/source-coop-mcp"
},
"split_keywords": [
"geospatial",
" mcp",
" model-context-protocol",
" object-storage",
" s3",
" source-cooperative"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "f81932b954c1ce4ff44dbe40a2839ea0bd5bce458bc8460fdc0631778169f0d2",
"md5": "f9434cdf197c138a0bcf369cf77edbd3",
"sha256": "98bb1ba8e0e31abc6e42dc2ab4e3601bca57353f6960215a41b3a4b1efb81164"
},
"downloads": -1,
"filename": "source_coop_mcp-0.2.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f9434cdf197c138a0bcf369cf77edbd3",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 17917,
"upload_time": "2025-10-22T23:19:56",
"upload_time_iso_8601": "2025-10-22T23:19:56.723542Z",
"url": "https://files.pythonhosted.org/packages/f8/19/32b954c1ce4ff44dbe40a2839ea0bd5bce458bc8460fdc0631778169f0d2/source_coop_mcp-0.2.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1c346703e9f6233247ba28b6d16837f69518ab3183016189772dcc66e83f484f",
"md5": "8b806b931d89d2545e70163102eac005",
"sha256": "6d10c38e7034c4578dd5d027a8c520042f0ead4b0fd01626662abb93450856f9"
},
"downloads": -1,
"filename": "source_coop_mcp-0.2.1.tar.gz",
"has_sig": false,
"md5_digest": "8b806b931d89d2545e70163102eac005",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 128411,
"upload_time": "2025-10-22T23:19:58",
"upload_time_iso_8601": "2025-10-22T23:19:58.248086Z",
"url": "https://files.pythonhosted.org/packages/1c/34/6703e9f6233247ba28b6d16837f69518ab3183016189772dcc66e83f484f/source_coop_mcp-0.2.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-22 23:19:58",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yharby",
"github_project": "source-coop-mcp",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "source-coop-mcp"
}