# LocalData MCP Server
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](https://github.com/jlowin/fastmcp)
**A powerful, secure MCP server for local databases and structured text files with advanced security features and large dataset handling.**
## โจ Features
### ๐๏ธ **Multi-Database Support**
- **SQL Databases**: PostgreSQL, MySQL, SQLite
- **Document Databases**: MongoDB
- **Structured Files**: CSV, JSON, YAML, TOML
### ๐ **Advanced Security**
- **Path Security**: Restricts file access to current working directory only
- **SQL Injection Prevention**: Parameterized queries and safe table identifiers
- **Connection Limits**: Maximum 10 concurrent database connections
- **Input Validation**: Comprehensive validation and sanitization
### ๐ **Large Dataset Handling**
- **Query Buffering**: Automatic buffering for results with 100+ rows
- **Large File Support**: 100MB+ files automatically use temporary SQLite storage
- **Chunk Retrieval**: Paginated access to large result sets
- **Auto-Cleanup**: 10-minute expiry with file modification detection
### ๐ ๏ธ **Developer Experience**
- **Comprehensive Tools**: 12 database operation tools
- **Error Handling**: Detailed, actionable error messages
- **Thread Safety**: Concurrent operation support
- **Backward Compatible**: All existing APIs preserved
## ๐ Quick Start
### Installation
```bash
# Using pip
pip install localdata-mcp
# Using uv (recommended)
uv tool install localdata-mcp
# Development installation
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
pip install -e .
```
### Configuration
Add to your MCP client configuration:
```json
{
"mcpServers": {
"localdata": {
"command": "localdata-mcp",
"env": {}
}
}
}
```
### Usage Examples
#### Connect to Databases
```python
# PostgreSQL
connect_database("analytics", "postgresql", "postgresql://user:pass@localhost/db")
# SQLite
connect_database("local", "sqlite", "./data.sqlite")
# CSV Files
connect_database("csvdata", "csv", "./data.csv")
# JSON Files
connect_database("config", "json", "./config.json")
```
#### Query Data
```python
# Execute queries with automatic result formatting
execute_query("analytics", "SELECT * FROM users LIMIT 50")
# Large result sets use buffering automatically
execute_query_json("analytics", "SELECT * FROM large_table")
```
#### Handle Large Results
```python
# Get chunked results for large datasets
get_query_chunk("analytics_1640995200_a1b2", 101, "100")
# Check buffer status
get_buffered_query_info("analytics_1640995200_a1b2")
# Manual cleanup
clear_query_buffer("analytics_1640995200_a1b2")
```
## ๐ง Available Tools
| Tool | Description | Use Case |
|------|-------------|----------|
| `connect_database` | Connect to databases/files | Initial setup |
| `disconnect_database` | Close connections | Cleanup |
| `list_databases` | Show active connections | Status check |
| `execute_query` | Run SQL (markdown output) | Small results |
| `execute_query_json` | Run SQL (JSON output) | Large results |
| `describe_database` | Show schema/structure | Exploration |
| `describe_table` | Show table details | Analysis |
| `get_table_sample` | Preview table data | Quick look |
| `get_table_sample_json` | Preview (JSON format) | Development |
| `find_table` | Locate tables by name | Navigation |
| `read_text_file` | Read structured files | File access |
| `get_query_chunk` | Paginated result access | Large data |
| `get_buffered_query_info` | Buffer status info | Monitoring |
| `clear_query_buffer` | Manual buffer cleanup | Management |
## ๐ Supported Data Sources
### SQL Databases
- **PostgreSQL**: Full support with connection pooling
- **MySQL**: Complete MySQL/MariaDB compatibility
- **SQLite**: Local file and in-memory databases
### Document Databases
- **MongoDB**: Collection queries and aggregation
### Structured Files
- **CSV**: Large file automatic SQLite conversion
- **JSON**: Nested structure flattening
- **YAML**: Configuration file support
- **TOML**: Settings and config files
## ๐ก๏ธ Security Features
### Path Security
```python
# โ
Allowed - current directory and subdirectories
"./data/users.csv"
"data/config.json"
"subdir/file.yaml"
# โ Blocked - parent directory access
"../etc/passwd"
"../../sensitive.db"
"/etc/hosts"
```
### SQL Injection Prevention
```python
# โ
Safe - parameterized queries
describe_table("mydb", "users") # Validates table name
# โ Blocked - malicious input
describe_table("mydb", "users; DROP TABLE users; --")
```
### Resource Limits
- **Connection Limit**: Maximum 10 concurrent connections
- **File Size Threshold**: 100MB triggers temporary storage
- **Query Buffering**: Automatic for 100+ row results
- **Auto-Cleanup**: Buffers expire after 10 minutes
## ๐ Performance & Scalability
### Large File Handling
- Files over 100MB automatically use temporary SQLite storage
- Memory-efficient streaming for large datasets
- Automatic cleanup of temporary files
### Query Optimization
- Results with 100+ rows automatically use buffering system
- Chunk-based retrieval for large datasets
- File modification detection for cache invalidation
### Concurrency
- Thread-safe connection management
- Concurrent query execution support
- Resource pooling and limits
## ๐งช Testing & Quality
**โ
100% Test Coverage**
- 100+ comprehensive test cases
- Security vulnerability testing
- Performance benchmarking
- Edge case validation
**๐ Security Validated**
- Path traversal prevention
- SQL injection protection
- Resource exhaustion testing
- Malicious input handling
**โก Performance Tested**
- Large file processing
- Concurrent connection handling
- Memory usage optimization
- Query response times
## ๐ API Compatibility
All existing MCP tool signatures remain **100% backward compatible**. New functionality is additive only:
- โ
All original tools work unchanged
- โ
Enhanced responses with additional metadata
- โ
New buffering tools for large datasets
- โ
Improved error messages and validation
## ๐ Examples
### Basic Database Operations
```python
# Connect to SQLite
connect_database("sales", "sqlite", "./sales.db")
# Explore structure
describe_database("sales")
describe_table("sales", "orders")
# Query data
execute_query("sales", "SELECT product, SUM(amount) FROM orders GROUP BY product")
```
### Large Dataset Processing
```python
# Connect to large CSV
connect_database("bigdata", "csv", "./million_records.csv")
# Query returns buffer info for large results
result = execute_query_json("bigdata", "SELECT * FROM data WHERE category = 'A'")
# Access results in chunks
chunk = get_query_chunk("bigdata_1640995200_a1b2", 1, "1000")
```
### Multi-Database Analysis
```python
# Connect multiple sources
connect_database("postgres", "postgresql", "postgresql://localhost/prod")
connect_database("config", "yaml", "./config.yaml")
connect_database("logs", "json", "./logs.json")
# Query across sources (in application logic)
user_data = execute_query("postgres", "SELECT * FROM users")
config = read_text_file("./config.yaml", "yaml")
```
## ๐ง Roadmap
- [ ] **Enhanced File Formats**: Excel, Parquet support
- [ ] **Caching Layer**: Configurable query result caching
- [ ] **Connection Pooling**: Advanced connection management
- [ ] **Streaming APIs**: Real-time data processing
- [ ] **Monitoring Tools**: Connection and performance metrics
## ๐ค Contributing
Contributions welcome! Please read our [Contributing Guidelines](CONTRIBUTING.md) for details.
### Development Setup
```bash
git clone https://github.com/ChrisGVE/localdata-mcp.git
cd localdata-mcp
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e ".[dev]"
pytest
```
## ๐ License
MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Links
- **GitHub**: [localdata-mcp](https://github.com/ChrisGVE/localdata-mcp)
- **PyPI**: [localdata-mcp](https://pypi.org/project/localdata-mcp/)
- **MCP Protocol**: [Model Context Protocol](https://modelcontextprotocol.io/)
- **FastMCP**: [FastMCP Framework](https://github.com/jlowin/fastmcp)
## ๐ Stats



---
**Made with โค๏ธ for the MCP Community**
Raw data
{
"_id": null,
"home_page": null,
"name": "localdata-mcp",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "mcp, model-context-protocol, database, ai, assistant",
"author": null,
"author_email": "\"Christian C. Berclaz\" <christian@berclaz.org>",
"download_url": "https://files.pythonhosted.org/packages/40/02/48378757175d310f0a86d528664de9b48274939dc82e378e520416311686/localdata_mcp-1.0.0.tar.gz",
"platform": null,
"description": "# LocalData MCP Server\n\n[](https://opensource.org/licenses/MIT)\n[](https://www.python.org/downloads/)\n[](https://github.com/jlowin/fastmcp)\n\n**A powerful, secure MCP server for local databases and structured text files with advanced security features and large dataset handling.**\n\n## \u2728 Features\n\n### \ud83d\uddc4\ufe0f **Multi-Database Support**\n- **SQL Databases**: PostgreSQL, MySQL, SQLite\n- **Document Databases**: MongoDB\n- **Structured Files**: CSV, JSON, YAML, TOML\n\n### \ud83d\udd12 **Advanced Security**\n- **Path Security**: Restricts file access to current working directory only\n- **SQL Injection Prevention**: Parameterized queries and safe table identifiers\n- **Connection Limits**: Maximum 10 concurrent database connections\n- **Input Validation**: Comprehensive validation and sanitization\n\n### \ud83d\udcca **Large Dataset Handling**\n- **Query Buffering**: Automatic buffering for results with 100+ rows\n- **Large File Support**: 100MB+ files automatically use temporary SQLite storage\n- **Chunk Retrieval**: Paginated access to large result sets\n- **Auto-Cleanup**: 10-minute expiry with file modification detection\n\n### \ud83d\udee0\ufe0f **Developer Experience**\n- **Comprehensive Tools**: 12 database operation tools\n- **Error Handling**: Detailed, actionable error messages\n- **Thread Safety**: Concurrent operation support\n- **Backward Compatible**: All existing APIs preserved\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\n# Using pip\npip install localdata-mcp\n\n# Using uv (recommended)\nuv tool install localdata-mcp\n\n# Development installation\ngit clone https://github.com/ChrisGVE/localdata-mcp.git\ncd localdata-mcp\npip install -e .\n```\n\n### Configuration\n\nAdd to your MCP client configuration:\n\n```json\n{\n \"mcpServers\": {\n \"localdata\": {\n \"command\": \"localdata-mcp\",\n \"env\": {}\n }\n }\n}\n```\n\n### Usage Examples\n\n#### Connect to Databases\n```python\n# PostgreSQL\nconnect_database(\"analytics\", \"postgresql\", \"postgresql://user:pass@localhost/db\")\n\n# SQLite\nconnect_database(\"local\", \"sqlite\", \"./data.sqlite\")\n\n# CSV Files\nconnect_database(\"csvdata\", \"csv\", \"./data.csv\")\n\n# JSON Files \nconnect_database(\"config\", \"json\", \"./config.json\")\n```\n\n#### Query Data\n```python\n# Execute queries with automatic result formatting\nexecute_query(\"analytics\", \"SELECT * FROM users LIMIT 50\")\n\n# Large result sets use buffering automatically\nexecute_query_json(\"analytics\", \"SELECT * FROM large_table\")\n```\n\n#### Handle Large Results\n```python\n# Get chunked results for large datasets\nget_query_chunk(\"analytics_1640995200_a1b2\", 101, \"100\")\n\n# Check buffer status\nget_buffered_query_info(\"analytics_1640995200_a1b2\")\n\n# Manual cleanup\nclear_query_buffer(\"analytics_1640995200_a1b2\")\n```\n\n## \ud83d\udd27 Available Tools\n\n| Tool | Description | Use Case |\n|------|-------------|----------|\n| `connect_database` | Connect to databases/files | Initial setup |\n| `disconnect_database` | Close connections | Cleanup |\n| `list_databases` | Show active connections | Status check |\n| `execute_query` | Run SQL (markdown output) | Small results |\n| `execute_query_json` | Run SQL (JSON output) | Large results |\n| `describe_database` | Show schema/structure | Exploration |\n| `describe_table` | Show table details | Analysis |\n| `get_table_sample` | Preview table data | Quick look |\n| `get_table_sample_json` | Preview (JSON format) | Development |\n| `find_table` | Locate tables by name | Navigation |\n| `read_text_file` | Read structured files | File access |\n| `get_query_chunk` | Paginated result access | Large data |\n| `get_buffered_query_info` | Buffer status info | Monitoring |\n| `clear_query_buffer` | Manual buffer cleanup | Management |\n\n## \ud83d\udccb Supported Data Sources\n\n### SQL Databases\n- **PostgreSQL**: Full support with connection pooling\n- **MySQL**: Complete MySQL/MariaDB compatibility \n- **SQLite**: Local file and in-memory databases\n\n### Document Databases\n- **MongoDB**: Collection queries and aggregation\n\n### Structured Files\n- **CSV**: Large file automatic SQLite conversion\n- **JSON**: Nested structure flattening\n- **YAML**: Configuration file support\n- **TOML**: Settings and config files\n\n## \ud83d\udee1\ufe0f Security Features\n\n### Path Security\n```python\n# \u2705 Allowed - current directory and subdirectories\n\"./data/users.csv\"\n\"data/config.json\" \n\"subdir/file.yaml\"\n\n# \u274c Blocked - parent directory access\n\"../etc/passwd\"\n\"../../sensitive.db\"\n\"/etc/hosts\"\n```\n\n### SQL Injection Prevention\n```python\n# \u2705 Safe - parameterized queries\ndescribe_table(\"mydb\", \"users\") # Validates table name\n\n# \u274c Blocked - malicious input\ndescribe_table(\"mydb\", \"users; DROP TABLE users; --\")\n```\n\n### Resource Limits\n- **Connection Limit**: Maximum 10 concurrent connections\n- **File Size Threshold**: 100MB triggers temporary storage\n- **Query Buffering**: Automatic for 100+ row results\n- **Auto-Cleanup**: Buffers expire after 10 minutes\n\n## \ud83d\udcca Performance & Scalability\n\n### Large File Handling\n- Files over 100MB automatically use temporary SQLite storage\n- Memory-efficient streaming for large datasets\n- Automatic cleanup of temporary files\n\n### Query Optimization\n- Results with 100+ rows automatically use buffering system\n- Chunk-based retrieval for large datasets\n- File modification detection for cache invalidation\n\n### Concurrency\n- Thread-safe connection management\n- Concurrent query execution support\n- Resource pooling and limits\n\n## \ud83e\uddea Testing & Quality\n\n**\u2705 100% Test Coverage**\n- 100+ comprehensive test cases\n- Security vulnerability testing\n- Performance benchmarking\n- Edge case validation\n\n**\ud83d\udd12 Security Validated**\n- Path traversal prevention\n- SQL injection protection \n- Resource exhaustion testing\n- Malicious input handling\n\n**\u26a1 Performance Tested**\n- Large file processing\n- Concurrent connection handling\n- Memory usage optimization\n- Query response times\n\n## \ud83d\udd04 API Compatibility\n\nAll existing MCP tool signatures remain **100% backward compatible**. New functionality is additive only:\n\n- \u2705 All original tools work unchanged\n- \u2705 Enhanced responses with additional metadata\n- \u2705 New buffering tools for large datasets\n- \u2705 Improved error messages and validation\n\n## \ud83d\udcd6 Examples\n\n### Basic Database Operations\n```python\n# Connect to SQLite\nconnect_database(\"sales\", \"sqlite\", \"./sales.db\")\n\n# Explore structure\ndescribe_database(\"sales\")\ndescribe_table(\"sales\", \"orders\")\n\n# Query data\nexecute_query(\"sales\", \"SELECT product, SUM(amount) FROM orders GROUP BY product\")\n```\n\n### Large Dataset Processing\n```python\n# Connect to large CSV\nconnect_database(\"bigdata\", \"csv\", \"./million_records.csv\")\n\n# Query returns buffer info for large results\nresult = execute_query_json(\"bigdata\", \"SELECT * FROM data WHERE category = 'A'\")\n\n# Access results in chunks\nchunk = get_query_chunk(\"bigdata_1640995200_a1b2\", 1, \"1000\")\n```\n\n### Multi-Database Analysis\n```python\n# Connect multiple sources\nconnect_database(\"postgres\", \"postgresql\", \"postgresql://localhost/prod\")\nconnect_database(\"config\", \"yaml\", \"./config.yaml\")\nconnect_database(\"logs\", \"json\", \"./logs.json\")\n\n# Query across sources (in application logic)\nuser_data = execute_query(\"postgres\", \"SELECT * FROM users\")\nconfig = read_text_file(\"./config.yaml\", \"yaml\")\n```\n\n## \ud83d\udea7 Roadmap\n\n- [ ] **Enhanced File Formats**: Excel, Parquet support\n- [ ] **Caching Layer**: Configurable query result caching \n- [ ] **Connection Pooling**: Advanced connection management\n- [ ] **Streaming APIs**: Real-time data processing\n- [ ] **Monitoring Tools**: Connection and performance metrics\n\n## \ud83e\udd1d Contributing\n\nContributions welcome! Please read our [Contributing Guidelines](CONTRIBUTING.md) for details.\n\n### Development Setup\n```bash\ngit clone https://github.com/ChrisGVE/localdata-mcp.git\ncd localdata-mcp\npython -m venv venv\nsource venv/bin/activate # On Windows: venv\\Scripts\\activate\npip install -e \".[dev]\"\npytest\n```\n\n## \ud83d\udcc4 License\n\nMIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\udd17 Links\n\n- **GitHub**: [localdata-mcp](https://github.com/ChrisGVE/localdata-mcp)\n- **PyPI**: [localdata-mcp](https://pypi.org/project/localdata-mcp/)\n- **MCP Protocol**: [Model Context Protocol](https://modelcontextprotocol.io/)\n- **FastMCP**: [FastMCP Framework](https://github.com/jlowin/fastmcp)\n\n## \ud83d\udcca Stats\n\n\n\n\n\n---\n\n**Made with \u2764\ufe0f for the MCP Community**\n",
"bugtrack_url": null,
"license": null,
"summary": "A dynamic MCP server for local databases and text files.",
"version": "1.0.0",
"project_urls": {
"Bug Tracker": "https://github.com/ChristianBerclaz/localdata-mcp/issues",
"Documentation": "https://github.com/ChristianBerclaz/localdata-mcp#readme",
"Homepage": "https://github.com/ChristianBerclaz/localdata-mcp",
"Repository": "https://github.com/ChristianBerclaz/localdata-mcp"
},
"split_keywords": [
"mcp",
" model-context-protocol",
" database",
" ai",
" assistant"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "228312b77b1f3c69f734420d2c28d8752ea4b39ca90166a715a8ef8a4a0b2915",
"md5": "8a351e3e7a9cb249f1295675f878e41d",
"sha256": "3c3f38011fc761dab06f36d2d017de2723b44323ad38d2123c75887b146a59d7"
},
"downloads": -1,
"filename": "localdata_mcp-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8a351e3e7a9cb249f1295675f878e41d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 13443,
"upload_time": "2025-08-15T16:48:56",
"upload_time_iso_8601": "2025-08-15T16:48:56.328212Z",
"url": "https://files.pythonhosted.org/packages/22/83/12b77b1f3c69f734420d2c28d8752ea4b39ca90166a715a8ef8a4a0b2915/localdata_mcp-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "400248378757175d310f0a86d528664de9b48274939dc82e378e520416311686",
"md5": "b8f76b82098fc5886a602512592bc606",
"sha256": "8220dec7e0af004af8111dc8314082bae42c0634034d2a8f3f45513753fcb8e1"
},
"downloads": -1,
"filename": "localdata_mcp-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "b8f76b82098fc5886a602512592bc606",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 16857,
"upload_time": "2025-08-15T16:48:57",
"upload_time_iso_8601": "2025-08-15T16:48:57.791070Z",
"url": "https://files.pythonhosted.org/packages/40/02/48378757175d310f0a86d528664de9b48274939dc82e378e520416311686/localdata_mcp-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-15 16:48:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ChristianBerclaz",
"github_project": "localdata-mcp",
"github_not_found": true,
"lcname": "localdata-mcp"
}