# mcp-bigquery
<div align="center">
<img src="docs/assets/images/logo.png" alt="MCP BigQuery Logo" width="200">
**Safe BigQuery exploration through Model Context Protocol**
[](LICENSE)
[](https://pypi.org/project/mcp-bigquery/)
[](https://pypi.org/project/mcp-bigquery/)
[](https://pypi.org/project/mcp-bigquery/)
[**Documentation**](https://caron14.github.io/mcp-bigquery/) |
[**Quick Start**](#-quick-start-4-minutes) |
[**Tools**](#-available-tools) |
[**Examples**](#-real-world-examples)
</div>
---
## π What is this?
**mcp-bigquery** is an MCP (Model Context Protocol) server that enables AI assistants like Claude to **safely** interact with Google BigQuery.
### π― Key Features
```mermaid
graph LR
A[AI Assistant] -->|MCP Protocol| B[mcp-bigquery]
B -->|Dry-run Only| C[BigQuery API]
B -.->|β Never Executes| D[Actual Query Execution]
```
- **π‘οΈ 100% Safe**: All operations are dry-run only (never executes queries)
- **π° Cost Transparency**: See costs before running any query
- **π Complete Analysis**: Analyze SQL structure, dependencies, and performance
- **π Schema Explorer**: Browse datasets, tables, and columns with ease
### β‘ Why use mcp-bigquery?
| Problem | Solution with mcp-bigquery |
|---------|---------------------------|
| πΈ Accidentally running expensive queries | Check costs before execution |
| π Wasting time on SQL errors | Detect syntax errors before running |
| πΊοΈ Unknown table structures | Easily explore schemas |
| β οΈ AI executing dangerous operations | Everything is read-only and safe |
## π Quick Start (4 minutes)
### Step 1: Install (1 minute)
```bash
pip install mcp-bigquery
```
### Step 2: Authenticate with Google Cloud (2 minutes)
```bash
# For personal accounts
gcloud auth application-default login
# For service accounts
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json
```
### Step 3: Configure Claude Desktop (1 minute)
Open your Claude Desktop config:
- **Mac**: `~/Library/Application Support/Claude/claude_desktop_config.json`
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
Add this configuration:
```json
{
"mcpServers": {
"mcp-bigquery": {
"command": "mcp-bigquery",
"env": {
"BQ_PROJECT": "your-gcp-project-id" // β Replace with your project ID
}
}
}
}
```
### Step 4: Test It!
Restart Claude Desktop and try these questions:
```
"What datasets are available in my BigQuery project?"
"Can you estimate the cost of: SELECT * FROM dataset.table"
"Show me the schema for the users table"
```
## π οΈ Available Tools
### π SQL Validation & Analysis
| Tool | Purpose | When to Use |
|------|---------|-------------|
| **bq_validate_sql** | Check SQL syntax | Before running any query |
| **bq_dry_run_sql** | Get cost estimates & metadata | π° To check costs |
| **bq_analyze_query_structure** | Analyze query complexity | To improve performance |
| **bq_extract_dependencies** | Extract table dependencies | To understand data lineage |
| **bq_validate_query_syntax** | Detailed error analysis | To debug SQL errors |
### π Schema Discovery
| Tool | Purpose | When to Use |
|------|---------|-------------|
| **bq_list_datasets** | List all datasets | To explore your project |
| **bq_list_tables** | List tables with partitioning info | To browse a dataset |
| **bq_describe_table** | Get detailed table schema | To understand columns |
| **bq_get_table_info** | Complete table metadata | To get statistics |
| **bq_query_info_schema** | Query INFORMATION_SCHEMA | For advanced metadata queries |
### β‘ Performance Optimization
| Tool | Purpose | When to Use |
|------|---------|-------------|
| **bq_analyze_query_performance** | Analyze performance | To optimize queries |
## π‘ Real-World Examples
### Example 1: Check Costs Before Running
```python
# Before running an expensive query...
query = "SELECT * FROM `bigquery-public-data.github_repos.commits`"
# First, check the cost
result = bq_dry_run_sql(sql=query)
print(f"Estimated cost: ${result['usdEstimate']}")
print(f"Data processed: {result['totalBytesProcessed'] / 1e9:.2f} GB")
# Output:
# Estimated cost: $12.50
# Data processed: 2500.00 GB
```
### Example 2: Understand Table Structure
```python
# Check table schema
result = bq_describe_table(
dataset_id="your_dataset",
table_id="users"
)
# Output:
# βββ user_id (INTEGER, REQUIRED)
# βββ email (STRING, NULLABLE)
# βββ created_at (TIMESTAMP, REQUIRED)
# βββ profile (RECORD, REPEATED)
# βββ name (STRING)
# βββ age (INTEGER)
```
### Example 3: Get Optimization Suggestions
```python
# Analyze a slow query
query = """
SELECT *
FROM large_table
WHERE date > '2024-01-01'
"""
result = bq_analyze_query_performance(sql=query)
# Output:
# Performance Score: 45/100 (Needs Improvement)
#
# Optimization Suggestions:
# 1. Avoid SELECT * - specify only needed columns
# 2. Use partition filter on date field
# 3. Consider adding LIMIT clause
```
### Example 4: Track Data Dependencies
```python
# Understand query dependencies
query = """
WITH user_stats AS (
SELECT user_id, COUNT(*) as order_count
FROM orders
GROUP BY user_id
)
SELECT u.name, s.order_count
FROM users u
JOIN user_stats s ON u.id = s.user_id
"""
result = bq_extract_dependencies(sql=query)
# Output:
# Tables: ['orders', 'users']
# Columns: ['user_id', 'name', 'id']
# Dependency Graph:
# orders β user_stats β final_result
# users β final_result
```
## π¨ How It Works
```
Your Code β β Claude/AI Assistant
β
MCP Protocol
β
mcp-bigquery
β
BigQuery API (Dry-run)
β
BigQuery
(Never executes actual queries)
```
## βοΈ Configuration
### Environment Variables
```bash
export BQ_PROJECT="my-project" # GCP Project ID (required)
export BQ_LOCATION="asia-northeast1" # Region (optional)
export SAFE_PRICE_PER_TIB="5.0" # Price per TiB (default: $5)
export DEBUG="true" # Enable debug logging
```
### Full Claude Desktop Configuration
```json
{
"mcpServers": {
"mcp-bigquery": {
"command": "mcp-bigquery",
"env": {
"BQ_PROJECT": "my-production-project",
"BQ_LOCATION": "asia-northeast1",
"SAFE_PRICE_PER_TIB": "6.0",
"DEBUG": "false"
}
}
}
}
```
## π§ Troubleshooting
### Common Issues & Solutions
#### β Authentication Error
```
Error: Could not automatically determine credentials
```
**Solution:**
```bash
gcloud auth application-default login
```
#### β Permission Error
```
Error: User does not have bigquery.tables.get permission
```
**Solution:** Grant BigQuery Data Viewer role
```bash
gcloud projects add-iam-policy-binding YOUR_PROJECT \
--member="user:your-email@example.com" \
--role="roles/bigquery.dataViewer"
```
#### β Project Not Set
```
Error: Project ID is required
```
**Solution:** Set `BQ_PROJECT` in your configuration
### Debug Mode
If issues persist, enable debug mode:
```json
{
"env": {
"DEBUG": "true",
"BQ_PROJECT": "your-project"
}
}
```
## π Learn More
### Getting Started
- [Installation Guide](https://caron14.github.io/mcp-bigquery/getting-started/)
- [Tool Documentation](https://caron14.github.io/mcp-bigquery/api/)
- [Example Workflows](https://caron14.github.io/mcp-bigquery/examples/)
### For Developers
- [Local Development](https://caron14.github.io/mcp-bigquery/development/)
- [Running Tests](https://caron14.github.io/mcp-bigquery/testing/)
- [Contributing Guide](CONTRIBUTING.md)
## π¦ Project Status
| Version | Release Date | Key Features |
|---------|--------------|--------------|
| v0.4.1 | 2025-01-22 | Better error handling, debug logging |
| v0.4.0 | 2025-01-22 | Added 6 schema discovery tools |
| v0.3.0 | 2025-01-17 | SQL analysis engine |
| v0.2.0 | 2025-01-16 | Basic validation & dry-run |
## π€ Contributing
Pull requests are welcome! See our [Contributing Guide](CONTRIBUTING.md).
```bash
# Setup development environment
git clone https://github.com/caron14/mcp-bigquery.git
cd mcp-bigquery
pip install -e ".[dev]"
# Run tests
pytest tests/
```
## π License
MIT License - see [LICENSE](LICENSE) for details.
## π Acknowledgments
- Google BigQuery team for the excellent API
- Anthropic for the MCP protocol
- All contributors and users
---
<div align="center">
**Built for safe BigQuery exploration** π‘οΈ
[Report Bug](https://github.com/caron14/mcp-bigquery/issues) Β·
[Request Feature](https://github.com/caron14/mcp-bigquery/issues) Β·
[Discussions](https://github.com/caron14/mcp-bigquery/discussions)
</div>
Raw data
{
"_id": null,
"home_page": null,
"name": "mcp-bigquery",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "caron14 <caron14@users.noreply.github.com>",
"keywords": "bigquery, claude, dry-run, gcp, mcp, sql, validation",
"author": null,
"author_email": "caron14 <caron14@users.noreply.github.com>",
"download_url": "https://files.pythonhosted.org/packages/9c/5d/21dd286128a2f0ff3941d0e39c18196e11ad51f07709bac5d8b78adafdbf/mcp_bigquery-0.4.1.tar.gz",
"platform": null,
"description": "# mcp-bigquery\n\n<div align=\"center\">\n\n<img src=\"docs/assets/images/logo.png\" alt=\"MCP BigQuery Logo\" width=\"200\">\n\n**Safe BigQuery exploration through Model Context Protocol**\n\n[](LICENSE)\n[](https://pypi.org/project/mcp-bigquery/)\n[](https://pypi.org/project/mcp-bigquery/)\n[](https://pypi.org/project/mcp-bigquery/)\n\n[**Documentation**](https://caron14.github.io/mcp-bigquery/) | \n[**Quick Start**](#-quick-start-4-minutes) | \n[**Tools**](#-available-tools) | \n[**Examples**](#-real-world-examples)\n\n</div>\n\n---\n\n## \ud83d\udccc What is this?\n\n**mcp-bigquery** is an MCP (Model Context Protocol) server that enables AI assistants like Claude to **safely** interact with Google BigQuery.\n\n### \ud83c\udfaf Key Features\n\n```mermaid\ngraph LR\n A[AI Assistant] -->|MCP Protocol| B[mcp-bigquery]\n B -->|Dry-run Only| C[BigQuery API]\n B -.->|\u274c Never Executes| D[Actual Query Execution]\n```\n\n- **\ud83d\udee1\ufe0f 100% Safe**: All operations are dry-run only (never executes queries)\n- **\ud83d\udcb0 Cost Transparency**: See costs before running any query\n- **\ud83d\udd0d Complete Analysis**: Analyze SQL structure, dependencies, and performance\n- **\ud83d\udcca Schema Explorer**: Browse datasets, tables, and columns with ease\n\n### \u26a1 Why use mcp-bigquery?\n\n| Problem | Solution with mcp-bigquery |\n|---------|---------------------------|\n| \ud83d\udcb8 Accidentally running expensive queries | Check costs before execution |\n| \ud83d\udc1b Wasting time on SQL errors | Detect syntax errors before running |\n| \ud83d\uddfa\ufe0f Unknown table structures | Easily explore schemas |\n| \u26a0\ufe0f AI executing dangerous operations | Everything is read-only and safe |\n\n## \ud83d\ude80 Quick Start (4 minutes)\n\n### Step 1: Install (1 minute)\n\n```bash\npip install mcp-bigquery\n```\n\n### Step 2: Authenticate with Google Cloud (2 minutes)\n\n```bash\n# For personal accounts\ngcloud auth application-default login\n\n# For service accounts\nexport GOOGLE_APPLICATION_CREDENTIALS=/path/to/key.json\n```\n\n### Step 3: Configure Claude Desktop (1 minute)\n\nOpen your Claude Desktop config:\n- **Mac**: `~/Library/Application Support/Claude/claude_desktop_config.json`\n- **Windows**: `%APPDATA%\\Claude\\claude_desktop_config.json`\n\nAdd this configuration:\n\n```json\n{\n \"mcpServers\": {\n \"mcp-bigquery\": {\n \"command\": \"mcp-bigquery\",\n \"env\": {\n \"BQ_PROJECT\": \"your-gcp-project-id\" // \u2190 Replace with your project ID\n }\n }\n }\n}\n```\n\n### Step 4: Test It!\n\nRestart Claude Desktop and try these questions:\n\n```\n\"What datasets are available in my BigQuery project?\"\n\"Can you estimate the cost of: SELECT * FROM dataset.table\"\n\"Show me the schema for the users table\"\n```\n\n## \ud83d\udee0\ufe0f Available Tools\n\n### \ud83d\udcdd SQL Validation & Analysis\n\n| Tool | Purpose | When to Use |\n|------|---------|-------------|\n| **bq_validate_sql** | Check SQL syntax | Before running any query |\n| **bq_dry_run_sql** | Get cost estimates & metadata | \ud83d\udcb0 To check costs |\n| **bq_analyze_query_structure** | Analyze query complexity | To improve performance |\n| **bq_extract_dependencies** | Extract table dependencies | To understand data lineage |\n| **bq_validate_query_syntax** | Detailed error analysis | To debug SQL errors |\n\n### \ud83d\udd0d Schema Discovery\n\n| Tool | Purpose | When to Use |\n|------|---------|-------------|\n| **bq_list_datasets** | List all datasets | To explore your project |\n| **bq_list_tables** | List tables with partitioning info | To browse a dataset |\n| **bq_describe_table** | Get detailed table schema | To understand columns |\n| **bq_get_table_info** | Complete table metadata | To get statistics |\n| **bq_query_info_schema** | Query INFORMATION_SCHEMA | For advanced metadata queries |\n\n### \u26a1 Performance Optimization\n\n| Tool | Purpose | When to Use |\n|------|---------|-------------|\n| **bq_analyze_query_performance** | Analyze performance | To optimize queries |\n\n## \ud83d\udca1 Real-World Examples\n\n### Example 1: Check Costs Before Running\n\n```python\n# Before running an expensive query...\nquery = \"SELECT * FROM `bigquery-public-data.github_repos.commits`\"\n\n# First, check the cost\nresult = bq_dry_run_sql(sql=query)\nprint(f\"Estimated cost: ${result['usdEstimate']}\")\nprint(f\"Data processed: {result['totalBytesProcessed'] / 1e9:.2f} GB\")\n\n# Output:\n# Estimated cost: $12.50\n# Data processed: 2500.00 GB\n```\n\n### Example 2: Understand Table Structure\n\n```python\n# Check table schema\nresult = bq_describe_table(\n dataset_id=\"your_dataset\",\n table_id=\"users\"\n)\n\n# Output:\n# \u251c\u2500\u2500 user_id (INTEGER, REQUIRED)\n# \u251c\u2500\u2500 email (STRING, NULLABLE)\n# \u251c\u2500\u2500 created_at (TIMESTAMP, REQUIRED)\n# \u2514\u2500\u2500 profile (RECORD, REPEATED)\n# \u251c\u2500\u2500 name (STRING)\n# \u2514\u2500\u2500 age (INTEGER)\n```\n\n### Example 3: Get Optimization Suggestions\n\n```python\n# Analyze a slow query\nquery = \"\"\"\nSELECT * \nFROM large_table \nWHERE date > '2024-01-01'\n\"\"\"\n\nresult = bq_analyze_query_performance(sql=query)\n\n# Output:\n# Performance Score: 45/100 (Needs Improvement)\n# \n# Optimization Suggestions:\n# 1. Avoid SELECT * - specify only needed columns\n# 2. Use partition filter on date field\n# 3. Consider adding LIMIT clause\n```\n\n### Example 4: Track Data Dependencies\n\n```python\n# Understand query dependencies\nquery = \"\"\"\nWITH user_stats AS (\n SELECT user_id, COUNT(*) as order_count\n FROM orders\n GROUP BY user_id\n)\nSELECT u.name, s.order_count\nFROM users u\nJOIN user_stats s ON u.id = s.user_id\n\"\"\"\n\nresult = bq_extract_dependencies(sql=query)\n\n# Output:\n# Tables: ['orders', 'users']\n# Columns: ['user_id', 'name', 'id']\n# Dependency Graph:\n# orders \u2192 user_stats \u2192 final_result\n# users \u2192 final_result\n```\n\n## \ud83c\udfa8 How It Works\n\n```\nYour Code \u2190 \u2192 Claude/AI Assistant\n \u2193\n MCP Protocol\n \u2193\n mcp-bigquery\n \u2193\n BigQuery API (Dry-run)\n \u2193\n BigQuery\n (Never executes actual queries)\n```\n\n## \u2699\ufe0f Configuration\n\n### Environment Variables\n\n```bash\nexport BQ_PROJECT=\"my-project\" # GCP Project ID (required)\nexport BQ_LOCATION=\"asia-northeast1\" # Region (optional)\nexport SAFE_PRICE_PER_TIB=\"5.0\" # Price per TiB (default: $5)\nexport DEBUG=\"true\" # Enable debug logging\n```\n\n### Full Claude Desktop Configuration\n\n```json\n{\n \"mcpServers\": {\n \"mcp-bigquery\": {\n \"command\": \"mcp-bigquery\",\n \"env\": {\n \"BQ_PROJECT\": \"my-production-project\",\n \"BQ_LOCATION\": \"asia-northeast1\",\n \"SAFE_PRICE_PER_TIB\": \"6.0\",\n \"DEBUG\": \"false\"\n }\n }\n }\n}\n```\n\n## \ud83d\udd27 Troubleshooting\n\n### Common Issues & Solutions\n\n#### \u274c Authentication Error\n```\nError: Could not automatically determine credentials\n```\n**Solution:**\n```bash\ngcloud auth application-default login\n```\n\n#### \u274c Permission Error\n```\nError: User does not have bigquery.tables.get permission\n```\n**Solution:** Grant BigQuery Data Viewer role\n```bash\ngcloud projects add-iam-policy-binding YOUR_PROJECT \\\n --member=\"user:your-email@example.com\" \\\n --role=\"roles/bigquery.dataViewer\"\n```\n\n#### \u274c Project Not Set\n```\nError: Project ID is required\n```\n**Solution:** Set `BQ_PROJECT` in your configuration\n\n### Debug Mode\n\nIf issues persist, enable debug mode:\n\n```json\n{\n \"env\": {\n \"DEBUG\": \"true\",\n \"BQ_PROJECT\": \"your-project\"\n }\n}\n```\n\n## \ud83d\udcda Learn More\n\n### Getting Started\n- [Installation Guide](https://caron14.github.io/mcp-bigquery/getting-started/)\n- [Tool Documentation](https://caron14.github.io/mcp-bigquery/api/)\n- [Example Workflows](https://caron14.github.io/mcp-bigquery/examples/)\n\n### For Developers\n- [Local Development](https://caron14.github.io/mcp-bigquery/development/)\n- [Running Tests](https://caron14.github.io/mcp-bigquery/testing/)\n- [Contributing Guide](CONTRIBUTING.md)\n\n## \ud83d\udea6 Project Status\n\n| Version | Release Date | Key Features |\n|---------|--------------|--------------|\n| v0.4.1 | 2025-01-22 | Better error handling, debug logging |\n| v0.4.0 | 2025-01-22 | Added 6 schema discovery tools |\n| v0.3.0 | 2025-01-17 | SQL analysis engine |\n| v0.2.0 | 2025-01-16 | Basic validation & dry-run |\n\n\n## \ud83e\udd1d Contributing\n\nPull requests are welcome! See our [Contributing Guide](CONTRIBUTING.md).\n\n```bash\n# Setup development environment\ngit clone https://github.com/caron14/mcp-bigquery.git\ncd mcp-bigquery\npip install -e \".[dev]\"\n\n# Run tests\npytest tests/\n```\n\n## \ud83d\udcc4 License\n\nMIT License - see [LICENSE](LICENSE) for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- Google BigQuery team for the excellent API\n- Anthropic for the MCP protocol\n- All contributors and users\n\n---\n\n<div align=\"center\">\n\n**Built for safe BigQuery exploration** \ud83d\udee1\ufe0f\n\n[Report Bug](https://github.com/caron14/mcp-bigquery/issues) \u00b7 \n[Request Feature](https://github.com/caron14/mcp-bigquery/issues) \u00b7 \n[Discussions](https://github.com/caron14/mcp-bigquery/discussions)\n\n</div>",
"bugtrack_url": null,
"license": "MIT",
"summary": "Minimal MCP server for BigQuery SQL validation and dry-run analysis",
"version": "0.4.1",
"project_urls": {
"Documentation": "https://github.com/caron14/mcp-bigquery#readme",
"Homepage": "https://github.com/caron14/mcp-bigquery",
"Issues": "https://github.com/caron14/mcp-bigquery/issues",
"Repository": "https://github.com/caron14/mcp-bigquery.git"
},
"split_keywords": [
"bigquery",
" claude",
" dry-run",
" gcp",
" mcp",
" sql",
" validation"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2b8b7fa5d9e4f33a5946642aadbd84087d0eb502bc5a1e3d25379485347ab5dd",
"md5": "1561229039c06d9b4b5f54924623b587",
"sha256": "908d7f080c4b1628157b4640617b5fdbbce5bd3b889f8fe2719b720ea5aa7e60"
},
"downloads": -1,
"filename": "mcp_bigquery-0.4.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "1561229039c06d9b4b5f54924623b587",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 43564,
"upload_time": "2025-08-22T15:47:19",
"upload_time_iso_8601": "2025-08-22T15:47:19.380892Z",
"url": "https://files.pythonhosted.org/packages/2b/8b/7fa5d9e4f33a5946642aadbd84087d0eb502bc5a1e3d25379485347ab5dd/mcp_bigquery-0.4.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "9c5d21dd286128a2f0ff3941d0e39c18196e11ad51f07709bac5d8b78adafdbf",
"md5": "dbf61d84712e05a7c5857763be196c1e",
"sha256": "d74020bb003102e6bbb487cf9d3cb1ad2116af7c6940a2e85691c2a5f52f8ad0"
},
"downloads": -1,
"filename": "mcp_bigquery-0.4.1.tar.gz",
"has_sig": false,
"md5_digest": "dbf61d84712e05a7c5857763be196c1e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 37323,
"upload_time": "2025-08-22T15:47:20",
"upload_time_iso_8601": "2025-08-22T15:47:20.992086Z",
"url": "https://files.pythonhosted.org/packages/9c/5d/21dd286128a2f0ff3941d0e39c18196e11ad51f07709bac5d8b78adafdbf/mcp_bigquery-0.4.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-22 15:47:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "caron14",
"github_project": "mcp-bigquery#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "mcp-bigquery"
}