# SQL Extension for Jupyter
A powerful Jupyter extension designed to work seamlessly with the **SyneHQ.com** data platform, providing secure access to your connected data sources without the need for managing database credentials.
[](https://github.com/SyneHQ/jupyter.sql-extension/actions/workflows/release.yml)
## About SyneHQ Integration
This extension is specifically built for the [SyneHQ.com](https://synehq.com) data platform, which provides:
- **Zero-Credential Data Access**: Connect to your data sources without exposing database credentials
- **Unified Data Platform**: Access all your connected data sources through a single, secure interface
- **Enterprise-Grade Security**: Built-in authentication, authorization, and audit logging
- **Multi-Platform Support**: Works with your favorite data analysis platforms including Jupyter, R, and more
## Key Features
### 🔐 Secure Connection Management
- **Credential-Free Access**: Retrieve database connections securely through SyneHQ's internal services
- **Enterprise Authentication**: Built-in support for SSO, OAuth, and enterprise identity providers
- **Connection Pooling**: Efficient connection management with automatic retry and failover
### 🛡️ Security & Validation
- **SQL Injection Prevention**: Advanced input validation and query sanitization
- **Query Safety Checks**: Automatic detection of potentially harmful operations
- **Audit Logging**: Complete query execution tracking for compliance and monitoring
### 📊 Rich Output Formatting
- **Pandas DataFrames**: Native support for DataFrame output with automatic type inference
- **Interactive Tables**: HTML tables with sorting, filtering, and pagination
- **JSON Export**: Structured data output for API integrations
- **Custom Visualization**: Support for charts and graphs integration
### 🔄 Advanced Query Features
- **Variable Assignment**: Assign query results to Python variables using intuitive syntax
- **Python Variable Substitution**: Use Python variables, expressions, and function calls directly in SQL queries
- **Type-Safe Formatting**: Automatic type detection and SQL-safe formatting for all Python data types
- **Expression Evaluation**: Evaluate complex Python expressions safely within SQL queries
- **Async Execution**: Non-blocking query execution for better performance
- **Query Caching**: Intelligent caching to reduce redundant database calls
### 📈 Performance & Monitoring
- **Execution Metrics**: Detailed performance tracking and query optimization insights
- **Connection Health**: Real-time monitoring of database connection status
- **Error Recovery**: Automatic retry mechanisms with exponential backoff
## Installation
### Prerequisites
- Python 3.8 or higher
- Jupyter Notebook or JupyterLab
- Access to SyneHQ.com data platform
### Install via pip
```bash
pip install syne-sql-extension
```
### Install from source
```bash
git clone https://github.com/synehq/jupyter-sql-extension.git
cd jupyter-sql-extension
pip install -e .
```
### Load the extension in Jupyter
```python
%load_ext syne_sql_extension
```
## Quick Start
### 1. Set up Authentication
```python
# Option 1: Set global variable (recommended for Jupyter notebooks)
SYNE_OAUTH_KEY = 'your_api_key_here'
# Option 2: Use environment variable
# export SYNE_OAUTH_KEY='your_api_key_here'
# Option 3: Provide via command line (most explicit)
%%sql my_database --api-key your_api_key_here
SELECT * FROM users LIMIT 10
```
### 2. Connect to SyneHQ
```python
%%sql my_database
SELECT * FROM users LIMIT 10
```
### 3. Use with variables
```python
# Assign results to a variable
%%sql analytics_db --output users_df
SELECT user_id, name, email, created_at
FROM users
WHERE created_at >= '2024-01-01'
```
### 4. Parameterized queries
```python
user_limit = 100
department = 'engineering'
%%sql hr_db
SELECT * FROM employees
WHERE department = {department}
LIMIT {user_limit}
```
### 5. Different output formats
```python
# DataFrame output (default)
%%sql sales_db --format dataframe
SELECT product, SUM(revenue) as total_revenue
FROM sales
GROUP BY product
# HTML table
%%sql sales_db --format html
SELECT * FROM products WHERE price > 100
# JSON output
%%sql api_db --format json
SELECT config FROM settings WHERE active = true
```
## Authentication
The extension supports multiple ways to provide your SyneHQ API key for authentication. The API key is resolved in the following order of preference:
### 1. Command Line (Most Explicit)
```python
%%sql my_db --api-key your_api_key_here
SELECT * FROM users LIMIT 10
```
### 2. Global Variable (Recommended for Jupyter)
```python
# Set once at the beginning of your notebook
SYNE_OAUTH_KEY = 'your_api_key_here'
# Then use without specifying the key
%%sql my_db
SELECT * FROM users LIMIT 10
```
### 3. Environment Variable
```bash
# Set in your shell environment
export SYNE_OAUTH_KEY='your_api_key_here'
# Or in your Jupyter environment
import os
os.environ['SYNE_OAUTH_KEY'] = 'your_api_key_here'
```
### Security Best Practices
- **Never hardcode API keys** in your notebooks or commit them to version control
- **Use environment variables** for production deployments
- **Use global variables** for interactive development in Jupyter
- **Rotate API keys regularly** for enhanced security
### Getting Your API Key
1. Log in to your [SyneHQ account](https://synehq.com)
2. Navigate to Teams > Choose the team >API Keys
3. Generate a new API key with appropriate permissions
4. Copy the key and use one of the authentication methods above
## Usage Examples
### Basic Queries
```python
# Simple select
%%sql main_db
SELECT COUNT(*) as total_users FROM users
# Join multiple tables
%%sql warehouse
SELECT
u.name,
p.product_name,
o.order_date,
o.amount
FROM users u
JOIN orders o ON u.id = o.user_id
JOIN products p ON o.product_id = p.id
WHERE o.order_date >= '2024-01-01'
```
### Data Analysis Workflow
```python
# Load data into DataFrame
%%sql analytics
sales_data >> SELECT
DATE(order_date) as date,
product_category,
SUM(amount) as daily_revenue,
COUNT(*) as order_count
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE(order_date), product_category
ORDER BY date DESC
# Analyze the data
print(f"Total revenue: ${sales_data['daily_revenue'].sum():,.2f}")
print(f"Average daily orders: {sales_data['order_count'].mean():.1f}")
# Create visualization
sales_data.groupby('product_category')['daily_revenue'].sum().plot(kind='bar')
```
### Advanced Features
```python
# Using Python variables in queries with enhanced syntax
start_date = '2024-01-01'
end_date = '2024-12-31'
min_revenue = 1000
user_ids = [1, 2, 3, 4, 5]
# Simple variable substitution
%%sql finance
SELECT
customer_id,
SUM(amount) as total_spent
FROM transactions
WHERE transaction_date BETWEEN {start_date} AND {end_date}
GROUP BY customer_id
HAVING SUM(amount) >= {min_revenue}
ORDER BY total_spent DESC
# List variables with automatic formatting
%%sql analytics
SELECT * FROM users WHERE id IN {user_ids}
# Type-specific formatting
%%sql analytics
SELECT * FROM users WHERE id IN {user_ids:list}
# Expression evaluation
%%sql finance
SELECT * FROM products WHERE price = {min_revenue * 1.5}
# Complex expressions with functions
from datetime import datetime, timedelta
%%sql analytics
SELECT * FROM users WHERE created_at >= {datetime.now() - timedelta(days=30)}
```
## Python Variable Support
The extension provides comprehensive Python variable substitution in SQL queries with multiple syntax options and safety features.
### Syntax Options
#### 1. Simple Variable Substitution
```python
user_id = 123
user_name = "John Doe"
%%sql my_connection -k my_key
SELECT * FROM users WHERE id = {user_id}
SELECT * FROM users WHERE name = {user_name}
```
#### 2. Type-Specific Formatting
```python
user_ids = [1, 2, 3, 4, 5]
price = 99.99
created_date = datetime(2024, 1, 1)
%%sql my_connection -k my_key
SELECT * FROM users WHERE id IN {user_ids:list}
SELECT * FROM products WHERE price = {price:number}
SELECT * FROM users WHERE created_at >= {created_date:date}
```
#### 3. Expression Evaluation
```python
base_price = 100
discount_rate = 0.1
tax_rate = 0.08
%%sql my_connection -k my_key
SELECT * FROM products WHERE final_price = {base_price * (1 - discount_rate) * (1 + tax_rate)}
```
#### 4. Function Calls and Complex Expressions
```python
from datetime import datetime, timedelta
%%sql my_connection -k my_key
SELECT * FROM users WHERE created_at >= {datetime.now() - timedelta(days=30)}
SELECT * FROM products WHERE rounded_price = {round(99.99 * 1.15, 2)}
```
### Supported Data Types
- **Strings**: Automatically quoted and escaped
- **Numbers**: Used as-is without quotes
- **Lists/Tuples**: Formatted as SQL IN clauses
- **Booleans**: Converted to strings
- **None**: Converted to SQL NULL
- **Datetime objects**: Formatted as ISO strings
### Security Features
- **Safe Expression Evaluation**: Only safe built-in functions are allowed
- **Pattern Blocking**: Dangerous patterns like `import`, `exec`, `eval` are blocked
- **Function Blacklisting**: Dangerous functions like `os`, `sys`, `subprocess` are blocked
- **Sandboxed Environment**: Expressions run in a restricted environment
### Complete Example
```python
# Set up variables
start_date = datetime(2024, 1, 1)
end_date = datetime(2024, 12, 31)
min_age = 18
max_age = 65
active_statuses = ['active', 'premium']
excluded_users = [999, 1000, 1001]
# Complex query with multiple variable types
%%sql analytics_db -k my_key
SELECT
u.id,
u.name,
u.email,
u.age,
u.status,
COUNT(o.id) as order_count,
SUM(o.total) as total_spent
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at BETWEEN {start_date:date} AND {end_date:date}
AND u.age BETWEEN {min_age} AND {max_age}
AND u.status IN {active_statuses:list}
AND u.id NOT IN {excluded_users:list}
GROUP BY u.id, u.name, u.email, u.age, u.status
HAVING COUNT(o.id) > 0
ORDER BY total_spent DESC
LIMIT 100
```
For detailed documentation on Python variable support, see [PYTHON_VARIABLE_SUPPORT.md](PYTHON_VARIABLE_SUPPORT.md).
## Connection Management
### Available Connections
```python
# List available connections
%%sql --list-connections -k {key}
```
### Test Connection
```python
# Test if connection is working
%%sql {connection_id}
SELECT 1
```
## Error Handling
The extension provides comprehensive error handling with user-friendly messages:
```python
%%sql invalid_db
SELECT * FROM nonexistent_table
```
Common error scenarios:
- **Connection Errors**: Invalid connection ID, network issues, authentication failures
- **Query Errors**: SQL syntax errors, table not found, permission denied
- **Validation Errors**: SQL injection attempts, unsafe operations
- **Timeout Errors**: Long-running queries, connection timeouts
## Security Features
### SQL Injection Prevention
```python
# ❌ This will be blocked
user_input = "'; DROP TABLE users; --"
%%sql db
SELECT * FROM users WHERE name = '{user_input}'
# ✅ Use parameter binding instead
user_input = "John Doe"
%%sql db
SELECT * FROM users WHERE name = {user_input}
```
### Query Validation
The extension automatically validates queries for:
- Potentially dangerous operations (DROP, DELETE, etc.)
- SQL injection patterns
- Syntax errors
- Resource usage limits
## Performance Optimization
### Query Caching
```python
# Enable caching for repeated queries
%%sql db --cache
SELECT expensive_aggregation() FROM large_table
```
### Async Execution
```python
# Run multiple queries concurrently
import asyncio
async def run_queries():
tasks = []
for db in ['db1', 'db2', 'db3']:
task = execute_query(f"%%sql {db}\nSELECT COUNT(*) FROM table")
tasks.append(task)
results = await asyncio.gather(*tasks)
return results
```
## Troubleshooting
### Common Issues
**Extension not loading:**
```python
# Check if extension is properly installed
%load_ext syne_sql_extension
```
**Connection failures:**
- Verify your SyneHQ API credentials
- Check network connectivity to SyneHQ services
- Ensure your workspace has access to the requested data sources
**Query errors:**
- Validate SQL syntax
- Check table and column names
- Verify permissions for the data source
### Debug Mode
```python
# Enable debug logging
%%sql db --debug
SELECT * FROM users
```
### Getting Help
- Check the [SyneHQ Documentation](https://docs.synehq.com)
- Visit our [GitHub Issues](https://github.com/synehq/jupyter-sql-extension/issues)
- Contact support at [support@synehq.com](mailto:support@synehq.com)
## API Reference
### Magic Command Options
| Option | Description | Default |
|--------|-------------|---------|
| `--connection-id` | SyneHQ connection identifier | Required |
| `--api-key` | SyneHQ API key for authentication | Auto-detected |
| `--output` | Variable name for query results | None |
| `--format` | Output format (dataframe, html, json) | dataframe |
| `--timeout` | Query timeout in seconds | 30 |
| `--cache` | Enable query caching | false |
| `--debug` | Enable debug logging | false |
| `--test` | Test connection without executing query | false |
### Configuration Options
| Setting | Description | Default |
|---------|-------------|---------|
| `api_url` | SyneHQ API endpoint | https://api.synehq.com |
| `timeout` | Default query timeout | 30 |
| `retry_attempts` | Number of retry attempts | 3 |
| `cache_enabled` | Enable query caching | true |
| `cache_ttl` | Cache time-to-live (seconds) | 300 |
| `output_format` | Default output format | dataframe |
## Contributing
We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.
### Development Setup
```bash
git clone https://github.com/synehq/jupyter-sql-extension.git
cd jupyter-sql-extension
pip install -e ".[dev]"
pre-commit install
```
### Running Tests
```bash
pytest tests/
```
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Support
- **Documentation**: [docs.synehq.com](https://docs.synehq.com)
- **Issues**: [GitHub Issues](https://github.com/synehq/jupyter.sql-extension/issues)
- **Email**: [support@synehq.com](mailto:support@synehq.com)
- **SyneHQ Platform**: [synehq.com](https://synehq.com)
---
**Made with ❤️ by the SyneHQ team**
Raw data
{
"_id": null,
"home_page": "https://github.com/synehq/jupyter.sql-extension",
"name": "syne-sql-extension",
"maintainer": "SyneHQ",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "engineering@synehq.com",
"keywords": "jupyter, notebook, sql, database, magic, ipython, data-science, analytics, query, rest-api, grpc, security, authentication, validation, async, enterprise",
"author": "SyneHQ",
"author_email": "dev@synehq.com",
"download_url": "https://files.pythonhosted.org/packages/7a/1b/b09c3081472dabab4773ebb55d7a0f3bcb10dbf22be2cc143bc3d3e46fe0/syne_sql_extension-1.0.8.tar.gz",
"platform": "any",
"description": "# SQL Extension for Jupyter\n\nA powerful Jupyter extension designed to work seamlessly with the **SyneHQ.com** data platform, providing secure access to your connected data sources without the need for managing database credentials.\n\n[](https://github.com/SyneHQ/jupyter.sql-extension/actions/workflows/release.yml)\n\n## About SyneHQ Integration\n\nThis extension is specifically built for the [SyneHQ.com](https://synehq.com) data platform, which provides:\n- **Zero-Credential Data Access**: Connect to your data sources without exposing database credentials\n- **Unified Data Platform**: Access all your connected data sources through a single, secure interface\n- **Enterprise-Grade Security**: Built-in authentication, authorization, and audit logging\n- **Multi-Platform Support**: Works with your favorite data analysis platforms including Jupyter, R, and more\n\n## Key Features\n\n### \ud83d\udd10 Secure Connection Management\n- **Credential-Free Access**: Retrieve database connections securely through SyneHQ's internal services\n- **Enterprise Authentication**: Built-in support for SSO, OAuth, and enterprise identity providers\n- **Connection Pooling**: Efficient connection management with automatic retry and failover\n\n### \ud83d\udee1\ufe0f Security & Validation\n- **SQL Injection Prevention**: Advanced input validation and query sanitization\n- **Query Safety Checks**: Automatic detection of potentially harmful operations\n- **Audit Logging**: Complete query execution tracking for compliance and monitoring\n\n### \ud83d\udcca Rich Output Formatting\n- **Pandas DataFrames**: Native support for DataFrame output with automatic type inference\n- **Interactive Tables**: HTML tables with sorting, filtering, and pagination\n- **JSON Export**: Structured data output for API integrations\n- **Custom Visualization**: Support for charts and graphs integration\n\n### \ud83d\udd04 Advanced Query Features\n- **Variable Assignment**: Assign query results to Python variables using intuitive syntax\n- **Python Variable Substitution**: Use Python variables, expressions, and function calls directly in SQL queries\n- **Type-Safe Formatting**: Automatic type detection and SQL-safe formatting for all Python data types\n- **Expression Evaluation**: Evaluate complex Python expressions safely within SQL queries\n- **Async Execution**: Non-blocking query execution for better performance\n- **Query Caching**: Intelligent caching to reduce redundant database calls\n\n### \ud83d\udcc8 Performance & Monitoring\n- **Execution Metrics**: Detailed performance tracking and query optimization insights\n- **Connection Health**: Real-time monitoring of database connection status\n- **Error Recovery**: Automatic retry mechanisms with exponential backoff\n\n## Installation\n\n### Prerequisites\n- Python 3.8 or higher\n- Jupyter Notebook or JupyterLab\n- Access to SyneHQ.com data platform\n\n### Install via pip\n```bash\npip install syne-sql-extension\n```\n\n### Install from source\n```bash\ngit clone https://github.com/synehq/jupyter-sql-extension.git\ncd jupyter-sql-extension\npip install -e .\n```\n\n### Load the extension in Jupyter\n```python\n%load_ext syne_sql_extension\n```\n\n## Quick Start\n\n### 1. Set up Authentication\n```python\n# Option 1: Set global variable (recommended for Jupyter notebooks)\nSYNE_OAUTH_KEY = 'your_api_key_here'\n\n# Option 2: Use environment variable\n# export SYNE_OAUTH_KEY='your_api_key_here'\n\n# Option 3: Provide via command line (most explicit)\n%%sql my_database --api-key your_api_key_here\nSELECT * FROM users LIMIT 10\n```\n\n### 2. Connect to SyneHQ\n```python\n%%sql my_database\nSELECT * FROM users LIMIT 10\n```\n\n### 3. Use with variables\n```python\n# Assign results to a variable\n%%sql analytics_db --output users_df\nSELECT user_id, name, email, created_at \nFROM users \nWHERE created_at >= '2024-01-01'\n```\n\n### 4. Parameterized queries\n```python\nuser_limit = 100\ndepartment = 'engineering'\n\n%%sql hr_db\nSELECT * FROM employees \nWHERE department = {department} \nLIMIT {user_limit}\n```\n\n### 5. Different output formats\n```python\n# DataFrame output (default)\n%%sql sales_db --format dataframe\nSELECT product, SUM(revenue) as total_revenue \nFROM sales \nGROUP BY product\n\n# HTML table\n%%sql sales_db --format html\nSELECT * FROM products WHERE price > 100\n\n# JSON output\n%%sql api_db --format json\nSELECT config FROM settings WHERE active = true\n```\n\n## Authentication\n\nThe extension supports multiple ways to provide your SyneHQ API key for authentication. The API key is resolved in the following order of preference:\n\n### 1. Command Line (Most Explicit)\n```python\n%%sql my_db --api-key your_api_key_here\nSELECT * FROM users LIMIT 10\n```\n\n### 2. Global Variable (Recommended for Jupyter)\n```python\n# Set once at the beginning of your notebook\nSYNE_OAUTH_KEY = 'your_api_key_here'\n\n# Then use without specifying the key\n%%sql my_db\nSELECT * FROM users LIMIT 10\n```\n\n### 3. Environment Variable\n```bash\n# Set in your shell environment\nexport SYNE_OAUTH_KEY='your_api_key_here'\n\n# Or in your Jupyter environment\nimport os\nos.environ['SYNE_OAUTH_KEY'] = 'your_api_key_here'\n```\n\n### Security Best Practices\n\n- **Never hardcode API keys** in your notebooks or commit them to version control\n- **Use environment variables** for production deployments\n- **Use global variables** for interactive development in Jupyter\n- **Rotate API keys regularly** for enhanced security\n\n### Getting Your API Key\n\n1. Log in to your [SyneHQ account](https://synehq.com)\n2. Navigate to Teams > Choose the team >API Keys\n3. Generate a new API key with appropriate permissions\n4. Copy the key and use one of the authentication methods above\n\n## Usage Examples\n\n### Basic Queries\n```python\n# Simple select\n%%sql main_db\nSELECT COUNT(*) as total_users FROM users\n\n# Join multiple tables\n%%sql warehouse\nSELECT \n u.name,\n p.product_name,\n o.order_date,\n o.amount\nFROM users u\nJOIN orders o ON u.id = o.user_id\nJOIN products p ON o.product_id = p.id\nWHERE o.order_date >= '2024-01-01'\n```\n\n### Data Analysis Workflow\n```python\n\n# Load data into DataFrame\n%%sql analytics\nsales_data >> SELECT \n DATE(order_date) as date,\n product_category,\n SUM(amount) as daily_revenue,\n COUNT(*) as order_count\nFROM orders \nWHERE order_date >= '2024-01-01'\nGROUP BY DATE(order_date), product_category\nORDER BY date DESC\n\n# Analyze the data\nprint(f\"Total revenue: ${sales_data['daily_revenue'].sum():,.2f}\")\nprint(f\"Average daily orders: {sales_data['order_count'].mean():.1f}\")\n\n# Create visualization\nsales_data.groupby('product_category')['daily_revenue'].sum().plot(kind='bar')\n```\n\n### Advanced Features\n```python\n# Using Python variables in queries with enhanced syntax\nstart_date = '2024-01-01'\nend_date = '2024-12-31'\nmin_revenue = 1000\nuser_ids = [1, 2, 3, 4, 5]\n\n# Simple variable substitution\n%%sql finance\nSELECT \n customer_id,\n SUM(amount) as total_spent\nFROM transactions \nWHERE transaction_date BETWEEN {start_date} AND {end_date}\nGROUP BY customer_id\nHAVING SUM(amount) >= {min_revenue}\nORDER BY total_spent DESC\n\n# List variables with automatic formatting\n%%sql analytics\nSELECT * FROM users WHERE id IN {user_ids}\n\n# Type-specific formatting\n%%sql analytics\nSELECT * FROM users WHERE id IN {user_ids:list}\n\n# Expression evaluation\n%%sql finance\nSELECT * FROM products WHERE price = {min_revenue * 1.5}\n\n# Complex expressions with functions\nfrom datetime import datetime, timedelta\n%%sql analytics\nSELECT * FROM users WHERE created_at >= {datetime.now() - timedelta(days=30)}\n```\n\n## Python Variable Support\n\nThe extension provides comprehensive Python variable substitution in SQL queries with multiple syntax options and safety features.\n\n### Syntax Options\n\n#### 1. Simple Variable Substitution\n```python\nuser_id = 123\nuser_name = \"John Doe\"\n\n%%sql my_connection -k my_key\nSELECT * FROM users WHERE id = {user_id}\nSELECT * FROM users WHERE name = {user_name}\n```\n\n#### 2. Type-Specific Formatting\n```python\nuser_ids = [1, 2, 3, 4, 5]\nprice = 99.99\ncreated_date = datetime(2024, 1, 1)\n\n%%sql my_connection -k my_key\nSELECT * FROM users WHERE id IN {user_ids:list}\nSELECT * FROM products WHERE price = {price:number}\nSELECT * FROM users WHERE created_at >= {created_date:date}\n```\n\n#### 3. Expression Evaluation\n```python\nbase_price = 100\ndiscount_rate = 0.1\ntax_rate = 0.08\n\n%%sql my_connection -k my_key\nSELECT * FROM products WHERE final_price = {base_price * (1 - discount_rate) * (1 + tax_rate)}\n```\n\n#### 4. Function Calls and Complex Expressions\n```python\nfrom datetime import datetime, timedelta\n\n%%sql my_connection -k my_key\nSELECT * FROM users WHERE created_at >= {datetime.now() - timedelta(days=30)}\nSELECT * FROM products WHERE rounded_price = {round(99.99 * 1.15, 2)}\n```\n\n### Supported Data Types\n\n- **Strings**: Automatically quoted and escaped\n- **Numbers**: Used as-is without quotes\n- **Lists/Tuples**: Formatted as SQL IN clauses\n- **Booleans**: Converted to strings\n- **None**: Converted to SQL NULL\n- **Datetime objects**: Formatted as ISO strings\n\n### Security Features\n\n- **Safe Expression Evaluation**: Only safe built-in functions are allowed\n- **Pattern Blocking**: Dangerous patterns like `import`, `exec`, `eval` are blocked\n- **Function Blacklisting**: Dangerous functions like `os`, `sys`, `subprocess` are blocked\n- **Sandboxed Environment**: Expressions run in a restricted environment\n\n### Complete Example\n\n```python\n# Set up variables\nstart_date = datetime(2024, 1, 1)\nend_date = datetime(2024, 12, 31)\nmin_age = 18\nmax_age = 65\nactive_statuses = ['active', 'premium']\nexcluded_users = [999, 1000, 1001]\n\n# Complex query with multiple variable types\n%%sql analytics_db -k my_key\nSELECT \n u.id,\n u.name,\n u.email,\n u.age,\n u.status,\n COUNT(o.id) as order_count,\n SUM(o.total) as total_spent\nFROM users u\nLEFT JOIN orders o ON u.id = o.user_id\nWHERE u.created_at BETWEEN {start_date:date} AND {end_date:date}\n AND u.age BETWEEN {min_age} AND {max_age}\n AND u.status IN {active_statuses:list}\n AND u.id NOT IN {excluded_users:list}\nGROUP BY u.id, u.name, u.email, u.age, u.status\nHAVING COUNT(o.id) > 0\nORDER BY total_spent DESC\nLIMIT 100\n```\n\nFor detailed documentation on Python variable support, see [PYTHON_VARIABLE_SUPPORT.md](PYTHON_VARIABLE_SUPPORT.md).\n\n## Connection Management\n\n### Available Connections\n```python\n# List available connections\n%%sql --list-connections -k {key}\n```\n\n### Test Connection\n```python\n# Test if connection is working\n%%sql {connection_id}\nSELECT 1\n```\n\n## Error Handling\n\nThe extension provides comprehensive error handling with user-friendly messages:\n\n```python\n%%sql invalid_db\nSELECT * FROM nonexistent_table\n```\n\nCommon error scenarios:\n- **Connection Errors**: Invalid connection ID, network issues, authentication failures\n- **Query Errors**: SQL syntax errors, table not found, permission denied\n- **Validation Errors**: SQL injection attempts, unsafe operations\n- **Timeout Errors**: Long-running queries, connection timeouts\n\n## Security Features\n\n### SQL Injection Prevention\n```python\n# \u274c This will be blocked\nuser_input = \"'; DROP TABLE users; --\"\n%%sql db\nSELECT * FROM users WHERE name = '{user_input}'\n\n# \u2705 Use parameter binding instead\nuser_input = \"John Doe\"\n%%sql db\nSELECT * FROM users WHERE name = {user_input}\n```\n\n### Query Validation\nThe extension automatically validates queries for:\n- Potentially dangerous operations (DROP, DELETE, etc.)\n- SQL injection patterns\n- Syntax errors\n- Resource usage limits\n\n## Performance Optimization\n\n### Query Caching\n```python\n# Enable caching for repeated queries\n%%sql db --cache\nSELECT expensive_aggregation() FROM large_table\n```\n\n### Async Execution\n```python\n# Run multiple queries concurrently\nimport asyncio\n\nasync def run_queries():\n tasks = []\n for db in ['db1', 'db2', 'db3']:\n task = execute_query(f\"%%sql {db}\\nSELECT COUNT(*) FROM table\")\n tasks.append(task)\n \n results = await asyncio.gather(*tasks)\n return results\n```\n\n## Troubleshooting\n\n### Common Issues\n\n**Extension not loading:**\n```python\n# Check if extension is properly installed\n%load_ext syne_sql_extension\n```\n\n**Connection failures:**\n- Verify your SyneHQ API credentials\n- Check network connectivity to SyneHQ services\n- Ensure your workspace has access to the requested data sources\n\n**Query errors:**\n- Validate SQL syntax\n- Check table and column names\n- Verify permissions for the data source\n\n### Debug Mode\n```python\n# Enable debug logging\n%%sql db --debug\nSELECT * FROM users\n```\n\n### Getting Help\n- Check the [SyneHQ Documentation](https://docs.synehq.com)\n- Visit our [GitHub Issues](https://github.com/synehq/jupyter-sql-extension/issues)\n- Contact support at [support@synehq.com](mailto:support@synehq.com)\n\n## API Reference\n\n### Magic Command Options\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `--connection-id` | SyneHQ connection identifier | Required |\n| `--api-key` | SyneHQ API key for authentication | Auto-detected |\n| `--output` | Variable name for query results | None |\n| `--format` | Output format (dataframe, html, json) | dataframe |\n| `--timeout` | Query timeout in seconds | 30 |\n| `--cache` | Enable query caching | false |\n| `--debug` | Enable debug logging | false |\n| `--test` | Test connection without executing query | false |\n\n### Configuration Options\n\n| Setting | Description | Default |\n|---------|-------------|---------|\n| `api_url` | SyneHQ API endpoint | https://api.synehq.com |\n| `timeout` | Default query timeout | 30 |\n| `retry_attempts` | Number of retry attempts | 3 |\n| `cache_enabled` | Enable query caching | true |\n| `cache_ttl` | Cache time-to-live (seconds) | 300 |\n| `output_format` | Default output format | dataframe |\n\n## Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n### Development Setup\n```bash\ngit clone https://github.com/synehq/jupyter-sql-extension.git\ncd jupyter-sql-extension\npip install -e \".[dev]\"\npre-commit install\n```\n\n### Running Tests\n```bash\npytest tests/\n```\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\n- **Documentation**: [docs.synehq.com](https://docs.synehq.com)\n- **Issues**: [GitHub Issues](https://github.com/synehq/jupyter.sql-extension/issues)\n- **Email**: [support@synehq.com](mailto:support@synehq.com)\n- **SyneHQ Platform**: [synehq.com](https://synehq.com)\n\n---\n\n**Made with \u2764\ufe0f by the SyneHQ team**\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Enterprise-grade Jupyter extension for secure SQL query execution through internal services",
"version": "1.0.8",
"project_urls": {
"Bug Reports": "https://github.com/synehq/jupyter.sql-extension/issues",
"Changelog": "https://github.com/synehq/jupyter.sql-extension/blob/main/CHANGELOG.md",
"Documentation": "https://jupyter-sql-extension.readthedocs.io/",
"Funding": "https://github.com/sponsors/synehq",
"Homepage": "https://github.com/synehq/jupyter.sql-extension",
"Repository": "https://github.com/synehq/jupyter.sql-extension"
},
"split_keywords": [
"jupyter",
" notebook",
" sql",
" database",
" magic",
" ipython",
" data-science",
" analytics",
" query",
" rest-api",
" grpc",
" security",
" authentication",
" validation",
" async",
" enterprise"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c6c79f1cdff573f47d1492b6af7ca32695cc298cc557bd83aa28218642e9422f",
"md5": "a2e81bf7bfbeb5c15a8e824f45a1a461",
"sha256": "1ae6a627be5e740adeabd09dccafaad47a7d990a26a284f0a53809b1d5361406"
},
"downloads": -1,
"filename": "syne_sql_extension-1.0.8-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a2e81bf7bfbeb5c15a8e824f45a1a461",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 193678,
"upload_time": "2025-10-06T16:34:34",
"upload_time_iso_8601": "2025-10-06T16:34:34.224833Z",
"url": "https://files.pythonhosted.org/packages/c6/c7/9f1cdff573f47d1492b6af7ca32695cc298cc557bd83aa28218642e9422f/syne_sql_extension-1.0.8-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7a1bb09c3081472dabab4773ebb55d7a0f3bcb10dbf22be2cc143bc3d3e46fe0",
"md5": "66e2b89dc3c14859b8ce143fb40dce6e",
"sha256": "0b2ad169638788de75b61d42b68a4216150d7ac350b087a53c2811e1a373d2a4"
},
"downloads": -1,
"filename": "syne_sql_extension-1.0.8.tar.gz",
"has_sig": false,
"md5_digest": "66e2b89dc3c14859b8ce143fb40dce6e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 52315,
"upload_time": "2025-10-06T16:34:36",
"upload_time_iso_8601": "2025-10-06T16:34:36.410303Z",
"url": "https://files.pythonhosted.org/packages/7a/1b/b09c3081472dabab4773ebb55d7a0f3bcb10dbf22be2cc143bc3d3e46fe0/syne_sql_extension-1.0.8.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-06 16:34:36",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "synehq",
"github_project": "jupyter.sql-extension",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "syne-sql-extension"
}