Name | dbxsql JSON |
Version |
1.0.1
JSON |
| download |
home_page | None |
Summary | A comprehensive Databricks SQL handler with Pydantic models, OAuth authentication, and connection management |
upload_time | 2025-08-08 11:41:17 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | MIT |
keywords |
database
databricks
oauth
pydantic
sql
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Databricks SQL Handler
A comprehensive Python package for interacting with Databricks SQL warehouses using Pydantic models, OAuth authentication, and robust connection management.
## Features
- **OAuth Authentication**: Secure client credentials flow authentication
- **Pydantic Models**: Type-safe data models for query results
- **Connection Management**: Robust connection handling with automatic retry logic
- **Query Execution**: Execute single or multiple queries with structured results
- **Built-in Models**: Pre-defined models for common Databricks operations
- **Extensible**: Easy to extend with custom data models
- **CLI Interface**: Interactive command-line interface for testing and development
## Installation
```bash
pip install dbxsql
```
## Quick Start
### Basic Usage
```python
from dbxsql import QueryHandler, DatabricksSettings, NexsysRecord
# Configure settings (or use environment variables)
settings = DatabricksSettings(
client_id="your_client_id",
client_secret="your_client_secret",
server_hostname="your_databricks_hostname",
http_path="/sql/1.0/warehouses/your_warehouse_id"
)
# Use as context manager
with QueryHandler(settings) as handler:
# Execute a simple query
result = handler.execute_query("SELECT * FROM my_table LIMIT 10")
# Execute with Pydantic model parsing
result = handler.execute_query(
"SELECT * FROM nexsys_table LIMIT 10",
NexsysRecord
)
# Access structured data
for record in result.data:
print(f"ID: {record.id}, Name: {record.name}")
```
### Environment Variables
Create a `.env` file:
```env
DATABRICKS_CLIENT_ID=your_client_id
DATABRICKS_CLIENT_SECRET=your_client_secret
DATABRICKS_SERVER_HOSTNAME=your_hostname.databricks.com
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/warehouse_id
DATABRICKS_LOG_LEVEL=INFO
```
### CLI Usage
```bash
# Interactive mode
dbxsql --interactive
# Run example queries
dbxsql --examples
# Execute a single query
dbxsql --query "SELECT current_timestamp()"
```
## Custom Models
Create your own Pydantic models:
```python
from pydantic import BaseModel
from datetime import datetime
from typing import Optional
class MyCustomModel(BaseModel):
id: int
name: str
created_at: datetime
amount: Optional[float] = None
# Register the model
from dbxsql import register_model
register_model("my_custom", MyCustomModel)
# Use it in queries
result = handler.execute_query("SELECT * FROM my_table", MyCustomModel)
```
## Configuration
All configuration can be done via environment variables with the `DATABRICKS_` prefix or programmatically:
| Setting | Environment Variable | Default | Description |
|---------|---------------------|---------|-------------|
| client_id | DATABRICKS_CLIENT_ID | Required | OAuth client ID |
| client_secret | DATABRICKS_CLIENT_SECRET | Required | OAuth client secret |
| server_hostname | DATABRICKS_SERVER_HOSTNAME | Required | Databricks hostname |
| http_path | DATABRICKS_HTTP_PATH | Required | SQL warehouse HTTP path |
| log_level | DATABRICKS_LOG_LEVEL | INFO | Logging level |
| max_retries | DATABRICKS_MAX_RETRIES | 3 | Query retry attempts |
| query_timeout | DATABRICKS_QUERY_TIMEOUT | 300 | Query timeout (seconds) |
## API Reference
### QueryHandler
Main class for executing queries and managing connections.
#### Methods
- `execute_query(query, model_class=None)`: Execute single query
- `execute_multiple_queries(queries, model_classes=None)`: Execute multiple queries
- `execute_query_with_retry(query, model_class=None)`: Execute with automatic retry
- `list_files(path)`: List files in Databricks path
- `show_tables(database=None)`: Show tables in database
- `test_connection()`: Test database connectivity
### Built-in Models
- `NexsysRecord`: For NEXSYS system data
- `SalesRecord`: For sales transaction data
- `FileInfo`: For file listing results
- `TableInfo`: For table information
- `GenericRecord`: For unknown data structures
## Error Handling
The package provides specific exceptions:
```python
from dbxsql import (
AuthenticationError,
ConnectionError,
QueryExecutionError,
SyntaxError,
TimeoutError,
DataParsingError
)
try:
result = handler.execute_query("SELECT * FROM table")
except AuthenticationError:
print("Authentication failed")
except QueryExecutionError as e:
print(f"Query failed: {e}")
```
## Troubleshooting
### CERTIFICATE_VERIFY_FAILED
Sometimes, depending on an operational system the next error message could be returned:
```
dbxsql.exceptions.ConnectionError: Failed to connect to Databricks: Error during request to server: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1007)
```
This error can be fixed by setting an environment variable **SSL_CERT_FILE**. This can be done in a terminal:
```bash
export SSL_CERT_FILE=/path/to/databricks.pem
```
or directly in the code:
```python
import os
import certifi
from dbxsql import QueryHandler
from dbxsql.settings import settings
from dbxsql.main import ApplicationRunner
os.environ["SSL_CERT_FILE"] = certifi.where()
with QueryHandler(settings) as handler:
app_runner = ApplicationRunner(handler)
app_runner.run_example_queries()
```
## License
MIT License
## Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Raw data
{
"_id": null,
"home_page": null,
"name": "dbxsql",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": "Andras Gyacsok <atti.dyachok@gmail.com>",
"keywords": "database, databricks, oauth, pydantic, sql",
"author": null,
"author_email": "Andras Gyacsok <atti.dyachok@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/f3/55/06e604c224b9664ea069c116bc9afde055b27be6640912e5ac44aa4a36b5/dbxsql-1.0.1.tar.gz",
"platform": null,
"description": "# Databricks SQL Handler\n\nA comprehensive Python package for interacting with Databricks SQL warehouses using Pydantic models, OAuth authentication, and robust connection management.\n\n## Features\n\n- **OAuth Authentication**: Secure client credentials flow authentication\n- **Pydantic Models**: Type-safe data models for query results\n- **Connection Management**: Robust connection handling with automatic retry logic\n- **Query Execution**: Execute single or multiple queries with structured results\n- **Built-in Models**: Pre-defined models for common Databricks operations\n- **Extensible**: Easy to extend with custom data models\n- **CLI Interface**: Interactive command-line interface for testing and development\n\n## Installation\n\n```bash\npip install dbxsql\n```\n\n## Quick Start\n\n### Basic Usage\n\n```python\nfrom dbxsql import QueryHandler, DatabricksSettings, NexsysRecord\n\n# Configure settings (or use environment variables)\nsettings = DatabricksSettings(\n client_id=\"your_client_id\",\n client_secret=\"your_client_secret\", \n server_hostname=\"your_databricks_hostname\",\n http_path=\"/sql/1.0/warehouses/your_warehouse_id\"\n)\n\n# Use as context manager\nwith QueryHandler(settings) as handler:\n # Execute a simple query\n result = handler.execute_query(\"SELECT * FROM my_table LIMIT 10\")\n \n # Execute with Pydantic model parsing\n result = handler.execute_query(\n \"SELECT * FROM nexsys_table LIMIT 10\", \n NexsysRecord\n )\n \n # Access structured data\n for record in result.data:\n print(f\"ID: {record.id}, Name: {record.name}\")\n```\n\n### Environment Variables\n\nCreate a `.env` file:\n\n```env\nDATABRICKS_CLIENT_ID=your_client_id\nDATABRICKS_CLIENT_SECRET=your_client_secret\nDATABRICKS_SERVER_HOSTNAME=your_hostname.databricks.com\nDATABRICKS_HTTP_PATH=/sql/1.0/warehouses/warehouse_id\nDATABRICKS_LOG_LEVEL=INFO\n```\n\n### CLI Usage\n\n```bash\n# Interactive mode\ndbxsql --interactive\n\n# Run example queries\ndbxsql --examples\n\n# Execute a single query\ndbxsql --query \"SELECT current_timestamp()\"\n```\n\n## Custom Models\n\nCreate your own Pydantic models:\n\n```python\nfrom pydantic import BaseModel\nfrom datetime import datetime\nfrom typing import Optional\n\nclass MyCustomModel(BaseModel):\n id: int\n name: str\n created_at: datetime\n amount: Optional[float] = None\n\n# Register the model\nfrom dbxsql import register_model\nregister_model(\"my_custom\", MyCustomModel)\n\n# Use it in queries\nresult = handler.execute_query(\"SELECT * FROM my_table\", MyCustomModel)\n```\n\n## Configuration\n\nAll configuration can be done via environment variables with the `DATABRICKS_` prefix or programmatically:\n\n| Setting | Environment Variable | Default | Description |\n|---------|---------------------|---------|-------------|\n| client_id | DATABRICKS_CLIENT_ID | Required | OAuth client ID |\n| client_secret | DATABRICKS_CLIENT_SECRET | Required | OAuth client secret |\n| server_hostname | DATABRICKS_SERVER_HOSTNAME | Required | Databricks hostname |\n| http_path | DATABRICKS_HTTP_PATH | Required | SQL warehouse HTTP path |\n| log_level | DATABRICKS_LOG_LEVEL | INFO | Logging level |\n| max_retries | DATABRICKS_MAX_RETRIES | 3 | Query retry attempts |\n| query_timeout | DATABRICKS_QUERY_TIMEOUT | 300 | Query timeout (seconds) |\n\n## API Reference\n\n### QueryHandler\n\nMain class for executing queries and managing connections.\n\n#### Methods\n\n- `execute_query(query, model_class=None)`: Execute single query\n- `execute_multiple_queries(queries, model_classes=None)`: Execute multiple queries \n- `execute_query_with_retry(query, model_class=None)`: Execute with automatic retry\n- `list_files(path)`: List files in Databricks path\n- `show_tables(database=None)`: Show tables in database\n- `test_connection()`: Test database connectivity\n\n### Built-in Models\n\n- `NexsysRecord`: For NEXSYS system data\n- `SalesRecord`: For sales transaction data \n- `FileInfo`: For file listing results\n- `TableInfo`: For table information\n- `GenericRecord`: For unknown data structures\n\n## Error Handling\n\nThe package provides specific exceptions:\n\n```python\nfrom dbxsql import (\n AuthenticationError,\n ConnectionError, \n QueryExecutionError,\n SyntaxError,\n TimeoutError,\n DataParsingError\n)\n\ntry:\n result = handler.execute_query(\"SELECT * FROM table\")\nexcept AuthenticationError:\n print(\"Authentication failed\")\nexcept QueryExecutionError as e:\n print(f\"Query failed: {e}\")\n```\n\n## Troubleshooting\n\n### CERTIFICATE_VERIFY_FAILED\n\nSometimes, depending on an operational system the next error message could be returned:\n\n```\ndbxsql.exceptions.ConnectionError: Failed to connect to Databricks: Error during request to server: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1007)\n```\nThis error can be fixed by setting an environment variable **SSL_CERT_FILE**. This can be done in a terminal:\n```bash\n export SSL_CERT_FILE=/path/to/databricks.pem\n```\nor directly in the code:\n```python\nimport os\nimport certifi\n\nfrom dbxsql import QueryHandler\nfrom dbxsql.settings import settings\nfrom dbxsql.main import ApplicationRunner\n\nos.environ[\"SSL_CERT_FILE\"] = certifi.where()\n\n\nwith QueryHandler(settings) as handler:\n app_runner = ApplicationRunner(handler)\n app_runner.run_example_queries()\n```\n\n## License\n\nMIT License\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.",
"bugtrack_url": null,
"license": "MIT",
"summary": "A comprehensive Databricks SQL handler with Pydantic models, OAuth authentication, and connection management",
"version": "1.0.1",
"project_urls": {
"Documentation": "https://github.com/kavodsky/dbxsql#readme",
"Homepage": "https://github.com/kavodsky/dbxsql",
"Issues": "https://github.com/kavodsky/dbxsql/issues",
"Repository": "https://github.com/kavodsky/dbxsql.git"
},
"split_keywords": [
"database",
" databricks",
" oauth",
" pydantic",
" sql"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9ab6daae9304cbc66cbde558575726355885bd65b86492aa7bbe6131f60508c4",
"md5": "087eeb45aa330f9ad848ac0c9588caaa",
"sha256": "711a3dbd6cc102fa8f6c5e1cd0a6ea0315d4cd3b2b927e3586e9119796065ef5"
},
"downloads": -1,
"filename": "dbxsql-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "087eeb45aa330f9ad848ac0c9588caaa",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 19768,
"upload_time": "2025-08-08T11:41:16",
"upload_time_iso_8601": "2025-08-08T11:41:16.850341Z",
"url": "https://files.pythonhosted.org/packages/9a/b6/daae9304cbc66cbde558575726355885bd65b86492aa7bbe6131f60508c4/dbxsql-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f35506e604c224b9664ea069c116bc9afde055b27be6640912e5ac44aa4a36b5",
"md5": "0650c90cbc55bddb37ae2f84a5ea0892",
"sha256": "e726f80b6fb9cc3e4d9e3457d6631a9ea327bd862d6b7d5bfe1a08fc563ce881"
},
"downloads": -1,
"filename": "dbxsql-1.0.1.tar.gz",
"has_sig": false,
"md5_digest": "0650c90cbc55bddb37ae2f84a5ea0892",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 15449,
"upload_time": "2025-08-08T11:41:17",
"upload_time_iso_8601": "2025-08-08T11:41:17.862798Z",
"url": "https://files.pythonhosted.org/packages/f3/55/06e604c224b9664ea069c116bc9afde055b27be6640912e5ac44aa4a36b5/dbxsql-1.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-08 11:41:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "kavodsky",
"github_project": "dbxsql#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "dbxsql"
}