dbxsql


Namedbxsql JSON
Version 1.0.1 PyPI version JSON
download
home_pageNone
SummaryA comprehensive Databricks SQL handler with Pydantic models, OAuth authentication, and connection management
upload_time2025-08-08 11:41:17
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseMIT
keywords database databricks oauth pydantic sql
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Databricks SQL Handler

A comprehensive Python package for interacting with Databricks SQL warehouses using Pydantic models, OAuth authentication, and robust connection management.

## Features

- **OAuth Authentication**: Secure client credentials flow authentication
- **Pydantic Models**: Type-safe data models for query results
- **Connection Management**: Robust connection handling with automatic retry logic
- **Query Execution**: Execute single or multiple queries with structured results
- **Built-in Models**: Pre-defined models for common Databricks operations
- **Extensible**: Easy to extend with custom data models
- **CLI Interface**: Interactive command-line interface for testing and development

## Installation

```bash
pip install dbxsql
```

## Quick Start

### Basic Usage

```python
from dbxsql import QueryHandler, DatabricksSettings, NexsysRecord

# Configure settings (or use environment variables)
settings = DatabricksSettings(
    client_id="your_client_id",
    client_secret="your_client_secret", 
    server_hostname="your_databricks_hostname",
    http_path="/sql/1.0/warehouses/your_warehouse_id"
)

# Use as context manager
with QueryHandler(settings) as handler:
    # Execute a simple query
    result = handler.execute_query("SELECT * FROM my_table LIMIT 10")
    
    # Execute with Pydantic model parsing
    result = handler.execute_query(
        "SELECT * FROM nexsys_table LIMIT 10", 
        NexsysRecord
    )
    
    # Access structured data
    for record in result.data:
        print(f"ID: {record.id}, Name: {record.name}")
```

### Environment Variables

Create a `.env` file:

```env
DATABRICKS_CLIENT_ID=your_client_id
DATABRICKS_CLIENT_SECRET=your_client_secret
DATABRICKS_SERVER_HOSTNAME=your_hostname.databricks.com
DATABRICKS_HTTP_PATH=/sql/1.0/warehouses/warehouse_id
DATABRICKS_LOG_LEVEL=INFO
```

### CLI Usage

```bash
# Interactive mode
dbxsql --interactive

# Run example queries
dbxsql --examples

# Execute a single query
dbxsql --query "SELECT current_timestamp()"
```

## Custom Models

Create your own Pydantic models:

```python
from pydantic import BaseModel
from datetime import datetime
from typing import Optional

class MyCustomModel(BaseModel):
    id: int
    name: str
    created_at: datetime
    amount: Optional[float] = None

# Register the model
from dbxsql import register_model
register_model("my_custom", MyCustomModel)

# Use it in queries
result = handler.execute_query("SELECT * FROM my_table", MyCustomModel)
```

## Configuration

All configuration can be done via environment variables with the `DATABRICKS_` prefix or programmatically:

| Setting | Environment Variable | Default | Description |
|---------|---------------------|---------|-------------|
| client_id | DATABRICKS_CLIENT_ID | Required | OAuth client ID |
| client_secret | DATABRICKS_CLIENT_SECRET | Required | OAuth client secret |
| server_hostname | DATABRICKS_SERVER_HOSTNAME | Required | Databricks hostname |
| http_path | DATABRICKS_HTTP_PATH | Required | SQL warehouse HTTP path |
| log_level | DATABRICKS_LOG_LEVEL | INFO | Logging level |
| max_retries | DATABRICKS_MAX_RETRIES | 3 | Query retry attempts |
| query_timeout | DATABRICKS_QUERY_TIMEOUT | 300 | Query timeout (seconds) |

## API Reference

### QueryHandler

Main class for executing queries and managing connections.

#### Methods

- `execute_query(query, model_class=None)`: Execute single query
- `execute_multiple_queries(queries, model_classes=None)`: Execute multiple queries  
- `execute_query_with_retry(query, model_class=None)`: Execute with automatic retry
- `list_files(path)`: List files in Databricks path
- `show_tables(database=None)`: Show tables in database
- `test_connection()`: Test database connectivity

### Built-in Models

- `NexsysRecord`: For NEXSYS system data
- `SalesRecord`: For sales transaction data  
- `FileInfo`: For file listing results
- `TableInfo`: For table information
- `GenericRecord`: For unknown data structures

## Error Handling

The package provides specific exceptions:

```python
from dbxsql import (
    AuthenticationError,
    ConnectionError, 
    QueryExecutionError,
    SyntaxError,
    TimeoutError,
    DataParsingError
)

try:
    result = handler.execute_query("SELECT * FROM table")
except AuthenticationError:
    print("Authentication failed")
except QueryExecutionError as e:
    print(f"Query failed: {e}")
```

## Troubleshooting

### CERTIFICATE_VERIFY_FAILED

Sometimes, depending on an operational system the next error message could be returned:

```
dbxsql.exceptions.ConnectionError: Failed to connect to Databricks: Error during request to server: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1007)
```
This error can be fixed by setting an environment variable **SSL_CERT_FILE**. This can be done in a terminal:
```bash
  export SSL_CERT_FILE=/path/to/databricks.pem
```
or directly in the code:
```python
import os
import certifi

from dbxsql import QueryHandler
from dbxsql.settings import settings
from dbxsql.main import ApplicationRunner

os.environ["SSL_CERT_FILE"]  = certifi.where()


with QueryHandler(settings) as handler:
    app_runner = ApplicationRunner(handler)
    app_runner.run_example_queries()
```

## License

MIT License

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dbxsql",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": "Andras Gyacsok <atti.dyachok@gmail.com>",
    "keywords": "database, databricks, oauth, pydantic, sql",
    "author": null,
    "author_email": "Andras Gyacsok <atti.dyachok@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/f3/55/06e604c224b9664ea069c116bc9afde055b27be6640912e5ac44aa4a36b5/dbxsql-1.0.1.tar.gz",
    "platform": null,
    "description": "# Databricks SQL Handler\n\nA comprehensive Python package for interacting with Databricks SQL warehouses using Pydantic models, OAuth authentication, and robust connection management.\n\n## Features\n\n- **OAuth Authentication**: Secure client credentials flow authentication\n- **Pydantic Models**: Type-safe data models for query results\n- **Connection Management**: Robust connection handling with automatic retry logic\n- **Query Execution**: Execute single or multiple queries with structured results\n- **Built-in Models**: Pre-defined models for common Databricks operations\n- **Extensible**: Easy to extend with custom data models\n- **CLI Interface**: Interactive command-line interface for testing and development\n\n## Installation\n\n```bash\npip install dbxsql\n```\n\n## Quick Start\n\n### Basic Usage\n\n```python\nfrom dbxsql import QueryHandler, DatabricksSettings, NexsysRecord\n\n# Configure settings (or use environment variables)\nsettings = DatabricksSettings(\n    client_id=\"your_client_id\",\n    client_secret=\"your_client_secret\", \n    server_hostname=\"your_databricks_hostname\",\n    http_path=\"/sql/1.0/warehouses/your_warehouse_id\"\n)\n\n# Use as context manager\nwith QueryHandler(settings) as handler:\n    # Execute a simple query\n    result = handler.execute_query(\"SELECT * FROM my_table LIMIT 10\")\n    \n    # Execute with Pydantic model parsing\n    result = handler.execute_query(\n        \"SELECT * FROM nexsys_table LIMIT 10\", \n        NexsysRecord\n    )\n    \n    # Access structured data\n    for record in result.data:\n        print(f\"ID: {record.id}, Name: {record.name}\")\n```\n\n### Environment Variables\n\nCreate a `.env` file:\n\n```env\nDATABRICKS_CLIENT_ID=your_client_id\nDATABRICKS_CLIENT_SECRET=your_client_secret\nDATABRICKS_SERVER_HOSTNAME=your_hostname.databricks.com\nDATABRICKS_HTTP_PATH=/sql/1.0/warehouses/warehouse_id\nDATABRICKS_LOG_LEVEL=INFO\n```\n\n### CLI Usage\n\n```bash\n# Interactive mode\ndbxsql --interactive\n\n# Run example queries\ndbxsql --examples\n\n# Execute a single query\ndbxsql --query \"SELECT current_timestamp()\"\n```\n\n## Custom Models\n\nCreate your own Pydantic models:\n\n```python\nfrom pydantic import BaseModel\nfrom datetime import datetime\nfrom typing import Optional\n\nclass MyCustomModel(BaseModel):\n    id: int\n    name: str\n    created_at: datetime\n    amount: Optional[float] = None\n\n# Register the model\nfrom dbxsql import register_model\nregister_model(\"my_custom\", MyCustomModel)\n\n# Use it in queries\nresult = handler.execute_query(\"SELECT * FROM my_table\", MyCustomModel)\n```\n\n## Configuration\n\nAll configuration can be done via environment variables with the `DATABRICKS_` prefix or programmatically:\n\n| Setting | Environment Variable | Default | Description |\n|---------|---------------------|---------|-------------|\n| client_id | DATABRICKS_CLIENT_ID | Required | OAuth client ID |\n| client_secret | DATABRICKS_CLIENT_SECRET | Required | OAuth client secret |\n| server_hostname | DATABRICKS_SERVER_HOSTNAME | Required | Databricks hostname |\n| http_path | DATABRICKS_HTTP_PATH | Required | SQL warehouse HTTP path |\n| log_level | DATABRICKS_LOG_LEVEL | INFO | Logging level |\n| max_retries | DATABRICKS_MAX_RETRIES | 3 | Query retry attempts |\n| query_timeout | DATABRICKS_QUERY_TIMEOUT | 300 | Query timeout (seconds) |\n\n## API Reference\n\n### QueryHandler\n\nMain class for executing queries and managing connections.\n\n#### Methods\n\n- `execute_query(query, model_class=None)`: Execute single query\n- `execute_multiple_queries(queries, model_classes=None)`: Execute multiple queries  \n- `execute_query_with_retry(query, model_class=None)`: Execute with automatic retry\n- `list_files(path)`: List files in Databricks path\n- `show_tables(database=None)`: Show tables in database\n- `test_connection()`: Test database connectivity\n\n### Built-in Models\n\n- `NexsysRecord`: For NEXSYS system data\n- `SalesRecord`: For sales transaction data  \n- `FileInfo`: For file listing results\n- `TableInfo`: For table information\n- `GenericRecord`: For unknown data structures\n\n## Error Handling\n\nThe package provides specific exceptions:\n\n```python\nfrom dbxsql import (\n    AuthenticationError,\n    ConnectionError, \n    QueryExecutionError,\n    SyntaxError,\n    TimeoutError,\n    DataParsingError\n)\n\ntry:\n    result = handler.execute_query(\"SELECT * FROM table\")\nexcept AuthenticationError:\n    print(\"Authentication failed\")\nexcept QueryExecutionError as e:\n    print(f\"Query failed: {e}\")\n```\n\n## Troubleshooting\n\n### CERTIFICATE_VERIFY_FAILED\n\nSometimes, depending on an operational system the next error message could be returned:\n\n```\ndbxsql.exceptions.ConnectionError: Failed to connect to Databricks: Error during request to server: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1007)\n```\nThis error can be fixed by setting an environment variable **SSL_CERT_FILE**. This can be done in a terminal:\n```bash\n  export SSL_CERT_FILE=/path/to/databricks.pem\n```\nor directly in the code:\n```python\nimport os\nimport certifi\n\nfrom dbxsql import QueryHandler\nfrom dbxsql.settings import settings\nfrom dbxsql.main import ApplicationRunner\n\nos.environ[\"SSL_CERT_FILE\"]  = certifi.where()\n\n\nwith QueryHandler(settings) as handler:\n    app_runner = ApplicationRunner(handler)\n    app_runner.run_example_queries()\n```\n\n## License\n\nMIT License\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A comprehensive Databricks SQL handler with Pydantic models, OAuth authentication, and connection management",
    "version": "1.0.1",
    "project_urls": {
        "Documentation": "https://github.com/kavodsky/dbxsql#readme",
        "Homepage": "https://github.com/kavodsky/dbxsql",
        "Issues": "https://github.com/kavodsky/dbxsql/issues",
        "Repository": "https://github.com/kavodsky/dbxsql.git"
    },
    "split_keywords": [
        "database",
        " databricks",
        " oauth",
        " pydantic",
        " sql"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9ab6daae9304cbc66cbde558575726355885bd65b86492aa7bbe6131f60508c4",
                "md5": "087eeb45aa330f9ad848ac0c9588caaa",
                "sha256": "711a3dbd6cc102fa8f6c5e1cd0a6ea0315d4cd3b2b927e3586e9119796065ef5"
            },
            "downloads": -1,
            "filename": "dbxsql-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "087eeb45aa330f9ad848ac0c9588caaa",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 19768,
            "upload_time": "2025-08-08T11:41:16",
            "upload_time_iso_8601": "2025-08-08T11:41:16.850341Z",
            "url": "https://files.pythonhosted.org/packages/9a/b6/daae9304cbc66cbde558575726355885bd65b86492aa7bbe6131f60508c4/dbxsql-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "f35506e604c224b9664ea069c116bc9afde055b27be6640912e5ac44aa4a36b5",
                "md5": "0650c90cbc55bddb37ae2f84a5ea0892",
                "sha256": "e726f80b6fb9cc3e4d9e3457d6631a9ea327bd862d6b7d5bfe1a08fc563ce881"
            },
            "downloads": -1,
            "filename": "dbxsql-1.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0650c90cbc55bddb37ae2f84a5ea0892",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 15449,
            "upload_time": "2025-08-08T11:41:17",
            "upload_time_iso_8601": "2025-08-08T11:41:17.862798Z",
            "url": "https://files.pythonhosted.org/packages/f3/55/06e604c224b9664ea069c116bc9afde055b27be6640912e5ac44aa4a36b5/dbxsql-1.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-08 11:41:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kavodsky",
    "github_project": "dbxsql#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "dbxsql"
}
        
Elapsed time: 1.15499s