# lfcdemolib
**Lakeflow Connect Demo Library**
A comprehensive Python library for building and managing Databricks Lakeflow Connect (LFC) demonstrations with support for multiple cloud providers and database types.
## Features
- **Simplified Demo Initialization**: One-line setup for Databricks notebooks with `DemoInstance`
- **Multi-Database Support**: SQL Server, MySQL, PostgreSQL, Oracle
- **Cloud Provider Support**: Azure, Oracle Cloud Infrastructure (OCI)
- **Change Data Capture (CDC)**: Built-in CDC/CT (Change Tracking) implementations
- **Schema Evolution**: Automatic schema evolution and migration handling
- **Connection Management**: Secure credential storage and retrieval
- **DML Operations**: Simplified data manipulation with automatic scheduling
- **REST API Integration**: Databricks workspace API wrapper
- **Test Framework**: Comprehensive testing utilities for database operations
## Installation
```bash
pip install lfcdemolib
```
**All database drivers are included** as core dependencies:
- pymysql (MySQL)
- psycopg2-binary (PostgreSQL)
- pymssql (SQL Server)
- oracledb (Oracle)
### Optional Dependencies
For development tools:
```bash
# Development tools (pytest, black, flake8, mypy, isort)
pip install "lfcdemolib[dev]"
# Documentation tools (sphinx)
pip install "lfcdemolib[docs]"
```
## Quick Start
### Databricks Notebook
```python
import lfcdemolib
# Configuration
config_dict = {
"source_connection_name": "lfcddemo-azure-mysql-both",
"cdc_qbc": "cdc",
"database": {
"cloud": "azure",
"type": "mysql"
}
}
# One-line initialization
d = lfcdemolib.DemoInstance(config_dict, dbutils, spark)
# Create pipeline
d.create_pipeline(pipeline_spec)
# Execute DML operations
d.dml.execute_delete_update_insert()
# Get recent data
df = d.dml.get_recent_data()
display(df)
```
### Tuple Unpacking (Advanced)
```python
# Get all components
d, config, dbxs, dmls, dbx_key, dml_key, scheduler = lfcdemolib.DemoInstance(
config_dict,
dbutils,
spark
)
# Use individual components
config.source_connection_name
dmls[dml_key].execute_delete_update_insert()
scheduler.get_jobs()
```
## Core Components
### DemoInstance
Simplified facade for demo initialization with automatic caching and scheduler management.
```python
d = lfcdemolib.DemoInstance(config_dict, dbutils, spark)
```
**Features:**
- Singleton scheduler management
- Automatic instance caching
- Simplified one-line initialization
- Delegates to DbxRest for Databricks operations
### LfcScheduler
Background task scheduler using APScheduler.
```python
scheduler = lfcdemolib.LfcScheduler()
scheduler.add_job(my_function, 'interval', seconds=60)
```
### DbxRest
Databricks REST API client with connection and secret management.
```python
dbx = lfcdemolib.DbxRest(dbutils=dbutils, config=config, lfc_scheduler=scheduler)
dbx.create_pipeline(spec)
```
### SimpleDML
Simplified DML operations with automatic scheduling.
```python
dml = lfcdemolib.SimpleDML(secrets_json, config=config, lfc_scheduler=scheduler)
dml.execute_delete_update_insert()
df = dml.get_recent_data()
```
### Pydantic Models
Type-safe configuration and credential management.
```python
from lfcdemolib import LfcNotebookConfig, LfcCredential
# Validate configuration
config = LfcNotebookConfig(config_dict)
# Validate credentials
credential = LfcCredential(secrets_json)
```
## Database Support
### Supported Databases
- **SQL Server**: CDC and Change Tracking (CT) support
- **MySQL**: Full replication support
- **PostgreSQL**: Logical replication support
- **Oracle**: 19c and later
### Supported Cloud Providers
- **Azure**: SQL Database, Azure Database for MySQL/PostgreSQL
- **OCI**: Oracle Cloud Infrastructure databases
## Configuration
### LfcNotebookConfig
```python
config_dict = {
"source_connection_name": "lfcddemo-azure-mysql-both", # Required
"cdc_qbc": "cdc", # Required: "cdc" or "qbc"
"target_catalog": "main", # Optional: defaults to "main"
"source_schema": None, # Optional: auto-detect
"database": { # Required if connection_name is blank
"cloud": "azure", # "azure" or "oci"
"type": "mysql" # "mysql", "postgresql", "sqlserver", "oracle"
}
}
```
### LfcCredential (V2 Format)
```python
credential = {
"host_fqdn": "myserver.database.windows.net",
"port": 3306,
"catalog": "mydb",
"schema": "dbo",
"username": "user",
"password": "pass",
"db_type": "mysql",
"cloud": {
"provider": "azure",
"region": "eastus"
},
"dba": {
"username": "admin",
"password": "adminpass"
}
}
```
## Advanced Features
### Automatic Scheduling
```python
# DML operations run automatically
d = lfcdemolib.DemoInstance(config_dict, dbutils, spark)
# Auto-scheduled DML operations every 10 seconds
```
### Custom Scheduler Jobs
```python
def my_task():
print("Running custom task")
d.scheduler.add_job(my_task, 'interval', seconds=30, id='my_task')
```
### Connection Management
```python
from lfcdemolib import LfcConn
# Manage Databricks connections
lfc_conn = LfcConn(workspace_client=workspace_client)
connection = lfc_conn.get_connection(connection_name)
```
### Secret Management
```python
from lfcdemolib import LfcSecrets
# Manage Databricks secrets
lfc_secrets = LfcSecrets(workspace_client=workspace_client)
secret = lfc_secrets.get_secret(scope='lfcddemo', key='mysql_password')
```
### Local Credential Storage
```python
from lfcdemolib import SimpleLocalCred
# Save credentials locally
cred_manager = SimpleLocalCred()
cred_manager.save_credentials(db_details, db_type='mysql', cloud='azure')
# Load credentials
credential = cred_manager.get_credential(
host='myserver.database.windows.net',
db_type='mysql'
)
```
## Testing
### SimpleTest
Comprehensive database test suite.
```python
from lfcdemolib import SimpleTest
tester = SimpleTest(workspace_client, config)
results = tester.run_comprehensive_tests()
```
## Command-Line Tools
### Deploy Credentials
```bash
cd lfc/db/bin
python deploy_credentials_to_workspaces.py \
--credential-file ~/.lfcddemo/credentials.json \
--target-workspace prod
```
### Convert Secrets
```bash
python convert_secret_to_credential.py \
--scope-name lfcddemo \
--secret-name mysql-connection \
--source azure
```
## Examples
### Multi-Database Demo
```python
import lfcdemolib
# MySQL
mysql_d = lfcdemolib.DemoInstance(mysql_config, dbutils, spark)
mysql_d.create_pipeline(mysql_spec)
# PostgreSQL
pg_d = lfcdemolib.DemoInstance(pg_config, dbutils, spark)
pg_d.create_pipeline(pg_spec)
# SQL Server
sqlserver_d = lfcdemolib.DemoInstance(sqlserver_config, dbutils, spark)
sqlserver_d.create_pipeline(sqlserver_spec)
# All share the same scheduler
print(mysql_d.scheduler is pg_d.scheduler) # True
```
### Monitoring
```python
# Check active jobs
for job in d.scheduler.get_jobs():
print(f"{job.id}: {job.next_run_time}")
# Check cleanup queue
for item in d.cleanup_queue.queue:
print(item)
```
## Requirements
- Python >= 3.8
- Databricks Runtime 13.0+
- SQLAlchemy >= 1.4.0
- Pydantic >= 1.8.0 (v1 compatibility)
- APScheduler >= 3.9.0
## License
This project is licensed under the Databricks Labs License - see the [LICENSE](LICENSE) file for details.
## Contributing
This is a Databricks Labs project. Contributions are welcome! Please ensure:
- Code follows PEP 8 style guidelines
- All tests pass
- Documentation is updated
- Pydantic v1 compatibility is maintained
## Support
For issues, questions, or contributions, please contact the Databricks Labs team.
## Changelog
### Version 0.0.6 (Current)
- Fixed `AttributeError: 'LfcNotebookConfig' object has no attribute 'get'` in `SimpleConn.py` for Pydantic v2
- Added `_get_config_value()` helper method for safe config access from both Pydantic models and dicts
- Corrected README.md changelog (was incorrectly showing "Version 1.0.0", now shows accurate release history)
- Improved compatibility with both Pydantic v1 and v2 models
### Version 0.0.5
- Fixed `AttributeError` with Pydantic v2 `LfcNotebookConfig` in `SimpleConn.py`
- Added `_get_config_value()` helper method for safe config access
- Improved compatibility with both Pydantic v1 and v2 models
### Version 0.0.4
- Added Pydantic v1/v2 compatibility layer (`_pydantic_compat.py`)
- Now works with both Pydantic v1.10+ and v2.x
- Resolves dependency conflicts with langchain, databricks-agents, etc.
- Updated `LfcCredentialModel` and `LfcNotebookConfig` to use compatibility layer
### Version 0.0.3
- Fixed VERSION file not included in MANIFEST.in (build error fix)
- Added VERSION to package manifest for proper sdist builds
- Fixed cleanup queue display format in notebooks
### Version 0.0.2
- Fixed pydantic version requirement for Databricks compatibility
- Added typing_extensions compatibility
- All database drivers included as core dependencies
- Updated description to "Lakeflow Connect Demo Library"
### Version 0.0.1
- Initial release
- DemoInstance facade for simplified initialization
- Support for MySQL, PostgreSQL, SQL Server, Oracle
- Azure and OCI cloud provider support
- Pydantic v1-based validation
- APScheduler integration
- Comprehensive test framework
---
**Databricks Labs** | [Documentation](#) | [Examples](#) | [API Reference](#)
Raw data
{
"_id": null,
"home_page": "https://github.com/databricks-labs/lfcdemolib",
"name": "lfcdemolib",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "Databricks Labs <labs@databricks.com>",
"keywords": "databricks, lakeflow, federation, cdc, change-data-capture, data-engineering, etl, database, replication",
"author": "Databricks Labs",
"author_email": "Databricks Labs <labs@databricks.com>",
"download_url": "https://files.pythonhosted.org/packages/1b/c9/a63a92951b3d6dcd8bec46297efccb25bc8d756100b32ca39b8b62b20479/lfcdemolib-0.0.7.tar.gz",
"platform": null,
"description": "# lfcdemolib\n\n**Lakeflow Connect Demo Library**\n\nA comprehensive Python library for building and managing Databricks Lakeflow Connect (LFC) demonstrations with support for multiple cloud providers and database types.\n\n## Features\n\n- **Simplified Demo Initialization**: One-line setup for Databricks notebooks with `DemoInstance`\n- **Multi-Database Support**: SQL Server, MySQL, PostgreSQL, Oracle\n- **Cloud Provider Support**: Azure, Oracle Cloud Infrastructure (OCI)\n- **Change Data Capture (CDC)**: Built-in CDC/CT (Change Tracking) implementations\n- **Schema Evolution**: Automatic schema evolution and migration handling\n- **Connection Management**: Secure credential storage and retrieval\n- **DML Operations**: Simplified data manipulation with automatic scheduling\n- **REST API Integration**: Databricks workspace API wrapper\n- **Test Framework**: Comprehensive testing utilities for database operations\n\n## Installation\n\n```bash\npip install lfcdemolib\n```\n\n**All database drivers are included** as core dependencies:\n- pymysql (MySQL)\n- psycopg2-binary (PostgreSQL)\n- pymssql (SQL Server)\n- oracledb (Oracle)\n\n### Optional Dependencies\n\nFor development tools:\n\n```bash\n# Development tools (pytest, black, flake8, mypy, isort)\npip install \"lfcdemolib[dev]\"\n\n# Documentation tools (sphinx)\npip install \"lfcdemolib[docs]\"\n```\n\n## Quick Start\n\n### Databricks Notebook\n\n```python\nimport lfcdemolib\n\n# Configuration\nconfig_dict = {\n \"source_connection_name\": \"lfcddemo-azure-mysql-both\",\n \"cdc_qbc\": \"cdc\",\n \"database\": {\n \"cloud\": \"azure\",\n \"type\": \"mysql\"\n }\n}\n\n# One-line initialization\nd = lfcdemolib.DemoInstance(config_dict, dbutils, spark)\n\n# Create pipeline\nd.create_pipeline(pipeline_spec)\n\n# Execute DML operations\nd.dml.execute_delete_update_insert()\n\n# Get recent data\ndf = d.dml.get_recent_data()\ndisplay(df)\n```\n\n### Tuple Unpacking (Advanced)\n\n```python\n# Get all components\nd, config, dbxs, dmls, dbx_key, dml_key, scheduler = lfcdemolib.DemoInstance(\n config_dict, \n dbutils, \n spark\n)\n\n# Use individual components\nconfig.source_connection_name\ndmls[dml_key].execute_delete_update_insert()\nscheduler.get_jobs()\n```\n\n## Core Components\n\n### DemoInstance\n\nSimplified facade for demo initialization with automatic caching and scheduler management.\n\n```python\nd = lfcdemolib.DemoInstance(config_dict, dbutils, spark)\n```\n\n**Features:**\n- Singleton scheduler management\n- Automatic instance caching\n- Simplified one-line initialization\n- Delegates to DbxRest for Databricks operations\n\n### LfcScheduler\n\nBackground task scheduler using APScheduler.\n\n```python\nscheduler = lfcdemolib.LfcScheduler()\nscheduler.add_job(my_function, 'interval', seconds=60)\n```\n\n### DbxRest\n\nDatabricks REST API client with connection and secret management.\n\n```python\ndbx = lfcdemolib.DbxRest(dbutils=dbutils, config=config, lfc_scheduler=scheduler)\ndbx.create_pipeline(spec)\n```\n\n### SimpleDML\n\nSimplified DML operations with automatic scheduling.\n\n```python\ndml = lfcdemolib.SimpleDML(secrets_json, config=config, lfc_scheduler=scheduler)\ndml.execute_delete_update_insert()\ndf = dml.get_recent_data()\n```\n\n### Pydantic Models\n\nType-safe configuration and credential management.\n\n```python\nfrom lfcdemolib import LfcNotebookConfig, LfcCredential\n\n# Validate configuration\nconfig = LfcNotebookConfig(config_dict)\n\n# Validate credentials\ncredential = LfcCredential(secrets_json)\n```\n\n## Database Support\n\n### Supported Databases\n\n- **SQL Server**: CDC and Change Tracking (CT) support\n- **MySQL**: Full replication support\n- **PostgreSQL**: Logical replication support\n- **Oracle**: 19c and later\n\n### Supported Cloud Providers\n\n- **Azure**: SQL Database, Azure Database for MySQL/PostgreSQL\n- **OCI**: Oracle Cloud Infrastructure databases\n\n## Configuration\n\n### LfcNotebookConfig\n\n```python\nconfig_dict = {\n \"source_connection_name\": \"lfcddemo-azure-mysql-both\", # Required\n \"cdc_qbc\": \"cdc\", # Required: \"cdc\" or \"qbc\"\n \"target_catalog\": \"main\", # Optional: defaults to \"main\"\n \"source_schema\": None, # Optional: auto-detect\n \"database\": { # Required if connection_name is blank\n \"cloud\": \"azure\", # \"azure\" or \"oci\"\n \"type\": \"mysql\" # \"mysql\", \"postgresql\", \"sqlserver\", \"oracle\"\n }\n}\n```\n\n### LfcCredential (V2 Format)\n\n```python\ncredential = {\n \"host_fqdn\": \"myserver.database.windows.net\",\n \"port\": 3306,\n \"catalog\": \"mydb\",\n \"schema\": \"dbo\",\n \"username\": \"user\",\n \"password\": \"pass\",\n \"db_type\": \"mysql\",\n \"cloud\": {\n \"provider\": \"azure\",\n \"region\": \"eastus\"\n },\n \"dba\": {\n \"username\": \"admin\",\n \"password\": \"adminpass\"\n }\n}\n```\n\n## Advanced Features\n\n### Automatic Scheduling\n\n```python\n# DML operations run automatically\nd = lfcdemolib.DemoInstance(config_dict, dbutils, spark)\n# Auto-scheduled DML operations every 10 seconds\n```\n\n### Custom Scheduler Jobs\n\n```python\ndef my_task():\n print(\"Running custom task\")\n\nd.scheduler.add_job(my_task, 'interval', seconds=30, id='my_task')\n```\n\n### Connection Management\n\n```python\nfrom lfcdemolib import LfcConn\n\n# Manage Databricks connections\nlfc_conn = LfcConn(workspace_client=workspace_client)\nconnection = lfc_conn.get_connection(connection_name)\n```\n\n### Secret Management\n\n```python\nfrom lfcdemolib import LfcSecrets\n\n# Manage Databricks secrets\nlfc_secrets = LfcSecrets(workspace_client=workspace_client)\nsecret = lfc_secrets.get_secret(scope='lfcddemo', key='mysql_password')\n```\n\n### Local Credential Storage\n\n```python\nfrom lfcdemolib import SimpleLocalCred\n\n# Save credentials locally\ncred_manager = SimpleLocalCred()\ncred_manager.save_credentials(db_details, db_type='mysql', cloud='azure')\n\n# Load credentials\ncredential = cred_manager.get_credential(\n host='myserver.database.windows.net',\n db_type='mysql'\n)\n```\n\n## Testing\n\n### SimpleTest\n\nComprehensive database test suite.\n\n```python\nfrom lfcdemolib import SimpleTest\n\ntester = SimpleTest(workspace_client, config)\nresults = tester.run_comprehensive_tests()\n```\n\n## Command-Line Tools\n\n### Deploy Credentials\n\n```bash\ncd lfc/db/bin\npython deploy_credentials_to_workspaces.py \\\n --credential-file ~/.lfcddemo/credentials.json \\\n --target-workspace prod\n```\n\n### Convert Secrets\n\n```bash\npython convert_secret_to_credential.py \\\n --scope-name lfcddemo \\\n --secret-name mysql-connection \\\n --source azure\n```\n\n## Examples\n\n### Multi-Database Demo\n\n```python\nimport lfcdemolib\n\n# MySQL\nmysql_d = lfcdemolib.DemoInstance(mysql_config, dbutils, spark)\nmysql_d.create_pipeline(mysql_spec)\n\n# PostgreSQL\npg_d = lfcdemolib.DemoInstance(pg_config, dbutils, spark)\npg_d.create_pipeline(pg_spec)\n\n# SQL Server\nsqlserver_d = lfcdemolib.DemoInstance(sqlserver_config, dbutils, spark)\nsqlserver_d.create_pipeline(sqlserver_spec)\n\n# All share the same scheduler\nprint(mysql_d.scheduler is pg_d.scheduler) # True\n```\n\n### Monitoring\n\n```python\n# Check active jobs\nfor job in d.scheduler.get_jobs():\n print(f\"{job.id}: {job.next_run_time}\")\n\n# Check cleanup queue\nfor item in d.cleanup_queue.queue:\n print(item)\n```\n\n## Requirements\n\n- Python >= 3.8\n- Databricks Runtime 13.0+\n- SQLAlchemy >= 1.4.0\n- Pydantic >= 1.8.0 (v1 compatibility)\n- APScheduler >= 3.9.0\n\n## License\n\nThis project is licensed under the Databricks Labs License - see the [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nThis is a Databricks Labs project. Contributions are welcome! Please ensure:\n\n- Code follows PEP 8 style guidelines\n- All tests pass\n- Documentation is updated\n- Pydantic v1 compatibility is maintained\n\n## Support\n\nFor issues, questions, or contributions, please contact the Databricks Labs team.\n\n## Changelog\n\n### Version 0.0.6 (Current)\n\n- Fixed `AttributeError: 'LfcNotebookConfig' object has no attribute 'get'` in `SimpleConn.py` for Pydantic v2\n- Added `_get_config_value()` helper method for safe config access from both Pydantic models and dicts\n- Corrected README.md changelog (was incorrectly showing \"Version 1.0.0\", now shows accurate release history)\n- Improved compatibility with both Pydantic v1 and v2 models\n\n### Version 0.0.5\n\n- Fixed `AttributeError` with Pydantic v2 `LfcNotebookConfig` in `SimpleConn.py`\n- Added `_get_config_value()` helper method for safe config access\n- Improved compatibility with both Pydantic v1 and v2 models\n\n### Version 0.0.4\n\n- Added Pydantic v1/v2 compatibility layer (`_pydantic_compat.py`)\n- Now works with both Pydantic v1.10+ and v2.x\n- Resolves dependency conflicts with langchain, databricks-agents, etc.\n- Updated `LfcCredentialModel` and `LfcNotebookConfig` to use compatibility layer\n\n### Version 0.0.3\n\n- Fixed VERSION file not included in MANIFEST.in (build error fix)\n- Added VERSION to package manifest for proper sdist builds\n- Fixed cleanup queue display format in notebooks\n\n### Version 0.0.2\n\n- Fixed pydantic version requirement for Databricks compatibility\n- Added typing_extensions compatibility\n- All database drivers included as core dependencies\n- Updated description to \"Lakeflow Connect Demo Library\"\n\n### Version 0.0.1\n\n- Initial release\n- DemoInstance facade for simplified initialization\n- Support for MySQL, PostgreSQL, SQL Server, Oracle\n- Azure and OCI cloud provider support\n- Pydantic v1-based validation\n- APScheduler integration\n- Comprehensive test framework\n\n---\n\n**Databricks Labs** | [Documentation](#) | [Examples](#) | [API Reference](#)\n\n",
"bugtrack_url": null,
"license": "Databricks Labs License",
"summary": "Lakeflow Connect Demo Library",
"version": "0.0.7",
"project_urls": {
"Bug Tracker": "https://github.com/databricks-labs/lfcdemolib/issues",
"Documentation": "https://github.com/databricks-labs/lfcdemolib#readme",
"Homepage": "https://github.com/databricks-labs/lfcdemolib",
"Repository": "https://github.com/databricks-labs/lfcdemolib"
},
"split_keywords": [
"databricks",
" lakeflow",
" federation",
" cdc",
" change-data-capture",
" data-engineering",
" etl",
" database",
" replication"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "5fe1ceba1dac65f1f4afa85579af2bda85adb39f8bcbc7e39c58aa8710c5c877",
"md5": "fab852d8e3166cce3d7759879d0ca5c1",
"sha256": "b5df34f6dc11d788f0f90a053c231db09315a97787203b68e29181c14c1e8c75"
},
"downloads": -1,
"filename": "lfcdemolib-0.0.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "fab852d8e3166cce3d7759879d0ca5c1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 199690,
"upload_time": "2025-10-24T18:12:56",
"upload_time_iso_8601": "2025-10-24T18:12:56.915455Z",
"url": "https://files.pythonhosted.org/packages/5f/e1/ceba1dac65f1f4afa85579af2bda85adb39f8bcbc7e39c58aa8710c5c877/lfcdemolib-0.0.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1bc9a63a92951b3d6dcd8bec46297efccb25bc8d756100b32ca39b8b62b20479",
"md5": "bee32d1e57fd008bd9fa77ab19e94f22",
"sha256": "2682043eda9e5d0a38cf46890600a63c45b9ecceae2d95e31f7135a4189085ae"
},
"downloads": -1,
"filename": "lfcdemolib-0.0.7.tar.gz",
"has_sig": false,
"md5_digest": "bee32d1e57fd008bd9fa77ab19e94f22",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 175831,
"upload_time": "2025-10-24T18:12:58",
"upload_time_iso_8601": "2025-10-24T18:12:58.460427Z",
"url": "https://files.pythonhosted.org/packages/1b/c9/a63a92951b3d6dcd8bec46297efccb25bc8d756100b32ca39b8b62b20479/lfcdemolib-0.0.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-24 18:12:58",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "databricks-labs",
"github_project": "lfcdemolib",
"github_not_found": true,
"lcname": "lfcdemolib"
}