dql-core

Name	dql-core JSON
Version	0.5.2 JSON
	download
home_page	None
Summary	Framework-agnostic validation engine for Data Quality Language (DQL)
upload_time	2025-10-10 18:28:21
maintainer	None
docs_url	None
author	None
requires_python	>=3.8
license	MIT
keywords	dql validation data-quality framework-agnostic
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # dql-core

[![CI](https://github.com/dql-project/dql-core/actions/workflows/ci.yml/badge.svg)](https://github.com/dql-project/dql-core/actions)
[![PyPI](https://img.shields.io/pypi/v/dql-core)](https://pypi.org/project/dql-core/)
[![Python](https://img.shields.io/pypi/pyversions/dql-core)](https://pypi.org/project/dql-core/)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

Framework-agnostic validation engine for Data Quality Language (DQL).

**[Documentation](https://yourusername.github.io/dql-core/)** | **[PyPI](https://pypi.org/project/dql-core/)** | **[GitHub](https://github.com/dql-project/dql-core)**

`dql-core` provides the abstract validation, cleaner, and executor framework that can be implemented for any Python framework (Django, Flask, FastAPI, SQLAlchemy, Pandas, etc.). It handles the business logic of DQL validation without being tied to any specific data access layer.

## Installation

```bash
pip install dql-core
```

## Quick Start

### Creating a Custom Executor

```python
from dql_core import ValidationExecutor
from dql_parser import parse_dql

class MyExecutor(ValidationExecutor):
    """Custom executor for your framework."""

    def get_records(self, model_name: str):
        # Return records from your data source
        return my_database.query(model_name).all()

    def filter_records(self, records, condition):
        # Apply filtering logic
        return [r for r in records if self.evaluate_condition(r, condition)]

    def count_records(self, records):
        return len(list(records))

    def get_field_value(self, record, field_name: str):
        # Get field value from your record type
        return getattr(record, field_name)

# Parse DQL and execute validation
dql_text = """
from Customer
expect column("email") to_not_be_null
expect column("age") to_be_between(18, 100)
"""

ast = parse_dql(dql_text)
executor = MyExecutor()
result = executor.execute(ast)

print(f"Validation passed: {result.overall_passed}")
print(f"Expectations: {result.total_expectations}")
print(f"Failed: {result.failed_expectations}")
```

## Features

### Abstract Validation Framework

- **6 Built-in Validators**: `to_be_null`, `to_not_be_null`, `to_match_pattern`, `to_be_between`, `to_be_in`, `to_be_unique`
- **Custom Validators**: Extend `Validator` base class
- **Validator Registry**: Register validators for operators
- **Framework-Agnostic**: Works with any data source (Django ORM, SQLAlchemy, Pandas, raw dicts)

### Abstract Executor

- **Template Method Pattern**: Implement 4 abstract methods, get full validation logic
- **Multi-Model Support**: Validate multiple models in one DQL file
- **Rich Results**: Detailed validation results with failure info
- **Severity Levels**: Support for critical, warning, info

### Cleaner Framework (Stories 2.4-2.8)

Cleaners automatically remediate data quality issues when expectations fail.

- **8 Built-in Cleaners**: String normalization, phone/date formatting, NULL handling
- **Custom Cleaners**: Build your own with `@cleaner` decorator
- **Cleaner Chains**: Execute multiple cleaners sequentially
- **Transaction Safety**: Automatic rollback on failure
- **Audit Logging**: Track all modifications
- **Dry-Run Mode**: Preview changes before applying

**Quick Example:**
```python
from dql_core import normalize_email, CleanerChain, SafeCleanerExecutor

# Single cleaner
record = {'email': '  [email protected]  '}
cleaner = normalize_email('email')
result = cleaner(record, {})
print(record['email'])  # '[email protected]'

# Cleaner chain
chain = (CleanerChain()
    .add('trim_whitespace', 'email')
    .add('lowercase', 'email'))
result = chain.execute(record, {})

# Transaction safety
from dql_core import DictTransactionManager
manager = DictTransactionManager()
executor = SafeCleanerExecutor(manager)
result = executor.execute_cleaners(cleaners, record, {})
# Automatic rollback if any cleaner fails
```

**Documentation:**
- **[Tutorial](docs/tutorial.md)** - 5-minute quickstart
- **[Cleaner Catalog](docs/cleaner-catalog.md)** - All 8 built-in cleaners
- **[Custom Cleaners Guide](docs/custom-cleaners-guide.md)** - Build your own
- **[Best Practices](docs/cleaner-best-practices.md)** - Performance and security
- **[Troubleshooting](docs/troubleshooting.md)** - Common issues
- **[Examples](examples/)** - Runnable code samples

### External API Adapters

- **Adapter Pattern**: Create adapters for external APIs
- **Rate Limiting**: Built-in rate limiter
- **Retry Logic**: Exponential backoff retry utilities
- **Factory Pattern**: APIAdapterFactory for creating adapters

## Built-in Validators

### to_be_null / to_not_be_null
```python
expect column("optional_field") to_be_null
expect column("required_field") to_not_be_null
```

### to_match_pattern
```python
expect column("email") to_match_pattern("^[\\w\\.-]+@[\\w\\.-]+\\.\\w+$")
expect column("phone") to_match_pattern("^\\d{3}-\\d{3}-\\d{4}$")
```

### to_be_between
```python
expect column("age") to_be_between(18, 120)
expect column("price") to_be_between(0.0, 9999.99)
```

### to_be_in
```python
expect column("status") to_be_in("active", "inactive", "pending")
expect column("category") to_be_in("A", "B", "C")
```

### to_be_unique
```python
expect column("email") to_be_unique
expect column("username") to_be_unique
```

## Development

```bash
# Install dev dependencies
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black dql_core tests
```

## License

MIT License - see [LICENSE](LICENSE) file for details.

## Documentation

**Full documentation:** https://yourusername.github.io/dql-core/

- [Concepts](https://yourusername.github.io/dql-core/concepts/) - Core architecture
- [Validator Guide](https://yourusername.github.io/dql-core/validator-guide/) - Create custom validators
- [Cleaner Guide](https://yourusername.github.io/dql-core/cleaner-guide/) - Write cleaner functions
- [Executor Guide](https://yourusername.github.io/dql-core/executor-guide/) - Implement framework executors
- [API Reference](https://yourusername.github.io/dql-core/api-reference/) - Complete API

## Related Packages

- **[dql-parser](https://github.com/dql-project/dql-parser)** - Pure Python DQL parser ([docs](https://yourusername.github.io/dql-parser/))
- **[django-dqm](https://github.com/dql-project/django-dqm)** - Django integration ([docs](https://yourusername.github.io/django-dqm/))

## Package Selection

Not sure which package to use? See the **[Package Selection Guide](https://yourusername.github.io/django-dqm/package-selection/)**

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "dql-core",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "dql, validation, data-quality, framework-agnostic",
    "author": null,
    "author_email": "\"stratege.ai Team\" <info@stragete.ai>",
    "download_url": "https://files.pythonhosted.org/packages/a2/6c/5e6abf7d2b27a631c121ca61a56e14304476ea37430e92c5b9788a83301b/dql_core-0.5.2.tar.gz",
    "platform": null,
    "description": "# dql-core\n\n[![CI](https://github.com/dql-project/dql-core/actions/workflows/ci.yml/badge.svg)](https://github.com/dql-project/dql-core/actions)\n[![PyPI](https://img.shields.io/pypi/v/dql-core)](https://pypi.org/project/dql-core/)\n[![Python](https://img.shields.io/pypi/pyversions/dql-core)](https://pypi.org/project/dql-core/)\n[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n\nFramework-agnostic validation engine for Data Quality Language (DQL).\n\n**[Documentation](https://yourusername.github.io/dql-core/)** | **[PyPI](https://pypi.org/project/dql-core/)** | **[GitHub](https://github.com/dql-project/dql-core)**\n\n`dql-core` provides the abstract validation, cleaner, and executor framework that can be implemented for any Python framework (Django, Flask, FastAPI, SQLAlchemy, Pandas, etc.). It handles the business logic of DQL validation without being tied to any specific data access layer.\n\n## Installation\n\n```bash\npip install dql-core\n```\n\n## Quick Start\n\n### Creating a Custom Executor\n\n```python\nfrom dql_core import ValidationExecutor\nfrom dql_parser import parse_dql\n\nclass MyExecutor(ValidationExecutor):\n    \"\"\"Custom executor for your framework.\"\"\"\n\n    def get_records(self, model_name: str):\n        # Return records from your data source\n        return my_database.query(model_name).all()\n\n    def filter_records(self, records, condition):\n        # Apply filtering logic\n        return [r for r in records if self.evaluate_condition(r, condition)]\n\n    def count_records(self, records):\n        return len(list(records))\n\n    def get_field_value(self, record, field_name: str):\n        # Get field value from your record type\n        return getattr(record, field_name)\n\n# Parse DQL and execute validation\ndql_text = \"\"\"\nfrom Customer\nexpect column(\"email\") to_not_be_null\nexpect column(\"age\") to_be_between(18, 100)\n\"\"\"\n\nast = parse_dql(dql_text)\nexecutor = MyExecutor()\nresult = executor.execute(ast)\n\nprint(f\"Validation passed: {result.overall_passed}\")\nprint(f\"Expectations: {result.total_expectations}\")\nprint(f\"Failed: {result.failed_expectations}\")\n```\n\n## Features\n\n### Abstract Validation Framework\n\n- **6 Built-in Validators**: `to_be_null`, `to_not_be_null`, `to_match_pattern`, `to_be_between`, `to_be_in`, `to_be_unique`\n- **Custom Validators**: Extend `Validator` base class\n- **Validator Registry**: Register validators for operators\n- **Framework-Agnostic**: Works with any data source (Django ORM, SQLAlchemy, Pandas, raw dicts)\n\n### Abstract Executor\n\n- **Template Method Pattern**: Implement 4 abstract methods, get full validation logic\n- **Multi-Model Support**: Validate multiple models in one DQL file\n- **Rich Results**: Detailed validation results with failure info\n- **Severity Levels**: Support for critical, warning, info\n\n### Cleaner Framework (Stories 2.4-2.8)\n\nCleaners automatically remediate data quality issues when expectations fail.\n\n- **8 Built-in Cleaners**: String normalization, phone/date formatting, NULL handling\n- **Custom Cleaners**: Build your own with `@cleaner` decorator\n- **Cleaner Chains**: Execute multiple cleaners sequentially\n- **Transaction Safety**: Automatic rollback on failure\n- **Audit Logging**: Track all modifications\n- **Dry-Run Mode**: Preview changes before applying\n\n**Quick Example:**\n```python\nfrom dql_core import normalize_email, CleanerChain, SafeCleanerExecutor\n\n# Single cleaner\nrecord = {'email': '  [email protected]  '}\ncleaner = normalize_email('email')\nresult = cleaner(record, {})\nprint(record['email'])  # '[email protected]'\n\n# Cleaner chain\nchain = (CleanerChain()\n    .add('trim_whitespace', 'email')\n    .add('lowercase', 'email'))\nresult = chain.execute(record, {})\n\n# Transaction safety\nfrom dql_core import DictTransactionManager\nmanager = DictTransactionManager()\nexecutor = SafeCleanerExecutor(manager)\nresult = executor.execute_cleaners(cleaners, record, {})\n# Automatic rollback if any cleaner fails\n```\n\n**Documentation:**\n- **[Tutorial](docs/tutorial.md)** - 5-minute quickstart\n- **[Cleaner Catalog](docs/cleaner-catalog.md)** - All 8 built-in cleaners\n- **[Custom Cleaners Guide](docs/custom-cleaners-guide.md)** - Build your own\n- **[Best Practices](docs/cleaner-best-practices.md)** - Performance and security\n- **[Troubleshooting](docs/troubleshooting.md)** - Common issues\n- **[Examples](examples/)** - Runnable code samples\n\n### External API Adapters\n\n- **Adapter Pattern**: Create adapters for external APIs\n- **Rate Limiting**: Built-in rate limiter\n- **Retry Logic**: Exponential backoff retry utilities\n- **Factory Pattern**: APIAdapterFactory for creating adapters\n\n## Built-in Validators\n\n### to_be_null / to_not_be_null\n```python\nexpect column(\"optional_field\") to_be_null\nexpect column(\"required_field\") to_not_be_null\n```\n\n### to_match_pattern\n```python\nexpect column(\"email\") to_match_pattern(\"^[\\\\w\\\\.-]+@[\\\\w\\\\.-]+\\\\.\\\\w+$\")\nexpect column(\"phone\") to_match_pattern(\"^\\\\d{3}-\\\\d{3}-\\\\d{4}$\")\n```\n\n### to_be_between\n```python\nexpect column(\"age\") to_be_between(18, 120)\nexpect column(\"price\") to_be_between(0.0, 9999.99)\n```\n\n### to_be_in\n```python\nexpect column(\"status\") to_be_in(\"active\", \"inactive\", \"pending\")\nexpect column(\"category\") to_be_in(\"A\", \"B\", \"C\")\n```\n\n### to_be_unique\n```python\nexpect column(\"email\") to_be_unique\nexpect column(\"username\") to_be_unique\n```\n\n## Development\n\n```bash\n# Install dev dependencies\npip install -e \".[dev]\"\n\n# Run tests\npytest\n\n# Format code\nblack dql_core tests\n```\n\n## License\n\nMIT License - see [LICENSE](LICENSE) file for details.\n\n## Documentation\n\n**Full documentation:** https://yourusername.github.io/dql-core/\n\n- [Concepts](https://yourusername.github.io/dql-core/concepts/) - Core architecture\n- [Validator Guide](https://yourusername.github.io/dql-core/validator-guide/) - Create custom validators\n- [Cleaner Guide](https://yourusername.github.io/dql-core/cleaner-guide/) - Write cleaner functions\n- [Executor Guide](https://yourusername.github.io/dql-core/executor-guide/) - Implement framework executors\n- [API Reference](https://yourusername.github.io/dql-core/api-reference/) - Complete API\n\n## Related Packages\n\n- **[dql-parser](https://github.com/dql-project/dql-parser)** - Pure Python DQL parser ([docs](https://yourusername.github.io/dql-parser/))\n- **[django-dqm](https://github.com/dql-project/django-dqm)** - Django integration ([docs](https://yourusername.github.io/django-dqm/))\n\n## Package Selection\n\nNot sure which package to use? See the **[Package Selection Guide](https://yourusername.github.io/django-dqm/package-selection/)**\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Framework-agnostic validation engine for Data Quality Language (DQL)",
    "version": "0.5.2",
    "project_urls": {
        "Changelog": "https://github.com/stratege.ai/dql-core/blob/main/CHANGELOG.md",
        "Documentation": "https://dql-core.readthedocs.io",
        "Homepage": "https://stratege.ai/tools/dql-core",
        "Issues": "https://github.com/stratege.ai/dql-core/issues",
        "Repository": "https://github.com/stratege.ai/dql-core"
    },
    "split_keywords": [
        "dql",
        " validation",
        " data-quality",
        " framework-agnostic"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "197495839b74c573235e0038e489acccea2988e49d03bbab2df46ac9cf412a67",
                "md5": "2b8107bf174bbf46e1e6120739325c3f",
                "sha256": "94e13b818bf08bb22247d7f69fd6df86df3a6ac3e159f847cc2c752fccedc01b"
            },
            "downloads": -1,
            "filename": "dql_core-0.5.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2b8107bf174bbf46e1e6120739325c3f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 48127,
            "upload_time": "2025-10-10T18:28:20",
            "upload_time_iso_8601": "2025-10-10T18:28:20.859905Z",
            "url": "https://files.pythonhosted.org/packages/19/74/95839b74c573235e0038e489acccea2988e49d03bbab2df46ac9cf412a67/dql_core-0.5.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a26c5e6abf7d2b27a631c121ca61a56e14304476ea37430e92c5b9788a83301b",
                "md5": "d1c868a9e1b6f5fa65cfd59d07c86798",
                "sha256": "f75dae6f28cee59ca86c61eeccf354bd7ccc67810b4c7b917ae1f024eadf981e"
            },
            "downloads": -1,
            "filename": "dql_core-0.5.2.tar.gz",
            "has_sig": false,
            "md5_digest": "d1c868a9e1b6f5fa65cfd59d07c86798",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 106278,
            "upload_time": "2025-10-10T18:28:21",
            "upload_time_iso_8601": "2025-10-10T18:28:21.869593Z",
            "url": "https://files.pythonhosted.org/packages/a2/6c/5e6abf7d2b27a631c121ca61a56e14304476ea37430e92c5b9788a83301b/dql_core-0.5.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-10 18:28:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "stratege.ai",
    "github_project": "dql-core",
    "github_not_found": true,
    "lcname": "dql-core"
}

None