query-farm-sql-manipulation


Namequery-farm-sql-manipulation JSON
Version 0.1.7 PyPI version JSON
download
home_pageNone
SummaryA Python library for intelligent SQL predicate manipulation using SQLGlot. This library provides tools to safely remove specific predicates from `SQL WHERE` clauses and filter SQL statements based on column availability.
upload_time2025-07-10 22:31:37
maintainerNone
docs_urlNone
authorNone
requires_python>=3.11
licenseNone
keywords predicate parsing predicates sql sql parsing
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # [Query.Farm](https://query.farm) SQL Manipulation

A Python library for intelligent SQL predicate manipulation using [SQLGlot](https://sqlglot.org)

### Column Filtering with Complex Expressions

```python
import sqlglot
from query_farm_sql_manipulation import transforms

sql = '''
SELECT * FROM users
WHERE age > 18
  AND (status = 'active' OR role = 'admin')
  AND department IN ('engineering', 'sales')
'''

# Parse the statement first
statement = sqlglot.parse_one(sql, dialect="duckdb")

# Only keep predicates involving 'age' and 'role'
allowed_columns = {'age', 'role'}

result = transforms.filter_column_references(
    statement=statement,
    selector=lambda col: col.name in allowed_columns,
)

# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'
print(result.sql())
```

## Features

- **Predicate Removal**: Safely remove specific predicates from complex `SQL WHERE` clauses while preserving logical structure
- **Column Filtering**: Filter SQL statements to only include predicates referencing allowed columns
- **Intelligent Logic Handling**: Properly handles `AND/OR` logic, nested expressions, `CASE` statements, and parentheses
- **SQLGlot Integration**: Built on top of [SQLGlot](https://sqlglot.com/sqlglot.html) for robust SQL parsing and manipulation
- **Multiple Dialect Support**: Works with various SQL dialects (default: DuckDB)

## Installation

```bash
pip install query-farm-sql-manipulation
```

## Requirements

- Python >= 3.12
- SQLGlot >= 26.33.0

## Quick Start

### Basic Predicate Removal

```python
import sqlglot
from query_farm_sql_manipulation import transforms

# Parse a SQL statement
sql = 'SELECT * FROM data WHERE x = 1 AND y = 2'
statement = sqlglot.parse_one(sql, dialect="duckdb")

# Find the predicate you want to remove
predicates = list(statement.find_all(sqlglot.expressions.Predicate))
target_predicate = predicates[0]  # x = 1

# Remove the predicate
transforms.remove_expression_part(target_predicate)

# Result: SELECT * FROM data WHERE y = 2
print(statement.sql())
```

### Column-Name Based Filtering

```python
import sqlglot
from query_farm_sql_manipulation import transforms

# Parse SQL statement first
sql = 'SELECT * FROM data WHERE color = "red" AND size > 10 AND type = "car"'
statement = sqlglot.parse_one(sql, dialect="duckdb")

# Filter to only include predicates with allowed columns
allowed_columns = {"color", "type"}

filtered = transforms.filter_column_references(
    statement=statement,
    selector=lambda col: col.name in allowed_columns,
)

# Result: SELECT * FROM data WHERE color = "red" AND type = "car"
print(filtered.sql())
```

## API Reference

### `remove_expression_part(child: sqlglot.Expression) -> None`

Removes the specified SQLGlot expression from its parent, respecting logical structure.

**Parameters:**
- `child`: The SQLGlot expression to remove

**Raises:**
- `ValueError`: If the expression cannot be safely removed

**Supported Parent Types:**
- `AND`/`OR` expressions: Replaces parent with the remaining operand
- `WHERE` clauses: Removes the entire WHERE clause if it becomes empty
- `Parentheses`: Recursively removes the parent
- `NOT` expressions: Removes the entire NOT expression
- `CASE` statements: Removes conditional branches

### `filter_column_references(*, statement: sqlglot.Expression, selector: Callable[[sqlglot.expressions.Column], bool]) -> sqlglot.Expression`

Filters a SQL statement to remove predicates containing columns that don't match the selector criteria.

**Parameters:**
- `statement`: The SQLGlot expression to filter
- `selector`: A callable that takes a Column and returns True if it should be preserved, False if it should be removed

**Returns:**
- Filtered SQLGlot expression with non-matching columns removed

**Raises:**
- `ValueError`: If a column can't be cleanly removed due to interactions with allowed columns

### `where_clause_contents(statement: sqlglot.expressions.Expression) -> sqlglot.expressions.Expression | None`

Extracts the contents of the WHERE clause from a SQLGlot expression.

**Parameters:**
- `statement`: The SQLGlot expression to extract from

**Returns:**
- The contents of the WHERE clause, or None if no WHERE clause exists

### `filter_predicates_with_right_side_column_references(statement: sqlglot.expressions.Expression) -> sqlglot.Expression`

Filters out predicates that have column references on the right side of comparisons.

**Parameters:**
- `statement`: The SQLGlot expression to filter

**Returns:**
- Filtered SQLGlot expression with right-side column reference predicates removed

## Examples

### Complex Logic Handling

The library intelligently handles complex logical expressions:

```python
# Original: (x = 1 AND y = 2) OR z = 3
# Remove y = 2: x = 1 OR z = 3

# Original: NOT (x = 1 AND y = 2)
# Remove x = 1: NOT y = 2 (which becomes y <> 2)

# Original: CASE WHEN x = 1 THEN 'yes' WHEN x = 2 THEN 'maybe' ELSE 'no' END
# Remove x = 1: CASE WHEN x = 2 THEN 'maybe' ELSE 'no' END
```

### Column Filtering with Complex Expressions

```python
sql = '''
SELECT * FROM users
WHERE age > 18
  AND (status = 'active' OR role = 'admin')
  AND department IN ('engineering', 'sales')
'''

# Only keep predicates involving 'age' and 'role'
allowed_columns = {'age', 'role'}

result = transforms.filter_column_references_statement(
    sql=sql,
    selector=lambda col: col.name in allowed_columns,
)

# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'
```

### Error Handling

The library will raise `ValueError` when predicates cannot be safely removed:

```python
import sqlglot
from query_farm_sql_manipulation import transforms

# This will raise ValueError because x = 1 is part of a larger expression
sql = "SELECT * FROM data WHERE result = (x = 1)"
statement = sqlglot.parse_one(sql, dialect="duckdb")

# Cannot remove x = 1 because it's used as a value, not a predicate
# This would raise ValueError if attempted
```

## Supported SQL Constructs

- **Logical Operators**: `AND`, `OR`, `NOT`
- **Comparison Operators**: `=`, `<>`, `<`, `>`, `<=`, `>=`, `LIKE`, `IN`, `IS NULL`, etc.
- **Complex Expressions**: `CASE` statements, subqueries, function calls
- **Nested Logic**: Parentheses and nested boolean expressions
- **Multiple Dialects**: DuckDB, PostgreSQL, MySQL, SQLite, and more via SQLGlot

## Testing

Run the test suite:

```bash
pytest src/query_farm_sql_manipulation/test_transforms.py
```

The test suite includes comprehensive examples of:
- Basic predicate removal scenarios
- Complex logical expression handling
- Error cases and edge conditions
- Column filtering with various SQL constructs

## Contributing

This project uses:
- **Rye** for dependency management
- **pytest** for testing
- **mypy** for type checking
- **ruff** for linting


## Author

This Python module was created by [Query.Farm](https://query.farm).

# License

MIT Licensed.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "query-farm-sql-manipulation",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "predicate parsing, predicates, sql, sql parsing",
    "author": null,
    "author_email": "Rusty Conover <rusty@query.farm>",
    "download_url": "https://files.pythonhosted.org/packages/91/5f/d0536c2af7b6fa5ada660ff6a63e6d0c0673e63de0be021bda6f31fea025/query_farm_sql_manipulation-0.1.7.tar.gz",
    "platform": null,
    "description": "# [Query.Farm](https://query.farm) SQL Manipulation\n\nA Python library for intelligent SQL predicate manipulation using [SQLGlot](https://sqlglot.org)\n\n### Column Filtering with Complex Expressions\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\nsql = '''\nSELECT * FROM users\nWHERE age > 18\n  AND (status = 'active' OR role = 'admin')\n  AND department IN ('engineering', 'sales')\n'''\n\n# Parse the statement first\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Only keep predicates involving 'age' and 'role'\nallowed_columns = {'age', 'role'}\n\nresult = transforms.filter_column_references(\n    statement=statement,\n    selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'\nprint(result.sql())\n```\n\n## Features\n\n- **Predicate Removal**: Safely remove specific predicates from complex `SQL WHERE` clauses while preserving logical structure\n- **Column Filtering**: Filter SQL statements to only include predicates referencing allowed columns\n- **Intelligent Logic Handling**: Properly handles `AND/OR` logic, nested expressions, `CASE` statements, and parentheses\n- **SQLGlot Integration**: Built on top of [SQLGlot](https://sqlglot.com/sqlglot.html) for robust SQL parsing and manipulation\n- **Multiple Dialect Support**: Works with various SQL dialects (default: DuckDB)\n\n## Installation\n\n```bash\npip install query-farm-sql-manipulation\n```\n\n## Requirements\n\n- Python >= 3.12\n- SQLGlot >= 26.33.0\n\n## Quick Start\n\n### Basic Predicate Removal\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# Parse a SQL statement\nsql = 'SELECT * FROM data WHERE x = 1 AND y = 2'\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Find the predicate you want to remove\npredicates = list(statement.find_all(sqlglot.expressions.Predicate))\ntarget_predicate = predicates[0]  # x = 1\n\n# Remove the predicate\ntransforms.remove_expression_part(target_predicate)\n\n# Result: SELECT * FROM data WHERE y = 2\nprint(statement.sql())\n```\n\n### Column-Name Based Filtering\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# Parse SQL statement first\nsql = 'SELECT * FROM data WHERE color = \"red\" AND size > 10 AND type = \"car\"'\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Filter to only include predicates with allowed columns\nallowed_columns = {\"color\", \"type\"}\n\nfiltered = transforms.filter_column_references(\n    statement=statement,\n    selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM data WHERE color = \"red\" AND type = \"car\"\nprint(filtered.sql())\n```\n\n## API Reference\n\n### `remove_expression_part(child: sqlglot.Expression) -> None`\n\nRemoves the specified SQLGlot expression from its parent, respecting logical structure.\n\n**Parameters:**\n- `child`: The SQLGlot expression to remove\n\n**Raises:**\n- `ValueError`: If the expression cannot be safely removed\n\n**Supported Parent Types:**\n- `AND`/`OR` expressions: Replaces parent with the remaining operand\n- `WHERE` clauses: Removes the entire WHERE clause if it becomes empty\n- `Parentheses`: Recursively removes the parent\n- `NOT` expressions: Removes the entire NOT expression\n- `CASE` statements: Removes conditional branches\n\n### `filter_column_references(*, statement: sqlglot.Expression, selector: Callable[[sqlglot.expressions.Column], bool]) -> sqlglot.Expression`\n\nFilters a SQL statement to remove predicates containing columns that don't match the selector criteria.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to filter\n- `selector`: A callable that takes a Column and returns True if it should be preserved, False if it should be removed\n\n**Returns:**\n- Filtered SQLGlot expression with non-matching columns removed\n\n**Raises:**\n- `ValueError`: If a column can't be cleanly removed due to interactions with allowed columns\n\n### `where_clause_contents(statement: sqlglot.expressions.Expression) -> sqlglot.expressions.Expression | None`\n\nExtracts the contents of the WHERE clause from a SQLGlot expression.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to extract from\n\n**Returns:**\n- The contents of the WHERE clause, or None if no WHERE clause exists\n\n### `filter_predicates_with_right_side_column_references(statement: sqlglot.expressions.Expression) -> sqlglot.Expression`\n\nFilters out predicates that have column references on the right side of comparisons.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to filter\n\n**Returns:**\n- Filtered SQLGlot expression with right-side column reference predicates removed\n\n## Examples\n\n### Complex Logic Handling\n\nThe library intelligently handles complex logical expressions:\n\n```python\n# Original: (x = 1 AND y = 2) OR z = 3\n# Remove y = 2: x = 1 OR z = 3\n\n# Original: NOT (x = 1 AND y = 2)\n# Remove x = 1: NOT y = 2 (which becomes y <> 2)\n\n# Original: CASE WHEN x = 1 THEN 'yes' WHEN x = 2 THEN 'maybe' ELSE 'no' END\n# Remove x = 1: CASE WHEN x = 2 THEN 'maybe' ELSE 'no' END\n```\n\n### Column Filtering with Complex Expressions\n\n```python\nsql = '''\nSELECT * FROM users\nWHERE age > 18\n  AND (status = 'active' OR role = 'admin')\n  AND department IN ('engineering', 'sales')\n'''\n\n# Only keep predicates involving 'age' and 'role'\nallowed_columns = {'age', 'role'}\n\nresult = transforms.filter_column_references_statement(\n    sql=sql,\n    selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'\n```\n\n### Error Handling\n\nThe library will raise `ValueError` when predicates cannot be safely removed:\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# This will raise ValueError because x = 1 is part of a larger expression\nsql = \"SELECT * FROM data WHERE result = (x = 1)\"\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Cannot remove x = 1 because it's used as a value, not a predicate\n# This would raise ValueError if attempted\n```\n\n## Supported SQL Constructs\n\n- **Logical Operators**: `AND`, `OR`, `NOT`\n- **Comparison Operators**: `=`, `<>`, `<`, `>`, `<=`, `>=`, `LIKE`, `IN`, `IS NULL`, etc.\n- **Complex Expressions**: `CASE` statements, subqueries, function calls\n- **Nested Logic**: Parentheses and nested boolean expressions\n- **Multiple Dialects**: DuckDB, PostgreSQL, MySQL, SQLite, and more via SQLGlot\n\n## Testing\n\nRun the test suite:\n\n```bash\npytest src/query_farm_sql_manipulation/test_transforms.py\n```\n\nThe test suite includes comprehensive examples of:\n- Basic predicate removal scenarios\n- Complex logical expression handling\n- Error cases and edge conditions\n- Column filtering with various SQL constructs\n\n## Contributing\n\nThis project uses:\n- **Rye** for dependency management\n- **pytest** for testing\n- **mypy** for type checking\n- **ruff** for linting\n\n\n## Author\n\nThis Python module was created by [Query.Farm](https://query.farm).\n\n# License\n\nMIT Licensed.\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python library for intelligent SQL predicate manipulation using SQLGlot. This library provides tools to safely remove specific predicates from `SQL WHERE` clauses and filter SQL statements based on column availability.",
    "version": "0.1.7",
    "project_urls": {
        "Issues": "https://github.com/query-farm/python-sql-manipulation/issues",
        "Repository": "https://github.com/query-farm/python-sql-manipulation.git"
    },
    "split_keywords": [
        "predicate parsing",
        " predicates",
        " sql",
        " sql parsing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0b98a09a5d90ea27cb7b84537cb90fdf9ae2628f78ab0b965e0c5207840656a9",
                "md5": "a6a0babef23a4be3b602099108ae6ea1",
                "sha256": "b1c7046ff506ff118fa4cc86524a778976507e6d853ee146a6d784097e5620fa"
            },
            "downloads": -1,
            "filename": "query_farm_sql_manipulation-0.1.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a6a0babef23a4be3b602099108ae6ea1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 9722,
            "upload_time": "2025-07-10T22:31:35",
            "upload_time_iso_8601": "2025-07-10T22:31:35.378870Z",
            "url": "https://files.pythonhosted.org/packages/0b/98/a09a5d90ea27cb7b84537cb90fdf9ae2628f78ab0b965e0c5207840656a9/query_farm_sql_manipulation-0.1.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "915fd0536c2af7b6fa5ada660ff6a63e6d0c0673e63de0be021bda6f31fea025",
                "md5": "bccbaba254efde48cfcc30dcc7d72276",
                "sha256": "6a31857fed4e507d94591b941bcd0b8aa397a46883d165b57f5e7577255bd1d3"
            },
            "downloads": -1,
            "filename": "query_farm_sql_manipulation-0.1.7.tar.gz",
            "has_sig": false,
            "md5_digest": "bccbaba254efde48cfcc30dcc7d72276",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 9032,
            "upload_time": "2025-07-10T22:31:37",
            "upload_time_iso_8601": "2025-07-10T22:31:37.203277Z",
            "url": "https://files.pythonhosted.org/packages/91/5f/d0536c2af7b6fa5ada660ff6a63e6d0c0673e63de0be021bda6f31fea025/query_farm_sql_manipulation-0.1.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-10 22:31:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "query-farm",
    "github_project": "python-sql-manipulation",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "query-farm-sql-manipulation"
}
        
Elapsed time: 0.78821s