Name | query-farm-sql-manipulation JSON |
Version |
0.1.7
JSON |
| download |
home_page | None |
Summary | A Python library for intelligent SQL predicate manipulation using SQLGlot. This library provides tools to safely remove specific predicates from `SQL WHERE` clauses and filter SQL statements based on column availability. |
upload_time | 2025-07-10 22:31:37 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.11 |
license | None |
keywords |
predicate parsing
predicates
sql
sql parsing
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# [Query.Farm](https://query.farm) SQL Manipulation
A Python library for intelligent SQL predicate manipulation using [SQLGlot](https://sqlglot.org)
### Column Filtering with Complex Expressions
```python
import sqlglot
from query_farm_sql_manipulation import transforms
sql = '''
SELECT * FROM users
WHERE age > 18
AND (status = 'active' OR role = 'admin')
AND department IN ('engineering', 'sales')
'''
# Parse the statement first
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Only keep predicates involving 'age' and 'role'
allowed_columns = {'age', 'role'}
result = transforms.filter_column_references(
statement=statement,
selector=lambda col: col.name in allowed_columns,
)
# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'
print(result.sql())
```
## Features
- **Predicate Removal**: Safely remove specific predicates from complex `SQL WHERE` clauses while preserving logical structure
- **Column Filtering**: Filter SQL statements to only include predicates referencing allowed columns
- **Intelligent Logic Handling**: Properly handles `AND/OR` logic, nested expressions, `CASE` statements, and parentheses
- **SQLGlot Integration**: Built on top of [SQLGlot](https://sqlglot.com/sqlglot.html) for robust SQL parsing and manipulation
- **Multiple Dialect Support**: Works with various SQL dialects (default: DuckDB)
## Installation
```bash
pip install query-farm-sql-manipulation
```
## Requirements
- Python >= 3.12
- SQLGlot >= 26.33.0
## Quick Start
### Basic Predicate Removal
```python
import sqlglot
from query_farm_sql_manipulation import transforms
# Parse a SQL statement
sql = 'SELECT * FROM data WHERE x = 1 AND y = 2'
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Find the predicate you want to remove
predicates = list(statement.find_all(sqlglot.expressions.Predicate))
target_predicate = predicates[0] # x = 1
# Remove the predicate
transforms.remove_expression_part(target_predicate)
# Result: SELECT * FROM data WHERE y = 2
print(statement.sql())
```
### Column-Name Based Filtering
```python
import sqlglot
from query_farm_sql_manipulation import transforms
# Parse SQL statement first
sql = 'SELECT * FROM data WHERE color = "red" AND size > 10 AND type = "car"'
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Filter to only include predicates with allowed columns
allowed_columns = {"color", "type"}
filtered = transforms.filter_column_references(
statement=statement,
selector=lambda col: col.name in allowed_columns,
)
# Result: SELECT * FROM data WHERE color = "red" AND type = "car"
print(filtered.sql())
```
## API Reference
### `remove_expression_part(child: sqlglot.Expression) -> None`
Removes the specified SQLGlot expression from its parent, respecting logical structure.
**Parameters:**
- `child`: The SQLGlot expression to remove
**Raises:**
- `ValueError`: If the expression cannot be safely removed
**Supported Parent Types:**
- `AND`/`OR` expressions: Replaces parent with the remaining operand
- `WHERE` clauses: Removes the entire WHERE clause if it becomes empty
- `Parentheses`: Recursively removes the parent
- `NOT` expressions: Removes the entire NOT expression
- `CASE` statements: Removes conditional branches
### `filter_column_references(*, statement: sqlglot.Expression, selector: Callable[[sqlglot.expressions.Column], bool]) -> sqlglot.Expression`
Filters a SQL statement to remove predicates containing columns that don't match the selector criteria.
**Parameters:**
- `statement`: The SQLGlot expression to filter
- `selector`: A callable that takes a Column and returns True if it should be preserved, False if it should be removed
**Returns:**
- Filtered SQLGlot expression with non-matching columns removed
**Raises:**
- `ValueError`: If a column can't be cleanly removed due to interactions with allowed columns
### `where_clause_contents(statement: sqlglot.expressions.Expression) -> sqlglot.expressions.Expression | None`
Extracts the contents of the WHERE clause from a SQLGlot expression.
**Parameters:**
- `statement`: The SQLGlot expression to extract from
**Returns:**
- The contents of the WHERE clause, or None if no WHERE clause exists
### `filter_predicates_with_right_side_column_references(statement: sqlglot.expressions.Expression) -> sqlglot.Expression`
Filters out predicates that have column references on the right side of comparisons.
**Parameters:**
- `statement`: The SQLGlot expression to filter
**Returns:**
- Filtered SQLGlot expression with right-side column reference predicates removed
## Examples
### Complex Logic Handling
The library intelligently handles complex logical expressions:
```python
# Original: (x = 1 AND y = 2) OR z = 3
# Remove y = 2: x = 1 OR z = 3
# Original: NOT (x = 1 AND y = 2)
# Remove x = 1: NOT y = 2 (which becomes y <> 2)
# Original: CASE WHEN x = 1 THEN 'yes' WHEN x = 2 THEN 'maybe' ELSE 'no' END
# Remove x = 1: CASE WHEN x = 2 THEN 'maybe' ELSE 'no' END
```
### Column Filtering with Complex Expressions
```python
sql = '''
SELECT * FROM users
WHERE age > 18
AND (status = 'active' OR role = 'admin')
AND department IN ('engineering', 'sales')
'''
# Only keep predicates involving 'age' and 'role'
allowed_columns = {'age', 'role'}
result = transforms.filter_column_references_statement(
sql=sql,
selector=lambda col: col.name in allowed_columns,
)
# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'
```
### Error Handling
The library will raise `ValueError` when predicates cannot be safely removed:
```python
import sqlglot
from query_farm_sql_manipulation import transforms
# This will raise ValueError because x = 1 is part of a larger expression
sql = "SELECT * FROM data WHERE result = (x = 1)"
statement = sqlglot.parse_one(sql, dialect="duckdb")
# Cannot remove x = 1 because it's used as a value, not a predicate
# This would raise ValueError if attempted
```
## Supported SQL Constructs
- **Logical Operators**: `AND`, `OR`, `NOT`
- **Comparison Operators**: `=`, `<>`, `<`, `>`, `<=`, `>=`, `LIKE`, `IN`, `IS NULL`, etc.
- **Complex Expressions**: `CASE` statements, subqueries, function calls
- **Nested Logic**: Parentheses and nested boolean expressions
- **Multiple Dialects**: DuckDB, PostgreSQL, MySQL, SQLite, and more via SQLGlot
## Testing
Run the test suite:
```bash
pytest src/query_farm_sql_manipulation/test_transforms.py
```
The test suite includes comprehensive examples of:
- Basic predicate removal scenarios
- Complex logical expression handling
- Error cases and edge conditions
- Column filtering with various SQL constructs
## Contributing
This project uses:
- **Rye** for dependency management
- **pytest** for testing
- **mypy** for type checking
- **ruff** for linting
## Author
This Python module was created by [Query.Farm](https://query.farm).
# License
MIT Licensed.
Raw data
{
"_id": null,
"home_page": null,
"name": "query-farm-sql-manipulation",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "predicate parsing, predicates, sql, sql parsing",
"author": null,
"author_email": "Rusty Conover <rusty@query.farm>",
"download_url": "https://files.pythonhosted.org/packages/91/5f/d0536c2af7b6fa5ada660ff6a63e6d0c0673e63de0be021bda6f31fea025/query_farm_sql_manipulation-0.1.7.tar.gz",
"platform": null,
"description": "# [Query.Farm](https://query.farm) SQL Manipulation\n\nA Python library for intelligent SQL predicate manipulation using [SQLGlot](https://sqlglot.org)\n\n### Column Filtering with Complex Expressions\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\nsql = '''\nSELECT * FROM users\nWHERE age > 18\n AND (status = 'active' OR role = 'admin')\n AND department IN ('engineering', 'sales')\n'''\n\n# Parse the statement first\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Only keep predicates involving 'age' and 'role'\nallowed_columns = {'age', 'role'}\n\nresult = transforms.filter_column_references(\n statement=statement,\n selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'\nprint(result.sql())\n```\n\n## Features\n\n- **Predicate Removal**: Safely remove specific predicates from complex `SQL WHERE` clauses while preserving logical structure\n- **Column Filtering**: Filter SQL statements to only include predicates referencing allowed columns\n- **Intelligent Logic Handling**: Properly handles `AND/OR` logic, nested expressions, `CASE` statements, and parentheses\n- **SQLGlot Integration**: Built on top of [SQLGlot](https://sqlglot.com/sqlglot.html) for robust SQL parsing and manipulation\n- **Multiple Dialect Support**: Works with various SQL dialects (default: DuckDB)\n\n## Installation\n\n```bash\npip install query-farm-sql-manipulation\n```\n\n## Requirements\n\n- Python >= 3.12\n- SQLGlot >= 26.33.0\n\n## Quick Start\n\n### Basic Predicate Removal\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# Parse a SQL statement\nsql = 'SELECT * FROM data WHERE x = 1 AND y = 2'\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Find the predicate you want to remove\npredicates = list(statement.find_all(sqlglot.expressions.Predicate))\ntarget_predicate = predicates[0] # x = 1\n\n# Remove the predicate\ntransforms.remove_expression_part(target_predicate)\n\n# Result: SELECT * FROM data WHERE y = 2\nprint(statement.sql())\n```\n\n### Column-Name Based Filtering\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# Parse SQL statement first\nsql = 'SELECT * FROM data WHERE color = \"red\" AND size > 10 AND type = \"car\"'\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Filter to only include predicates with allowed columns\nallowed_columns = {\"color\", \"type\"}\n\nfiltered = transforms.filter_column_references(\n statement=statement,\n selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM data WHERE color = \"red\" AND type = \"car\"\nprint(filtered.sql())\n```\n\n## API Reference\n\n### `remove_expression_part(child: sqlglot.Expression) -> None`\n\nRemoves the specified SQLGlot expression from its parent, respecting logical structure.\n\n**Parameters:**\n- `child`: The SQLGlot expression to remove\n\n**Raises:**\n- `ValueError`: If the expression cannot be safely removed\n\n**Supported Parent Types:**\n- `AND`/`OR` expressions: Replaces parent with the remaining operand\n- `WHERE` clauses: Removes the entire WHERE clause if it becomes empty\n- `Parentheses`: Recursively removes the parent\n- `NOT` expressions: Removes the entire NOT expression\n- `CASE` statements: Removes conditional branches\n\n### `filter_column_references(*, statement: sqlglot.Expression, selector: Callable[[sqlglot.expressions.Column], bool]) -> sqlglot.Expression`\n\nFilters a SQL statement to remove predicates containing columns that don't match the selector criteria.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to filter\n- `selector`: A callable that takes a Column and returns True if it should be preserved, False if it should be removed\n\n**Returns:**\n- Filtered SQLGlot expression with non-matching columns removed\n\n**Raises:**\n- `ValueError`: If a column can't be cleanly removed due to interactions with allowed columns\n\n### `where_clause_contents(statement: sqlglot.expressions.Expression) -> sqlglot.expressions.Expression | None`\n\nExtracts the contents of the WHERE clause from a SQLGlot expression.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to extract from\n\n**Returns:**\n- The contents of the WHERE clause, or None if no WHERE clause exists\n\n### `filter_predicates_with_right_side_column_references(statement: sqlglot.expressions.Expression) -> sqlglot.Expression`\n\nFilters out predicates that have column references on the right side of comparisons.\n\n**Parameters:**\n- `statement`: The SQLGlot expression to filter\n\n**Returns:**\n- Filtered SQLGlot expression with right-side column reference predicates removed\n\n## Examples\n\n### Complex Logic Handling\n\nThe library intelligently handles complex logical expressions:\n\n```python\n# Original: (x = 1 AND y = 2) OR z = 3\n# Remove y = 2: x = 1 OR z = 3\n\n# Original: NOT (x = 1 AND y = 2)\n# Remove x = 1: NOT y = 2 (which becomes y <> 2)\n\n# Original: CASE WHEN x = 1 THEN 'yes' WHEN x = 2 THEN 'maybe' ELSE 'no' END\n# Remove x = 1: CASE WHEN x = 2 THEN 'maybe' ELSE 'no' END\n```\n\n### Column Filtering with Complex Expressions\n\n```python\nsql = '''\nSELECT * FROM users\nWHERE age > 18\n AND (status = 'active' OR role = 'admin')\n AND department IN ('engineering', 'sales')\n'''\n\n# Only keep predicates involving 'age' and 'role'\nallowed_columns = {'age', 'role'}\n\nresult = transforms.filter_column_references_statement(\n sql=sql,\n selector=lambda col: col.name in allowed_columns,\n)\n\n# Result: SELECT * FROM users WHERE age > 18 AND role = 'admin'\n```\n\n### Error Handling\n\nThe library will raise `ValueError` when predicates cannot be safely removed:\n\n```python\nimport sqlglot\nfrom query_farm_sql_manipulation import transforms\n\n# This will raise ValueError because x = 1 is part of a larger expression\nsql = \"SELECT * FROM data WHERE result = (x = 1)\"\nstatement = sqlglot.parse_one(sql, dialect=\"duckdb\")\n\n# Cannot remove x = 1 because it's used as a value, not a predicate\n# This would raise ValueError if attempted\n```\n\n## Supported SQL Constructs\n\n- **Logical Operators**: `AND`, `OR`, `NOT`\n- **Comparison Operators**: `=`, `<>`, `<`, `>`, `<=`, `>=`, `LIKE`, `IN`, `IS NULL`, etc.\n- **Complex Expressions**: `CASE` statements, subqueries, function calls\n- **Nested Logic**: Parentheses and nested boolean expressions\n- **Multiple Dialects**: DuckDB, PostgreSQL, MySQL, SQLite, and more via SQLGlot\n\n## Testing\n\nRun the test suite:\n\n```bash\npytest src/query_farm_sql_manipulation/test_transforms.py\n```\n\nThe test suite includes comprehensive examples of:\n- Basic predicate removal scenarios\n- Complex logical expression handling\n- Error cases and edge conditions\n- Column filtering with various SQL constructs\n\n## Contributing\n\nThis project uses:\n- **Rye** for dependency management\n- **pytest** for testing\n- **mypy** for type checking\n- **ruff** for linting\n\n\n## Author\n\nThis Python module was created by [Query.Farm](https://query.farm).\n\n# License\n\nMIT Licensed.\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python library for intelligent SQL predicate manipulation using SQLGlot. This library provides tools to safely remove specific predicates from `SQL WHERE` clauses and filter SQL statements based on column availability.",
"version": "0.1.7",
"project_urls": {
"Issues": "https://github.com/query-farm/python-sql-manipulation/issues",
"Repository": "https://github.com/query-farm/python-sql-manipulation.git"
},
"split_keywords": [
"predicate parsing",
" predicates",
" sql",
" sql parsing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0b98a09a5d90ea27cb7b84537cb90fdf9ae2628f78ab0b965e0c5207840656a9",
"md5": "a6a0babef23a4be3b602099108ae6ea1",
"sha256": "b1c7046ff506ff118fa4cc86524a778976507e6d853ee146a6d784097e5620fa"
},
"downloads": -1,
"filename": "query_farm_sql_manipulation-0.1.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a6a0babef23a4be3b602099108ae6ea1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 9722,
"upload_time": "2025-07-10T22:31:35",
"upload_time_iso_8601": "2025-07-10T22:31:35.378870Z",
"url": "https://files.pythonhosted.org/packages/0b/98/a09a5d90ea27cb7b84537cb90fdf9ae2628f78ab0b965e0c5207840656a9/query_farm_sql_manipulation-0.1.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "915fd0536c2af7b6fa5ada660ff6a63e6d0c0673e63de0be021bda6f31fea025",
"md5": "bccbaba254efde48cfcc30dcc7d72276",
"sha256": "6a31857fed4e507d94591b941bcd0b8aa397a46883d165b57f5e7577255bd1d3"
},
"downloads": -1,
"filename": "query_farm_sql_manipulation-0.1.7.tar.gz",
"has_sig": false,
"md5_digest": "bccbaba254efde48cfcc30dcc7d72276",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 9032,
"upload_time": "2025-07-10T22:31:37",
"upload_time_iso_8601": "2025-07-10T22:31:37.203277Z",
"url": "https://files.pythonhosted.org/packages/91/5f/d0536c2af7b6fa5ada660ff6a63e6d0c0673e63de0be021bda6f31fea025/query_farm_sql_manipulation-0.1.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-10 22:31:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "query-farm",
"github_project": "python-sql-manipulation",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "query-farm-sql-manipulation"
}