# CrateDB SQL Parser for Python
[![License](https://img.shields.io/github/license/crate/cratedb-sqlparse.svg)](https://github.com/crate/cratedb-sqlparse/blob/main/LICENSE)
[![PyPI version](https://img.shields.io/pypi/v/cratedb-sqlparse.svg)](https://pypi.org/project/cratedb-sqlparse/)
[![PyPI downloads](https://pepy.tech/badge/cratedb-sqlparse/month)](https://pepy.tech/project/cratedb-sqlparse/)
[![Tests](https://github.com/crate/cratedb-sqlparse/actions/workflows/python.yml/badge.svg)](https://github.com/crate/cratedb-sqlparse/actions/workflows/python.yml)
[![Test coverage](https://img.shields.io/codecov/c/gh/crate/cratedb-sqlparse.svg)](https://codecov.io/gh/crate/cratedb-sqlparse/)
[![Python versions](https://img.shields.io/pypi/pyversions/cratedb-sqlparse.svg)](https://pypi.org/project/cratedb-sqlparse/)
[![Status](https://img.shields.io/pypi/status/cratedb-sqlparse.svg)](https://pypi.org/project/cratedb-sqlparse/)
This package provides utilities to validate and split SQL statements specifically designed for
CrateDB.
It is built upon CrateDB's antlr4 grammar, ensuring accurate parsing tailored to CrateDB's SQL
dialect.
It draws inspiration from `sqlparse`.
## Installation.
```shell
pip install cratedb-sqlparse
```
## Usage.
### Simple example
```python
from cratedb_sqlparse import sqlparse
query = """
SELECT * FROM SYS.SHARDS;
INSERT INTO doc.tbl VALUES (1);
"""
statements = sqlparse(query)
print(len(statements))
# 2
select_query = statements[0]
print(select_query.query)
# 'SELECT * FROM SYS.SHARDS'
print(select_query.type)
# 'SELECT'
print(select_query.tree)
# (statement (query (queryNoWith (queryTerm (querySpec SELECT (selectItem *) FROM (relation (aliasedRelation (relationPrimary (table (qname (ident (unquotedIdent SYS)) . (ident (unquotedIdent (nonReserved SHARDS)))))))))))))
```
### Exceptions and errors.
By default exceptions are stored in `statement.exception`
```python
from cratedb_sqlparse import sqlparse
query = """
SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;
INSERT INTO doc.tbl VALUES (1, 23, 4);
"""
statements = sqlparse(query)
stmt = statements[0]
if stmt.exception:
print(stmt.exception.error_message)
# InputMismatchException[line 2:31 mismatched input 'HERE' expecting {<EOF>, ';'}]
print(stmt.exception.original_query_with_error_marked)
# SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;
# ^^^^
#
# INSERT INTO doc.tbl VALUES (1, 23, 4);
print(stmt.exception.offending_token.text)
# HERE
```
In some situations, you might want sqlparse to raise an exception.
You can set `raise_exception` to `True`
```python
from cratedb_sqlparse import sqlparse
sqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', raise_exception=True)
# cratedb_sqlparse.parser.ParsingException: NoViableAltException[line 1:37 no viable alternative at input 'SELECT COUNT(*) FROM doc.tbl f WHERE .']
```
Catch the exception:
```python
from cratedb_sqlparse import sqlparse, ParsingException
try:
t = sqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', raise_exception=True)[0]
except ParsingException:
print('Catched!')
```
Note:
It will only raise the first exception it finds, even if you pass in several statements.
### Query metadata.
Query metadata can be read with `statement.metadata`
```python
from cratedb_sqlparse import sqlparse
stmt = sqlparse("SELECT A, B FROM doc.tbl12")
print(stmt.metadata)
# Metadata(tables=[Table(schema='doc', name='tbl12')], parameterized_properties={}, with_properties={})
```
#### Query properties.
Properties defined within a `WITH` statement, `statement.metadata.with_properties`.
```python
from cratedb_sqlparse import sqlparse
stmt = sqlparse("""
CREATE TABLE doc.tbl12 (A TEXT) WITH (
"allocation.max_retries" = 5,
"blocks.metadata" = false
);
""")[0]
print(stmt.metadata)
# Metadata(tables=[Table(schema='doc', name='tbl12')], with_properties={'allocation.max_retries': '5', 'blocks.metadata': 'false'})
```
#### Table name
```python
print(stmt.metadata.tables)
# [Table(schema='doc', name='tbl12')]
table = stmt.metadata.tables[0]
print(table.schema, table.name, table.fqn, sep='\n')
# doc
# tbl12
# '"doc"."tbl12"'
```
#### Parameterized properties.
Parameterized properties are properties without a real defined value, marked with a dollar string, `metadata.parameterized_properties`
```python
from cratedb_sqlparse import sqlparse
stmt = sqlparse("""
CREATE TABLE doc.tbl12 (A TEXT) WITH (
"allocation.max_retries" = 5,
"blocks.metadata" = $1
);
""")[0]
print(stmt.metadata)
# Metadata(tables=[Table(schema='doc', name='tbl12')], parameterized_properties={'blocks.metadata': '$1'}, with_properties={'allocation.max_retries': '5', 'blocks.metadata': '$1'})
```
In this case, `blocks.metadata` will be in `with_properties` and `parameterized_properties` as well.
For values to be picked up they need to start with a dollar `'$'` and be preceded by integers, e.g. `'$1'`, `'$123'` -
`'$123abc'` would not be valid.
## Development
### Set up environment
```shell
git clone https://github.com/crate/cratedb-sqlparse
cd cratedb-sqlparse/cratedb_sqlparse_py
python3 -m venv .venv
source .venv/bin/activate
pip install --editable='.[develop,generate,release,test]'
```
Everytime you open a shell again you would need to run `source .venv/bin/activate`
to use `poe` commands.
### Run lint and tests with coverage.
```shell
poe check
```
### Run only tests
```shell
poe test
```
### Run a specific test.
```shell
poe test -k test_sqlparse_collects_exceptions_2
```
### Run linter
```shell
poe lint
```
Raw data
{
"_id": null,
"home_page": null,
"name": "cratedb-sqlparse",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "antlr4, antlr4 grammar, cratedb, grammar, sql, sql grammar, sql parser, sql parsing",
"author": null,
"author_email": "Ivan Sanchez Valencia <ivan.sanchezvalencia@crate.io>",
"download_url": "https://files.pythonhosted.org/packages/e3/01/9f209fb868d67c2c15cde908e41f0399306a670e937ae656cbd357f6e7f9/cratedb_sqlparse-0.0.10.tar.gz",
"platform": null,
"description": "# CrateDB SQL Parser for Python\n\n[![License](https://img.shields.io/github/license/crate/cratedb-sqlparse.svg)](https://github.com/crate/cratedb-sqlparse/blob/main/LICENSE)\n[![PyPI version](https://img.shields.io/pypi/v/cratedb-sqlparse.svg)](https://pypi.org/project/cratedb-sqlparse/)\n[![PyPI downloads](https://pepy.tech/badge/cratedb-sqlparse/month)](https://pepy.tech/project/cratedb-sqlparse/)\n\n[![Tests](https://github.com/crate/cratedb-sqlparse/actions/workflows/python.yml/badge.svg)](https://github.com/crate/cratedb-sqlparse/actions/workflows/python.yml)\n[![Test coverage](https://img.shields.io/codecov/c/gh/crate/cratedb-sqlparse.svg)](https://codecov.io/gh/crate/cratedb-sqlparse/)\n[![Python versions](https://img.shields.io/pypi/pyversions/cratedb-sqlparse.svg)](https://pypi.org/project/cratedb-sqlparse/)\n\n[![Status](https://img.shields.io/pypi/status/cratedb-sqlparse.svg)](https://pypi.org/project/cratedb-sqlparse/)\n\nThis package provides utilities to validate and split SQL statements specifically designed for\nCrateDB.\n\nIt is built upon CrateDB's antlr4 grammar, ensuring accurate parsing tailored to CrateDB's SQL\ndialect.\n\nIt draws inspiration from `sqlparse`.\n\n## Installation.\n\n```shell\npip install cratedb-sqlparse\n```\n\n## Usage.\n\n### Simple example\n\n```python\nfrom cratedb_sqlparse import sqlparse\n\nquery = \"\"\"\n SELECT * FROM SYS.SHARDS;\n INSERT INTO doc.tbl VALUES (1);\n\"\"\"\nstatements = sqlparse(query)\n\nprint(len(statements))\n# 2\n\nselect_query = statements[0]\n\nprint(select_query.query)\n# 'SELECT * FROM SYS.SHARDS'\n\nprint(select_query.type)\n# 'SELECT'\n\nprint(select_query.tree)\n# (statement (query (queryNoWith (queryTerm (querySpec SELECT (selectItem *) FROM (relation (aliasedRelation (relationPrimary (table (qname (ident (unquotedIdent SYS)) . (ident (unquotedIdent (nonReserved SHARDS)))))))))))))\n```\n\n### Exceptions and errors.\n\nBy default exceptions are stored in `statement.exception`\n\n```python\nfrom cratedb_sqlparse import sqlparse\n\nquery = \"\"\"\nSELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;\n\nINSERT INTO doc.tbl VALUES (1, 23, 4);\n\"\"\"\nstatements = sqlparse(query)\nstmt = statements[0]\n\nif stmt.exception:\n print(stmt.exception.error_message)\n # InputMismatchException[line 2:31 mismatched input 'HERE' expecting {<EOF>, ';'}]\n\n print(stmt.exception.original_query_with_error_marked)\n # SELECT COUNT(*) FROM doc.tbl f HERE f.id = 1;\n # ^^^^\n # \n # INSERT INTO doc.tbl VALUES (1, 23, 4);\n\n print(stmt.exception.offending_token.text)\n # HERE\n```\n\nIn some situations, you might want sqlparse to raise an exception.\n\nYou can set `raise_exception` to `True`\n\n```python\nfrom cratedb_sqlparse import sqlparse\n\nsqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', raise_exception=True)\n\n# cratedb_sqlparse.parser.ParsingException: NoViableAltException[line 1:37 no viable alternative at input 'SELECT COUNT(*) FROM doc.tbl f WHERE .']\n```\n\nCatch the exception:\n\n```python\nfrom cratedb_sqlparse import sqlparse, ParsingException\n\ntry:\n t = sqlparse('SELECT COUNT(*) FROM doc.tbl f WHERE .id = 1;', raise_exception=True)[0]\nexcept ParsingException:\n print('Catched!')\n```\n\nNote:\n\nIt will only raise the first exception it finds, even if you pass in several statements.\n\n### Query metadata.\n\nQuery metadata can be read with `statement.metadata`\n\n```python\nfrom cratedb_sqlparse import sqlparse\n\nstmt = sqlparse(\"SELECT A, B FROM doc.tbl12\")\n\nprint(stmt.metadata)\n# Metadata(tables=[Table(schema='doc', name='tbl12')], parameterized_properties={}, with_properties={})\n```\n\n#### Query properties.\n\nProperties defined within a `WITH` statement, `statement.metadata.with_properties`.\n\n\n```python\nfrom cratedb_sqlparse import sqlparse\n\nstmt = sqlparse(\"\"\"\n CREATE TABLE doc.tbl12 (A TEXT) WITH (\n \"allocation.max_retries\" = 5,\n \"blocks.metadata\" = false\n );\n\"\"\")[0]\n\nprint(stmt.metadata)\n# Metadata(tables=[Table(schema='doc', name='tbl12')], with_properties={'allocation.max_retries': '5', 'blocks.metadata': 'false'})\n```\n\n#### Table name\n```python\nprint(stmt.metadata.tables)\n# [Table(schema='doc', name='tbl12')]\n\ntable = stmt.metadata.tables[0]\nprint(table.schema, table.name, table.fqn, sep='\\n')\n# doc\n# tbl12\n# '\"doc\".\"tbl12\"'\n```\n\n\n\n#### Parameterized properties.\n\nParameterized properties are properties without a real defined value, marked with a dollar string, `metadata.parameterized_properties`\n\n```python\nfrom cratedb_sqlparse import sqlparse\n\nstmt = sqlparse(\"\"\"\n CREATE TABLE doc.tbl12 (A TEXT) WITH (\n \"allocation.max_retries\" = 5,\n \"blocks.metadata\" = $1\n);\n\"\"\")[0]\n\nprint(stmt.metadata)\n# Metadata(tables=[Table(schema='doc', name='tbl12')], parameterized_properties={'blocks.metadata': '$1'}, with_properties={'allocation.max_retries': '5', 'blocks.metadata': '$1'})\n```\n\nIn this case, `blocks.metadata` will be in `with_properties` and `parameterized_properties` as well.\n\nFor values to be picked up they need to start with a dollar `'$'` and be preceded by integers, e.g. `'$1'`, `'$123'` -\n`'$123abc'` would not be valid.\n\n\n## Development\n\n### Set up environment\n\n```shell\ngit clone https://github.com/crate/cratedb-sqlparse\n\ncd cratedb-sqlparse/cratedb_sqlparse_py\n\npython3 -m venv .venv\n\nsource .venv/bin/activate\n\npip install --editable='.[develop,generate,release,test]'\n```\n\nEverytime you open a shell again you would need to run `source .venv/bin/activate`\nto use `poe` commands.\n\n### Run lint and tests with coverage.\n\n```shell\npoe check\n```\n\n### Run only tests\n\n```shell\npoe test\n```\n\n### Run a specific test.\n\n```shell\npoe test -k test_sqlparse_collects_exceptions_2\n```\n\n### Run linter\n\n```shell\npoe lint\n```\n",
"bugtrack_url": null,
"license": "Apache License 2.0",
"summary": "Parsing utilities to validate and split SQL statements for CrateDB.",
"version": "0.0.10",
"project_urls": {
"changelog": "https://github.com/crate/cratedb-sqlparse/blob/main/CHANGES.md",
"documentation": "https://github.com/crate/cratedb-sqlparse",
"homepage": "https://github.com/crate/cratedb-sqlparse",
"repository": "https://github.com/crate/cratedb-sqlparse"
},
"split_keywords": [
"antlr4",
" antlr4 grammar",
" cratedb",
" grammar",
" sql",
" sql grammar",
" sql parser",
" sql parsing"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "d5d7cb8739c55c10bf44ad6cb430eb89863d668cb51b1a0c6db49cd2cc72fb71",
"md5": "62f1d25b8c297fca55eecd266caea8e6",
"sha256": "602b133a22ca3506e19b2487683069d4573d11ba656a6303b1d00da63f7b9947"
},
"downloads": -1,
"filename": "cratedb_sqlparse-0.0.10-py3-none-any.whl",
"has_sig": false,
"md5_digest": "62f1d25b8c297fca55eecd266caea8e6",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 142499,
"upload_time": "2024-10-30T16:31:29",
"upload_time_iso_8601": "2024-10-30T16:31:29.876326Z",
"url": "https://files.pythonhosted.org/packages/d5/d7/cb8739c55c10bf44ad6cb430eb89863d668cb51b1a0c6db49cd2cc72fb71/cratedb_sqlparse-0.0.10-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e3019f209fb868d67c2c15cde908e41f0399306a670e937ae656cbd357f6e7f9",
"md5": "8e0f1a9e2d6191cdc23206c338ee09c3",
"sha256": "417f7d17661bf6f4f2e20fcd9c87195c9bcdfa7650eae9cd69aa470bfd6fad82"
},
"downloads": -1,
"filename": "cratedb_sqlparse-0.0.10.tar.gz",
"has_sig": false,
"md5_digest": "8e0f1a9e2d6191cdc23206c338ee09c3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 141001,
"upload_time": "2024-10-30T16:31:31",
"upload_time_iso_8601": "2024-10-30T16:31:31.344088Z",
"url": "https://files.pythonhosted.org/packages/e3/01/9f209fb868d67c2c15cde908e41f0399306a670e937ae656cbd357f6e7f9/cratedb_sqlparse-0.0.10.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-30 16:31:31",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "crate",
"github_project": "cratedb-sqlparse",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "antlr4-tools",
"specs": [
[
"<",
"0.3"
]
]
},
{
"name": "poethepoet",
"specs": [
[
"<",
"0.30"
]
]
},
{
"name": "requests",
"specs": [
[
"<",
"3"
]
]
}
],
"lcname": "cratedb-sqlparse"
}