# PostgreSQL SQL Parser
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
[](https://pypi.org/project/python-pgsql-parser/)
# SQLParser Project
## Overview
SQLParser is a Python library designed to parse and analyze SQL scripts, focusing on Data Definition Language (DDL) statements and SELECT queries. It processes SQL statements to extract structured information about tables, columns, constraints, and query elements, providing a robust foundation for database schema analysis and query processing.
## Features
- **DDL Parsing**: Supports parsing of `CREATE TABLE`, `ALTER TABLE`, and `CREATE INDEX` statements to extract table structures, columns, primary keys, foreign keys, and constraints.
- **SELECT Query Parsing**: Handles `SELECT` statements, including those with Common Table Expressions (CTEs), to identify query structure and column expressions.
- **Tokenization**: Utilizes an `AdvancedSQLLexer` to tokenize SQL scripts accurately.
- **Structured Output**: Represents parsed SQL components as structured objects (`Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.).
- **Extensible Architecture**: Built with a modular design using a base parser class (`ParserBase`) for easy extension and maintenance.
## Source Installation
To use the SQLParser library, ensure you have Python 3.8+ installed. Clone the repository and install dependencies:
```bash
git clone https://github.com/devsunny/python-pgsql-parser.git
cd sqlparser
pip install -r requirements.txt
```
## Pip installation
```bash
pip install python-pgsql-parser
```
## Usage
The `AdvancedSQLParser` class is the main entry point for parsing SQL scripts. Below is an example of how to use it:
```python
from sqlparser import AdvancedSQLParser
# Example SQL script
sql_script = """
CREATE TABLE users (
id INT PRIMARY KEY,
username VARCHAR(50) NOT NULL,
email VARCHAR(100),
FOREIGN KEY (email) REFERENCES accounts(email)
);
SELECT username, email FROM users;
"""
# Initialize parser and parse the script
parser = AdvancedSQLParser(sql_script)
# Access parsed tables
tables = parser.get_tables()
for table_name, table in tables.items():
print(f"Table: {table.name}")
for col_name, column in table.columns.items():
print(f" Column: {col_name}, Type: {column.col_type}, Nullable: {column.nullable}")
if table.primary_key:
print(f" Primary Key: {table.primary_key.columns}")
for fk in table.foreign_keys:
print(f" Foreign Key: {fk.columns} -> {fk.ref_table}({fk.ref_columns})")
```
### Output
For the above SQL script, the parser will output structured information about the `users` table, its columns, primary key, foreign key, and the `SELECT` query components.
## Project Structure
- **sqlparser/models.py**: Defines data models (`Token`, `TokenType`, `Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.) for representing SQL components.
- **sqlparser/sql_lexer.py**: Implements `AdvancedSQLLexer` for tokenizing SQL scripts.
- **sqlparser/parser_base.py**: Provides the base `ParserBase` class with utility methods for token consumption and parsing.
- **sqlparser/advanced_statement_analyzer.py**: Analyzes SQL statements and handles nested subqueries.
- **sqlparser/column_def_parser.py**: Parses column definitions, including data types, constraints, and foreign key references.
- **sqlparser/advanced_ddl_statement_parser.py**: Parses DDL statements (`CREATE TABLE`, `ALTER TABLE`, `CREATE INDEX`).
- **sqlparser/sql_query_parser.py**: Parses `SELECT` queries, including CTEs and column expressions.
- **sqlparser/advanced_sql_parser.py**: Main parser class that orchestrates the parsing process and maintains the table registry.
## Code Example
Below is a snippet from `column_def_parser.py` that demonstrates how column definitions are parsed:
<xaiArtifactCode>
from typing import List
from .models import Token, TokenType, Column, PrimaryKey, ForeignKey, Constraint
from .parser_base import ParserBase
class ColumnDefParser(ParserBase):
def __init__(self, col_def: List[Token]):
super().__init__(col_def)
self.col_name = None
self.col_type = None
self.nullable = True
self.default_value = None
self.is_primary = False
self.fk_ref = None
self.max_char_length = None
self.precision = 0
self.scale = 0
self.primary_key = None
self.foreign_key = None
self.constraint = None
self.foreign_key_ref = None
self._parse()
def _parse_column_def(self):
while self.current_pos < self.num_of_tokens:
tok = self._consume_one()
if self.col_name is None:
self.col_name = tok.value
elif self.col_type is None:
self.col_type = tok.value.upper()
elif tok.token_type == TokenType.OPEN_PAREN:
self._parse_precision_scale()
self._consume_one() # consume close_paren
elif self._is_keyword(tok, "NOT") and self._is_keyword(self._peek(), "NULL"):
self.nullable = False
self._consume_one() # consume null
# ... (additional parsing logic)
</xaiArtifactCode>
## Dependencies
- Python 3.8+
- `typing` (standard library)
- `collections.OrderedDict` (standard library)
## Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/your-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin feature/your-feature`)
5. Open a pull request
## License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## Support
For issues and feature requests, please [open an issue](https://github.com/yourusername/python-pgsql-parser/issues).
Raw data
{
"_id": null,
"home_page": "https://github.com/yourusername/python-pgsql-parser",
"name": "python-pgsql-parser",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "sql, parser, postgresql, ddl, metadata, schema, analysis",
"author": "Sunny Liu",
"author_email": "Sunny Liu <sunnyliu2@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/c7/f0/fbfded9ac2b3dbaab0b0f31314145bf4aaccc8a85caf2a5daaa02dd7d810/python_pgsql_parser-0.2.0.tar.gz",
"platform": null,
"description": "# PostgreSQL SQL Parser\n\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n[](https://pypi.org/project/python-pgsql-parser/)\n\n# SQLParser Project\n\n## Overview\nSQLParser is a Python library designed to parse and analyze SQL scripts, focusing on Data Definition Language (DDL) statements and SELECT queries. It processes SQL statements to extract structured information about tables, columns, constraints, and query elements, providing a robust foundation for database schema analysis and query processing.\n\n## Features\n- **DDL Parsing**: Supports parsing of `CREATE TABLE`, `ALTER TABLE`, and `CREATE INDEX` statements to extract table structures, columns, primary keys, foreign keys, and constraints.\n- **SELECT Query Parsing**: Handles `SELECT` statements, including those with Common Table Expressions (CTEs), to identify query structure and column expressions.\n- **Tokenization**: Utilizes an `AdvancedSQLLexer` to tokenize SQL scripts accurately.\n- **Structured Output**: Represents parsed SQL components as structured objects (`Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.).\n- **Extensible Architecture**: Built with a modular design using a base parser class (`ParserBase`) for easy extension and maintenance.\n\n## Source Installation\nTo use the SQLParser library, ensure you have Python 3.8+ installed. Clone the repository and install dependencies:\n\n```bash\ngit clone https://github.com/devsunny/python-pgsql-parser.git\ncd sqlparser\npip install -r requirements.txt\n```\n\n## Pip installation\n```bash\npip install python-pgsql-parser\n\n```\n## Usage\nThe `AdvancedSQLParser` class is the main entry point for parsing SQL scripts. Below is an example of how to use it:\n\n```python\nfrom sqlparser import AdvancedSQLParser\n\n# Example SQL script\nsql_script = \"\"\"\nCREATE TABLE users (\n id INT PRIMARY KEY,\n username VARCHAR(50) NOT NULL,\n email VARCHAR(100),\n FOREIGN KEY (email) REFERENCES accounts(email)\n);\nSELECT username, email FROM users;\n\"\"\"\n\n# Initialize parser and parse the script\nparser = AdvancedSQLParser(sql_script)\n\n# Access parsed tables\ntables = parser.get_tables()\nfor table_name, table in tables.items():\n print(f\"Table: {table.name}\")\n for col_name, column in table.columns.items():\n print(f\" Column: {col_name}, Type: {column.col_type}, Nullable: {column.nullable}\")\n if table.primary_key:\n print(f\" Primary Key: {table.primary_key.columns}\")\n for fk in table.foreign_keys:\n print(f\" Foreign Key: {fk.columns} -> {fk.ref_table}({fk.ref_columns})\")\n```\n\n### Output\nFor the above SQL script, the parser will output structured information about the `users` table, its columns, primary key, foreign key, and the `SELECT` query components.\n\n## Project Structure\n- **sqlparser/models.py**: Defines data models (`Token`, `TokenType`, `Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.) for representing SQL components.\n- **sqlparser/sql_lexer.py**: Implements `AdvancedSQLLexer` for tokenizing SQL scripts.\n- **sqlparser/parser_base.py**: Provides the base `ParserBase` class with utility methods for token consumption and parsing.\n- **sqlparser/advanced_statement_analyzer.py**: Analyzes SQL statements and handles nested subqueries.\n- **sqlparser/column_def_parser.py**: Parses column definitions, including data types, constraints, and foreign key references.\n- **sqlparser/advanced_ddl_statement_parser.py**: Parses DDL statements (`CREATE TABLE`, `ALTER TABLE`, `CREATE INDEX`).\n- **sqlparser/sql_query_parser.py**: Parses `SELECT` queries, including CTEs and column expressions.\n- **sqlparser/advanced_sql_parser.py**: Main parser class that orchestrates the parsing process and maintains the table registry.\n\n## Code Example\nBelow is a snippet from `column_def_parser.py` that demonstrates how column definitions are parsed:\n\n<xaiArtifactCode>\nfrom typing import List\nfrom .models import Token, TokenType, Column, PrimaryKey, ForeignKey, Constraint\nfrom .parser_base import ParserBase\n\nclass ColumnDefParser(ParserBase):\n def __init__(self, col_def: List[Token]):\n super().__init__(col_def)\n self.col_name = None\n self.col_type = None\n self.nullable = True\n self.default_value = None\n self.is_primary = False\n self.fk_ref = None\n self.max_char_length = None\n self.precision = 0\n self.scale = 0\n self.primary_key = None\n self.foreign_key = None\n self.constraint = None\n self.foreign_key_ref = None\n self._parse()\n\n def _parse_column_def(self):\n while self.current_pos < self.num_of_tokens:\n tok = self._consume_one()\n if self.col_name is None:\n self.col_name = tok.value\n elif self.col_type is None:\n self.col_type = tok.value.upper()\n elif tok.token_type == TokenType.OPEN_PAREN:\n self._parse_precision_scale()\n self._consume_one() # consume close_paren\n elif self._is_keyword(tok, \"NOT\") and self._is_keyword(self._peek(), \"NULL\"):\n self.nullable = False\n self._consume_one() # consume null\n # ... (additional parsing logic)\n</xaiArtifactCode>\n\n## Dependencies\n- Python 3.8+\n- `typing` (standard library)\n- `collections.OrderedDict` (standard library)\n\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/your-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin feature/your-feature`)\n5. Open a pull request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\nFor issues and feature requests, please [open an issue](https://github.com/yourusername/python-pgsql-parser/issues).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Advanced PostgreSQL SQL parser for extracting database schema metadata",
"version": "0.2.0",
"project_urls": {
"Documentation": "https://github.com/devsunny/python-pgsql-parser/blob/main/docs/index.md",
"Homepage": "https://github.com/devsunny/python-pgsql-parser",
"Source": "https://github.com/devsunny/python-pgsql-parser",
"Tracker": "https://github.com/devsunny/python-pgsql-parser/issues"
},
"split_keywords": [
"sql",
" parser",
" postgresql",
" ddl",
" metadata",
" schema",
" analysis"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "06343d7d7313b5e38124647486a36904cfbf3d4ae9e1ccba131d92cde01249cd",
"md5": "19ff8b7bf0bf4277978ce5397623be16",
"sha256": "f494ecf8e168639706c92aafcf976ed09bb0e64c8ebbc8c8ebc2c9c8debdb8a3"
},
"downloads": -1,
"filename": "python_pgsql_parser-0.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "19ff8b7bf0bf4277978ce5397623be16",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 22257,
"upload_time": "2025-08-06T01:50:28",
"upload_time_iso_8601": "2025-08-06T01:50:28.374084Z",
"url": "https://files.pythonhosted.org/packages/06/34/3d7d7313b5e38124647486a36904cfbf3d4ae9e1ccba131d92cde01249cd/python_pgsql_parser-0.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "c7f0fbfded9ac2b3dbaab0b0f31314145bf4aaccc8a85caf2a5daaa02dd7d810",
"md5": "451e592e42f2abd5103a0172e24524a1",
"sha256": "b8349e13148b3d176688195a90dca94b7cc04ff7d6f72cbe74b29325148b11ff"
},
"downloads": -1,
"filename": "python_pgsql_parser-0.2.0.tar.gz",
"has_sig": false,
"md5_digest": "451e592e42f2abd5103a0172e24524a1",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 27073,
"upload_time": "2025-08-06T01:50:29",
"upload_time_iso_8601": "2025-08-06T01:50:29.183187Z",
"url": "https://files.pythonhosted.org/packages/c7/f0/fbfded9ac2b3dbaab0b0f31314145bf4aaccc8a85caf2a5daaa02dd7d810/python_pgsql_parser-0.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-06 01:50:29",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yourusername",
"github_project": "python-pgsql-parser",
"github_not_found": true,
"lcname": "python-pgsql-parser"
}