python-pgsql-parser


Namepython-pgsql-parser JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/yourusername/python-pgsql-parser
SummaryAdvanced PostgreSQL SQL parser for extracting database schema metadata
upload_time2025-08-06 01:50:29
maintainerNone
docs_urlNone
authorSunny Liu
requires_python>=3.7
licenseMIT
keywords sql parser postgresql ddl metadata schema analysis
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # PostgreSQL SQL Parser

[![Python Version](https://img.shields.io/badge/python-3.7%2B-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![PyPI Version](https://img.shields.io/pypi/v/python-pgsql-parser.svg)](https://pypi.org/project/python-pgsql-parser/)

# SQLParser Project

## Overview
SQLParser is a Python library designed to parse and analyze SQL scripts, focusing on Data Definition Language (DDL) statements and SELECT queries. It processes SQL statements to extract structured information about tables, columns, constraints, and query elements, providing a robust foundation for database schema analysis and query processing.

## Features
- **DDL Parsing**: Supports parsing of `CREATE TABLE`, `ALTER TABLE`, and `CREATE INDEX` statements to extract table structures, columns, primary keys, foreign keys, and constraints.
- **SELECT Query Parsing**: Handles `SELECT` statements, including those with Common Table Expressions (CTEs), to identify query structure and column expressions.
- **Tokenization**: Utilizes an `AdvancedSQLLexer` to tokenize SQL scripts accurately.
- **Structured Output**: Represents parsed SQL components as structured objects (`Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.).
- **Extensible Architecture**: Built with a modular design using a base parser class (`ParserBase`) for easy extension and maintenance.

## Source Installation
To use the SQLParser library, ensure you have Python 3.8+ installed. Clone the repository and install dependencies:

```bash
git clone https://github.com/devsunny/python-pgsql-parser.git
cd sqlparser
pip install -r requirements.txt
```

## Pip installation
```bash
pip install python-pgsql-parser

```
## Usage
The `AdvancedSQLParser` class is the main entry point for parsing SQL scripts. Below is an example of how to use it:

```python
from sqlparser import AdvancedSQLParser

# Example SQL script
sql_script = """
CREATE TABLE users (
    id INT PRIMARY KEY,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100),
    FOREIGN KEY (email) REFERENCES accounts(email)
);
SELECT username, email FROM users;
"""

# Initialize parser and parse the script
parser = AdvancedSQLParser(sql_script)

# Access parsed tables
tables = parser.get_tables()
for table_name, table in tables.items():
    print(f"Table: {table.name}")
    for col_name, column in table.columns.items():
        print(f"  Column: {col_name}, Type: {column.col_type}, Nullable: {column.nullable}")
    if table.primary_key:
        print(f"  Primary Key: {table.primary_key.columns}")
    for fk in table.foreign_keys:
        print(f"  Foreign Key: {fk.columns} -> {fk.ref_table}({fk.ref_columns})")
```

### Output
For the above SQL script, the parser will output structured information about the `users` table, its columns, primary key, foreign key, and the `SELECT` query components.

## Project Structure
- **sqlparser/models.py**: Defines data models (`Token`, `TokenType`, `Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.) for representing SQL components.
- **sqlparser/sql_lexer.py**: Implements `AdvancedSQLLexer` for tokenizing SQL scripts.
- **sqlparser/parser_base.py**: Provides the base `ParserBase` class with utility methods for token consumption and parsing.
- **sqlparser/advanced_statement_analyzer.py**: Analyzes SQL statements and handles nested subqueries.
- **sqlparser/column_def_parser.py**: Parses column definitions, including data types, constraints, and foreign key references.
- **sqlparser/advanced_ddl_statement_parser.py**: Parses DDL statements (`CREATE TABLE`, `ALTER TABLE`, `CREATE INDEX`).
- **sqlparser/sql_query_parser.py**: Parses `SELECT` queries, including CTEs and column expressions.
- **sqlparser/advanced_sql_parser.py**: Main parser class that orchestrates the parsing process and maintains the table registry.

## Code Example
Below is a snippet from `column_def_parser.py` that demonstrates how column definitions are parsed:

<xaiArtifactCode>
from typing import List
from .models import Token, TokenType, Column, PrimaryKey, ForeignKey, Constraint
from .parser_base import ParserBase

class ColumnDefParser(ParserBase):
    def __init__(self, col_def: List[Token]):
        super().__init__(col_def)
        self.col_name = None
        self.col_type = None
        self.nullable = True
        self.default_value = None
        self.is_primary = False
        self.fk_ref = None
        self.max_char_length = None
        self.precision = 0
        self.scale = 0
        self.primary_key = None
        self.foreign_key = None
        self.constraint = None
        self.foreign_key_ref = None
        self._parse()

    def _parse_column_def(self):
        while self.current_pos < self.num_of_tokens:
            tok = self._consume_one()
            if self.col_name is None:
                self.col_name = tok.value
            elif self.col_type is None:
                self.col_type = tok.value.upper()
            elif tok.token_type == TokenType.OPEN_PAREN:
                self._parse_precision_scale()
                self._consume_one()  # consume close_paren
            elif self._is_keyword(tok, "NOT") and self._is_keyword(self._peek(), "NULL"):
                self.nullable = False
                self._consume_one()  # consume null
            # ... (additional parsing logic)
</xaiArtifactCode>

## Dependencies
- Python 3.8+
- `typing` (standard library)
- `collections.OrderedDict` (standard library)


## Contributing

Contributions are welcome! Please follow these steps:

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/your-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin feature/your-feature`)
5. Open a pull request

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Support

For issues and feature requests, please [open an issue](https://github.com/yourusername/python-pgsql-parser/issues).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/yourusername/python-pgsql-parser",
    "name": "python-pgsql-parser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "sql, parser, postgresql, ddl, metadata, schema, analysis",
    "author": "Sunny Liu",
    "author_email": "Sunny Liu <sunnyliu2@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/c7/f0/fbfded9ac2b3dbaab0b0f31314145bf4aaccc8a85caf2a5daaa02dd7d810/python_pgsql_parser-0.2.0.tar.gz",
    "platform": null,
    "description": "# PostgreSQL SQL Parser\n\n[![Python Version](https://img.shields.io/badge/python-3.7%2B-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![PyPI Version](https://img.shields.io/pypi/v/python-pgsql-parser.svg)](https://pypi.org/project/python-pgsql-parser/)\n\n# SQLParser Project\n\n## Overview\nSQLParser is a Python library designed to parse and analyze SQL scripts, focusing on Data Definition Language (DDL) statements and SELECT queries. It processes SQL statements to extract structured information about tables, columns, constraints, and query elements, providing a robust foundation for database schema analysis and query processing.\n\n## Features\n- **DDL Parsing**: Supports parsing of `CREATE TABLE`, `ALTER TABLE`, and `CREATE INDEX` statements to extract table structures, columns, primary keys, foreign keys, and constraints.\n- **SELECT Query Parsing**: Handles `SELECT` statements, including those with Common Table Expressions (CTEs), to identify query structure and column expressions.\n- **Tokenization**: Utilizes an `AdvancedSQLLexer` to tokenize SQL scripts accurately.\n- **Structured Output**: Represents parsed SQL components as structured objects (`Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.).\n- **Extensible Architecture**: Built with a modular design using a base parser class (`ParserBase`) for easy extension and maintenance.\n\n## Source Installation\nTo use the SQLParser library, ensure you have Python 3.8+ installed. Clone the repository and install dependencies:\n\n```bash\ngit clone https://github.com/devsunny/python-pgsql-parser.git\ncd sqlparser\npip install -r requirements.txt\n```\n\n## Pip installation\n```bash\npip install python-pgsql-parser\n\n```\n## Usage\nThe `AdvancedSQLParser` class is the main entry point for parsing SQL scripts. Below is an example of how to use it:\n\n```python\nfrom sqlparser import AdvancedSQLParser\n\n# Example SQL script\nsql_script = \"\"\"\nCREATE TABLE users (\n    id INT PRIMARY KEY,\n    username VARCHAR(50) NOT NULL,\n    email VARCHAR(100),\n    FOREIGN KEY (email) REFERENCES accounts(email)\n);\nSELECT username, email FROM users;\n\"\"\"\n\n# Initialize parser and parse the script\nparser = AdvancedSQLParser(sql_script)\n\n# Access parsed tables\ntables = parser.get_tables()\nfor table_name, table in tables.items():\n    print(f\"Table: {table.name}\")\n    for col_name, column in table.columns.items():\n        print(f\"  Column: {col_name}, Type: {column.col_type}, Nullable: {column.nullable}\")\n    if table.primary_key:\n        print(f\"  Primary Key: {table.primary_key.columns}\")\n    for fk in table.foreign_keys:\n        print(f\"  Foreign Key: {fk.columns} -> {fk.ref_table}({fk.ref_columns})\")\n```\n\n### Output\nFor the above SQL script, the parser will output structured information about the `users` table, its columns, primary key, foreign key, and the `SELECT` query components.\n\n## Project Structure\n- **sqlparser/models.py**: Defines data models (`Token`, `TokenType`, `Table`, `Column`, `PrimaryKey`, `ForeignKey`, `Constraint`, etc.) for representing SQL components.\n- **sqlparser/sql_lexer.py**: Implements `AdvancedSQLLexer` for tokenizing SQL scripts.\n- **sqlparser/parser_base.py**: Provides the base `ParserBase` class with utility methods for token consumption and parsing.\n- **sqlparser/advanced_statement_analyzer.py**: Analyzes SQL statements and handles nested subqueries.\n- **sqlparser/column_def_parser.py**: Parses column definitions, including data types, constraints, and foreign key references.\n- **sqlparser/advanced_ddl_statement_parser.py**: Parses DDL statements (`CREATE TABLE`, `ALTER TABLE`, `CREATE INDEX`).\n- **sqlparser/sql_query_parser.py**: Parses `SELECT` queries, including CTEs and column expressions.\n- **sqlparser/advanced_sql_parser.py**: Main parser class that orchestrates the parsing process and maintains the table registry.\n\n## Code Example\nBelow is a snippet from `column_def_parser.py` that demonstrates how column definitions are parsed:\n\n<xaiArtifactCode>\nfrom typing import List\nfrom .models import Token, TokenType, Column, PrimaryKey, ForeignKey, Constraint\nfrom .parser_base import ParserBase\n\nclass ColumnDefParser(ParserBase):\n    def __init__(self, col_def: List[Token]):\n        super().__init__(col_def)\n        self.col_name = None\n        self.col_type = None\n        self.nullable = True\n        self.default_value = None\n        self.is_primary = False\n        self.fk_ref = None\n        self.max_char_length = None\n        self.precision = 0\n        self.scale = 0\n        self.primary_key = None\n        self.foreign_key = None\n        self.constraint = None\n        self.foreign_key_ref = None\n        self._parse()\n\n    def _parse_column_def(self):\n        while self.current_pos < self.num_of_tokens:\n            tok = self._consume_one()\n            if self.col_name is None:\n                self.col_name = tok.value\n            elif self.col_type is None:\n                self.col_type = tok.value.upper()\n            elif tok.token_type == TokenType.OPEN_PAREN:\n                self._parse_precision_scale()\n                self._consume_one()  # consume close_paren\n            elif self._is_keyword(tok, \"NOT\") and self._is_keyword(self._peek(), \"NULL\"):\n                self.nullable = False\n                self._consume_one()  # consume null\n            # ... (additional parsing logic)\n</xaiArtifactCode>\n\n## Dependencies\n- Python 3.8+\n- `typing` (standard library)\n- `collections.OrderedDict` (standard library)\n\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/your-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin feature/your-feature`)\n5. Open a pull request\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Support\n\nFor issues and feature requests, please [open an issue](https://github.com/yourusername/python-pgsql-parser/issues).\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Advanced PostgreSQL SQL parser for extracting database schema metadata",
    "version": "0.2.0",
    "project_urls": {
        "Documentation": "https://github.com/devsunny/python-pgsql-parser/blob/main/docs/index.md",
        "Homepage": "https://github.com/devsunny/python-pgsql-parser",
        "Source": "https://github.com/devsunny/python-pgsql-parser",
        "Tracker": "https://github.com/devsunny/python-pgsql-parser/issues"
    },
    "split_keywords": [
        "sql",
        " parser",
        " postgresql",
        " ddl",
        " metadata",
        " schema",
        " analysis"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "06343d7d7313b5e38124647486a36904cfbf3d4ae9e1ccba131d92cde01249cd",
                "md5": "19ff8b7bf0bf4277978ce5397623be16",
                "sha256": "f494ecf8e168639706c92aafcf976ed09bb0e64c8ebbc8c8ebc2c9c8debdb8a3"
            },
            "downloads": -1,
            "filename": "python_pgsql_parser-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "19ff8b7bf0bf4277978ce5397623be16",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 22257,
            "upload_time": "2025-08-06T01:50:28",
            "upload_time_iso_8601": "2025-08-06T01:50:28.374084Z",
            "url": "https://files.pythonhosted.org/packages/06/34/3d7d7313b5e38124647486a36904cfbf3d4ae9e1ccba131d92cde01249cd/python_pgsql_parser-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c7f0fbfded9ac2b3dbaab0b0f31314145bf4aaccc8a85caf2a5daaa02dd7d810",
                "md5": "451e592e42f2abd5103a0172e24524a1",
                "sha256": "b8349e13148b3d176688195a90dca94b7cc04ff7d6f72cbe74b29325148b11ff"
            },
            "downloads": -1,
            "filename": "python_pgsql_parser-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "451e592e42f2abd5103a0172e24524a1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 27073,
            "upload_time": "2025-08-06T01:50:29",
            "upload_time_iso_8601": "2025-08-06T01:50:29.183187Z",
            "url": "https://files.pythonhosted.org/packages/c7/f0/fbfded9ac2b3dbaab0b0f31314145bf4aaccc8a85caf2a5daaa02dd7d810/python_pgsql_parser-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-06 01:50:29",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "yourusername",
    "github_project": "python-pgsql-parser",
    "github_not_found": true,
    "lcname": "python-pgsql-parser"
}
        
Elapsed time: 2.28014s