# SQL Splitter ๐ฏ
Advanced MySQL SQL Parser with Visualization Component Support
[](https://python.org)
[](LICENSE)
[](https://github.com/alexkwok22/sql-splitter)
## ๐ Features
- **๐จ Field Type Classification**: Automatically categorizes fields as `column`, `aggregation`, `expression`, or `computed`
- **๐ Aggregation Scope Tracking**: Tracks which tables are involved in aggregation functions like `COUNT(*)`
- **๐ Visualization-Ready Output**: Enhanced JSON format perfect for SQL diagram generation
- **๐ ๏ธ Advanced JOIN Detection**: Handles complex nested JOINs and old-style comma-separated syntax
- **๐ท๏ธ Smart Alias Resolution**: Context-aware alias mapping and resolution
- **๐ฌ MySQL Compatibility**: Full MySQL syntax support with normalization
- **๐ Comprehensive Metadata**: Provides detailed parsing information for debugging and visualization
## ๐ฆ Installation
```bash
pip install sql-splitter
```
Or install from source:
```bash
git clone https://github.com/alexkwok22/sql-splitter.git
cd sql-splitter
pip install -e .
```
## ๐ฏ Quick Start
### Basic Usage
```python
from sql_splitter import SQLParserAST
# Initialize parser
parser = SQLParserAST()
# Parse SQL query
sql = """
SELECT
users.name,
COUNT(*) as total_orders,
SUM(orders.amount) as total_revenue
FROM users
JOIN orders ON users.id = orders.user_id
WHERE users.status = 'active'
GROUP BY users.name
"""
result = parser.parse(sql)
print(result)
```
### Enhanced JSON Output
```json
{
"success": true,
"fields": [
{
"table": "users",
"field": "users.name",
"alias": "name",
"fieldType": "column",
"involvedTables": ["users"]
},
{
"table": null,
"field": "COUNT(*)",
"alias": "total_orders",
"fieldType": "aggregation",
"aggregationScope": ["users", "orders"],
"involvedTables": ["users", "orders"]
},
{
"table": "orders",
"field": "SUM(orders.amount)",
"alias": "total_revenue",
"fieldType": "aggregation",
"involvedTables": ["orders"]
}
],
"tables": ["users", "orders"],
"joins": [
{
"type": "JOIN",
"leftTable": "users",
"leftField": "id",
"rightTable": "orders",
"rightField": "user_id",
"condition": "users.id = orders.user_id"
}
],
"whereConditions": ["users.status = 'active'"],
"parser": "sqlsplit",
"metadata": {
"aliasMapping": {},
"aggregationFields": ["total_orders", "total_revenue"],
"computedFields": [],
"unresolved": {
"aliases": [],
"fields": []
}
}
}
```
## ๐จ Visualization Components Support
### Field Type Classification
SQL Splitter automatically classifies fields into four types:
- **`column`**: Simple table columns (`users.name`)
- **`aggregation`**: Aggregate functions (`COUNT(*)`, `SUM(amount)`)
- **`expression`**: Complex expressions (`DATE_FORMAT(created_at, '%Y-%m')`)
- **`computed`**: Conditional logic (`CASE WHEN status = 1 THEN 'active' END`)
### Aggregation Scope Tracking
For visualization components, aggregation functions include `aggregationScope` to show which tables are involved:
```python
# COUNT(*) shows all tables in the query
{
"field": "COUNT(*)",
"fieldType": "aggregation",
"aggregationScope": ["users", "orders", "products"] # All related tables
}
# Specific aggregations show only relevant tables
{
"field": "SUM(orders.amount)",
"fieldType": "aggregation",
"aggregationScope": ["orders"] # Only orders table
}
```
## ๐ ๏ธ Advanced Features
### MySQL Normalization
```python
from sql_splitter import MySQLCompatibleNormalizer
normalizer = MySQLCompatibleNormalizer()
normalized_sql, rules, errors = normalizer.normalize_query(sql)
```
### Old-Style JOIN Conversion
Automatically converts old-style comma-separated JOINs:
```sql
-- Input: Old-style
SELECT * FROM users a, orders b WHERE a.id = b.user_id
-- Output: Modern JOIN
SELECT * FROM users a JOIN orders b ON a.id = b.user_id
```
### Context-Aware Alias Resolution
Handles complex alias scenarios:
```python
# Resolves aliases like 'u' -> 'users', 'o' -> 'orders'
"metadata": {
"aliasMapping": {
"u": "users",
"o": "orders"
}
}
```
## ๐ Documentation
- **[Quick Start Guide](docs/Quick-Start-Guide.md)** - Get started in 5 minutes
- **[API Documentation](docs/SQL-Parser-Usage-Guide.md)** - Complete API reference
- **[Expected Format](docs/expect.md)** - JSON output specification
- **[Examples](examples/)** - Real-world usage examples
## ๐งช Testing
```bash
# Run basic tests
python -m pytest tests/
# Run with coverage
python -m pytest tests/ --cov=sql_splitter --cov-report=html
```
## ๐ Requirements
- Python 3.7+
- No external dependencies (pure Python implementation)
## ๐ค Contributing
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'Add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
## ๐ License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## ๐ Acknowledgments
- Built for SQL visualization component developers
- Supports complex MySQL queries and edge cases
- Designed with performance and accuracy in mind
---
**Made with โค๏ธ for the SQL visualization community**
Raw data
{
"_id": null,
"home_page": null,
"name": "sql-splitter",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "sql, parser, mysql, visualization, ast, database, query, analysis",
"author": null,
"author_email": "SQL Splitter Team <contact@sqlsplitter.dev>",
"download_url": "https://files.pythonhosted.org/packages/8a/66/7976de83db4e823bc0a7fd583c896944ff196afbe8993ce3280c25e5005d/sql_splitter-6.1.0.tar.gz",
"platform": null,
"description": "# SQL Splitter \ud83c\udfaf\r\n\r\nAdvanced MySQL SQL Parser with Visualization Component Support\r\n\r\n[](https://python.org)\r\n[](LICENSE)\r\n[](https://github.com/alexkwok22/sql-splitter)\r\n\r\n## \ud83d\ude80 Features\r\n\r\n- **\ud83c\udfa8 Field Type Classification**: Automatically categorizes fields as `column`, `aggregation`, `expression`, or `computed`\r\n- **\ud83d\udd17 Aggregation Scope Tracking**: Tracks which tables are involved in aggregation functions like `COUNT(*)`\r\n- **\ud83d\udcca Visualization-Ready Output**: Enhanced JSON format perfect for SQL diagram generation\r\n- **\ud83d\udee0\ufe0f Advanced JOIN Detection**: Handles complex nested JOINs and old-style comma-separated syntax\r\n- **\ud83c\udff7\ufe0f Smart Alias Resolution**: Context-aware alias mapping and resolution\r\n- **\ud83d\udc2c MySQL Compatibility**: Full MySQL syntax support with normalization\r\n- **\ud83d\udccb Comprehensive Metadata**: Provides detailed parsing information for debugging and visualization\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n```bash\r\npip install sql-splitter\r\n```\r\n\r\nOr install from source:\r\n\r\n```bash\r\ngit clone https://github.com/alexkwok22/sql-splitter.git\r\ncd sql-splitter\r\npip install -e .\r\n```\r\n\r\n## \ud83c\udfaf Quick Start\r\n\r\n### Basic Usage\r\n\r\n```python\r\nfrom sql_splitter import SQLParserAST\r\n\r\n# Initialize parser\r\nparser = SQLParserAST()\r\n\r\n# Parse SQL query\r\nsql = \"\"\"\r\nSELECT \r\n users.name,\r\n COUNT(*) as total_orders,\r\n SUM(orders.amount) as total_revenue\r\nFROM users \r\nJOIN orders ON users.id = orders.user_id \r\nWHERE users.status = 'active'\r\nGROUP BY users.name\r\n\"\"\"\r\n\r\nresult = parser.parse(sql)\r\nprint(result)\r\n```\r\n\r\n### Enhanced JSON Output\r\n\r\n```json\r\n{\r\n \"success\": true,\r\n \"fields\": [\r\n {\r\n \"table\": \"users\",\r\n \"field\": \"users.name\",\r\n \"alias\": \"name\",\r\n \"fieldType\": \"column\",\r\n \"involvedTables\": [\"users\"]\r\n },\r\n {\r\n \"table\": null,\r\n \"field\": \"COUNT(*)\",\r\n \"alias\": \"total_orders\",\r\n \"fieldType\": \"aggregation\",\r\n \"aggregationScope\": [\"users\", \"orders\"],\r\n \"involvedTables\": [\"users\", \"orders\"]\r\n },\r\n {\r\n \"table\": \"orders\",\r\n \"field\": \"SUM(orders.amount)\",\r\n \"alias\": \"total_revenue\", \r\n \"fieldType\": \"aggregation\",\r\n \"involvedTables\": [\"orders\"]\r\n }\r\n ],\r\n \"tables\": [\"users\", \"orders\"],\r\n \"joins\": [\r\n {\r\n \"type\": \"JOIN\",\r\n \"leftTable\": \"users\",\r\n \"leftField\": \"id\",\r\n \"rightTable\": \"orders\", \r\n \"rightField\": \"user_id\",\r\n \"condition\": \"users.id = orders.user_id\"\r\n }\r\n ],\r\n \"whereConditions\": [\"users.status = 'active'\"],\r\n \"parser\": \"sqlsplit\",\r\n \"metadata\": {\r\n \"aliasMapping\": {},\r\n \"aggregationFields\": [\"total_orders\", \"total_revenue\"],\r\n \"computedFields\": [],\r\n \"unresolved\": {\r\n \"aliases\": [],\r\n \"fields\": []\r\n }\r\n }\r\n}\r\n```\r\n\r\n## \ud83c\udfa8 Visualization Components Support\r\n\r\n### Field Type Classification\r\n\r\nSQL Splitter automatically classifies fields into four types:\r\n\r\n- **`column`**: Simple table columns (`users.name`)\r\n- **`aggregation`**: Aggregate functions (`COUNT(*)`, `SUM(amount)`)\r\n- **`expression`**: Complex expressions (`DATE_FORMAT(created_at, '%Y-%m')`)\r\n- **`computed`**: Conditional logic (`CASE WHEN status = 1 THEN 'active' END`)\r\n\r\n### Aggregation Scope Tracking\r\n\r\nFor visualization components, aggregation functions include `aggregationScope` to show which tables are involved:\r\n\r\n```python\r\n# COUNT(*) shows all tables in the query\r\n{\r\n \"field\": \"COUNT(*)\",\r\n \"fieldType\": \"aggregation\",\r\n \"aggregationScope\": [\"users\", \"orders\", \"products\"] # All related tables\r\n}\r\n\r\n# Specific aggregations show only relevant tables\r\n{\r\n \"field\": \"SUM(orders.amount)\",\r\n \"fieldType\": \"aggregation\", \r\n \"aggregationScope\": [\"orders\"] # Only orders table\r\n}\r\n```\r\n\r\n## \ud83d\udee0\ufe0f Advanced Features\r\n\r\n### MySQL Normalization\r\n\r\n```python\r\nfrom sql_splitter import MySQLCompatibleNormalizer\r\n\r\nnormalizer = MySQLCompatibleNormalizer()\r\nnormalized_sql, rules, errors = normalizer.normalize_query(sql)\r\n```\r\n\r\n### Old-Style JOIN Conversion\r\n\r\nAutomatically converts old-style comma-separated JOINs:\r\n\r\n```sql\r\n-- Input: Old-style\r\nSELECT * FROM users a, orders b WHERE a.id = b.user_id\r\n\r\n-- Output: Modern JOIN\r\nSELECT * FROM users a JOIN orders b ON a.id = b.user_id\r\n```\r\n\r\n### Context-Aware Alias Resolution\r\n\r\nHandles complex alias scenarios:\r\n\r\n```python\r\n# Resolves aliases like 'u' -> 'users', 'o' -> 'orders'\r\n\"metadata\": {\r\n \"aliasMapping\": {\r\n \"u\": \"users\",\r\n \"o\": \"orders\"\r\n }\r\n}\r\n```\r\n\r\n## \ud83d\udcda Documentation\r\n\r\n- **[Quick Start Guide](docs/Quick-Start-Guide.md)** - Get started in 5 minutes\r\n- **[API Documentation](docs/SQL-Parser-Usage-Guide.md)** - Complete API reference\r\n- **[Expected Format](docs/expect.md)** - JSON output specification\r\n- **[Examples](examples/)** - Real-world usage examples\r\n\r\n## \ud83e\uddea Testing\r\n\r\n```bash\r\n# Run basic tests\r\npython -m pytest tests/\r\n\r\n# Run with coverage\r\npython -m pytest tests/ --cov=sql_splitter --cov-report=html\r\n```\r\n\r\n## \ud83d\udccb Requirements\r\n\r\n- Python 3.7+\r\n- No external dependencies (pure Python implementation)\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\n1. Fork the repository\r\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\r\n3. Commit your changes (`git commit -m 'Add amazing feature'`)\r\n4. Push to the branch (`git push origin feature/amazing-feature`)\r\n5. Open a Pull Request\r\n\r\n## \ud83d\udcdd License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n## \ud83d\ude4f Acknowledgments\r\n\r\n- Built for SQL visualization component developers\r\n- Supports complex MySQL queries and edge cases\r\n- Designed with performance and accuracy in mind\r\n\r\n---\r\n\r\n**Made with \u2764\ufe0f for the SQL visualization community**\r\n",
"bugtrack_url": null,
"license": null,
"summary": "Advanced MySQL SQL Parser with Visualization Component Support",
"version": "6.1.0",
"project_urls": {
"Bug Reports": "https://github.com/alexkwok22/sql-splitter/issues",
"Documentation": "https://github.com/alexkwok22/sql-splitter/blob/main/docs/",
"Homepage": "https://github.com/alexkwok22/sql-splitter",
"Repository": "https://github.com/alexkwok22/sql-splitter"
},
"split_keywords": [
"sql",
" parser",
" mysql",
" visualization",
" ast",
" database",
" query",
" analysis"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b46c24af72d5d25cd6e5cdda8890ad201721df38b302f72150d7aca3617a96a7",
"md5": "79122285eb56de4b9d6324af22d64e0f",
"sha256": "5e6b4737670704078da136ea1a141c825542bd73c5acd60f8d8cd25f06483503"
},
"downloads": -1,
"filename": "sql_splitter-6.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "79122285eb56de4b9d6324af22d64e0f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 49183,
"upload_time": "2025-08-28T04:32:05",
"upload_time_iso_8601": "2025-08-28T04:32:05.041830Z",
"url": "https://files.pythonhosted.org/packages/b4/6c/24af72d5d25cd6e5cdda8890ad201721df38b302f72150d7aca3617a96a7/sql_splitter-6.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8a667976de83db4e823bc0a7fd583c896944ff196afbe8993ce3280c25e5005d",
"md5": "7d2d8fb3b0ccfafb252daeec0a56a187",
"sha256": "28269494aa6307fb9c0f286716f60f88482591035695c28f34095605c489b85e"
},
"downloads": -1,
"filename": "sql_splitter-6.1.0.tar.gz",
"has_sig": false,
"md5_digest": "7d2d8fb3b0ccfafb252daeec0a56a187",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 44965,
"upload_time": "2025-08-28T04:32:06",
"upload_time_iso_8601": "2025-08-28T04:32:06.553941Z",
"url": "https://files.pythonhosted.org/packages/8a/66/7976de83db4e823bc0a7fd583c896944ff196afbe8993ce3280c25e5005d/sql_splitter-6.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-28 04:32:06",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "alexkwok22",
"github_project": "sql-splitter",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "sql-splitter"
}