lombardi

Name	lombardi JSON
Version	0.1.6 JSON
	download
home_page	None
Summary	SQL performance analysis and optimization advisor with SQLMesh integration
upload_time	2025-07-27 22:45:00
maintainer	None
docs_url	None
author	None
requires_python	>=3.11
license	MIT
keywords	data optimization performance sql sqlmesh
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Lombardi 🎯

> SQL performance analysis and optimization advisor with SQLMesh integration

[![PyPI version](https://img.shields.io/pypi/v/lombardi.svg)](https://pypi.org/project/lombardi/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Lombardi is a powerful SQL performance analysis tool that helps you identify bottlenecks, detect antipatterns, and optimize your queries across different data warehouses. Built on top of SQLGlot's semantic parsing, it provides actionable insights for better query performance.

## ✨ Features

### 🔍 **Semantic SQL Analysis**
- **Complexity Scoring**: Quantitative assessment (0-100) based on joins, subqueries, nesting depth
- **Antipattern Detection**: Identifies common performance killers like SELECT *, cartesian joins, functions in WHERE clauses
- **Optimization Suggestions**: Actionable recommendations with specific examples

### 🏢 **Warehouse-Specific Rules**
- **Snowflake**: Clustering keys, result caching, VARIANT optimization, time travel suggestions
- **BigQuery**: Partitioning recommendations, slot efficiency, materialized views, array operations
- **Universal**: Cross-platform optimizations that work everywhere

### 🛠️ **Integration Ready**
- **SQLMesh Integration**: Custom audit for performance checks during `sqlmesh plan`
- **CLI Tool**: Rich terminal output with colors, tables, and formatting
- **Python API**: Programmatic access for custom workflows
- **Multiple Output Formats**: Rich terminal, JSON, plain text

## 🚀 Quick Start

### Installation

```bash
pip install lombardi
```

### CLI Usage

```bash
# Analyze a SQL file
lombardi analyze query.sql --dialect snowflake

# Analyze a query string
lombardi analyze --sql "SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id"

# Get just complexity metrics
lombardi complexity --sql "SELECT COUNT(*) FROM large_table WHERE date_col >= '2024-01-01'"

# Export results as JSON
lombardi analyze query.sql --format json > analysis.json

# BigQuery-specific analysis
lombardi analyze --sql "SELECT * FROM dataset.table" --dialect bigquery
```

### Python API

```python
from lombardi import ComplexityAnalyzer, AntipatternDetector, OptimizationSuggester

# Analyze query complexity
analyzer = ComplexityAnalyzer(dialect="snowflake")
metrics = analyzer.analyze("SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id")

print(f"Complexity Score: {metrics.complexity_score}/100")
print(f"Join Count: {metrics.join_count}")
print(f"Subqueries: {metrics.subquery_count}")

# Detect antipatterns
detector = AntipatternDetector()
issues = detector.detect("SELECT * FROM users WHERE UPPER(name) = 'JOHN'")

for issue in issues:
    print(f"❌ {issue.pattern}: {issue.description}")

# Get optimization suggestions
suggester = OptimizationSuggester(dialect="bigquery")
suggestions = suggester.suggest("SELECT * FROM large_table WHERE date_col >= '2024-01-01'")

for suggestion in suggestions:
    print(f"💡 {suggestion.title}: {suggestion.description}")
```

## 📊 Example Output

```bash
$ lombardi analyze --sql "SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id WHERE UPPER(c.name) LIKE '%ACME%'"
```

```
╭──────────────────────────── SQL Analysis Results ────────────────────────────╮
│ Complexity Score: 12.0/100                                                   │
╰──────────────────────────────────────────────────────────────────────────────╯

    Complexity Metrics    
┏━━━━━━━━━━━━━━━━┳━━━━━━━┓
┃ Metric         ┃ Value ┃
┡━━━━━━━━━━━━━━━━╇━━━━━━━┩
│ Join Count     │ 1     │
│ Subquery Count │ 0     │
│ CTE Count      │ 0     │
│ Function Count │ 1     │
│ Nesting Depth  │ 0     │
│ Table Count    │ 2     │
│ Column Count   │ 3     │
└────────────────┴───────┘

                                Detected Issues                                 
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Pattern                     ┃ Severity ┃ Description                         ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Select Star                 │ MEDIUM   │ SELECT * can be inefficient        │
│ Function On Column In Where │ HIGH     │ Functions prevent index usage      │
└─────────────────────────────┴──────────┴─────────────────────────────────────┘

                            Optimization Suggestions                            
┏━━━━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Category    ┃ Priority ┃ Suggestion                                          ┃
┡━━━━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Performance │ HIGH     │ Function On Column In Where: Rewrite to avoid      │
│             │          │ applying functions to indexed columns               │
│ Indexing    │ HIGH     │ JOIN Index Optimization: Ensure indexes exist on   │
│             │          │ JOIN columns: o.customer_id, c.id                  │
│ Performance │ MEDIUM   │ Select Star: Explicitly list required columns      │
└─────────────┴──────────┴─────────────────────────────────────────────────────┘
```

## 🔌 SQLMesh Integration

Integrate Lombardi into your SQLMesh workflow for automated performance checks:

```python
# In your SQLMesh model
MODEL (
  name my_project.optimized_table,
  kind FULL,
  audits [performance_check]
);

SELECT 
  region_code,
  species_code,
  COUNT(*) as observation_count
FROM @DEV.my_project.raw_observations 
WHERE observation_date >= '2024-01-01'
GROUP BY 1, 2;
```

```python
# Configure the audit
from lombardi.integrations.sqlmesh_audit import create_performance_audit

performance_audit = create_performance_audit(
    complexity_threshold=30.0,
    min_severity="medium",
    dialect="snowflake"
)
```

## 🏗️ Architecture

```
lombardi/
├── analyzers/           # Core analysis engines
│   ├── complexity_analyzer.py      # Complexity scoring
│   ├── antipattern_detector.py     # Antipattern detection  
│   └── optimization_suggester.py   # Optimization recommendations
├── rules/               # Warehouse-specific optimizations
│   ├── snowflake_rules.py          # Snowflake optimizations
│   └── bigquery_rules.py           # BigQuery optimizations
├── integrations/        # Third-party integrations
│   └── sqlmesh_audit.py            # SQLMesh audit integration
└── cli.py              # Command-line interface
```

## 🎯 Detected Antipatterns

Lombardi identifies these common SQL performance issues:

| Pattern | Severity | Description |
|---------|----------|-------------|
| **SELECT *** | Medium | Can be inefficient and fragile |
| **Cartesian Joins** | Critical | Missing JOIN conditions |
| **Functions in WHERE** | High | Prevents index usage |
| **Leading Wildcards** | Medium | `LIKE '%pattern'` prevents indexes |
| **Missing WHERE** | Medium | Full table scans |
| **Subquery in SELECT** | Medium | Often better as JOINs |

## 🏢 Warehouse-Specific Optimizations

### Snowflake
- ❄️ **Clustering Keys**: Recommendations for large table partitioning
- 🚀 **Result Caching**: Identify cacheable query patterns  
- 📊 **Warehouse Sizing**: Complexity-based sizing suggestions
- 🕒 **Time Travel**: Optimize historical queries
- 📄 **VARIANT**: Semi-structured data query optimization

### BigQuery
- 📅 **Partitioning**: Date/timestamp partitioning opportunities
- 🗂️ **Clustering**: Multi-column clustering recommendations
- 💰 **Slot Efficiency**: Cost optimization suggestions
- 🔄 **Materialized Views**: Pre-aggregation opportunities
- 📊 **Array Operations**: UNNEST and array function optimization

## 🔧 Configuration

### CLI Options

```bash
lombardi analyze [OPTIONS] [SQL_FILE]

Options:
  --sql, -s TEXT              SQL query string to analyze
  --dialect, -d TEXT          SQL dialect (snowflake, bigquery, etc.)
  --threshold, -t FLOAT       Complexity threshold (0-100) [default: 50.0]
  --severity TEXT             Minimum severity (low, medium, high, critical) [default: medium]
  --format, -f TEXT           Output format (rich, json, plain) [default: rich]
  --suggestions/--no-suggestions  Include optimization suggestions [default: suggestions]
```

### Python API Configuration

```python
# Initialize with specific dialect
analyzer = ComplexityAnalyzer(dialect="snowflake")
detector = AntipatternDetector(dialect="bigquery")

# Configure thresholds and rules
suggester = OptimizationSuggester(dialect="snowflake")
suggestions = suggester.suggest(sql_query)

# Warehouse-specific analysis
from lombardi.rules.snowflake_rules import SnowflakeRules
snowflake_rules = SnowflakeRules()
sf_suggestions = snowflake_rules.analyze(sql_query)
```

## 🧪 Development

### Setup

```bash
git clone https://github.com/Doctacon/lombardi.git
cd lombardi
uv sync
```

### Testing

```bash
# Run tests
uv run pytest tests/ -v

# Test CLI
uv run lombardi analyze --sql "SELECT * FROM test_table"

# Test with coverage
uv run pytest tests/ --cov=lombardi
```

### Building

```bash
# Build package
uv build

# Test locally
pip install dist/lombardi-*.whl
```

## 📈 Performance Impact

Lombardi helps identify optimizations that can provide:

- **Query Speed**: 10x-100x faster execution through proper indexing
- **Cost Reduction**: 50-90% lower warehouse costs via efficient queries  
- **Resource Usage**: Reduced CPU, memory, and I/O through better query patterns
- **Maintainability**: Cleaner, more readable SQL through antipattern detection

## 🤝 Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

1. Fork the repository
2. Create a feature branch: `git checkout -b feature/amazing-feature`
3. Make your changes and add tests
4. Run tests: `uv run pytest`
5. Submit a pull request

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## 🙏 Acknowledgments

- **SQLGlot**: Powers our semantic SQL parsing
- **SQLMesh**: Inspiration for the audit integration pattern
- **Rich**: Beautiful terminal output
- **Typer**: Excellent CLI framework

## 🔗 Links

- **Documentation**: [Coming Soon]
- **PyPI**: https://pypi.org/project/lombardi/
- **Issues**: https://github.com/Doctacon/lombardi/issues
- **Discussions**: https://github.com/Doctacon/lombardi/discussions

---

*Named after [Vince Lombardi](https://en.wikipedia.org/wiki/Vince_Lombardi), who believed "Perfection is not attainable, but if we chase perfection we can catch excellence" - the same philosophy we apply to SQL optimization.*

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "lombardi",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "data, optimization, performance, sql, sqlmesh",
    "author": null,
    "author_email": "Connor Lough <loughondata@protonmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/9d/39/8d19a794d40e79938a1e5b419d9ba5c114cafc47454641c78182215df652/lombardi-0.1.6.tar.gz",
    "platform": null,
    "description": "# Lombardi \ud83c\udfaf\n\n> SQL performance analysis and optimization advisor with SQLMesh integration\n\n[![PyPI version](https://img.shields.io/pypi/v/lombardi.svg)](https://pypi.org/project/lombardi/)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nLombardi is a powerful SQL performance analysis tool that helps you identify bottlenecks, detect antipatterns, and optimize your queries across different data warehouses. Built on top of SQLGlot's semantic parsing, it provides actionable insights for better query performance.\n\n## \u2728 Features\n\n### \ud83d\udd0d **Semantic SQL Analysis**\n- **Complexity Scoring**: Quantitative assessment (0-100) based on joins, subqueries, nesting depth\n- **Antipattern Detection**: Identifies common performance killers like SELECT *, cartesian joins, functions in WHERE clauses\n- **Optimization Suggestions**: Actionable recommendations with specific examples\n\n### \ud83c\udfe2 **Warehouse-Specific Rules**\n- **Snowflake**: Clustering keys, result caching, VARIANT optimization, time travel suggestions\n- **BigQuery**: Partitioning recommendations, slot efficiency, materialized views, array operations\n- **Universal**: Cross-platform optimizations that work everywhere\n\n### \ud83d\udee0\ufe0f **Integration Ready**\n- **SQLMesh Integration**: Custom audit for performance checks during `sqlmesh plan`\n- **CLI Tool**: Rich terminal output with colors, tables, and formatting\n- **Python API**: Programmatic access for custom workflows\n- **Multiple Output Formats**: Rich terminal, JSON, plain text\n\n## \ud83d\ude80 Quick Start\n\n### Installation\n\n```bash\npip install lombardi\n```\n\n### CLI Usage\n\n```bash\n# Analyze a SQL file\nlombardi analyze query.sql --dialect snowflake\n\n# Analyze a query string\nlombardi analyze --sql \"SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id\"\n\n# Get just complexity metrics\nlombardi complexity --sql \"SELECT COUNT(*) FROM large_table WHERE date_col >= '2024-01-01'\"\n\n# Export results as JSON\nlombardi analyze query.sql --format json > analysis.json\n\n# BigQuery-specific analysis\nlombardi analyze --sql \"SELECT * FROM dataset.table\" --dialect bigquery\n```\n\n### Python API\n\n```python\nfrom lombardi import ComplexityAnalyzer, AntipatternDetector, OptimizationSuggester\n\n# Analyze query complexity\nanalyzer = ComplexityAnalyzer(dialect=\"snowflake\")\nmetrics = analyzer.analyze(\"SELECT * FROM orders JOIN customers ON orders.customer_id = customers.id\")\n\nprint(f\"Complexity Score: {metrics.complexity_score}/100\")\nprint(f\"Join Count: {metrics.join_count}\")\nprint(f\"Subqueries: {metrics.subquery_count}\")\n\n# Detect antipatterns\ndetector = AntipatternDetector()\nissues = detector.detect(\"SELECT * FROM users WHERE UPPER(name) = 'JOHN'\")\n\nfor issue in issues:\n    print(f\"\u274c {issue.pattern}: {issue.description}\")\n\n# Get optimization suggestions\nsuggester = OptimizationSuggester(dialect=\"bigquery\")\nsuggestions = suggester.suggest(\"SELECT * FROM large_table WHERE date_col >= '2024-01-01'\")\n\nfor suggestion in suggestions:\n    print(f\"\ud83d\udca1 {suggestion.title}: {suggestion.description}\")\n```\n\n## \ud83d\udcca Example Output\n\n```bash\n$ lombardi analyze --sql \"SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id WHERE UPPER(c.name) LIKE '%ACME%'\"\n```\n\n```\n\u256d\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500 SQL Analysis Results \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 Complexity Score: 12.0/100                                                   \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\n    Complexity Metrics    \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Metric         \u2503 Value \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 Join Count     \u2502 1     \u2502\n\u2502 Subquery Count \u2502 0     \u2502\n\u2502 CTE Count      \u2502 0     \u2502\n\u2502 Function Count \u2502 1     \u2502\n\u2502 Nesting Depth  \u2502 0     \u2502\n\u2502 Table Count    \u2502 2     \u2502\n\u2502 Column Count   \u2502 3     \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n\n                                Detected Issues                                 \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Pattern                     \u2503 Severity \u2503 Description                         \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 Select Star                 \u2502 MEDIUM   \u2502 SELECT * can be inefficient        \u2502\n\u2502 Function On Column In Where \u2502 HIGH     \u2502 Functions prevent index usage      \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n\n                            Optimization Suggestions                            \n\u250f\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2533\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2513\n\u2503 Category    \u2503 Priority \u2503 Suggestion                                          \u2503\n\u2521\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2547\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2501\u2529\n\u2502 Performance \u2502 HIGH     \u2502 Function On Column In Where: Rewrite to avoid      \u2502\n\u2502             \u2502          \u2502 applying functions to indexed columns               \u2502\n\u2502 Indexing    \u2502 HIGH     \u2502 JOIN Index Optimization: Ensure indexes exist on   \u2502\n\u2502             \u2502          \u2502 JOIN columns: o.customer_id, c.id                  \u2502\n\u2502 Performance \u2502 MEDIUM   \u2502 Select Star: Explicitly list required columns      \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## \ud83d\udd0c SQLMesh Integration\n\nIntegrate Lombardi into your SQLMesh workflow for automated performance checks:\n\n```python\n# In your SQLMesh model\nMODEL (\n  name my_project.optimized_table,\n  kind FULL,\n  audits [performance_check]\n);\n\nSELECT \n  region_code,\n  species_code,\n  COUNT(*) as observation_count\nFROM @DEV.my_project.raw_observations \nWHERE observation_date >= '2024-01-01'\nGROUP BY 1, 2;\n```\n\n```python\n# Configure the audit\nfrom lombardi.integrations.sqlmesh_audit import create_performance_audit\n\nperformance_audit = create_performance_audit(\n    complexity_threshold=30.0,\n    min_severity=\"medium\",\n    dialect=\"snowflake\"\n)\n```\n\n## \ud83c\udfd7\ufe0f Architecture\n\n```\nlombardi/\n\u251c\u2500\u2500 analyzers/           # Core analysis engines\n\u2502   \u251c\u2500\u2500 complexity_analyzer.py      # Complexity scoring\n\u2502   \u251c\u2500\u2500 antipattern_detector.py     # Antipattern detection  \n\u2502   \u2514\u2500\u2500 optimization_suggester.py   # Optimization recommendations\n\u251c\u2500\u2500 rules/               # Warehouse-specific optimizations\n\u2502   \u251c\u2500\u2500 snowflake_rules.py          # Snowflake optimizations\n\u2502   \u2514\u2500\u2500 bigquery_rules.py           # BigQuery optimizations\n\u251c\u2500\u2500 integrations/        # Third-party integrations\n\u2502   \u2514\u2500\u2500 sqlmesh_audit.py            # SQLMesh audit integration\n\u2514\u2500\u2500 cli.py              # Command-line interface\n```\n\n## \ud83c\udfaf Detected Antipatterns\n\nLombardi identifies these common SQL performance issues:\n\n| Pattern | Severity | Description |\n|---------|----------|-------------|\n| **SELECT *** | Medium | Can be inefficient and fragile |\n| **Cartesian Joins** | Critical | Missing JOIN conditions |\n| **Functions in WHERE** | High | Prevents index usage |\n| **Leading Wildcards** | Medium | `LIKE '%pattern'` prevents indexes |\n| **Missing WHERE** | Medium | Full table scans |\n| **Subquery in SELECT** | Medium | Often better as JOINs |\n\n## \ud83c\udfe2 Warehouse-Specific Optimizations\n\n### Snowflake\n- \u2744\ufe0f **Clustering Keys**: Recommendations for large table partitioning\n- \ud83d\ude80 **Result Caching**: Identify cacheable query patterns  \n- \ud83d\udcca **Warehouse Sizing**: Complexity-based sizing suggestions\n- \ud83d\udd52 **Time Travel**: Optimize historical queries\n- \ud83d\udcc4 **VARIANT**: Semi-structured data query optimization\n\n### BigQuery\n- \ud83d\udcc5 **Partitioning**: Date/timestamp partitioning opportunities\n- \ud83d\uddc2\ufe0f **Clustering**: Multi-column clustering recommendations\n- \ud83d\udcb0 **Slot Efficiency**: Cost optimization suggestions\n- \ud83d\udd04 **Materialized Views**: Pre-aggregation opportunities\n- \ud83d\udcca **Array Operations**: UNNEST and array function optimization\n\n## \ud83d\udd27 Configuration\n\n### CLI Options\n\n```bash\nlombardi analyze [OPTIONS] [SQL_FILE]\n\nOptions:\n  --sql, -s TEXT              SQL query string to analyze\n  --dialect, -d TEXT          SQL dialect (snowflake, bigquery, etc.)\n  --threshold, -t FLOAT       Complexity threshold (0-100) [default: 50.0]\n  --severity TEXT             Minimum severity (low, medium, high, critical) [default: medium]\n  --format, -f TEXT           Output format (rich, json, plain) [default: rich]\n  --suggestions/--no-suggestions  Include optimization suggestions [default: suggestions]\n```\n\n### Python API Configuration\n\n```python\n# Initialize with specific dialect\nanalyzer = ComplexityAnalyzer(dialect=\"snowflake\")\ndetector = AntipatternDetector(dialect=\"bigquery\")\n\n# Configure thresholds and rules\nsuggester = OptimizationSuggester(dialect=\"snowflake\")\nsuggestions = suggester.suggest(sql_query)\n\n# Warehouse-specific analysis\nfrom lombardi.rules.snowflake_rules import SnowflakeRules\nsnowflake_rules = SnowflakeRules()\nsf_suggestions = snowflake_rules.analyze(sql_query)\n```\n\n## \ud83e\uddea Development\n\n### Setup\n\n```bash\ngit clone https://github.com/Doctacon/lombardi.git\ncd lombardi\nuv sync\n```\n\n### Testing\n\n```bash\n# Run tests\nuv run pytest tests/ -v\n\n# Test CLI\nuv run lombardi analyze --sql \"SELECT * FROM test_table\"\n\n# Test with coverage\nuv run pytest tests/ --cov=lombardi\n```\n\n### Building\n\n```bash\n# Build package\nuv build\n\n# Test locally\npip install dist/lombardi-*.whl\n```\n\n## \ud83d\udcc8 Performance Impact\n\nLombardi helps identify optimizations that can provide:\n\n- **Query Speed**: 10x-100x faster execution through proper indexing\n- **Cost Reduction**: 50-90% lower warehouse costs via efficient queries  \n- **Resource Usage**: Reduced CPU, memory, and I/O through better query patterns\n- **Maintainability**: Cleaner, more readable SQL through antipattern detection\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n1. Fork the repository\n2. Create a feature branch: `git checkout -b feature/amazing-feature`\n3. Make your changes and add tests\n4. Run tests: `uv run pytest`\n5. Submit a pull request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- **SQLGlot**: Powers our semantic SQL parsing\n- **SQLMesh**: Inspiration for the audit integration pattern\n- **Rich**: Beautiful terminal output\n- **Typer**: Excellent CLI framework\n\n## \ud83d\udd17 Links\n\n- **Documentation**: [Coming Soon]\n- **PyPI**: https://pypi.org/project/lombardi/\n- **Issues**: https://github.com/Doctacon/lombardi/issues\n- **Discussions**: https://github.com/Doctacon/lombardi/discussions\n\n---\n\n*Named after [Vince Lombardi](https://en.wikipedia.org/wiki/Vince_Lombardi), who believed \"Perfection is not attainable, but if we chase perfection we can catch excellence\" - the same philosophy we apply to SQL optimization.*",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "SQL performance analysis and optimization advisor with SQLMesh integration",
    "version": "0.1.6",
    "project_urls": {
        "Homepage": "https://github.com/Doctacon/lombardi",
        "Issues": "https://github.com/Doctacon/lombardi/issues",
        "Repository": "https://github.com/Doctacon/lombardi"
    },
    "split_keywords": [
        "data",
        " optimization",
        " performance",
        " sql",
        " sqlmesh"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d5d99854689b17847b3eb07478591ab5599f41d6e9b922a83305dba26e23fcaf",
                "md5": "7f000f9211b73239af9c3e58621372e3",
                "sha256": "22a59b1b494d155d2b63ebb7fa28a6c64c8d4383db9e84b9c31ecbc1b9f13c3f"
            },
            "downloads": -1,
            "filename": "lombardi-0.1.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7f000f9211b73239af9c3e58621372e3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 21254,
            "upload_time": "2025-07-27T22:44:59",
            "upload_time_iso_8601": "2025-07-27T22:44:59.279485Z",
            "url": "https://files.pythonhosted.org/packages/d5/d9/9854689b17847b3eb07478591ab5599f41d6e9b922a83305dba26e23fcaf/lombardi-0.1.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9d398d19a794d40e79938a1e5b419d9ba5c114cafc47454641c78182215df652",
                "md5": "323fb45f16ea8f79a3aca309edb6b979",
                "sha256": "9826ad6f23ee3a5b821f35ed9d8f890805df70161c99848d9e7e3aadc3f64e80"
            },
            "downloads": -1,
            "filename": "lombardi-0.1.6.tar.gz",
            "has_sig": false,
            "md5_digest": "323fb45f16ea8f79a3aca309edb6b979",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 101087,
            "upload_time": "2025-07-27T22:45:00",
            "upload_time_iso_8601": "2025-07-27T22:45:00.674778Z",
            "url": "https://files.pythonhosted.org/packages/9d/39/8d19a794d40e79938a1e5b419d9ba5c114cafc47454641c78182215df652/lombardi-0.1.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-27 22:45:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Doctacon",
    "github_project": "lombardi",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "lombardi"
}

None