# thai-lint
[](https://opensource.org/licenses/MIT)
[](https://www.python.org/downloads/)
[](tests/)
[](htmlcov/)
The AI Linter - Enterprise-ready linting and governance for AI-generated code across multiple languages.
## Documentation
**New to thailint?** Start here:
- **[Quick Start Guide](docs/quick-start.md)** - Get running in 5 minutes
- **[Configuration Reference](docs/configuration.md)** - Complete config options for all linters
- **[Troubleshooting Guide](docs/troubleshooting.md)** - Common issues and solutions
**Full Documentation:** Browse the **[docs/](docs/)** folder for comprehensive guides covering installation, all linters, configuration patterns, and integration examples.
## Overview
thailint is a modern, enterprise-ready multi-language linter designed specifically for AI-generated code. It focuses on common mistakes and anti-patterns that AI coding assistants frequently introduce—issues that existing linters don't catch or don't handle consistently across languages.
**Why thailint?**
We're not trying to replace the wonderful existing linters like Pylint, ESLint, or Ruff. Instead, thailint fills critical gaps:
- **AI-Specific Patterns**: AI assistants have predictable blind spots (excessive nesting, magic numbers, SRP violations) that traditional linters miss
- **Cross-Language Consistency**: Detects the same anti-patterns across Python, TypeScript, and JavaScript with unified rules
- **No Existing Solutions**: Issues like excessive nesting depth, file placement violations, and cross-project code duplication lack comprehensive multi-language detection
- **Governance Layer**: Enforces project-wide structure and organization patterns that AI can't infer from local context
thailint complements your existing linting stack by catching the patterns AI tools repeatedly miss.
**Complete documentation available in the [docs/](docs/) folder** covering installation, configuration, all linters, and troubleshooting.
## Features
### Core Capabilities
- **File Placement Linting** - Enforce project structure and organization
- **Magic Numbers Linting** - Detect unnamed numeric literals that should be constants
- Python and TypeScript support with AST analysis
- Context-aware detection (ignores constants, test files, range() usage)
- Configurable allowed numbers and thresholds
- Helpful suggestions for extracting to named constants
- **Nesting Depth Linting** - Detect excessive code nesting with AST analysis
- Python and TypeScript support with tree-sitter
- Configurable max depth (default: 4, recommended: 3)
- Helpful refactoring suggestions (guard clauses, extract method)
- **SRP Linting** - Detect Single Responsibility Principle violations
- Heuristic-based analysis (method count, LOC, keywords)
- Language-specific thresholds (Python, TypeScript, JavaScript)
- Refactoring patterns from real-world examples
- **DRY Linting** - Detect duplicate code across projects
- Token-based hash detection with SQLite storage
- Fast duplicate detection (in-memory or disk-backed)
- Configurable thresholds (lines, tokens, occurrences)
- Language-specific detection (Python, TypeScript, JavaScript)
- False positive filtering (keyword args, imports)
- **Pluggable Architecture** - Easy to extend with custom linters
- **Multi-Language Support** - Python, TypeScript, JavaScript, and more
- **Flexible Configuration** - YAML/JSON configs with pattern matching
- **5-Level Ignore System** - Repo, directory, file, method, and line-level ignores
### Deployment Modes
- **CLI Mode** - Full-featured command-line interface
- **Library API** - Python library for programmatic integration
- **Docker Support** - Containerized deployment for CI/CD
### Enterprise Features
- **Performance** - <100ms for single files, <5s for 1000 files
- **Type Safety** - Full type hints and MyPy strict mode
- **Test Coverage** - 90% coverage with 317 tests
- **CI/CD Ready** - Proper exit codes and JSON output
- **Comprehensive Docs** - Complete documentation and examples
## Installation
### From Source
```bash
# Clone repository
git clone https://github.com/be-wise-be-kind/thai-lint.git
cd thai-lint
# Install dependencies
pip install -e ".[dev]"
```
### From PyPI (once published)
```bash
pip install thai-lint
```
### With Docker
```bash
# Pull from Docker Hub
docker pull washad/thailint:latest
# Run CLI
docker run --rm washad/thailint:latest --help
```
## Quick Start
### CLI Mode
```bash
# Check file placement
thailint file-placement .
# Check multiple files
thailint nesting file1.py file2.py file3.py
# Check specific directory
thailint nesting src/
# Check for duplicate code
thailint dry .
# Check for magic numbers
thailint magic-numbers src/
# With config file
thailint dry --config .thailint.yaml src/
# JSON output for CI/CD
thailint dry --format json src/
```
**New to thailint?** See the **[Quick Start Guide](docs/quick-start.md)** for a complete walkthrough including config generation, understanding output, and next steps.
### Library Mode
```python
from src import Linter
# Initialize linter
linter = Linter(config_file='.thailint.yaml')
# Lint directory
violations = linter.lint('src/', rules=['file-placement'])
# Process results
if violations:
for v in violations:
print(f"{v.file_path}: {v.message}")
```
### Docker Mode
```bash
# Lint directory (recommended - lints all files inside)
docker run --rm -v $(pwd):/data \
washad/thailint:latest file-placement /data
# Lint single file
docker run --rm -v $(pwd):/data \
washad/thailint:latest file-placement /data/src/app.py
# Lint multiple specific files
docker run --rm -v $(pwd):/data \
washad/thailint:latest nesting /data/src/file1.py /data/src/file2.py
# Check nesting depth in subdirectory
docker run --rm -v $(pwd):/data \
washad/thailint:latest nesting /data/src
```
### Docker with Sibling Directories
For Docker environments with sibling directories (e.g., separate config and source directories), use `--project-root` or config path inference:
```bash
# Directory structure:
# /workspace/
# ├── root/ # Contains .thailint.yaml and .git
# ├── backend/ # Code to lint
# └── tools/
# Option 1: Explicit project root (recommended)
docker run --rm -v $(pwd):/data \
washad/thailint:latest \
--project-root /data/root \
magic-numbers /data/backend/
# Option 2: Config path inference (automatic)
docker run --rm -v $(pwd):/data \
washad/thailint:latest \
--config /data/root/.thailint.yaml \
magic-numbers /data/backend/
# With ignore patterns resolving from project root
docker run --rm -v $(pwd):/data \
washad/thailint:latest \
--project-root /data/root \
--config /data/root/.thailint.yaml \
magic-numbers /data/backend/
```
**Priority order:**
1. `--project-root` (highest priority - explicit specification)
2. Inferred from `--config` path directory
3. Auto-detection from file location (fallback)
See **[Docker Usage](#docker-usage)** section below for more examples.
## Configuration
Create `.thailint.yaml` in your project root:
```yaml
# File placement linter configuration
file-placement:
enabled: true
# Global patterns apply to entire project
global_patterns:
deny:
- pattern: "^(?!src/|tests/).*\\.py$"
message: "Python files must be in src/ or tests/"
# Directory-specific rules
directories:
src:
allow:
- ".*\\.py$"
deny:
- "test_.*\\.py$"
tests:
allow:
- "test_.*\\.py$"
- "conftest\\.py$"
# Files/directories to ignore
ignore:
- "__pycache__/"
- "*.pyc"
- ".venv/"
# Nesting depth linter configuration
nesting:
enabled: true
max_nesting_depth: 4 # Maximum allowed nesting depth
# Language-specific settings (optional)
languages:
python:
max_depth: 4
typescript:
max_depth: 4
javascript:
max_depth: 4
# DRY linter configuration
dry:
enabled: true
min_duplicate_lines: 4 # Minimum lines to consider duplicate
min_duplicate_tokens: 30 # Minimum tokens to consider duplicate
min_occurrences: 2 # Report if appears 2+ times
# Language-specific thresholds
python:
min_occurrences: 3 # Python: require 3+ occurrences
# Storage settings (SQLite)
storage_mode: "memory" # Options: "memory" (default) or "tempfile"
# Ignore patterns
ignore:
- "tests/"
- "__init__.py"
# Magic numbers linter configuration
magic-numbers:
enabled: true
allowed_numbers: [-1, 0, 1, 2, 10, 100, 1000] # Numbers allowed without constants
max_small_integer: 10 # Max value allowed in range() or enumerate()
```
**JSON format also supported** (`.thailint.json`):
```json
{
"file-placement": {
"enabled": true,
"directories": {
"src": {
"allow": [".*\\.py$"],
"deny": ["test_.*\\.py$"]
}
},
"ignore": ["__pycache__/", "*.pyc"]
},
"nesting": {
"enabled": true,
"max_nesting_depth": 4,
"languages": {
"python": { "max_depth": 4 },
"typescript": { "max_depth": 4 }
}
},
"dry": {
"enabled": true,
"min_duplicate_lines": 4,
"min_duplicate_tokens": 30,
"min_occurrences": 2,
"python": {
"min_occurrences": 3
},
"storage_mode": "memory",
"ignore": ["tests/", "__init__.py"]
},
"magic-numbers": {
"enabled": true,
"allowed_numbers": [-1, 0, 1, 2, 10, 100, 1000],
"max_small_integer": 10
}
}
```
See [Configuration Guide](docs/configuration.md) for complete reference.
**Need help with ignores?** See **[How to Ignore Violations](docs/how-to-ignore-violations.md)** for complete guide to all ignore levels (line, method, class, file, repository).
## Nesting Depth Linter
### Overview
The nesting depth linter detects deeply nested code (if/for/while/try statements) that reduces readability and maintainability. It uses AST analysis to accurately calculate nesting depth.
### Quick Start
```bash
# Check nesting depth in current directory
thailint nesting .
# Use strict limit (max depth 3)
thailint nesting --max-depth 3 src/
# Get JSON output
thailint nesting --format json src/
```
### Configuration
Add to `.thailint.yaml`:
```yaml
nesting:
enabled: true
max_nesting_depth: 3 # Default: 4, recommended: 3
```
### Example Violation
**Code with excessive nesting:**
```python
def process_data(items):
for item in items: # Depth 2
if item.is_valid(): # Depth 3
try: # Depth 4 ← VIOLATION (max=3)
if item.process():
return True
except Exception:
pass
return False
```
**Refactored with guard clauses:**
```python
def process_data(items):
for item in items: # Depth 2
if not item.is_valid():
continue
try: # Depth 3 ✓
if item.process():
return True
except Exception:
pass
return False
```
### Refactoring Patterns
Common patterns to reduce nesting:
1. **Guard Clauses (Early Returns)**
- Replace `if x: do_something()` with `if not x: return`
- Exit early, reduce nesting
2. **Extract Method**
- Move nested logic to separate functions
- Improves readability and testability
3. **Dispatch Pattern**
- Replace if-elif-else chains with dictionary dispatch
- More extensible and cleaner
4. **Flatten Error Handling**
- Combine multiple try-except blocks
- Use tuple of exception types
### Language Support
- **Python**: Full support (if/for/while/with/try/match)
- **TypeScript**: Full support (if/for/while/try/switch)
- **JavaScript**: Supported via TypeScript parser
See [Nesting Linter Guide](docs/nesting-linter.md) for comprehensive documentation and refactoring patterns.
## Single Responsibility Principle (SRP) Linter
### Overview
The SRP linter detects classes that violate the Single Responsibility Principle by having too many methods, too many lines of code, or generic naming patterns. It uses AST analysis with configurable heuristics to identify classes that likely handle multiple responsibilities.
### Quick Start
```bash
# Check SRP violations in current directory
thailint srp .
# Use custom thresholds
thailint srp --max-methods 10 --max-loc 300 src/
# Get JSON output
thailint srp --format json src/
```
### Configuration
Add to `.thailint.yaml`:
```yaml
srp:
enabled: true
max_methods: 7 # Maximum methods per class
max_loc: 200 # Maximum lines of code per class
# Language-specific thresholds
python:
max_methods: 8
max_loc: 200
typescript:
max_methods: 10 # TypeScript more verbose
max_loc: 250
```
### Detection Heuristics
The SRP linter uses three heuristics to detect violations:
1. **Method Count**: Classes with >7 methods (default) likely have multiple responsibilities
2. **Lines of Code**: Classes with >200 LOC (default) are often doing too much
3. **Responsibility Keywords**: Names containing "Manager", "Handler", "Processor", etc.
### Example Violation
**Code with SRP violation:**
```python
class UserManager: # 8 methods, contains "Manager" keyword
def create_user(self): pass
def update_user(self): pass
def delete_user(self): pass
def send_email(self): pass # ← Different responsibility
def log_action(self): pass # ← Different responsibility
def validate_data(self): pass # ← Different responsibility
def generate_report(self): pass # ← Different responsibility
def export_data(self): pass # ← Violation at method 8
```
**Refactored following SRP:**
```python
class UserRepository: # 3 methods ✓
def create(self, user): pass
def update(self, user): pass
def delete(self, user): pass
class EmailService: # 1 method ✓
def send(self, user, template): pass
class UserAuditLog: # 1 method ✓
def log(self, action, user): pass
class UserValidator: # 1 method ✓
def validate(self, data): pass
class ReportGenerator: # 1 method ✓
def generate(self, users): pass
```
### Refactoring Patterns
Common patterns to fix SRP violations (discovered during dogfooding):
1. **Extract Class**
- Split god classes into focused classes
- Each class handles one responsibility
2. **Split Configuration and Logic**
- Separate config loading from business logic
- Create dedicated ConfigLoader classes
3. **Extract Language-Specific Logic**
- Separate Python/TypeScript analysis
- Use analyzer classes per language
4. **Utility Module Pattern**
- Group related helper methods
- Create focused utility classes
### Language Support
- **Python**: Full support with method counting and LOC analysis
- **TypeScript**: Full support with tree-sitter parsing
- **JavaScript**: Supported via TypeScript parser
### Real-World Example
**Large class refactoring:**
- **Before**: FilePlacementLinter (33 methods, 382 LOC) - single class handling config, patterns, validation
- **After**: Extract Class pattern applied - 5 focused classes (ConfigLoader, PatternValidator, RuleChecker, PathResolver, FilePlacementLinter)
- **Result**: Each class ≤8 methods, ≤150 LOC, single responsibility
See [SRP Linter Guide](docs/srp-linter.md) for comprehensive documentation and refactoring patterns.
## DRY Linter (Don't Repeat Yourself)
### Overview
The DRY linter detects duplicate code blocks across your entire project using token-based hashing with SQLite storage. It identifies identical or near-identical code that violates the Don't Repeat Yourself (DRY) principle, helping maintain code quality at scale.
### Quick Start
```bash
# Check for duplicate code in current directory
thailint dry .
# Use custom thresholds
thailint dry --min-lines 5 src/
# Use tempfile storage for large projects
thailint dry --storage-mode tempfile src/
# Get JSON output
thailint dry --format json src/
```
### Configuration
Add to `.thailint.yaml`:
```yaml
dry:
enabled: true
min_duplicate_lines: 4 # Minimum lines to consider duplicate
min_duplicate_tokens: 30 # Minimum tokens to consider duplicate
min_occurrences: 2 # Report if appears 2+ times
# Language-specific thresholds
python:
min_occurrences: 3 # Python: require 3+ occurrences
typescript:
min_occurrences: 3 # TypeScript: require 3+ occurrences
# Storage settings
storage_mode: "memory" # Options: "memory" (default) or "tempfile"
# Ignore patterns
ignore:
- "tests/" # Test code often has acceptable duplication
- "__init__.py" # Import-only files exempt
# False positive filters
filters:
keyword_argument_filter: true # Filter function call kwargs
import_group_filter: true # Filter import groups
```
### How It Works
**Token-Based Detection:**
1. Parse code into tokens (stripping comments, normalizing whitespace)
2. Create rolling hash windows of N lines
3. Store hashes in SQLite database with file locations
4. Query for hashes appearing 2+ times across project
**SQLite Storage:**
- In-memory mode (default): Stores in RAM for best performance
- Tempfile mode: Stores in temporary disk file for large projects
- Fresh analysis on every run (no persistence between runs)
- Fast duplicate detection using B-tree indexes
### Example Violation
**Code with duplication:**
```python
# src/auth.py
def validate_user(user_data):
if not user_data:
return False
if not user_data.get('email'):
return False
if not user_data.get('password'):
return False
return True
# src/admin.py
def validate_admin(admin_data):
if not admin_data:
return False
if not admin_data.get('email'):
return False
if not admin_data.get('password'):
return False
return True
```
**Violation message:**
```
src/auth.py:3 - Duplicate code detected (4 lines, 2 occurrences)
Locations:
- src/auth.py:3-6
- src/admin.py:3-6
Consider extracting to shared function
```
**Refactored (DRY):**
```python
# src/validators.py
def validate_credentials(data):
if not data:
return False
if not data.get('email'):
return False
if not data.get('password'):
return False
return True
# src/auth.py & src/admin.py
from src.validators import validate_credentials
def validate_user(user_data):
return validate_credentials(user_data)
def validate_admin(admin_data):
return validate_credentials(admin_data)
```
### Performance
| Operation | Performance | Storage Mode |
|-----------|-------------|--------------|
| Scan (1000 files) | 1-3s | Memory (default) |
| Large project (5000+ files) | Use tempfile mode | Tempfile |
**Note**: Every run analyzes files fresh - no persistence between runs ensures accurate results
### Language Support
- **Python**: Full support with AST-based tokenization
- **TypeScript**: Full support with tree-sitter parsing
- **JavaScript**: Supported via TypeScript parser
### False Positive Filtering
Built-in filters automatically exclude common non-duplication patterns:
- **keyword_argument_filter**: Excludes function calls with keyword arguments
- **import_group_filter**: Excludes import statement groups
### Refactoring Patterns
1. **Extract Function**: Move repeated logic to shared function
2. **Extract Base Class**: Create base class for similar implementations
3. **Extract Utility Module**: Move helper functions to shared utilities
4. **Template Method**: Use function parameters for variations
See [DRY Linter Guide](docs/dry-linter.md) for comprehensive documentation, storage modes, and refactoring patterns.
## Magic Numbers Linter
### Overview
The magic numbers linter detects unnamed numeric literals (magic numbers) that should be extracted to named constants. It uses AST analysis to identify numeric literals that lack meaningful context.
### What are Magic Numbers?
**Magic numbers** are unnamed numeric literals in code without explanation:
```python
# Bad - Magic numbers
timeout = 3600 # What is 3600?
max_retries = 5 # Why 5?
# Good - Named constants
TIMEOUT_SECONDS = 3600
MAX_RETRY_ATTEMPTS = 5
```
### Quick Start
```bash
# Check for magic numbers in current directory
thailint magic-numbers .
# Check specific directory
thailint magic-numbers src/
# Get JSON output
thailint magic-numbers --format json src/
```
### Configuration
Add to `.thailint.yaml`:
```yaml
magic-numbers:
enabled: true
allowed_numbers: [-1, 0, 1, 2, 10, 100, 1000]
max_small_integer: 10 # Max for range() to be acceptable
```
### Example Violation
**Code with magic numbers:**
```python
def calculate_timeout():
return 3600 # Magic number - what is 3600?
def process_items(items):
for i in range(100): # Magic number - why 100?
items[i] *= 1.5 # Magic number - what is 1.5?
```
**Violation messages:**
```
src/example.py:2 - Magic number 3600 should be a named constant
src/example.py:5 - Magic number 100 should be a named constant
src/example.py:6 - Magic number 1.5 should be a named constant
```
**Refactored code:**
```python
TIMEOUT_SECONDS = 3600
MAX_ITEMS = 100
PRICE_MULTIPLIER = 1.5
def calculate_timeout():
return TIMEOUT_SECONDS
def process_items(items):
for i in range(MAX_ITEMS):
items[i] *= PRICE_MULTIPLIER
```
### Acceptable Contexts
The linter **does not** flag numbers in these contexts:
| Context | Example | Why Acceptable |
|---------|---------|----------------|
| Constants | `MAX_SIZE = 100` | UPPERCASE name provides context |
| Small `range()` | `range(5)` | Small loop bounds are clear |
| Test files | `test_*.py` | Test data can be literal |
| Allowed numbers | `-1, 0, 1, 2, 10` | Common values are self-explanatory |
### Refactoring Patterns
**Pattern 1: Extract to Module Constants**
```python
# Before
def connect():
timeout = 30
retries = 3
# After
DEFAULT_TIMEOUT_SECONDS = 30
DEFAULT_MAX_RETRIES = 3
def connect():
timeout = DEFAULT_TIMEOUT_SECONDS
retries = DEFAULT_MAX_RETRIES
```
**Pattern 2: Extract with Units in Name**
```python
# Before
delay = 3600 # Is this seconds? Minutes?
# After
TASK_DELAY_SECONDS = 3600 # Clear unit
delay = TASK_DELAY_SECONDS
```
**Pattern 3: Use Standard Library**
```python
# Before
if status == 200:
return "success"
# After
from http import HTTPStatus
if status == HTTPStatus.OK:
return "success"
```
### Language Support
- **Python**: Full support (int, float, scientific notation)
- **TypeScript**: Full support (int, float, scientific notation)
- **JavaScript**: Supported via TypeScript parser
### Ignoring Violations
```python
# Line-level ignore
timeout = 3600 # thailint: ignore[magic-numbers] - Industry standard
# Method-level ignore
def get_ports(): # thailint: ignore[magic-numbers] - Standard ports
return {80: "HTTP", 443: "HTTPS"}
# File-level ignore
# thailint: ignore-file[magic-numbers]
```
See **[How to Ignore Violations](docs/how-to-ignore-violations.md)** and **[Magic Numbers Linter Guide](docs/magic-numbers-linter.md)** for complete documentation.
## Pre-commit Hooks
Automate code quality checks before every commit and push with pre-commit hooks.
### Quick Setup
```bash
# 1. Install pre-commit framework
pip install pre-commit
# 2. Install git hooks
pre-commit install
pre-commit install --hook-type pre-push
# 3. Test it works
pre-commit run --all-files
```
### What You Get
**On every commit:**
- Prevents commits to main/master branch
- Auto-fixes formatting issues
- Runs thailint on changed files (fast, uses pass_filenames: true)
**On every push:**
- Full linting on entire codebase
- Runs complete test suite
### Example Configuration
```yaml
# .pre-commit-config.yaml
repos:
- repo: local
hooks:
# Prevent commits to protected branches
- id: no-commit-to-main
name: Prevent commits to main branch
entry: bash -c 'branch=$(git rev-parse --abbrev-ref HEAD); if [ "$branch" = "main" ]; then echo "ERROR: Use a feature branch!"; exit 1; fi'
language: system
pass_filenames: false
always_run: true
# Auto-format code
- id: format
name: Auto-fix formatting
entry: just format
language: system
pass_filenames: false
# Run thailint on changed files (passes filenames directly)
- id: thailint-changed
name: Lint changed files
entry: thailint nesting
language: system
files: \.(py|ts|tsx|js|jsx)$
pass_filenames: true
```
See **[Pre-commit Hooks Guide](docs/pre-commit-hooks.md)** for complete documentation, troubleshooting, and advanced configuration.
## Common Use Cases
### CI/CD Integration
```yaml
# GitHub Actions example
name: Lint
on: [push, pull_request]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install thailint
run: pip install thailint
- name: Run file placement linter
run: thailint file-placement .
- name: Run nesting linter
run: thailint nesting src/ --config .thailint.yaml
```
### Editor Integration
```python
# VS Code extension example
from src import Linter
linter = Linter(config_file='.thailint.yaml')
violations = linter.lint(file_path)
```
### Test Suite
```python
# pytest integration
import pytest
from src import Linter
def test_no_violations():
linter = Linter()
violations = linter.lint('src/')
assert len(violations) == 0
```
## Development
### Setup Development Environment
```bash
# Install dependencies and activate virtualenv
just init
# Or manually:
poetry install
source $(poetry env info --path)/bin/activate
```
### Running Tests
```bash
# Run all tests (parallel mode - fast)
just test
# Run with coverage (serial mode)
just test-coverage
# Run specific test
poetry run pytest tests/test_cli.py::test_hello_command -v
```
### Code Quality
```bash
# Fast linting (Ruff only - use during development)
just lint
# Comprehensive linting (Ruff + Pylint + Flake8 + MyPy)
just lint-all
# Security scanning
just lint-security
# Complexity analysis (Radon + Xenon + Nesting)
just lint-complexity
# SOLID principles (SRP)
just lint-solid
# DRY principles (duplicate code detection)
just lint-dry
# ALL quality checks (runs everything)
just lint-full
# Auto-fix formatting issues
just format
```
### Dogfooding (Lint Our Own Code)
```bash
# Lint file placement
just lint-placement
# Check nesting depth
just lint-nesting
# Check for magic numbers
poetry run thai-lint magic-numbers src/
```
### Building and Publishing
```bash
# Build Python package
poetry build
# Build Docker image locally
docker build -t washad/thailint:latest .
# Publish to PyPI and Docker Hub (runs tests + linting + version bump)
just publish
```
### Quick Development Workflows
```bash
# Make changes, then run quality checks
just lint-full
# Share changes for collaboration (skips hooks)
just share "WIP: feature description"
# Clean up cache and artifacts
just clean
```
See `just --list` or `just help` for all available commands.
## Docker Usage
### Basic Docker Commands
```bash
# Pull published image
docker pull washad/thailint:latest
# Run CLI help
docker run --rm washad/thailint:latest --help
# Lint entire directory (recommended)
docker run --rm -v $(pwd):/data washad/thailint:latest file-placement /data
# Lint single file
docker run --rm -v $(pwd):/data washad/thailint:latest file-placement /data/src/app.py
# Lint multiple specific files
docker run --rm -v $(pwd):/data washad/thailint:latest nesting /data/src/file1.py /data/src/file2.py
# Lint specific subdirectory
docker run --rm -v $(pwd):/data washad/thailint:latest nesting /data/src
# With custom config
docker run --rm -v $(pwd):/data \
washad/thailint:latest nesting --config /data/.thailint.yaml /data
# JSON output for CI/CD
docker run --rm -v $(pwd):/data \
washad/thailint:latest file-placement --format json /data
```
### Docker with Sibling Directories (Advanced)
For complex Docker setups with sibling directories, use `--project-root` for explicit control:
```bash
# Scenario: Monorepo with separate config and code directories
# Directory structure:
# /workspace/
# ├── config/ # Contains .thailint.yaml
# ├── backend/app/ # Python backend code
# ├── frontend/ # TypeScript frontend
# └── tools/ # Build tools
# Explicit project root (recommended for Docker)
docker run --rm -v /path/to/workspace:/workspace \
washad/thailint:latest \
--project-root /workspace/config \
magic-numbers /workspace/backend/
# Config path inference (automatic - no --project-root needed)
docker run --rm -v /path/to/workspace:/workspace \
washad/thailint:latest \
--config /workspace/config/.thailint.yaml \
magic-numbers /workspace/backend/
# Lint multiple sibling directories with shared config
docker run --rm -v /path/to/workspace:/workspace \
washad/thailint:latest \
--project-root /workspace/config \
nesting /workspace/backend/ /workspace/frontend/
```
**When to use `--project-root` in Docker:**
- **Sibling directory structures** - When config/code aren't nested
- **Monorepos** - Multiple projects sharing one config
- **CI/CD** - Explicit paths prevent auto-detection issues
- **Ignore patterns** - Ensures patterns resolve from correct base directory
## Documentation
### Comprehensive Guides
- **[Getting Started](docs/getting-started.md)** - Installation, first lint, basic config
- **[Configuration Reference](docs/configuration.md)** - Complete config options (YAML/JSON)
- **[How to Ignore Violations](docs/how-to-ignore-violations.md)** - Complete guide to all ignore levels
- **[API Reference](docs/api-reference.md)** - Library API documentation
- **[CLI Reference](docs/cli-reference.md)** - All CLI commands and options
- **[Deployment Modes](docs/deployment-modes.md)** - CLI, Library, and Docker usage
- **[File Placement Linter](docs/file-placement-linter.md)** - Detailed linter guide
- **[Magic Numbers Linter](docs/magic-numbers-linter.md)** - Magic numbers detection guide
- **[Nesting Depth Linter](docs/nesting-linter.md)** - Nesting depth analysis guide
- **[SRP Linter](docs/srp-linter.md)** - Single Responsibility Principle guide
- **[DRY Linter](docs/dry-linter.md)** - Duplicate code detection guide
- **[Pre-commit Hooks](docs/pre-commit-hooks.md)** - Automated quality checks
- **[Publishing Guide](docs/releasing.md)** - Release and publishing workflow
- **[Publishing Checklist](docs/publishing-checklist.md)** - Post-publication validation
### Examples
See [`examples/`](examples/) directory for working code:
- **[basic_usage.py](examples/basic_usage.py)** - Simple library API usage
- **[advanced_usage.py](examples/advanced_usage.py)** - Advanced patterns and workflows
- **[ci_integration.py](examples/ci_integration.py)** - CI/CD integration example
## Project Structure
```
thai-lint/
├── src/ # Application source code
│ ├── api.py # High-level Library API
│ ├── cli.py # CLI commands
│ ├── core/ # Core abstractions
│ │ ├── base.py # Base linter interfaces
│ │ ├── registry.py # Rule registry
│ │ └── types.py # Core types (Violation, Severity)
│ ├── linters/ # Linter implementations
│ │ └── file_placement/ # File placement linter
│ ├── linter_config/ # Configuration system
│ │ ├── loader.py # Config loader (YAML/JSON)
│ │ └── ignore.py # Ignore directives
│ └── orchestrator/ # Multi-language orchestrator
│ ├── core.py # Main orchestrator
│ └── language_detector.py
├── tests/ # Test suite (221 tests, 87% coverage)
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ └── conftest.py # Pytest fixtures
├── docs/ # Documentation
│ ├── getting-started.md
│ ├── configuration.md
│ ├── api-reference.md
│ ├── cli-reference.md
│ ├── deployment-modes.md
│ └── file-placement-linter.md
├── examples/ # Working examples
│ ├── basic_usage.py
│ ├── advanced_usage.py
│ └── ci_integration.py
├── .ai/ # AI agent documentation
├── Dockerfile # Multi-stage Docker build
├── docker-compose.yml # Docker orchestration
└── pyproject.toml # Project configuration
```
## Contributing
Contributions are welcome! Please follow these steps:
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run tests and linting
5. Commit your changes (`git commit -m 'Add amazing feature'`)
6. Push to the branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request
### Development Guidelines
- Write tests for new features
- Follow existing code style (enforced by Ruff)
- Add type hints to all functions
- Update documentation for user-facing changes
- Run `pytest` and `ruff check` before committing
## Performance
thailint is designed for speed and efficiency:
| Operation | Performance | Target |
|-----------|-------------|--------|
| Single file lint | ~20ms | <100ms |
| 100 files | ~300ms | <1s |
| 1000 files | ~900ms | <5s |
| Config loading | ~10ms | <100ms |
*Performance benchmarks run on standard hardware, your results may vary.*
## Exit Codes
thailint uses standard exit codes for CI/CD integration:
- **0** - Success (no violations)
- **1** - Violations found
- **2** - Error occurred (invalid config, file not found, etc.)
```bash
thailint file-placement .
if [ $? -eq 0 ]; then
echo "Linting passed"
else
echo "Linting failed"
fi
```
## Architecture
See [`.ai/docs/`](.ai/docs/) for detailed architecture documentation and [`.ai/howtos/`](.ai/howtos/) for development guides.
## License
MIT License - see LICENSE file for details.
## Support
- **Issues**: https://github.com/be-wise-be-kind/thai-lint/issues
- **Documentation**: `.ai/docs/` and `.ai/howtos/`
## Acknowledgments
Built with:
- [Click](https://click.palletsprojects.com/) - CLI framework
- [pytest](https://pytest.org/) - Testing framework
- [Ruff](https://docs.astral.sh/ruff/) - Linting and formatting
- [Docker](https://www.docker.com/) - Containerization
## Changelog
See [CHANGELOG.md](CHANGELOG.md) for version history.
Raw data
{
"_id": null,
"home_page": "https://github.com/be-wise-be-kind/thai-lint",
"name": "thailint",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.11",
"maintainer_email": null,
"keywords": "linter, ai, code-quality, static-analysis, file-placement, governance, multi-language, cli, docker, python",
"author": "Steve Jackson",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/c0/d7/f3160e95accf4166c95eccdc143ad1145fcd97e13fb6949074349bdc2ffe/thailint-0.4.4.tar.gz",
"platform": null,
"description": "# thai-lint\n\n[](https://opensource.org/licenses/MIT)\n[](https://www.python.org/downloads/)\n[](tests/)\n[](htmlcov/)\n\nThe AI Linter - Enterprise-ready linting and governance for AI-generated code across multiple languages.\n\n## Documentation\n\n**New to thailint?** Start here:\n- **[Quick Start Guide](docs/quick-start.md)** - Get running in 5 minutes\n- **[Configuration Reference](docs/configuration.md)** - Complete config options for all linters\n- **[Troubleshooting Guide](docs/troubleshooting.md)** - Common issues and solutions\n\n**Full Documentation:** Browse the **[docs/](docs/)** folder for comprehensive guides covering installation, all linters, configuration patterns, and integration examples.\n\n## Overview\n\nthailint is a modern, enterprise-ready multi-language linter designed specifically for AI-generated code. It focuses on common mistakes and anti-patterns that AI coding assistants frequently introduce\u2014issues that existing linters don't catch or don't handle consistently across languages.\n\n**Why thailint?**\n\nWe're not trying to replace the wonderful existing linters like Pylint, ESLint, or Ruff. Instead, thailint fills critical gaps:\n\n- **AI-Specific Patterns**: AI assistants have predictable blind spots (excessive nesting, magic numbers, SRP violations) that traditional linters miss\n- **Cross-Language Consistency**: Detects the same anti-patterns across Python, TypeScript, and JavaScript with unified rules\n- **No Existing Solutions**: Issues like excessive nesting depth, file placement violations, and cross-project code duplication lack comprehensive multi-language detection\n- **Governance Layer**: Enforces project-wide structure and organization patterns that AI can't infer from local context\n\nthailint complements your existing linting stack by catching the patterns AI tools repeatedly miss.\n\n**Complete documentation available in the [docs/](docs/) folder** covering installation, configuration, all linters, and troubleshooting.\n\n## Features\n\n### Core Capabilities\n- **File Placement Linting** - Enforce project structure and organization\n- **Magic Numbers Linting** - Detect unnamed numeric literals that should be constants\n - Python and TypeScript support with AST analysis\n - Context-aware detection (ignores constants, test files, range() usage)\n - Configurable allowed numbers and thresholds\n - Helpful suggestions for extracting to named constants\n- **Nesting Depth Linting** - Detect excessive code nesting with AST analysis\n - Python and TypeScript support with tree-sitter\n - Configurable max depth (default: 4, recommended: 3)\n - Helpful refactoring suggestions (guard clauses, extract method)\n- **SRP Linting** - Detect Single Responsibility Principle violations\n - Heuristic-based analysis (method count, LOC, keywords)\n - Language-specific thresholds (Python, TypeScript, JavaScript)\n - Refactoring patterns from real-world examples\n- **DRY Linting** - Detect duplicate code across projects\n - Token-based hash detection with SQLite storage\n - Fast duplicate detection (in-memory or disk-backed)\n - Configurable thresholds (lines, tokens, occurrences)\n - Language-specific detection (Python, TypeScript, JavaScript)\n - False positive filtering (keyword args, imports)\n- **Pluggable Architecture** - Easy to extend with custom linters\n- **Multi-Language Support** - Python, TypeScript, JavaScript, and more\n- **Flexible Configuration** - YAML/JSON configs with pattern matching\n- **5-Level Ignore System** - Repo, directory, file, method, and line-level ignores\n\n### Deployment Modes\n- **CLI Mode** - Full-featured command-line interface\n- **Library API** - Python library for programmatic integration\n- **Docker Support** - Containerized deployment for CI/CD\n\n### Enterprise Features\n- **Performance** - <100ms for single files, <5s for 1000 files\n- **Type Safety** - Full type hints and MyPy strict mode\n- **Test Coverage** - 90% coverage with 317 tests\n- **CI/CD Ready** - Proper exit codes and JSON output\n- **Comprehensive Docs** - Complete documentation and examples\n\n## Installation\n\n### From Source\n\n```bash\n# Clone repository\ngit clone https://github.com/be-wise-be-kind/thai-lint.git\ncd thai-lint\n\n# Install dependencies\npip install -e \".[dev]\"\n```\n\n### From PyPI (once published)\n\n```bash\npip install thai-lint\n```\n\n### With Docker\n\n```bash\n# Pull from Docker Hub\ndocker pull washad/thailint:latest\n\n# Run CLI\ndocker run --rm washad/thailint:latest --help\n```\n\n## Quick Start\n\n### CLI Mode\n\n```bash\n# Check file placement\nthailint file-placement .\n\n# Check multiple files\nthailint nesting file1.py file2.py file3.py\n\n# Check specific directory\nthailint nesting src/\n\n# Check for duplicate code\nthailint dry .\n\n# Check for magic numbers\nthailint magic-numbers src/\n\n# With config file\nthailint dry --config .thailint.yaml src/\n\n# JSON output for CI/CD\nthailint dry --format json src/\n```\n\n**New to thailint?** See the **[Quick Start Guide](docs/quick-start.md)** for a complete walkthrough including config generation, understanding output, and next steps.\n\n### Library Mode\n\n```python\nfrom src import Linter\n\n# Initialize linter\nlinter = Linter(config_file='.thailint.yaml')\n\n# Lint directory\nviolations = linter.lint('src/', rules=['file-placement'])\n\n# Process results\nif violations:\n for v in violations:\n print(f\"{v.file_path}: {v.message}\")\n```\n\n### Docker Mode\n\n```bash\n# Lint directory (recommended - lints all files inside)\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest file-placement /data\n\n# Lint single file\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest file-placement /data/src/app.py\n\n# Lint multiple specific files\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest nesting /data/src/file1.py /data/src/file2.py\n\n# Check nesting depth in subdirectory\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest nesting /data/src\n```\n\n### Docker with Sibling Directories\n\nFor Docker environments with sibling directories (e.g., separate config and source directories), use `--project-root` or config path inference:\n\n```bash\n# Directory structure:\n# /workspace/\n# \u251c\u2500\u2500 root/ # Contains .thailint.yaml and .git\n# \u251c\u2500\u2500 backend/ # Code to lint\n# \u2514\u2500\u2500 tools/\n\n# Option 1: Explicit project root (recommended)\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest \\\n --project-root /data/root \\\n magic-numbers /data/backend/\n\n# Option 2: Config path inference (automatic)\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest \\\n --config /data/root/.thailint.yaml \\\n magic-numbers /data/backend/\n\n# With ignore patterns resolving from project root\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest \\\n --project-root /data/root \\\n --config /data/root/.thailint.yaml \\\n magic-numbers /data/backend/\n```\n\n**Priority order:**\n1. `--project-root` (highest priority - explicit specification)\n2. Inferred from `--config` path directory\n3. Auto-detection from file location (fallback)\n\nSee **[Docker Usage](#docker-usage)** section below for more examples.\n\n## Configuration\n\nCreate `.thailint.yaml` in your project root:\n\n```yaml\n# File placement linter configuration\nfile-placement:\n enabled: true\n\n # Global patterns apply to entire project\n global_patterns:\n deny:\n - pattern: \"^(?!src/|tests/).*\\\\.py$\"\n message: \"Python files must be in src/ or tests/\"\n\n # Directory-specific rules\n directories:\n src:\n allow:\n - \".*\\\\.py$\"\n deny:\n - \"test_.*\\\\.py$\"\n\n tests:\n allow:\n - \"test_.*\\\\.py$\"\n - \"conftest\\\\.py$\"\n\n # Files/directories to ignore\n ignore:\n - \"__pycache__/\"\n - \"*.pyc\"\n - \".venv/\"\n\n# Nesting depth linter configuration\nnesting:\n enabled: true\n max_nesting_depth: 4 # Maximum allowed nesting depth\n\n # Language-specific settings (optional)\n languages:\n python:\n max_depth: 4\n typescript:\n max_depth: 4\n javascript:\n max_depth: 4\n\n# DRY linter configuration\ndry:\n enabled: true\n min_duplicate_lines: 4 # Minimum lines to consider duplicate\n min_duplicate_tokens: 30 # Minimum tokens to consider duplicate\n min_occurrences: 2 # Report if appears 2+ times\n\n # Language-specific thresholds\n python:\n min_occurrences: 3 # Python: require 3+ occurrences\n\n # Storage settings (SQLite)\n storage_mode: \"memory\" # Options: \"memory\" (default) or \"tempfile\"\n\n # Ignore patterns\n ignore:\n - \"tests/\"\n - \"__init__.py\"\n\n# Magic numbers linter configuration\nmagic-numbers:\n enabled: true\n allowed_numbers: [-1, 0, 1, 2, 10, 100, 1000] # Numbers allowed without constants\n max_small_integer: 10 # Max value allowed in range() or enumerate()\n```\n\n**JSON format also supported** (`.thailint.json`):\n\n```json\n{\n \"file-placement\": {\n \"enabled\": true,\n \"directories\": {\n \"src\": {\n \"allow\": [\".*\\\\.py$\"],\n \"deny\": [\"test_.*\\\\.py$\"]\n }\n },\n \"ignore\": [\"__pycache__/\", \"*.pyc\"]\n },\n \"nesting\": {\n \"enabled\": true,\n \"max_nesting_depth\": 4,\n \"languages\": {\n \"python\": { \"max_depth\": 4 },\n \"typescript\": { \"max_depth\": 4 }\n }\n },\n \"dry\": {\n \"enabled\": true,\n \"min_duplicate_lines\": 4,\n \"min_duplicate_tokens\": 30,\n \"min_occurrences\": 2,\n \"python\": {\n \"min_occurrences\": 3\n },\n \"storage_mode\": \"memory\",\n \"ignore\": [\"tests/\", \"__init__.py\"]\n },\n \"magic-numbers\": {\n \"enabled\": true,\n \"allowed_numbers\": [-1, 0, 1, 2, 10, 100, 1000],\n \"max_small_integer\": 10\n }\n}\n```\n\nSee [Configuration Guide](docs/configuration.md) for complete reference.\n\n**Need help with ignores?** See **[How to Ignore Violations](docs/how-to-ignore-violations.md)** for complete guide to all ignore levels (line, method, class, file, repository).\n\n## Nesting Depth Linter\n\n### Overview\n\nThe nesting depth linter detects deeply nested code (if/for/while/try statements) that reduces readability and maintainability. It uses AST analysis to accurately calculate nesting depth.\n\n### Quick Start\n\n```bash\n# Check nesting depth in current directory\nthailint nesting .\n\n# Use strict limit (max depth 3)\nthailint nesting --max-depth 3 src/\n\n# Get JSON output\nthailint nesting --format json src/\n```\n\n### Configuration\n\nAdd to `.thailint.yaml`:\n\n```yaml\nnesting:\n enabled: true\n max_nesting_depth: 3 # Default: 4, recommended: 3\n```\n\n### Example Violation\n\n**Code with excessive nesting:**\n```python\ndef process_data(items):\n for item in items: # Depth 2\n if item.is_valid(): # Depth 3\n try: # Depth 4 \u2190 VIOLATION (max=3)\n if item.process():\n return True\n except Exception:\n pass\n return False\n```\n\n**Refactored with guard clauses:**\n```python\ndef process_data(items):\n for item in items: # Depth 2\n if not item.is_valid():\n continue\n try: # Depth 3 \u2713\n if item.process():\n return True\n except Exception:\n pass\n return False\n```\n\n### Refactoring Patterns\n\nCommon patterns to reduce nesting:\n\n1. **Guard Clauses (Early Returns)**\n - Replace `if x: do_something()` with `if not x: return`\n - Exit early, reduce nesting\n\n2. **Extract Method**\n - Move nested logic to separate functions\n - Improves readability and testability\n\n3. **Dispatch Pattern**\n - Replace if-elif-else chains with dictionary dispatch\n - More extensible and cleaner\n\n4. **Flatten Error Handling**\n - Combine multiple try-except blocks\n - Use tuple of exception types\n\n### Language Support\n\n- **Python**: Full support (if/for/while/with/try/match)\n- **TypeScript**: Full support (if/for/while/try/switch)\n- **JavaScript**: Supported via TypeScript parser\n\nSee [Nesting Linter Guide](docs/nesting-linter.md) for comprehensive documentation and refactoring patterns.\n\n## Single Responsibility Principle (SRP) Linter\n\n### Overview\n\nThe SRP linter detects classes that violate the Single Responsibility Principle by having too many methods, too many lines of code, or generic naming patterns. It uses AST analysis with configurable heuristics to identify classes that likely handle multiple responsibilities.\n\n### Quick Start\n\n```bash\n# Check SRP violations in current directory\nthailint srp .\n\n# Use custom thresholds\nthailint srp --max-methods 10 --max-loc 300 src/\n\n# Get JSON output\nthailint srp --format json src/\n```\n\n### Configuration\n\nAdd to `.thailint.yaml`:\n\n```yaml\nsrp:\n enabled: true\n max_methods: 7 # Maximum methods per class\n max_loc: 200 # Maximum lines of code per class\n\n # Language-specific thresholds\n python:\n max_methods: 8\n max_loc: 200\n\n typescript:\n max_methods: 10 # TypeScript more verbose\n max_loc: 250\n```\n\n### Detection Heuristics\n\nThe SRP linter uses three heuristics to detect violations:\n\n1. **Method Count**: Classes with >7 methods (default) likely have multiple responsibilities\n2. **Lines of Code**: Classes with >200 LOC (default) are often doing too much\n3. **Responsibility Keywords**: Names containing \"Manager\", \"Handler\", \"Processor\", etc.\n\n### Example Violation\n\n**Code with SRP violation:**\n```python\nclass UserManager: # 8 methods, contains \"Manager\" keyword\n def create_user(self): pass\n def update_user(self): pass\n def delete_user(self): pass\n def send_email(self): pass # \u2190 Different responsibility\n def log_action(self): pass # \u2190 Different responsibility\n def validate_data(self): pass # \u2190 Different responsibility\n def generate_report(self): pass # \u2190 Different responsibility\n def export_data(self): pass # \u2190 Violation at method 8\n```\n\n**Refactored following SRP:**\n```python\nclass UserRepository: # 3 methods \u2713\n def create(self, user): pass\n def update(self, user): pass\n def delete(self, user): pass\n\nclass EmailService: # 1 method \u2713\n def send(self, user, template): pass\n\nclass UserAuditLog: # 1 method \u2713\n def log(self, action, user): pass\n\nclass UserValidator: # 1 method \u2713\n def validate(self, data): pass\n\nclass ReportGenerator: # 1 method \u2713\n def generate(self, users): pass\n```\n\n### Refactoring Patterns\n\nCommon patterns to fix SRP violations (discovered during dogfooding):\n\n1. **Extract Class**\n - Split god classes into focused classes\n - Each class handles one responsibility\n\n2. **Split Configuration and Logic**\n - Separate config loading from business logic\n - Create dedicated ConfigLoader classes\n\n3. **Extract Language-Specific Logic**\n - Separate Python/TypeScript analysis\n - Use analyzer classes per language\n\n4. **Utility Module Pattern**\n - Group related helper methods\n - Create focused utility classes\n\n### Language Support\n\n- **Python**: Full support with method counting and LOC analysis\n- **TypeScript**: Full support with tree-sitter parsing\n- **JavaScript**: Supported via TypeScript parser\n\n### Real-World Example\n\n**Large class refactoring:**\n- **Before**: FilePlacementLinter (33 methods, 382 LOC) - single class handling config, patterns, validation\n- **After**: Extract Class pattern applied - 5 focused classes (ConfigLoader, PatternValidator, RuleChecker, PathResolver, FilePlacementLinter)\n- **Result**: Each class \u22648 methods, \u2264150 LOC, single responsibility\n\nSee [SRP Linter Guide](docs/srp-linter.md) for comprehensive documentation and refactoring patterns.\n\n## DRY Linter (Don't Repeat Yourself)\n\n### Overview\n\nThe DRY linter detects duplicate code blocks across your entire project using token-based hashing with SQLite storage. It identifies identical or near-identical code that violates the Don't Repeat Yourself (DRY) principle, helping maintain code quality at scale.\n\n### Quick Start\n\n```bash\n# Check for duplicate code in current directory\nthailint dry .\n\n# Use custom thresholds\nthailint dry --min-lines 5 src/\n\n# Use tempfile storage for large projects\nthailint dry --storage-mode tempfile src/\n\n# Get JSON output\nthailint dry --format json src/\n```\n\n### Configuration\n\nAdd to `.thailint.yaml`:\n\n```yaml\ndry:\n enabled: true\n min_duplicate_lines: 4 # Minimum lines to consider duplicate\n min_duplicate_tokens: 30 # Minimum tokens to consider duplicate\n min_occurrences: 2 # Report if appears 2+ times\n\n # Language-specific thresholds\n python:\n min_occurrences: 3 # Python: require 3+ occurrences\n typescript:\n min_occurrences: 3 # TypeScript: require 3+ occurrences\n\n # Storage settings\n storage_mode: \"memory\" # Options: \"memory\" (default) or \"tempfile\"\n\n # Ignore patterns\n ignore:\n - \"tests/\" # Test code often has acceptable duplication\n - \"__init__.py\" # Import-only files exempt\n\n # False positive filters\n filters:\n keyword_argument_filter: true # Filter function call kwargs\n import_group_filter: true # Filter import groups\n```\n\n### How It Works\n\n**Token-Based Detection:**\n1. Parse code into tokens (stripping comments, normalizing whitespace)\n2. Create rolling hash windows of N lines\n3. Store hashes in SQLite database with file locations\n4. Query for hashes appearing 2+ times across project\n\n**SQLite Storage:**\n- In-memory mode (default): Stores in RAM for best performance\n- Tempfile mode: Stores in temporary disk file for large projects\n- Fresh analysis on every run (no persistence between runs)\n- Fast duplicate detection using B-tree indexes\n\n### Example Violation\n\n**Code with duplication:**\n```python\n# src/auth.py\ndef validate_user(user_data):\n if not user_data:\n return False\n if not user_data.get('email'):\n return False\n if not user_data.get('password'):\n return False\n return True\n\n# src/admin.py\ndef validate_admin(admin_data):\n if not admin_data:\n return False\n if not admin_data.get('email'):\n return False\n if not admin_data.get('password'):\n return False\n return True\n```\n\n**Violation message:**\n```\nsrc/auth.py:3 - Duplicate code detected (4 lines, 2 occurrences)\n Locations:\n - src/auth.py:3-6\n - src/admin.py:3-6\n Consider extracting to shared function\n```\n\n**Refactored (DRY):**\n```python\n# src/validators.py\ndef validate_credentials(data):\n if not data:\n return False\n if not data.get('email'):\n return False\n if not data.get('password'):\n return False\n return True\n\n# src/auth.py & src/admin.py\nfrom src.validators import validate_credentials\n\ndef validate_user(user_data):\n return validate_credentials(user_data)\n\ndef validate_admin(admin_data):\n return validate_credentials(admin_data)\n```\n\n### Performance\n\n| Operation | Performance | Storage Mode |\n|-----------|-------------|--------------|\n| Scan (1000 files) | 1-3s | Memory (default) |\n| Large project (5000+ files) | Use tempfile mode | Tempfile |\n\n**Note**: Every run analyzes files fresh - no persistence between runs ensures accurate results\n\n### Language Support\n\n- **Python**: Full support with AST-based tokenization\n- **TypeScript**: Full support with tree-sitter parsing\n- **JavaScript**: Supported via TypeScript parser\n\n### False Positive Filtering\n\nBuilt-in filters automatically exclude common non-duplication patterns:\n- **keyword_argument_filter**: Excludes function calls with keyword arguments\n- **import_group_filter**: Excludes import statement groups\n\n### Refactoring Patterns\n\n1. **Extract Function**: Move repeated logic to shared function\n2. **Extract Base Class**: Create base class for similar implementations\n3. **Extract Utility Module**: Move helper functions to shared utilities\n4. **Template Method**: Use function parameters for variations\n\nSee [DRY Linter Guide](docs/dry-linter.md) for comprehensive documentation, storage modes, and refactoring patterns.\n\n## Magic Numbers Linter\n\n### Overview\n\nThe magic numbers linter detects unnamed numeric literals (magic numbers) that should be extracted to named constants. It uses AST analysis to identify numeric literals that lack meaningful context.\n\n### What are Magic Numbers?\n\n**Magic numbers** are unnamed numeric literals in code without explanation:\n\n```python\n# Bad - Magic numbers\ntimeout = 3600 # What is 3600?\nmax_retries = 5 # Why 5?\n\n# Good - Named constants\nTIMEOUT_SECONDS = 3600\nMAX_RETRY_ATTEMPTS = 5\n```\n\n### Quick Start\n\n```bash\n# Check for magic numbers in current directory\nthailint magic-numbers .\n\n# Check specific directory\nthailint magic-numbers src/\n\n# Get JSON output\nthailint magic-numbers --format json src/\n```\n\n### Configuration\n\nAdd to `.thailint.yaml`:\n\n```yaml\nmagic-numbers:\n enabled: true\n allowed_numbers: [-1, 0, 1, 2, 10, 100, 1000]\n max_small_integer: 10 # Max for range() to be acceptable\n```\n\n### Example Violation\n\n**Code with magic numbers:**\n```python\ndef calculate_timeout():\n return 3600 # Magic number - what is 3600?\n\ndef process_items(items):\n for i in range(100): # Magic number - why 100?\n items[i] *= 1.5 # Magic number - what is 1.5?\n```\n\n**Violation messages:**\n```\nsrc/example.py:2 - Magic number 3600 should be a named constant\nsrc/example.py:5 - Magic number 100 should be a named constant\nsrc/example.py:6 - Magic number 1.5 should be a named constant\n```\n\n**Refactored code:**\n```python\nTIMEOUT_SECONDS = 3600\nMAX_ITEMS = 100\nPRICE_MULTIPLIER = 1.5\n\ndef calculate_timeout():\n return TIMEOUT_SECONDS\n\ndef process_items(items):\n for i in range(MAX_ITEMS):\n items[i] *= PRICE_MULTIPLIER\n```\n\n### Acceptable Contexts\n\nThe linter **does not** flag numbers in these contexts:\n\n| Context | Example | Why Acceptable |\n|---------|---------|----------------|\n| Constants | `MAX_SIZE = 100` | UPPERCASE name provides context |\n| Small `range()` | `range(5)` | Small loop bounds are clear |\n| Test files | `test_*.py` | Test data can be literal |\n| Allowed numbers | `-1, 0, 1, 2, 10` | Common values are self-explanatory |\n\n### Refactoring Patterns\n\n**Pattern 1: Extract to Module Constants**\n```python\n# Before\ndef connect():\n timeout = 30\n retries = 3\n\n# After\nDEFAULT_TIMEOUT_SECONDS = 30\nDEFAULT_MAX_RETRIES = 3\n\ndef connect():\n timeout = DEFAULT_TIMEOUT_SECONDS\n retries = DEFAULT_MAX_RETRIES\n```\n\n**Pattern 2: Extract with Units in Name**\n```python\n# Before\ndelay = 3600 # Is this seconds? Minutes?\n\n# After\nTASK_DELAY_SECONDS = 3600 # Clear unit\n\ndelay = TASK_DELAY_SECONDS\n```\n\n**Pattern 3: Use Standard Library**\n```python\n# Before\nif status == 200:\n return \"success\"\n\n# After\nfrom http import HTTPStatus\n\nif status == HTTPStatus.OK:\n return \"success\"\n```\n\n### Language Support\n\n- **Python**: Full support (int, float, scientific notation)\n- **TypeScript**: Full support (int, float, scientific notation)\n- **JavaScript**: Supported via TypeScript parser\n\n### Ignoring Violations\n\n```python\n# Line-level ignore\ntimeout = 3600 # thailint: ignore[magic-numbers] - Industry standard\n\n# Method-level ignore\ndef get_ports(): # thailint: ignore[magic-numbers] - Standard ports\n return {80: \"HTTP\", 443: \"HTTPS\"}\n\n# File-level ignore\n# thailint: ignore-file[magic-numbers]\n```\n\nSee **[How to Ignore Violations](docs/how-to-ignore-violations.md)** and **[Magic Numbers Linter Guide](docs/magic-numbers-linter.md)** for complete documentation.\n\n## Pre-commit Hooks\n\nAutomate code quality checks before every commit and push with pre-commit hooks.\n\n### Quick Setup\n\n```bash\n# 1. Install pre-commit framework\npip install pre-commit\n\n# 2. Install git hooks\npre-commit install\npre-commit install --hook-type pre-push\n\n# 3. Test it works\npre-commit run --all-files\n```\n\n### What You Get\n\n**On every commit:**\n- Prevents commits to main/master branch\n- Auto-fixes formatting issues\n- Runs thailint on changed files (fast, uses pass_filenames: true)\n\n**On every push:**\n- Full linting on entire codebase\n- Runs complete test suite\n\n### Example Configuration\n\n```yaml\n# .pre-commit-config.yaml\nrepos:\n - repo: local\n hooks:\n # Prevent commits to protected branches\n - id: no-commit-to-main\n name: Prevent commits to main branch\n entry: bash -c 'branch=$(git rev-parse --abbrev-ref HEAD); if [ \"$branch\" = \"main\" ]; then echo \"ERROR: Use a feature branch!\"; exit 1; fi'\n language: system\n pass_filenames: false\n always_run: true\n\n # Auto-format code\n - id: format\n name: Auto-fix formatting\n entry: just format\n language: system\n pass_filenames: false\n\n # Run thailint on changed files (passes filenames directly)\n - id: thailint-changed\n name: Lint changed files\n entry: thailint nesting\n language: system\n files: \\.(py|ts|tsx|js|jsx)$\n pass_filenames: true\n```\n\nSee **[Pre-commit Hooks Guide](docs/pre-commit-hooks.md)** for complete documentation, troubleshooting, and advanced configuration.\n\n## Common Use Cases\n\n### CI/CD Integration\n\n```yaml\n# GitHub Actions example\nname: Lint\n\non: [push, pull_request]\n\njobs:\n lint:\n runs-on: ubuntu-latest\n steps:\n - uses: actions/checkout@v3\n - name: Install thailint\n run: pip install thailint\n - name: Run file placement linter\n run: thailint file-placement .\n - name: Run nesting linter\n run: thailint nesting src/ --config .thailint.yaml\n```\n\n### Editor Integration\n\n```python\n# VS Code extension example\nfrom src import Linter\n\nlinter = Linter(config_file='.thailint.yaml')\nviolations = linter.lint(file_path)\n```\n\n### Test Suite\n\n```python\n# pytest integration\nimport pytest\nfrom src import Linter\n\ndef test_no_violations():\n linter = Linter()\n violations = linter.lint('src/')\n assert len(violations) == 0\n```\n\n## Development\n\n### Setup Development Environment\n\n```bash\n# Install dependencies and activate virtualenv\njust init\n\n# Or manually:\npoetry install\nsource $(poetry env info --path)/bin/activate\n```\n\n### Running Tests\n\n```bash\n# Run all tests (parallel mode - fast)\njust test\n\n# Run with coverage (serial mode)\njust test-coverage\n\n# Run specific test\npoetry run pytest tests/test_cli.py::test_hello_command -v\n```\n\n### Code Quality\n\n```bash\n# Fast linting (Ruff only - use during development)\njust lint\n\n# Comprehensive linting (Ruff + Pylint + Flake8 + MyPy)\njust lint-all\n\n# Security scanning\njust lint-security\n\n# Complexity analysis (Radon + Xenon + Nesting)\njust lint-complexity\n\n# SOLID principles (SRP)\njust lint-solid\n\n# DRY principles (duplicate code detection)\njust lint-dry\n\n# ALL quality checks (runs everything)\njust lint-full\n\n# Auto-fix formatting issues\njust format\n```\n\n### Dogfooding (Lint Our Own Code)\n\n```bash\n# Lint file placement\njust lint-placement\n\n# Check nesting depth\njust lint-nesting\n\n# Check for magic numbers\npoetry run thai-lint magic-numbers src/\n```\n\n### Building and Publishing\n\n```bash\n# Build Python package\npoetry build\n\n# Build Docker image locally\ndocker build -t washad/thailint:latest .\n\n# Publish to PyPI and Docker Hub (runs tests + linting + version bump)\njust publish\n```\n\n### Quick Development Workflows\n\n```bash\n# Make changes, then run quality checks\njust lint-full\n\n# Share changes for collaboration (skips hooks)\njust share \"WIP: feature description\"\n\n# Clean up cache and artifacts\njust clean\n```\n\nSee `just --list` or `just help` for all available commands.\n\n## Docker Usage\n\n### Basic Docker Commands\n\n```bash\n# Pull published image\ndocker pull washad/thailint:latest\n\n# Run CLI help\ndocker run --rm washad/thailint:latest --help\n\n# Lint entire directory (recommended)\ndocker run --rm -v $(pwd):/data washad/thailint:latest file-placement /data\n\n# Lint single file\ndocker run --rm -v $(pwd):/data washad/thailint:latest file-placement /data/src/app.py\n\n# Lint multiple specific files\ndocker run --rm -v $(pwd):/data washad/thailint:latest nesting /data/src/file1.py /data/src/file2.py\n\n# Lint specific subdirectory\ndocker run --rm -v $(pwd):/data washad/thailint:latest nesting /data/src\n\n# With custom config\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest nesting --config /data/.thailint.yaml /data\n\n# JSON output for CI/CD\ndocker run --rm -v $(pwd):/data \\\n washad/thailint:latest file-placement --format json /data\n```\n\n### Docker with Sibling Directories (Advanced)\n\nFor complex Docker setups with sibling directories, use `--project-root` for explicit control:\n\n```bash\n# Scenario: Monorepo with separate config and code directories\n# Directory structure:\n# /workspace/\n# \u251c\u2500\u2500 config/ # Contains .thailint.yaml\n# \u251c\u2500\u2500 backend/app/ # Python backend code\n# \u251c\u2500\u2500 frontend/ # TypeScript frontend\n# \u2514\u2500\u2500 tools/ # Build tools\n\n# Explicit project root (recommended for Docker)\ndocker run --rm -v /path/to/workspace:/workspace \\\n washad/thailint:latest \\\n --project-root /workspace/config \\\n magic-numbers /workspace/backend/\n\n# Config path inference (automatic - no --project-root needed)\ndocker run --rm -v /path/to/workspace:/workspace \\\n washad/thailint:latest \\\n --config /workspace/config/.thailint.yaml \\\n magic-numbers /workspace/backend/\n\n# Lint multiple sibling directories with shared config\ndocker run --rm -v /path/to/workspace:/workspace \\\n washad/thailint:latest \\\n --project-root /workspace/config \\\n nesting /workspace/backend/ /workspace/frontend/\n```\n\n**When to use `--project-root` in Docker:**\n- **Sibling directory structures** - When config/code aren't nested\n- **Monorepos** - Multiple projects sharing one config\n- **CI/CD** - Explicit paths prevent auto-detection issues\n- **Ignore patterns** - Ensures patterns resolve from correct base directory\n\n## Documentation\n\n### Comprehensive Guides\n\n- **[Getting Started](docs/getting-started.md)** - Installation, first lint, basic config\n- **[Configuration Reference](docs/configuration.md)** - Complete config options (YAML/JSON)\n- **[How to Ignore Violations](docs/how-to-ignore-violations.md)** - Complete guide to all ignore levels\n- **[API Reference](docs/api-reference.md)** - Library API documentation\n- **[CLI Reference](docs/cli-reference.md)** - All CLI commands and options\n- **[Deployment Modes](docs/deployment-modes.md)** - CLI, Library, and Docker usage\n- **[File Placement Linter](docs/file-placement-linter.md)** - Detailed linter guide\n- **[Magic Numbers Linter](docs/magic-numbers-linter.md)** - Magic numbers detection guide\n- **[Nesting Depth Linter](docs/nesting-linter.md)** - Nesting depth analysis guide\n- **[SRP Linter](docs/srp-linter.md)** - Single Responsibility Principle guide\n- **[DRY Linter](docs/dry-linter.md)** - Duplicate code detection guide\n- **[Pre-commit Hooks](docs/pre-commit-hooks.md)** - Automated quality checks\n- **[Publishing Guide](docs/releasing.md)** - Release and publishing workflow\n- **[Publishing Checklist](docs/publishing-checklist.md)** - Post-publication validation\n\n### Examples\n\nSee [`examples/`](examples/) directory for working code:\n\n- **[basic_usage.py](examples/basic_usage.py)** - Simple library API usage\n- **[advanced_usage.py](examples/advanced_usage.py)** - Advanced patterns and workflows\n- **[ci_integration.py](examples/ci_integration.py)** - CI/CD integration example\n\n## Project Structure\n\n```\nthai-lint/\n\u251c\u2500\u2500 src/ # Application source code\n\u2502 \u251c\u2500\u2500 api.py # High-level Library API\n\u2502 \u251c\u2500\u2500 cli.py # CLI commands\n\u2502 \u251c\u2500\u2500 core/ # Core abstractions\n\u2502 \u2502 \u251c\u2500\u2500 base.py # Base linter interfaces\n\u2502 \u2502 \u251c\u2500\u2500 registry.py # Rule registry\n\u2502 \u2502 \u2514\u2500\u2500 types.py # Core types (Violation, Severity)\n\u2502 \u251c\u2500\u2500 linters/ # Linter implementations\n\u2502 \u2502 \u2514\u2500\u2500 file_placement/ # File placement linter\n\u2502 \u251c\u2500\u2500 linter_config/ # Configuration system\n\u2502 \u2502 \u251c\u2500\u2500 loader.py # Config loader (YAML/JSON)\n\u2502 \u2502 \u2514\u2500\u2500 ignore.py # Ignore directives\n\u2502 \u2514\u2500\u2500 orchestrator/ # Multi-language orchestrator\n\u2502 \u251c\u2500\u2500 core.py # Main orchestrator\n\u2502 \u2514\u2500\u2500 language_detector.py\n\u251c\u2500\u2500 tests/ # Test suite (221 tests, 87% coverage)\n\u2502 \u251c\u2500\u2500 unit/ # Unit tests\n\u2502 \u251c\u2500\u2500 integration/ # Integration tests\n\u2502 \u2514\u2500\u2500 conftest.py # Pytest fixtures\n\u251c\u2500\u2500 docs/ # Documentation\n\u2502 \u251c\u2500\u2500 getting-started.md\n\u2502 \u251c\u2500\u2500 configuration.md\n\u2502 \u251c\u2500\u2500 api-reference.md\n\u2502 \u251c\u2500\u2500 cli-reference.md\n\u2502 \u251c\u2500\u2500 deployment-modes.md\n\u2502 \u2514\u2500\u2500 file-placement-linter.md\n\u251c\u2500\u2500 examples/ # Working examples\n\u2502 \u251c\u2500\u2500 basic_usage.py\n\u2502 \u251c\u2500\u2500 advanced_usage.py\n\u2502 \u2514\u2500\u2500 ci_integration.py\n\u251c\u2500\u2500 .ai/ # AI agent documentation\n\u251c\u2500\u2500 Dockerfile # Multi-stage Docker build\n\u251c\u2500\u2500 docker-compose.yml # Docker orchestration\n\u2514\u2500\u2500 pyproject.toml # Project configuration\n```\n\n## Contributing\n\nContributions are welcome! Please follow these steps:\n\n1. Fork the repository\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\n3. Make your changes\n4. Run tests and linting\n5. Commit your changes (`git commit -m 'Add amazing feature'`)\n6. Push to the branch (`git push origin feature/amazing-feature`)\n7. Open a Pull Request\n\n### Development Guidelines\n\n- Write tests for new features\n- Follow existing code style (enforced by Ruff)\n- Add type hints to all functions\n- Update documentation for user-facing changes\n- Run `pytest` and `ruff check` before committing\n\n## Performance\n\nthailint is designed for speed and efficiency:\n\n| Operation | Performance | Target |\n|-----------|-------------|--------|\n| Single file lint | ~20ms | <100ms |\n| 100 files | ~300ms | <1s |\n| 1000 files | ~900ms | <5s |\n| Config loading | ~10ms | <100ms |\n\n*Performance benchmarks run on standard hardware, your results may vary.*\n\n## Exit Codes\n\nthailint uses standard exit codes for CI/CD integration:\n\n- **0** - Success (no violations)\n- **1** - Violations found\n- **2** - Error occurred (invalid config, file not found, etc.)\n\n```bash\nthailint file-placement .\nif [ $? -eq 0 ]; then\n echo \"Linting passed\"\nelse\n echo \"Linting failed\"\nfi\n```\n\n## Architecture\n\nSee [`.ai/docs/`](.ai/docs/) for detailed architecture documentation and [`.ai/howtos/`](.ai/howtos/) for development guides.\n\n## License\n\nMIT License - see LICENSE file for details.\n\n## Support\n\n- **Issues**: https://github.com/be-wise-be-kind/thai-lint/issues\n- **Documentation**: `.ai/docs/` and `.ai/howtos/`\n\n## Acknowledgments\n\nBuilt with:\n- [Click](https://click.palletsprojects.com/) - CLI framework\n- [pytest](https://pytest.org/) - Testing framework\n- [Ruff](https://docs.astral.sh/ruff/) - Linting and formatting\n- [Docker](https://www.docker.com/) - Containerization\n\n## Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for version history.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "The AI Linter - Enterprise-grade linting and governance for AI-generated code across multiple languages",
"version": "0.4.4",
"project_urls": {
"Documentation": "https://github.com/be-wise-be-kind/thai-lint#readme",
"Homepage": "https://github.com/be-wise-be-kind/thai-lint",
"Repository": "https://github.com/be-wise-be-kind/thai-lint"
},
"split_keywords": [
"linter",
" ai",
" code-quality",
" static-analysis",
" file-placement",
" governance",
" multi-language",
" cli",
" docker",
" python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5982c1e5258762cc22c7b19a8979c2b8b9b04d7bfda00e51aac9c277c98e1b05",
"md5": "8c3f345021a82bea87111fe33e5c898c",
"sha256": "7707c49907d636bad1bbcfb8596c41684f93ed2b47711662de4aca4adac43465"
},
"downloads": -1,
"filename": "thailint-0.4.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8c3f345021a82bea87111fe33e5c898c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.11",
"size": 142851,
"upload_time": "2025-10-14T22:33:49",
"upload_time_iso_8601": "2025-10-14T22:33:49.345629Z",
"url": "https://files.pythonhosted.org/packages/59/82/c1e5258762cc22c7b19a8979c2b8b9b04d7bfda00e51aac9c277c98e1b05/thailint-0.4.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c0d7f3160e95accf4166c95eccdc143ad1145fcd97e13fb6949074349bdc2ffe",
"md5": "f57d78c56b70bd815ade6e0f6e28b890",
"sha256": "3316e956672bb8ea03a7e8afa429fda4f25823011c212a830d22836b2740c1e9"
},
"downloads": -1,
"filename": "thailint-0.4.4.tar.gz",
"has_sig": false,
"md5_digest": "f57d78c56b70bd815ade6e0f6e28b890",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.11",
"size": 119626,
"upload_time": "2025-10-14T22:33:50",
"upload_time_iso_8601": "2025-10-14T22:33:50.573280Z",
"url": "https://files.pythonhosted.org/packages/c0/d7/f3160e95accf4166c95eccdc143ad1145fcd97e13fb6949074349bdc2ffe/thailint-0.4.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-14 22:33:50",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "be-wise-be-kind",
"github_project": "thai-lint",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "thailint"
}