# Datatrack - Version Control for Database Schemas
A high-performance CLI tool that brings Git-like version control to your database schemas with intelligent processing optimizations. Built for Data Engineers, Analytics Engineers, and Platform Teams.
## Key Features
- **High Performance**: 70-75% faster schema introspection for large databases
- **Intelligent Processing**: Auto-selects optimal strategy based on schema size
- **Multi-Database Support**: PostgreSQL, MySQL, SQLite, SQL Server
- **Schema Comparison**: Generate detailed diffs between versions
- **Quality Linting**: Enforce naming conventions and best practices
- **Multiple Export Formats**: JSON, YAML, Markdown, HTML
## Performance Improvements
| Schema Size | Processing Method | Performance Gain |
|---------------|----------------------|------------------|
| 1-49 tables | Standard | Baseline | |
| 50-199 tables | Parallel (4 workers) | 65-70% faster |
| 200+ tables | Parallel + Batched | 70-75% faster |
## Installation
```bash
pip install datatrack-core
```
pip install -e .
```
This method is ideal if you want to contribute or modify the tool.
## Helpful Commands
Datatrack comes with built-in help and guidance for every command. Use this to quickly learn syntax and options:
```bash
datatrack --help
or
datatrack -h
```
## How to Use
### 1. Initialize Tracking
```bash
datatrack init
```
Creates `.datatrack/`, `.databases/`, and optional initial files.
### 2. Connect to a Database
Save your DB connection for future use:
### MySQL
```bash
datatrack connect mysql+pymysql://root:<password>@localhost:3306/<database-name>
```
### PostgreSQL
```bash
datatrack connect postgresql+psycopg2://postgres:<password>@localhost:5432/<database-name>
```
### SQLite
```bash
datatrack connect sqlite:///.databases/<database-name>
```
## 3. Take a Schema Snapshot
```bash
# Standard snapshot
datatrack snapshot
# High-performance snapshot with parallel processing
datatrack snapshot --parallel
# Custom performance configuration
datatrack snapshot --parallel --max-workers 8 --batch-size 50
# For large schemas (200+ tables) - automatically optimized
datatrack snapshot # Auto-enables parallel + batched processing
```
Saves the current schema to `.databases/exports/<db_name>/snapshots/`.
## 4. Lint the Schema
```bash
datatrack lint
```
Detects issues in naming and structure.
## 5. Verify Schema Rules
```bash
datatrack verify
```
Validates schema against `schema_rules.yaml`.
## 6. View Schema Differences
```bash
datatrack diff
```
Shows table and column changes between the latest two snapshots.
## 7. Export Snapshots or Diffs
Export latest snapshot as YAML (default)
```bash
datatrack export
```
Explicitly export snapshot as YAML
```bash
datatrack export --type snapshot --format yaml
```
Export latest diff as JSON
```bash
datatrack export --type diff --format json
```
Output is saved in `.databases/exports/<db_name>/`.
## 8. View Snapshot History
```bash
datatrack history
```
Displays all snapshot timestamps and table counts.
## 9. Run the Full Pipeline
```bash
datatrack pipeline run
```
Runs `lint`, `snapshot`, `verify`, `diff`, and `export` together.
For advanced use cases and integration into CI/CD, visit:
**https://github.com/nrnavaneet/datatrack**
Raw data
{
"_id": null,
"home_page": null,
"name": "datatrack-core",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "database, schema, version-control, migration, diff, sql, postgresql, mysql, sqlite",
"author": null,
"author_email": "N R Navaneet <navaneetnr@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/c7/eb/8e2b33d684119ea699bf66aa367395bcf8dcbf4732ab5ac748a26eabd587/datatrack_core-1.1.3.tar.gz",
"platform": null,
"description": "# Datatrack - Version Control for Database Schemas\n\nA high-performance CLI tool that brings Git-like version control to your database schemas with intelligent processing optimizations. Built for Data Engineers, Analytics Engineers, and Platform Teams.\n\n## Key Features\n\n- **High Performance**: 70-75% faster schema introspection for large databases\n- **Intelligent Processing**: Auto-selects optimal strategy based on schema size\n- **Multi-Database Support**: PostgreSQL, MySQL, SQLite, SQL Server\n- **Schema Comparison**: Generate detailed diffs between versions\n- **Quality Linting**: Enforce naming conventions and best practices\n- **Multiple Export Formats**: JSON, YAML, Markdown, HTML\n\n## Performance Improvements\n\n| Schema Size | Processing Method | Performance Gain |\n|---------------|----------------------|------------------|\n| 1-49 tables | Standard | Baseline | |\n| 50-199 tables | Parallel (4 workers) | 65-70% faster |\n| 200+ tables | Parallel + Batched | 70-75% faster |\n\n## Installation\n\n```bash\npip install datatrack-core\n```\npip install -e .\n```\nThis method is ideal if you want to contribute or modify the tool.\n\n## Helpful Commands\n\nDatatrack comes with built-in help and guidance for every command. Use this to quickly learn syntax and options:\n```bash\ndatatrack --help\nor\ndatatrack -h\n```\n\n## How to Use\n\n### 1. Initialize Tracking\n\n```bash\ndatatrack init\n```\n\nCreates `.datatrack/`, `.databases/`, and optional initial files.\n\n\n### 2. Connect to a Database\n\nSave your DB connection for future use:\n\n### MySQL\n\n```bash\ndatatrack connect mysql+pymysql://root:<password>@localhost:3306/<database-name>\n```\n\n### PostgreSQL\n\n```bash\ndatatrack connect postgresql+psycopg2://postgres:<password>@localhost:5432/<database-name>\n```\n\n### SQLite\n\n```bash\ndatatrack connect sqlite:///.databases/<database-name>\n```\n\n## 3. Take a Schema Snapshot\n\n```bash\n# Standard snapshot\ndatatrack snapshot\n\n# High-performance snapshot with parallel processing\ndatatrack snapshot --parallel\n\n# Custom performance configuration\ndatatrack snapshot --parallel --max-workers 8 --batch-size 50\n\n# For large schemas (200+ tables) - automatically optimized\ndatatrack snapshot # Auto-enables parallel + batched processing\n```\n\nSaves the current schema to `.databases/exports/<db_name>/snapshots/`.\n\n## 4. Lint the Schema\n\n```bash\ndatatrack lint\n```\n\nDetects issues in naming and structure.\n\n## 5. Verify Schema Rules\n\n```bash\ndatatrack verify\n```\n\nValidates schema against `schema_rules.yaml`.\n\n## 6. View Schema Differences\n\n```bash\ndatatrack diff\n```\n\nShows table and column changes between the latest two snapshots.\n\n## 7. Export Snapshots or Diffs\n\nExport latest snapshot as YAML (default)\n```bash\ndatatrack export\n```\n\nExplicitly export snapshot as YAML\n```bash\ndatatrack export --type snapshot --format yaml\n```\nExport latest diff as JSON\n```bash\ndatatrack export --type diff --format json\n```\n\nOutput is saved in `.databases/exports/<db_name>/`.\n\n## 8. View Snapshot History\n\n```bash\ndatatrack history\n```\n\nDisplays all snapshot timestamps and table counts.\n\n## 9. Run the Full Pipeline\n\n```bash\ndatatrack pipeline run\n```\n\nRuns `lint`, `snapshot`, `verify`, `diff`, and `export` together.\n\nFor advanced use cases and integration into CI/CD, visit:\n\n**https://github.com/nrnavaneet/datatrack**\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "High-Performance Version Control for Database Schemas with Intelligent Processing",
"version": "1.1.3",
"project_urls": {
"Bug Tracker": "https://github.com/nrnavaneet/datatrack/issues",
"Changelog": "https://github.com/nrnavaneet/datatrack/releases",
"Contributing": "https://github.com/nrnavaneet/datatrack/blob/main/docs/contribute/CONTRIBUTING.md",
"Development": "https://github.com/nrnavaneet/datatrack/blob/main/docs/DEVELOPMENT.md",
"Documentation": "https://github.com/nrnavaneet/datatrack/blob/main/README.md",
"Homepage": "https://github.com/nrnavaneet/datatrack",
"Installation": "https://github.com/nrnavaneet/datatrack/blob/main/docs/INSTALLATION.md",
"Repository": "https://github.com/nrnavaneet/datatrack",
"Usage Guide": "https://github.com/nrnavaneet/datatrack/blob/main/docs/USAGE.md"
},
"split_keywords": [
"database",
" schema",
" version-control",
" migration",
" diff",
" sql",
" postgresql",
" mysql",
" sqlite"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "23b41fdfcdeeb355ef4cdc52a535b660b64a405cdd63452dcd3733efd391bf5b",
"md5": "21b1b1d3d62ca7257eb097522d25f0ff",
"sha256": "9674435ff9d8dec2718d1c4530318abb98d7db14a7ba4fdede738dc1c5c46d33"
},
"downloads": -1,
"filename": "datatrack_core-1.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "21b1b1d3d62ca7257eb097522d25f0ff",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 35455,
"upload_time": "2025-07-30T08:18:18",
"upload_time_iso_8601": "2025-07-30T08:18:18.850788Z",
"url": "https://files.pythonhosted.org/packages/23/b4/1fdfcdeeb355ef4cdc52a535b660b64a405cdd63452dcd3733efd391bf5b/datatrack_core-1.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "c7eb8e2b33d684119ea699bf66aa367395bcf8dcbf4732ab5ac748a26eabd587",
"md5": "61140b67bd30697d31c9ab71b181596e",
"sha256": "09eca4c6aad40590a341fd0453b9e19e6eec4649254d9a3043429a2de49aae9d"
},
"downloads": -1,
"filename": "datatrack_core-1.1.3.tar.gz",
"has_sig": false,
"md5_digest": "61140b67bd30697d31c9ab71b181596e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 46495,
"upload_time": "2025-07-30T08:18:20",
"upload_time_iso_8601": "2025-07-30T08:18:20.592358Z",
"url": "https://files.pythonhosted.org/packages/c7/eb/8e2b33d684119ea699bf66aa367395bcf8dcbf4732ab5ac748a26eabd587/datatrack_core-1.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-30 08:18:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "nrnavaneet",
"github_project": "datatrack",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "typer",
"specs": [
[
">=",
"0.9.0"
]
]
},
{
"name": "PyYAML",
"specs": [
[
">=",
"6.0"
]
]
},
{
"name": "sqlalchemy",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "psycopg2-binary",
"specs": [
[
">=",
"2.9.0"
]
]
},
{
"name": "pymysql",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "pytest",
"specs": []
},
{
"name": "isort",
"specs": []
}
],
"lcname": "datatrack-core"
}