datatrack-core

Name	datatrack-core JSON
Version	1.1.3 JSON
	download
home_page	None
Summary	High-Performance Version Control for Database Schemas with Intelligent Processing
upload_time	2025-07-30 08:18:20
maintainer	None
docs_url	None
author	None
requires_python	>=3.7
license	MIT
keywords	database schema version-control migration diff sql postgresql mysql sqlite
VCS
bugtrack_url
requirements	typer PyYAML sqlalchemy psycopg2-binary pymysql pytest isort
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

# Datatrack - Version Control for Database Schemas

A high-performance CLI tool that brings Git-like version control to your database schemas with intelligent processing optimizations. Built for Data Engineers, Analytics Engineers, and Platform Teams.

## Key Features

- **High Performance**: 70-75% faster schema introspection for large databases
- **Intelligent Processing**: Auto-selects optimal strategy based on schema size
- **Multi-Database Support**: PostgreSQL, MySQL, SQLite, SQL Server
- **Schema Comparison**: Generate detailed diffs between versions
- **Quality Linting**: Enforce naming conventions and best practices
- **Multiple Export Formats**: JSON, YAML, Markdown, HTML

## Performance Improvements

| Schema Size | Processing Method | Performance Gain |
|---------------|----------------------|------------------|
| 1-49 tables | Standard | Baseline | |
| 50-199 tables | Parallel (4 workers) | 65-70% faster |
| 200+ tables | Parallel + Batched | 70-75% faster |

## Installation

```bash
pip install datatrack-core
```
pip install -e .
```
This method is ideal if you want to contribute or modify the tool.

## Helpful Commands

Datatrack comes with built-in help and guidance for every command. Use this to quickly learn syntax and options:
```bash
datatrack --help
or
datatrack -h
```

## How to Use

### 1. Initialize Tracking

```bash
datatrack init
```

Creates `.datatrack/`, `.databases/`, and optional initial files.

### 2. Connect to a Database

Save your DB connection for future use:

### MySQL

```bash
datatrack connect mysql+pymysql://root:<password>@localhost:3306/<database-name>
```

### PostgreSQL

```bash
datatrack connect postgresql+psycopg2://postgres:<password>@localhost:5432/<database-name>
```

### SQLite

```bash
datatrack connect sqlite:///.databases/<database-name>
```

## 3. Take a Schema Snapshot

```bash
# Standard snapshot
datatrack snapshot

# High-performance snapshot with parallel processing
datatrack snapshot --parallel

# Custom performance configuration
datatrack snapshot --parallel --max-workers 8 --batch-size 50

# For large schemas (200+ tables) - automatically optimized
datatrack snapshot # Auto-enables parallel + batched processing
```

Saves the current schema to `.databases/exports/<db_name>/snapshots/`.

## 4. Lint the Schema

```bash
datatrack lint
```

Detects issues in naming and structure.

## 5. Verify Schema Rules

```bash
datatrack verify
```

Validates schema against `schema_rules.yaml`.

## 6. View Schema Differences

```bash
datatrack diff
```

Shows table and column changes between the latest two snapshots.

## 7. Export Snapshots or Diffs

Export latest snapshot as YAML (default)
```bash
datatrack export
```

Explicitly export snapshot as YAML
```bash
datatrack export --type snapshot --format yaml
```
Export latest diff as JSON
```bash
datatrack export --type diff --format json
```

Output is saved in `.databases/exports/<db_name>/`.

## 8. View Snapshot History

```bash
datatrack history
```

Displays all snapshot timestamps and table counts.

## 9. Run the Full Pipeline

```bash
datatrack pipeline run
```

Runs `lint`, `snapshot`, `verify`, `diff`, and `export` together.

For advanced use cases and integration into CI/CD, visit:

**https://github.com/nrnavaneet/datatrack**

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "datatrack-core",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "database, schema, version-control, migration, diff, sql, postgresql, mysql, sqlite",
    "author": null,
    "author_email": "N R Navaneet <navaneetnr@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/c7/eb/8e2b33d684119ea699bf66aa367395bcf8dcbf4732ab5ac748a26eabd587/datatrack_core-1.1.3.tar.gz",
    "platform": null,
    "description": "# Datatrack - Version Control for Database Schemas\n\nA high-performance CLI tool that brings Git-like version control to your database schemas with intelligent processing optimizations. Built for Data Engineers, Analytics Engineers, and Platform Teams.\n\n## Key Features\n\n- **High Performance**: 70-75% faster schema introspection for large databases\n- **Intelligent Processing**: Auto-selects optimal strategy based on schema size\n- **Multi-Database Support**: PostgreSQL, MySQL, SQLite, SQL Server\n- **Schema Comparison**: Generate detailed diffs between versions\n- **Quality Linting**: Enforce naming conventions and best practices\n- **Multiple Export Formats**: JSON, YAML, Markdown, HTML\n\n## Performance Improvements\n\n| Schema Size   | Processing Method    | Performance Gain |\n|---------------|----------------------|------------------|\n| 1-49 tables   | Standard | Baseline  |                  |\n| 50-199 tables | Parallel (4 workers) | 65-70% faster    |\n| 200+ tables   | Parallel + Batched   | 70-75% faster    |\n\n## Installation\n\n```bash\npip install datatrack-core\n```\npip install -e .\n```\nThis method is ideal if you want to contribute or modify the tool.\n\n## Helpful Commands\n\nDatatrack comes with built-in help and guidance for every command. Use this to quickly learn syntax and options:\n```bash\ndatatrack --help\nor\ndatatrack -h\n```\n\n##  How to Use\n\n### 1. Initialize Tracking\n\n```bash\ndatatrack init\n```\n\nCreates `.datatrack/`, `.databases/`, and optional initial files.\n\n\n### 2. Connect to a Database\n\nSave your DB connection for future use:\n\n### MySQL\n\n```bash\ndatatrack connect mysql+pymysql://root:<password>@localhost:3306/<database-name>\n```\n\n### PostgreSQL\n\n```bash\ndatatrack connect postgresql+psycopg2://postgres:<password>@localhost:5432/<database-name>\n```\n\n### SQLite\n\n```bash\ndatatrack connect sqlite:///.databases/<database-name>\n```\n\n## 3. Take a Schema Snapshot\n\n```bash\n# Standard snapshot\ndatatrack snapshot\n\n# High-performance snapshot with parallel processing\ndatatrack snapshot --parallel\n\n# Custom performance configuration\ndatatrack snapshot --parallel --max-workers 8 --batch-size 50\n\n# For large schemas (200+ tables) - automatically optimized\ndatatrack snapshot  # Auto-enables parallel + batched processing\n```\n\nSaves the current schema to `.databases/exports/<db_name>/snapshots/`.\n\n## 4. Lint the Schema\n\n```bash\ndatatrack lint\n```\n\nDetects issues in naming and structure.\n\n## 5. Verify Schema Rules\n\n```bash\ndatatrack verify\n```\n\nValidates schema against `schema_rules.yaml`.\n\n## 6. View Schema Differences\n\n```bash\ndatatrack diff\n```\n\nShows table and column changes between the latest two snapshots.\n\n## 7. Export Snapshots or Diffs\n\nExport latest snapshot as YAML (default)\n```bash\ndatatrack export\n```\n\nExplicitly export snapshot as YAML\n```bash\ndatatrack export --type snapshot --format yaml\n```\nExport latest diff as JSON\n```bash\ndatatrack export --type diff --format json\n```\n\nOutput is saved in `.databases/exports/<db_name>/`.\n\n## 8. View Snapshot History\n\n```bash\ndatatrack history\n```\n\nDisplays all snapshot timestamps and table counts.\n\n## 9. Run the Full Pipeline\n\n```bash\ndatatrack pipeline run\n```\n\nRuns `lint`, `snapshot`, `verify`, `diff`, and `export` together.\n\nFor advanced use cases and integration into CI/CD, visit:\n\n**https://github.com/nrnavaneet/datatrack**\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "High-Performance Version Control for Database Schemas with Intelligent Processing",
    "version": "1.1.3",
    "project_urls": {
        "Bug Tracker": "https://github.com/nrnavaneet/datatrack/issues",
        "Changelog": "https://github.com/nrnavaneet/datatrack/releases",
        "Contributing": "https://github.com/nrnavaneet/datatrack/blob/main/docs/contribute/CONTRIBUTING.md",
        "Development": "https://github.com/nrnavaneet/datatrack/blob/main/docs/DEVELOPMENT.md",
        "Documentation": "https://github.com/nrnavaneet/datatrack/blob/main/README.md",
        "Homepage": "https://github.com/nrnavaneet/datatrack",
        "Installation": "https://github.com/nrnavaneet/datatrack/blob/main/docs/INSTALLATION.md",
        "Repository": "https://github.com/nrnavaneet/datatrack",
        "Usage Guide": "https://github.com/nrnavaneet/datatrack/blob/main/docs/USAGE.md"
    },
    "split_keywords": [
        "database",
        " schema",
        " version-control",
        " migration",
        " diff",
        " sql",
        " postgresql",
        " mysql",
        " sqlite"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "23b41fdfcdeeb355ef4cdc52a535b660b64a405cdd63452dcd3733efd391bf5b",
                "md5": "21b1b1d3d62ca7257eb097522d25f0ff",
                "sha256": "9674435ff9d8dec2718d1c4530318abb98d7db14a7ba4fdede738dc1c5c46d33"
            },
            "downloads": -1,
            "filename": "datatrack_core-1.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "21b1b1d3d62ca7257eb097522d25f0ff",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 35455,
            "upload_time": "2025-07-30T08:18:18",
            "upload_time_iso_8601": "2025-07-30T08:18:18.850788Z",
            "url": "https://files.pythonhosted.org/packages/23/b4/1fdfcdeeb355ef4cdc52a535b660b64a405cdd63452dcd3733efd391bf5b/datatrack_core-1.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c7eb8e2b33d684119ea699bf66aa367395bcf8dcbf4732ab5ac748a26eabd587",
                "md5": "61140b67bd30697d31c9ab71b181596e",
                "sha256": "09eca4c6aad40590a341fd0453b9e19e6eec4649254d9a3043429a2de49aae9d"
            },
            "downloads": -1,
            "filename": "datatrack_core-1.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "61140b67bd30697d31c9ab71b181596e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 46495,
            "upload_time": "2025-07-30T08:18:20",
            "upload_time_iso_8601": "2025-07-30T08:18:20.592358Z",
            "url": "https://files.pythonhosted.org/packages/c7/eb/8e2b33d684119ea699bf66aa367395bcf8dcbf4732ab5ac748a26eabd587/datatrack_core-1.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-30 08:18:20",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nrnavaneet",
    "github_project": "datatrack",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "typer",
            "specs": [
                [
                    ">=",
                    "0.9.0"
                ]
            ]
        },
        {
            "name": "PyYAML",
            "specs": [
                [
                    ">=",
                    "6.0"
                ]
            ]
        },
        {
            "name": "sqlalchemy",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "psycopg2-binary",
            "specs": [
                [
                    ">=",
                    "2.9.0"
                ]
            ]
        },
        {
            "name": "pymysql",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": []
        },
        {
            "name": "isort",
            "specs": []
        }
    ],
    "lcname": "datatrack-core"
}

None