bq-schema-sync


Namebq-schema-sync JSON
Version 0.2.1 PyPI version JSON
download
home_pagehttps://github.com/victorelexpe/bq-schema-sync
SummaryA tool to synchronize BigQuery table schemas with local definitions
upload_time2024-07-21 14:43:52
maintainerNone
docs_urlNone
authorVictor Hasim Elexpe Ahamri
requires_python>=3.8
licenseMIT
keywords bigquery schema sync gcp google-cloud
VCS
bugtrack_url
requirements google-cloud-bigquery pyyaml
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # bq-schema-sync

`bq-schema-sync` is a Python package designed to help synchronize Google BigQuery table schemas with local schema definitions. It provides tools for comparing schemas, applying changes, validating schema definitions, generating migration scripts, managing schema versions, and enforcing schema validation rules.

## Features

- **Schema Comparison**: Identify differences between local schema definitions and BigQuery table schemas.
- **Schema Synchronization**: Apply changes to BigQuery table schemas based on local definitions.
- **Validation**: Ensure that local schema definitions adhere to BigQuery constraints and best practices.
- **Migration Script Generation**: Generate SQL scripts to manually apply schema changes.
- **Schema Versioning**: Track schema changes over time, save versions, list versions, and apply specific versions.
- **Dry Run Mode**: Preview changes without applying them to the BigQuery table.
- **Schema Validation Rules**: Enforce custom validation rules to ensure schema definitions meet specific criteria.

## Installation

You can install `bq-schema-sync` using pip:

    pip install bq-schema-sync

## Usage

### Command-Line Interface (CLI)

#### Initialize Configuration

Generate a template configuration file:

    bq-schema-sync init

#### Compare Schemas

Compare the local schema with the BigQuery table schema:

    bq-schema-sync compare --config config.yaml --dry-run

#### Apply Changes

Sync the local schema with the BigQuery table schema:

    bq-schema-sync apply --config config.yaml --dry-run

#### Generate Migration Script

Generate a SQL script for manual schema migration:

    bq-schema-sync generate-script --config config.yaml --output migration.sql

#### Validate Schema

Validate the local schema against BigQuery constraints and custom validation rules:

    bq-schema-sync validate --config config.yaml

#### Save Schema Version

Save the current schema version with a description:

    bq-schema-sync save-version --config config.yaml --description "Added new field 'email'"

#### List Schema Versions

List all saved schema versions:

    bq-schema-sync list-versions --config config.yaml

#### Apply Schema Version

Apply a specific schema version:

    bq-schema-sync apply-version --config config.yaml --version 1

### Python API

You can also use `bq-schema-sync` as a Python module:

    from bq_schema_sync import SchemaSync
    from google.cloud import bigquery

    # Initialize SchemaSync with configuration details
    client = bigquery.Client(project='my-gcp-project')
    schema_sync = SchemaSync(
        project_id='my-gcp-project',
        dataset_id='my_dataset',
        table_id='my_table',
        schema={'fields': [{'name': 'id', 'type': 'STRING', 'mode': 'REQUIRED', 'description': 'Unique identifier'}]},
        client=client
    )

    # Compare schemas
    differences = schema_sync.compare_schemas()
    print("Schema Differences:", differences)

    # Apply changes to sync the BigQuery table schema with the local schema
    schema_sync.apply_changes()

    # Generate migration script
    schema_sync.generate_migration_script('migration_scripts/update_my_table_schema.sql')

    # Validate the local schema
    schema_sync.validate_schema()

    # Save the current schema version
    schema_sync.save_version("Initial schema definition")

    # List all schema versions
    versions = schema_sync.list_versions()
    for version in versions:
        print(f"Version: {version['version']}, Timestamp: {version['timestamp']}, Description: {version['description']}")

    # Apply a specific schema version
    schema_sync.apply_version(1)

## Configuration File

The configuration file should be in YAML format and include the following details:

    project_id: your-gcp-project-id
    dataset_id: your-dataset-id
    table_id: your-table-id
    schema:
      fields:
        - name: id
          type: STRING
          mode: REQUIRED
          description: "Unique identifier"
        - name: created_at
          type: TIMESTAMP
          mode: NULLABLE
          description: "Record creation timestamp"

## Example Schema

Here is an example schema file in YAML format (`config.yaml`):

    project_id: your-gcp-project-id
    dataset_id: your-dataset-id
    table_id: your-table-id
    schema:
      fields:
        - name: id
          type: STRING
          mode: REQUIRED
          description: "Unique identifier"
        - name: created_at
          type: TIMESTAMP
          mode: NULLABLE
          description: "Record creation timestamp"

## Testing

You can run tests using `unittest`:

    python -m unittest discover tests

## Contributing

Contributions are welcome! Please feel free to submit a pull request or open an issue.

## Author

This project is developed and maintained by Victor Hasim Elexpe Ahamri. You can follow me on Twitter [@victorelexpe](https://twitter.com/victorelexpe) and visit my website [elexpe.dev](https://elexpe.dev).

## License

This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/victorelexpe/bq-schema-sync",
    "name": "bq-schema-sync",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "bigquery schema sync gcp google-cloud",
    "author": "Victor Hasim Elexpe Ahamri",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/bf/78/431ebb09b94f113a32f588a04398e47e6a50da7186cd3127012117613944/bq_schema_sync-0.2.1.tar.gz",
    "platform": null,
    "description": "# bq-schema-sync\n\n`bq-schema-sync` is a Python package designed to help synchronize Google BigQuery table schemas with local schema definitions. It provides tools for comparing schemas, applying changes, validating schema definitions, generating migration scripts, managing schema versions, and enforcing schema validation rules.\n\n## Features\n\n- **Schema Comparison**: Identify differences between local schema definitions and BigQuery table schemas.\n- **Schema Synchronization**: Apply changes to BigQuery table schemas based on local definitions.\n- **Validation**: Ensure that local schema definitions adhere to BigQuery constraints and best practices.\n- **Migration Script Generation**: Generate SQL scripts to manually apply schema changes.\n- **Schema Versioning**: Track schema changes over time, save versions, list versions, and apply specific versions.\n- **Dry Run Mode**: Preview changes without applying them to the BigQuery table.\n- **Schema Validation Rules**: Enforce custom validation rules to ensure schema definitions meet specific criteria.\n\n## Installation\n\nYou can install `bq-schema-sync` using pip:\n\n    pip install bq-schema-sync\n\n## Usage\n\n### Command-Line Interface (CLI)\n\n#### Initialize Configuration\n\nGenerate a template configuration file:\n\n    bq-schema-sync init\n\n#### Compare Schemas\n\nCompare the local schema with the BigQuery table schema:\n\n    bq-schema-sync compare --config config.yaml --dry-run\n\n#### Apply Changes\n\nSync the local schema with the BigQuery table schema:\n\n    bq-schema-sync apply --config config.yaml --dry-run\n\n#### Generate Migration Script\n\nGenerate a SQL script for manual schema migration:\n\n    bq-schema-sync generate-script --config config.yaml --output migration.sql\n\n#### Validate Schema\n\nValidate the local schema against BigQuery constraints and custom validation rules:\n\n    bq-schema-sync validate --config config.yaml\n\n#### Save Schema Version\n\nSave the current schema version with a description:\n\n    bq-schema-sync save-version --config config.yaml --description \"Added new field 'email'\"\n\n#### List Schema Versions\n\nList all saved schema versions:\n\n    bq-schema-sync list-versions --config config.yaml\n\n#### Apply Schema Version\n\nApply a specific schema version:\n\n    bq-schema-sync apply-version --config config.yaml --version 1\n\n### Python API\n\nYou can also use `bq-schema-sync` as a Python module:\n\n    from bq_schema_sync import SchemaSync\n    from google.cloud import bigquery\n\n    # Initialize SchemaSync with configuration details\n    client = bigquery.Client(project='my-gcp-project')\n    schema_sync = SchemaSync(\n        project_id='my-gcp-project',\n        dataset_id='my_dataset',\n        table_id='my_table',\n        schema={'fields': [{'name': 'id', 'type': 'STRING', 'mode': 'REQUIRED', 'description': 'Unique identifier'}]},\n        client=client\n    )\n\n    # Compare schemas\n    differences = schema_sync.compare_schemas()\n    print(\"Schema Differences:\", differences)\n\n    # Apply changes to sync the BigQuery table schema with the local schema\n    schema_sync.apply_changes()\n\n    # Generate migration script\n    schema_sync.generate_migration_script('migration_scripts/update_my_table_schema.sql')\n\n    # Validate the local schema\n    schema_sync.validate_schema()\n\n    # Save the current schema version\n    schema_sync.save_version(\"Initial schema definition\")\n\n    # List all schema versions\n    versions = schema_sync.list_versions()\n    for version in versions:\n        print(f\"Version: {version['version']}, Timestamp: {version['timestamp']}, Description: {version['description']}\")\n\n    # Apply a specific schema version\n    schema_sync.apply_version(1)\n\n## Configuration File\n\nThe configuration file should be in YAML format and include the following details:\n\n    project_id: your-gcp-project-id\n    dataset_id: your-dataset-id\n    table_id: your-table-id\n    schema:\n      fields:\n        - name: id\n          type: STRING\n          mode: REQUIRED\n          description: \"Unique identifier\"\n        - name: created_at\n          type: TIMESTAMP\n          mode: NULLABLE\n          description: \"Record creation timestamp\"\n\n## Example Schema\n\nHere is an example schema file in YAML format (`config.yaml`):\n\n    project_id: your-gcp-project-id\n    dataset_id: your-dataset-id\n    table_id: your-table-id\n    schema:\n      fields:\n        - name: id\n          type: STRING\n          mode: REQUIRED\n          description: \"Unique identifier\"\n        - name: created_at\n          type: TIMESTAMP\n          mode: NULLABLE\n          description: \"Record creation timestamp\"\n\n## Testing\n\nYou can run tests using `unittest`:\n\n    python -m unittest discover tests\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a pull request or open an issue.\n\n## Author\n\nThis project is developed and maintained by Victor Hasim Elexpe Ahamri. You can follow me on Twitter [@victorelexpe](https://twitter.com/victorelexpe) and visit my website [elexpe.dev](https://elexpe.dev).\n\n## License\n\nThis project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A tool to synchronize BigQuery table schemas with local definitions",
    "version": "0.2.1",
    "project_urls": {
        "Documentation": "https://github.com/victorelexpe/bq-schema-sync#readme",
        "Homepage": "https://github.com/victorelexpe/bq-schema-sync",
        "Source": "https://github.com/victorelexpe/bq-schema-sync",
        "Tracker": "https://github.com/victorelexpe/bq-schema-sync/issues"
    },
    "split_keywords": [
        "bigquery",
        "schema",
        "sync",
        "gcp",
        "google-cloud"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "19906ea62130b6a36cff1ac78288aa8f868c5328f49080c248656af622210ed0",
                "md5": "a2029403767ab133d92b57abed48b515",
                "sha256": "202d8f80c394f2edbec2a556bacaf6101165eff233aa9dffae6b603e9cbc9fa4"
            },
            "downloads": -1,
            "filename": "bq_schema_sync-0.2.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a2029403767ab133d92b57abed48b515",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 10107,
            "upload_time": "2024-07-21T14:43:51",
            "upload_time_iso_8601": "2024-07-21T14:43:51.181349Z",
            "url": "https://files.pythonhosted.org/packages/19/90/6ea62130b6a36cff1ac78288aa8f868c5328f49080c248656af622210ed0/bq_schema_sync-0.2.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bf78431ebb09b94f113a32f588a04398e47e6a50da7186cd3127012117613944",
                "md5": "0fedfef01c6f69b8ebd29a0c5d84e00a",
                "sha256": "1247df50d40b21299c89f31523052cdc23616e3ca1edd7a789486bb20460cdf4"
            },
            "downloads": -1,
            "filename": "bq_schema_sync-0.2.1.tar.gz",
            "has_sig": false,
            "md5_digest": "0fedfef01c6f69b8ebd29a0c5d84e00a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 10028,
            "upload_time": "2024-07-21T14:43:52",
            "upload_time_iso_8601": "2024-07-21T14:43:52.482660Z",
            "url": "https://files.pythonhosted.org/packages/bf/78/431ebb09b94f113a32f588a04398e47e6a50da7186cd3127012117613944/bq_schema_sync-0.2.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-21 14:43:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "victorelexpe",
    "github_project": "bq-schema-sync",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "google-cloud-bigquery",
            "specs": []
        },
        {
            "name": "pyyaml",
            "specs": []
        }
    ],
    "lcname": "bq-schema-sync"
}
        
Elapsed time: 1.10341s