reflekt


Namereflekt JSON
Version 0.6.0 PyPI version JSON
download
home_pagehttps://github.com/GClunies/Reflekt
SummaryA CLI tool to define event schemas, lint them to enforce conventions, publish to a schema registry, and build corresponding data artifacts (e.g., dbt sources/models/docs).
upload_time2024-03-05 04:24:45
maintainer
docs_urlNone
authorGregory Clunies
requires_python>=3.9,<3.12
licenseApache-2.0
keywords events jsonschema analytics business-intelligence data-modeling dbt snowflake redshift segment avo
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage
            <!--
SPDX-FileCopyrightText: 2022 Gregory Clunies <greg@reflekt-ci.com>

SPDX-License-Identifier: Apache-2.0
-->

# Reflekt
> /rəˈflek(t)/ - _to embody or represent in a faithful or appropriate way_

![PyPI](https://img.shields.io/pypi/v/reflekt?style=for-the-badge)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/reflekt?style=for-the-badge)
![GitHub](https://img.shields.io/github/license/gclunies/reflekt?style=for-the-badge)

Reflekt helps teams define, govern, and model event data for warehouse-first product analytics. It integrates with [Customer Data Platforms](#customer-data-platform-cdp), [Schema Registries](#schema-registry), [Data Warehouses](#data-warehouse), and [dbt](#dbt). Events are defined using [JSON schema](https://json-schema.org/), version controlled, and updated by pull requests (PRs), enabling:
  - Branches for parallel development and testing.
  - Reviews and discussion amongst teams and stakeholders.
  - CI/CD suites to:
    - `reflekt lint` schemas against naming and metadata rules.
    - `reflekt push` schemas to deploy them to a [schema registry](#schema-registry) for event data validation.

Stop writing SQL & YAML. `reflekt build` a [dbt](#dbt) package to model, document, and test events in the warehouse.

![reflekt build](docs/reflekt_build.gif)

To learn more about `reflekt`, checkout:
- [Getting Started](#getting-started)
- [CLI Command Reference](#cli-commands)
- [Integrations](#integrations)
- [reflekt-jaffle-shop](https://github.com/GClunies/reflekt-jaffle-shop/) and this [demo](https://www.loom.com/share/75b60cfc2b3549edafde4eedcb3c9631?sid=fb610521-c651-40f9-9de5-8f07a2534302)

## Getting Started
### Install
Reflekt is available on [PyPI](https://pypi.org/project/reflekt/). Install it with `pip` or package manager, preferably in a virtual environment:
```bash
❯ source /path/to/venv/bin/activate  # Activate virtual environment
❯ pip install reflekt                # Install Reflekt
❯ reflekt --version                  # Confirm installation
Reflekt CLI Version: 0.6.0
```
<br>

### Create a Reflekt Project
To create a new Reflekt project, make a directory, initialize a Git repo, and run `reflekt init`.
```bash
❯ mkdir ~/Repos/my-reflekt-project  # Create a new directory for the project
❯ cd ~/Repos/my-reflekt-project     # Navigate to the project directory
❯ git init                          # Initialize a new Git repo (REQUIRED)
❯ reflekt init                      # Initialize a new Reflekt project in the current directory

# Follow the prompts to configure the project
```

You now have a Reflekt project with the structure:
```bash
my-reflekt-project
├── .logs/                # CLI command logs
├── .reflekt_cache/       # Local cache used by reflekt CLI
├── artifacts/            # Data artifacts created by `reflekt build`
├── schemas/              # Event schema definitions
├── .gitignore
├── README.md
└── reflekt_project.yml   # Project configuration
```
<br>

### Configure a Reflekt Project
Reflekt uses 3 files to define and configure a Reflekt project.

#### `reflekt_project.yml`
Defines project settings, event and metadata conventions, data artifact generation, and optional registry config (Avo only).

<details>
<summary>Example: <code>reflekt_project.yml</code> (click to expand)</summary>
<br>

```yaml
# Example reflekt_project.yml
# GENERAL CONFIG ----------------------------------------------------------------------
version: 1.0

name: reflekt_demo              # Project name
vendor: com.company_name        # Default vendor for schemas in reflekt project
default_profile: dev_reflekt    # Default profile to use from reflekt_profiles.yml
profiles_path: ~/.reflekt/reflekt_profiles.yml  # Path to reflekt_profiles.yml

# SCHEMAS CONFIG ----------------------------------------------------------------------
schemas:                        # Define schema conventions
  conventions:
    event:
      casing: title             # title | snake | camel | pascal | any
      numbers: false            # Allow numbers in event names
      reserved: []              # Reserved event names
    property:
      casing: snake             # title | snake | camel | pascal | any
      numbers: false            # Allow numbers in property names
      reserved: []              # Reserved property names
    data_types: [               # Allowed data types
        string, integer, number, boolean, object, array, any, 'null'
    ]

# REGISTRY CONFIG ---------------------------------------------------------------------
registry:                       # Additional config for schema registry if needed
  avo:                          # Avo specific config
    branches:                   # Provide ID for Avo branches for `reflekt pull` to work
      staging: AbC12dEfG        # Safe to version control (See Avo docs to find branch ID: https://bit.ly/avo-docs-branch-id)
      main: main                # 'main' always refers to the main branch

# ARTIFACTS CONFIG -----------------------------------------------------------------------
artifacts:                      # Configure how data artifacts are built
  dbt:                          # dbt package config
    sources:
      prefix: __src_            # Source files will start with this prefix
    models:
      prefix: stg_              # Model files will start with this prefix
    docs:
      prefix: _stg_             # Docs files will start with this prefix
      in_folder: false          # Docs files in separate folder?
      tests:                    # dbt tests to add based on column name (can be empty dict {})
        id: [unique, not_null]
```
</details>

#### `reflekt_profiles.yml`
This file defines connections to schema registries and data warehouse connections.

<details>
<summary>Example: <code>reflekt_profiles.yml</code> (click to expand)</summary>
<br>

```yaml
# Example reflekt_profiles.yml
version: 1.0

dev_reflekt:                                         # Profile name (multiple allowed)
  registry:                                          # Schema registry connection details (multiple allowed)
    - type: segment
      api_token: segment_api_token                   # https://docs.segmentapis.com/tag/Getting-Started#section/Get-an-API-token

    - type: avo
      workspace_id: avo_workspace_id                 # https://www.avo.app/docs/public-api/export-tracking-plan#endpoint
      service_account_name: avo_service_account_name # https://www.avo.app/docs/public-api/authentication#creating-service-accounts
      service_account_secret: avo_service_account_secret

  source:                          # Data warehouse connection details (multiple allowed)
    - id: snowflake                # ID must be unique per profile
      type: snowflake              # Specify details where raw event data is stored
      account: abc12345
      database: raw
      warehouse: transforming
      role: transformer
      user: reflekt_user           # Create reflekt_user with access to raw data (permissions: USAGE, SELECT)
      password: reflekt_user_password

    - id: redshift                 # ID must be unique per profile
      type: redshift               # Specify details where raw event data is stored
      host: example-redshift-cluster-1.abc123.us-west-1.redshift.amazonaws.com
      database: analytics
      port: 5439
      user: reflekt_user           # Create reflekt_user with access to raw data (permissions: USAGE, SELECT)
      password: reflekt_user_password

    - id: bigquery                 # ID must be unique per profile
      type: bigquery               # Specify details where raw event data is stored
      project: raw-data
      dataset: jaffle_shop_segment
      keyfile_json:                # Create a reflekt-user service account with access to BigQuery project where raw event data lands (permissions: BigQuery Data Viewer, BigQuery Job User). Include JSON keyfile fields below.
        type: "service_account"
        project_id: "foo-bar-123456"
        private_key_id: "abc123def456ghi789"
        private_key: "-----BEGIN PRIVATE KEY-----\nmy-very-long-private-keyF\n\n-----END PRIVATE KEY-----\n"
        client_email: "reflekt-user@foo-bar-123456.iam.gserviceaccount.com"
        client_id: "123456789101112131415161718"
        auth_uri: "https://accounts.google.com/o/oauth2/auth"
        token_uri: "https://oauth2.googleapis.com/token"
        auth_provider_x509_cert_url: "https://www.googleapis.com/oauth2/v1/certs"
        client_x509_cert_url: "https://www.googleapis.com/robot/v1/metadata/x509/reflekt-user%40foo-bar-123456.iam.gserviceaccount.com"
```
</details>

#### `schemas/.reflekt/meta/1-0.json`
A meta-schema used by `reflekt lint` to ensure all events in `schemas/` follow the Reflekt format. Can also be used to define gloablly required metadata for all event schemas.

<details>
<summary>Example: <code>schemas/.reflekt/meta/1-0.json</code> (click to expand)</summary>
<br>

```json
{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": ".reflekt/meta/1-0.json",
    "description": "Meta-schema for all Reflekt events",
    "self": {
        "vendor": "reflekt",
        "name": "meta",
        "format": "jsonschema",
        "version": "1-0"
    },
    "type": "object",
    "allOf": [
        {
            "$ref": "http://json-schema.org/draft-07/schema#"
        },
        {
            "properties": {
                "self": {
                    "type": "object",
                    "properties": {
                        "vendor": {
                            "type": "string",
                            "description": "The company, application, team, or system that authored the schema (e.g., com.company, com.company.android, com.company.marketing)"
                        },
                        "name": {
                            "type": "string",
                            "description": "The schema name. Describes what the schema is meant to capture (e.g., pageViewed, clickedLink)"
                        },
                        "format": {
                            "type": "string",
                            "description": "The format of the schema",
                            "const": "jsonschema"
                        },
                        "version": {
                            "type": "string",
                            "description": "The schema version, in MODEL-ADDITION format (e.g., 1-0, 1-1, 2-3, etc.)",
                            "pattern": "^[1-9][0-9]*-(0|[1-9][0-9]*)$"
                        },
                        "metadata": {  // EXAMPLE: Defining required metadata (code_owner, product_owner)
                            "type": "object",
                            "description": "Required metadata for all event schemas",
                            "properties": {
                                "code_owner": {"type": "string"},
                                "product_owner": {"type": "string"}
                            },
                            "required": ["code_owner", "product_owner"],
                            "additionalProperties": false
                        },
                    },
                    "required": ["vendor", "name", "format", "version"],
                    "additionalProperties": false
                },
                "properties": {},
                "tests": {},
            },
            "required": ["self", "metadata", "properties"]
        }
    ]
}

```
</details>
<br>

### Defining Event Schemas
Events in a Reflekt project are defined using the [JSON schema](https://json-schema.org/) specification and are stored in the `schemas/` directory of the project. Click to expand the `Order Completed` example below.

<details>
<summary>Example: <code>my-reflekt-project/schemas/jaffle_shop/Order_Completed/1-0.json</code> (click to expand)</summary>
<br>

```json
{
    "$schema": "http://json-schema.org/draft-07/schema#",
    "$id": "jaffle_shop/Order_Completed/1-0.json",  // Unique ID for schema (relative to `schemas/` dir)
    "description": "User completed an order.",      // Event description (REQUIRED)
    "self": {
        "vendor": "com.thejaffleshop",                         // Company, application, or system that authored the schema
        "name": "Order Completed",                             // Name of the event
        "format": "jsonschema",                                // Format of the schema
        "version": "1-0",                                      // Version of the schema
        "metadata": {                                          // Metadata for the event
            "code_owner": "@the-jaffle-shop/frontend-guild",
            "product_owner": "pmanager@thejaffleshop.com",
        }
    },
    "type": "object",
    "properties": {                                            // Event properties (REQUIRED, but can be empty)
        "coupon": {
            "description": "Coupon code used for the order.",  // Property description (REQUIRED)
            "type": [
                "string",
                "null"                                         // Allow null values
            ]
        },
        "currency": {
            "description": "Currency for the order.",
            "type": "string"
        },
        "discount": {
            "description": "Total discount for the order.",
            "type": "number"
        },
        "order_id": {
            "description": "Unique identifier for the order.",
            "type": "string"
        },
        "products": {
            "description": "List of products in the cart.",
            "type": "array",                                    // Array type
            "items": {                                          // Items in the array
                "type": "object",
                "properties": {                                 // Properties of the items
                    "category": {
                        "description": "Category of the product.",
                        "type": "string"
                    },
                    "name": {
                        "description": "Name of the product.",
                        "type": "string"
                    },
                    "price": {
                        "description": "Price of the product.",
                        "type": "number"
                    },
                    "product_id": {
                        "description": "Unique identifier for the product.",
                        "type": "string"
                    },
                    "quantity": {
                        "description": "Quantity of the product in the cart.",
                        "type": "integer"
                    },
                    "sku": {
                        "description": "Stock keeping unit for the product.",
                        "type": "string"
                    }
                },
                "required": [                                  // Required properties for the items
                    "product_id",
                    "sku",
                    "category",
                    "name",
                    "price",
                    "quantity"
                ],
                "additionalProperties": false,                // Are additional item properties allowed?
            }
        },
        "revenue": {
            "description": "Total revenue for the order.",
            "type": "number"
        },
        "session_id": {
            "description": "Unique identifier for the session.",
            "type": "string"
        },
        "shipping": {
            "description": "Shipping cost for the order.",
            "type": "number"
        },
        "subtotal": {
            "description": "Subtotal for the order (revenue - discount).",
            "type": "number"
        },
        "tax": {
            "description": "Tax for the order.",
            "type": "number"
        },
        "total": {
            "description": "Total cost for the order (revenue - discount + shipping + tax = subtotal + shipping + tax).",
            "type": "number"
        }
    },
    "required": [                  // Required properties (can be empty)
        "session_id",
        "order_id",
        "revenue",
        "coupon",
        "discount",
        "subtotal",
        "shipping",
        "tax",
        "total",
        "currency",
        "products"
    ],
    "additionalProperties": false  // Are additional properties allowed?
}
```
</details>


#### Schema `$id` and `version`
Schemas in a Reflekt project are identified and `--select`ed by their `$id`, which is their path relative to the `schemas/` directory. For example:
| File Path to Schema                                                   | Schema `$id`                       |
|-----------------------------------------------------------------------|------------------------------------|
| `~/repos/my-reflekt-project/schemas/jaffle_shop/Cart_Viewed/1-0.json` | `jaffle_shop/Cart_Viewed/1-0.json` |
| `~/repos/my-reflekt-project/schemas/jaffle_shop/Cart_Viewed/2-1.json` | `jaffle_shop/Cart_Viewed/2-1.json` |

Each schema has a `version` (e.g., `1-0`, `2-1`), used to indicate changes to data collection requirements. New event schemas start at `1-0` and follow a `MAJOR-MINOR` version spec, as shown in the table below.
| Type  | Description | Example | Use Case |
|-------|---------------------------------------------------|---------|----------|
| MAJOR | Breaking change incompatible with previous data. | `1-0`, `2-0`<br>(ends in `-0) | - Add/remove/rename a *required* property<br> - Change a property from *optional to required*<br> - Change a property's type |
| MINOR | Non-breaking change compatible with previous data. | `1-1`, `2-3` | - Add/remove/rename an *optional* property<br> - Change a property from *required to optional* |

> [!NOTE]
> For `MINOR` schema versions (non-breaking changes), you can either:
> - Update the existing schema and increment the MINOR version number.
> - Create a new `.json` file with the updated schema and increment the MINOR version number.
>
> For `MAJOR` schema versions (breaking changes), you MUST:
> - **Create a new `.json` file** with the updated schema. This way, an application/product/feature can begin using the new schema while others continue to use the old schema (migrating later).

<br>

### Linting Event Schemas
Schemas can be linted to test if they follow the naming conventions in your [`reflekt_project.yml`] and metadata conventions in `schemas/.reflekt/meta/1-0.json`.

```bash
❯ reflekt lint --select schemas/jaffle_shop
[18:31:12] INFO     Running with reflekt=0.5.0
[18:31:12] INFO     Searching for JSON schemas in: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop
[18:31:12] INFO     Found 9 schema(s) to lint
[18:31:12] INFO     1 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Order_Completed/1-0.json
[18:31:19] INFO     2 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Identify/1-0.json
[18:31:20] INFO     3 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Clicked/1-0.json
[18:31:24] INFO     4 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Cart_Viewed/1-0.json
[18:31:26] INFO     5 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Removed/1-0.json
[18:31:32] INFO     6 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Added/1-0.json
[18:31:37] INFO     7 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Viewed/1-0.json
[18:31:40] INFO     8 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Completed/1-0.json
[18:31:44] INFO     9 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Page_Viewed/1-0.json
[18:31:51] INFO     Completed successfully
```
<br>

### Sending Event Schemas to a Schema Registries
In order to validate events as they flow from **Application -> Registry -> Customer Data Platform (CDP) -> Data Warehouse**, we need to send a copy of our event schemas to a schema registry (see [supported registries](#schema-registry)). This is done with the `reflekt push` command.

```bash
❯ reflekt push --registry segment --select schemas/jaffle_shop
[18:41:05] INFO     Running with reflekt=0.5.0
[18:41:06] INFO     Searching for JSON schemas in: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop
[18:41:06] INFO     Found 9 schemas to push
[18:41:06] INFO     1 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Order_Completed/1-0.json
[18:41:06] INFO     2 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Identify/1-0.json
[18:41:06] INFO     3 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Clicked/1-0.json
[18:41:06] INFO     4 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Cart_Viewed/1-0.json
[18:41:06] INFO     5 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Removed/1-0.json
[18:41:06] INFO     6 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Added/1-0.json
[18:41:06] INFO     7 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Viewed/1-0.json
[18:41:06] INFO     8 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Completed/1-0.json
[18:41:06] INFO     9 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Page_Viewed/1-0.json
[18:41:08] INFO     Completed successfully
```
<br>

### Building `dbt` Packages to Model Event Data
Modeling event data in `dbt` is a lot of work. Everyone wants staging models that are clean, documented, and tested. But who wants to write and maintain SQL and YAML for hundreds of events?

You don't have to choose. Put `reflekt build` to work for you - staging models, documentation, even tests - all in a dbt package ready for you to use in your dbt project.

```bash
❯ reflekt build --artifact dbt --select schemas/jaffle_shop --source snowflake.raw.jaffle_shop_segment --sdk segment
[18:56:25] INFO     Running with reflekt=0.5.0
[18:56:26] INFO     Searching for JSON schemas in: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop
[18:56:26] INFO     Found 9 schemas to build
[18:56:27] INFO     Building dbt package:
                        name: jaffle_shop
                        dir: /Users/gclunies/Repos/reflekt/artifacts/dbt/jaffle_shop
                        --select: jaffle_shop
                        --sdk_arg: segment
                        --source: snowflake.raw.jaffle_shop_segment
[18:56:27] INFO     Building dbt source 'jaffle_shop_segment'
[18:56:27] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Order_Completed/1-0.json
[18:56:28] INFO     Building dbt table 'order_completed' in source 'jaffle_shop_segment'
[18:56:28] INFO     Building staging model 'stg_jaffle_shop_segment__order_completed.sql'
[18:56:28] INFO     Building dbt documentation '_stg_jaffle_shop_segment__order_completed.yml'
[18:56:28] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Identify/1-0.json
[18:56:29] INFO     Building dbt table 'identifies' in source 'jaffle_shop_segment'
[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__identifies.sql'
[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__identifies.yml'
[18:56:29] INFO     Building dbt artifacts for schema: Segment 'users' table
[18:56:29] INFO     Building dbt table 'users' in source 'jaffle_shop_segment'
[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__users.sql'
[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__users.yml'
[18:56:29] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Clicked/1-0.json
[18:56:29] INFO     Building dbt table 'product_clicked' in source 'jaffle_shop_segment'
[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__product_clicked.sql'
[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__product_clicked.yml'
[18:56:29] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Cart_Viewed/1-0.json
[18:56:29] INFO     Building dbt table 'cart_viewed' in source 'jaffle_shop_segment'
[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__cart_viewed.sql'
[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__cart_viewed.yml'
[18:56:29] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Removed/1-0.json
[18:56:30] INFO     Building dbt table 'product_removed' in source 'jaffle_shop_segment'
[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__product_removed.sql'
[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__product_removed.yml'
[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Added/1-0.json
[18:56:30] INFO     Building dbt table 'product_added' in source 'jaffle_shop_segment'
[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__product_added.sql'
[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__product_added.yml'
[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Viewed/1-0.json
[18:56:30] INFO     Building dbt table 'checkout_step_viewed' in source 'jaffle_shop_segment'
[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__checkout_step_viewed.sql'
[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__checkout_step_viewed.yml'
[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Completed/1-0.json
[18:56:30] INFO     Building dbt table 'checkout_step_completed' in source 'jaffle_shop_segment'
[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__checkout_step_completed.sql'
[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__checkout_step_completed.yml'
[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Page_Viewed/1-0.json
[18:56:30] INFO     Building dbt table 'pages' in source 'jaffle_shop_segment'
[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__pages.sql'
[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__pages.yml'
[18:56:30] INFO     Building dbt artifacts for schema: Segment 'tracks' table
[18:56:31] INFO     Building dbt table 'tracks' in source 'jaffle_shop_segment'
[18:56:31] INFO     Building staging model 'stg_jaffle_shop_segment__tracks.sql'
[18:56:31] INFO     Building dbt documentation '_stg_jaffle_shop_segment__tracks.yml'
[18:56:31] INFO     Copying dbt package from temporary path /Users/gclunies/Repos/reflekt/.reflekt_cache/artifacts/dbt/jaffle_shop to /Users/gclunies/Repos/reflekt/artifacts/dbt/jaffle_shop
[18:56:31] INFO     Successfully built dbt package
```
---
<br>

## CLI Commands
A description of commands can be seen by running `reflekt --help`. The help page for each CLI command is shown below.

### `reflekt init`
```bash
❯ reflekt init --help
[11:17:16] INFO     Running with reflekt=0.6.0

 Usage: reflekt init [OPTIONS]

 Initialize a Reflekt project.

╭─ Options ──────────────────────────────────────────────────────╮
│ --dir                        TEXT  [default: .]                │
│ --verbose    --no-verbose          [default: no-verbose]       │
│ --help                             Show this message and exit. │
╰────────────────────────────────────────────────────────────────╯
```

### `reflekt debug`
```bash
❯ reflekt debug --help
[11:18:07] INFO     Running with reflekt=0.6.0

 Usage: reflekt debug [OPTIONS]

 Check Reflekt project configuration.

╭─ Options ───────────────────────────────────╮
│ --help          Show this message and exit. │
╰─────────────────────────────────────────────╯
```

### `reflekt lint`
```bash
❯ reflekt lint --help
[11:20:29] INFO     Running with reflekt=0.6.0

 Usage: reflekt lint [OPTIONS]

 Lint schema(s) to test for naming and metadata conventions.

╭─ Options ───────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --select   -s      TEXT  Schema(s) to lint. Starting with 'schemas/' is optional. [default: None] [required] │
│    --verbose  -v            Verbose logging.                                                                        │
│    --help                   Show this message and exit.                                                             │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

### `reflekt push`
```bash
❯ reflekt push --help
[11:22:37] INFO     Running with reflekt=0.6.0

 Usage: reflekt push [OPTIONS]

 Push schema(s) to a schema registry.

╭─ Options ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --registry  -r      [avo|segment]  Schema registry to push to. [default: None] [required]                                          │
│ *  --select    -s      TEXT           Schema(s) to push to schema registry. Starting with 'schemas/' is optional. [default: None] │
│    --delete    -D                     Delete schema(s) from schema registry. Prompts for confirmation                                 │
│    --force     -F                     Force command to run without confirmation.                                                      │
│    --profile   -p      TEXT           Profile in reflekt_profiles.yml to use for schema registry connection.                          │
│    --verbose   -v                     Verbose logging.                                                                                │
│    --help                             Show this message and exit.                                                                     │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

### `reflekt pull`
```bash
❯ reflekt pull --help
[11:25:18] INFO     Running with reflekt=0.6.0

 Usage: reflekt pull [OPTIONS]

 Pull schema(s) from a schema registry.

╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --registry  -r      [avo|segment]  Schema registry to pull from. [default: None] [required]                                                                         │
│ *  --select    -s      TEXT           Schema(s) to pull from schema registry. If registry uses tracking plans, starting with the plan name. [default: None] [required] │
│    --profile   -p      TEXT           Profile in reflekt_profiles.yml to use for schema registry connection.                                                           │
│    --verbose   -v                     Verbose logging.                                                                                                                 │
│    --help                             Show this message and exit.                                                                                                      │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

### `reflekt build`
```bash
❯ reflekt build --help
[11:31:36] INFO     Running with reflekt=0.6.0

 Usage: reflekt build [OPTIONS]

 Build data artifacts based on schemas.

╭─ Options ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ *  --artifact  -a      [dbt]      Type of data artifact to build. [default: None] [required]                                                                                                         │
│ *  --select    -s      TEXT       Schema(s) to build data artifacts for. Starting with 'schemas/' is optional. [default: None] [required]                                                            │
│ *  --sdk               [segment]  The SDK used to collect the event data. [default: None] [required]                                                                                                 │
│ *  --source            TEXT       The <source_id>.<database>.<schema> storing raw event data. <source_id> must be a data warehouse source defined in reflekt_profiles.yml [default: None] [required] │
│    --profile   -p      TEXT       Profile in reflekt_profiles.yml to look for the data source specified by the --source option. Defaults to default_profile in reflekt_project.yml                   │
│    --verbose   -v                 Verbose logging.                                                                                                                                                   │
│    --help                         Show this message and exit.                                                                                                                                        │
╰──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```

### `reflekt report`
```bash
❯ reflekt report --help
[08:20:09] INFO     Running with reflekt=0.6.0

 Usage: reflekt report [OPTIONS]

 Generate Markdown report(s) for schema(s).

╭─ Options ───────────────────────────────────────────────────────────────────────────────────────╮
│ *  --select   -s      TEXT  Schema(s) to generate Markdown report(s) for. Starting with         │
│                             'schemas/' is optional.                                             │
│                             [default: None]                                                     │
│                             [required]                                                          │
│    --to-file  -f            Write report(s) to file instead of terminal.                        │
│    --verbose  -v            Verbose logging.                                                    │
│    --help                   Show this message and exit.                                         │
╰─────────────────────────────────────────────────────────────────────────────────────────────────╯
```

---
<br>

## Integrations

### Customer Data Platform (CDP)
Reflekt understands how Customer Data Platforms (CDPs) collect event data and load them into data warehouses, allowing it to:
  - Parse the schemas in a Reflekt project.
  - Find matching tables for events and columns for properties in the data warehouse.
  - Build a `dbt` package with sources, staging models, and documentation for event data.

| CDP | Supported |
|-----|-----------|
| [Segment](https://segment.com/) | ✅ |
| [Rudderstack](https://www.rudderstack.com/) | 🚧 Coming Soon 🚧 |
| [Amplitude](https://amplitude.com/) | 🚧 Coming Soon 🚧  |


### Schema Registry
Schema registries store and serve schemas. When a schema is registered in a regsitry, it can be used to validate events as they flow through your data collection infrastructure. Reflekt works with schema registries from CDPs, SaaS vendors, and open-source projects, letting teams to decide between managed and self-hosted solutions.

| Registry |  Open Source  | Schema Versions | Recommended Workflow |
|----------|:-------------:|-----------------|----------------------|
| [Segment Protocols](https://segment.com/docs/protocols/) | ❌ | `MAJOR` only | Manage schemas in Reflekt.<br> `reflekt push` to Protocols for event validation.<br> `reflekt build --artifact dbt` to build dbt package. |
| [Avo](https://www.avo.app/) | ❌ | `MAJOR` only | Manage schemas in Avo.<br> `reflekt pull` to get schemas.<br>  `reflekt build --artifact dbt` to build dbt package. |
| [reflekt-registry](https://github.com/GClunies/reflekt-registry)<br> 🚧 Coming Soon 🚧 | ✅ | `MAJOR` & `MINOR` |  Manage schemas in Reflekt.<br> `reflekt push` to reflekt-registry.<br> `reflekt build --artifact dbt` to build dbt package. |

### Data Warehouse
In order to build dbt packages, Reflekt needs to connect to a cloud data warehouse where raw event data is stored.

| Data Warehouse | Supported |
|----------------|-----------|
| [Snowflake](https://www.snowflake.com/) | ✅ |
| [Redshift](https://aws.amazon.com/redshift/) | ✅ |
| [BigQuery](https://cloud.google.com/bigquery) | ✅ |

Reflekt **NEVER** copies, moves, or modifies events in the data warehouse. It ONLY reads table and column names for templating.

### dbt
[dbt](https://www.getdbt.com/) enables anyone that knows SQL to transform data in a cloud data warehouse. When modeling in dbt, it is [best practice](https://docs.getdbt.com/guides/best-practices/how-we-structure/1-guide-overview) to:
- Define sources pointing to the raw data.
- Write staging models that [rename, recast, or usefully reconsider](https://discourse.getdbt.com/t/how-we-used-to-structure-our-dbt-projects/355#data-transformation-101-1) columns into a consistent format.
- Document and test the staging models.

But as the number of events intsrumented in your product grow or change as the product evolves, mainting this best practice can be **burdensome and boring.** This is where [`reflekt build`](#reflekt-build) steps in.

## Contribute
- Source Code: [github.com/GClunies/reflekt](https://github.com/GClunies/reflekt)
- Issue Tracker: [github.com/GClunies/reflekt/issues](https://github.com/GClunies/reflekt/issues)
- Pull Requests: [github.com/GClunies/reflekt/pulls](https://github.com/GClunies/reflekt/pulls)

## License
This project is [licensed](LICENSE) under the Apache-2.0 License.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/GClunies/Reflekt",
    "name": "reflekt",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<3.12",
    "maintainer_email": "",
    "keywords": "events,jsonschema,analytics,business-intelligence,data-modeling,dbt,snowflake,redshift,segment,avo",
    "author": "Gregory Clunies",
    "author_email": "greg@reflekt-ci.com",
    "download_url": "https://files.pythonhosted.org/packages/a3/23/79a73b1b0c48b775ce5f5c0f433807384624d53858bce4c2eedd2381e263/reflekt-0.6.0.tar.gz",
    "platform": null,
    "description": "<!--\nSPDX-FileCopyrightText: 2022 Gregory Clunies <greg@reflekt-ci.com>\n\nSPDX-License-Identifier: Apache-2.0\n-->\n\n# Reflekt\n> /r\u0259\u02c8flek(t)/ - _to embody or represent in a faithful or appropriate way_\n\n![PyPI](https://img.shields.io/pypi/v/reflekt?style=for-the-badge)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/reflekt?style=for-the-badge)\n![GitHub](https://img.shields.io/github/license/gclunies/reflekt?style=for-the-badge)\n\nReflekt helps teams define, govern, and model event data for warehouse-first product analytics. It integrates with [Customer Data Platforms](#customer-data-platform-cdp), [Schema Registries](#schema-registry), [Data Warehouses](#data-warehouse), and [dbt](#dbt). Events are defined using [JSON schema](https://json-schema.org/), version controlled, and updated by pull requests (PRs), enabling:\n  - Branches for parallel development and testing.\n  - Reviews and discussion amongst teams and stakeholders.\n  - CI/CD suites to:\n    - `reflekt lint` schemas against naming and metadata rules.\n    - `reflekt push` schemas to deploy them to a [schema registry](#schema-registry) for event data validation.\n\nStop writing SQL & YAML. `reflekt build` a [dbt](#dbt) package to model, document, and test events in the warehouse.\n\n![reflekt build](docs/reflekt_build.gif)\n\nTo learn more about `reflekt`, checkout:\n- [Getting Started](#getting-started)\n- [CLI Command Reference](#cli-commands)\n- [Integrations](#integrations)\n- [reflekt-jaffle-shop](https://github.com/GClunies/reflekt-jaffle-shop/) and this [demo](https://www.loom.com/share/75b60cfc2b3549edafde4eedcb3c9631?sid=fb610521-c651-40f9-9de5-8f07a2534302)\n\n## Getting Started\n### Install\nReflekt is available on [PyPI](https://pypi.org/project/reflekt/). Install it with `pip` or package manager, preferably in a virtual environment:\n```bash\n\u276f source /path/to/venv/bin/activate  # Activate virtual environment\n\u276f pip install reflekt                # Install Reflekt\n\u276f reflekt --version                  # Confirm installation\nReflekt CLI Version: 0.6.0\n```\n<br>\n\n### Create a Reflekt Project\nTo create a new Reflekt project, make a directory, initialize a Git repo, and run `reflekt init`.\n```bash\n\u276f mkdir ~/Repos/my-reflekt-project  # Create a new directory for the project\n\u276f cd ~/Repos/my-reflekt-project     # Navigate to the project directory\n\u276f git init                          # Initialize a new Git repo (REQUIRED)\n\u276f reflekt init                      # Initialize a new Reflekt project in the current directory\n\n# Follow the prompts to configure the project\n```\n\nYou now have a Reflekt project with the structure:\n```bash\nmy-reflekt-project\n\u251c\u2500\u2500 .logs/                # CLI command logs\n\u251c\u2500\u2500 .reflekt_cache/       # Local cache used by reflekt CLI\n\u251c\u2500\u2500 artifacts/            # Data artifacts created by `reflekt build`\n\u251c\u2500\u2500 schemas/              # Event schema definitions\n\u251c\u2500\u2500 .gitignore\n\u251c\u2500\u2500 README.md\n\u2514\u2500\u2500 reflekt_project.yml   # Project configuration\n```\n<br>\n\n### Configure a Reflekt Project\nReflekt uses 3 files to define and configure a Reflekt project.\n\n#### `reflekt_project.yml`\nDefines project settings, event and metadata conventions, data artifact generation, and optional registry config (Avo only).\n\n<details>\n<summary>Example: <code>reflekt_project.yml</code> (click to expand)</summary>\n<br>\n\n```yaml\n# Example reflekt_project.yml\n# GENERAL CONFIG ----------------------------------------------------------------------\nversion: 1.0\n\nname: reflekt_demo              # Project name\nvendor: com.company_name        # Default vendor for schemas in reflekt project\ndefault_profile: dev_reflekt    # Default profile to use from reflekt_profiles.yml\nprofiles_path: ~/.reflekt/reflekt_profiles.yml  # Path to reflekt_profiles.yml\n\n# SCHEMAS CONFIG ----------------------------------------------------------------------\nschemas:                        # Define schema conventions\n  conventions:\n    event:\n      casing: title             # title | snake | camel | pascal | any\n      numbers: false            # Allow numbers in event names\n      reserved: []              # Reserved event names\n    property:\n      casing: snake             # title | snake | camel | pascal | any\n      numbers: false            # Allow numbers in property names\n      reserved: []              # Reserved property names\n    data_types: [               # Allowed data types\n        string, integer, number, boolean, object, array, any, 'null'\n    ]\n\n# REGISTRY CONFIG ---------------------------------------------------------------------\nregistry:                       # Additional config for schema registry if needed\n  avo:                          # Avo specific config\n    branches:                   # Provide ID for Avo branches for `reflekt pull` to work\n      staging: AbC12dEfG        # Safe to version control (See Avo docs to find branch ID: https://bit.ly/avo-docs-branch-id)\n      main: main                # 'main' always refers to the main branch\n\n# ARTIFACTS CONFIG -----------------------------------------------------------------------\nartifacts:                      # Configure how data artifacts are built\n  dbt:                          # dbt package config\n    sources:\n      prefix: __src_            # Source files will start with this prefix\n    models:\n      prefix: stg_              # Model files will start with this prefix\n    docs:\n      prefix: _stg_             # Docs files will start with this prefix\n      in_folder: false          # Docs files in separate folder?\n      tests:                    # dbt tests to add based on column name (can be empty dict {})\n        id: [unique, not_null]\n```\n</details>\n\n#### `reflekt_profiles.yml`\nThis file defines connections to schema registries and data warehouse connections.\n\n<details>\n<summary>Example: <code>reflekt_profiles.yml</code> (click to expand)</summary>\n<br>\n\n```yaml\n# Example reflekt_profiles.yml\nversion: 1.0\n\ndev_reflekt:                                         # Profile name (multiple allowed)\n  registry:                                          # Schema registry connection details (multiple allowed)\n    - type: segment\n      api_token: segment_api_token                   # https://docs.segmentapis.com/tag/Getting-Started#section/Get-an-API-token\n\n    - type: avo\n      workspace_id: avo_workspace_id                 # https://www.avo.app/docs/public-api/export-tracking-plan#endpoint\n      service_account_name: avo_service_account_name # https://www.avo.app/docs/public-api/authentication#creating-service-accounts\n      service_account_secret: avo_service_account_secret\n\n  source:                          # Data warehouse connection details (multiple allowed)\n    - id: snowflake                # ID must be unique per profile\n      type: snowflake              # Specify details where raw event data is stored\n      account: abc12345\n      database: raw\n      warehouse: transforming\n      role: transformer\n      user: reflekt_user           # Create reflekt_user with access to raw data (permissions: USAGE, SELECT)\n      password: reflekt_user_password\n\n    - id: redshift                 # ID must be unique per profile\n      type: redshift               # Specify details where raw event data is stored\n      host: example-redshift-cluster-1.abc123.us-west-1.redshift.amazonaws.com\n      database: analytics\n      port: 5439\n      user: reflekt_user           # Create reflekt_user with access to raw data (permissions: USAGE, SELECT)\n      password: reflekt_user_password\n\n    - id: bigquery                 # ID must be unique per profile\n      type: bigquery               # Specify details where raw event data is stored\n      project: raw-data\n      dataset: jaffle_shop_segment\n      keyfile_json:                # Create a reflekt-user service account with access to BigQuery project where raw event data lands (permissions: BigQuery Data Viewer, BigQuery Job User). Include JSON keyfile fields below.\n        type: \"service_account\"\n        project_id: \"foo-bar-123456\"\n        private_key_id: \"abc123def456ghi789\"\n        private_key: \"-----BEGIN PRIVATE KEY-----\\nmy-very-long-private-keyF\\n\\n-----END PRIVATE KEY-----\\n\"\n        client_email: \"reflekt-user@foo-bar-123456.iam.gserviceaccount.com\"\n        client_id: \"123456789101112131415161718\"\n        auth_uri: \"https://accounts.google.com/o/oauth2/auth\"\n        token_uri: \"https://oauth2.googleapis.com/token\"\n        auth_provider_x509_cert_url: \"https://www.googleapis.com/oauth2/v1/certs\"\n        client_x509_cert_url: \"https://www.googleapis.com/robot/v1/metadata/x509/reflekt-user%40foo-bar-123456.iam.gserviceaccount.com\"\n```\n</details>\n\n#### `schemas/.reflekt/meta/1-0.json`\nA meta-schema used by `reflekt lint` to ensure all events in `schemas/` follow the Reflekt format. Can also be used to define gloablly required metadata for all event schemas.\n\n<details>\n<summary>Example: <code>schemas/.reflekt/meta/1-0.json</code> (click to expand)</summary>\n<br>\n\n```json\n{\n    \"$schema\": \"http://json-schema.org/draft-07/schema#\",\n    \"$id\": \".reflekt/meta/1-0.json\",\n    \"description\": \"Meta-schema for all Reflekt events\",\n    \"self\": {\n        \"vendor\": \"reflekt\",\n        \"name\": \"meta\",\n        \"format\": \"jsonschema\",\n        \"version\": \"1-0\"\n    },\n    \"type\": \"object\",\n    \"allOf\": [\n        {\n            \"$ref\": \"http://json-schema.org/draft-07/schema#\"\n        },\n        {\n            \"properties\": {\n                \"self\": {\n                    \"type\": \"object\",\n                    \"properties\": {\n                        \"vendor\": {\n                            \"type\": \"string\",\n                            \"description\": \"The company, application, team, or system that authored the schema (e.g., com.company, com.company.android, com.company.marketing)\"\n                        },\n                        \"name\": {\n                            \"type\": \"string\",\n                            \"description\": \"The schema name. Describes what the schema is meant to capture (e.g., pageViewed, clickedLink)\"\n                        },\n                        \"format\": {\n                            \"type\": \"string\",\n                            \"description\": \"The format of the schema\",\n                            \"const\": \"jsonschema\"\n                        },\n                        \"version\": {\n                            \"type\": \"string\",\n                            \"description\": \"The schema version, in MODEL-ADDITION format (e.g., 1-0, 1-1, 2-3, etc.)\",\n                            \"pattern\": \"^[1-9][0-9]*-(0|[1-9][0-9]*)$\"\n                        },\n                        \"metadata\": {  // EXAMPLE: Defining required metadata (code_owner, product_owner)\n                            \"type\": \"object\",\n                            \"description\": \"Required metadata for all event schemas\",\n                            \"properties\": {\n                                \"code_owner\": {\"type\": \"string\"},\n                                \"product_owner\": {\"type\": \"string\"}\n                            },\n                            \"required\": [\"code_owner\", \"product_owner\"],\n                            \"additionalProperties\": false\n                        },\n                    },\n                    \"required\": [\"vendor\", \"name\", \"format\", \"version\"],\n                    \"additionalProperties\": false\n                },\n                \"properties\": {},\n                \"tests\": {},\n            },\n            \"required\": [\"self\", \"metadata\", \"properties\"]\n        }\n    ]\n}\n\n```\n</details>\n<br>\n\n### Defining Event Schemas\nEvents in a Reflekt project are defined using the [JSON schema](https://json-schema.org/) specification and are stored in the `schemas/` directory of the project. Click to expand the `Order Completed` example below.\n\n<details>\n<summary>Example: <code>my-reflekt-project/schemas/jaffle_shop/Order_Completed/1-0.json</code> (click to expand)</summary>\n<br>\n\n```json\n{\n    \"$schema\": \"http://json-schema.org/draft-07/schema#\",\n    \"$id\": \"jaffle_shop/Order_Completed/1-0.json\",  // Unique ID for schema (relative to `schemas/` dir)\n    \"description\": \"User completed an order.\",      // Event description (REQUIRED)\n    \"self\": {\n        \"vendor\": \"com.thejaffleshop\",                         // Company, application, or system that authored the schema\n        \"name\": \"Order Completed\",                             // Name of the event\n        \"format\": \"jsonschema\",                                // Format of the schema\n        \"version\": \"1-0\",                                      // Version of the schema\n        \"metadata\": {                                          // Metadata for the event\n            \"code_owner\": \"@the-jaffle-shop/frontend-guild\",\n            \"product_owner\": \"pmanager@thejaffleshop.com\",\n        }\n    },\n    \"type\": \"object\",\n    \"properties\": {                                            // Event properties (REQUIRED, but can be empty)\n        \"coupon\": {\n            \"description\": \"Coupon code used for the order.\",  // Property description (REQUIRED)\n            \"type\": [\n                \"string\",\n                \"null\"                                         // Allow null values\n            ]\n        },\n        \"currency\": {\n            \"description\": \"Currency for the order.\",\n            \"type\": \"string\"\n        },\n        \"discount\": {\n            \"description\": \"Total discount for the order.\",\n            \"type\": \"number\"\n        },\n        \"order_id\": {\n            \"description\": \"Unique identifier for the order.\",\n            \"type\": \"string\"\n        },\n        \"products\": {\n            \"description\": \"List of products in the cart.\",\n            \"type\": \"array\",                                    // Array type\n            \"items\": {                                          // Items in the array\n                \"type\": \"object\",\n                \"properties\": {                                 // Properties of the items\n                    \"category\": {\n                        \"description\": \"Category of the product.\",\n                        \"type\": \"string\"\n                    },\n                    \"name\": {\n                        \"description\": \"Name of the product.\",\n                        \"type\": \"string\"\n                    },\n                    \"price\": {\n                        \"description\": \"Price of the product.\",\n                        \"type\": \"number\"\n                    },\n                    \"product_id\": {\n                        \"description\": \"Unique identifier for the product.\",\n                        \"type\": \"string\"\n                    },\n                    \"quantity\": {\n                        \"description\": \"Quantity of the product in the cart.\",\n                        \"type\": \"integer\"\n                    },\n                    \"sku\": {\n                        \"description\": \"Stock keeping unit for the product.\",\n                        \"type\": \"string\"\n                    }\n                },\n                \"required\": [                                  // Required properties for the items\n                    \"product_id\",\n                    \"sku\",\n                    \"category\",\n                    \"name\",\n                    \"price\",\n                    \"quantity\"\n                ],\n                \"additionalProperties\": false,                // Are additional item properties allowed?\n            }\n        },\n        \"revenue\": {\n            \"description\": \"Total revenue for the order.\",\n            \"type\": \"number\"\n        },\n        \"session_id\": {\n            \"description\": \"Unique identifier for the session.\",\n            \"type\": \"string\"\n        },\n        \"shipping\": {\n            \"description\": \"Shipping cost for the order.\",\n            \"type\": \"number\"\n        },\n        \"subtotal\": {\n            \"description\": \"Subtotal for the order (revenue - discount).\",\n            \"type\": \"number\"\n        },\n        \"tax\": {\n            \"description\": \"Tax for the order.\",\n            \"type\": \"number\"\n        },\n        \"total\": {\n            \"description\": \"Total cost for the order (revenue - discount + shipping + tax = subtotal + shipping + tax).\",\n            \"type\": \"number\"\n        }\n    },\n    \"required\": [                  // Required properties (can be empty)\n        \"session_id\",\n        \"order_id\",\n        \"revenue\",\n        \"coupon\",\n        \"discount\",\n        \"subtotal\",\n        \"shipping\",\n        \"tax\",\n        \"total\",\n        \"currency\",\n        \"products\"\n    ],\n    \"additionalProperties\": false  // Are additional properties allowed?\n}\n```\n</details>\n\n\n#### Schema `$id` and `version`\nSchemas in a Reflekt project are identified and `--select`ed by their `$id`, which is their path relative to the `schemas/` directory. For example:\n| File Path to Schema                                                   | Schema `$id`                       |\n|-----------------------------------------------------------------------|------------------------------------|\n| `~/repos/my-reflekt-project/schemas/jaffle_shop/Cart_Viewed/1-0.json` | `jaffle_shop/Cart_Viewed/1-0.json` |\n| `~/repos/my-reflekt-project/schemas/jaffle_shop/Cart_Viewed/2-1.json` | `jaffle_shop/Cart_Viewed/2-1.json` |\n\nEach schema has a `version` (e.g., `1-0`, `2-1`), used to indicate changes to data collection requirements. New event schemas start at `1-0` and follow a `MAJOR-MINOR` version spec, as shown in the table below.\n| Type  | Description | Example | Use Case |\n|-------|---------------------------------------------------|---------|----------|\n| MAJOR | Breaking change incompatible with previous data. | `1-0`, `2-0`<br>(ends in `-0) | - Add/remove/rename a *required* property<br> - Change a property from *optional to required*<br> - Change a property's type |\n| MINOR | Non-breaking change compatible with previous data. | `1-1`, `2-3` | - Add/remove/rename an *optional* property<br> - Change a property from *required to optional* |\n\n> [!NOTE]\n> For `MINOR` schema versions (non-breaking changes), you can either:\n> - Update the existing schema and increment the MINOR version number.\n> - Create a new `.json` file with the updated schema and increment the MINOR version number.\n>\n> For `MAJOR` schema versions (breaking changes), you MUST:\n> - **Create a new `.json` file** with the updated schema. This way, an application/product/feature can begin using the new schema while others continue to use the old schema (migrating later).\n\n<br>\n\n### Linting Event Schemas\nSchemas can be linted to test if they follow the naming conventions in your [`reflekt_project.yml`] and metadata conventions in `schemas/.reflekt/meta/1-0.json`.\n\n```bash\n\u276f reflekt lint --select schemas/jaffle_shop\n[18:31:12] INFO     Running with reflekt=0.5.0\n[18:31:12] INFO     Searching for JSON schemas in: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop\n[18:31:12] INFO     Found 9 schema(s) to lint\n[18:31:12] INFO     1 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Order_Completed/1-0.json\n[18:31:19] INFO     2 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Identify/1-0.json\n[18:31:20] INFO     3 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Clicked/1-0.json\n[18:31:24] INFO     4 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Cart_Viewed/1-0.json\n[18:31:26] INFO     5 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Removed/1-0.json\n[18:31:32] INFO     6 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Added/1-0.json\n[18:31:37] INFO     7 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Viewed/1-0.json\n[18:31:40] INFO     8 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Completed/1-0.json\n[18:31:44] INFO     9 of 9 Linting /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Page_Viewed/1-0.json\n[18:31:51] INFO     Completed successfully\n```\n<br>\n\n### Sending Event Schemas to a Schema Registries\nIn order to validate events as they flow from **Application -> Registry -> Customer Data Platform (CDP) -> Data Warehouse**, we need to send a copy of our event schemas to a schema registry (see [supported registries](#schema-registry)). This is done with the `reflekt push` command.\n\n```bash\n\u276f reflekt push --registry segment --select schemas/jaffle_shop\n[18:41:05] INFO     Running with reflekt=0.5.0\n[18:41:06] INFO     Searching for JSON schemas in: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop\n[18:41:06] INFO     Found 9 schemas to push\n[18:41:06] INFO     1 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Order_Completed/1-0.json\n[18:41:06] INFO     2 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Identify/1-0.json\n[18:41:06] INFO     3 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Clicked/1-0.json\n[18:41:06] INFO     4 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Cart_Viewed/1-0.json\n[18:41:06] INFO     5 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Removed/1-0.json\n[18:41:06] INFO     6 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Added/1-0.json\n[18:41:06] INFO     7 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Viewed/1-0.json\n[18:41:06] INFO     8 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Completed/1-0.json\n[18:41:06] INFO     9 of 9 Pushing /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Page_Viewed/1-0.json\n[18:41:08] INFO     Completed successfully\n```\n<br>\n\n### Building `dbt` Packages to Model Event Data\nModeling event data in `dbt` is a lot of work. Everyone wants staging models that are clean, documented, and tested. But who wants to write and maintain SQL and YAML for hundreds of events?\n\nYou don't have to choose. Put `reflekt build` to work for you - staging models, documentation, even tests - all in a dbt package ready for you to use in your dbt project.\n\n```bash\n\u276f reflekt build --artifact dbt --select schemas/jaffle_shop --source snowflake.raw.jaffle_shop_segment --sdk segment\n[18:56:25] INFO     Running with reflekt=0.5.0\n[18:56:26] INFO     Searching for JSON schemas in: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop\n[18:56:26] INFO     Found 9 schemas to build\n[18:56:27] INFO     Building dbt package:\n                        name: jaffle_shop\n                        dir: /Users/gclunies/Repos/reflekt/artifacts/dbt/jaffle_shop\n                        --select: jaffle_shop\n                        --sdk_arg: segment\n                        --source: snowflake.raw.jaffle_shop_segment\n[18:56:27] INFO     Building dbt source 'jaffle_shop_segment'\n[18:56:27] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Order_Completed/1-0.json\n[18:56:28] INFO     Building dbt table 'order_completed' in source 'jaffle_shop_segment'\n[18:56:28] INFO     Building staging model 'stg_jaffle_shop_segment__order_completed.sql'\n[18:56:28] INFO     Building dbt documentation '_stg_jaffle_shop_segment__order_completed.yml'\n[18:56:28] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Identify/1-0.json\n[18:56:29] INFO     Building dbt table 'identifies' in source 'jaffle_shop_segment'\n[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__identifies.sql'\n[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__identifies.yml'\n[18:56:29] INFO     Building dbt artifacts for schema: Segment 'users' table\n[18:56:29] INFO     Building dbt table 'users' in source 'jaffle_shop_segment'\n[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__users.sql'\n[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__users.yml'\n[18:56:29] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Clicked/1-0.json\n[18:56:29] INFO     Building dbt table 'product_clicked' in source 'jaffle_shop_segment'\n[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__product_clicked.sql'\n[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__product_clicked.yml'\n[18:56:29] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Cart_Viewed/1-0.json\n[18:56:29] INFO     Building dbt table 'cart_viewed' in source 'jaffle_shop_segment'\n[18:56:29] INFO     Building staging model 'stg_jaffle_shop_segment__cart_viewed.sql'\n[18:56:29] INFO     Building dbt documentation '_stg_jaffle_shop_segment__cart_viewed.yml'\n[18:56:29] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Removed/1-0.json\n[18:56:30] INFO     Building dbt table 'product_removed' in source 'jaffle_shop_segment'\n[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__product_removed.sql'\n[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__product_removed.yml'\n[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Product_Added/1-0.json\n[18:56:30] INFO     Building dbt table 'product_added' in source 'jaffle_shop_segment'\n[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__product_added.sql'\n[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__product_added.yml'\n[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Viewed/1-0.json\n[18:56:30] INFO     Building dbt table 'checkout_step_viewed' in source 'jaffle_shop_segment'\n[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__checkout_step_viewed.sql'\n[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__checkout_step_viewed.yml'\n[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Checkout_Step_Completed/1-0.json\n[18:56:30] INFO     Building dbt table 'checkout_step_completed' in source 'jaffle_shop_segment'\n[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__checkout_step_completed.sql'\n[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__checkout_step_completed.yml'\n[18:56:30] INFO     Building dbt artifacts for schema: /Users/gclunies/Repos/reflekt/schemas/jaffle_shop/Page_Viewed/1-0.json\n[18:56:30] INFO     Building dbt table 'pages' in source 'jaffle_shop_segment'\n[18:56:30] INFO     Building staging model 'stg_jaffle_shop_segment__pages.sql'\n[18:56:30] INFO     Building dbt documentation '_stg_jaffle_shop_segment__pages.yml'\n[18:56:30] INFO     Building dbt artifacts for schema: Segment 'tracks' table\n[18:56:31] INFO     Building dbt table 'tracks' in source 'jaffle_shop_segment'\n[18:56:31] INFO     Building staging model 'stg_jaffle_shop_segment__tracks.sql'\n[18:56:31] INFO     Building dbt documentation '_stg_jaffle_shop_segment__tracks.yml'\n[18:56:31] INFO     Copying dbt package from temporary path /Users/gclunies/Repos/reflekt/.reflekt_cache/artifacts/dbt/jaffle_shop to /Users/gclunies/Repos/reflekt/artifacts/dbt/jaffle_shop\n[18:56:31] INFO     Successfully built dbt package\n```\n---\n<br>\n\n## CLI Commands\nA description of commands can be seen by running `reflekt --help`. The help page for each CLI command is shown below.\n\n### `reflekt init`\n```bash\n\u276f reflekt init --help\n[11:17:16] INFO     Running with reflekt=0.6.0\n\n Usage: reflekt init [OPTIONS]\n\n Initialize a Reflekt project.\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 --dir                        TEXT  [default: .]                \u2502\n\u2502 --verbose    --no-verbose          [default: no-verbose]       \u2502\n\u2502 --help                             Show this message and exit. \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n### `reflekt debug`\n```bash\n\u276f reflekt debug --help\n[11:18:07] INFO     Running with reflekt=0.6.0\n\n Usage: reflekt debug [OPTIONS]\n\n Check Reflekt project configuration.\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 --help          Show this message and exit. \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n### `reflekt lint`\n```bash\n\u276f reflekt lint --help\n[11:20:29] INFO     Running with reflekt=0.6.0\n\n Usage: reflekt lint [OPTIONS]\n\n Lint schema(s) to test for naming and metadata conventions.\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 *  --select   -s      TEXT  Schema(s) to lint. Starting with 'schemas/' is optional. [default: None] [required] \u2502\n\u2502    --verbose  -v            Verbose logging.                                                                        \u2502\n\u2502    --help                   Show this message and exit.                                                             \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n### `reflekt push`\n```bash\n\u276f reflekt push --help\n[11:22:37] INFO     Running with reflekt=0.6.0\n\n Usage: reflekt push [OPTIONS]\n\n Push schema(s) to a schema registry.\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 *  --registry  -r      [avo|segment]  Schema registry to push to. [default: None] [required]                                          \u2502\n\u2502 *  --select    -s      TEXT           Schema(s) to push to schema registry. Starting with 'schemas/' is optional. [default: None] \u2502\n\u2502    --delete    -D                     Delete schema(s) from schema registry. Prompts for confirmation                                 \u2502\n\u2502    --force     -F                     Force command to run without confirmation.                                                      \u2502\n\u2502    --profile   -p      TEXT           Profile in reflekt_profiles.yml to use for schema registry connection.                          \u2502\n\u2502    --verbose   -v                     Verbose logging.                                                                                \u2502\n\u2502    --help                             Show this message and exit.                                                                     \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n### `reflekt pull`\n```bash\n\u276f reflekt pull --help\n[11:25:18] INFO     Running with reflekt=0.6.0\n\n Usage: reflekt pull [OPTIONS]\n\n Pull schema(s) from a schema registry.\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 *  --registry  -r      [avo|segment]  Schema registry to pull from. [default: None] [required]                                                                         \u2502\n\u2502 *  --select    -s      TEXT           Schema(s) to pull from schema registry. If registry uses tracking plans, starting with the plan name. [default: None] [required] \u2502\n\u2502    --profile   -p      TEXT           Profile in reflekt_profiles.yml to use for schema registry connection.                                                           \u2502\n\u2502    --verbose   -v                     Verbose logging.                                                                                                                 \u2502\n\u2502    --help                             Show this message and exit.                                                                                                      \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n### `reflekt build`\n```bash\n\u276f reflekt build --help\n[11:31:36] INFO     Running with reflekt=0.6.0\n\n Usage: reflekt build [OPTIONS]\n\n Build data artifacts based on schemas.\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 *  --artifact  -a      [dbt]      Type of data artifact to build. [default: None] [required]                                                                                                         \u2502\n\u2502 *  --select    -s      TEXT       Schema(s) to build data artifacts for. Starting with 'schemas/' is optional. [default: None] [required]                                                            \u2502\n\u2502 *  --sdk               [segment]  The SDK used to collect the event data. [default: None] [required]                                                                                                 \u2502\n\u2502 *  --source            TEXT       The <source_id>.<database>.<schema> storing raw event data. <source_id> must be a data warehouse source defined in reflekt_profiles.yml [default: None] [required] \u2502\n\u2502    --profile   -p      TEXT       Profile in reflekt_profiles.yml to look for the data source specified by the --source option. Defaults to default_profile in reflekt_project.yml                   \u2502\n\u2502    --verbose   -v                 Verbose logging.                                                                                                                                                   \u2502\n\u2502    --help                         Show this message and exit.                                                                                                                                        \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n### `reflekt report`\n```bash\n\u276f reflekt report --help\n[08:20:09] INFO     Running with reflekt=0.6.0\n\n Usage: reflekt report [OPTIONS]\n\n Generate Markdown report(s) for schema(s).\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 *  --select   -s      TEXT  Schema(s) to generate Markdown report(s) for. Starting with         \u2502\n\u2502                             'schemas/' is optional.                                             \u2502\n\u2502                             [default: None]                                                     \u2502\n\u2502                             [required]                                                          \u2502\n\u2502    --to-file  -f            Write report(s) to file instead of terminal.                        \u2502\n\u2502    --verbose  -v            Verbose logging.                                                    \u2502\n\u2502    --help                   Show this message and exit.                                         \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n---\n<br>\n\n## Integrations\n\n### Customer Data Platform (CDP)\nReflekt understands how Customer Data Platforms (CDPs) collect event data and load them into data warehouses, allowing it to:\n  - Parse the schemas in a Reflekt project.\n  - Find matching tables for events and columns for properties in the data warehouse.\n  - Build a `dbt` package with sources, staging models, and documentation for event data.\n\n| CDP | Supported |\n|-----|-----------|\n| [Segment](https://segment.com/) | \u2705 |\n| [Rudderstack](https://www.rudderstack.com/) | \ud83d\udea7 Coming Soon \ud83d\udea7 |\n| [Amplitude](https://amplitude.com/) | \ud83d\udea7 Coming Soon \ud83d\udea7  |\n\n\n### Schema Registry\nSchema registries store and serve schemas. When a schema is registered in a regsitry, it can be used to validate events as they flow through your data collection infrastructure. Reflekt works with schema registries from CDPs, SaaS vendors, and open-source projects, letting teams to decide between managed and self-hosted solutions.\n\n| Registry |  Open Source  | Schema Versions | Recommended Workflow |\n|----------|:-------------:|-----------------|----------------------|\n| [Segment Protocols](https://segment.com/docs/protocols/) | \u274c | `MAJOR` only | Manage schemas in Reflekt.<br> `reflekt push` to Protocols for event validation.<br> `reflekt build --artifact dbt` to build dbt package. |\n| [Avo](https://www.avo.app/) | \u274c | `MAJOR` only | Manage schemas in Avo.<br> `reflekt pull` to get schemas.<br>  `reflekt build --artifact dbt` to build dbt package. |\n| [reflekt-registry](https://github.com/GClunies/reflekt-registry)<br> \ud83d\udea7 Coming Soon \ud83d\udea7 | \u2705 | `MAJOR` & `MINOR` |  Manage schemas in Reflekt.<br> `reflekt push` to reflekt-registry.<br> `reflekt build --artifact dbt` to build dbt package. |\n\n### Data Warehouse\nIn order to build dbt packages, Reflekt needs to connect to a cloud data warehouse where raw event data is stored.\n\n| Data Warehouse | Supported |\n|----------------|-----------|\n| [Snowflake](https://www.snowflake.com/) | \u2705 |\n| [Redshift](https://aws.amazon.com/redshift/) | \u2705 |\n| [BigQuery](https://cloud.google.com/bigquery) | \u2705 |\n\nReflekt **NEVER** copies, moves, or modifies events in the data warehouse. It ONLY reads table and column names for templating.\n\n### dbt\n[dbt](https://www.getdbt.com/) enables anyone that knows SQL to transform data in a cloud data warehouse. When modeling in dbt, it is [best practice](https://docs.getdbt.com/guides/best-practices/how-we-structure/1-guide-overview) to:\n- Define sources pointing to the raw data.\n- Write staging models that [rename, recast, or usefully reconsider](https://discourse.getdbt.com/t/how-we-used-to-structure-our-dbt-projects/355#data-transformation-101-1) columns into a consistent format.\n- Document and test the staging models.\n\nBut as the number of events intsrumented in your product grow or change as the product evolves, mainting this best practice can be **burdensome and boring.** This is where [`reflekt build`](#reflekt-build) steps in.\n\n## Contribute\n- Source Code: [github.com/GClunies/reflekt](https://github.com/GClunies/reflekt)\n- Issue Tracker: [github.com/GClunies/reflekt/issues](https://github.com/GClunies/reflekt/issues)\n- Pull Requests: [github.com/GClunies/reflekt/pulls](https://github.com/GClunies/reflekt/pulls)\n\n## License\nThis project is [licensed](LICENSE) under the Apache-2.0 License.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "A CLI tool to define event schemas, lint them to enforce conventions, publish to a schema registry, and build corresponding data artifacts (e.g., dbt sources/models/docs).",
    "version": "0.6.0",
    "project_urls": {
        "Homepage": "https://github.com/GClunies/Reflekt",
        "Repository": "https://github.com/GClunies/Reflekt"
    },
    "split_keywords": [
        "events",
        "jsonschema",
        "analytics",
        "business-intelligence",
        "data-modeling",
        "dbt",
        "snowflake",
        "redshift",
        "segment",
        "avo"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "067bc24e507321929d90550263c3802fe3e98aede60cee3eba6a66aa7cb52a2f",
                "md5": "dc0e171a41874f836446f3613f9623a4",
                "sha256": "8b885289bc8f806e6b956d3c39127483d544b5820942cf6f034f59667208f783"
            },
            "downloads": -1,
            "filename": "reflekt-0.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dc0e171a41874f836446f3613f9623a4",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<3.12",
            "size": 70158,
            "upload_time": "2024-03-05T04:24:43",
            "upload_time_iso_8601": "2024-03-05T04:24:43.322329Z",
            "url": "https://files.pythonhosted.org/packages/06/7b/c24e507321929d90550263c3802fe3e98aede60cee3eba6a66aa7cb52a2f/reflekt-0.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a32379a73b1b0c48b775ce5f5c0f433807384624d53858bce4c2eedd2381e263",
                "md5": "f721430db56408cc36eaf3f8eda4725b",
                "sha256": "cbaf618e9f5f6f27c86a31c7f357a18b4da5c1185ddad8d7013054ed7d4b9919"
            },
            "downloads": -1,
            "filename": "reflekt-0.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "f721430db56408cc36eaf3f8eda4725b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<3.12",
            "size": 60903,
            "upload_time": "2024-03-05T04:24:45",
            "upload_time_iso_8601": "2024-03-05T04:24:45.514233Z",
            "url": "https://files.pythonhosted.org/packages/a3/23/79a73b1b0c48b775ce5f5c0f433807384624d53858bce4c2eedd2381e263/reflekt-0.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-05 04:24:45",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "GClunies",
    "github_project": "Reflekt",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "lcname": "reflekt"
}
        
Elapsed time: 0.39599s