kedro-datasentinel


Namekedro-datasentinel JSON
Version 0.0.1b2 PyPI version JSON
download
home_pageNone
SummaryA Kedro plugin to integrate Data Sentinel in Kedro projects.
upload_time2025-09-01 20:02:46
maintainerNone
docs_urlNone
authorSumz SAS
requires_python>=3.10
licenseApache Software License (Apache 2.0)
keywords data quality data engineering kedro kedro-plugin data-sentinel
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Kedro-DataSentinel

[![Python version](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue.svg)](https://pypi.org/project/kedro-datasentinel/)
[![PyPI version](https://badge.fury.io/py/kedro-datasentinel.svg)](https://pypi.org/project/kedro-datasentinel/)
[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/SumzCol/kedro-datasentinel/blob/main/LICENSE)
[![Powered by Kedro](https://img.shields.io/badge/powered_by-kedro-ffc900?logo=kedro)](https://kedro.org)

`kedro-datasentinel` is a [kedro-plugin](https://kedro.readthedocs.io/en/stable/extend_kedro/plugins.html) for seamless integration of Data Sentinel capabilities inside [kedro](https://kedro.readthedocs.io/en/stable/index.html) projects. It enforces Kedro principles to make data quality and validation as production-ready as possible. Its core functionalities are:

- **Data Validation**: `kedro-datasentinel` enhances data quality for machine learning and data engineering pipelines. With minimal configuration, you can validate your datasets during a kedro run, both online (during pipeline execution) and offline (post-execution).

- **Audit Logging**: Track and monitor your pipeline executions with detailed audit logs. This feature provides visibility into your data processing workflows, making it easier to debug issues and ensure compliance.

- **Notification System**: Get alerted when data quality issues arise. Configure notifications to be sent through various channels when validation checks fail.

## How do I install kedro-datasentinel?

You can install `kedro-datasentinel` with pip:

```bash
pip install kedro-datasentinel
```

For development installation:

```bash
pip install --upgrade git+https://github.com/SumzCol/kedro-datasentinel.git
```

We recommend using a package manager (like `conda`) to create a virtual environment and to read [kedro installation guide](https://docs.kedro.org/en/stable/get_started/minimal_kedro_project.html#step-1-install-kedro).

## Getting started

To use `kedro-datasentinel` in your Kedro project:

1. Install the package as described above
2. Create a `datasentinel.yml` configuration file in your project's `conf` directory
3. Configure your datasets with validation rules in your catalog
4. Run your Kedro pipeline as usual

## Features

### Data Validation

`kedro-datasentinel` provides a flexible framework for validating your data:

- **Online Validation**: Validate data during pipeline execution
- **Offline Validation**: Validate data after pipeline execution leveraging the command `datasentinel validate -d <dataset_name>`
- **Custom Checks**: Create your own validation checks
- **Integration with Data Sentinel**: Leverage all the capabilities of Data Sentinel

### Audit Logging

Track the execution of your Kedro pipelines with detailed audit logs:

- **Node Execution**: Log when nodes start, complete, or fail
- **Input/Output Tracking**: Record which datasets were used as inputs and outputs
- **Error Logging**: Capture exceptions and error messages
- **Multiple Storage Options**: Store audit logs in databases, files, or custom stores

### Notification System

Get alerted when data quality issues arise:

- **Email Notifications**: Send emails when validation checks fail
- **Custom Notifiers**: Create your own notification channels
- **Event-Based Triggers**: Configure which events trigger notifications

## Release and roadmap

The [release history](https://github.com/SumzCol/kedro-datasentinel/blob/master/CHANGELOG.md) centralizes package improvements across time.

## Disclaimer

This package is still in active development. We use [SemVer](https://semver.org/) principles to version our releases.

## Can I contribute?

We'd be happy to receive help to maintain and improve the package. Any PR will be considered (from typo in the docs to core features add-on). Please check the [contributing guidelines](https://github.com/SumzCol/kedro-datasentinel/blob/master/CONTRIBUTING.md).

## Main contributors

The following people actively maintain, enhance and discuss design to make this package as good as possible:

- [Sumz SAS Team](https://github.com/SumzCol)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "kedro-datasentinel",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "data quality, data engineering, kedro, kedro-plugin, data-sentinel",
    "author": "Sumz SAS",
    "author_email": null,
    "download_url": null,
    "platform": null,
    "description": "# Kedro-DataSentinel\n\n[![Python version](https://img.shields.io/badge/python-3.10%20%7C%203.11%20%7C%203.12%20%7C%203.13-blue.svg)](https://pypi.org/project/kedro-datasentinel/)\n[![PyPI version](https://badge.fury.io/py/kedro-datasentinel.svg)](https://pypi.org/project/kedro-datasentinel/)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/SumzCol/kedro-datasentinel/blob/main/LICENSE)\n[![Powered by Kedro](https://img.shields.io/badge/powered_by-kedro-ffc900?logo=kedro)](https://kedro.org)\n\n`kedro-datasentinel` is a [kedro-plugin](https://kedro.readthedocs.io/en/stable/extend_kedro/plugins.html) for seamless integration of Data Sentinel capabilities inside [kedro](https://kedro.readthedocs.io/en/stable/index.html) projects. It enforces Kedro principles to make data quality and validation as production-ready as possible. Its core functionalities are:\n\n- **Data Validation**: `kedro-datasentinel` enhances data quality for machine learning and data engineering pipelines. With minimal configuration, you can validate your datasets during a kedro run, both online (during pipeline execution) and offline (post-execution).\n\n- **Audit Logging**: Track and monitor your pipeline executions with detailed audit logs. This feature provides visibility into your data processing workflows, making it easier to debug issues and ensure compliance.\n\n- **Notification System**: Get alerted when data quality issues arise. Configure notifications to be sent through various channels when validation checks fail.\n\n## How do I install kedro-datasentinel?\n\nYou can install `kedro-datasentinel` with pip:\n\n```bash\npip install kedro-datasentinel\n```\n\nFor development installation:\n\n```bash\npip install --upgrade git+https://github.com/SumzCol/kedro-datasentinel.git\n```\n\nWe recommend using a package manager (like `conda`) to create a virtual environment and to read [kedro installation guide](https://docs.kedro.org/en/stable/get_started/minimal_kedro_project.html#step-1-install-kedro).\n\n## Getting started\n\nTo use `kedro-datasentinel` in your Kedro project:\n\n1. Install the package as described above\n2. Create a `datasentinel.yml` configuration file in your project's `conf` directory\n3. Configure your datasets with validation rules in your catalog\n4. Run your Kedro pipeline as usual\n\n## Features\n\n### Data Validation\n\n`kedro-datasentinel` provides a flexible framework for validating your data:\n\n- **Online Validation**: Validate data during pipeline execution\n- **Offline Validation**: Validate data after pipeline execution leveraging the command `datasentinel validate -d <dataset_name>`\n- **Custom Checks**: Create your own validation checks\n- **Integration with Data Sentinel**: Leverage all the capabilities of Data Sentinel\n\n### Audit Logging\n\nTrack the execution of your Kedro pipelines with detailed audit logs:\n\n- **Node Execution**: Log when nodes start, complete, or fail\n- **Input/Output Tracking**: Record which datasets were used as inputs and outputs\n- **Error Logging**: Capture exceptions and error messages\n- **Multiple Storage Options**: Store audit logs in databases, files, or custom stores\n\n### Notification System\n\nGet alerted when data quality issues arise:\n\n- **Email Notifications**: Send emails when validation checks fail\n- **Custom Notifiers**: Create your own notification channels\n- **Event-Based Triggers**: Configure which events trigger notifications\n\n## Release and roadmap\n\nThe [release history](https://github.com/SumzCol/kedro-datasentinel/blob/master/CHANGELOG.md) centralizes package improvements across time.\n\n## Disclaimer\n\nThis package is still in active development. We use [SemVer](https://semver.org/) principles to version our releases.\n\n## Can I contribute?\n\nWe'd be happy to receive help to maintain and improve the package. Any PR will be considered (from typo in the docs to core features add-on). Please check the [contributing guidelines](https://github.com/SumzCol/kedro-datasentinel/blob/master/CONTRIBUTING.md).\n\n## Main contributors\n\nThe following people actively maintain, enhance and discuss design to make this package as good as possible:\n\n- [Sumz SAS Team](https://github.com/SumzCol)\n",
    "bugtrack_url": null,
    "license": "Apache Software License (Apache 2.0)",
    "summary": "A Kedro plugin to integrate Data Sentinel in Kedro projects.",
    "version": "0.0.1b2",
    "project_urls": {
        "Bug Tracker": "https://github.com/SumzCol/kedro-datasentinel/issues",
        "Homepage": "https://github.com/SumzCol/kedro-datasentinel"
    },
    "split_keywords": [
        "data quality",
        " data engineering",
        " kedro",
        " kedro-plugin",
        " data-sentinel"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "579fc230b57f795263787eede00fe32a4257b20c6b04150fdb0c3034ec2579e6",
                "md5": "437b7d7e1f51da98321de1b8852d75bc",
                "sha256": "f918e54697bb15385dc151c3d19f7fcde55536ef64ad263b4b7084206aad32ad"
            },
            "downloads": -1,
            "filename": "kedro_datasentinel-0.0.1b2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "437b7d7e1f51da98321de1b8852d75bc",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 19485,
            "upload_time": "2025-09-01T20:02:46",
            "upload_time_iso_8601": "2025-09-01T20:02:46.788318Z",
            "url": "https://files.pythonhosted.org/packages/57/9f/c230b57f795263787eede00fe32a4257b20c6b04150fdb0c3034ec2579e6/kedro_datasentinel-0.0.1b2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-09-01 20:02:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "SumzCol",
    "github_project": "kedro-datasentinel",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "kedro-datasentinel"
}
        
Elapsed time: 1.43110s