# Kedro-DataSentinel
[](https://pypi.org/project/kedro-datasentinel/)
[](https://pypi.org/project/kedro-datasentinel/)
[](https://github.com/SumzCol/kedro-datasentinel/blob/main/LICENSE)
[](https://kedro.org)
`kedro-datasentinel` is a [kedro-plugin](https://kedro.readthedocs.io/en/stable/extend_kedro/plugins.html) for seamless integration of Data Sentinel capabilities inside [kedro](https://kedro.readthedocs.io/en/stable/index.html) projects. It enforces Kedro principles to make data quality and validation as production-ready as possible. Its core functionalities are:
- **Data Validation**: `kedro-datasentinel` enhances data quality for machine learning and data engineering pipelines. With minimal configuration, you can validate your datasets during a kedro run, both online (during pipeline execution) and offline (post-execution).
- **Audit Logging**: Track and monitor your pipeline executions with detailed audit logs. This feature provides visibility into your data processing workflows, making it easier to debug issues and ensure compliance.
- **Notification System**: Get alerted when data quality issues arise. Configure notifications to be sent through various channels when validation checks fail.
## How do I install kedro-datasentinel?
You can install `kedro-datasentinel` with pip:
```bash
pip install kedro-datasentinel
```
For development installation:
```bash
pip install --upgrade git+https://github.com/SumzCol/kedro-datasentinel.git
```
We recommend using a package manager (like `conda`) to create a virtual environment and to read [kedro installation guide](https://docs.kedro.org/en/stable/get_started/minimal_kedro_project.html#step-1-install-kedro).
## Getting started
To use `kedro-datasentinel` in your Kedro project:
1. Install the package as described above
2. Create a `datasentinel.yml` configuration file in your project's `conf` directory
3. Configure your datasets with validation rules in your catalog
4. Run your Kedro pipeline as usual
## Features
### Data Validation
`kedro-datasentinel` provides a flexible framework for validating your data:
- **Online Validation**: Validate data during pipeline execution
- **Offline Validation**: Validate data after pipeline execution leveraging the command `datasentinel validate -d <dataset_name>`
- **Custom Checks**: Create your own validation checks
- **Integration with Data Sentinel**: Leverage all the capabilities of Data Sentinel
### Audit Logging
Track the execution of your Kedro pipelines with detailed audit logs:
- **Node Execution**: Log when nodes start, complete, or fail
- **Input/Output Tracking**: Record which datasets were used as inputs and outputs
- **Error Logging**: Capture exceptions and error messages
- **Multiple Storage Options**: Store audit logs in databases, files, or custom stores
### Notification System
Get alerted when data quality issues arise:
- **Email Notifications**: Send emails when validation checks fail
- **Custom Notifiers**: Create your own notification channels
- **Event-Based Triggers**: Configure which events trigger notifications
## Release and roadmap
The [release history](https://github.com/SumzCol/kedro-datasentinel/blob/master/CHANGELOG.md) centralizes package improvements across time.
## Disclaimer
This package is still in active development. We use [SemVer](https://semver.org/) principles to version our releases.
## Can I contribute?
We'd be happy to receive help to maintain and improve the package. Any PR will be considered (from typo in the docs to core features add-on). Please check the [contributing guidelines](https://github.com/SumzCol/kedro-datasentinel/blob/master/CONTRIBUTING.md).
## Main contributors
The following people actively maintain, enhance and discuss design to make this package as good as possible:
- [Sumz SAS Team](https://github.com/SumzCol)
Raw data
{
"_id": null,
"home_page": null,
"name": "kedro-datasentinel",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "data quality, data engineering, kedro, kedro-plugin, data-sentinel",
"author": "Sumz SAS",
"author_email": null,
"download_url": null,
"platform": null,
"description": "# Kedro-DataSentinel\n\n[](https://pypi.org/project/kedro-datasentinel/)\n[](https://pypi.org/project/kedro-datasentinel/)\n[](https://github.com/SumzCol/kedro-datasentinel/blob/main/LICENSE)\n[](https://kedro.org)\n\n`kedro-datasentinel` is a [kedro-plugin](https://kedro.readthedocs.io/en/stable/extend_kedro/plugins.html) for seamless integration of Data Sentinel capabilities inside [kedro](https://kedro.readthedocs.io/en/stable/index.html) projects. It enforces Kedro principles to make data quality and validation as production-ready as possible. Its core functionalities are:\n\n- **Data Validation**: `kedro-datasentinel` enhances data quality for machine learning and data engineering pipelines. With minimal configuration, you can validate your datasets during a kedro run, both online (during pipeline execution) and offline (post-execution).\n\n- **Audit Logging**: Track and monitor your pipeline executions with detailed audit logs. This feature provides visibility into your data processing workflows, making it easier to debug issues and ensure compliance.\n\n- **Notification System**: Get alerted when data quality issues arise. Configure notifications to be sent through various channels when validation checks fail.\n\n## How do I install kedro-datasentinel?\n\nYou can install `kedro-datasentinel` with pip:\n\n```bash\npip install kedro-datasentinel\n```\n\nFor development installation:\n\n```bash\npip install --upgrade git+https://github.com/SumzCol/kedro-datasentinel.git\n```\n\nWe recommend using a package manager (like `conda`) to create a virtual environment and to read [kedro installation guide](https://docs.kedro.org/en/stable/get_started/minimal_kedro_project.html#step-1-install-kedro).\n\n## Getting started\n\nTo use `kedro-datasentinel` in your Kedro project:\n\n1. Install the package as described above\n2. Create a `datasentinel.yml` configuration file in your project's `conf` directory\n3. Configure your datasets with validation rules in your catalog\n4. Run your Kedro pipeline as usual\n\n## Features\n\n### Data Validation\n\n`kedro-datasentinel` provides a flexible framework for validating your data:\n\n- **Online Validation**: Validate data during pipeline execution\n- **Offline Validation**: Validate data after pipeline execution leveraging the command `datasentinel validate -d <dataset_name>`\n- **Custom Checks**: Create your own validation checks\n- **Integration with Data Sentinel**: Leverage all the capabilities of Data Sentinel\n\n### Audit Logging\n\nTrack the execution of your Kedro pipelines with detailed audit logs:\n\n- **Node Execution**: Log when nodes start, complete, or fail\n- **Input/Output Tracking**: Record which datasets were used as inputs and outputs\n- **Error Logging**: Capture exceptions and error messages\n- **Multiple Storage Options**: Store audit logs in databases, files, or custom stores\n\n### Notification System\n\nGet alerted when data quality issues arise:\n\n- **Email Notifications**: Send emails when validation checks fail\n- **Custom Notifiers**: Create your own notification channels\n- **Event-Based Triggers**: Configure which events trigger notifications\n\n## Release and roadmap\n\nThe [release history](https://github.com/SumzCol/kedro-datasentinel/blob/master/CHANGELOG.md) centralizes package improvements across time.\n\n## Disclaimer\n\nThis package is still in active development. We use [SemVer](https://semver.org/) principles to version our releases.\n\n## Can I contribute?\n\nWe'd be happy to receive help to maintain and improve the package. Any PR will be considered (from typo in the docs to core features add-on). Please check the [contributing guidelines](https://github.com/SumzCol/kedro-datasentinel/blob/master/CONTRIBUTING.md).\n\n## Main contributors\n\nThe following people actively maintain, enhance and discuss design to make this package as good as possible:\n\n- [Sumz SAS Team](https://github.com/SumzCol)\n",
"bugtrack_url": null,
"license": "Apache Software License (Apache 2.0)",
"summary": "A Kedro plugin to integrate Data Sentinel in Kedro projects.",
"version": "0.0.1b2",
"project_urls": {
"Bug Tracker": "https://github.com/SumzCol/kedro-datasentinel/issues",
"Homepage": "https://github.com/SumzCol/kedro-datasentinel"
},
"split_keywords": [
"data quality",
" data engineering",
" kedro",
" kedro-plugin",
" data-sentinel"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "579fc230b57f795263787eede00fe32a4257b20c6b04150fdb0c3034ec2579e6",
"md5": "437b7d7e1f51da98321de1b8852d75bc",
"sha256": "f918e54697bb15385dc151c3d19f7fcde55536ef64ad263b4b7084206aad32ad"
},
"downloads": -1,
"filename": "kedro_datasentinel-0.0.1b2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "437b7d7e1f51da98321de1b8852d75bc",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 19485,
"upload_time": "2025-09-01T20:02:46",
"upload_time_iso_8601": "2025-09-01T20:02:46.788318Z",
"url": "https://files.pythonhosted.org/packages/57/9f/c230b57f795263787eede00fe32a4257b20c6b04150fdb0c3034ec2579e6/kedro_datasentinel-0.0.1b2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-01 20:02:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "SumzCol",
"github_project": "kedro-datasentinel",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "kedro-datasentinel"
}