# DagSonar
Deep visibility into your Airflow task changes through AST parsing and tracking
[](https://www.python.org/downloads/)
[](https://airflow.apache.org/)
[](LICENSE)
## What is DagSonar?
DagSonar is a monitoring tool that provides deep visibility into your Airflow DAG tasks by tracking changes through AST (Abstract Syntax Tree) parsing. It detects modifications in task definitions, external variables, shell scripts, and function calls, ensuring you never miss critical changes to your DAGs.
## Key Features
- **AST-Based Detection**: Tracks changes by parsing the Abstract Syntax Tree of your DAG files
- **Task Reference Tracking**: Monitors task definitions, external variables, and function calls
- **Shell Script Integration**: Tracks associated shell scripts referenced in BashOperator tasks
- **Change History**: Maintains a JSON-based history of all task modifications
- **Task Hash Generation**: Generates unique hashes for each task state to detect changes
- **Support for Multiple DAGs**: Track tasks across multiple DAG configurations
## Installation
```bash
pip install dagsonar
```
## Basic Usage
```python
from pathlib import Path
from dagsonar import TaskTracker, DagConfig
# Initialize the tracker
tracker = TaskTracker(history_file=Path("task_history.json"))
# Configure DAGs to track
dag_configs = {
"example_dag": DagConfig(
path=Path("/path/to/dag.py"),
tasks=["task1", "task2"] # Optional: specify tasks to track
)
}
# Track tasks and get references
references = tracker.track_tasks(dag_configs)
# Check for changes
changes, references = tracker.check_for_changes(references)
# Save the new state
tracker.save_history(references)
```
## Features in Detail
### Task Reference Tracking
DagSonar tracks several aspects of your tasks:
- Task content and structure through AST
- External variable references
- Called functions
- Shell scripts referenced in bash tasks
- Task-specific hashes for change detection
### Supported Task Types
Currently supports tracking of:
- Function-based task definitions
- BashOperator task instances
- Referenced shell scripts
- External variable dependencies
## Configuration
### DagConfig
```python
from dagsonar import DagConfig
from pathlib import Path
config = DagConfig(
path=Path("/path/to/dag.py"), # Path to DAG file
tasks=["task1", "task2"] # Optional: List of specific tasks to track
)
```
### Task History
Task history is stored in JSON format with the following structure:
```json
[
{
"dag_id": "example_dag",
"reference": {
"dag_id": "example_dag",
"task_history": [
{
"task_id": "task1",
"content": "<ast_content>",
"hash": "<computed_hash>",
"external_variables": [],
"called_functions": [],
"shell_scripts": []
}
]
}
}
]
```
## Contributing
We welcome contributions! Please check out our [Contributing Guide](CONTRIBUTING.md) to get started.
### Development Setup
1. Clone the repository:
```bash
git clone https://github.com/pesnik/dagsonar.git
cd dagsonar
```
2. Create a virtual environment:
```bash
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
.\venv\Scripts\activate # Windows
```
3. Install development dependencies:
```bash
pip install -e ".[dev]"
```
### Running Tests
```bash
pytest tests/
```
## License
This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.
## Acknowledgments
- Apache Airflow community
- All contributors and users providing valuable feedback
---
Built for the Airflow community
Raw data
{
"_id": null,
"home_page": null,
"name": "dagsonar",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "airflow, changes, dag, monitoring, task, tracking",
"author": null,
"author_email": "\"Md. Rakibul Hasan\" <hasanrakibul.masum@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/06/f4/3de279eabdc9a256e9d440e28414ae5a9c91661af75e0cc936ad45ecf3e1/dagsonar-0.0.3.tar.gz",
"platform": null,
"description": "# DagSonar\n\nDeep visibility into your Airflow task changes through AST parsing and tracking\n\n[](https://www.python.org/downloads/)\n[](https://airflow.apache.org/)\n[](LICENSE)\n\n## What is DagSonar?\n\nDagSonar is a monitoring tool that provides deep visibility into your Airflow DAG tasks by tracking changes through AST (Abstract Syntax Tree) parsing. It detects modifications in task definitions, external variables, shell scripts, and function calls, ensuring you never miss critical changes to your DAGs.\n\n## Key Features\n\n- **AST-Based Detection**: Tracks changes by parsing the Abstract Syntax Tree of your DAG files\n- **Task Reference Tracking**: Monitors task definitions, external variables, and function calls\n- **Shell Script Integration**: Tracks associated shell scripts referenced in BashOperator tasks\n- **Change History**: Maintains a JSON-based history of all task modifications\n- **Task Hash Generation**: Generates unique hashes for each task state to detect changes\n- **Support for Multiple DAGs**: Track tasks across multiple DAG configurations\n\n## Installation\n\n```bash\npip install dagsonar\n```\n\n## Basic Usage\n\n```python\nfrom pathlib import Path\nfrom dagsonar import TaskTracker, DagConfig\n\n# Initialize the tracker\ntracker = TaskTracker(history_file=Path(\"task_history.json\"))\n\n# Configure DAGs to track\ndag_configs = {\n \"example_dag\": DagConfig(\n path=Path(\"/path/to/dag.py\"),\n tasks=[\"task1\", \"task2\"] # Optional: specify tasks to track\n )\n}\n\n# Track tasks and get references\nreferences = tracker.track_tasks(dag_configs)\n\n# Check for changes\nchanges, references = tracker.check_for_changes(references)\n\n# Save the new state\ntracker.save_history(references)\n```\n\n## Features in Detail\n\n### Task Reference Tracking\n\nDagSonar tracks several aspects of your tasks:\n- Task content and structure through AST\n- External variable references\n- Called functions\n- Shell scripts referenced in bash tasks\n- Task-specific hashes for change detection\n\n### Supported Task Types\n\nCurrently supports tracking of:\n- Function-based task definitions\n- BashOperator task instances\n- Referenced shell scripts\n- External variable dependencies\n\n## Configuration\n\n### DagConfig\n```python\nfrom dagsonar import DagConfig\nfrom pathlib import Path\n\nconfig = DagConfig(\n path=Path(\"/path/to/dag.py\"), # Path to DAG file\n tasks=[\"task1\", \"task2\"] # Optional: List of specific tasks to track\n)\n```\n\n### Task History\n\nTask history is stored in JSON format with the following structure:\n```json\n[\n {\n \"dag_id\": \"example_dag\",\n \"reference\": {\n \"dag_id\": \"example_dag\",\n \"task_history\": [\n {\n \"task_id\": \"task1\",\n \"content\": \"<ast_content>\",\n \"hash\": \"<computed_hash>\",\n \"external_variables\": [],\n \"called_functions\": [],\n \"shell_scripts\": []\n }\n ]\n }\n }\n]\n```\n\n## Contributing\n\nWe welcome contributions! Please check out our [Contributing Guide](CONTRIBUTING.md) to get started.\n\n### Development Setup\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/pesnik/dagsonar.git\ncd dagsonar\n```\n\n2. Create a virtual environment:\n```bash\npython -m venv venv\nsource venv/bin/activate # Linux/Mac\n# or\n.\\venv\\Scripts\\activate # Windows\n```\n\n3. Install development dependencies:\n```bash\npip install -e \".[dev]\"\n```\n\n### Running Tests\n```bash\npytest tests/\n```\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## Acknowledgments\n\n- Apache Airflow community\n- All contributors and users providing valuable feedback\n\n---\nBuilt for the Airflow community\n",
"bugtrack_url": null,
"license": null,
"summary": "Deep visibility into your Airflow task changes",
"version": "0.0.3",
"project_urls": {
"Documentation": "https://github.com/pesnik/dagsonar#readme",
"Issues": "https://github.com/pesnik/dagsonar/issues",
"Source": "https://github.com/pesnik/dagsonar"
},
"split_keywords": [
"airflow",
" changes",
" dag",
" monitoring",
" task",
" tracking"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "fdcdc36910151ee0eb1000e273dd0a266488bc3485708c5e583e1ef7da7bfd70",
"md5": "d00984b48d646394a06fa575891309b4",
"sha256": "06c495b19da0d48f8ba9d6dad01acce84eb8996c970a0bae5a3cbd66dc36fb7a"
},
"downloads": -1,
"filename": "dagsonar-0.0.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d00984b48d646394a06fa575891309b4",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 9379,
"upload_time": "2025-02-20T10:20:51",
"upload_time_iso_8601": "2025-02-20T10:20:51.788796Z",
"url": "https://files.pythonhosted.org/packages/fd/cd/c36910151ee0eb1000e273dd0a266488bc3485708c5e583e1ef7da7bfd70/dagsonar-0.0.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "06f43de279eabdc9a256e9d440e28414ae5a9c91661af75e0cc936ad45ecf3e1",
"md5": "29a7d1ea55abbd2e8242697bcf77efe0",
"sha256": "83b4b67e7db01556d6974a2991afba0e455a06bd290933362bc00eb43957244f"
},
"downloads": -1,
"filename": "dagsonar-0.0.3.tar.gz",
"has_sig": false,
"md5_digest": "29a7d1ea55abbd2e8242697bcf77efe0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 12315,
"upload_time": "2025-02-20T10:20:53",
"upload_time_iso_8601": "2025-02-20T10:20:53.294765Z",
"url": "https://files.pythonhosted.org/packages/06/f4/3de279eabdc9a256e9d440e28414ae5a9c91661af75e0cc936ad45ecf3e1/dagsonar-0.0.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-20 10:20:53",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "pesnik",
"github_project": "dagsonar#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "dagsonar"
}