dagster-kafka


Namedagster-kafka JSON
Version 1.1.2 PyPI version JSON
download
home_pagehttps://github.com/kingsley-123/dagster-kafka-integration
SummaryComplete Kafka integration for Dagster with enterprise DLQ tooling
upload_time2025-07-31 01:07:27
maintainerNone
docs_urlNone
authorKingsley Okonkwo
requires_python>=3.9
licenseNone
keywords dagster kafka apache-kafka data-engineering streaming dlq dead-letter-queue avro protobuf schema-registry enterprise production monitoring alerting
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Dagster Kafka Integration

The most comprehensively validated Kafka integration for Dagster with enterprise-grade features supporting all three major serialization formats and production security.

## Enterprise Validation Completed

**Version 1.1.2** - Most validated Kafka integration package ever created:

**11-Phase Comprehensive Validation** - Unprecedented testing methodology  
**Exceptional Performance** - 1,199 messages/second peak throughput proven  
**Security Hardened** - Complete credential validation + network security  
**Stress Tested** - 100% success rate (305/305 operations over 8+ minutes)  
**Enterprise Ready** - Complete DLQ tooling suite with 5 CLI tools  
**Zero Critical Issues** - Across all validation phases  

## Complete Enterprise Solution

- **JSON Support**: Native JSON message consumption from Kafka topics
- **Avro Support**: Full Avro message support with Schema Registry integration  
- **Protobuf Support**: Complete Protocol Buffers integration with schema management
- **Dead Letter Queue (DLQ)**: Enterprise-grade error handling with circuit breaker patterns
- **Enterprise Security**: Complete SASL/SSL authentication and encryption support
- **Schema Evolution**: Comprehensive validation with breaking change detection across all formats
- **Production Monitoring**: Real-time alerting with Slack/Email integration
- **High Performance**: Advanced caching, batching, and connection pooling
- **Error Recovery**: Multiple recovery strategies for production resilience

## Installation

```bash
pip install dagster-kafka
```

## Enterprise DLQ Tooling Suite

Complete operational tooling available immediately after installation:

```bash
# Analyze failed messages with comprehensive error pattern analysis
dlq-inspector --topic user-events --max-messages 20

# Replay messages with filtering and safety controls  
dlq-replayer --source-topic orders_dlq --target-topic orders --dry-run

# Monitor DLQ health across multiple topics
dlq-monitor --topics user-events_dlq,orders_dlq --output-format json

# Set up automated alerting
dlq-alerts --topic critical-events_dlq --max-messages 500

# Operations dashboard for DLQ health monitoring
dlq-dashboard --topics user-events_dlq,orders_dlq
```

## Quick Start

```python
from dagster import asset, Definitions
from dagster_kafka import KafkaResource, KafkaIOManager, DLQStrategy

@asset
def api_events():
    '''Consume JSON messages from Kafka topic with DLQ support.'''
    pass

defs = Definitions(
    assets=[api_events],
    resources={
        "kafka": KafkaResource(bootstrap_servers="localhost:9092"),
        "io_manager": KafkaIOManager(
            kafka_resource=KafkaResource(bootstrap_servers="localhost:9092"),
            consumer_group_id="my-dagster-pipeline",
            enable_dlq=True,
            dlq_strategy=DLQStrategy.RETRY_THEN_DLQ,
            dlq_max_retries=3
        )
    }
)
```

## Validation Results Summary

| Phase | Test Type | Result | Key Metrics |
|-------|-----------|--------|-------------|
| **Phase 5** | Performance Testing | **PASS** | 1,199 msgs/sec peak throughput |
| **Phase 7** | Integration Testing | **PASS** | End-to-end message flow validated |
| **Phase 9** | Compatibility Testing | **PASS** | Python 3.12 + Dagster 1.11.3 |
| **Phase 10** | Security Audit | **PASS** | Credential + network security |
| **Phase 11** | Stress Testing | **EXCEPTIONAL** | 100% success rate, 305 operations |

## Enterprise Security

### Security Protocols Supported
- **SASL_SSL**: Combined authentication and encryption (recommended for production)
- **SSL**: Certificate-based encryption
- **SASL_PLAINTEXT**: Username/password authentication  
- **PLAINTEXT**: For local development and testing

### SASL Authentication Mechanisms
- **SCRAM-SHA-256**: Secure challenge-response authentication
- **SCRAM-SHA-512**: Enhanced secure authentication
- **PLAIN**: Simple username/password authentication
- **GSSAPI**: Kerberos authentication for enterprise environments
- **OAUTHBEARER**: OAuth-based authentication

## Why Choose This Integration

### Complete Solution
- Only integration supporting all 3 major formats (JSON, Avro, Protobuf)
- Enterprise-grade security with SASL/SSL support
- Production-ready with comprehensive monitoring
- Advanced error handling with Dead Letter Queue support
- Complete DLQ Tooling Suite for enterprise operations

### Enterprise Ready
- 11-phase comprehensive validation covering all scenarios
- Real-world deployment patterns and examples
- Performance optimization tools and monitoring
- Enterprise security for production Kafka clusters
- Bulletproof error handling with circuit breaker patterns

### Unprecedented Validation
- Most validated package in the Dagster ecosystem
- Performance proven: 1,199 msgs/sec peak throughput
- Stability proven: 100% success rate under stress
- Security proven: Complete credential and network validation
- Enterprise proven: Exceptional rating across all dimensions

## Repository

GitHub: https://github.com/kingsley-123/dagster-kafka-integration

## License

Apache 2.0

---

The most comprehensively validated Kafka integration for Dagster - Version 1.1.2 Enterprise Validation Release with Security Hardening.

Built by Kingsley Okonkwo - Solving real data engineering problems with comprehensive open source solutions.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/kingsley-123/dagster-kafka-integration",
    "name": "dagster-kafka",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "dagster, kafka, apache-kafka, data-engineering, streaming, dlq, dead-letter-queue, avro, protobuf, schema-registry, enterprise, production, monitoring, alerting",
    "author": "Kingsley Okonkwo",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/6e/25/f25361e43ea7c79b2635fc1e93a2eee5795accf81fc26d82895e65e46a0a/dagster_kafka-1.1.2.tar.gz",
    "platform": null,
    "description": "\r\n# Dagster Kafka Integration\r\n\r\nThe most comprehensively validated Kafka integration for Dagster with enterprise-grade features supporting all three major serialization formats and production security.\r\n\r\n## Enterprise Validation Completed\r\n\r\n**Version 1.1.2** - Most validated Kafka integration package ever created:\r\n\r\n**11-Phase Comprehensive Validation** - Unprecedented testing methodology  \r\n**Exceptional Performance** - 1,199 messages/second peak throughput proven  \r\n**Security Hardened** - Complete credential validation + network security  \r\n**Stress Tested** - 100% success rate (305/305 operations over 8+ minutes)  \r\n**Enterprise Ready** - Complete DLQ tooling suite with 5 CLI tools  \r\n**Zero Critical Issues** - Across all validation phases  \r\n\r\n## Complete Enterprise Solution\r\n\r\n- **JSON Support**: Native JSON message consumption from Kafka topics\r\n- **Avro Support**: Full Avro message support with Schema Registry integration  \r\n- **Protobuf Support**: Complete Protocol Buffers integration with schema management\r\n- **Dead Letter Queue (DLQ)**: Enterprise-grade error handling with circuit breaker patterns\r\n- **Enterprise Security**: Complete SASL/SSL authentication and encryption support\r\n- **Schema Evolution**: Comprehensive validation with breaking change detection across all formats\r\n- **Production Monitoring**: Real-time alerting with Slack/Email integration\r\n- **High Performance**: Advanced caching, batching, and connection pooling\r\n- **Error Recovery**: Multiple recovery strategies for production resilience\r\n\r\n## Installation\r\n\r\n```bash\r\npip install dagster-kafka\r\n```\r\n\r\n## Enterprise DLQ Tooling Suite\r\n\r\nComplete operational tooling available immediately after installation:\r\n\r\n```bash\r\n# Analyze failed messages with comprehensive error pattern analysis\r\ndlq-inspector --topic user-events --max-messages 20\r\n\r\n# Replay messages with filtering and safety controls  \r\ndlq-replayer --source-topic orders_dlq --target-topic orders --dry-run\r\n\r\n# Monitor DLQ health across multiple topics\r\ndlq-monitor --topics user-events_dlq,orders_dlq --output-format json\r\n\r\n# Set up automated alerting\r\ndlq-alerts --topic critical-events_dlq --max-messages 500\r\n\r\n# Operations dashboard for DLQ health monitoring\r\ndlq-dashboard --topics user-events_dlq,orders_dlq\r\n```\r\n\r\n## Quick Start\r\n\r\n```python\r\nfrom dagster import asset, Definitions\r\nfrom dagster_kafka import KafkaResource, KafkaIOManager, DLQStrategy\r\n\r\n@asset\r\ndef api_events():\r\n    '''Consume JSON messages from Kafka topic with DLQ support.'''\r\n    pass\r\n\r\ndefs = Definitions(\r\n    assets=[api_events],\r\n    resources={\r\n        \"kafka\": KafkaResource(bootstrap_servers=\"localhost:9092\"),\r\n        \"io_manager\": KafkaIOManager(\r\n            kafka_resource=KafkaResource(bootstrap_servers=\"localhost:9092\"),\r\n            consumer_group_id=\"my-dagster-pipeline\",\r\n            enable_dlq=True,\r\n            dlq_strategy=DLQStrategy.RETRY_THEN_DLQ,\r\n            dlq_max_retries=3\r\n        )\r\n    }\r\n)\r\n```\r\n\r\n## Validation Results Summary\r\n\r\n| Phase | Test Type | Result | Key Metrics |\r\n|-------|-----------|--------|-------------|\r\n| **Phase 5** | Performance Testing | **PASS** | 1,199 msgs/sec peak throughput |\r\n| **Phase 7** | Integration Testing | **PASS** | End-to-end message flow validated |\r\n| **Phase 9** | Compatibility Testing | **PASS** | Python 3.12 + Dagster 1.11.3 |\r\n| **Phase 10** | Security Audit | **PASS** | Credential + network security |\r\n| **Phase 11** | Stress Testing | **EXCEPTIONAL** | 100% success rate, 305 operations |\r\n\r\n## Enterprise Security\r\n\r\n### Security Protocols Supported\r\n- **SASL_SSL**: Combined authentication and encryption (recommended for production)\r\n- **SSL**: Certificate-based encryption\r\n- **SASL_PLAINTEXT**: Username/password authentication  \r\n- **PLAINTEXT**: For local development and testing\r\n\r\n### SASL Authentication Mechanisms\r\n- **SCRAM-SHA-256**: Secure challenge-response authentication\r\n- **SCRAM-SHA-512**: Enhanced secure authentication\r\n- **PLAIN**: Simple username/password authentication\r\n- **GSSAPI**: Kerberos authentication for enterprise environments\r\n- **OAUTHBEARER**: OAuth-based authentication\r\n\r\n## Why Choose This Integration\r\n\r\n### Complete Solution\r\n- Only integration supporting all 3 major formats (JSON, Avro, Protobuf)\r\n- Enterprise-grade security with SASL/SSL support\r\n- Production-ready with comprehensive monitoring\r\n- Advanced error handling with Dead Letter Queue support\r\n- Complete DLQ Tooling Suite for enterprise operations\r\n\r\n### Enterprise Ready\r\n- 11-phase comprehensive validation covering all scenarios\r\n- Real-world deployment patterns and examples\r\n- Performance optimization tools and monitoring\r\n- Enterprise security for production Kafka clusters\r\n- Bulletproof error handling with circuit breaker patterns\r\n\r\n### Unprecedented Validation\r\n- Most validated package in the Dagster ecosystem\r\n- Performance proven: 1,199 msgs/sec peak throughput\r\n- Stability proven: 100% success rate under stress\r\n- Security proven: Complete credential and network validation\r\n- Enterprise proven: Exceptional rating across all dimensions\r\n\r\n## Repository\r\n\r\nGitHub: https://github.com/kingsley-123/dagster-kafka-integration\r\n\r\n## License\r\n\r\nApache 2.0\r\n\r\n---\r\n\r\nThe most comprehensively validated Kafka integration for Dagster - Version 1.1.2 Enterprise Validation Release with Security Hardening.\r\n\r\nBuilt by Kingsley Okonkwo - Solving real data engineering problems with comprehensive open source solutions.\r\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Complete Kafka integration for Dagster with enterprise DLQ tooling",
    "version": "1.1.2",
    "project_urls": {
        "Documentation": "https://github.com/kingsley-123/dagster-kafka-integration/blob/main/README.md",
        "Homepage": "https://github.com/kingsley-123/dagster-kafka-integration",
        "Issues": "https://github.com/kingsley-123/dagster-kafka-integration/issues",
        "Source": "https://github.com/kingsley-123/dagster-kafka-integration",
        "Validation Report": "https://github.com/kingsley-123/dagster-kafka-integration/tree/main/validation"
    },
    "split_keywords": [
        "dagster",
        " kafka",
        " apache-kafka",
        " data-engineering",
        " streaming",
        " dlq",
        " dead-letter-queue",
        " avro",
        " protobuf",
        " schema-registry",
        " enterprise",
        " production",
        " monitoring",
        " alerting"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "bc5d8be42d74c3bc09837a2b8e5d8b3ba74ba8d8e5ea64e880e615422cc5b42e",
                "md5": "6e7d0afdcac67bb6e0a7eda8bba5abc5",
                "sha256": "4dcb758495c768eab2ea3ca5ac7c0e85de883e0a7c19489f3bb73d5c3dcda59b"
            },
            "downloads": -1,
            "filename": "dagster_kafka-1.1.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6e7d0afdcac67bb6e0a7eda8bba5abc5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 58275,
            "upload_time": "2025-07-31T01:07:25",
            "upload_time_iso_8601": "2025-07-31T01:07:25.688551Z",
            "url": "https://files.pythonhosted.org/packages/bc/5d/8be42d74c3bc09837a2b8e5d8b3ba74ba8d8e5ea64e880e615422cc5b42e/dagster_kafka-1.1.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6e25f25361e43ea7c79b2635fc1e93a2eee5795accf81fc26d82895e65e46a0a",
                "md5": "f2adb3022abb6c3bc25e0625ab2c39a8",
                "sha256": "ee656b60a2867a087f814b87733f2f12b0d2b0c98cee0a944c5731f89d515e02"
            },
            "downloads": -1,
            "filename": "dagster_kafka-1.1.2.tar.gz",
            "has_sig": false,
            "md5_digest": "f2adb3022abb6c3bc25e0625ab2c39a8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 69327,
            "upload_time": "2025-07-31T01:07:27",
            "upload_time_iso_8601": "2025-07-31T01:07:27.058602Z",
            "url": "https://files.pythonhosted.org/packages/6e/25/f25361e43ea7c79b2635fc1e93a2eee5795accf81fc26d82895e65e46a0a/dagster_kafka-1.1.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-31 01:07:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kingsley-123",
    "github_project": "dagster-kafka-integration",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "dagster-kafka"
}
        
Elapsed time: 1.69282s