zephflow


Namezephflow JSON
Version 0.2.3 PyPI version JSON
download
home_pageNone
SummaryPython SDK for ZephFlow data processing pipelines
upload_time2025-07-08 03:35:35
maintainerNone
docs_urlNone
authorFleak Tech Inc.
requires_python<4.0.0,>=3.8.1
licenseApache-2.0
keywords data-processing streaming etl pipeline workflow
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # ZephFlow Python SDK

[![PyPI version](https://img.shields.io/pypi/v/zephflow.svg)](https://pypi.org/project/zephflow/)
[![Python Versions](https://img.shields.io/pypi/pyversions/zephflow.svg)](https://pypi.org/project/zephflow/)
[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

Python SDK for building and running ZephFlow data processing pipelines. ZephFlow provides a powerful, intuitive API for stream processing, data transformation, and event-driven architectures.

## Features

- **Simple, fluent API** for building data processing pipelines
- **Powerful filtering** using JSONPath expressions
- **Data transformation** with the eval expression language
- **Flow composition** - merge and combine multiple flows
- **Error handling** with assertions and error tracking
- **Multiple sink options** for outputting processed data
- **Java-based engine** for high performance processing

## Documentation

For comprehensive documentation, tutorials, and API reference, visit: [https://docs.fleak.ai/zephflow](https://docs.fleak.ai/zephflow)

## Prerequisites

- Python 3.8 or higher
- Java 17 or higher (required for the processing engine)

## Installation

Install ZephFlow using pip:

```bash
pip install zephflow
```

## Quick Start

Here's a simple example to get you started with ZephFlow:

```python
import zephflow

# Create a flow that filters and transforms events
flow = (
    zephflow.ZephFlow.start_flow()
    .filter("$.value > 10")  # Keep only events with value > 10
    .eval("""
        dict(
            id=$.id,
            doubled_value=$.value * 2,
            category=case(
                $.value < 20 => 'medium',
                _ => 'high'
            )
        )
    """)
    .stdout_sink("JSON_OBJECT")  # Output to console
)

# Process some events
events = [
    {"id": 1, "value": 5},   # Will be filtered out
    {"id": 2, "value": 15},  # Will be processed
    {"id": 3, "value": 25}   # Will be processed
]

result = flow.process(events)
print(f"Processed {result.getOutputEvents().size()} events")
```

If you already have a workflow file:

```python
import zephflow

zephflow.ZephFlow.execute_dag("my_dag.yaml")
```

## Troubleshooting
### macOS SSL Certificate Issue
If you're on macOS and encounter an error like:

```<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)>
This indicates that Python cannot verify SSL certificates due to missing system root certificates.
```

#### Solution
Run the certificate installation script that comes with your Python installation:

```
/Applications/Python\ 3.x/Install\ Certificates.command
Replace 3.x with your installed version (e.g., 3.10). This installs the necessary certificates so Python can verify HTTPS downloads.
```

## Core Concepts

### Filtering

Use JSONPath expressions to filter events:

```python
flow = (
    zephflow.ZephFlow.start_flow()
    .filter("$.priority == 'high' && $.value >= 100")
)
```

### Transformation

Transform data using the eval expression language:

```python
flow = (
    zephflow.ZephFlow.start_flow()
    .eval("""
        dict(
            timestamp=now(),
            original_id=$.id,
            processed_value=$.value * 1.1,
            status='processed'
        )
    """)
)
```

### Merging Flows

Combine multiple flows for complex processing logic:

```python
high_priority = zephflow.ZephFlow.start_flow().filter("$.priority == 'high'")
large_value = zephflow.ZephFlow.start_flow().filter("$.value >= 1000")

merged = zephflow.ZephFlow.merge(high_priority, large_value)
```

### Error Handling

Add assertions to validate data and handle errors:

```python
flow = (
  zephflow.ZephFlow.start_flow()
  .assertion("$.required_field != null")
  .assertion("$.value >= 0")
  .eval("dict(id=$.id, validated_value=$.value)")
)

result = flow.process(events, include_error_by_step=True)
if result.getErrorByStep().size() > 0:
  print("Some events failed validation")
```

## Examples

For more detailed examples, check out [Quick Start Example](https://github.com/fleaktech/zephflow-python-sdk/blob/main/examples/quickstart.py) - Basic filtering and transformation

## Environment Variables

- `ZEPHFLOW_MAIN_JAR` - Path to a custom ZephFlow JAR file (optional)
- `ZEPHFLOW_JAR_DIR` - Directory for storing downloaded JAR files (optional)


## Support

- **Documentation**: [https://docs.fleak.ai/zephflow](https://docs.fleak.ai/zephflow)
- **Discussions**: [Slack](https://join.slack.com/t/fleak-hq/shared_invite/zt-361k9cnhf-9~mmjpOH1IbZfRxeXplfKA)

## License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.

## About Fleak

ZephFlow is developed and maintained by [Fleak Tech Inc.](https://fleak.ai), building the future of data processing and streaming analytics.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "zephflow",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0.0,>=3.8.1",
    "maintainer_email": null,
    "keywords": "data-processing, streaming, etl, pipeline, workflow",
    "author": "Fleak Tech Inc.",
    "author_email": "contact@fleak.ai",
    "download_url": "https://files.pythonhosted.org/packages/82/1d/832daaa3819b51878f4aae2d39f60f716170d5229492cccc795202854c70/zephflow-0.2.3.tar.gz",
    "platform": null,
    "description": "# ZephFlow Python SDK\n\n[![PyPI version](https://img.shields.io/pypi/v/zephflow.svg)](https://pypi.org/project/zephflow/)\n[![Python Versions](https://img.shields.io/pypi/pyversions/zephflow.svg)](https://pypi.org/project/zephflow/)\n[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n\nPython SDK for building and running ZephFlow data processing pipelines. ZephFlow provides a powerful, intuitive API for stream processing, data transformation, and event-driven architectures.\n\n## Features\n\n- **Simple, fluent API** for building data processing pipelines\n- **Powerful filtering** using JSONPath expressions\n- **Data transformation** with the eval expression language\n- **Flow composition** - merge and combine multiple flows\n- **Error handling** with assertions and error tracking\n- **Multiple sink options** for outputting processed data\n- **Java-based engine** for high performance processing\n\n## Documentation\n\nFor comprehensive documentation, tutorials, and API reference, visit: [https://docs.fleak.ai/zephflow](https://docs.fleak.ai/zephflow)\n\n## Prerequisites\n\n- Python 3.8 or higher\n- Java 17 or higher (required for the processing engine)\n\n## Installation\n\nInstall ZephFlow using pip:\n\n```bash\npip install zephflow\n```\n\n## Quick Start\n\nHere's a simple example to get you started with ZephFlow:\n\n```python\nimport zephflow\n\n# Create a flow that filters and transforms events\nflow = (\n    zephflow.ZephFlow.start_flow()\n    .filter(\"$.value > 10\")  # Keep only events with value > 10\n    .eval(\"\"\"\n        dict(\n            id=$.id,\n            doubled_value=$.value * 2,\n            category=case(\n                $.value < 20 => 'medium',\n                _ => 'high'\n            )\n        )\n    \"\"\")\n    .stdout_sink(\"JSON_OBJECT\")  # Output to console\n)\n\n# Process some events\nevents = [\n    {\"id\": 1, \"value\": 5},   # Will be filtered out\n    {\"id\": 2, \"value\": 15},  # Will be processed\n    {\"id\": 3, \"value\": 25}   # Will be processed\n]\n\nresult = flow.process(events)\nprint(f\"Processed {result.getOutputEvents().size()} events\")\n```\n\nIf you already have a workflow file:\n\n```python\nimport zephflow\n\nzephflow.ZephFlow.execute_dag(\"my_dag.yaml\")\n```\n\n## Troubleshooting\n### macOS SSL Certificate Issue\nIf you're on macOS and encounter an error like:\n\n```<urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)>\nThis indicates that Python cannot verify SSL certificates due to missing system root certificates.\n```\n\n#### Solution\nRun the certificate installation script that comes with your Python installation:\n\n```\n/Applications/Python\\ 3.x/Install\\ Certificates.command\nReplace 3.x with your installed version (e.g., 3.10). This installs the necessary certificates so Python can verify HTTPS downloads.\n```\n\n## Core Concepts\n\n### Filtering\n\nUse JSONPath expressions to filter events:\n\n```python\nflow = (\n    zephflow.ZephFlow.start_flow()\n    .filter(\"$.priority == 'high' && $.value >= 100\")\n)\n```\n\n### Transformation\n\nTransform data using the eval expression language:\n\n```python\nflow = (\n    zephflow.ZephFlow.start_flow()\n    .eval(\"\"\"\n        dict(\n            timestamp=now(),\n            original_id=$.id,\n            processed_value=$.value * 1.1,\n            status='processed'\n        )\n    \"\"\")\n)\n```\n\n### Merging Flows\n\nCombine multiple flows for complex processing logic:\n\n```python\nhigh_priority = zephflow.ZephFlow.start_flow().filter(\"$.priority == 'high'\")\nlarge_value = zephflow.ZephFlow.start_flow().filter(\"$.value >= 1000\")\n\nmerged = zephflow.ZephFlow.merge(high_priority, large_value)\n```\n\n### Error Handling\n\nAdd assertions to validate data and handle errors:\n\n```python\nflow = (\n  zephflow.ZephFlow.start_flow()\n  .assertion(\"$.required_field != null\")\n  .assertion(\"$.value >= 0\")\n  .eval(\"dict(id=$.id, validated_value=$.value)\")\n)\n\nresult = flow.process(events, include_error_by_step=True)\nif result.getErrorByStep().size() > 0:\n  print(\"Some events failed validation\")\n```\n\n## Examples\n\nFor more detailed examples, check out [Quick Start Example](https://github.com/fleaktech/zephflow-python-sdk/blob/main/examples/quickstart.py) - Basic filtering and transformation\n\n## Environment Variables\n\n- `ZEPHFLOW_MAIN_JAR` - Path to a custom ZephFlow JAR file (optional)\n- `ZEPHFLOW_JAR_DIR` - Directory for storing downloaded JAR files (optional)\n\n\n## Support\n\n- **Documentation**: [https://docs.fleak.ai/zephflow](https://docs.fleak.ai/zephflow)\n- **Discussions**: [Slack](https://join.slack.com/t/fleak-hq/shared_invite/zt-361k9cnhf-9~mmjpOH1IbZfRxeXplfKA)\n\n## License\n\nThis project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.\n\n## About Fleak\n\nZephFlow is developed and maintained by [Fleak Tech Inc.](https://fleak.ai), building the future of data processing and streaming analytics.\n",
    "bugtrack_url": null,
    "license": "Apache-2.0",
    "summary": "Python SDK for ZephFlow data processing pipelines",
    "version": "0.2.3",
    "project_urls": {
        "Documentation": "https://docs.fleak.ai/zephflow",
        "Homepage": "https://github.com/fleaktech/zephflow-python-sdk",
        "Repository": "https://github.com/fleaktech/zephflow-python-sdk"
    },
    "split_keywords": [
        "data-processing",
        " streaming",
        " etl",
        " pipeline",
        " workflow"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "dd5cf38f4bc2c33ec9403145f08d17336ad0e67157951b60499474cc34f51eda",
                "md5": "1b8643ff0d7f6df07073b90d9018bf1e",
                "sha256": "5cd6892d78877421d1aed0c557a5eb16988bf6e581b8841d70864b622ccb7054"
            },
            "downloads": -1,
            "filename": "zephflow-0.2.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "1b8643ff0d7f6df07073b90d9018bf1e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0.0,>=3.8.1",
            "size": 17372,
            "upload_time": "2025-07-08T03:35:33",
            "upload_time_iso_8601": "2025-07-08T03:35:33.918971Z",
            "url": "https://files.pythonhosted.org/packages/dd/5c/f38f4bc2c33ec9403145f08d17336ad0e67157951b60499474cc34f51eda/zephflow-0.2.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "821d832daaa3819b51878f4aae2d39f60f716170d5229492cccc795202854c70",
                "md5": "fc90da231328c0567a82a781aab323e4",
                "sha256": "7dd58d67af801527f2d8a8af79457cae4151da8d6c025c98f4620b0490a31187"
            },
            "downloads": -1,
            "filename": "zephflow-0.2.3.tar.gz",
            "has_sig": false,
            "md5_digest": "fc90da231328c0567a82a781aab323e4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0.0,>=3.8.1",
            "size": 16813,
            "upload_time": "2025-07-08T03:35:35",
            "upload_time_iso_8601": "2025-07-08T03:35:35.130481Z",
            "url": "https://files.pythonhosted.org/packages/82/1d/832daaa3819b51878f4aae2d39f60f716170d5229492cccc795202854c70/zephflow-0.2.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-08 03:35:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "fleaktech",
    "github_project": "zephflow-python-sdk",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "zephflow"
}
        
Elapsed time: 1.90845s