taskflow-pipeline


Nametaskflow-pipeline JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryA Python library for orchestrating RPA, data processing, and AI task pipelines from YAML/JSON configurations
upload_time2025-11-03 13:38:57
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords automation rpa pipeline workflow orchestration task-automation yaml-config data-processing ai-tasks robotic-process-automation
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # πŸš€ TaskFlow Pipeline# TaskFlow



<div align="center">[![PyPI version](https://badge.fury.io/py/taskflow-pipeline.svg)](https://badge.fury.io/py/taskflow-pipeline)

[![Python Versions](https://img.shields.io/pypi/pyversions/taskflow-pipeline.svg)](https://pypi.org/project/taskflow-pipeline/)

[![PyPI version](https://badge.fury.io/py/taskflow-pipeline.svg)](https://badge.fury.io/py/taskflow-pipeline)[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

[![Python Versions](https://img.shields.io/pypi/pyversions/taskflow-pipeline.svg)](https://pypi.org/project/taskflow-pipeline/)[![Tests](https://github.com/berkterekli/taskflow-pipeline/workflows/Tests/badge.svg)](https://github.com/berkterekli/taskflow-pipeline/actions)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

[![Downloads](https://pepy.tech/badge/taskflow-pipeline)](https://pepy.tech/project/taskflow-pipeline)A Python library for orchestrating automated pipelines including RPA (desktop/web automation), data processing, and AI tasks using YAML or JSON configuration files.

[![Tests](https://github.com/berkterekli/taskflow-pipeline/workflows/Tests/badge.svg)](https://github.com/berkterekli/taskflow-pipeline/actions)

## Features

**A powerful Python library for orchestrating complex automation workflows with simple YAML/JSON configuration files**

- **Configuration-based execution**: Define your task pipelines in YAML or JSON

[Installation](#-installation) β€’- **Modular task system**: Supports RPA, data processing, and AI tasks

[Quick Start](#-quick-start) β€’- **Easy extensibility**: Add custom task modules with simple function definitions

[Features](#-features) β€’- **Type-safe**: Built with type hints for better IDE support

[Documentation](#-documentation) β€’- **Error handling**: Clear error messages for debugging pipelines

[Examples](#-examples)

## Installation

</div>

```bash

---pip install taskflow-pipeline

```

## 🎯 What is TaskFlow?

For development:

TaskFlow is a declarative automation framework that lets you build complex workflows by writing simple configuration files. Perfect for:

```bash

- **πŸ€– RPA (Robotic Process Automation)**: Automate repetitive tasks, UI interactions, and document processinggit clone https://github.com/berkterekli/taskflow-pipeline.git

- **πŸ“Š Data Processing**: ETL pipelines, data cleaning, transformation, and validationcd taskflow-pipeline

- **🧠 AI/ML Workflows**: Text generation, classification, sentiment analysis, and morepip install -e ".[dev]"

- **πŸ”„ Business Process Automation**: Invoice processing, report generation, email automation```



**Why TaskFlow?**## Quick Start



βœ… **No-code workflow definition** - Define complex pipelines in YAML/JSON  1. Create a `tasks.yaml` file:

βœ… **Type-safe and tested** - Built with type hints and comprehensive test coverage  

βœ… **Extensible architecture** - Easily add custom tasks and integrations  ```yaml

βœ… **Production-ready** - Detailed logging, error handling, and validation  tasks:

βœ… **Framework-agnostic** - Works with Selenium, Playwright, Pandas, OpenAI, and more  - action: "rpa.click"

    params:

---      target: "Submit Button"

  

## πŸ“¦ Installation  - action: "data.clean_data"

    params:

### Basic Installation      data: "sample_data.csv"

```

```bash

pip install taskflow-pipeline2. Run your pipeline:

```

```python

### Development Installationfrom taskflow import TaskFlow



```bash# Initialize and run the pipeline

git clone https://github.com/berkterekli/taskflow-pipeline.gitpipeline = TaskFlow("tasks.yaml")

cd taskflow-pipelinepipeline.run()

pip install -e ".[dev]"```

```

## Task Types

### Requirements

### RPA Tasks

- Python 3.8+- `rpa.click`: Simulate clicking on UI elements

- PyYAML 6.0+- `rpa.extract_table_from_pdf`: Extract tables from PDF files



---### Data Tasks

- `data.clean_data`: Clean and preprocess data

## πŸš€ Quick Start

### AI Tasks

### 1. Create Your First Pipeline- `ai.generate_text`: Generate text using AI models



Create a file named `my_workflow.yaml`:## Project Structure



```yaml```

tasks:taskflow/

  # Step 1: Simulate a button clickβ”œβ”€β”€ __init__.py

  - action: "rpa.click"β”œβ”€β”€ core.py          # Main TaskFlow engine

    params:β”œβ”€β”€ parser.py        # YAML/JSON parser

      target: "Submit Button"└── tasks/

      β”œβ”€β”€ __init__.py

  # Step 2: Extract data from a PDF    β”œβ”€β”€ rpa_tasks.py      # RPA automation tasks

  - action: "rpa.extract_table_from_pdf"    β”œβ”€β”€ data_tasks.py     # Data processing tasks

    params:    └── ai_tasks.py       # AI-related tasks

      file_path: "invoices/invoice_001.pdf"```

  

  # Step 3: Clean the extracted data## Development

  - action: "data.clean_data"

    params:Run tests:

      data: "extracted_table.csv"

  ```bash

  # Step 4: Generate a summary with AIpytest

  - action: "ai.generate_text"```

    params:

      prompt: "Summarize the invoice data"Format code:

      max_tokens: 200

``````bash

black taskflow tests

### 2. Run Your Pipeline```



```pythonType checking:

from taskflow import TaskFlow

```bash

# Initialize the pipelinemypy taskflow

pipeline = TaskFlow("my_workflow.yaml")```



# Execute all tasks## License

pipeline.run()

```MIT License - see [LICENSE](LICENSE) file for details.



### 3. See the Results## Contributing



```Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.

2025-11-03 16:12:08 - INFO - TaskFlow initialized with config: my_workflow.yaml

2025-11-03 16:12:08 - INFO - Starting pipeline execution...## Publishing

2025-11-03 16:12:08 - INFO - Task 1/4: Executing 'rpa.click'

[RPA] Simulating click on: Submit ButtonSee [PUBLISHING_GUIDE.md](PUBLISHING_GUIDE.md) for detailed instructions on publishing to PyPI.

2025-11-03 16:12:08 - INFO - Task 1: Completed successfully

2025-11-03 16:12:08 - INFO - Task 2/4: Executing 'rpa.extract_table_from_pdf'## Changelog

[RPA] Extracting table from PDF: invoices/invoice_001.pdf

2025-11-03 16:12:08 - INFO - Task 2: Completed successfullySee [CHANGELOG.md](CHANGELOG.md) for version history and changes.

...

2025-11-03 16:12:08 - INFO - Pipeline execution completed successfully!## Author

```

**Berk Terekli**

---

## Support

## ✨ Features

- πŸ“– [Documentation](https://github.com/berkterekli/taskflow-pipeline#readme)

### 🎭 Configuration-Based Workflows- πŸ› [Issue Tracker](https://github.com/berkterekli/taskflow-pipeline/issues)

- πŸ’¬ [Discussions](https://github.com/berkterekli/taskflow-pipeline/discussions)

Define your entire automation pipeline in a simple, readable format:

```yaml
tasks:
  - action: "rpa.click"
    params:
      target: "Login Button"
  
  - action: "data.transform_data"
    params:
      input_path: "raw_data.csv"
      output_path: "processed_data.csv"
      operations: ["dedupe", "normalize", "validate"]
```

### 🧩 Modular Task System

TaskFlow comes with three built-in task categories:

#### **πŸ€– RPA Tasks** - Robotic Process Automation

```python
# Available RPA Tasks:
rpa.click                    # Click UI elements
rpa.type_text               # Type into input fields
rpa.take_screenshot         # Capture screenshots
rpa.extract_table_from_pdf  # Extract tables from PDFs
```

**Example Use Case: Invoice Processing**

```yaml
tasks:
  - action: "rpa.extract_table_from_pdf"
    params:
      file_path: "invoice.pdf"
  
  - action: "data.validate_data"
    params:
      data: "extracted_invoice.csv"
      schema:
        invoice_number: "string"
        amount: "float"
        date: "date"
```

#### **πŸ“Š Data Tasks** - Data Processing & ETL

```python
# Available Data Tasks:
data.clean_data        # Clean and preprocess data
data.transform_data    # Apply transformations
data.merge_datasets    # Merge multiple datasets
data.validate_data     # Validate against schemas
```

**Example Use Case: Data Pipeline**

```yaml
tasks:
  - action: "data.merge_datasets"
    params:
      datasets: 
        - "sales_q1.csv"
        - "sales_q2.csv"
        - "sales_q3.csv"
      output_path: "annual_sales.csv"
  
  - action: "data.clean_data"
    params:
      data: "annual_sales.csv"
  
  - action: "data.transform_data"
    params:
      input_path: "annual_sales.csv"
      output_path: "sales_report.csv"
      operations: ["aggregate", "pivot", "format"]
```

#### **🧠 AI Tasks** - Artificial Intelligence

```python
# Available AI Tasks:
ai.generate_text      # Generate text with LLMs
ai.classify_text      # Classify text into categories
ai.analyze_sentiment  # Sentiment analysis
ai.extract_entities   # Named entity recognition
```

**Example Use Case: Content Analysis**

```yaml
tasks:
  - action: "ai.analyze_sentiment"
    params:
      text: "Customer feedback text here..."
  
  - action: "ai.extract_entities"
    params:
      text: "Extract companies, people, and locations from this text."
  
  - action: "ai.generate_text"
    params:
      prompt: "Write a summary of the customer feedback"
      max_tokens: 150
```

### πŸ”Œ Easy Extensibility

Add your own custom tasks in minutes:

```python
from taskflow import TaskFlow

# Define a custom function
def send_email(to: str, subject: str, body: str) -> None:
    """Send an email notification."""
    print(f"Sending email to {to}: {subject}")
    # Your email logic here...

# Register it as a custom action
pipeline = TaskFlow("workflow.yaml")
pipeline.add_custom_action("email.send", send_email)
pipeline.run()
```

Then use it in your YAML:

```yaml
tasks:
  - action: "email.send"
    params:
      to: "team@company.com"
      subject: "Pipeline Completed"
      body: "The data processing pipeline has finished successfully."
```

### πŸ›‘οΈ Type-Safe & Production Ready

- **Type Hints**: Full type annotations for IDE support and type checking
- **Comprehensive Logging**: Detailed logs for every step
- **Error Handling**: Clear error messages with actionable information
- **Validation**: Automatic validation of configuration files and parameters

### 🎨 Both YAML and JSON Support

Use whichever format you prefer:

**YAML** (Recommended for readability):
```yaml
tasks:
  - action: "rpa.click"
    params:
      target: "Button"
```

**JSON** (Better for programmatic generation):
```json
{
  "tasks": [
    {
      "action": "rpa.click",
      "params": {"target": "Button"}
    }
  ]
}
```

---

## πŸ“š Complete Task Reference

### πŸ€– RPA Tasks

| Action | Description | Parameters | Example |
|--------|-------------|------------|---------|
| `rpa.click` | Click UI element | `target` (str) | Click login button |
| `rpa.type_text` | Type into input | `target` (str), `text` (str) | Fill form fields |
| `rpa.take_screenshot` | Capture screen | `output_path` (str) | Save evidence |
| `rpa.extract_table_from_pdf` | Extract PDF table | `file_path` (str) | Parse invoices |

### πŸ“Š Data Tasks

| Action | Description | Parameters | Example |
|--------|-------------|------------|---------|
| `data.clean_data` | Clean data | `data` (str/list) | Remove duplicates |
| `data.transform_data` | Transform data | `input_path`, `output_path`, `operations` | ETL pipeline |
| `data.merge_datasets` | Merge datasets | `datasets` (list), `output_path` | Combine data sources |
| `data.validate_data` | Validate schema | `data` (str), `schema` (dict) | Ensure data quality |

### 🧠 AI Tasks

| Action | Description | Parameters | Example |
|--------|-------------|------------|---------|
| `ai.generate_text` | Generate text | `prompt` (str), `max_tokens` (int) | Create summaries |
| `ai.classify_text` | Classify text | `text` (str), `categories` (list) | Categorize content |
| `ai.analyze_sentiment` | Analyze sentiment | `text` (str) | Measure satisfaction |
| `ai.extract_entities` | Extract entities | `text` (str) | Find names, places |

---

## πŸ’‘ Real-World Examples

### Example 1: Invoice Processing Automation

```yaml
# invoice_workflow.yaml
tasks:
  # Step 1: Extract data from invoice PDF
  - action: "rpa.extract_table_from_pdf"
    params:
      file_path: "invoices/invoice_2024_001.pdf"
  
  # Step 2: Validate the extracted data
  - action: "data.validate_data"
    params:
      data: "extracted_invoice.csv"
      schema:
        invoice_id: "string"
        vendor: "string"
        amount: "float"
        date: "date"
  
  # Step 3: Clean and format the data
  - action: "data.clean_data"
    params:
      data: "extracted_invoice.csv"
  
  # Step 4: Generate a summary email
  - action: "ai.generate_text"
    params:
      prompt: "Create a professional email summary of this invoice"
      max_tokens: 200
```

Run it:
```python
from taskflow import TaskFlow

pipeline = TaskFlow("invoice_workflow.yaml")
pipeline.run()
```

### Example 2: Customer Feedback Analysis

```yaml
# feedback_analysis.yaml
tasks:
  # Step 1: Analyze sentiment of customer reviews
  - action: "ai.analyze_sentiment"
    params:
      text: "The product quality is excellent but shipping was slow."
  
  # Step 2: Extract key entities (products, issues)
  - action: "ai.extract_entities"
    params:
      text: "The product quality is excellent but shipping was slow."
  
  # Step 3: Classify feedback type
  - action: "ai.classify_text"
    params:
      text: "The product quality is excellent but shipping was slow."
      categories: ["product_quality", "shipping", "customer_service", "pricing"]
  
  # Step 4: Generate action items
  - action: "ai.generate_text"
    params:
      prompt: "Based on the feedback, suggest 3 action items for improvement"
      max_tokens: 150
```

### Example 3: Data ETL Pipeline

```yaml
# etl_pipeline.yaml
tasks:
  # Step 1: Merge quarterly sales data
  - action: "data.merge_datasets"
    params:
      datasets:
        - "data/q1_sales.csv"
        - "data/q2_sales.csv"
        - "data/q3_sales.csv"
        - "data/q4_sales.csv"
      output_path: "data/annual_sales.csv"
  
  # Step 2: Clean the merged data
  - action: "data.clean_data"
    params:
      data: "data/annual_sales.csv"
  
  # Step 3: Transform and aggregate
  - action: "data.transform_data"
    params:
      input_path: "data/annual_sales.csv"
      output_path: "data/sales_report.csv"
      operations: ["dedupe", "aggregate", "sort"]
  
  # Step 4: Validate final output
  - action: "data.validate_data"
    params:
      data: "data/sales_report.csv"
      schema:
        product_id: "string"
        total_sales: "float"
        region: "string"
```

### Example 4: Custom Task Integration

```python
# custom_workflow.py
from taskflow import TaskFlow
import requests

# Define custom tasks
def send_slack_notification(webhook_url: str, message: str) -> None:
    """Send a Slack notification."""
    requests.post(webhook_url, json={"text": message})
    print(f"Sent Slack message: {message}")

def query_database(connection_string: str, query: str) -> list:
    """Query a database and return results."""
    print(f"Executing query: {query}")
    # Your database logic here...
    return [{"id": 1, "name": "Result"}]

# Set up pipeline with custom actions
pipeline = TaskFlow("custom_workflow.yaml")
pipeline.add_custom_action("slack.send", send_slack_notification)
pipeline.add_custom_action("db.query", query_database)

# Run the pipeline
pipeline.run()
```

**custom_workflow.yaml:**
```yaml
tasks:
  - action: "db.query"
    params:
      connection_string: "postgresql://localhost/mydb"
      query: "SELECT * FROM users WHERE active = true"
  
  - action: "data.clean_data"
    params:
      data: "query_results.csv"
  
  - action: "slack.send"
    params:
      webhook_url: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
      message: "Daily user report generated successfully!"
```

---

## πŸ—οΈ Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   TaskFlow                       β”‚
β”‚                  (Orchestrator)                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                        β”‚
             β–Ό                        β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  YAML Parser   β”‚       β”‚  JSON Parser   β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜       β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
             β”‚                        β”‚
             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β–Ό
             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
             β”‚    Action Mapper      β”‚
             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β–Ό                β–Ό                β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   RPA Tasks   β”‚ β”‚ Data Tasks  β”‚ β”‚   AI Tasks    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ β€’ click       β”‚ β”‚ β€’ clean     β”‚ β”‚ β€’ generate    β”‚
β”‚ β€’ type        β”‚ β”‚ β€’ transform β”‚ β”‚ β€’ classify    β”‚
β”‚ β€’ screenshot  β”‚ β”‚ β€’ merge     β”‚ β”‚ β€’ sentiment   β”‚
β”‚ β€’ extract     β”‚ β”‚ β€’ validate  β”‚ β”‚ β€’ entities    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                β”‚                β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          β–Ό
                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                   β”‚   Logging    β”‚
                   β”‚ Error Handle β”‚
                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

---

## πŸ”§ Advanced Usage

### Error Handling

```python
from taskflow import TaskFlow

try:
    pipeline = TaskFlow("workflow.yaml")
    pipeline.run()
except FileNotFoundError:
    print("❌ Configuration file not found")
except ValueError as e:
    print(f"❌ Invalid configuration: {e}")
except Exception as e:
    print(f"❌ Pipeline failed: {e}")
```

### Custom Logging

```python
import logging
from taskflow import TaskFlow

# Configure logging
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('pipeline.log'),
        logging.StreamHandler()
    ]
)

pipeline = TaskFlow("workflow.yaml")
pipeline.run()
```

### Dynamic Task Generation

```python
from taskflow import TaskFlow
import yaml

# Generate tasks programmatically
tasks = {
    "tasks": [
        {
            "action": "rpa.click",
            "params": {"target": f"Button {i}"}
        }
        for i in range(5)
    ]
}

# Save to file
with open("dynamic_workflow.yaml", "w") as f:
    yaml.dump(tasks, f)

# Run the pipeline
pipeline = TaskFlow("dynamic_workflow.yaml")
pipeline.run()
```

---

## πŸ“– Documentation

### Project Structure

```
taskflow/
β”œβ”€β”€ __init__.py              # Package initialization
β”œβ”€β”€ core.py                  # Main TaskFlow engine
β”œβ”€β”€ parser.py                # YAML/JSON configuration parser
└── tasks/
    β”œβ”€β”€ __init__.py
    β”œβ”€β”€ rpa_tasks.py         # RPA automation functions
    β”œβ”€β”€ data_tasks.py        # Data processing functions
    └── ai_tasks.py          # AI/ML task functions
```

### Development

```bash
# Clone the repository
git clone https://github.com/berkterekli/taskflow-pipeline.git
cd taskflow-pipeline

# Install development dependencies
pip install -e ".[dev]"

# Run tests
pytest -v

# Run tests with coverage
pytest --cov=taskflow --cov-report=html

# Format code
black taskflow tests examples

# Type checking
mypy taskflow

# Lint code
flake8 taskflow
```

### Running Tests

```bash
# Run all tests
pytest

# Run specific test file
pytest tests/test_pipeline.py

# Run with verbose output
pytest -v

# Run with coverage report
pytest --cov=taskflow --cov-report=term-missing
```

---

## 🀝 Contributing

We welcome contributions! Here's how you can help:

1. **Fork the repository**
2. **Create a feature branch**: `git checkout -b feature/amazing-feature`
3. **Make your changes**
4. **Run tests**: `pytest`
5. **Format code**: `black taskflow tests`
6. **Commit changes**: `git commit -m 'Add amazing feature'`
7. **Push to branch**: `git push origin feature/amazing-feature`
8. **Open a Pull Request**

See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.

### Adding New Tasks

To add a new task module:

1. Create `taskflow/tasks/your_tasks.py`
2. Implement functions with type hints and docstrings
3. Register actions in `taskflow/core.py`
4. Add tests in `tests/test_your_tasks.py`
5. Update documentation

---

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

---

## πŸ™ Acknowledgments

- Built with ❀️ by [Berk Terekli](https://github.com/berkterekli)
- Inspired by the need for simple, maintainable automation workflows
- Thanks to all [contributors](https://github.com/berkterekli/taskflow-pipeline/graphs/contributors)

---

## πŸ“ž Support & Community

- πŸ“– **Documentation**: [GitHub Wiki](https://github.com/berkterekli/taskflow-pipeline/wiki)
- πŸ› **Bug Reports**: [Issue Tracker](https://github.com/berkterekli/taskflow-pipeline/issues)
- πŸ’¬ **Discussions**: [GitHub Discussions](https://github.com/berkterekli/taskflow-pipeline/discussions)
- πŸ“§ **Email**: berk.terekli@example.com
- 🐦 **Twitter**: [@berkterekli](https://twitter.com/berkterekli)

---

## ⭐ Star History

If you find TaskFlow useful, please consider giving it a star on GitHub! It helps others discover the project.

[![Star History Chart](https://api.star-history.com/svg?repos=berkterekli/taskflow-pipeline&type=Date)](https://star-history.com/#berkterekli/taskflow-pipeline&Date)

---

## πŸ—ΊοΈ Roadmap

- [ ] **Web UI Dashboard** - Visual pipeline editor and monitor
- [ ] **Parallel Execution** - Run tasks in parallel for better performance
- [ ] **Conditional Logic** - If/else conditions in workflows
- [ ] **Loop Support** - Iterate over datasets
- [ ] **Error Retry** - Automatic retry with exponential backoff
- [ ] **Notifications** - Email, Slack, Teams integrations
- [ ] **Scheduling** - Cron-like scheduling support
- [ ] **Docker Support** - Pre-built Docker images
- [ ] **Cloud Integrations** - AWS, Azure, GCP task modules
- [ ] **Database Tasks** - Built-in database operations
- [ ] **API Tasks** - REST API integration tasks

---

## πŸ“Š Stats

![PyPI - Downloads](https://img.shields.io/pypi/dm/taskflow-pipeline)
![GitHub stars](https://img.shields.io/github/stars/berkterekli/taskflow-pipeline?style=social)
![GitHub forks](https://img.shields.io/github/forks/berkterekli/taskflow-pipeline?style=social)
![GitHub watchers](https://img.shields.io/github/watchers/berkterekli/taskflow-pipeline?style=social)

---

<div align="center">

**Made with ❀️ by [Berk Terekli](https://github.com/berkterekli)**

[⬆ Back to Top](#-taskflow-pipeline)

</div>

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "taskflow-pipeline",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Berk Terekli <berk.terekli@example.com>",
    "keywords": "automation, rpa, pipeline, workflow, orchestration, task-automation, yaml-config, data-processing, ai-tasks, robotic-process-automation",
    "author": null,
    "author_email": "Berk Terekli <berk.terekli@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/3e/03/5354ffee482c66457adee95f603a530847200fde869b3b230eaf4f719cd8/taskflow_pipeline-0.1.1.tar.gz",
    "platform": null,
    "description": "# \ud83d\ude80 TaskFlow Pipeline# TaskFlow\n\n\n\n<div align=\"center\">[![PyPI version](https://badge.fury.io/py/taskflow-pipeline.svg)](https://badge.fury.io/py/taskflow-pipeline)\n\n[![Python Versions](https://img.shields.io/pypi/pyversions/taskflow-pipeline.svg)](https://pypi.org/project/taskflow-pipeline/)\n\n[![PyPI version](https://badge.fury.io/py/taskflow-pipeline.svg)](https://badge.fury.io/py/taskflow-pipeline)[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n[![Python Versions](https://img.shields.io/pypi/pyversions/taskflow-pipeline.svg)](https://pypi.org/project/taskflow-pipeline/)[![Tests](https://github.com/berkterekli/taskflow-pipeline/workflows/Tests/badge.svg)](https://github.com/berkterekli/taskflow-pipeline/actions)\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\n[![Downloads](https://pepy.tech/badge/taskflow-pipeline)](https://pepy.tech/project/taskflow-pipeline)A Python library for orchestrating automated pipelines including RPA (desktop/web automation), data processing, and AI tasks using YAML or JSON configuration files.\n\n[![Tests](https://github.com/berkterekli/taskflow-pipeline/workflows/Tests/badge.svg)](https://github.com/berkterekli/taskflow-pipeline/actions)\n\n## Features\n\n**A powerful Python library for orchestrating complex automation workflows with simple YAML/JSON configuration files**\n\n- **Configuration-based execution**: Define your task pipelines in YAML or JSON\n\n[Installation](#-installation) \u2022- **Modular task system**: Supports RPA, data processing, and AI tasks\n\n[Quick Start](#-quick-start) \u2022- **Easy extensibility**: Add custom task modules with simple function definitions\n\n[Features](#-features) \u2022- **Type-safe**: Built with type hints for better IDE support\n\n[Documentation](#-documentation) \u2022- **Error handling**: Clear error messages for debugging pipelines\n\n[Examples](#-examples)\n\n## Installation\n\n</div>\n\n```bash\n\n---pip install taskflow-pipeline\n\n```\n\n## \ud83c\udfaf What is TaskFlow?\n\nFor development:\n\nTaskFlow is a declarative automation framework that lets you build complex workflows by writing simple configuration files. Perfect for:\n\n```bash\n\n- **\ud83e\udd16 RPA (Robotic Process Automation)**: Automate repetitive tasks, UI interactions, and document processinggit clone https://github.com/berkterekli/taskflow-pipeline.git\n\n- **\ud83d\udcca Data Processing**: ETL pipelines, data cleaning, transformation, and validationcd taskflow-pipeline\n\n- **\ud83e\udde0 AI/ML Workflows**: Text generation, classification, sentiment analysis, and morepip install -e \".[dev]\"\n\n- **\ud83d\udd04 Business Process Automation**: Invoice processing, report generation, email automation```\n\n\n\n**Why TaskFlow?**## Quick Start\n\n\n\n\u2705 **No-code workflow definition** - Define complex pipelines in YAML/JSON  1. Create a `tasks.yaml` file:\n\n\u2705 **Type-safe and tested** - Built with type hints and comprehensive test coverage  \n\n\u2705 **Extensible architecture** - Easily add custom tasks and integrations  ```yaml\n\n\u2705 **Production-ready** - Detailed logging, error handling, and validation  tasks:\n\n\u2705 **Framework-agnostic** - Works with Selenium, Playwright, Pandas, OpenAI, and more  - action: \"rpa.click\"\n\n    params:\n\n---      target: \"Submit Button\"\n\n  \n\n## \ud83d\udce6 Installation  - action: \"data.clean_data\"\n\n    params:\n\n### Basic Installation      data: \"sample_data.csv\"\n\n```\n\n```bash\n\npip install taskflow-pipeline2. Run your pipeline:\n\n```\n\n```python\n\n### Development Installationfrom taskflow import TaskFlow\n\n\n\n```bash# Initialize and run the pipeline\n\ngit clone https://github.com/berkterekli/taskflow-pipeline.gitpipeline = TaskFlow(\"tasks.yaml\")\n\ncd taskflow-pipelinepipeline.run()\n\npip install -e \".[dev]\"```\n\n```\n\n## Task Types\n\n### Requirements\n\n### RPA Tasks\n\n- Python 3.8+- `rpa.click`: Simulate clicking on UI elements\n\n- PyYAML 6.0+- `rpa.extract_table_from_pdf`: Extract tables from PDF files\n\n\n\n---### Data Tasks\n\n- `data.clean_data`: Clean and preprocess data\n\n## \ud83d\ude80 Quick Start\n\n### AI Tasks\n\n### 1. Create Your First Pipeline- `ai.generate_text`: Generate text using AI models\n\n\n\nCreate a file named `my_workflow.yaml`:## Project Structure\n\n\n\n```yaml```\n\ntasks:taskflow/\n\n  # Step 1: Simulate a button click\u251c\u2500\u2500 __init__.py\n\n  - action: \"rpa.click\"\u251c\u2500\u2500 core.py          # Main TaskFlow engine\n\n    params:\u251c\u2500\u2500 parser.py        # YAML/JSON parser\n\n      target: \"Submit Button\"\u2514\u2500\u2500 tasks/\n\n      \u251c\u2500\u2500 __init__.py\n\n  # Step 2: Extract data from a PDF    \u251c\u2500\u2500 rpa_tasks.py      # RPA automation tasks\n\n  - action: \"rpa.extract_table_from_pdf\"    \u251c\u2500\u2500 data_tasks.py     # Data processing tasks\n\n    params:    \u2514\u2500\u2500 ai_tasks.py       # AI-related tasks\n\n      file_path: \"invoices/invoice_001.pdf\"```\n\n  \n\n  # Step 3: Clean the extracted data## Development\n\n  - action: \"data.clean_data\"\n\n    params:Run tests:\n\n      data: \"extracted_table.csv\"\n\n  ```bash\n\n  # Step 4: Generate a summary with AIpytest\n\n  - action: \"ai.generate_text\"```\n\n    params:\n\n      prompt: \"Summarize the invoice data\"Format code:\n\n      max_tokens: 200\n\n``````bash\n\nblack taskflow tests\n\n### 2. Run Your Pipeline```\n\n\n\n```pythonType checking:\n\nfrom taskflow import TaskFlow\n\n```bash\n\n# Initialize the pipelinemypy taskflow\n\npipeline = TaskFlow(\"my_workflow.yaml\")```\n\n\n\n# Execute all tasks## License\n\npipeline.run()\n\n```MIT License - see [LICENSE](LICENSE) file for details.\n\n\n\n### 3. See the Results## Contributing\n\n\n\n```Contributions are welcome! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.\n\n2025-11-03 16:12:08 - INFO - TaskFlow initialized with config: my_workflow.yaml\n\n2025-11-03 16:12:08 - INFO - Starting pipeline execution...## Publishing\n\n2025-11-03 16:12:08 - INFO - Task 1/4: Executing 'rpa.click'\n\n[RPA] Simulating click on: Submit ButtonSee [PUBLISHING_GUIDE.md](PUBLISHING_GUIDE.md) for detailed instructions on publishing to PyPI.\n\n2025-11-03 16:12:08 - INFO - Task 1: Completed successfully\n\n2025-11-03 16:12:08 - INFO - Task 2/4: Executing 'rpa.extract_table_from_pdf'## Changelog\n\n[RPA] Extracting table from PDF: invoices/invoice_001.pdf\n\n2025-11-03 16:12:08 - INFO - Task 2: Completed successfullySee [CHANGELOG.md](CHANGELOG.md) for version history and changes.\n\n...\n\n2025-11-03 16:12:08 - INFO - Pipeline execution completed successfully!## Author\n\n```\n\n**Berk Terekli**\n\n---\n\n## Support\n\n## \u2728 Features\n\n- \ud83d\udcd6 [Documentation](https://github.com/berkterekli/taskflow-pipeline#readme)\n\n### \ud83c\udfad Configuration-Based Workflows- \ud83d\udc1b [Issue Tracker](https://github.com/berkterekli/taskflow-pipeline/issues)\n\n- \ud83d\udcac [Discussions](https://github.com/berkterekli/taskflow-pipeline/discussions)\n\nDefine your entire automation pipeline in a simple, readable format:\n\n```yaml\ntasks:\n  - action: \"rpa.click\"\n    params:\n      target: \"Login Button\"\n  \n  - action: \"data.transform_data\"\n    params:\n      input_path: \"raw_data.csv\"\n      output_path: \"processed_data.csv\"\n      operations: [\"dedupe\", \"normalize\", \"validate\"]\n```\n\n### \ud83e\udde9 Modular Task System\n\nTaskFlow comes with three built-in task categories:\n\n#### **\ud83e\udd16 RPA Tasks** - Robotic Process Automation\n\n```python\n# Available RPA Tasks:\nrpa.click                    # Click UI elements\nrpa.type_text               # Type into input fields\nrpa.take_screenshot         # Capture screenshots\nrpa.extract_table_from_pdf  # Extract tables from PDFs\n```\n\n**Example Use Case: Invoice Processing**\n\n```yaml\ntasks:\n  - action: \"rpa.extract_table_from_pdf\"\n    params:\n      file_path: \"invoice.pdf\"\n  \n  - action: \"data.validate_data\"\n    params:\n      data: \"extracted_invoice.csv\"\n      schema:\n        invoice_number: \"string\"\n        amount: \"float\"\n        date: \"date\"\n```\n\n#### **\ud83d\udcca Data Tasks** - Data Processing & ETL\n\n```python\n# Available Data Tasks:\ndata.clean_data        # Clean and preprocess data\ndata.transform_data    # Apply transformations\ndata.merge_datasets    # Merge multiple datasets\ndata.validate_data     # Validate against schemas\n```\n\n**Example Use Case: Data Pipeline**\n\n```yaml\ntasks:\n  - action: \"data.merge_datasets\"\n    params:\n      datasets: \n        - \"sales_q1.csv\"\n        - \"sales_q2.csv\"\n        - \"sales_q3.csv\"\n      output_path: \"annual_sales.csv\"\n  \n  - action: \"data.clean_data\"\n    params:\n      data: \"annual_sales.csv\"\n  \n  - action: \"data.transform_data\"\n    params:\n      input_path: \"annual_sales.csv\"\n      output_path: \"sales_report.csv\"\n      operations: [\"aggregate\", \"pivot\", \"format\"]\n```\n\n#### **\ud83e\udde0 AI Tasks** - Artificial Intelligence\n\n```python\n# Available AI Tasks:\nai.generate_text      # Generate text with LLMs\nai.classify_text      # Classify text into categories\nai.analyze_sentiment  # Sentiment analysis\nai.extract_entities   # Named entity recognition\n```\n\n**Example Use Case: Content Analysis**\n\n```yaml\ntasks:\n  - action: \"ai.analyze_sentiment\"\n    params:\n      text: \"Customer feedback text here...\"\n  \n  - action: \"ai.extract_entities\"\n    params:\n      text: \"Extract companies, people, and locations from this text.\"\n  \n  - action: \"ai.generate_text\"\n    params:\n      prompt: \"Write a summary of the customer feedback\"\n      max_tokens: 150\n```\n\n### \ud83d\udd0c Easy Extensibility\n\nAdd your own custom tasks in minutes:\n\n```python\nfrom taskflow import TaskFlow\n\n# Define a custom function\ndef send_email(to: str, subject: str, body: str) -> None:\n    \"\"\"Send an email notification.\"\"\"\n    print(f\"Sending email to {to}: {subject}\")\n    # Your email logic here...\n\n# Register it as a custom action\npipeline = TaskFlow(\"workflow.yaml\")\npipeline.add_custom_action(\"email.send\", send_email)\npipeline.run()\n```\n\nThen use it in your YAML:\n\n```yaml\ntasks:\n  - action: \"email.send\"\n    params:\n      to: \"team@company.com\"\n      subject: \"Pipeline Completed\"\n      body: \"The data processing pipeline has finished successfully.\"\n```\n\n### \ud83d\udee1\ufe0f Type-Safe & Production Ready\n\n- **Type Hints**: Full type annotations for IDE support and type checking\n- **Comprehensive Logging**: Detailed logs for every step\n- **Error Handling**: Clear error messages with actionable information\n- **Validation**: Automatic validation of configuration files and parameters\n\n### \ud83c\udfa8 Both YAML and JSON Support\n\nUse whichever format you prefer:\n\n**YAML** (Recommended for readability):\n```yaml\ntasks:\n  - action: \"rpa.click\"\n    params:\n      target: \"Button\"\n```\n\n**JSON** (Better for programmatic generation):\n```json\n{\n  \"tasks\": [\n    {\n      \"action\": \"rpa.click\",\n      \"params\": {\"target\": \"Button\"}\n    }\n  ]\n}\n```\n\n---\n\n## \ud83d\udcda Complete Task Reference\n\n### \ud83e\udd16 RPA Tasks\n\n| Action | Description | Parameters | Example |\n|--------|-------------|------------|---------|\n| `rpa.click` | Click UI element | `target` (str) | Click login button |\n| `rpa.type_text` | Type into input | `target` (str), `text` (str) | Fill form fields |\n| `rpa.take_screenshot` | Capture screen | `output_path` (str) | Save evidence |\n| `rpa.extract_table_from_pdf` | Extract PDF table | `file_path` (str) | Parse invoices |\n\n### \ud83d\udcca Data Tasks\n\n| Action | Description | Parameters | Example |\n|--------|-------------|------------|---------|\n| `data.clean_data` | Clean data | `data` (str/list) | Remove duplicates |\n| `data.transform_data` | Transform data | `input_path`, `output_path`, `operations` | ETL pipeline |\n| `data.merge_datasets` | Merge datasets | `datasets` (list), `output_path` | Combine data sources |\n| `data.validate_data` | Validate schema | `data` (str), `schema` (dict) | Ensure data quality |\n\n### \ud83e\udde0 AI Tasks\n\n| Action | Description | Parameters | Example |\n|--------|-------------|------------|---------|\n| `ai.generate_text` | Generate text | `prompt` (str), `max_tokens` (int) | Create summaries |\n| `ai.classify_text` | Classify text | `text` (str), `categories` (list) | Categorize content |\n| `ai.analyze_sentiment` | Analyze sentiment | `text` (str) | Measure satisfaction |\n| `ai.extract_entities` | Extract entities | `text` (str) | Find names, places |\n\n---\n\n## \ud83d\udca1 Real-World Examples\n\n### Example 1: Invoice Processing Automation\n\n```yaml\n# invoice_workflow.yaml\ntasks:\n  # Step 1: Extract data from invoice PDF\n  - action: \"rpa.extract_table_from_pdf\"\n    params:\n      file_path: \"invoices/invoice_2024_001.pdf\"\n  \n  # Step 2: Validate the extracted data\n  - action: \"data.validate_data\"\n    params:\n      data: \"extracted_invoice.csv\"\n      schema:\n        invoice_id: \"string\"\n        vendor: \"string\"\n        amount: \"float\"\n        date: \"date\"\n  \n  # Step 3: Clean and format the data\n  - action: \"data.clean_data\"\n    params:\n      data: \"extracted_invoice.csv\"\n  \n  # Step 4: Generate a summary email\n  - action: \"ai.generate_text\"\n    params:\n      prompt: \"Create a professional email summary of this invoice\"\n      max_tokens: 200\n```\n\nRun it:\n```python\nfrom taskflow import TaskFlow\n\npipeline = TaskFlow(\"invoice_workflow.yaml\")\npipeline.run()\n```\n\n### Example 2: Customer Feedback Analysis\n\n```yaml\n# feedback_analysis.yaml\ntasks:\n  # Step 1: Analyze sentiment of customer reviews\n  - action: \"ai.analyze_sentiment\"\n    params:\n      text: \"The product quality is excellent but shipping was slow.\"\n  \n  # Step 2: Extract key entities (products, issues)\n  - action: \"ai.extract_entities\"\n    params:\n      text: \"The product quality is excellent but shipping was slow.\"\n  \n  # Step 3: Classify feedback type\n  - action: \"ai.classify_text\"\n    params:\n      text: \"The product quality is excellent but shipping was slow.\"\n      categories: [\"product_quality\", \"shipping\", \"customer_service\", \"pricing\"]\n  \n  # Step 4: Generate action items\n  - action: \"ai.generate_text\"\n    params:\n      prompt: \"Based on the feedback, suggest 3 action items for improvement\"\n      max_tokens: 150\n```\n\n### Example 3: Data ETL Pipeline\n\n```yaml\n# etl_pipeline.yaml\ntasks:\n  # Step 1: Merge quarterly sales data\n  - action: \"data.merge_datasets\"\n    params:\n      datasets:\n        - \"data/q1_sales.csv\"\n        - \"data/q2_sales.csv\"\n        - \"data/q3_sales.csv\"\n        - \"data/q4_sales.csv\"\n      output_path: \"data/annual_sales.csv\"\n  \n  # Step 2: Clean the merged data\n  - action: \"data.clean_data\"\n    params:\n      data: \"data/annual_sales.csv\"\n  \n  # Step 3: Transform and aggregate\n  - action: \"data.transform_data\"\n    params:\n      input_path: \"data/annual_sales.csv\"\n      output_path: \"data/sales_report.csv\"\n      operations: [\"dedupe\", \"aggregate\", \"sort\"]\n  \n  # Step 4: Validate final output\n  - action: \"data.validate_data\"\n    params:\n      data: \"data/sales_report.csv\"\n      schema:\n        product_id: \"string\"\n        total_sales: \"float\"\n        region: \"string\"\n```\n\n### Example 4: Custom Task Integration\n\n```python\n# custom_workflow.py\nfrom taskflow import TaskFlow\nimport requests\n\n# Define custom tasks\ndef send_slack_notification(webhook_url: str, message: str) -> None:\n    \"\"\"Send a Slack notification.\"\"\"\n    requests.post(webhook_url, json={\"text\": message})\n    print(f\"Sent Slack message: {message}\")\n\ndef query_database(connection_string: str, query: str) -> list:\n    \"\"\"Query a database and return results.\"\"\"\n    print(f\"Executing query: {query}\")\n    # Your database logic here...\n    return [{\"id\": 1, \"name\": \"Result\"}]\n\n# Set up pipeline with custom actions\npipeline = TaskFlow(\"custom_workflow.yaml\")\npipeline.add_custom_action(\"slack.send\", send_slack_notification)\npipeline.add_custom_action(\"db.query\", query_database)\n\n# Run the pipeline\npipeline.run()\n```\n\n**custom_workflow.yaml:**\n```yaml\ntasks:\n  - action: \"db.query\"\n    params:\n      connection_string: \"postgresql://localhost/mydb\"\n      query: \"SELECT * FROM users WHERE active = true\"\n  \n  - action: \"data.clean_data\"\n    params:\n      data: \"query_results.csv\"\n  \n  - action: \"slack.send\"\n    params:\n      webhook_url: \"https://hooks.slack.com/services/YOUR/WEBHOOK/URL\"\n      message: \"Daily user report generated successfully!\"\n```\n\n---\n\n## \ud83c\udfd7\ufe0f Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502                   TaskFlow                       \u2502\n\u2502                  (Orchestrator)                  \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n             \u2502                        \u2502\n             \u25bc                        \u25bc\n    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510       \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n    \u2502  YAML Parser   \u2502       \u2502  JSON Parser   \u2502\n    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518       \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n             \u2502                        \u2502\n             \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                         \u25bc\n             \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n             \u2502    Action Mapper      \u2502\n             \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                         \u2502\n        \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n        \u25bc                \u25bc                \u25bc\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510 \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502   RPA Tasks   \u2502 \u2502 Data Tasks  \u2502 \u2502   AI Tasks    \u2502\n\u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524 \u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524 \u251c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2524\n\u2502 \u2022 click       \u2502 \u2502 \u2022 clean     \u2502 \u2502 \u2022 generate    \u2502\n\u2502 \u2022 type        \u2502 \u2502 \u2022 transform \u2502 \u2502 \u2022 classify    \u2502\n\u2502 \u2022 screenshot  \u2502 \u2502 \u2022 merge     \u2502 \u2502 \u2022 sentiment   \u2502\n\u2502 \u2022 extract     \u2502 \u2502 \u2022 validate  \u2502 \u2502 \u2022 entities    \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518 \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n         \u2502                \u2502                \u2502\n         \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u253c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n                          \u25bc\n                   \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n                   \u2502   Logging    \u2502\n                   \u2502 Error Handle \u2502\n                   \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n---\n\n## \ud83d\udd27 Advanced Usage\n\n### Error Handling\n\n```python\nfrom taskflow import TaskFlow\n\ntry:\n    pipeline = TaskFlow(\"workflow.yaml\")\n    pipeline.run()\nexcept FileNotFoundError:\n    print(\"\u274c Configuration file not found\")\nexcept ValueError as e:\n    print(f\"\u274c Invalid configuration: {e}\")\nexcept Exception as e:\n    print(f\"\u274c Pipeline failed: {e}\")\n```\n\n### Custom Logging\n\n```python\nimport logging\nfrom taskflow import TaskFlow\n\n# Configure logging\nlogging.basicConfig(\n    level=logging.DEBUG,\n    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',\n    handlers=[\n        logging.FileHandler('pipeline.log'),\n        logging.StreamHandler()\n    ]\n)\n\npipeline = TaskFlow(\"workflow.yaml\")\npipeline.run()\n```\n\n### Dynamic Task Generation\n\n```python\nfrom taskflow import TaskFlow\nimport yaml\n\n# Generate tasks programmatically\ntasks = {\n    \"tasks\": [\n        {\n            \"action\": \"rpa.click\",\n            \"params\": {\"target\": f\"Button {i}\"}\n        }\n        for i in range(5)\n    ]\n}\n\n# Save to file\nwith open(\"dynamic_workflow.yaml\", \"w\") as f:\n    yaml.dump(tasks, f)\n\n# Run the pipeline\npipeline = TaskFlow(\"dynamic_workflow.yaml\")\npipeline.run()\n```\n\n---\n\n## \ud83d\udcd6 Documentation\n\n### Project Structure\n\n```\ntaskflow/\n\u251c\u2500\u2500 __init__.py              # Package initialization\n\u251c\u2500\u2500 core.py                  # Main TaskFlow engine\n\u251c\u2500\u2500 parser.py                # YAML/JSON configuration parser\n\u2514\u2500\u2500 tasks/\n    \u251c\u2500\u2500 __init__.py\n    \u251c\u2500\u2500 rpa_tasks.py         # RPA automation functions\n    \u251c\u2500\u2500 data_tasks.py        # Data processing functions\n    \u2514\u2500\u2500 ai_tasks.py          # AI/ML task functions\n```\n\n### Development\n\n```bash\n# Clone the repository\ngit clone https://github.com/berkterekli/taskflow-pipeline.git\ncd taskflow-pipeline\n\n# Install development dependencies\npip install -e \".[dev]\"\n\n# Run tests\npytest -v\n\n# Run tests with coverage\npytest --cov=taskflow --cov-report=html\n\n# Format code\nblack taskflow tests examples\n\n# Type checking\nmypy taskflow\n\n# Lint code\nflake8 taskflow\n```\n\n### Running Tests\n\n```bash\n# Run all tests\npytest\n\n# Run specific test file\npytest tests/test_pipeline.py\n\n# Run with verbose output\npytest -v\n\n# Run with coverage report\npytest --cov=taskflow --cov-report=term-missing\n```\n\n---\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Here's how you can help:\n\n1. **Fork the repository**\n2. **Create a feature branch**: `git checkout -b feature/amazing-feature`\n3. **Make your changes**\n4. **Run tests**: `pytest`\n5. **Format code**: `black taskflow tests`\n6. **Commit changes**: `git commit -m 'Add amazing feature'`\n7. **Push to branch**: `git push origin feature/amazing-feature`\n8. **Open a Pull Request**\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines.\n\n### Adding New Tasks\n\nTo add a new task module:\n\n1. Create `taskflow/tasks/your_tasks.py`\n2. Implement functions with type hints and docstrings\n3. Register actions in `taskflow/core.py`\n4. Add tests in `tests/test_your_tasks.py`\n5. Update documentation\n\n---\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n---\n\n## \ud83d\ude4f Acknowledgments\n\n- Built with \u2764\ufe0f by [Berk Terekli](https://github.com/berkterekli)\n- Inspired by the need for simple, maintainable automation workflows\n- Thanks to all [contributors](https://github.com/berkterekli/taskflow-pipeline/graphs/contributors)\n\n---\n\n## \ud83d\udcde Support & Community\n\n- \ud83d\udcd6 **Documentation**: [GitHub Wiki](https://github.com/berkterekli/taskflow-pipeline/wiki)\n- \ud83d\udc1b **Bug Reports**: [Issue Tracker](https://github.com/berkterekli/taskflow-pipeline/issues)\n- \ud83d\udcac **Discussions**: [GitHub Discussions](https://github.com/berkterekli/taskflow-pipeline/discussions)\n- \ud83d\udce7 **Email**: berk.terekli@example.com\n- \ud83d\udc26 **Twitter**: [@berkterekli](https://twitter.com/berkterekli)\n\n---\n\n## \u2b50 Star History\n\nIf you find TaskFlow useful, please consider giving it a star on GitHub! It helps others discover the project.\n\n[![Star History Chart](https://api.star-history.com/svg?repos=berkterekli/taskflow-pipeline&type=Date)](https://star-history.com/#berkterekli/taskflow-pipeline&Date)\n\n---\n\n## \ud83d\uddfa\ufe0f Roadmap\n\n- [ ] **Web UI Dashboard** - Visual pipeline editor and monitor\n- [ ] **Parallel Execution** - Run tasks in parallel for better performance\n- [ ] **Conditional Logic** - If/else conditions in workflows\n- [ ] **Loop Support** - Iterate over datasets\n- [ ] **Error Retry** - Automatic retry with exponential backoff\n- [ ] **Notifications** - Email, Slack, Teams integrations\n- [ ] **Scheduling** - Cron-like scheduling support\n- [ ] **Docker Support** - Pre-built Docker images\n- [ ] **Cloud Integrations** - AWS, Azure, GCP task modules\n- [ ] **Database Tasks** - Built-in database operations\n- [ ] **API Tasks** - REST API integration tasks\n\n---\n\n## \ud83d\udcca Stats\n\n![PyPI - Downloads](https://img.shields.io/pypi/dm/taskflow-pipeline)\n![GitHub stars](https://img.shields.io/github/stars/berkterekli/taskflow-pipeline?style=social)\n![GitHub forks](https://img.shields.io/github/forks/berkterekli/taskflow-pipeline?style=social)\n![GitHub watchers](https://img.shields.io/github/watchers/berkterekli/taskflow-pipeline?style=social)\n\n---\n\n<div align=\"center\">\n\n**Made with \u2764\ufe0f by [Berk Terekli](https://github.com/berkterekli)**\n\n[\u2b06 Back to Top](#-taskflow-pipeline)\n\n</div>\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python library for orchestrating RPA, data processing, and AI task pipelines from YAML/JSON configurations",
    "version": "0.1.1",
    "project_urls": {
        "Bug Tracker": "https://github.com/berkterekli/taskflow-pipeline/issues",
        "Changelog": "https://github.com/berkterekli/taskflow-pipeline/blob/main/CHANGELOG.md",
        "Documentation": "https://github.com/berkterekli/taskflow-pipeline#readme",
        "Homepage": "https://github.com/berkterekli/taskflow-pipeline",
        "Repository": "https://github.com/berkterekli/taskflow-pipeline"
    },
    "split_keywords": [
        "automation",
        " rpa",
        " pipeline",
        " workflow",
        " orchestration",
        " task-automation",
        " yaml-config",
        " data-processing",
        " ai-tasks",
        " robotic-process-automation"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7f559ca41df0ab1b0ab7136a916acaf04e55ca2be7ecc4be1e055fa55137ddb6",
                "md5": "b0d8936ed9ad86c089437c5a68824b1b",
                "sha256": "641c8145d6dc167eba50a65986a65c3f4712a6f38827b9ad704d7267f78388b0"
            },
            "downloads": -1,
            "filename": "taskflow_pipeline-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b0d8936ed9ad86c089437c5a68824b1b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 16434,
            "upload_time": "2025-11-03T13:38:56",
            "upload_time_iso_8601": "2025-11-03T13:38:56.047509Z",
            "url": "https://files.pythonhosted.org/packages/7f/55/9ca41df0ab1b0ab7136a916acaf04e55ca2be7ecc4be1e055fa55137ddb6/taskflow_pipeline-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3e035354ffee482c66457adee95f603a530847200fde869b3b230eaf4f719cd8",
                "md5": "c40521b86515cff46acaa512008f4bfe",
                "sha256": "91db2e753e97d73fc245482eb2f526b5abcd7a3c56be5b89d95c8216fcb1d49a"
            },
            "downloads": -1,
            "filename": "taskflow_pipeline-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "c40521b86515cff46acaa512008f4bfe",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 28510,
            "upload_time": "2025-11-03T13:38:57",
            "upload_time_iso_8601": "2025-11-03T13:38:57.220154Z",
            "url": "https://files.pythonhosted.org/packages/3e/03/5354ffee482c66457adee95f603a530847200fde869b3b230eaf4f719cd8/taskflow_pipeline-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-03 13:38:57",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "berkterekli",
    "github_project": "taskflow-pipeline",
    "github_not_found": true,
    "lcname": "taskflow-pipeline"
}
        
Elapsed time: 2.78890s