Flowfile


NameFlowfile JSON
Version 0.3.6 PyPI version JSON
download
home_pageNone
SummaryProject combining flowfile core (backend) and flowfile_worker (compute offloader) and flowfile_frame (api)
upload_time2025-07-31 10:31:10
maintainerNone
docs_urlNone
authorEdward van Eechoud
requires_python<3.13,>=3.10
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <h1 align="center">
  <img src="https://raw.githubusercontent.com/Edwardvaneechoud/Flowfile/main/.github/images/logo.png" alt="Flowfile Logo" width="100">
  <br>
  Flowfile
</h1>

<p align="center">
  <b>Main Repository</b>: <a href="https://github.com/Edwardvaneechoud/Flowfile">Edwardvaneechoud/Flowfile</a><br>
  <b>Documentation</b>: 
  <a href="https://edwardvaneechoud.github.io/Flowfile/">Website</a> - 
  <a href="https://github.com/Edwardvaneechoud/Flowfile/blob/main/flowfile_core/README.md">Core</a> - 
  <a href="https://github.com/Edwardvaneechoud/Flowfile/blob/main/flowfile_worker/README.md">Worker</a> - 
  <a href="https://github.com/Edwardvaneechoud/Flowfile/blob/main/flowfile_frontend/README.md">Frontend</a> - 
  <a href="https://dev.to/edwardvaneechoud/building-flowfile-architecting-a-visual-etl-tool-with-polars-576c">Technical Architecture</a>
</p>

<p>
Flowfile is a visual ETL tool and Python library suite that combines drag-and-drop workflow building with the speed of Polars dataframes. Build data pipelines visually, transform data using powerful nodes, or define data flows programmatically with Python and analyze results - all with high-performance data processing. Export your visual flows as standalone Python/Polars code for production deployment.
</p>

## 🚀 Getting Started

### Installation

Install Flowfile directly from PyPI:

```bash
pip install Flowfile
```

### Quick Start: Web UI

The easiest way to get started is by launching the web-based UI:

```bash
# Start the Flowfile web UI with integrated services
flowfile run ui
```

This will:
- Start the combined core and worker services
- Launch a web interface in your browser
- Provide access to the full visual ETL capabilities

**Options:**
```bash
# Customize host
flowfile run ui --host 0.0.0.0

# Start without opening a browser
flowfile run ui --no-browser
```

You can also start the web UI programmatically:

```python
import flowfile

# Start with default settings
flowfile.start_web_ui()

# Or customize
flowfile.start_web_ui(open_browser=False)
```

### Using the FlowFrame API

Flowfile provides a Polars-like API for defining data pipelines programmatically:

```python
import flowfile as ff
from flowfile import col, open_graph_in_editor

# Create a data pipeline
df = ff.from_dict({
    "id": [1, 2, 3, 4, 5],
    "category": ["A", "B", "A", "C", "B"],
    "value": [100, 200, 150, 300, 250]
})

# Process the data
result = df.filter(col("value") > 150).with_columns([
    (col("value") * 2).alias("double_value")
])

# Open the graph in the web UI (starts the server if needed)
open_graph_in_editor(result.flow_graph)
```

## 📦 Package Components

The `Flowfile` PyPI package includes:

- **Core Service (`flowfile_core`)**: The main ETL engine using Polars
- **Worker Service (`flowfile_worker`)**: Handles computation-intensive tasks
- **Web UI**: Browser-based visual ETL interface
- **FlowFrame API (`flowfile_frame`)**: Polars-like API for Python coding

## ✨ Key Features

### Visual ETL with Web UI

- **No Installation Required**: Launch directly from the pip package
- **Drag-and-Drop Interface**: Build data pipelines visually
- **Integrated Services**: Combined core and worker services
- **Browser-Based**: Access from any device on your network
- **Code Generation**: Export visual flows as Python/Polars scripts

### FlowFrame API

- **Familiar Syntax**: Polars-like API makes it easy to learn
- **ETL Graph Generation**: Automatically builds visual workflows
- **Lazy Evaluation**: Operations are not executed until needed
- **Interoperability**: Move between code and visual interfaces

### Data Operations

- **Data Cleaning & Transformation**: Complex joins, filtering, etc.
- **High Performance**: Built on Polars for efficient processing
- **Data Integration**: Handle various file formats
- **ETL Pipeline Building**: Create reusable workflows

## 🔄 Common FlowFrame Operations

```python

import flowfile as ff
from flowfile import col, when, lit

# Read data
df = ff.from_dict({
    "id": [1, 2, 3, 4, 5],
    "category": ["A", "B", "A", "C", "B"],
    "value": [100, 200, 150, 300, 250]
})
# df_parquet = ff.read_parquet("data.parquet")
# df_csv = ff.read_csv("data.csv")

other_df = ff.from_dict({
    "product_id": [1, 2, 3, 4, 6],
    "product_name": ["WidgetA", "WidgetB", "WidgetC", "WidgetD", "WidgetE"],
    "supplier": ["SupplierX", "SupplierY", "SupplierX", "SupplierZ", "SupplierY"]
}, flow_graph=df.flow_graph  # Assign the data to the same graph
)

# Filter
filtered = df.filter(col("value") > 150)

# Transform
result = df.select(
    col("id"),
    (col("value") * 2).alias("double_value")
)

# Conditional logic
with_status = df.with_columns([
    when(col("value") > 200).then(lit("High")).otherwise(lit("Low")).alias("status")
])

# Group and aggregate
by_category = df.group_by("category").agg([
    col("value").sum().alias("total"),
    col("value").mean().alias("average")
])

# Join data
joined = df.join(other_df, left_on="id", right_on="product_id")

joined.flow_graph.flow_settings.execution_location = "auto"
joined.flow_graph.flow_settings.execution_mode = "Development"
ff.open_graph_in_editor(joined.flow_graph)  # opens the graph in the UI!

```

## 📝 Code Generation

Export your visual flows as standalone Python/Polars code for production use:

![Code Generation](https://raw.githubusercontent.com/Edwardvaneechoud/Flowfile/refs/heads/main/.github/images/generated_code.png)

Simply click the "Generate code" button in the visual editor to:
- Generate clean, readable Python/Polars code
- Export flows without Flowfile dependencies
- Deploy workflows in any Python environment
- Share ETL logic with team members

## 🧰 Command-Line Interface

```bash
# Show help and version info
flowfile

# Start the web UI
flowfile run ui [options]

# Run individual services
flowfile run core --host 0.0.0.0 --port 63578
flowfile run worker --host 0.0.0.0 --port 63579
```

## 📚 Resources

- **[Main Repository](https://github.com/Edwardvaneechoud/Flowfile)**: Latest code and examples
- **[Documentation](https://edwardvaneechoud.github.io/Flowfile/)**: Comprehensive guides
- **[Technical Architecture](https://dev.to/edwardvaneechoud/building-flowfile-architecting-a-visual-etl-tool-with-polars-576c)**: Design overview

## 🖥️ Full Application Options

For the complete visual ETL experience, you have additional options:

- **Desktop Application**: Download from the [main repository](https://github.com/Edwardvaneechoud/Flowfile#-getting-started)
- **Docker Setup**: Run with Docker Compose
- **Manual Setup**: For development environments

## 📋 Development Roadmap

See the [main repository](https://github.com/Edwardvaneechoud/Flowfile#-todo) for the latest development roadmap and TODO list.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "Flowfile",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.13,>=3.10",
    "maintainer_email": null,
    "keywords": null,
    "author": "Edward van Eechoud",
    "author_email": "evaneechoud@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/12/12/b8275713459d90203c17d22af04cd49e7e90a608f9bbd18c024bca5622c8/flowfile-0.3.6.tar.gz",
    "platform": null,
    "description": "<h1 align=\"center\">\n  <img src=\"https://raw.githubusercontent.com/Edwardvaneechoud/Flowfile/main/.github/images/logo.png\" alt=\"Flowfile Logo\" width=\"100\">\n  <br>\n  Flowfile\n</h1>\n\n<p align=\"center\">\n  <b>Main Repository</b>: <a href=\"https://github.com/Edwardvaneechoud/Flowfile\">Edwardvaneechoud/Flowfile</a><br>\n  <b>Documentation</b>: \n  <a href=\"https://edwardvaneechoud.github.io/Flowfile/\">Website</a> - \n  <a href=\"https://github.com/Edwardvaneechoud/Flowfile/blob/main/flowfile_core/README.md\">Core</a> - \n  <a href=\"https://github.com/Edwardvaneechoud/Flowfile/blob/main/flowfile_worker/README.md\">Worker</a> - \n  <a href=\"https://github.com/Edwardvaneechoud/Flowfile/blob/main/flowfile_frontend/README.md\">Frontend</a> - \n  <a href=\"https://dev.to/edwardvaneechoud/building-flowfile-architecting-a-visual-etl-tool-with-polars-576c\">Technical Architecture</a>\n</p>\n\n<p>\nFlowfile is a visual ETL tool and Python library suite that combines drag-and-drop workflow building with the speed of Polars dataframes. Build data pipelines visually, transform data using powerful nodes, or define data flows programmatically with Python and analyze results - all with high-performance data processing. Export your visual flows as standalone Python/Polars code for production deployment.\n</p>\n\n## \ud83d\ude80 Getting Started\n\n### Installation\n\nInstall Flowfile directly from PyPI:\n\n```bash\npip install Flowfile\n```\n\n### Quick Start: Web UI\n\nThe easiest way to get started is by launching the web-based UI:\n\n```bash\n# Start the Flowfile web UI with integrated services\nflowfile run ui\n```\n\nThis will:\n- Start the combined core and worker services\n- Launch a web interface in your browser\n- Provide access to the full visual ETL capabilities\n\n**Options:**\n```bash\n# Customize host\nflowfile run ui --host 0.0.0.0\n\n# Start without opening a browser\nflowfile run ui --no-browser\n```\n\nYou can also start the web UI programmatically:\n\n```python\nimport flowfile\n\n# Start with default settings\nflowfile.start_web_ui()\n\n# Or customize\nflowfile.start_web_ui(open_browser=False)\n```\n\n### Using the FlowFrame API\n\nFlowfile provides a Polars-like API for defining data pipelines programmatically:\n\n```python\nimport flowfile as ff\nfrom flowfile import col, open_graph_in_editor\n\n# Create a data pipeline\ndf = ff.from_dict({\n    \"id\": [1, 2, 3, 4, 5],\n    \"category\": [\"A\", \"B\", \"A\", \"C\", \"B\"],\n    \"value\": [100, 200, 150, 300, 250]\n})\n\n# Process the data\nresult = df.filter(col(\"value\") > 150).with_columns([\n    (col(\"value\") * 2).alias(\"double_value\")\n])\n\n# Open the graph in the web UI (starts the server if needed)\nopen_graph_in_editor(result.flow_graph)\n```\n\n## \ud83d\udce6 Package Components\n\nThe `Flowfile` PyPI package includes:\n\n- **Core Service (`flowfile_core`)**: The main ETL engine using Polars\n- **Worker Service (`flowfile_worker`)**: Handles computation-intensive tasks\n- **Web UI**: Browser-based visual ETL interface\n- **FlowFrame API (`flowfile_frame`)**: Polars-like API for Python coding\n\n## \u2728 Key Features\n\n### Visual ETL with Web UI\n\n- **No Installation Required**: Launch directly from the pip package\n- **Drag-and-Drop Interface**: Build data pipelines visually\n- **Integrated Services**: Combined core and worker services\n- **Browser-Based**: Access from any device on your network\n- **Code Generation**: Export visual flows as Python/Polars scripts\n\n### FlowFrame API\n\n- **Familiar Syntax**: Polars-like API makes it easy to learn\n- **ETL Graph Generation**: Automatically builds visual workflows\n- **Lazy Evaluation**: Operations are not executed until needed\n- **Interoperability**: Move between code and visual interfaces\n\n### Data Operations\n\n- **Data Cleaning & Transformation**: Complex joins, filtering, etc.\n- **High Performance**: Built on Polars for efficient processing\n- **Data Integration**: Handle various file formats\n- **ETL Pipeline Building**: Create reusable workflows\n\n## \ud83d\udd04 Common FlowFrame Operations\n\n```python\n\nimport flowfile as ff\nfrom flowfile import col, when, lit\n\n# Read data\ndf = ff.from_dict({\n    \"id\": [1, 2, 3, 4, 5],\n    \"category\": [\"A\", \"B\", \"A\", \"C\", \"B\"],\n    \"value\": [100, 200, 150, 300, 250]\n})\n# df_parquet = ff.read_parquet(\"data.parquet\")\n# df_csv = ff.read_csv(\"data.csv\")\n\nother_df = ff.from_dict({\n    \"product_id\": [1, 2, 3, 4, 6],\n    \"product_name\": [\"WidgetA\", \"WidgetB\", \"WidgetC\", \"WidgetD\", \"WidgetE\"],\n    \"supplier\": [\"SupplierX\", \"SupplierY\", \"SupplierX\", \"SupplierZ\", \"SupplierY\"]\n}, flow_graph=df.flow_graph  # Assign the data to the same graph\n)\n\n# Filter\nfiltered = df.filter(col(\"value\") > 150)\n\n# Transform\nresult = df.select(\n    col(\"id\"),\n    (col(\"value\") * 2).alias(\"double_value\")\n)\n\n# Conditional logic\nwith_status = df.with_columns([\n    when(col(\"value\") > 200).then(lit(\"High\")).otherwise(lit(\"Low\")).alias(\"status\")\n])\n\n# Group and aggregate\nby_category = df.group_by(\"category\").agg([\n    col(\"value\").sum().alias(\"total\"),\n    col(\"value\").mean().alias(\"average\")\n])\n\n# Join data\njoined = df.join(other_df, left_on=\"id\", right_on=\"product_id\")\n\njoined.flow_graph.flow_settings.execution_location = \"auto\"\njoined.flow_graph.flow_settings.execution_mode = \"Development\"\nff.open_graph_in_editor(joined.flow_graph)  # opens the graph in the UI!\n\n```\n\n## \ud83d\udcdd Code Generation\n\nExport your visual flows as standalone Python/Polars code for production use:\n\n![Code Generation](https://raw.githubusercontent.com/Edwardvaneechoud/Flowfile/refs/heads/main/.github/images/generated_code.png)\n\nSimply click the \"Generate code\" button in the visual editor to:\n- Generate clean, readable Python/Polars code\n- Export flows without Flowfile dependencies\n- Deploy workflows in any Python environment\n- Share ETL logic with team members\n\n## \ud83e\uddf0 Command-Line Interface\n\n```bash\n# Show help and version info\nflowfile\n\n# Start the web UI\nflowfile run ui [options]\n\n# Run individual services\nflowfile run core --host 0.0.0.0 --port 63578\nflowfile run worker --host 0.0.0.0 --port 63579\n```\n\n## \ud83d\udcda Resources\n\n- **[Main Repository](https://github.com/Edwardvaneechoud/Flowfile)**: Latest code and examples\n- **[Documentation](https://edwardvaneechoud.github.io/Flowfile/)**: Comprehensive guides\n- **[Technical Architecture](https://dev.to/edwardvaneechoud/building-flowfile-architecting-a-visual-etl-tool-with-polars-576c)**: Design overview\n\n## \ud83d\udda5\ufe0f Full Application Options\n\nFor the complete visual ETL experience, you have additional options:\n\n- **Desktop Application**: Download from the [main repository](https://github.com/Edwardvaneechoud/Flowfile#-getting-started)\n- **Docker Setup**: Run with Docker Compose\n- **Manual Setup**: For development environments\n\n## \ud83d\udccb Development Roadmap\n\nSee the [main repository](https://github.com/Edwardvaneechoud/Flowfile#-todo) for the latest development roadmap and TODO list.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Project combining flowfile core (backend) and flowfile_worker (compute offloader) and flowfile_frame (api)",
    "version": "0.3.6",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "dca428b7a7ad018d259ec38d1eedc640421c73f1c4ed707990447c59b84611fa",
                "md5": "2e0c18a3015a4c2c1b38cc92ff6a5473",
                "sha256": "77fc264448dfdfc0b83fc11b9078dca027d75c95748f6b9d3688c051918a3253"
            },
            "downloads": -1,
            "filename": "flowfile-0.3.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2e0c18a3015a4c2c1b38cc92ff6a5473",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.13,>=3.10",
            "size": 4886658,
            "upload_time": "2025-07-31T10:31:08",
            "upload_time_iso_8601": "2025-07-31T10:31:08.664583Z",
            "url": "https://files.pythonhosted.org/packages/dc/a4/28b7a7ad018d259ec38d1eedc640421c73f1c4ed707990447c59b84611fa/flowfile-0.3.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1212b8275713459d90203c17d22af04cd49e7e90a608f9bbd18c024bca5622c8",
                "md5": "64183aaee1b24146ba7092848ed5170a",
                "sha256": "e9edaab4ed6c2a2817803a45efaf632a422a6b718fa3a43e7b35ecd612617197"
            },
            "downloads": -1,
            "filename": "flowfile-0.3.6.tar.gz",
            "has_sig": false,
            "md5_digest": "64183aaee1b24146ba7092848ed5170a",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.13,>=3.10",
            "size": 4748348,
            "upload_time": "2025-07-31T10:31:10",
            "upload_time_iso_8601": "2025-07-31T10:31:10.777405Z",
            "url": "https://files.pythonhosted.org/packages/12/12/b8275713459d90203c17d22af04cd49e7e90a608f9bbd18c024bca5622c8/flowfile-0.3.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-31 10:31:10",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "flowfile"
}
        
Elapsed time: 1.92484s