<p align="center">
<a href="https://graphbook.ai">
<img src="docs/_static/graphbook.png" alt="Logo" width=256>
</a>
<h1 align="center">Graphbook</h1>
<p align="center">
<a href="https://github.com/graphbookai/graphbook/blob/main/LICENSE">
<img alt="GitHub License" src="https://img.shields.io/github/license/graphbookai/graphbook">
</a>
<a href="https://github.com/graphbookai/graphbook/actions/workflows/pypi.yml">
<img alt="GitHub Actions Workflow Status" src="https://img.shields.io/github/actions/workflow/status/graphbookai/graphbook/pypi.yml">
</a>
<a href="https://hub.docker.com/r/rsamf/graphbook">
<img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/rsamf/graphbook">
</a>
<a href="https://www.pepy.tech/projects/graphbook">
<img alt="PyPI Downloads" src="https://static.pepy.tech/badge/graphbook">
</a>
<a href="https://pypi.org/project/graphbook/">
<img alt="PyPI - Version" src="https://img.shields.io/pypi/v/graphbook">
</a>
</p>
<div align="center">
<a href="https://discord.gg/XukMUDmjnt">
<img alt="Join Discord" src="https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white">
</a>
</div>
<p align="center">
<a href="https://discord.gg/XukMUDmjnt">
<img alt="Discord" src="https://img.shields.io/discord/1199855707567177860">
</a>
</p>
<p align="center">
The ML workflow framework
<br>
<a href="https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug">Report bug</a>
·
<a href="https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement">Request feature</a>
</p>
<p align="center">
<a href="#overview">Overview</a> •
<a href="#status">Status</a> •
<a href="#getting-started">Getting Started</a> •
<a href="#examples">Examples</a> •
<a href="#collaboration">Collaboration</a>
</p>
</p>
## Overview
Graphbook is a framework for building efficient, visual DAG-structured ML workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching for PyTorch tensors, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.
<p align="center">
<a href="https://graphbook.ai">
<img src="https://media.githubusercontent.com/media/rsamf/public/main/docs/overview/huggingface-pipeline-demo.gif" alt="Huggingface Pipeline Demo" width="512">
</a>
<div align="center">Build, run, monitor!</div>
</p>
## Status
Graphbook is in a very early stage of development, so expect minor bugs and rapid design changes through the coming releases. If you would like to [report a bug](https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug) or [request a feature](https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement), please feel free to do so. We aim to make Graphbook serve our users in the best way possible.
### Current Features
- Graph-based visual editor to experiment and create complex ML workflows
- Caches outputs and only re-executes parts of the workflow that changes between executions
- UI monitoring components for logs and outputs per node
- Custom buildable nodes with Python via OOP and functional patterns
- Automatic batching for Pytorch tensors
- Multiprocessing I/O to and from disk and network
- Customizable multiprocessing functions
- Ability to execute entire graphs, or individual subgraphs/nodes
- Ability to execute singular batches of data
- Ability to pause graph execution
- Basic nodes for filtering, loading, and saving outputs
- Node grouping and subflows
- Autosaving and shareable serialized workflow files
- Registers node code changes without needing a restart
- Monitorable CPU and GPU resource usage
- Human-in-the-loop prompting for interactivity and manual control during DAG execution
- (BETA) Third Party Plugins *
\* We plan on adding documentation for the community to build plugins, but for now, an example can be seen at
[example_plugin](example_plugin) and
[graphbook-huggingface](https://github.com/graphbookai/graphbook-huggingface)
### Planned Features
- A `graphbook run` command to execute workflows in a CLI
- All-code workflows, so users never have to leave their IDE
- Remote subgraphs for scaling workflows on other Graphbook services
- And many optimizations for large data processing workloads
### Supported OS
The following operating systems are supported in order of most to least recommended:
- Linux
- Mac
- Windows (not recommended) *
\* There may be issues with running Graphbook on Windows. With limited resources, we can only focus testing and development on Linux.
## Getting Started
### Install from PyPI
1. `pip install graphbook`
1. `graphbook`
1. Visit http://localhost:8005
### Install with Docker
1. Pull and run the downloaded image
```bash
docker run --rm -p 8005:8005 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest
```
1. Visit http://localhost:8005
### Recommended Plugins
* [Huggingface](https://github.com/graphbookai/graphbook-huggingface)
Visit the [docs](https://docs.graphbook.ai) to learn more on how to create custom nodes and workflows with Graphbook.
## Examples
We continually post examples of workflows and custom nodes in our [examples repo](https://github.com/graphbookai/graphbook-examples).
## Collaboration
Graphbook is in active development and very much welcomes contributors. This is a guide on how to run Graphbook in development mode. If you are simply using Graphbook, view the [Getting Started](#getting-started) section.
### Run Graphbook in Development Mode
You can use any other virtual environment solution, but it is highly adviced to use [poetry](https://python-poetry.org/docs/) since our dependencies are specified in poetry's format.
1. Clone the repo and `cd graphbook`
1. `poetry install --with dev`
1. `poetry shell`
1. `python graphbook/main.py`
1. `cd web`
1. `npm install`
1. `npm run dev`
Raw data
{
"_id": null,
"home_page": "https://graphbook.ai",
"name": "graphbook",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": "ml, workflow, framework, pytorch, data science, machine learning, ai",
"author": "Richard Franklin",
"author_email": "rsamf@graphbook.ai",
"download_url": "https://files.pythonhosted.org/packages/b4/09/3446652101696c1b220bb3923fdc5d419649a3c7eb2d56c5c3e150f7ca48/graphbook-0.9.1.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <a href=\"https://graphbook.ai\">\n <img src=\"docs/_static/graphbook.png\" alt=\"Logo\" width=256>\n </a>\n\n <h1 align=\"center\">Graphbook</h1>\n\n <p align=\"center\">\n <a href=\"https://github.com/graphbookai/graphbook/blob/main/LICENSE\">\n <img alt=\"GitHub License\" src=\"https://img.shields.io/github/license/graphbookai/graphbook\">\n </a>\n <a href=\"https://github.com/graphbookai/graphbook/actions/workflows/pypi.yml\">\n <img alt=\"GitHub Actions Workflow Status\" src=\"https://img.shields.io/github/actions/workflow/status/graphbookai/graphbook/pypi.yml\">\n </a>\n <a href=\"https://hub.docker.com/r/rsamf/graphbook\">\n <img alt=\"Docker Pulls\" src=\"https://img.shields.io/docker/pulls/rsamf/graphbook\">\n </a>\n <a href=\"https://www.pepy.tech/projects/graphbook\">\n <img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/badge/graphbook\">\n </a>\n <a href=\"https://pypi.org/project/graphbook/\">\n <img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/graphbook\">\n </a>\n </p>\n <div align=\"center\">\n <a href=\"https://discord.gg/XukMUDmjnt\">\n <img alt=\"Join Discord\" src=\"https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white\">\n </a>\n </div>\n <p align=\"center\">\n <a href=\"https://discord.gg/XukMUDmjnt\">\n <img alt=\"Discord\" src=\"https://img.shields.io/discord/1199855707567177860\">\n </a>\n </p>\n\n <p align=\"center\">\n The ML workflow framework\n <br>\n <a href=\"https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug\">Report bug</a>\n \u00b7\n <a href=\"https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement\">Request feature</a>\n </p>\n\n <p align=\"center\">\n <a href=\"#overview\">Overview</a> \u2022\n <a href=\"#status\">Status</a> \u2022\n <a href=\"#getting-started\">Getting Started</a> \u2022\n <a href=\"#examples\">Examples</a> \u2022\n <a href=\"#collaboration\">Collaboration</a>\n </p>\n</p>\n\n## Overview\nGraphbook is a framework for building efficient, visual DAG-structured ML workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching for PyTorch tensors, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.\n\n<p align=\"center\">\n <a href=\"https://graphbook.ai\">\n <img src=\"https://media.githubusercontent.com/media/rsamf/public/main/docs/overview/huggingface-pipeline-demo.gif\" alt=\"Huggingface Pipeline Demo\" width=\"512\">\n </a>\n <div align=\"center\">Build, run, monitor!</div>\n</p>\n\n## Status\nGraphbook is in a very early stage of development, so expect minor bugs and rapid design changes through the coming releases. If you would like to [report a bug](https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug) or [request a feature](https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement), please feel free to do so. We aim to make Graphbook serve our users in the best way possible.\n\n### Current Features\n- \u200b\u200bGraph-based visual editor to experiment and create complex ML workflows\n- Caches outputs and only re-executes parts of the workflow that changes between executions\n- UI monitoring components for logs and outputs per node\n- Custom buildable nodes with Python via OOP and functional patterns\n- Automatic batching for Pytorch tensors\n- Multiprocessing I/O to and from disk and network\n- Customizable multiprocessing functions\n- Ability to execute entire graphs, or individual subgraphs/nodes\n- Ability to execute singular batches of data\n- Ability to pause graph execution\n- Basic nodes for filtering, loading, and saving outputs\n- Node grouping and subflows\n- Autosaving and shareable serialized workflow files\n- Registers node code changes without needing a restart\n- Monitorable CPU and GPU resource usage\n- Human-in-the-loop prompting for interactivity and manual control during DAG execution\n- (BETA) Third Party Plugins *\n\n\\* We plan on adding documentation for the community to build plugins, but for now, an example can be seen at\n[example_plugin](example_plugin) and\n[graphbook-huggingface](https://github.com/graphbookai/graphbook-huggingface)\n\n### Planned Features\n- A `graphbook run` command to execute workflows in a CLI\n- All-code workflows, so users never have to leave their IDE\n- Remote subgraphs for scaling workflows on other Graphbook services\n- And many optimizations for large data processing workloads\n\n### Supported OS\nThe following operating systems are supported in order of most to least recommended:\n- Linux\n- Mac\n- Windows (not recommended) *\n\n\\* There may be issues with running Graphbook on Windows. With limited resources, we can only focus testing and development on Linux.\n\n## Getting Started\n### Install from PyPI\n1. `pip install graphbook`\n1. `graphbook`\n1. Visit http://localhost:8005\n\n### Install with Docker\n1. Pull and run the downloaded image\n ```bash\n docker run --rm -p 8005:8005 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest\n ```\n1. Visit http://localhost:8005\n\n### Recommended Plugins\n* [Huggingface](https://github.com/graphbookai/graphbook-huggingface)\n\nVisit the [docs](https://docs.graphbook.ai) to learn more on how to create custom nodes and workflows with Graphbook.\n\n## Examples\nWe continually post examples of workflows and custom nodes in our [examples repo](https://github.com/graphbookai/graphbook-examples).\n\n## Collaboration\nGraphbook is in active development and very much welcomes contributors. This is a guide on how to run Graphbook in development mode. If you are simply using Graphbook, view the [Getting Started](#getting-started) section.\n\n### Run Graphbook in Development Mode\nYou can use any other virtual environment solution, but it is highly adviced to use [poetry](https://python-poetry.org/docs/) since our dependencies are specified in poetry's format.\n1. Clone the repo and `cd graphbook`\n1. `poetry install --with dev`\n1. `poetry shell`\n1. `python graphbook/main.py`\n1. `cd web`\n1. `npm install`\n1. `npm run dev`\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "An extensible ML workflow framework built for data scientists and ML engineers.",
"version": "0.9.1",
"project_urls": {
"Documentation": "https://docs.graphbook.ai",
"Homepage": "https://graphbook.ai",
"Repository": "https://github.com/graphbookai/graphbook"
},
"split_keywords": [
"ml",
" workflow",
" framework",
" pytorch",
" data science",
" machine learning",
" ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "b199dc3c1f5a721ef2ff984b2e1a560c0ad2207fceea7f5b7af39a437438957c",
"md5": "7b4cd8acf9942fafcafc1faaf1ccd56c",
"sha256": "1b00385a2f59d73deb2f68ec1c19150fb056024c76030231d9726f70bb31932f"
},
"downloads": -1,
"filename": "graphbook-0.9.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7b4cd8acf9942fafcafc1faaf1ccd56c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 1122637,
"upload_time": "2024-11-15T22:45:26",
"upload_time_iso_8601": "2024-11-15T22:45:26.582779Z",
"url": "https://files.pythonhosted.org/packages/b1/99/dc3c1f5a721ef2ff984b2e1a560c0ad2207fceea7f5b7af39a437438957c/graphbook-0.9.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "b4093446652101696c1b220bb3923fdc5d419649a3c7eb2d56c5c3e150f7ca48",
"md5": "8b18d6d1fdf22a8eef962af6a949f250",
"sha256": "a0e292e79c84987760fc82ee65d53050291cfeeae85fa133ef89dc2391237cb2"
},
"downloads": -1,
"filename": "graphbook-0.9.1.tar.gz",
"has_sig": false,
"md5_digest": "8b18d6d1fdf22a8eef962af6a949f250",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 1114863,
"upload_time": "2024-11-15T22:45:28",
"upload_time_iso_8601": "2024-11-15T22:45:28.483554Z",
"url": "https://files.pythonhosted.org/packages/b4/09/3446652101696c1b220bb3923fdc5d419649a3c7eb2d56c5c3e150f7ca48/graphbook-0.9.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-15 22:45:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "graphbookai",
"github_project": "graphbook",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "graphbook"
}