graphbook


Namegraphbook JSON
Version 0.11.1 PyPI version JSON
download
home_pagehttps://graphbook.ai
SummaryThe AI-driven data pipeline and workflow framework for data scientists and machine learning engineers.
upload_time2025-02-07 23:54:31
maintainerNone
docs_urlNone
authorRichard Franklin
requires_python<4.0,>=3.9
licenseMIT
keywords ml workflow framework pytorch data science machine learning ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <a href="https://graphbook.ai">
    <img src="docs/_static/graphbook.png" alt="Logo" width=256>
  </a>

  <h1 align="center">Graphbook</h1>

  <p align="center">
    <a href="https://github.com/graphbookai/graphbook/blob/main/LICENSE">
      <img alt="GitHub License" src="https://img.shields.io/github/license/graphbookai/graphbook">
    </a>
    <a href="https://github.com/graphbookai/graphbook/actions/workflows/pypi.yml">
      <img alt="GitHub Actions Workflow Status" src="https://img.shields.io/github/actions/workflow/status/graphbookai/graphbook/pypi.yml">
    </a>
    <a href="https://hub.docker.com/r/rsamf/graphbook">
      <img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/rsamf/graphbook">
    </a>
    <a href="https://www.pepy.tech/projects/graphbook">
      <img alt="PyPI Downloads" src="https://static.pepy.tech/badge/graphbook">
    </a>
    <a href="https://pypi.org/project/graphbook/">
      <img alt="PyPI - Version" src="https://img.shields.io/pypi/v/graphbook">
    </a>
  </p>
  <div align="center">
    <a href="https://discord.gg/XukMUDmjnt">
      <img alt="Join Discord" src="https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white">
    </a>
  </div>
  <p align="center">
    <a href="https://discord.gg/XukMUDmjnt">
      <img alt="Discord" src="https://img.shields.io/discord/1199855707567177860">
    </a>
  </p>

  <p align="center">
    The Framework for AI-driven Data Pipelines
    <br>
    <a href="https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug">Report bug</a>
    ·
    <a href="https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement">Request feature</a>
  </p>

  <p align="center">
    <a href="#overview">Overview</a> •
    <a href="#status">Status</a> •
    <a href="#getting-started">Getting Started</a> •
    <a href="#examples">Examples</a> •
    <a href="#collaboration">Collaboration</a>
  </p>
</p>

## Overview
Graphbook is a framework for building efficient, interactive DAG-structured AI data pipelines or workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching for PyTorch tensors, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.

Try out the [demo](https://huggingface.co/spaces/rsamf/rmbg-graphbook)!

<p align="center">
  <a href="https://graphbook.ai">
    <img src="https://media.githubusercontent.com/media/rsamf/public/main/docs/overview/huggingface-pipeline-demo.gif" alt="Huggingface Pipeline Demo" width="512">
  </a>
  <div align="center">Build, run, monitor!</div>
</p>

### Applications
* Clean and curate custom large scale datasets
* [Demo ML apps](https://huggingface.co/spaces/rsamf/rmbg-graphbook) on Huggingface Spaces
* Build and deliver customizable no-code or hybrid low-code ML apps and services
* Quickly experiment with different ML models and adjust hyperparameters
* Maximize GPU utilization, parallelize IO, and scale across clusters
* Wrap your Ray DAGs with a frontend for end users

## Status
Graphbook is in a very early stage of development, so expect minor bugs and rapid design changes through the coming releases. If you would like to [report a bug](https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug) or [request a feature](https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement), please feel free to do so. We aim to make Graphbook serve our users in the best way possible.

### Current Features
* ​​Graph-based visual editor to experiment and create complex ML workflows
* Caches outputs and only re-executes parts of the workflow that changes between executions
* UI monitoring components for logs and outputs per node
* Custom buildable nodes with Python via OOP and functional patterns
* Automatic batching for Pytorch tensors
* Multiprocessing I/O to and from disk and network
* Customizable multiprocessing functions
* Ability to execute entire graphs, or individual subgraphs/nodes
* Ability to execute singular batches of data
* Ability to pause graph execution
* Basic nodes for filtering, loading, and saving outputs
* Node grouping and subflows
* Autosaving and shareable serialized workflow files
* Registers node code changes without needing a restart
* Monitorable system CPU and GPU resource usage
* Monitorable worker queue sizes for optimal worker scaling
* Human-in-the-loop prompting for interactivity and manual control during DAG execution
* Can switch to threaded processing per client session for demoing apps to multiple simultaneous users
* (BETA) **Now with Ray!** Build all-code workflows and scale pipelines on remote machines
* (BETA) Third Party Plugins *

\* We plan on adding documentation for the community to build plugins, but for now, an example can be seen at
[example_plugin](example_plugin) and
[graphbook-huggingface](https://github.com/graphbookai/graphbook-huggingface)

### Supported OS
The following operating systems are supported in order of most to least recommended:
- Linux
- Mac
- Windows (not recommended) *

\* There may be issues with running Graphbook on Windows. With limited resources, we can only focus testing and development on Linux.

## Getting Started
### Install from PyPI
1. `pip install graphbook`
1. `graphbook`
1. Visit http://localhost:8005

### Install with Docker
1. Pull and run the downloaded image
    ```bash
    docker run --rm -p 8005:8005 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest
    ```
1. Visit http://localhost:8005

### Recommended Plugins
* [Graphbook Hugging Face](https://github.com/graphbookai/graphbook-huggingface)

Visit the [docs](https://docs.graphbook.ai) to learn more on how to create custom nodes and workflows with Graphbook.

## Examples
We continually post examples of workflows and custom nodes in our [examples repo](https://github.com/graphbookai/graphbook-examples).

## Collaboration
Graphbook is in active development and very much welcomes contributors. This is a guide on how to run Graphbook in development mode. If you are simply using Graphbook, view the [Getting Started](#getting-started) section.

### Run Graphbook in Development Mode
You can use any other virtual environment solution, but it is highly adviced to use [poetry](https://python-poetry.org/docs/) since our dependencies are specified in poetry's format.
1. Clone the repo and `cd graphbook`
1. `poetry install --with dev`
1. `poetry shell`
1. `python -m graphbook.main`
1. `cd web`
1. `deno install`
1. `deno run dev`
1. In your browser, navigate to localhost:5173, and in the settings, change your **Graph Server Host** to `localhost:8005`.

            

Raw data

            {
    "_id": null,
    "home_page": "https://graphbook.ai",
    "name": "graphbook",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "ml, workflow, framework, pytorch, data science, machine learning, ai",
    "author": "Richard Franklin",
    "author_email": "rsamf@graphbook.ai",
    "download_url": "https://files.pythonhosted.org/packages/00/78/8ba30b09d4a22652a27a5be9c8e551688e7ac88e42272ea3d4d1bf851985/graphbook-0.11.1.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <a href=\"https://graphbook.ai\">\n    <img src=\"docs/_static/graphbook.png\" alt=\"Logo\" width=256>\n  </a>\n\n  <h1 align=\"center\">Graphbook</h1>\n\n  <p align=\"center\">\n    <a href=\"https://github.com/graphbookai/graphbook/blob/main/LICENSE\">\n      <img alt=\"GitHub License\" src=\"https://img.shields.io/github/license/graphbookai/graphbook\">\n    </a>\n    <a href=\"https://github.com/graphbookai/graphbook/actions/workflows/pypi.yml\">\n      <img alt=\"GitHub Actions Workflow Status\" src=\"https://img.shields.io/github/actions/workflow/status/graphbookai/graphbook/pypi.yml\">\n    </a>\n    <a href=\"https://hub.docker.com/r/rsamf/graphbook\">\n      <img alt=\"Docker Pulls\" src=\"https://img.shields.io/docker/pulls/rsamf/graphbook\">\n    </a>\n    <a href=\"https://www.pepy.tech/projects/graphbook\">\n      <img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/badge/graphbook\">\n    </a>\n    <a href=\"https://pypi.org/project/graphbook/\">\n      <img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/graphbook\">\n    </a>\n  </p>\n  <div align=\"center\">\n    <a href=\"https://discord.gg/XukMUDmjnt\">\n      <img alt=\"Join Discord\" src=\"https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white\">\n    </a>\n  </div>\n  <p align=\"center\">\n    <a href=\"https://discord.gg/XukMUDmjnt\">\n      <img alt=\"Discord\" src=\"https://img.shields.io/discord/1199855707567177860\">\n    </a>\n  </p>\n\n  <p align=\"center\">\n    The Framework for AI-driven Data Pipelines\n    <br>\n    <a href=\"https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug\">Report bug</a>\n    \u00b7\n    <a href=\"https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement\">Request feature</a>\n  </p>\n\n  <p align=\"center\">\n    <a href=\"#overview\">Overview</a> \u2022\n    <a href=\"#status\">Status</a> \u2022\n    <a href=\"#getting-started\">Getting Started</a> \u2022\n    <a href=\"#examples\">Examples</a> \u2022\n    <a href=\"#collaboration\">Collaboration</a>\n  </p>\n</p>\n\n## Overview\nGraphbook is a framework for building efficient, interactive DAG-structured AI data pipelines or workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching for PyTorch tensors, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.\n\nTry out the [demo](https://huggingface.co/spaces/rsamf/rmbg-graphbook)!\n\n<p align=\"center\">\n  <a href=\"https://graphbook.ai\">\n    <img src=\"https://media.githubusercontent.com/media/rsamf/public/main/docs/overview/huggingface-pipeline-demo.gif\" alt=\"Huggingface Pipeline Demo\" width=\"512\">\n  </a>\n  <div align=\"center\">Build, run, monitor!</div>\n</p>\n\n### Applications\n* Clean and curate custom large scale datasets\n* [Demo ML apps](https://huggingface.co/spaces/rsamf/rmbg-graphbook) on Huggingface Spaces\n* Build and deliver customizable no-code or hybrid low-code ML apps and services\n* Quickly experiment with different ML models and adjust hyperparameters\n* Maximize GPU utilization, parallelize IO, and scale across clusters\n* Wrap your Ray DAGs with a frontend for end users\n\n## Status\nGraphbook is in a very early stage of development, so expect minor bugs and rapid design changes through the coming releases. If you would like to [report a bug](https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug) or [request a feature](https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement), please feel free to do so. We aim to make Graphbook serve our users in the best way possible.\n\n### Current Features\n* \u200b\u200bGraph-based visual editor to experiment and create complex ML workflows\n* Caches outputs and only re-executes parts of the workflow that changes between executions\n* UI monitoring components for logs and outputs per node\n* Custom buildable nodes with Python via OOP and functional patterns\n* Automatic batching for Pytorch tensors\n* Multiprocessing I/O to and from disk and network\n* Customizable multiprocessing functions\n* Ability to execute entire graphs, or individual subgraphs/nodes\n* Ability to execute singular batches of data\n* Ability to pause graph execution\n* Basic nodes for filtering, loading, and saving outputs\n* Node grouping and subflows\n* Autosaving and shareable serialized workflow files\n* Registers node code changes without needing a restart\n* Monitorable system CPU and GPU resource usage\n* Monitorable worker queue sizes for optimal worker scaling\n* Human-in-the-loop prompting for interactivity and manual control during DAG execution\n* Can switch to threaded processing per client session for demoing apps to multiple simultaneous users\n* (BETA) **Now with Ray!** Build all-code workflows and scale pipelines on remote machines\n* (BETA) Third Party Plugins *\n\n\\* We plan on adding documentation for the community to build plugins, but for now, an example can be seen at\n[example_plugin](example_plugin) and\n[graphbook-huggingface](https://github.com/graphbookai/graphbook-huggingface)\n\n### Supported OS\nThe following operating systems are supported in order of most to least recommended:\n- Linux\n- Mac\n- Windows (not recommended) *\n\n\\* There may be issues with running Graphbook on Windows. With limited resources, we can only focus testing and development on Linux.\n\n## Getting Started\n### Install from PyPI\n1. `pip install graphbook`\n1. `graphbook`\n1. Visit http://localhost:8005\n\n### Install with Docker\n1. Pull and run the downloaded image\n    ```bash\n    docker run --rm -p 8005:8005 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest\n    ```\n1. Visit http://localhost:8005\n\n### Recommended Plugins\n* [Graphbook Hugging Face](https://github.com/graphbookai/graphbook-huggingface)\n\nVisit the [docs](https://docs.graphbook.ai) to learn more on how to create custom nodes and workflows with Graphbook.\n\n## Examples\nWe continually post examples of workflows and custom nodes in our [examples repo](https://github.com/graphbookai/graphbook-examples).\n\n## Collaboration\nGraphbook is in active development and very much welcomes contributors. This is a guide on how to run Graphbook in development mode. If you are simply using Graphbook, view the [Getting Started](#getting-started) section.\n\n### Run Graphbook in Development Mode\nYou can use any other virtual environment solution, but it is highly adviced to use [poetry](https://python-poetry.org/docs/) since our dependencies are specified in poetry's format.\n1. Clone the repo and `cd graphbook`\n1. `poetry install --with dev`\n1. `poetry shell`\n1. `python -m graphbook.main`\n1. `cd web`\n1. `deno install`\n1. `deno run dev`\n1. In your browser, navigate to localhost:5173, and in the settings, change your **Graph Server Host** to `localhost:8005`.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "The AI-driven data pipeline and workflow framework for data scientists and machine learning engineers.",
    "version": "0.11.1",
    "project_urls": {
        "Documentation": "https://docs.graphbook.ai",
        "Homepage": "https://graphbook.ai",
        "Repository": "https://github.com/graphbookai/graphbook"
    },
    "split_keywords": [
        "ml",
        " workflow",
        " framework",
        " pytorch",
        " data science",
        " machine learning",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "397060f18687ec0051b663beae0324e45110938cde2ec985522cef1aced921ed",
                "md5": "bb5e7cada9890e694bd49a62de2a2595",
                "sha256": "4058da76bc53ed5ebf3e837093fac98842e53b62cc77cdbf6040629ac491d78e"
            },
            "downloads": -1,
            "filename": "graphbook-0.11.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "bb5e7cada9890e694bd49a62de2a2595",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 1167121,
            "upload_time": "2025-02-07T23:54:29",
            "upload_time_iso_8601": "2025-02-07T23:54:29.558218Z",
            "url": "https://files.pythonhosted.org/packages/39/70/60f18687ec0051b663beae0324e45110938cde2ec985522cef1aced921ed/graphbook-0.11.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "00788ba30b09d4a22652a27a5be9c8e551688e7ac88e42272ea3d4d1bf851985",
                "md5": "ad36a4100145ddde4ded82d853b817c4",
                "sha256": "e1c50b60c9bcf8ec10aae617bd14d42b81b4830882fd88cdeb71e7755b5c261e"
            },
            "downloads": -1,
            "filename": "graphbook-0.11.1.tar.gz",
            "has_sig": false,
            "md5_digest": "ad36a4100145ddde4ded82d853b817c4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 1154439,
            "upload_time": "2025-02-07T23:54:31",
            "upload_time_iso_8601": "2025-02-07T23:54:31.635454Z",
            "url": "https://files.pythonhosted.org/packages/00/78/8ba30b09d4a22652a27a5be9c8e551688e7ac88e42272ea3d4d1bf851985/graphbook-0.11.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-07 23:54:31",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "graphbookai",
    "github_project": "graphbook",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "graphbook"
}
        
Elapsed time: 1.17479s