graphbook


Namegraphbook JSON
Version 0.9.1 PyPI version JSON
download
home_pagehttps://graphbook.ai
SummaryAn extensible ML workflow framework built for data scientists and ML engineers.
upload_time2024-11-15 22:45:28
maintainerNone
docs_urlNone
authorRichard Franklin
requires_python<4.0,>=3.9
licenseMIT
keywords ml workflow framework pytorch data science machine learning ai
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <a href="https://graphbook.ai">
    <img src="docs/_static/graphbook.png" alt="Logo" width=256>
  </a>

  <h1 align="center">Graphbook</h1>

  <p align="center">
    <a href="https://github.com/graphbookai/graphbook/blob/main/LICENSE">
      <img alt="GitHub License" src="https://img.shields.io/github/license/graphbookai/graphbook">
    </a>
    <a href="https://github.com/graphbookai/graphbook/actions/workflows/pypi.yml">
      <img alt="GitHub Actions Workflow Status" src="https://img.shields.io/github/actions/workflow/status/graphbookai/graphbook/pypi.yml">
    </a>
    <a href="https://hub.docker.com/r/rsamf/graphbook">
      <img alt="Docker Pulls" src="https://img.shields.io/docker/pulls/rsamf/graphbook">
    </a>
    <a href="https://www.pepy.tech/projects/graphbook">
      <img alt="PyPI Downloads" src="https://static.pepy.tech/badge/graphbook">
    </a>
    <a href="https://pypi.org/project/graphbook/">
      <img alt="PyPI - Version" src="https://img.shields.io/pypi/v/graphbook">
    </a>
  </p>
  <div align="center">
    <a href="https://discord.gg/XukMUDmjnt">
      <img alt="Join Discord" src="https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white">
    </a>
  </div>
  <p align="center">
    <a href="https://discord.gg/XukMUDmjnt">
      <img alt="Discord" src="https://img.shields.io/discord/1199855707567177860">
    </a>
  </p>

  <p align="center">
    The ML workflow framework
    <br>
    <a href="https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug">Report bug</a>
    ·
    <a href="https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement">Request feature</a>
  </p>

  <p align="center">
    <a href="#overview">Overview</a> •
    <a href="#status">Status</a> •
    <a href="#getting-started">Getting Started</a> •
    <a href="#examples">Examples</a> •
    <a href="#collaboration">Collaboration</a>
  </p>
</p>

## Overview
Graphbook is a framework for building efficient, visual DAG-structured ML workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching for PyTorch tensors, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.

<p align="center">
  <a href="https://graphbook.ai">
    <img src="https://media.githubusercontent.com/media/rsamf/public/main/docs/overview/huggingface-pipeline-demo.gif" alt="Huggingface Pipeline Demo" width="512">
  </a>
  <div align="center">Build, run, monitor!</div>
</p>

## Status
Graphbook is in a very early stage of development, so expect minor bugs and rapid design changes through the coming releases. If you would like to [report a bug](https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug) or [request a feature](https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement), please feel free to do so. We aim to make Graphbook serve our users in the best way possible.

### Current Features
- ​​Graph-based visual editor to experiment and create complex ML workflows
- Caches outputs and only re-executes parts of the workflow that changes between executions
- UI monitoring components for logs and outputs per node
- Custom buildable nodes with Python via OOP and functional patterns
- Automatic batching for Pytorch tensors
- Multiprocessing I/O to and from disk and network
- Customizable multiprocessing functions
- Ability to execute entire graphs, or individual subgraphs/nodes
- Ability to execute singular batches of data
- Ability to pause graph execution
- Basic nodes for filtering, loading, and saving outputs
- Node grouping and subflows
- Autosaving and shareable serialized workflow files
- Registers node code changes without needing a restart
- Monitorable CPU and GPU resource usage
- Human-in-the-loop prompting for interactivity and manual control during DAG execution
- (BETA) Third Party Plugins *

\* We plan on adding documentation for the community to build plugins, but for now, an example can be seen at
[example_plugin](example_plugin) and
[graphbook-huggingface](https://github.com/graphbookai/graphbook-huggingface)

### Planned Features
- A `graphbook run` command to execute workflows in a CLI
- All-code workflows, so users never have to leave their IDE
- Remote subgraphs for scaling workflows on other Graphbook services
- And many optimizations for large data processing workloads

### Supported OS
The following operating systems are supported in order of most to least recommended:
- Linux
- Mac
- Windows (not recommended) *

\* There may be issues with running Graphbook on Windows. With limited resources, we can only focus testing and development on Linux.

## Getting Started
### Install from PyPI
1. `pip install graphbook`
1. `graphbook`
1. Visit http://localhost:8005

### Install with Docker
1. Pull and run the downloaded image
    ```bash
    docker run --rm -p 8005:8005 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest
    ```
1. Visit http://localhost:8005

### Recommended Plugins
* [Huggingface](https://github.com/graphbookai/graphbook-huggingface)

Visit the [docs](https://docs.graphbook.ai) to learn more on how to create custom nodes and workflows with Graphbook.

## Examples
We continually post examples of workflows and custom nodes in our [examples repo](https://github.com/graphbookai/graphbook-examples).

## Collaboration
Graphbook is in active development and very much welcomes contributors. This is a guide on how to run Graphbook in development mode. If you are simply using Graphbook, view the [Getting Started](#getting-started) section.

### Run Graphbook in Development Mode
You can use any other virtual environment solution, but it is highly adviced to use [poetry](https://python-poetry.org/docs/) since our dependencies are specified in poetry's format.
1. Clone the repo and `cd graphbook`
1. `poetry install --with dev`
1. `poetry shell`
1. `python graphbook/main.py`
1. `cd web`
1. `npm install`
1. `npm run dev`

            

Raw data

            {
    "_id": null,
    "home_page": "https://graphbook.ai",
    "name": "graphbook",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": "ml, workflow, framework, pytorch, data science, machine learning, ai",
    "author": "Richard Franklin",
    "author_email": "rsamf@graphbook.ai",
    "download_url": "https://files.pythonhosted.org/packages/b4/09/3446652101696c1b220bb3923fdc5d419649a3c7eb2d56c5c3e150f7ca48/graphbook-0.9.1.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <a href=\"https://graphbook.ai\">\n    <img src=\"docs/_static/graphbook.png\" alt=\"Logo\" width=256>\n  </a>\n\n  <h1 align=\"center\">Graphbook</h1>\n\n  <p align=\"center\">\n    <a href=\"https://github.com/graphbookai/graphbook/blob/main/LICENSE\">\n      <img alt=\"GitHub License\" src=\"https://img.shields.io/github/license/graphbookai/graphbook\">\n    </a>\n    <a href=\"https://github.com/graphbookai/graphbook/actions/workflows/pypi.yml\">\n      <img alt=\"GitHub Actions Workflow Status\" src=\"https://img.shields.io/github/actions/workflow/status/graphbookai/graphbook/pypi.yml\">\n    </a>\n    <a href=\"https://hub.docker.com/r/rsamf/graphbook\">\n      <img alt=\"Docker Pulls\" src=\"https://img.shields.io/docker/pulls/rsamf/graphbook\">\n    </a>\n    <a href=\"https://www.pepy.tech/projects/graphbook\">\n      <img alt=\"PyPI Downloads\" src=\"https://static.pepy.tech/badge/graphbook\">\n    </a>\n    <a href=\"https://pypi.org/project/graphbook/\">\n      <img alt=\"PyPI - Version\" src=\"https://img.shields.io/pypi/v/graphbook\">\n    </a>\n  </p>\n  <div align=\"center\">\n    <a href=\"https://discord.gg/XukMUDmjnt\">\n      <img alt=\"Join Discord\" src=\"https://img.shields.io/badge/Join%20our%20Discord-5865F2?style=for-the-badge&logo=discord&logoColor=white\">\n    </a>\n  </div>\n  <p align=\"center\">\n    <a href=\"https://discord.gg/XukMUDmjnt\">\n      <img alt=\"Discord\" src=\"https://img.shields.io/discord/1199855707567177860\">\n    </a>\n  </p>\n\n  <p align=\"center\">\n    The ML workflow framework\n    <br>\n    <a href=\"https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug\">Report bug</a>\n    \u00b7\n    <a href=\"https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement\">Request feature</a>\n  </p>\n\n  <p align=\"center\">\n    <a href=\"#overview\">Overview</a> \u2022\n    <a href=\"#status\">Status</a> \u2022\n    <a href=\"#getting-started\">Getting Started</a> \u2022\n    <a href=\"#examples\">Examples</a> \u2022\n    <a href=\"#collaboration\">Collaboration</a>\n  </p>\n</p>\n\n## Overview\nGraphbook is a framework for building efficient, visual DAG-structured ML workflows composed of nodes written in Python. Graphbook provides common ML processing features such as multiprocessing IO and automatic batching for PyTorch tensors, and it features a web-based UI to assemble, monitor, and execute data processing workflows. It can be used to prepare training data for custom ML models, experiment with custom trained or off-the-shelf models, and to build ML-based ETL applications. Custom nodes can be built in Python, and Graphbook will behave like a framework and call lifecycle methods on those nodes.\n\n<p align=\"center\">\n  <a href=\"https://graphbook.ai\">\n    <img src=\"https://media.githubusercontent.com/media/rsamf/public/main/docs/overview/huggingface-pipeline-demo.gif\" alt=\"Huggingface Pipeline Demo\" width=\"512\">\n  </a>\n  <div align=\"center\">Build, run, monitor!</div>\n</p>\n\n## Status\nGraphbook is in a very early stage of development, so expect minor bugs and rapid design changes through the coming releases. If you would like to [report a bug](https://github.com/graphbookai/graphbook/issues/new?template=bug_report.md&labels=bug) or [request a feature](https://github.com/graphbookai/graphbook/issues/new?template=feature_request.md&labels=enhancement), please feel free to do so. We aim to make Graphbook serve our users in the best way possible.\n\n### Current Features\n- \u200b\u200bGraph-based visual editor to experiment and create complex ML workflows\n- Caches outputs and only re-executes parts of the workflow that changes between executions\n- UI monitoring components for logs and outputs per node\n- Custom buildable nodes with Python via OOP and functional patterns\n- Automatic batching for Pytorch tensors\n- Multiprocessing I/O to and from disk and network\n- Customizable multiprocessing functions\n- Ability to execute entire graphs, or individual subgraphs/nodes\n- Ability to execute singular batches of data\n- Ability to pause graph execution\n- Basic nodes for filtering, loading, and saving outputs\n- Node grouping and subflows\n- Autosaving and shareable serialized workflow files\n- Registers node code changes without needing a restart\n- Monitorable CPU and GPU resource usage\n- Human-in-the-loop prompting for interactivity and manual control during DAG execution\n- (BETA) Third Party Plugins *\n\n\\* We plan on adding documentation for the community to build plugins, but for now, an example can be seen at\n[example_plugin](example_plugin) and\n[graphbook-huggingface](https://github.com/graphbookai/graphbook-huggingface)\n\n### Planned Features\n- A `graphbook run` command to execute workflows in a CLI\n- All-code workflows, so users never have to leave their IDE\n- Remote subgraphs for scaling workflows on other Graphbook services\n- And many optimizations for large data processing workloads\n\n### Supported OS\nThe following operating systems are supported in order of most to least recommended:\n- Linux\n- Mac\n- Windows (not recommended) *\n\n\\* There may be issues with running Graphbook on Windows. With limited resources, we can only focus testing and development on Linux.\n\n## Getting Started\n### Install from PyPI\n1. `pip install graphbook`\n1. `graphbook`\n1. Visit http://localhost:8005\n\n### Install with Docker\n1. Pull and run the downloaded image\n    ```bash\n    docker run --rm -p 8005:8005 -v $PWD/workflows:/app/workflows rsamf/graphbook:latest\n    ```\n1. Visit http://localhost:8005\n\n### Recommended Plugins\n* [Huggingface](https://github.com/graphbookai/graphbook-huggingface)\n\nVisit the [docs](https://docs.graphbook.ai) to learn more on how to create custom nodes and workflows with Graphbook.\n\n## Examples\nWe continually post examples of workflows and custom nodes in our [examples repo](https://github.com/graphbookai/graphbook-examples).\n\n## Collaboration\nGraphbook is in active development and very much welcomes contributors. This is a guide on how to run Graphbook in development mode. If you are simply using Graphbook, view the [Getting Started](#getting-started) section.\n\n### Run Graphbook in Development Mode\nYou can use any other virtual environment solution, but it is highly adviced to use [poetry](https://python-poetry.org/docs/) since our dependencies are specified in poetry's format.\n1. Clone the repo and `cd graphbook`\n1. `poetry install --with dev`\n1. `poetry shell`\n1. `python graphbook/main.py`\n1. `cd web`\n1. `npm install`\n1. `npm run dev`\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "An extensible ML workflow framework built for data scientists and ML engineers.",
    "version": "0.9.1",
    "project_urls": {
        "Documentation": "https://docs.graphbook.ai",
        "Homepage": "https://graphbook.ai",
        "Repository": "https://github.com/graphbookai/graphbook"
    },
    "split_keywords": [
        "ml",
        " workflow",
        " framework",
        " pytorch",
        " data science",
        " machine learning",
        " ai"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b199dc3c1f5a721ef2ff984b2e1a560c0ad2207fceea7f5b7af39a437438957c",
                "md5": "7b4cd8acf9942fafcafc1faaf1ccd56c",
                "sha256": "1b00385a2f59d73deb2f68ec1c19150fb056024c76030231d9726f70bb31932f"
            },
            "downloads": -1,
            "filename": "graphbook-0.9.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7b4cd8acf9942fafcafc1faaf1ccd56c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 1122637,
            "upload_time": "2024-11-15T22:45:26",
            "upload_time_iso_8601": "2024-11-15T22:45:26.582779Z",
            "url": "https://files.pythonhosted.org/packages/b1/99/dc3c1f5a721ef2ff984b2e1a560c0ad2207fceea7f5b7af39a437438957c/graphbook-0.9.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b4093446652101696c1b220bb3923fdc5d419649a3c7eb2d56c5c3e150f7ca48",
                "md5": "8b18d6d1fdf22a8eef962af6a949f250",
                "sha256": "a0e292e79c84987760fc82ee65d53050291cfeeae85fa133ef89dc2391237cb2"
            },
            "downloads": -1,
            "filename": "graphbook-0.9.1.tar.gz",
            "has_sig": false,
            "md5_digest": "8b18d6d1fdf22a8eef962af6a949f250",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 1114863,
            "upload_time": "2024-11-15T22:45:28",
            "upload_time_iso_8601": "2024-11-15T22:45:28.483554Z",
            "url": "https://files.pythonhosted.org/packages/b4/09/3446652101696c1b220bb3923fdc5d419649a3c7eb2d56c5c3e150f7ca48/graphbook-0.9.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-15 22:45:28",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "graphbookai",
    "github_project": "graphbook",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "graphbook"
}
        
Elapsed time: 0.50991s