fast_dynamic_batcher


Namefast_dynamic_batcher JSON
Version 0.2.0 PyPI version JSON
download
home_pagehttps://github.com/JeffWigger/FastDynamicBatcher
SummaryFastDynamicBatcher is a library for batching inputs across requests to accelerate machine learning workloads.
upload_time2024-12-06 21:00:21
maintainerNone
docs_urlNone
authorJeffrey Wigger
requires_python>=3.9
licenseMIT
keywords machine-learning batching
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: https://github.com/jeffwigger/FastDynamicBatcher/actions/workflows/test_pip.yaml/badge.svg
     :target: https://github.com/JeffWigger/FastDynamicBatcher/actions
     :alt: Workflow Status

Fast Dynamic Batcher
====================

Bundling several ML model inputs into a larger batch is the simplest way to achieve
significant inference speed-ups in ML workloads. The **Fast Dynamic Batcher** library has
been built to make it easy to use such dynamic batches in Python web frameworks like FastAPI. With our
dynamic batcher, you can combine the inputs of several requests into a
single batch, which can then be run more efficiently on GPUs. In our testing, we achieved up to 2.5x more throughput with it.

Example Usage
-------------

To use dynamic batching in FastAPI, you have to first
create an instance of the ``InferenceModel`` class. Initiate your ML
model in its ``init`` method and use it in its ``infer`` method:

.. code-block:: python

   from typing import Any
   from fast_dynamic_batcher.inference_template import InferenceModel


   class DemoModel(InferenceModel):
      def __init__(self):
          super().__init__()
          # Initiate your ML model here

      def infer(self, inputs: list[Any]) -> list[Any]:
          # Run your inputs as a batch for your model
          ml_output = ... # Your inference outputs
          return ml_output

Subsequently, use your ``InferenceModel`` instance to initiate our
``DynBatcher``:

.. code-block:: python

   from contextlib import asynccontextmanager

   from anyio import CapacityLimiter
   from anyio.lowlevel import RunVar

   from fast_dynamic_batcher.dyn_batcher import DynBatcher


   @asynccontextmanager
   async def lifespan(app: FastAPI):
      RunVar("_default_thread_limiter").set(CapacityLimiter(16))
      global dyn_batcher
      dyn_batcher = DynBatcher(DemoModel, max_batch_size = 8, max_delay = 0.1)
      yield
      dyn_batcher.stop()

   app = FastAPI(lifespan=lifespan)

   @app.post("/predict/")
   async def predict(
      input_model: YourInputPydanticModel
   ):
      return await dyn_batcher.process_batched(input_model)

The ``DynBatcher`` can be initiated in the FastAPI lifespans as a global
variable. It can be further customized with the ``max_batch_size``
and ``max_delay`` variables. Subsequently, use it in your
FastAPI endpoints by registering your inputs by calling its
``process_batched`` method.

Our dynamic batching algorithm will then wait for either the number of
inputs to equal the ``max_batch_size``, or until ``max_delay`` seconds have
passed. In the latter case, a batch may contain between 1 and
``max_batch_size`` inputs. Once, either condition is met, a batch will
be processed by calling the ``infer`` method of your ``InferenceModel``
instance.

Installation
------------

The Fast Dynamic Batcher library can be installed with pip:

.. code-block:: bash

   pip install fast_dynamic_batcher


Performance Tests
-----------------

We tested the performance of our dynamic batching solution against a baseline without batching on a Colab instance with a T4 GPU as well as on a laptop with an Intel i7-1250U CPU.
The experiments were conducted by using this `testing script <https://github.com/JeffWigger/FastDynamicBatcher/blob/main/test/test_dyn_batcher.py>`_. The results are reported in the table below:

.. list-table:: Performance Experiments
   :widths: 40 30 30
   :header-rows: 1

   * - Hardware
     - No Batching
     - Dynamic Batch size of 16
   * - Colab T4 GPU
     - 7.65s
     - 3.07s
   * - CPU Intel i7-1250U
     - 117.10s
     - 88.47s

On GPUs, which benefit greatly from large batch sizes, we achieved a speed-up of almost 2.5x by creating dynamic batches of size 16. On, CPUs, the gains are more modest with a speed-up of 1.3x.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/JeffWigger/FastDynamicBatcher",
    "name": "fast_dynamic_batcher",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "machine-learning, batching",
    "author": "Jeffrey Wigger",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/3c/3c/ee44f0af38fa0bad711a65d9848c373adf7b444d8146a474df18ae4c4ad8/fast_dynamic_batcher-0.2.0.tar.gz",
    "platform": null,
    "description": ".. image:: https://github.com/jeffwigger/FastDynamicBatcher/actions/workflows/test_pip.yaml/badge.svg\n     :target: https://github.com/JeffWigger/FastDynamicBatcher/actions\n     :alt: Workflow Status\n\nFast Dynamic Batcher\n====================\n\nBundling several ML model inputs into a larger batch is the simplest way to achieve\nsignificant inference speed-ups in ML workloads. The **Fast Dynamic Batcher** library has\nbeen built to make it easy to use such dynamic batches in Python web frameworks like FastAPI. With our\ndynamic batcher, you can combine the inputs of several requests into a\nsingle batch, which can then be run more efficiently on GPUs. In our testing, we achieved up to 2.5x more throughput with it.\n\nExample Usage\n-------------\n\nTo use dynamic batching in FastAPI, you have to first\ncreate an instance of the ``InferenceModel`` class. Initiate your ML\nmodel in its ``init`` method and use it in its ``infer`` method:\n\n.. code-block:: python\n\n   from typing import Any\n   from fast_dynamic_batcher.inference_template import InferenceModel\n\n\n   class DemoModel(InferenceModel):\n      def __init__(self):\n          super().__init__()\n          # Initiate your ML model here\n\n      def infer(self, inputs: list[Any]) -> list[Any]:\n          # Run your inputs as a batch for your model\n          ml_output = ... # Your inference outputs\n          return ml_output\n\nSubsequently, use your ``InferenceModel`` instance to initiate our\n``DynBatcher``:\n\n.. code-block:: python\n\n   from contextlib import asynccontextmanager\n\n   from anyio import CapacityLimiter\n   from anyio.lowlevel import RunVar\n\n   from fast_dynamic_batcher.dyn_batcher import DynBatcher\n\n\n   @asynccontextmanager\n   async def lifespan(app: FastAPI):\n      RunVar(\"_default_thread_limiter\").set(CapacityLimiter(16))\n      global dyn_batcher\n      dyn_batcher = DynBatcher(DemoModel, max_batch_size = 8, max_delay = 0.1)\n      yield\n      dyn_batcher.stop()\n\n   app = FastAPI(lifespan=lifespan)\n\n   @app.post(\"/predict/\")\n   async def predict(\n      input_model: YourInputPydanticModel\n   ):\n      return await dyn_batcher.process_batched(input_model)\n\nThe ``DynBatcher`` can be initiated in the FastAPI lifespans as a global\nvariable. It can be further customized with the ``max_batch_size``\nand ``max_delay`` variables. Subsequently, use it in your\nFastAPI endpoints by registering your inputs by calling its\n``process_batched`` method.\n\nOur dynamic batching algorithm will then wait for either the number of\ninputs to equal the ``max_batch_size``, or until ``max_delay`` seconds have\npassed. In the latter case, a batch may contain between 1 and\n``max_batch_size`` inputs. Once, either condition is met, a batch will\nbe processed by calling the ``infer`` method of your ``InferenceModel``\ninstance.\n\nInstallation\n------------\n\nThe Fast Dynamic Batcher library can be installed with pip:\n\n.. code-block:: bash\n\n   pip install fast_dynamic_batcher\n\n\nPerformance Tests\n-----------------\n\nWe tested the performance of our dynamic batching solution against a baseline without batching on a Colab instance with a T4 GPU as well as on a laptop with an Intel i7-1250U CPU.\nThe experiments were conducted by using this `testing script <https://github.com/JeffWigger/FastDynamicBatcher/blob/main/test/test_dyn_batcher.py>`_. The results are reported in the table below:\n\n.. list-table:: Performance Experiments\n   :widths: 40 30 30\n   :header-rows: 1\n\n   * - Hardware\n     - No Batching\n     - Dynamic Batch size of 16\n   * - Colab T4 GPU\n     - 7.65s\n     - 3.07s\n   * - CPU Intel i7-1250U\n     - 117.10s\n     - 88.47s\n\nOn GPUs, which benefit greatly from large batch sizes, we achieved a speed-up of almost 2.5x by creating dynamic batches of size 16. On, CPUs, the gains are more modest with a speed-up of 1.3x.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "FastDynamicBatcher is a library for batching inputs across requests to accelerate machine learning workloads.",
    "version": "0.2.0",
    "project_urls": {
        "Documentation": "https://fastdynamicbatcher.readthedocs.io/en/latest/",
        "Homepage": "https://github.com/JeffWigger/FastDynamicBatcher",
        "Repository": "https://github.com/JeffWigger/FastDynamicBatcher"
    },
    "split_keywords": [
        "machine-learning",
        " batching"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1409a0542ef6aa83a1c4927828c9b5bf33e1fce7a003c07f6129df0ae44a2623",
                "md5": "dc4986b571c1e09eb397f2e3e3c74acd",
                "sha256": "7384d4a3e957d208f088562417c32d2b16c9d06c84c31795657d8c5f54f12871"
            },
            "downloads": -1,
            "filename": "fast_dynamic_batcher-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dc4986b571c1e09eb397f2e3e3c74acd",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 7944,
            "upload_time": "2024-12-06T21:00:20",
            "upload_time_iso_8601": "2024-12-06T21:00:20.233759Z",
            "url": "https://files.pythonhosted.org/packages/14/09/a0542ef6aa83a1c4927828c9b5bf33e1fce7a003c07f6129df0ae44a2623/fast_dynamic_batcher-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3c3cee44f0af38fa0bad711a65d9848c373adf7b444d8146a474df18ae4c4ad8",
                "md5": "8cacc8496be702979408566d3bcdad5b",
                "sha256": "ba1bcea9bd871b7fdff2f3405116598bb1aa8fa6c54fecc628bca455a82a925e"
            },
            "downloads": -1,
            "filename": "fast_dynamic_batcher-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "8cacc8496be702979408566d3bcdad5b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 6809,
            "upload_time": "2024-12-06T21:00:21",
            "upload_time_iso_8601": "2024-12-06T21:00:21.760832Z",
            "url": "https://files.pythonhosted.org/packages/3c/3c/ee44f0af38fa0bad711a65d9848c373adf7b444d8146a474df18ae4c4ad8/fast_dynamic_batcher-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-06 21:00:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "JeffWigger",
    "github_project": "FastDynamicBatcher",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "fast_dynamic_batcher"
}
        
Elapsed time: 1.84499s