deadpool-executor


Namedeadpool-executor JSON
Version 2022.9.6 PyPI version JSON
download
home_pageNone
SummaryDeadpool
upload_time2022-09-23 10:56:15
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            .. image:: https://github.com/cjrh/deadpool/workflows/Python%20application/badge.svg
    :target: https://github.com/cjrh/deadpool/actions

.. image:: https://coveralls.io/repos/github/cjrh/deadpool/badge.svg?branch=main
    :target: https://coveralls.io/github/cjrh/deadpool?branch=main

.. image:: https://img.shields.io/pypi/pyversions/deadpool-executor.svg
    :target: https://pypi.python.org/pypi/deadpool-executor

.. image:: https://img.shields.io/github/tag/cjrh/deadpool.svg
    :target: https://img.shields.io/github/tag/cjrh/deadpool.svg

.. image:: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg
    :target: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg

.. image:: https://img.shields.io/pypi/v/deadpool-executor.svg
    :target: https://pypi.org/project/deadpool-executor/

.. image:: https://img.shields.io/badge/calver-YYYY.MM.MINOR-22bfda.svg
    :alt: This project uses calendar-based versioning scheme
    :target: http://calver.org/

.. image:: https://pepy.tech/badge/deadpool-executor
    :alt: Downloads
    :target: https://pepy.tech/project/deadpool-executor

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
    :alt: This project uses the "black" style formatter for Python code
    :target: https://github.com/python/black

.. image:: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool/badge
    :alt: OpenSSF Scorecard
    :target: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool

deadpool
========

``Deadpool`` is a process pool that is really hard to kill.

``Deadpool`` is an implementation of the ``Executor`` interface
in the ``concurrent.futures`` standard library. ``Deadpool`` is
a process pool executor, quite similar to the stdlib's
`ProcessPoolExecutor`_.

This document assumes that you are familiar with the stdlib
`ProcessPoolExecutor`_. If you are not, it is important
to understand that ``Deadpool`` makes very specific tradeoffs that
can result in quite different behaviour to the stdlib
implementation.

.. contents::
   :local:
   :backlinks: entry

Installation
------------

The python package name is *deadpool-executor*, so to install
you must type ``$ pip install deadpool-executor``. The import
name is *deadpool*, so in your Python code you must type
``import deadpool`` to use it.

Why would I want to use this?
-----------------------------

I created ``Deadpool`` because I became frustrated with the
stdlib `ProcessPoolExecutor`_, and various other community
implementations of process pools. In particular, I had a use-case
that required a high server uptime, but also had variable and
unpredictable memory requirements such that certain tasks could
trigger the `OOM killer`_, often resulting in a "broken" process
pool. I also needed task-specific timeouts that could kill a "hung"
task, which the stdlib executor doesn't provide.

You might wonder, isn't it bad to just kill a task like that?
In my use-case, we had extensive logging and monitoring to alert
us if any tasks failed; but it was paramount that our services
continue to operate even when tasks got killed in OOM scenarios,
or specific tasks took too long. This is the primary trade-off
that ``Deadpool`` offers.

I also tried using the `Pebble <https://github.com/noxdafox/pebble>`_
community process pool. This is a cool project, featuring several
of the properties I've been looking for such as timeouts, and
more resilient operation. However, during testing I found several
occurrences of a mysterious `RuntimeError`_ that caused the Pebble
pool to become broken and no longer accept new tasks.

My goal with ``Deadpool`` is to make a process pool executor that
is impossible to break. The tradeoffs are that I care less about:

- being cross-platform
- optimizing per-task latency

What differs from `ProcessPoolExecutor`_?
-----------------------------------------

``Deadpool`` is generally similar to `ProcessPoolExecutor`_ since it executes
tasks in subprocesses, and implements the standard ``Executor`` abstract
interface. However, it differs in the following ways:

- ``Deadpool`` makes a new subprocess for every task submitted to
  the pool (up to the ``max_workers`` limit). It is like having
  ``max_tasks_per_child == 1`` (a new feature in
  Python 3.11, although it was available in `multiprocessing.Pool`_
  since Python 3.2). I have ideas about making this configurable, but
  for now this is a much less important than overall resilience of
  the pool. This also means that ``Deadpool`` doesn't suffer from
  long-lived subprocesses being affected by memory leaks, usually
  created by native extensions.
- ``Deadpool`` defaults to the `forkserver <https://docs.python.org/3.11/library/multiprocessing.html#contexts-and-start-methods>`_ multiprocessing
  context, unlike the stdlib pool which defaults to ``fork`` on
  Linux. It's just a setting though, you can change it in the same way as
  with the stdlib pool.
- ``Deadpool`` does not keep a pool of processes around indefinitely.
  There will only be as many concurrent processes running as there
  is work to be done, up to the limit set by the ``max_workers``
  parameter; but if there are fewer tasks to be executed, there will
  be fewer active subprocesses. When there are no pending or active
  tasks, there will be *no subprocesses present*. They are created
  on demand as necessary and disappear when not required.
- ``Deadpool`` tasks can have timeouts. When a task hits the timeout,
  the underlying subprocess in the pool is killed with ``SIGKILL``.
  The entire process tree of that subprocess is killed.
- ``Deadpool`` tasks can have priorities. The priority is set in the
  ``submit()`` call. See the examples later in this document for further
  discussion on priorities.
- The shutdown parameters ``wait`` and ``cancel_futures`` can behave
  differently to how they work in the _ProcessPoolExecutor. This is
  discussed in more detail later in this document.
- If a ``Deadpool`` subprocess in the pool is killed by some
  external actor, for example, the OS runs out of memory and the
  `OOM killer`_ kills a pool subprocess that is using too much memory,
  ``Deadpool`` does not care and further operation is unaffected.
  ``Deadpool`` will not, and indeed cannot raise
  `BrokenProcessPool <https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.process.BrokenProcessPool>`_ or
  `BrokenExecutor <https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.BrokenExecutor>`_.
- ``Deadpool`` also allows a ``finalizer``, with corresponding
  ``finalargs``, that will be called after a task is executed on
  a subprocess, but before the subprocess terminates. It is
  analogous to the ``initializer`` and ``initargs`` parameters.
  Just like the ``initializer`` callable, the ``finalizer``
  callable is executed inside the subprocess. It is not guaranteed that
  the finalizer will always run. If a process is killed, e.g. due to a
  timeout or any other reason, the finalizer will not run. The finalizer
  could be used for things like flushing pending monitoring messages,
  such as traces and so on.
- ``Deadpool`` currently only works on Linux. There isn't any specific
  reason it can't work on other platforms.

Show me some code
-----------------

Simple case
^^^^^^^^^^^

The simple case works exactly the same as with `ProcessPoolExecutor`_:

.. code-block:: python

    from deadpool import Deadpool

    def f():
        return 123

    with deadpool.Deadpool() as exe:
        fut = exe.submit(f)
        result = fut.result()

    assert result == 123

It is intended that all the basic behaviour should "just work" in the
same way, and ``Deadpool`` should be a drop-in replacement for
`ProcessPoolExecutor`_; but there are some subtle differences so you
should read all of this document to see if any of those will affect you.

Timeouts
^^^^^^^^

If a timeout is reached on a task, the subprocess running that task will be
killed, as in ``SIGKILL``. ``Deadpool`` doesn't mind, but your own
application should: if you use timeouts it is likely important that your tasks
be `idempotent <https://en.wikipedia.org/wiki/Idempotence>`_, especially if
your application will restart tasks, or restart them after application deployment,
and other similar scenarios.

.. code-block:: python

    import time
    import deadpool

    def f():
        time.sleep(10.0)

    with deadpool.Deadpool() as exe:
        fut = exe.submit(f, deadpool_timeout=1.0)

        with pytest.raises(deadpool.TimeoutError)
            fut.result()

Handling OOM killed situations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: python

    import time
    import deadpool

    def f():
        x = list(range(10**100))

    with deadpool.Deadpool() as exe:
        fut = exe.submit(f, timeout=1.0)

        try:
            result = fut.result()
        except deadpool.ProcessError:
            print("Oh no someone killed my task!")


As long as the OOM killer terminates the subprocess (and not the main process),
which is likely because it'll be your subprocess that is using too much
memory, this will not hurt the pool, and it will be able to receive and
process more tasks.

Design Details
--------------

Typical Example - with timeouts
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Here's a typical example of how code using Deadpool might look. The
output of this code should be similar to the following:

.. code-block:: bash

    $ python examples/entrypoint.py
    ...................xxxxxxxxxxx.xxxxxxx.x.xxxxxxx.x
    $

Each ``.`` is a successfully completed task, and each ``x`` is a task
that timed out. Below is the code for this example.

.. code-block:: python

    import random, time
    import deadpool


    def work():
        time.sleep(random.random() * 4.0)
        print(".", end="", flush=True)
        return 1


    def main():
        with deadpool.Deadpool() as exe:
            futs = (exe.submit(work, timeout=2.0) for _ in range(50))
            for fut in deadpool.as_completed(futs):
                try:
                    assert fut.result() == 1
                except deadpool.TimeoutError:
                    print("x", end="", flush=True)


    if __name__ == "__main__":
        main()
        print()

- The work function will be busy for a random time period between 0 and
  4 seconds.
- There is a ``deadpool_timeout`` kwarg given to the ``submit`` method.
  This kwarg is special and will be consumed by Deadpool. You cannot
  use this kwarg name for your own task functions.
- When a task completes, it prints out ``.`` internally. But when a task
  raises a ``deadpool.TimeoutError``, a ``x`` will be printed out instead.
- When a task times out, keep in mind that the underlying process that
  is executing that task is killed, literally with the ``SIGKILL`` signal.

Deadpool tasks have priority
^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The example below is similar to the previous one for timeouts. In fact
this example retains the timeouts to show how the different features
compose together. In this example we create tasks with different
priorities, and we change the printed character of each task to show
that higher priority items are executed first.

The code example will print something similar to the following:

.. code-block:: bash

    $ python examples/priorities.py
    !!!!!xxxxxxxxxxx!x..!...x.xxxxxxxx.xxxx.x...xxxxxx

You can see how the ``!`` characters, used for indicating higher priority
tasks, appear towards the front indicating that they were executed sooner.
Below is the code.

.. code-block:: python

    import random, time
    import deadpool


    def work(symbol):
        time.sleep(random.random() * 4.0)
        print(symbol, end="", flush=True)
        return 1


    def main():
        with deadpool.Deadpool(max_backlog=100) as exe:
            futs = []
            for _ in range(25):
                fut = exe.submit(work, ".",deadpool_timeout=2.0, deadpool_priority=10)
                futs.append(fut)
                fut = exe.submit(work, "!",deadpool_timeout=2.0, deadpool_priority=0)
                futs.append(fut)

            for fut in deadpool.as_completed(futs):
                try:
                    assert fut.result() == 1
                except deadpool.TimeoutError:
                    print("x", end="", flush=True)


    if __name__ == "__main__":
        main()
        print()

- When the tasks are submitted, they are given a priority. The default
  value for the ``deadpool_priority`` parameter is 0, but here we'll
  write them out explicity.  Half of the tasks will have priority 10 and
  half will have priority 0.
- A lower value for the ``deadpool_priority`` parameters means a **higher**
  priority. The highest priority allowed is indicated by 0. Negative
  priority values are not allowed.
- I also specified the ``max_backlog`` parameter when creating the
  Deadpool instance. This is discussed in more detail next, but quickly:
  task priority can only be enforced on what is in the submitted backlog
  of tasks, and the ``max_backlog`` parameter controls the depth of that
  queue. If ``max_backlog`` is too low, then the window of prioritization
  will not include tasks submitted later which might have higher priorities
  than earlier-submitted tasks. The ``submit`` call will in fact block
  if the ``max_backlog`` depth has been reached.

Controlling the backlog of submitted tasks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

By default, the ``max_backlog`` parameter is set to 5. This parameter is
used to create the "submit queue" size. The submit queue is the place
where submitted tasks are held before they are executed in background
processes.

If the submit queue is large (``max_backlog``), it will mean
that a large number of tasks can be added to the system with the
``submit`` method, even before any tasks have finished exiting. Conversely,
a low ``max_backlog`` parameter means that the submit queue will fill up
faster. If the submit queue is full, it means that the next call to
``submit`` will block.

This kind of blocking is fine, and typically desired. It means that
backpressure from blocking is controlling the amount of work in flight.
By using a smaller ``max_backlog``, it means that you'll also be
limiting the amount of memory in use during the execution of all the tasks.

However, if you nevertheless still accumulate received futures as my
example code above is doing, that accumulation, i.e., the list of futures,
will contribute to memory growth. If you have a large amount of work, it
will be better to set a *callback* function on each of the futures rather
than processing them by iterating over ``as_completed``.

The example below illustrates this technique for keeping memory
consumption down:

.. code-block:: python

    import random, time
    import deadpool


    def work():
        time.sleep(random.random() * 4.0)
        print(".", end="", flush=True)
        return 1


    def cb(fut):
        try:
            assert fut.result() == 1
        except deadpool.TimeoutError:
            print("x", end="", flush=True)


    def main():
        with deadpool.Deadpool() as exe:
            for _ in range(50):
                exe.submit(work, deadpool_timeout=2.0).add_done_callback(cb)


    if __name__ == "__main__":
        main()
        print()


With this callback-based design, we no longer have an accumulation of futures
in a list. We get the same kind of output as in the "typical example" from
earlier:

.. code-block:: bash

    $ python examples/callbacks.py
    .....xxx.xxxxxxxxx.........x..xxxxx.x....x.xxxxxxx


Speaking of callbacks, the customized ``Future`` class used by Deadpool
lets you set a callback for when the task begins executing on a real
system process. That can be configured like so:

.. code-block:: python

    with deadpool.Deadpool() as exe:
        f = exe.submit(work)

        def cb(fut: deadpool.Future):
            print(f"My task is running on process {fut.pid}")

        f.add_pid_callback(cb)

More about shutdown
^^^^^^^^^^^^^^^^^^^

In the documentation for ProcessPoolExecutor_, the following function
signature is given for the shutdown_ method of the executor interface:

.. code-block:: python

    shutdown(wait=True, *, cancel_futures=False)

I want to honor this, but it presents some difficulties because the
semantics of the ``wait`` and ``cancel_futures`` parameters need to be
somewhat different for Deadpool.

In Deadpool, this is what the combinations of those flags mean:

.. csv-table:: Shutdown flags
   :header: ``wait``, ``cancel_futures``, ``effect``
   :widths: 10, 10, 80
   :align: left

   ``True``, ``True``, "Wait for already-running tasks to complete; the
   ``shutdown()`` call will unblock (return) when they're done. Cancel
   all pending tasks that are in the submit queue, but have not yet started
   running. The ``fut.cancelled()`` method will return ``True`` for such
   cancelled tasks."
   ``True``, ``False``, "Wait for already-running tasks to complete.
   Pending tasks in the
   submit queue that have not yet started running will *not* be cancelled, and
   will all continue to execute. The ``shutdown()`` call will return only
   after all submitted tasks have completed. "
   ``False``, ``True``, "Already-running tasks **will be cancelled** and this
   means the underlying subprocesses executing these tasks will receive
   SIGKILL. Pending tasks on the submit queue that have not yet started
   running will also be cancelled."
   ``False``, ``False``, "This is a strange one. What to do if the caller
   doesn't want to wait, but also doesn't want to cancel things? In this
   case, already-running tasks will be allowed to complete, but pending
   tasks on the submit queue will be cancelled. This is the same outcome as
   as ``wait==True`` and ``cancel_futures==True``. An alternative design
   might have been to allow all tasks, both running and pending, to just
   keep going in the background even after the ``shutdown()`` call
   returns. Does anyone have a use-case for this?"

If you're using ``Deadpool`` as a context manager, you might be wondering
how exactly to set these parameters in the ``shutdown`` call, since that
call is made for you automatically when the context manager exits.

For this, Deadpool provides additional parameters that can be provided
when creating the instance:

.. code-block:: python

   # This is pseudocode
   import deadpool

   with deadpool.DeadPool(
           shutdown_wait=True,
           shutdown_cancel_futures=True
   ):
       fut = exe.submit(...)


.. _shutdown: https://docs.python.org/3/library/concurrent.futures.html?highlight=brokenprocesspool#concurrent.futures.Executor.shutdown
.. _ProcessPoolExecutor: https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#processpoolexecutor
.. _RuntimeError: https://github.com/noxdafox/pebble/issues/42#issuecomment-551245730
.. _OOM killer: https://en.wikipedia.org/wiki/Out_of_memory#Out_of_memory_management
.. _multiprocessing.Pool: https://docs.python.org/3.11/library/multiprocessing.html#multiprocessing.pool.Pool

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "deadpool-executor",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": null,
    "author_email": "Caleb Hattingh <caleb.hattingh@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/b2/77/7e1916a2aa5966af97b6c1f864cf4c610de35c8f91ee9fd4e7a7aae89e1b/deadpool-executor-2022.9.6.tar.gz",
    "platform": null,
    "description": ".. image:: https://github.com/cjrh/deadpool/workflows/Python%20application/badge.svg\n    :target: https://github.com/cjrh/deadpool/actions\n\n.. image:: https://coveralls.io/repos/github/cjrh/deadpool/badge.svg?branch=main\n    :target: https://coveralls.io/github/cjrh/deadpool?branch=main\n\n.. image:: https://img.shields.io/pypi/pyversions/deadpool-executor.svg\n    :target: https://pypi.python.org/pypi/deadpool-executor\n\n.. image:: https://img.shields.io/github/tag/cjrh/deadpool.svg\n    :target: https://img.shields.io/github/tag/cjrh/deadpool.svg\n\n.. image:: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg\n    :target: https://img.shields.io/badge/install-pip%20install%20deadpool--executor-ff69b4.svg\n\n.. image:: https://img.shields.io/pypi/v/deadpool-executor.svg\n    :target: https://pypi.org/project/deadpool-executor/\n\n.. image:: https://img.shields.io/badge/calver-YYYY.MM.MINOR-22bfda.svg\n    :alt: This project uses calendar-based versioning scheme\n    :target: http://calver.org/\n\n.. image:: https://pepy.tech/badge/deadpool-executor\n    :alt: Downloads\n    :target: https://pepy.tech/project/deadpool-executor\n\n.. image:: https://img.shields.io/badge/code%20style-black-000000.svg\n    :alt: This project uses the \"black\" style formatter for Python code\n    :target: https://github.com/python/black\n\n.. image:: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool/badge\n    :alt: OpenSSF Scorecard\n    :target: https://api.securityscorecards.dev/projects/github.com/cjrh/deadpool\n\ndeadpool\n========\n\n``Deadpool`` is a process pool that is really hard to kill.\n\n``Deadpool`` is an implementation of the ``Executor`` interface\nin the ``concurrent.futures`` standard library. ``Deadpool`` is\na process pool executor, quite similar to the stdlib's\n`ProcessPoolExecutor`_.\n\nThis document assumes that you are familiar with the stdlib\n`ProcessPoolExecutor`_. If you are not, it is important\nto understand that ``Deadpool`` makes very specific tradeoffs that\ncan result in quite different behaviour to the stdlib\nimplementation.\n\n.. contents::\n   :local:\n   :backlinks: entry\n\nInstallation\n------------\n\nThe python package name is *deadpool-executor*, so to install\nyou must type ``$ pip install deadpool-executor``. The import\nname is *deadpool*, so in your Python code you must type\n``import deadpool`` to use it.\n\nWhy would I want to use this?\n-----------------------------\n\nI created ``Deadpool`` because I became frustrated with the\nstdlib `ProcessPoolExecutor`_, and various other community\nimplementations of process pools. In particular, I had a use-case\nthat required a high server uptime, but also had variable and\nunpredictable memory requirements such that certain tasks could\ntrigger the `OOM killer`_, often resulting in a \"broken\" process\npool. I also needed task-specific timeouts that could kill a \"hung\"\ntask, which the stdlib executor doesn't provide.\n\nYou might wonder, isn't it bad to just kill a task like that?\nIn my use-case, we had extensive logging and monitoring to alert\nus if any tasks failed; but it was paramount that our services\ncontinue to operate even when tasks got killed in OOM scenarios,\nor specific tasks took too long. This is the primary trade-off\nthat ``Deadpool`` offers.\n\nI also tried using the `Pebble <https://github.com/noxdafox/pebble>`_\ncommunity process pool. This is a cool project, featuring several\nof the properties I've been looking for such as timeouts, and\nmore resilient operation. However, during testing I found several\noccurrences of a mysterious `RuntimeError`_ that caused the Pebble\npool to become broken and no longer accept new tasks.\n\nMy goal with ``Deadpool`` is to make a process pool executor that\nis impossible to break. The tradeoffs are that I care less about:\n\n- being cross-platform\n- optimizing per-task latency\n\nWhat differs from `ProcessPoolExecutor`_?\n-----------------------------------------\n\n``Deadpool`` is generally similar to `ProcessPoolExecutor`_ since it executes\ntasks in subprocesses, and implements the standard ``Executor`` abstract\ninterface. However, it differs in the following ways:\n\n- ``Deadpool`` makes a new subprocess for every task submitted to\n  the pool (up to the ``max_workers`` limit). It is like having\n  ``max_tasks_per_child == 1`` (a new feature in\n  Python 3.11, although it was available in `multiprocessing.Pool`_\n  since Python 3.2). I have ideas about making this configurable, but\n  for now this is a much less important than overall resilience of\n  the pool. This also means that ``Deadpool`` doesn't suffer from\n  long-lived subprocesses being affected by memory leaks, usually\n  created by native extensions.\n- ``Deadpool`` defaults to the `forkserver <https://docs.python.org/3.11/library/multiprocessing.html#contexts-and-start-methods>`_ multiprocessing\n  context, unlike the stdlib pool which defaults to ``fork`` on\n  Linux. It's just a setting though, you can change it in the same way as\n  with the stdlib pool.\n- ``Deadpool`` does not keep a pool of processes around indefinitely.\n  There will only be as many concurrent processes running as there\n  is work to be done, up to the limit set by the ``max_workers``\n  parameter; but if there are fewer tasks to be executed, there will\n  be fewer active subprocesses. When there are no pending or active\n  tasks, there will be *no subprocesses present*. They are created\n  on demand as necessary and disappear when not required.\n- ``Deadpool`` tasks can have timeouts. When a task hits the timeout,\n  the underlying subprocess in the pool is killed with ``SIGKILL``.\n  The entire process tree of that subprocess is killed.\n- ``Deadpool`` tasks can have priorities. The priority is set in the\n  ``submit()`` call. See the examples later in this document for further\n  discussion on priorities.\n- The shutdown parameters ``wait`` and ``cancel_futures`` can behave\n  differently to how they work in the _ProcessPoolExecutor. This is\n  discussed in more detail later in this document.\n- If a ``Deadpool`` subprocess in the pool is killed by some\n  external actor, for example, the OS runs out of memory and the\n  `OOM killer`_ kills a pool subprocess that is using too much memory,\n  ``Deadpool`` does not care and further operation is unaffected.\n  ``Deadpool`` will not, and indeed cannot raise\n  `BrokenProcessPool <https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.process.BrokenProcessPool>`_ or\n  `BrokenExecutor <https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#concurrent.futures.BrokenExecutor>`_.\n- ``Deadpool`` also allows a ``finalizer``, with corresponding\n  ``finalargs``, that will be called after a task is executed on\n  a subprocess, but before the subprocess terminates. It is\n  analogous to the ``initializer`` and ``initargs`` parameters.\n  Just like the ``initializer`` callable, the ``finalizer``\n  callable is executed inside the subprocess. It is not guaranteed that\n  the finalizer will always run. If a process is killed, e.g. due to a\n  timeout or any other reason, the finalizer will not run. The finalizer\n  could be used for things like flushing pending monitoring messages,\n  such as traces and so on.\n- ``Deadpool`` currently only works on Linux. There isn't any specific\n  reason it can't work on other platforms.\n\nShow me some code\n-----------------\n\nSimple case\n^^^^^^^^^^^\n\nThe simple case works exactly the same as with `ProcessPoolExecutor`_:\n\n.. code-block:: python\n\n    from deadpool import Deadpool\n\n    def f():\n        return 123\n\n    with deadpool.Deadpool() as exe:\n        fut = exe.submit(f)\n        result = fut.result()\n\n    assert result == 123\n\nIt is intended that all the basic behaviour should \"just work\" in the\nsame way, and ``Deadpool`` should be a drop-in replacement for\n`ProcessPoolExecutor`_; but there are some subtle differences so you\nshould read all of this document to see if any of those will affect you.\n\nTimeouts\n^^^^^^^^\n\nIf a timeout is reached on a task, the subprocess running that task will be\nkilled, as in ``SIGKILL``. ``Deadpool`` doesn't mind, but your own\napplication should: if you use timeouts it is likely important that your tasks\nbe `idempotent <https://en.wikipedia.org/wiki/Idempotence>`_, especially if\nyour application will restart tasks, or restart them after application deployment,\nand other similar scenarios.\n\n.. code-block:: python\n\n    import time\n    import deadpool\n\n    def f():\n        time.sleep(10.0)\n\n    with deadpool.Deadpool() as exe:\n        fut = exe.submit(f, deadpool_timeout=1.0)\n\n        with pytest.raises(deadpool.TimeoutError)\n            fut.result()\n\nHandling OOM killed situations\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n.. code-block:: python\n\n    import time\n    import deadpool\n\n    def f():\n        x = list(range(10**100))\n\n    with deadpool.Deadpool() as exe:\n        fut = exe.submit(f, timeout=1.0)\n\n        try:\n            result = fut.result()\n        except deadpool.ProcessError:\n            print(\"Oh no someone killed my task!\")\n\n\nAs long as the OOM killer terminates the subprocess (and not the main process),\nwhich is likely because it'll be your subprocess that is using too much\nmemory, this will not hurt the pool, and it will be able to receive and\nprocess more tasks.\n\nDesign Details\n--------------\n\nTypical Example - with timeouts\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nHere's a typical example of how code using Deadpool might look. The\noutput of this code should be similar to the following:\n\n.. code-block:: bash\n\n    $ python examples/entrypoint.py\n    ...................xxxxxxxxxxx.xxxxxxx.x.xxxxxxx.x\n    $\n\nEach ``.`` is a successfully completed task, and each ``x`` is a task\nthat timed out. Below is the code for this example.\n\n.. code-block:: python\n\n    import random, time\n    import deadpool\n\n\n    def work():\n        time.sleep(random.random() * 4.0)\n        print(\".\", end=\"\", flush=True)\n        return 1\n\n\n    def main():\n        with deadpool.Deadpool() as exe:\n            futs = (exe.submit(work, timeout=2.0) for _ in range(50))\n            for fut in deadpool.as_completed(futs):\n                try:\n                    assert fut.result() == 1\n                except deadpool.TimeoutError:\n                    print(\"x\", end=\"\", flush=True)\n\n\n    if __name__ == \"__main__\":\n        main()\n        print()\n\n- The work function will be busy for a random time period between 0 and\n  4 seconds.\n- There is a ``deadpool_timeout`` kwarg given to the ``submit`` method.\n  This kwarg is special and will be consumed by Deadpool. You cannot\n  use this kwarg name for your own task functions.\n- When a task completes, it prints out ``.`` internally. But when a task\n  raises a ``deadpool.TimeoutError``, a ``x`` will be printed out instead.\n- When a task times out, keep in mind that the underlying process that\n  is executing that task is killed, literally with the ``SIGKILL`` signal.\n\nDeadpool tasks have priority\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nThe example below is similar to the previous one for timeouts. In fact\nthis example retains the timeouts to show how the different features\ncompose together. In this example we create tasks with different\npriorities, and we change the printed character of each task to show\nthat higher priority items are executed first.\n\nThe code example will print something similar to the following:\n\n.. code-block:: bash\n\n    $ python examples/priorities.py\n    !!!!!xxxxxxxxxxx!x..!...x.xxxxxxxx.xxxx.x...xxxxxx\n\nYou can see how the ``!`` characters, used for indicating higher priority\ntasks, appear towards the front indicating that they were executed sooner.\nBelow is the code.\n\n.. code-block:: python\n\n    import random, time\n    import deadpool\n\n\n    def work(symbol):\n        time.sleep(random.random() * 4.0)\n        print(symbol, end=\"\", flush=True)\n        return 1\n\n\n    def main():\n        with deadpool.Deadpool(max_backlog=100) as exe:\n            futs = []\n            for _ in range(25):\n                fut = exe.submit(work, \".\",deadpool_timeout=2.0, deadpool_priority=10)\n                futs.append(fut)\n                fut = exe.submit(work, \"!\",deadpool_timeout=2.0, deadpool_priority=0)\n                futs.append(fut)\n\n            for fut in deadpool.as_completed(futs):\n                try:\n                    assert fut.result() == 1\n                except deadpool.TimeoutError:\n                    print(\"x\", end=\"\", flush=True)\n\n\n    if __name__ == \"__main__\":\n        main()\n        print()\n\n- When the tasks are submitted, they are given a priority. The default\n  value for the ``deadpool_priority`` parameter is 0, but here we'll\n  write them out explicity.  Half of the tasks will have priority 10 and\n  half will have priority 0.\n- A lower value for the ``deadpool_priority`` parameters means a **higher**\n  priority. The highest priority allowed is indicated by 0. Negative\n  priority values are not allowed.\n- I also specified the ``max_backlog`` parameter when creating the\n  Deadpool instance. This is discussed in more detail next, but quickly:\n  task priority can only be enforced on what is in the submitted backlog\n  of tasks, and the ``max_backlog`` parameter controls the depth of that\n  queue. If ``max_backlog`` is too low, then the window of prioritization\n  will not include tasks submitted later which might have higher priorities\n  than earlier-submitted tasks. The ``submit`` call will in fact block\n  if the ``max_backlog`` depth has been reached.\n\nControlling the backlog of submitted tasks\n^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\nBy default, the ``max_backlog`` parameter is set to 5. This parameter is\nused to create the \"submit queue\" size. The submit queue is the place\nwhere submitted tasks are held before they are executed in background\nprocesses.\n\nIf the submit queue is large (``max_backlog``), it will mean\nthat a large number of tasks can be added to the system with the\n``submit`` method, even before any tasks have finished exiting. Conversely,\na low ``max_backlog`` parameter means that the submit queue will fill up\nfaster. If the submit queue is full, it means that the next call to\n``submit`` will block.\n\nThis kind of blocking is fine, and typically desired. It means that\nbackpressure from blocking is controlling the amount of work in flight.\nBy using a smaller ``max_backlog``, it means that you'll also be\nlimiting the amount of memory in use during the execution of all the tasks.\n\nHowever, if you nevertheless still accumulate received futures as my\nexample code above is doing, that accumulation, i.e., the list of futures,\nwill contribute to memory growth. If you have a large amount of work, it\nwill be better to set a *callback* function on each of the futures rather\nthan processing them by iterating over ``as_completed``.\n\nThe example below illustrates this technique for keeping memory\nconsumption down:\n\n.. code-block:: python\n\n    import random, time\n    import deadpool\n\n\n    def work():\n        time.sleep(random.random() * 4.0)\n        print(\".\", end=\"\", flush=True)\n        return 1\n\n\n    def cb(fut):\n        try:\n            assert fut.result() == 1\n        except deadpool.TimeoutError:\n            print(\"x\", end=\"\", flush=True)\n\n\n    def main():\n        with deadpool.Deadpool() as exe:\n            for _ in range(50):\n                exe.submit(work, deadpool_timeout=2.0).add_done_callback(cb)\n\n\n    if __name__ == \"__main__\":\n        main()\n        print()\n\n\nWith this callback-based design, we no longer have an accumulation of futures\nin a list. We get the same kind of output as in the \"typical example\" from\nearlier:\n\n.. code-block:: bash\n\n    $ python examples/callbacks.py\n    .....xxx.xxxxxxxxx.........x..xxxxx.x....x.xxxxxxx\n\n\nSpeaking of callbacks, the customized ``Future`` class used by Deadpool\nlets you set a callback for when the task begins executing on a real\nsystem process. That can be configured like so:\n\n.. code-block:: python\n\n    with deadpool.Deadpool() as exe:\n        f = exe.submit(work)\n\n        def cb(fut: deadpool.Future):\n            print(f\"My task is running on process {fut.pid}\")\n\n        f.add_pid_callback(cb)\n\nMore about shutdown\n^^^^^^^^^^^^^^^^^^^\n\nIn the documentation for ProcessPoolExecutor_, the following function\nsignature is given for the shutdown_ method of the executor interface:\n\n.. code-block:: python\n\n    shutdown(wait=True, *, cancel_futures=False)\n\nI want to honor this, but it presents some difficulties because the\nsemantics of the ``wait`` and ``cancel_futures`` parameters need to be\nsomewhat different for Deadpool.\n\nIn Deadpool, this is what the combinations of those flags mean:\n\n.. csv-table:: Shutdown flags\n   :header: ``wait``, ``cancel_futures``, ``effect``\n   :widths: 10, 10, 80\n   :align: left\n\n   ``True``, ``True``, \"Wait for already-running tasks to complete; the\n   ``shutdown()`` call will unblock (return) when they're done. Cancel\n   all pending tasks that are in the submit queue, but have not yet started\n   running. The ``fut.cancelled()`` method will return ``True`` for such\n   cancelled tasks.\"\n   ``True``, ``False``, \"Wait for already-running tasks to complete.\n   Pending tasks in the\n   submit queue that have not yet started running will *not* be cancelled, and\n   will all continue to execute. The ``shutdown()`` call will return only\n   after all submitted tasks have completed. \"\n   ``False``, ``True``, \"Already-running tasks **will be cancelled** and this\n   means the underlying subprocesses executing these tasks will receive\n   SIGKILL. Pending tasks on the submit queue that have not yet started\n   running will also be cancelled.\"\n   ``False``, ``False``, \"This is a strange one. What to do if the caller\n   doesn't want to wait, but also doesn't want to cancel things? In this\n   case, already-running tasks will be allowed to complete, but pending\n   tasks on the submit queue will be cancelled. This is the same outcome as\n   as ``wait==True`` and ``cancel_futures==True``. An alternative design\n   might have been to allow all tasks, both running and pending, to just\n   keep going in the background even after the ``shutdown()`` call\n   returns. Does anyone have a use-case for this?\"\n\nIf you're using ``Deadpool`` as a context manager, you might be wondering\nhow exactly to set these parameters in the ``shutdown`` call, since that\ncall is made for you automatically when the context manager exits.\n\nFor this, Deadpool provides additional parameters that can be provided\nwhen creating the instance:\n\n.. code-block:: python\n\n   # This is pseudocode\n   import deadpool\n\n   with deadpool.DeadPool(\n           shutdown_wait=True,\n           shutdown_cancel_futures=True\n   ):\n       fut = exe.submit(...)\n\n\n.. _shutdown: https://docs.python.org/3/library/concurrent.futures.html?highlight=brokenprocesspool#concurrent.futures.Executor.shutdown\n.. _ProcessPoolExecutor: https://docs.python.org/3/library/concurrent.futures.html?highlight=broken%20process%20pool#processpoolexecutor\n.. _RuntimeError: https://github.com/noxdafox/pebble/issues/42#issuecomment-551245730\n.. _OOM killer: https://en.wikipedia.org/wiki/Out_of_memory#Out_of_memory_management\n.. _multiprocessing.Pool: https://docs.python.org/3.11/library/multiprocessing.html#multiprocessing.pool.Pool\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Deadpool",
    "version": "2022.9.6",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "md5": "4e3a9586c7377d58b06ef51220103eb7",
                "sha256": "d1d12742db58fb8fc3e701e282d2d27211ae4b70acf1ee9e04c02c24e2f63fab"
            },
            "downloads": -1,
            "filename": "deadpool_executor-2022.9.6-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "4e3a9586c7377d58b06ef51220103eb7",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 23908,
            "upload_time": "2022-09-23T10:56:12",
            "upload_time_iso_8601": "2022-09-23T10:56:12.299330Z",
            "url": "https://files.pythonhosted.org/packages/29/69/9864548154c930f67b3879d008eecc65798c810e538e70860b8a3ffe07dc/deadpool_executor-2022.9.6-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "md5": "c3a5c9bccb84879d60a1813fdceab134",
                "sha256": "eb4fea1ce1ee6b39858ada11c33fbab0c90eee4320c291cdbdfe3fe54be64bf2"
            },
            "downloads": -1,
            "filename": "deadpool-executor-2022.9.6.tar.gz",
            "has_sig": false,
            "md5_digest": "c3a5c9bccb84879d60a1813fdceab134",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 31092,
            "upload_time": "2022-09-23T10:56:15",
            "upload_time_iso_8601": "2022-09-23T10:56:15.924274Z",
            "url": "https://files.pythonhosted.org/packages/b2/77/7e1916a2aa5966af97b6c1f864cf4c610de35c8f91ee9fd4e7a7aae89e1b/deadpool-executor-2022.9.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-09-23 10:56:15",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "deadpool-executor"
}
        
Elapsed time: 0.50387s