requests-futures


Namerequests-futures JSON
Version 1.0.2 PyPI version JSON
download
home_pagehttps://github.com/ross/requests-futures
SummaryAsynchronous Python HTTP for Humans.
upload_time2024-11-15 22:14:51
maintainerNone
docs_urlNone
authorRoss McFarland
requires_pythonNone
licenseApache License v2
keywords
VCS
bugtrack_url
requirements certifi charset-normalizer idna requests urllib3
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Asynchronous Python HTTP Requests for Humans
============================================

.. image:: https://travis-ci.org/ross/requests-futures.svg?branch=master
        :target: https://travis-ci.org/ross/requests-futures

Small add-on for the python requests_ http library. Makes use of python 3.2's
`concurrent.futures`_ or the backport_ for prior versions of python.

The additional API and changes are minimal and strives to avoid surprises.

The following synchronous code:

.. code-block:: python

    from requests import Session

    session = Session()
    # first requests starts and blocks until finished
    response_one = session.get('http://httpbin.org/get')
    # second request starts once first is finished
    response_two = session.get('http://httpbin.org/get?foo=bar')
    # both requests are complete
    print('response one status: {0}'.format(response_one.status_code))
    print(response_one.content)
    print('response two status: {0}'.format(response_two.status_code))
    print(response_two.content)

Can be translated to make use of futures, and thus be asynchronous by creating
a FuturesSession and catching the returned Future in place of Response. The
Response can be retrieved by calling the result method on the Future:

.. code-block:: python

    from requests_futures.sessions import FuturesSession

    session = FuturesSession()
    # first request is started in background
    future_one = session.get('http://httpbin.org/get')
    # second requests is started immediately
    future_two = session.get('http://httpbin.org/get?foo=bar')
    # wait for the first request to complete, if it hasn't already
    response_one = future_one.result()
    print('response one status: {0}'.format(response_one.status_code))
    print(response_one.content)
    # wait for the second request to complete, if it hasn't already
    response_two = future_two.result()
    print('response two status: {0}'.format(response_two.status_code))
    print(response_two.content)

By default a ThreadPoolExecutor is created with 8 workers. If you would like to
adjust that value or share a executor across multiple sessions you can provide
one to the FuturesSession constructor.

.. code-block:: python

    from concurrent.futures import ThreadPoolExecutor
    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ThreadPoolExecutor(max_workers=10))
    # ...

As a shortcut in case of just increasing workers number you can pass
`max_workers` straight to the `FuturesSession` constructor:

.. code-block:: python

    from requests_futures.sessions import FuturesSession
    session = FuturesSession(max_workers=10)

FutureSession will use an existing session object if supplied:

.. code-block:: python

    from requests import session
    from requests_futures.sessions import FuturesSession
    my_session = session()
    future_session = FuturesSession(session=my_session)

That's it. The api of requests.Session is preserved without any modifications
beyond returning a Future rather than Response. As with all futures exceptions
are shifted (thrown) to the future.result() call so try/except blocks should be
moved there.


Tying extra information to the request/response
===============================================

The most common piece of information needed is the URL of the request. This can
be accessed without any extra steps using the `request` property of the
response object.

.. code-block:: python

    from concurrent.futures import as_completed
    from pprint import pprint
    from requests_futures.sessions import FuturesSession

    session = FuturesSession()

    futures=[session.get(f'http://httpbin.org/get?{i}') for i in range(3)]

    for future in as_completed(futures):
        resp = future.result()
        pprint({
            'url': resp.request.url,
            'content': resp.json(),
        })

There are situations in which you may want to tie additional information to a
request/response. There are a number of ways to go about this, the simplest is
to attach additional information to the future object itself.

.. code-block:: python

    from concurrent.futures import as_completed
    from pprint import pprint
    from requests_futures.sessions import FuturesSession

    session = FuturesSession()

    futures=[]
    for i in range(3):
        future = session.get('http://httpbin.org/get')
        future.i = i
        futures.append(future)

    for future in as_completed(futures):
        resp = future.result()
        pprint({
            'i': future.i,
            'content': resp.json(),
        })

Canceling queued requests (a.k.a cleaning up after yourself)
============================================================

If you know that you won't be needing any additional responses from futures that
haven't yet resolved, it's a good idea to cancel those requests. You can do this
by using the session as a context manager:

.. code-block:: python

    from requests_futures.sessions import FuturesSession
    with FuturesSession(max_workers=1) as session:
        future = session.get('https://httpbin.org/get')
        future2 = session.get('https://httpbin.org/delay/10')
        future3 = session.get('https://httpbin.org/delay/10')
        response = future.result()

In this example, the second or third request will be skipped, saving time and
resources that would otherwise be wasted.

Iterating over a list of requests responses
===========================================

Without preserving the requests order:

.. code-block:: python

    from concurrent.futures import as_completed
    from requests_futures.sessions import FuturesSession
    with FuturesSession() as session:
        futures = [session.get('https://httpbin.org/delay/{}'.format(i % 3)) for i in range(10)]
        for future in as_completed(futures):
            resp = future.result()
            print(resp.json()['url'])

Working in the Background
=========================

Additional processing can be done in the background using requests's hooks_
functionality. This can be useful for shifting work out of the foreground, for
a simple example take json parsing.

.. code-block:: python

    from pprint import pprint
    from requests_futures.sessions import FuturesSession

    session = FuturesSession()

    def response_hook(resp, *args, **kwargs):
        # parse the json storing the result on the response object
        resp.data = resp.json()

    future = session.get('http://httpbin.org/get', hooks={
        'response': response_hook,
    })
    # do some other stuff, send some more requests while this one works
    response = future.result()
    print('response status {0}'.format(response.status_code))
    # data will have been attached to the response object in the background
    pprint(response.data)

Hooks can also be applied to the session.

.. code-block:: python

    from pprint import pprint
    from requests_futures.sessions import FuturesSession

    def response_hook(resp, *args, **kwargs):
        # parse the json storing the result on the response object
        resp.data = resp.json()

    session = FuturesSession()
    session.hooks['response'] = response_hook

    future = session.get('http://httpbin.org/get')
    # do some other stuff, send some more requests while this one works
    response = future.result()
    print('response status {0}'.format(response.status_code))
    # data will have been attached to the response object in the background
    pprint(response.data)   pprint(response.data)

A more advanced example that adds an `elapsed` property to all requests.

.. code-block:: python

    from pprint import pprint
    from requests_futures.sessions import FuturesSession
    from time import time


    class ElapsedFuturesSession(FuturesSession):

        def request(self, method, url, hooks=None, *args, **kwargs):
            start = time()
            if hooks is None:
                hooks = {}

            def timing(r, *args, **kwargs):
                r.elapsed = time() - start

            try:
                if isinstance(hooks['response'], (list, tuple)):
                    # needs to be first so we don't time other hooks execution
                    hooks['response'].insert(0, timing)
                else:
                    hooks['response'] = [timing, hooks['response']]
            except KeyError:
                hooks['response'] = timing

            return super(ElapsedFuturesSession, self) \
                .request(method, url, hooks=hooks, *args, **kwargs)



    session = ElapsedFuturesSession()
    future = session.get('http://httpbin.org/get')
    # do some other stuff, send some more requests while this one works
    response = future.result()
    print('response status {0}'.format(response.status_code))
    print('response elapsed {0}'.format(response.elapsed))

Using ProcessPoolExecutor
=========================

Similarly to `ThreadPoolExecutor`, it is possible to use an instance of
`ProcessPoolExecutor`. As the name suggest, the requests will be executed
concurrently in separate processes rather than threads.

.. code-block:: python

    from concurrent.futures import ProcessPoolExecutor
    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10))
    # ... use as before

.. HINT::
    Using the `ProcessPoolExecutor` is useful, in cases where memory
    usage per request is very high (large response) and cycling the interpreter
    is required to release memory back to OS.

A base requirement of using `ProcessPoolExecutor` is that the `Session.request`,
`FutureSession` all be pickle-able.

This means that only Python 3.5 is fully supported, while Python versions
3.4 and above REQUIRE an existing `requests.Session` instance to be passed
when initializing `FutureSession`. Python 2.X and < 3.4 are currently not
supported.

.. code-block:: python

    # Using python 3.4
    from concurrent.futures import ProcessPoolExecutor
    from requests import Session
    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),
                             session=Session())
    # ... use as before

In case pickling fails, an exception is raised pointing to this documentation.

.. code-block:: python

    # Using python 2.7
    from concurrent.futures import ProcessPoolExecutor
    from requests import Session
    from requests_futures.sessions import FuturesSession

    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),
                             session=Session())
    Traceback (most recent call last):
    ...
    RuntimeError: Cannot pickle function. Refer to documentation: https://github.com/ross/requests-futures/#using-processpoolexecutor

.. IMPORTANT::
  * Python >= 3.4 required
  * A session instance is required when using Python < 3.5
  * If sub-classing `FuturesSession` it must be importable (module global)

Installation
============

    pip install requests-futures

.. _`requests`: https://github.com/kennethreitz/requests
.. _`concurrent.futures`: http://docs.python.org/dev/library/concurrent.futures.html
.. _backport: https://pypi.python.org/pypi/futures
.. _hooks: http://docs.python-requests.org/en/master/user/advanced/#event-hooks

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ross/requests-futures",
    "name": "requests-futures",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Ross McFarland",
    "author_email": "rwmcfa1@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/88/f8/175b823241536ba09da033850d66194c372c65c38804847ac9cef0239542/requests_futures-1.0.2.tar.gz",
    "platform": null,
    "description": "Asynchronous Python HTTP Requests for Humans\n============================================\n\n.. image:: https://travis-ci.org/ross/requests-futures.svg?branch=master\n        :target: https://travis-ci.org/ross/requests-futures\n\nSmall add-on for the python requests_ http library. Makes use of python 3.2's\n`concurrent.futures`_ or the backport_ for prior versions of python.\n\nThe additional API and changes are minimal and strives to avoid surprises.\n\nThe following synchronous code:\n\n.. code-block:: python\n\n    from requests import Session\n\n    session = Session()\n    # first requests starts and blocks until finished\n    response_one = session.get('http://httpbin.org/get')\n    # second request starts once first is finished\n    response_two = session.get('http://httpbin.org/get?foo=bar')\n    # both requests are complete\n    print('response one status: {0}'.format(response_one.status_code))\n    print(response_one.content)\n    print('response two status: {0}'.format(response_two.status_code))\n    print(response_two.content)\n\nCan be translated to make use of futures, and thus be asynchronous by creating\na FuturesSession and catching the returned Future in place of Response. The\nResponse can be retrieved by calling the result method on the Future:\n\n.. code-block:: python\n\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession()\n    # first request is started in background\n    future_one = session.get('http://httpbin.org/get')\n    # second requests is started immediately\n    future_two = session.get('http://httpbin.org/get?foo=bar')\n    # wait for the first request to complete, if it hasn't already\n    response_one = future_one.result()\n    print('response one status: {0}'.format(response_one.status_code))\n    print(response_one.content)\n    # wait for the second request to complete, if it hasn't already\n    response_two = future_two.result()\n    print('response two status: {0}'.format(response_two.status_code))\n    print(response_two.content)\n\nBy default a ThreadPoolExecutor is created with 8 workers. If you would like to\nadjust that value or share a executor across multiple sessions you can provide\none to the FuturesSession constructor.\n\n.. code-block:: python\n\n    from concurrent.futures import ThreadPoolExecutor\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession(executor=ThreadPoolExecutor(max_workers=10))\n    # ...\n\nAs a shortcut in case of just increasing workers number you can pass\n`max_workers` straight to the `FuturesSession` constructor:\n\n.. code-block:: python\n\n    from requests_futures.sessions import FuturesSession\n    session = FuturesSession(max_workers=10)\n\nFutureSession will use an existing session object if supplied:\n\n.. code-block:: python\n\n    from requests import session\n    from requests_futures.sessions import FuturesSession\n    my_session = session()\n    future_session = FuturesSession(session=my_session)\n\nThat's it. The api of requests.Session is preserved without any modifications\nbeyond returning a Future rather than Response. As with all futures exceptions\nare shifted (thrown) to the future.result() call so try/except blocks should be\nmoved there.\n\n\nTying extra information to the request/response\n===============================================\n\nThe most common piece of information needed is the URL of the request. This can\nbe accessed without any extra steps using the `request` property of the\nresponse object.\n\n.. code-block:: python\n\n    from concurrent.futures import as_completed\n    from pprint import pprint\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession()\n\n    futures=[session.get(f'http://httpbin.org/get?{i}') for i in range(3)]\n\n    for future in as_completed(futures):\n        resp = future.result()\n        pprint({\n            'url': resp.request.url,\n            'content': resp.json(),\n        })\n\nThere are situations in which you may want to tie additional information to a\nrequest/response. There are a number of ways to go about this, the simplest is\nto attach additional information to the future object itself.\n\n.. code-block:: python\n\n    from concurrent.futures import as_completed\n    from pprint import pprint\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession()\n\n    futures=[]\n    for i in range(3):\n        future = session.get('http://httpbin.org/get')\n        future.i = i\n        futures.append(future)\n\n    for future in as_completed(futures):\n        resp = future.result()\n        pprint({\n            'i': future.i,\n            'content': resp.json(),\n        })\n\nCanceling queued requests (a.k.a cleaning up after yourself)\n============================================================\n\nIf you know that you won't be needing any additional responses from futures that\nhaven't yet resolved, it's a good idea to cancel those requests. You can do this\nby using the session as a context manager:\n\n.. code-block:: python\n\n    from requests_futures.sessions import FuturesSession\n    with FuturesSession(max_workers=1) as session:\n        future = session.get('https://httpbin.org/get')\n        future2 = session.get('https://httpbin.org/delay/10')\n        future3 = session.get('https://httpbin.org/delay/10')\n        response = future.result()\n\nIn this example, the second or third request will be skipped, saving time and\nresources that would otherwise be wasted.\n\nIterating over a list of requests responses\n===========================================\n\nWithout preserving the requests order:\n\n.. code-block:: python\n\n    from concurrent.futures import as_completed\n    from requests_futures.sessions import FuturesSession\n    with FuturesSession() as session:\n        futures = [session.get('https://httpbin.org/delay/{}'.format(i % 3)) for i in range(10)]\n        for future in as_completed(futures):\n            resp = future.result()\n            print(resp.json()['url'])\n\nWorking in the Background\n=========================\n\nAdditional processing can be done in the background using requests's hooks_\nfunctionality. This can be useful for shifting work out of the foreground, for\na simple example take json parsing.\n\n.. code-block:: python\n\n    from pprint import pprint\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession()\n\n    def response_hook(resp, *args, **kwargs):\n        # parse the json storing the result on the response object\n        resp.data = resp.json()\n\n    future = session.get('http://httpbin.org/get', hooks={\n        'response': response_hook,\n    })\n    # do some other stuff, send some more requests while this one works\n    response = future.result()\n    print('response status {0}'.format(response.status_code))\n    # data will have been attached to the response object in the background\n    pprint(response.data)\n\nHooks can also be applied to the session.\n\n.. code-block:: python\n\n    from pprint import pprint\n    from requests_futures.sessions import FuturesSession\n\n    def response_hook(resp, *args, **kwargs):\n        # parse the json storing the result on the response object\n        resp.data = resp.json()\n\n    session = FuturesSession()\n    session.hooks['response'] = response_hook\n\n    future = session.get('http://httpbin.org/get')\n    # do some other stuff, send some more requests while this one works\n    response = future.result()\n    print('response status {0}'.format(response.status_code))\n    # data will have been attached to the response object in the background\n    pprint(response.data)   pprint(response.data)\n\nA more advanced example that adds an `elapsed` property to all requests.\n\n.. code-block:: python\n\n    from pprint import pprint\n    from requests_futures.sessions import FuturesSession\n    from time import time\n\n\n    class ElapsedFuturesSession(FuturesSession):\n\n        def request(self, method, url, hooks=None, *args, **kwargs):\n            start = time()\n            if hooks is None:\n                hooks = {}\n\n            def timing(r, *args, **kwargs):\n                r.elapsed = time() - start\n\n            try:\n                if isinstance(hooks['response'], (list, tuple)):\n                    # needs to be first so we don't time other hooks execution\n                    hooks['response'].insert(0, timing)\n                else:\n                    hooks['response'] = [timing, hooks['response']]\n            except KeyError:\n                hooks['response'] = timing\n\n            return super(ElapsedFuturesSession, self) \\\n                .request(method, url, hooks=hooks, *args, **kwargs)\n\n\n\n    session = ElapsedFuturesSession()\n    future = session.get('http://httpbin.org/get')\n    # do some other stuff, send some more requests while this one works\n    response = future.result()\n    print('response status {0}'.format(response.status_code))\n    print('response elapsed {0}'.format(response.elapsed))\n\nUsing ProcessPoolExecutor\n=========================\n\nSimilarly to `ThreadPoolExecutor`, it is possible to use an instance of\n`ProcessPoolExecutor`. As the name suggest, the requests will be executed\nconcurrently in separate processes rather than threads.\n\n.. code-block:: python\n\n    from concurrent.futures import ProcessPoolExecutor\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10))\n    # ... use as before\n\n.. HINT::\n    Using the `ProcessPoolExecutor` is useful, in cases where memory\n    usage per request is very high (large response) and cycling the interpreter\n    is required to release memory back to OS.\n\nA base requirement of using `ProcessPoolExecutor` is that the `Session.request`,\n`FutureSession` all be pickle-able.\n\nThis means that only Python 3.5 is fully supported, while Python versions\n3.4 and above REQUIRE an existing `requests.Session` instance to be passed\nwhen initializing `FutureSession`. Python 2.X and < 3.4 are currently not\nsupported.\n\n.. code-block:: python\n\n    # Using python 3.4\n    from concurrent.futures import ProcessPoolExecutor\n    from requests import Session\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),\n                             session=Session())\n    # ... use as before\n\nIn case pickling fails, an exception is raised pointing to this documentation.\n\n.. code-block:: python\n\n    # Using python 2.7\n    from concurrent.futures import ProcessPoolExecutor\n    from requests import Session\n    from requests_futures.sessions import FuturesSession\n\n    session = FuturesSession(executor=ProcessPoolExecutor(max_workers=10),\n                             session=Session())\n    Traceback (most recent call last):\n    ...\n    RuntimeError: Cannot pickle function. Refer to documentation: https://github.com/ross/requests-futures/#using-processpoolexecutor\n\n.. IMPORTANT::\n  * Python >= 3.4 required\n  * A session instance is required when using Python < 3.5\n  * If sub-classing `FuturesSession` it must be importable (module global)\n\nInstallation\n============\n\n    pip install requests-futures\n\n.. _`requests`: https://github.com/kennethreitz/requests\n.. _`concurrent.futures`: http://docs.python.org/dev/library/concurrent.futures.html\n.. _backport: https://pypi.python.org/pypi/futures\n.. _hooks: http://docs.python-requests.org/en/master/user/advanced/#event-hooks\n",
    "bugtrack_url": null,
    "license": "Apache License v2",
    "summary": "Asynchronous Python HTTP for Humans.",
    "version": "1.0.2",
    "project_urls": {
        "Homepage": "https://github.com/ross/requests-futures"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "91237c1096731c15c83826cb0dd42078b561a838aed44c36f370aeb815168106",
                "md5": "03b2eb3612e108443884352fbc949c8f",
                "sha256": "a3534af7c2bf670cd7aa730716e9e7d4386497554f87792be7514063b8912897"
            },
            "downloads": -1,
            "filename": "requests_futures-1.0.2-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "03b2eb3612e108443884352fbc949c8f",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 7671,
            "upload_time": "2024-11-15T22:14:50",
            "upload_time_iso_8601": "2024-11-15T22:14:50.255267Z",
            "url": "https://files.pythonhosted.org/packages/91/23/7c1096731c15c83826cb0dd42078b561a838aed44c36f370aeb815168106/requests_futures-1.0.2-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "88f8175b823241536ba09da033850d66194c372c65c38804847ac9cef0239542",
                "md5": "cfa914d02e9f5aa7b12d6bdc4b673de2",
                "sha256": "6b7eb57940336e800faebc3dab506360edec9478f7b22dc570858ad3aa7458da"
            },
            "downloads": -1,
            "filename": "requests_futures-1.0.2.tar.gz",
            "has_sig": false,
            "md5_digest": "cfa914d02e9f5aa7b12d6bdc4b673de2",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10356,
            "upload_time": "2024-11-15T22:14:51",
            "upload_time_iso_8601": "2024-11-15T22:14:51.988275Z",
            "url": "https://files.pythonhosted.org/packages/88/f8/175b823241536ba09da033850d66194c372c65c38804847ac9cef0239542/requests_futures-1.0.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-15 22:14:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ross",
    "github_project": "requests-futures",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "certifi",
            "specs": [
                [
                    "==",
                    "2024.8.30"
                ]
            ]
        },
        {
            "name": "charset-normalizer",
            "specs": [
                [
                    "==",
                    "3.4.0"
                ]
            ]
        },
        {
            "name": "idna",
            "specs": [
                [
                    "==",
                    "3.10"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.32.3"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "==",
                    "2.2.3"
                ]
            ]
        }
    ],
    "lcname": "requests-futures"
}
        
Elapsed time: 0.95153s