wsgiprox


Namewsgiprox JSON
Version 1.5.2 PyPI version JSON
download
home_pagehttps://github.com/webrecorder/wsgiprox
SummaryHTTP/S proxy with WebSockets over WSGI
upload_time2019-03-19 21:24:49
maintainer
docs_urlNone
authorIlya Kreymer
requires_python
licenseApache 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage
            wsgiprox
========

.. image:: https://travis-ci.org/webrecorder/wsgiprox.svg?branch=master
    :target: https://travis-ci.org/webrecorder/wsgiprox

``wsgiprox`` is a Python WSGI middleware for adding HTTP and HTTPS proxy support to a WSGI application.

The library accepts HTTP and HTTPS proxy connections, and routes them to a designated prefix.

Usage
~~~~~

For example, given a `WSGI <http://wsgi.readthedocs.io/en/latest/>`_ callable ``application``, the middleware could be defined as follows:

.. code:: python

    from wsgiprox.wsgiprox import WSGIProxMiddleware

    application = WSGIProxMiddleware(application, '/prefix/', 'wsgiprox')


With the above configuration, the middleware is configured to add a prefix of ``/prefix/`` to any url, unless it is to the proxy host ``wsgiprox``.  Assuming a WSGI server running on port 8080, the middleware would translate HTTP/S proxy connections to a non-proxy WSGI request, and pass to the wrapped application:

*  Proxy Request: ``curl -x "localhost:8080" "http://example.com/path/file.html?A=B"``

   Becomes equivalent to: ``curl "http://localhost:8080/prefix/http://example.com/path/file.html?A=B"``


*  Proxy Request: ``curl -k -x "localhost:8080" "https://example.com/path/file.html?A=B"``

   Becomes equivalent to: ``curl "http://localhost:8080/prefix/https://example.com/path/file.html?A=B"``

*  Proxy Request to proxy host: ``curl -k -x "localhost:8080" "https://wsgiprox/path/file.html?A=B"``

   Not adding prefix for ``wsgiprox``, becomes equivalent to: ``curl -H "Host: wsgiprox" "http://localhost:8080/path/file.html?A=B"``


All standard WSGI ``environ`` fields are set to the expected values for the translated url.

When a request passes through wsgiprox middleware, ``environ['wsgiprox.proxy_host']`` is set to the proxy host.
In this example, the WSGI app could check that ``environ.get('wsgiprox.proxy_host') == 'wsgiprox'`` to ensure that it was a proxy request. If the request is to the proxy host itself, then it is passed to the WSGI app without prefixing, and ``environ['wsgiprox.proxy_host'] == environ['HTTP_HOST']``


Custom Resolvers
================

The provided ``FixedResolver`` simply prepends a fixed prefix to each url. A custom resolver could compute the final url in a different way. The resolver instance is called with the full url, and the original WSGI ``environ``. The result is the translated ``REQUEST_URI`` that is passed to the WSGI applictaion.

See `resolvers.py <wsgiprox/resolvers.py>`_ for all available resolvers.

For example, the following Resolver translates the url to a custom prefix based on the remote IP of the original request.

.. code:: python

    class IPResolver(object):
        def __call__(self, url, environ):
            return '/' + environ['REMOTE_ADDR'] + '/' + url

    application = WSGIProxMiddleware(application, IPResolver())


HTTPS CA
========

To support HTTPS proxy, ``wsgiprox`` creates a custom CA (Certificate Authority), which must be accepted by the client (or it must ignore cert verification as with the ``-k`` option in CURL)

By default, ``wsgiprox`` looks for CA .pem at: ``<working dir>/ca/wsgiprox-ca.pem`` and auto-creates this bundle using the `certauth <https://github.com/ikreymer/certauth>`_ library.

The CA name and CA root cert filename can also be specified explicitly via ``proxy_options`` dict.

By default, the following options are used:

.. code:: python

    WSGIProxMiddleware(..., proxy_options={ca_name='wsgiprox https proxy CA',
                                           ca_file='./ca/wsgiprox-ca.pem'})

The generated ``wsgiprox-ca.pem`` can be imported directly into most browsers directly as a trusted certificate authority, allowing the browser to accept HTTPS content proxied through ``wsgiprox``

Downloading Certs
=================

The CA cert can be downloaded directly from the proxy directly. This allows for quick installation into a client/browser.

* ``curl -x "localhost:8080" http://wsgiprox/download/pem`` will download in PEM format (for most platforms)
* ``curl -x "localhost:8080" http://wsgiprox/download/p12`` will download in PKCS12 format (for Windows)

The download host is the same as proxy main host, though can be changed via ``download_host`` param to WSGIProxMiddleware constructor.

Custom Proxy Host Apps
======================

It's is also possible to configure a custom WSGI app per proxy host, eg:

* ``curl -x "localhost:8080" https://proxy-app-1/path/`` is passed to ``proxy-app-1``
* ``curl -x "localhost:8080" https://proxy-app-2/foo`` is passed to ``proxy-app-2``

This can be done via:

.. code:: python

    from wsgiprox.wsgiprox import WSGIProxMiddleware

    proxy_apps = {"proxy-app-1": ProxyApp1WSGI(),
                  "proxy-app-2": ProxyApp2WSGI(),
                  "proxy-alias": None,
                 }

    application = WSGIProxMiddleware(application, proxy_apps=apps)

All other requests, or any requests not handled by the proxy app, are passed to the main ``application``.

In the last case, since there is no proxy app, the request is passed directly to wrapped application.
The ``wsgiprox.proxy_host`` would be set to ``'proxy-alias'`` instead of the default ``'wsgiprox'``, allowing the application to differentiate handling based on the value of ``wsgiprox.proxy_host``.

Internally, the ``proxy_apps`` dict is used to configure the cert downloader app and default proxy host:

.. code:: python

    proxy_apps['proxy_host'] = None
    proxy_apps['download_host'] = CertDownloader(self.ca)


Websockets
==========

``wsgiprox`` optionally also supports proxying websockets, both unencryped ``ws://`` and via TLS ``wss://``. The websockets proxy functionality has primarily been tested with and requires the `gevent-websocket <https://github.com/jgelens/gevent-websocket>`_ library, and assumes that the wrapped WSGI application is also using this library for websocket support. Other implementations are not yet supported.

To enable websocket proxying, install with ``pip install wsgiprox[gevent-websocket]`` which will install ``gevent-websocket``.
To disable websocket proxying even with ``gevent-websocket`` installed, add ``proxy_options={'enable_websockets': False}``

See the `test suite <test/test_wsgiprox.py>`_ for additional details.


How it Works / A note about WSGI
=================================

``wsgiprox`` supports several different proxying methods:

* HTTP direct proxy, no tunnel
* HTTP CONNECT tunnel for websockets, no SSL
* HTTP CONNECT tunnel with SSL (also supports websockets)

For regular HTTP proxy, wsgiprox simply rewrites a host-qualifed request such as ``GET http://example.com/``, and passes it along to underlying WSGI app.

The other proxy methods involve the HTTP ``CONNECT`` verb and explicitly establishing a tunnel using the underlying socket. For HTTPS/SSL proxying, an SSL socket is established over the tunnel, while HTTP websocket proxy uses the underlying socket directly.

The system thus relies on being able to access the underyling socket for the connection. As WSGI spec does not provide a way to do this, ``wsgiprox`` is not guaranteed to work under any WSGI server. The CONNECT verb creates a tunnel, and the tunneled connection is what is passed to the wrapped WSGI application. This is non-standard behavior and may not work on all WSGI servers.

This middleware has been tested primarily with gevent WSGI server and uWSGI.

There is also support for gunicorn and wsgiref, as they provide a way to access the underlying success. If the underlying socket can not be accessed, the ``CONNECT`` verb will fail with a 405.

It may be possible to extend support to additional WSGI servers by extending ``WSGIProxMiddleware.get_raw_socket()`` to be able to find the underlying socket.

Inspiration
~~~~~~~~~~~

This project draws inspiration from a lot of previous efforts.

Much of the functionality is a refactoring and spin-off of the proxy functionality in `pywb <https://github.com/ikreymer/pywb>`_, which is built on top of standalone CA handling library `certauth <https://github.com/ikreymer/certauth>`_.

certauth was refactored from an earlier implementation in `warcprox <https://github.com/internetarchive/warcprox>`_ (which also inspired this name!).

The certificate download feature was inspired by a similar feature available in `mitmprox <https://github.com/mitmproxy/mitmproxy>`_

License
~~~~~~~

``wsgiprox`` is licensed under the Apache 2.0 License and is part of the
Webrecorder project.

See `NOTICE <NOTICE>`__ and `LICENSE <LICENSE>`__ for details.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/webrecorder/wsgiprox",
    "name": "wsgiprox",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Ilya Kreymer",
    "author_email": "ikreymer@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/86/44/c52ccfc68fb7a46ec824d11bc99dcf3eb161c93e7bf31f04091c54f62d9b/wsgiprox-1.5.2.tar.gz",
    "platform": "",
    "description": "wsgiprox\n========\n\n.. image:: https://travis-ci.org/webrecorder/wsgiprox.svg?branch=master\n    :target: https://travis-ci.org/webrecorder/wsgiprox\n\n``wsgiprox`` is a Python WSGI middleware for adding HTTP and HTTPS proxy support to a WSGI application.\n\nThe library accepts HTTP and HTTPS proxy connections, and routes them to a designated prefix.\n\nUsage\n~~~~~\n\nFor example, given a `WSGI <http://wsgi.readthedocs.io/en/latest/>`_ callable ``application``, the middleware could be defined as follows:\n\n.. code:: python\n\n    from wsgiprox.wsgiprox import WSGIProxMiddleware\n\n    application = WSGIProxMiddleware(application, '/prefix/', 'wsgiprox')\n\n\nWith the above configuration, the middleware is configured to add a prefix of ``/prefix/`` to any url, unless it is to the proxy host ``wsgiprox``.  Assuming a WSGI server running on port 8080, the middleware would translate HTTP/S proxy connections to a non-proxy WSGI request, and pass to the wrapped application:\n\n*  Proxy Request: ``curl -x \"localhost:8080\" \"http://example.com/path/file.html?A=B\"``\n\n   Becomes equivalent to: ``curl \"http://localhost:8080/prefix/http://example.com/path/file.html?A=B\"``\n\n\n*  Proxy Request: ``curl -k -x \"localhost:8080\" \"https://example.com/path/file.html?A=B\"``\n\n   Becomes equivalent to: ``curl \"http://localhost:8080/prefix/https://example.com/path/file.html?A=B\"``\n\n*  Proxy Request to proxy host: ``curl -k -x \"localhost:8080\" \"https://wsgiprox/path/file.html?A=B\"``\n\n   Not adding prefix for ``wsgiprox``, becomes equivalent to: ``curl -H \"Host: wsgiprox\" \"http://localhost:8080/path/file.html?A=B\"``\n\n\nAll standard WSGI ``environ`` fields are set to the expected values for the translated url.\n\nWhen a request passes through wsgiprox middleware, ``environ['wsgiprox.proxy_host']`` is set to the proxy host.\nIn this example, the WSGI app could check that ``environ.get('wsgiprox.proxy_host') == 'wsgiprox'`` to ensure that it was a proxy request. If the request is to the proxy host itself, then it is passed to the WSGI app without prefixing, and ``environ['wsgiprox.proxy_host'] == environ['HTTP_HOST']``\n\n\nCustom Resolvers\n================\n\nThe provided ``FixedResolver`` simply prepends a fixed prefix to each url. A custom resolver could compute the final url in a different way. The resolver instance is called with the full url, and the original WSGI ``environ``. The result is the translated ``REQUEST_URI`` that is passed to the WSGI applictaion.\n\nSee `resolvers.py <wsgiprox/resolvers.py>`_ for all available resolvers.\n\nFor example, the following Resolver translates the url to a custom prefix based on the remote IP of the original request.\n\n.. code:: python\n\n    class IPResolver(object):\n        def __call__(self, url, environ):\n            return '/' + environ['REMOTE_ADDR'] + '/' + url\n\n    application = WSGIProxMiddleware(application, IPResolver())\n\n\nHTTPS CA\n========\n\nTo support HTTPS proxy, ``wsgiprox`` creates a custom CA (Certificate Authority), which must be accepted by the client (or it must ignore cert verification as with the ``-k`` option in CURL)\n\nBy default, ``wsgiprox`` looks for CA .pem at: ``<working dir>/ca/wsgiprox-ca.pem`` and auto-creates this bundle using the `certauth <https://github.com/ikreymer/certauth>`_ library.\n\nThe CA name and CA root cert filename can also be specified explicitly via ``proxy_options`` dict.\n\nBy default, the following options are used:\n\n.. code:: python\n\n    WSGIProxMiddleware(..., proxy_options={ca_name='wsgiprox https proxy CA',\n                                           ca_file='./ca/wsgiprox-ca.pem'})\n\nThe generated ``wsgiprox-ca.pem`` can be imported directly into most browsers directly as a trusted certificate authority, allowing the browser to accept HTTPS content proxied through ``wsgiprox``\n\nDownloading Certs\n=================\n\nThe CA cert can be downloaded directly from the proxy directly. This allows for quick installation into a client/browser.\n\n* ``curl -x \"localhost:8080\" http://wsgiprox/download/pem`` will download in PEM format (for most platforms)\n* ``curl -x \"localhost:8080\" http://wsgiprox/download/p12`` will download in PKCS12 format (for Windows)\n\nThe download host is the same as proxy main host, though can be changed via ``download_host`` param to WSGIProxMiddleware constructor.\n\nCustom Proxy Host Apps\n======================\n\nIt's is also possible to configure a custom WSGI app per proxy host, eg:\n\n* ``curl -x \"localhost:8080\" https://proxy-app-1/path/`` is passed to ``proxy-app-1``\n* ``curl -x \"localhost:8080\" https://proxy-app-2/foo`` is passed to ``proxy-app-2``\n\nThis can be done via:\n\n.. code:: python\n\n    from wsgiprox.wsgiprox import WSGIProxMiddleware\n\n    proxy_apps = {\"proxy-app-1\": ProxyApp1WSGI(),\n                  \"proxy-app-2\": ProxyApp2WSGI(),\n                  \"proxy-alias\": None,\n                 }\n\n    application = WSGIProxMiddleware(application, proxy_apps=apps)\n\nAll other requests, or any requests not handled by the proxy app, are passed to the main ``application``.\n\nIn the last case, since there is no proxy app, the request is passed directly to wrapped application.\nThe ``wsgiprox.proxy_host`` would be set to ``'proxy-alias'`` instead of the default ``'wsgiprox'``, allowing the application to differentiate handling based on the value of ``wsgiprox.proxy_host``.\n\nInternally, the ``proxy_apps`` dict is used to configure the cert downloader app and default proxy host:\n\n.. code:: python\n\n    proxy_apps['proxy_host'] = None\n    proxy_apps['download_host'] = CertDownloader(self.ca)\n\n\nWebsockets\n==========\n\n``wsgiprox`` optionally also supports proxying websockets, both unencryped ``ws://`` and via TLS ``wss://``. The websockets proxy functionality has primarily been tested with and requires the `gevent-websocket <https://github.com/jgelens/gevent-websocket>`_ library, and assumes that the wrapped WSGI application is also using this library for websocket support. Other implementations are not yet supported.\n\nTo enable websocket proxying, install with ``pip install wsgiprox[gevent-websocket]`` which will install ``gevent-websocket``.\nTo disable websocket proxying even with ``gevent-websocket`` installed, add ``proxy_options={'enable_websockets': False}``\n\nSee the `test suite <test/test_wsgiprox.py>`_ for additional details.\n\n\nHow it Works / A note about WSGI\n=================================\n\n``wsgiprox`` supports several different proxying methods:\n\n* HTTP direct proxy, no tunnel\n* HTTP CONNECT tunnel for websockets, no SSL\n* HTTP CONNECT tunnel with SSL (also supports websockets)\n\nFor regular HTTP proxy, wsgiprox simply rewrites a host-qualifed request such as ``GET http://example.com/``, and passes it along to underlying WSGI app.\n\nThe other proxy methods involve the HTTP ``CONNECT`` verb and explicitly establishing a tunnel using the underlying socket. For HTTPS/SSL proxying, an SSL socket is established over the tunnel, while HTTP websocket proxy uses the underlying socket directly.\n\nThe system thus relies on being able to access the underyling socket for the connection. As WSGI spec does not provide a way to do this, ``wsgiprox`` is not guaranteed to work under any WSGI server. The CONNECT verb creates a tunnel, and the tunneled connection is what is passed to the wrapped WSGI application. This is non-standard behavior and may not work on all WSGI servers.\n\nThis middleware has been tested primarily with gevent WSGI server and uWSGI.\n\nThere is also support for gunicorn and wsgiref, as they provide a way to access the underlying success. If the underlying socket can not be accessed, the ``CONNECT`` verb will fail with a 405.\n\nIt may be possible to extend support to additional WSGI servers by extending ``WSGIProxMiddleware.get_raw_socket()`` to be able to find the underlying socket.\n\nInspiration\n~~~~~~~~~~~\n\nThis project draws inspiration from a lot of previous efforts.\n\nMuch of the functionality is a refactoring and spin-off of the proxy functionality in `pywb <https://github.com/ikreymer/pywb>`_, which is built on top of standalone CA handling library `certauth <https://github.com/ikreymer/certauth>`_.\n\ncertauth was refactored from an earlier implementation in `warcprox <https://github.com/internetarchive/warcprox>`_ (which also inspired this name!).\n\nThe certificate download feature was inspired by a similar feature available in `mitmprox <https://github.com/mitmproxy/mitmproxy>`_\n\nLicense\n~~~~~~~\n\n``wsgiprox`` is licensed under the Apache 2.0 License and is part of the\nWebrecorder project.\n\nSee `NOTICE <NOTICE>`__ and `LICENSE <LICENSE>`__ for details.\n\n\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "HTTP/S proxy with WebSockets over WSGI",
    "version": "1.5.2",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "d7233b6b4211d1a4b96fcb401604006b",
                "sha256": "8dab64cef38ff39d525d246bc1b34b5a378c1476be7127ef79c85c29ab92765c"
            },
            "downloads": -1,
            "filename": "wsgiprox-1.5.2-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "d7233b6b4211d1a4b96fcb401604006b",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 20906,
            "upload_time": "2019-03-19T21:24:47",
            "upload_time_iso_8601": "2019-03-19T21:24:47.571228Z",
            "url": "https://files.pythonhosted.org/packages/74/1e/285621fcdca5399a1a74095bf9d10e1492ddce5626654a82cb7319b3fd91/wsgiprox-1.5.2-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "8ad4657a5a901f2e8ed31f84f80b7b9c",
                "sha256": "3a97f6f8d51122aa5e3a1c6c8c73a84a15a3f9b69f312a3aae8bef9b52a8da85"
            },
            "downloads": -1,
            "filename": "wsgiprox-1.5.2.tar.gz",
            "has_sig": false,
            "md5_digest": "8ad4657a5a901f2e8ed31f84f80b7b9c",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 17619,
            "upload_time": "2019-03-19T21:24:49",
            "upload_time_iso_8601": "2019-03-19T21:24:49.333351Z",
            "url": "https://files.pythonhosted.org/packages/86/44/c52ccfc68fb7a46ec824d11bc99dcf3eb161c93e7bf31f04091c54f62d9b/wsgiprox-1.5.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2019-03-19 21:24:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "webrecorder",
    "github_project": "wsgiprox",
    "travis_ci": true,
    "coveralls": true,
    "github_actions": false,
    "appveyor": true,
    "lcname": "wsgiprox"
}
        
Elapsed time: 0.01584s