proxpi

Name	proxpi JSON
Version	1.2.0 JSON
	download
home_page	https://github.com/EpicWink/proxpi
Summary	PyPI caching mirror
upload_time	2024-07-08 02:30:54
maintainer	None
docs_url	None
author	Laurie O
requires_python	~=3.6
license	MIT
keywords	pypi index mirror cache
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # proxpi
[![Build status](
https://github.com/EpicWink/proxpi/workflows/test/badge.svg?branch=master)](
https://github.com/EpicWink/proxpi/actions?query=branch%3Amaster+workflow%3Atest)
[![codecov](https://codecov.io/gh/EpicWink/proxpi/branch/master/graph/badge.svg)](
https://codecov.io/gh/EpicWink/proxpi)

PyPI caching mirror

* Host a proxy PyPI mirror server with caching
  * Cache the index (project list and projects' file list)
  * Cache the project files
* Support multiple indices
* Set index cache times-to-live (individually for each index)
* Set files cache max-size on disk
* Manually invalidate index cache

See [Alternatives](#alternatives).

## Usage
### Start server
Choose between running inside [Docker](https://www.docker.com/) container if you want to
run in a known-working environment, or outside via a Python app (instructions here are
for the [Flask](https://flask.palletsprojects.com/en/latest/) development server) if you
want more control over the environment.

#### Docker
Uses a [Gunicorn](https://gunicorn.org/) WSGI server
```bash
docker run -p 5000:5000 epicwink/proxpi
```

Without arguments, runs with 2 threads. If passing arguments, make sure to bind to an
exported address (or all with `0.0.0.0`) on port 5000 (ie `--bind 0.0.0.0:5000`).

##### Compose
Alternatively, use [Docker Compose](https://docs.docker.com/compose/)
```bash
docker compose up
```

#### Local
##### Install
```bash
pip install proxpi
```

Install `coloredlogs` as well to get coloured logging

##### Run server
```bash
FLASK_APP=proxpi.server flask run
```

See `flask run --help` for more information on address and port binding, and certificate
specification to use HTTPS. Alternatively, bring your own WSGI server.

### Use proxy
Use PIP's index-URL flag to install packages via the proxy

```bash
pip install --index-url http://127.0.0.1:5000/index/ simplejson
```

### Cache invalidation
Either head to http://127.0.0.1:5000/ in the browser, or run:
```bash
curl -X DELETE http://127.0.0.1:5000/cache/simplejson
curl -X DELETE http://127.0.0.1:5000/cache/list
```

If you need to invalidate a locally cached file, restart the server: files should never
change in a package index.

### Environment variables
* `PROXPI_INDEX_URL`: index URL, default: https://pypi.org/simple/
* `PROXPI_INDEX_TTL`: index cache time-to-live in seconds,
   default: 30 minutes. Disable index-cache by setting this to 0
* `PROXPI_EXTRA_INDEX_URLS`: extra index URLs (comma-separated)
* `PROXPI_EXTRA_INDEX_TTLS`: corresponding extra index cache times-to-live in seconds
   (comma-separated), default: 3 minutes, cache disabled when 0
* `PROXPI_CACHE_SIZE`: size of downloaded project files cache (bytes), default 5GB.
  Disable files-cache by setting this to 0
* `PROXPI_CACHE_DIR`: downloaded project files cache directory path, default: a new
  temporary directory
* `PROXPI_BINARY_FILE_MIME_TYPE=1`: force file-response content-type to
  `"application/octet-stream"` instead of letting Flask guess it. This may be needed
  if your package installer (eg Poetry) mishandles responses with declared encoding.
* `PROXPI_DISABLE_INDEX_SSL_VERIFICATION=1`: don't verify any index SSL certificates
* `PROXPI_DOWNLOAD_TIMEOUT`: time (in seconds) before `proxpi` will redirect to the
  proxied index server for file downloads instead of waiting for the download,
  default: 0.9
* `PROXPI_CONNECT_TIMEOUT`: time (in seconds) `proxpi` will wait for a socket to
  connect to the index server before `requests` raises a `ConnectTimeout` error
  to prevent indefinite blocking, default: none, or 3.1 if read-timeout provided
* `PROXPI_READ_TIMEOUT`: time (in seconds) `proxpi` will wait for chunks of data 
  from the index server before `requests` raises a `ReadTimeout` error to prevent
  indefinite blocking, default: none, or 20 if connect-timeout provided

### Considerations with CI
`proxpi` was designed with three goals (particularly for continuous integration (CI)):
* to reduce load on PyPI package serving
* to reduce `pip install` times
* not require modification to the current workflow

Specifically, `proxpi` was designed to run for CI services such as
[Travis](https://travis-ci.org/),
[Jenkins](https://jenkins.io/),
[GitLab CI](https://docs.gitlab.com/ee/ci/),
[Azure Pipelines](https://azure.microsoft.com/en-us/services/devops/pipelines/)
and [GitHub Actions](https://github.com/features/actions).

`proxpi` works by caching index requests (ie which versions, wheel-types, etc are
available for a given project, the index cache) and the project files themselves (to a
local directory, the package cache). This means they will cache identical requests after
the first request, and will be useless for just one `pip install`.

#### Cache persistence
As a basic end-user of these services, for at least most of these services you won't be
able to keep a `proxpi` server running between multiple invocations of your project(s)
CI pipeline: CI invocations are designed to be independent. This means the best that you
can do is start the cache for just the current job.

A more advanced user of these CI services can bring their own runner (personally, my
needs are for running GitLab CI). This means you can run `proxpi` on a fully-controlled
server (eg [EC2](https://aws.amazon.com/ec2/) instance), and proxy PyPI requests (during
a `pip` command) through the local cache. See the instructions
[below](#gitlab-ci-instructions).

Hopefully, in the future these CI services will all implement their own transparent
caching for PyPI. For example, Azure already has
[Azure Artifacts](https://azure.microsoft.com/en-au/services/devops/artifacts/) which
provides much more functionality than `proxpi`, but won't reduce `pip install` times for
CI services not using Azure.

#### GitLab CI instructions
This implementation leverages the index URL configurable of `pip` and Docker networks.
This is to be run on a server you have console access to.

1. Create a Docker bridge network
   ```shell
   docker network create gitlab-runner-network
   ```

1. Start a GitLab CI Docker runner using
   [their documentation](https://docs.gitlab.com/runner/install/docker.html)

2. Run the `proxpi` Docker container
   ```bash
   docker run \
     --detach \
     --network gitlab-runner-network \
     --volume proxpi-cache:/var/cache/proxpi \
     --env PROXPI_CACHE_DIR=/var/cache/proxpi \
     --name proxpi epicwink/proxpi:latest
   ```
   You don't need to expose a port (the `-p` flag) as we'll be using an internal
   Docker network.

4. Set `pip`'s index URL to the `proxpi` server by setting it in the runner environment.
   Set `runners[0].docker.network_mode` to `gitlab-runner-network`.
   Add `PIP_INDEX_URL=http://proxpi:5000/index/` and `PIP_TRUSTED_HOST=proxpi`
   to `runners.environment` in the GitLab CI runner configuration TOML. For example, you
   may end up with the following configuration:
   ```toml
   [[runners]]
     name = "awesome-ci-01"
     url = "https://gitlab.com/"
     token = "SECRET"
     executor = "docker"
     environment = [
       "DOCKER_TLS_CERTDIR=/certs",
       "PIP_INDEX_URL=http://proxpi:5000/index/",
       "PIP_TRUSTED_HOST=proxpi",
     ]
   
   [[runners.docker]]
     network_mode = "gitlab-runner-network"
     ...
   ```

This is designed to not require any changes to the GitLab CI project configuration (ie
`gitlab-ci.yml`), unless it already sets the index URL for some reason (if that's the
case, you're probably already using a cache).

Another option is to set up a proxy, but that's more effort than the above method.

## Alternatives
* [simpleindex](https://pypi.org/project/simpleindex/): routes URLs to multiple
  indices (including PyPI), supports local (or S3 with a plygin) directory of packages,
  no caching without custom plugins

* [bandersnatch](https://pypi.org/project/bandersnatch/): mirrors one index (eg PyPI),
  storing packages locally, or on S3 with a plugin. Manual update, no proxy

* [devpi](https://pypi.org/project/devpi/): heavyweight, runs a full index (or multiple)
  in addition to mirroring (in place of proxying), supports proxying (with inheritance),
  supports package upload, server replication and fail-over

* [pypiserver](https://pypi.org/project/pypiserver/): serves local directory of
  packages, proxy to PyPI when not-found, supports package upload, no caching

* [PyPI Cloud](https://pypi.org/project/pypicloud/): serves local or cloud-storage
  directory of packages, with redirecting/cached proxying to indexes, authentication and
  authorisation.

* [`pypiprivate`](https://pypi.org/project/pypiprivate/): serves local (or S3-hosted)
  directory of packages, no proxy to package indices (including PyPI)

* [Pulp](https://pypi.org/project/pulpcore/): generic content repository, can host
  multiple ecosystems' packages.
  [Python package index plugin](https://pypi.org/project/pulp-python/) supports local/S3
  mirrors, package upload, proxying to multiple indices, no caching

* [`pip2pi`](https://pypi.org/project/pip2pi/): manual syncing of specific packages,
  no proxy

* [`nginx_pypi_cache`](https://github.com/hauntsaninja/nginx_pypi_cache): caching proxy
  using [nginx](https://nginx.org/en/), single index

* [Flask-Pypi-Proxy](https://pypi.org/project/Flask-Pypi-Proxy/): unmaintained, no cache
  size limit, no caching index pages

* [`http.server`](https://docs.python.org/3/library/http.server.html): standard-library,
  hosts directory exactly as laid out, no proxy to package indices (eg PyPI)

* [Apache with `mod_rewrite`](
  https://httpd.apache.org/docs/current/mod/mod_rewrite.html): I'm not familiar with
  Apache, but it likely has the capability to proxy and cache (with eg `mod_cache_disk`)

* [Gemfury](https://fury.co/l/pypi-server): hosted, managed. Private index is not free,
  documentation doesn't say anything about proxying

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/EpicWink/proxpi",
    "name": "proxpi",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "~=3.6",
    "maintainer_email": null,
    "keywords": "pypi, index, mirror, cache",
    "author": "Laurie O",
    "author_email": "laurie_opperman@hotmail.com",
    "download_url": "https://files.pythonhosted.org/packages/a0/8b/8aed4eaf60c0de00c3cb40a8ff3ffe16e8fe1b90e4a64dc1355a12a3978e/proxpi-1.2.0.tar.gz",
    "platform": null,
    "description": "# proxpi\n[![Build status](\nhttps://github.com/EpicWink/proxpi/workflows/test/badge.svg?branch=master)](\nhttps://github.com/EpicWink/proxpi/actions?query=branch%3Amaster+workflow%3Atest)\n[![codecov](https://codecov.io/gh/EpicWink/proxpi/branch/master/graph/badge.svg)](\nhttps://codecov.io/gh/EpicWink/proxpi)\n\nPyPI caching mirror\n\n* Host a proxy PyPI mirror server with caching\n  * Cache the index (project list and projects' file list)\n  * Cache the project files\n* Support multiple indices\n* Set index cache times-to-live (individually for each index)\n* Set files cache max-size on disk\n* Manually invalidate index cache\n\nSee [Alternatives](#alternatives).\n\n## Usage\n### Start server\nChoose between running inside [Docker](https://www.docker.com/) container if you want to\nrun in a known-working environment, or outside via a Python app (instructions here are\nfor the [Flask](https://flask.palletsprojects.com/en/latest/) development server) if you\nwant more control over the environment.\n\n#### Docker\nUses a [Gunicorn](https://gunicorn.org/) WSGI server\n```bash\ndocker run -p 5000:5000 epicwink/proxpi\n```\n\nWithout arguments, runs with 2 threads. If passing arguments, make sure to bind to an\nexported address (or all with `0.0.0.0`) on port 5000 (ie `--bind 0.0.0.0:5000`).\n\n##### Compose\nAlternatively, use [Docker Compose](https://docs.docker.com/compose/)\n```bash\ndocker compose up\n```\n\n#### Local\n##### Install\n```bash\npip install proxpi\n```\n\nInstall `coloredlogs` as well to get coloured logging\n\n##### Run server\n```bash\nFLASK_APP=proxpi.server flask run\n```\n\nSee `flask run --help` for more information on address and port binding, and certificate\nspecification to use HTTPS. Alternatively, bring your own WSGI server.\n\n### Use proxy\nUse PIP's index-URL flag to install packages via the proxy\n\n```bash\npip install --index-url http://127.0.0.1:5000/index/ simplejson\n```\n\n### Cache invalidation\nEither head to http://127.0.0.1:5000/ in the browser, or run:\n```bash\ncurl -X DELETE http://127.0.0.1:5000/cache/simplejson\ncurl -X DELETE http://127.0.0.1:5000/cache/list\n```\n\nIf you need to invalidate a locally cached file, restart the server: files should never\nchange in a package index.\n\n### Environment variables\n* `PROXPI_INDEX_URL`: index URL, default: https://pypi.org/simple/\n* `PROXPI_INDEX_TTL`: index cache time-to-live in seconds,\n   default: 30 minutes. Disable index-cache by setting this to 0\n* `PROXPI_EXTRA_INDEX_URLS`: extra index URLs (comma-separated)\n* `PROXPI_EXTRA_INDEX_TTLS`: corresponding extra index cache times-to-live in seconds\n   (comma-separated), default: 3 minutes, cache disabled when 0\n* `PROXPI_CACHE_SIZE`: size of downloaded project files cache (bytes), default 5GB.\n  Disable files-cache by setting this to 0\n* `PROXPI_CACHE_DIR`: downloaded project files cache directory path, default: a new\n  temporary directory\n* `PROXPI_BINARY_FILE_MIME_TYPE=1`: force file-response content-type to\n  `\"application/octet-stream\"` instead of letting Flask guess it. This may be needed\n  if your package installer (eg Poetry) mishandles responses with declared encoding.\n* `PROXPI_DISABLE_INDEX_SSL_VERIFICATION=1`: don't verify any index SSL certificates\n* `PROXPI_DOWNLOAD_TIMEOUT`: time (in seconds) before `proxpi` will redirect to the\n  proxied index server for file downloads instead of waiting for the download,\n  default: 0.9\n* `PROXPI_CONNECT_TIMEOUT`: time (in seconds) `proxpi` will wait for a socket to\n  connect to the index server before `requests` raises a `ConnectTimeout` error\n  to prevent indefinite blocking, default: none, or 3.1 if read-timeout provided\n* `PROXPI_READ_TIMEOUT`: time (in seconds) `proxpi` will wait for chunks of data \n  from the index server before `requests` raises a `ReadTimeout` error to prevent\n  indefinite blocking, default: none, or 20 if connect-timeout provided\n\n### Considerations with CI\n`proxpi` was designed with three goals (particularly for continuous integration (CI)):\n* to reduce load on PyPI package serving\n* to reduce `pip install` times\n* not require modification to the current workflow\n\nSpecifically, `proxpi` was designed to run for CI services such as\n[Travis](https://travis-ci.org/),\n[Jenkins](https://jenkins.io/),\n[GitLab CI](https://docs.gitlab.com/ee/ci/),\n[Azure Pipelines](https://azure.microsoft.com/en-us/services/devops/pipelines/)\nand [GitHub Actions](https://github.com/features/actions).\n\n`proxpi` works by caching index requests (ie which versions, wheel-types, etc are\navailable for a given project, the index cache) and the project files themselves (to a\nlocal directory, the package cache). This means they will cache identical requests after\nthe first request, and will be useless for just one `pip install`.\n\n#### Cache persistence\nAs a basic end-user of these services, for at least most of these services you won't be\nable to keep a `proxpi` server running between multiple invocations of your project(s)\nCI pipeline: CI invocations are designed to be independent. This means the best that you\ncan do is start the cache for just the current job.\n\nA more advanced user of these CI services can bring their own runner (personally, my\nneeds are for running GitLab CI). This means you can run `proxpi` on a fully-controlled\nserver (eg [EC2](https://aws.amazon.com/ec2/) instance), and proxy PyPI requests (during\na `pip` command) through the local cache. See the instructions\n[below](#gitlab-ci-instructions).\n\nHopefully, in the future these CI services will all implement their own transparent\ncaching for PyPI. For example, Azure already has\n[Azure Artifacts](https://azure.microsoft.com/en-au/services/devops/artifacts/) which\nprovides much more functionality than `proxpi`, but won't reduce `pip install` times for\nCI services not using Azure.\n\n#### GitLab CI instructions\nThis implementation leverages the index URL configurable of `pip` and Docker networks.\nThis is to be run on a server you have console access to.\n\n1. Create a Docker bridge network\n   ```shell\n   docker network create gitlab-runner-network\n   ```\n\n1. Start a GitLab CI Docker runner using\n   [their documentation](https://docs.gitlab.com/runner/install/docker.html)\n\n2. Run the `proxpi` Docker container\n   ```bash\n   docker run \\\n     --detach \\\n     --network gitlab-runner-network \\\n     --volume proxpi-cache:/var/cache/proxpi \\\n     --env PROXPI_CACHE_DIR=/var/cache/proxpi \\\n     --name proxpi epicwink/proxpi:latest\n   ```\n   You don't need to expose a port (the `-p` flag) as we'll be using an internal\n   Docker network.\n\n4. Set `pip`'s index URL to the `proxpi` server by setting it in the runner environment.\n   Set `runners[0].docker.network_mode` to `gitlab-runner-network`.\n   Add `PIP_INDEX_URL=http://proxpi:5000/index/` and `PIP_TRUSTED_HOST=proxpi`\n   to `runners.environment` in the GitLab CI runner configuration TOML. For example, you\n   may end up with the following configuration:\n   ```toml\n   [[runners]]\n     name = \"awesome-ci-01\"\n     url = \"https://gitlab.com/\"\n     token = \"SECRET\"\n     executor = \"docker\"\n     environment = [\n       \"DOCKER_TLS_CERTDIR=/certs\",\n       \"PIP_INDEX_URL=http://proxpi:5000/index/\",\n       \"PIP_TRUSTED_HOST=proxpi\",\n     ]\n   \n   [[runners.docker]]\n     network_mode = \"gitlab-runner-network\"\n     ...\n   ```\n\nThis is designed to not require any changes to the GitLab CI project configuration (ie\n`gitlab-ci.yml`), unless it already sets the index URL for some reason (if that's the\ncase, you're probably already using a cache).\n\nAnother option is to set up a proxy, but that's more effort than the above method.\n\n## Alternatives\n* [simpleindex](https://pypi.org/project/simpleindex/): routes URLs to multiple\n  indices (including PyPI), supports local (or S3 with a plygin) directory of packages,\n  no caching without custom plugins\n\n* [bandersnatch](https://pypi.org/project/bandersnatch/): mirrors one index (eg PyPI),\n  storing packages locally, or on S3 with a plugin. Manual update, no proxy\n\n* [devpi](https://pypi.org/project/devpi/): heavyweight, runs a full index (or multiple)\n  in addition to mirroring (in place of proxying), supports proxying (with inheritance),\n  supports package upload, server replication and fail-over\n\n* [pypiserver](https://pypi.org/project/pypiserver/): serves local directory of\n  packages, proxy to PyPI when not-found, supports package upload, no caching\n\n* [PyPI Cloud](https://pypi.org/project/pypicloud/): serves local or cloud-storage\n  directory of packages, with redirecting/cached proxying to indexes, authentication and\n  authorisation.\n\n* [`pypiprivate`](https://pypi.org/project/pypiprivate/): serves local (or S3-hosted)\n  directory of packages, no proxy to package indices (including PyPI)\n\n* [Pulp](https://pypi.org/project/pulpcore/): generic content repository, can host\n  multiple ecosystems' packages.\n  [Python package index plugin](https://pypi.org/project/pulp-python/) supports local/S3\n  mirrors, package upload, proxying to multiple indices, no caching\n\n* [`pip2pi`](https://pypi.org/project/pip2pi/): manual syncing of specific packages,\n  no proxy\n\n* [`nginx_pypi_cache`](https://github.com/hauntsaninja/nginx_pypi_cache): caching proxy\n  using [nginx](https://nginx.org/en/), single index\n\n* [Flask-Pypi-Proxy](https://pypi.org/project/Flask-Pypi-Proxy/): unmaintained, no cache\n  size limit, no caching index pages\n\n* [`http.server`](https://docs.python.org/3/library/http.server.html): standard-library,\n  hosts directory exactly as laid out, no proxy to package indices (eg PyPI)\n\n* [Apache with `mod_rewrite`](\n  https://httpd.apache.org/docs/current/mod/mod_rewrite.html): I'm not familiar with\n  Apache, but it likely has the capability to proxy and cache (with eg `mod_cache_disk`)\n\n* [Gemfury](https://fury.co/l/pypi-server): hosted, managed. Private index is not free,\n  documentation doesn't say anything about proxying\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "PyPI caching mirror",
    "version": "1.2.0",
    "project_urls": {
        "Homepage": "https://github.com/EpicWink/proxpi"
    },
    "split_keywords": [
        "pypi",
        " index",
        " mirror",
        " cache"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e5aa81986e8b2d3d95b53017c5425e9112b1e7c8b903d4ae54b0871806ebb79f",
                "md5": "a7dc36868aed8a769c72acd0553c55ac",
                "sha256": "a5f0e2034e0cce946d268d7c1b586b02494279c872a5e0347beed634179c6f5d"
            },
            "downloads": -1,
            "filename": "proxpi-1.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "a7dc36868aed8a769c72acd0553c55ac",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "~=3.6",
            "size": 17691,
            "upload_time": "2024-07-08T02:30:53",
            "upload_time_iso_8601": "2024-07-08T02:30:53.062749Z",
            "url": "https://files.pythonhosted.org/packages/e5/aa/81986e8b2d3d95b53017c5425e9112b1e7c8b903d4ae54b0871806ebb79f/proxpi-1.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a08b8aed4eaf60c0de00c3cb40a8ff3ffe16e8fe1b90e4a64dc1355a12a3978e",
                "md5": "a43901da86a4f788b2ae5426ced3195e",
                "sha256": "ed9e0d74126b40af5c7786434cb4987e3b30e183678a8f61bd4a6d6a6533ae36"
            },
            "downloads": -1,
            "filename": "proxpi-1.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a43901da86a4f788b2ae5426ced3195e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "~=3.6",
            "size": 49687,
            "upload_time": "2024-07-08T02:30:54",
            "upload_time_iso_8601": "2024-07-08T02:30:54.238496Z",
            "url": "https://files.pythonhosted.org/packages/a0/8b/8aed4eaf60c0de00c3cb40a8ff3ffe16e8fe1b90e4a64dc1355a12a3978e/proxpi-1.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-07-08 02:30:54",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "EpicWink",
    "github_project": "proxpi",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "proxpi"
}

Laurie O