docker-image-cleaner


Namedocker-image-cleaner JSON
Version 1.0.0b3 PyPI version JSON
download
home_pagehttps://github.com/jupyterhub/docker-image-cleaner
SummaryCleanup old docker images to free up disk space and inodes
upload_time2023-01-02 11:04:17
maintainer
docs_urlNone
authorProject Jupyter Contributors
requires_python>=3.8
licenseBSD
keywords
VCS
bugtrack_url
requirements cachetools certifi charset-normalizer docker google-auth idna kubernetes oauthlib packaging pyasn1 pyasn1-modules pyparsing python-dateutil pyyaml requests requests-oauthlib rsa six urllib3 websocket-client
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Docker Image Cleaner

[![GitHub Workflow Status](https://img.shields.io/github/workflow/status/jupyterhub/docker-image-cleaner/Publish?logo=github)](https://github.com/jupyterhub/docker-image-cleaner/actions)
[![Latest PyPI version](https://img.shields.io/pypi/v/docker-image-cleaner?logo=pypi)](https://pypi.python.org/pypi/docker-image-cleaner)
[![Latest quay.io image tags](https://img.shields.io/github/v/tag/jupyterhub/docker-image-cleaner?include_prereleases&label=quay.io)](https://quay.io/repository/jupyterhub/docker-image-cleaner?tab=tags)

A Python package (`docker-image-cleaner`) and associated Docker image
(`quay.io/jupyterhub/docker-image-cleaner`) to clean up old docker images when a
disk is running low on inodes or space.

The script has initially been developed to help installations of BinderHub clean
up space on nodes as it otherwise can run out of space and stop being able to
build now docker images.

## Why?

Container images are one of the biggest consumers of disk space
and inodes on kubernetes nodes. Kubernetes tries to make sure there is enough
disk space on each node by [garbage
collecting](https://kubernetes.io/docs/concepts/architecture/garbage-collection/#containers-images)
unused container images and containers. Tuning this is important
for [binderhub](https://github.com/jupyterhub/binderhub/) installations,
as many images are built and used only a couple times. However, on
most managed kubernetes installations (like GKE, EKS, etc), we can not
tune these parameters!

This script approximates the specific parts of the kubernetes container image
garbage collection in a configurable way.

## Requirements

1. Only kubernetes nodes using the `docker` runtime are supported.
   `containerd` or `cri-o` container backends are not supported.
2. The script expects to run in a kubernetes `DaemonSet`, with `/var/lib/docker`
   from the node mounted inside the container. This lets the script figure
   out how much disk space docker container images are actually using.
3. The `DaemonSet` should have a `ServiceAccount` attached that has permissions
   to talk to the kubernetes API and cordon / uncordon nodes. This makes sure
   new pods are not scheduled on to the node while image cleaning is happening,
   as it can take a while.

## How does it work?

1. Compute how much space `/var/lib/docker` directory (specified by the
   `DOCKER_IMAGE_CLEANER_PATH_TO_CHECK` environment variable) is taking up.
2. If the disk space used is greater than the garbage collection trigger threshold
   (specified by `DOCKER_IMAGE_CLEANER_THRESHOLD_HIGH`), garbage collection is triggered.
   If not, the script just waits another 5 minutes (set by `DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS`).
3. If garbage collection is triggered, the kubernetes node is first cordoned
   to prevent any new pods from being scheduled on it for the duration of the
   garbage collection.
4. Stopped containers are removed via `docker container prune`.
5. Dangling images are removed via `docker image prune`
6. If no dangling images are found to prune, _all_ images are pruned (`docker image prune -a`)
7. After the garbage collection is done, the kubernetes node is also uncordoned.
8. When done, we wait another 5 minutes (set by `DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS`), and repeat
   the whole process.

## Configuration options

Currently, environment variables are used to set configuration for now.

| Env variable                            | Description                                                                                                                 | Default           |
| --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | ----------------- |
| `DOCKER_IMAGE_CLEANER_NODE_NAME`        | The k8s node where the docker image cleaner is running, so it can be cordoned via the k8s api                               |                   |
| `DOCKER_IMAGE_CLEANER_PATH_TO_CHECK`    | Path to `/var/lib/docker` directory used by the docker daemon                                                               | `/var/lib/docker` |
| `DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS` | Amount of time (in seconds) to wait between checking if GC needs to be triggered                                            | `300`             |
| `DOCKER_IMAGE_CLEANER_DELAY_SECONDS`    | Amount of time (in seconds) to wait between deleting container images, so we don't DOS the docker API                       | `1`               |
| `DOCKER_IMAGE_CLEANER_THRESHOLD_TYPE`   | Determine if GC should be triggered based on relative or absolute disk usage                                                | `relative`        |
| `DOCKER_IMAGE_CLEANER_THRESHOLD_HIGH`   | % or absolute disk space available (based on `DOCKER_IMAGE_CLEANER_THRESHOLD_TYPE`) when we start deleting container images | `80`              |
| `DOCKER_IMAGE_CLEANER_TIMEOUT_SECONDS`  | Request timeout (in seconds) for docker API requests. Pruning images often takes minutes. Default: 300 (5 minutes)          |

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jupyterhub/docker-image-cleaner",
    "name": "docker-image-cleaner",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "",
    "author": "Project Jupyter Contributors",
    "author_email": "jupyter@googlegroups.com",
    "download_url": "https://files.pythonhosted.org/packages/42/cf/43aeac8e11c41f2719c80d9542c5526b8780b4e5d2eafd1dc8a69a419087/docker-image-cleaner-1.0.0b3.tar.gz",
    "platform": null,
    "description": "# Docker Image Cleaner\n\n[![GitHub Workflow Status](https://img.shields.io/github/workflow/status/jupyterhub/docker-image-cleaner/Publish?logo=github)](https://github.com/jupyterhub/docker-image-cleaner/actions)\n[![Latest PyPI version](https://img.shields.io/pypi/v/docker-image-cleaner?logo=pypi)](https://pypi.python.org/pypi/docker-image-cleaner)\n[![Latest quay.io image tags](https://img.shields.io/github/v/tag/jupyterhub/docker-image-cleaner?include_prereleases&label=quay.io)](https://quay.io/repository/jupyterhub/docker-image-cleaner?tab=tags)\n\nA Python package (`docker-image-cleaner`) and associated Docker image\n(`quay.io/jupyterhub/docker-image-cleaner`) to clean up old docker images when a\ndisk is running low on inodes or space.\n\nThe script has initially been developed to help installations of BinderHub clean\nup space on nodes as it otherwise can run out of space and stop being able to\nbuild now docker images.\n\n## Why?\n\nContainer images are one of the biggest consumers of disk space\nand inodes on kubernetes nodes. Kubernetes tries to make sure there is enough\ndisk space on each node by [garbage\ncollecting](https://kubernetes.io/docs/concepts/architecture/garbage-collection/#containers-images)\nunused container images and containers. Tuning this is important\nfor [binderhub](https://github.com/jupyterhub/binderhub/) installations,\nas many images are built and used only a couple times. However, on\nmost managed kubernetes installations (like GKE, EKS, etc), we can not\ntune these parameters!\n\nThis script approximates the specific parts of the kubernetes container image\ngarbage collection in a configurable way.\n\n## Requirements\n\n1. Only kubernetes nodes using the `docker` runtime are supported.\n   `containerd` or `cri-o` container backends are not supported.\n2. The script expects to run in a kubernetes `DaemonSet`, with `/var/lib/docker`\n   from the node mounted inside the container. This lets the script figure\n   out how much disk space docker container images are actually using.\n3. The `DaemonSet` should have a `ServiceAccount` attached that has permissions\n   to talk to the kubernetes API and cordon / uncordon nodes. This makes sure\n   new pods are not scheduled on to the node while image cleaning is happening,\n   as it can take a while.\n\n## How does it work?\n\n1. Compute how much space `/var/lib/docker` directory (specified by the\n   `DOCKER_IMAGE_CLEANER_PATH_TO_CHECK` environment variable) is taking up.\n2. If the disk space used is greater than the garbage collection trigger threshold\n   (specified by `DOCKER_IMAGE_CLEANER_THRESHOLD_HIGH`), garbage collection is triggered.\n   If not, the script just waits another 5 minutes (set by `DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS`).\n3. If garbage collection is triggered, the kubernetes node is first cordoned\n   to prevent any new pods from being scheduled on it for the duration of the\n   garbage collection.\n4. Stopped containers are removed via `docker container prune`.\n5. Dangling images are removed via `docker image prune`\n6. If no dangling images are found to prune, _all_ images are pruned (`docker image prune -a`)\n7. After the garbage collection is done, the kubernetes node is also uncordoned.\n8. When done, we wait another 5 minutes (set by `DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS`), and repeat\n   the whole process.\n\n## Configuration options\n\nCurrently, environment variables are used to set configuration for now.\n\n| Env variable                            | Description                                                                                                                 | Default           |\n| --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------- | ----------------- |\n| `DOCKER_IMAGE_CLEANER_NODE_NAME`        | The k8s node where the docker image cleaner is running, so it can be cordoned via the k8s api                               |                   |\n| `DOCKER_IMAGE_CLEANER_PATH_TO_CHECK`    | Path to `/var/lib/docker` directory used by the docker daemon                                                               | `/var/lib/docker` |\n| `DOCKER_IMAGE_CLEANER_INTERVAL_SECONDS` | Amount of time (in seconds) to wait between checking if GC needs to be triggered                                            | `300`             |\n| `DOCKER_IMAGE_CLEANER_DELAY_SECONDS`    | Amount of time (in seconds) to wait between deleting container images, so we don't DOS the docker API                       | `1`               |\n| `DOCKER_IMAGE_CLEANER_THRESHOLD_TYPE`   | Determine if GC should be triggered based on relative or absolute disk usage                                                | `relative`        |\n| `DOCKER_IMAGE_CLEANER_THRESHOLD_HIGH`   | % or absolute disk space available (based on `DOCKER_IMAGE_CLEANER_THRESHOLD_TYPE`) when we start deleting container images | `80`              |\n| `DOCKER_IMAGE_CLEANER_TIMEOUT_SECONDS`  | Request timeout (in seconds) for docker API requests. Pruning images often takes minutes. Default: 300 (5 minutes)          |\n",
    "bugtrack_url": null,
    "license": "BSD",
    "summary": "Cleanup old docker images to free up disk space and inodes",
    "version": "1.0.0b3",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "3333dbca7f285711dce62ae1404130b5",
                "sha256": "714746d47d3393f7d17c17b3e3d6af6a7cf08325a14873a62b0a33adb77bd260"
            },
            "downloads": -1,
            "filename": "docker_image_cleaner-1.0.0b3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3333dbca7f285711dce62ae1404130b5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 8036,
            "upload_time": "2023-01-02T11:04:16",
            "upload_time_iso_8601": "2023-01-02T11:04:16.301090Z",
            "url": "https://files.pythonhosted.org/packages/27/62/973bc20a08f364d271066d98c25727e69c0d2ab795f2a174b8c899ee5043/docker_image_cleaner-1.0.0b3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "f663e5194c55e51f2ac4aae268444008",
                "sha256": "987dad5eec22794dc4f719aecdd3c3e10c423254f1bf14d215fdfd4b4ecad8a8"
            },
            "downloads": -1,
            "filename": "docker-image-cleaner-1.0.0b3.tar.gz",
            "has_sig": false,
            "md5_digest": "f663e5194c55e51f2ac4aae268444008",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 7521,
            "upload_time": "2023-01-02T11:04:17",
            "upload_time_iso_8601": "2023-01-02T11:04:17.297821Z",
            "url": "https://files.pythonhosted.org/packages/42/cf/43aeac8e11c41f2719c80d9542c5526b8780b4e5d2eafd1dc8a69a419087/docker-image-cleaner-1.0.0b3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-02 11:04:17",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "jupyterhub",
    "github_project": "docker-image-cleaner",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "cachetools",
            "specs": [
                [
                    "==",
                    "5.2.0"
                ]
            ]
        },
        {
            "name": "certifi",
            "specs": [
                [
                    "==",
                    "2022.9.24"
                ]
            ]
        },
        {
            "name": "charset-normalizer",
            "specs": [
                [
                    "==",
                    "2.1.1"
                ]
            ]
        },
        {
            "name": "docker",
            "specs": [
                [
                    "==",
                    "6.0.1"
                ]
            ]
        },
        {
            "name": "google-auth",
            "specs": [
                [
                    "==",
                    "2.14.0"
                ]
            ]
        },
        {
            "name": "idna",
            "specs": [
                [
                    "==",
                    "3.4"
                ]
            ]
        },
        {
            "name": "kubernetes",
            "specs": [
                [
                    "==",
                    "25.3.0"
                ]
            ]
        },
        {
            "name": "oauthlib",
            "specs": [
                [
                    "==",
                    "3.2.2"
                ]
            ]
        },
        {
            "name": "packaging",
            "specs": [
                [
                    "==",
                    "21.3"
                ]
            ]
        },
        {
            "name": "pyasn1",
            "specs": [
                [
                    "==",
                    "0.4.8"
                ]
            ]
        },
        {
            "name": "pyasn1-modules",
            "specs": [
                [
                    "==",
                    "0.2.8"
                ]
            ]
        },
        {
            "name": "pyparsing",
            "specs": [
                [
                    "==",
                    "3.0.9"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    "==",
                    "2.8.2"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    "==",
                    "6.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    "==",
                    "2.28.1"
                ]
            ]
        },
        {
            "name": "requests-oauthlib",
            "specs": [
                [
                    "==",
                    "1.3.1"
                ]
            ]
        },
        {
            "name": "rsa",
            "specs": [
                [
                    "==",
                    "4.9"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    "==",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "urllib3",
            "specs": [
                [
                    "==",
                    "1.26.12"
                ]
            ]
        },
        {
            "name": "websocket-client",
            "specs": [
                [
                    "==",
                    "1.4.1"
                ]
            ]
        }
    ],
    "lcname": "docker-image-cleaner"
}
        
Elapsed time: 0.04160s