zimscraperlib


Namezimscraperlib JSON
Version 5.1.0 PyPI version JSON
download
home_pageNone
SummaryCollection of python tools to re-use common code across scrapers
upload_time2025-01-21 08:31:24
maintainerNone
docs_urlNone
authorNone
requires_python<3.14,>=3.13
licenseGPL-3.0-or-later
keywords offline openzim zim
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # zimscraperlib

[![Build Status](https://github.com/openzim/python-scraperlib/workflows/CI/badge.svg?query=branch%3Amain)](https://github.com/openzim/python-scraperlib/actions?query=branch%3Amain)
[![CodeFactor](https://www.codefactor.io/repository/github/openzim/python-scraperlib/badge)](https://www.codefactor.io/repository/github/openzim/python-scraperlib)
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)
[![PyPI version shields.io](https://img.shields.io/pypi/v/zimscraperlib.svg)](https://pypi.org/project/zimscraperlib/)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/zimscraperlib.svg)](https://pypi.org/project/zimscraperlib)
[![codecov](https://codecov.io/gh/openzim/python-scraperlib/branch/master/graph/badge.svg)](https://codecov.io/gh/openzim/python-scraperlib)
[![Read the Docs](https://img.shields.io/readthedocs/python-scraperlib)](https://python-scraperlib.readthedocs.io/)

Collection of python code to re-use across python-based scrapers

# Usage

- This library is meant to be installed via PyPI ([`zimscraperlib`](https://pypi.org/project/zimscraperlib/)).
- Make sure to reference it using a version code as the API is subject to frequent changes.
- API should remain the same only within the same _minor_ version.

Example usage:

```pip
zimscraperlib>=1.1,<1.2
```

See documentation at [Read the Docs](https://python-scraperlib.readthedocs.io/) for details.

# Dependencies

- libmagic
- wget
- libzim (auto-installed, not available on Windows)
- Pillow
- FFmpeg
- gifsicle (>=1.92)
- libcairo (if you use the image manipulation, this is used for svg conversion)

## macOS

```sh
brew install libmagic wget libtiff libjpeg webp little-cms2 ffmpeg gifsicle
```

## Linux

```sh
sudo apt install libmagic1 wget ffmpeg \
    libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \
    libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk \
    libharfbuzz-dev libfribidi-dev libxcb1-dev gifsicle
```

## Alpine

```
apk add ffmpeg gifsicle libmagic wget libjpeg
```

# Contribution

This project adheres to openZIM's [Contribution Guidelines](https://github.com/openzim/overview/wiki/Contributing).

This project has implemented openZIM's [Python bootstrap, conventions and policies](https://github.com/openzim/_python-bootstrap/docs/Policy.md) **v1.0.2**.

```shell
pip install hatch
pip install ".[dev]"
pre-commit install
# For tests
invoke coverage
```

# Users

Non-exhaustive list of scrapers using it (check status when updating API):

- [openzim/freecodecamp](https://github.com/openzim/freecodecamp)
- [openzim/gutenberg](https://github.com/openzim/gutenberg)
- [openzim/ifixit](https://github.com/openzim/ifixit)
- [openzim/kolibri](https://github.com/openzim/kolibri)
- [openzim/nautilus](https://github.com/openzim/nautilus)
- [openzim/nautilus](https://github.com/openzim/nautilus)
- [openzim/openedx](https://github.com/openzim/openedx)
- [openzim/sotoki](https://github.com/openzim/sotoki)
- [openzim/ted](https://github.com/openzim/ted)
- [openzim/warc2zim](https://github.com/openzim/warc2zim)
- [openzim/wikihow](https://github.com/openzim/wikihow)
- [openzim/youtube](https://github.com/openzim/youtube)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "zimscraperlib",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<3.14,>=3.13",
    "maintainer_email": null,
    "keywords": "offline, openzim, zim",
    "author": null,
    "author_email": "openZIM <dev@openzim.org>",
    "download_url": "https://files.pythonhosted.org/packages/0f/1f/e56bd292c6d9fef87c73ef263041d259d8647c556ff2af04aa50b8e19ad7/zimscraperlib-5.1.0.tar.gz",
    "platform": null,
    "description": "# zimscraperlib\n\n[![Build Status](https://github.com/openzim/python-scraperlib/workflows/CI/badge.svg?query=branch%3Amain)](https://github.com/openzim/python-scraperlib/actions?query=branch%3Amain)\n[![CodeFactor](https://www.codefactor.io/repository/github/openzim/python-scraperlib/badge)](https://www.codefactor.io/repository/github/openzim/python-scraperlib)\n[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0)\n[![PyPI version shields.io](https://img.shields.io/pypi/v/zimscraperlib.svg)](https://pypi.org/project/zimscraperlib/)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/zimscraperlib.svg)](https://pypi.org/project/zimscraperlib)\n[![codecov](https://codecov.io/gh/openzim/python-scraperlib/branch/master/graph/badge.svg)](https://codecov.io/gh/openzim/python-scraperlib)\n[![Read the Docs](https://img.shields.io/readthedocs/python-scraperlib)](https://python-scraperlib.readthedocs.io/)\n\nCollection of python code to re-use across python-based scrapers\n\n# Usage\n\n- This library is meant to be installed via PyPI ([`zimscraperlib`](https://pypi.org/project/zimscraperlib/)).\n- Make sure to reference it using a version code as the API is subject to frequent changes.\n- API should remain the same only within the same _minor_ version.\n\nExample usage:\n\n```pip\nzimscraperlib>=1.1,<1.2\n```\n\nSee documentation at [Read the Docs](https://python-scraperlib.readthedocs.io/) for details.\n\n# Dependencies\n\n- libmagic\n- wget\n- libzim (auto-installed, not available on Windows)\n- Pillow\n- FFmpeg\n- gifsicle (>=1.92)\n- libcairo (if you use the image manipulation, this is used for svg conversion)\n\n## macOS\n\n```sh\nbrew install libmagic wget libtiff libjpeg webp little-cms2 ffmpeg gifsicle\n```\n\n## Linux\n\n```sh\nsudo apt install libmagic1 wget ffmpeg \\\n    libtiff5-dev libjpeg8-dev libopenjp2-7-dev zlib1g-dev \\\n    libfreetype6-dev liblcms2-dev libwebp-dev tcl8.6-dev tk8.6-dev python3-tk \\\n    libharfbuzz-dev libfribidi-dev libxcb1-dev gifsicle\n```\n\n## Alpine\n\n```\napk add ffmpeg gifsicle libmagic wget libjpeg\n```\n\n# Contribution\n\nThis project adheres to openZIM's [Contribution Guidelines](https://github.com/openzim/overview/wiki/Contributing).\n\nThis project has implemented openZIM's [Python bootstrap, conventions and policies](https://github.com/openzim/_python-bootstrap/docs/Policy.md) **v1.0.2**.\n\n```shell\npip install hatch\npip install \".[dev]\"\npre-commit install\n# For tests\ninvoke coverage\n```\n\n# Users\n\nNon-exhaustive list of scrapers using it (check status when updating API):\n\n- [openzim/freecodecamp](https://github.com/openzim/freecodecamp)\n- [openzim/gutenberg](https://github.com/openzim/gutenberg)\n- [openzim/ifixit](https://github.com/openzim/ifixit)\n- [openzim/kolibri](https://github.com/openzim/kolibri)\n- [openzim/nautilus](https://github.com/openzim/nautilus)\n- [openzim/nautilus](https://github.com/openzim/nautilus)\n- [openzim/openedx](https://github.com/openzim/openedx)\n- [openzim/sotoki](https://github.com/openzim/sotoki)\n- [openzim/ted](https://github.com/openzim/ted)\n- [openzim/warc2zim](https://github.com/openzim/warc2zim)\n- [openzim/wikihow](https://github.com/openzim/wikihow)\n- [openzim/youtube](https://github.com/openzim/youtube)\n",
    "bugtrack_url": null,
    "license": "GPL-3.0-or-later",
    "summary": "Collection of python tools to re-use common code across scrapers",
    "version": "5.1.0",
    "project_urls": {
        "Donate": "https://www.kiwix.org/en/support-us/",
        "Homepage": "https://github.com/openzim/python-scraperlib"
    },
    "split_keywords": [
        "offline",
        " openzim",
        " zim"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0a5e6f4f21f2a1d3c3391179040efe9a445324771b150fe38ca80801718c0cc0",
                "md5": "f87ec7d4d92e3111b84e615e154b4e43",
                "sha256": "bbb7846ca2d3888e30120af0ae1bff1d10ac35deea47da6e6e519a88d2efb8fe"
            },
            "downloads": -1,
            "filename": "zimscraperlib-5.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f87ec7d4d92e3111b84e615e154b4e43",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<3.14,>=3.13",
            "size": 123780,
            "upload_time": "2025-01-21T08:31:21",
            "upload_time_iso_8601": "2025-01-21T08:31:21.569622Z",
            "url": "https://files.pythonhosted.org/packages/0a/5e/6f4f21f2a1d3c3391179040efe9a445324771b150fe38ca80801718c0cc0/zimscraperlib-5.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0f1fe56bd292c6d9fef87c73ef263041d259d8647c556ff2af04aa50b8e19ad7",
                "md5": "412c2de6631944da2875105078336f05",
                "sha256": "10fb8128ae8b42c6c9da3ee919917767c214baf5294ce93073ff31b2669dcadd"
            },
            "downloads": -1,
            "filename": "zimscraperlib-5.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "412c2de6631944da2875105078336f05",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<3.14,>=3.13",
            "size": 6654521,
            "upload_time": "2025-01-21T08:31:24",
            "upload_time_iso_8601": "2025-01-21T08:31:24.347197Z",
            "url": "https://files.pythonhosted.org/packages/0f/1f/e56bd292c6d9fef87c73ef263041d259d8647c556ff2af04aa50b8e19ad7/zimscraperlib-5.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-01-21 08:31:24",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "openzim",
    "github_project": "python-scraperlib",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "zimscraperlib"
}
        
Elapsed time: 4.81654s