pydebcontents


Namepydebcontents JSON
Version 0.3.1 PyPI version JSON
download
home_page
SummaryUtilities working with Debian repository Contents files
upload_time2023-12-02 05:22:24
maintainer
docs_urlNone
author
requires_python>=3.7
licenseBSD-3-Clause
keywords debian package contents archive
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pydebcontents: Searching Debian Contents files

Package repositories published by Debian (and its derivatives) have lots of different index files describing the Releases, Packages, Sources, and the file Contents of the packages.
The Debian wiki has a [full description of the repository format](https://wiki.debian.org/DebianRepository/Format).

Access to the data within the Release, Packages, and Sources files is provided by the [python-debian](https://python-debian-team.pages.debian.net/python-debian/html/) module, available within the Debian archive and from PyPI.

This module provides access to the Contents files.

# Requirements

This module requires no Python modules outside of stdlib.

Searching the Contents files is, however, dependent on the external `zgrep` program being on your PATH; `zgrep` is used to transparently search the gzip-compressed `Contents.gz` files.

The Contents files need to be arranged as they would be found on a Debian mirror:
`dists/{release}/{component}/Contents-{arch}.gz`.

Users of the `apt-cacher-ng` package might like to use its local file cache for access to the Contents files in the expected format.

# Installation

From PyPI:

    pip install pydebcontents

From git:

    git clone https://salsa.debian.org/debian-irc-team/pydebcontents
    cd pydebcontents
    pip install .

# Usage

The module comes with a simple command-line interface that feels a bit like the standard `apt-file` program.

For example, to find all the README files shipped in packages:

    py-apt-file --base /var/cache/apt-cacher-ng/debrep/ search --mode glob  usr/share/doc/*/README

The only verb that `py-apt-file` knows at present is `search`.

```
$ py-apt-file search --help
usage: py-apt-file search [-h] [--release RELEASE] [--arch ARCH] [--component COMP] [--mode {glob,regex,fixed}]
[--max MAX]
PATTERN

positional arguments:
PATTERN               glob, regular expression or fixed string

options:
-h, --help            show this help message and exit
--release RELEASE     release to search (default: sid)
--arch ARCH, --architecture ARCH
architecture to search (default: amd64)
--component COMP      archive components to search (default: all of them)
--mode {glob,regex,fixed}
match mode for pattern
--max MAX             maximum number of packages to return
```

From Python, the module can be used as:

```python
import pydebcontents

contents = pydebcontents.ContentsFile("/var/cache/apt-cacher-ng/debrep/", "sid", "amd64", ["contrib"])

contents.search("usr/share/doc/.*/README")
```

A `ContentsDict` structure is returned, which is a `dict` where the
keys are package entries (in the `{section}/{package}` format used in the Contents files), and the values are lists of matching filenames.

The search term that `ContentsFile.search` uses is a `str` representation of a regular expression.
There are convenience functions in `pydebcontents` for handling search patterns, including navigating some of the foibles of `zgrep` and the Contents file format:

 - `glob2re` converts glob syntax to regular expression
 - `fixed2re` converts a fixed string into a regular expression
 - `re2re` cleans up an existing regular expression
 - `pattern2re` is for programmatic use in selecting one of the above three functions.


## To-do list / limitations

 - A previous attempt at a Python-only implementation was too slow to be usable for searching the Contents files; this could be revisited.
 - The mirrors are now carrying other compression formats such as `xz` that will not be found or used at present.
 - There is no utility provided to obtain the Contents files and arrange them on disk in a suitable tree.
 - There is no ability to simply point at a Contents file on-disk that is not in the desired tree format.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "pydebcontents",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": "",
    "keywords": "debian,package,contents,archive",
    "author": "",
    "author_email": "Stuart Prescott <stuart@debian.org>",
    "download_url": "https://files.pythonhosted.org/packages/02/d2/b876e7b68a0ae424842f425a0fb946b855a9933f1f9fc6143b15be12f602/pydebcontents-0.3.1.tar.gz",
    "platform": null,
    "description": "# pydebcontents: Searching Debian Contents files\n\nPackage repositories published by Debian (and its derivatives) have lots of different index files describing the Releases, Packages, Sources, and the file Contents of the packages.\nThe Debian wiki has a [full description of the repository format](https://wiki.debian.org/DebianRepository/Format).\n\nAccess to the data within the Release, Packages, and Sources files is provided by the [python-debian](https://python-debian-team.pages.debian.net/python-debian/html/) module, available within the Debian archive and from PyPI.\n\nThis module provides access to the Contents files.\n\n# Requirements\n\nThis module requires no Python modules outside of stdlib.\n\nSearching the Contents files is, however, dependent on the external `zgrep` program being on your PATH; `zgrep` is used to transparently search the gzip-compressed `Contents.gz` files.\n\nThe Contents files need to be arranged as they would be found on a Debian mirror:\n`dists/{release}/{component}/Contents-{arch}.gz`.\n\nUsers of the `apt-cacher-ng` package might like to use its local file cache for access to the Contents files in the expected format.\n\n# Installation\n\nFrom PyPI:\n\n    pip install pydebcontents\n\nFrom git:\n\n    git clone https://salsa.debian.org/debian-irc-team/pydebcontents\n    cd pydebcontents\n    pip install .\n\n# Usage\n\nThe module comes with a simple command-line interface that feels a bit like the standard `apt-file` program.\n\nFor example, to find all the README files shipped in packages:\n\n    py-apt-file --base /var/cache/apt-cacher-ng/debrep/ search --mode glob  usr/share/doc/*/README\n\nThe only verb that `py-apt-file` knows at present is `search`.\n\n```\n$ py-apt-file search --help\nusage: py-apt-file search [-h] [--release RELEASE] [--arch ARCH] [--component COMP] [--mode {glob,regex,fixed}]\n[--max MAX]\nPATTERN\n\npositional arguments:\nPATTERN               glob, regular expression or fixed string\n\noptions:\n-h, --help            show this help message and exit\n--release RELEASE     release to search (default: sid)\n--arch ARCH, --architecture ARCH\narchitecture to search (default: amd64)\n--component COMP      archive components to search (default: all of them)\n--mode {glob,regex,fixed}\nmatch mode for pattern\n--max MAX             maximum number of packages to return\n```\n\nFrom Python, the module can be used as:\n\n```python\nimport pydebcontents\n\ncontents = pydebcontents.ContentsFile(\"/var/cache/apt-cacher-ng/debrep/\", \"sid\", \"amd64\", [\"contrib\"])\n\ncontents.search(\"usr/share/doc/.*/README\")\n```\n\nA `ContentsDict` structure is returned, which is a `dict` where the\nkeys are package entries (in the `{section}/{package}` format used in the Contents files), and the values are lists of matching filenames.\n\nThe search term that `ContentsFile.search` uses is a `str` representation of a regular expression.\nThere are convenience functions in `pydebcontents` for handling search patterns, including navigating some of the foibles of `zgrep` and the Contents file format:\n\n - `glob2re` converts glob syntax to regular expression\n - `fixed2re` converts a fixed string into a regular expression\n - `re2re` cleans up an existing regular expression\n - `pattern2re` is for programmatic use in selecting one of the above three functions.\n\n\n## To-do list / limitations\n\n - A previous attempt at a Python-only implementation was too slow to be usable for searching the Contents files; this could be revisited.\n - The mirrors are now carrying other compression formats such as `xz` that will not be found or used at present.\n - There is no utility provided to obtain the Contents files and arrange them on disk in a suitable tree.\n - There is no ability to simply point at a Contents file on-disk that is not in the desired tree format.\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "Utilities working with Debian repository Contents files",
    "version": "0.3.1",
    "project_urls": {
        "homepage": "https://salsa.debian.org/debian-irc-team/pydebcontents/"
    },
    "split_keywords": [
        "debian",
        "package",
        "contents",
        "archive"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fadb9da949ceda90e21b2a8f73c6ef43f52354f0adcfa8ccaf054ddf7439af8a",
                "md5": "dfa0de664ad82cdcd801ba6c32002047",
                "sha256": "66046376c625e08eff59f2b4de52833328a93e6e1d41c3f0964b0d6772de03e6"
            },
            "downloads": -1,
            "filename": "pydebcontents-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "dfa0de664ad82cdcd801ba6c32002047",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 16099,
            "upload_time": "2023-12-02T05:22:22",
            "upload_time_iso_8601": "2023-12-02T05:22:22.434404Z",
            "url": "https://files.pythonhosted.org/packages/fa/db/9da949ceda90e21b2a8f73c6ef43f52354f0adcfa8ccaf054ddf7439af8a/pydebcontents-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "02d2b876e7b68a0ae424842f425a0fb946b855a9933f1f9fc6143b15be12f602",
                "md5": "cac241019a0b12e75df1d2bc593057f4",
                "sha256": "d1fe3a8de2eb140d92b33f87d23c45908777004121feaa3a1fe2362449b58c46"
            },
            "downloads": -1,
            "filename": "pydebcontents-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "cac241019a0b12e75df1d2bc593057f4",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 12938,
            "upload_time": "2023-12-02T05:22:24",
            "upload_time_iso_8601": "2023-12-02T05:22:24.004707Z",
            "url": "https://files.pythonhosted.org/packages/02/d2/b876e7b68a0ae424842f425a0fb946b855a9933f1f9fc6143b15be12f602/pydebcontents-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-02 05:22:24",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "pydebcontents"
}
        
Elapsed time: 2.11439s