getdents


Namegetdents JSON
Version 0.4.0 PyPI version JSON
download
home_page
SummaryPython binding to linux syscall getdents64.
upload_time2023-09-11 21:35:16
maintainer
docs_urlNone
author
requires_python>=3.8
licenseBSD-2-Clause
keywords getdents
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ===============
Python getdents
===============

Iterate large directories efficiently with python.

About
=====

``python-getdents`` is a simple wrapper around Linux system call ``getdents64`` (see ``man getdents`` for details). `More details <http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html>`_ on approach.

TODO
====

* Verify that implementation works on platforms other than ``x86_64``.

Install
=======

.. code-block:: sh

    pip install getdents

For development
---------------

.. code-block:: sh

    python3 -m venv env
    . env/bin/activate
    pip install -e .[test]

Building Wheels
~~~~~~~~~~~~~~~

.. code-block:: sh

    pip install cibuildwheel
    cibuildwheel --platform linux --output-dir wheelhouse

Run tests
=========

.. code-block:: sh

    ulimit -v 33554432 && py.test tests/

Or

.. code-block:: sh

    ulimit -v 33554432 && ./setup.py test

Usage
=====

.. code-block:: python

    from getdents import getdents

    for inode, type, name in getdents('/tmp', 32768):
        print(name)

Advanced
--------

.. code-block:: python

    import os
    from getdents import *

    fd = os.open('/tmp', O_GETDENTS)

    for inode, type, name in getdents_raw(fd, 2**20):
        print({
                DT_BLK:     'blockdev',
                DT_CHR:     'chardev ',
                DT_DIR:     'dir     ',
                DT_FIFO:    'pipe    ',
                DT_LNK:     'symlink ',
                DT_REG:     'file    ',
                DT_SOCK:    'socket  ',
                DT_UNKNOWN: 'unknown ',
            }[type], {
                True:  'd',
                False: ' ',
            }[inode == 0],
            name,
        )

    os.close(fd)

CLI
---

Usage
~~~~~

::

    python-getdents [-h] [-b N] [-o NAME] PATH

Options
~~~~~~~

+--------------------------+-------------------------------------------------+
| Option                   | Description                                     |
+==========================+=================================================+
| ``-b N``                 | Buffer size (in bytes) to allocate when         |
|                          | iterating over directory. Default is 32768, the |
|                          | same value used by glibc, you probably want to  |
+--------------------------+ increase this value. Try starting with 16777216 |
| ``--buffer-size N``      | (16 MiB). Best performance is achieved when     |
|                          | buffer size rounds to size of the file system   |
|                          | block.                                          |
+--------------------------+-------------------------------------------------+
| ``-o NAME``              | Output format:                                  |
|                          |                                                 |
|                          | * ``plain`` (default) Print only names.         |
|                          | * ``csv`` Print as comma-separated values in    |
+--------------------------+   order: inode, type, name.                     |
| ``--output-format NAME`` | * ``csv-headers`` Same as ``csv``, but print    |
|                          |   headers on the first line also.               |
|                          | * ``json`` output as JSON array.                |
|                          | * ``json-stream`` output each directory entry   |
|                          |   as single json object separated by newline.   |
+--------------------------+-------------------------------------------------+

Exit codes
~~~~~~~~~~

* 3 - Requested buffer is too large
* 4 - ``PATH`` not found.
* 5 - ``PATH`` is not a directory.
* 6 - Not enough permissions to read contents of the ``PATH``.

Examples
~~~~~~~~

.. code-block:: sh

    python-getdents /path/to/large/dir
    python -m getdents /path/to/large/dir
    python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "getdents",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "",
    "keywords": "getdents",
    "author": "",
    "author_email": "ZipFile <zipfile.d@protonmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/d1/0c/62f3264e8c49908d7b884597acbda9057b0ab26cc41cae096462a07db66b/getdents-0.4.0.tar.gz",
    "platform": null,
    "description": "===============\nPython getdents\n===============\n\nIterate large directories efficiently with python.\n\nAbout\n=====\n\n``python-getdents`` is a simple wrapper around Linux system call ``getdents64`` (see ``man getdents`` for details). `More details <http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html>`_ on approach.\n\nTODO\n====\n\n* Verify that implementation works on platforms other than ``x86_64``.\n\nInstall\n=======\n\n.. code-block:: sh\n\n    pip install getdents\n\nFor development\n---------------\n\n.. code-block:: sh\n\n    python3 -m venv env\n    . env/bin/activate\n    pip install -e .[test]\n\nBuilding Wheels\n~~~~~~~~~~~~~~~\n\n.. code-block:: sh\n\n    pip install cibuildwheel\n    cibuildwheel --platform linux --output-dir wheelhouse\n\nRun tests\n=========\n\n.. code-block:: sh\n\n    ulimit -v 33554432 && py.test tests/\n\nOr\n\n.. code-block:: sh\n\n    ulimit -v 33554432 && ./setup.py test\n\nUsage\n=====\n\n.. code-block:: python\n\n    from getdents import getdents\n\n    for inode, type, name in getdents('/tmp', 32768):\n        print(name)\n\nAdvanced\n--------\n\n.. code-block:: python\n\n    import os\n    from getdents import *\n\n    fd = os.open('/tmp', O_GETDENTS)\n\n    for inode, type, name in getdents_raw(fd, 2**20):\n        print({\n                DT_BLK:     'blockdev',\n                DT_CHR:     'chardev ',\n                DT_DIR:     'dir     ',\n                DT_FIFO:    'pipe    ',\n                DT_LNK:     'symlink ',\n                DT_REG:     'file    ',\n                DT_SOCK:    'socket  ',\n                DT_UNKNOWN: 'unknown ',\n            }[type], {\n                True:  'd',\n                False: ' ',\n            }[inode == 0],\n            name,\n        )\n\n    os.close(fd)\n\nCLI\n---\n\nUsage\n~~~~~\n\n::\n\n    python-getdents [-h] [-b N] [-o NAME] PATH\n\nOptions\n~~~~~~~\n\n+--------------------------+-------------------------------------------------+\n| Option                   | Description                                     |\n+==========================+=================================================+\n| ``-b N``                 | Buffer size (in bytes) to allocate when         |\n|                          | iterating over directory. Default is 32768, the |\n|                          | same value used by glibc, you probably want to  |\n+--------------------------+ increase this value. Try starting with 16777216 |\n| ``--buffer-size N``      | (16 MiB). Best performance is achieved when     |\n|                          | buffer size rounds to size of the file system   |\n|                          | block.                                          |\n+--------------------------+-------------------------------------------------+\n| ``-o NAME``              | Output format:                                  |\n|                          |                                                 |\n|                          | * ``plain`` (default) Print only names.         |\n|                          | * ``csv`` Print as comma-separated values in    |\n+--------------------------+   order: inode, type, name.                     |\n| ``--output-format NAME`` | * ``csv-headers`` Same as ``csv``, but print    |\n|                          |   headers on the first line also.               |\n|                          | * ``json`` output as JSON array.                |\n|                          | * ``json-stream`` output each directory entry   |\n|                          |   as single json object separated by newline.   |\n+--------------------------+-------------------------------------------------+\n\nExit codes\n~~~~~~~~~~\n\n* 3 - Requested buffer is too large\n* 4 - ``PATH`` not found.\n* 5 - ``PATH`` is not a directory.\n* 6 - Not enough permissions to read contents of the ``PATH``.\n\nExamples\n~~~~~~~~\n\n.. code-block:: sh\n\n    python-getdents /path/to/large/dir\n    python -m getdents /path/to/large/dir\n    python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv\n",
    "bugtrack_url": null,
    "license": "BSD-2-Clause",
    "summary": "Python binding to linux syscall getdents64.",
    "version": "0.4.0",
    "project_urls": {
        "Source": "https://github.com/ZipFile/python-getdents"
    },
    "split_keywords": [
        "getdents"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4306f1753bc171807960d43efcf327357f38c32f9ab02def06113bb2d400b42e",
                "md5": "4987cbe00f15854c180c4e45d4a330f5",
                "sha256": "6c4ede8c0396ccee694c5507d59a33659b589b1e09fda2c6e4bece78225bb839"
            },
            "downloads": -1,
            "filename": "getdents-0.4.0-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "4987cbe00f15854c180c4e45d4a330f5",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 16346,
            "upload_time": "2023-09-11T21:35:10",
            "upload_time_iso_8601": "2023-09-11T21:35:10.075352Z",
            "url": "https://files.pythonhosted.org/packages/43/06/f1753bc171807960d43efcf327357f38c32f9ab02def06113bb2d400b42e/getdents-0.4.0-cp38-abi3-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4468312f4d4c2b2721899ea7bfc7bbbe2e19b519df00276e3a56dfdfc73c0da6",
                "md5": "ed908dc2685d68a370e85e27b563fbb7",
                "sha256": "76e2f9281beb429c0b321b91561beb2adad0c80920cf75248bd8842c59d111ec"
            },
            "downloads": -1,
            "filename": "getdents-0.4.0-cp38-abi3-musllinux_1_1_x86_64.whl",
            "has_sig": false,
            "md5_digest": "ed908dc2685d68a370e85e27b563fbb7",
            "packagetype": "bdist_wheel",
            "python_version": "cp38",
            "requires_python": ">=3.8",
            "size": 18713,
            "upload_time": "2023-09-11T21:35:11",
            "upload_time_iso_8601": "2023-09-11T21:35:11.197670Z",
            "url": "https://files.pythonhosted.org/packages/44/68/312f4d4c2b2721899ea7bfc7bbbe2e19b519df00276e3a56dfdfc73c0da6/getdents-0.4.0-cp38-abi3-musllinux_1_1_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6ddd4d98b79999febfbb9ebe068740fc9d50e8af88137b09324337d806f2a48d",
                "md5": "0b737c0d146daae671124b293327a2f2",
                "sha256": "13f93ead3cd3c99b094793f4cbbb19abac7d5e09fc373fe37335addd41afbdac"
            },
            "downloads": -1,
            "filename": "getdents-0.4.0-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "0b737c0d146daae671124b293327a2f2",
            "packagetype": "bdist_wheel",
            "python_version": "pp310",
            "requires_python": ">=3.8",
            "size": 12623,
            "upload_time": "2023-09-11T21:35:12",
            "upload_time_iso_8601": "2023-09-11T21:35:12.271520Z",
            "url": "https://files.pythonhosted.org/packages/6d/dd/4d98b79999febfbb9ebe068740fc9d50e8af88137b09324337d806f2a48d/getdents-0.4.0-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e0fa0e1d5007290b206c87e64cbeaa9caa6b218b2c982ead899322e74168f75c",
                "md5": "80afed6e0553ca83fcb709d1528bed07",
                "sha256": "7ee12332b72b015022424e2df177bbcc58754aaf4085b6f5614b9135504ab0ee"
            },
            "downloads": -1,
            "filename": "getdents-0.4.0-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "80afed6e0553ca83fcb709d1528bed07",
            "packagetype": "bdist_wheel",
            "python_version": "pp38",
            "requires_python": ">=3.8",
            "size": 12619,
            "upload_time": "2023-09-11T21:35:13",
            "upload_time_iso_8601": "2023-09-11T21:35:13.738066Z",
            "url": "https://files.pythonhosted.org/packages/e0/fa/0e1d5007290b206c87e64cbeaa9caa6b218b2c982ead899322e74168f75c/getdents-0.4.0-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b64c27b34c0bcbf7e63fa0ce7a619ab3dc31073606d478e80e4cacbda4764bf8",
                "md5": "417c740169d372755d7cbcc00c10b78d",
                "sha256": "40475dba347b20cbdca289b0468f52f3176307e35077a58d186d6cebbaf19501"
            },
            "downloads": -1,
            "filename": "getdents-0.4.0-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "has_sig": false,
            "md5_digest": "417c740169d372755d7cbcc00c10b78d",
            "packagetype": "bdist_wheel",
            "python_version": "pp39",
            "requires_python": ">=3.8",
            "size": 12620,
            "upload_time": "2023-09-11T21:35:14",
            "upload_time_iso_8601": "2023-09-11T21:35:14.795511Z",
            "url": "https://files.pythonhosted.org/packages/b6/4c/27b34c0bcbf7e63fa0ce7a619ab3dc31073606d478e80e4cacbda4764bf8/getdents-0.4.0-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d10c62f3264e8c49908d7b884597acbda9057b0ab26cc41cae096462a07db66b",
                "md5": "72f3b0964abb92d0839fa3f79b9fbfbf",
                "sha256": "03af041b079173f9e2975f4198a32e3f9fb1962b2c8b856b8838e401946e168c"
            },
            "downloads": -1,
            "filename": "getdents-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "72f3b0964abb92d0839fa3f79b9fbfbf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 10583,
            "upload_time": "2023-09-11T21:35:16",
            "upload_time_iso_8601": "2023-09-11T21:35:16.337588Z",
            "url": "https://files.pythonhosted.org/packages/d1/0c/62f3264e8c49908d7b884597acbda9057b0ab26cc41cae096462a07db66b/getdents-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-09-11 21:35:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ZipFile",
    "github_project": "python-getdents",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "getdents"
}
        
Elapsed time: 0.11970s