Name | getdents JSON |
Version |
1.0.0
JSON |
| download |
home_page | None |
Summary | Python binding to linux syscall getdents64. |
upload_time | 2025-09-02 18:42:33 |
maintainer | None |
docs_url | None |
author | None |
requires_python | >=3.10 |
license | None |
keywords |
getdents
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
===============
Python getdents
===============
Iterate large directories efficiently with python.
About
=====
``python-getdents`` is a simple wrapper around Linux system call ``getdents64`` (see ``man getdents`` for details).
Implementation is based on solution descibed in `You can list a directory containing 8 million files! But not with ls. <http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html>`_ article by Ben Congleton.
Install
=======
.. code-block:: sh
pip install getdents
For development
---------------
.. code-block:: sh
python3 -m venv env
. env/bin/activate
pip install -e .[test]
Building Wheels
~~~~~~~~~~~~~~~
.. code-block:: sh
pip install cibuildwheel
cibuildwheel --platform linux --output-dir wheelhouse
Run tests
=========
.. code-block:: sh
ulimit -v 33554432 && py.test tests/
Usage
=====
.. code-block:: python
from getdents import getdents
for inode, type_, name in getdents("/tmp"):
print(name)
Advanced
--------
While ``getdents`` provides a convenient wrapper with ls-like filtering, you can use ``getdents_raw`` for more control:
.. code-block:: python
import os
from getdents import DT_LNK, O_GETDENTS, getdents_raw
fd = os.open("/tmp", O_GETDENTS)
for inode, type_, name in getdents_raw(fd, 2**20):
if type_ == DT_LNK and inode != 0:
print("found symlink:", name, "->", os.readlink(name, dir_fd=fd))
os.close(fd)
Batching
~~~~~~~~
In case you need more control over syscalls, you may call instance of ``getdents_raw`` instead.
Each call corresponds to single ``getdents64`` syscall, returning list of hovever many entries fits in buffer size.
Call returns ``None`` when there are no more entries to read.
.. code-block:: python
it = getdents_raw(fd, 2**20)
for batch in iter(it, None):
for inode, type, name in batch:
...
Free-threading
~~~~~~~~~~~~~~
While it is not so wise idea to do an I/O from multiple threads on a single file descriptor, you can do it if you need to.
This package supports free-threading (nogil) in Python.
CLI
---
Usage
~~~~~
::
python-getdents [-h] [-b N] [-o NAME] PATH
Options
~~~~~~~
+--------------------------+-------------------------------------------------+
| Option | Description |
+==========================+=================================================+
| ``-b N`` | Buffer size (in bytes) to allocate when |
| | iterating over directory. Default is 32768, the |
| | same value used by glibc, you probably want to |
+--------------------------+ increase this value. Try starting with 16777216 |
| ``--buffer-size N`` | (16 MiB). Best performance is achieved when |
| | buffer size rounds to size of the file system |
| | block. |
+--------------------------+-------------------------------------------------+
| ``-o NAME`` | Output format: |
| | |
| | * ``plain`` (default) Print only names. |
| | * ``csv`` Print as comma-separated values in |
+--------------------------+ order: inode, type, name. |
| ``--output-format NAME`` | * ``csv-headers`` Same as ``csv``, but print |
| | headers on the first line also. |
| | * ``json`` output as JSON array. |
| | * ``json-stream`` output each directory entry |
| | as single json object separated by newline. |
+--------------------------+-------------------------------------------------+
Exit codes
~~~~~~~~~~
* 3 - Requested buffer is too large
* 4 - ``PATH`` not found.
* 5 - ``PATH`` is not a directory.
* 6 - Not enough permissions to read contents of the ``PATH``.
Examples
~~~~~~~~
.. code-block:: sh
python-getdents /path/to/large/dir
python -m getdents /path/to/large/dir
python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv
Raw data
{
"_id": null,
"home_page": null,
"name": "getdents",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "getdents",
"author": null,
"author_email": "ZipFile <zipfile.d@protonmail.com>",
"download_url": "https://files.pythonhosted.org/packages/10/aa/cbdc87f71e8659f579557beb5d719e82459f70cdac6c089f948bce6cd76a/getdents-1.0.0.tar.gz",
"platform": null,
"description": "===============\nPython getdents\n===============\n\nIterate large directories efficiently with python.\n\nAbout\n=====\n\n``python-getdents`` is a simple wrapper around Linux system call ``getdents64`` (see ``man getdents`` for details).\n\nImplementation is based on solution descibed in `You can list a directory containing 8 million files! But not with ls. <http://be-n.com/spw/you-can-list-a-million-files-in-a-directory-but-not-with-ls.html>`_ article by Ben Congleton.\n\nInstall\n=======\n\n.. code-block:: sh\n\n pip install getdents\n\nFor development\n---------------\n\n.. code-block:: sh\n\n python3 -m venv env\n . env/bin/activate\n pip install -e .[test]\n\nBuilding Wheels\n~~~~~~~~~~~~~~~\n\n.. code-block:: sh\n\n pip install cibuildwheel\n cibuildwheel --platform linux --output-dir wheelhouse\n\nRun tests\n=========\n\n.. code-block:: sh\n\n ulimit -v 33554432 && py.test tests/\n\nUsage\n=====\n\n.. code-block:: python\n\n from getdents import getdents\n\n for inode, type_, name in getdents(\"/tmp\"):\n print(name)\n\nAdvanced\n--------\n\nWhile ``getdents`` provides a convenient wrapper with ls-like filtering, you can use ``getdents_raw`` for more control:\n\n.. code-block:: python\n\n import os\n from getdents import DT_LNK, O_GETDENTS, getdents_raw\n\n fd = os.open(\"/tmp\", O_GETDENTS)\n\n for inode, type_, name in getdents_raw(fd, 2**20):\n if type_ == DT_LNK and inode != 0:\n print(\"found symlink:\", name, \"->\", os.readlink(name, dir_fd=fd))\n\n os.close(fd)\n\nBatching\n~~~~~~~~\n\nIn case you need more control over syscalls, you may call instance of ``getdents_raw`` instead.\nEach call corresponds to single ``getdents64`` syscall, returning list of hovever many entries fits in buffer size.\nCall returns ``None`` when there are no more entries to read.\n\n.. code-block:: python\n\n it = getdents_raw(fd, 2**20)\n\n for batch in iter(it, None):\n for inode, type, name in batch:\n ...\n\nFree-threading\n~~~~~~~~~~~~~~\n\nWhile it is not so wise idea to do an I/O from multiple threads on a single file descriptor, you can do it if you need to.\nThis package supports free-threading (nogil) in Python.\n\nCLI\n---\n\nUsage\n~~~~~\n\n::\n\n python-getdents [-h] [-b N] [-o NAME] PATH\n\nOptions\n~~~~~~~\n\n+--------------------------+-------------------------------------------------+\n| Option | Description |\n+==========================+=================================================+\n| ``-b N`` | Buffer size (in bytes) to allocate when |\n| | iterating over directory. Default is 32768, the |\n| | same value used by glibc, you probably want to |\n+--------------------------+ increase this value. Try starting with 16777216 |\n| ``--buffer-size N`` | (16 MiB). Best performance is achieved when |\n| | buffer size rounds to size of the file system |\n| | block. |\n+--------------------------+-------------------------------------------------+\n| ``-o NAME`` | Output format: |\n| | |\n| | * ``plain`` (default) Print only names. |\n| | * ``csv`` Print as comma-separated values in |\n+--------------------------+ order: inode, type, name. |\n| ``--output-format NAME`` | * ``csv-headers`` Same as ``csv``, but print |\n| | headers on the first line also. |\n| | * ``json`` output as JSON array. |\n| | * ``json-stream`` output each directory entry |\n| | as single json object separated by newline. |\n+--------------------------+-------------------------------------------------+\n\nExit codes\n~~~~~~~~~~\n\n* 3 - Requested buffer is too large\n* 4 - ``PATH`` not found.\n* 5 - ``PATH`` is not a directory.\n* 6 - Not enough permissions to read contents of the ``PATH``.\n\nExamples\n~~~~~~~~\n\n.. code-block:: sh\n\n python-getdents /path/to/large/dir\n python -m getdents /path/to/large/dir\n python-getdents /path/to/large/dir -o csv -b 16777216 > dir.csv\n",
"bugtrack_url": null,
"license": null,
"summary": "Python binding to linux syscall getdents64.",
"version": "1.0.0",
"project_urls": {
"Source": "https://github.com/ZipFile/python-getdents"
},
"split_keywords": [
"getdents"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a192a28176f225841e06fd8c27c37951b045df648f89b8f2f04c65be430aef73",
"md5": "eec3877d3b41e7931ae89dd6fb7fe697",
"sha256": "f62d1edd1522fd044439589c4e8c200b94a677d81ae3b86320eff8e3cd8ccb10"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp310-abi3-manylinux_2_28_aarch64.whl",
"has_sig": false,
"md5_digest": "eec3877d3b41e7931ae89dd6fb7fe697",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 16275,
"upload_time": "2025-09-02T18:42:24",
"upload_time_iso_8601": "2025-09-02T18:42:24.282174Z",
"url": "https://files.pythonhosted.org/packages/a1/92/a28176f225841e06fd8c27c37951b045df648f89b8f2f04c65be430aef73/getdents-1.0.0-cp310-abi3-manylinux_2_28_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "8a2c4765c59e3349d60856c77215b32f5fe6b6cb2cc1675639d359fd12424a30",
"md5": "36e823a5462464cd03296d9af38c85e3",
"sha256": "49cd092b360b52a40802ef6fa08e50346ad36dd67f63f05781501f957ca21ab0"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp310-abi3-manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "36e823a5462464cd03296d9af38c85e3",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 15238,
"upload_time": "2025-09-02T18:42:25",
"upload_time_iso_8601": "2025-09-02T18:42:25.600711Z",
"url": "https://files.pythonhosted.org/packages/8a/2c/4765c59e3349d60856c77215b32f5fe6b6cb2cc1675639d359fd12424a30/getdents-1.0.0-cp310-abi3-manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3a05a9d854115b73022e681cfa916e24cf70d38eaf19bc57a592097fa99b601f",
"md5": "ef26257f57c1361a350a61c2f53e4913",
"sha256": "8f992fa25380d76f88cb89cd582b1cb5b64e9bb1142cb26776ccf3b40044f7f4"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp310-abi3-musllinux_1_2_aarch64.whl",
"has_sig": false,
"md5_digest": "ef26257f57c1361a350a61c2f53e4913",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 16206,
"upload_time": "2025-09-02T18:42:26",
"upload_time_iso_8601": "2025-09-02T18:42:26.965935Z",
"url": "https://files.pythonhosted.org/packages/3a/05/a9d854115b73022e681cfa916e24cf70d38eaf19bc57a592097fa99b601f/getdents-1.0.0-cp310-abi3-musllinux_1_2_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "a897c31cb9dafdba8edba3983e14fc063cd885c99c0a5d4d0da3c692d43b06c0",
"md5": "a41f6318f5d4be27891e378b301d9226",
"sha256": "cdd21d302592fa4c2b4e983d30b07a0be8b41846ec1413d2ffd2034a287e25ce"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp310-abi3-musllinux_1_2_x86_64.whl",
"has_sig": false,
"md5_digest": "a41f6318f5d4be27891e378b301d9226",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.10",
"size": 15520,
"upload_time": "2025-09-02T18:42:27",
"upload_time_iso_8601": "2025-09-02T18:42:27.945715Z",
"url": "https://files.pythonhosted.org/packages/a8/97/c31cb9dafdba8edba3983e14fc063cd885c99c0a5d4d0da3c692d43b06c0/getdents-1.0.0-cp310-abi3-musllinux_1_2_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "6ee4a4c3172a2e1d17621dd52884d48a5120673379ec478ff8e1c312124770fe",
"md5": "82dda60e563df604bf5c0d3fb0cb98f4",
"sha256": "7c6e461ec4d14e8ea668faab5e68467940dd0d8000f4c6d2f3f91832bddb0769"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp314-cp314t-manylinux_2_28_aarch64.whl",
"has_sig": false,
"md5_digest": "82dda60e563df604bf5c0d3fb0cb98f4",
"packagetype": "bdist_wheel",
"python_version": "cp314",
"requires_python": ">=3.10",
"size": 17234,
"upload_time": "2025-09-02T18:42:28",
"upload_time_iso_8601": "2025-09-02T18:42:28.923701Z",
"url": "https://files.pythonhosted.org/packages/6e/e4/a4c3172a2e1d17621dd52884d48a5120673379ec478ff8e1c312124770fe/getdents-1.0.0-cp314-cp314t-manylinux_2_28_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "5a8a4055f0eeb93b7a9251ff720dc1c4d5352cac7b36a975ffaf072073c105a4",
"md5": "bc8b947e075fba300d11f32d39c1c1b0",
"sha256": "2ce612bdc9cc3690dc568b360bafee319afca151b9abb38fea376e7ddd344085"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp314-cp314t-manylinux_2_28_x86_64.whl",
"has_sig": false,
"md5_digest": "bc8b947e075fba300d11f32d39c1c1b0",
"packagetype": "bdist_wheel",
"python_version": "cp314",
"requires_python": ">=3.10",
"size": 15888,
"upload_time": "2025-09-02T18:42:29",
"upload_time_iso_8601": "2025-09-02T18:42:29.917038Z",
"url": "https://files.pythonhosted.org/packages/5a/8a/4055f0eeb93b7a9251ff720dc1c4d5352cac7b36a975ffaf072073c105a4/getdents-1.0.0-cp314-cp314t-manylinux_2_28_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d361e7304e86899d2b8181ab23035e54b0322f70c019439ece5b83b0cf1888bf",
"md5": "842e3926720a0448ebdd7f525d1d56b6",
"sha256": "381f0081be3bdd249f121e51b13b477c615b516be31d08b8bf6b839ea968b48e"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl",
"has_sig": false,
"md5_digest": "842e3926720a0448ebdd7f525d1d56b6",
"packagetype": "bdist_wheel",
"python_version": "cp314",
"requires_python": ">=3.10",
"size": 17037,
"upload_time": "2025-09-02T18:42:31",
"upload_time_iso_8601": "2025-09-02T18:42:31.070727Z",
"url": "https://files.pythonhosted.org/packages/d3/61/e7304e86899d2b8181ab23035e54b0322f70c019439ece5b83b0cf1888bf/getdents-1.0.0-cp314-cp314t-musllinux_1_2_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "b8c26bf256ec5358ab95608f0f1a9671f7e441bfb9045ce046f8396d5be4d609",
"md5": "34163cf17a58dd7f062fca53650655a5",
"sha256": "35238c0e4fa94b266099abd00391ce1716d439cd4b127427ac385bc49fa230cd"
},
"downloads": -1,
"filename": "getdents-1.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl",
"has_sig": false,
"md5_digest": "34163cf17a58dd7f062fca53650655a5",
"packagetype": "bdist_wheel",
"python_version": "cp314",
"requires_python": ">=3.10",
"size": 16186,
"upload_time": "2025-09-02T18:42:32",
"upload_time_iso_8601": "2025-09-02T18:42:32.006247Z",
"url": "https://files.pythonhosted.org/packages/b8/c2/6bf256ec5358ab95608f0f1a9671f7e441bfb9045ce046f8396d5be4d609/getdents-1.0.0-cp314-cp314t-musllinux_1_2_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "10aacbdc87f71e8659f579557beb5d719e82459f70cdac6c089f948bce6cd76a",
"md5": "c1d657d70c3245cde663d587b5b793ae",
"sha256": "80ab2825a09e5b1107fe3d166458d01d4a7cedfe255ee9762d12c68c9f890d24"
},
"downloads": -1,
"filename": "getdents-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "c1d657d70c3245cde663d587b5b793ae",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 13823,
"upload_time": "2025-09-02T18:42:33",
"upload_time_iso_8601": "2025-09-02T18:42:33.027385Z",
"url": "https://files.pythonhosted.org/packages/10/aa/cbdc87f71e8659f579557beb5d719e82459f70cdac6c089f948bce6cd76a/getdents-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-02 18:42:33",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ZipFile",
"github_project": "python-getdents",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "getdents"
}