htmllistparse


Namehtmllistparse JSON
Version 0.6.1 PyPI version JSON
download
home_pagehttps://github.com/gumblex/htmllisting-parser
SummaryPython parser for Apache/nginx-style HTML directory listing.
upload_time2023-06-03 11:48:41
maintainer
docs_urlNone
authorDingyuan Wang
requires_python
licenseMIT
keywords apache nginx listing fuse
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            htmllisting-parser
==================
Python parser for Apache/nginx-style HTML directory listing

.. code-block:: python

   import htmllistparse
   cwd, listing = htmllistparse.fetch_listing(some_url, timeout=30)

   # or you can get the url and make a BeautifulSoup yourself, then use
   # cwd, listing = htmllistparse.parse(soup)

where ``cwd`` is the current directory, ``listing`` is a list of ``FileEntry`` named tuples:

* ``name``: File name, ``str``. Have a trailing / if it's a directory.
* ``modified``: Last modification time, ``time.struct_time`` or ``None``. Timezone is not known.
* ``size``: File size, ``int`` or ``None``. May be estimated from the prefix, such as "K", "M".
* ``description``: File description, file type, or any other things found. ``str`` as HTML, or ``None``.

Supports:

* Vanilla Apache/nginx/lighttpd/darkhttpd autoindex
* Most ``<pre>``-style index
* Many other ``<table>``-style index
* ``<ul>``-style

.. note::
   Please wrap the functions in a general ``try... except`` block. It may throw exceptions unexpectedly.

ReHTTPFS
--------

Reinvented HTTP Filesystem.

* Mounts most HTTP file listings with FUSE.
* Gets directory tree and file stats with less overhead.
* Supports Range requests.
* Supports Keep-Alive.

::

   usage: rehttpfs.py [-h] [-o OPTIONS] [-t TIMEOUT] [-u USER_AGENT] [-v] [-d]
                      url mountpoint

   Mount HTML directory listings.

   positional arguments:
     url                   URL to mount
     mountpoint            filesystem mount point

   optional arguments:
     -h, --help            show this help message and exit
     -o OPTIONS            comma separated FUSE options
     -t TIMEOUT, --timeout TIMEOUT
                           HTTP request timeout
     -u USER_AGENT, --user-agent USER_AGENT
                           HTTP User-Agent
     -v, --verbose         enable debug logging
     -d, --daemon          run in background


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/gumblex/htmllisting-parser",
    "name": "htmllistparse",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "apache nginx listing fuse",
    "author": "Dingyuan Wang",
    "author_email": "gumblex@aosc.io",
    "download_url": "https://files.pythonhosted.org/packages/47/4f/6c57a2817e4f20c1ed8dcca24ee036f981ed036f4b36d07a0100303db96a/htmllistparse-0.6.1.tar.gz",
    "platform": "any",
    "description": "htmllisting-parser\n==================\nPython parser for Apache/nginx-style HTML directory listing\n\n.. code-block:: python\n\n   import htmllistparse\n   cwd, listing = htmllistparse.fetch_listing(some_url, timeout=30)\n\n   # or you can get the url and make a BeautifulSoup yourself, then use\n   # cwd, listing = htmllistparse.parse(soup)\n\nwhere ``cwd`` is the current directory, ``listing`` is a list of ``FileEntry`` named tuples:\n\n* ``name``: File name, ``str``. Have a trailing / if it's a directory.\n* ``modified``: Last modification time, ``time.struct_time`` or ``None``. Timezone is not known.\n* ``size``: File size, ``int`` or ``None``. May be estimated from the prefix, such as \"K\", \"M\".\n* ``description``: File description, file type, or any other things found. ``str`` as HTML, or ``None``.\n\nSupports:\n\n* Vanilla Apache/nginx/lighttpd/darkhttpd autoindex\n* Most ``<pre>``-style index\n* Many other ``<table>``-style index\n* ``<ul>``-style\n\n.. note::\n   Please wrap the functions in a general ``try... except`` block. It may throw exceptions unexpectedly.\n\nReHTTPFS\n--------\n\nReinvented HTTP Filesystem.\n\n* Mounts most HTTP file listings with FUSE.\n* Gets directory tree and file stats with less overhead.\n* Supports Range requests.\n* Supports Keep-Alive.\n\n::\n\n   usage: rehttpfs.py [-h] [-o OPTIONS] [-t TIMEOUT] [-u USER_AGENT] [-v] [-d]\n                      url mountpoint\n\n   Mount HTML directory listings.\n\n   positional arguments:\n     url                   URL to mount\n     mountpoint            filesystem mount point\n\n   optional arguments:\n     -h, --help            show this help message and exit\n     -o OPTIONS            comma separated FUSE options\n     -t TIMEOUT, --timeout TIMEOUT\n                           HTTP request timeout\n     -u USER_AGENT, --user-agent USER_AGENT\n                           HTTP User-Agent\n     -v, --verbose         enable debug logging\n     -d, --daemon          run in background\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Python parser for Apache/nginx-style HTML directory listing.",
    "version": "0.6.1",
    "project_urls": {
        "Homepage": "https://github.com/gumblex/htmllisting-parser"
    },
    "split_keywords": [
        "apache",
        "nginx",
        "listing",
        "fuse"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "43fbfed3185cb09dd7b7e06207a8b33767f1a3f9765e856394ad84b538a1b6ca",
                "md5": "3c53716e6dd0068488763729014a6c40",
                "sha256": "ed027107de47bf18c7059db156075267947a828d3d72ab02823fbef0f39481a9"
            },
            "downloads": -1,
            "filename": "htmllistparse-0.6.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "3c53716e6dd0068488763729014a6c40",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 9967,
            "upload_time": "2023-06-03T11:48:38",
            "upload_time_iso_8601": "2023-06-03T11:48:38.706210Z",
            "url": "https://files.pythonhosted.org/packages/43/fb/fed3185cb09dd7b7e06207a8b33767f1a3f9765e856394ad84b538a1b6ca/htmllistparse-0.6.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "474f6c57a2817e4f20c1ed8dcca24ee036f981ed036f4b36d07a0100303db96a",
                "md5": "3e8ffe2e64318ad9c3875bf0678bc724",
                "sha256": "6dc8a6bf03c843b9d325843a26a2351a795b573cd92a2c9b8271621019c64082"
            },
            "downloads": -1,
            "filename": "htmllistparse-0.6.1.tar.gz",
            "has_sig": false,
            "md5_digest": "3e8ffe2e64318ad9c3875bf0678bc724",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 10035,
            "upload_time": "2023-06-03T11:48:41",
            "upload_time_iso_8601": "2023-06-03T11:48:41.189203Z",
            "url": "https://files.pythonhosted.org/packages/47/4f/6c57a2817e4f20c1ed8dcca24ee036f981ed036f4b36d07a0100303db96a/htmllistparse-0.6.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-03 11:48:41",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "gumblex",
    "github_project": "htmllisting-parser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "htmllistparse"
}
        
Elapsed time: 0.07145s