htmllisting-parser
==================
Python parser for Apache/nginx-style HTML directory listing
.. code-block:: python
import htmllistparse
cwd, listing = htmllistparse.fetch_listing(some_url, timeout=30)
# or you can get the url and make a BeautifulSoup yourself, then use
# cwd, listing = htmllistparse.parse(soup)
where ``cwd`` is the current directory, ``listing`` is a list of ``FileEntry`` named tuples:
* ``name``: File name, ``str``. Have a trailing / if it's a directory.
* ``modified``: Last modification time, ``time.struct_time`` or ``None``. Timezone is not known.
* ``size``: File size, ``int`` or ``None``. May be estimated from the prefix, such as "K", "M".
* ``description``: File description, file type, or any other things found. ``str`` as HTML, or ``None``.
Supports:
* Vanilla Apache/nginx/lighttpd/darkhttpd autoindex
* Most ``<pre>``-style index
* Many other ``<table>``-style index
* ``<ul>``-style
.. note::
Please wrap the functions in a general ``try... except`` block. It may throw exceptions unexpectedly.
ReHTTPFS
--------
Reinvented HTTP Filesystem.
* Mounts most HTTP file listings with FUSE.
* Gets directory tree and file stats with less overhead.
* Supports Range requests.
* Supports Keep-Alive.
::
usage: rehttpfs.py [-h] [-o OPTIONS] [-t TIMEOUT] [-u USER_AGENT] [-v] [-d]
url mountpoint
Mount HTML directory listings.
positional arguments:
url URL to mount
mountpoint filesystem mount point
optional arguments:
-h, --help show this help message and exit
-o OPTIONS comma separated FUSE options
-t TIMEOUT, --timeout TIMEOUT
HTTP request timeout
-u USER_AGENT, --user-agent USER_AGENT
HTTP User-Agent
-v, --verbose enable debug logging
-d, --daemon run in background
Raw data
{
"_id": null,
"home_page": "https://github.com/gumblex/htmllisting-parser",
"name": "htmllistparse",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "apache nginx listing fuse",
"author": "Dingyuan Wang",
"author_email": "gumblex@aosc.io",
"download_url": "https://files.pythonhosted.org/packages/47/4f/6c57a2817e4f20c1ed8dcca24ee036f981ed036f4b36d07a0100303db96a/htmllistparse-0.6.1.tar.gz",
"platform": "any",
"description": "htmllisting-parser\n==================\nPython parser for Apache/nginx-style HTML directory listing\n\n.. code-block:: python\n\n import htmllistparse\n cwd, listing = htmllistparse.fetch_listing(some_url, timeout=30)\n\n # or you can get the url and make a BeautifulSoup yourself, then use\n # cwd, listing = htmllistparse.parse(soup)\n\nwhere ``cwd`` is the current directory, ``listing`` is a list of ``FileEntry`` named tuples:\n\n* ``name``: File name, ``str``. Have a trailing / if it's a directory.\n* ``modified``: Last modification time, ``time.struct_time`` or ``None``. Timezone is not known.\n* ``size``: File size, ``int`` or ``None``. May be estimated from the prefix, such as \"K\", \"M\".\n* ``description``: File description, file type, or any other things found. ``str`` as HTML, or ``None``.\n\nSupports:\n\n* Vanilla Apache/nginx/lighttpd/darkhttpd autoindex\n* Most ``<pre>``-style index\n* Many other ``<table>``-style index\n* ``<ul>``-style\n\n.. note::\n Please wrap the functions in a general ``try... except`` block. It may throw exceptions unexpectedly.\n\nReHTTPFS\n--------\n\nReinvented HTTP Filesystem.\n\n* Mounts most HTTP file listings with FUSE.\n* Gets directory tree and file stats with less overhead.\n* Supports Range requests.\n* Supports Keep-Alive.\n\n::\n\n usage: rehttpfs.py [-h] [-o OPTIONS] [-t TIMEOUT] [-u USER_AGENT] [-v] [-d]\n url mountpoint\n\n Mount HTML directory listings.\n\n positional arguments:\n url URL to mount\n mountpoint filesystem mount point\n\n optional arguments:\n -h, --help show this help message and exit\n -o OPTIONS comma separated FUSE options\n -t TIMEOUT, --timeout TIMEOUT\n HTTP request timeout\n -u USER_AGENT, --user-agent USER_AGENT\n HTTP User-Agent\n -v, --verbose enable debug logging\n -d, --daemon run in background\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Python parser for Apache/nginx-style HTML directory listing.",
"version": "0.6.1",
"project_urls": {
"Homepage": "https://github.com/gumblex/htmllisting-parser"
},
"split_keywords": [
"apache",
"nginx",
"listing",
"fuse"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "43fbfed3185cb09dd7b7e06207a8b33767f1a3f9765e856394ad84b538a1b6ca",
"md5": "3c53716e6dd0068488763729014a6c40",
"sha256": "ed027107de47bf18c7059db156075267947a828d3d72ab02823fbef0f39481a9"
},
"downloads": -1,
"filename": "htmllistparse-0.6.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3c53716e6dd0068488763729014a6c40",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 9967,
"upload_time": "2023-06-03T11:48:38",
"upload_time_iso_8601": "2023-06-03T11:48:38.706210Z",
"url": "https://files.pythonhosted.org/packages/43/fb/fed3185cb09dd7b7e06207a8b33767f1a3f9765e856394ad84b538a1b6ca/htmllistparse-0.6.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "474f6c57a2817e4f20c1ed8dcca24ee036f981ed036f4b36d07a0100303db96a",
"md5": "3e8ffe2e64318ad9c3875bf0678bc724",
"sha256": "6dc8a6bf03c843b9d325843a26a2351a795b573cd92a2c9b8271621019c64082"
},
"downloads": -1,
"filename": "htmllistparse-0.6.1.tar.gz",
"has_sig": false,
"md5_digest": "3e8ffe2e64318ad9c3875bf0678bc724",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10035,
"upload_time": "2023-06-03T11:48:41",
"upload_time_iso_8601": "2023-06-03T11:48:41.189203Z",
"url": "https://files.pythonhosted.org/packages/47/4f/6c57a2817e4f20c1ed8dcca24ee036f981ed036f4b36d07a0100303db96a/htmllistparse-0.6.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-06-03 11:48:41",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "gumblex",
"github_project": "htmllisting-parser",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "htmllistparse"
}