Name | fetch-sitemap JSON |
Version |
27
JSON |
| download |
home_page | None |
Summary | Fetch a given sitemap and retrieve all URLs in it. |
upload_time | 2024-10-17 19:57:27 |
maintainer | None |
docs_url | None |
author | Martin Mahner |
requires_python | <4.0,>=3.9 |
license | MIT |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# fetch-sitemap
Retrieves all URLs of a given sitemap.xml URL and fetches each page one by one.
Useful for (load) testing the entire site for error responses.
![Sample Output](https://raw.githubusercontent.com/bartTC/fetch-sitemap/main/example.png)
## Installation
```bash
$ pip install fetch-sitemap
```
## Usage
```
$ fetch-sitemap --help
Usage: fetch-sitemap [OPTIONS] SITEMAP_URL
Fetch a given sitemap and retrieve all URLs in it.
╭─ Options ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ --basic-auth -a TEXT Basic auth information. Format: 'username:password' │
│ --limit -l INT [>=1] Maximum number of URLs to fetch from the given sitemap.xml. │
│ --recursive/--no-recursive Recursively fetch all sitemap documents from the given sitemap.xml. [default: recursive] │
│ --concurrency-limit -c INT [>=1] Max number of concurrent requests. [default: 5; >=1] │
│ --request-timeout -t INT [>=1] Timeout for fetching a URL in seconds. [default: 30; >=1] │
│ --random -r Append a random string like ?12334232343 to each URL to bypass frontend cache. │
│ --random-length INT [1 to 100] Length of the --random hash. [default: 15; 1 to 100] │
│ --report-path -p FILE Store results in a CSV file. Example: ./report.csv │
│ --output-dir -o DIRECTORY Store all fetched sitemap documents in this folder. Example: /tmp/my.domain.com/ │
│ --slow-threshold FLOAT [>=0.0] Responses slower than this (in seconds) are considered 'slow'. [default: 5.0; >=0.0] │
│ --slow-num INTEGER OR "ALL" How many 'slow' responses to show. [default: 10] │
│ --user-agent TEXT User-Agent string set in the HTTP header. [default: Mozilla/5.0 (compatible; fetch-sitemap/23)] │
│ --version Show the version and exit. │
│ --help Show this message and exit. │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
```
## 🤺 Local Development
```bash
poetry install
poetry run fetch-sitemap -h
poetry run ./tests.sh
```
Raw data
{
"_id": null,
"home_page": null,
"name": "fetch-sitemap",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.9",
"maintainer_email": null,
"keywords": null,
"author": "Martin Mahner",
"author_email": "martin@mahner.org",
"download_url": "https://files.pythonhosted.org/packages/5f/f7/a438aafe4b8c25943177c300dc6067523d2df642393682b9587ccc8a2d44/fetch_sitemap-27.tar.gz",
"platform": null,
"description": "# fetch-sitemap\n\nRetrieves all URLs of a given sitemap.xml URL and fetches each page one by one. \nUseful for (load) testing the entire site for error responses.\n\n![Sample Output](https://raw.githubusercontent.com/bartTC/fetch-sitemap/main/example.png)\n\n## Installation\n\n```bash \n$ pip install fetch-sitemap\n```\n\n## Usage \n\n```\n$ fetch-sitemap --help\n\n Usage: fetch-sitemap [OPTIONS] SITEMAP_URL\n\n Fetch a given sitemap and retrieve all URLs in it.\n\n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 --basic-auth -a TEXT Basic auth information. Format: 'username:password' \u2502\n\u2502 --limit -l INT [>=1] Maximum number of URLs to fetch from the given sitemap.xml. \u2502\n\u2502 --recursive/--no-recursive Recursively fetch all sitemap documents from the given sitemap.xml. [default: recursive] \u2502\n\u2502 --concurrency-limit -c INT [>=1] Max number of concurrent requests. [default: 5; >=1] \u2502\n\u2502 --request-timeout -t INT [>=1] Timeout for fetching a URL in seconds. [default: 30; >=1] \u2502\n\u2502 --random -r Append a random string like ?12334232343 to each URL to bypass frontend cache. \u2502\n\u2502 --random-length INT [1 to 100] Length of the --random hash. [default: 15; 1 to 100] \u2502\n\u2502 --report-path -p FILE Store results in a CSV file. Example: ./report.csv \u2502\n\u2502 --output-dir -o DIRECTORY Store all fetched sitemap documents in this folder. Example: /tmp/my.domain.com/ \u2502\n\u2502 --slow-threshold FLOAT [>=0.0] Responses slower than this (in seconds) are considered 'slow'. [default: 5.0; >=0.0] \u2502\n\u2502 --slow-num INTEGER OR \"ALL\" How many 'slow' responses to show. [default: 10] \u2502\n\u2502 --user-agent TEXT User-Agent string set in the HTTP header. [default: Mozilla/5.0 (compatible; fetch-sitemap/23)] \u2502\n\u2502 --version Show the version and exit. \u2502\n\u2502 --help Show this message and exit. \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n```\n\n## \ud83e\udd3a Local Development\n\n```bash\npoetry install\npoetry run fetch-sitemap -h\npoetry run ./tests.sh\n```",
"bugtrack_url": null,
"license": "MIT",
"summary": "Fetch a given sitemap and retrieve all URLs in it.",
"version": "27",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "09a38e87dab6872b12ee477adc5005e30c3c01a5ef6cbe95b6fa0315bacb569e",
"md5": "5bfdba457048eec23c644589d86ce6d2",
"sha256": "4f9f4606303c416a46be20ff82450b87391ece12fef33bd810da1514b1519ad4"
},
"downloads": -1,
"filename": "fetch_sitemap-27-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5bfdba457048eec23c644589d86ce6d2",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.9",
"size": 9943,
"upload_time": "2024-10-17T19:57:26",
"upload_time_iso_8601": "2024-10-17T19:57:26.415974Z",
"url": "https://files.pythonhosted.org/packages/09/a3/8e87dab6872b12ee477adc5005e30c3c01a5ef6cbe95b6fa0315bacb569e/fetch_sitemap-27-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5ff7a438aafe4b8c25943177c300dc6067523d2df642393682b9587ccc8a2d44",
"md5": "3acd3e540c3ed5bc239399c9d62a0fd9",
"sha256": "ff888992d0e3eee82075b42f5d441784ad3d20816ebc3f480cd2a0750a504052"
},
"downloads": -1,
"filename": "fetch_sitemap-27.tar.gz",
"has_sig": false,
"md5_digest": "3acd3e540c3ed5bc239399c9d62a0fd9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.9",
"size": 9223,
"upload_time": "2024-10-17T19:57:27",
"upload_time_iso_8601": "2024-10-17T19:57:27.397185Z",
"url": "https://files.pythonhosted.org/packages/5f/f7/a438aafe4b8c25943177c300dc6067523d2df642393682b9587ccc8a2d44/fetch_sitemap-27.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-17 19:57:27",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "fetch-sitemap"
}