Name | reachable JSON |
Version |
0.4.0
JSON |
| download |
home_page | None |
Summary | Check if a URL is reachable |
upload_time | 2024-10-03 15:09:34 |
maintainer | None |
docs_url | None |
author | Alex Mili |
requires_python | >=3.8 |
license | MIT License Copyright (c) 2016 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
**Reachable** checks if a URL exists and is reachable.
# Features
- Use `HEAD`request instead of `GET` to save some bandwidth
- Follow redirects
- Handle local redirects (without full URL in `location` header)
- Record all the URLs of the redirection chain
- Check if redirected URL match the TLD of source URL
- Detect Cloudflare protection
- Avoid basic bot detectors
- Use randome Chrome user agent
- Wait between consecutive requests to the same host
- Include `Host` header
- Use of HTTP/2
# Installation
You can install it with pip :
```bash
pip install reachable
```
Or clone this repository and simply run :
```bash
cd reachable/
pip install -e .
```
# Usage
## Simple URL
```python
from reachable import is_reachable
result = is_reachable("https://google.com")
```
The output will look like this:
```json
{
"original_url": "https://google.com",
"final_url": "https://www.google.com/",
"response": null,
"status_code": 200,
"success": true,
"error_name": null,
"cloudflare_protection": false,
"redirect": {
"chain": ["https://www.google.com/"],
"final_url": "https://www.google.com/",
"tld_match": true
}
}
```
## Multiple URLs
```python
from reachable import is_reachable
result = is_reachable(["https://google.com", "http://bing.com"])
```
The output will look like this:
```json
[
{
"original_url": "https://google.com",
"final_url": "https://www.google.com/",
"response": null,
"status_code": 200,
"success": true,
"error_name": null,
"cloudflare_protection": false,
"redirect": {
"chain": ["https://www.google.com/"],
"final_url": "https://www.google.com/",
"tld_match": true
}
},
{
"original_url": "http://bing.com",
"final_url": "https://www.bing.com/?toWww=1&redig=16A78C94",
"response": null,
"status_code": 200,
"success": true,
"error_name": null,
"cloudflare_protection": false,
"redirect": {
"chain": ["https://www.bing.com:443/?toWww=1&redig=16A78C94"],
"final_url": "https://www.bing.com/?toWww=1&redig=16A78C94",
"tld_match": true
}
}
]
```
## Async
```python
import asyncio
from reachable import is_reachable_async
result = asyncio.run(is_reachable_async("https://google.com"))
```
or
```python
import asyncio
from reachable import is_reachable_async
urls = ["https://google.com", "https://bing.com"]
try:
loop = asyncio.get_running_loop()
except RuntimeError:
# No loop already exists so we crete one
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
try:
result = loop.run_until_complete(asyncio.gather(*[is_reachable_async(url) for url in urls]))
finally:
loop.close()
```
Raw data
{
"_id": null,
"home_page": null,
"name": "reachable",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Alex Mili",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/46/e8/21e4e5877d60130312156916bab3533d91c3293e4467d8436f332d754a9d/reachable-0.4.0.tar.gz",
"platform": null,
"description": "**Reachable** checks if a URL exists and is reachable.\n\n# Features\n- Use `HEAD`request instead of `GET` to save some bandwidth\n- Follow redirects\n- Handle local redirects (without full URL in `location` header)\n- Record all the URLs of the redirection chain\n- Check if redirected URL match the TLD of source URL\n- Detect Cloudflare protection\n- Avoid basic bot detectors\n - Use randome Chrome user agent\n - Wait between consecutive requests to the same host\n - Include `Host` header\n- Use of HTTP/2\n\n# Installation\nYou can install it with pip :\n```bash\npip install reachable\n```\nOr clone this repository and simply run :\n```bash\ncd reachable/\npip install -e .\n```\n\n# Usage\n\n## Simple URL\n```python\nfrom reachable import is_reachable\nresult = is_reachable(\"https://google.com\")\n```\n\nThe output will look like this:\n```json\n{\n \"original_url\": \"https://google.com\",\n \"final_url\": \"https://www.google.com/\",\n \"response\": null, \n \"status_code\": 200,\n \"success\": true,\n \"error_name\": null,\n \"cloudflare_protection\": false,\n \"redirect\": {\n \"chain\": [\"https://www.google.com/\"],\n \"final_url\": \"https://www.google.com/\",\n \"tld_match\": true\n }\n}\n```\n\n## Multiple URLs\n```python\nfrom reachable import is_reachable\nresult = is_reachable([\"https://google.com\", \"http://bing.com\"])\n```\n\nThe output will look like this:\n```json\n[\n {\n \"original_url\": \"https://google.com\",\n \"final_url\": \"https://www.google.com/\",\n \"response\": null, \n \"status_code\": 200,\n \"success\": true,\n \"error_name\": null,\n \"cloudflare_protection\": false,\n \"redirect\": {\n \"chain\": [\"https://www.google.com/\"],\n \"final_url\": \"https://www.google.com/\",\n \"tld_match\": true\n }\n },\n {\n \"original_url\": \"http://bing.com\",\n \"final_url\": \"https://www.bing.com/?toWww=1&redig=16A78C94\",\n \"response\": null,\n \"status_code\": 200,\n \"success\": true,\n \"error_name\": null,\n \"cloudflare_protection\": false,\n \"redirect\": {\n \"chain\": [\"https://www.bing.com:443/?toWww=1&redig=16A78C94\"],\n \"final_url\": \"https://www.bing.com/?toWww=1&redig=16A78C94\",\n \"tld_match\": true\n }\n }\n]\n```\n\n## Async\n```python\nimport asyncio\nfrom reachable import is_reachable_async\n\nresult = asyncio.run(is_reachable_async(\"https://google.com\"))\n```\nor\n```python\nimport asyncio\nfrom reachable import is_reachable_async\n\nurls = [\"https://google.com\", \"https://bing.com\"]\n\ntry:\n loop = asyncio.get_running_loop()\nexcept RuntimeError:\n # No loop already exists so we crete one\n loop = asyncio.new_event_loop()\n asyncio.set_event_loop(loop)\ntry:\n result = loop.run_until_complete(asyncio.gather(*[is_reachable_async(url) for url in urls]))\nfinally:\n loop.close()\n```",
"bugtrack_url": null,
"license": "MIT License Copyright (c) 2016 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
"summary": "Check if a URL is reachable",
"version": "0.4.0",
"project_urls": {
"Documentation": "https://github.com/AlexMili/Reachable",
"Homepage": "https://github.com/AlexMili/Reachable",
"Issues": "https://github.com/AlexMili/Reachable/issues",
"Repository": "https://github.com/AlexMili/Reachable"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0448954223685625ff786c38d0a7c067f558a53de5f18ef666c80053347103c6",
"md5": "065d21c69a0089c938cf3d014a449454",
"sha256": "3706f351258990cbd47cb82e8247e341fbeee2de13147cb0dd9eb1bb06952202"
},
"downloads": -1,
"filename": "reachable-0.4.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "065d21c69a0089c938cf3d014a449454",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 9169,
"upload_time": "2024-10-03T15:09:33",
"upload_time_iso_8601": "2024-10-03T15:09:33.680664Z",
"url": "https://files.pythonhosted.org/packages/04/48/954223685625ff786c38d0a7c067f558a53de5f18ef666c80053347103c6/reachable-0.4.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "46e821e4e5877d60130312156916bab3533d91c3293e4467d8436f332d754a9d",
"md5": "c659e05950cfc6448d2414a4b55a4aa3",
"sha256": "a210bab228415e36e4a3e8d1f42f267893b8e507537bacb2a3236c872532210d"
},
"downloads": -1,
"filename": "reachable-0.4.0.tar.gz",
"has_sig": false,
"md5_digest": "c659e05950cfc6448d2414a4b55a4aa3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 8540,
"upload_time": "2024-10-03T15:09:34",
"upload_time_iso_8601": "2024-10-03T15:09:34.712096Z",
"url": "https://files.pythonhosted.org/packages/46/e8/21e4e5877d60130312156916bab3533d91c3293e4467d8436f332d754a9d/reachable-0.4.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-03 15:09:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AlexMili",
"github_project": "Reachable",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "reachable"
}