reachable


Namereachable JSON
Version 0.4.0 PyPI version JSON
download
home_pageNone
SummaryCheck if a URL is reachable
upload_time2024-10-03 15:09:34
maintainerNone
docs_urlNone
authorAlex Mili
requires_python>=3.8
licenseMIT License Copyright (c) 2016 Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            **Reachable** checks if a URL exists and is reachable.

# Features
- Use `HEAD`request instead of `GET` to save some bandwidth
- Follow redirects
- Handle local redirects (without full URL in `location` header)
- Record all the URLs of the redirection chain
- Check if redirected URL match the TLD of source URL
- Detect Cloudflare protection
- Avoid basic bot detectors
    - Use randome Chrome user agent
    - Wait between consecutive requests to the same host
    - Include `Host` header
- Use of HTTP/2

# Installation
You can install it with pip :
```bash
pip install reachable
```
Or clone this repository and simply run :
```bash
cd reachable/
pip install -e .
```

# Usage

## Simple URL
```python
from reachable import is_reachable
result = is_reachable("https://google.com")
```

The output will look like this:
```json
{
    "original_url": "https://google.com",
    "final_url": "https://www.google.com/",
    "response": null, 
    "status_code": 200,
    "success": true,
    "error_name": null,
    "cloudflare_protection": false,
    "redirect": {
        "chain": ["https://www.google.com/"],
        "final_url": "https://www.google.com/",
        "tld_match": true
    }
}
```

## Multiple URLs
```python
from reachable import is_reachable
result = is_reachable(["https://google.com", "http://bing.com"])
```

The output will look like this:
```json
[
    {
        "original_url": "https://google.com",
        "final_url": "https://www.google.com/",
        "response": null, 
        "status_code": 200,
        "success": true,
        "error_name": null,
        "cloudflare_protection": false,
        "redirect": {
            "chain": ["https://www.google.com/"],
            "final_url": "https://www.google.com/",
            "tld_match": true
        }
    },
    {
        "original_url": "http://bing.com",
        "final_url": "https://www.bing.com/?toWww=1&redig=16A78C94",
        "response": null,
        "status_code": 200,
        "success": true,
        "error_name": null,
        "cloudflare_protection": false,
        "redirect": {
            "chain": ["https://www.bing.com:443/?toWww=1&redig=16A78C94"],
            "final_url": "https://www.bing.com/?toWww=1&redig=16A78C94",
            "tld_match": true
        }
    }
]
```

## Async
```python
import asyncio
from reachable import is_reachable_async

result = asyncio.run(is_reachable_async("https://google.com"))
```
or
```python
import asyncio
from reachable import is_reachable_async

urls = ["https://google.com", "https://bing.com"]

try:
    loop = asyncio.get_running_loop()
except RuntimeError:
    # No loop already exists so we crete one
    loop = asyncio.new_event_loop()
    asyncio.set_event_loop(loop)
try:
    result = loop.run_until_complete(asyncio.gather(*[is_reachable_async(url) for url in urls]))
finally:
    loop.close()
```
            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "reachable",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Alex Mili",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/46/e8/21e4e5877d60130312156916bab3533d91c3293e4467d8436f332d754a9d/reachable-0.4.0.tar.gz",
    "platform": null,
    "description": "**Reachable** checks if a URL exists and is reachable.\n\n# Features\n- Use `HEAD`request instead of `GET` to save some bandwidth\n- Follow redirects\n- Handle local redirects (without full URL in `location` header)\n- Record all the URLs of the redirection chain\n- Check if redirected URL match the TLD of source URL\n- Detect Cloudflare protection\n- Avoid basic bot detectors\n    - Use randome Chrome user agent\n    - Wait between consecutive requests to the same host\n    - Include `Host` header\n- Use of HTTP/2\n\n# Installation\nYou can install it with pip :\n```bash\npip install reachable\n```\nOr clone this repository and simply run :\n```bash\ncd reachable/\npip install -e .\n```\n\n# Usage\n\n## Simple URL\n```python\nfrom reachable import is_reachable\nresult = is_reachable(\"https://google.com\")\n```\n\nThe output will look like this:\n```json\n{\n    \"original_url\": \"https://google.com\",\n    \"final_url\": \"https://www.google.com/\",\n    \"response\": null, \n    \"status_code\": 200,\n    \"success\": true,\n    \"error_name\": null,\n    \"cloudflare_protection\": false,\n    \"redirect\": {\n        \"chain\": [\"https://www.google.com/\"],\n        \"final_url\": \"https://www.google.com/\",\n        \"tld_match\": true\n    }\n}\n```\n\n## Multiple URLs\n```python\nfrom reachable import is_reachable\nresult = is_reachable([\"https://google.com\", \"http://bing.com\"])\n```\n\nThe output will look like this:\n```json\n[\n    {\n        \"original_url\": \"https://google.com\",\n        \"final_url\": \"https://www.google.com/\",\n        \"response\": null, \n        \"status_code\": 200,\n        \"success\": true,\n        \"error_name\": null,\n        \"cloudflare_protection\": false,\n        \"redirect\": {\n            \"chain\": [\"https://www.google.com/\"],\n            \"final_url\": \"https://www.google.com/\",\n            \"tld_match\": true\n        }\n    },\n    {\n        \"original_url\": \"http://bing.com\",\n        \"final_url\": \"https://www.bing.com/?toWww=1&redig=16A78C94\",\n        \"response\": null,\n        \"status_code\": 200,\n        \"success\": true,\n        \"error_name\": null,\n        \"cloudflare_protection\": false,\n        \"redirect\": {\n            \"chain\": [\"https://www.bing.com:443/?toWww=1&redig=16A78C94\"],\n            \"final_url\": \"https://www.bing.com/?toWww=1&redig=16A78C94\",\n            \"tld_match\": true\n        }\n    }\n]\n```\n\n## Async\n```python\nimport asyncio\nfrom reachable import is_reachable_async\n\nresult = asyncio.run(is_reachable_async(\"https://google.com\"))\n```\nor\n```python\nimport asyncio\nfrom reachable import is_reachable_async\n\nurls = [\"https://google.com\", \"https://bing.com\"]\n\ntry:\n    loop = asyncio.get_running_loop()\nexcept RuntimeError:\n    # No loop already exists so we crete one\n    loop = asyncio.new_event_loop()\n    asyncio.set_event_loop(loop)\ntry:\n    result = loop.run_until_complete(asyncio.gather(*[is_reachable_async(url) for url in urls]))\nfinally:\n    loop.close()\n```",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2016  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.",
    "summary": "Check if a URL is reachable",
    "version": "0.4.0",
    "project_urls": {
        "Documentation": "https://github.com/AlexMili/Reachable",
        "Homepage": "https://github.com/AlexMili/Reachable",
        "Issues": "https://github.com/AlexMili/Reachable/issues",
        "Repository": "https://github.com/AlexMili/Reachable"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0448954223685625ff786c38d0a7c067f558a53de5f18ef666c80053347103c6",
                "md5": "065d21c69a0089c938cf3d014a449454",
                "sha256": "3706f351258990cbd47cb82e8247e341fbeee2de13147cb0dd9eb1bb06952202"
            },
            "downloads": -1,
            "filename": "reachable-0.4.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "065d21c69a0089c938cf3d014a449454",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 9169,
            "upload_time": "2024-10-03T15:09:33",
            "upload_time_iso_8601": "2024-10-03T15:09:33.680664Z",
            "url": "https://files.pythonhosted.org/packages/04/48/954223685625ff786c38d0a7c067f558a53de5f18ef666c80053347103c6/reachable-0.4.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "46e821e4e5877d60130312156916bab3533d91c3293e4467d8436f332d754a9d",
                "md5": "c659e05950cfc6448d2414a4b55a4aa3",
                "sha256": "a210bab228415e36e4a3e8d1f42f267893b8e507537bacb2a3236c872532210d"
            },
            "downloads": -1,
            "filename": "reachable-0.4.0.tar.gz",
            "has_sig": false,
            "md5_digest": "c659e05950cfc6448d2414a4b55a4aa3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 8540,
            "upload_time": "2024-10-03T15:09:34",
            "upload_time_iso_8601": "2024-10-03T15:09:34.712096Z",
            "url": "https://files.pythonhosted.org/packages/46/e8/21e4e5877d60130312156916bab3533d91c3293e4467d8436f332d754a9d/reachable-0.4.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-10-03 15:09:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "AlexMili",
    "github_project": "Reachable",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "reachable"
}
        
Elapsed time: 4.78031s