proxy-fetcher


Nameproxy-fetcher JSON
Version 0.3.1 PyPI version JSON
download
home_pagehttps://github.com/ilmir-muslim/proxy-fetcher
SummaryPackage for fetching and validating working HTTP/HTTPS proxies from multiple sources
upload_time2025-07-15 14:48:46
maintainerNone
docs_urlNone
authorIlmir Gilmiiarov
requires_python>=3.11
licenseNone
keywords proxy scraper validator fetcher
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            Proxy Fetcher (Enhanced)

Python package for fetching and validating HTTP/HTTPS proxies from multiple sources with asynchronous support and custom validation.
Features

    Asynchronous validation for high-performance proxy checking

    Custom URL testing - verify proxies against specific websites

    Advanced validation with custom response validators

    User-Agent customization to avoid blocking

    Multiple proxy sources (Geonode, ProxyScrape, Free-Proxy.CZ)

    Progress tracking with tqdm

    Automatic saving of working proxies

Installation
bash

pip install proxy-fetcher

    Note: Requires Python 3.13 or higher

Quick Start
Basic Usage
python

from proxy_fetcher import get_proxies

# Get 10 working proxies (default)
proxies = get_proxies()
print(f"Found {len(proxies)} working proxies")

Custom URL Testing
python

# Test proxies against a specific website
proxies = get_proxies(
    custom_url="https://www.google.com/finance/quote/AED-KZT",
    timeout=8
)

Advanced Validation
python

def custom_validator(response, content):
    # Validate response content
    return "AED-KZT" in content and response.status == 200

proxies = get_proxies(
    custom_url="https://www.google.com/finance/quote/AED-KZT",
    custom_validator=custom_validator,
    user_agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36"
)

Performance Tuning
python

proxies = get_proxies(
    CONCURRENT_CHECKS=200,  # Parallel checks (default: 100)
    TIMEOUT=3,              # Shorter timeout for faster results
    PROXY_LIMIT=500,        # More proxies to test
    MAX_ATTEMPTS=5          # More attempts to find proxies
)

Configuration Options
Parameter	Default	Description
MIN_WORKING_PROXIES	10	Minimum working proxies to find
PROXY_LIMIT	100	Max proxies to fetch per attempt
TIMEOUT	5	Timeout for proxy validation (seconds)
MAX_ATTEMPTS	3	Max attempts to reach target count
CONCURRENT_CHECKS	100	Number of concurrent proxy checks (increase for speed)
TEST_URLS	Standard list	Default URLs for proxy validation
CUSTOM_TEST_URL	None	Specific URL to test proxies against (overrides TEST_URLS)
CUSTOM_VALIDATOR	None	Custom function to validate proxy response (see example)
USER_AGENT	Modern browser	User-Agent header to use for validation requests
Advanced Usage
Using the ProxyFetcher Class
python

from proxy_fetcher import ProxyFetcher

fetcher = ProxyFetcher(
    MIN_WORKING_PROXIES=5,
    CUSTOM_TEST_URL="https://example.com",
    CONCURRENT_CHECKS=250,
    USER_AGENT="My Custom User Agent"
)

if fetcher.fetch_proxies():
    print(f"Working proxies: {fetcher.working_proxies}")
    # Save to custom file
    with open('my_proxies.txt', 'w') as f:
        f.write('\n'.join(fetcher.working_proxies))

Custom Validator Examples

For Google Finance:
python

def google_finance_validator(response, content):
    # Check status and content
    return response.status == 200 and "AED-KZT" in content and "currency exchange rate" in content

For JSON APIs:
python

import json

def json_api_validator(response, content):
    try:
        data = json.loads(content)
        return data.get("success") and response.status == 200
    except:
        return False

Proxy Storage
Automatic Saving

Working proxies are automatically saved to working_proxies.txt after successful validation.
Manual Export
python

proxies = get_proxies()
with open('custom_proxies.txt', 'w') as f:
    f.write('\n'.join(proxies))

Loading Proxies
python

with open('working_proxies.txt') as f:
    loaded_proxies = f.read().splitlines()

## Performance Tips

    Increase concurrency: Set CONCURRENT_CHECKS=200-500 for faster validation

    Reduce timeout: Set TIMEOUT=3-5 seconds for public proxies

    Custom validation: Implement specific checks for target websites

    Use fresh proxies: Public proxies often have short lifespans

## Troubleshooting

    No proxies found:

        Increase TIMEOUT (10-15 seconds)

        Use simpler TEST_URLS (like http://httpbin.org/ip)

        Increase MAX_ATTEMPTS

    Slow validation:

        Increase CONCURRENT_CHECKS

        Reduce PROXY_LIMIT

    Blocked by target site:

        Rotate User-Agents

        Add delays between requests

License

MIT
Changelog (v0.3.0)

    Added asynchronous validation with aiohttp

    Implemented custom URL testing

    Added support for response validators

    Enhanced User-Agent customization

    Improved performance with concurrent checks

    Updated documentation with Google Finance examples

    For support and issues, visit GitHub Repository

## Совместимость

Пакет работает с Python 3.11 и выше. Для использования с Python 3.11 установите:

```bash
pip install proxy-fetcher==0.3.1

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ilmir-muslim/proxy-fetcher",
    "name": "proxy-fetcher",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": null,
    "keywords": "proxy scraper validator fetcher",
    "author": "Ilmir Gilmiiarov",
    "author_email": "ilmir_gf@mail.ru",
    "download_url": "https://files.pythonhosted.org/packages/6e/64/0eda87cf68a89a2a917dae2e5e7b146797a785c89aa05a8139d6c5ba3006/proxy_fetcher-0.3.1.tar.gz",
    "platform": null,
    "description": "Proxy Fetcher (Enhanced)\n\nPython package for fetching and validating HTTP/HTTPS proxies from multiple sources with asynchronous support and custom validation.\nFeatures\n\n    Asynchronous validation for high-performance proxy checking\n\n    Custom URL testing - verify proxies against specific websites\n\n    Advanced validation with custom response validators\n\n    User-Agent customization to avoid blocking\n\n    Multiple proxy sources (Geonode, ProxyScrape, Free-Proxy.CZ)\n\n    Progress tracking with tqdm\n\n    Automatic saving of working proxies\n\nInstallation\nbash\n\npip install proxy-fetcher\n\n    Note: Requires Python 3.13 or higher\n\nQuick Start\nBasic Usage\npython\n\nfrom proxy_fetcher import get_proxies\n\n# Get 10 working proxies (default)\nproxies = get_proxies()\nprint(f\"Found {len(proxies)} working proxies\")\n\nCustom URL Testing\npython\n\n# Test proxies against a specific website\nproxies = get_proxies(\n    custom_url=\"https://www.google.com/finance/quote/AED-KZT\",\n    timeout=8\n)\n\nAdvanced Validation\npython\n\ndef custom_validator(response, content):\n    # Validate response content\n    return \"AED-KZT\" in content and response.status == 200\n\nproxies = get_proxies(\n    custom_url=\"https://www.google.com/finance/quote/AED-KZT\",\n    custom_validator=custom_validator,\n    user_agent=\"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/125.0.0.0 Safari/537.36\"\n)\n\nPerformance Tuning\npython\n\nproxies = get_proxies(\n    CONCURRENT_CHECKS=200,  # Parallel checks (default: 100)\n    TIMEOUT=3,              # Shorter timeout for faster results\n    PROXY_LIMIT=500,        # More proxies to test\n    MAX_ATTEMPTS=5          # More attempts to find proxies\n)\n\nConfiguration Options\nParameter\tDefault\tDescription\nMIN_WORKING_PROXIES\t10\tMinimum working proxies to find\nPROXY_LIMIT\t100\tMax proxies to fetch per attempt\nTIMEOUT\t5\tTimeout for proxy validation (seconds)\nMAX_ATTEMPTS\t3\tMax attempts to reach target count\nCONCURRENT_CHECKS\t100\tNumber of concurrent proxy checks (increase for speed)\nTEST_URLS\tStandard list\tDefault URLs for proxy validation\nCUSTOM_TEST_URL\tNone\tSpecific URL to test proxies against (overrides TEST_URLS)\nCUSTOM_VALIDATOR\tNone\tCustom function to validate proxy response (see example)\nUSER_AGENT\tModern browser\tUser-Agent header to use for validation requests\nAdvanced Usage\nUsing the ProxyFetcher Class\npython\n\nfrom proxy_fetcher import ProxyFetcher\n\nfetcher = ProxyFetcher(\n    MIN_WORKING_PROXIES=5,\n    CUSTOM_TEST_URL=\"https://example.com\",\n    CONCURRENT_CHECKS=250,\n    USER_AGENT=\"My Custom User Agent\"\n)\n\nif fetcher.fetch_proxies():\n    print(f\"Working proxies: {fetcher.working_proxies}\")\n    # Save to custom file\n    with open('my_proxies.txt', 'w') as f:\n        f.write('\\n'.join(fetcher.working_proxies))\n\nCustom Validator Examples\n\nFor Google Finance:\npython\n\ndef google_finance_validator(response, content):\n    # Check status and content\n    return response.status == 200 and \"AED-KZT\" in content and \"currency exchange rate\" in content\n\nFor JSON APIs:\npython\n\nimport json\n\ndef json_api_validator(response, content):\n    try:\n        data = json.loads(content)\n        return data.get(\"success\") and response.status == 200\n    except:\n        return False\n\nProxy Storage\nAutomatic Saving\n\nWorking proxies are automatically saved to working_proxies.txt after successful validation.\nManual Export\npython\n\nproxies = get_proxies()\nwith open('custom_proxies.txt', 'w') as f:\n    f.write('\\n'.join(proxies))\n\nLoading Proxies\npython\n\nwith open('working_proxies.txt') as f:\n    loaded_proxies = f.read().splitlines()\n\n## Performance Tips\n\n    Increase concurrency: Set CONCURRENT_CHECKS=200-500 for faster validation\n\n    Reduce timeout: Set TIMEOUT=3-5 seconds for public proxies\n\n    Custom validation: Implement specific checks for target websites\n\n    Use fresh proxies: Public proxies often have short lifespans\n\n## Troubleshooting\n\n    No proxies found:\n\n        Increase TIMEOUT (10-15 seconds)\n\n        Use simpler TEST_URLS (like http://httpbin.org/ip)\n\n        Increase MAX_ATTEMPTS\n\n    Slow validation:\n\n        Increase CONCURRENT_CHECKS\n\n        Reduce PROXY_LIMIT\n\n    Blocked by target site:\n\n        Rotate User-Agents\n\n        Add delays between requests\n\nLicense\n\nMIT\nChangelog (v0.3.0)\n\n    Added asynchronous validation with aiohttp\n\n    Implemented custom URL testing\n\n    Added support for response validators\n\n    Enhanced User-Agent customization\n\n    Improved performance with concurrent checks\n\n    Updated documentation with Google Finance examples\n\n    For support and issues, visit GitHub Repository\n\n## \u0421\u043e\u0432\u043c\u0435\u0441\u0442\u0438\u043c\u043e\u0441\u0442\u044c\n\n\u041f\u0430\u043a\u0435\u0442 \u0440\u0430\u0431\u043e\u0442\u0430\u0435\u0442 \u0441 Python 3.11 \u0438 \u0432\u044b\u0448\u0435. \u0414\u043b\u044f \u0438\u0441\u043f\u043e\u043b\u044c\u0437\u043e\u0432\u0430\u043d\u0438\u044f \u0441 Python 3.11 \u0443\u0441\u0442\u0430\u043d\u043e\u0432\u0438\u0442\u0435:\n\n```bash\npip install proxy-fetcher==0.3.1\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Package for fetching and validating working HTTP/HTTPS proxies from multiple sources",
    "version": "0.3.1",
    "project_urls": {
        "Bug Reports": "https://github.com/ilmir-muslim/proxy-fetcher/issues",
        "Homepage": "https://github.com/ilmir-muslim/proxy-fetcher",
        "Source": "https://github.com/ilmir-muslim/proxy-fetcher"
    },
    "split_keywords": [
        "proxy",
        "scraper",
        "validator",
        "fetcher"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9f3b4e126cfb021c40a7c4012d8cf81d846490e53919b15e120c7f8da03ccd98",
                "md5": "c48d4bfadd2bf2be62fb0af96436b03c",
                "sha256": "3aa1b017a1d924dd060f69df7df78917b91f681156f31e9e09ce8002c4aa99e3"
            },
            "downloads": -1,
            "filename": "proxy_fetcher-0.3.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c48d4bfadd2bf2be62fb0af96436b03c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 6312,
            "upload_time": "2025-07-15T14:48:45",
            "upload_time_iso_8601": "2025-07-15T14:48:45.400132Z",
            "url": "https://files.pythonhosted.org/packages/9f/3b/4e126cfb021c40a7c4012d8cf81d846490e53919b15e120c7f8da03ccd98/proxy_fetcher-0.3.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6e640eda87cf68a89a2a917dae2e5e7b146797a785c89aa05a8139d6c5ba3006",
                "md5": "f47c4dd5bb7b486fd9a7c934dd0bbc49",
                "sha256": "2929d0b52e190d68867bb0736f3a104b7ab6d784c0f21fb9892e8913b795803e"
            },
            "downloads": -1,
            "filename": "proxy_fetcher-0.3.1.tar.gz",
            "has_sig": false,
            "md5_digest": "f47c4dd5bb7b486fd9a7c934dd0bbc49",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 5882,
            "upload_time": "2025-07-15T14:48:46",
            "upload_time_iso_8601": "2025-07-15T14:48:46.550137Z",
            "url": "https://files.pythonhosted.org/packages/6e/64/0eda87cf68a89a2a917dae2e5e7b146797a785c89aa05a8139d6c5ba3006/proxy_fetcher-0.3.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-15 14:48:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ilmir-muslim",
    "github_project": "proxy-fetcher",
    "github_not_found": true,
    "lcname": "proxy-fetcher"
}
        
Elapsed time: 2.32885s