requests-tor


Namerequests-tor JSON
Version 1.4 PyPI version JSON
download
home_pagehttps://github.com/deedy5/requests_tor
SummaryMultithreading requests via TOR with automatic TOR new identity
upload_time2022-12-14 17:20:51
maintainer
docs_urlNone
authordeedy5
requires_python>=3.6
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Python >= 3.6](https://img.shields.io/badge/python->=3.6-red.svg)](https://www.python.org/downloads/) [![](https://badgen.net/github/release/deedy5/requests_tor)](https://github.com/deedy5/requests_tor/releases) [![](https://badge.fury.io/py/requests-tor.svg)](https://pypi.org/project/requests-tor) 
# requests_tor 

`Release history:` [https://pypi.org/project/requests-tor/#history](https://pypi.org/project/requests-tor/#history)

---

Multithreading requests via [TOR](https://www.torproject.org) with automatic TOR new identity.

Wrapper of the [requests](https://docs.python-requests.org) and [stem](https://stem.torproject.org) libraries.
Returns [requests.Response](https://docs.python-requests.org/en/latest/api/#requests.Response) object.

Masking as Tor Browser by using its default headers:
``` 
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8",
    "Accept-Encoding": "gzip, deflate, br",
    "Accept-Language": "en-US,en;q=0.5",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Firefox/102.0",
```

### Install

```
pip install -U requests_tor
```

### Dependencies
Download and start [Tor Browser](https://www.torproject.org/download/) or install [Tor](https://community.torproject.org/onion-services/setup/install/)

_Notes:_
* In Tor [torrc file](https://support.torproject.org/tbb/tbb-editing-torrc/) control port is disabled by default. Needs to uncomment line `ControlPort 9051`
* If you face an error `Authentication failed: unable to read '/run/tor/control.authcookie' ([Errno 13] Permission denied: '/run/tor/control.authcookie')` - needs to add your current user to the tor group. `ps ax o comm,group | grep tor` - command to find tor group (group name will be in the second column, for example `debian-tor`). `sudo usermod -a -G debian-tor $USER` - add your current user to tor group
* Restart Tor (`/etc/init.d/tor restart`) and re-login

---
### Simple usage
```python
from requests_tor import RequestsTor

# If you use the Tor browser
rt = RequestsTor()
OR
# If you use the Tor
rt = RequestsTor(tor_ports=(9050,), tor_cport=9051)

url = 'https://httpbin.org/anything'
r = rt.get(url)
print(r.text)

urls = ['https://foxnews.com', 'https://nbcnews.com', 'https://wsj.com/news/world',
        'https://abcnews.go.com', 'https://cbsnews.com',  'https://nytimes.com',
        'https://usatoday.com','https://reuters.com/world', 'http://bbc.com/news',
        'https://theguardian.com/world', 'https://cnn.com', 'https://apnews.com']
r = rt.get_urls(urls)
print(r[-1].text)
```

---
### Advanced usage
[Edit torrc file](https://support.torproject.org/tbb/tbb-editing-torrc/):

1. add [socks ports](https://2019.www.torproject.org/docs/tor-manual.html.en#SocksPort),
```
SocksPort 9000 IsolateDestAddr
SocksPort 9001 IsolateDestAddr
SocksPort 9002 IsolateDestAddr
SocksPort 9003 IsolateDestAddr
SocksPort 9004 IsolateDestAddr
```
2. add password for control port [not necessary]:

generate and add in torrc file [HashedControlPassword](https://2019.www.torproject.org/docs/tor-manual.html.en#HashedControlPassword).
```
HashedControlPassword hashed_password
```
---
```python
from requests_tor import RequestsTor

rt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), tor_cport=9151, password=None,
                 autochange_id=5, threads=8)
"""
    tor_ports = specify Tor socks ports tuple (default is (9150,), as the default in Tor Browser),
    if more than one port is set, the requests will be sent sequentially through the each port;
    tor_cport = specify Tor control port (default is 9151 for Tor Browser, for Tor use 9051);
    password = specify Tor control port password (default is None);
    autochange_id = number of requests via a one Tor socks port (default=5) to change TOR identity,
    specify autochange_id = 0 to turn off autochange Tor identity;
    threads = specify threads to download urls list (default=8);
    """
    
# check your ip
rt.check_ip()

# new Tor identity. Сalling this function includes time.sleep(3)
rt.new_id()

# test automatic TOR new identity
rt.test()

# Requests. TOR new identity is executed after (autochange_id * len(tor_ports)) requests.
# GET request. 
rt.get(url, params=None, **kwargs)

# POST request. 
rt.post(url, data=None, json=None, **kwargs)

# PUT request. 
rt.put(url, data=None, **kwargs)

# PATCH request.
rt.patch(url, data=None, **kwargs)

# DELETE request.
rt.delete(url, **kwargs)

# HEAD request.
rt.head(url, **kwargs)

"""
    url – URL for the new Request object.
    params – dictionary, list of tuples or bytes to send in the query string.
    **kwargs – optional arguments that request takes:
        data – (optional) Dictionary, list of tuples, bytes, or file-like object 
                to send in the body of the request.
        json – (optional) A JSON serializable Python object 
                to send in the body of the Request.
        headers – (optional) Dictionary of HTTP Headers to send with the Request.
        cookies – (optional) Dict or CookieJar object to send with the Request.
        files – (optional) Dictionary of 'name': file-like-objects (or {'name': file-tuple}) 
            for multipart encoding upload. file-tuple can be a 2-tuple ('filename', fileobj), 
            3-tuple ('filename', fileobj, 'content_type') or a 4-tuple ('filename', fileobj, '
            content_type', custom_headers), where 'content-type' is a string defining the 
            content type of the given file and custom_headers a dict-like object containing 
            additional headers to add for the file.
        auth – (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
        timeout – (optional) How many seconds to wait for the server to send data before 
                giving up, as a float, or a (connect timeout, read timeout) tuple.
        allow_redirects (bool) – (optional) Boolean. 
            Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to True.
        proxies – (optional) Dictionary mapping protocol to the URL of the proxy.
        verify – (optional) Either a boolean, in which case it controls whether we verify 
            the server’s TLS certificate, or a string, in which case it must be a path to 
            a CA bundle to use. Defaults to True.
        stream – (optional) if False, the response content will be immediately downloaded.
        cert – (optional) if String, path to ssl client cert file (.pem). 
                If Tuple, (‘cert’, ‘key’) pair.
        """
```
## Examples
### 1. Get url with unique params and headers in request.
```python
from requests_tor import RequestsTor

rt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), autochange_id=5)

url = 'https://httpbin.org/anything'
params = {
    "id": 12345,
    "status": 'passed'
    }
headers = {
    "Origin": "https://www.foxnews.com",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"
    }
r = rt.get(url, params=params, headers=headers)
print(r.text)  
```

### 2. Get list of urls concurrently.
```python
from requests_tor import RequestsTor

rt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), autochange_id=5)

# get urls list concurrently. TOR new identity is executed depending on the number of socksports and 
# autochange_id parameter. In case of 5 socksports and autochange_id=5, after downloading 5*5=25 urls
# TOR identity will be changed. It does matter, because calling TOR new identity includes time.sleep(3).
# get_urls(urls) can accept params, headers and other arguments from requests library.
urls = (f'https://checkip.amazonaws.com' for _ in range(10))
results = rt.get_urls(urls)
for r in results:
    print(r.text) 
```

 
### 3. Get list of urls concurrently with unique ip for each url
```python
from requests_tor import RequestsTor

rt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), autochange_id=1)

urls = (f'https://habr.com/ru/post/{x}' for x in range(1, 51))
r = rt.get_urls(urls)
print(r[-1].text)
```
---

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/deedy5/requests_tor",
    "name": "requests-tor",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "",
    "author": "deedy5",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/66/d1/e14635310900e17c13bdf331c805034e7116a6bc91aff12fced4b21d12e2/requests_tor-1.4.tar.gz",
    "platform": null,
    "description": "[![Python >= 3.6](https://img.shields.io/badge/python->=3.6-red.svg)](https://www.python.org/downloads/) [![](https://badgen.net/github/release/deedy5/requests_tor)](https://github.com/deedy5/requests_tor/releases) [![](https://badge.fury.io/py/requests-tor.svg)](https://pypi.org/project/requests-tor) \n# requests_tor \n\n`Release history:` [https://pypi.org/project/requests-tor/#history](https://pypi.org/project/requests-tor/#history)\n\n---\n\nMultithreading requests via [TOR](https://www.torproject.org) with automatic TOR new identity.\n\nWrapper of the [requests](https://docs.python-requests.org) and [stem](https://stem.torproject.org) libraries.\nReturns [requests.Response](https://docs.python-requests.org/en/latest/api/#requests.Response) object.\n\nMasking as Tor Browser by using its default headers:\n``` \n    \"Accept\": \"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8\",\n    \"Accept-Encoding\": \"gzip, deflate, br\",\n    \"Accept-Language\": \"en-US,en;q=0.5\",\n    \"Upgrade-Insecure-Requests\": \"1\",\n    \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; rv:102.0) Gecko/20100101 Firefox/102.0\",\n```\n\n### Install\n\n```\npip install -U requests_tor\n```\n\n### Dependencies\nDownload and start [Tor Browser](https://www.torproject.org/download/) or install [Tor](https://community.torproject.org/onion-services/setup/install/)\n\n_Notes:_\n* In Tor [torrc file](https://support.torproject.org/tbb/tbb-editing-torrc/) control port is disabled by default. Needs to uncomment line `ControlPort 9051`\n* If you face an error `Authentication failed: unable to read '/run/tor/control.authcookie' ([Errno 13] Permission denied: '/run/tor/control.authcookie')` - needs to add your current user to the tor group. `ps ax o comm,group | grep tor` - command to find tor group (group name will be in the second column, for example `debian-tor`). `sudo usermod -a -G debian-tor $USER` - add your current user to tor group\n* Restart Tor (`/etc/init.d/tor restart`) and re-login\n\n---\n### Simple usage\n```python\nfrom requests_tor import RequestsTor\n\n# If you use the Tor browser\nrt = RequestsTor()\nOR\n# If you use the Tor\nrt = RequestsTor(tor_ports=(9050,), tor_cport=9051)\n\nurl = 'https://httpbin.org/anything'\nr = rt.get(url)\nprint(r.text)\n\nurls = ['https://foxnews.com', 'https://nbcnews.com', 'https://wsj.com/news/world',\n        'https://abcnews.go.com', 'https://cbsnews.com',  'https://nytimes.com',\n        'https://usatoday.com','https://reuters.com/world', 'http://bbc.com/news',\n        'https://theguardian.com/world', 'https://cnn.com', 'https://apnews.com']\nr = rt.get_urls(urls)\nprint(r[-1].text)\n```\n\n---\n### Advanced usage\n[Edit torrc file](https://support.torproject.org/tbb/tbb-editing-torrc/):\n\n1. add [socks ports](https://2019.www.torproject.org/docs/tor-manual.html.en#SocksPort),\n```\nSocksPort 9000 IsolateDestAddr\nSocksPort 9001 IsolateDestAddr\nSocksPort 9002 IsolateDestAddr\nSocksPort 9003 IsolateDestAddr\nSocksPort 9004 IsolateDestAddr\n```\n2. add password for control port [not necessary]:\n\ngenerate and add in torrc file [HashedControlPassword](https://2019.www.torproject.org/docs/tor-manual.html.en#HashedControlPassword).\n```\nHashedControlPassword hashed_password\n```\n---\n```python\nfrom requests_tor import RequestsTor\n\nrt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), tor_cport=9151, password=None,\n                 autochange_id=5, threads=8)\n\"\"\"\n    tor_ports = specify Tor socks ports tuple (default is (9150,), as the default in Tor Browser),\n    if more than one port is set, the requests will be sent sequentially through the each port;\n    tor_cport = specify Tor control port (default is 9151 for Tor Browser, for Tor use 9051);\n    password = specify Tor control port password (default is None);\n    autochange_id = number of requests via a one Tor socks port (default=5) to change TOR identity,\n    specify autochange_id = 0 to turn off autochange Tor identity;\n    threads = specify threads to download urls list (default=8);\n    \"\"\"\n    \n# check your ip\nrt.check_ip()\n\n# new Tor identity. \u0421alling this function includes time.sleep(3)\nrt.new_id()\n\n# test automatic TOR new identity\nrt.test()\n\n# Requests. TOR new identity is executed after (autochange_id * len(tor_ports)) requests.\n# GET request. \nrt.get(url, params=None, **kwargs)\n\n# POST request. \nrt.post(url, data=None, json=None, **kwargs)\n\n# PUT request. \nrt.put(url, data=None, **kwargs)\n\n# PATCH request.\nrt.patch(url, data=None, **kwargs)\n\n# DELETE request.\nrt.delete(url, **kwargs)\n\n# HEAD request.\nrt.head(url, **kwargs)\n\n\"\"\"\n    url \u2013 URL for the new Request object.\n    params \u2013 dictionary, list of tuples or bytes to send in the query string.\n    **kwargs \u2013 optional arguments that request takes:\n        data \u2013 (optional) Dictionary, list of tuples, bytes, or file-like object \n                to send in the body of the request.\n        json \u2013 (optional) A JSON serializable Python object \n                to send in the body of the Request.\n        headers \u2013 (optional) Dictionary of HTTP Headers to send with the Request.\n        cookies \u2013 (optional) Dict or CookieJar object to send with the Request.\n        files \u2013 (optional) Dictionary of 'name': file-like-objects (or {'name': file-tuple}) \n            for multipart encoding upload. file-tuple can be a 2-tuple ('filename', fileobj), \n            3-tuple ('filename', fileobj, 'content_type') or a 4-tuple ('filename', fileobj, '\n            content_type', custom_headers), where 'content-type' is a string defining the \n            content type of the given file and custom_headers a dict-like object containing \n            additional headers to add for the file.\n        auth \u2013 (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.\n        timeout \u2013 (optional) How many seconds to wait for the server to send data before \n                giving up, as a float, or a (connect timeout, read timeout) tuple.\n        allow_redirects (bool) \u2013 (optional) Boolean. \n            Enable/disable GET/OPTIONS/POST/PUT/PATCH/DELETE/HEAD redirection. Defaults to True.\n        proxies \u2013 (optional) Dictionary mapping protocol to the URL of the proxy.\n        verify \u2013 (optional) Either a boolean, in which case it controls whether we verify \n            the server\u2019s TLS certificate, or a string, in which case it must be a path to \n            a CA bundle to use. Defaults to True.\n        stream \u2013 (optional) if False, the response content will be immediately downloaded.\n        cert \u2013 (optional) if String, path to ssl client cert file (.pem). \n                If Tuple, (\u2018cert\u2019, \u2018key\u2019) pair.\n        \"\"\"\n```\n## Examples\n### 1. Get url with unique params and headers in request.\n```python\nfrom requests_tor import RequestsTor\n\nrt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), autochange_id=5)\n\nurl = 'https://httpbin.org/anything'\nparams = {\n    \"id\": 12345,\n    \"status\": 'passed'\n    }\nheaders = {\n    \"Origin\": \"https://www.foxnews.com\",\n    \"User-Agent\": \"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36\"\n    }\nr = rt.get(url, params=params, headers=headers)\nprint(r.text)  \n```\n\n### 2. Get list of urls concurrently.\n```python\nfrom requests_tor import RequestsTor\n\nrt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), autochange_id=5)\n\n# get urls list concurrently. TOR new identity is executed depending on the number of socksports and \n# autochange_id parameter. In case of 5 socksports and autochange_id=5, after downloading 5*5=25 urls\n# TOR identity will be changed. It does matter, because calling TOR new identity includes time.sleep(3).\n# get_urls(urls) can accept params, headers and other arguments from requests library.\nurls = (f'https://checkip.amazonaws.com' for _ in range(10))\nresults = rt.get_urls(urls)\nfor r in results:\n    print(r.text) \n```\n\n \n### 3. Get list of urls concurrently with unique ip for each url\n```python\nfrom requests_tor import RequestsTor\n\nrt = RequestsTor(tor_ports=(9000, 9001, 9002, 9003, 9004), autochange_id=1)\n\nurls = (f'https://habr.com/ru/post/{x}' for x in range(1, 51))\nr = rt.get_urls(urls)\nprint(r[-1].text)\n```\n---\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Multithreading requests via TOR with automatic TOR new identity",
    "version": "1.4",
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "md5": "e195a94e05da2c0ff05aa3f8096cff12",
                "sha256": "817274861a57e821b77df6db3ecd4253f5b9a103d7e4e7b408c035f1e5016fa5"
            },
            "downloads": -1,
            "filename": "requests_tor-1.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e195a94e05da2c0ff05aa3f8096cff12",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 7245,
            "upload_time": "2022-12-14T17:20:49",
            "upload_time_iso_8601": "2022-12-14T17:20:49.582097Z",
            "url": "https://files.pythonhosted.org/packages/09/9a/8c91c538db50caa65027774052f91f12cf8ad36d3097d45acc8f270d6574/requests_tor-1.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "md5": "3f1ecaf1bafff9af20ce3766b2acc86d",
                "sha256": "f7dcea87fc40105294ef6ea7c3ff665bed161262dac562c169d33cd2e9ee18f3"
            },
            "downloads": -1,
            "filename": "requests_tor-1.4.tar.gz",
            "has_sig": false,
            "md5_digest": "3f1ecaf1bafff9af20ce3766b2acc86d",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 6731,
            "upload_time": "2022-12-14T17:20:51",
            "upload_time_iso_8601": "2022-12-14T17:20:51.163112Z",
            "url": "https://files.pythonhosted.org/packages/66/d1/e14635310900e17c13bdf331c805034e7116a6bc91aff12fced4b21d12e2/requests_tor-1.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2022-12-14 17:20:51",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "deedy5",
    "github_project": "requests_tor",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "requests-tor"
}
        
Elapsed time: 0.46744s