hrfh


Namehrfh JSON
Version 0.1.15 PyPI version JSON
download
home_pageNone
Summaryan HTTP Response Fuzzy Hashing package
upload_time2024-06-28 08:43:19
maintainerNone
docs_urlNone
authorYihang Wang
requires_python<4.0,>=3.7
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## Usage

```bash
pip install hrfh
```

```python
from hrfh.utils.parser import create_http_response_from_bytes
response = create_http_response_from_bytes(b"""HTTP/1.0 200 OK\r\nServer: nginx\r\nServer: apache\r\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\r\n\r\n""")
print(response)
print(response.masked)
print(response.fuzzy_hash())
```

```python
>>> from hrfh.utils.parser import create_http_response_from_bytes
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package words to /root/nltk_data...
[nltk_data]   Unzipping corpora/words.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
>>> response = create_http_response_from_bytes(b"""HTTP/1.0 200 OK\r\nServer: nginx\r\nServer: apache\r\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\r\n\r\n""")
>>> print(response)
<HTTPResponse 1.1.1.1:80 200 OK>
>>> print(response.masked)
HTTP/1.0 200 OK
ETag: [MASK]
Server: apache
Server: nginx
>>> print(response.fuzzy_hash())
ba15cc1f9ad3ef632d0ce7798f7fa44718f1e7fcc2c0f94c1a702f647b79923b
```

## Source Usage

1. Install requirements

```bash
sudo apt install python3-pip
```

```bash
pip install poetry
poetry install
poetry run python main.py
```

2. Prepare HTTP response data as json format in `data/${cdn}/${ip}.json` file

```bash
$ tree data/
data
├── akamai
│   ├── 104.103.147.116.json
│   └── 104.81.222.211.json
├── alibaba-cdn
└── wangsu
```

```bash
cat data/akamai/104.103.147.116.json
```

```json
{
  "ip": "104.103.147.116",
  "timestamp": 1717146116,
  "status_code": 400,
  "status_reason": "Bad Request",
  "headers": {
    "Server": "AkamaiGHost",
    "Mime-Version": "1.0",
    "Content-Type": "text/html",
    "Content-Length": "312",
    "Expires": "Fri, 31 May 2024 09:01:56 GMT",
    "Date": "Fri, 31 May 2024 09:01:56 GMT",
    "Connection": "close"
  },
  "body": "<HTML><HEAD>\n<TITLE>Invalid URL</TITLE>\n</HEAD><BODY>\n<H1>Invalid URL</H1>\nThe requested URL \"&#91;no&#32;URL&#93;\", is invalid.<p>\nReference&#32;&#35;9&#46;8be83217&#46;1717146116&#46;2661874a\n<P>https&#58;&#47;&#47;errors&#46;edgesuite&#46;net&#47;9&#46;8be83217&#46;1717146116&#46;2661874a</P>\n</BODY></HTML>\n"
}
```

3. Run the script to generate the hash 

```bash
poetry run python main.py
```

```
01c7da5c66ffab8b54a <HTTPResponse 45.64.21.148:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 103.151.139.204:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 199.91.74.213:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 156.59.207.6:80 403 Forbidden>
01c7da5c66ffab8b54a <HTTPResponse 23.90.149.102:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 58.57.102.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 60.188.66.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 117.68.34.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 124.225.184.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 58.42.14.41:80 403 Forbidden>
100c01467b6bb4c99e7 <HTTPResponse 101.206.106.41:80 403 Forbidden>
```

## Customize

### Load from another source

1. Implement your load which returns a [`HTTPResponse`](hrfh/models/__init__.py) object.
2. call `HTTPResponse.fuzzy_hash()` to get the hash of the http response.

## Python 3.7 Support

```python
$ docker run -i -t python:3.7 /bin/bash
root@aa0241a5a2f5:/# python --version
Python 3.7.12
root@aa0241a5a2f5:/# pip --version
pip 24.0 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)
root@aa0241a5a2f5:/# pip install --upgrade -q ipython hrfh==0.1.3
root@aa0241a5a2f5:/# ipython
Python 3.7.12 (default, Dec 21 2021, 11:25:13) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.34.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from hrfh.utils.parser import create_http_response_from_bytes
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data] Downloading package words to /root/nltk_data...
[nltk_data]   Unzipping corpora/words.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.

In [2]: response = create_http_response_from_bytes(b"""HTTP/1.0 200 OK\r\nServer: nginx\r\nServer: apache\r\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\r\n\r\n""")

In [3]: response.masked
Out[3]: 'HTTP/1.0 200 OK\nETag: [MASK]\nServer: apache\nServer: nginx'

In [4]: response.fuzzy_hash()
Out[4]: 'ba15cc1f9ad3ef632d0ce7798f7fa44718f1e7fcc2c0f94c1a702f647b79923b'
```

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "hrfh",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.7",
    "maintainer_email": null,
    "keywords": null,
    "author": "Yihang Wang",
    "author_email": "wangyihanger@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/b1/c8/c9fcf104bec58b54c5dbe18e95a22d64ad0471b72036ec0137fb71e51bd2/hrfh-0.1.15.tar.gz",
    "platform": null,
    "description": "## Usage\n\n```bash\npip install hrfh\n```\n\n```python\nfrom hrfh.utils.parser import create_http_response_from_bytes\nresponse = create_http_response_from_bytes(b\"\"\"HTTP/1.0 200 OK\\r\\nServer: nginx\\r\\nServer: apache\\r\\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\\r\\n\\r\\n\"\"\")\nprint(response)\nprint(response.masked)\nprint(response.fuzzy_hash())\n```\n\n```python\n>>> from hrfh.utils.parser import create_http_response_from_bytes\n[nltk_data] Downloading package wordnet to /root/nltk_data...\n[nltk_data] Downloading package words to /root/nltk_data...\n[nltk_data]   Unzipping corpora/words.zip.\n[nltk_data] Downloading package punkt to /root/nltk_data...\n[nltk_data]   Unzipping tokenizers/punkt.zip.\n>>> response = create_http_response_from_bytes(b\"\"\"HTTP/1.0 200 OK\\r\\nServer: nginx\\r\\nServer: apache\\r\\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\\r\\n\\r\\n\"\"\")\n>>> print(response)\n<HTTPResponse 1.1.1.1:80 200 OK>\n>>> print(response.masked)\nHTTP/1.0 200 OK\nETag: [MASK]\nServer: apache\nServer: nginx\n>>> print(response.fuzzy_hash())\nba15cc1f9ad3ef632d0ce7798f7fa44718f1e7fcc2c0f94c1a702f647b79923b\n```\n\n## Source Usage\n\n1. Install requirements\n\n```bash\nsudo apt install python3-pip\n```\n\n```bash\npip install poetry\npoetry install\npoetry run python main.py\n```\n\n2. Prepare HTTP response data as json format in `data/${cdn}/${ip}.json` file\n\n```bash\n$ tree data/\ndata\n\u251c\u2500\u2500 akamai\n\u2502   \u251c\u2500\u2500 104.103.147.116.json\n\u2502   \u2514\u2500\u2500 104.81.222.211.json\n\u251c\u2500\u2500 alibaba-cdn\n\u2514\u2500\u2500 wangsu\n```\n\n```bash\ncat data/akamai/104.103.147.116.json\n```\n\n```json\n{\n  \"ip\": \"104.103.147.116\",\n  \"timestamp\": 1717146116,\n  \"status_code\": 400,\n  \"status_reason\": \"Bad Request\",\n  \"headers\": {\n    \"Server\": \"AkamaiGHost\",\n    \"Mime-Version\": \"1.0\",\n    \"Content-Type\": \"text/html\",\n    \"Content-Length\": \"312\",\n    \"Expires\": \"Fri, 31 May 2024 09:01:56 GMT\",\n    \"Date\": \"Fri, 31 May 2024 09:01:56 GMT\",\n    \"Connection\": \"close\"\n  },\n  \"body\": \"<HTML><HEAD>\\n<TITLE>Invalid URL</TITLE>\\n</HEAD><BODY>\\n<H1>Invalid URL</H1>\\nThe requested URL \\\"&#91;no&#32;URL&#93;\\\", is invalid.<p>\\nReference&#32;&#35;9&#46;8be83217&#46;1717146116&#46;2661874a\\n<P>https&#58;&#47;&#47;errors&#46;edgesuite&#46;net&#47;9&#46;8be83217&#46;1717146116&#46;2661874a</P>\\n</BODY></HTML>\\n\"\n}\n```\n\n3. Run the script to generate the hash \n\n```bash\npoetry run python main.py\n```\n\n```\n01c7da5c66ffab8b54a <HTTPResponse 45.64.21.148:80 403 Forbidden>\n01c7da5c66ffab8b54a <HTTPResponse 103.151.139.204:80 403 Forbidden>\n01c7da5c66ffab8b54a <HTTPResponse 199.91.74.213:80 403 Forbidden>\n01c7da5c66ffab8b54a <HTTPResponse 156.59.207.6:80 403 Forbidden>\n01c7da5c66ffab8b54a <HTTPResponse 23.90.149.102:80 403 Forbidden>\n100c01467b6bb4c99e7 <HTTPResponse 58.57.102.41:80 403 Forbidden>\n100c01467b6bb4c99e7 <HTTPResponse 60.188.66.41:80 403 Forbidden>\n100c01467b6bb4c99e7 <HTTPResponse 117.68.34.41:80 403 Forbidden>\n100c01467b6bb4c99e7 <HTTPResponse 124.225.184.41:80 403 Forbidden>\n100c01467b6bb4c99e7 <HTTPResponse 58.42.14.41:80 403 Forbidden>\n100c01467b6bb4c99e7 <HTTPResponse 101.206.106.41:80 403 Forbidden>\n```\n\n## Customize\n\n### Load from another source\n\n1. Implement your load which returns a [`HTTPResponse`](hrfh/models/__init__.py) object.\n2. call `HTTPResponse.fuzzy_hash()` to get the hash of the http response.\n\n## Python 3.7 Support\n\n```python\n$ docker run -i -t python:3.7 /bin/bash\nroot@aa0241a5a2f5:/# python --version\nPython 3.7.12\nroot@aa0241a5a2f5:/# pip --version\npip 24.0 from /usr/local/lib/python3.7/site-packages/pip (python 3.7)\nroot@aa0241a5a2f5:/# pip install --upgrade -q ipython hrfh==0.1.3\nroot@aa0241a5a2f5:/# ipython\nPython 3.7.12 (default, Dec 21 2021, 11:25:13) \nType 'copyright', 'credits' or 'license' for more information\nIPython 7.34.0 -- An enhanced Interactive Python. Type '?' for help.\n\nIn [1]: from hrfh.utils.parser import create_http_response_from_bytes\n[nltk_data] Downloading package wordnet to /root/nltk_data...\n[nltk_data] Downloading package words to /root/nltk_data...\n[nltk_data]   Unzipping corpora/words.zip.\n[nltk_data] Downloading package punkt to /root/nltk_data...\n[nltk_data]   Unzipping tokenizers/punkt.zip.\n\nIn [2]: response = create_http_response_from_bytes(b\"\"\"HTTP/1.0 200 OK\\r\\nServer: nginx\\r\\nServer: apache\\r\\nETag: ea67ba7f802fb5c6cfa13a6b6d27adc6\\r\\n\\r\\n\"\"\")\n\nIn [3]: response.masked\nOut[3]: 'HTTP/1.0 200 OK\\nETag: [MASK]\\nServer: apache\\nServer: nginx'\n\nIn [4]: response.fuzzy_hash()\nOut[4]: 'ba15cc1f9ad3ef632d0ce7798f7fa44718f1e7fcc2c0f94c1a702f647b79923b'\n```\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "an HTTP Response Fuzzy Hashing package",
    "version": "0.1.15",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0b90ca128508c82345b4b2ee7536b8b33577879245832d1f1dd5ff11af9e4e94",
                "md5": "2f47f7b275be92d58fda016547fb43ea",
                "sha256": "f1a32ffc839885847c5c1fa2318199526664fd7ca4004f193340a15355e6446c"
            },
            "downloads": -1,
            "filename": "hrfh-0.1.15-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2f47f7b275be92d58fda016547fb43ea",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.7",
            "size": 7800,
            "upload_time": "2024-06-28T08:43:17",
            "upload_time_iso_8601": "2024-06-28T08:43:17.744812Z",
            "url": "https://files.pythonhosted.org/packages/0b/90/ca128508c82345b4b2ee7536b8b33577879245832d1f1dd5ff11af9e4e94/hrfh-0.1.15-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "b1c8c9fcf104bec58b54c5dbe18e95a22d64ad0471b72036ec0137fb71e51bd2",
                "md5": "8a4d57557feb3f42c90d7b5a98b96cf6",
                "sha256": "69b6757fe66ffa35ee8293635d109ce8a19f46ebbd8de490285fce77ddbaf504"
            },
            "downloads": -1,
            "filename": "hrfh-0.1.15.tar.gz",
            "has_sig": false,
            "md5_digest": "8a4d57557feb3f42c90d7b5a98b96cf6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.7",
            "size": 7198,
            "upload_time": "2024-06-28T08:43:19",
            "upload_time_iso_8601": "2024-06-28T08:43:19.225525Z",
            "url": "https://files.pythonhosted.org/packages/b1/c8/c9fcf104bec58b54c5dbe18e95a22d64ad0471b72036ec0137fb71e51bd2/hrfh-0.1.15.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-28 08:43:19",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "hrfh"
}
        
Elapsed time: 0.28560s