httpdiff


Namehttpdiff JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryHTTPDiff - Finding differences between HTTP responses
upload_time2024-12-14 20:50:21
maintainerNone
docs_urlNone
authorWilliam Kristoffersen
requires_pythonNone
licenseMIT
keywords python httpdiff
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # HTTPDiff

A library written for finding differences between HTTP responses.

- [Disclaimer](https://github.com/WillIWas123/HTTPDiff#disclaimer)
- [About](https://github.com/WillIWas123/HTTPDiff#about)
- [Usecases](https://github.com/WillIWas123/HTTPDiff#usecases)
- [Installation](https://github.com/WillIWas123/HTTPDiff#installation)
- [How it all works](https://github.com/WillIWas123/HTTPDiff#how_it_all_works)
- [Example usage](https://github.com/WillIWas123/HTTPDiff#example-usage)
- [Todo](https://github.com/WillIWas123/HTTPDiff#todo)
- [Tips](https://github.com/WillIWas123/HTTPDiff#tips)

## Disclaimer

- This is considered to be a beta release, and may contain bugs and unintentional behavior. Consider yourself warned!

## About

[HTTPDiff](https://github.com/WillIWas123/HTTPDiff) is a library built for finding differences between responses.

A lot of web pentesting tools suck, using regexes or hardcoded values to determine whether something is different. These methods will produce false-positives no matter how much you tweak those values. [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) attempts to use a more dynamic way of differentiation responses, attempting to reduce the false-positives produced during a scan.

By sending multiple requests with a known outcome, it is possible to calibrate a baseline of how the application normally behaves. [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) can then be used to find deviations from the default behavior. [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) will analyze every part of the response; the status code, reason, headers, body, response time, and even errors.


## Usecases

- Want to create a SQL injection scanner? Send a bunch of payloads with random strings for calibration, then send pairs of payloads (e.g. ' or '1'='1 and ' or '1'='2) and check for differences!

- If you want to brute-force endpoints and directories on a web application, you can start by sending a series of requests to known invalid endpoints. The baseline can now be used to determine if any other endpoints behave in a similar way, or are somehow different. ~~Go to [WebCD](https://github.com/WillIWas123/WebCD) for a good example on how to utilize this library.~~ <- This is a work in progress, be patient.


## Installation


```python3 -m pip install httpdiff```

## How it all works

Here comes some details of how the library is built, feel free to skip this section if you're not interested:


### Here's the process for calibrating:

1. The Baseline object takes a response object as a parameter (among others), multiple Blobs are created, one for headers, reason (status code + message), response time, body etc.
2. The input bytes are split on multiple characters `,.; and whitespaces`. A list of these bytes are stored as the original lines.
3. A new response is inputted, the bytes are similarly split on the same characters.
4. Levenshtein's algorithm (similar to `git diff`) is used to generate opcodes describing how to transform the original lines to the new lines.
5. Using these opcodes it is possible to relatively accurately determine the location of each Item, track replacements, insertions, deletions etc.
6. A check for multiple properties are done, if all the lines in an Item have the same property it's stored as a method to analyze the lines in the future. A property in this case is a way to compare or measure a line.
7. If there are no properties that can be used, the Item is going to be ignored in any future diffing.
8. Repeat from step 3.


### Here's the process for diffing:

1. A new response is inputted, the bytes are similarly split on the same characters as before.
2. Opcodes are generated in a similar manner as before.
3. Each line is compared against its respective Item, verifying the new line has the same properties as all the previous lines in the Item.
4. If the line does not contain one of the stored properties, a Diff is created.
5. (Optional) Find differences in two responses with expected different outcomes and compare the diffs.

## Example usage

Go visit [WebCD](https://github.com/WillIWas123/WebCD) to see an awesome content discovery tool utilizing [HTTPDiff](https://github.com/WillIWas123/HTTPDiff).

Here's a small example script showing how [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) can be used:

```python
import requests
from httpdiff import Response, Baseline, remove_reflection
import string
import random

def calibrate_baseline(baseline):
    value = "".join(random.choice(string.ascii_letters) for _ in range(random.randint(3,15)))
    for _ in range(10):
        resp = requests.get(f"https://someurl/endpoint?param={value}")
        httpdiff_resp = Response(resp)
        baseline.add_response(httpdiff_resp)

def scan(baseline):
    # 10 1's and 10 2's are used for easier identifying the reflection in the response
    payload1 = "' or '1111111111'='1111111111"
    resp = requests.get(f"https://someurl/endpoint?param={payload1}")
    httpdiff_resp1 = Response(resp)
    # Using lists instead of generator output because the diffs are going to be compared


    # payload2 in this example is supposed to contain a similar payload, but a different result if vulnerable. Kind of an opposite payload.
    payload2 = "' or '1111111111'='2222222222"
    resp = requests.get(f"https://someurl/endpoint?param={payload2}")
    httpdiff_resp2 = Response(resp)

    # Attempts to remove the reflection of the payloads
    remove_reflection(httpdiff_resp1,httpdiff_resp2,payload1,payload2)

    diffs = list(baseline.is_diff(httpdiff_resp1))
    diffs2 = list(baseline.is_diff(httpdiff_resp2))
    if diffs != diffs2:
      print(f"Vulnerable to SQL Injection!") 

def main():
    baseline = Baseline()
    calibrate_baseline(baseline)
    scan(baseline)

if __name__ == "__main__": 
  main()
```


## Todo

- Implement more property checks for Items
- Improve method for diffing integer ranges
- Properly handle errors
- Make it easier to "overwrite" functions in order to create custom calibration and diffing methods.
- Do a lot more testing with this tool, bugs may still be present.
- Multiple TODO's are scattered around the code, these will be addressed some time in the future.

## Tips

Some tips for successfully creating your own scanner of some sort:

- Use random values of random length when calibrating a baseline
- Use cachebusters
- Repeat one set of values during calibration (to ensure potential cache hits are included in the baseline)
- Use relatively long values for values that are arbitrary (for removing reflection with better accuracy)
- Verify the baseline upon a positive result
- Verify the same payload a couple of times upon a positive result to verify it's not a fluke
- Create an issue if you catch any mistakes in the library
- Tell others about [HTTPDiff](https://github.com/WillIWas123/HTTPDiff)




            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "httpdiff",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python, httpdiff",
    "author": "William Kristoffersen",
    "author_email": "william.kristof@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/11/c3/50cc27f098a7ce4637d91caf34fc4932bea925d99c24ee0393d8a2cda745/httpdiff-0.1.3.tar.gz",
    "platform": null,
    "description": "# HTTPDiff\n\nA library written for finding differences between HTTP responses.\n\n- [Disclaimer](https://github.com/WillIWas123/HTTPDiff#disclaimer)\n- [About](https://github.com/WillIWas123/HTTPDiff#about)\n- [Usecases](https://github.com/WillIWas123/HTTPDiff#usecases)\n- [Installation](https://github.com/WillIWas123/HTTPDiff#installation)\n- [How it all works](https://github.com/WillIWas123/HTTPDiff#how_it_all_works)\n- [Example usage](https://github.com/WillIWas123/HTTPDiff#example-usage)\n- [Todo](https://github.com/WillIWas123/HTTPDiff#todo)\n- [Tips](https://github.com/WillIWas123/HTTPDiff#tips)\n\n## Disclaimer\n\n- This is considered to be a beta release, and may contain bugs and unintentional behavior. Consider yourself warned!\n\n## About\n\n[HTTPDiff](https://github.com/WillIWas123/HTTPDiff) is a library built for finding differences between responses.\n\nA lot of web pentesting tools suck, using regexes or hardcoded values to determine whether something is different. These methods will produce false-positives no matter how much you tweak those values. [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) attempts to use a more dynamic way of differentiation responses, attempting to reduce the false-positives produced during a scan.\n\nBy sending multiple requests with a known outcome, it is possible to calibrate a baseline of how the application normally behaves. [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) can then be used to find deviations from the default behavior. [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) will analyze every part of the response; the status code, reason, headers, body, response time, and even errors.\n\n\n## Usecases\n\n- Want to create a SQL injection scanner? Send a bunch of payloads with random strings for calibration, then send pairs of payloads (e.g. ' or '1'='1 and ' or '1'='2) and check for differences!\n\n- If you want to brute-force endpoints and directories on a web application, you can start by sending a series of requests to known invalid endpoints. The baseline can now be used to determine if any other endpoints behave in a similar way, or are somehow different. ~~Go to [WebCD](https://github.com/WillIWas123/WebCD) for a good example on how to utilize this library.~~ <- This is a work in progress, be patient.\n\n\n## Installation\n\n\n```python3 -m pip install httpdiff```\n\n## How it all works\n\nHere comes some details of how the library is built, feel free to skip this section if you're not interested:\n\n\n### Here's the process for calibrating:\n\n1. The Baseline object takes a response object as a parameter (among others), multiple Blobs are created, one for headers, reason (status code + message), response time, body etc.\n2. The input bytes are split on multiple characters `,.; and whitespaces`. A list of these bytes are stored as the original lines.\n3. A new response is inputted, the bytes are similarly split on the same characters.\n4. Levenshtein's algorithm (similar to `git diff`) is used to generate opcodes describing how to transform the original lines to the new lines.\n5. Using these opcodes it is possible to relatively accurately determine the location of each Item, track replacements, insertions, deletions etc.\n6. A check for multiple properties are done, if all the lines in an Item have the same property it's stored as a method to analyze the lines in the future. A property in this case is a way to compare or measure a line.\n7. If there are no properties that can be used, the Item is going to be ignored in any future diffing.\n8. Repeat from step 3.\n\n\n### Here's the process for diffing:\n\n1. A new response is inputted, the bytes are similarly split on the same characters as before.\n2. Opcodes are generated in a similar manner as before.\n3. Each line is compared against its respective Item, verifying the new line has the same properties as all the previous lines in the Item.\n4. If the line does not contain one of the stored properties, a Diff is created.\n5. (Optional) Find differences in two responses with expected different outcomes and compare the diffs.\n\n## Example usage\n\nGo visit [WebCD](https://github.com/WillIWas123/WebCD) to see an awesome content discovery tool utilizing [HTTPDiff](https://github.com/WillIWas123/HTTPDiff).\n\nHere's a small example script showing how [HTTPDiff](https://github.com/WillIWas123/HTTPDiff) can be used:\n\n```python\nimport requests\nfrom httpdiff import Response, Baseline, remove_reflection\nimport string\nimport random\n\ndef calibrate_baseline(baseline):\n    value = \"\".join(random.choice(string.ascii_letters) for _ in range(random.randint(3,15)))\n    for _ in range(10):\n        resp = requests.get(f\"https://someurl/endpoint?param={value}\")\n        httpdiff_resp = Response(resp)\n        baseline.add_response(httpdiff_resp)\n\ndef scan(baseline):\n    # 10 1's and 10 2's are used for easier identifying the reflection in the response\n    payload1 = \"' or '1111111111'='1111111111\"\n    resp = requests.get(f\"https://someurl/endpoint?param={payload1}\")\n    httpdiff_resp1 = Response(resp)\n    # Using lists instead of generator output because the diffs are going to be compared\n\n\n    # payload2 in this example is supposed to contain a similar payload, but a different result if vulnerable. Kind of an opposite payload.\n    payload2 = \"' or '1111111111'='2222222222\"\n    resp = requests.get(f\"https://someurl/endpoint?param={payload2}\")\n    httpdiff_resp2 = Response(resp)\n\n    # Attempts to remove the reflection of the payloads\n    remove_reflection(httpdiff_resp1,httpdiff_resp2,payload1,payload2)\n\n    diffs = list(baseline.is_diff(httpdiff_resp1))\n    diffs2 = list(baseline.is_diff(httpdiff_resp2))\n    if diffs != diffs2:\n      print(f\"Vulnerable to SQL Injection!\") \n\ndef main():\n    baseline = Baseline()\n    calibrate_baseline(baseline)\n    scan(baseline)\n\nif __name__ == \"__main__\": \n  main()\n```\n\n\n## Todo\n\n- Implement more property checks for Items\n- Improve method for diffing integer ranges\n- Properly handle errors\n- Make it easier to \"overwrite\" functions in order to create custom calibration and diffing methods.\n- Do a lot more testing with this tool, bugs may still be present.\n- Multiple TODO's are scattered around the code, these will be addressed some time in the future.\n\n## Tips\n\nSome tips for successfully creating your own scanner of some sort:\n\n- Use random values of random length when calibrating a baseline\n- Use cachebusters\n- Repeat one set of values during calibration (to ensure potential cache hits are included in the baseline)\n- Use relatively long values for values that are arbitrary (for removing reflection with better accuracy)\n- Verify the baseline upon a positive result\n- Verify the same payload a couple of times upon a positive result to verify it's not a fluke\n- Create an issue if you catch any mistakes in the library\n- Tell others about [HTTPDiff](https://github.com/WillIWas123/HTTPDiff)\n\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "HTTPDiff - Finding differences between HTTP responses",
    "version": "0.1.3",
    "project_urls": null,
    "split_keywords": [
        "python",
        " httpdiff"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8dabe75a81000c2d427040d5ad41614541bd3da6426176f25d59e4f0c7acc0ac",
                "md5": "88911d15b418673c6816a831d30bec57",
                "sha256": "5ff471ce182bed5642f18d509c84549b0c1ffb7ea470805403aea7e6da971912"
            },
            "downloads": -1,
            "filename": "httpdiff-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "88911d15b418673c6816a831d30bec57",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 11428,
            "upload_time": "2024-12-14T20:50:19",
            "upload_time_iso_8601": "2024-12-14T20:50:19.544430Z",
            "url": "https://files.pythonhosted.org/packages/8d/ab/e75a81000c2d427040d5ad41614541bd3da6426176f25d59e4f0c7acc0ac/httpdiff-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "11c350cc27f098a7ce4637d91caf34fc4932bea925d99c24ee0393d8a2cda745",
                "md5": "9cf5dae993ca1f17cca82c3abe5ddafd",
                "sha256": "b3dd3cc6520c73a76bafef6d1289653de7466cf271e8897387a7cdb6df5cdf01"
            },
            "downloads": -1,
            "filename": "httpdiff-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "9cf5dae993ca1f17cca82c3abe5ddafd",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 12894,
            "upload_time": "2024-12-14T20:50:21",
            "upload_time_iso_8601": "2024-12-14T20:50:21.898909Z",
            "url": "https://files.pythonhosted.org/packages/11/c3/50cc27f098a7ce4637d91caf34fc4932bea925d99c24ee0393d8a2cda745/httpdiff-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-14 20:50:21",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "httpdiff"
}
        
Elapsed time: 0.45850s