subwiz


Namesubwiz JSON
Version 0.1.3 PyPI version JSON
download
home_pageNone
SummaryA recon tool that uses AI to predict subdomains. Then returns those that resolve.
upload_time2024-08-18 18:45:18
maintainerNone
docs_urlNone
authorNone
requires_python>=3.7
licenseMIT License
keywords machine learning recon subdomains transformers
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <pre style="color: lime; background-color: black;">
███████╗██╗   ██╗██████╗     ██╗    ██╗██╗███████╗
██╔════╝██║   ██║██╔══██╗    ██║    ██║██║╚══███╔╝
███████╗██║   ██║██████╔╝    ██║ █╗ ██║██║  ███╔╝ 
╚════██║██║   ██║██╔══██╗    ██║███╗██║██║ ███╔╝  
███████║╚██████╔╝██████╔╝    ╚███╔███╔╝██║███████╗
╚══════╝ ╚═════╝ ╚═════╝      ╚══╝╚══╝ ╚═╝╚══════╝
</pre>

A recon tool that uses AI to predict subdomains. Then returns those that resolve.

### Installation

```pip install subwiz```

### Recommended Use

Use [subfinder](https://github.com/projectdiscovery/subfinder) ❤️ to find subdomains from passive sources:

```subfinder -d example.com -o subdomains.txt```

Seed subwiz with these subdomains:

```subwiz -i subdomains.txt```

### Supported Switches

```commandline
usage: cli.py [-h] -i INPUT_FILE [-o OUTPUT_FILE] [-n NUM_PREDICTIONS]
              [--no-resolve] [--force-download] [-t TEMPERATURE]
              [-d {auto,cpu,cuda,mps}] [-q MAX_NEW_TOKENS]
              [--resolution_concurrency RESOLUTION_LIM]

options:
  -h, --help            show this help message and exit
  -i INPUT_FILE, --input-file INPUT_FILE
                        file containing new-line-separated subdomains.
                        (default: None)
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        output file to write new-line separated subdomains to.
                        (default: None)
  -n NUM_PREDICTIONS, --num_predictions NUM_PREDICTIONS
                        number of subdomains to predict. (default: 500)
  --no-resolve          do not resolve the output subdomains. (default: False)
  --force-download      download model and tokenizer files, even if cached.
                        (default: False)
  -t TEMPERATURE, --temperature TEMPERATURE
                        add randomness to the model, recommended ≤ 0.3)
                        (default: 0.0)
  -d {auto,cpu,cuda,mps}, --device {auto,cpu,cuda,mps}
                        hardware to run the transformer model on. (default:
                        auto)
  -q MAX_NEW_TOKENS, --max_new_tokens MAX_NEW_TOKENS
                        maximum length of predicted subdomains in tokens.
                        (default: 10)
  --resolution_concurrency RESOLUTION_LIM
                        number of concurrent resolutions. (default: 128)

```

### In Python

Use subwiz in Python, with the same inputs as the command line interface.

```
import subwiz

known_subdomains = ['test1.example.com', 'test2.example.com']
new_subdomains = subwiz.run(input_domains=known_subdomains)
```

---
### Model

Use the `--no-resolve` flag to inspect model outputs without checking if they resolve.

#### Architecture

Subwiz is a ultra-lightweight transformer model based on [nanoGPT](https://github.com/karpathy/nanoGPT/tree/master) ❤️:

- 17.3M parameters.
- Trained on 26M tokens, lists of subdomains from passive sources.
- Tokenizer trained on same lists of subdomains (8192 tokens).

#### Hugging Face
The model is saved in Hugging Face as [HadrianSecurity/subwiz](https://huggingface.co/HadrianSecurity/subwiz).
It is downloaded when you first run subwiz.

#### Inference

Typically, generative transformer models (e.g. ChatGPT) predict a single output sequence.
Subwiz predicts the N most likely sequences using a beam search algorithm.

![Diagram of the inference algorithm](https://raw.githubusercontent.com/hadriansecurity/subwiz/main/subwiz_inference.png)

*Beam algorithm to predict the N most likely outputs from a generative transformer model.*

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "subwiz",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "machine learning, recon, subdomains, transformers",
    "author": null,
    "author_email": "Klaas Meinke <klaas@hadrian.io>",
    "download_url": "https://files.pythonhosted.org/packages/7c/c2/9f17b3fca548c18b1f1aa59e78eef37b282dc47755715687bbc3c29789a5/subwiz-0.1.3.tar.gz",
    "platform": null,
    "description": "<pre style=\"color: lime; background-color: black;\">\n\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\u2588\u2588\u2557   \u2588\u2588\u2557\u2588\u2588\u2588\u2588\u2588\u2588\u2557     \u2588\u2588\u2557    \u2588\u2588\u2557\u2588\u2588\u2557\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\n\u2588\u2588\u2554\u2550\u2550\u2550\u2550\u255d\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557    \u2588\u2588\u2551    \u2588\u2588\u2551\u2588\u2588\u2551\u255a\u2550\u2550\u2588\u2588\u2588\u2554\u255d\n\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d    \u2588\u2588\u2551 \u2588\u2557 \u2588\u2588\u2551\u2588\u2588\u2551  \u2588\u2588\u2588\u2554\u255d \n\u255a\u2550\u2550\u2550\u2550\u2588\u2588\u2551\u2588\u2588\u2551   \u2588\u2588\u2551\u2588\u2588\u2554\u2550\u2550\u2588\u2588\u2557    \u2588\u2588\u2551\u2588\u2588\u2588\u2557\u2588\u2588\u2551\u2588\u2588\u2551 \u2588\u2588\u2588\u2554\u255d  \n\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2551\u255a\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2588\u2588\u2588\u2588\u2554\u255d    \u255a\u2588\u2588\u2588\u2554\u2588\u2588\u2588\u2554\u255d\u2588\u2588\u2551\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2557\n\u255a\u2550\u2550\u2550\u2550\u2550\u2550\u255d \u255a\u2550\u2550\u2550\u2550\u2550\u255d \u255a\u2550\u2550\u2550\u2550\u2550\u255d      \u255a\u2550\u2550\u255d\u255a\u2550\u2550\u255d \u255a\u2550\u255d\u255a\u2550\u2550\u2550\u2550\u2550\u2550\u255d\n</pre>\n\nA recon tool that uses AI to predict subdomains. Then returns those that resolve.\n\n### Installation\n\n```pip install subwiz```\n\n### Recommended Use\n\nUse [subfinder](https://github.com/projectdiscovery/subfinder) \u2764\ufe0f to find subdomains from passive sources:\n\n```subfinder -d example.com -o subdomains.txt```\n\nSeed subwiz with these subdomains:\n\n```subwiz -i subdomains.txt```\n\n### Supported Switches\n\n```commandline\nusage: cli.py [-h] -i INPUT_FILE [-o OUTPUT_FILE] [-n NUM_PREDICTIONS]\n              [--no-resolve] [--force-download] [-t TEMPERATURE]\n              [-d {auto,cpu,cuda,mps}] [-q MAX_NEW_TOKENS]\n              [--resolution_concurrency RESOLUTION_LIM]\n\noptions:\n  -h, --help            show this help message and exit\n  -i INPUT_FILE, --input-file INPUT_FILE\n                        file containing new-line-separated subdomains.\n                        (default: None)\n  -o OUTPUT_FILE, --output-file OUTPUT_FILE\n                        output file to write new-line separated subdomains to.\n                        (default: None)\n  -n NUM_PREDICTIONS, --num_predictions NUM_PREDICTIONS\n                        number of subdomains to predict. (default: 500)\n  --no-resolve          do not resolve the output subdomains. (default: False)\n  --force-download      download model and tokenizer files, even if cached.\n                        (default: False)\n  -t TEMPERATURE, --temperature TEMPERATURE\n                        add randomness to the model, recommended \u2264 0.3)\n                        (default: 0.0)\n  -d {auto,cpu,cuda,mps}, --device {auto,cpu,cuda,mps}\n                        hardware to run the transformer model on. (default:\n                        auto)\n  -q MAX_NEW_TOKENS, --max_new_tokens MAX_NEW_TOKENS\n                        maximum length of predicted subdomains in tokens.\n                        (default: 10)\n  --resolution_concurrency RESOLUTION_LIM\n                        number of concurrent resolutions. (default: 128)\n\n```\n\n### In Python\n\nUse subwiz in Python, with the same inputs as the command line interface.\n\n```\nimport subwiz\n\nknown_subdomains = ['test1.example.com', 'test2.example.com']\nnew_subdomains = subwiz.run(input_domains=known_subdomains)\n```\n\n---\n### Model\n\nUse the `--no-resolve` flag to inspect model outputs without checking if they resolve.\n\n#### Architecture\n\nSubwiz is a ultra-lightweight transformer model based on [nanoGPT](https://github.com/karpathy/nanoGPT/tree/master) \u2764\ufe0f:\n\n- 17.3M parameters.\n- Trained on 26M tokens, lists of subdomains from passive sources.\n- Tokenizer trained on same lists of subdomains (8192 tokens).\n\n#### Hugging Face\nThe model is saved in Hugging Face as [HadrianSecurity/subwiz](https://huggingface.co/HadrianSecurity/subwiz).\nIt is downloaded when you first run subwiz.\n\n#### Inference\n\nTypically, generative transformer models (e.g. ChatGPT) predict a single output sequence.\nSubwiz predicts the N most likely sequences using a beam search algorithm.\n\n![Diagram of the inference algorithm](https://raw.githubusercontent.com/hadriansecurity/subwiz/main/subwiz_inference.png)\n\n*Beam algorithm to predict the N most likely outputs from a generative transformer model.*\n",
    "bugtrack_url": null,
    "license": "MIT License",
    "summary": "A recon tool that uses AI to predict subdomains. Then returns those that resolve.",
    "version": "0.1.3",
    "project_urls": {
        "Source": "https://github.com/hadriansecurity/subwiz"
    },
    "split_keywords": [
        "machine learning",
        " recon",
        " subdomains",
        " transformers"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "4385463f3e2247e933f6db44133945b61c64c8b9b14d4dfa5603b3464f4e39a6",
                "md5": "695a674bf7dc469f6216b1cb47f253b8",
                "sha256": "413e6d1649cc01682ca25400dba95b165903554751a924a2c412fa7ee5c5cc05"
            },
            "downloads": -1,
            "filename": "subwiz-0.1.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "695a674bf7dc469f6216b1cb47f253b8",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 16288,
            "upload_time": "2024-08-18T18:45:17",
            "upload_time_iso_8601": "2024-08-18T18:45:17.310306Z",
            "url": "https://files.pythonhosted.org/packages/43/85/463f3e2247e933f6db44133945b61c64c8b9b14d4dfa5603b3464f4e39a6/subwiz-0.1.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7cc29f17b3fca548c18b1f1aa59e78eef37b282dc47755715687bbc3c29789a5",
                "md5": "d162932b9e1734f1c7fc440352e4abb8",
                "sha256": "0600919e584a71c48a6374e4abf816cf5707e9c4997659f6eebb4a163713c67c"
            },
            "downloads": -1,
            "filename": "subwiz-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "d162932b9e1734f1c7fc440352e4abb8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 323961,
            "upload_time": "2024-08-18T18:45:18",
            "upload_time_iso_8601": "2024-08-18T18:45:18.764174Z",
            "url": "https://files.pythonhosted.org/packages/7c/c2/9f17b3fca548c18b1f1aa59e78eef37b282dc47755715687bbc3c29789a5/subwiz-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-08-18 18:45:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "hadriansecurity",
    "github_project": "subwiz",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "subwiz"
}
        
Elapsed time: 0.32416s