# uro
Using a URL list for security testing can be painful as there are a lot of URLs that have uninteresting/duplicate content; **uro** aims to solve that.
It doesn't make any http requests to the URLs and removes:
- incremental urls e.g. `/page/1/` and `/page/2/`
- blog posts and similar human written content e.g. `/posts/a-brief-history-of-time`
- urls with same path but parameter value difference e.g. `/page.php?id=1` and `/page.php?id=2`
- images, js, css and other "useless" files
![uro-demo](https://i.ibb.co/x2tWCC5/uro-demo.png)
#### Installation
The recommended way to install uro is through pip as follows:
```
pip3 install uro --user
```
### Basic Usage
The quickest way to inclue uro in your workflow is to feed it data through stdin and print it to your terminal.
```
cat urls.txt | uro
```
### Advanced usage
#### Reading urls from a file (-i/--input)
`python3 uro.txt -i input.txt`
#### Writing urls to a file (-o/--output)
If the file already exists, uro will not overwrite the contents. Otherwise, it will create a new file.
`python3 uro.txt -i input.txt -o output.txt`
#### Whitelist (`-w/--whitelist`)
uro will ignore all other extension except the ones provided.
`uro -w php asp html`
**Note:** Extensionless pages e.g. /books/1 will still be included. To remove them too, use `--filter hasext`.
#### Blacklist (`-b/--blacklist`)
uro will ignore the given extensions.
`uro -b jpg png js pdf`
**Note:** uro has a list of "useless" extensions which it removes by default; that list will be overidden by whatever extensions you provide through blacklist option. Extensionless pages e.g. /books/1 will still be included. To remove them too, use `--filter hasext`.
#### Filters (-f/--filters)
For granular control, uro supports the following filters:
1. **hasparams:** only output urls that have query parameters e.g. `http://example.com/page.php?id=`
2. **noparams:** only output urls that have no query parameters e.g. `http://example.com/page.php`
3. **hasexts:** only output urls that have extensions e.g. `http://example.com/page.php`
4. **noexts:** only output urls that have no extensions e.g. `http://example.com/page`
5. **keepcontent:** keep human written content e.g. blogs.
6. **keepslash:** don't remove trailing slash from urls e.g. `http://example.com/page/`
7. **vuln:** only ouput urls with paramsters that are know to be vulnerable. [More info.](https://github.com/s0md3v/parth)
Example: `uro --filters hasexts hasparams`
Raw data
{
"_id": null,
"home_page": "https://github.com/s0md3v/uro",
"name": "uro",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "declutter,crawling,pentesting",
"author": "s0md3v",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/f3/4f/f83964e605618c2d7b7ffc4ee3a0324e8de40c620827be00ddf544cce5ae/uro-1.0.0.tar.gz",
"platform": null,
"description": "# uro\nUsing a URL list for security testing can be painful as there are a lot of URLs that have uninteresting/duplicate content; **uro** aims to solve that.\n\nIt doesn't make any http requests to the URLs and removes:\n- incremental urls e.g. `/page/1/` and `/page/2/`\n- blog posts and similar human written content e.g. `/posts/a-brief-history-of-time`\n- urls with same path but parameter value difference e.g. `/page.php?id=1` and `/page.php?id=2`\n- images, js, css and other \"useless\" files\n\n![uro-demo](https://i.ibb.co/x2tWCC5/uro-demo.png)\n\n#### Installation\nThe recommended way to install uro is through pip as follows:\n```\npip3 install uro --user\n```\n\n### Basic Usage\nThe quickest way to inclue uro in your workflow is to feed it data through stdin and print it to your terminal.\n```\ncat urls.txt | uro\n```\n\n### Advanced usage\n#### Reading urls from a file (-i/--input)\n\n`python3 uro.txt -i input.txt`\n\n#### Writing urls to a file (-o/--output)\nIf the file already exists, uro will not overwrite the contents. Otherwise, it will create a new file.\n\n`python3 uro.txt -i input.txt -o output.txt`\n\n#### Whitelist (`-w/--whitelist`)\nuro will ignore all other extension except the ones provided.\n\n`uro -w php asp html`\n\n**Note:** Extensionless pages e.g. /books/1 will still be included. To remove them too, use `--filter hasext`.\n\n#### Blacklist (`-b/--blacklist`)\nuro will ignore the given extensions.\n\n`uro -b jpg png js pdf`\n\n**Note:** uro has a list of \"useless\" extensions which it removes by default; that list will be overidden by whatever extensions you provide through blacklist option. Extensionless pages e.g. /books/1 will still be included. To remove them too, use `--filter hasext`.\n\n#### Filters (-f/--filters)\nFor granular control, uro supports the following filters:\n\n1. **hasparams:** only output urls that have query parameters e.g. `http://example.com/page.php?id=`\n2. **noparams:** only output urls that have no query parameters e.g. `http://example.com/page.php`\n3. **hasexts:** only output urls that have extensions e.g. `http://example.com/page.php`\n4. **noexts:** only output urls that have no extensions e.g. `http://example.com/page`\n5. **keepcontent:** keep human written content e.g. blogs.\n6. **keepslash:** don't remove trailing slash from urls e.g. `http://example.com/page/`\n7. **vuln:** only ouput urls with paramsters that are know to be vulnerable. [More info.](https://github.com/s0md3v/parth)\n\nExample: `uro --filters hasexts hasparams`\n",
"bugtrack_url": null,
"license": "Apache-2.0 License",
"summary": "A python tool to declutter url lists for crawling/pentesting",
"version": "1.0.0",
"split_keywords": [
"declutter",
"crawling",
"pentesting"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "f34ff83964e605618c2d7b7ffc4ee3a0324e8de40c620827be00ddf544cce5ae",
"md5": "afce5ec7c85a960899ef415349694fce",
"sha256": "72a2c1f5bdf825149d0d0ed480f94321601252dc73c96a762ac5a8ded52cfc76"
},
"downloads": -1,
"filename": "uro-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "afce5ec7c85a960899ef415349694fce",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 10025,
"upload_time": "2023-04-04T10:50:47",
"upload_time_iso_8601": "2023-04-04T10:50:47.972971Z",
"url": "https://files.pythonhosted.org/packages/f3/4f/f83964e605618c2d7b7ffc4ee3a0324e8de40c620827be00ddf544cce5ae/uro-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-04-04 10:50:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "s0md3v",
"github_project": "uro",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "uro"
}