urlscan


Nameurlscan JSON
Version 1.0.6 PyPI version JSON
download
home_pageNone
SummaryView/select the URLs in an email message or file
upload_time2024-11-22 18:25:58
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseGPL-2.0-or-later
keywords email mutt tmux urlscan urlview
VCS
bugtrack_url
requirements urwid
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Urlscan

[![main](https://github.com/firecat53/urlscan/actions/workflows/main.yml/badge.svg)](https://github.com/firecat53/urlscan/actions/workflows/main.yml)

## Contributors

Scott Hansen \<tech@firecat53.net\> (Author and Maintainer)

Maxime Chatelle \<xakz@rxsoft.eu\> (Debian Maintainer)

Daniel Burrows \<dburrows@debian.org\> (Original Author)

## Purpose and Requirements

Urlscan is a small program that is designed to integrate with the "mutt"
mailreader to allow you to easily launch a Web browser for URLs contained in
email messages. It is a replacement for the "urlview" program.

Requires: Python 3.7+ and the python-urwid library

## Features

Urlscan parses an email message or file and scans it for URLs and email
addresses. It then displays the URLs and their context within the message, and
allows you to choose one or more URLs to send to your Web browser.
Alternatively, it send a list of all URLs to stdout.

Relative to urlview, urlscan has the following additional features:

- Support for emails in quoted-printable and base64 encodings. No more stripping
  out =40D from URLs by hand!

- The context of each URL is provided along with the URL. For HTML mails, a
  crude parser is used to render the HTML into text. Context view can be toggled
  on/off with `c`.

- URLs are shortened by default to fit on one line. Viewing full URL (for one or
  all) is toggled with `s` or `S`.

- Jump to a URL by typing the number.

- Incremental case-insensitive search with `/`.

- Execute an arbitrary function (for example, copy URL to clipboard) instead of
  opening URL in a browser.

- Use `l` to cycle through whether URLs are opened using the Python webbrowser
  module (default), xdg-open (if installed) or opened by a function passed on
  the command line with `--run` or `--run-safe`.

- Configure colors and keybindings via ~/.config/urlscan/config.json. Generate
  default config file for editing by running `urlscan -g`. Cycle through
  available palettes with `p`. Set display width with `--width`.

- Copy URL to clipboard with `C` or to primary selection with `P`.  Requires
  xsel or xclip.

- Run a command with the selected URL as the argument or pipe the selected
  URL to a command.

- Show complete help menu with `F1`. Hide header on startup with `--nohelp`.

- Use a custom regular expression with `-E` for matching urls or any
  other pattern. In junction with `-r`, this effectively turns urlscan
  into a general purpose CLI selector-type utility.

- Scan certain email headers for URLs. Currently `Link`, `Archived-At` and
  `List-*` are scanned when `--headers` is passed.

- Queue multiple URLs for opening and open them all at once with `a` and `o`.

## Installation and setup

To install urlscan, install from your distribution repositories, from Pypi, or do
a local development install with pip -e:

    pipx install urlscan

    OR

    pip install --user urlscan

    OR

    cd <path/to/urlscan> && pip install --user -e .

**NOTE**

    The minimum required version of urwid is 1.2.1.

Once urlscan is installed, add the following lines to your .muttrc:

    macro index,pager \cb "<pipe-message> urlscan<Enter>" "call urlscan to
    extract URLs out of a message"

    macro attach,compose \cb "<pipe-entry> urlscan<Enter>" "call urlscan to
    extract URLs out of a message"

Once this is done, Control-b while reading mail in mutt will automatically
invoke urlscan on the message.

> Note for Neomutt users: [As of version
> `2023-05-17`](https://github.com/neomutt/neomutt/releases/tag/20230517) true
> color support was implemented. If you are using true color support with Neomutt,
> or are encountering the error `setupterm: could not find terminfo database`,
> then you should also add `TERM=xterm-256color` to your macro in `.muttrc`.
> See more here [#135](https://github.com/firecat53/urlscan/issues/135). For example:
> `macro index,pager  \cb "<pipe-message>  TERM=xterm-256color urlscan<Enter>" "call urlscan to
extract URLs out of a message"`

To choose a particular browser, set the environment variable BROWSER. If BROWSER
is not set, xdg-open will control which browser is used, if it's available.:

    export BROWSER=/usr/bin/epiphany


## Command Line usage

    urlscan OPTIONS <file>

    OPTIONS [-c, --compact]
            [-d, --dedupe]
            [-E, --regex <expression>]
            [-f, --run-safe <expression>]
            [-g, --genconf]
            [-H, --nohelp]
            [    --headers]
            [-n, --no-browser]
            [-p, --pipe]
            [-r, --run <expression>]
            [-R, --reverse]
            [-s, --single]
            [-w, --width]
            [-W  --whitespace-off]

Urlscan can extract URLs and email addresses from emails or any text file.
Calling with no flags will start the curses browser. Calling with '-n' will just
output a list of URLs/email addressess to stdout. The '-c' flag removes the
context from around the URLs in the curses browser, and the '-d' flag removes
duplicate URLs. The '-R' flag reverses the displayed order of URLs and context.
Files can also be piped to urlscan using normal shell pipe mechanisms: `cat
<something> | urlscan` or `urlscan < <something>`. The '-W' flag condenses the
display output by suppressing blank lines and ellipses lines.

Instead of opening a web browser, the selected URL can be passed as the argument
to a command using `--run-safe "<command> {}"` or `--run "<command> {}"`. Note
the use of `{}` in the command string to denote the selected URL. Alternatively,
the URL can be piped to the command using `--run-safe <command> --pipe` (or
`--run`). Using --run-safe with --pipe is preferred if the command supports it,
as it is marginally more secure and tolerant of special characters in the URL.

## Theming

Run `urlscan -g` to generate ~/.config/urlscan/config.json with the default
color and black & white palettes. This can be edited or added to, as desired.
The first palette in the list will be the default. Configure the palettes
according to the [Urwid display attributes][1].

Display width can be set with `--width`.

## Keybindings

Run `urlscan -g` to generate ~/.config/urlscan/config.json. All of the keys will
be listed. You can either leave in place or delete any that will not be altered.

To unset a binding, set it equal to "".  For example: `"P": ""`

The follow actions are supported:

- `add_url` -- add a URL to the queue (default: `a`)
- `all_escape` -- toggle unescape all URLs (default: `u`)
- `all_shorten` -- toggle shorten all URLs (default: `S`)
- `bottom` -- move cursor to last item (default: `G`)
- `clear_screen` -- redraw screen (default: `Ctrl-l`)
- `clipboard` -- copy highlighted URL to clipboard using xsel/xclip (default: `C`)
- `clipboard_pri` -- copy highlighted URL to primary selection using xsel/xclip (default: `P`)
- `context` -- show/hide context (default: `c`)
- `del_url` -- delete URL from the queue (default: `d`)
- `down` -- cursor down (default: `j`)
- `help_menu` -- show/hide help menu (default: `F1`)
- `link_handler` -- cycle link handling (webbrowser, xdg-open, --run-safe or --run) (default: `l`)
- `next` -- jump to next URL (default: `J`)
- `open_queue` -- open all URLs in queue (default: `o`)
- `open_queue_win` -- open all URLs in queue in new window (default: `O`)
- `open_url` -- open selected URL (default: `space` or `enter`)
- `palette` -- cycle through palettes (default: `p`)
- `previous` -- jump to previous URL (default: `K`)
- `quit` -- quit (default: `q` or `Q`)
- `reverse` -- reverse display order (default: `R`)
- `shorten` -- toggle shorten highlighted URL (default: `s`)
- `top` -- move to first list item (default: `g`)
- `up` -- cursor up (default: `k`)

## Known bugs and limitations

- Running urlscan sometimes "messes up" the terminal background. This seems to
  be an urwid bug, but I haven't tracked down just what's going on.

- Extraction of context from HTML messages leaves something to be desired.
  Probably the ideal solution would be to extract context on a word basis rather
  than on a paragraph basis.

- The HTML message handling is a bit kludgy in general.

- multipart/alternative sections are handled by descending into all the
  sub-parts, rather than just picking one, which may lead to URLs and context
  appearing twice. (Bypass this by selecting the '--dedupe' option)

## Build/development

- pyproject.toml is configured for [hatch][2] for building and submitting to pypi.
- flake.nix is available for a development shell or building/testing the package
  if desired. `nix develop`
- To update TLD list: `wget https://data.iana.org/TLD/tlds-alpha-by-domain.txt`
- GitHub Action will upload to TestPyPi on each push to `main`. To create a
  GitHub and PyPi release, create a new tag (formatting below) and push tags.

        <tag name on first line>
        
        * Release note 1
        * Release note 2
        * ...

[1]: http://urwid.org/manual/displayattributes.html#display-attributes  "Urwid display attributes"
[2]: https://hatch.pypa.io/latest/  "Hatch"

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "urlscan",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "email, mutt, tmux, urlscan, urlview",
    "author": null,
    "author_email": "Scott Hansen <tech@firecat53.net>",
    "download_url": "https://files.pythonhosted.org/packages/f0/9d/dbb1b7b3bb226a8a796b870cf9325cae53edc36acdf619cf4c5eefe94880/urlscan-1.0.6.tar.gz",
    "platform": null,
    "description": "# Urlscan\n\n[![main](https://github.com/firecat53/urlscan/actions/workflows/main.yml/badge.svg)](https://github.com/firecat53/urlscan/actions/workflows/main.yml)\n\n## Contributors\n\nScott Hansen \\<tech@firecat53.net\\> (Author and Maintainer)\n\nMaxime Chatelle \\<xakz@rxsoft.eu\\> (Debian Maintainer)\n\nDaniel Burrows \\<dburrows@debian.org\\> (Original Author)\n\n## Purpose and Requirements\n\nUrlscan is a small program that is designed to integrate with the \"mutt\"\nmailreader to allow you to easily launch a Web browser for URLs contained in\nemail messages. It is a replacement for the \"urlview\" program.\n\nRequires: Python 3.7+ and the python-urwid library\n\n## Features\n\nUrlscan parses an email message or file and scans it for URLs and email\naddresses. It then displays the URLs and their context within the message, and\nallows you to choose one or more URLs to send to your Web browser.\nAlternatively, it send a list of all URLs to stdout.\n\nRelative to urlview, urlscan has the following additional features:\n\n- Support for emails in quoted-printable and base64 encodings. No more stripping\n  out =40D from URLs by hand!\n\n- The context of each URL is provided along with the URL. For HTML mails, a\n  crude parser is used to render the HTML into text. Context view can be toggled\n  on/off with `c`.\n\n- URLs are shortened by default to fit on one line. Viewing full URL (for one or\n  all) is toggled with `s` or `S`.\n\n- Jump to a URL by typing the number.\n\n- Incremental case-insensitive search with `/`.\n\n- Execute an arbitrary function (for example, copy URL to clipboard) instead of\n  opening URL in a browser.\n\n- Use `l` to cycle through whether URLs are opened using the Python webbrowser\n  module (default), xdg-open (if installed) or opened by a function passed on\n  the command line with `--run` or `--run-safe`.\n\n- Configure colors and keybindings via ~/.config/urlscan/config.json. Generate\n  default config file for editing by running `urlscan -g`. Cycle through\n  available palettes with `p`. Set display width with `--width`.\n\n- Copy URL to clipboard with `C` or to primary selection with `P`.  Requires\n  xsel or xclip.\n\n- Run a command with the selected URL as the argument or pipe the selected\n  URL to a command.\n\n- Show complete help menu with `F1`. Hide header on startup with `--nohelp`.\n\n- Use a custom regular expression with `-E` for matching urls or any\n  other pattern. In junction with `-r`, this effectively turns urlscan\n  into a general purpose CLI selector-type utility.\n\n- Scan certain email headers for URLs. Currently `Link`, `Archived-At` and\n  `List-*` are scanned when `--headers` is passed.\n\n- Queue multiple URLs for opening and open them all at once with `a` and `o`.\n\n## Installation and setup\n\nTo install urlscan, install from your distribution repositories, from Pypi, or do\na local development install with pip -e:\n\n    pipx install urlscan\n\n    OR\n\n    pip install --user urlscan\n\n    OR\n\n    cd <path/to/urlscan> && pip install --user -e .\n\n**NOTE**\n\n    The minimum required version of urwid is 1.2.1.\n\nOnce urlscan is installed, add the following lines to your .muttrc:\n\n    macro index,pager \\cb \"<pipe-message> urlscan<Enter>\" \"call urlscan to\n    extract URLs out of a message\"\n\n    macro attach,compose \\cb \"<pipe-entry> urlscan<Enter>\" \"call urlscan to\n    extract URLs out of a message\"\n\nOnce this is done, Control-b while reading mail in mutt will automatically\ninvoke urlscan on the message.\n\n> Note for Neomutt users: [As of version\n> `2023-05-17`](https://github.com/neomutt/neomutt/releases/tag/20230517) true\n> color support was implemented. If you are using true color support with Neomutt,\n> or are encountering the error `setupterm: could not find terminfo database`,\n> then you should also add `TERM=xterm-256color` to your macro in `.muttrc`.\n> See more here [#135](https://github.com/firecat53/urlscan/issues/135). For example:\n> `macro index,pager  \\cb \"<pipe-message>  TERM=xterm-256color urlscan<Enter>\" \"call urlscan to\nextract URLs out of a message\"`\n\nTo choose a particular browser, set the environment variable BROWSER. If BROWSER\nis not set, xdg-open will control which browser is used, if it's available.:\n\n    export BROWSER=/usr/bin/epiphany\n\n\n## Command Line usage\n\n    urlscan OPTIONS <file>\n\n    OPTIONS [-c, --compact]\n            [-d, --dedupe]\n            [-E, --regex <expression>]\n            [-f, --run-safe <expression>]\n            [-g, --genconf]\n            [-H, --nohelp]\n            [    --headers]\n            [-n, --no-browser]\n            [-p, --pipe]\n            [-r, --run <expression>]\n            [-R, --reverse]\n            [-s, --single]\n            [-w, --width]\n            [-W  --whitespace-off]\n\nUrlscan can extract URLs and email addresses from emails or any text file.\nCalling with no flags will start the curses browser. Calling with '-n' will just\noutput a list of URLs/email addressess to stdout. The '-c' flag removes the\ncontext from around the URLs in the curses browser, and the '-d' flag removes\nduplicate URLs. The '-R' flag reverses the displayed order of URLs and context.\nFiles can also be piped to urlscan using normal shell pipe mechanisms: `cat\n<something> | urlscan` or `urlscan < <something>`. The '-W' flag condenses the\ndisplay output by suppressing blank lines and ellipses lines.\n\nInstead of opening a web browser, the selected URL can be passed as the argument\nto a command using `--run-safe \"<command> {}\"` or `--run \"<command> {}\"`. Note\nthe use of `{}` in the command string to denote the selected URL. Alternatively,\nthe URL can be piped to the command using `--run-safe <command> --pipe` (or\n`--run`). Using --run-safe with --pipe is preferred if the command supports it,\nas it is marginally more secure and tolerant of special characters in the URL.\n\n## Theming\n\nRun `urlscan -g` to generate ~/.config/urlscan/config.json with the default\ncolor and black & white palettes. This can be edited or added to, as desired.\nThe first palette in the list will be the default. Configure the palettes\naccording to the [Urwid display attributes][1].\n\nDisplay width can be set with `--width`.\n\n## Keybindings\n\nRun `urlscan -g` to generate ~/.config/urlscan/config.json. All of the keys will\nbe listed. You can either leave in place or delete any that will not be altered.\n\nTo unset a binding, set it equal to \"\".  For example: `\"P\": \"\"`\n\nThe follow actions are supported:\n\n- `add_url` -- add a URL to the queue (default: `a`)\n- `all_escape` -- toggle unescape all URLs (default: `u`)\n- `all_shorten` -- toggle shorten all URLs (default: `S`)\n- `bottom` -- move cursor to last item (default: `G`)\n- `clear_screen` -- redraw screen (default: `Ctrl-l`)\n- `clipboard` -- copy highlighted URL to clipboard using xsel/xclip (default: `C`)\n- `clipboard_pri` -- copy highlighted URL to primary selection using xsel/xclip (default: `P`)\n- `context` -- show/hide context (default: `c`)\n- `del_url` -- delete URL from the queue (default: `d`)\n- `down` -- cursor down (default: `j`)\n- `help_menu` -- show/hide help menu (default: `F1`)\n- `link_handler` -- cycle link handling (webbrowser, xdg-open, --run-safe or --run) (default: `l`)\n- `next` -- jump to next URL (default: `J`)\n- `open_queue` -- open all URLs in queue (default: `o`)\n- `open_queue_win` -- open all URLs in queue in new window (default: `O`)\n- `open_url` -- open selected URL (default: `space` or `enter`)\n- `palette` -- cycle through palettes (default: `p`)\n- `previous` -- jump to previous URL (default: `K`)\n- `quit` -- quit (default: `q` or `Q`)\n- `reverse` -- reverse display order (default: `R`)\n- `shorten` -- toggle shorten highlighted URL (default: `s`)\n- `top` -- move to first list item (default: `g`)\n- `up` -- cursor up (default: `k`)\n\n## Known bugs and limitations\n\n- Running urlscan sometimes \"messes up\" the terminal background. This seems to\n  be an urwid bug, but I haven't tracked down just what's going on.\n\n- Extraction of context from HTML messages leaves something to be desired.\n  Probably the ideal solution would be to extract context on a word basis rather\n  than on a paragraph basis.\n\n- The HTML message handling is a bit kludgy in general.\n\n- multipart/alternative sections are handled by descending into all the\n  sub-parts, rather than just picking one, which may lead to URLs and context\n  appearing twice. (Bypass this by selecting the '--dedupe' option)\n\n## Build/development\n\n- pyproject.toml is configured for [hatch][2] for building and submitting to pypi.\n- flake.nix is available for a development shell or building/testing the package\n  if desired. `nix develop`\n- To update TLD list: `wget https://data.iana.org/TLD/tlds-alpha-by-domain.txt`\n- GitHub Action will upload to TestPyPi on each push to `main`. To create a\n  GitHub and PyPi release, create a new tag (formatting below) and push tags.\n\n        <tag name on first line>\n        \n        * Release note 1\n        * Release note 2\n        * ...\n\n[1]: http://urwid.org/manual/displayattributes.html#display-attributes  \"Urwid display attributes\"\n[2]: https://hatch.pypa.io/latest/  \"Hatch\"\n",
    "bugtrack_url": null,
    "license": "GPL-2.0-or-later",
    "summary": "View/select the URLs in an email message or file",
    "version": "1.0.6",
    "project_urls": {
        "Homepage": "https://github.com/firecat53/urlscan"
    },
    "split_keywords": [
        "email",
        " mutt",
        " tmux",
        " urlscan",
        " urlview"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7482c87c8c39e348c71a5821128fcbc393a5801b6d797224d530c146025bc997",
                "md5": "9db6a206940cd264a0bdc33bba442213",
                "sha256": "e78811ae97fb9018086cf48db659df7bb168d1f2d4472d87d28b956a2fdf600a"
            },
            "downloads": -1,
            "filename": "urlscan-1.0.6-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "9db6a206940cd264a0bdc33bba442213",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 49595,
            "upload_time": "2024-11-22T18:25:55",
            "upload_time_iso_8601": "2024-11-22T18:25:55.988464Z",
            "url": "https://files.pythonhosted.org/packages/74/82/c87c8c39e348c71a5821128fcbc393a5801b6d797224d530c146025bc997/urlscan-1.0.6-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f09ddbb1b7b3bb226a8a796b870cf9325cae53edc36acdf619cf4c5eefe94880",
                "md5": "dc089c52ae2f4bd2fb393593ecfadf83",
                "sha256": "3bbf8900de23913c29aed27702eaba92a871b2fe95920e72c56a19fff7cb4581"
            },
            "downloads": -1,
            "filename": "urlscan-1.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "dc089c52ae2f4bd2fb393593ecfadf83",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 35939,
            "upload_time": "2024-11-22T18:25:58",
            "upload_time_iso_8601": "2024-11-22T18:25:58.063333Z",
            "url": "https://files.pythonhosted.org/packages/f0/9d/dbb1b7b3bb226a8a796b870cf9325cae53edc36acdf619cf4c5eefe94880/urlscan-1.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-22 18:25:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "firecat53",
    "github_project": "urlscan",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "urwid",
            "specs": [
                [
                    ">=",
                    "1.2.1"
                ]
            ]
        }
    ],
    "lcname": "urlscan"
}
        
Elapsed time: 1.35522s