memento-cli


Namememento-cli JSON
Version 0.0.4 PyPI version JSON
download
home_page
SummaryExamine snapshots in eeb archives such as the Internet Archive's Wayback Machine
upload_time2023-12-21 21:05:25
maintainer
docs_urlNone
authorEd Summers
requires_python>=3.9,<4.0
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # memento-cli

[![Build Status](https://github.com/edsu/memento-cli/actions/workflows/test.yml/badge.svg)](https://github.com/edsu/memento-cli/actions/workflows/test.yml)

A command line tool interacting with Memento ([RFC 7089](https://www.rfc-editor.org/rfc/rfc7089)) supporting web archives, such as the Internet Archive's Wayback Machine.

For more background on why this tool was created see: https://inkdroid.org/2023/09/14/memento-bisect/

## Usage

### List Snapshots

To list all the available snapshots (or Mementos) for a given snapshot you can use the `list` command:

```bash
$ memento list https://web.archive.org/web/20230407140923/https:/help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2017-12-29 05:40:51 https://web.archive.org/web/20171229054051/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-03 20:03:00 https://web.archive.org/web/20180103200300/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-04 06:39:58 https://web.archive.org/web/20180104063958/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-06 16:08:07 https://web.archive.org/web/20180106160807/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-12 06:10:07 https://web.archive.org/web/20180112061007/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-12 17:40:16 https://web.archive.org/web/20180112174016/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-12 18:40:34 https://web.archive.org/web/20180112184034/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-12 19:11:48 https://web.archive.org/web/20180112191148/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-20 19:05:57 https://web.archive.org/web/20180120190557/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
2018-01-20 19:19:20 https://web.archive.org/web/20180120191920/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
...
```

Since *memento* works with any RFC 7089 supporting archive you can use it to list versions in other web archives as well:

```bash
$ memento list https://www.webarchive.org.uk/wayback/archive/20130501020401/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/
2013-05-01 02:03:57 https://www.webarchive.org.uk/wayback/archive/20130501020357mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition
2013-05-01 02:04:01 https://www.webarchive.org.uk/wayback/archive/20130501020401mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/
2013-07-29 12:58:03 https://www.webarchive.org.uk/wayback/archive/20130729125803mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition
2013-07-29 12:58:06 https://www.webarchive.org.uk/wayback/archive/20130729125806mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/
2021-01-22 06:38:21 https://www.webarchive.org.uk/wayback/archive/20210122063821mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/
2022-03-14 16:36:16 https://www.webarchive.org.uk/wayback/archive/20220314163616mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/
```

### Searching for Changes (Bisect)

Let's suppose you know that the [Twitter Hateful Conduct Policy](https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy) used to have language about:

> women, people of color, lesbian, gay, bisexual, transgender, queer, intersex, asexual individuals
 
You can see it in the Internet Archive Wayback Machine [in 2019](https://web.archive.org/web/20190711134608/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy). But you can't see it [on the page in 2023](https://web.archive.org/web/20230621094005/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy). To identify when the change was introduced, you can *bisect* the version history to search for the version where the text went missing, using the two snapshots and the `--text` option. This will perform a binary search between the two versions looking for the text.

```bash
$ memento bisect --missing --text "women, people of color, lesbian, gay" \
  https://web.archive.org/web/20190711134608/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy \
  https://web.archive.org/web/20230621094005/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy
```

<img src="https://github.com/edsu/memento-cli/raw/main/images/memento.gif">

The `--text` value can be a regular expression too if you want. If you only provide one snapshot URL it will use that as the start index, and use the last snapshot in the archive as the end.

The *bisect* command uses a browser behind the scenes (using Selenium) in order to fully render the page. If you wanted to find out when some text appears (rather than goes missing) then remove the `--missing` parameter from the command.

And if you would prefer to examine the pages in between manually, leave off the `--text` parameter and *memento* will prompt you to continue, and show you the browser it is controlling.

If you would like to see the browser when using `--text` use the `--show-browser` option.

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "memento-cli",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Ed Summers",
    "author_email": "ehs@pobox.com",
    "download_url": "https://files.pythonhosted.org/packages/22/24/8680807a14cf66774b1301066ec261cb4190a3e3580139cec3e68449ef08/memento_cli-0.0.4.tar.gz",
    "platform": null,
    "description": "# memento-cli\n\n[![Build Status](https://github.com/edsu/memento-cli/actions/workflows/test.yml/badge.svg)](https://github.com/edsu/memento-cli/actions/workflows/test.yml)\n\nA command line tool interacting with Memento ([RFC 7089](https://www.rfc-editor.org/rfc/rfc7089)) supporting web archives, such as the Internet Archive's Wayback Machine.\n\nFor more background on why this tool was created see: https://inkdroid.org/2023/09/14/memento-bisect/\n\n## Usage\n\n### List Snapshots\n\nTo list all the available snapshots (or Mementos) for a given snapshot you can use the `list` command:\n\n```bash\n$ memento list https://web.archive.org/web/20230407140923/https:/help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2017-12-29 05:40:51 https://web.archive.org/web/20171229054051/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-03 20:03:00 https://web.archive.org/web/20180103200300/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-04 06:39:58 https://web.archive.org/web/20180104063958/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-06 16:08:07 https://web.archive.org/web/20180106160807/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-12 06:10:07 https://web.archive.org/web/20180112061007/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-12 17:40:16 https://web.archive.org/web/20180112174016/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-12 18:40:34 https://web.archive.org/web/20180112184034/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-12 19:11:48 https://web.archive.org/web/20180112191148/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-20 19:05:57 https://web.archive.org/web/20180120190557/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n2018-01-20 19:19:20 https://web.archive.org/web/20180120191920/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n...\n```\n\nSince *memento* works with any RFC 7089 supporting archive you can use it to list versions in other web archives as well:\n\n```bash\n$ memento list https://www.webarchive.org.uk/wayback/archive/20130501020401/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/\n2013-05-01 02:03:57 https://www.webarchive.org.uk/wayback/archive/20130501020357mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition\n2013-05-01 02:04:01 https://www.webarchive.org.uk/wayback/archive/20130501020401mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/\n2013-07-29 12:58:03 https://www.webarchive.org.uk/wayback/archive/20130729125803mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition\n2013-07-29 12:58:06 https://www.webarchive.org.uk/wayback/archive/20130729125806mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/\n2021-01-22 06:38:21 https://www.webarchive.org.uk/wayback/archive/20210122063821mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/\n2022-03-14 16:36:16 https://www.webarchive.org.uk/wayback/archive/20220314163616mp_/http://www.vam.ac.uk/content/exhibitions/david-bowie-is/david-bowie-is-inside-the-exhibition/\n```\n\n### Searching for Changes (Bisect)\n\nLet's suppose you know that the [Twitter Hateful Conduct Policy](https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy) used to have language about:\n\n> women, people of color, lesbian, gay, bisexual, transgender, queer, intersex, asexual individuals\n \nYou can see it in the Internet Archive Wayback Machine [in 2019](https://web.archive.org/web/20190711134608/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy). But you can't see it [on the page in 2023](https://web.archive.org/web/20230621094005/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy). To identify when the change was introduced, you can *bisect* the version history to search for the version where the text went missing, using the two snapshots and the `--text` option. This will perform a binary search between the two versions looking for the text.\n\n```bash\n$ memento bisect --missing --text \"women, people of color, lesbian, gay\" \\\n  https://web.archive.org/web/20190711134608/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy \\\n  https://web.archive.org/web/20230621094005/https://help.twitter.com/en/rules-and-policies/hateful-conduct-policy\n```\n\n<img src=\"https://github.com/edsu/memento-cli/raw/main/images/memento.gif\">\n\nThe `--text` value can be a regular expression too if you want. If you only provide one snapshot URL it will use that as the start index, and use the last snapshot in the archive as the end.\n\nThe *bisect* command uses a browser behind the scenes (using Selenium) in order to fully render the page. If you wanted to find out when some text appears (rather than goes missing) then remove the `--missing` parameter from the command.\n\nAnd if you would prefer to examine the pages in between manually, leave off the `--text` parameter and *memento* will prompt you to continue, and show you the browser it is controlling.\n\nIf you would like to see the browser when using `--text` use the `--show-browser` option.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Examine snapshots in eeb archives such as the Internet Archive's Wayback Machine",
    "version": "0.0.4",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "25a6d6bddc420cd7a808a7a03893d1ea38772bf79e3c02e639c27104cff91aaf",
                "md5": "0d0dc58c03b3d06ad78d4ec868c11c38",
                "sha256": "adf7f2536c019832e4345a30d0ab469c39e401b4555bde5e1c84ccb6296a0eb0"
            },
            "downloads": -1,
            "filename": "memento_cli-0.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0d0dc58c03b3d06ad78d4ec868c11c38",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9,<4.0",
            "size": 5656,
            "upload_time": "2023-12-21T21:05:24",
            "upload_time_iso_8601": "2023-12-21T21:05:24.670294Z",
            "url": "https://files.pythonhosted.org/packages/25/a6/d6bddc420cd7a808a7a03893d1ea38772bf79e3c02e639c27104cff91aaf/memento_cli-0.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "22248680807a14cf66774b1301066ec261cb4190a3e3580139cec3e68449ef08",
                "md5": "62f608405603b0c57f8de3e0facc4fe3",
                "sha256": "81b31f8df3f44ce449d83bb600435e34eb0376346cc62ed225c66c5d38d26bf0"
            },
            "downloads": -1,
            "filename": "memento_cli-0.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "62f608405603b0c57f8de3e0facc4fe3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9,<4.0",
            "size": 4743,
            "upload_time": "2023-12-21T21:05:25",
            "upload_time_iso_8601": "2023-12-21T21:05:25.645225Z",
            "url": "https://files.pythonhosted.org/packages/22/24/8680807a14cf66774b1301066ec261cb4190a3e3580139cec3e68449ef08/memento_cli-0.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-21 21:05:25",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "memento-cli"
}
        
Elapsed time: 0.35914s