notion-duplicates


Namenotion-duplicates JSON
Version 0.6.0 PyPI version JSON
download
home_pageNone
SummaryDetect duplicated pages in a Notion database and optionallly delete them
upload_time2024-05-13 00:41:41
maintainerNone
docs_urlNone
authorJerome Provensal
requires_python<4.0,>=3.9
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            ## Purpose

**Detect the duplicated pages in a Notion database and optionally delete the dupes**

### What's a duplicated page?
It's a page with the both same _title_ and _last_edited_time_ as another document.

### Motivation
I recently decided to move away from Evernote (after being a subsciber since 2008). 
My reason? They started to jack up their price to a level that wasn't justifiable to me.

The price of the yearly subscription went from $35 in 2022, to $50 in 2023 and for this year they want **$130!** 
`</RANT>`

After I imported many pages from Evernote, I ended up with 100s if not 1000s of duplicated pages.

This script solved the problem! 

## Install

```sh
pip install notion-duplicates
```

## Prerequisites

You first need to create an *integration* from Notion that will create a *token*:

- Go to https://www.notion.so/my-integrations
- Click on **[ + New Integration ]**
- Specify the name say: **notion_duplicates**
- Click on Show under *Internal Integration Secret* and copy the *secret* which looks like:

  - `secret_WhGbvv7jUxt88WXYZDlhxoiBtgtzGXBqPrVSA00aaBo`
- That's the value to use as NOTION_TOKEN

Next, you need to connect the **notion_duplicates** integration with your Notion database:

- Navigate to your Notion database such as: https://www.notion.so/a769a042d8f544ce860ba408d295ab28?v=8603013e8753451cb46496a62e6ac55f
- Click on the **. . .** at the top right of the page
- Select **Connect To** and select **notion_duplicates** from the list, and confirm

Finally, you need your **database_id** that can easily be extracted from your database URL:

It's the 32 characters from the / to the ?. See the example below where the database_id=a769a042d8f544ce860ba408d295ab28

```commandline
https://www.notion.so/a769a042d8f544ce860ba408d295ab28?v=8603013e8753451cb46496a62e6ac55f
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```

## Usage

### Help (-h)

```commandline
notion_duplicates -h
usage: notion_duplicates [-h] [-m [MAX_PAGE_COUNT]] [-D] [-M [MAX_DELETE_PAGE_COUNT]] database_id

Detect duplicated pages in a Notion database and optionally delete them

positional arguments:
  database_id           Notion database on which to conduct the duplicate search. See README.md for more details

optional arguments:
  -h, --help            show this help message and exit
  -m [MAX_PAGE_COUNT], --max_page_count [MAX_PAGE_COUNT]
                        Maximum number of pages to scan for duplicated pages (default: None)
  -D, --delete          Do the actual deletion (set in_trash=True) (default: False)
  -M [MAX_DELETE_PAGE_COUNT], --max_delete_page_count [MAX_DELETE_PAGE_COUNT]
                        Maximum number of pages to delete (default: None)
```

### Example with no duplicate
```commandline
notion_duplicates a769a042d8f544ce860ba408d295ab28
Iterated over 3 pages in the database:a769a042d8f544ce860ba408d295ab28. Found 0 duplicated page(s) and deleted 0 page(s)
Elapased time:0.12 seconds
```

### Example showing duplicates only (no deletion)
```commandline
notion_duplicates 5ae487a972e345b09450c181150a7AAA
Scanned 100 in 0.61 secs or 164 pages/sec
Scanned 200 in 1.52 secs or 131 pages/sec
Scanned 300 in 2.22 secs or 135 pages/sec
Scanned 400 in 3.02 secs or 132 pages/sec
Scanned 500 in 3.63 secs or 138 pages/sec
This page is a dupe -> title:(1) Facebook | last_edited:2013-07-05T01:34:00.000Z | url:https://www.notion.so/1-Facebook-a7df306435694572be8460ac45b75950
This page is a dupe -> title:Patio Lounger RE 11.2in Nicollet : Target | last_edited:2013-07-04T23:09:00.000Z | url:https://www.notion.so/Patio-Lounger-RE-11-2in-Nicollet-Target-706e30effb4345b4b50ee0db3328ebbb
This page is a dupe -> title:ÄPPLARÖ Drop-leaf table - IKEA | last_edited:2013-07-04T23:03:00.000Z | url:https://www.notion.so/PPLAR-Drop-leaf-table-IKEA-9fe474b0f5424c499f3fe78aeb005deb
Reached max page count
Iterated over 521 pages in the database:5ae487a972e345b09450c181150a77b2. Found 3 duplicated page(s) and deleted 0 page(s)
Elapased time:4.52 seconds
```

### Example deleting duplicates (use -D)
```commandline
notion_duplicates -D 5ae487a972e345b09450c181150a7AAA
Scanned 100 in 0.61 secs or 164 pages/sec
Scanned 200 in 1.52 secs or 131 pages/sec
Scanned 300 in 2.22 secs or 135 pages/sec
Scanned 400 in 3.02 secs or 132 pages/sec
Scanned 500 in 3.63 secs or 138 pages/sec
DELETING dupe page -> title:(1) Facebook | last_edited:2013-07-05T01:34:00.000Z | url:https://www.notion.so/1-Facebook-a7df306435694572be8460ac45b75950
DELETING dupe page -> title:Patio Lounger RE 11.2in Nicollet : Target | last_edited:2013-07-04T23:09:00.000Z | url:https://www.notion.so/Patio-Lounger-RE-11-2in-Nicollet-Target-706e30effb4345b4b50ee0db3328ebbb
DELETING dupe page -> title:ÄPPLARÖ Drop-leaf table - IKEA | last_edited:2013-07-04T23:03:00.000Z | url:https://www.notion.so/PPLAR-Drop-leaf-table-IKEA-9fe474b0f5424c499f3fe78aeb005deb
Iterated over 521 pages in the database:5ae487a972e345b09450c181150a7AAA. Found 3 duplicated page(s) and deleted 3 page(s)
Elapased time:4.77 seconds
```




            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "notion-duplicates",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4.0,>=3.9",
    "maintainer_email": null,
    "keywords": null,
    "author": "Jerome Provensal",
    "author_email": "jeromegit@provensal.com",
    "download_url": "https://files.pythonhosted.org/packages/35/23/0fac009a0199344c11dd2188f73e3e03226dbb4dcb187642e514b00b3860/notion_duplicates-0.6.0.tar.gz",
    "platform": null,
    "description": "## Purpose\n\n**Detect the duplicated pages in a Notion database and optionally delete the dupes**\n\n### What's a duplicated page?\nIt's a page with the both same _title_ and _last_edited_time_ as another document.\n\n### Motivation\nI recently decided to move away from Evernote (after being a subsciber since 2008). \nMy reason? They started to jack up their price to a level that wasn't justifiable to me.\n\nThe price of the yearly subscription went from $35 in 2022, to $50 in 2023 and for this year they want **$130!** \n`</RANT>`\n\nAfter I imported many pages from Evernote, I ended up with 100s if not 1000s of duplicated pages.\n\nThis script solved the problem! \n\n## Install\n\n```sh\npip install notion-duplicates\n```\n\n## Prerequisites\n\nYou first need to create an *integration* from Notion that will create a *token*:\n\n- Go to https://www.notion.so/my-integrations\n- Click on **[ + New Integration ]**\n- Specify the name say: **notion_duplicates**\n- Click on Show under *Internal Integration Secret* and copy the *secret* which looks like:\n\n  - `secret_WhGbvv7jUxt88WXYZDlhxoiBtgtzGXBqPrVSA00aaBo`\n- That's the value to use as NOTION_TOKEN\n\nNext, you need to connect the **notion_duplicates** integration with your Notion database:\n\n- Navigate to your Notion database such as: https://www.notion.so/a769a042d8f544ce860ba408d295ab28?v=8603013e8753451cb46496a62e6ac55f\n- Click on the **. . .** at the top right of the page\n- Select **Connect To** and select **notion_duplicates** from the list, and confirm\n\nFinally, you need your **database_id** that can easily be extracted from your database URL:\n\nIt's the 32 characters from the / to the ?. See the example below where the database_id=a769a042d8f544ce860ba408d295ab28\n\n```commandline\nhttps://www.notion.so/a769a042d8f544ce860ba408d295ab28?v=8603013e8753451cb46496a62e6ac55f\n                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n```\n\n## Usage\n\n### Help (-h)\n\n```commandline\nnotion_duplicates -h\nusage: notion_duplicates [-h] [-m [MAX_PAGE_COUNT]] [-D] [-M [MAX_DELETE_PAGE_COUNT]] database_id\n\nDetect duplicated pages in a Notion database and optionally delete them\n\npositional arguments:\n  database_id           Notion database on which to conduct the duplicate search. See README.md for more details\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -m [MAX_PAGE_COUNT], --max_page_count [MAX_PAGE_COUNT]\n                        Maximum number of pages to scan for duplicated pages (default: None)\n  -D, --delete          Do the actual deletion (set in_trash=True) (default: False)\n  -M [MAX_DELETE_PAGE_COUNT], --max_delete_page_count [MAX_DELETE_PAGE_COUNT]\n                        Maximum number of pages to delete (default: None)\n```\n\n### Example with no duplicate\n```commandline\nnotion_duplicates a769a042d8f544ce860ba408d295ab28\nIterated over 3 pages in the database:a769a042d8f544ce860ba408d295ab28. Found 0 duplicated page(s) and deleted 0 page(s)\nElapased time:0.12 seconds\n```\n\n### Example showing duplicates only (no deletion)\n```commandline\nnotion_duplicates 5ae487a972e345b09450c181150a7AAA\nScanned 100 in 0.61 secs or 164 pages/sec\nScanned 200 in 1.52 secs or 131 pages/sec\nScanned 300 in 2.22 secs or 135 pages/sec\nScanned 400 in 3.02 secs or 132 pages/sec\nScanned 500 in 3.63 secs or 138 pages/sec\nThis page is a dupe -> title:(1) Facebook | last_edited:2013-07-05T01:34:00.000Z | url:https://www.notion.so/1-Facebook-a7df306435694572be8460ac45b75950\nThis page is a dupe -> title:Patio Lounger RE 11.2in Nicollet : Target | last_edited:2013-07-04T23:09:00.000Z | url:https://www.notion.so/Patio-Lounger-RE-11-2in-Nicollet-Target-706e30effb4345b4b50ee0db3328ebbb\nThis page is a dupe -> title:\u00c4PPLAR\u00d6 Drop-leaf table - IKEA | last_edited:2013-07-04T23:03:00.000Z | url:https://www.notion.so/PPLAR-Drop-leaf-table-IKEA-9fe474b0f5424c499f3fe78aeb005deb\nReached max page count\nIterated over 521 pages in the database:5ae487a972e345b09450c181150a77b2. Found 3 duplicated page(s) and deleted 0 page(s)\nElapased time:4.52 seconds\n```\n\n### Example deleting duplicates (use -D)\n```commandline\nnotion_duplicates -D 5ae487a972e345b09450c181150a7AAA\nScanned 100 in 0.61 secs or 164 pages/sec\nScanned 200 in 1.52 secs or 131 pages/sec\nScanned 300 in 2.22 secs or 135 pages/sec\nScanned 400 in 3.02 secs or 132 pages/sec\nScanned 500 in 3.63 secs or 138 pages/sec\nDELETING dupe page -> title:(1) Facebook | last_edited:2013-07-05T01:34:00.000Z | url:https://www.notion.so/1-Facebook-a7df306435694572be8460ac45b75950\nDELETING dupe page -> title:Patio Lounger RE 11.2in Nicollet : Target | last_edited:2013-07-04T23:09:00.000Z | url:https://www.notion.so/Patio-Lounger-RE-11-2in-Nicollet-Target-706e30effb4345b4b50ee0db3328ebbb\nDELETING dupe page -> title:\u00c4PPLAR\u00d6 Drop-leaf table - IKEA | last_edited:2013-07-04T23:03:00.000Z | url:https://www.notion.so/PPLAR-Drop-leaf-table-IKEA-9fe474b0f5424c499f3fe78aeb005deb\nIterated over 521 pages in the database:5ae487a972e345b09450c181150a7AAA. Found 3 duplicated page(s) and deleted 3 page(s)\nElapased time:4.77 seconds\n```\n\n\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Detect duplicated pages in a Notion database and optionallly delete them",
    "version": "0.6.0",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "1f5e055946e2f33532b90c06616edfb42aa0d538c900fc8779a6f12d3027ff8f",
                "md5": "cc4e03f2df20cddf2977b32c623a186c",
                "sha256": "e83dc031b071c27b93f84b49f19dba1f6413d852936e5139b517be985884715f"
            },
            "downloads": -1,
            "filename": "notion_duplicates-0.6.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cc4e03f2df20cddf2977b32c623a186c",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4.0,>=3.9",
            "size": 4810,
            "upload_time": "2024-05-13T00:41:40",
            "upload_time_iso_8601": "2024-05-13T00:41:40.683396Z",
            "url": "https://files.pythonhosted.org/packages/1f/5e/055946e2f33532b90c06616edfb42aa0d538c900fc8779a6f12d3027ff8f/notion_duplicates-0.6.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "35230fac009a0199344c11dd2188f73e3e03226dbb4dcb187642e514b00b3860",
                "md5": "e732bbf1e714130a7535b4a2e9a7fad0",
                "sha256": "a671797315a0af0161012695aca78d03292706e394124560c82bf9d97ad103d5"
            },
            "downloads": -1,
            "filename": "notion_duplicates-0.6.0.tar.gz",
            "has_sig": false,
            "md5_digest": "e732bbf1e714130a7535b4a2e9a7fad0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4.0,>=3.9",
            "size": 3842,
            "upload_time": "2024-05-13T00:41:41",
            "upload_time_iso_8601": "2024-05-13T00:41:41.977078Z",
            "url": "https://files.pythonhosted.org/packages/35/23/0fac009a0199344c11dd2188f73e3e03226dbb4dcb187642e514b00b3860/notion_duplicates-0.6.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-13 00:41:41",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "notion-duplicates"
}
        
Elapsed time: 0.66059s