# Move Unmarker
Very small CLI utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ, using [PyMuPDF](https://pymupdf.readthedocs.io/en/latest/).
Beware that there is no input sanitization or error checking, you are on your own. This tool will **run**, as opposed to **work**, without fail on any
pdf that has at least one content stream per page, which is basically every pdf in the wild. Unless the pdf has a watermark corresponding to the 2nd
content stream of every page this will either do nothing (with the exception of changing compression options and maybe other idiosyncrasies of PyMuPDF
when it comes to writing a pdf) or, in case it does have a 2 or more content streams on a page, it will keep just the first and likely make the file useless,
though it will still open. Most pdf writers concatenate multiple content streams into one, so chances are it won't do anything or just crash.
This tool will **overwrite without confirmation** any file with the same name as `--output` (default "unmarked.pdf").
## Installation
1. Make sure Python 3.8 or higher and pip are installed
1. Run `pip install move-unmarker`
## Usage
usage: unmarker [-h] [-o OUTPUT] [-g GARBAGE] input
Utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ.
positional arguments:
input input filename
options:
-h, --help show this help message and exit
-o OUTPUT, --output OUTPUT
output filename (default: "unmarked.pdf")
-g GARBAGE, --garbage GARBAGE
level of garbage collection (default: 1)
[pymupdf.Document.save](https://pymupdf.readthedocs.io/en/latest/document.html#Document.save) method for more details on garbage collection.
### TLDR
- `unmarker watermarked.pdf`
- `unmarker -o unmarked.pdf watermarked.pdf`
- `unmarker --garbage 3 watermarked.pdf`
## Development
1. Check Python's version `python -V`
1. Install Python 3.8 or higher and pip, if they aren't already installed:
- Windows `winget install Python.Python.3.X` (replace X with the desired minor version)
- Ubuntu/Debian based distros `apt install python3 python3-pip`
- Arch based distros `pacman -S python python-pip`
- Fedora `dnf install python3 python3-pip`
1. [Install poetry](https://python-poetry.org/docs/#installation)
1. Clone this repo
`git clone https://github.com/joaofauvel/move-unmarker.git && cd move-unmarker`
1. Install requirements
`poetry install`
Raw data
{
"_id": null,
"home_page": "https://github.com/joaofauvel/move-unmarker",
"name": "move-unmarker",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "pdf, watermark, manipulation, cli, USP, ESALQ",
"author": "Jo\u00e3o Fauvel",
"author_email": "jmmfauvel@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/78/2c/122b17474467fe4453c618fbe0e6933021c2ab7b06d227176b8689299bfe/move_unmarker-0.1.4.tar.gz",
"platform": null,
"description": "# Move Unmarker\nVery small CLI utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ, using [PyMuPDF](https://pymupdf.readthedocs.io/en/latest/). \n \n\nBeware that there is no input sanitization or error checking, you are on your own. This tool will **run**, as opposed to **work**, without fail on any \npdf that has at least one content stream per page, which is basically every pdf in the wild. Unless the pdf has a watermark corresponding to the 2nd \ncontent stream of every page this will either do nothing (with the exception of changing compression options and maybe other idiosyncrasies of PyMuPDF \nwhen it comes to writing a pdf) or, in case it does have a 2 or more content streams on a page, it will keep just the first and likely make the file useless, \nthough it will still open. Most pdf writers concatenate multiple content streams into one, so chances are it won't do anything or just crash. \nThis tool will **overwrite without confirmation** any file with the same name as `--output` (default \"unmarked.pdf\").\n\n## Installation\n1. Make sure Python 3.8 or higher and pip are installed\n1. Run `pip install move-unmarker`\n\n## Usage\n usage: unmarker [-h] [-o OUTPUT] [-g GARBAGE] input\n\n Utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ.\n\n positional arguments:\n input input filename\n\n options:\n -h, --help show this help message and exit\n -o OUTPUT, --output OUTPUT\n output filename (default: \"unmarked.pdf\")\n -g GARBAGE, --garbage GARBAGE\n level of garbage collection (default: 1) \n[pymupdf.Document.save](https://pymupdf.readthedocs.io/en/latest/document.html#Document.save) method for more details on garbage collection. \n\n### TLDR\n- `unmarker watermarked.pdf` \n- `unmarker -o unmarked.pdf watermarked.pdf` \n- `unmarker --garbage 3 watermarked.pdf`\n\n## Development\n1. Check Python's version `python -V`\n1. Install Python 3.8 or higher and pip, if they aren't already installed:\n\n - Windows `winget install Python.Python.3.X` (replace X with the desired minor version)\n - Ubuntu/Debian based distros `apt install python3 python3-pip`\n - Arch based distros `pacman -S python python-pip`\n - Fedora `dnf install python3 python3-pip`\n\n1. [Install poetry](https://python-poetry.org/docs/#installation) \n1. Clone this repo \n`git clone https://github.com/joaofauvel/move-unmarker.git && cd move-unmarker`\n1. Install requirements \n`poetry install`",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "CLI utility to remove PII watermarks from pdfs downloaded from Move USP/ESALQ",
"version": "0.1.4",
"project_urls": {
"Homepage": "https://github.com/joaofauvel/move-unmarker",
"Repository": "https://github.com/joaofauvel/move-unmarker"
},
"split_keywords": [
"pdf",
" watermark",
" manipulation",
" cli",
" usp",
" esalq"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "bafbb627741a36ca9c6c40b173174a050f7cec8e96368becf21409fd268e297a",
"md5": "4e6127a1f50f219de586e0514625640e",
"sha256": "a67bd378ce123ded6aa3a447559ab1d7d1fa9ba1de5d536bd5c7628dde38b418"
},
"downloads": -1,
"filename": "move_unmarker-0.1.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "4e6127a1f50f219de586e0514625640e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 15718,
"upload_time": "2024-05-16T02:51:59",
"upload_time_iso_8601": "2024-05-16T02:51:59.549632Z",
"url": "https://files.pythonhosted.org/packages/ba/fb/b627741a36ca9c6c40b173174a050f7cec8e96368becf21409fd268e297a/move_unmarker-0.1.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "782c122b17474467fe4453c618fbe0e6933021c2ab7b06d227176b8689299bfe",
"md5": "c8b4ff096fc638ffa08ea0f61b38f3e2",
"sha256": "5d26070d13b193f0d6cc24212761d93578302f16e547ce95f9e450cb393dd6af"
},
"downloads": -1,
"filename": "move_unmarker-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "c8b4ff096fc638ffa08ea0f61b38f3e2",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 14986,
"upload_time": "2024-05-16T02:52:00",
"upload_time_iso_8601": "2024-05-16T02:52:00.707462Z",
"url": "https://files.pythonhosted.org/packages/78/2c/122b17474467fe4453c618fbe0e6933021c2ab7b06d227176b8689299bfe/move_unmarker-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-16 02:52:00",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "joaofauvel",
"github_project": "move-unmarker",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "move-unmarker"
}