pdfhandleretc


Namepdfhandleretc JSON
Version 0.1.1 PyPI version JSON
download
home_pageNone
SummaryLightweight command-line and Python API toolkit for PDF text extraction, encryption, permissions, and more.
upload_time2025-07-16 04:04:16
maintainerNone
docs_urlNone
authorNone
requires_python>=3.10
licenseCC0-1.0
keywords pdf pdfminer pikepdf encryption text-extraction cli
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![PyPI](https://img.shields.io/pypi/v/pdfhandleretc.svg)](https://pypi.org/project/pdfhandleretc/)  
[![Documentation Status](https://readthedocs.org/projects/pdfhandleretc/badge/?version=latest)](https://scraperetc.readthedocs.io/en/latest/)  
[![CI](https://github.com/carret1268/PdfHandlerETC/actions/workflows/ci.yml/badge.svg)](https://github.com/carret1268/PdfHandlerETC/actions/workflows/ci.yml)

# PdfHandlerETC

PdfHandlerETC is a lightweight command-line and Python toolkit for handling common PDF tasks including text extraction, encryption, decryption, permissions inspection, word counting, page resizing, and file merging.

This project is released under the [CC0 1.0 Public Domain Dedication](https://creativecommons.org/publicdomain/zero/1.0/).

## Features

- Extract text from PDFs by page or range
- Encrypt and decrypt PDFs with customizable permissions
- Count words across entire documents or selected pages
- Inspect encryption status and permissions
- Resize page dimensions
- Merge two PDFs with optional visual separators (blank page or black bar)
- Detect duplicate PDFs based on text content
- Includes both a Python API and command-line interface (CLI)

## Installation

Install from PyPI:

```
pip install pdfhandleretc
```

## Command-Line Usage

After installation, you can use the `pdfhandler` CLI tool:

```
pdfhandler extract document.pdf --pages "1-3, 5"
pdfhandler wordcount document.pdf --pages "1, 3"
pdfhandler encrypt document.pdf --output secure.pdf
pdfhandler decrypt secure.pdf --in-place
pdfhandler permissions secure.pdf
pdfhandler resize document.pdf 612 792 --output resized.pdf
pdfhandler dupe-check file1.pdf file2.pdf
pdfhandler merge intro.pdf appendix.pdf merged.pdf --add-separator black
```

Use `--help` for details:

```
pdfhandler --help
pdfhandler extract --help
```

## Python Usage

```python
from pdfhandler import PdfHandler, PdfHandlerUtils

handler = PdfHandler("example.pdf")

# Extract text
text = handler.get_pdf_text("1-2, 4")
print(text)

# Word count
print("Words:", handler.word_count("1-3"))

# Encrypt the file
handler.encrypt(output="example-encrypted.pdf")

# Show permissions
handler.print_permissions()

# Resize pages
handler.resize(width=612, height=792, output_path="resized.pdf")

# Merge with a visual separator (black bar or blank page)
PdfHandlerUtils.merge_pdfs(
    "intro.pdf",
    "appendix.pdf",
    "merged.pdf",
    add_separator="black"  # or "blank"
)
```

## License

This project is licensed under the [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication. You may use, modify, and distribute it freely without attribution or restriction.

## Dependencies

- pdfminer.six - for text extraction
- pikepdf - for encryption and PDF manipulation
- colorama - for cross-platform terminal colors

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pdfhandleretc",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "pdf, pdfminer, pikepdf, encryption, text-extraction, cli",
    "author": null,
    "author_email": "Erik Carr <your.email@example.com>",
    "download_url": "https://files.pythonhosted.org/packages/33/b5/801ad2d496488e6a442e2e99f269451bd218747e734066aa3efc4af79974/pdfhandleretc-0.1.1.tar.gz",
    "platform": null,
    "description": "[![PyPI](https://img.shields.io/pypi/v/pdfhandleretc.svg)](https://pypi.org/project/pdfhandleretc/)  \n[![Documentation Status](https://readthedocs.org/projects/pdfhandleretc/badge/?version=latest)](https://scraperetc.readthedocs.io/en/latest/)  \n[![CI](https://github.com/carret1268/PdfHandlerETC/actions/workflows/ci.yml/badge.svg)](https://github.com/carret1268/PdfHandlerETC/actions/workflows/ci.yml)\n\n# PdfHandlerETC\n\nPdfHandlerETC is a lightweight command-line and Python toolkit for handling common PDF tasks including text extraction, encryption, decryption, permissions inspection, word counting, page resizing, and file merging.\n\nThis project is released under the [CC0 1.0 Public Domain Dedication](https://creativecommons.org/publicdomain/zero/1.0/).\n\n## Features\n\n- Extract text from PDFs by page or range\n- Encrypt and decrypt PDFs with customizable permissions\n- Count words across entire documents or selected pages\n- Inspect encryption status and permissions\n- Resize page dimensions\n- Merge two PDFs with optional visual separators (blank page or black bar)\n- Detect duplicate PDFs based on text content\n- Includes both a Python API and command-line interface (CLI)\n\n## Installation\n\nInstall from PyPI:\n\n```\npip install pdfhandleretc\n```\n\n## Command-Line Usage\n\nAfter installation, you can use the `pdfhandler` CLI tool:\n\n```\npdfhandler extract document.pdf --pages \"1-3, 5\"\npdfhandler wordcount document.pdf --pages \"1, 3\"\npdfhandler encrypt document.pdf --output secure.pdf\npdfhandler decrypt secure.pdf --in-place\npdfhandler permissions secure.pdf\npdfhandler resize document.pdf 612 792 --output resized.pdf\npdfhandler dupe-check file1.pdf file2.pdf\npdfhandler merge intro.pdf appendix.pdf merged.pdf --add-separator black\n```\n\nUse `--help` for details:\n\n```\npdfhandler --help\npdfhandler extract --help\n```\n\n## Python Usage\n\n```python\nfrom pdfhandler import PdfHandler, PdfHandlerUtils\n\nhandler = PdfHandler(\"example.pdf\")\n\n# Extract text\ntext = handler.get_pdf_text(\"1-2, 4\")\nprint(text)\n\n# Word count\nprint(\"Words:\", handler.word_count(\"1-3\"))\n\n# Encrypt the file\nhandler.encrypt(output=\"example-encrypted.pdf\")\n\n# Show permissions\nhandler.print_permissions()\n\n# Resize pages\nhandler.resize(width=612, height=792, output_path=\"resized.pdf\")\n\n# Merge with a visual separator (black bar or blank page)\nPdfHandlerUtils.merge_pdfs(\n    \"intro.pdf\",\n    \"appendix.pdf\",\n    \"merged.pdf\",\n    add_separator=\"black\"  # or \"blank\"\n)\n```\n\n## License\n\nThis project is licensed under the [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/) public domain dedication. You may use, modify, and distribute it freely without attribution or restriction.\n\n## Dependencies\n\n- pdfminer.six - for text extraction\n- pikepdf - for encryption and PDF manipulation\n- colorama - for cross-platform terminal colors\n",
    "bugtrack_url": null,
    "license": "CC0-1.0",
    "summary": "Lightweight command-line and Python API toolkit for PDF text extraction, encryption, permissions, and more.",
    "version": "0.1.1",
    "project_urls": {
        "Homepage": "https://github.com/carret1268/pdfhandleretc",
        "Issues": "https://github.com/carret1268/pdfhandleretc/issues",
        "Repository": "https://github.com/carret1268/pdfhandleretc"
    },
    "split_keywords": [
        "pdf",
        " pdfminer",
        " pikepdf",
        " encryption",
        " text-extraction",
        " cli"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "19330332a6676215cb4242514ff49a2ade563f8fdf4aa1e3e0b51be9e30ae136",
                "md5": "0d8ea5526c9ecb50a5336fcf43e8ae2a",
                "sha256": "91ea0b588ee4c5091158d39a1feebed1f55ee57e9244756da770cc068bdca786"
            },
            "downloads": -1,
            "filename": "pdfhandleretc-0.1.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0d8ea5526c9ecb50a5336fcf43e8ae2a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 12427,
            "upload_time": "2025-07-16T04:04:15",
            "upload_time_iso_8601": "2025-07-16T04:04:15.101579Z",
            "url": "https://files.pythonhosted.org/packages/19/33/0332a6676215cb4242514ff49a2ade563f8fdf4aa1e3e0b51be9e30ae136/pdfhandleretc-0.1.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "33b5801ad2d496488e6a442e2e99f269451bd218747e734066aa3efc4af79974",
                "md5": "71b40512af2a80a6b6cc62e30cd49445",
                "sha256": "1b40e735b4e7ad68e3b383fbb10466f40b22cede700b3f88ade544ec7c78f638"
            },
            "downloads": -1,
            "filename": "pdfhandleretc-0.1.1.tar.gz",
            "has_sig": false,
            "md5_digest": "71b40512af2a80a6b6cc62e30cd49445",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 12785,
            "upload_time": "2025-07-16T04:04:16",
            "upload_time_iso_8601": "2025-07-16T04:04:16.364858Z",
            "url": "https://files.pythonhosted.org/packages/33/b5/801ad2d496488e6a442e2e99f269451bd218747e734066aa3efc4af79974/pdfhandleretc-0.1.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-16 04:04:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "carret1268",
    "github_project": "pdfhandleretc",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "pdfhandleretc"
}
        
Elapsed time: 0.73982s