pdf-toolchest


Namepdf-toolchest JSON
Version 0.1.13 PyPI version JSON
download
home_pageNone
SummaryA Swiss-army knife for everyday PDF workflows – pure-Python, cross-platform, and fully typed.
upload_time2025-07-29 17:53:34
maintainerNone
docs_urlNone
authorAlex
requires_python<4,>=3.11
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pdf-tools

*A Swiss-army knife for everyday PDF workflows – pure-Python, cross-platform, and fully typed.*

---

## Features

| Capability | Sub-command / API | Notes |
|------------|------------------|-------|
| **Convert** images (`.png`, `.jpg`, …) or Word (`.docx`) to PDF | `pdf-tools convert …` <br>`pdf_tools.convert.service.convert_file_to_pdf()` | Uses **Pillow + img2pdf** for images and **LibreOffice / unoserver** for Word files. |
| **Merge** multiple PDFs (or whole folders) | `pdf-tools merge …` <br>`pdf_tools.merge.service.merge_pdfs()` | Preserves bookmarks; skips non-PDF inputs with a warning. |
| **Process** = Convert **→** Merge in one go | `pdf-tools process convert-and-merge …` | Handy for ad-hoc batches of mixed file types. |
| **Watermark** (text stamp) | `pdf-tools watermark add-text …` <br>`pdf_tools.watermark.service.add_text_watermark()` | PyMuPDF in-place editing; configurable font, colour, opacity, rotation, position. |
| **Async-friendly CLI** | Built on Typer + custom `AsyncTyper` | Callbacks can be `async def` – future-proof for parallel work. |
| **Pydantic v2 models** | `File`, `Files`, `WatermarkOptions`, … | JSON-serialisable contracts for easy automation. |
| **Fully typed** + Ruff + Mypy + pytest + hypothesis |   | CI fails on lint, type, docs, or test issues. |

---

## Installation

```bash
pipx install unoserver --system-site-packages 
pip install pdf-tools
```

External dependency: LibreOffice must be installed and on your $PATH for Word→PDF conversion.

1) Install unoserver globally using ``pipx install --system-site-packages`` (prefered) or ``sudo -H pip install``
2) use the bundled python that ships with LibreOffice, or
3) call soffice --headless directly (see Batch listeners below).

## CLI Quick Start

```bash
# 1. Convert a single Word file
pdf-tools convert file-to-pdf draft.docx

# 2. Convert every image in a folder → PDFs in ./out
pdf-tools convert folder-to-pdfs assets/ --output-dir out/

# 3. Merge selected PDFs
pdf-tools merge pdf-files a.pdf b.pdf c.pdf -o merged.pdf 

# 4. Merge *all* PDFs in a folder
pdf-tools merge pdfs-in-folder scans/ -o merged.pdf 

# 5. One-liner: convert images + docs → merge
pdf-tools process convert-and-merge-pdfs image1.jpg doc1.docx doc2.docx -o final.pdf 

# 6. Add a diagonal red DRAFT watermark on every page
pdf-tools watermark add-text src.pdf stamped.pdf \
    --text "DRAFT" --color "#FF0000" --font-size 72 --opacity 0.2 --rotation 45
```

###  Batch LibreOffice Listener (faster)

```bash
# spin up a listener for the whole session (Linux)
unoserver --interface 127.0.0.1 --port 2002 &
export LIBRE_PORT=2002   # used by convert helpers
pdf-tools process convert-and-merge-pdfs ...
kill %1                  # when done
```

## Using the Python API

```python
from pathlib import Path
from pdf_tools.convert.service import convert_file_to_pdf
from pdf_tools.merge.service import merge_pdfs
from pdf_tools.watermark.models import WatermarkOptions
from pdf_tools.watermark.service import add_text_watermark

# 1. Convert
img_pdf = convert_file_to_pdf(
    input_path=Path("diagram.png"),
    output_dir=Path("out"),
)

# 2. Merge two PDFs
merge_pdfs(
    input_paths=[Path("intro.pdf"), img_pdf.path],
    output_path=Path("bundle.pdf"),
)

# 3. Watermark (first page only)
opts = WatermarkOptions(text="CONFIDENTIAL", font_size=36, all_pages=False)
add_text_watermark(src=Path("bundle.pdf"), dst=Path("bundle_wm.pdf"), opts=opts)
```

### Spinning up transient LibreOffice listener
```python
from pdf_tools.convert.unoserver_ctx import unoserver_listener
from pdf_tools.process.service import convert_and_merge_pdfs

with unoserver_listener(port=2002):            # starts & auto-kills unoserver
    convert_and_merge_pdfs(
        input_paths=[Path("doc1.docx"), Path("pic.jpg")],
        output_path=Path("package.pdf"),
    )
```

## Development setup

```bash
pipx install unoserver --system-site-packages
git clone https://github.com/your-org/pdf-tools.git
cd pdf-tools
poetry install --with dev   # include dev/test dependencies

# Run checks
ruff check .
mypy .
pytest -q
```


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "pdf-toolchest",
    "maintainer": null,
    "docs_url": null,
    "requires_python": "<4,>=3.11",
    "maintainer_email": null,
    "keywords": null,
    "author": "Alex",
    "author_email": "aswann@protonmail.com",
    "download_url": "https://files.pythonhosted.org/packages/23/47/e6fc0ad0633db4c3eae4cc0fc5ebb350a1d68684ecf8cd9106700a24ad04/pdf_toolchest-0.1.13.tar.gz",
    "platform": null,
    "description": "# pdf-tools\n\n*A Swiss-army knife for everyday PDF workflows \u2013 pure-Python, cross-platform, and fully typed.*\n\n---\n\n## Features\n\n| Capability | Sub-command / API | Notes |\n|------------|------------------|-------|\n| **Convert** images (`.png`, `.jpg`, \u2026) or Word (`.docx`) to PDF | `pdf-tools convert \u2026` <br>`pdf_tools.convert.service.convert_file_to_pdf()` | Uses **Pillow + img2pdf** for images and **LibreOffice / unoserver** for Word files. |\n| **Merge** multiple PDFs (or whole folders) | `pdf-tools merge \u2026` <br>`pdf_tools.merge.service.merge_pdfs()` | Preserves bookmarks; skips non-PDF inputs with a warning. |\n| **Process** = Convert **\u2192** Merge in one go | `pdf-tools process convert-and-merge \u2026` | Handy for ad-hoc batches of mixed file types. |\n| **Watermark** (text stamp) | `pdf-tools watermark add-text \u2026` <br>`pdf_tools.watermark.service.add_text_watermark()` | PyMuPDF in-place editing; configurable font, colour, opacity, rotation, position. |\n| **Async-friendly CLI** | Built on Typer + custom `AsyncTyper` | Callbacks can be `async def` \u2013 future-proof for parallel work. |\n| **Pydantic v2 models** | `File`, `Files`, `WatermarkOptions`, \u2026 | JSON-serialisable contracts for easy automation. |\n| **Fully typed** + Ruff + Mypy + pytest + hypothesis |   | CI fails on lint, type, docs, or test issues. |\n\n---\n\n## Installation\n\n```bash\npipx install unoserver --system-site-packages \npip install pdf-tools\n```\n\nExternal dependency: LibreOffice must be installed and on your $PATH for Word\u2192PDF conversion.\n\n1) Install unoserver globally using ``pipx install --system-site-packages`` (prefered) or ``sudo -H pip install``\n2) use the bundled python that ships with LibreOffice, or\n3) call soffice --headless directly (see Batch listeners below).\n\n## CLI Quick Start\n\n```bash\n# 1. Convert a single Word file\npdf-tools convert file-to-pdf draft.docx\n\n# 2. Convert every image in a folder \u2192 PDFs in ./out\npdf-tools convert folder-to-pdfs assets/ --output-dir out/\n\n# 3. Merge selected PDFs\npdf-tools merge pdf-files a.pdf b.pdf c.pdf -o merged.pdf \n\n# 4. Merge *all* PDFs in a folder\npdf-tools merge pdfs-in-folder scans/ -o merged.pdf \n\n# 5. One-liner: convert images + docs \u2192 merge\npdf-tools process convert-and-merge-pdfs image1.jpg doc1.docx doc2.docx -o final.pdf \n\n# 6. Add a diagonal red DRAFT watermark on every page\npdf-tools watermark add-text src.pdf stamped.pdf \\\n    --text \"DRAFT\" --color \"#FF0000\" --font-size 72 --opacity 0.2 --rotation 45\n```\n\n###  Batch LibreOffice Listener (faster)\n\n```bash\n# spin up a listener for the whole session (Linux)\nunoserver --interface 127.0.0.1 --port 2002 &\nexport LIBRE_PORT=2002   # used by convert helpers\npdf-tools process convert-and-merge-pdfs ...\nkill %1                  # when done\n```\n\n## Using the Python API\n\n```python\nfrom pathlib import Path\nfrom pdf_tools.convert.service import convert_file_to_pdf\nfrom pdf_tools.merge.service import merge_pdfs\nfrom pdf_tools.watermark.models import WatermarkOptions\nfrom pdf_tools.watermark.service import add_text_watermark\n\n# 1. Convert\nimg_pdf = convert_file_to_pdf(\n    input_path=Path(\"diagram.png\"),\n    output_dir=Path(\"out\"),\n)\n\n# 2. Merge two PDFs\nmerge_pdfs(\n    input_paths=[Path(\"intro.pdf\"), img_pdf.path],\n    output_path=Path(\"bundle.pdf\"),\n)\n\n# 3. Watermark (first page only)\nopts = WatermarkOptions(text=\"CONFIDENTIAL\", font_size=36, all_pages=False)\nadd_text_watermark(src=Path(\"bundle.pdf\"), dst=Path(\"bundle_wm.pdf\"), opts=opts)\n```\n\n### Spinning up transient LibreOffice listener\n```python\nfrom pdf_tools.convert.unoserver_ctx import unoserver_listener\nfrom pdf_tools.process.service import convert_and_merge_pdfs\n\nwith unoserver_listener(port=2002):            # starts & auto-kills unoserver\n    convert_and_merge_pdfs(\n        input_paths=[Path(\"doc1.docx\"), Path(\"pic.jpg\")],\n        output_path=Path(\"package.pdf\"),\n    )\n```\n\n## Development setup\n\n```bash\npipx install unoserver --system-site-packages\ngit clone https://github.com/your-org/pdf-tools.git\ncd pdf-tools\npoetry install --with dev   # include dev/test dependencies\n\n# Run checks\nruff check .\nmypy .\npytest -q\n```\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Swiss-army knife for everyday PDF workflows \u2013 pure-Python, cross-platform, and fully typed.",
    "version": "0.1.13",
    "project_urls": null,
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "a345710cf6c555597e9d0f6c1bccdff6eb49d7da02270e1bd8ddda7b10cdde7b",
                "md5": "18e616e41593d86e7d8dcf4885e75459",
                "sha256": "1e301478e4ff0a1a000599a277848dde7a095bf380ebc781357675963a8c4647"
            },
            "downloads": -1,
            "filename": "pdf_toolchest-0.1.13-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "18e616e41593d86e7d8dcf4885e75459",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": "<4,>=3.11",
            "size": 26912,
            "upload_time": "2025-07-29T17:53:33",
            "upload_time_iso_8601": "2025-07-29T17:53:33.619771Z",
            "url": "https://files.pythonhosted.org/packages/a3/45/710cf6c555597e9d0f6c1bccdff6eb49d7da02270e1bd8ddda7b10cdde7b/pdf_toolchest-0.1.13-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2347e6fc0ad0633db4c3eae4cc0fc5ebb350a1d68684ecf8cd9106700a24ad04",
                "md5": "763e4ec641f5e847f43899a982b113db",
                "sha256": "534e1799b19fbdef9afce059e3aee711a0cf9478a79841b9075386bd91f36d7f"
            },
            "downloads": -1,
            "filename": "pdf_toolchest-0.1.13.tar.gz",
            "has_sig": false,
            "md5_digest": "763e4ec641f5e847f43899a982b113db",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": "<4,>=3.11",
            "size": 19559,
            "upload_time": "2025-07-29T17:53:34",
            "upload_time_iso_8601": "2025-07-29T17:53:34.875078Z",
            "url": "https://files.pythonhosted.org/packages/23/47/e6fc0ad0633db4c3eae4cc0fc5ebb350a1d68684ecf8cd9106700a24ad04/pdf_toolchest-0.1.13.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-29 17:53:34",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "lcname": "pdf-toolchest"
}
        
Elapsed time: 0.48721s