page-dewarp


Namepage-dewarp JSON
Version 0.1.9 PyPI version JSON
download
home_pageNone
SummaryPage dewarping and thresholding using a cubic sheet model.
upload_time2025-02-13 20:29:27
maintainerNone
docs_urlNone
authorNone
requires_python>=3.9
licenseMIT
keywords computer vision cubic spline dewarping document scanning image correction image processing
VCS
bugtrack_url
requirements contourpy cycler fonttools kiwisolver matplotlib mpmath numpy opencv-python packaging pillow pyparsing python-dateutil scipy six sympy toml tomlkit
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # page-dewarp

[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)
[![PyPI](https://img.shields.io/pypi/v/page-dewarp.svg)](https://pypi.org/project/page-dewarp)
[![Supported Python versions](https://img.shields.io/pypi/pyversions/page-dewarp.svg)](https://pypi.org/project/page-dewarp)
[![downloads](https://static.pepy.tech/badge/page-dewarp/month)](https://pepy.tech/project/page-dewarp)
[![License](https://img.shields.io/pypi/l/page-dewarp.svg)](https://pypi.python.org/pypi/page-dewarp)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/lmmx/page-dewarp/master.svg)](https://results.pre-commit.ci/latest/github/lmmx/page-dewarp/master)

Document image dewarping library using a cubic sheet model

Python 3 library for page dewarping and thresholding,
[available on PyPI](https://pypi.org/project/page_dewarp/).

## Installation

To install from PyPI, optionally using [uv](https://docs.astral.sh/uv/) (recommended), run:

- `pip install page-dewarp`
- or `uv pip install page-dewarp` (recommended)

## Dependencies

Python 3.9+ and NumPy, SciPy, SymPy, Matplotlib, OpenCV, and msgspec are required to run `page-dewarp`.

## Help

See [documentation](https://page-dewarp.vercel.app) for more details.

## Background

This library was renovated from the [original (2016) Python 2 script](https://github.com/mzucker/page_dewarp/)
by Matt Zucker, as Python 2 is now long since decommissioned.

## Usage

```
usage: page-dewarp [-h] [-d {0,1,2,3}] [-o {file,screen,both}] [-p]
                   [-vw SCREEN_MAX_W] [-vh SCREEN_MAX_H] [-x PAGE_MARGIN_X]
                   [-y PAGE_MARGIN_Y] [-tw TEXT_MIN_WIDTH]
                   [-th TEXT_MIN_HEIGHT] [-ta TEXT_MIN_ASPECT]
                   [-tk TEXT_MAX_THICKNESS] [-wz ADAPTIVE_WINSZ]
                   [-ri RVEC_IDX] [-ti TVEC_IDX] [-ci CUBIC_IDX]
                   [-sw SPAN_MIN_WIDTH] [-sp SPAN_PX_PER_STEP]
                   [-eo EDGE_MAX_OVERLAP] [-el EDGE_MAX_LENGTH]
                   [-ec EDGE_ANGLE_COST] [-ea EDGE_MAX_ANGLE]
                   [-f FOCAL_LENGTH] [-z OUTPUT_ZOOM] [-dpi OUTPUT_DPI]
                   [-nb NO_BINARY] [-s REMAP_DECIMATE]
                   IMAGE_FILE_OR_FILES [IMAGE_FILE_OR_FILES ...]

positional arguments:
  IMAGE_FILE_OR_FILES   One or more images to process

optional arguments:
  -h, --help            show this help message and exit
  -d {0,1,2,3}, --debug-level {0,1,2,3}
  -o {file,screen,both}, --debug-output {file,screen,both}
  -p, --pdf             Merge dewarped images into a PDF
  -vw SCREEN_MAX_W, --max-screen-width SCREEN_MAX_W
                        Viewing screen max width (for resizing to screen)
  -vh SCREEN_MAX_H, --max-screen-height SCREEN_MAX_H
                        Viewing screen max height (for resizing to screen)
  -x PAGE_MARGIN_X, --x-margin PAGE_MARGIN_X
                        Reduced px to ignore near L/R edge
  -y PAGE_MARGIN_Y, --y-margin PAGE_MARGIN_Y
                        Reduced px to ignore near T/B edge
  -tw TEXT_MIN_WIDTH, --min-text-width TEXT_MIN_WIDTH
                        Min reduced px width of detected text contour
  -th TEXT_MIN_HEIGHT, --min-text-height TEXT_MIN_HEIGHT
                        Min reduced px height of detected text contour
  -ta TEXT_MIN_ASPECT, --min-text-aspect TEXT_MIN_ASPECT
                        Filter out text contours below this w/h ratio
  -tk TEXT_MAX_THICKNESS, --max-text-thickness TEXT_MAX_THICKNESS
                        Max reduced px thickness of detected text contour
  -wz ADAPTIVE_WINSZ, --adaptive-winsz ADAPTIVE_WINSZ
                        Window size for adaptive threshold in reduced px
  -ri RVEC_IDX, --rotation-vec-param-idx RVEC_IDX
                        Index of rvec in params vector (slice: pair of values)
  -ti TVEC_IDX, --translation-vec-param-idx TVEC_IDX
                        Index of tvec in params vector (slice: pair of values)
  -ci CUBIC_IDX, --cubic-slope-param-idx CUBIC_IDX
                        Index of cubic slopes in params vector (slice: pair of
                        values)
  -sw SPAN_MIN_WIDTH, --min-span-width SPAN_MIN_WIDTH
                        Minimum reduced px width for span
  -sp SPAN_PX_PER_STEP, --span-spacing SPAN_PX_PER_STEP
                        Reduced px spacing for sampling along spans
  -eo EDGE_MAX_OVERLAP, --max-edge-overlap EDGE_MAX_OVERLAP
                        Max reduced px horiz. overlap of contours in span
  -el EDGE_MAX_LENGTH, --max-edge-length EDGE_MAX_LENGTH
                        Max reduced px length of edge connecting contours
  -ec EDGE_ANGLE_COST, --edge-angle-cost EDGE_ANGLE_COST
                        Cost of angles in edges (tradeoff vs. length)
  -ea EDGE_MAX_ANGLE, --max-edge-angle EDGE_MAX_ANGLE
                        Maximum change in angle allowed between contours
  -f FOCAL_LENGTH, --focal-length FOCAL_LENGTH
                        Normalized focal length of camera
  -z OUTPUT_ZOOM, --output-zoom OUTPUT_ZOOM
                        How much to zoom output relative to *original* image
  -dpi OUTPUT_DPI, --output-dpi OUTPUT_DPI
                        Just affects stated DPI of PNG, not appearance
  -nb NO_BINARY, --no-binary NO_BINARY
                        Disable output conversion to binary thresholded image
  -s REMAP_DECIMATE, --shrink REMAP_DECIMATE
                        Downscaling factor for remapping image
```

- PDF conversion not yet implemented

To try out an example image, run

```sh
git clone https://github.com/lmmx/page-dewarp
cd page-dewarp
mkdir results && cd results
page-dewarp ../example_input/boston_cooking_a.jpg
```

## Explanation and extension to Gpufit

A book on a flat surface can be said to be 'fixed to zero' at the endpoints of a curve, which
you can model as a cubic (see
[`derive_cubic.py`](https://github.com/lmmx/page-dewarp/blob/master/derive_cubic.py))

The "cubic Hermite spline" is one of the models supported by
[Gpufit](https://github.com/gpufit/Gpufit/), a library for Levenberg Marquardt curve fitting in
CUDA (C++ with Python API).

_[Work in progress]_

- See full writeup on [Matt Zucker's blog](https://mzucker.github.io/2016/08/15/page-dewarping.html)
- See [lecture](https://www.cs.cornell.edu/courses/cs4620/2013fa/lectures/16spline-curves.pdf)
  on splines by Steve Marschner for more details and how a spline can be represented in matrix form.
- Brief notes on this project are over on [my website](https://doc.spin.systems/page-dewarp)

## Features

Improvements on the original include:

- [x] Banished Python 2
- [x] Command line interface
  - [x] Alterable config options
- [x] Repackage for pip installation
- [x] Refactor with modules and classes
- [ ] Speed up the optimisation
  - [ ] Multiprocessing on CPU
  - [ ] Optional interface to use Gpufit on GPU (or Deep Declarative Networks?)

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "page-dewarp",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "computer vision, cubic spline, dewarping, document scanning, image correction, image processing",
    "author": null,
    "author_email": "Louis Maddox <louismmx@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/44/3f/7de3bba213c74fcba524090f542c8f7c55e4f6e229a8f5d97f24291b7209/page_dewarp-0.1.9.tar.gz",
    "platform": null,
    "description": "# page-dewarp\n\n[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)\n[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)\n[![PyPI](https://img.shields.io/pypi/v/page-dewarp.svg)](https://pypi.org/project/page-dewarp)\n[![Supported Python versions](https://img.shields.io/pypi/pyversions/page-dewarp.svg)](https://pypi.org/project/page-dewarp)\n[![downloads](https://static.pepy.tech/badge/page-dewarp/month)](https://pepy.tech/project/page-dewarp)\n[![License](https://img.shields.io/pypi/l/page-dewarp.svg)](https://pypi.python.org/pypi/page-dewarp)\n[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/lmmx/page-dewarp/master.svg)](https://results.pre-commit.ci/latest/github/lmmx/page-dewarp/master)\n\nDocument image dewarping library using a cubic sheet model\n\nPython 3 library for page dewarping and thresholding,\n[available on PyPI](https://pypi.org/project/page_dewarp/).\n\n## Installation\n\nTo install from PyPI, optionally using [uv](https://docs.astral.sh/uv/) (recommended), run:\n\n- `pip install page-dewarp`\n- or `uv pip install page-dewarp` (recommended)\n\n## Dependencies\n\nPython 3.9+ and NumPy, SciPy, SymPy, Matplotlib, OpenCV, and msgspec are required to run `page-dewarp`.\n\n## Help\n\nSee [documentation](https://page-dewarp.vercel.app) for more details.\n\n## Background\n\nThis library was renovated from the [original (2016) Python 2 script](https://github.com/mzucker/page_dewarp/)\nby Matt Zucker, as Python 2 is now long since decommissioned.\n\n## Usage\n\n```\nusage: page-dewarp [-h] [-d {0,1,2,3}] [-o {file,screen,both}] [-p]\n                   [-vw SCREEN_MAX_W] [-vh SCREEN_MAX_H] [-x PAGE_MARGIN_X]\n                   [-y PAGE_MARGIN_Y] [-tw TEXT_MIN_WIDTH]\n                   [-th TEXT_MIN_HEIGHT] [-ta TEXT_MIN_ASPECT]\n                   [-tk TEXT_MAX_THICKNESS] [-wz ADAPTIVE_WINSZ]\n                   [-ri RVEC_IDX] [-ti TVEC_IDX] [-ci CUBIC_IDX]\n                   [-sw SPAN_MIN_WIDTH] [-sp SPAN_PX_PER_STEP]\n                   [-eo EDGE_MAX_OVERLAP] [-el EDGE_MAX_LENGTH]\n                   [-ec EDGE_ANGLE_COST] [-ea EDGE_MAX_ANGLE]\n                   [-f FOCAL_LENGTH] [-z OUTPUT_ZOOM] [-dpi OUTPUT_DPI]\n                   [-nb NO_BINARY] [-s REMAP_DECIMATE]\n                   IMAGE_FILE_OR_FILES [IMAGE_FILE_OR_FILES ...]\n\npositional arguments:\n  IMAGE_FILE_OR_FILES   One or more images to process\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -d {0,1,2,3}, --debug-level {0,1,2,3}\n  -o {file,screen,both}, --debug-output {file,screen,both}\n  -p, --pdf             Merge dewarped images into a PDF\n  -vw SCREEN_MAX_W, --max-screen-width SCREEN_MAX_W\n                        Viewing screen max width (for resizing to screen)\n  -vh SCREEN_MAX_H, --max-screen-height SCREEN_MAX_H\n                        Viewing screen max height (for resizing to screen)\n  -x PAGE_MARGIN_X, --x-margin PAGE_MARGIN_X\n                        Reduced px to ignore near L/R edge\n  -y PAGE_MARGIN_Y, --y-margin PAGE_MARGIN_Y\n                        Reduced px to ignore near T/B edge\n  -tw TEXT_MIN_WIDTH, --min-text-width TEXT_MIN_WIDTH\n                        Min reduced px width of detected text contour\n  -th TEXT_MIN_HEIGHT, --min-text-height TEXT_MIN_HEIGHT\n                        Min reduced px height of detected text contour\n  -ta TEXT_MIN_ASPECT, --min-text-aspect TEXT_MIN_ASPECT\n                        Filter out text contours below this w/h ratio\n  -tk TEXT_MAX_THICKNESS, --max-text-thickness TEXT_MAX_THICKNESS\n                        Max reduced px thickness of detected text contour\n  -wz ADAPTIVE_WINSZ, --adaptive-winsz ADAPTIVE_WINSZ\n                        Window size for adaptive threshold in reduced px\n  -ri RVEC_IDX, --rotation-vec-param-idx RVEC_IDX\n                        Index of rvec in params vector (slice: pair of values)\n  -ti TVEC_IDX, --translation-vec-param-idx TVEC_IDX\n                        Index of tvec in params vector (slice: pair of values)\n  -ci CUBIC_IDX, --cubic-slope-param-idx CUBIC_IDX\n                        Index of cubic slopes in params vector (slice: pair of\n                        values)\n  -sw SPAN_MIN_WIDTH, --min-span-width SPAN_MIN_WIDTH\n                        Minimum reduced px width for span\n  -sp SPAN_PX_PER_STEP, --span-spacing SPAN_PX_PER_STEP\n                        Reduced px spacing for sampling along spans\n  -eo EDGE_MAX_OVERLAP, --max-edge-overlap EDGE_MAX_OVERLAP\n                        Max reduced px horiz. overlap of contours in span\n  -el EDGE_MAX_LENGTH, --max-edge-length EDGE_MAX_LENGTH\n                        Max reduced px length of edge connecting contours\n  -ec EDGE_ANGLE_COST, --edge-angle-cost EDGE_ANGLE_COST\n                        Cost of angles in edges (tradeoff vs. length)\n  -ea EDGE_MAX_ANGLE, --max-edge-angle EDGE_MAX_ANGLE\n                        Maximum change in angle allowed between contours\n  -f FOCAL_LENGTH, --focal-length FOCAL_LENGTH\n                        Normalized focal length of camera\n  -z OUTPUT_ZOOM, --output-zoom OUTPUT_ZOOM\n                        How much to zoom output relative to *original* image\n  -dpi OUTPUT_DPI, --output-dpi OUTPUT_DPI\n                        Just affects stated DPI of PNG, not appearance\n  -nb NO_BINARY, --no-binary NO_BINARY\n                        Disable output conversion to binary thresholded image\n  -s REMAP_DECIMATE, --shrink REMAP_DECIMATE\n                        Downscaling factor for remapping image\n```\n\n- PDF conversion not yet implemented\n\nTo try out an example image, run\n\n```sh\ngit clone https://github.com/lmmx/page-dewarp\ncd page-dewarp\nmkdir results && cd results\npage-dewarp ../example_input/boston_cooking_a.jpg\n```\n\n## Explanation and extension to Gpufit\n\nA book on a flat surface can be said to be 'fixed to zero' at the endpoints of a curve, which\nyou can model as a cubic (see\n[`derive_cubic.py`](https://github.com/lmmx/page-dewarp/blob/master/derive_cubic.py))\n\nThe \"cubic Hermite spline\" is one of the models supported by\n[Gpufit](https://github.com/gpufit/Gpufit/), a library for Levenberg Marquardt curve fitting in\nCUDA (C++ with Python API).\n\n_[Work in progress]_\n\n- See full writeup on [Matt Zucker's blog](https://mzucker.github.io/2016/08/15/page-dewarping.html)\n- See [lecture](https://www.cs.cornell.edu/courses/cs4620/2013fa/lectures/16spline-curves.pdf)\n  on splines by Steve Marschner for more details and how a spline can be represented in matrix form.\n- Brief notes on this project are over on [my website](https://doc.spin.systems/page-dewarp)\n\n## Features\n\nImprovements on the original include:\n\n- [x] Banished Python 2\n- [x] Command line interface\n  - [x] Alterable config options\n- [x] Repackage for pip installation\n- [x] Refactor with modules and classes\n- [ ] Speed up the optimisation\n  - [ ] Multiprocessing on CPU\n  - [ ] Optional interface to use Gpufit on GPU (or Deep Declarative Networks?)\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Page dewarping and thresholding using a cubic sheet model.",
    "version": "0.1.9",
    "project_urls": {
        "Documentation": "https://page-dewarp.vercel.app/",
        "Homepage": "https://github.com/lmmx/page-dewarp",
        "Repository": "https://github.com/lmmx/page-dewarp.git"
    },
    "split_keywords": [
        "computer vision",
        " cubic spline",
        " dewarping",
        " document scanning",
        " image correction",
        " image processing"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "9bc756f6fd12f7307c50d6e487d966282741ed0550c1710a71a4c75162e88995",
                "md5": "ebb009f2b235a14b794c1344ad7352ca",
                "sha256": "90890e4f92bf5c174b5cf09cbcb8440c2eefa7b2d2d5d35fa15154943a083d6c"
            },
            "downloads": -1,
            "filename": "page_dewarp-0.1.9-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ebb009f2b235a14b794c1344ad7352ca",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 35684,
            "upload_time": "2025-02-13T20:29:25",
            "upload_time_iso_8601": "2025-02-13T20:29:25.985951Z",
            "url": "https://files.pythonhosted.org/packages/9b/c7/56f6fd12f7307c50d6e487d966282741ed0550c1710a71a4c75162e88995/page_dewarp-0.1.9-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "443f7de3bba213c74fcba524090f542c8f7c55e4f6e229a8f5d97f24291b7209",
                "md5": "65ab77ba41b5a8923666de8a7154b1dc",
                "sha256": "d9a23e6781665b60f5911aca0f10cd12552dcf49a8acab071b7fa78656b94fb0"
            },
            "downloads": -1,
            "filename": "page_dewarp-0.1.9.tar.gz",
            "has_sig": false,
            "md5_digest": "65ab77ba41b5a8923666de8a7154b1dc",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 30498,
            "upload_time": "2025-02-13T20:29:27",
            "upload_time_iso_8601": "2025-02-13T20:29:27.255845Z",
            "url": "https://files.pythonhosted.org/packages/44/3f/7de3bba213c74fcba524090f542c8f7c55e4f6e229a8f5d97f24291b7209/page_dewarp-0.1.9.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-13 20:29:27",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "lmmx",
    "github_project": "page-dewarp",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "contourpy",
            "specs": [
                [
                    "==",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "cycler",
            "specs": [
                [
                    "==",
                    "0.12.1"
                ]
            ]
        },
        {
            "name": "fonttools",
            "specs": [
                [
                    "==",
                    "4.53.1"
                ]
            ]
        },
        {
            "name": "kiwisolver",
            "specs": [
                [
                    "==",
                    "1.4.7"
                ]
            ]
        },
        {
            "name": "matplotlib",
            "specs": [
                [
                    "==",
                    "3.9.2"
                ]
            ]
        },
        {
            "name": "mpmath",
            "specs": [
                [
                    "==",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "numpy",
            "specs": [
                [
                    "==",
                    "2.1.1"
                ]
            ]
        },
        {
            "name": "opencv-python",
            "specs": [
                [
                    "==",
                    "4.10.0.84"
                ]
            ]
        },
        {
            "name": "packaging",
            "specs": [
                [
                    "==",
                    "24.1"
                ]
            ]
        },
        {
            "name": "pillow",
            "specs": [
                [
                    "==",
                    "10.4.0"
                ]
            ]
        },
        {
            "name": "pyparsing",
            "specs": [
                [
                    "==",
                    "3.1.4"
                ]
            ]
        },
        {
            "name": "python-dateutil",
            "specs": [
                [
                    "==",
                    "2.9.0.post0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    "==",
                    "1.14.1"
                ]
            ]
        },
        {
            "name": "six",
            "specs": [
                [
                    "==",
                    "1.16.0"
                ]
            ]
        },
        {
            "name": "sympy",
            "specs": [
                [
                    "==",
                    "1.13.2"
                ]
            ]
        },
        {
            "name": "toml",
            "specs": [
                [
                    "==",
                    "0.10.2"
                ]
            ]
        },
        {
            "name": "tomlkit",
            "specs": [
                [
                    "==",
                    "0.13.2"
                ]
            ]
        }
    ],
    "lcname": "page-dewarp"
}
        
Elapsed time: 0.49597s