# Perspectra
Software and corresponding workflow to scan documents and books
with as little hardware as possible.
Check out [github:adius/awesome-scanning]
for and extensive list of alternative solutions.
[github:adius/awesome-scanning]: https://github.com/adius/awesome-scanning
## Installation
```bash
pip install perspectra
```
## Usage
### Command Line Interface
```txt
usage: perspectra [-h] [--debug] {binarize,correct,corners,renumber-pages} ...
options:
-h, --help show this help message and exit
--debug Render debugging view
subcommands:
subcommands to handle files and correct photos
{binarize,correct,corners,renumber-pages}
additional help
binarize Binarize image
correct Pespectively correct and crop photos of documents.
corners Returns the corners of the document in the image
as [top-left, top-right, bottom-right, bottom-left]
renumber-pages Renames the images in a directory according to their page numbers.
The assumend layout is `cover -> odd pages -> even pages reversed`
```
## Best Practices for Taking the Photos
Your photos should ideally have following properties:
- Photos with 10 - 20 Mpx
- Contain 1 document
- Rectangular
- Pronounced corners
- Only black content on white or light-colored paper
- On dark background
- Maximum of 30° rotation
### Camera Settings
```yaml
# Rule of thumb is the inverse of your focal length,
# but motion blur is pretty much the worst for readable documents,
# therefore use at least half of it and never less than 1/50.
shutter: 1/50 - 1/200 s
# The whole document must be sharp even if you photograph it from an angle.
# Therefore at least 8 f.
aperture: 8-12 f
# Noise is less bad than motion blur => relative high ISO
# Should be the last thing you set:
# As high as necessary as low as possible
iso: 800-6400
```
When using `Tv` (Time Value) or `Av` (Aperture Value) mode
use exposure compensation to set lightness value below 0.
You really don't want to overexpose your photos as the bright pages
are the first thing that clips.
On the other hand,
it doesn't matter if you loose background parts because they are to dark.
### Generating the Photos from a Video
A good tool for this purpose is [PySceneDetect].
It's a Python/OpenCV-based scene detection program,
using threshold/content analysis on a given video.
[PySceneDetect]: https://github.com/Breakthrough/PySceneDetect
For easy installation you can use the [docker image]
[docker image]: https://github.com/handflucht/PySceneDetect
Find good values for threshold:
```fish
docker run \
--rm \
--volume (pwd):/video \
handflucht/pyscenedetect
--input /video/page-turning.mp4 \
--downscale-factor 2 \
--detector content \
--statsfile page-turning-stats.csv
```
To launch the image run:
```fish
docker run \
--interactive \
--tty \
--volume=(pwd):/video \
--entrypoint=bash \
handflucht/pyscenedetect
```
Then run in the shell:
```bash
cd /video
scenedetect \
--input page-turning.mp4 \
--downscale-factor 2 \
--detector content \
--threshold 3 \
--min-scene-length 80 \
--save-images
```
TODO: The correct way to do this:
(after https://github.com/Breakthrough/PySceneDetect/issues/45 is implemented)
```fish
docker run \
--rm \
--volume (pwd):/video \
handflucht/pyscenedetect \
--input /video/page-turning.mp4 \
--downscale-factor 2 \
--detector content \
--threshold 3 \
--min-scene-length 80 \
--save-images <TODO: path>
```
Aim for a low threshold and a long minimun scene length.
I.e. turn the page really fast and show it for a long time.
## TODO
- [ ] Calculate aspect ratio of scanned document
and apply during perspective transformation
- [ ] Make sure besin for watershed algorithm is in no local confinement
- [ ] Add white border to sobel image or crop it by 1px in order
to correctly handle partly cropped documents
- [ ] Check that there were at least 4 corners detected
- [ ] Dewarp pages
- [ ] Maximize contrast of image before binarizing
- [ ] Make algorithms independent from photo sizes
- [ ] Limit the kind of objects which get deleted when touching the border
(e.g. only elongated objects)
- [ ] Better algorithm for documents with rounded corners (e.g. credit cards)
- [ ] Mention that file format is infered from file extension
- [ ] Spread range after converting to grayscale
- [ ] Add a border around the image
so that watershed starts to flood the image from all directions
and to improve recognition for clipped documents
- [ ] Implement https://github.com/scikit-image/scikit-image/issues/2212
- [ ] Checkout http://ilastik.org
- [ ] Trim images before saving
- [ ] Try out https://github.com/andrewssobral/bgslibrary
- [ ] Try out https://github.com/Image-Py/imagepy
- [ ] Try out https://github.com/WPIRoboticsProjects/GRIP/releases
- [ ] Try out https://mybinder.org
- [ ] Read https://gilberttanner.com/blog/detectron2-train-a-instance-segmentation-model
- [ ] Read https://www.pyimagesearch.com/2018/07/30/opencv-object-tracking/
Raw data
{
"_id": null,
"home_page": null,
"name": "Perspectra",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "document scanner, perspective transformation",
"author": null,
"author_email": "Adrian Sieber <mail@adriansieber.com>",
"download_url": "https://files.pythonhosted.org/packages/8c/bd/e70d982450ec75c59d3ad216e65896f4ed09c056223dad9db424e4d2f943/perspectra-0.1.0.tar.gz",
"platform": null,
"description": "# Perspectra\n\nSoftware and corresponding workflow to scan documents and books\nwith as little hardware as possible.\n\nCheck out [github:adius/awesome-scanning]\nfor and extensive list of alternative solutions.\n\n[github:adius/awesome-scanning]: https://github.com/adius/awesome-scanning\n\n\n## Installation\n\n```bash\npip install perspectra\n```\n\n\n## Usage\n\n### Command Line Interface\n\n```txt\nusage: perspectra [-h] [--debug] {binarize,correct,corners,renumber-pages} ...\n\noptions:\n -h, --help show this help message and exit\n --debug Render debugging view\n\nsubcommands:\n subcommands to handle files and correct photos\n\n {binarize,correct,corners,renumber-pages}\n additional help\n binarize Binarize image\n correct Pespectively correct and crop photos of documents.\n corners Returns the corners of the document in the image\n as [top-left, top-right, bottom-right, bottom-left]\n renumber-pages Renames the images in a directory according to their page numbers.\n The assumend layout is `cover -> odd pages -> even pages reversed`\n```\n\n\n## Best Practices for Taking the Photos\n\nYour photos should ideally have following properties:\n\n- Photos with 10 - 20 Mpx\n- Contain 1 document\n - Rectangular\n - Pronounced corners\n - Only black content on white or light-colored paper\n - On dark background\n - Maximum of 30\u00b0 rotation\n\n\n### Camera Settings\n\n```yaml\n# Rule of thumb is the inverse of your focal length,\n# but motion blur is pretty much the worst for readable documents,\n# therefore use at least half of it and never less than 1/50.\nshutter: 1/50 - 1/200 s\n\n# The whole document must be sharp even if you photograph it from an angle.\n# Therefore at least 8 f.\naperture: 8-12 f\n\n# Noise is less bad than motion blur => relative high ISO\n# Should be the last thing you set:\n# As high as necessary as low as possible\niso: 800-6400\n```\n\nWhen using `Tv` (Time Value) or `Av` (Aperture Value) mode\nuse exposure compensation to set lightness value below 0.\nYou really don't want to overexpose your photos as the bright pages\nare the first thing that clips.\n\nOn the other hand,\nit doesn't matter if you loose background parts because they are to dark.\n\n\n### Generating the Photos from a Video\n\nA good tool for this purpose is [PySceneDetect].\nIt's a Python/OpenCV-based scene detection program,\nusing threshold/content analysis on a given video.\n\n[PySceneDetect]: https://github.com/Breakthrough/PySceneDetect\n\nFor easy installation you can use the [docker image]\n\n[docker image]: https://github.com/handflucht/PySceneDetect\n\n\nFind good values for threshold:\n\n```fish\ndocker run \\\n --rm \\\n --volume (pwd):/video \\\n handflucht/pyscenedetect\n --input /video/page-turning.mp4 \\\n --downscale-factor 2 \\\n --detector content \\\n --statsfile page-turning-stats.csv\n```\n\n\nTo launch the image run:\n\n```fish\ndocker run \\\n --interactive \\\n --tty \\\n --volume=(pwd):/video \\\n --entrypoint=bash \\\n handflucht/pyscenedetect\n```\n\n\nThen run in the shell:\n\n```bash\ncd /video\nscenedetect \\\n --input page-turning.mp4 \\\n --downscale-factor 2 \\\n --detector content \\\n --threshold 3 \\\n --min-scene-length 80 \\\n --save-images\n```\n\n\nTODO: The correct way to do this:\n(after https://github.com/Breakthrough/PySceneDetect/issues/45 is implemented)\n\n```fish\ndocker run \\\n --rm \\\n --volume (pwd):/video \\\n handflucht/pyscenedetect \\\n --input /video/page-turning.mp4 \\\n --downscale-factor 2 \\\n --detector content \\\n --threshold 3 \\\n --min-scene-length 80 \\\n --save-images <TODO: path>\n```\n\nAim for a low threshold and a long minimun scene length.\nI.e. turn the page really fast and show it for a long time.\n\n\n## TODO\n\n- [ ] Calculate aspect ratio of scanned document\n and apply during perspective transformation\n- [ ] Make sure besin for watershed algorithm is in no local confinement\n- [ ] Add white border to sobel image or crop it by 1px in order\n to correctly handle partly cropped documents\n- [ ] Check that there were at least 4 corners detected\n- [ ] Dewarp pages\n- [ ] Maximize contrast of image before binarizing\n- [ ] Make algorithms independent from photo sizes\n- [ ] Limit the kind of objects which get deleted when touching the border\n (e.g. only elongated objects)\n- [ ] Better algorithm for documents with rounded corners (e.g. credit cards)\n- [ ] Mention that file format is infered from file extension\n- [ ] Spread range after converting to grayscale\n- [ ] Add a border around the image\n so that watershed starts to flood the image from all directions\n and to improve recognition for clipped documents\n- [ ] Implement https://github.com/scikit-image/scikit-image/issues/2212\n- [ ] Checkout http://ilastik.org\n- [ ] Trim images before saving\n- [ ] Try out https://github.com/andrewssobral/bgslibrary\n- [ ] Try out https://github.com/Image-Py/imagepy\n- [ ] Try out https://github.com/WPIRoboticsProjects/GRIP/releases\n- [ ] Try out https://mybinder.org\n- [ ] Read https://gilberttanner.com/blog/detectron2-train-a-instance-segmentation-model\n- [ ] Read https://www.pyimagesearch.com/2018/07/30/opencv-object-tracking/\n",
"bugtrack_url": null,
"license": "ISC",
"summary": "Automatically extract and perspectively correct documents in images",
"version": "0.1.0",
"project_urls": {
"Homepage": "http://github.com/feramhq/perspectra"
},
"split_keywords": [
"document scanner",
" perspective transformation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "76f9128840924aae8e0600d3ebc3b037a05267364cabafd2c408195882dfa44f",
"md5": "42edaa27b26dc227505d932a74b4b43a",
"sha256": "44ded53fa6ae94c6e5f8b955c9919f0bd205d74cd3ff3b83dbb4f493f58181cf"
},
"downloads": -1,
"filename": "Perspectra-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "42edaa27b26dc227505d932a74b4b43a",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 16939,
"upload_time": "2024-11-19T10:54:29",
"upload_time_iso_8601": "2024-11-19T10:54:29.519887Z",
"url": "https://files.pythonhosted.org/packages/76/f9/128840924aae8e0600d3ebc3b037a05267364cabafd2c408195882dfa44f/Perspectra-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8cbde70d982450ec75c59d3ad216e65896f4ed09c056223dad9db424e4d2f943",
"md5": "d769375a3ffe9f63e1fbed0460d8a4bb",
"sha256": "11bb816f02ea1e4b059fc68ccb01ec0d50df5bd5c1c4b192d52ca8ee5d50896c"
},
"downloads": -1,
"filename": "perspectra-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "d769375a3ffe9f63e1fbed0460d8a4bb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 16404,
"upload_time": "2024-11-19T10:54:30",
"upload_time_iso_8601": "2024-11-19T10:54:30.775789Z",
"url": "https://files.pythonhosted.org/packages/8c/bd/e70d982450ec75c59d3ad216e65896f4ed09c056223dad9db424e4d2f943/perspectra-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-19 10:54:30",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "feramhq",
"github_project": "perspectra",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "perspectra"
}