# flyfield
A tool to automatically detect white boxes in PDFs and convert them into interactive, automated form fields—aimed at users and developers looking to streamline PDF form workflows.
---
## Overview
**flyfield** is a Python library and command-line tool designed to automate the conversion of white box placeholders within PDFs into fully interactive form fields. This enables users and developers to analyze form layouts, generate fillable fields, fill forms programmatically using CSV data, and capture filled data for further use.
Powered by [PyMuPDF](https://pymupdf.readthedocs.io) and [PyPDFForm](https://pypdfform.readthedocs.io), flyfield provides a modular, extensible solution for reliable PDF form automation.
---
## Key Features
- **White box detection:** Precisely extract white box regions as potential form fields from vector PDFs.
- **Layout analysis:** Group extracted fields by page, line, and block with flexible gap detection.
- **Form field generation:** Automatically produce Python scripts to create interactive PDF form fields aligned with detected boxes.
- **Markup visualization:** Generate annotated PDFs marking detected fields for verification.
- **Form filling and capture:** Fill PDF forms programmatically from CSV data and extract filled data into CSV format.
- **CLI integration:** User-friendly command-line interface to orchestrate workflows from extraction to data capture.
- **Open Source and Extensible:** Easily customize and extend for specific PDF processing needs.
---
## Installation
For isolated installation, use [pipx](https://pipxproject.github.io/pipx/):
```
pipx install flyfield
```
Verify the installed version:
```
flyfield --version
```
Alternatively, install with `pip`:
```
pip install flyfield
```
---
## Usage
Execute commands against PDF files as needed:
```
flyfield --input-pdf myfile.pdf --markup
```
Common options:
- `--markup` → Generate a marked-up PDF showing detected white boxes.
- `--fields` → Generate and run scripts to add interactive form fields.
- `--fill` → Fill form fields using CSV data.
- `--capture` → Extract filled form data back to CSV.
- `--input-csv` → Provide CSV input for field data, bypassing extraction.
- `--debug` → Enable verbose debug output.
Example workflow:
```
flyfield --input-pdf form_template.pdf --markup --fields
flyfield --input-pdf form_template-fields.pdf --input-csv form_template.csv --fill
flyfield --input-pdf form_template-filled.pdf --capture
```
---
## For Developers
Clone the repository and install development dependencies:
```
git clone https://github.com/flywire/flyfield.git
cd flyfield
pip install -e .[dev]
```
Run the test suite with coverage enabled:
```
tox
```
The project is modular with separate components, including:
- `extract` for box detection
- `layout` for field grouping and filtering
- `markup_and_fields` for field generation and markup
- `io_utils` for data input/output
- `utils` for utility functions
Get CLI help with:
```
python -m flyfield.cli --help
```
---
## License
Licensed under the **GNU General Public License v3.0 or later** (GPL-3.0-or-later). See [LICENSE](LICENSE) for details.
---
## Contributing
Contributions are welcome! Please open issues to report bugs or request features, and submit pull requests with tests and documentation improvements.
---
## Acknowledgements
- Built using [PyMuPDF](https://pymupdf.readthedocs.io) for PDF handling.
- Uses [PyPDFForm](https://pypdfform.readthedocs.io) for interactive form creation.
- Inspired by the need for robust automation of PDF workflows involving white boxed form fields.
---
Raw data
{
"_id": null,
"home_page": null,
"name": "flyfield",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "pdf, form, fields, extraction, pymupdf, pypdfforms",
"author": null,
"author_email": "flywire <flywire0@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/10/e1/480fa18e5241ae2393b81970275b385b301b1a33ef8c55233d281c85dcaf/flyfield-2025.8.29.tar.gz",
"platform": null,
"description": "# flyfield\n\nA tool to automatically detect white boxes in PDFs and convert them into interactive, automated form fields\u2014aimed at users and developers looking to streamline PDF form workflows.\n\n---\n\n## Overview\n\n**flyfield** is a Python library and command-line tool designed to automate the conversion of white box placeholders within PDFs into fully interactive form fields. This enables users and developers to analyze form layouts, generate fillable fields, fill forms programmatically using CSV data, and capture filled data for further use.\n\nPowered by [PyMuPDF](https://pymupdf.readthedocs.io) and [PyPDFForm](https://pypdfform.readthedocs.io), flyfield provides a modular, extensible solution for reliable PDF form automation.\n\n---\n\n## Key Features\n\n- **White box detection:** Precisely extract white box regions as potential form fields from vector PDFs.\n- **Layout analysis:** Group extracted fields by page, line, and block with flexible gap detection.\n- **Form field generation:** Automatically produce Python scripts to create interactive PDF form fields aligned with detected boxes.\n- **Markup visualization:** Generate annotated PDFs marking detected fields for verification.\n- **Form filling and capture:** Fill PDF forms programmatically from CSV data and extract filled data into CSV format.\n- **CLI integration:** User-friendly command-line interface to orchestrate workflows from extraction to data capture.\n- **Open Source and Extensible:** Easily customize and extend for specific PDF processing needs.\n\n---\n\n## Installation\n\nFor isolated installation, use [pipx](https://pipxproject.github.io/pipx/):\n\n```\n\npipx install flyfield\n\n```\n\nVerify the installed version:\n\n```\n\nflyfield --version\n\n```\n\nAlternatively, install with `pip`:\n\n```\n\npip install flyfield\n\n```\n\n---\n\n## Usage\n\nExecute commands against PDF files as needed:\n\n```\n\nflyfield --input-pdf myfile.pdf --markup\n\n```\n\nCommon options:\n\n- `--markup` \u2192 Generate a marked-up PDF showing detected white boxes.\n- `--fields` \u2192 Generate and run scripts to add interactive form fields.\n- `--fill` \u2192 Fill form fields using CSV data.\n- `--capture` \u2192 Extract filled form data back to CSV.\n- `--input-csv` \u2192 Provide CSV input for field data, bypassing extraction.\n- `--debug` \u2192 Enable verbose debug output.\n\nExample workflow:\n\n```\n\nflyfield --input-pdf form_template.pdf --markup --fields\nflyfield --input-pdf form_template-fields.pdf --input-csv form_template.csv --fill\nflyfield --input-pdf form_template-filled.pdf --capture\n\n```\n\n---\n\n## For Developers\n\nClone the repository and install development dependencies:\n\n```\n\ngit clone https://github.com/flywire/flyfield.git\ncd flyfield\npip install -e .[dev]\n\n```\n\nRun the test suite with coverage enabled:\n\n```\n\ntox\n\n```\n\nThe project is modular with separate components, including:\n\n- `extract` for box detection\n- `layout` for field grouping and filtering\n- `markup_and_fields` for field generation and markup\n- `io_utils` for data input/output\n- `utils` for utility functions\n\nGet CLI help with:\n\n```\n\npython -m flyfield.cli --help\n\n```\n\n---\n\n## License\n\nLicensed under the **GNU General Public License v3.0 or later** (GPL-3.0-or-later). See [LICENSE](LICENSE) for details.\n\n---\n\n## Contributing\n\nContributions are welcome! Please open issues to report bugs or request features, and submit pull requests with tests and documentation improvements.\n\n---\n\n## Acknowledgements\n\n- Built using [PyMuPDF](https://pymupdf.readthedocs.io) for PDF handling.\n- Uses [PyPDFForm](https://pypdfform.readthedocs.io) for interactive form creation.\n- Inspired by the need for robust automation of PDF workflows involving white boxed form fields.\n\n---\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "Tools for extracting, processing, and generating interactive fields for PDFs containing white box fields.",
"version": "2025.8.29",
"project_urls": null,
"split_keywords": [
"pdf",
" form",
" fields",
" extraction",
" pymupdf",
" pypdfforms"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9a8e7bb78b35da0c6be041b4ed5dce4c9f835f7b1e7586e29ed3cc38989e88db",
"md5": "857205295fff4627c663c88f732c4b58",
"sha256": "b223a4e386de386c31f96a2038fba2b9859be6c5ba26fb52829295cf36ab4db7"
},
"downloads": -1,
"filename": "flyfield-2025.8.29-py3-none-any.whl",
"has_sig": false,
"md5_digest": "857205295fff4627c663c88f732c4b58",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 21904,
"upload_time": "2025-09-01T13:49:17",
"upload_time_iso_8601": "2025-09-01T13:49:17.446116Z",
"url": "https://files.pythonhosted.org/packages/9a/8e/7bb78b35da0c6be041b4ed5dce4c9f835f7b1e7586e29ed3cc38989e88db/flyfield-2025.8.29-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "10e1480fa18e5241ae2393b81970275b385b301b1a33ef8c55233d281c85dcaf",
"md5": "e6fe8479bb6dd8bbe0065b21ae0ff579",
"sha256": "a6bca34e77a014a78ec929c04674e34073209f859362d842587ed5509165b09a"
},
"downloads": -1,
"filename": "flyfield-2025.8.29.tar.gz",
"has_sig": false,
"md5_digest": "e6fe8479bb6dd8bbe0065b21ae0ff579",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 18828,
"upload_time": "2025-09-01T13:49:18",
"upload_time_iso_8601": "2025-09-01T13:49:18.573520Z",
"url": "https://files.pythonhosted.org/packages/10/e1/480fa18e5241ae2393b81970275b385b301b1a33ef8c55233d281c85dcaf/flyfield-2025.8.29.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-09-01 13:49:18",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "flyfield"
}