# Recursive Segmentation Model
The ideas presented in this repository are largely based off the original paper from 1995: _Recursive XY cut using bounding boxes of connected components_ (https://ieeexplore.ieee.org/document/602059). It's a super lightweight segmentation algorithm with no ML components so it also segments extremely fast and can be done in parallel too (more to come on this front).
**_Disclaimer_**: _This is an unbenchmarked segmentation model. It works decently well for documents at first glance and will be extended to general images in the near future. I also need to find a better name for this package._
## Getting Started
This repository is pushed to a PyPI distribution (https://pypi.org/project/xy-segmentation/). Get started by running the following command:
```
pip install xy-segmentation
```
Example usage:
```python
ifile = "examples/images/apple_iphone-13_manual.jpg"
img = Image.open(ifile)
draw = ImageDraw.Draw(img, "RGBA")
for crop in segment_pdf_image(img):
draw.rectangle(
crop.bounding_box, outline=(255, 0, 0), width=3, fill=(0, 127, 255, 80)
)
img.show()
```
## Examples
<p>
<img src="https://github.com/johnathanchiu/recursive-segmentation/blob/main/examples/outputs/apple_output.jpg" alt="Image 1" width="400"/>
<img src="https://github.com/johnathanchiu/recursive-segmentation/blob/main/examples/outputs/dell_output.jpg" alt="Image 2" width="400"/>
</p>
See `main.py` or `ex.ipynb` for examples on how to draw the images.
Examples from the `pdfs` folder under `examples` were grabbed from [here](https://www.princexml.com/samples/) and `images` folder under `examples` were grabbed from [here](https://github.com/AIM3-RUC/MPMQA).
## Local Setup
```
pip install -r requirements.txt
```
## Additional Information
This algorithm works particularly well with documents that have a lot of diagrams and that are well spaced. It performs poorly on documents that are purely text-based (but there is usually no need to segment documents that are completely text-based just throw it into RAG directly). It could be interesting to detect situations like this and skip the segmentation step entirely for these sorts of pages.
At the moment, I am looking to build out an ML model to determine when to split chunks in the page. The main principle would be to train a seq2seq model that outputs a binary sequence. The sequence input is the slices of the image and the output is a binary sequence where a 1 represents a split in the image and 0 otherwise.
### Limitations
Like any bounding box segmentation algorithm, the main limitation is the shape of the segmentation. Edge cases arise when the input image is not necessarily framed in a grid-shape. Take an example where an image contains "L" shaped objects. This makes it impossible to segment out the "L" shaped object defined by a bounding box. If anyone has any ideas on how to improve this, please feel free to suggest!
## Contributing
Feel free to contribute to this repository through Pull Requests and Issues. Reach out to me if you have any ideas surrounding this that you want to discuss!
Raw data
{
"_id": null,
"home_page": null,
"name": "xyseg",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "document, images, parser, segmentation",
"author": null,
"author_email": "Johnathan Chiu <johnathanchiu1065@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/4a/d9/87b098698326a74bc044c45b215fca6edfd492a4119192b908c8c058d7fe/xyseg-0.0.2.tar.gz",
"platform": null,
"description": "# Recursive Segmentation Model\n\nThe ideas presented in this repository are largely based off the original paper from 1995: _Recursive XY cut using bounding boxes of connected components_ (https://ieeexplore.ieee.org/document/602059). It's a super lightweight segmentation algorithm with no ML components so it also segments extremely fast and can be done in parallel too (more to come on this front).\n\n**_Disclaimer_**: _This is an unbenchmarked segmentation model. It works decently well for documents at first glance and will be extended to general images in the near future. I also need to find a better name for this package._\n\n## Getting Started\n\nThis repository is pushed to a PyPI distribution (https://pypi.org/project/xy-segmentation/). Get started by running the following command:\n\n```\npip install xy-segmentation\n```\n\nExample usage:\n\n```python\nifile = \"examples/images/apple_iphone-13_manual.jpg\"\nimg = Image.open(ifile)\n\ndraw = ImageDraw.Draw(img, \"RGBA\")\nfor crop in segment_pdf_image(img):\n draw.rectangle(\n crop.bounding_box, outline=(255, 0, 0), width=3, fill=(0, 127, 255, 80)\n )\n\nimg.show()\n```\n\n## Examples\n\n<p>\n<img src=\"https://github.com/johnathanchiu/recursive-segmentation/blob/main/examples/outputs/apple_output.jpg\" alt=\"Image 1\" width=\"400\"/> \n<img src=\"https://github.com/johnathanchiu/recursive-segmentation/blob/main/examples/outputs/dell_output.jpg\" alt=\"Image 2\" width=\"400\"/>\n</p>\n\nSee `main.py` or `ex.ipynb` for examples on how to draw the images.\n\nExamples from the `pdfs` folder under `examples` were grabbed from [here](https://www.princexml.com/samples/) and `images` folder under `examples` were grabbed from [here](https://github.com/AIM3-RUC/MPMQA).\n\n## Local Setup\n\n```\npip install -r requirements.txt\n```\n\n## Additional Information\n\nThis algorithm works particularly well with documents that have a lot of diagrams and that are well spaced. It performs poorly on documents that are purely text-based (but there is usually no need to segment documents that are completely text-based just throw it into RAG directly). It could be interesting to detect situations like this and skip the segmentation step entirely for these sorts of pages.\n\nAt the moment, I am looking to build out an ML model to determine when to split chunks in the page. The main principle would be to train a seq2seq model that outputs a binary sequence. The sequence input is the slices of the image and the output is a binary sequence where a 1 represents a split in the image and 0 otherwise.\n\n### Limitations\n\nLike any bounding box segmentation algorithm, the main limitation is the shape of the segmentation. Edge cases arise when the input image is not necessarily framed in a grid-shape. Take an example where an image contains \"L\" shaped objects. This makes it impossible to segment out the \"L\" shaped object defined by a bounding box. If anyone has any ideas on how to improve this, please feel free to suggest!\n\n## Contributing\n\nFeel free to contribute to this repository through Pull Requests and Issues. Reach out to me if you have any ideas surrounding this that you want to discuss!\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Recursive Segmentation Algorithm",
"version": "0.0.2",
"project_urls": {
"Homepage": "https://github.com/johnathanchiu/recursive-segmentation",
"Issues": "https://github.com/johnathanchiu/recursive-segmentation/issues"
},
"split_keywords": [
"document",
" images",
" parser",
" segmentation"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "829a932ee00d06e2ebcd7466ccaaf46a3f41fe3cbea125600da7888743d5361b",
"md5": "0e4249e8c544b65e75ce3fb8a018c058",
"sha256": "f3ab9724e3540f97103a42dbd7809b22b650397cdd409a0bf6ea431299acf3e5"
},
"downloads": -1,
"filename": "xyseg-0.0.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0e4249e8c544b65e75ce3fb8a018c058",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 10180,
"upload_time": "2024-10-07T16:25:14",
"upload_time_iso_8601": "2024-10-07T16:25:14.615264Z",
"url": "https://files.pythonhosted.org/packages/82/9a/932ee00d06e2ebcd7466ccaaf46a3f41fe3cbea125600da7888743d5361b/xyseg-0.0.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "4ad987b098698326a74bc044c45b215fca6edfd492a4119192b908c8c058d7fe",
"md5": "40b7da429ab8ffe0cebeded3ed2f13ad",
"sha256": "a01efc720aa47bafc8c91eefbe1f3227f6c769b27e4a921eead2e95dc0dd0b16"
},
"downloads": -1,
"filename": "xyseg-0.0.2.tar.gz",
"has_sig": false,
"md5_digest": "40b7da429ab8ffe0cebeded3ed2f13ad",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 10331,
"upload_time": "2024-10-07T16:25:16",
"upload_time_iso_8601": "2024-10-07T16:25:16.071312Z",
"url": "https://files.pythonhosted.org/packages/4a/d9/87b098698326a74bc044c45b215fca6edfd492a4119192b908c8c058d7fe/xyseg-0.0.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-07 16:25:16",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "johnathanchiu",
"github_project": "recursive-segmentation",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "xyseg"
}