Name | pylaia JSON |
Version |
1.1.2
JSON |
| download |
home_page | None |
Summary | None |
upload_time | 2024-10-16 15:51:04 |
maintainer | None |
docs_url | None |
author | None |
requires_python | <3.11,>=3.9 |
license | MIT |
keywords |
htr
ocr
python
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
<div align="center">
# PyLaia
**PyLaia is a device agnostic, PyTorch based, deep learning toolkit for handwritten document analysis.**
**It is also a successor to [Laia](https://github.com/jpuigcerver/Laia).**
[![pipeline status](https://gitlab.teklia.com/atr/pylaia/badges/master/pipeline.svg)](https://gitlab.teklia.com/atr/pylaia/-/commits/master)
[![Coverage](https://gitlab.teklia.com/atr/pylaia/badges/master/coverage.svg)](https://gitlab.teklia.com/atr/pylaia/-/commits/master)
[![Code quality](https://img.shields.io/codefactor/grade/github/jpuigcerver/PyLaia?&label=CodeFactor&logo=CodeFactor&labelColor=2782f7)](https://www.codefactor.io/repository/github/jpuigcerver/PyLaia)
[![Python: 3.9 | 3.10](https://img.shields.io/badge/python-3.9%20%7C%203.10-blue)](https://www.python.org/)
[![PyTorch: 1.13.0 | 1.13.1](https://img.shields.io/badge/PyTorch-1.13.0%20%7C%201.13.1-8628d5.svg?&logo=PyTorch&logoColor=white&labelColor=%23ee4c2c)](https://pytorch.org/)
[![pre-commit: enabled](https://img.shields.io/badge/pre--commit-enabled-76877c?&logo=pre-commit&labelColor=1f2d23)](https://github.com/pre-commit/pre-commit)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?)](https://github.com/ambv/black)
[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
</div>
Get started by having a look at our [Documentation](https://atr.pages.teklia.com/pylaia)!
## Installation
To install PyLaia from PyPi:
```bash
pip install pylaia
```
The following Python scripts will be installed in your system:
- [`pylaia-htr-create-model`](laia/scripts/htr/create_model.py): Create a VGG-like model with BLSTMs on top for handwriting text recognition. The script has different options to customize the model. The architecture is based on the paper ["Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?"](https://ieeexplore.ieee.org/document/8269951) (2017) by J. Puigcerver.
- [`pylaia-htr-train-ctc`](laia/scripts/htr/train_ctc.py): Train a model using the CTC algorithm and a set of text-line images and their transcripts.
- [`pylaia-htr-decode-ctc`](laia/scripts/htr/decode_ctc.py): Decode text line images using a trained model and the CTC algorithm. It can also output the char/word segmentation boundaries of the symbols recognized.
- [`pylaia-htr-netout`](laia/scripts/htr/netout.py): Dump the output of the model for a set of text-line images in order to decode using an external language model.
## Contributing
If you want to contribute new feature or found a text that is incorrectly segmented using pySBD, then please head to [CONTRIBUTING.md](https://gitlab.teklia.com/atr/pylaia/-/blob/master/CONTRIBUTING.md) to know more and follow these steps.
1. Fork it ( <https://gitlab.teklia.com/atr/pylaia/-/forks/new> )
2. Create your feature branch (`git checkout -b my-new-feature`)
3. Commit your changes (`git commit -am 'Add some feature'`)
4. Push to the branch (`git push origin my-new-feature`)
5. Create a new Merge Request ( <https://gitlab.teklia.com/atr/pylaia/-/merge_requests/new> )
### Code of conduct
We are committed to providing a friendly, safe and welcoming environment for all. Please read and
respect the [PyLaia Code of Conduct](https://gitlab.teklia.com/atr/pylaia/-/blob/master/CODE_OF_CONDUCT.md).
## Acknowledgments
Work in this toolkit was financially supported by the [Pattern Recognition and Human Language Technology (PRHLT) Research Center](https://www.prhlt.upv.es/).
## Citation
* Article describing the latest contributions to PyLaia
```bib
@inproceedings{pylaia2024,
author = "Tarride, Solène and Schneider, Yoann and Generali, Marie and Boillet, Melodie and Abadie, Bastien and Kermorvant, Christopher",
title = "Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library",
booktitle = "Submitted at ICDAR",
year = "2024"
}
```
* Original article
```bib
@inproceedings{laia2017,
author={Puigcerver, Joan},
booktitle={2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
title={Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?},
year={2017},
volume={01},
number={},
pages={67-72},
doi={10.1109/ICDAR.2017.20}}
```
* GitLab repository
```bib
@software{pylaia-teklia,
author = {Teklia},
title = {PyLaia},
year = {2022},
url = {https://gitlab.teklia.com/atr/pylaia/},
version = {1.1.0},
note = {commit SHA}
}
```
* GitHub repository
```bib
@misc{puigcerver2018pylaia,
author = {Joan Puigcerver and Carlos Mocholí},
title = {PyLaia},
year = {2018},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/jpuigcerver/PyLaia/}},
commit = {commit SHA}
}
```
## Contact
🆘 Have a question about PyLaia? Please contact us on [support.teklia.com](https://support.teklia.com/c/machine-learning/pylaia/13).
Raw data
{
"_id": null,
"home_page": null,
"name": "pylaia",
"maintainer": null,
"docs_url": null,
"requires_python": "<3.11,>=3.9",
"maintainer_email": "Teklia <contact@teklia.com>",
"keywords": "HTR OCR python",
"author": null,
"author_email": "Joan Puigcerver <joapuipe@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/51/b7/47f07f442c9bd651cef7d36b382cd26856e567e1ab724cd2fe53e5ae0e40/pylaia-1.1.2.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n\n# PyLaia\n\n**PyLaia is a device agnostic, PyTorch based, deep learning toolkit for handwritten document analysis.**\n\n**It is also a successor to [Laia](https://github.com/jpuigcerver/Laia).**\n\n[![pipeline status](https://gitlab.teklia.com/atr/pylaia/badges/master/pipeline.svg)](https://gitlab.teklia.com/atr/pylaia/-/commits/master)\n[![Coverage](https://gitlab.teklia.com/atr/pylaia/badges/master/coverage.svg)](https://gitlab.teklia.com/atr/pylaia/-/commits/master)\n[![Code quality](https://img.shields.io/codefactor/grade/github/jpuigcerver/PyLaia?&label=CodeFactor&logo=CodeFactor&labelColor=2782f7)](https://www.codefactor.io/repository/github/jpuigcerver/PyLaia)\n\n[![Python: 3.9 | 3.10](https://img.shields.io/badge/python-3.9%20%7C%203.10-blue)](https://www.python.org/)\n[![PyTorch: 1.13.0 | 1.13.1](https://img.shields.io/badge/PyTorch-1.13.0%20%7C%201.13.1-8628d5.svg?&logo=PyTorch&logoColor=white&labelColor=%23ee4c2c)](https://pytorch.org/)\n[![pre-commit: enabled](https://img.shields.io/badge/pre--commit-enabled-76877c?&logo=pre-commit&labelColor=1f2d23)](https://github.com/pre-commit/pre-commit)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?)](https://github.com/ambv/black)\n[![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n\n</div>\n\nGet started by having a look at our [Documentation](https://atr.pages.teklia.com/pylaia)!\n\n## Installation\n\nTo install PyLaia from PyPi:\n\n```bash\npip install pylaia\n```\n\nThe following Python scripts will be installed in your system:\n\n- [`pylaia-htr-create-model`](laia/scripts/htr/create_model.py): Create a VGG-like model with BLSTMs on top for handwriting text recognition. The script has different options to customize the model. The architecture is based on the paper [\"Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?\"](https://ieeexplore.ieee.org/document/8269951) (2017) by J. Puigcerver.\n- [`pylaia-htr-train-ctc`](laia/scripts/htr/train_ctc.py): Train a model using the CTC algorithm and a set of text-line images and their transcripts.\n- [`pylaia-htr-decode-ctc`](laia/scripts/htr/decode_ctc.py): Decode text line images using a trained model and the CTC algorithm. It can also output the char/word segmentation boundaries of the symbols recognized.\n- [`pylaia-htr-netout`](laia/scripts/htr/netout.py): Dump the output of the model for a set of text-line images in order to decode using an external language model.\n\n## Contributing\n\nIf you want to contribute new feature or found a text that is incorrectly segmented using pySBD, then please head to [CONTRIBUTING.md](https://gitlab.teklia.com/atr/pylaia/-/blob/master/CONTRIBUTING.md) to know more and follow these steps.\n\n1. Fork it ( <https://gitlab.teklia.com/atr/pylaia/-/forks/new> )\n2. Create your feature branch (`git checkout -b my-new-feature`)\n3. Commit your changes (`git commit -am 'Add some feature'`)\n4. Push to the branch (`git push origin my-new-feature`)\n5. Create a new Merge Request ( <https://gitlab.teklia.com/atr/pylaia/-/merge_requests/new> )\n\n### Code of conduct\n\nWe are committed to providing a friendly, safe and welcoming environment for all. Please read and\nrespect the [PyLaia Code of Conduct](https://gitlab.teklia.com/atr/pylaia/-/blob/master/CODE_OF_CONDUCT.md).\n\n## Acknowledgments\n\nWork in this toolkit was financially supported by the [Pattern Recognition and Human Language Technology (PRHLT) Research Center](https://www.prhlt.upv.es/).\n\n## Citation\n\n* Article describing the latest contributions to PyLaia\n\n```bib\n@inproceedings{pylaia2024,\n author = \"Tarride, Sol\u00e8ne and Schneider, Yoann and Generali, Marie and Boillet, Melodie and Abadie, Bastien and Kermorvant, Christopher\",\n title = \"Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library\",\n booktitle = \"Submitted at ICDAR\",\n year = \"2024\"\n}\n```\n\n* Original article\n\n```bib\n@inproceedings{laia2017,\n author={Puigcerver, Joan},\n booktitle={2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},\n title={Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?},\n year={2017},\n volume={01},\n number={},\n pages={67-72},\n doi={10.1109/ICDAR.2017.20}}\n```\n\n* GitLab repository\n\n```bib\n@software{pylaia-teklia,\n author = {Teklia},\n title = {PyLaia},\n year = {2022},\n url = {https://gitlab.teklia.com/atr/pylaia/},\n version = {1.1.0},\n note = {commit SHA}\n}\n```\n\n* GitHub repository\n\n```bib\n@misc{puigcerver2018pylaia,\n author = {Joan Puigcerver and Carlos Mochol\u00ed},\n title = {PyLaia},\n year = {2018},\n publisher = {GitHub},\n journal = {GitHub repository},\n howpublished = {\\url{https://github.com/jpuigcerver/PyLaia/}},\n commit = {commit SHA}\n}\n```\n\n## Contact\n\n\ud83c\udd98 Have a question about PyLaia? Please contact us on [support.teklia.com](https://support.teklia.com/c/machine-learning/pylaia/13).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": null,
"version": "1.1.2",
"project_urls": {
"Documentation": "https://atr.pages.teklia.com/pylaia/",
"Downloads": "https://gitlab.teklia.com/atr/pylaia",
"Homepage": "https://atr.pages.teklia.com/pylaia/",
"Source": "https://gitlab.teklia.com/atr/pylaia/",
"Tracker": "https://gitlab.teklia.com/atr/pylaia/issues/"
},
"split_keywords": [
"htr",
"ocr",
"python"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "36d24d2ecc9677eddef508ef4a1671fbcfd1459f904600e8be0edac712d2f7f2",
"md5": "da3912e93189f62c50551e3e6447ce55",
"sha256": "8b894d3a84a8636b58cbd8b5a1e08024b77f3de42fc2a81e772e4710c734399b"
},
"downloads": -1,
"filename": "pylaia-1.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "da3912e93189f62c50551e3e6447ce55",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<3.11,>=3.9",
"size": 93416,
"upload_time": "2024-10-16T15:51:02",
"upload_time_iso_8601": "2024-10-16T15:51:02.555856Z",
"url": "https://files.pythonhosted.org/packages/36/d2/4d2ecc9677eddef508ef4a1671fbcfd1459f904600e8be0edac712d2f7f2/pylaia-1.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "51b747f07f442c9bd651cef7d36b382cd26856e567e1ab724cd2fe53e5ae0e40",
"md5": "ac8ff69819deb3f05db095397cd619f3",
"sha256": "873fabd7f382ce7d267cc8af2e0f1848ab6f3382eb2c7e5a0fbfa8c0f576b95a"
},
"downloads": -1,
"filename": "pylaia-1.1.2.tar.gz",
"has_sig": false,
"md5_digest": "ac8ff69819deb3f05db095397cd619f3",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<3.11,>=3.9",
"size": 66849,
"upload_time": "2024-10-16T15:51:04",
"upload_time_iso_8601": "2024-10-16T15:51:04.163072Z",
"url": "https://files.pythonhosted.org/packages/51/b7/47f07f442c9bd651cef7d36b382cd26856e567e1ab724cd2fe53e5ae0e40/pylaia-1.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-16 15:51:04",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "pylaia"
}