vnhtr


Namevnhtr JSON
Version 0.1.8 PyPI version JSON
download
home_pagehttps://github.com/nguyenhoanganh2002/vnhtr
SummaryEncoder-Decoder base for Vietnamese handwriting recognition
upload_time2024-01-13 13:57:59
maintainer
docs_urlNone
authornguyenhoanganh2002
requires_python>=3.6
license
keywords ocr vnocr htr vnhtr
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Vietnamese Handwriting Text Recognition (aka vnhtr package)

This project deploys and improves two foundational models within [TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr) and [VietOCR](https://github.com/pbcquoc/vietocr).

## Proposal Architecture
### VGG Transformer with Rethinking Head
![VGG Transformer with Rethinking Head](https://github.com/nguyenhoanganh2002/vnhtr/assets/79850337/82876cdd-b84a-47da-9339-6362bd0400d1)
### TrOCR with Rethinking Head
![TrOCR with Rethinking Head](https://github.com/nguyenhoanganh2002/vnhtr/assets/79850337/9295c94f-5059-4a03-a3f3-950e0ab92e30)
## Usage
### `vnhtr` package
```bash
pip install vnhtr
```
```python
from PIL import Image
from vnhtr.vnhtr_script.tools import *

vta_predictor = VGGTransformer("cuda:0")
tra_predictor = TrOCR("cuda:0")

vta_predictor.predict([Image.open("/content/out_sample_2.jpg")])
tra_predictor.predict([Image.open("/content/out_sample_2.jpg")])
```
### Fully implemented
```bash
git clone https://github.com/nguyenhoanganh2002/vnhtr
cd ./vnhtr/vnhtr/source
pip install -r requirements.txt
```
* Pretrain/Fintune VGG Transformer/TrOCR (pretraining on a large dataset and then finetuning on a wild dataset) 
```bash
python VGGTransformer/train.py
python VisionEncoderDecoder/train.py
```
* Pretrain VGG Transformer/TrOCR with Rethinking Head (large dataset)
```bash
python VGGTransformer/adapter_trainer.py
python VisionEncoderDecoder/adapter_trainer.py
```
* Finetune VGG Transformer with Rethinking Head (wild dataset)
```bash
python VGGTransformer/finetune.py
python VisionEncoderDecoder/finetune.py
```
* Access the model without going through the training or finetuning phases.
```python
from VGGTransformer.config import config as vggtransformer_cf
from VGGTransformer.models import VGGTransformer, AdapterVGGTransformer
from VisionEncoderDecoder.config import config as trocr_cf
from VisionEncoderDecoder.model import VNTrOCR, AdapterVNTrOCR

vt_base = VGGTransformer(vggtransformer_cf)
vt_adapter = AdapterVGGTransformer(vggtransformer_cf)
tr_base = VNTrOCR(trocr_cf)
tr_adapter = AdapterVNTrOCR(trocr_cf)
```

For access to the full dataset and pretrained weights, please contact: [anh.nh204511@gmail.com](mailto:anh.nh204511@gmail.com)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/nguyenhoanganh2002/vnhtr",
    "name": "vnhtr",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": "",
    "keywords": "ocr,vnocr,htr,vnhtr",
    "author": "nguyenhoanganh2002",
    "author_email": "anh.nh204511@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/c9/ee/0e75d3c39b87df9daacee0e04f640f014439663e62cc6f7f515f237bb046/vnhtr-0.1.8.tar.gz",
    "platform": null,
    "description": "# Vietnamese Handwriting Text Recognition (aka vnhtr package)\n\nThis project deploys and improves two foundational models within [TrOCR](https://huggingface.co/docs/transformers/model_doc/trocr) and [VietOCR](https://github.com/pbcquoc/vietocr).\n\n## Proposal Architecture\n### VGG Transformer with Rethinking Head\n![VGG Transformer with Rethinking Head](https://github.com/nguyenhoanganh2002/vnhtr/assets/79850337/82876cdd-b84a-47da-9339-6362bd0400d1)\n### TrOCR with Rethinking Head\n![TrOCR with Rethinking Head](https://github.com/nguyenhoanganh2002/vnhtr/assets/79850337/9295c94f-5059-4a03-a3f3-950e0ab92e30)\n## Usage\n### `vnhtr` package\n```bash\npip install vnhtr\n```\n```python\nfrom PIL import Image\nfrom vnhtr.vnhtr_script.tools import *\n\nvta_predictor = VGGTransformer(\"cuda:0\")\ntra_predictor = TrOCR(\"cuda:0\")\n\nvta_predictor.predict([Image.open(\"/content/out_sample_2.jpg\")])\ntra_predictor.predict([Image.open(\"/content/out_sample_2.jpg\")])\n```\n### Fully implemented\n```bash\ngit clone https://github.com/nguyenhoanganh2002/vnhtr\ncd ./vnhtr/vnhtr/source\npip install -r requirements.txt\n```\n* Pretrain/Fintune VGG Transformer/TrOCR (pretraining on a large dataset and then finetuning on a wild dataset) \n```bash\npython VGGTransformer/train.py\npython VisionEncoderDecoder/train.py\n```\n* Pretrain VGG Transformer/TrOCR with Rethinking Head (large dataset)\n```bash\npython VGGTransformer/adapter_trainer.py\npython VisionEncoderDecoder/adapter_trainer.py\n```\n* Finetune VGG Transformer with Rethinking Head (wild dataset)\n```bash\npython VGGTransformer/finetune.py\npython VisionEncoderDecoder/finetune.py\n```\n* Access the model without going through the training or finetuning phases.\n```python\nfrom VGGTransformer.config import config as vggtransformer_cf\nfrom VGGTransformer.models import VGGTransformer, AdapterVGGTransformer\nfrom VisionEncoderDecoder.config import config as trocr_cf\nfrom VisionEncoderDecoder.model import VNTrOCR, AdapterVNTrOCR\n\nvt_base = VGGTransformer(vggtransformer_cf)\nvt_adapter = AdapterVGGTransformer(vggtransformer_cf)\ntr_base = VNTrOCR(trocr_cf)\ntr_adapter = AdapterVNTrOCR(trocr_cf)\n```\n\nFor access to the full dataset and pretrained weights, please contact: [anh.nh204511@gmail.com](mailto:anh.nh204511@gmail.com)\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Encoder-Decoder base for Vietnamese handwriting recognition",
    "version": "0.1.8",
    "project_urls": {
        "Homepage": "https://github.com/nguyenhoanganh2002/vnhtr"
    },
    "split_keywords": [
        "ocr",
        "vnocr",
        "htr",
        "vnhtr"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "65dba81df657c54395cb3b716cb833caf4989dbe32b617955a88e3435068ca17",
                "md5": "6bb979c4e3ccaa67d7b0e190b4345334",
                "sha256": "88dab8c51e4d641a6de8127dd8d3902f30198235ef976b05e6b1bd0c75d28725"
            },
            "downloads": -1,
            "filename": "vnhtr-0.1.8-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6bb979c4e3ccaa67d7b0e190b4345334",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 50592,
            "upload_time": "2024-01-13T13:57:57",
            "upload_time_iso_8601": "2024-01-13T13:57:57.196623Z",
            "url": "https://files.pythonhosted.org/packages/65/db/a81df657c54395cb3b716cb833caf4989dbe32b617955a88e3435068ca17/vnhtr-0.1.8-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c9ee0e75d3c39b87df9daacee0e04f640f014439663e62cc6f7f515f237bb046",
                "md5": "4fd62697b313f397a99ec65c94a234ba",
                "sha256": "39bb0fe41c4ed1d6f2a3bf6e879aaafbc53ba7eaeb75e7d64c6448d860215d19"
            },
            "downloads": -1,
            "filename": "vnhtr-0.1.8.tar.gz",
            "has_sig": false,
            "md5_digest": "4fd62697b313f397a99ec65c94a234ba",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 32903,
            "upload_time": "2024-01-13T13:57:59",
            "upload_time_iso_8601": "2024-01-13T13:57:59.472876Z",
            "url": "https://files.pythonhosted.org/packages/c9/ee/0e75d3c39b87df9daacee0e04f640f014439663e62cc6f7f515f237bb046/vnhtr-0.1.8.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-01-13 13:57:59",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "nguyenhoanganh2002",
    "github_project": "vnhtr",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "vnhtr"
}
        
Elapsed time: 0.17604s