imgocr


Nameimgocr JSON
Version 0.1.3 PyPI version JSON
download
home_pagehttps://github.com/shibing624/imgocr
SummaryImage ocr tool, use ppocr onnx model.
upload_time2024-12-25 14:13:33
maintainerNone
docs_urlNone
authorXuMing
requires_python>=3.6.0
licenseApache License 2.0
keywords ocr image ocr text recognition
VCS
bugtrack_url
requirements loguru tqdm shapely numpy pillow pyclipper requests onnxruntime opencv-python-headless
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [**🇨🇳中文**](https://github.com/shibing624/imgocr/blob/main/README.md) | [**🌐English**](https://github.com/shibing624/imgocr/blob/main/README_EN.md) | [**📖文档/Docs**](https://github.com/shibing624/imgocr/wiki) 

<div align="center">
  <a href="https://github.com/shibing624/imgocr">
    <img src="https://github.com/shibing624/imgocr/blob/main/docs/imgocr-logo.png" height="150" alt="Logo">
  </a>
</div>

-----------------

# imgocr: Image OCR toolkit
[![PyPI version](https://badge.fury.io/py/imgocr.svg)](https://badge.fury.io/py/imgocr)
[![Downloads](https://static.pepy.tech/badge/imgocr)](https://pepy.tech/project/imgocr)
[![Contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CONTRIBUTING.md)
[![License Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)
[![python_version](https://img.shields.io/badge/Python-3.6%2B-green.svg)](requirements.txt)
[![GitHub issues](https://img.shields.io/github/issues/shibing624/imgocr.svg)](https://github.com/shibing624/imgocr/issues)
[![Wechat Group](https://img.shields.io/badge/wechat-group-green.svg?logo=wechat)](#Contact)


**imgocr**:Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB).

**imgocr**:基于PaddleOCR-v4-onnx模型(~14MB)推理,性能更高,可实现 CPU 上毫秒级的 OCR 精准预测,在通用场景上达到开源SOTA。


## Showcase


| 银行存根 | ![银行存根](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/00111002.jpg) |
|----------|----------------------------------------------------------------------------------------------|
| 表格     | ![表格](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/00015504.jpg)     |
| 火车票   | ![火车票](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/00056221.jpg)   |
| 英文论文 | ![英文论文](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/eng_paper.png) |

## Benchmark

PP-OCRv4串联系统由文本检测模型和文本识别模型串联完成,首先输入预测图片,经过文本检测模型获取全部的检测框。根据检测框坐标在原图中抠出文本行,并进行矫正,最后将全部文本行送入文本识别模型,得到文本结果。

整个流程如下图所示:

<img src="https://github.com/shibing624/imgocr/blob/main/docs/ppocrv4_framework.png" width="800" alt="ppocr-v4">

OCR 检测/识别 benchmark:

| 模型                    | 检测 mAP(%) | 识别 Acc(%) | GPU 推理耗时(ms) | CPU 推理耗时(ms) | 模型存储大小(M) | 下载地址 |
|-------------------------|-----------|-----------|--------------|--------------|-----------|--------|
| PP-OCRv4-mobile(高效率,默认) | 77.79     | 78.20     | 2.71         | 79.11        | 14        | [mobile-model](https://modelscope.cn/models/lili666/imgocr/summary) |
| PP-OCRv4-server(高精度)	   | 82.69	    | 84.04	    | 24.92	       | 2742.31	     | 207       | [server-model](https://modelscope.cn/models/lili666/imgocr/summary) |

> GPU 推理耗时基于 NVIDIA Tesla T4 机器,精度类型为 FP32,CPU 推理速度基于 Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz,精度类型为 FP32。

> OCR 评估集是 PaddleOCR 自建的中文数据集,覆盖街景、网图、文档、手写多个场景,其中文本识别包含1.1w张图片,检测包含500张图片。



## Demo

HuggingFace Demo: https://huggingface.co/spaces/shibing624/imgocr

![](https://github.com/shibing624/imgocr/blob/main/docs/imgocr_hf.png)

run example: [examples/gradio_demo.py](https://github.com/shibing624/imgocr/blob/main/examples/gradio_demo.py) to see the demo:
```shell
python examples/gradio_demo.py
```

## Install

无需安装paddlepaddle、paddleocr等深度学习库,仅需安装onnxruntime,即可用imgocr调用。

```shell
pip install onnxruntime # pip install onnxruntime-gpu for gpu
pip install imgocr
```

or

```shell
pip install onnxruntime # pip install onnxruntime-gpu for gpu
pip install -r requirements.txt
git clone https://github.com/shibing624/imgocr.git
cd imgocr
pip install --no-deps .
```

## Usage

### OCR识别

example: [examples/ocr_demo.py](https://github.com/shibing624/imgocr/blob/main/examples/ocr_demo.py)

```python
from imgocr import ImgOcr
m = ImgOcr(use_gpu=False, is_efficiency_mode=True)
result = m.ocr("data/11.jpg")
print("result:", result)
for i in result:
    print(i['text'])
```

> `is_efficiency_mode`: 是否使用高效率模型,默认`True`,使用高效率模型(mobile,14MB),速度更快,精度稍低,该模型已经内置集成在`imgocr/models`文件夹下。如果需要更高精度,设置为False,使用高精度模型(server,207MB),代码会自动下载到`imgocr/models`文件夹。

output:
```shell
result: [{'box': [[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], 'text': '纯臻营养护发素', 'score': 0.9978395700454712}, {'box': [[26.0, 83.0], [173.0, 83.0], [173.0, 104.0], [26.0, 104.0]], 'text': '产品信息/参数', 'score': 0.9898329377174377}, {'box': [[27.0, 112.0], [331.0, 112.0], [331.0, 135.0], [27.0, 135.0]], 'text': '(45元/每公斤,100公斤起订)', 'score': 0.9659210443496704}, {'box': [[25.0, 143.0], [281.0, 143.0], [281.0, 165.0], [25.0, 165.0]], 'text': '每瓶22元,1000瓶起订)', 'score': 0.9928666353225708}, {'box': [[26.0, 179.0], [300.0, 179.0], [300.0, 195.0], [26.0, 195.0]], 'text': '【品牌】:代加工方式/OEMODM', 'score': 0.9843945503234863}, {'box': [[26.0, 210.0], [234.0, 210.0], [234.0, 227.0], [26.0, 227.0]], 'text': '【品名】:纯臻营养护发素', 'score': 0.9963161945343018}, {'box': [[25.0, 239.0], [241.0, 239.0], [241.0, 259.0], [25.0, 259.0]], 'text': '【产品编号】:YM-X-3011', 'score': 0.9848018884658813}, {'box': [[413.0, 232.0], [430.0, 232.0], [430.0, 306.0], [413.0, 306.0]], 'text': 'ODMOEM', 'score': 0.9908049702644348}, {'box': [[24.0, 271.0], [180.0, 271.0], [180.0, 290.0], [24.0, 290.0]], 'text': '【净含量】:220ml', 'score': 0.9892324209213257}, {'box': [[26.0, 303.0], [251.0, 303.0], [251.0, 319.0], [26.0, 319.0]], 'text': '【适用人群】:适合所有肤质', 'score': 0.9909228682518005}, {'box': [[26.0, 335.0], [344.0, 335.0], [344.0, 352.0], [26.0, 352.0]], 'text': '【主要成分】:鲸蜡硬脂醇、燕麦β-葡聚', 'score': 0.9828647971153259}, {'box': [[26.0, 364.0], [281.0, 364.0], [281.0, 384.0], [26.0, 384.0]], 'text': '糖、椰油酰胺丙基甜菜碱、泛醌', 'score': 0.9505177140235901}, {'box': [[368.0, 368.0], [477.0, 368.0], [477.0, 389.0], [368.0, 389.0]], 'text': '(成品包材)', 'score': 0.992072343826294}, {'box': [[26.0, 397.0], [360.0, 397.0], [360.0, 414.0], [26.0, 414.0]], 'text': '【主要功能】:可紧致头发磷层,从而达到', 'score': 0.9904329180717468}, {'box': [[28.0, 429.0], [370.0, 429.0], [370.0, 445.0], [28.0, 445.0]], 'text': '即时持久改善头发光泽的效果,给干燥的头', 'score': 0.9874186515808105}, {'box': [[27.0, 458.0], [137.0, 458.0], [137.0, 479.0], [27.0, 479.0]], 'text': '发足够的滋养', 'score': 0.9987384676933289}]
纯臻营养护发素
产品信息/参数
(45元/每公斤,100公斤起订)
每瓶22元,1000瓶起订)
【品牌】:代加工方式/OEMODM
【品名】:纯臻营养护发素
【产品编号】:YM-X-3011
ODMOEM
【净含量】:220ml
【适用人群】:适合所有肤质
【主要成分】:鲸蜡硬脂醇、燕麦β-葡聚
糖、椰油酰胺丙基甜菜碱、泛醌
(成品包材)
【主要功能】:可紧致头发磷层,从而达到
即时持久改善头发光泽的效果,给干燥的头
发足够的滋养
```
![](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/11.jpg)

### 命令行模式(CLI)

支持批量做OCR识别

code: [cli.py](https://github.com/shibing624/imgocr/blob/main/imgocr/cli.py)

```
> imgocr -h                                    
usage: cli.py [-h] --image_dir IMAGE_DIR [--output_dir OUTPUT_DIR]
              [--chunk_size CHUNK_SIZE] [--use_gpu USE_GPU]

imgocr cli

options:
  -h, --help            show this help message and exit
  --image_dir IMAGE_DIR
                        input image dir path, required
  --output_dir OUTPUT_DIR
                        output ocr result dir path, default outputs
  --chunk_size CHUNK_SIZE
                        chunk size, default 10
  --use_gpu USE_GPU     use gpu, default False
```

run:

```shell
pip install imgocr -U
imgocr --image_dir data
```

> 输入图片目录(--image_dir, required)

## Contact

- Issue(建议):[![GitHub issues](https://img.shields.io/github/issues/shibing624/imgocr.svg)](https://github.com/shibing624/imgocr/issues)
- 邮件我:xuming: xuming624@qq.com
- 微信我:加我*微信号:xuming624, 备注:姓名-公司-NLP* 进NLP交流群。

<img src="https://github.com/shibing624/imgocr/blob/main/docs/wechat.jpeg" width="200" />


## Citation

如果你在研究中使用了imgocr,请按如下格式引用:

APA:
```latex
Xu, M. imgocr: Image OCR toolkit (Version 0.0.1) [Computer software]. https://github.com/shibing624/imgocr
```

BibTeX:
```latex
@misc{imgocr,
  author = {Ming Xu},
  title = {imgocr: Image OCR toolkit},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/shibing624/imgocr}},
}
```

## License


授权协议为 [The Apache License 2.0](LICENSE),可免费用做商业用途。请在产品说明中附加imgocr的链接和授权协议。


## Contribute
项目代码还很粗糙,如果大家对代码有所改进,欢迎提交回本项目,在提交之前,注意以下两点:

 - 在`tests`添加相应的单元测试
 - 使用`python -m pytest -v`来运行所有单元测试,确保所有单测都是通过的

之后即可提交PR。

## References
- [RapidOCR](https://github.com/RapidAI/RapidOCR)  
- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)  
- [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX)
- [ppocr-onnx](https://github.com/triwinds/ppocr-onnx)

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/shibing624/imgocr",
    "name": "imgocr",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6.0",
    "maintainer_email": null,
    "keywords": "ocr, image ocr, text recognition",
    "author": "XuMing",
    "author_email": "xuming624@qq.com",
    "download_url": "https://files.pythonhosted.org/packages/0f/36/6dfc6a5b0a488883d966491f13df31610dd5425738fbaf8c2ffd0ad0997a/imgocr-0.1.3.tar.gz",
    "platform": null,
    "description": "[**\ud83c\udde8\ud83c\uddf3\u4e2d\u6587**](https://github.com/shibing624/imgocr/blob/main/README.md) | [**\ud83c\udf10English**](https://github.com/shibing624/imgocr/blob/main/README_EN.md) | [**\ud83d\udcd6\u6587\u6863/Docs**](https://github.com/shibing624/imgocr/wiki) \n\n<div align=\"center\">\n  <a href=\"https://github.com/shibing624/imgocr\">\n    <img src=\"https://github.com/shibing624/imgocr/blob/main/docs/imgocr-logo.png\" height=\"150\" alt=\"Logo\">\n  </a>\n</div>\n\n-----------------\n\n# imgocr: Image OCR toolkit\n[![PyPI version](https://badge.fury.io/py/imgocr.svg)](https://badge.fury.io/py/imgocr)\n[![Downloads](https://static.pepy.tech/badge/imgocr)](https://pepy.tech/project/imgocr)\n[![Contributions welcome](https://img.shields.io/badge/contributions-welcome-brightgreen.svg)](CONTRIBUTING.md)\n[![License Apache 2.0](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](LICENSE)\n[![python_version](https://img.shields.io/badge/Python-3.6%2B-green.svg)](requirements.txt)\n[![GitHub issues](https://img.shields.io/github/issues/shibing624/imgocr.svg)](https://github.com/shibing624/imgocr/issues)\n[![Wechat Group](https://img.shields.io/badge/wechat-group-green.svg?logo=wechat)](#Contact)\n\n\n**imgocr**\uff1aPython3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB).\n\n**imgocr**\uff1a\u57fa\u4e8ePaddleOCR-v4-onnx\u6a21\u578b\uff08~14MB\uff09\u63a8\u7406\uff0c\u6027\u80fd\u66f4\u9ad8\uff0c\u53ef\u5b9e\u73b0 CPU \u4e0a\u6beb\u79d2\u7ea7\u7684 OCR \u7cbe\u51c6\u9884\u6d4b\uff0c\u5728\u901a\u7528\u573a\u666f\u4e0a\u8fbe\u5230\u5f00\u6e90SOTA\u3002\n\n\n## Showcase\n\n\n| \u94f6\u884c\u5b58\u6839 | ![\u94f6\u884c\u5b58\u6839](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/00111002.jpg) |\n|----------|----------------------------------------------------------------------------------------------|\n| \u8868\u683c     | ![\u8868\u683c](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/00015504.jpg)     |\n| \u706b\u8f66\u7968   | ![\u706b\u8f66\u7968](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/00056221.jpg)   |\n| \u82f1\u6587\u8bba\u6587 | ![\u82f1\u6587\u8bba\u6587](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/eng_paper.png) |\n\n## Benchmark\n\nPP-OCRv4\u4e32\u8054\u7cfb\u7edf\u7531\u6587\u672c\u68c0\u6d4b\u6a21\u578b\u548c\u6587\u672c\u8bc6\u522b\u6a21\u578b\u4e32\u8054\u5b8c\u6210\uff0c\u9996\u5148\u8f93\u5165\u9884\u6d4b\u56fe\u7247\uff0c\u7ecf\u8fc7\u6587\u672c\u68c0\u6d4b\u6a21\u578b\u83b7\u53d6\u5168\u90e8\u7684\u68c0\u6d4b\u6846\u3002\u6839\u636e\u68c0\u6d4b\u6846\u5750\u6807\u5728\u539f\u56fe\u4e2d\u62a0\u51fa\u6587\u672c\u884c\uff0c\u5e76\u8fdb\u884c\u77eb\u6b63\uff0c\u6700\u540e\u5c06\u5168\u90e8\u6587\u672c\u884c\u9001\u5165\u6587\u672c\u8bc6\u522b\u6a21\u578b\uff0c\u5f97\u5230\u6587\u672c\u7ed3\u679c\u3002\n\n\u6574\u4e2a\u6d41\u7a0b\u5982\u4e0b\u56fe\u6240\u793a\uff1a\n\n<img src=\"https://github.com/shibing624/imgocr/blob/main/docs/ppocrv4_framework.png\" width=\"800\" alt=\"ppocr-v4\">\n\nOCR \u68c0\u6d4b/\u8bc6\u522b benchmark\uff1a\n\n| \u6a21\u578b                    | \u68c0\u6d4b mAP(%) | \u8bc6\u522b Acc(%) | GPU \u63a8\u7406\u8017\u65f6(ms) | CPU \u63a8\u7406\u8017\u65f6(ms) | \u6a21\u578b\u5b58\u50a8\u5927\u5c0f(M) | \u4e0b\u8f7d\u5730\u5740 |\n|-------------------------|-----------|-----------|--------------|--------------|-----------|--------|\n| PP-OCRv4-mobile(\u9ad8\u6548\u7387\uff0c\u9ed8\u8ba4) | 77.79     | 78.20     | 2.71         | 79.11        | 14        | [mobile-model](https://modelscope.cn/models/lili666/imgocr/summary) |\n| PP-OCRv4-server(\u9ad8\u7cbe\u5ea6)\t   | 82.69\t    | 84.04\t    | 24.92\t       | 2742.31\t     | 207       | [server-model](https://modelscope.cn/models/lili666/imgocr/summary) |\n\n> GPU \u63a8\u7406\u8017\u65f6\u57fa\u4e8e NVIDIA Tesla T4 \u673a\u5668\uff0c\u7cbe\u5ea6\u7c7b\u578b\u4e3a FP32\uff0cCPU \u63a8\u7406\u901f\u5ea6\u57fa\u4e8e Intel(R) Xeon(R) Gold 5117 CPU @ 2.00GHz\uff0c\u7cbe\u5ea6\u7c7b\u578b\u4e3a FP32\u3002\n\n> OCR \u8bc4\u4f30\u96c6\u662f PaddleOCR \u81ea\u5efa\u7684\u4e2d\u6587\u6570\u636e\u96c6\uff0c\u8986\u76d6\u8857\u666f\u3001\u7f51\u56fe\u3001\u6587\u6863\u3001\u624b\u5199\u591a\u4e2a\u573a\u666f\uff0c\u5176\u4e2d\u6587\u672c\u8bc6\u522b\u5305\u542b1.1w\u5f20\u56fe\u7247\uff0c\u68c0\u6d4b\u5305\u542b500\u5f20\u56fe\u7247\u3002\n\n\n\n## Demo\n\nHuggingFace Demo: https://huggingface.co/spaces/shibing624/imgocr\n\n![](https://github.com/shibing624/imgocr/blob/main/docs/imgocr_hf.png)\n\nrun example: [examples/gradio_demo.py](https://github.com/shibing624/imgocr/blob/main/examples/gradio_demo.py) to see the demo:\n```shell\npython examples/gradio_demo.py\n```\n\n## Install\n\n\u65e0\u9700\u5b89\u88c5paddlepaddle\u3001paddleocr\u7b49\u6df1\u5ea6\u5b66\u4e60\u5e93\uff0c\u4ec5\u9700\u5b89\u88c5onnxruntime\uff0c\u5373\u53ef\u7528imgocr\u8c03\u7528\u3002\n\n```shell\npip install onnxruntime # pip install onnxruntime-gpu for gpu\npip install imgocr\n```\n\nor\n\n```shell\npip install onnxruntime # pip install onnxruntime-gpu for gpu\npip install -r requirements.txt\ngit clone https://github.com/shibing624/imgocr.git\ncd imgocr\npip install --no-deps .\n```\n\n## Usage\n\n### OCR\u8bc6\u522b\n\nexample: [examples/ocr_demo.py](https://github.com/shibing624/imgocr/blob/main/examples/ocr_demo.py)\n\n```python\nfrom imgocr import ImgOcr\nm = ImgOcr(use_gpu=False, is_efficiency_mode=True)\nresult = m.ocr(\"data/11.jpg\")\nprint(\"result:\", result)\nfor i in result:\n    print(i['text'])\n```\n\n> `is_efficiency_mode`: \u662f\u5426\u4f7f\u7528\u9ad8\u6548\u7387\u6a21\u578b\uff0c\u9ed8\u8ba4`True`\uff0c\u4f7f\u7528\u9ad8\u6548\u7387\u6a21\u578b(mobile\uff0c14MB)\uff0c\u901f\u5ea6\u66f4\u5feb\uff0c\u7cbe\u5ea6\u7a0d\u4f4e\uff0c\u8be5\u6a21\u578b\u5df2\u7ecf\u5185\u7f6e\u96c6\u6210\u5728`imgocr/models`\u6587\u4ef6\u5939\u4e0b\u3002\u5982\u679c\u9700\u8981\u66f4\u9ad8\u7cbe\u5ea6\uff0c\u8bbe\u7f6e\u4e3aFalse\uff0c\u4f7f\u7528\u9ad8\u7cbe\u5ea6\u6a21\u578b(server\uff0c207MB)\uff0c\u4ee3\u7801\u4f1a\u81ea\u52a8\u4e0b\u8f7d\u5230`imgocr/models`\u6587\u4ef6\u5939\u3002\n\noutput:\n```shell\nresult: [{'box': [[28.0, 37.0], [302.0, 39.0], [302.0, 72.0], [27.0, 70.0]], 'text': '\u7eaf\u81fb\u8425\u517b\u62a4\u53d1\u7d20', 'score': 0.9978395700454712}, {'box': [[26.0, 83.0], [173.0, 83.0], [173.0, 104.0], [26.0, 104.0]], 'text': '\u4ea7\u54c1\u4fe1\u606f/\u53c2\u6570', 'score': 0.9898329377174377}, {'box': [[27.0, 112.0], [331.0, 112.0], [331.0, 135.0], [27.0, 135.0]], 'text': '\uff0845\u5143/\u6bcf\u516c\u65a4\uff0c100\u516c\u65a4\u8d77\u8ba2\uff09', 'score': 0.9659210443496704}, {'box': [[25.0, 143.0], [281.0, 143.0], [281.0, 165.0], [25.0, 165.0]], 'text': '\u6bcf\u74f622\u5143\uff0c1000\u74f6\u8d77\u8ba2\uff09', 'score': 0.9928666353225708}, {'box': [[26.0, 179.0], [300.0, 179.0], [300.0, 195.0], [26.0, 195.0]], 'text': '\u3010\u54c1\u724c\u3011\uff1a\u4ee3\u52a0\u5de5\u65b9\u5f0f/OEMODM', 'score': 0.9843945503234863}, {'box': [[26.0, 210.0], [234.0, 210.0], [234.0, 227.0], [26.0, 227.0]], 'text': '\u3010\u54c1\u540d\u3011\uff1a\u7eaf\u81fb\u8425\u517b\u62a4\u53d1\u7d20', 'score': 0.9963161945343018}, {'box': [[25.0, 239.0], [241.0, 239.0], [241.0, 259.0], [25.0, 259.0]], 'text': '\u3010\u4ea7\u54c1\u7f16\u53f7\u3011\uff1aYM-X-3011', 'score': 0.9848018884658813}, {'box': [[413.0, 232.0], [430.0, 232.0], [430.0, 306.0], [413.0, 306.0]], 'text': 'ODMOEM', 'score': 0.9908049702644348}, {'box': [[24.0, 271.0], [180.0, 271.0], [180.0, 290.0], [24.0, 290.0]], 'text': '\u3010\u51c0\u542b\u91cf\u3011\uff1a220ml', 'score': 0.9892324209213257}, {'box': [[26.0, 303.0], [251.0, 303.0], [251.0, 319.0], [26.0, 319.0]], 'text': '\u3010\u9002\u7528\u4eba\u7fa4\u3011\uff1a\u9002\u5408\u6240\u6709\u80a4\u8d28', 'score': 0.9909228682518005}, {'box': [[26.0, 335.0], [344.0, 335.0], [344.0, 352.0], [26.0, 352.0]], 'text': '\u3010\u4e3b\u8981\u6210\u5206\u3011\uff1a\u9cb8\u8721\u786c\u8102\u9187\u3001\u71d5\u9ea6\u03b2-\u8461\u805a', 'score': 0.9828647971153259}, {'box': [[26.0, 364.0], [281.0, 364.0], [281.0, 384.0], [26.0, 384.0]], 'text': '\u7cd6\u3001\u6930\u6cb9\u9170\u80fa\u4e19\u57fa\u751c\u83dc\u78b1\u3001\u6cdb\u918c', 'score': 0.9505177140235901}, {'box': [[368.0, 368.0], [477.0, 368.0], [477.0, 389.0], [368.0, 389.0]], 'text': '\uff08\u6210\u54c1\u5305\u6750\uff09', 'score': 0.992072343826294}, {'box': [[26.0, 397.0], [360.0, 397.0], [360.0, 414.0], [26.0, 414.0]], 'text': '\u3010\u4e3b\u8981\u529f\u80fd\u3011\uff1a\u53ef\u7d27\u81f4\u5934\u53d1\u78f7\u5c42\uff0c\u4ece\u800c\u8fbe\u5230', 'score': 0.9904329180717468}, {'box': [[28.0, 429.0], [370.0, 429.0], [370.0, 445.0], [28.0, 445.0]], 'text': '\u5373\u65f6\u6301\u4e45\u6539\u5584\u5934\u53d1\u5149\u6cfd\u7684\u6548\u679c\uff0c\u7ed9\u5e72\u71e5\u7684\u5934', 'score': 0.9874186515808105}, {'box': [[27.0, 458.0], [137.0, 458.0], [137.0, 479.0], [27.0, 479.0]], 'text': '\u53d1\u8db3\u591f\u7684\u6ecb\u517b', 'score': 0.9987384676933289}]\n\u7eaf\u81fb\u8425\u517b\u62a4\u53d1\u7d20\n\u4ea7\u54c1\u4fe1\u606f/\u53c2\u6570\n\uff0845\u5143/\u6bcf\u516c\u65a4\uff0c100\u516c\u65a4\u8d77\u8ba2\uff09\n\u6bcf\u74f622\u5143\uff0c1000\u74f6\u8d77\u8ba2\uff09\n\u3010\u54c1\u724c\u3011\uff1a\u4ee3\u52a0\u5de5\u65b9\u5f0f/OEMODM\n\u3010\u54c1\u540d\u3011\uff1a\u7eaf\u81fb\u8425\u517b\u62a4\u53d1\u7d20\n\u3010\u4ea7\u54c1\u7f16\u53f7\u3011\uff1aYM-X-3011\nODMOEM\n\u3010\u51c0\u542b\u91cf\u3011\uff1a220ml\n\u3010\u9002\u7528\u4eba\u7fa4\u3011\uff1a\u9002\u5408\u6240\u6709\u80a4\u8d28\n\u3010\u4e3b\u8981\u6210\u5206\u3011\uff1a\u9cb8\u8721\u786c\u8102\u9187\u3001\u71d5\u9ea6\u03b2-\u8461\u805a\n\u7cd6\u3001\u6930\u6cb9\u9170\u80fa\u4e19\u57fa\u751c\u83dc\u78b1\u3001\u6cdb\u918c\n\uff08\u6210\u54c1\u5305\u6750\uff09\n\u3010\u4e3b\u8981\u529f\u80fd\u3011\uff1a\u53ef\u7d27\u81f4\u5934\u53d1\u78f7\u5c42\uff0c\u4ece\u800c\u8fbe\u5230\n\u5373\u65f6\u6301\u4e45\u6539\u5584\u5934\u53d1\u5149\u6cfd\u7684\u6548\u679c\uff0c\u7ed9\u5e72\u71e5\u7684\u5934\n\u53d1\u8db3\u591f\u7684\u6ecb\u517b\n```\n![](https://github.com/shibing624/imgocr/blob/main/examples/ocr_results/11.jpg)\n\n### \u547d\u4ee4\u884c\u6a21\u5f0f\uff08CLI\uff09\n\n\u652f\u6301\u6279\u91cf\u505aOCR\u8bc6\u522b\n\ncode: [cli.py](https://github.com/shibing624/imgocr/blob/main/imgocr/cli.py)\n\n```\n> imgocr -h                                    \nusage: cli.py [-h] --image_dir IMAGE_DIR [--output_dir OUTPUT_DIR]\n              [--chunk_size CHUNK_SIZE] [--use_gpu USE_GPU]\n\nimgocr cli\n\noptions:\n  -h, --help            show this help message and exit\n  --image_dir IMAGE_DIR\n                        input image dir path, required\n  --output_dir OUTPUT_DIR\n                        output ocr result dir path, default outputs\n  --chunk_size CHUNK_SIZE\n                        chunk size, default 10\n  --use_gpu USE_GPU     use gpu, default False\n```\n\nrun\uff1a\n\n```shell\npip install imgocr -U\nimgocr --image_dir data\n```\n\n> \u8f93\u5165\u56fe\u7247\u76ee\u5f55\uff08--image_dir\uff0c required\uff09\n\n## Contact\n\n- Issue(\u5efa\u8bae)\uff1a[![GitHub issues](https://img.shields.io/github/issues/shibing624/imgocr.svg)](https://github.com/shibing624/imgocr/issues)\n- \u90ae\u4ef6\u6211\uff1axuming: xuming624@qq.com\n- \u5fae\u4fe1\u6211\uff1a\u52a0\u6211*\u5fae\u4fe1\u53f7\uff1axuming624, \u5907\u6ce8\uff1a\u59d3\u540d-\u516c\u53f8-NLP* \u8fdbNLP\u4ea4\u6d41\u7fa4\u3002\n\n<img src=\"https://github.com/shibing624/imgocr/blob/main/docs/wechat.jpeg\" width=\"200\" />\n\n\n## Citation\n\n\u5982\u679c\u4f60\u5728\u7814\u7a76\u4e2d\u4f7f\u7528\u4e86imgocr\uff0c\u8bf7\u6309\u5982\u4e0b\u683c\u5f0f\u5f15\u7528\uff1a\n\nAPA:\n```latex\nXu, M. imgocr: Image OCR toolkit (Version 0.0.1) [Computer software]. https://github.com/shibing624/imgocr\n```\n\nBibTeX:\n```latex\n@misc{imgocr,\n  author = {Ming Xu},\n  title = {imgocr: Image OCR toolkit},\n  year = {2024},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  howpublished = {\\url{https://github.com/shibing624/imgocr}},\n}\n```\n\n## License\n\n\n\u6388\u6743\u534f\u8bae\u4e3a [The Apache License 2.0](LICENSE)\uff0c\u53ef\u514d\u8d39\u7528\u505a\u5546\u4e1a\u7528\u9014\u3002\u8bf7\u5728\u4ea7\u54c1\u8bf4\u660e\u4e2d\u9644\u52a0imgocr\u7684\u94fe\u63a5\u548c\u6388\u6743\u534f\u8bae\u3002\n\n\n## Contribute\n\u9879\u76ee\u4ee3\u7801\u8fd8\u5f88\u7c97\u7cd9\uff0c\u5982\u679c\u5927\u5bb6\u5bf9\u4ee3\u7801\u6709\u6240\u6539\u8fdb\uff0c\u6b22\u8fce\u63d0\u4ea4\u56de\u672c\u9879\u76ee\uff0c\u5728\u63d0\u4ea4\u4e4b\u524d\uff0c\u6ce8\u610f\u4ee5\u4e0b\u4e24\u70b9\uff1a\n\n - \u5728`tests`\u6dfb\u52a0\u76f8\u5e94\u7684\u5355\u5143\u6d4b\u8bd5\n - \u4f7f\u7528`python -m pytest -v`\u6765\u8fd0\u884c\u6240\u6709\u5355\u5143\u6d4b\u8bd5\uff0c\u786e\u4fdd\u6240\u6709\u5355\u6d4b\u90fd\u662f\u901a\u8fc7\u7684\n\n\u4e4b\u540e\u5373\u53ef\u63d0\u4ea4PR\u3002\n\n## References\n- [RapidOCR](https://github.com/RapidAI/RapidOCR)  \n- [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)  \n- [Paddle2ONNX](https://github.com/PaddlePaddle/Paddle2ONNX)\n- [ppocr-onnx](https://github.com/triwinds/ppocr-onnx)\n",
    "bugtrack_url": null,
    "license": "Apache License 2.0",
    "summary": "Image ocr tool, use ppocr onnx model.",
    "version": "0.1.3",
    "project_urls": {
        "Homepage": "https://github.com/shibing624/imgocr"
    },
    "split_keywords": [
        "ocr",
        " image ocr",
        " text recognition"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0f366dfc6a5b0a488883d966491f13df31610dd5425738fbaf8c2ffd0ad0997a",
                "md5": "d9924690527a90e989ba20d136b2fa92",
                "sha256": "ecf1a1aaa7391886c2a63002bde37ac4fdb43d0626283b83cc0dbb4e8aea597a"
            },
            "downloads": -1,
            "filename": "imgocr-0.1.3.tar.gz",
            "has_sig": false,
            "md5_digest": "d9924690527a90e989ba20d136b2fa92",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6.0",
            "size": 20869664,
            "upload_time": "2024-12-25T14:13:33",
            "upload_time_iso_8601": "2024-12-25T14:13:33.259314Z",
            "url": "https://files.pythonhosted.org/packages/0f/36/6dfc6a5b0a488883d966491f13df31610dd5425738fbaf8c2ffd0ad0997a/imgocr-0.1.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-12-25 14:13:33",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "shibing624",
    "github_project": "imgocr",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "loguru",
            "specs": []
        },
        {
            "name": "tqdm",
            "specs": []
        },
        {
            "name": "shapely",
            "specs": []
        },
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "pillow",
            "specs": []
        },
        {
            "name": "pyclipper",
            "specs": []
        },
        {
            "name": "requests",
            "specs": []
        },
        {
            "name": "onnxruntime",
            "specs": []
        },
        {
            "name": "opencv-python-headless",
            "specs": [
                [
                    "~=",
                    "4.10.0.84"
                ]
            ]
        }
    ],
    "lcname": "imgocr"
}
        
Elapsed time: 1.60332s