<p align="center">
<a href="https://github.com/docling-project/docling-eval">
<img loading="lazy" alt="Docling" src="docs/assets/docling-eval-pic.png" width="40%"/>
</a>
</p>
# Docling-eval
[](https://arxiv.org/abs/2408.09869)
[](https://pypi.org/project/docling-eval/)
[](https://pypi.org/project/docling-eval/)
[](https://github.com/astral-sh/uv)
[](https://github.com/psf/black)
[](https://pycqa.github.io/isort/)
[](https://pydantic.dev)
[](https://github.com/pre-commit/pre-commit)
[](https://opensource.org/licenses/MIT)
Evaluate [Docling](https://github.com/docling-project/docling) on various datasets.
## Features
Evaluate docling on various datasets. You can use the cli
```shell
terminal %> docling-eval --help
Usage: docling_eval [OPTIONS] COMMAND [ARGS]...
Docling Evaluation CLI for benchmarking document processing tasks.
╭─ Options ────────────────────────────────────────────────────────────────────────────╮
│ --help Show this message and exit. │
╰──────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ───────────────────────────────────────────────────────────────────────────╮
│ create Create both ground truth and evaluation datasets in one step. │
│ create-eval Create evaluation dataset from existing ground truth. │
│ create-gt Create ground truth dataset only. │
│ evaluate Evaluate predictions against ground truth. │
│ visualize Visualize evaluation results. │
╰──────────────────────────────────────────────────────────────────────────────────────╯
```
## Benchmarks
- General
- [DP-Bench benchmarks](docs/DP-Bench_benchmarks.md): Text, layout, reading order and table structure evaluation on the DP-Bench dataset.
- [OmniDocBench benchmarks](docs/OmniDocBench_benchmarks.md): Text, layout, reading order and table structure evaluation on the OmniDocBench dataset.
- Layout
- [DocLayNetV1 Benchmarks](docs/DocLayNetv1_benchmarks.md): Text and layout evaluation on the DocLayNet v1.2 dataset.
- Table-Structure
- [FinTabnet Benchmarks](docs/FinTabNet_benchmarks.md): Table structure evaluation on the FinTabNet dataset.
- [PubTabNet benchmarks](docs/PubTabNet_benchmarks.md): Table structure evaluation on the PubTabNet dataset.
- [Pub1M benchmarks](docs/P1M_benchmarks.md): Table structure evaluation on the Pub1M dataset.
On our list for next benchmarks:
- [OmniOCR](getomni-ai/ocr-benchmark)
- Hyperscalers
- [CoMix](https://github.com/emanuelevivoli/CoMix/tree/main/docs/datasets)
- [DocVQA](https://huggingface.co/datasets/lmms-lab/DocVQA)
- [rd-tablebench](https://huggingface.co/datasets/reducto/rd-tablebench)
- [BigDocs-Bench](https://huggingface.co/datasets/ServiceNow/BigDocs-Bench)
## Contributing
Please read [Contributing to Docling](https://github.com/docling-project/docling/blob/main/CONTRIBUTING.md) for details.
## License
The Docling codebase is under MIT license.
For individual model usage, please refer to the model licenses found in the original packages.
## IBM ❤️ Open Source AI
Docling-eval has been brought to you by IBM.
Raw data
{
"_id": null,
"home_page": null,
"name": "docling-eval",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.10",
"maintainer_email": null,
"keywords": "docling, evaluation, convert, document, pdf, docx, html, markdown, layout model, segmentation, table structure, table former",
"author": null,
"author_email": "Peter Staar <taa@zurich.ibm.com>, Panos Vagenas <pva@zurich.ibm.com>, Ahmed Nassar <ahn@zurich.ibm.com>, Nikos Livathinos <nli@zurich.ibm.com>, Maxim Lysak <mly@zurich.ibm.com>, Michele Dolfi <dol@zurich.ibm.com>, Christoph Auer <cau@zurich.ibm.com>",
"download_url": "https://files.pythonhosted.org/packages/00/4e/c8f89a34624c130801389c7e3bc554473ac92da128bba3154d25242f8db1/docling_eval-0.10.0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <a href=\"https://github.com/docling-project/docling-eval\">\n <img loading=\"lazy\" alt=\"Docling\" src=\"docs/assets/docling-eval-pic.png\" width=\"40%\"/>\n </a>\n</p>\n\n# Docling-eval\n\n\n[](https://arxiv.org/abs/2408.09869)\n[](https://pypi.org/project/docling-eval/)\n[](https://pypi.org/project/docling-eval/)\n[](https://github.com/astral-sh/uv)\n[](https://github.com/psf/black)\n[](https://pycqa.github.io/isort/)\n[](https://pydantic.dev)\n[](https://github.com/pre-commit/pre-commit)\n[](https://opensource.org/licenses/MIT)\n\nEvaluate [Docling](https://github.com/docling-project/docling) on various datasets.\n\n## Features\n\nEvaluate docling on various datasets. You can use the cli\n\n```shell\nterminal %> docling-eval --help\n \n Usage: docling_eval [OPTIONS] COMMAND [ARGS]... \n \n Docling Evaluation CLI for benchmarking document processing tasks. \n \n\u256d\u2500 Options \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 --help Show this message and exit. \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\u256d\u2500 Commands \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256e\n\u2502 create Create both ground truth and evaluation datasets in one step. \u2502\n\u2502 create-eval Create evaluation dataset from existing ground truth. \u2502\n\u2502 create-gt Create ground truth dataset only. \u2502\n\u2502 evaluate Evaluate predictions against ground truth. \u2502\n\u2502 visualize Visualize evaluation results. \u2502\n\u2570\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u256f\n\n\n```\n\n## Benchmarks\n\n- General\n - [DP-Bench benchmarks](docs/DP-Bench_benchmarks.md): Text, layout, reading order and table structure evaluation on the DP-Bench dataset.\n - [OmniDocBench benchmarks](docs/OmniDocBench_benchmarks.md): Text, layout, reading order and table structure evaluation on the OmniDocBench dataset.\n- Layout\n - [DocLayNetV1 Benchmarks](docs/DocLayNetv1_benchmarks.md): Text and layout evaluation on the DocLayNet v1.2 dataset.\n- Table-Structure\n - [FinTabnet Benchmarks](docs/FinTabNet_benchmarks.md): Table structure evaluation on the FinTabNet dataset.\n - [PubTabNet benchmarks](docs/PubTabNet_benchmarks.md): Table structure evaluation on the PubTabNet dataset.\n - [Pub1M benchmarks](docs/P1M_benchmarks.md): Table structure evaluation on the Pub1M dataset.\n\nOn our list for next benchmarks:\n\n- [OmniOCR](getomni-ai/ocr-benchmark)\n- Hyperscalers\n- [CoMix](https://github.com/emanuelevivoli/CoMix/tree/main/docs/datasets)\n- [DocVQA](https://huggingface.co/datasets/lmms-lab/DocVQA)\n- [rd-tablebench](https://huggingface.co/datasets/reducto/rd-tablebench)\n- [BigDocs-Bench](https://huggingface.co/datasets/ServiceNow/BigDocs-Bench)\n \n## Contributing\n\nPlease read [Contributing to Docling](https://github.com/docling-project/docling/blob/main/CONTRIBUTING.md) for details.\n\n\n## License\n\nThe Docling codebase is under MIT license.\nFor individual model usage, please refer to the model licenses found in the original packages.\n\n\n## IBM \u2764\ufe0f Open Source AI\n\nDocling-eval has been brought to you by IBM.\n",
"bugtrack_url": null,
"license": null,
"summary": "Evaluation of Docling",
"version": "0.10.0",
"project_urls": {
"changelog": "https://github.com/docling-project/docling-eval/blob/main/CHANGELOG.md",
"homepage": "https://github.com/docling-project/docling-eval",
"issues": "https://github.com/docling-project/docling-eval/issues",
"repository": "https://github.com/docling-project/docling-eval"
},
"split_keywords": [
"docling",
" evaluation",
" convert",
" document",
" pdf",
" docx",
" html",
" markdown",
" layout model",
" segmentation",
" table structure",
" table former"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "3b15c30de981b1b426ef0384d4359827281120351006f169baefdfaa57ebfc46",
"md5": "a32e09869b8948cd5e2e7f7f41796518",
"sha256": "9b291b3f6095c51aef1aa3b12f0e3ff1c2afcb1d54eef8e2e5dfcacd7fb42697"
},
"downloads": -1,
"filename": "docling_eval-0.10.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "a32e09869b8948cd5e2e7f7f41796518",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.10",
"size": 298307,
"upload_time": "2025-11-05T18:31:47",
"upload_time_iso_8601": "2025-11-05T18:31:47.251846Z",
"url": "https://files.pythonhosted.org/packages/3b/15/c30de981b1b426ef0384d4359827281120351006f169baefdfaa57ebfc46/docling_eval-0.10.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "004ec8f89a34624c130801389c7e3bc554473ac92da128bba3154d25242f8db1",
"md5": "137af3d05cd17b040d6bbf72514c6eff",
"sha256": "c9d8ef3606cc72174086af598791a647b907cce159292e89745e738cf45e5d6e"
},
"downloads": -1,
"filename": "docling_eval-0.10.0.tar.gz",
"has_sig": false,
"md5_digest": "137af3d05cd17b040d6bbf72514c6eff",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.10",
"size": 262964,
"upload_time": "2025-11-05T18:31:48",
"upload_time_iso_8601": "2025-11-05T18:31:48.532752Z",
"url": "https://files.pythonhosted.org/packages/00/4e/c8f89a34624c130801389c7e3bc554473ac92da128bba3154d25242f8db1/docling_eval-0.10.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-05 18:31:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "docling-project",
"github_project": "docling-eval",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "docling-eval"
}