<h3 align="center">ALTO Tools</h3>
<p align="center">
<a href="https://www.python.org/">:snake:</a> tools for performing various operations on <a href="http://www.loc.gov/standards/alto/">ALTO</a> XML files
</p>
<p align="center">
<!--<a href="pypi.org/project/alto-tools/"><img src="https://img.shields.io/pypi/v/alto-tools.svg" title="PyPI Version"></a>-->
<a href="https://github.com/cneud/alto-tools/actions/workflows/tests.yml"><img src="https://github.com/cneud/alto-tools/actions/workflows/tests.yml/badge.svg" title="GitHub Actions Tests Status"></a>
<a href="https://opensource.org/license/apache-2-0/"><img src="https://img.shields.io/github/license/cneud/alto-tools" title="Apache Software License 2.0"></a>
</p>
---
## Installation
Clone the repository, enter it and run
```bash
pip install .
```
## Usage
```bash
alto-tools <INPUT> [OPTION]
```
`INPUT` should be the path to an ALTO file or directory containing ALTO files.
Output is sent to `stdout`.
| OPTION | Description |
|--------|:------------|
| `-t` `--text` | Extract UTF-8 encoded text content |
| `-c` `--confidence` | Extract mean OCR word confidence score |
| `-i` `--illustrations` | Extract bounding box coordinates of `<Illustration>` elements |
| `-g` `--graphics` | Extract bounding box coordinates of `<GraphicalElement>` elements |
Raw data
{
"_id": null,
"home_page": "",
"name": "alto-tools",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "optical character recognition,digital libraries",
"author": "Clemens Neudecker",
"author_email": "",
"download_url": "https://files.pythonhosted.org/packages/ac/70/135a76f2d514242093f32972ecfcee8fa5fe14618f6d69302ef68d717363/alto-tools-0.1.0.tar.gz",
"platform": null,
"description": "<h3 align=\"center\">ALTO Tools</h3>\n<p align=\"center\">\n <a href=\"https://www.python.org/\">:snake:</a> tools for performing various operations on <a href=\"http://www.loc.gov/standards/alto/\">ALTO</a> XML files\n</p>\n<p align=\"center\">\n <!--<a href=\"pypi.org/project/alto-tools/\"><img src=\"https://img.shields.io/pypi/v/alto-tools.svg\" title=\"PyPI Version\"></a>-->\n <a href=\"https://github.com/cneud/alto-tools/actions/workflows/tests.yml\"><img src=\"https://github.com/cneud/alto-tools/actions/workflows/tests.yml/badge.svg\" title=\"GitHub Actions Tests Status\"></a>\n <a href=\"https://opensource.org/license/apache-2-0/\"><img src=\"https://img.shields.io/github/license/cneud/alto-tools\" title=\"Apache Software License 2.0\"></a>\n</p>\n\n---\n\n## Installation\n\nClone the repository, enter it and run\n\n```bash\npip install .\n```\n\n## Usage\n\n```bash\nalto-tools <INPUT> [OPTION] \n```\n\n`INPUT` should be the path to an ALTO file or directory containing ALTO files.\n\nOutput is sent to `stdout`.\n\n| OPTION | Description |\n|--------|:------------|\n| `-t` `--text` | Extract UTF-8 encoded text content |\n| `-c` `--confidence` | Extract mean OCR word confidence score |\n| `-i` `--illustrations` | Extract bounding box coordinates of `<Illustration>` elements |\n| `-g` `--graphics` | Extract bounding box coordinates of `<GraphicalElement>` elements |\n",
"bugtrack_url": null,
"license": "",
"summary": "Perform various operations on ALTO xml files",
"version": "0.1.0",
"project_urls": {
"Homepage": "https://github.com/cneud/alto-tools",
"Repository": "https://github.com/cneud/alto-tools.git"
},
"split_keywords": [
"optical character recognition",
"digital libraries"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "50a8397269efadf94ece214951c2678b06afd290fad091f5cf48c0cff5c92c13",
"md5": "3a460e80393f33bc46b47cb3e3cd3f90",
"sha256": "9015c74d0bd089da52ae6d8d69903c2b562061ef21803b1f39cec1587a468691"
},
"downloads": -1,
"filename": "alto_tools-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "3a460e80393f33bc46b47cb3e3cd3f90",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 9496,
"upload_time": "2023-09-25T11:36:19",
"upload_time_iso_8601": "2023-09-25T11:36:19.296025Z",
"url": "https://files.pythonhosted.org/packages/50/a8/397269efadf94ece214951c2678b06afd290fad091f5cf48c0cff5c92c13/alto_tools-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ac70135a76f2d514242093f32972ecfcee8fa5fe14618f6d69302ef68d717363",
"md5": "3979b7f66432b28401610be1037f86cd",
"sha256": "f95c288388015835c38e2d86afb31491be9adae9b60ea2fc0f3927ffaa8cfef7"
},
"downloads": -1,
"filename": "alto-tools-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "3979b7f66432b28401610be1037f86cd",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 9471,
"upload_time": "2023-09-25T11:36:21",
"upload_time_iso_8601": "2023-09-25T11:36:21.340722Z",
"url": "https://files.pythonhosted.org/packages/ac/70/135a76f2d514242093f32972ecfcee8fa5fe14618f6d69302ef68d717363/alto-tools-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-25 11:36:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "cneud",
"github_project": "alto-tools",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "alto-tools"
}