amazon-textract-response-parser

Name	amazon-textract-response-parser JSON
Version	1.0.3 JSON
	download
home_page	https://github.com/aws-samples/amazon-textract-response-parser
Summary	Easily parse JSON returned by Amazon Textract.
upload_time	2024-06-13 05:46:53
maintainer	None
docs_url	None
author	Amazon Rekognition Textract Demoes
requires_python	>=3.8
license	Apache License Version 2.0
keywords	amazon-textract-response-parser trp aws amazon textract ocr response parser
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Textract Response Parser

You can use Textract response parser library to easily parser JSON returned by Amazon Textract. Library parses JSON and provides programming language specific constructs to work with different parts of the document. [textractor](https://github.com/aws-samples/amazon-textract-textractor) is an example of PoC batch processing tool that takes advantage of Textract response parser library and generate output in multiple formats.

## Installation

```
python -m pip install amazon-textract-response-parser
```

## Pipeline and Serializer/Deserializer

### Serializer/Deserializer

Based on the [marshmallow](https://marshmallow.readthedocs.io/en/stable/) framework, the serializer/deserializer allows for creating an object represenation of the Textract JSON response.

#### Deserialize Textract JSON
```python
# j holds the Textract JSON dict
from trp.trp2 import TDocument, TDocumentSchema
t_doc = TDocumentSchema().load(j)
```

#### Serialize Textract 
```python
from trp.trp2 import TDocument, TDocumentSchema
t_doc = TDocumentSchema().dump(t_doc)
```

#### Deserialize Textract AnalyzeId JSON
```python
# j holds the Textract JSON
from trp.trp2_analyzeid import TAnalyzeIdDocument, TAnalyzeIdDocumentSchema
t_doc = TAnalyzeIdDocumentSchema().load(json.loads(j))
```
#### Serialize Textract AnalyzeId object to JSON
```python
from trp.trp2_analyzeid import TAnalyzeIdDocument, TAnalyzeIdDocumentSchema
t_doc = TAnalyzeIdDocumentSchema().dump(t_doc)
```


### Pipeline 

We added some commonly requested features as easily consumable components that modify the Textract JSON Schema and ideally don't require big changes to any  existing workflow.

#### Order blocks (WORDS, LINES, TABLE, KEY_VALUE_SET) by geometry y-axis

By default Textract does not put the elements identified in an order in the JSON response.

The sample implementation ```order_blocks_by_geo``` of a function using the Serializer/Deserializer shows how to change the structure and order the elements while maintaining the schema. This way no change is necessary to integrate with existing processing.

```bash
# the sample code below makes use of the amazon-textract-caller
python -m pip install amazon-textract-caller
```

```python
from textractcaller.t_call import call_textract, Textract_Features
from trp.trp2 import TDocument, TDocumentSchema
from trp.t_pipeline import order_blocks_by_geo
import trp
import json

j = call_textract(input_document="path_to_some_document (PDF, JPEG, PNG)", features=[Textract_Features.FORMS, Textract_Features.TABLES])
# the t_doc will be not ordered
t_doc = TDocumentSchema().load(j)
# the ordered_doc has elements ordered by y-coordinate (top to bottom of page)
ordered_doc = order_blocks_by_geo(t_doc)
# send to trp for further processing logic
trp_doc = trp.Document(TDocumentSchema().dump(ordered_doc))
```

#### Page orientation in degrees

Amazon Textract supports all in-plane document rotations. However the response does not include a single number for the degree, but instead each word and line does have polygon points which can be used to calculate the degree of rotation. The following code adds this information as a custom field to Amazon Textract JSON response.

```python
from trp.t_pipeline import add_page_orientation
import trp.trp2 as t2
import trp as t1

# assign the Textract JSON dict to j
j = <call_textract(input_document="path_to_some_document (PDF, JPEG, PNG)") or your JSON dict>
t_document: t2.TDocument = t2.TDocumentSchema().load(j)
t_document = add_page_orientation(t_document)

doc = t1.Document(t2.TDocumentSchema().dump(t_document))
# page orientation can be read now for each page
for page in doc.pages:
    print(page.custom['PageOrientationBasedOnWords'])
```

#### Using the pipeline on command line

The amazon-textract-response-parser package also includes a command line tool to test pipeline components like the add_page_orientation or the order_blocks_by_geo.

Here is one example of the usage (in combination with the ```amazon-textract``` command from amazon-textract-helper and the ```jq``` tool (https://stedolan.github.io/jq/))

```bash
> amazon-textract --input-document "s3://somebucket/some-multi-page-pdf.pdf" | amazon-textract-pipeline --components add_page_orientation | jq '.Blocks[] | select(.BlockType=="PAGE") | .Custom'm

{
  "Orientation": 7
}
{
  "Orientation": 11
}
...
{
  "Orientation": -7
}
{
  "Orientation": 0
}
```


#### Merge or link tables across pages

Sometimes tables start on one page and continue across the next page or pages. This component identifies if that is the case based on the number of columns and if a header is present on the subsequent table and can modify the output Textract JSON schema for down-stream processing. Other custom-logic is possible to develop for specific use cases.

The MergeOptions.MERGE combines the tables and makes them appear as one for post processing, with the drawback that the geometry information is not accuracy any longer. So overlaying with bounding boxes will not be accuracy.

The MergeOptions.LINK maintains the geometric structure and enriches the table information with links between the table elements. There is a custom['previus_table'] and custom['next_table'] attribute added to the TABLE blocks in the Textract JSON schema.

Usage is simple

```python
from trp.t_pipeline import pipeline_merge_tables
import trp.trp2 as t2

j = <call_textract(input_document="path_to_some_document (PDF, JPEG, PNG)") or your JSON dict>
t_document: t2.TDocument = t2.TDocumentSchema().load(j)
t_document = pipeline_merge_tables(t_document, MergeOptions.MERGE, None, HeaderFooterType.NONE)
```

Using from command line example

```bash
# from the root of the repository
cat src-python/tests/data/gib_multi_page_table_merge.json | amazon-textract-pipeline --components merge_tables | amazon-textract --stdin --pretty-print TABLES
# compare to cat src-python/tests/data/gib_multi_page_table_merge.json | amazon-textract --stdin --pretty-print TABLES
```

#### Add OCR confidence score to KEY and VALUE

It can be useful for some use cases to validate the confidence score for a given KEY or the VALUE from an Analyze action with FORMS feature result.

The Confidence property of a BlockType 'KEY_VALUE_SET' expresses the confidence in this particular prediction being a KEY or a VALUE, but not the confidence of the underlying text value.

Simplified example:

```json
{
    "Confidence": 95.5,
    "Geometry": {<...>},
    "Id": "v1",
    "Relationships": [{"Type": "CHILD", "Ids": ["c1"]}],
    "EntityTypes": ["VALUE"],
    "BlockType": "KEY_VALUE_SET"
},
{
    "Confidence": 99.2610092163086,
    "TextType": "PRINTED",
    "Geometry": {<...>},
    "Id": "c1",
    "Text": "2021-Apr-08",
    "BlockType": "WORD"
},
```

In this example the confidence in the prediction of the VALUE to be an actual value in a key/value relationship is 95.5.

The confidence in the actual text representation is 99.2610092163086.
For simplicity in this example the value consists of just one word, but is not limited to that and could contain multiple words.

The KV_OCR_Confidence pipeline component adds confidence scores for the underlying OCR to the JSON. After executing the example JSON will look like this:

```json
{
    "Confidence": 95.5,
    "Geometry": {<...>},
    "Id": "v1",
    "Relationships": [{"Type": "CHILD", "Ids": ["c1"]}],
    "EntityTypes": ["VALUE"],
    "BlockType": "KEY_VALUE_SET",
    "Custom": {"OCRConfidence": {"mean": 99.2610092163086, "min": 99.2610092163086}}
},
{
    "Confidence": 99.2610092163086,
    "TextType": "PRINTED",
    "Geometry": {<...>},
    "Id": "c1",
    "Text": "2021-Apr-08",
    "BlockType": "WORD"
},
```

Usage is simple

```python
from trp.t_pipeline import add_kv_ocr_confidence
import trp.trp2 as t2

j = <call_textract(input_document="path_to_some_document (PDF, JPEG, PNG)") or your JSON dict>
t_document: t2.TDocument = t2.TDocumentSchema().load(j)
t_document = add_kv_ocr_confidence(t_document)
# further processing
```

Using from command line example and validating the output:

```bash
# from the root of the repository
cat "src-python/tests/data/employment-application.json" | amazon-textract-pipeline --components kv_ocr_confidence | jq '.Blocks[] | select(.BlockType=="KEY_VALUE_SET") '
```

# Parse JSON response from Textract

```python
from trp import Document
doc = Document(response)

# Iterate over elements in the document
for page in doc.pages:
    # Print lines and words
    for line in page.lines:
        print("Line: {}--{}".format(line.text, line.confidence))
        for word in line.words:
            print("Word: {}--{}".format(word.text, word.confidence))

    # Print tables
    for table in page.tables:
        for r, row in enumerate(table.rows):
            for c, cell in enumerate(row.cells):
                print("Table[{}][{}] = {}-{}".format(r, c, cell.text, cell.confidence))

    # Print fields
    for field in page.form.fields:
        print("Field: Key: {}, Value: {}".format(field.key.text, field.value.text))

    # Get field by key
    key = "Phone Number:"
    field = page.form.getFieldByKey(key)
    if(field):
        print("Field: Key: {}, Value: {}".format(field.key, field.value))

    # Search fields by key
    key = "address"
    fields = page.form.searchFieldsByKey(key)
    for field in fields:
        print("Field: Key: {}, Value: {}".format(field.key, field.value))

```

## Test

- Clone the repo and run pytest

```bash
git clone https://github.com/aws-samples/amazon-textract-response-parser.git
cd amazon-textract-response-parser
python -m venv virtualenv
virtualenv/bin/activate

python -m pip install --upgrade pip setuptools
python -m pip install -e .[dev]
pytest
```



## Other Resources

- [Large scale document processing with Amazon Textract - Reference Architecture](https://github.com/aws-samples/amazon-textract-serverless-large-scale-document-processing)
- [Batch processing tool](https://github.com/aws-samples/amazon-textract-textractor)
- [Code samples](https://github.com/aws-samples/amazon-textract-code-samples)

## License Summary

This sample code is made available under the Apache License Version 2.0. See the LICENSE file.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aws-samples/amazon-textract-response-parser",
    "name": "amazon-textract-response-parser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "amazon-textract-response-parser trp aws amazon textract ocr response parser",
    "author": "Amazon Rekognition Textract Demoes",
    "author_email": "rekognition-textract-demos@amazon.com",
    "download_url": "https://files.pythonhosted.org/packages/aa/02/ddb91991661ba7728df7e816694c758eac45508abb3261e5aee25a434ba2/amazon-textract-response-parser-1.0.3.tar.gz",
    "platform": null,
    "description": "# Textract Response Parser\n\nYou can use Textract response parser library to easily parser JSON returned by Amazon Textract. Library parses JSON and provides programming language specific constructs to work with different parts of the document. [textractor](https://github.com/aws-samples/amazon-textract-textractor) is an example of PoC batch processing tool that takes advantage of Textract response parser library and generate output in multiple formats.\n\n## Installation\n\n```\npython -m pip install amazon-textract-response-parser\n```\n\n## Pipeline and Serializer/Deserializer\n\n### Serializer/Deserializer\n\nBased on the [marshmallow](https://marshmallow.readthedocs.io/en/stable/) framework, the serializer/deserializer allows for creating an object represenation of the Textract JSON response.\n\n#### Deserialize Textract JSON\n```python\n# j holds the Textract JSON dict\nfrom trp.trp2 import TDocument, TDocumentSchema\nt_doc = TDocumentSchema().load(j)\n```\n\n#### Serialize Textract \n```python\nfrom trp.trp2 import TDocument, TDocumentSchema\nt_doc = TDocumentSchema().dump(t_doc)\n```\n\n#### Deserialize Textract AnalyzeId JSON\n```python\n# j holds the Textract JSON\nfrom trp.trp2_analyzeid import TAnalyzeIdDocument, TAnalyzeIdDocumentSchema\nt_doc = TAnalyzeIdDocumentSchema().load(json.loads(j))\n```\n#### Serialize Textract AnalyzeId object to JSON\n```python\nfrom trp.trp2_analyzeid import TAnalyzeIdDocument, TAnalyzeIdDocumentSchema\nt_doc = TAnalyzeIdDocumentSchema().dump(t_doc)\n```\n\n\n### Pipeline \n\nWe added some commonly requested features as easily consumable components that modify the Textract JSON Schema and ideally don't require big changes to any  existing workflow.\n\n#### Order blocks (WORDS, LINES, TABLE, KEY_VALUE_SET) by geometry y-axis\n\nBy default Textract does not put the elements identified in an order in the JSON response.\n\nThe sample implementation ```order_blocks_by_geo``` of a function using the Serializer/Deserializer shows how to change the structure and order the elements while maintaining the schema. This way no change is necessary to integrate with existing processing.\n\n```bash\n# the sample code below makes use of the amazon-textract-caller\npython -m pip install amazon-textract-caller\n```\n\n```python\nfrom textractcaller.t_call import call_textract, Textract_Features\nfrom trp.trp2 import TDocument, TDocumentSchema\nfrom trp.t_pipeline import order_blocks_by_geo\nimport trp\nimport json\n\nj = call_textract(input_document=\"path_to_some_document (PDF, JPEG, PNG)\", features=[Textract_Features.FORMS, Textract_Features.TABLES])\n# the t_doc will be not ordered\nt_doc = TDocumentSchema().load(j)\n# the ordered_doc has elements ordered by y-coordinate (top to bottom of page)\nordered_doc = order_blocks_by_geo(t_doc)\n# send to trp for further processing logic\ntrp_doc = trp.Document(TDocumentSchema().dump(ordered_doc))\n```\n\n#### Page orientation in degrees\n\nAmazon Textract supports all in-plane document rotations. However the response does not include a single number for the degree, but instead each word and line does have polygon points which can be used to calculate the degree of rotation. The following code adds this information as a custom field to Amazon Textract JSON response.\n\n```python\nfrom trp.t_pipeline import add_page_orientation\nimport trp.trp2 as t2\nimport trp as t1\n\n# assign the Textract JSON dict to j\nj = <call_textract(input_document=\"path_to_some_document (PDF, JPEG, PNG)\") or your JSON dict>\nt_document: t2.TDocument = t2.TDocumentSchema().load(j)\nt_document = add_page_orientation(t_document)\n\ndoc = t1.Document(t2.TDocumentSchema().dump(t_document))\n# page orientation can be read now for each page\nfor page in doc.pages:\n    print(page.custom['PageOrientationBasedOnWords'])\n```\n\n#### Using the pipeline on command line\n\nThe amazon-textract-response-parser package also includes a command line tool to test pipeline components like the add_page_orientation or the order_blocks_by_geo.\n\nHere is one example of the usage (in combination with the ```amazon-textract``` command from amazon-textract-helper and the ```jq``` tool (https://stedolan.github.io/jq/))\n\n```bash\n> amazon-textract --input-document \"s3://somebucket/some-multi-page-pdf.pdf\" | amazon-textract-pipeline --components add_page_orientation | jq '.Blocks[] | select(.BlockType==\"PAGE\") | .Custom'm\n\n{\n  \"Orientation\": 7\n}\n{\n  \"Orientation\": 11\n}\n...\n{\n  \"Orientation\": -7\n}\n{\n  \"Orientation\": 0\n}\n```\n\n\n#### Merge or link tables across pages\n\nSometimes tables start on one page and continue across the next page or pages. This component identifies if that is the case based on the number of columns and if a header is present on the subsequent table and can modify the output Textract JSON schema for down-stream processing. Other custom-logic is possible to develop for specific use cases.\n\nThe MergeOptions.MERGE combines the tables and makes them appear as one for post processing, with the drawback that the geometry information is not accuracy any longer. So overlaying with bounding boxes will not be accuracy.\n\nThe MergeOptions.LINK maintains the geometric structure and enriches the table information with links between the table elements. There is a custom['previus_table'] and custom['next_table'] attribute added to the TABLE blocks in the Textract JSON schema.\n\nUsage is simple\n\n```python\nfrom trp.t_pipeline import pipeline_merge_tables\nimport trp.trp2 as t2\n\nj = <call_textract(input_document=\"path_to_some_document (PDF, JPEG, PNG)\") or your JSON dict>\nt_document: t2.TDocument = t2.TDocumentSchema().load(j)\nt_document = pipeline_merge_tables(t_document, MergeOptions.MERGE, None, HeaderFooterType.NONE)\n```\n\nUsing from command line example\n\n```bash\n# from the root of the repository\ncat src-python/tests/data/gib_multi_page_table_merge.json | amazon-textract-pipeline --components merge_tables | amazon-textract --stdin --pretty-print TABLES\n# compare to cat src-python/tests/data/gib_multi_page_table_merge.json | amazon-textract --stdin --pretty-print TABLES\n```\n\n#### Add OCR confidence score to KEY and VALUE\n\nIt can be useful for some use cases to validate the confidence score for a given KEY or the VALUE from an Analyze action with FORMS feature result.\n\nThe Confidence property of a BlockType 'KEY_VALUE_SET' expresses the confidence in this particular prediction being a KEY or a VALUE, but not the confidence of the underlying text value.\n\nSimplified example:\n\n```json\n{\n    \"Confidence\": 95.5,\n    \"Geometry\": {<...>},\n    \"Id\": \"v1\",\n    \"Relationships\": [{\"Type\": \"CHILD\", \"Ids\": [\"c1\"]}],\n    \"EntityTypes\": [\"VALUE\"],\n    \"BlockType\": \"KEY_VALUE_SET\"\n},\n{\n    \"Confidence\": 99.2610092163086,\n    \"TextType\": \"PRINTED\",\n    \"Geometry\": {<...>},\n    \"Id\": \"c1\",\n    \"Text\": \"2021-Apr-08\",\n    \"BlockType\": \"WORD\"\n},\n```\n\nIn this example the confidence in the prediction of the VALUE to be an actual value in a key/value relationship is 95.5.\n\nThe confidence in the actual text representation is 99.2610092163086.\nFor simplicity in this example the value consists of just one word, but is not limited to that and could contain multiple words.\n\nThe KV_OCR_Confidence pipeline component adds confidence scores for the underlying OCR to the JSON. After executing the example JSON will look like this:\n\n```json\n{\n    \"Confidence\": 95.5,\n    \"Geometry\": {<...>},\n    \"Id\": \"v1\",\n    \"Relationships\": [{\"Type\": \"CHILD\", \"Ids\": [\"c1\"]}],\n    \"EntityTypes\": [\"VALUE\"],\n    \"BlockType\": \"KEY_VALUE_SET\",\n    \"Custom\": {\"OCRConfidence\": {\"mean\": 99.2610092163086, \"min\": 99.2610092163086}}\n},\n{\n    \"Confidence\": 99.2610092163086,\n    \"TextType\": \"PRINTED\",\n    \"Geometry\": {<...>},\n    \"Id\": \"c1\",\n    \"Text\": \"2021-Apr-08\",\n    \"BlockType\": \"WORD\"\n},\n```\n\nUsage is simple\n\n```python\nfrom trp.t_pipeline import add_kv_ocr_confidence\nimport trp.trp2 as t2\n\nj = <call_textract(input_document=\"path_to_some_document (PDF, JPEG, PNG)\") or your JSON dict>\nt_document: t2.TDocument = t2.TDocumentSchema().load(j)\nt_document = add_kv_ocr_confidence(t_document)\n# further processing\n```\n\nUsing from command line example and validating the output:\n\n```bash\n# from the root of the repository\ncat \"src-python/tests/data/employment-application.json\" | amazon-textract-pipeline --components kv_ocr_confidence | jq '.Blocks[] | select(.BlockType==\"KEY_VALUE_SET\") '\n```\n\n# Parse JSON response from Textract\n\n```python\nfrom trp import Document\ndoc = Document(response)\n\n# Iterate over elements in the document\nfor page in doc.pages:\n    # Print lines and words\n    for line in page.lines:\n        print(\"Line: {}--{}\".format(line.text, line.confidence))\n        for word in line.words:\n            print(\"Word: {}--{}\".format(word.text, word.confidence))\n\n    # Print tables\n    for table in page.tables:\n        for r, row in enumerate(table.rows):\n            for c, cell in enumerate(row.cells):\n                print(\"Table[{}][{}] = {}-{}\".format(r, c, cell.text, cell.confidence))\n\n    # Print fields\n    for field in page.form.fields:\n        print(\"Field: Key: {}, Value: {}\".format(field.key.text, field.value.text))\n\n    # Get field by key\n    key = \"Phone Number:\"\n    field = page.form.getFieldByKey(key)\n    if(field):\n        print(\"Field: Key: {}, Value: {}\".format(field.key, field.value))\n\n    # Search fields by key\n    key = \"address\"\n    fields = page.form.searchFieldsByKey(key)\n    for field in fields:\n        print(\"Field: Key: {}, Value: {}\".format(field.key, field.value))\n\n```\n\n## Test\n\n- Clone the repo and run pytest\n\n```bash\ngit clone https://github.com/aws-samples/amazon-textract-response-parser.git\ncd amazon-textract-response-parser\npython -m venv virtualenv\nvirtualenv/bin/activate\n\npython -m pip install --upgrade pip setuptools\npython -m pip install -e .[dev]\npytest\n```\n\n\n\n## Other Resources\n\n- [Large scale document processing with Amazon Textract - Reference Architecture](https://github.com/aws-samples/amazon-textract-serverless-large-scale-document-processing)\n- [Batch processing tool](https://github.com/aws-samples/amazon-textract-textractor)\n- [Code samples](https://github.com/aws-samples/amazon-textract-code-samples)\n\n## License Summary\n\nThis sample code is made available under the Apache License Version 2.0. See the LICENSE file.\n",
    "bugtrack_url": null,
    "license": "Apache License Version 2.0",
    "summary": "Easily parse JSON returned by Amazon Textract.",
    "version": "1.0.3",
    "project_urls": {
        "Homepage": "https://github.com/aws-samples/amazon-textract-response-parser"
    },
    "split_keywords": [
        "amazon-textract-response-parser",
        "trp",
        "aws",
        "amazon",
        "textract",
        "ocr",
        "response",
        "parser"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "84ef6c18d048a64e1a7b8372ebaf154f989299110fa5909b4893f06e99db588b",
                "md5": "5cbdc548f4a4f096cfbd19429edb82a6",
                "sha256": "834ffcec01085b82565b3f55625572f3425885918f09380cbe9e7fb02a7c8de7"
            },
            "downloads": -1,
            "filename": "amazon_textract_response_parser-1.0.3-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5cbdc548f4a4f096cfbd19429edb82a6",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": ">=3.8",
            "size": 30320,
            "upload_time": "2024-06-13T05:46:51",
            "upload_time_iso_8601": "2024-06-13T05:46:51.034315Z",
            "url": "https://files.pythonhosted.org/packages/84/ef/6c18d048a64e1a7b8372ebaf154f989299110fa5909b4893f06e99db588b/amazon_textract_response_parser-1.0.3-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aa02ddb91991661ba7728df7e816694c758eac45508abb3261e5aee25a434ba2",
                "md5": "d994390210c9dcf195f3f5047733f08e",
                "sha256": "7d7f56702bb576e24949ff5ca98d75d546fec12923ee97e399f6b72f5c6db018"
            },
            "downloads": -1,
            "filename": "amazon-textract-response-parser-1.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "d994390210c9dcf195f3f5047733f08e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 26648,
            "upload_time": "2024-06-13T05:46:53",
            "upload_time_iso_8601": "2024-06-13T05:46:53.326100Z",
            "url": "https://files.pythonhosted.org/packages/aa/02/ddb91991661ba7728df7e816694c758eac45508abb3261e5aee25a434ba2/amazon-textract-response-parser-1.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-06-13 05:46:53",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aws-samples",
    "github_project": "amazon-textract-response-parser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "amazon-textract-response-parser"
}

Amazon Rekognition Textract Demoes