aws-textract-pipeline


Nameaws-textract-pipeline JSON
Version 0.4.1 PyPI version JSON
download
home_pagehttps://github.com/MacHu-GWU/aws_textract_pipeline-project
SummaryPackage short description.
upload_time2024-04-23 17:22:52
maintainerSanhe Hu
docs_urlNone
authorSanhe Hu
requires_python>=3.8
licenseMIT
keywords
VCS
bugtrack_url
requirements aws_textract pynamodb pynamodb_mate boto_session_manager s3pathlib PyMuPDF python-docx openpyxl python-pptx pillow
Travis-CI No Travis.
coveralls test coverage
            
.. image:: https://readthedocs.org/projects/aws-textract-pipeline/badge/?version=latest
    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/
    :alt: Documentation Status

.. image:: https://github.com/MacHu-GWU/aws_textract_pipeline-project/workflows/CI/badge.svg
    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/actions?query=workflow:CI

.. image:: https://codecov.io/gh/MacHu-GWU/aws_textract_pipeline-project/branch/main/graph/badge.svg
    :target: https://codecov.io/gh/MacHu-GWU/aws_textract_pipeline-project

.. image:: https://img.shields.io/pypi/v/aws-textract-pipeline.svg
    :target: https://pypi.python.org/pypi/aws-textract-pipeline

.. image:: https://img.shields.io/pypi/l/aws-textract-pipeline.svg
    :target: https://pypi.python.org/pypi/aws-textract-pipeline

.. image:: https://img.shields.io/pypi/pyversions/aws-textract-pipeline.svg
    :target: https://pypi.python.org/pypi/aws-textract-pipeline

.. image:: https://img.shields.io/badge/Release_History!--None.svg?style=social
    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/blob/main/release-history.rst

.. image:: https://img.shields.io/badge/STAR_Me_on_GitHub!--None.svg?style=social
    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project

------

.. image:: https://img.shields.io/badge/Link-Document-blue.svg
    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/

.. image:: https://img.shields.io/badge/Link-API-blue.svg
    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/py-modindex.html

.. image:: https://img.shields.io/badge/Link-Install-blue.svg
    :target: `install`_

.. image:: https://img.shields.io/badge/Link-GitHub-blue.svg
    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project

.. image:: https://img.shields.io/badge/Link-Submit_Issue-blue.svg
    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/issues

.. image:: https://img.shields.io/badge/Link-Request_Feature-blue.svg
    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/issues

.. image:: https://img.shields.io/badge/Link-Download-blue.svg
    :target: https://pypi.org/pypi/aws-textract-pipeline#files


Welcome to ``aws_textract_pipeline`` Documentation
==============================================================================
.. image:: https://aws-textract-pipeline.readthedocs.io/en/latest/_static/aws_textract_pipeline-logo.png
    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/

This project is a low-level implementation of the "Data Store Pipeline" component described in the `Intelligent Document Processing Platform Solution Design <https://dev-exp-share.readthedocs.io/en/latest/search.html?q=Intelligent+Document+Processing+Platform+Solution+Design&check_keywords=yes&area=default>`_ solution.

The term "low-level implementation" implies that this implementation does not rely on AWS services and performs pure in-memory computations. This implementation can be deployed on any platform and is not limited to the AWS ecosystem. It can be deployed as a batch job using virtual machines or containers, or it can be used for real-time processing with an event-driven architecture.

See usage example at `test_pipeline.py <https://github.com/MacHu-GWU/aws_textract_pipeline-project/blob/main/debug/test_pipeline.py>`_.


.. _install:

Install
------------------------------------------------------------------------------

``aws_textract_pipeline`` is released on PyPI, so all you need is to:

.. code-block:: console

    $ pip install aws-textract-pipeline

To upgrade to latest version:

.. code-block:: console

    $ pip install --upgrade aws-textract-pipeline

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MacHu-GWU/aws_textract_pipeline-project",
    "name": "aws-textract-pipeline",
    "maintainer": "Sanhe Hu",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "husanhe@gmail.com",
    "keywords": null,
    "author": "Sanhe Hu",
    "author_email": "husanhe@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/09/a7/b0faf5521930d88f15eb95ad6ad19192ac01366a85a06bc9da6285b545ca/aws_textract_pipeline-0.4.1.tar.gz",
    "platform": "Windows",
    "description": "\n.. image:: https://readthedocs.org/projects/aws-textract-pipeline/badge/?version=latest\n    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/\n    :alt: Documentation Status\n\n.. image:: https://github.com/MacHu-GWU/aws_textract_pipeline-project/workflows/CI/badge.svg\n    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/actions?query=workflow:CI\n\n.. image:: https://codecov.io/gh/MacHu-GWU/aws_textract_pipeline-project/branch/main/graph/badge.svg\n    :target: https://codecov.io/gh/MacHu-GWU/aws_textract_pipeline-project\n\n.. image:: https://img.shields.io/pypi/v/aws-textract-pipeline.svg\n    :target: https://pypi.python.org/pypi/aws-textract-pipeline\n\n.. image:: https://img.shields.io/pypi/l/aws-textract-pipeline.svg\n    :target: https://pypi.python.org/pypi/aws-textract-pipeline\n\n.. image:: https://img.shields.io/pypi/pyversions/aws-textract-pipeline.svg\n    :target: https://pypi.python.org/pypi/aws-textract-pipeline\n\n.. image:: https://img.shields.io/badge/Release_History!--None.svg?style=social\n    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/blob/main/release-history.rst\n\n.. image:: https://img.shields.io/badge/STAR_Me_on_GitHub!--None.svg?style=social\n    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project\n\n------\n\n.. image:: https://img.shields.io/badge/Link-Document-blue.svg\n    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/\n\n.. image:: https://img.shields.io/badge/Link-API-blue.svg\n    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/py-modindex.html\n\n.. image:: https://img.shields.io/badge/Link-Install-blue.svg\n    :target: `install`_\n\n.. image:: https://img.shields.io/badge/Link-GitHub-blue.svg\n    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project\n\n.. image:: https://img.shields.io/badge/Link-Submit_Issue-blue.svg\n    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/issues\n\n.. image:: https://img.shields.io/badge/Link-Request_Feature-blue.svg\n    :target: https://github.com/MacHu-GWU/aws_textract_pipeline-project/issues\n\n.. image:: https://img.shields.io/badge/Link-Download-blue.svg\n    :target: https://pypi.org/pypi/aws-textract-pipeline#files\n\n\nWelcome to ``aws_textract_pipeline`` Documentation\n==============================================================================\n.. image:: https://aws-textract-pipeline.readthedocs.io/en/latest/_static/aws_textract_pipeline-logo.png\n    :target: https://aws-textract-pipeline.readthedocs.io/en/latest/\n\nThis project is a low-level implementation of the \"Data Store Pipeline\" component described in the `Intelligent Document Processing Platform Solution Design <https://dev-exp-share.readthedocs.io/en/latest/search.html?q=Intelligent+Document+Processing+Platform+Solution+Design&check_keywords=yes&area=default>`_ solution.\n\nThe term \"low-level implementation\" implies that this implementation does not rely on AWS services and performs pure in-memory computations. This implementation can be deployed on any platform and is not limited to the AWS ecosystem. It can be deployed as a batch job using virtual machines or containers, or it can be used for real-time processing with an event-driven architecture.\n\nSee usage example at `test_pipeline.py <https://github.com/MacHu-GWU/aws_textract_pipeline-project/blob/main/debug/test_pipeline.py>`_.\n\n\n.. _install:\n\nInstall\n------------------------------------------------------------------------------\n\n``aws_textract_pipeline`` is released on PyPI, so all you need is to:\n\n.. code-block:: console\n\n    $ pip install aws-textract-pipeline\n\nTo upgrade to latest version:\n\n.. code-block:: console\n\n    $ pip install --upgrade aws-textract-pipeline\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Package short description.",
    "version": "0.4.1",
    "project_urls": {
        "Download": "https://pypi.python.org/pypi/aws_textract_pipeline/0.4.1#downloads",
        "Homepage": "https://github.com/MacHu-GWU/aws_textract_pipeline-project"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fe6fe0cf3189af5ce692809d38d846eb788c363b77046f8e66b031cb760ec86b",
                "md5": "f8f86f4322b03b5eb1596600891a62d1",
                "sha256": "18a896da559b2ed498c1c3910844a8bd0a615fe7e288916911033932a474921c"
            },
            "downloads": -1,
            "filename": "aws_textract_pipeline-0.4.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f8f86f4322b03b5eb1596600891a62d1",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 32587,
            "upload_time": "2024-04-23T17:22:45",
            "upload_time_iso_8601": "2024-04-23T17:22:45.982648Z",
            "url": "https://files.pythonhosted.org/packages/fe/6f/e0cf3189af5ce692809d38d846eb788c363b77046f8e66b031cb760ec86b/aws_textract_pipeline-0.4.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "09a7b0faf5521930d88f15eb95ad6ad19192ac01366a85a06bc9da6285b545ca",
                "md5": "85bb23f65deb0d0c26e353c6629a08ca",
                "sha256": "f3c1b77bf9457aeb4c58df4d35246edcc0f7913d2b32f824ad0b7cb237dec18b"
            },
            "downloads": -1,
            "filename": "aws_textract_pipeline-0.4.1.tar.gz",
            "has_sig": false,
            "md5_digest": "85bb23f65deb0d0c26e353c6629a08ca",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 32965,
            "upload_time": "2024-04-23T17:22:52",
            "upload_time_iso_8601": "2024-04-23T17:22:52.570751Z",
            "url": "https://files.pythonhosted.org/packages/09/a7/b0faf5521930d88f15eb95ad6ad19192ac01366a85a06bc9da6285b545ca/aws_textract_pipeline-0.4.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-23 17:22:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MacHu-GWU",
    "github_project": "aws_textract_pipeline-project",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [
        {
            "name": "aws_textract",
            "specs": [
                [
                    "<",
                    "1.0.0"
                ],
                [
                    ">=",
                    "0.2.1"
                ]
            ]
        },
        {
            "name": "pynamodb",
            "specs": [
                [
                    "<",
                    "6.0.0"
                ],
                [
                    ">=",
                    "5.5.1"
                ]
            ]
        },
        {
            "name": "pynamodb_mate",
            "specs": [
                [
                    "==",
                    "5.3.4.9"
                ]
            ]
        },
        {
            "name": "boto_session_manager",
            "specs": [
                [
                    ">=",
                    "1.7.2"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "s3pathlib",
            "specs": [
                [
                    "<",
                    "3.0.0"
                ],
                [
                    ">=",
                    "2.1.2"
                ]
            ]
        },
        {
            "name": "PyMuPDF",
            "specs": [
                [
                    ">=",
                    "1.23.26"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "python-docx",
            "specs": [
                [
                    ">=",
                    "1.0.1"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "openpyxl",
            "specs": [
                [
                    ">=",
                    "3.0.10"
                ],
                [
                    "<",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "python-pptx",
            "specs": [
                [
                    "<",
                    "1.0.0"
                ],
                [
                    ">=",
                    "0.6.23"
                ]
            ]
        },
        {
            "name": "pillow",
            "specs": [
                [
                    ">=",
                    "9.5.0"
                ],
                [
                    "<",
                    "10.0.0"
                ]
            ]
        }
    ],
    "lcname": "aws-textract-pipeline"
}
        
Elapsed time: 0.91454s