# Butler DocAI
DocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications.
---
🚧 DocAI is still a work-in-progress and undergoing early development.
Our goal is to put modern machine-learning technology in the hands of the 20+ million developers in the world. If this excites you, we are looking for early adopters to come along for the ride!
## Requirements
Python >= 3.7
## Installation & Usage
To install DocAI with pip:
```sh
pip install docai-py
```
### System Dependencies
#### Mac
- Install [poppler](http://macappstore.org/poppler/)
#### Linux
- Install poppler-utils via your package manager
## Getting Started
Please follow the [installation procedure](#installation--usage) and then run the following:
```python
from docai import PredictionClient
# Get API Key from https://docs.butlerlabs.ai/reference/uploading-documents-to-the-rest-api#get-your-api-key
api_key = '<api-key>'
# Get Queue ID from https://docs.butlerlabs.ai/reference/uploading-documents-to-the-rest-api#go-to-the-model-details-page
queue_id = '<queue_id>'
# Path to a local JPEG, PNG, or PDF file
local_file_path = 'example.pdf'
extraction_results = PredictionClient(api_key).extract_document(queue_id, local_file_path)
print(extraction_results)
```
## Maintain
### Install Packages for Development
Install [poetry](https://python-poetry.org/docs/#installation) on your host machine
```sh
poetry install
```
### Butler REST API Codegen
To regenerate code updates to REST API:
```sh
openapi-python-client update --url https://app.butlerlabs.ai/api/docs-json --config codegen.yaml
```
### Running Unit Tests
To run all unit tests:
```sh
poetry run pytest -v -m unit_tests
```
Alternatively, you can also use VSCode's "Testing" tab to run/debug individual tests
### Adding a New Dependency
To add a new pip package dependency, see [poetry add](https://python-poetry.org/docs/cli/#add).
For versioning, it is best to use the minimum version that works, combined with `^`, `~`, or `>=` and `<` checks.
For example:
- `poetry add my-package@^1.2.3` is a shorthand for `>=1.2.3,<2.0.0`
- `poetry add my-package@~1.2.3` is a shorthand for `>=1.2.3,<1.3.0`
- `poetry add "my-package>=1.2.3,<4.5.6"`
For development only dependencies, make sure to include the `--dev` flag.
### Build and Publish
#### Build and Publish Setup
```sh
# setup for testpypi
poetry config repositories.testpypi https://test.pypi.org/legacy/
poetry config pypi-token.testpypi <testpypi token>
# setup for pypi
poetry config repositories.pypi https://upload.pypi.org/legacy/
poetry config pypi-token.pypi <pypi token>
```
#### Build and Publish Procedure
Update `pyproject.toml` and `docai/__init__.py` to have a new version number
```sh
# build packages
poetry build
# upload to test pypi
poetry publish -r testpypi
# test install from test pypi
pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple docai-py
# upload to real pypi
poetry publish -r pypi
```
Raw data
{
"_id": null,
"home_page": "https://butlerlabs.ai",
"name": "docai-py",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7,<4.0",
"maintainer_email": "",
"keywords": "Document AI,Document Processing,OCR,LayoutLM",
"author": "Butler Labs",
"author_email": "support@butlerlabs.ai",
"download_url": "https://files.pythonhosted.org/packages/bf/61/b5c0a1e0441bc0cbc08420c727f258653fd280ce92308cf62e8ffe5035f6/docai_py-0.1.1.tar.gz",
"platform": null,
"description": "# Butler DocAI\n\nDocAI helps developers quickly build document, image and text processing pipelines using open source and cloud-based machine learning models for a wide range of applications.\n\n---\n\n\ud83d\udea7 DocAI is still a work-in-progress and undergoing early development.\n\nOur goal is to put modern machine-learning technology in the hands of the 20+ million developers in the world. If this excites you, we are looking for early adopters to come along for the ride!\n\n## Requirements\n\nPython >= 3.7\n\n## Installation & Usage\n\nTo install DocAI with pip:\n\n```sh\n\npip install docai-py\n\n```\n\n### System Dependencies\n\n#### Mac\n\n- Install [poppler](http://macappstore.org/poppler/)\n\n#### Linux\n\n- Install poppler-utils via your package manager\n\n## Getting Started\n\nPlease follow the [installation procedure](#installation--usage) and then run the following:\n\n```python\n\n\n\nfrom docai import PredictionClient\n\n\n\n# Get API Key from https://docs.butlerlabs.ai/reference/uploading-documents-to-the-rest-api#get-your-api-key\n\napi_key = '<api-key>'\n\n# Get Queue ID from https://docs.butlerlabs.ai/reference/uploading-documents-to-the-rest-api#go-to-the-model-details-page\n\nqueue_id = '<queue_id>'\n\n# Path to a local JPEG, PNG, or PDF file\n\nlocal_file_path = 'example.pdf'\n\n\n\nextraction_results = PredictionClient(api_key).extract_document(queue_id, local_file_path)\n\nprint(extraction_results)\n\n```\n\n## Maintain\n\n### Install Packages for Development\n\nInstall [poetry](https://python-poetry.org/docs/#installation) on your host machine\n\n```sh\n\npoetry install\n\n```\n\n### Butler REST API Codegen\n\nTo regenerate code updates to REST API:\n\n```sh\n\nopenapi-python-client update --url https://app.butlerlabs.ai/api/docs-json --config codegen.yaml\n\n```\n\n### Running Unit Tests\n\nTo run all unit tests:\n\n```sh\npoetry run pytest -v -m unit_tests\n```\n\nAlternatively, you can also use VSCode's \"Testing\" tab to run/debug individual tests\n\n\n### Adding a New Dependency\n\nTo add a new pip package dependency, see [poetry add](https://python-poetry.org/docs/cli/#add).\n\nFor versioning, it is best to use the minimum version that works, combined with `^`, `~`, or `>=` and `<` checks.\n\nFor example:\n\n- `poetry add my-package@^1.2.3` is a shorthand for `>=1.2.3,<2.0.0`\n\n- `poetry add my-package@~1.2.3` is a shorthand for `>=1.2.3,<1.3.0`\n\n- `poetry add \"my-package>=1.2.3,<4.5.6\"`\n\nFor development only dependencies, make sure to include the `--dev` flag.\n\n### Build and Publish\n\n#### Build and Publish Setup\n\n```sh\n\n# setup for testpypi\n\npoetry config repositories.testpypi https://test.pypi.org/legacy/\n\npoetry config pypi-token.testpypi <testpypi token>\n\n\n\n# setup for pypi\n\npoetry config repositories.pypi https://upload.pypi.org/legacy/\n\npoetry config pypi-token.pypi <pypi token>\n\n```\n\n#### Build and Publish Procedure\n\nUpdate `pyproject.toml` and `docai/__init__.py` to have a new version number\n\n```sh\n\n# build packages\n\npoetry build\n\n\n\n# upload to test pypi\n\npoetry publish -r testpypi\n\n\n\n# test install from test pypi\n\npip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple docai-py\n\n\n\n# upload to real pypi\n\npoetry publish -r pypi\n\n```\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Butler Doc AI",
"version": "0.1.1",
"split_keywords": [
"document ai",
"document processing",
"ocr",
"layoutlm"
],
"urls": [
{
"comment_text": "",
"digests": {
"md5": "5b07b5bad11ebda9695b0d6d7d30c73c",
"sha256": "cc7dbe4b04ccf7f015d532f8353dac44888539a0e3f02732241e185aaf0a6536"
},
"downloads": -1,
"filename": "docai_py-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5b07b5bad11ebda9695b0d6d7d30c73c",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7,<4.0",
"size": 344257,
"upload_time": "2022-12-09T05:56:36",
"upload_time_iso_8601": "2022-12-09T05:56:36.371030Z",
"url": "https://files.pythonhosted.org/packages/a6/04/2e071c457acbfb605e0f64633b7343edf32040eb1f4001877b303c27a80a/docai_py-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"md5": "69050a2bd51a7bcb2cb58f226062da15",
"sha256": "90b81590ebfd342ac860e3f384b0ab0aa9310c77d742b6150adb4bf4ec939e1b"
},
"downloads": -1,
"filename": "docai_py-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "69050a2bd51a7bcb2cb58f226062da15",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7,<4.0",
"size": 133029,
"upload_time": "2022-12-09T05:56:39",
"upload_time_iso_8601": "2022-12-09T05:56:39.416338Z",
"url": "https://files.pythonhosted.org/packages/bf/61/b5c0a1e0441bc0cbc08420c727f258653fd280ce92308cf62e8ffe5035f6/docai_py-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2022-12-09 05:56:39",
"github": false,
"gitlab": false,
"bitbucket": false,
"lcname": "docai-py"
}