# Unstructured Expanded
The `unstructured_expanded` library is a wrapper around the `unstructured` open source library to add image-extraction capabilities to the API.
Its only purpose is to provide a more complete API for the `unstructured` library, since the library maintainers of the open source project
have chosen to lock image extraction for office documents behind a paywall.
## Quick-Start
This library is meant to be used in conjunction with the `unstructured` library.
Versions of this library are equivalent to the `unstructured` library version they are based on.
```shell
# Install the variant of unstructured with everything you need support for
pip install unstructured["all-docs"]
# Install the unstructured_expanded library on top of it
pip install unstructured_expanded
```
## License
See the licensing information in the [LICENSE](LICENSE) file.
## Citation
If you use this library in your research, please include a citation:
```bibtex
@misc{unstructured_expanded,
title={Unstructured_expanded: A Python Library for Extracting Text and Images from Documents using the unstructured API.},
author={Kogan, Isaac},
year={2024},
url={https://github.com/isaackogan/unstructured_expanded}
}
```
Raw data
{
"_id": null,
"home_page": "https://github.com/isaackogan/unstructured_expanded",
"name": "unstructured-expanded",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": "nlp, natural language processing, text, documents, images, image extraction, pdf, docx, pptx, semantic, semantic analysis, semantic parsing, semantic extraction, unstructured",
"author": "Isaac Kogan",
"author_email": "info@isaackogan.com",
"download_url": "https://files.pythonhosted.org/packages/f5/c6/55932bc5cf99da7aa447757b2d8dd81dae07693b043980b9ae6476004f2b/unstructured_expanded-0.16.4.post3.tar.gz",
"platform": null,
"description": "# Unstructured Expanded\n\nThe `unstructured_expanded` library is a wrapper around the `unstructured` open source library to add image-extraction capabilities to the API.\n\nIts only purpose is to provide a more complete API for the `unstructured` library, since the library maintainers of the open source project\nhave chosen to lock image extraction for office documents behind a paywall.\n\n## Quick-Start\n\nThis library is meant to be used in conjunction with the `unstructured` library.\n\nVersions of this library are equivalent to the `unstructured` library version they are based on.\n\n```shell\n# Install the variant of unstructured with everything you need support for\npip install unstructured[\"all-docs\"]\n\n# Install the unstructured_expanded library on top of it\npip install unstructured_expanded\n```\n\n## License\n\nSee the licensing information in the [LICENSE](LICENSE) file.\n\n## Citation\n\nIf you use this library in your research, please include a citation:\n\n```bibtex\n@misc{unstructured_expanded,\n title={Unstructured_expanded: A Python Library for Extracting Text and Images from Documents using the unstructured API.},\n author={Kogan, Isaac},\n year={2024},\n url={https://github.com/isaackogan/unstructured_expanded}\n}\n```\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Expansion to the unstructured package, adding support for image extraction.",
"version": "0.16.4.post3",
"project_urls": {
"Homepage": "https://github.com/isaackogan/unstructured_expanded"
},
"split_keywords": [
"nlp",
" natural language processing",
" text",
" documents",
" images",
" image extraction",
" pdf",
" docx",
" pptx",
" semantic",
" semantic analysis",
" semantic parsing",
" semantic extraction",
" unstructured"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "7b339f2d69936dd3495172fd508e1d7ee2c78e0d40e9c335f06174a2ef2a96d3",
"md5": "6c03cadafecf3cbfe8042981bf4050b8",
"sha256": "11bba6b3a158ef0996a314e4d5e2c4aa1568a99a3316a37c0cd564c8391f6b7b"
},
"downloads": -1,
"filename": "unstructured_expanded-0.16.4.post3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "6c03cadafecf3cbfe8042981bf4050b8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 8423,
"upload_time": "2024-11-04T02:38:50",
"upload_time_iso_8601": "2024-11-04T02:38:50.090723Z",
"url": "https://files.pythonhosted.org/packages/7b/33/9f2d69936dd3495172fd508e1d7ee2c78e0d40e9c335f06174a2ef2a96d3/unstructured_expanded-0.16.4.post3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f5c655932bc5cf99da7aa447757b2d8dd81dae07693b043980b9ae6476004f2b",
"md5": "25072a1286c78ae308a61488bc12aa33",
"sha256": "1e414089592e4e20efd160941584acf3cc786d28e644a0af9f2920337ec83ce0"
},
"downloads": -1,
"filename": "unstructured_expanded-0.16.4.post3.tar.gz",
"has_sig": false,
"md5_digest": "25072a1286c78ae308a61488bc12aa33",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 6932,
"upload_time": "2024-11-04T02:38:51",
"upload_time_iso_8601": "2024-11-04T02:38:51.305519Z",
"url": "https://files.pythonhosted.org/packages/f5/c6/55932bc5cf99da7aa447757b2d8dd81dae07693b043980b9ae6476004f2b/unstructured_expanded-0.16.4.post3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-04 02:38:51",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "isaackogan",
"github_project": "unstructured_expanded",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "unstructured-expanded"
}