| Name | any-document-extractor JSON |
| Version |
0.1.2
JSON |
| download |
| home_page | None |
| Summary | A Python library for extracting text content from any document format. |
| upload_time | 2025-10-17 12:39:34 |
| maintainer | None |
| docs_url | None |
| author | yeqing |
| requires_python | >=3.9 |
| license | None |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|
# Any document Extractor
A Python library for extracting text content from any document format.
## Features
- Supports multiple document formats (PPTX, DOCX, PDF, XLSX.)
- Returns clean extracted text
## Installation
```bash
pip install any-document-extractor
````
## Usage
Basic usage example:
```python
from anydocumentextractor import DocumentExtractor
def main(fp: str):
extra = DocumentExtractor(fp)
return extra.extract()
if __name__ == '__main__':
fp = 'text.docx' # Can be any supported document
content = main(fp)
print(content)
```
## Supported Formats
- Microsoft Office: PPTX, DOCX, XLSX
- OpenDocument: ODT, ODP
- PDF documents
- Plain text files
- And more...
## License
MIT License - Free for commercial and personal use.
You can customize this further by adding:
- More detailed installation instructions
- Specific version requirements
- Advanced usage examples
- Error handling documentation
- Contribution guidelines
- Project status badges
Raw data
{
"_id": null,
"home_page": null,
"name": "any-document-extractor",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": null,
"author": "yeqing",
"author_email": "215777@qq.com",
"download_url": "https://files.pythonhosted.org/packages/91/d8/ef6d95838766884021799d3027863c1dd4eb9b40110463adc97db43e02b3/any_document_extractor-0.1.2.tar.gz",
"platform": null,
"description": "# Any document Extractor\n\nA Python library for extracting text content from any document format.\n\n## Features\n\n- Supports multiple document formats (PPTX, DOCX, PDF, XLSX.)\n- Returns clean extracted text\n\n## Installation\n\n```bash\npip install any-document-extractor\n````\n\n\n\n## Usage\nBasic usage example:\n\n```python\n\nfrom anydocumentextractor import DocumentExtractor\n\n\ndef main(fp: str):\n extra = DocumentExtractor(fp)\n return extra.extract()\n\n\nif __name__ == '__main__':\n fp = 'text.docx' # Can be any supported document\n content = main(fp)\n print(content)\n\n```\n\n## Supported Formats\n- Microsoft Office: PPTX, DOCX, XLSX\n- OpenDocument: ODT, ODP\n- PDF documents\n- Plain text files\n- And more...\n\n## License\nMIT License - Free for commercial and personal use.\n\nYou can customize this further by adding:\n- More detailed installation instructions\n- Specific version requirements\n- Advanced usage examples\n- Error handling documentation\n- Contribution guidelines\n- Project status badges\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A Python library for extracting text content from any document format.",
"version": "0.1.2",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "a4b423acc4da1c0f33723e1423c9d179628c4b7703d230a86439ab85507499ab",
"md5": "518d1c1159602a96aefae5f909cec44f",
"sha256": "f7a30983def65cd0f885930cc38ba16656df239dd4fd6c955c4c1ec8d86a652f"
},
"downloads": -1,
"filename": "any_document_extractor-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "518d1c1159602a96aefae5f909cec44f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 2665,
"upload_time": "2025-10-17T12:39:33",
"upload_time_iso_8601": "2025-10-17T12:39:33.954365Z",
"url": "https://files.pythonhosted.org/packages/a4/b4/23acc4da1c0f33723e1423c9d179628c4b7703d230a86439ab85507499ab/any_document_extractor-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "91d8ef6d95838766884021799d3027863c1dd4eb9b40110463adc97db43e02b3",
"md5": "2435cb6af77760bb07a2abf101edd829",
"sha256": "14016da860e1e2ad41aecfed4b099117fc9eaa680afe153c1dd6004fac413b11"
},
"downloads": -1,
"filename": "any_document_extractor-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "2435cb6af77760bb07a2abf101edd829",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 2467,
"upload_time": "2025-10-17T12:39:34",
"upload_time_iso_8601": "2025-10-17T12:39:34.869896Z",
"url": "https://files.pythonhosted.org/packages/91/d8/ef6d95838766884021799d3027863c1dd4eb9b40110463adc97db43e02b3/any_document_extractor-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-17 12:39:34",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "any-document-extractor"
}