Name | Version | Summary | date |
docling |
2.43.0 |
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications. |
2025-07-28 09:46:56 |
contextgem |
0.12.1 |
Effortless LLM extraction from documents |
2025-07-27 20:11:08 |
docu-devs-api-client |
0.1.9 |
A client library for accessing DocuDevs API |
2025-07-27 12:45:54 |
phantomtext |
0.1.1 |
A toolkit for content injection, obfuscation, scanning, and sanitization of various document formats. If you use this library, please cite: Castagnaro et al. 'The Hidden Threat in Plain Text: Attacking RAG Data Loaders' (2025). |
2025-07-24 06:39:18 |
palimpzest |
0.7.20 |
Palimpzest is a system which enables anyone to process AI-powered analytical queries simply by defining them in a declarative language |
2025-07-23 19:20:06 |
ocr-document-converter |
3.1.0 |
Enterprise-grade OCR and document conversion tool with dual OCR engines |
2025-07-22 15:19:03 |
pdfalchemy |
0.1.0 |
A Python library for advanced PDF manipulation and processing |
2025-07-19 13:36:21 |
txtify |
0.1.2 |
A versatile Python tool to convert documents (PPTX, DOCX, PDF, XLSX) to plain text, ideal for providing context to AI code assistants like GitHub Copilot and Amazon CodeWhisperer. |
2025-07-17 03:24:47 |
attachments |
0.21.0 |
The Python funnel for LLM context - turn any file into model-ready text + images, in one line. |
2025-07-14 03:31:59 |
docforge |
0.1.0 |
Forge perfect documents from any format with precision, power, and simplicity |
2025-07-13 22:29:47 |
aspose-words |
25.7.0 |
Aspose.Words for Python is a Document Processing library that allows developers to work with documents in many popular formats without needing Office Automation. |
2025-07-12 15:20:32 |
docling-ibm-models |
3.8.1 |
This package contains the AI models used by the Docling PDF conversion package |
2025-07-10 12:45:29 |
chunknorris |
1.1.4 |
A package for chunking documents from various formats |
2025-07-10 12:29:33 |
smartloop |
1.2.3 |
Smartloop Command Line interface to process documents using LLM |
2025-02-09 03:55:57 |
llm-document-analysis |
0.1.1 |
A Python library for LLM-powered document analysis and processing |
2025-02-08 17:23:15 |
mdocLib |
0.0.3 |
mdoc, markdown as docs. well. I think that it's good idea. but watch out. your code can be stinks easly it you didn't write docstring yourself. |
2025-02-03 10:30:40 |
docling-google-ocr |
2.13.1 |
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications. |
2025-02-02 06:56:31 |
adf-lib |
0.2.1 |
A Python library for creating and manipulating ADF (Atlassian Document Format) documents |
2025-01-28 05:02:37 |
docowling |
1.0.17 |
SDK and CLI for parsing PDF, DOCX, HTML, and more, to a unified document representation for powering downstream workflows such as gen AI applications. |
2025-01-11 17:29:23 |
leadtools |
23.0.0.4 |
Powered by patented artificial intelligence and machine learning algorithms, LEADTOOLS is a collection of comprehensive toolkits to integrate recognition, document, medical, imaging, and multimedia technologies into desktop, server, tablet, web and mobile solutions. |
2025-01-02 19:30:22 |