PyDigger - unearthing stuff about Python


NameVersionSummarydate
document-data-extractor 1.0.2 Best open-source document to markdown extractor for LLM training data. Convert PDF, Word, PowerPoint, Excel, images, URLs to clean markdown, JSON, HTML locally. Alternative to Unstructured, Docling, Marker, MarkItDown, MinerU, PaddleOCR, Tesseract 2025-07-28 12:27:30
markitdown-pdf-separators 0.4.3 MarkItDown with PDF page separators - convert PDFs to Markdown with page boundary markers 2025-07-28 08:46:51
llm-data-converter 2.2.0 Best open-source document to markdown converter for LLM training data. Convert PDF, Word, PowerPoint, Excel, images, URLs to clean markdown, JSON, HTML locally. Alternative to Unstructured, Docling, Marker, MarkItDown, MinerU, PaddleOCR, Tesseract 2025-07-25 13:32:07
hourdayweektotal
96200910271303627
Elapsed time: 3.47786s