# Nougat OCR loader
This loader reads the equations, symbols, and tables included in the PDF.
Users can input the path of the academic PDF document `file` which they want to parse. This OCR understands LaTeX math and tables.
## Usage
Here's an example usage of the PDFNougatOCR.
```python
from llama_hub.nougat_ocr import PDFNougatOCR
reader = PDFNougatOCR()
pdf_path = Path("/path/to/pdf")
documents = reader.load_data(pdf_path)
```
## Miscellaneous
An `output` folder will be created with the same name as the pdf and `.mmd` extension.
Raw data
{
"_id": null,
"home_page": "",
"name": "llama-index-readers-nougat-ocr",
"maintainer": "mdarshad1000",
"docs_url": null,
"requires_python": ">=3.8.1,<4.0",
"maintainer_email": "",
"keywords": "academic papers,ocr,pdf",
"author": "Your Name",
"author_email": "you@example.com",
"download_url": "https://files.pythonhosted.org/packages/bf/6b/583f19985927bc30f65fad05e3a1b2ba2a7c94fb183ac6e95c388b63e151/llama_index_readers_nougat_ocr-0.1.3.tar.gz",
"platform": null,
"description": "# Nougat OCR loader\n\nThis loader reads the equations, symbols, and tables included in the PDF.\n\nUsers can input the path of the academic PDF document `file` which they want to parse. This OCR understands LaTeX math and tables.\n\n## Usage\n\nHere's an example usage of the PDFNougatOCR.\n\n```python\nfrom llama_hub.nougat_ocr import PDFNougatOCR\n\nreader = PDFNougatOCR()\n\npdf_path = Path(\"/path/to/pdf\")\n\ndocuments = reader.load_data(pdf_path)\n```\n\n## Miscellaneous\n\nAn `output` folder will be created with the same name as the pdf and `.mmd` extension.\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "llama-index readers nougat_ocr integration",
"version": "0.1.3",
"project_urls": null,
"split_keywords": [
"academic papers",
"ocr",
"pdf"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5b4031888600f24367804b3f5acfd7ac3438ab7b53fe01741486f4afa8b2d22d",
"md5": "07299a7892361ca1b4f699372f708e9e",
"sha256": "bf4dfb8bb8c0678244b840fee7c72c412ae600acb653c90f6000a60b5a386536"
},
"downloads": -1,
"filename": "llama_index_readers_nougat_ocr-0.1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "07299a7892361ca1b4f699372f708e9e",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8.1,<4.0",
"size": 2669,
"upload_time": "2024-02-21T20:30:11",
"upload_time_iso_8601": "2024-02-21T20:30:11.267104Z",
"url": "https://files.pythonhosted.org/packages/5b/40/31888600f24367804b3f5acfd7ac3438ab7b53fe01741486f4afa8b2d22d/llama_index_readers_nougat_ocr-0.1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bf6b583f19985927bc30f65fad05e3a1b2ba2a7c94fb183ac6e95c388b63e151",
"md5": "e7a35e2e89eca3c59e4d5f8cd01ef863",
"sha256": "714a39d47e92a6165c9d1de4d7a8c4f1df65e0320c8b30e1c9065ec621ffdbf9"
},
"downloads": -1,
"filename": "llama_index_readers_nougat_ocr-0.1.3.tar.gz",
"has_sig": false,
"md5_digest": "e7a35e2e89eca3c59e4d5f8cd01ef863",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8.1,<4.0",
"size": 2435,
"upload_time": "2024-02-21T20:30:12",
"upload_time_iso_8601": "2024-02-21T20:30:12.710484Z",
"url": "https://files.pythonhosted.org/packages/bf/6b/583f19985927bc30f65fad05e3a1b2ba2a7c94fb183ac6e95c388b63e151/llama_index_readers_nougat_ocr-0.1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-21 20:30:12",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "llama-index-readers-nougat-ocr"
}