<div align="center">
<h1>PdfItDown</h1>
<h2>Convert Everything to PDF</h2>
</div>
<br>
<div align="center">
<img src="https://raw.githubusercontent.com/AstraBert/PdfItDown/main/logo.png" alt="PdfItDown Logo">
</div>
> [!IMPORTANT]
> `markdown-pdf` is now implemented, with support for `PyMuPdf` v1.25.1, internally in `PdfItDown`. Make sure to install the latest version of the package (from 0.0.4 on) to avoid errors such as the one in [this issue](https://github.com/AstraBert/PdfItDown/issues/1)
**PdfItDown** is a python package that relies on [`markitdown` by Microsoft](https://github.com/microsoft/markitdown/) and (a slightly modified version of) [`markdown_pdf`](https://github.com/vb64/markdown-pdf).
### Applicability
**PdfItDown** is applicable to the following file formats:
- Markdown
- PowerPoint
- Word
- Excel
- HTML
- Text-based formats (CSV, XML)
- ZIP files (iterates over contents)
### How does it work?
**PdfItDown** works in a very simple way:
- From **markdown** to PDF
```mermaid
graph LR
2(Input File) --> 3[Markdown content]
3[Markdown content] --> 4[markdown-pdf]
4[markdown-pdf] --> 5(PDF file)
```
- From other **text-based** file formats to PDF
```mermaid
graph LR
2(Input File) --> 3[markitdown]
3[markitdown] --> 4[Markdown content]
4[Markdown content] --> 5[markdown-pdf]
5[markdown-pdf] --> 6(PDF file)
```
### Installation and Usage
To install **PdfItDown**, just run:
```bash
python3 -m pip install pdfitdown
```
You can now use the command line tool:
```
usage: pdfitdown [-h] -i INPUTFILE -o OUTPUTFILE [-t TITLE]
options:
-h, --help show this help message and exit
-i INPUTFILE, --inputfile INPUTFILE
Path to the input file that needs to be converted to PDF
-o OUTPUTFILE, --outputfile OUTPUTFILE
Path to the output PDF file
-t TITLE, --title TITLE
Title to include in the PDF metadata. Default: 'PDF Title'
```
An example usage can be:
```bash
pdfitdown -i README.md -o README.pdf -t "README"
```
Or you can use it inside your python scripts:
- To convert **.pptx/.docx/.csv/.json/.xml/.html/.zip file to PDF**
```python
from pdfitdown.pdfconversion import convert_to_pdf
output_pdf = convert_to_pdf(file_path = "BusinessGrowth.xlsx", output_path = "business_growth.pdf", title = "Business Growth")
```
- To convert a **.md file to PDF**:
```python
from pdfitdown.pdfconversion import convert_markdown_to_pdf
output_pdf = convert_markdown_to_pdf(file_path = "BusinessGrowth.md", output_path = "business_growth.pdf", title = "Business Growth")
```
In these examples, you will find the output PDF under `business_growth.pdf`.
### Contributing
Contributions are always welcome!
Find contribution guidelines at [CONTRIBUTING.md](https://github.com/AstraBert/PdfItDown/tree/main/CONTRIBUTING.md)
### License and Funding
This project is open-source and is provided under an [MIT License](https://github.com/AstraBert/PdfItDown/tree/main/LICENSE).
If you found it useful, please consider [funding it](https://github.com/sponsors/AstraBert).
Raw data
{
"_id": null,
"home_page": null,
"name": "pdfitdown",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": "\"Clelia (Astra) Bertelli\" <astraberte9@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/3e/e6/1ffb9fdf51ba43f9b8d7d5f5a03e37b148ca83ada8e9ef62cb82953c8a3e/pdfitdown-0.0.4.tar.gz",
"platform": null,
"description": "<div align=\"center\">\n<h1>PdfItDown</h1>\n<h2>Convert Everything to PDF</h2>\n</div>\n<br>\n<div align=\"center\">\n <img src=\"https://raw.githubusercontent.com/AstraBert/PdfItDown/main/logo.png\" alt=\"PdfItDown Logo\">\n</div>\n\n> [!IMPORTANT]\n> `markdown-pdf` is now implemented, with support for `PyMuPdf` v1.25.1, internally in `PdfItDown`. Make sure to install the latest version of the package (from 0.0.4 on) to avoid errors such as the one in [this issue](https://github.com/AstraBert/PdfItDown/issues/1) \n\n**PdfItDown** is a python package that relies on [`markitdown` by Microsoft](https://github.com/microsoft/markitdown/) and (a slightly modified version of) [`markdown_pdf`](https://github.com/vb64/markdown-pdf). \n\n### Applicability\n\n**PdfItDown** is applicable to the following file formats:\n\n- Markdown\n- PowerPoint\n- Word\n- Excel\n- HTML\n- Text-based formats (CSV, XML)\n- ZIP files (iterates over contents)\n\n### How does it work?\n\n**PdfItDown** works in a very simple way:\n\n- From **markdown** to PDF\n\n```mermaid\ngraph LR\n2(Input File) --> 3[Markdown content] \n3[Markdown content] --> 4[markdown-pdf]\n4[markdown-pdf] --> 5(PDF file)\n```\n\n- From other **text-based** file formats to PDF\n\n```mermaid\ngraph LR\n2(Input File) --> 3[markitdown]\n3[markitdown] --> 4[Markdown content]\n4[Markdown content] --> 5[markdown-pdf]\n5[markdown-pdf] --> 6(PDF file)\n```\n\n### Installation and Usage\n\nTo install **PdfItDown**, just run:\n\n```bash\npython3 -m pip install pdfitdown\n```\n\nYou can now use the command line tool:\n\n```\nusage: pdfitdown [-h] -i INPUTFILE -o OUTPUTFILE [-t TITLE]\n\noptions:\n -h, --help show this help message and exit\n -i INPUTFILE, --inputfile INPUTFILE\n Path to the input file that needs to be converted to PDF\n -o OUTPUTFILE, --outputfile OUTPUTFILE\n Path to the output PDF file\n -t TITLE, --title TITLE\n Title to include in the PDF metadata. Default: 'PDF Title'\n```\n\nAn example usage can be:\n\n```bash\npdfitdown -i README.md -o README.pdf -t \"README\"\n```\n\nOr you can use it inside your python scripts:\n\n- To convert **.pptx/.docx/.csv/.json/.xml/.html/.zip file to PDF**\n\n```python\nfrom pdfitdown.pdfconversion import convert_to_pdf\n\noutput_pdf = convert_to_pdf(file_path = \"BusinessGrowth.xlsx\", output_path = \"business_growth.pdf\", title = \"Business Growth\")\n```\n\n- To convert a **.md file to PDF**:\n\n```python\nfrom pdfitdown.pdfconversion import convert_markdown_to_pdf\n\noutput_pdf = convert_markdown_to_pdf(file_path = \"BusinessGrowth.md\", output_path = \"business_growth.pdf\", title = \"Business Growth\")\n```\n\nIn these examples, you will find the output PDF under `business_growth.pdf`.\n\n\n### Contributing\n\nContributions are always welcome!\n\nFind contribution guidelines at [CONTRIBUTING.md](https://github.com/AstraBert/PdfItDown/tree/main/CONTRIBUTING.md)\n\n### License and Funding\n\nThis project is open-source and is provided under an [MIT License](https://github.com/AstraBert/PdfItDown/tree/main/LICENSE).\n\nIf you found it useful, please consider [funding it](https://github.com/sponsors/AstraBert).\n",
"bugtrack_url": null,
"license": null,
"summary": "PdfItDown - Convert Everything to PDF",
"version": "0.0.4",
"project_urls": {
"Homepage": "https://github.com/AstraBert/PdfItDown",
"Issues": "https://github.com/AstraBert/PdfItDown/issues"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "27d0b9790874e759661ffa3b7cd62ecc0dba75c470afd4f9f0b6c7b941566a5a",
"md5": "aa1ecd766b7a59f929d845f25658e8df",
"sha256": "689d18800c70f78b5b9a03d2ab0a6679446e24d134feaad229bf0f73703c07f7"
},
"downloads": -1,
"filename": "pdfitdown-0.0.4-py3-none-any.whl",
"has_sig": false,
"md5_digest": "aa1ecd766b7a59f929d845f25658e8df",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 7073,
"upload_time": "2025-01-03T11:13:14",
"upload_time_iso_8601": "2025-01-03T11:13:14.692168Z",
"url": "https://files.pythonhosted.org/packages/27/d0/b9790874e759661ffa3b7cd62ecc0dba75c470afd4f9f0b6c7b941566a5a/pdfitdown-0.0.4-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "3ee61ffb9fdf51ba43f9b8d7d5f5a03e37b148ca83ada8e9ef62cb82953c8a3e",
"md5": "9909c7dd1dc83908ed4ff33e163a369e",
"sha256": "af6be5ec24a75c05daf7a90d14b69a09c3fb48f4d2961550828d15bd08d4ef89"
},
"downloads": -1,
"filename": "pdfitdown-0.0.4.tar.gz",
"has_sig": false,
"md5_digest": "9909c7dd1dc83908ed4ff33e163a369e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 6821,
"upload_time": "2025-01-03T11:13:17",
"upload_time_iso_8601": "2025-01-03T11:13:17.331140Z",
"url": "https://files.pythonhosted.org/packages/3e/e6/1ffb9fdf51ba43f9b8d7d5f5a03e37b148ca83ada8e9ef62cb82953c8a3e/pdfitdown-0.0.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-01-03 11:13:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "AstraBert",
"github_project": "PdfItDown",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "pymupdf",
"specs": [
[
"==",
"1.25.1"
]
]
},
{
"name": "markdown-it-py",
"specs": [
[
"==",
"3.0.0"
]
]
},
{
"name": "markitdown",
"specs": []
},
{
"name": "termcolor",
"specs": []
}
],
"lcname": "pdfitdown"
}