# selenium-print
`selenium-print` is an open-source Python package that enables printing HTML to PDF using the Selenium framework. This package is especially useful when you want to render dynamic content that requires a JS engine or ajax calls to a server or complex layouts using state-of-the-art CSS like grid, flexbox, media queries or transform.
Note: This library uses [selenium](https://github.com/SeleniumHQ/selenium) which in turn requires the installation of the Google Chrome browser as well as a chromedriver. This librarydoes not support any other browser at the moment. This makes it **not a lightweight solution** for generating PDF files. But it does make an easy to setup PDF generation solution that supports ALL browser functionality. Also note that starting a selenium browser is not very fast either, so if your priority is not ease-of-use, but performance, this is not the library for you. Although in benchmarks with [weasyprint](https://github.com/Kozea/WeasyPrint), which seems to also have one of the most impressive HTML/CSS support, this framework beat it by an average of 33% time decrease.
## Features
- Utilizes the full flexibility of the JS engine of the browser.
- Supports ajax calls to the server.
- Renders HTML to PDF.
- Easy to install and use.
## Installation
You can install `selenium-print` using pip:
```bash
pip install selenium-print
```
## Usage
### Simple usage
The simplest way to use selenium-print is to use the utility method that wraps all logic.
```python
from seleniumprint import file_to_pdf
input_html_file_path="/Users/user/path/to/input.html"
output_pdf_file_path="/Users/user/path/to/output.pdf"
file_to_pdf(input_html_file_path, output_pdf_file_path)
input_html_url="http://localhost:8000/report/1"
output_pdf_file_path="./report_pdfs/report_1.pdf"
url_to_pdf(input_html_url, output_pdf_file_path)
```
### Class-Based Usage
Another way to use SeleniumPDF is to call the url_to_pdf method with a URL and an optional output file path:
```python
from seleniumprint import SeleniumPDF
url = "https://www.example.com"
pdf_file_path = "example.pdf"
# Initialize SeleniumPDF
selenium_pdf = SeleniumPDF()
# Generate PDF from URL
selenium_pdf.url_to_pdf(url, pdf_file_path)`
```
In this example, `SeleniumPDF` is initialized with the default options and the `url_to_pdf` method is called with a URL and an output file path. The page at the specified URL is loaded and then converted to a PDF, which is saved to the specified file path, if an output file path was provided. If not it will return the raw bytes that you can work with.
### Custom Waiting
If you need to wait for some time after loading the page before converting it to a PDF, you can use the `load_page` and `convert_current_page_to_pdf` methods separately and add a sleep in between or some other kind of waiting logic, e.g. a JS based logic. Here's an example:
```python
from seleniumprint import SeleniumPDF
import time
url = "https://www.example.com"
pdf_file_path = "example.pdf"
# Initialize SeleniumPDF
selenium_pdf = SeleniumPDF()
# Load page and wait for 5 seconds
selenium_pdf.load_page(url)
time.sleep(5)
# Convert page to PDF and save to file
selenium_pdf.convert_current_page_to_pdf(pdf_file_path)
## Alternative JS-based waiting (needs appropriate HTML-side implementation)
selenium_pdf.load_page(url)
# wait manually until page is ready or some other condition is true
while selenium_pdf.driver.driver.execute_js("return window.readyState !== 'complete'"):
time.sleep(0.5)
# Convert page to pdf now
selenium_pdf.convert_current_page_to_pdf(pdf_file_path)
```
In this example, `load_page` is called to load the page at the specified URL, and then a 5-second wait is added using the `time.sleep` function. Finally, `convert_current_page_to_pdf` is called to convert the loaded page to a PDF and save it to the specified file path.
### Specifying chromedriver location
The constructor of `SeleniumPDF` as well as the utility method `file_to_pdf` and `url_to_pdf` take a chrome-driver specific argument called `chrome_driver_path`.
```python
# Build path to HTML file relative to current script
html_file_path = os.path.join(script_dir, "test.html")
# Build path to output PDF file
pdf_file_path = os.path.join(script_dir, "test_direct.pdf")
# Generate PDF from HTML File Path
file_to_pdf(html_file_path, pdf_file_path, chrome_driver_path=CHROME_DRIVER_PATH)
```
## Contributing
Contributions are welcome! Here are some ways you can contribute:
- Report issues and bugs.
- Suggest new features or enhancements.
- Submit pull requests for bug fixes or new features.
Before submitting a pull request, please make sure that your code follows PEP 8 guidelines and includes tests.
## Support
This is a personal project, so support may be limited. However, I will do my best to answer any issues that are reported.
## License
This project is published under MIT license which means its basically free to use for everyone in any way whatsoever.
Raw data
{
"_id": null,
"home_page": "https://github.com/bubblegumsoldier/selenium-print",
"name": "selenium-print",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "print,pdf,converter,selenium",
"author": "Henry M\u00fcssemann",
"author_email": "henry@muessemann.de",
"download_url": "",
"platform": null,
"description": "# selenium-print\n\n`selenium-print` is an open-source Python package that enables printing HTML to PDF using the Selenium framework. This package is especially useful when you want to render dynamic content that requires a JS engine or ajax calls to a server or complex layouts using state-of-the-art CSS like grid, flexbox, media queries or transform.\n\nNote: This library uses [selenium](https://github.com/SeleniumHQ/selenium) which in turn requires the installation of the Google Chrome browser as well as a chromedriver. This librarydoes not support any other browser at the moment. This makes it **not a lightweight solution** for generating PDF files. But it does make an easy to setup PDF generation solution that supports ALL browser functionality. Also note that starting a selenium browser is not very fast either, so if your priority is not ease-of-use, but performance, this is not the library for you. Although in benchmarks with [weasyprint](https://github.com/Kozea/WeasyPrint), which seems to also have one of the most impressive HTML/CSS support, this framework beat it by an average of 33% time decrease.\n\n## Features\n\n- Utilizes the full flexibility of the JS engine of the browser.\n- Supports ajax calls to the server.\n- Renders HTML to PDF.\n- Easy to install and use.\n\n## Installation\n\nYou can install `selenium-print` using pip:\n\n```bash\npip install selenium-print\n```\n\n## Usage\n\n### Simple usage\n\nThe simplest way to use selenium-print is to use the utility method that wraps all logic.\n\n```python\nfrom seleniumprint import file_to_pdf\n\ninput_html_file_path=\"/Users/user/path/to/input.html\"\noutput_pdf_file_path=\"/Users/user/path/to/output.pdf\"\n\nfile_to_pdf(input_html_file_path, output_pdf_file_path)\n\ninput_html_url=\"http://localhost:8000/report/1\"\noutput_pdf_file_path=\"./report_pdfs/report_1.pdf\"\n\nurl_to_pdf(input_html_url, output_pdf_file_path)\n```\n\n### Class-Based Usage\n\nAnother way to use SeleniumPDF is to call the url_to_pdf method with a URL and an optional output file path:\n\n```python\nfrom seleniumprint import SeleniumPDF\n\nurl = \"https://www.example.com\"\npdf_file_path = \"example.pdf\"\n\n# Initialize SeleniumPDF\nselenium_pdf = SeleniumPDF()\n\n# Generate PDF from URL\nselenium_pdf.url_to_pdf(url, pdf_file_path)`\n```\n\nIn this example, `SeleniumPDF` is initialized with the default options and the `url_to_pdf` method is called with a URL and an output file path. The page at the specified URL is loaded and then converted to a PDF, which is saved to the specified file path, if an output file path was provided. If not it will return the raw bytes that you can work with.\n\n### Custom Waiting\n\nIf you need to wait for some time after loading the page before converting it to a PDF, you can use the `load_page` and `convert_current_page_to_pdf` methods separately and add a sleep in between or some other kind of waiting logic, e.g. a JS based logic. Here's an example:\n\n```python\nfrom seleniumprint import SeleniumPDF\nimport time\n\nurl = \"https://www.example.com\"\npdf_file_path = \"example.pdf\"\n\n# Initialize SeleniumPDF\nselenium_pdf = SeleniumPDF()\n\n# Load page and wait for 5 seconds\nselenium_pdf.load_page(url)\ntime.sleep(5)\n\n# Convert page to PDF and save to file\nselenium_pdf.convert_current_page_to_pdf(pdf_file_path)\n\n## Alternative JS-based waiting (needs appropriate HTML-side implementation)\n\nselenium_pdf.load_page(url)\n\n# wait manually until page is ready or some other condition is true\nwhile selenium_pdf.driver.driver.execute_js(\"return window.readyState !== 'complete'\"):\n time.sleep(0.5)\n\n# Convert page to pdf now\nselenium_pdf.convert_current_page_to_pdf(pdf_file_path)\n\n```\n\nIn this example, `load_page` is called to load the page at the specified URL, and then a 5-second wait is added using the `time.sleep` function. Finally, `convert_current_page_to_pdf` is called to convert the loaded page to a PDF and save it to the specified file path.\n\n### Specifying chromedriver location\n\nThe constructor of `SeleniumPDF` as well as the utility method `file_to_pdf` and `url_to_pdf` take a chrome-driver specific argument called `chrome_driver_path`.\n\n```python\n# Build path to HTML file relative to current script\nhtml_file_path = os.path.join(script_dir, \"test.html\")\n\n# Build path to output PDF file\npdf_file_path = os.path.join(script_dir, \"test_direct.pdf\")\n\n# Generate PDF from HTML File Path\nfile_to_pdf(html_file_path, pdf_file_path, chrome_driver_path=CHROME_DRIVER_PATH)\n```\n\n## Contributing\n\nContributions are welcome! Here are some ways you can contribute:\n\n- Report issues and bugs.\n- Suggest new features or enhancements.\n- Submit pull requests for bug fixes or new features.\n\nBefore submitting a pull request, please make sure that your code follows PEP 8 guidelines and includes tests.\n\n## Support\n\nThis is a personal project, so support may be limited. However, I will do my best to answer any issues that are reported.\n\n## License\n\nThis project is published under MIT license which means its basically free to use for everyone in any way whatsoever.\n\n\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A minimal printing utility based on selenium and chrome that is able to render any HTML to PDF.",
"version": "1.0.1",
"split_keywords": [
"print",
"pdf",
"converter",
"selenium"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "aabe0eb11ed4c30c8441241daf54b0f40535f6bca01375ce7c10efdac4cc0087",
"md5": "f08ab2675c9adb1090cb77fc232f8f84",
"sha256": "1796adc5c57254ab499c752f73e9817513f516478f4119422a171f0cc336119a"
},
"downloads": -1,
"filename": "selenium_print-1.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f08ab2675c9adb1090cb77fc232f8f84",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 6800,
"upload_time": "2023-03-28T14:20:55",
"upload_time_iso_8601": "2023-03-28T14:20:55.362992Z",
"url": "https://files.pythonhosted.org/packages/aa/be/0eb11ed4c30c8441241daf54b0f40535f6bca01375ce7c10efdac4cc0087/selenium_print-1.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-28 14:20:55",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "bubblegumsoldier",
"github_project": "selenium-print",
"travis_ci": false,
"coveralls": true,
"github_actions": true,
"requirements": [],
"lcname": "selenium-print"
}