selenium-print

Name	selenium-print JSON
Version	1.0.1 JSON
	download
home_page	https://github.com/bubblegumsoldier/selenium-print
Summary	A minimal printing utility based on selenium and chrome that is able to render any HTML to PDF.
upload_time	2023-03-28 14:20:55
maintainer
docs_url	None
author	Henry Müssemann
requires_python
license	MIT
keywords	print pdf converter selenium
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage

            # selenium-print

`selenium-print` is an open-source Python package that enables printing HTML to PDF using the Selenium framework. This package is especially useful when you want to render dynamic content that requires a JS engine or ajax calls to a server or complex layouts using state-of-the-art CSS like grid, flexbox, media queries or transform.

Note: This library uses [selenium](https://github.com/SeleniumHQ/selenium) which in turn requires the installation of the Google Chrome browser as well as a chromedriver. This librarydoes not support any other browser at the moment. This makes it **not a lightweight solution** for generating PDF files. But it does make an easy to setup PDF generation solution that supports ALL browser functionality. Also note that starting a selenium browser is not very fast either, so if your priority is not ease-of-use, but performance, this is not the library for you. Although in benchmarks with [weasyprint](https://github.com/Kozea/WeasyPrint), which seems to also have one of the most impressive HTML/CSS support, this framework beat it by an average of 33% time decrease.

## Features

- Utilizes the full flexibility of the JS engine of the browser.
- Supports ajax calls to the server.
- Renders HTML to PDF.
- Easy to install and use.

## Installation

You can install `selenium-print` using pip:

```bash
pip install selenium-print
```

## Usage

### Simple usage

The simplest way to use selenium-print is to use the utility method that wraps all logic.

```python
from seleniumprint import file_to_pdf

input_html_file_path="/Users/user/path/to/input.html"
output_pdf_file_path="/Users/user/path/to/output.pdf"

file_to_pdf(input_html_file_path, output_pdf_file_path)

input_html_url="http://localhost:8000/report/1"
output_pdf_file_path="./report_pdfs/report_1.pdf"

url_to_pdf(input_html_url, output_pdf_file_path)
```

### Class-Based Usage

Another way to use SeleniumPDF is to call the url_to_pdf method with a URL and an optional output file path:

```python
from seleniumprint import SeleniumPDF

url = "https://www.example.com"
pdf_file_path = "example.pdf"

# Initialize SeleniumPDF
selenium_pdf = SeleniumPDF()

# Generate PDF from URL
selenium_pdf.url_to_pdf(url, pdf_file_path)`
```

In this example, `SeleniumPDF` is initialized with the default options and the `url_to_pdf` method is called with a URL and an output file path. The page at the specified URL is loaded and then converted to a PDF, which is saved to the specified file path, if an output file path was provided. If not it will return the raw bytes that you can work with.

### Custom Waiting

If you need to wait for some time after loading the page before converting it to a PDF, you can use the `load_page` and `convert_current_page_to_pdf` methods separately and add a sleep in between or some other kind of waiting logic, e.g. a JS based logic. Here's an example:

```python
from seleniumprint import SeleniumPDF
import time

url = "https://www.example.com"
pdf_file_path = "example.pdf"

# Initialize SeleniumPDF
selenium_pdf = SeleniumPDF()

# Load page and wait for 5 seconds
selenium_pdf.load_page(url)
time.sleep(5)

# Convert page to PDF and save to file
selenium_pdf.convert_current_page_to_pdf(pdf_file_path)

## Alternative JS-based waiting (needs appropriate HTML-side implementation)

selenium_pdf.load_page(url)

# wait manually until page is ready or some other condition is true
while selenium_pdf.driver.driver.execute_js("return window.readyState !== 'complete'"):
  time.sleep(0.5)

# Convert page to pdf now
selenium_pdf.convert_current_page_to_pdf(pdf_file_path)

```

In this example, `load_page` is called to load the page at the specified URL, and then a 5-second wait is added using the `time.sleep` function. Finally, `convert_current_page_to_pdf` is called to convert the loaded page to a PDF and save it to the specified file path.

### Specifying chromedriver location

The constructor of `SeleniumPDF` as well as the utility method `file_to_pdf` and `url_to_pdf` take a chrome-driver specific argument called `chrome_driver_path`.

```python
# Build path to HTML file relative to current script
html_file_path = os.path.join(script_dir, "test.html")

# Build path to output PDF file
pdf_file_path = os.path.join(script_dir, "test_direct.pdf")

# Generate PDF from HTML File Path
file_to_pdf(html_file_path, pdf_file_path, chrome_driver_path=CHROME_DRIVER_PATH)
```

## Contributing

Contributions are welcome! Here are some ways you can contribute:

- Report issues and bugs.
- Suggest new features or enhancements.
- Submit pull requests for bug fixes or new features.

Before submitting a pull request, please make sure that your code follows PEP 8 guidelines and includes tests.

## Support

This is a personal project, so support may be limited. However, I will do my best to answer any issues that are reported.

## License

This project is published under MIT license which means its basically free to use for everyone in any way whatsoever.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/bubblegumsoldier/selenium-print",
    "name": "selenium-print",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "print,pdf,converter,selenium",
    "author": "Henry M\u00fcssemann",
    "author_email": "henry@muessemann.de",
    "download_url": "",
    "platform": null,
    "description": "# selenium-print\n\n`selenium-print` is an open-source Python package that enables printing HTML to PDF using the Selenium framework. This package is especially useful when you want to render dynamic content that requires a JS engine or ajax calls to a server or complex layouts using state-of-the-art CSS like grid, flexbox, media queries or transform.\n\nNote: This library uses [selenium](https://github.com/SeleniumHQ/selenium) which in turn requires the installation of the Google Chrome browser as well as a chromedriver. This librarydoes not support any other browser at the moment. This makes it **not a lightweight solution** for generating PDF files. But it does make an easy to setup PDF generation solution that supports ALL browser functionality. Also note that starting a selenium browser is not very fast either, so if your priority is not ease-of-use, but performance, this is not the library for you. Although in benchmarks with [weasyprint](https://github.com/Kozea/WeasyPrint), which seems to also have one of the most impressive HTML/CSS support, this framework beat it by an average of 33% time decrease.\n\n## Features\n\n- Utilizes the full flexibility of the JS engine of the browser.\n- Supports ajax calls to the server.\n- Renders HTML to PDF.\n- Easy to install and use.\n\n## Installation\n\nYou can install `selenium-print` using pip:\n\n```bash\npip install selenium-print\n```\n\n## Usage\n\n### Simple usage\n\nThe simplest way to use selenium-print is to use the utility method that wraps all logic.\n\n```python\nfrom seleniumprint import file_to_pdf\n\ninput_html_file_path=\"/Users/user/path/to/input.html\"\noutput_pdf_file_path=\"/Users/user/path/to/output.pdf\"\n\nfile_to_pdf(input_html_file_path, output_pdf_file_path)\n\ninput_html_url=\"http://localhost:8000/report/1\"\noutput_pdf_file_path=\"./report_pdfs/report_1.pdf\"\n\nurl_to_pdf(input_html_url, output_pdf_file_path)\n```\n\n### Class-Based Usage\n\nAnother way to use SeleniumPDF is to call the url_to_pdf method with a URL and an optional output file path:\n\n```python\nfrom seleniumprint import SeleniumPDF\n\nurl = \"https://www.example.com\"\npdf_file_path = \"example.pdf\"\n\n# Initialize SeleniumPDF\nselenium_pdf = SeleniumPDF()\n\n# Generate PDF from URL\nselenium_pdf.url_to_pdf(url, pdf_file_path)`\n```\n\nIn this example, `SeleniumPDF` is initialized with the default options and the `url_to_pdf` method is called with a URL and an output file path. The page at the specified URL is loaded and then converted to a PDF, which is saved to the specified file path, if an output file path was provided. If not it will return the raw bytes that you can work with.\n\n### Custom Waiting\n\nIf you need to wait for some time after loading the page before converting it to a PDF, you can use the `load_page` and `convert_current_page_to_pdf` methods separately and add a sleep in between or some other kind of waiting logic, e.g. a JS based logic. Here's an example:\n\n```python\nfrom seleniumprint import SeleniumPDF\nimport time\n\nurl = \"https://www.example.com\"\npdf_file_path = \"example.pdf\"\n\n# Initialize SeleniumPDF\nselenium_pdf = SeleniumPDF()\n\n# Load page and wait for 5 seconds\nselenium_pdf.load_page(url)\ntime.sleep(5)\n\n# Convert page to PDF and save to file\nselenium_pdf.convert_current_page_to_pdf(pdf_file_path)\n\n## Alternative JS-based waiting (needs appropriate HTML-side implementation)\n\nselenium_pdf.load_page(url)\n\n# wait manually until page is ready or some other condition is true\nwhile selenium_pdf.driver.driver.execute_js(\"return window.readyState !== 'complete'\"):\n  time.sleep(0.5)\n\n# Convert page to pdf now\nselenium_pdf.convert_current_page_to_pdf(pdf_file_path)\n\n```\n\nIn this example, `load_page` is called to load the page at the specified URL, and then a 5-second wait is added using the `time.sleep` function. Finally, `convert_current_page_to_pdf` is called to convert the loaded page to a PDF and save it to the specified file path.\n\n### Specifying chromedriver location\n\nThe constructor of `SeleniumPDF` as well as the utility method `file_to_pdf` and `url_to_pdf` take a chrome-driver specific argument called `chrome_driver_path`.\n\n```python\n# Build path to HTML file relative to current script\nhtml_file_path = os.path.join(script_dir, \"test.html\")\n\n# Build path to output PDF file\npdf_file_path = os.path.join(script_dir, \"test_direct.pdf\")\n\n# Generate PDF from HTML File Path\nfile_to_pdf(html_file_path, pdf_file_path, chrome_driver_path=CHROME_DRIVER_PATH)\n```\n\n## Contributing\n\nContributions are welcome! Here are some ways you can contribute:\n\n- Report issues and bugs.\n- Suggest new features or enhancements.\n- Submit pull requests for bug fixes or new features.\n\nBefore submitting a pull request, please make sure that your code follows PEP 8 guidelines and includes tests.\n\n## Support\n\nThis is a personal project, so support may be limited. However, I will do my best to answer any issues that are reported.\n\n## License\n\nThis project is published under MIT license which means its basically free to use for everyone in any way whatsoever.\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A minimal printing utility based on selenium and chrome that is able to render any HTML to PDF.",
    "version": "1.0.1",
    "split_keywords": [
        "print",
        "pdf",
        "converter",
        "selenium"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "aabe0eb11ed4c30c8441241daf54b0f40535f6bca01375ce7c10efdac4cc0087",
                "md5": "f08ab2675c9adb1090cb77fc232f8f84",
                "sha256": "1796adc5c57254ab499c752f73e9817513f516478f4119422a171f0cc336119a"
            },
            "downloads": -1,
            "filename": "selenium_print-1.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f08ab2675c9adb1090cb77fc232f8f84",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 6800,
            "upload_time": "2023-03-28T14:20:55",
            "upload_time_iso_8601": "2023-03-28T14:20:55.362992Z",
            "url": "https://files.pythonhosted.org/packages/aa/be/0eb11ed4c30c8441241daf54b0f40535f6bca01375ce7c10efdac4cc0087/selenium_print-1.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-28 14:20:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "bubblegumsoldier",
    "github_project": "selenium-print",
    "travis_ci": false,
    "coveralls": true,
    "github_actions": true,
    "requirements": [],
    "lcname": "selenium-print"
}

Henry Müssemann