python-pdf


Namepython-pdf JSON
Version 0.39 PyPI version JSON
download
home_pagehttps://github.com/tutorcruncher/pydf
SummaryPDF generation in python using wkhtmltopdf suitable for heroku
upload_time2021-11-10 10:19:35
maintainer
docs_urlNone
authorSamuel Colvin
requires_python
licenseMIT
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            pydf
====


|BuildStatus| |codecov| |PyPI| |license| |docker|

PDF generation in python using
`wkhtmltopdf <http://wkhtmltopdf.org/>`__.

Wkhtmltopdf binaries are precompiled and included in the package making
pydf easier to use, in particular this means pydf works on heroku.

Currently using **wkhtmltopdf 0.12.5 for Ubuntu 18.04 (bionic)**, requires **Python 3.6+**.

**If you're not on Linux amd64:** pydf comes bundled with a wkhtmltopdf binary which will only work on Linux amd64
architectures. If you're on another OS or architecture your mileage may vary, it is likely that you'll need to supply
your own wkhtmltopdf binary and point pydf towards it by setting the ``WKHTMLTOPDF_PATH`` environment variable.

Install
-------

.. code:: shell

    pip install python-pdf

For python 2 use ``pip install python-pdf==0.30.0``.

Basic Usage
-----------

.. code:: python

    import pydf
    pdf = pydf.generate_pdf('<h1>this is html</h1>')
    with open('test_doc.pdf', 'wb') as f:
        f.write(pdf)

Async Usage
-----------

Generation of lots of documents with wkhtmltopdf can be slow as wkhtmltopdf can only generate one document
per process. To get round this pydf uses python 3's asyncio ``create_subprocess_exec`` to generate multiple pdfs
at the same time. Thus the time taken to spin up processes doesn't slow you down.

.. code:: python

    from pathlib import Path
    from pydf import AsyncPydf

    async def generate_async():
        apydf = AsyncPydf()

        async def gen(i):
            pdf_content = await apydf.generate_pdf('<h1>this is html</h1>')
            Path(f'output_{i:03}.pdf').write_bytes(pdf_content)

        coros = [gen(i) for i in range(50)]
        await asyncio.gather(*coros)

    loop = asyncio.get_event_loop()
    loop.run_until_complete(generate_async())


See `benchmarks/run.py <https://github.com/tutorcruncher/pydf/blob/master/benchmark/run.py>`__
for a full example.

Locally generating an entire invoice goes from 0.372s/pdf to 0.035s/pdf with the async model.

Docker
------

pydf is available as a docker image with a very simple http API for generating pdfs.

Simple ``POST`` (or ``GET`` with data if possible) you HTML data to ``/generate.pdf``.

Arguments can be passed using http headers; any header starting ``pdf-`` or ``pdf_`` will
have that prefix removed, be converted to lower case and passed to wkhtmltopdf.

For example:

.. code:: shell

   docker run -rm -p 8000:80 -d samuelcolvin/pydf
   curl -d '<h1>this is html</h1>' -H "pdf-orientation: landscape" http://localhost:8000/generate.pdf > created.pdf
   open "created.pdf"

In docker compose:

.. code:: yaml

   services:
     pdf:
       image: samuelcolvin/pydf

Other services can then generate PDFs by making requests to ``pdf/generate.pdf``. Pretty cool.

API
---

**generate\_pdf(source, [\*\*kwargs])**

Generate a pdf from either a url or a html string.

After the html and url arguments all other arguments are passed straight
to wkhtmltopdf

For details on extra arguments see the output of get\_help() and
get\_extended\_help()

All arguments whether specified or caught with extra\_kwargs are
converted to command line args with ``'--' + original_name.replace('_', '-')``.

Arguments which are True are passed with no value eg. just --quiet,
False and None arguments are missed, everything else is passed with
str(value).

**Arguments:**

-  ``source``: html string to generate pdf from or url to get
-  ``quiet``: bool
-  ``grayscale``: bool
-  ``lowquality``: bool
-  ``margin_bottom``: string eg. 10mm
-  ``margin_left``: string eg. 10mm
-  ``margin_right``: string eg. 10mm
-  ``margin_top``: string eg. 10mm
-  ``orientation``: Portrait or Landscape
-  ``page_height``: string eg. 10mm
-  ``page_width``: string eg. 10mm
-  ``page_size``: string: A4, Letter, etc.
-  ``image_dpi``: int default 600
-  ``image_quality``: int default 94
-  ``extra_kwargs``: any exotic extra options for wkhtmltopdf

Returns string representing pdf

**get\_version()**

Get version of pydf and wkhtmltopdf binary

**get\_help()**

get help string from wkhtmltopdf binary uses -h command line option

**get\_extended\_help()**

get extended help string from wkhtmltopdf binary uses -H command line
option

**execute\_wk(\*args)**

Low level function to call wkhtmltopdf, arguments are added to
wkhtmltopdf binary and passed to subprocess with not processing.

.. |BuildStatus| image:: https://travis-ci.org/tutorcruncher/pydf.svg?branch=master
   :target: https://travis-ci.org/tutorcruncher/pydf
.. |codecov| image:: https://codecov.io/github/tutorcruncher/pydf/coverage.svg?branch=master
   :target: https://codecov.io/github/tutorcruncher/pydf?branch=master
.. |PyPI| image:: https://img.shields.io/pypi/v/python-pdf.svg?style=flat
   :target: https://pypi.python.org/pypi/python-pdf
.. |license| image:: https://img.shields.io/pypi/l/python-pdf.svg
   :target: https://github.com/tutorcruncher/pydf
.. |docker| image:: https://img.shields.io/docker/automated/samuelcolvin/pydf.svg
   :target: https://hub.docker.com/r/samuelcolvin/pydf/


Heroku
-------

If you are deploying onto Heroku, then you will need to install a couple of dependencies before WKHTMLTOPDF will work.

Add the Heroku buildpack `https://buildpack-registry.s3.amazonaws.com/buildpacks/heroku-community/apt.tgz`

Then create an `Aptfile` in your root directory with the dependencies:

.. code::shell
  libjpeg62
  libc6



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/tutorcruncher/pydf",
    "name": "python-pdf",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "",
    "author": "Samuel Colvin",
    "author_email": "s@muelcolvin.com",
    "download_url": "https://files.pythonhosted.org/packages/d9/e5/638921f9cb962e2a6bcbbab86ab6e2cf0391073c8874ffd6078dce596d22/python-pdf-0.39.tar.gz",
    "platform": "any",
    "description": "pydf\n====\n\n\n|BuildStatus| |codecov| |PyPI| |license| |docker|\n\nPDF generation in python using\n`wkhtmltopdf <http://wkhtmltopdf.org/>`__.\n\nWkhtmltopdf binaries are precompiled and included in the package making\npydf easier to use, in particular this means pydf works on heroku.\n\nCurrently using **wkhtmltopdf 0.12.5 for Ubuntu 18.04 (bionic)**, requires **Python 3.6+**.\n\n**If you're not on Linux amd64:** pydf comes bundled with a wkhtmltopdf binary which will only work on Linux amd64\narchitectures. If you're on another OS or architecture your mileage may vary, it is likely that you'll need to supply\nyour own wkhtmltopdf binary and point pydf towards it by setting the ``WKHTMLTOPDF_PATH`` environment variable.\n\nInstall\n-------\n\n.. code:: shell\n\n    pip install python-pdf\n\nFor python 2 use ``pip install python-pdf==0.30.0``.\n\nBasic Usage\n-----------\n\n.. code:: python\n\n    import pydf\n    pdf = pydf.generate_pdf('<h1>this is html</h1>')\n    with open('test_doc.pdf', 'wb') as f:\n        f.write(pdf)\n\nAsync Usage\n-----------\n\nGeneration of lots of documents with wkhtmltopdf can be slow as wkhtmltopdf can only generate one document\nper process. To get round this pydf uses python 3's asyncio ``create_subprocess_exec`` to generate multiple pdfs\nat the same time. Thus the time taken to spin up processes doesn't slow you down.\n\n.. code:: python\n\n    from pathlib import Path\n    from pydf import AsyncPydf\n\n    async def generate_async():\n        apydf = AsyncPydf()\n\n        async def gen(i):\n            pdf_content = await apydf.generate_pdf('<h1>this is html</h1>')\n            Path(f'output_{i:03}.pdf').write_bytes(pdf_content)\n\n        coros = [gen(i) for i in range(50)]\n        await asyncio.gather(*coros)\n\n    loop = asyncio.get_event_loop()\n    loop.run_until_complete(generate_async())\n\n\nSee `benchmarks/run.py <https://github.com/tutorcruncher/pydf/blob/master/benchmark/run.py>`__\nfor a full example.\n\nLocally generating an entire invoice goes from 0.372s/pdf to 0.035s/pdf with the async model.\n\nDocker\n------\n\npydf is available as a docker image with a very simple http API for generating pdfs.\n\nSimple ``POST`` (or ``GET`` with data if possible) you HTML data to ``/generate.pdf``.\n\nArguments can be passed using http headers; any header starting ``pdf-`` or ``pdf_`` will\nhave that prefix removed, be converted to lower case and passed to wkhtmltopdf.\n\nFor example:\n\n.. code:: shell\n\n   docker run -rm -p 8000:80 -d samuelcolvin/pydf\n   curl -d '<h1>this is html</h1>' -H \"pdf-orientation: landscape\" http://localhost:8000/generate.pdf > created.pdf\n   open \"created.pdf\"\n\nIn docker compose:\n\n.. code:: yaml\n\n   services:\n     pdf:\n       image: samuelcolvin/pydf\n\nOther services can then generate PDFs by making requests to ``pdf/generate.pdf``. Pretty cool.\n\nAPI\n---\n\n**generate\\_pdf(source, [\\*\\*kwargs])**\n\nGenerate a pdf from either a url or a html string.\n\nAfter the html and url arguments all other arguments are passed straight\nto wkhtmltopdf\n\nFor details on extra arguments see the output of get\\_help() and\nget\\_extended\\_help()\n\nAll arguments whether specified or caught with extra\\_kwargs are\nconverted to command line args with ``'--' + original_name.replace('_', '-')``.\n\nArguments which are True are passed with no value eg. just --quiet,\nFalse and None arguments are missed, everything else is passed with\nstr(value).\n\n**Arguments:**\n\n-  ``source``: html string to generate pdf from or url to get\n-  ``quiet``: bool\n-  ``grayscale``: bool\n-  ``lowquality``: bool\n-  ``margin_bottom``: string eg. 10mm\n-  ``margin_left``: string eg. 10mm\n-  ``margin_right``: string eg. 10mm\n-  ``margin_top``: string eg. 10mm\n-  ``orientation``: Portrait or Landscape\n-  ``page_height``: string eg. 10mm\n-  ``page_width``: string eg. 10mm\n-  ``page_size``: string: A4, Letter, etc.\n-  ``image_dpi``: int default 600\n-  ``image_quality``: int default 94\n-  ``extra_kwargs``: any exotic extra options for wkhtmltopdf\n\nReturns string representing pdf\n\n**get\\_version()**\n\nGet version of pydf and wkhtmltopdf binary\n\n**get\\_help()**\n\nget help string from wkhtmltopdf binary uses -h command line option\n\n**get\\_extended\\_help()**\n\nget extended help string from wkhtmltopdf binary uses -H command line\noption\n\n**execute\\_wk(\\*args)**\n\nLow level function to call wkhtmltopdf, arguments are added to\nwkhtmltopdf binary and passed to subprocess with not processing.\n\n.. |BuildStatus| image:: https://travis-ci.org/tutorcruncher/pydf.svg?branch=master\n   :target: https://travis-ci.org/tutorcruncher/pydf\n.. |codecov| image:: https://codecov.io/github/tutorcruncher/pydf/coverage.svg?branch=master\n   :target: https://codecov.io/github/tutorcruncher/pydf?branch=master\n.. |PyPI| image:: https://img.shields.io/pypi/v/python-pdf.svg?style=flat\n   :target: https://pypi.python.org/pypi/python-pdf\n.. |license| image:: https://img.shields.io/pypi/l/python-pdf.svg\n   :target: https://github.com/tutorcruncher/pydf\n.. |docker| image:: https://img.shields.io/docker/automated/samuelcolvin/pydf.svg\n   :target: https://hub.docker.com/r/samuelcolvin/pydf/\n\n\nHeroku\n-------\n\nIf you are deploying onto Heroku, then you will need to install a couple of dependencies before WKHTMLTOPDF will work.\n\nAdd the Heroku buildpack `https://buildpack-registry.s3.amazonaws.com/buildpacks/heroku-community/apt.tgz`\n\nThen create an `Aptfile` in your root directory with the dependencies:\n\n.. code::shell\n  libjpeg62\n  libc6\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "PDF generation in python using wkhtmltopdf suitable for heroku",
    "version": "0.39",
    "project_urls": {
        "Homepage": "https://github.com/tutorcruncher/pydf"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "073572684c47579d50649d12a54c6ec68ab42474f40decea401c10a738432516",
                "md5": "3f55dd04ba97733af4af531522f04203",
                "sha256": "3385a992ecc10a7261ae7df7175c675ff010d3486634dfa7ad4bc69de20849fb"
            },
            "downloads": -1,
            "filename": "python_pdf-0.39-py36-none-any.whl",
            "has_sig": false,
            "md5_digest": "3f55dd04ba97733af4af531522f04203",
            "packagetype": "bdist_wheel",
            "python_version": "py36",
            "requires_python": null,
            "size": 16810546,
            "upload_time": "2021-11-10T10:19:32",
            "upload_time_iso_8601": "2021-11-10T10:19:32.321990Z",
            "url": "https://files.pythonhosted.org/packages/07/35/72684c47579d50649d12a54c6ec68ab42474f40decea401c10a738432516/python_pdf-0.39-py36-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d9e5638921f9cb962e2a6bcbbab86ab6e2cf0391073c8874ffd6078dce596d22",
                "md5": "a5903e1461a4b414dff4f5c27328ddf0",
                "sha256": "dedbb63b9af02ccc0edaa013606cd82087238d2d9d67ca779696ce5427e4d343"
            },
            "downloads": -1,
            "filename": "python-pdf-0.39.tar.gz",
            "has_sig": false,
            "md5_digest": "a5903e1461a4b414dff4f5c27328ddf0",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16758622,
            "upload_time": "2021-11-10T10:19:35",
            "upload_time_iso_8601": "2021-11-10T10:19:35.302213Z",
            "url": "https://files.pythonhosted.org/packages/d9/e5/638921f9cb962e2a6bcbbab86ab6e2cf0391073c8874ffd6078dce596d22/python-pdf-0.39.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2021-11-10 10:19:35",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "tutorcruncher",
    "github_project": "pydf",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "python-pdf"
}
        
Elapsed time: 0.06800s