doc-workflow


Namedoc-workflow JSON
Version 0.1.2a4 PyPI version JSON
download
home_pagehttps://github.com/iulica/doc-workflow
SummaryA Python Document Management Framework for generating and sending (pdf, docx, etc) documents to customers
upload_time2025-11-01 20:53:00
maintainerNone
docs_urlNone
authorIulian Ciorăscu
requires_pythonNone
licenseMIT
keywords docx pdf split watermark email mailmerge qrbill xlsx
VCS
bugtrack_url
requirements PyPDF2 cairosvg qrbill openpyxl svglib reportlab docx-mailmerge2 docx2pdf gspread
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
=================
Document Workflow
=================

.. image:: https://badge.fury.io/py/doc-workflow.png
    :alt: PyPI
    :target: https://pypi.python.org/pypi/doc-workflow

Creates, merges, splits, edits documents(mainly docx/pdf) as well as sending them by email.
Originally created for QR bills integration but is generic and can be used for much more.


Installation
============

Installation with ``pip``:
::

    $ pip install doc-workflow


Usage
=====

From the command line:
::

    $ docwf <path_to_json_config_file>

From Python:
::

    from docwf import DocWorkflow

    config_obj = {
        "globals": {
            "data": {
                "workbook": "source.xlsx",
                "sheet": "mailmergesheet",
            },
            "constants": {
                "language": "fr"
            }
        },
        "tasks": [
            {
                "active": 1, # you can activate/deactivate tasks
                "name": "create bills", # name for debug purpose
                "locals": {
                    "data" : {
                        "sheet": "overridesheetfortask"
                    },
                    "key" : "value", # overrides global arguments for the task
                },
                "task": {
                    "type": "myplugin", # or builtin plugins (see below)
                    "task_dependent_argument": "value{param}",
                }
            },
        ]
    }
    my_plugins = {
        "myplugin": MyPluginClass
    }
    DocWorkflow(config_obj, plugins=my_plugins).gen()

Typical workflow tasks
======================

Assume the data is in the source.xlsx in the sheet named bills

========  ============  ==========  ======  =========  ======
clientnr  email         send_email  total   reference  etc 
========  ============  ==========  ======  =========  ======
1         c1@gmail.com     yes       1032   ref2022c1    ...
2         c2@gmail.com     yes       1232   ref2022c2    ...
========  ============  ==========  ======  =========  ======


Create bills from Word template
-------------------------------
::

    {
        "active": 1, # you can activate/deactivate tasks
        "name": "create bills", # name for debug purpose
        "task": {
            "type": "mailmerge",
            "input_docx": "templates/template_bill.docx",
            "output_docx": "bills/bill_{year}.docx" # output depends on the column year, it should be constant throughout all rows
        }
    },

Create pdf from the generated docx
-----------------------------------

It uses the Word Application (Mac/Windows).
If the docx template has dynamic fields (IF, etc), 
the generated docx will ask permission to update 
all fields before saving it as pdf.
::

    {
        "name": "save pdf from docx (uses Word)",
        "task": {
            "type": "makepdf",
            "input_docx": "bills/bill_{year}.docx",
            "output_pdf": "bills/bill_{year}.pdf"
        }
    },


Fills in QR codes
-------------------------------

for the bills by adding a page to each bill or by merging the QR bill into one of the pages.
::

    {
        "name": "create qr bills",
        "locals": {
            "creditor": {
                "iban": "CH....",
                "name": "The Good Company",
                "pcode": "xyzt",
                "city": "Bern",
                "street": "Dorfstrasse 1"
            },
            "task_params": {
                "extra_infos": "reference", # fixed keys for bill reason ...
                "amount": "total"   # and the amount. With task_params you can create data entries out of existing columns
            }
        },
        "task": {
            "type": "qr",
            "merge_type": "merge", # or "append"
            "input_filename": "bills/bill_{year}.pdf",
            "delete_input": true, # delete the input filename after creating the output
            "pages": 2, # the number of pages per each bill
            "merge_pos": 2, # or "insert_pos" if "append"
            "output_filename": "bills/bill_{year}_with_qr.pdf"
        }
    },

Split the bills into separate pdf files.
------------------------------------------

From one input to multiple outputs
::

    {
        "name": "split bills",
        "task": {
            "type": "split_pdf",
            "input_filename": "bills/bill_{year}_with_qr.pdf",
            "pages": 2,
            "makedir": "bills/bills_{year}", # if the output directory doesn't exist, create it
            "output_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf" # output filename using unique name for each customer
        }
    },

Unify bills that are to be printed
------------------------------------------

This shows how to filter rows. The same split_pdf plugin is used, from multiple inputs to one output.
::

    {
        "name": "unify bills for print",
        "filter": {"column": "send_email", "value": "no"},
        "task": {
            "type": "split_pdf",
            "input_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf",
            "delete_input": true,
            "pages": 2,
            "output_filename": "bills/bills_{year}_paper.pdf"
        }
    },

Send the bills by email
------------------------------------------

::

    {
        "name": "send emails",
        "locals": {
            "sender": {
                "email": "info@domain.com",
                "name": "Info",
                "server": "smtp.gmail.com:587",
                "username": "info@domain.com",
                "password": "strongpassword",
                "bcc": "bills@domain.com",
                "headers": {
                    "Reply-To": "contability@domain.com"
                }
            },
        },
        "filter": {"column": "send_email", "value": "yes"},
        "task": {
            "type": "email",
            "recipient": "email", # the key/column name for the customer email
            "subject" : "Bill for year {year}", # can contain dynamic parts
            "body_template_file" : "templates/email_template.txt", # text template for the email body
            "attachments" : [ "bills/bills_{year}/bill_{year}_{clientnr}.pdf" ] # list of attachments
        }
    },


Watermark PDF files
------------------------------------------

Mark reminder bills
::

    {
        "name": "save reminder",
        "filter": {"column": "reminder", "value": "yes"},
        "task": {
            "type": "watermark",
            "makedir": "bills/bills_{key_year}/reminders/",
            "watermark": "REMINDER",
            "input_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf",
            "pages": 2,
            "output_filename": "bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf"
        }
    },

Send reminder bills
::

    {
        "name": "send reminder emails",
        "locals": {
            "sender": {
                ...
            },
        },
        "filter": [
            {"column": "send_email", "value": "yes"},
            {"column": "reminder", "value": "yes"}
        ],
        "task": {
            "type": "email",
            "recipient": "email", # the key/column name for the customer email
            "subject" : "Bill for year {year} (reminder)", # can contain dynamic parts
            "body_template_file" : "templates/reminder_email_template.txt", # text template for the email body
            "attachments" : [ "bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf" ] # list of attachments
        }
    },

Use Google Spreadsheets instead of Excel
------------------------------------------

To support google spreadsheets you need a service account and credentials as JSON.
Follow the tutorial `gspread with service account`_.

Change the "workbook" value
::

        "globals": {
            "data": {
                "workbook": "https://docs.google.com/spreadsheets/d/1u...",
                "sheet": "mailmergesheet",
                "credentials": {
                    "type": "service_account",
                    "project_id": "...",
                    "private_key_id": "...",
                    "private_key": "-----BEGIN PRIVATE KEY....\n-----END PRIVATE KEY-----\n",
                    "client_email": "project@project-123.iam.gserviceaccount.com",
                    "client_id": "...",
                    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
                    "token_uri": "https://oauth2.googleapis.com/token",
                    "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
                    "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
                }
            },
            ...
        }

Export Google Spreadsheets in a PDF file
------------------------------------------

Only works with gspread type data
::

    {
        "#import": ["inc/inc_workbook_gspread.json"]
        "name": "export sheets as pdf",
        "globals": {
            "printsheets_defaults" : {
                "gridlines": true,
                "printnotes": false
            }
        },
        "tasks": [
            {
                "active": 1,
                "name": "bill documents",
                "task": {
                    "makedir": "bills/web",
                    "type": "printsheets",
                    "printsheets": [
                        {
                            "gid": "1571231333"
                        },
                        {
                            "gid": "291382312357"
                        },
                        {
                            "gid": "3712318114",
                            "portrait": false,
                            "printnotes": true
                        }
                    ],
                    "output_filename": "bills/web/heizung_unterlagen_{key_year}.pdf"
                }
            }
        ]
    }

Todo / Wish List
================

* Create unit tests
* Develop the command line to be able to run simple tasks directly
* Create more advanced filters
* Auto-magically create directories (remove the makedir argument)

Contributing
============

* Fork the repository on GitHub and start hacking
* Send a pull request with your changes


Credits
=======

This repository is created and maintained by `Iulian Ciorăscu`_.

.. _Iulian Ciorăscu: https://github.com/iulica/
.. _gspread with service account: https://docs.gspread.org/en/latest/oauth2.html#service-account

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/iulica/doc-workflow",
    "name": "doc-workflow",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "docx, pdf, split, watermark, email, mailmerge, qrbill, xlsx",
    "author": "Iulian Cior\u0103scu",
    "author_email": "ciulian@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/e0/18/3c6c65ff301df39b92819848bfc660df5b1824c8ce72ebae2169cde64886/doc_workflow-0.1.2a4.tar.gz",
    "platform": null,
    "description": "\n=================\nDocument Workflow\n=================\n\n.. image:: https://badge.fury.io/py/doc-workflow.png\n    :alt: PyPI\n    :target: https://pypi.python.org/pypi/doc-workflow\n\nCreates, merges, splits, edits documents(mainly docx/pdf) as well as sending them by email.\nOriginally created for QR bills integration but is generic and can be used for much more.\n\n\nInstallation\n============\n\nInstallation with ``pip``:\n::\n\n    $ pip install doc-workflow\n\n\nUsage\n=====\n\nFrom the command line:\n::\n\n    $ docwf <path_to_json_config_file>\n\nFrom Python:\n::\n\n    from docwf import DocWorkflow\n\n    config_obj = {\n        \"globals\": {\n            \"data\": {\n                \"workbook\": \"source.xlsx\",\n                \"sheet\": \"mailmergesheet\",\n            },\n            \"constants\": {\n                \"language\": \"fr\"\n            }\n        },\n        \"tasks\": [\n            {\n                \"active\": 1, # you can activate/deactivate tasks\n                \"name\": \"create bills\", # name for debug purpose\n                \"locals\": {\n                    \"data\" : {\n                        \"sheet\": \"overridesheetfortask\"\n                    },\n                    \"key\" : \"value\", # overrides global arguments for the task\n                },\n                \"task\": {\n                    \"type\": \"myplugin\", # or builtin plugins (see below)\n                    \"task_dependent_argument\": \"value{param}\",\n                }\n            },\n        ]\n    }\n    my_plugins = {\n        \"myplugin\": MyPluginClass\n    }\n    DocWorkflow(config_obj, plugins=my_plugins).gen()\n\nTypical workflow tasks\n======================\n\nAssume the data is in the source.xlsx in the sheet named bills\n\n========  ============  ==========  ======  =========  ======\nclientnr  email         send_email  total   reference  etc \n========  ============  ==========  ======  =========  ======\n1         c1@gmail.com     yes       1032   ref2022c1    ...\n2         c2@gmail.com     yes       1232   ref2022c2    ...\n========  ============  ==========  ======  =========  ======\n\n\nCreate bills from Word template\n-------------------------------\n::\n\n    {\n        \"active\": 1, # you can activate/deactivate tasks\n        \"name\": \"create bills\", # name for debug purpose\n        \"task\": {\n            \"type\": \"mailmerge\",\n            \"input_docx\": \"templates/template_bill.docx\",\n            \"output_docx\": \"bills/bill_{year}.docx\" # output depends on the column year, it should be constant throughout all rows\n        }\n    },\n\nCreate pdf from the generated docx\n-----------------------------------\n\nIt uses the Word Application (Mac/Windows).\nIf the docx template has dynamic fields (IF, etc), \nthe generated docx will ask permission to update \nall fields before saving it as pdf.\n::\n\n    {\n        \"name\": \"save pdf from docx (uses Word)\",\n        \"task\": {\n            \"type\": \"makepdf\",\n            \"input_docx\": \"bills/bill_{year}.docx\",\n            \"output_pdf\": \"bills/bill_{year}.pdf\"\n        }\n    },\n\n\nFills in QR codes\n-------------------------------\n\nfor the bills by adding a page to each bill or by merging the QR bill into one of the pages.\n::\n\n    {\n        \"name\": \"create qr bills\",\n        \"locals\": {\n            \"creditor\": {\n                \"iban\": \"CH....\",\n                \"name\": \"The Good Company\",\n                \"pcode\": \"xyzt\",\n                \"city\": \"Bern\",\n                \"street\": \"Dorfstrasse 1\"\n            },\n            \"task_params\": {\n                \"extra_infos\": \"reference\", # fixed keys for bill reason ...\n                \"amount\": \"total\"   # and the amount. With task_params you can create data entries out of existing columns\n            }\n        },\n        \"task\": {\n            \"type\": \"qr\",\n            \"merge_type\": \"merge\", # or \"append\"\n            \"input_filename\": \"bills/bill_{year}.pdf\",\n            \"delete_input\": true, # delete the input filename after creating the output\n            \"pages\": 2, # the number of pages per each bill\n            \"merge_pos\": 2, # or \"insert_pos\" if \"append\"\n            \"output_filename\": \"bills/bill_{year}_with_qr.pdf\"\n        }\n    },\n\nSplit the bills into separate pdf files.\n------------------------------------------\n\nFrom one input to multiple outputs\n::\n\n    {\n        \"name\": \"split bills\",\n        \"task\": {\n            \"type\": \"split_pdf\",\n            \"input_filename\": \"bills/bill_{year}_with_qr.pdf\",\n            \"pages\": 2,\n            \"makedir\": \"bills/bills_{year}\", # if the output directory doesn't exist, create it\n            \"output_filename\": \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\" # output filename using unique name for each customer\n        }\n    },\n\nUnify bills that are to be printed\n------------------------------------------\n\nThis shows how to filter rows. The same split_pdf plugin is used, from multiple inputs to one output.\n::\n\n    {\n        \"name\": \"unify bills for print\",\n        \"filter\": {\"column\": \"send_email\", \"value\": \"no\"},\n        \"task\": {\n            \"type\": \"split_pdf\",\n            \"input_filename\": \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\",\n            \"delete_input\": true,\n            \"pages\": 2,\n            \"output_filename\": \"bills/bills_{year}_paper.pdf\"\n        }\n    },\n\nSend the bills by email\n------------------------------------------\n\n::\n\n    {\n        \"name\": \"send emails\",\n        \"locals\": {\n            \"sender\": {\n                \"email\": \"info@domain.com\",\n                \"name\": \"Info\",\n                \"server\": \"smtp.gmail.com:587\",\n                \"username\": \"info@domain.com\",\n                \"password\": \"strongpassword\",\n                \"bcc\": \"bills@domain.com\",\n                \"headers\": {\n                    \"Reply-To\": \"contability@domain.com\"\n                }\n            },\n        },\n        \"filter\": {\"column\": \"send_email\", \"value\": \"yes\"},\n        \"task\": {\n            \"type\": \"email\",\n            \"recipient\": \"email\", # the key/column name for the customer email\n            \"subject\" : \"Bill for year {year}\", # can contain dynamic parts\n            \"body_template_file\" : \"templates/email_template.txt\", # text template for the email body\n            \"attachments\" : [ \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\" ] # list of attachments\n        }\n    },\n\n\nWatermark PDF files\n------------------------------------------\n\nMark reminder bills\n::\n\n    {\n        \"name\": \"save reminder\",\n        \"filter\": {\"column\": \"reminder\", \"value\": \"yes\"},\n        \"task\": {\n            \"type\": \"watermark\",\n            \"makedir\": \"bills/bills_{key_year}/reminders/\",\n            \"watermark\": \"REMINDER\",\n            \"input_filename\": \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\",\n            \"pages\": 2,\n            \"output_filename\": \"bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf\"\n        }\n    },\n\nSend reminder bills\n::\n\n    {\n        \"name\": \"send reminder emails\",\n        \"locals\": {\n            \"sender\": {\n                ...\n            },\n        },\n        \"filter\": [\n            {\"column\": \"send_email\", \"value\": \"yes\"},\n            {\"column\": \"reminder\", \"value\": \"yes\"}\n        ],\n        \"task\": {\n            \"type\": \"email\",\n            \"recipient\": \"email\", # the key/column name for the customer email\n            \"subject\" : \"Bill for year {year} (reminder)\", # can contain dynamic parts\n            \"body_template_file\" : \"templates/reminder_email_template.txt\", # text template for the email body\n            \"attachments\" : [ \"bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf\" ] # list of attachments\n        }\n    },\n\nUse Google Spreadsheets instead of Excel\n------------------------------------------\n\nTo support google spreadsheets you need a service account and credentials as JSON.\nFollow the tutorial `gspread with service account`_.\n\nChange the \"workbook\" value\n::\n\n        \"globals\": {\n            \"data\": {\n                \"workbook\": \"https://docs.google.com/spreadsheets/d/1u...\",\n                \"sheet\": \"mailmergesheet\",\n                \"credentials\": {\n                    \"type\": \"service_account\",\n                    \"project_id\": \"...\",\n                    \"private_key_id\": \"...\",\n                    \"private_key\": \"-----BEGIN PRIVATE KEY....\\n-----END PRIVATE KEY-----\\n\",\n                    \"client_email\": \"project@project-123.iam.gserviceaccount.com\",\n                    \"client_id\": \"...\",\n                    \"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\",\n                    \"token_uri\": \"https://oauth2.googleapis.com/token\",\n                    \"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\",\n                    \"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/...\"\n                }\n            },\n            ...\n        }\n\nExport Google Spreadsheets in a PDF file\n------------------------------------------\n\nOnly works with gspread type data\n::\n\n    {\n        \"#import\": [\"inc/inc_workbook_gspread.json\"]\n        \"name\": \"export sheets as pdf\",\n        \"globals\": {\n            \"printsheets_defaults\" : {\n                \"gridlines\": true,\n                \"printnotes\": false\n            }\n        },\n        \"tasks\": [\n            {\n                \"active\": 1,\n                \"name\": \"bill documents\",\n                \"task\": {\n                    \"makedir\": \"bills/web\",\n                    \"type\": \"printsheets\",\n                    \"printsheets\": [\n                        {\n                            \"gid\": \"1571231333\"\n                        },\n                        {\n                            \"gid\": \"291382312357\"\n                        },\n                        {\n                            \"gid\": \"3712318114\",\n                            \"portrait\": false,\n                            \"printnotes\": true\n                        }\n                    ],\n                    \"output_filename\": \"bills/web/heizung_unterlagen_{key_year}.pdf\"\n                }\n            }\n        ]\n    }\n\nTodo / Wish List\n================\n\n* Create unit tests\n* Develop the command line to be able to run simple tasks directly\n* Create more advanced filters\n* Auto-magically create directories (remove the makedir argument)\n\nContributing\n============\n\n* Fork the repository on GitHub and start hacking\n* Send a pull request with your changes\n\n\nCredits\n=======\n\nThis repository is created and maintained by `Iulian Cior\u0103scu`_.\n\n.. _Iulian Cior\u0103scu: https://github.com/iulica/\n.. _gspread with service account: https://docs.gspread.org/en/latest/oauth2.html#service-account\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A Python Document Management Framework for generating and sending (pdf, docx, etc) documents to customers",
    "version": "0.1.2a4",
    "project_urls": {
        "Homepage": "https://github.com/iulica/doc-workflow"
    },
    "split_keywords": [
        "docx",
        " pdf",
        " split",
        " watermark",
        " email",
        " mailmerge",
        " qrbill",
        " xlsx"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7fa813658894e140e04f24b2de62cc79c763ac71c2a2e2b6f13f5cfdeb930d84",
                "md5": "e5ee6bd01aaec1b909b7373681a6fbc5",
                "sha256": "725fdbb88d0739e5829dc36cccfb675dbd48e2a9abdfd849cfd5482f0599b116"
            },
            "downloads": -1,
            "filename": "doc_workflow-0.1.2a4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e5ee6bd01aaec1b909b7373681a6fbc5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 20136,
            "upload_time": "2025-11-01T20:52:58",
            "upload_time_iso_8601": "2025-11-01T20:52:58.812845Z",
            "url": "https://files.pythonhosted.org/packages/7f/a8/13658894e140e04f24b2de62cc79c763ac71c2a2e2b6f13f5cfdeb930d84/doc_workflow-0.1.2a4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e0183c6c65ff301df39b92819848bfc660df5b1824c8ce72ebae2169cde64886",
                "md5": "c60bf3804f273161ad8fcddca6a12e61",
                "sha256": "e999e34805fc1856b292f468bfae5506126d2d7da261f93c8b53bee67e56bdf4"
            },
            "downloads": -1,
            "filename": "doc_workflow-0.1.2a4.tar.gz",
            "has_sig": false,
            "md5_digest": "c60bf3804f273161ad8fcddca6a12e61",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 15734,
            "upload_time": "2025-11-01T20:53:00",
            "upload_time_iso_8601": "2025-11-01T20:53:00.011591Z",
            "url": "https://files.pythonhosted.org/packages/e0/18/3c6c65ff301df39b92819848bfc660df5b1824c8ce72ebae2169cde64886/doc_workflow-0.1.2a4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-01 20:53:00",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "iulica",
    "github_project": "doc-workflow",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "PyPDF2",
            "specs": []
        },
        {
            "name": "cairosvg",
            "specs": []
        },
        {
            "name": "qrbill",
            "specs": []
        },
        {
            "name": "openpyxl",
            "specs": []
        },
        {
            "name": "svglib",
            "specs": []
        },
        {
            "name": "reportlab",
            "specs": []
        },
        {
            "name": "docx-mailmerge2",
            "specs": []
        },
        {
            "name": "docx2pdf",
            "specs": []
        },
        {
            "name": "gspread",
            "specs": []
        }
    ],
    "lcname": "doc-workflow"
}
        
Elapsed time: 9.54138s