Document Workflow
.. image:: https://badge.fury.io/py/doc-workflow.png
:alt: PyPI
:target: https://pypi.python.org/pypi/doc-workflow
Creates, merges, splits, edits documents(mainly docx/pdf) as well as sending them by email.
Originally created for QR bills integration but is generic and can be used for much more.
Installation with ``pip``:
$ pip install doc-workflow
From the command line:
$ docwf <path_to_json_config_file>
From Python:
from docwf import DocWorkflow
config_obj = {
"globals": {
"data": {
"workbook": "source.xlsx",
"sheet": "mailmergesheet",
"constants": {
"language": "fr"
"tasks": [
"active": 1, # you can activate/deactivate tasks
"name": "create bills", # name for debug purpose
"locals": {
"data" : {
"sheet": "overridesheetfortask"
"key" : "value", # overrides global arguments for the task
"task": {
"type": "myplugin", # or builtin plugins (see below)
"task_dependent_argument": "value{param}",
my_plugins = {
"myplugin": MyPluginClass
DocWorkflow(config_obj, plugins=my_plugins).gen()
Typical workflow tasks
Assume the data is in the source.xlsx in the sheet named bills
======== ============ ========== ====== ========= ======
clientnr email send_email total reference etc
======== ============ ========== ====== ========= ======
1 c1@gmail.com yes 1032 ref2022c1 ...
2 c2@gmail.com yes 1232 ref2022c2 ...
======== ============ ========== ====== ========= ======
Create bills from Word template
"active": 1, # you can activate/deactivate tasks
"name": "create bills", # name for debug purpose
"task": {
"type": "mailmerge",
"input_docx": "templates/template_bill.docx",
"output_docx": "bills/bill_{year}.docx" # output depends on the column year, it should be constant throughout all rows
Create pdf from the generated docx
It uses the Word Application (Mac/Windows).
If the docx template has dynamic fields (IF, etc),
the generated docx will ask permission to update
all fields before saving it as pdf.
"name": "save pdf from docx (uses Word)",
"task": {
"type": "makepdf",
"input_docx": "bills/bill_{year}.docx",
"output_pdf": "bills/bill_{year}.pdf"
Fills in QR codes
for the bills by adding a page to each bill or by merging the QR bill into one of the pages.
"name": "create qr bills",
"locals": {
"creditor": {
"iban": "CH....",
"name": "The Good Company",
"pcode": "xyzt",
"city": "Bern",
"street": "Dorfstrasse 1"
"task_params": {
"extra_infos": "reference", # fixed keys for bill reason ...
"amount": "total" # and the amount. With task_params you can create data entries out of existing columns
"task": {
"type": "qr",
"merge_type": "merge", # or "append"
"input_filename": "bills/bill_{year}.pdf",
"delete_input": true, # delete the input filename after creating the output
"pages": 2, # the number of pages per each bill
"merge_pos": 2, # or "insert_pos" if "append"
"output_filename": "bills/bill_{year}_with_qr.pdf"
Split the bills into separate pdf files.
From one input to multiple outputs
"name": "split bills",
"task": {
"type": "split_pdf",
"input_filename": "bills/bill_{year}_with_qr.pdf",
"pages": 2,
"makedir": "bills/bills_{year}", # if the output directory doesn't exist, create it
"output_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf" # output filename using unique name for each customer
Unify bills that are to be printed
This shows how to filter rows. The same split_pdf plugin is used, from multiple inputs to one output.
"name": "unify bills for print",
"filter": {"column": "send_email", "value": "no"},
"task": {
"type": "split_pdf",
"input_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf",
"delete_input": true,
"pages": 2,
"output_filename": "bills/bills_{year}_paper.pdf"
Send the bills by email
"name": "send emails",
"locals": {
"sender": {
"email": "info@domain.com",
"name": "Info",
"server": "smtp.gmail.com:587",
"username": "info@domain.com",
"password": "strongpassword",
"bcc": "bills@domain.com",
"headers": {
"Reply-To": "contability@domain.com"
"filter": {"column": "send_email", "value": "yes"},
"task": {
"type": "email",
"recipient": "email", # the key/column name for the customer email
"subject" : "Bill for year {year}", # can contain dynamic parts
"body_template_file" : "templates/email_template.txt", # text template for the email body
"attachments" : [ "bills/bills_{year}/bill_{year}_{clientnr}.pdf" ] # list of attachments
Watermark PDF files
Mark reminder bills
"name": "save reminder",
"filter": {"column": "reminder", "value": "yes"},
"task": {
"type": "watermark",
"makedir": "bills/bills_{key_year}/reminders/",
"watermark": "REMINDER",
"input_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf",
"pages": 2,
"output_filename": "bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf"
Send reminder bills
"name": "send reminder emails",
"locals": {
"sender": {
"filter": [
{"column": "send_email", "value": "yes"},
{"column": "reminder", "value": "yes"}
"task": {
"type": "email",
"recipient": "email", # the key/column name for the customer email
"subject" : "Bill for year {year} (reminder)", # can contain dynamic parts
"body_template_file" : "templates/reminder_email_template.txt", # text template for the email body
"attachments" : [ "bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf" ] # list of attachments
Use Google Spreadsheets instead of Excel
To support google spreadsheets you need a service account and credentials as JSON.
Follow the tutorial `gspread with service account`_.
Change the "workbook" value
"globals": {
"data": {
"workbook": "https://docs.google.com/spreadsheets/d/1u...",
"sheet": "mailmergesheet",
"credentials": {
"type": "service_account",
"project_id": "...",
"private_key_id": "...",
"private_key": "-----BEGIN PRIVATE KEY....\n-----END PRIVATE KEY-----\n",
"client_email": "project@project-123.iam.gserviceaccount.com",
"client_id": "...",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://oauth2.googleapis.com/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
Todo / Wish List
* Create unit tests
* Develop the command line to be able to run simple tasks directly
* Create more advanced filters
* Auto-magically create directories (remove the makedir argument)
* Fork the repository on GitHub and start hacking
* Send a pull request with your changes
This repository is created and maintained by `Iulian Ciorăscu`_.
.. _Iulian Ciorăscu: https://github.com/iulica/
.. _gspread with service account: https://docs.gspread.org/en/latest/oauth2.html#service-account
Raw data
"_id": null,
"home_page": "https://github.com/iulica/doc-workflow",
"name": "doc-workflow",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "docx,pdf,split,watermark,email,mailmerge,qrbill,xlsx",
"author": "Iulian Cior\u0103scu",
"author_email": "ciulian@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/a3/20/7a11960b3682500f1c909c8f79770a473a9e2626c242574879f38154802d/doc-workflow-0.1.1a2.tar.gz",
"platform": null,
"description": "\n=================\nDocument Workflow\n=================\n\n.. image:: https://badge.fury.io/py/doc-workflow.png\n :alt: PyPI\n :target: https://pypi.python.org/pypi/doc-workflow\n\nCreates, merges, splits, edits documents(mainly docx/pdf) as well as sending them by email.\nOriginally created for QR bills integration but is generic and can be used for much more.\n\n\nInstallation\n============\n\nInstallation with ``pip``:\n::\n\n $ pip install doc-workflow\n\n\nUsage\n=====\n\nFrom the command line:\n::\n\n $ docwf <path_to_json_config_file>\n\nFrom Python:\n::\n\n from docwf import DocWorkflow\n\n config_obj = {\n \"globals\": {\n \"data\": {\n \"workbook\": \"source.xlsx\",\n \"sheet\": \"mailmergesheet\",\n },\n \"constants\": {\n \"language\": \"fr\"\n }\n },\n \"tasks\": [\n {\n \"active\": 1, # you can activate/deactivate tasks\n \"name\": \"create bills\", # name for debug purpose\n \"locals\": {\n \"data\" : {\n \"sheet\": \"overridesheetfortask\"\n },\n \"key\" : \"value\", # overrides global arguments for the task\n },\n \"task\": {\n \"type\": \"myplugin\", # or builtin plugins (see below)\n \"task_dependent_argument\": \"value{param}\",\n }\n },\n ]\n }\n my_plugins = {\n \"myplugin\": MyPluginClass\n }\n DocWorkflow(config_obj, plugins=my_plugins).gen()\n\nTypical workflow tasks\n======================\n\nAssume the data is in the source.xlsx in the sheet named bills\n\n======== ============ ========== ====== ========= ======\nclientnr email send_email total reference etc \n======== ============ ========== ====== ========= ======\n1 c1@gmail.com yes 1032 ref2022c1 ...\n2 c2@gmail.com yes 1232 ref2022c2 ...\n======== ============ ========== ====== ========= ======\n\n\nCreate bills from Word template\n-------------------------------\n::\n\n {\n \"active\": 1, # you can activate/deactivate tasks\n \"name\": \"create bills\", # name for debug purpose\n \"task\": {\n \"type\": \"mailmerge\",\n \"input_docx\": \"templates/template_bill.docx\",\n \"output_docx\": \"bills/bill_{year}.docx\" # output depends on the column year, it should be constant throughout all rows\n }\n },\n\nCreate pdf from the generated docx\n-----------------------------------\n\nIt uses the Word Application (Mac/Windows).\nIf the docx template has dynamic fields (IF, etc), \nthe generated docx will ask permission to update \nall fields before saving it as pdf.\n::\n\n {\n \"name\": \"save pdf from docx (uses Word)\",\n \"task\": {\n \"type\": \"makepdf\",\n \"input_docx\": \"bills/bill_{year}.docx\",\n \"output_pdf\": \"bills/bill_{year}.pdf\"\n }\n },\n\n\nFills in QR codes\n-------------------------------\n\nfor the bills by adding a page to each bill or by merging the QR bill into one of the pages.\n::\n\n {\n \"name\": \"create qr bills\",\n \"locals\": {\n \"creditor\": {\n \"iban\": \"CH....\",\n \"name\": \"The Good Company\",\n \"pcode\": \"xyzt\",\n \"city\": \"Bern\",\n \"street\": \"Dorfstrasse 1\"\n },\n \"task_params\": {\n \"extra_infos\": \"reference\", # fixed keys for bill reason ...\n \"amount\": \"total\" # and the amount. With task_params you can create data entries out of existing columns\n }\n },\n \"task\": {\n \"type\": \"qr\",\n \"merge_type\": \"merge\", # or \"append\"\n \"input_filename\": \"bills/bill_{year}.pdf\",\n \"delete_input\": true, # delete the input filename after creating the output\n \"pages\": 2, # the number of pages per each bill\n \"merge_pos\": 2, # or \"insert_pos\" if \"append\"\n \"output_filename\": \"bills/bill_{year}_with_qr.pdf\"\n }\n },\n\nSplit the bills into separate pdf files.\n------------------------------------------\n\nFrom one input to multiple outputs\n::\n\n {\n \"name\": \"split bills\",\n \"task\": {\n \"type\": \"split_pdf\",\n \"input_filename\": \"bills/bill_{year}_with_qr.pdf\",\n \"pages\": 2,\n \"makedir\": \"bills/bills_{year}\", # if the output directory doesn't exist, create it\n \"output_filename\": \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\" # output filename using unique name for each customer\n }\n },\n\nUnify bills that are to be printed\n------------------------------------------\n\nThis shows how to filter rows. The same split_pdf plugin is used, from multiple inputs to one output.\n::\n\n {\n \"name\": \"unify bills for print\",\n \"filter\": {\"column\": \"send_email\", \"value\": \"no\"},\n \"task\": {\n \"type\": \"split_pdf\",\n \"input_filename\": \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\",\n \"delete_input\": true,\n \"pages\": 2,\n \"output_filename\": \"bills/bills_{year}_paper.pdf\"\n }\n },\n\nSend the bills by email\n------------------------------------------\n\n::\n\n {\n \"name\": \"send emails\",\n \"locals\": {\n \"sender\": {\n \"email\": \"info@domain.com\",\n \"name\": \"Info\",\n \"server\": \"smtp.gmail.com:587\",\n \"username\": \"info@domain.com\",\n \"password\": \"strongpassword\",\n \"bcc\": \"bills@domain.com\",\n \"headers\": {\n \"Reply-To\": \"contability@domain.com\"\n }\n },\n },\n \"filter\": {\"column\": \"send_email\", \"value\": \"yes\"},\n \"task\": {\n \"type\": \"email\",\n \"recipient\": \"email\", # the key/column name for the customer email\n \"subject\" : \"Bill for year {year}\", # can contain dynamic parts\n \"body_template_file\" : \"templates/email_template.txt\", # text template for the email body\n \"attachments\" : [ \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\" ] # list of attachments\n }\n },\n\n\nWatermark PDF files\n------------------------------------------\n\nMark reminder bills\n::\n\n {\n \"name\": \"save reminder\",\n \"filter\": {\"column\": \"reminder\", \"value\": \"yes\"},\n \"task\": {\n \"type\": \"watermark\",\n \"makedir\": \"bills/bills_{key_year}/reminders/\",\n \"watermark\": \"REMINDER\",\n \"input_filename\": \"bills/bills_{year}/bill_{year}_{clientnr}.pdf\",\n \"pages\": 2,\n \"output_filename\": \"bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf\"\n }\n },\n\nSend reminder bills\n::\n\n {\n \"name\": \"send reminder emails\",\n \"locals\": {\n \"sender\": {\n ...\n },\n },\n \"filter\": [\n {\"column\": \"send_email\", \"value\": \"yes\"},\n {\"column\": \"reminder\", \"value\": \"yes\"}\n ],\n \"task\": {\n \"type\": \"email\",\n \"recipient\": \"email\", # the key/column name for the customer email\n \"subject\" : \"Bill for year {year} (reminder)\", # can contain dynamic parts\n \"body_template_file\" : \"templates/reminder_email_template.txt\", # text template for the email body\n \"attachments\" : [ \"bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf\" ] # list of attachments\n }\n },\n\nUse Google Spreadsheets instead of Excel\n------------------------------------------\n\nTo support google spreadsheets you need a service account and credentials as JSON.\nFollow the tutorial `gspread with service account`_.\n\nChange the \"workbook\" value\n::\n\n \"globals\": {\n \"data\": {\n \"workbook\": \"https://docs.google.com/spreadsheets/d/1u...\",\n \"sheet\": \"mailmergesheet\",\n \"credentials\": {\n \"type\": \"service_account\",\n \"project_id\": \"...\",\n \"private_key_id\": \"...\",\n \"private_key\": \"-----BEGIN PRIVATE KEY....\\n-----END PRIVATE KEY-----\\n\",\n \"client_email\": \"project@project-123.iam.gserviceaccount.com\",\n \"client_id\": \"...\",\n \"auth_uri\": \"https://accounts.google.com/o/oauth2/auth\",\n \"token_uri\": \"https://oauth2.googleapis.com/token\",\n \"auth_provider_x509_cert_url\": \"https://www.googleapis.com/oauth2/v1/certs\",\n \"client_x509_cert_url\": \"https://www.googleapis.com/robot/v1/metadata/x509/...\"\n }\n },\n ...\n }\n\n\nTodo / Wish List\n================\n\n* Create unit tests\n* Develop the command line to be able to run simple tasks directly\n* Create more advanced filters\n* Auto-magically create directories (remove the makedir argument)\n\nContributing\n============\n\n* Fork the repository on GitHub and start hacking\n* Send a pull request with your changes\n\n\nCredits\n=======\n\nThis repository is created and maintained by `Iulian Cior\u0103scu`_.\n\n.. _Iulian Cior\u0103scu: https://github.com/iulica/\n.. _gspread with service account: https://docs.gspread.org/en/latest/oauth2.html#service-account\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "A Python Document Management Framework for generating and sending (pdf, docx, etc) documents to customers",
"version": "0.1.1a2",
"split_keywords": [
"urls": [
"comment_text": "",
"digests": {
"blake2b_256": "f3250438a2a5b7af747130cfb2e0fe6c8be85fd619f7d67b2baea123ce16c8b4",
"md5": "8b146c3f17c1179efc2fa168a3f04da5",
"sha256": "68a5b733fcf079ff8f975f6e655a7ba4541158d218709170a97f6f60c3439725"
"downloads": -1,
"filename": "doc_workflow-0.1.1a2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "8b146c3f17c1179efc2fa168a3f04da5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 18801,
"upload_time": "2023-02-02T11:20:22",
"upload_time_iso_8601": "2023-02-02T11:20:22.181130Z",
"url": "https://files.pythonhosted.org/packages/f3/25/0438a2a5b7af747130cfb2e0fe6c8be85fd619f7d67b2baea123ce16c8b4/doc_workflow-0.1.1a2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
"comment_text": "",
"digests": {
"blake2b_256": "a3207a11960b3682500f1c909c8f79770a473a9e2626c242574879f38154802d",
"md5": "27d4d87adf1954e44041542aa96a990e",
"sha256": "490d8f3cbc564b4e71dca7fefe1751adc1b8a75d5a0f7f8b7c291c4ad458882f"
"downloads": -1,
"filename": "doc-workflow-0.1.1a2.tar.gz",
"has_sig": false,
"md5_digest": "27d4d87adf1954e44041542aa96a990e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 14788,
"upload_time": "2023-02-02T11:20:23",
"upload_time_iso_8601": "2023-02-02T11:20:23.371781Z",
"url": "https://files.pythonhosted.org/packages/a3/20/7a11960b3682500f1c909c8f79770a473a9e2626c242574879f38154802d/doc-workflow-0.1.1a2.tar.gz",
"yanked": false,
"yanked_reason": null
"upload_time": "2023-02-02 11:20:23",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "iulica",
"github_project": "doc-workflow",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "doc-workflow"