| Name | workcraft JSON |
| Version |
0.5.15
JSON |
| download |
| home_page | None |
| Summary | A simple, lightweight, database-only, worker library in Python |
| upload_time | 2024-10-25 11:20:14 |
| maintainer | None |
| docs_url | None |
| author | None |
| requires_python | >=3.10 |
| license | None |
| keywords |
|
| VCS |
|
| bugtrack_url |
|
| requirements |
No requirements were recorded.
|
| Travis-CI |
No Travis.
|
| coveralls test coverage |
No coveralls.
|

# Workcraft
## A simple, lightweight, database-only, worker library in Python
workcraft is a simple, lightweight, database-only worker library in Python with a MySQL database as the single source of truth.
workcraft adresses some of the pain points of using Celery, namely its incapability to handle long-running tasks and the fact that you need a message broker and that also that sometimes the workers aren't doing the tasks as you would expect them to.
All workcraft needs is a running MySQL database, which means you could in theory scale workcraft both vertically (get more database resources) and horizontally (get more databases). But so far, I've not tried scaling it that way.
workcraft is **not** the best in sitations where you need extreme precision and sub-second latency/wait times before a task is fetched and processed.
But if it's OK for you that your workers take at least 1 second to fetch your task AND you want a clear overview of your tasks and workers (using a database GUI for example), then workcraft is ideal for you.
## Installation
Run
```
pip install workcraft
```
## Getting started
First, you need a running MySQL database. Then, for one time, you need to setup all the tables and events. For that, first you need to create a `.env` file and add some variables in there:
```
WK_DB_HOST="127.0.0.1"
WK_DB_PORT=3306
WK_DB_USER="root"
WK_DB_PASS="workcraft"
WK_DB_NAME="workcraft"
```
(Adjust to your settings of course)
Then, run:
```
python3 -m workcraft setup_database_tables
```
This command will take the connection parameters from your `.env` file - but it's not strictly required. You can also pass it those as parameters. Here are the args of the `setup_database_tables` function:
```python
def setup_database_tables(
db_host: str = "127.0.0.1",
db_port: int = 3306,
db_user: str = "root",
db_name: str = "workcraft",
db_password: str | None = None,
read_from_env: bool = True,
drop_tables: bool = False,
):
...
```
E.g.:
```
python3 -m workcraft setup_database_tables --read_from_env=False --db_password=test --drop_tables=True
```
Then, to use workers, implement your worker code:
```python
import asyncio
import random
import time
from multiprocessing import Pool
from loguru import logger
from workcraft.core import workcraft
from workcraft.db import get_db_config
workcraft = workcraft()
global_counter = 0
@workcraft.setup_handler()
def setup_handler():
global global_counter
global_counter = 1000
logger.info("Setting up the worker!")
@workcraft.task("simple_task")
def simple_task(a: int, b: int, c: int) -> int:
global global_counter
global_counter += 1
time.sleep(1)
logger.info(global_counter)
# raise ValueError("Random error!")
return a + b + c
@workcraft.postrun_handler()
def postrun_handler(task_id, task_name, result, status):
logger.info(
f"Postrun handler called for {task_id} and {task_name}! Got result: {result} and status {status}"
)
def get_random_number():
logger.info("Getting random number...")
time.sleep(random.randint(5, 10))
return random.randint(1, 100)
@workcraft.task("complex_task_1")
def parallel_task():
num_processes = 8
n_random_numbers = 20
with Pool(processes=num_processes) as pool:
pool.starmap(
get_random_number,
[() for _ in range(n_random_numbers)],
)
async def main():
n_tasks = 1
for _ in range(n_tasks):
a = random.randint(1, 100)
b = random.randint(1, 100)
c = random.randint(1, 100)
workcraft.send_task_sync(
"simple_task",
[a, b],
task_kwargs={"c": c},
retry_on_failure=True,
db_config=get_db_config(),
)
# But you could also just directly input the data into the database
if __name__ == "__main__":
asyncio.run(main())
```
To run a worker then, you would run:
```
python3 -m workcraft peon --workcraft_path=example.workcraft --worker-id=test1
```
If you then execute `example.py`, you will add a task into the queue and then see as the worker processes that task.
## Configuration
If you have a `workcraft.config.json` file, those settings will be used when setting up the tables as well as other, worker-related settings:
```json
{
"DB_PEON_HEARTBEAT_INTERVAL": 5,
"DB_POLLING_INTERVAL": 5,
"DB_SETUP_BACKOFF_MULTIPLIER_SECONDS": 30,
"DB_SETUP_BACKOFF_MAX_SECONDS": 3600,
"DB_SETUP_RUN_SELF_CORRECT_TASK_INTERVAL": 10,
"DB_SETUP_RUN_REOPEN_FAILED_TASK_INTERVAL": 10,
"DB_SETUP_WAIT_TIME_BEFORE_WORKER_DECLARED_DEAD": 60,
"DB_SETUP_CHECK_DEAD_WORKER_INTERVAL": 10
}
```
DB_PEON_HEARTBEAT_INTERVAL: This is the interval at which the peon sends a heartbeat to the database.
DB_POLLING_INTERVAL: This is the interval at which the peon polls the database for new tasks.
DB_SETUP_BACKOFF_MULTIPLIER_SECONDS: This is the multiplier for the exponential backoff algorithm.
DB_SETUP_BACKOFF_MAX_SECONDS: This is the maximum backoff time for the exponential backoff algorithm.
DB_SETUP_RUN_SELF_CORRECT_TASK_INTERVAL: This is the interval at which the database runs the self-correct task.
DB_SETUP_RUN_REOPEN_FAILED_TASK_INTERVAL: This is the interval at which the database reopens failed tasks.
DB_SETUP_WAIT_TIME_BEFORE_WORKER_DECLARED_DEAD: This is the time the database waits before declaring a worker dead.
DB_SETUP_CHECK_DEAD_WORKER_INTERVAL: This is the interval at which the database checks for dead workers.
The configs with `DB_SETUP_` in the beginning are only used during the setup of the database. In other words, they are only used once. The first two are using during runtime.
Raw data
{
"_id": null,
"home_page": null,
"name": "workcraft",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": null,
"author": null,
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/f6/5d/2de1bb53f743f72290f9034cf3544561666f612763adf17f67a8a31f5a07/workcraft-0.5.15.tar.gz",
"platform": null,
"description": "\n\n# Workcraft\n\n## A simple, lightweight, database-only, worker library in Python\n\nworkcraft is a simple, lightweight, database-only worker library in Python with a MySQL database as the single source of truth.\n\nworkcraft adresses some of the pain points of using Celery, namely its incapability to handle long-running tasks and the fact that you need a message broker and that also that sometimes the workers aren't doing the tasks as you would expect them to.\n\nAll workcraft needs is a running MySQL database, which means you could in theory scale workcraft both vertically (get more database resources) and horizontally (get more databases). But so far, I've not tried scaling it that way.\n\nworkcraft is **not** the best in sitations where you need extreme precision and sub-second latency/wait times before a task is fetched and processed.\n\nBut if it's OK for you that your workers take at least 1 second to fetch your task AND you want a clear overview of your tasks and workers (using a database GUI for example), then workcraft is ideal for you.\n\n\n## Installation\n\nRun\n\n```\npip install workcraft\n```\n\n## Getting started\n\nFirst, you need a running MySQL database. Then, for one time, you need to setup all the tables and events. For that, first you need to create a `.env` file and add some variables in there:\n\n```\nWK_DB_HOST=\"127.0.0.1\"\nWK_DB_PORT=3306\nWK_DB_USER=\"root\"\nWK_DB_PASS=\"workcraft\"\nWK_DB_NAME=\"workcraft\"\n```\n\n(Adjust to your settings of course)\n\nThen, run:\n\n```\npython3 -m workcraft setup_database_tables\n```\n\nThis command will take the connection parameters from your `.env` file - but it's not strictly required. You can also pass it those as parameters. Here are the args of the `setup_database_tables` function:\n\n```python\ndef setup_database_tables(\n db_host: str = \"127.0.0.1\",\n db_port: int = 3306,\n db_user: str = \"root\",\n db_name: str = \"workcraft\",\n db_password: str | None = None,\n read_from_env: bool = True,\n drop_tables: bool = False,\n):\n...\n```\n\nE.g.:\n```\npython3 -m workcraft setup_database_tables --read_from_env=False --db_password=test --drop_tables=True\n```\n\nThen, to use workers, implement your worker code:\n\n```python\n\nimport asyncio\nimport random\nimport time\nfrom multiprocessing import Pool\n\nfrom loguru import logger\nfrom workcraft.core import workcraft\nfrom workcraft.db import get_db_config\n\n\nworkcraft = workcraft()\n\nglobal_counter = 0\n\n\n@workcraft.setup_handler()\ndef setup_handler():\n global global_counter\n global_counter = 1000\n logger.info(\"Setting up the worker!\")\n\n\n@workcraft.task(\"simple_task\")\ndef simple_task(a: int, b: int, c: int) -> int:\n global global_counter\n global_counter += 1\n time.sleep(1)\n logger.info(global_counter)\n # raise ValueError(\"Random error!\")\n return a + b + c\n\n\n@workcraft.postrun_handler()\ndef postrun_handler(task_id, task_name, result, status):\n logger.info(\n f\"Postrun handler called for {task_id} and {task_name}! Got result: {result} and status {status}\"\n )\n\n\ndef get_random_number():\n logger.info(\"Getting random number...\")\n time.sleep(random.randint(5, 10))\n return random.randint(1, 100)\n\n\n@workcraft.task(\"complex_task_1\")\ndef parallel_task():\n num_processes = 8\n n_random_numbers = 20\n with Pool(processes=num_processes) as pool:\n pool.starmap(\n get_random_number,\n [() for _ in range(n_random_numbers)],\n )\n\n\nasync def main():\n n_tasks = 1\n\n for _ in range(n_tasks):\n a = random.randint(1, 100)\n b = random.randint(1, 100)\n c = random.randint(1, 100)\n\n workcraft.send_task_sync(\n \"simple_task\",\n [a, b],\n task_kwargs={\"c\": c},\n retry_on_failure=True,\n db_config=get_db_config(),\n )\n # But you could also just directly input the data into the database\n\n\nif __name__ == \"__main__\":\n asyncio.run(main())\n\n```\n\nTo run a worker then, you would run:\n\n```\npython3 -m workcraft peon --workcraft_path=example.workcraft --worker-id=test1\n```\n\nIf you then execute `example.py`, you will add a task into the queue and then see as the worker processes that task.\n\n## Configuration\n\nIf you have a `workcraft.config.json` file, those settings will be used when setting up the tables as well as other, worker-related settings:\n\n```json\n\n{\n \"DB_PEON_HEARTBEAT_INTERVAL\": 5,\n \"DB_POLLING_INTERVAL\": 5,\n \"DB_SETUP_BACKOFF_MULTIPLIER_SECONDS\": 30,\n \"DB_SETUP_BACKOFF_MAX_SECONDS\": 3600,\n \"DB_SETUP_RUN_SELF_CORRECT_TASK_INTERVAL\": 10,\n \"DB_SETUP_RUN_REOPEN_FAILED_TASK_INTERVAL\": 10,\n \"DB_SETUP_WAIT_TIME_BEFORE_WORKER_DECLARED_DEAD\": 60,\n \"DB_SETUP_CHECK_DEAD_WORKER_INTERVAL\": 10\n}\n```\n\n\nDB_PEON_HEARTBEAT_INTERVAL: This is the interval at which the peon sends a heartbeat to the database.\nDB_POLLING_INTERVAL: This is the interval at which the peon polls the database for new tasks.\nDB_SETUP_BACKOFF_MULTIPLIER_SECONDS: This is the multiplier for the exponential backoff algorithm.\nDB_SETUP_BACKOFF_MAX_SECONDS: This is the maximum backoff time for the exponential backoff algorithm.\nDB_SETUP_RUN_SELF_CORRECT_TASK_INTERVAL: This is the interval at which the database runs the self-correct task.\nDB_SETUP_RUN_REOPEN_FAILED_TASK_INTERVAL: This is the interval at which the database reopens failed tasks.\nDB_SETUP_WAIT_TIME_BEFORE_WORKER_DECLARED_DEAD: This is the time the database waits before declaring a worker dead.\nDB_SETUP_CHECK_DEAD_WORKER_INTERVAL: This is the interval at which the database checks for dead workers.\n\nThe configs with `DB_SETUP_` in the beginning are only used during the setup of the database. In other words, they are only used once. The first two are using during runtime.\n",
"bugtrack_url": null,
"license": null,
"summary": "A simple, lightweight, database-only, worker library in Python",
"version": "0.5.15",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "2cd91af24c2f7c2764cfca83d20e923aee82d10984c95122ce5d2df5c71bac27",
"md5": "f052d0b4e050512c0dd318b813a05b3d",
"sha256": "0892a34ce8cbdd9fd8f070f7c65ea0148cf8075f376c956f267da1954a6f28bf"
},
"downloads": -1,
"filename": "workcraft-0.5.15-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f052d0b4e050512c0dd318b813a05b3d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 17965,
"upload_time": "2024-10-25T11:20:12",
"upload_time_iso_8601": "2024-10-25T11:20:12.593101Z",
"url": "https://files.pythonhosted.org/packages/2c/d9/1af24c2f7c2764cfca83d20e923aee82d10984c95122ce5d2df5c71bac27/workcraft-0.5.15-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "f65d2de1bb53f743f72290f9034cf3544561666f612763adf17f67a8a31f5a07",
"md5": "c0e74c39748aea2e91990c348031628b",
"sha256": "6a314b1c8e3464bcb1ca869ef85511d8c2a7254e6f830a33dbbf1c8f3e1fc465"
},
"downloads": -1,
"filename": "workcraft-0.5.15.tar.gz",
"has_sig": false,
"md5_digest": "c0e74c39748aea2e91990c348031628b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 55235,
"upload_time": "2024-10-25T11:20:14",
"upload_time_iso_8601": "2024-10-25T11:20:14.889552Z",
"url": "https://files.pythonhosted.org/packages/f6/5d/2de1bb53f743f72290f9034cf3544561666f612763adf17f67a8a31f5a07/workcraft-0.5.15.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-25 11:20:14",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "workcraft"
}