# grammlog
GrammAcc's structured logging.
[Full Documentation](https://grammacc.github.io/grammlog)
Pure python package that provides structured logging to JSONL with no dependencies.
Supports asynchronous logging via asyncio. See: [Asynchronous Logging](#asynchronous-logging)
Provides wrappers around the standard `debug`, `info`, `warning`, `error`, and `critical`
functions.
Each function accepts a [`Logger`](https://docs.python.org/3/library/logging.html#logging.Logger)
as its first argument, so you can provide a custom logger with your own handlers to write
structured logs to any target.
This package also provides a `make_logger` convenience function that creates and configures
a file-based logger with rotating log files and a size limit.
## Installation
```bash
pip install grammlog
```
## Basic Usage
```pycon
>>> import os
>>> import grammlog
>>> logger = grammlog.make_logger("app", log_dir="logs", log_level=grammlog.Level.INFO)
>>> grammlog.info(logger, "application initialized")
>>> grammlog.info(
... logger,
... "env vars set",
... {"env": os.environ},
... )
>>> grammlog.set_env(grammlog_dir="logs", default_grammlog_level=grammlog.Level.INFO)
>>> auth_log = grammlog.make_logger("auth") # Use the log_dir and log_level from the env.
>>> try:
... user_id = 42
... get_user_if_logged_in(user_id)
... except NameError as e:
... grammlog.error(
... logger,
... "User was not logged in",
... {"user_id": user_id},
... err=e,
... )
>>> db_logger = grammlog.make_logger(
... "db_queries", log_dir="logs/db", log_level=grammlog.Level.ERROR
... )
>>> try:
... user_name = "invalid"
... db.query(table="users", user_name=user_name)
... except NameError as e:
... grammlog.error(
... db_logger,
... "Unknown error in db query",
... {"queried_table": "users", "user_name": user_name},
... err=e,
... )
```
## Structured Data
The logging functions all take an arbitrary logger as their first argument, which
allows them to output structured logs to any handlers supported by the stdlib's logging module.
But they also accept a required string `msg`, and an optional `details` dictionary as well as
an optional `err` Exception.
These arguments will be merged together into a JSON-formatted object that will be serialized
and written to a single line of JSONL logged output using the provided logger.
In addition to the data provided in the function arguments, the logged object will also include
the following keys:
- `level`: The logging level of the message. E.g. 'DEBUG', 'ERROR'.
- `timestamp`: The timestamp of the logging event in UTC as given by
[`datetime.datetime.timestamp()`](https://docs.python.org/3/library/datetime.html#datetime.datetime.timestamp)
If the `err` argument is not `None`, then the logged line will also contain the following keys:
- `err`: The `repr` of the `err` object.
- `traceback`: The formatted traceback of the exception.
The `msg` string will be merged into the object under the `msg` key, and all keys and values
in the `details` dict will be merged into the resulting object as-is.
The keys in the `details` dict are assumed to be strings, but the values can be any type.
If a value is not json-serializable, the logging function will force a string conversion
by calling `str(value)` on it. This is applied recursively to any nested dictionaries as well.
The default of calling `str` on the values is fine for logging blobs of dictionary data, but
usually, it's best to explicitly convert an unsupported value to a json-serializable form before
logging it so that the logs contain all of the information expected. For example, when logging
dates/datetimes, it may be desirable to have a very high precision POSIX timestamp, or you may
want to log a more human-readable ISO-formatted date string. Converting the value to the desired
format before logging it is preferred.
### Processing Structured Logs
Logging in a structured format like JSON is useful because we can query the logs
based on keys and values instead of simply scanning through thousands of lines
of text manually or with inaccurate search heuristics.
For example, using the shell tool [`jq`](https://jqlang.github.io/jq/), we can
filter our logs to only the lines that have an `err` object logged.
cat debug.log | jq 'select(.err != null)'
Or we can get a list of all log entries from the last 42 minutes.
cat debug.log | jq 'select(.timestamp >= (now - 2520))'
Or we can count the number of log entries.
cat debug.log | jq --slurp 'length'
Or we can get an array of (msg, traceback) tuples for all of the ValueErrors that we've logged.
cat debug.log | jq 'select(.err != null) | select(.err | contains("ValueError")) | [.msg, .traceback]'
Or we can use any other combination of query-like filters to examine the exact
messages we're concerned with.
This means that if we include queryable keys in our logging calls in the source code, it is
easy to find the specific error messages we need to debug all the nasty issues our applications
give us at 3am on a Saturday.
## Asynchronous Logging
There are async versions of each of the logging functions as well as
`register_async_logger` and `deregister_async_logger` functions:
- `async_debug`
- `async_info`
- `async_warning`
- `async_error`
- `async_critical`
- `register_async_logger`
- `deregister_async_logger`
- `deregister_all_async_loggers`
The async logging functions only work with a logger that has been registered.
This is because the logging calls themselves are synchronous, and they
need to be queued in order to run them concurrently with other tasks in the event
loop. Registering a logger to be used asynchronously doesn't mutate the logger in
any way, and *async loggers* are still the same
[`logging.Logger`](https://docs.python.org/3/library/logging.html#logging.Logger)
objects. Calling `register_async_logger` simply creates an
[`asyncio.Queue`](https://docs.python.org/3/library/asyncio-queue.html#asyncio.Queue)
and a [`Task`](https://docs.python.org/3/library/asyncio-task.html#task-object)
to run synchronous logging functions in the background.
The async queues are managed internally in this package and run the logging
events on the event loop in the background. This means that a call like
`await async_info(logger, msg)` doesn't actually wait until the message is logged;
it puts an event into the queue to be logged in the background at the discretion of
the event loop's task scheduler. This means that `deregister_async_logger` needs to
be called on any loggers registered as async before application shutdown in order to
guarantee all log messages are flushed to their targets. Failing to deregister a
logger will not cause any problems, but it may result in pending log messages being
lost. To simplify cleanup, the `deregister_all_async_loggers` function can be used to
deregister all registered async loggers during the application's shutdown procedure
without needing a reference to each individual logger in that scope.
Similarly to how any logger with any handlers can be used with the sync functions
for structured logging to any target, any logger can be registered as an async logger by passing
it into the `register_async_logger` function. That does not mean that registering another library's
logger will cause that library's logging events to run asynchronously. The asynchronous logging only
works if the `async_*` functions are used. Registering a logger that you don't control will only add
overhead due to the empty task taking CPU cycles away from other background work on the event loop.
TODO: Add an example of async usage here.
### Async Performance Considerations
Using async logging won't make your logging any faster. Because writing the actual log messages
is synchronous, excessive logging will still cause a CPU-bound bottleneck in your application.
However, if you are using asyncio already, using async logging should make your code more efficient
by giving the event loop a chance to start other background tasks in between registering a logging
event and actually logging the message. In other words, the main benefit is not to make logging more
efficient but instead to make sure the event loop can keep as many concurrent tasks running as
possible.
One thing to consider with respect to the async event loop is the size limit for the logging
queues. The queues will not block the event loop from
running other tasks regardless of the size limit, but there are tradeoffs to consider.
Due to the way
[`asyncio.Queue`](https://docs.python.org/3/library/asyncio-queue.html#asyncio.Queue)
works, when the queue is full, it will continue to pass execution to other tasks until
an item is removed from the queue. This means that in situations where the application is
performing excessive logging due to some unforseen usage pattern or a programming oversight, the
size limit on the queue will help to throttle the CPU usage of the logging events by not continuing
to enqueue more events until the oldest one is evicted. This will give the event loop more chances
to start other tasks such as handling a request through the web framework or sending an async API
response to the frontend. The logging events will still hog the CPU while they are running, but
the size limit maximizes the chances the application has to start other IO-bound tasks in between
logging events. The flipside of this is that if the async logging call is happening inside a handler
for a request or before sending a response to the client, then that entire coroutine will wait until
there is space in the queue to add another logging event. For this reason, some applications may
want to use a large size limit for logging queues depending on their needs, but it is very unlikely
that the wait time for a queue eviction would result in a more significant slowdown than the CPU
load that an unbounded queue would allow the logging events to accumulate.
When in doubt, profile.
Raw data
{
"_id": null,
"home_page": null,
"name": "grammlog",
"maintainer": "GrammAcc",
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "asyncio, json-logging, logging, structured-logging",
"author": "GrammAcc",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/8e/9f/b45b75c2e1feaa7096858d6b2ad8e506c1cc1f3b5be8a26b3da040dec3dc/grammlog-1.2.0.tar.gz",
"platform": null,
"description": "# grammlog\n\nGrammAcc's structured logging.\n\n[Full Documentation](https://grammacc.github.io/grammlog)\n\nPure python package that provides structured logging to JSONL with no dependencies.\n\nSupports asynchronous logging via asyncio. See: [Asynchronous Logging](#asynchronous-logging)\n\nProvides wrappers around the standard `debug`, `info`, `warning`, `error`, and `critical`\nfunctions.\n\nEach function accepts a [`Logger`](https://docs.python.org/3/library/logging.html#logging.Logger)\nas its first argument, so you can provide a custom logger with your own handlers to write\nstructured logs to any target.\n\nThis package also provides a `make_logger` convenience function that creates and configures\na file-based logger with rotating log files and a size limit.\n\n## Installation\n\n```bash\npip install grammlog\n```\n\n## Basic Usage\n\n```pycon\n>>> import os\n>>> import grammlog\n>>> logger = grammlog.make_logger(\"app\", log_dir=\"logs\", log_level=grammlog.Level.INFO)\n>>> grammlog.info(logger, \"application initialized\")\n>>> grammlog.info(\n... logger,\n... \"env vars set\",\n... {\"env\": os.environ},\n... )\n>>> grammlog.set_env(grammlog_dir=\"logs\", default_grammlog_level=grammlog.Level.INFO)\n>>> auth_log = grammlog.make_logger(\"auth\") # Use the log_dir and log_level from the env.\n>>> try:\n... user_id = 42\n... get_user_if_logged_in(user_id)\n... except NameError as e:\n... grammlog.error(\n... logger,\n... \"User was not logged in\",\n... {\"user_id\": user_id},\n... err=e,\n... )\n>>> db_logger = grammlog.make_logger(\n... \"db_queries\", log_dir=\"logs/db\", log_level=grammlog.Level.ERROR\n... )\n>>> try:\n... user_name = \"invalid\"\n... db.query(table=\"users\", user_name=user_name)\n... except NameError as e:\n... grammlog.error(\n... db_logger,\n... \"Unknown error in db query\",\n... {\"queried_table\": \"users\", \"user_name\": user_name},\n... err=e,\n... )\n\n```\n\n## Structured Data\n\nThe logging functions all take an arbitrary logger as their first argument, which\nallows them to output structured logs to any handlers supported by the stdlib's logging module.\nBut they also accept a required string `msg`, and an optional `details` dictionary as well as\nan optional `err` Exception.\n\nThese arguments will be merged together into a JSON-formatted object that will be serialized\nand written to a single line of JSONL logged output using the provided logger.\n\nIn addition to the data provided in the function arguments, the logged object will also include\nthe following keys:\n\n - `level`: The logging level of the message. E.g. 'DEBUG', 'ERROR'.\n - `timestamp`: The timestamp of the logging event in UTC as given by\n[`datetime.datetime.timestamp()`](https://docs.python.org/3/library/datetime.html#datetime.datetime.timestamp)\n\n\nIf the `err` argument is not `None`, then the logged line will also contain the following keys:\n\n - `err`: The `repr` of the `err` object.\n - `traceback`: The formatted traceback of the exception.\n\nThe `msg` string will be merged into the object under the `msg` key, and all keys and values\nin the `details` dict will be merged into the resulting object as-is.\n\nThe keys in the `details` dict are assumed to be strings, but the values can be any type.\nIf a value is not json-serializable, the logging function will force a string conversion\nby calling `str(value)` on it. This is applied recursively to any nested dictionaries as well.\n\nThe default of calling `str` on the values is fine for logging blobs of dictionary data, but\nusually, it's best to explicitly convert an unsupported value to a json-serializable form before\nlogging it so that the logs contain all of the information expected. For example, when logging\ndates/datetimes, it may be desirable to have a very high precision POSIX timestamp, or you may\nwant to log a more human-readable ISO-formatted date string. Converting the value to the desired\nformat before logging it is preferred.\n\n### Processing Structured Logs\n\nLogging in a structured format like JSON is useful because we can query the logs\nbased on keys and values instead of simply scanning through thousands of lines\nof text manually or with inaccurate search heuristics.\n\nFor example, using the shell tool [`jq`](https://jqlang.github.io/jq/), we can\nfilter our logs to only the lines that have an `err` object logged.\n\n cat debug.log | jq 'select(.err != null)'\n\nOr we can get a list of all log entries from the last 42 minutes.\n\n cat debug.log | jq 'select(.timestamp >= (now - 2520))'\n\nOr we can count the number of log entries.\n\n cat debug.log | jq --slurp 'length'\n\nOr we can get an array of (msg, traceback) tuples for all of the ValueErrors that we've logged.\n\n cat debug.log | jq 'select(.err != null) | select(.err | contains(\"ValueError\")) | [.msg, .traceback]'\n\nOr we can use any other combination of query-like filters to examine the exact\nmessages we're concerned with.\n\nThis means that if we include queryable keys in our logging calls in the source code, it is\neasy to find the specific error messages we need to debug all the nasty issues our applications\ngive us at 3am on a Saturday.\n\n## Asynchronous Logging\n\nThere are async versions of each of the logging functions as well as\n`register_async_logger` and `deregister_async_logger` functions:\n\n- `async_debug`\n- `async_info`\n- `async_warning`\n- `async_error`\n- `async_critical`\n- `register_async_logger`\n- `deregister_async_logger`\n- `deregister_all_async_loggers`\n\nThe async logging functions only work with a logger that has been registered.\nThis is because the logging calls themselves are synchronous, and they\nneed to be queued in order to run them concurrently with other tasks in the event\nloop. Registering a logger to be used asynchronously doesn't mutate the logger in\nany way, and *async loggers* are still the same\n[`logging.Logger`](https://docs.python.org/3/library/logging.html#logging.Logger)\nobjects. Calling `register_async_logger` simply creates an\n[`asyncio.Queue`](https://docs.python.org/3/library/asyncio-queue.html#asyncio.Queue)\nand a [`Task`](https://docs.python.org/3/library/asyncio-task.html#task-object)\nto run synchronous logging functions in the background.\n\nThe async queues are managed internally in this package and run the logging\nevents on the event loop in the background. This means that a call like\n`await async_info(logger, msg)` doesn't actually wait until the message is logged;\nit puts an event into the queue to be logged in the background at the discretion of\nthe event loop's task scheduler. This means that `deregister_async_logger` needs to\nbe called on any loggers registered as async before application shutdown in order to\nguarantee all log messages are flushed to their targets. Failing to deregister a\nlogger will not cause any problems, but it may result in pending log messages being\nlost. To simplify cleanup, the `deregister_all_async_loggers` function can be used to\nderegister all registered async loggers during the application's shutdown procedure\nwithout needing a reference to each individual logger in that scope.\n\nSimilarly to how any logger with any handlers can be used with the sync functions\nfor structured logging to any target, any logger can be registered as an async logger by passing\nit into the `register_async_logger` function. That does not mean that registering another library's\nlogger will cause that library's logging events to run asynchronously. The asynchronous logging only\nworks if the `async_*` functions are used. Registering a logger that you don't control will only add\noverhead due to the empty task taking CPU cycles away from other background work on the event loop.\n\nTODO: Add an example of async usage here.\n\n\n### Async Performance Considerations\n\nUsing async logging won't make your logging any faster. Because writing the actual log messages\nis synchronous, excessive logging will still cause a CPU-bound bottleneck in your application.\nHowever, if you are using asyncio already, using async logging should make your code more efficient\nby giving the event loop a chance to start other background tasks in between registering a logging\nevent and actually logging the message. In other words, the main benefit is not to make logging more\nefficient but instead to make sure the event loop can keep as many concurrent tasks running as\npossible.\n\nOne thing to consider with respect to the async event loop is the size limit for the logging\nqueues. The queues will not block the event loop from\nrunning other tasks regardless of the size limit, but there are tradeoffs to consider.\nDue to the way\n[`asyncio.Queue`](https://docs.python.org/3/library/asyncio-queue.html#asyncio.Queue)\nworks, when the queue is full, it will continue to pass execution to other tasks until\nan item is removed from the queue. This means that in situations where the application is\nperforming excessive logging due to some unforseen usage pattern or a programming oversight, the\nsize limit on the queue will help to throttle the CPU usage of the logging events by not continuing\nto enqueue more events until the oldest one is evicted. This will give the event loop more chances\nto start other tasks such as handling a request through the web framework or sending an async API\nresponse to the frontend. The logging events will still hog the CPU while they are running, but\nthe size limit maximizes the chances the application has to start other IO-bound tasks in between\nlogging events. The flipside of this is that if the async logging call is happening inside a handler\nfor a request or before sending a response to the client, then that entire coroutine will wait until\nthere is space in the queue to add another logging event. For this reason, some applications may\nwant to use a large size limit for logging queues depending on their needs, but it is very unlikely\nthat the wait time for a queue eviction would result in a more significant slowdown than the CPU\nload that an unbounded queue would allow the logging events to accumulate.\n\nWhen in doubt, profile.\n",
"bugtrack_url": null,
"license": null,
"summary": "Structured logging to JSONL with support for asynchronous logging via asyncio",
"version": "1.2.0",
"project_urls": {
"bug_tracker": "https://github.com/GrammAcc/grammlog/issues",
"documentation": "https://grammacc.github.io/grammlog",
"homepage": "https://github.com/GrammAcc/grammlog",
"repository": "https://github.com/GrammAcc/grammlog"
},
"split_keywords": [
"asyncio",
" json-logging",
" logging",
" structured-logging"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "a7a52fe28f5ff0477d7ea7a022fce843ac9ef0bd875ae7bf9377e5a562ab62b9",
"md5": "476e5fa33e97c55837cbea0d57122c3d",
"sha256": "3af1174d2565c1ab94dd05c3dc6b1ad57460568c11b8c176e8acee37cff4c72a"
},
"downloads": -1,
"filename": "grammlog-1.2.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "476e5fa33e97c55837cbea0d57122c3d",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 14799,
"upload_time": "2024-10-28T23:27:16",
"upload_time_iso_8601": "2024-10-28T23:27:16.520412Z",
"url": "https://files.pythonhosted.org/packages/a7/a5/2fe28f5ff0477d7ea7a022fce843ac9ef0bd875ae7bf9377e5a562ab62b9/grammlog-1.2.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8e9fb45b75c2e1feaa7096858d6b2ad8e506c1cc1f3b5be8a26b3da040dec3dc",
"md5": "c08f540f8b6ae33da1ac67e17aef0a3b",
"sha256": "e0878f5793ff35b1ad46ec60c2e26f8b0064e4f30f81bf160f03ec4d4333d728"
},
"downloads": -1,
"filename": "grammlog-1.2.0.tar.gz",
"has_sig": false,
"md5_digest": "c08f540f8b6ae33da1ac67e17aef0a3b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 16220,
"upload_time": "2024-10-28T23:27:17",
"upload_time_iso_8601": "2024-10-28T23:27:17.825884Z",
"url": "https://files.pythonhosted.org/packages/8e/9f/b45b75c2e1feaa7096858d6b2ad8e506c1cc1f3b5be8a26b3da040dec3dc/grammlog-1.2.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-10-28 23:27:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "GrammAcc",
"github_project": "grammlog",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "grammlog"
}