lavd

Name	lavd JSON
Version	0.2.7 JSON
	download
home_page	https://github.com/jungomi/lavd
Summary	Log and Visualise Data
upload_time	2023-07-13 09:37:19
maintainer
docs_url	None
author	Michael Jungo
requires_python	>=3.9
license	MIT
keywords	log visualise visualize data
VCS
bugtrack_url
requirements	halo Pillow simplejson types-simplejson tornado tqdm watchdog torch torchvision mypy ruff black isort build
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Log and Visualise Data (Lavd)

[![Node.js][actions-nodejs-badge]][actions-nodejs-link]
[![Python][actions-python-badge]][actions-python-link]

![Lavd Screenshot](.github/lavd.png)

Tracking the progress of Machine Learning (ML) experiments is essential for
deciding what model works well with the available data. Using some kind of
metric, for example Intersection over Union (IoU) for object detection, gives
a good idea of the quality of the model, but additionally, it is important to
also see a particular example to get a better intuition. This, and more can be
logged and visualised in an interactive manner, including:

- Scalars
- Images
- Text
- Log files
- Markdown documents
- CLI commands (how to run the experiment with the same parameters)

_...and more to come._

## Table of Contents

- [Table of Contents](#table-of-contents)
- [Getting Started](#getting-started)
- [Logging Data](#logging-data)
  - [Scalars](#scalars)
  - [Images](#images)
  - [Text](#text)
  - [Markdown](#markdown)
  - [Command Line Options](#command-line-options)
  - [Summary](#summary)
  - [Printing](#printing)
  - [Spinner](#spinner)
  - [Progress Bar](#progress-bar)
  - [Saving a PyTorch Model](#saving-a-pytorch-model)
  - [Saving any Object](#saving-any-object)
  - [Disabling the Logger](#disabling-the-logger)
- [Data Layout](#data-layout)
- [Comparison to TensorBoard](#comparison-to-tensorboard)
- [Known Issues](#known-issues)
  - [Inotify watch limit reached](#inotify-watch-limit-reached)

## Getting Started

Installing with pip:

```sh
pip install --user --upgrade lavd
```

Once installed, the server can be started by running `lavd` with the directory
containing the (logged) data:

```sh
lavd path/to/logs
```

Or if you prefer to run it as a module instead of using the executable (both
versions do exactly the same):

```sh
python -m lavd.server path/to/logs
```

## Logging Data

_Lavd_ includes a logger that can be used to easily log all the desired data.

```python
import lavd

# Logs everything to log/some-experiment-name/
# If no name is giving it the current date and time are used as a name
logger = lavd.Logger("some-experiment-name")
```

### Scalars

Scalars will be plotted and unlike other categories, the have to be assigned to
a step.

```python
logger.log_scalar(0.8, "accuracy", step=1)
logger.log_scalar(0.6, "accuracy", step=2)
logger.log_scalar(0.3, "accuracy", step=3)

logger.log_scalar(0.1, "easy/accuracy", step=7)
logger.log_scalar(0.05, "easy/accuracy", step=14)
```

### Images

Logging images with optional bounding boxes that for interactive visualisation.

```python
 # Saves image to: log/some-experiment-name/0001/bird.png
logger.log_image(image, "bird", step=1)
 # Saves image to: log/some-experiment-name/0001/other/bird.png
logger.log_image(image, "other/bird", step=1)
# No step saves it to: log/some-experiment-name/other/bird.png
logger.log_image(image, "other/bird")

# With bounding boxes, each box is given by 2 points: top-left and bottom-right.
boxes = [
    {
        "xStart": 100,
        "yStart": 100,
        "xEnd": 150,
        "yEnd": 200,
        "className": "bird", # Optional
        "probability": 0.4,  # Optional
    },
    # Another bounding box, without class or probability
    {"xStart": 200, "yStart": 22, "xEnd": 233, "yEnd": 80,},
]
# List of available classes
classes = ["bird", "orange", "background"]
# Threshold to count boxes as valid and threfore visible
threshold = 0.2
logger.log_image(
    img,
    "birds-with-bounding-boxes",
    step=3,
    boxes=boxes,
    classes=class_names,
    threshold=threshold,
)
```

### Text

Text can be logged with an additional expected text, which then shows a diff
between the actual text and the expected, otherwise it's just the text.

```python
logger.log_text("The quick brown fox...", "famous-sentence", step=1)
# With an expected text
logger.log_text("hello world", "with-diff", step=2, expected="Hallo Welt")
```

### Markdown

Markdown files are given as Markdown string.

```python
logger.log_markdown("# Hello\n\nMore markdown...", "some-markdown")

logger.log_markdown("# Step 1\\nn## Result\n\nGood", "for-step", step=1)
```

### Command Line Options

Command line arguments that have been used to start the experiment and all
available options for that script can be saved, if you use `argparse` to parse
the command line options.

```python
import argparse

parser = argparse.ArgumentParser()
# Add all options with: parser.add_argument()
# ...
args = parser.parse_args()

logger.log_command(parser, args)
```

### Summary

A summary of the current experiment, including current git commit, diff of the
working tree and possible more information, can be saved as a Markdown document.
There are two parts of the summary that can be configured, short infos that are
presented is an unordered list given as a dictionary, where the key is the label
(can be nested for nested lists), and additional sections at the end of the
document given as a dictionary where the key is the section title and the value
is either the raw string of the content or a list of lines.

```python
# Regular summary
logger.log_summary()


infos = {"size": {"train": 1000, "validation": 50}}
sections = {
    "Additional raw Information": "Some text\nand more",
    "Same but with Lines": ["Some text", "and more"],
}
logger.log_summary(infos, sections)
```

### Printing

Print functions act as print() combined with "".format() but also log all the
messages into a log file in addition to printing.

```python
#               Message              ...Format arguments
logger.println("{}, message: {num}", "Hello", num=1)
# => Hello, message: 1

# Same but to STDERR instead of STDOUT
logger.eprintln("{}, message: {num}", "Hello", num=1)
```

Nicely formatted tables with `print_table`:

```python
header = ["Name", "Correct", "Total", "Accuracy"]
rows = [
    ["Train", 978, 1000, 0.978],
    ["Validation", 90, 100, 0.9],
    ["Test", None, 50, None],
]
logger.print_table(header, rows)
# | Name       | Correct | Total | Accuracy |
# |------------|---------|-------|----------|
# | Train      | 978     | 1000  | 0.97800  |
# | Validation | 90      | 100   | 0.90000  |
# | Test       | -       | 50    | -        |
```

### Spinner

A small wrapper around [Halo][halo], that also logs start and end of the
duration to a log file. Requires a name as the first argument, which will be
used in the log file and as the text next to the spinner, unless the `text`
argument is supplied. All additional arguments are forwarded to `Halo`.

```python
with logger.spinner("Loading..."):
    # Do something that takes a while

# Without context manager, but also some arguments for Halo
spinner = logger.spinner("Manual spinner", spinner="smiley", placement="right")
spinner.start()
# Do something
spinner.stop()
```

### Progress Bar

A small wrapper around [tqdm][tqdm], that also logs start and end of the
duration to a log file. Requires a name as the first argument, which will be
used in the log file and as the text next to the progress bar, unless the `desc`
argument is supplied. All additional arguments are forwarded to `tqdm`.

```python
for d in logger.progress_bar("Train", data):
    # Trainings loop

# With context manager and manual update
with logger.progress_bar("Manual progress bar", total=100) as pbar:
    while True:
      # Do something

      # Update the progress bar by 4 steps
      pbar.update(4)
```

### Saving a PyTorch Model

_[PyTorch][pytorch] is optional, but must be installed to use this feature._

Saves the state dict of a PyTorch model and optionally its current gradients. If
the model is a Multi-GPU/Node model (i.e. `nn.DataParallel` or
`nn.parallel.DistributedDataParallel`) it will automatically be unwrapped to be
compatible with the initial model.

```python
model = ResNet101()
logger.save_model(model, step=3)

distributed_model = nn.parallel.DistributedDataParallel(
    model, device_ids=[gpu_id], find_unused_parameters=True
)
# Distributed model and also saving the gradients
logger.save_model(distributed_model, step=4, grads=True)
```

### Saving any Object

_[PyTorch][pytorch] is optional, but must be installed to use this feature._

Saves any object by serialising it with `torch.save`.

```python
logger.save_obj({ "key": 99 }, "a-dict")

logger.save_obj((1, 2), "some_tuple", step=4)
```

### Disabling the Logger

All logging actions can be disabled, either creating a disabled logger or by
later disabling it. The logging actions will be a no-op instead and all methods
can be used as if they were a regular logger but without having any output.
Methods that produce some useful information without logging anything continue
to function as usual.

Disabling the logger is particularly useful when the same script is launched
multiple in multiple processes, but only the main process should create the
logs.

```python
# Create a disabled logger
# No log directory will be created
logger = lavd.Logger("some-experiment-name", disabled=True)

# Does nothing
logger.println("hello")

# Won't show the spinner
with logger.spinner("Loading..."):
    # Do something that takes a while

# Create a normal logger
logger = lavd.Logger("some-experiment-name")

# Disable it later
logger.disable()

# Enable it (also works if the logger was initially disabled)
logger.enable()
```

## Data Layout

The server picks up any data that is present in the specified log directory that
can be displayed, but certain functionalities require a specific layout or file
format (but still human readable).

In the root of log directory (i.e. path given to `lavd`) the different
experiments are listed by name. Each experiment is in its own directory. Any
other file at the top level will be ignored, since it cannot be associated to
one experiment.

Within each experiment directory, there are numbered directories, which acts as
a step (epoch) within the experiment, and all files within that directory
belongs to that specific step. They are usually padded with zeros e.g. `0001`,
but that is not required, it only helps to show them in the correct order for
tools, such as `ls`, that shows them in lexicographical order. Since
`0001 == 01 == 1`, multiple representations of the same numbers will be in
conflict with each other, and should therefore be avoided.

Every other file, including nested files, are used as data based on their file
type and the relative path is used as a category label (minus the extension).
For example, the `some-experiment-name/0001/bird.png` is an image belonging to
the `some-experiment-name` experiment, of step 1 and with the label `bird`.
The same label can contain multiple data, let's say there is also a `bird.md`
file, so there would be a Markdown document with the label `bird`. The same
base name can therefore be used for different files, as long as they are in
a different category, this means that just using a different extension will
create a conflict (e.g. `birds.png` and `birds.jpeg`) and should also be
avoided.

Files not within a step directory are considered global to the experiment
(including nested files) and once in a nested directory, numbered directories
are just regular directories, not specific steps, for example
`some-experiment-name/other/0001/still-global.md` is a global to
`some-experiment-name` with the label `other/0001/still-global`.

There is one exception to the rule above, and that's for commands. Only
command is used per experiment, and it's the `commands.json` file, directly
inside the experiment directory, i.e. `some-experiment-name/commands.json`. Any
other file containing the commands, will not be displayed (just the command, all
other categories are still shown even if they are in a nested `command.json`)

JSON files are special in the sense that they can contain multiple kinds of
data. They are separated within the JSON file by using the appropriate keys.

The following directory structure is demonstrates the key ideas:

```sh
log/
├── some-experiment-name
│   ├── 0001                        # Step: 1
│   │   ├── bird.png                # Label: bird
│   │   ├── bird.json               # Label: bird
│   │   ├── bird.md                 # Label: bird
│   │   └── difficult
│   │       └── unlabelled
│   │           └── some image.png  # Label: difficult/unlabelled/some image
│   ├── 0002                        # Step 2
│   │   └── bird.png
│   ├── command.json                # Only command that can contain command definitions
│   ├── events.log
│   ├── other                       # Nested global
│   │   ├── 0001                    # Not a step directory!
│   │   │   └── still-global.md     # Label: other/0001/still-global
│   │   └── instructions.md         # Label: other/instructions
│   └── summary.md
├── yet-another-experiment
│   └── 0007
│       └── bird.png
├── top-level-file.md               # ignored
│
...
```

## Comparison to TensorBoard

[TensorBoard][tensorboard] is one of the most used tools for this purpose and
is undoubtedly incredibly useful. Unfortunately, it also has its drawbacks,
hence _Lavd_ was created with a different set of goals. One of the biggest
annoyances from of TensorBoard is the fact that everything is serialised, which
means that you cannot inspect the data directly without deserialising it first.
If sufficiently large amount of data is logged, it takes a while to see the last
event, because everything is deserialised in order. Additionally, something like
images cannot just be viewed with any other tool, since they are not saved as
image files. TensorBoard also limits the number of steps that can be viewed for
something for images or text, presumably to not blow up the serialised data, and
you cannot see anything in between or even decide which ones you want to keep.

_Lavd_ leverages the file system and the data can be freely viewed and edited
just like any other file, while still being picked up for the visualisation.
That also comes with its own drawbacks compared to serialising the data,
for example using more disk space, although that is almost certainly
a non-issue.

Another difference to TensorBoard is that it does not revolve around TensorFlow
graphs or tensors or Machine Learning in general, but focuses more on the actual
data. Even though TensorBoard has been used for a while with PyTorch, the
graphs are not really the focus of that, but it's mostly used to log the
scalars, text, images etc. While the graph visualisation are nice, they are not
used all that often, because once the graph has been seen there is just no
reason to create it again, but there are countless experiments that use the
exact same graph. This also means that _Lavd_ does not try to cover everything
that TensorBoard does. Of course there are also features that TensorBoard does
not cover.

## Known Issues

### Inotify watch limit reached

```
OSError: [Errno 28] inotify watch limit reached
```

The error occurs if there are too many files that are being watch at the same
time. You can check how many watches are allowed as follows:

```sh
cat /proc/sys/fs/inotify/max_user_watches
# => 8192
```

The default is 8192 on most systems. To solve the issue the limit needs to be
increased:

```sh
sudo sysctl fs.inotify.max_user_watches=524288
```

[actions-nodejs-badge]: https://github.com/jungomi/lavd/actions/workflows/nodejs.yml/badge.svg
[actions-nodejs-link]: https://github.com/jungomi/lavd/actions/workflows/nodejs.yml
[actions-python-badge]: https://github.com/jungomi/lavd/actions/workflows/python.yml/badge.svg
[actions-python-link]: https://github.com/jungomi/lavd/actions/workflows/python.yml
[halo]: https://github.com/manrajgrover/halo
[pytorch]: https://pytorch.org
[tensorboard]: https://github.com/tensorflow/tensorboard
[tqdm]: https://github.com/tqdm/tqdm

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/jungomi/lavd",
    "name": "lavd",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": "",
    "keywords": "log,visualise,visualize,data",
    "author": "Michael Jungo",
    "author_email": "michaeljungo92@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/ba/b1/c5cfea48b60c31326d74baa4d5cfadd65ae42ffedb106ab4e52836d8b0a6/lavd-0.2.7.tar.gz",
    "platform": null,
    "description": "# Log and Visualise Data (Lavd)\n\n[![Node.js][actions-nodejs-badge]][actions-nodejs-link]\n[![Python][actions-python-badge]][actions-python-link]\n\n![Lavd Screenshot](.github/lavd.png)\n\nTracking the progress of Machine Learning (ML) experiments is essential for\ndeciding what model works well with the available data. Using some kind of\nmetric, for example Intersection over Union (IoU) for object detection, gives\na good idea of the quality of the model, but additionally, it is important to\nalso see a particular example to get a better intuition. This, and more can be\nlogged and visualised in an interactive manner, including:\n\n- Scalars\n- Images\n- Text\n- Log files\n- Markdown documents\n- CLI commands (how to run the experiment with the same parameters)\n\n_...and more to come._\n\n## Table of Contents\n\n- [Table of Contents](#table-of-contents)\n- [Getting Started](#getting-started)\n- [Logging Data](#logging-data)\n  - [Scalars](#scalars)\n  - [Images](#images)\n  - [Text](#text)\n  - [Markdown](#markdown)\n  - [Command Line Options](#command-line-options)\n  - [Summary](#summary)\n  - [Printing](#printing)\n  - [Spinner](#spinner)\n  - [Progress Bar](#progress-bar)\n  - [Saving a PyTorch Model](#saving-a-pytorch-model)\n  - [Saving any Object](#saving-any-object)\n  - [Disabling the Logger](#disabling-the-logger)\n- [Data Layout](#data-layout)\n- [Comparison to TensorBoard](#comparison-to-tensorboard)\n- [Known Issues](#known-issues)\n  - [Inotify watch limit reached](#inotify-watch-limit-reached)\n\n## Getting Started\n\nInstalling with pip:\n\n```sh\npip install --user --upgrade lavd\n```\n\nOnce installed, the server can be started by running `lavd` with the directory\ncontaining the (logged) data:\n\n```sh\nlavd path/to/logs\n```\n\nOr if you prefer to run it as a module instead of using the executable (both\nversions do exactly the same):\n\n```sh\npython -m lavd.server path/to/logs\n```\n\n## Logging Data\n\n_Lavd_ includes a logger that can be used to easily log all the desired data.\n\n```python\nimport lavd\n\n# Logs everything to log/some-experiment-name/\n# If no name is giving it the current date and time are used as a name\nlogger = lavd.Logger(\"some-experiment-name\")\n```\n\n### Scalars\n\nScalars will be plotted and unlike other categories, the have to be assigned to\na step.\n\n```python\nlogger.log_scalar(0.8, \"accuracy\", step=1)\nlogger.log_scalar(0.6, \"accuracy\", step=2)\nlogger.log_scalar(0.3, \"accuracy\", step=3)\n\nlogger.log_scalar(0.1, \"easy/accuracy\", step=7)\nlogger.log_scalar(0.05, \"easy/accuracy\", step=14)\n```\n\n### Images\n\nLogging images with optional bounding boxes that for interactive visualisation.\n\n```python\n # Saves image to: log/some-experiment-name/0001/bird.png\nlogger.log_image(image, \"bird\", step=1)\n # Saves image to: log/some-experiment-name/0001/other/bird.png\nlogger.log_image(image, \"other/bird\", step=1)\n# No step saves it to: log/some-experiment-name/other/bird.png\nlogger.log_image(image, \"other/bird\")\n\n# With bounding boxes, each box is given by 2 points: top-left and bottom-right.\nboxes = [\n    {\n        \"xStart\": 100,\n        \"yStart\": 100,\n        \"xEnd\": 150,\n        \"yEnd\": 200,\n        \"className\": \"bird\", # Optional\n        \"probability\": 0.4,  # Optional\n    },\n    # Another bounding box, without class or probability\n    {\"xStart\": 200, \"yStart\": 22, \"xEnd\": 233, \"yEnd\": 80,},\n]\n# List of available classes\nclasses = [\"bird\", \"orange\", \"background\"]\n# Threshold to count boxes as valid and threfore visible\nthreshold = 0.2\nlogger.log_image(\n    img,\n    \"birds-with-bounding-boxes\",\n    step=3,\n    boxes=boxes,\n    classes=class_names,\n    threshold=threshold,\n)\n```\n\n### Text\n\nText can be logged with an additional expected text, which then shows a diff\nbetween the actual text and the expected, otherwise it's just the text.\n\n```python\nlogger.log_text(\"The quick brown fox...\", \"famous-sentence\", step=1)\n# With an expected text\nlogger.log_text(\"hello world\", \"with-diff\", step=2, expected=\"Hallo Welt\")\n```\n\n### Markdown\n\nMarkdown files are given as Markdown string.\n\n```python\nlogger.log_markdown(\"# Hello\\n\\nMore markdown...\", \"some-markdown\")\n\nlogger.log_markdown(\"# Step 1\\\\nn## Result\\n\\nGood\", \"for-step\", step=1)\n```\n\n### Command Line Options\n\nCommand line arguments that have been used to start the experiment and all\navailable options for that script can be saved, if you use `argparse` to parse\nthe command line options.\n\n```python\nimport argparse\n\nparser = argparse.ArgumentParser()\n# Add all options with: parser.add_argument()\n# ...\nargs = parser.parse_args()\n\nlogger.log_command(parser, args)\n```\n\n### Summary\n\nA summary of the current experiment, including current git commit, diff of the\nworking tree and possible more information, can be saved as a Markdown document.\nThere are two parts of the summary that can be configured, short infos that are\npresented is an unordered list given as a dictionary, where the key is the label\n(can be nested for nested lists), and additional sections at the end of the\ndocument given as a dictionary where the key is the section title and the value\nis either the raw string of the content or a list of lines.\n\n```python\n# Regular summary\nlogger.log_summary()\n\n\ninfos = {\"size\": {\"train\": 1000, \"validation\": 50}}\nsections = {\n    \"Additional raw Information\": \"Some text\\nand more\",\n    \"Same but with Lines\": [\"Some text\", \"and more\"],\n}\nlogger.log_summary(infos, sections)\n```\n\n### Printing\n\nPrint functions act as print() combined with \"\".format() but also log all the\nmessages into a log file in addition to printing.\n\n```python\n#               Message              ...Format arguments\nlogger.println(\"{}, message: {num}\", \"Hello\", num=1)\n# => Hello, message: 1\n\n# Same but to STDERR instead of STDOUT\nlogger.eprintln(\"{}, message: {num}\", \"Hello\", num=1)\n```\n\nNicely formatted tables with `print_table`:\n\n```python\nheader = [\"Name\", \"Correct\", \"Total\", \"Accuracy\"]\nrows = [\n    [\"Train\", 978, 1000, 0.978],\n    [\"Validation\", 90, 100, 0.9],\n    [\"Test\", None, 50, None],\n]\nlogger.print_table(header, rows)\n# | Name       | Correct | Total | Accuracy |\n# |------------|---------|-------|----------|\n# | Train      | 978     | 1000  | 0.97800  |\n# | Validation | 90      | 100   | 0.90000  |\n# | Test       | -       | 50    | -        |\n```\n\n### Spinner\n\nA small wrapper around [Halo][halo], that also logs start and end of the\nduration to a log file. Requires a name as the first argument, which will be\nused in the log file and as the text next to the spinner, unless the `text`\nargument is supplied. All additional arguments are forwarded to `Halo`.\n\n```python\nwith logger.spinner(\"Loading...\"):\n    # Do something that takes a while\n\n# Without context manager, but also some arguments for Halo\nspinner = logger.spinner(\"Manual spinner\", spinner=\"smiley\", placement=\"right\")\nspinner.start()\n# Do something\nspinner.stop()\n```\n\n### Progress Bar\n\nA small wrapper around [tqdm][tqdm], that also logs start and end of the\nduration to a log file. Requires a name as the first argument, which will be\nused in the log file and as the text next to the progress bar, unless the `desc`\nargument is supplied. All additional arguments are forwarded to `tqdm`.\n\n```python\nfor d in logger.progress_bar(\"Train\", data):\n    # Trainings loop\n\n# With context manager and manual update\nwith logger.progress_bar(\"Manual progress bar\", total=100) as pbar:\n    while True:\n      # Do something\n\n      # Update the progress bar by 4 steps\n      pbar.update(4)\n```\n\n### Saving a PyTorch Model\n\n_[PyTorch][pytorch] is optional, but must be installed to use this feature._\n\nSaves the state dict of a PyTorch model and optionally its current gradients. If\nthe model is a Multi-GPU/Node model (i.e. `nn.DataParallel` or\n`nn.parallel.DistributedDataParallel`) it will automatically be unwrapped to be\ncompatible with the initial model.\n\n```python\nmodel = ResNet101()\nlogger.save_model(model, step=3)\n\ndistributed_model = nn.parallel.DistributedDataParallel(\n    model, device_ids=[gpu_id], find_unused_parameters=True\n)\n# Distributed model and also saving the gradients\nlogger.save_model(distributed_model, step=4, grads=True)\n```\n\n### Saving any Object\n\n_[PyTorch][pytorch] is optional, but must be installed to use this feature._\n\nSaves any object by serialising it with `torch.save`.\n\n```python\nlogger.save_obj({ \"key\": 99 }, \"a-dict\")\n\nlogger.save_obj((1, 2), \"some_tuple\", step=4)\n```\n\n### Disabling the Logger\n\nAll logging actions can be disabled, either creating a disabled logger or by\nlater disabling it. The logging actions will be a no-op instead and all methods\ncan be used as if they were a regular logger but without having any output.\nMethods that produce some useful information without logging anything continue\nto function as usual.\n\nDisabling the logger is particularly useful when the same script is launched\nmultiple in multiple processes, but only the main process should create the\nlogs.\n\n```python\n# Create a disabled logger\n# No log directory will be created\nlogger = lavd.Logger(\"some-experiment-name\", disabled=True)\n\n# Does nothing\nlogger.println(\"hello\")\n\n# Won't show the spinner\nwith logger.spinner(\"Loading...\"):\n    # Do something that takes a while\n\n# Create a normal logger\nlogger = lavd.Logger(\"some-experiment-name\")\n\n# Disable it later\nlogger.disable()\n\n# Enable it (also works if the logger was initially disabled)\nlogger.enable()\n```\n\n## Data Layout\n\nThe server picks up any data that is present in the specified log directory that\ncan be displayed, but certain functionalities require a specific layout or file\nformat (but still human readable).\n\nIn the root of log directory (i.e. path given to `lavd`) the different\nexperiments are listed by name. Each experiment is in its own directory. Any\nother file at the top level will be ignored, since it cannot be associated to\none experiment.\n\nWithin each experiment directory, there are numbered directories, which acts as\na step (epoch) within the experiment, and all files within that directory\nbelongs to that specific step. They are usually padded with zeros e.g. `0001`,\nbut that is not required, it only helps to show them in the correct order for\ntools, such as `ls`, that shows them in lexicographical order. Since\n`0001 == 01 == 1`, multiple representations of the same numbers will be in\nconflict with each other, and should therefore be avoided.\n\nEvery other file, including nested files, are used as data based on their file\ntype and the relative path is used as a category label (minus the extension).\nFor example, the `some-experiment-name/0001/bird.png` is an image belonging to\nthe `some-experiment-name` experiment, of step 1 and with the label `bird`.\nThe same label can contain multiple data, let's say there is also a `bird.md`\nfile, so there would be a Markdown document with the label `bird`. The same\nbase name can therefore be used for different files, as long as they are in\na different category, this means that just using a different extension will\ncreate a conflict (e.g. `birds.png` and `birds.jpeg`) and should also be\navoided.\n\nFiles not within a step directory are considered global to the experiment\n(including nested files) and once in a nested directory, numbered directories\nare just regular directories, not specific steps, for example\n`some-experiment-name/other/0001/still-global.md` is a global to\n`some-experiment-name` with the label `other/0001/still-global`.\n\nThere is one exception to the rule above, and that's for commands. Only\ncommand is used per experiment, and it's the `commands.json` file, directly\ninside the experiment directory, i.e. `some-experiment-name/commands.json`. Any\nother file containing the commands, will not be displayed (just the command, all\nother categories are still shown even if they are in a nested `command.json`)\n\nJSON files are special in the sense that they can contain multiple kinds of\ndata. They are separated within the JSON file by using the appropriate keys.\n\nThe following directory structure is demonstrates the key ideas:\n\n```sh\nlog/\n\u251c\u2500\u2500 some-experiment-name\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0001                        # Step: 1\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 bird.png                # Label: bird\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 bird.json               # Label: bird\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 bird.md                 # Label: bird\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 difficult\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0     \u2514\u2500\u2500 unlabelled\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0         \u2514\u2500\u2500 some image.png  # Label: difficult/unlabelled/some image\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 0002                        # Step 2\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 bird.png\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 command.json                # Only command that can contain command definitions\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 events.log\n\u2502\u00a0\u00a0 \u251c\u2500\u2500 other                       # Nested global\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u251c\u2500\u2500 0001                    # Not a step directory!\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 still-global.md     # Label: other/0001/still-global\n\u2502\u00a0\u00a0 \u2502\u00a0\u00a0 \u2514\u2500\u2500 instructions.md         # Label: other/instructions\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 summary.md\n\u251c\u2500\u2500 yet-another-experiment\n\u2502\u00a0\u00a0 \u2514\u2500\u2500 0007\n\u2502\u00a0\u00a0     \u2514\u2500\u2500 bird.png\n\u251c\u2500\u2500 top-level-file.md               # ignored\n\u2502\n...\n```\n\n## Comparison to TensorBoard\n\n[TensorBoard][tensorboard] is one of the most used tools for this purpose and\nis undoubtedly incredibly useful. Unfortunately, it also has its drawbacks,\nhence _Lavd_ was created with a different set of goals. One of the biggest\nannoyances from of TensorBoard is the fact that everything is serialised, which\nmeans that you cannot inspect the data directly without deserialising it first.\nIf sufficiently large amount of data is logged, it takes a while to see the last\nevent, because everything is deserialised in order. Additionally, something like\nimages cannot just be viewed with any other tool, since they are not saved as\nimage files. TensorBoard also limits the number of steps that can be viewed for\nsomething for images or text, presumably to not blow up the serialised data, and\nyou cannot see anything in between or even decide which ones you want to keep.\n\n_Lavd_ leverages the file system and the data can be freely viewed and edited\njust like any other file, while still being picked up for the visualisation.\nThat also comes with its own drawbacks compared to serialising the data,\nfor example using more disk space, although that is almost certainly\na non-issue.\n\nAnother difference to TensorBoard is that it does not revolve around TensorFlow\ngraphs or tensors or Machine Learning in general, but focuses more on the actual\ndata. Even though TensorBoard has been used for a while with PyTorch, the\ngraphs are not really the focus of that, but it's mostly used to log the\nscalars, text, images etc. While the graph visualisation are nice, they are not\nused all that often, because once the graph has been seen there is just no\nreason to create it again, but there are countless experiments that use the\nexact same graph. This also means that _Lavd_ does not try to cover everything\nthat TensorBoard does. Of course there are also features that TensorBoard does\nnot cover.\n\n## Known Issues\n\n### Inotify watch limit reached\n\n```\nOSError: [Errno 28] inotify watch limit reached\n```\n\nThe error occurs if there are too many files that are being watch at the same\ntime. You can check how many watches are allowed as follows:\n\n```sh\ncat /proc/sys/fs/inotify/max_user_watches\n# => 8192\n```\n\nThe default is 8192 on most systems. To solve the issue the limit needs to be\nincreased:\n\n```sh\nsudo sysctl fs.inotify.max_user_watches=524288\n```\n\n[actions-nodejs-badge]: https://github.com/jungomi/lavd/actions/workflows/nodejs.yml/badge.svg\n[actions-nodejs-link]: https://github.com/jungomi/lavd/actions/workflows/nodejs.yml\n[actions-python-badge]: https://github.com/jungomi/lavd/actions/workflows/python.yml/badge.svg\n[actions-python-link]: https://github.com/jungomi/lavd/actions/workflows/python.yml\n[halo]: https://github.com/manrajgrover/halo\n[pytorch]: https://pytorch.org\n[tensorboard]: https://github.com/tensorflow/tensorboard\n[tqdm]: https://github.com/tqdm/tqdm\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Log and Visualise Data",
    "version": "0.2.7",
    "project_urls": {
        "Homepage": "https://github.com/jungomi/lavd"
    },
    "split_keywords": [
        "log",
        "visualise",
        "visualize",
        "data"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "84301e98c428a5331b9e593697609441ddab0e499ba0d41d0a1422eef39e4617",
                "md5": "890ecf0a54629d8bca3e0e7b7e2b5e46",
                "sha256": "a896675ee126d9081c2b5258d1ab8994730fa9bcb59eaa15076a05e982feffc6"
            },
            "downloads": -1,
            "filename": "lavd-0.2.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "890ecf0a54629d8bca3e0e7b7e2b5e46",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 563252,
            "upload_time": "2023-07-13T09:37:17",
            "upload_time_iso_8601": "2023-07-13T09:37:17.059451Z",
            "url": "https://files.pythonhosted.org/packages/84/30/1e98c428a5331b9e593697609441ddab0e499ba0d41d0a1422eef39e4617/lavd-0.2.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "bab1c5cfea48b60c31326d74baa4d5cfadd65ae42ffedb106ab4e52836d8b0a6",
                "md5": "a7779f70eb462872d6f9f20a74dc5492",
                "sha256": "ed8326337757da1b6a2bedfa7afd4595ef616ee6f51e064db424474f11adf559"
            },
            "downloads": -1,
            "filename": "lavd-0.2.7.tar.gz",
            "has_sig": false,
            "md5_digest": "a7779f70eb462872d6f9f20a74dc5492",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 6155937,
            "upload_time": "2023-07-13T09:37:19",
            "upload_time_iso_8601": "2023-07-13T09:37:19.313520Z",
            "url": "https://files.pythonhosted.org/packages/ba/b1/c5cfea48b60c31326d74baa4d5cfadd65ae42ffedb106ab4e52836d8b0a6/lavd-0.2.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-07-13 09:37:19",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "jungomi",
    "github_project": "lavd",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "halo",
            "specs": [
                [
                    "==",
                    "0.0.31"
                ]
            ]
        },
        {
            "name": "Pillow",
            "specs": [
                [
                    "==",
                    "10.0.0"
                ]
            ]
        },
        {
            "name": "simplejson",
            "specs": [
                [
                    "==",
                    "3.19.1"
                ]
            ]
        },
        {
            "name": "types-simplejson",
            "specs": [
                [
                    "==",
                    "3.19.0.1"
                ]
            ]
        },
        {
            "name": "tornado",
            "specs": [
                [
                    "==",
                    "6.3.2"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    "==",
                    "4.65.0"
                ]
            ]
        },
        {
            "name": "watchdog",
            "specs": [
                [
                    "==",
                    "3.0.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    "==",
                    "2.0.1"
                ]
            ]
        },
        {
            "name": "torchvision",
            "specs": [
                [
                    "==",
                    "0.15.2"
                ]
            ]
        },
        {
            "name": "mypy",
            "specs": [
                [
                    "==",
                    "1.4.1"
                ]
            ]
        },
        {
            "name": "ruff",
            "specs": [
                [
                    "==",
                    "0.0.278"
                ]
            ]
        },
        {
            "name": "black",
            "specs": [
                [
                    "==",
                    "23.7.0"
                ]
            ]
        },
        {
            "name": "isort",
            "specs": [
                [
                    "==",
                    "5.12.0"
                ]
            ]
        },
        {
            "name": "build",
            "specs": [
                [
                    "==",
                    "0.10.0"
                ]
            ]
        }
    ],
    "lcname": "lavd"
}

Michael Jungo