torchbricks

Name	torchbricks JSON
Version	0.3.0 JSON
	download
home_page	None
Summary	Decoupled and modular approach to building multi-task ML models using a single model recipe for all model stages
upload_time	2024-04-24 22:54:58
maintainer	None
docs_url	None
author	None
requires_python	>=3.7
license	MIT License Copyright (c) 2023 Peter Christiansen Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords	torch multi-task machine learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <!--

---
jupyter:
  jupytext:
    hide_notebook_metadata: true
    text_representation:
      extension: .md
      format_name: markdown
      format_version: '1.3'
      jupytext_version: 1.14.5
  kernelspec:
    display_name: torchbricks
    language: python
    name: python3
---

-->

# TorchBricks
[![codecov](https://codecov.io/gh/pete-machine/torchbricks/branch/main/graph/badge.svg?token=torchbricks_token_here)](https://codecov.io/gh/pete-machine/torchbricks)
[![CI](https://github.com/pete-machine/torchbricks/actions/workflows/main.yml/badge.svg)](https://github.com/pete-machine/torchbricks/actions/workflows/main.yml)


TorchBricks builds pytorch models using small reuseable and decoupled parts - we call them bricks. 

The concept is simple and flexible and allows you to more easily combine, add or swap out parts of the model 
(preprocessor, backbone, neck, head or post-processor), change the task or extend it with multiple tasks.

TorchBricks is a compact recipe on both *how* model parts are connected and *when* parts should be executed 
during different model stages such as training, validation, testing, inference and export.

TorchBricks is NOT a framework! - it just a thin abstraction on top of pytorch modules. 

<!-- #region -->

## Install it with pip

```bash
pip install torchbricks
```
<!-- #endregion -->

## Bricks by example

To demonstrate the the concepts of TorchBricks, we will first specify some dummy parts used in a regular image recognition model: 
A preprocessor, a backbone and a head (in this case a classifier).
*Note: Don't worry about the actually implementation of these modules - they are just dummy examples.*

```python
from typing import Tuple

import torch
from torch import nn


class PreprocessorDummy(nn.Module):
    def forward(self, raw_input: torch.Tensor) -> torch.Tensor:
        return raw_input / 2


class TinyModel(nn.Module):
    def __init__(self, n_channels: int, n_features: int) -> None:
        super().__init__()
        self.conv = nn.Conv2d(n_channels, n_features, kernel_size=1)

    def forward(self, tensor: torch.Tensor) -> torch.Tensor:
        return self.conv(tensor)


class ClassifierDummy(nn.Module):
    def __init__(self, num_classes: int, in_features: int) -> None:
        super().__init__()
        self.fc = nn.Linear(in_features, num_classes)
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.softmax = nn.Softmax(dim=1)

    def forward(self, tensor: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
        logits = self.fc(torch.flatten(self.avgpool(tensor), start_dim=1))
        return logits, self.softmax(logits)
```


## Concept 1: Bricks are connected
An important concept of TorchBricks is that it defines how modules are connected by specifying input and output names of
each module similar to a DAG. 

In below code snippet, we demonstrate how this would look for our dummy model. 

```python
from torchbricks.brick_collection import BrickCollection
from torchbricks.bricks import BrickNotTrainable, BrickTrainable

bricks = {
    "preprocessor": BrickNotTrainable(PreprocessorDummy(), input_names=["raw_images"], output_names=["processed"]),
    "backbone": BrickTrainable(TinyModel(n_channels=3, n_features=10), input_names=["processed"], output_names=["embedding"]),
    "head": BrickTrainable(ClassifierDummy(num_classes=3, in_features=10), input_names=["embedding"], output_names=["logits", "softmaxed"]),
}
brick_collection = BrickCollection(bricks)
# print(create_mermaid_dag_graph(brick_collection))
print(brick_collection)
```

Each module is placed in a dictionary with a unique name and wrapped inside a brick with input and output names. 
Input and output names specifies how outputs of one module is passed to inputs of the next module. 

In above example, we use `BrickNotTrainable` to wrap modules that are shouldn't be trained (weights are fixed) and 
`BrickTrainable` to wrap modules that are trainable (weights are updated on each training iteration). 

Finally, the dictionary of bricks is passed to a `BrickCollection`. 

Below we visualize how the brick collection connects bricks together. 


```mermaid
flowchart LR
    %% Brick definitions
    preprocessor(<strong>'preprocessor': PreprocessorDummy</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable
    backbone(<strong>'backbone': TinyModel</strong><br><i>BrickTrainable</i>):::BrickTrainable
    head(<strong>'head': ClassifierDummy</strong><br><i>BrickTrainable</i>):::BrickTrainable
    
    %% Draw input and outputs
    raw_images:::input --> preprocessor
    
    %% Draw nodes and edges
    preprocessor --> |processed| backbone
    backbone --> |embedding| head
    head --> logits:::output
    head --> softmaxed:::output
    
    %% Add styling
    classDef arrow stroke-width:0px,fill-opacity:0.0 
    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 
    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 
    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 
    classDef BrickTrainable stroke-width:0px,fill:#6D597A 
    
    %% Add legends
    subgraph Legends
        input(input):::input
        output(output):::output
    end
```
*Graph is visualized using [mermaid](https://github.com/mermaid-js/mermaid) syntax.*
*We provide the `create_mermaid_dag_graph`-function to create a brick collection visualization*


The `BrickCollection` is used for executing above graph by passing a dictionary with named input data (`named_inputs`). 

For above brick collection, we only expect one named input called `raw_images`. 

```python
batch_size = 2
batched_images = torch.rand((batch_size, 3, 100, 200))
named_inputs = {"raw_images": batched_images}
named_outputs = brick_collection(named_inputs=named_inputs)
print("Brick outputs:", named_outputs.keys())
# Brick outputs: dict_keys(['raw_images', 'processed', 'embedding', 'logits', 'softmaxed'])
```

The brick collection accepts a dictionary and returns a dictionary with all intermediated and resulting tensors. 

Running our models as a brick collection has the following advantages:

- A brick collection act as a regular `nn.Module` with all the familiar features: a `forward`-function, a `to`-function to move 
  to a specific device/precision, you can save/load a model, management of parameters, onnx exportable etc. 
- A brick collection is also a simple DAG, it accepts a dictionary with "named data" (we call this `named_inputs`), 
executes each bricks and ensures that the outputs are passed to the inputs of other bricks with matching names. 
Structuring the model as a DAG, makes it easy to add/remove outputs for a given module during development, add new modules to the
collection and build completely new models from reusable parts. 
- A brick collection is actually a dictionary (`nn.DictModule`). Allowing you to access, pop and update the 
  collection easily as a regular dictionary. It can also handle nested dictionary, allowing groups of bricks to be added/removed easily. 


## Concept 2: Bricks are grouped
Another important concept is that bricks can be executed in groups. 

To demonstrate how and why this is useful, we have added the `group` argument to each brick and introduced `BrickLoss` brick.

```python
from torchbricks.bricks import BrickLoss

bricks = {
    "preprocessor": BrickNotTrainable(PreprocessorDummy(), input_names=["raw_images"], output_names=["processed"], group="MODEL"),
    "backbone": BrickTrainable(
        TinyModel(n_channels=3, n_features=10), input_names=["processed"], output_names=["embedding"], group="MODEL"
    ),
    "head": BrickTrainable(
        ClassifierDummy(num_classes=3, in_features=10),
        input_names=["embedding"],
        output_names=["logits", "softmaxed"],
        group="MODEL",
    ),
    "loss": BrickLoss(model=nn.CrossEntropyLoss(), input_names=["logits", "targets"], output_names=["loss_ce"], group="LOSS"),
}
brick_collection = BrickCollection(bricks)

print(brick_collection)
# BrickCollection(
#   (preprocessor): BrickNotTrainable(PreprocessorDummy, input_names=['raw_images'], output_names=['processed'], groups={'MODEL'})
#   (backbone): BrickTrainable(TinyModel, input_names=['processed'], output_names=['embedding'], groups={'MODEL'})
#   (head): BrickTrainable(ClassifierDummy, input_names=['embedding'], output_names=['logits', 'softmaxed'], groups={'MODEL'})
#   (loss): BrickLoss(CrossEntropyLoss, input_names=['logits', 'targets'], output_names=['loss_ce'], groups={'LOSS'})
# )
# print(create_mermaid_dag_graph(brick_collection))
```

With group names, it is now possible to execute desired subsets of the model 
during execution by adding `groups`.

Here is a few examples: 

```python
named_inputs = {"raw_images": batched_images, "targets": torch.ones((batch_size), dtype=torch.int64)}

# With no groups specified, all bricks are executed
named_outputs = brick_collection(named_inputs=named_inputs)

# With groups specified, only bricks in the specified groups are executed
named_outputs = brick_collection(named_inputs=named_inputs, groups={"MODEL"})
```

Groups are important concept in our model recipe as it allows us to specify how model will act during different model stages. 



**Brick collection during inference and export:**

During `Inference` and `Export` model stages, we do not have ground truth labels and we wan to skip loss calculations. 

```python
# Execution only "MODEL" group bricks
named_outputs = brick_collection(named_inputs=named_inputs, groups={"MODEL"})
```

The graph will look like this and note that the graph only requires `raw_images` as input:
```mermaid
flowchart LR
    %% Brick definitions
    preprocessor(<strong>'preprocessor': PreprocessorDummy</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable
    backbone(<strong>'backbone': TinyModel</strong><br><i>BrickTrainable</i>):::BrickTrainable
    head(<strong>'head': ClassifierDummy</strong><br><i>BrickTrainable</i>):::BrickTrainable
    
    %% Draw input and outputs
    raw_images:::input --> preprocessor
    
    %% Draw nodes and edges
    preprocessor --> |processed| backbone
    backbone --> |embedding| head
    head --> logits:::output
    head --> softmaxed:::output
    
    %% Add styling
    classDef arrow stroke-width:0px,fill-opacity:0.0 
    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 
    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 
    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 
    classDef BrickTrainable stroke-width:0px,fill:#6D597A 
    
    %% Add legends
    subgraph Legends
        input(input):::input
        output(output):::output
    end
```


**Brick collection during train, test and validation:**

During "Train", "Test" and "Validation", `targets` are available and we want to calculate loss to 
both improve model and track loss curves. 

```python
# Execution all groups
named_outputs = brick_collection(named_inputs=named_inputs)

# Or execute explicitly "MODEL" and "LOSS" group bricks
named_outputs = brick_collection(named_inputs=named_inputs, groups={"MODEL", "LOSS"})
```

The graph will look like this and note that the graph now requires `raw_images` and `targets` as input:

```mermaid
flowchart LR
    %% Brick definitions
    preprocessor(<strong>'preprocessor': PreprocessorDummy</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable
    backbone(<strong>'backbone': TinyModel</strong><br><i>BrickTrainable</i>):::BrickTrainable
    head(<strong>'head': ClassifierDummy</strong><br><i>BrickTrainable</i>):::BrickTrainable
    loss(<strong>'loss': CrossEntropyLoss</strong><br><i>BrickLoss</i>):::BrickLoss
    
    %% Draw input and outputs
    raw_images:::input --> preprocessor
    targets:::input --> loss
    
    %% Draw nodes and edges
    preprocessor --> |processed| backbone
    backbone --> |embedding| head
    head --> |logits| loss
    head --> softmaxed:::output
    loss --> loss_ce:::output
    
    %% Add styling
    classDef arrow stroke-width:0px,fill-opacity:0.0 
    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 
    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 
    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 
    classDef BrickTrainable stroke-width:0px,fill:#6D597A 
    classDef BrickLoss stroke-width:0px,fill:#5C677D 
    
    %% Add legends
    subgraph Legends
        input(input):::input
        output(output):::output
    end
```


As demonstrated in above example, we can easily change the required inputs by change the model stage.
That allows us to support two basic use cases:

1) When labels/targets are available, we have the option of getting model prediction along with loss and metrics.

2) When labels/targets are **not** available, we do only model predictions used for model inference/export.

The mechanism of activating different parts of the model and making loss, metrics and visualizations part of the model recipe, 
allows us to more easily investigate/debug/visualize model parts in a notebook or scratch scripts.


## Brick features: 



### Brick feature: TorchMetrics
**We are not creating a training framework**, but to easily use the brick collection in your favorite training framework or custom 
training/validation/test loop, we need the option of **calculating model metrics** 

To easily inject both model, losses and metrics, we also need to easily support metrics and calculate metrics across a dataset. 
We will extend our example from before by adding metric bricks. 

To calculate metrics across a dataset, we heavily rely on concepts and functions used in the 
[TorchMetrics](https://torchmetrics.readthedocs.io/en/stable/) library.

The used of TorchMetrics in a brick collection is demonstrated in below code snippet. 

```python
import torchvision
from torchbricks.bag_of_bricks.backbones import resnet_to_brick
from torchbricks.bag_of_bricks.image_classification import ImageClassifier
from torchbricks.bag_of_bricks.preprocessors import Preprocessor
from torchbricks.bricks import BrickLoss, BrickMetricSingle
from torchmetrics.classification import MulticlassAccuracy

num_classes = 10
resnet = torchvision.models.resnet18(weights=None, num_classes=num_classes)
resnet_brick = resnet_to_brick(resnet=resnet, input_name="normalized", output_name="features")
n_features = resnet_brick.model.n_backbone_features
bricks = {
    "preprocessor": BrickNotTrainable(Preprocessor(), input_names=["raw"], output_names=["normalized"]),
    "backbone": resnet_brick,
    "head": BrickTrainable(
        ImageClassifier(num_classes=num_classes, n_features=n_features),
        input_names=["features"],
        output_names=["logits", "probabilities", "class_prediction"],
    ),
    "accuracy": BrickMetricSingle(MulticlassAccuracy(num_classes=num_classes), input_names=["class_prediction", "targets"]),
    "loss": BrickLoss(model=nn.CrossEntropyLoss(), input_names=["logits", "targets"], output_names=["loss_ce"]),
}
brick_collection = BrickCollection(bricks)
```

We will now use the brick collection above to simulate how a user can iterate over a dataset.

```python
# Simulate dataloader
named_input_simulated = {"raw": batched_images, "targets": torch.ones((batch_size), dtype=torch.int64)}
dataloader_simulated = [named_input_simulated for _ in range(5)]

# Loop over the dataset
for named_inputs in dataloader_simulated:  # Simulates iterating over the dataset
    named_outputs = brick_collection(named_inputs=named_inputs)
    named_outputs_losses_only = brick_collection.extract_losses(named_outputs=named_outputs)

metrics = brick_collection.summarize(reset=True)
print(f"{named_outputs.keys()=}")
# named_outputs.keys()=dict_keys(['raw', 'targets', 'stage', 'normalized', 'features', 'logits', 'probabilities', 'class_prediction', 'loss_ce'])
print(f"{metrics=}")
# metrics={'MulticlassAccuracy': tensor(0.)}
```

For each iteration in our (simulated) dataset, we calculate model outputs, losses and metrics for each batch. 

Losses are calculated and returned in `named_outputs` together with other model outputs. 
We provide `extract_losses` as simple function to filter `named_outputs` and only return losses in a new dictionary. 

Unlike other bricks, `BrickMetrics` will not (by default) output metrics for each batch. 
Instead metrics are stored internally in `BrickMetricSingle` and only aggregated and return when
the `summarize` function is called. In above example, metric is aggregated over 5 batches as summaries to a single value. 

It is important to note that we set `reset=True` to reset the internal aggregation of metrics.  

**Additional notes on metrics**

You have the option of either using a single metric (`torchmetrics.Metric`) with `BrickMetricSingle` or a collection of 
metrics (`torchmetrics.MetricCollection`) with `BrickMetrics`.

For multiple metrics, we advice to use `BrickMetrics` with a `torchmetrics.MetricCollection` 
[doc](https://torchmetrics.readthedocs.io/en/stable/pages/overview.html#metriccollection). 
It has some intelligent mechanisms for efficiently sharing calculation for multiple metrics.

Note also that metrics are not passed to other bricks or returned as output of the brick collection - they are only stored internally. 
To also pass metrics to other bricks, you can set `return_metrics=True` for `BrickMetrics` and `BrickMetricSingle`. 
But be aware, this will add computational cost. 


### Brick features: Act as a nn.Module
A brick collection acts as a 'nn.Module' meaning:

```python
from pathlib import Path

# Move to specify device (CPU/GPU) or precision to automatically move model parameters
brick_collection.to(torch.float16)
brick_collection.to(torch.float32)

# Save model parameters
path_model = Path("build/readme_model.pt")
torch.save(brick_collection.state_dict(), path_model)

# Load model parameters
brick_collection.load_state_dict(torch.load(path_model))

# Iterate all parameters
for name, params in brick_collection.named_parameters():
    pass

# Iterate all layers
for name, module in brick_collection.named_modules():
    pass

# Using compile with pytorch >= 2.0
torch.compile(brick_collection)
```

### Brick features: Nested bricks and relative input/output names
To more easily add, remove and swap out a subset of bricks in a brick collection (e.g. bricks related to specific task), we
support passing a nested dictionary of bricks to a `BrickCollection` and using relative input and output names. 

First we create a function (`create_image_classification_head`) that returns a dictionary with image classification specific 
bricks. 

```python
from typing import Dict

from torchbricks.bricks import BrickInterface


def create_image_classification_head(
    num_classes: int, in_channels: int, features_name: str, targets_name: str
) -> Dict[str, BrickInterface]:
    """Image classifier bricks: Classifier, loss and metrics"""
    head = {
        "classify": BrickTrainable(
            ImageClassifier(num_classes=num_classes, n_features=in_channels),
            input_names=[features_name],
            output_names=["./logits", "./probabilities", "./class_prediction"],
        ),
        "accuracy": BrickMetricSingle(MulticlassAccuracy(num_classes=num_classes), input_names=["./class_prediction", targets_name]),
        "loss": BrickLoss(model=nn.CrossEntropyLoss(), input_names=["./logits", targets_name], output_names=["./loss_ce"]),
    }
    return head
```

We now create the full model containing a `preprocessor`, `backbone` and two independent heads called `head0` and `head1`.
Each head is a dictionary of bricks, making our brick collection a nested dictionary. 

```python
from torchbricks.graph_plotter import create_mermaid_dag_graph

n_features = resnet_brick.model.n_backbone_features
bricks = {
    "preprocessor": BrickNotTrainable(Preprocessor(), input_names=["raw"], output_names=["normalized"]),
    "backbone": resnet_brick,
    "head0": create_image_classification_head(num_classes=3, in_channels=n_features, features_name="features", targets_name="targets0"),
    "head1": create_image_classification_head(num_classes=5, in_channels=n_features, features_name="features", targets_name="targets1"),
}
brick_collections = BrickCollection(bricks)
print(brick_collections)
print(create_mermaid_dag_graph(brick_collections))
```

Also demonstrated in above example is the use of relative input and output names. 
Looking at our `create_image_classification_head` function again, you will notice that we actually use of relative input and output names 
(`./logits`, `./probabilities`, `./class_prediction` and `./loss_ce`). 

Relative names will use the brick name to derive "absolute" names. E.g. for `head0` the relative 
input name `./logits` becomes `head0/logits` and for `head1` the relative input name `./logits`  becomes `head1/logits`.

We visualize above graph: 


```mermaid
flowchart LR
    %% Brick definitions
    preprocessor(<strong>'preprocessor': Preprocessor</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable
    backbone(<strong>'backbone': BackboneResnet</strong><br><i>BrickTrainable</i>):::BrickTrainable
    head0/classify(<strong>'head0/classify': ImageClassifier</strong><br><i>BrickTrainable</i>):::BrickTrainable
    head0/accuracy(<strong>'head0/accuracy': 'MulticlassAccuracy'</strong><br><i>BrickMetricSingle</i>):::BrickMetricSingle
    head0/loss(<strong>'head0/loss': CrossEntropyLoss</strong><br><i>BrickLoss</i>):::BrickLoss
    head1/classify(<strong>'head1/classify': ImageClassifier</strong><br><i>BrickTrainable</i>):::BrickTrainable
    head1/accuracy(<strong>'head1/accuracy': 'MulticlassAccuracy'</strong><br><i>BrickMetricSingle</i>):::BrickMetricSingle
    head1/loss(<strong>'head1/loss': CrossEntropyLoss</strong><br><i>BrickLoss</i>):::BrickLoss
    
    %% Draw input and outputs
    raw:::input --> preprocessor
    targets0:::input --> head0/accuracy
    targets0:::input --> head0/loss
    targets1:::input --> head1/accuracy
    targets1:::input --> head1/loss
    
    %% Draw nodes and edges
    preprocessor --> |normalized| backbone
    backbone --> |features| head0/classify
    backbone --> |features| head1/classify
    subgraph head0
        head0/classify --> |head0/class_prediction| head0/accuracy
        head0/classify --> |head0/logits| head0/loss
        head0/classify --> head0/probabilities:::output
        head0/loss --> head0/loss_ce:::output
    end
    subgraph head1
        head1/classify --> |head1/class_prediction| head1/accuracy
        head1/classify --> |head1/logits| head1/loss
        head1/classify --> head1/probabilities:::output
        head1/loss --> head1/loss_ce:::output
    end
    
    %% Add styling
    classDef arrow stroke-width:0px,fill-opacity:0.0 
    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 
    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 
    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 
    classDef BrickTrainable stroke-width:0px,fill:#6D597A 
    classDef BrickMetricSingle stroke-width:0px,fill:#1450A3 
    classDef BrickLoss stroke-width:0px,fill:#5C677D 
    
    %% Add legends
    subgraph Legends
        input(input):::input
        output(output):::output
    end
```


### Brick features: Save and loading bricks
A brick collection can be saved and loaded as a regular pytorch `nn.Module`. For more information you can look up the official 
pytorch guide on [Saving and Loading Models](https://pytorch.org/tutorials/beginner/saving_loading_models.html). 

However, we have also added a brick collection specific saving/loading format. It uses a pytorch weight format, 
but creates a model file for each brick and keeps files in a nested folder structure. 

The idea is that a user can more easily add or remove weights to a specific model by simply moving around model files and folders.
Time will tell, if this a useful abstraction or dead code. 

But it looks like this: 

```python
path_model_folder = Path("build/bricks")

# Saving model parameters brick-collection style
brick_collections.save_bricks(path_model_folder=path_model_folder, exist_ok=True)

print("Model files: ")
print("\n".join(str(path) for path in path_model_folder.rglob("*.pt")))

# Loading model parameters brick-collection style
brick_collection.load_bricks(path_model_folder=path_model_folder)
```

### Brick features: Export as ONNX
To export a brick collection as onnx we provide the `export_bricks_as_onnx`-function. 

Pass an example input (`named_input`) to trace a brick collection.
Set `dynamic_batch_size=True` to support any batch size inputs and here we explicitly set `stage=Stage.EXPORT` - this is also 
the default.

```python
from torchbricks.brick_collection_utils import export_bricks_as_onnx

path_build = Path("build")
path_build.mkdir(exist_ok=True)
path_onnx = path_build / "readme_model.onnx"

export_bricks_as_onnx(path_onnx=path_onnx, brick_collection=brick_collection, named_inputs=named_inputs, dynamic_batch_size=True)
```

### Brick features: Bag of bricks - reusable bricks modules
Note also in above example we use bag-of-bricks to import commonly used `nn.Module`s 

This includes a `Preprocessor`, `ImageClassifier` and `resnet_to_brick` to convert a torchvision resnet models to a backbone brick 
without a classifier.


### Brick features: Training with pytorch-lightning trainer
I like and love pytorch-lightning! We can avoid writing the easy-to-get-wrong training loop and validation/test scrips.

Pytorch lightning creates logs, ensures training is done efficiently on any device (CPU, GPU, TPU), on multiple/distributed devices 
with reduced precision and much more.

However, one issue I found myself having when wanting to extend my custom pytorch-lightning module (`LightningModule`) is that it forces an
object oriented style with multiple levels of inheritance. This is not necessarily bad, but it makes it hard to reuse 
code across projects and generally makes the code complicated. 

With a brick collection you should rarely change or inherit your lightning module, instead you can inject the model, metrics and loss functions
into a lightning module. Changes to preprocessor, backbone, necks, heads, metrics and losses are done on the outside
and injected into the lightning module. 

Below is an example of how you could inject a brick collection with pytorch-lightning. 

We have created `LightningBrickCollection` ([available here](https://github.com/PeteHeine/torchbricks/blob/main/scripts/lightning_module.py)) 
as an example for you to use. 


```python
from functools import partial
from pathlib import Path

import pytorch_lightning as pl
import torchvision
from utils_testing.datamodule_cifar10 import CIFAR10DataModule
from utils_testing.lightning_module import LightningBrickCollection

experiment_name = "CIFAR10"
transform = torchvision.transforms.ToTensor()
data_module = CIFAR10DataModule(data_dir="data", batch_size=5, num_workers=12, test_transforms=transform, train_transforms=transform)
create_optimizer_func = partial(torch.optim.SGD, lr=0.05, momentum=0.9, weight_decay=5e-4)
bricks_lightning_module = LightningBrickCollection(
    path_experiments=Path("build") / "experiments",
    experiment_name=None,
    brick_collection=brick_collection,
    create_optimizers_func=create_optimizer_func,
)

trainer = pl.Trainer(max_epochs=1, limit_train_batches=2, limit_val_batches=2, limit_test_batches=2)
# Train and test model by injecting 'bricks_lightning_module'
trainer.fit(bricks_lightning_module, datamodule=data_module)
trainer.test(bricks_lightning_module, datamodule=data_module)
```


### Brick features: Pass all intermediate tensors to Brick
By adding `'__all__'` to `input_names`, it is possible to access all tensors as a dictionary inside a brick module. 
For production code, this may not be the best option, but this feature can be valuable during an exploration phase or 
when doing some live debugging of a new model/module. 

We will demonstrate in code by introducing a (dummy) module `MyNewPostProcessor`.

*Note: It is just a dummy class, don't worry to much about the actual implementation.*

The important thing to notice is that `input_names = ['__all__']` is used for our `visualizer`-brick to
pass all tensors as a dictionary as an argument in the forward call. 

```python
from typing import Any


class MyNewPostProcessor(torch.nn.Module):
    def forward(self, named_inputs: Dict[str, Any]):
        ## Here `named_inputs` contains all intermediate tensors
        assert "raw" in named_inputs
        assert "embedding" in named_inputs
        return named_inputs["embedding"]


bricks = {
    "backbone": BrickTrainable(TinyModel(n_channels=3, n_features=10), input_names=["raw"], output_names=["embedding"]),
    "post_processor": BrickNotTrainable(MyNewPostProcessor(), input_names=["__all__"], output_names=["postprocessed"]),
}
brick_collection = BrickCollection(bricks)
named_outputs = brick_collection(named_inputs={"raw": torch.rand((2, 3, 100, 200))})
```

### Brick features: Visualizations in TorchBricks
We provide `BrickPerImageVisualization` as base brick for doing visualizations in a brick collection. 
The advantage of brick-based visualization is that it can be bundled together with a specific task/head. 

Secondly, visualization/drawing functions typically operate on a single image and on non-`torch.Tensor` data types.
E.g. Opencv/matplotlib uses `np.array` and pillow using `Image`. 

(Torchvision actually has functions to draw rectangles, key-points and segmentation masks directly on `torch.Tensor`s -
but it still operates on a single image and it has no option for rendering text).

The goal of `BrickPerImageVisualization` is to convert batched tensors/data to per image data in a desired format/datatype 
and pass it to a draw function. Look up the documentation of `BrickPerImageVisualization` to see all options.

First we create a callable to do per image visualizations. It can be a simple function, but as demonstrated in below example, it 
can also be a callable class. 

The callable visualizes image classification predictions using pillow and requires two `np.array`s as input: 
`input_image` of shape [H, W, C] and `target_prediction` [1].

```python
import numpy as np
from PIL import Image, ImageDraw, ImageFont
from torchbricks.tensor_conversions import float2uint8


class VisualizeImageClassification:
    def __init__(self, class_names: list, font_size: int = 50):
        self.class_names = class_names
        self.font = ImageFont.truetype("tests/data/font_ASMAN.TTF", size=font_size)

    def __call__(self, input_image: np.ndarray, target_prediction: np.ndarray) -> Image.Image:
        """Draws image classification results"""
        assert input_image.ndim == 3  # Converted to single image channel last numpy array [H, W, C]
        image = Image.fromarray(float2uint8(input_image))
        draw = ImageDraw.Draw(image)
        draw.text((25, 25), text=self.class_names[target_prediction[0]], font=self.font)
        return image
```

The drawing class `VisualizeImageClassification` is now passed to `BrickPerImageVisualization` and used in a brick collection.

```python
from torchbricks.bag_of_bricks.brick_visualizer import BrickPerImageVisualization

bricks = {
    "visualizer": BrickPerImageVisualization(
        callable=VisualizeImageClassification(class_names=["cat", "dog"]),
        input_names=["input_image", "target"],
        output_names=["visualization"],
    )
}

batched_inputs = {"input_image": torch.zeros((2, 3, 100, 200)), "target": torch.tensor([0, 1], dtype=torch.int64)}
brick_collection = BrickCollection(bricks)
outputs = brick_collection(named_inputs=batched_inputs)

display(outputs["visualization"][0], outputs["visualization"][1])
```

`BrickPerImageProcessing` will by default convert a batch tensor of shape [B, C, H, W] to a channel last numpy image of shape [H, W, C]. 
This is the default behavior, and it allows us in the callable of `VisualizeImageClassification` to operate directly on numpy arrays. 

However for `BrickPerImageProcessing` a user has the option for unpacking batch data in a desired way as we will demonstrate in the 
next example.


Below we create a class that inherits `BrickPerImageVisualization` to create a brick for visualizing
image classification `BrickVisualizeImageClassification`. The functionality is similar to above, but demonstrate 
other options of the `BrickPerImageVisualization` class. 

*It is important to note that `visualize_image_classification_pillow` is passed as a callable, and we do not override functionality of 
`BrickPerImageVisualization`. We only use it to simplify the constructor of `BrickVisualizeImageClassification`.

```python
from typing import List

from torchbricks.tensor_conversions import function_composer, torch_to_numpy, unpack_batched_tensor_to_pillow_images


class BrickVisualizeImageClassification(BrickPerImageVisualization):
    def __init__(self, input_image: str, target_name: str, class_names: List[str], output_name: str):
        self.class_names = class_names
        self.font = ImageFont.truetype("tests/data/font_ASMAN.TTF", 50)
        super().__init__(
            callable=self.visualize_image_classification_pillow,
            input_names=[input_image, target_name],
            output_names=[output_name],
            unpack_functions_for_type={torch.Tensor: unpack_batched_tensor_to_pillow_images},
            unpack_functions_for_input_name={target_name: function_composer(torch_to_numpy, list)},
        )

    def visualize_image_classification_pillow(self, image: Image.Image, target_prediction: np.int64) -> Image.Image:
        """Draws image classification results"""
        draw = ImageDraw.Draw(image)

        draw.text((25, 25), text=self.class_names[target_prediction], font=self.font)
        return image


visualizer = BrickVisualizeImageClassification(
    input_image="input_image", target_name="target", class_names=["cat", "dog"], output_name="VisualizeImageClassification"
)
batched_inputs = {"input_image": torch.zeros((2, 3, 100, 200)), "target": torch.tensor([0, 1], dtype=torch.int64)}
visualizer(batched_inputs)
```

Not unlike before, the callable (here `visualize_image_classification_pillow`) accepts an `Image.Image` image and an `int64` value directly
and we are not required to do conversions inside the drawing function. 

This can be achieved by using the two input arguments: 
- `unpack_functions_for_type: Dict[Type, Callable]` specifying how each type should be unpacked.
  In above example we use `unpack_functions_for_type={torch.Tensor: unpack_batched_tensor_to_pillow_images}` to unpack all `torch.Tensor`s 
  of shape [B, 3, H, W] as pillow images.
- `unpack_functions_for_input_name: Dict[str, Callable]` specifies how a specific input name should be unpacked. 
  In above example we use `unpack_functions_for_input_name={target_name: function_composer(torch_to_numpy, list)}` to unpack a 
  `torch.Tensor` of shape [B] to one int64 value per image. 

Specifying unpacking by input name (`unpack_functions_for_input_name`) will override the per type unpacking of `unpack_functions_for_type`. 


## Motivation

The main motivation:
- Sharable models: Packing model parts, metrics, loss-functions and visualizations into a single recipe, makes the model more sharable to
  other projects and supports sharing models for different use cases such as: Only inference, inference+visualizations and 
  training+metrics+losses.
- Shareable Parts: The brick collection encourage users to decouples parts and making also each part more sharable. 
- Multiple tasks: Makes it easier to add and remove tasks. Each task can be expressed by model parts in a dictionary, 
  we can easily add/remove them to a brick collection. 
- By packing model modules, metrics, loss-functions and visualization into a single brick collection, we can more easily 
  inject it into your custom trainer and evaluation without doing per task/model modifications. 
- Your model is **not** required to only return logits. Some training frameworks expect you to only return logits - values that go into 
  your loss function. Then at inference/test/evaluation you need to do post processing or pass additional outputs to 
  calculate metrics, do visualizations and make prediction human interpretable. It encourage unclear control flow (if/else statements) 
  in the model that depends on model stage. 
- Using input and output names makes it easier to describe how parts are connected. Internally data is passed between bricks in a 
  dictionary of any type - making in flexible. But for each module, you can specific and add and check type hints for input and output 
  data to both improve readability and make it more production ready. 
- When I started making a framework suited for multiple tasks, I would passed dictionaries around to all modules and pull out tensors by
  name in each module. Book keeping names and updating names was messy. 
  I also started using the typical backbone(encoder) / head(decoder) separation... But some heads may share a common neck. 
  The decoder might also take different inputs and
  split into different representation and merge again... Also to avoid code duplication, I ended up during 
  multiple layers of inheritance for the decoder, making reuse bad and generally everything became too complicated and a new task would 
  require me to refactor the whole concept. Yes, it was probably not a super great attempt either, but it made me realize it should be 
  easier to make a new task and it should be easier to reuse parts. 


<!-- #region -->
##

## What are we missing?
- [ ] Demonstrate model configuration with hydra in this document
- [ ] Make common Visualizations with pillow - not opencv to not blow up the required dependencies. ImageClassification, Segmentation, ObjectDetection
  - [ ] VideoModule to store data as a video
  - [ ] DisplayModule to show data
- [ ] Consider caching unpacked data for `PerImageVisualizer`
- [ ] Multiple named tensors caching module. 
- [ ] Use pymy, pyright or pyre to do static code checks. 
- [ ] Collection of helper modules. Preprocessors, Backbones, Necks/Upsamplers, ImageClassification, SemanticSegmentation, ObjectDetection
  - [ ] Make common brick collections: BricksImageClassification, BricksSegmentation, BricksPointDetection, BricksObjectDetection
- [ ] Support preparing data in the dataloader?
- [ ] Support torch.jit.scripting? 

## How does it really work?
????



## Development

Read the [CONTRIBUTING.md](CONTRIBUTING.md) file.

### Install

    conda create --name torchbricks --file conda-linux-64.lock
    conda activate torchbricks
    poetry install

### Activating the environment

    conda activate torchbricks

<!-- #endregion -->

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "torchbricks",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "torch, multi-task, machine learning",
    "author": null,
    "author_email": "Peter Hviid Christiansen <PeterHviidChristiansen@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/6c/4e/6514a26be6f642ca0582bd5b21beb4d63426f8fb6159b9c77949fccc373e/torchbricks-0.3.0.tar.gz",
    "platform": null,
    "description": "<!--\n\n---\njupyter:\n  jupytext:\n    hide_notebook_metadata: true\n    text_representation:\n      extension: .md\n      format_name: markdown\n      format_version: '1.3'\n      jupytext_version: 1.14.5\n  kernelspec:\n    display_name: torchbricks\n    language: python\n    name: python3\n---\n\n-->\n\n# TorchBricks\n[![codecov](https://codecov.io/gh/pete-machine/torchbricks/branch/main/graph/badge.svg?token=torchbricks_token_here)](https://codecov.io/gh/pete-machine/torchbricks)\n[![CI](https://github.com/pete-machine/torchbricks/actions/workflows/main.yml/badge.svg)](https://github.com/pete-machine/torchbricks/actions/workflows/main.yml)\n\n\nTorchBricks builds pytorch models using small reuseable and decoupled parts - we call them bricks. \n\nThe concept is simple and flexible and allows you to more easily combine, add or swap out parts of the model \n(preprocessor, backbone, neck, head or post-processor), change the task or extend it with multiple tasks.\n\nTorchBricks is a compact recipe on both *how* model parts are connected and *when* parts should be executed \nduring different model stages such as training, validation, testing, inference and export.\n\nTorchBricks is NOT a framework! - it just a thin abstraction on top of pytorch modules. \n\n<!-- #region -->\n\n## Install it with pip\n\n```bash\npip install torchbricks\n```\n<!-- #endregion -->\n\n## Bricks by example\n\nTo demonstrate the the concepts of TorchBricks, we will first specify some dummy parts used in a regular image recognition model: \nA preprocessor, a backbone and a head (in this case a classifier).\n*Note: Don't worry about the actually implementation of these modules - they are just dummy examples.*\n\n```python\nfrom typing import Tuple\n\nimport torch\nfrom torch import nn\n\n\nclass PreprocessorDummy(nn.Module):\n    def forward(self, raw_input: torch.Tensor) -> torch.Tensor:\n        return raw_input / 2\n\n\nclass TinyModel(nn.Module):\n    def __init__(self, n_channels: int, n_features: int) -> None:\n        super().__init__()\n        self.conv = nn.Conv2d(n_channels, n_features, kernel_size=1)\n\n    def forward(self, tensor: torch.Tensor) -> torch.Tensor:\n        return self.conv(tensor)\n\n\nclass ClassifierDummy(nn.Module):\n    def __init__(self, num_classes: int, in_features: int) -> None:\n        super().__init__()\n        self.fc = nn.Linear(in_features, num_classes)\n        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))\n        self.softmax = nn.Softmax(dim=1)\n\n    def forward(self, tensor: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:\n        logits = self.fc(torch.flatten(self.avgpool(tensor), start_dim=1))\n        return logits, self.softmax(logits)\n```\n\n\n## Concept 1: Bricks are connected\nAn important concept of TorchBricks is that it defines how modules are connected by specifying input and output names of\neach module similar to a DAG. \n\nIn below code snippet, we demonstrate how this would look for our dummy model. \n\n```python\nfrom torchbricks.brick_collection import BrickCollection\nfrom torchbricks.bricks import BrickNotTrainable, BrickTrainable\n\nbricks = {\n    \"preprocessor\": BrickNotTrainable(PreprocessorDummy(), input_names=[\"raw_images\"], output_names=[\"processed\"]),\n    \"backbone\": BrickTrainable(TinyModel(n_channels=3, n_features=10), input_names=[\"processed\"], output_names=[\"embedding\"]),\n    \"head\": BrickTrainable(ClassifierDummy(num_classes=3, in_features=10), input_names=[\"embedding\"], output_names=[\"logits\", \"softmaxed\"]),\n}\nbrick_collection = BrickCollection(bricks)\n# print(create_mermaid_dag_graph(brick_collection))\nprint(brick_collection)\n```\n\nEach module is placed in a dictionary with a unique name and wrapped inside a brick with input and output names. \nInput and output names specifies how outputs of one module is passed to inputs of the next module. \n\nIn above example, we use `BrickNotTrainable` to wrap modules that are shouldn't be trained (weights are fixed) and \n`BrickTrainable` to wrap modules that are trainable (weights are updated on each training iteration). \n\nFinally, the dictionary of bricks is passed to a `BrickCollection`. \n\nBelow we visualize how the brick collection connects bricks together. \n\n\n```mermaid\nflowchart LR\n    %% Brick definitions\n    preprocessor(<strong>'preprocessor': PreprocessorDummy</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable\n    backbone(<strong>'backbone': TinyModel</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    head(<strong>'head': ClassifierDummy</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    \n    %% Draw input and outputs\n    raw_images:::input --> preprocessor\n    \n    %% Draw nodes and edges\n    preprocessor --> |processed| backbone\n    backbone --> |embedding| head\n    head --> logits:::output\n    head --> softmaxed:::output\n    \n    %% Add styling\n    classDef arrow stroke-width:0px,fill-opacity:0.0 \n    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 \n    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 \n    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 \n    classDef BrickTrainable stroke-width:0px,fill:#6D597A \n    \n    %% Add legends\n    subgraph Legends\n        input(input):::input\n        output(output):::output\n    end\n```\n*Graph is visualized using [mermaid](https://github.com/mermaid-js/mermaid) syntax.*\n*We provide the `create_mermaid_dag_graph`-function to create a brick collection visualization*\n\n\nThe `BrickCollection` is used for executing above graph by passing a dictionary with named input data (`named_inputs`). \n\nFor above brick collection, we only expect one named input called `raw_images`. \n\n```python\nbatch_size = 2\nbatched_images = torch.rand((batch_size, 3, 100, 200))\nnamed_inputs = {\"raw_images\": batched_images}\nnamed_outputs = brick_collection(named_inputs=named_inputs)\nprint(\"Brick outputs:\", named_outputs.keys())\n# Brick outputs: dict_keys(['raw_images', 'processed', 'embedding', 'logits', 'softmaxed'])\n```\n\nThe brick collection accepts a dictionary and returns a dictionary with all intermediated and resulting tensors. \n\nRunning our models as a brick collection has the following advantages:\n\n- A brick collection act as a regular `nn.Module` with all the familiar features: a `forward`-function, a `to`-function to move \n  to a specific device/precision, you can save/load a model, management of parameters, onnx exportable etc. \n- A brick collection is also a simple DAG, it accepts a dictionary with \"named data\" (we call this `named_inputs`), \nexecutes each bricks and ensures that the outputs are passed to the inputs of other bricks with matching names. \nStructuring the model as a DAG, makes it easy to add/remove outputs for a given module during development, add new modules to the\ncollection and build completely new models from reusable parts. \n- A brick collection is actually a dictionary (`nn.DictModule`). Allowing you to access, pop and update the \n  collection easily as a regular dictionary. It can also handle nested dictionary, allowing groups of bricks to be added/removed easily. \n\n\n## Concept 2: Bricks are grouped\nAnother important concept is that bricks can be executed in groups. \n\nTo demonstrate how and why this is useful, we have added the `group` argument to each brick and introduced `BrickLoss` brick.\n\n```python\nfrom torchbricks.bricks import BrickLoss\n\nbricks = {\n    \"preprocessor\": BrickNotTrainable(PreprocessorDummy(), input_names=[\"raw_images\"], output_names=[\"processed\"], group=\"MODEL\"),\n    \"backbone\": BrickTrainable(\n        TinyModel(n_channels=3, n_features=10), input_names=[\"processed\"], output_names=[\"embedding\"], group=\"MODEL\"\n    ),\n    \"head\": BrickTrainable(\n        ClassifierDummy(num_classes=3, in_features=10),\n        input_names=[\"embedding\"],\n        output_names=[\"logits\", \"softmaxed\"],\n        group=\"MODEL\",\n    ),\n    \"loss\": BrickLoss(model=nn.CrossEntropyLoss(), input_names=[\"logits\", \"targets\"], output_names=[\"loss_ce\"], group=\"LOSS\"),\n}\nbrick_collection = BrickCollection(bricks)\n\nprint(brick_collection)\n# BrickCollection(\n#   (preprocessor): BrickNotTrainable(PreprocessorDummy, input_names=['raw_images'], output_names=['processed'], groups={'MODEL'})\n#   (backbone): BrickTrainable(TinyModel, input_names=['processed'], output_names=['embedding'], groups={'MODEL'})\n#   (head): BrickTrainable(ClassifierDummy, input_names=['embedding'], output_names=['logits', 'softmaxed'], groups={'MODEL'})\n#   (loss): BrickLoss(CrossEntropyLoss, input_names=['logits', 'targets'], output_names=['loss_ce'], groups={'LOSS'})\n# )\n# print(create_mermaid_dag_graph(brick_collection))\n```\n\nWith group names, it is now possible to execute desired subsets of the model \nduring execution by adding `groups`.\n\nHere is a few examples: \n\n```python\nnamed_inputs = {\"raw_images\": batched_images, \"targets\": torch.ones((batch_size), dtype=torch.int64)}\n\n# With no groups specified, all bricks are executed\nnamed_outputs = brick_collection(named_inputs=named_inputs)\n\n# With groups specified, only bricks in the specified groups are executed\nnamed_outputs = brick_collection(named_inputs=named_inputs, groups={\"MODEL\"})\n```\n\nGroups are important concept in our model recipe as it allows us to specify how model will act during different model stages. \n\n\n\n**Brick collection during inference and export:**\n\nDuring `Inference` and `Export` model stages, we do not have ground truth labels and we wan to skip loss calculations. \n\n```python\n# Execution only \"MODEL\" group bricks\nnamed_outputs = brick_collection(named_inputs=named_inputs, groups={\"MODEL\"})\n```\n\nThe graph will look like this and note that the graph only requires `raw_images` as input:\n```mermaid\nflowchart LR\n    %% Brick definitions\n    preprocessor(<strong>'preprocessor': PreprocessorDummy</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable\n    backbone(<strong>'backbone': TinyModel</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    head(<strong>'head': ClassifierDummy</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    \n    %% Draw input and outputs\n    raw_images:::input --> preprocessor\n    \n    %% Draw nodes and edges\n    preprocessor --> |processed| backbone\n    backbone --> |embedding| head\n    head --> logits:::output\n    head --> softmaxed:::output\n    \n    %% Add styling\n    classDef arrow stroke-width:0px,fill-opacity:0.0 \n    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 \n    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 \n    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 \n    classDef BrickTrainable stroke-width:0px,fill:#6D597A \n    \n    %% Add legends\n    subgraph Legends\n        input(input):::input\n        output(output):::output\n    end\n```\n\n\n**Brick collection during train, test and validation:**\n\nDuring \"Train\", \"Test\" and \"Validation\", `targets` are available and we want to calculate loss to \nboth improve model and track loss curves. \n\n```python\n# Execution all groups\nnamed_outputs = brick_collection(named_inputs=named_inputs)\n\n# Or execute explicitly \"MODEL\" and \"LOSS\" group bricks\nnamed_outputs = brick_collection(named_inputs=named_inputs, groups={\"MODEL\", \"LOSS\"})\n```\n\nThe graph will look like this and note that the graph now requires `raw_images` and `targets` as input:\n\n```mermaid\nflowchart LR\n    %% Brick definitions\n    preprocessor(<strong>'preprocessor': PreprocessorDummy</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable\n    backbone(<strong>'backbone': TinyModel</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    head(<strong>'head': ClassifierDummy</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    loss(<strong>'loss': CrossEntropyLoss</strong><br><i>BrickLoss</i>):::BrickLoss\n    \n    %% Draw input and outputs\n    raw_images:::input --> preprocessor\n    targets:::input --> loss\n    \n    %% Draw nodes and edges\n    preprocessor --> |processed| backbone\n    backbone --> |embedding| head\n    head --> |logits| loss\n    head --> softmaxed:::output\n    loss --> loss_ce:::output\n    \n    %% Add styling\n    classDef arrow stroke-width:0px,fill-opacity:0.0 \n    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 \n    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 \n    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 \n    classDef BrickTrainable stroke-width:0px,fill:#6D597A \n    classDef BrickLoss stroke-width:0px,fill:#5C677D \n    \n    %% Add legends\n    subgraph Legends\n        input(input):::input\n        output(output):::output\n    end\n```\n\n\nAs demonstrated in above example, we can easily change the required inputs by change the model stage.\nThat allows us to support two basic use cases:\n\n1) When labels/targets are available, we have the option of getting model prediction along with loss and metrics.\n\n2) When labels/targets are **not** available, we do only model predictions used for model inference/export.\n\nThe mechanism of activating different parts of the model and making loss, metrics and visualizations part of the model recipe, \nallows us to more easily investigate/debug/visualize model parts in a notebook or scratch scripts.\n\n\n## Brick features: \n\n\n\n### Brick feature: TorchMetrics\n**We are not creating a training framework**, but to easily use the brick collection in your favorite training framework or custom \ntraining/validation/test loop, we need the option of **calculating model metrics** \n\nTo easily inject both model, losses and metrics, we also need to easily support metrics and calculate metrics across a dataset. \nWe will extend our example from before by adding metric bricks. \n\nTo calculate metrics across a dataset, we heavily rely on concepts and functions used in the \n[TorchMetrics](https://torchmetrics.readthedocs.io/en/stable/) library.\n\nThe used of TorchMetrics in a brick collection is demonstrated in below code snippet. \n\n```python\nimport torchvision\nfrom torchbricks.bag_of_bricks.backbones import resnet_to_brick\nfrom torchbricks.bag_of_bricks.image_classification import ImageClassifier\nfrom torchbricks.bag_of_bricks.preprocessors import Preprocessor\nfrom torchbricks.bricks import BrickLoss, BrickMetricSingle\nfrom torchmetrics.classification import MulticlassAccuracy\n\nnum_classes = 10\nresnet = torchvision.models.resnet18(weights=None, num_classes=num_classes)\nresnet_brick = resnet_to_brick(resnet=resnet, input_name=\"normalized\", output_name=\"features\")\nn_features = resnet_brick.model.n_backbone_features\nbricks = {\n    \"preprocessor\": BrickNotTrainable(Preprocessor(), input_names=[\"raw\"], output_names=[\"normalized\"]),\n    \"backbone\": resnet_brick,\n    \"head\": BrickTrainable(\n        ImageClassifier(num_classes=num_classes, n_features=n_features),\n        input_names=[\"features\"],\n        output_names=[\"logits\", \"probabilities\", \"class_prediction\"],\n    ),\n    \"accuracy\": BrickMetricSingle(MulticlassAccuracy(num_classes=num_classes), input_names=[\"class_prediction\", \"targets\"]),\n    \"loss\": BrickLoss(model=nn.CrossEntropyLoss(), input_names=[\"logits\", \"targets\"], output_names=[\"loss_ce\"]),\n}\nbrick_collection = BrickCollection(bricks)\n```\n\nWe will now use the brick collection above to simulate how a user can iterate over a dataset.\n\n```python\n# Simulate dataloader\nnamed_input_simulated = {\"raw\": batched_images, \"targets\": torch.ones((batch_size), dtype=torch.int64)}\ndataloader_simulated = [named_input_simulated for _ in range(5)]\n\n# Loop over the dataset\nfor named_inputs in dataloader_simulated:  # Simulates iterating over the dataset\n    named_outputs = brick_collection(named_inputs=named_inputs)\n    named_outputs_losses_only = brick_collection.extract_losses(named_outputs=named_outputs)\n\nmetrics = brick_collection.summarize(reset=True)\nprint(f\"{named_outputs.keys()=}\")\n# named_outputs.keys()=dict_keys(['raw', 'targets', 'stage', 'normalized', 'features', 'logits', 'probabilities', 'class_prediction', 'loss_ce'])\nprint(f\"{metrics=}\")\n# metrics={'MulticlassAccuracy': tensor(0.)}\n```\n\nFor each iteration in our (simulated) dataset, we calculate model outputs, losses and metrics for each batch. \n\nLosses are calculated and returned in `named_outputs` together with other model outputs. \nWe provide `extract_losses` as simple function to filter `named_outputs` and only return losses in a new dictionary. \n\nUnlike other bricks, `BrickMetrics` will not (by default) output metrics for each batch. \nInstead metrics are stored internally in `BrickMetricSingle` and only aggregated and return when\nthe `summarize` function is called. In above example, metric is aggregated over 5 batches as summaries to a single value. \n\nIt is important to note that we set `reset=True` to reset the internal aggregation of metrics.  \n\n**Additional notes on metrics**\n\nYou have the option of either using a single metric (`torchmetrics.Metric`) with `BrickMetricSingle` or a collection of \nmetrics (`torchmetrics.MetricCollection`) with `BrickMetrics`.\n\nFor multiple metrics, we advice to use `BrickMetrics` with a `torchmetrics.MetricCollection` \n[doc](https://torchmetrics.readthedocs.io/en/stable/pages/overview.html#metriccollection). \nIt has some intelligent mechanisms for efficiently sharing calculation for multiple metrics.\n\nNote also that metrics are not passed to other bricks or returned as output of the brick collection - they are only stored internally. \nTo also pass metrics to other bricks, you can set `return_metrics=True` for `BrickMetrics` and `BrickMetricSingle`. \nBut be aware, this will add computational cost. \n\n\n### Brick features: Act as a nn.Module\nA brick collection acts as a 'nn.Module' meaning:\n\n```python\nfrom pathlib import Path\n\n# Move to specify device (CPU/GPU) or precision to automatically move model parameters\nbrick_collection.to(torch.float16)\nbrick_collection.to(torch.float32)\n\n# Save model parameters\npath_model = Path(\"build/readme_model.pt\")\ntorch.save(brick_collection.state_dict(), path_model)\n\n# Load model parameters\nbrick_collection.load_state_dict(torch.load(path_model))\n\n# Iterate all parameters\nfor name, params in brick_collection.named_parameters():\n    pass\n\n# Iterate all layers\nfor name, module in brick_collection.named_modules():\n    pass\n\n# Using compile with pytorch >= 2.0\ntorch.compile(brick_collection)\n```\n\n### Brick features: Nested bricks and relative input/output names\nTo more easily add, remove and swap out a subset of bricks in a brick collection (e.g. bricks related to specific task), we\nsupport passing a nested dictionary of bricks to a `BrickCollection` and using relative input and output names. \n\nFirst we create a function (`create_image_classification_head`) that returns a dictionary with image classification specific \nbricks. \n\n```python\nfrom typing import Dict\n\nfrom torchbricks.bricks import BrickInterface\n\n\ndef create_image_classification_head(\n    num_classes: int, in_channels: int, features_name: str, targets_name: str\n) -> Dict[str, BrickInterface]:\n    \"\"\"Image classifier bricks: Classifier, loss and metrics\"\"\"\n    head = {\n        \"classify\": BrickTrainable(\n            ImageClassifier(num_classes=num_classes, n_features=in_channels),\n            input_names=[features_name],\n            output_names=[\"./logits\", \"./probabilities\", \"./class_prediction\"],\n        ),\n        \"accuracy\": BrickMetricSingle(MulticlassAccuracy(num_classes=num_classes), input_names=[\"./class_prediction\", targets_name]),\n        \"loss\": BrickLoss(model=nn.CrossEntropyLoss(), input_names=[\"./logits\", targets_name], output_names=[\"./loss_ce\"]),\n    }\n    return head\n```\n\nWe now create the full model containing a `preprocessor`, `backbone` and two independent heads called `head0` and `head1`.\nEach head is a dictionary of bricks, making our brick collection a nested dictionary. \n\n```python\nfrom torchbricks.graph_plotter import create_mermaid_dag_graph\n\nn_features = resnet_brick.model.n_backbone_features\nbricks = {\n    \"preprocessor\": BrickNotTrainable(Preprocessor(), input_names=[\"raw\"], output_names=[\"normalized\"]),\n    \"backbone\": resnet_brick,\n    \"head0\": create_image_classification_head(num_classes=3, in_channels=n_features, features_name=\"features\", targets_name=\"targets0\"),\n    \"head1\": create_image_classification_head(num_classes=5, in_channels=n_features, features_name=\"features\", targets_name=\"targets1\"),\n}\nbrick_collections = BrickCollection(bricks)\nprint(brick_collections)\nprint(create_mermaid_dag_graph(brick_collections))\n```\n\nAlso demonstrated in above example is the use of relative input and output names. \nLooking at our `create_image_classification_head` function again, you will notice that we actually use of relative input and output names \n(`./logits`, `./probabilities`, `./class_prediction` and `./loss_ce`). \n\nRelative names will use the brick name to derive \"absolute\" names. E.g. for `head0` the relative \ninput name `./logits` becomes `head0/logits` and for `head1` the relative input name `./logits`  becomes `head1/logits`.\n\nWe visualize above graph: \n\n\n```mermaid\nflowchart LR\n    %% Brick definitions\n    preprocessor(<strong>'preprocessor': Preprocessor</strong><br><i>BrickNotTrainable</i>):::BrickNotTrainable\n    backbone(<strong>'backbone': BackboneResnet</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    head0/classify(<strong>'head0/classify': ImageClassifier</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    head0/accuracy(<strong>'head0/accuracy': 'MulticlassAccuracy'</strong><br><i>BrickMetricSingle</i>):::BrickMetricSingle\n    head0/loss(<strong>'head0/loss': CrossEntropyLoss</strong><br><i>BrickLoss</i>):::BrickLoss\n    head1/classify(<strong>'head1/classify': ImageClassifier</strong><br><i>BrickTrainable</i>):::BrickTrainable\n    head1/accuracy(<strong>'head1/accuracy': 'MulticlassAccuracy'</strong><br><i>BrickMetricSingle</i>):::BrickMetricSingle\n    head1/loss(<strong>'head1/loss': CrossEntropyLoss</strong><br><i>BrickLoss</i>):::BrickLoss\n    \n    %% Draw input and outputs\n    raw:::input --> preprocessor\n    targets0:::input --> head0/accuracy\n    targets0:::input --> head0/loss\n    targets1:::input --> head1/accuracy\n    targets1:::input --> head1/loss\n    \n    %% Draw nodes and edges\n    preprocessor --> |normalized| backbone\n    backbone --> |features| head0/classify\n    backbone --> |features| head1/classify\n    subgraph head0\n        head0/classify --> |head0/class_prediction| head0/accuracy\n        head0/classify --> |head0/logits| head0/loss\n        head0/classify --> head0/probabilities:::output\n        head0/loss --> head0/loss_ce:::output\n    end\n    subgraph head1\n        head1/classify --> |head1/class_prediction| head1/accuracy\n        head1/classify --> |head1/logits| head1/loss\n        head1/classify --> head1/probabilities:::output\n        head1/loss --> head1/loss_ce:::output\n    end\n    \n    %% Add styling\n    classDef arrow stroke-width:0px,fill-opacity:0.0 \n    classDef input stroke-width:0px,fill-opacity:0.3,fill:#22A699 \n    classDef output stroke-width:0px,fill-opacity:0.3,fill:#F2BE22 \n    classDef BrickNotTrainable stroke-width:0px,fill:#B56576 \n    classDef BrickTrainable stroke-width:0px,fill:#6D597A \n    classDef BrickMetricSingle stroke-width:0px,fill:#1450A3 \n    classDef BrickLoss stroke-width:0px,fill:#5C677D \n    \n    %% Add legends\n    subgraph Legends\n        input(input):::input\n        output(output):::output\n    end\n```\n\n\n### Brick features: Save and loading bricks\nA brick collection can be saved and loaded as a regular pytorch `nn.Module`. For more information you can look up the official \npytorch guide on [Saving and Loading Models](https://pytorch.org/tutorials/beginner/saving_loading_models.html). \n\nHowever, we have also added a brick collection specific saving/loading format. It uses a pytorch weight format, \nbut creates a model file for each brick and keeps files in a nested folder structure. \n\nThe idea is that a user can more easily add or remove weights to a specific model by simply moving around model files and folders.\nTime will tell, if this a useful abstraction or dead code. \n\nBut it looks like this: \n\n```python\npath_model_folder = Path(\"build/bricks\")\n\n# Saving model parameters brick-collection style\nbrick_collections.save_bricks(path_model_folder=path_model_folder, exist_ok=True)\n\nprint(\"Model files: \")\nprint(\"\\n\".join(str(path) for path in path_model_folder.rglob(\"*.pt\")))\n\n# Loading model parameters brick-collection style\nbrick_collection.load_bricks(path_model_folder=path_model_folder)\n```\n\n### Brick features: Export as ONNX\nTo export a brick collection as onnx we provide the `export_bricks_as_onnx`-function. \n\nPass an example input (`named_input`) to trace a brick collection.\nSet `dynamic_batch_size=True` to support any batch size inputs and here we explicitly set `stage=Stage.EXPORT` - this is also \nthe default.\n\n```python\nfrom torchbricks.brick_collection_utils import export_bricks_as_onnx\n\npath_build = Path(\"build\")\npath_build.mkdir(exist_ok=True)\npath_onnx = path_build / \"readme_model.onnx\"\n\nexport_bricks_as_onnx(path_onnx=path_onnx, brick_collection=brick_collection, named_inputs=named_inputs, dynamic_batch_size=True)\n```\n\n### Brick features: Bag of bricks - reusable bricks modules\nNote also in above example we use bag-of-bricks to import commonly used `nn.Module`s \n\nThis includes a `Preprocessor`, `ImageClassifier` and `resnet_to_brick` to convert a torchvision resnet models to a backbone brick \nwithout a classifier.\n\n\n### Brick features: Training with pytorch-lightning trainer\nI like and love pytorch-lightning! We can avoid writing the easy-to-get-wrong training loop and validation/test scrips.\n\nPytorch lightning creates logs, ensures training is done efficiently on any device (CPU, GPU, TPU), on multiple/distributed devices \nwith reduced precision and much more.\n\nHowever, one issue I found myself having when wanting to extend my custom pytorch-lightning module (`LightningModule`) is that it forces an\nobject oriented style with multiple levels of inheritance. This is not necessarily bad, but it makes it hard to reuse \ncode across projects and generally makes the code complicated. \n\nWith a brick collection you should rarely change or inherit your lightning module, instead you can inject the model, metrics and loss functions\ninto a lightning module. Changes to preprocessor, backbone, necks, heads, metrics and losses are done on the outside\nand injected into the lightning module. \n\nBelow is an example of how you could inject a brick collection with pytorch-lightning. \n\nWe have created `LightningBrickCollection` ([available here](https://github.com/PeteHeine/torchbricks/blob/main/scripts/lightning_module.py)) \nas an example for you to use. \n\n\n```python\nfrom functools import partial\nfrom pathlib import Path\n\nimport pytorch_lightning as pl\nimport torchvision\nfrom utils_testing.datamodule_cifar10 import CIFAR10DataModule\nfrom utils_testing.lightning_module import LightningBrickCollection\n\nexperiment_name = \"CIFAR10\"\ntransform = torchvision.transforms.ToTensor()\ndata_module = CIFAR10DataModule(data_dir=\"data\", batch_size=5, num_workers=12, test_transforms=transform, train_transforms=transform)\ncreate_optimizer_func = partial(torch.optim.SGD, lr=0.05, momentum=0.9, weight_decay=5e-4)\nbricks_lightning_module = LightningBrickCollection(\n    path_experiments=Path(\"build\") / \"experiments\",\n    experiment_name=None,\n    brick_collection=brick_collection,\n    create_optimizers_func=create_optimizer_func,\n)\n\ntrainer = pl.Trainer(max_epochs=1, limit_train_batches=2, limit_val_batches=2, limit_test_batches=2)\n# Train and test model by injecting 'bricks_lightning_module'\ntrainer.fit(bricks_lightning_module, datamodule=data_module)\ntrainer.test(bricks_lightning_module, datamodule=data_module)\n```\n\n\n### Brick features: Pass all intermediate tensors to Brick\nBy adding `'__all__'` to `input_names`, it is possible to access all tensors as a dictionary inside a brick module. \nFor production code, this may not be the best option, but this feature can be valuable during an exploration phase or \nwhen doing some live debugging of a new model/module. \n\nWe will demonstrate in code by introducing a (dummy) module `MyNewPostProcessor`.\n\n*Note: It is just a dummy class, don't worry to much about the actual implementation.*\n\nThe important thing to notice is that `input_names = ['__all__']` is used for our `visualizer`-brick to\npass all tensors as a dictionary as an argument in the forward call. \n\n```python\nfrom typing import Any\n\n\nclass MyNewPostProcessor(torch.nn.Module):\n    def forward(self, named_inputs: Dict[str, Any]):\n        ## Here `named_inputs` contains all intermediate tensors\n        assert \"raw\" in named_inputs\n        assert \"embedding\" in named_inputs\n        return named_inputs[\"embedding\"]\n\n\nbricks = {\n    \"backbone\": BrickTrainable(TinyModel(n_channels=3, n_features=10), input_names=[\"raw\"], output_names=[\"embedding\"]),\n    \"post_processor\": BrickNotTrainable(MyNewPostProcessor(), input_names=[\"__all__\"], output_names=[\"postprocessed\"]),\n}\nbrick_collection = BrickCollection(bricks)\nnamed_outputs = brick_collection(named_inputs={\"raw\": torch.rand((2, 3, 100, 200))})\n```\n\n### Brick features: Visualizations in TorchBricks\nWe provide `BrickPerImageVisualization` as base brick for doing visualizations in a brick collection. \nThe advantage of brick-based visualization is that it can be bundled together with a specific task/head. \n\nSecondly, visualization/drawing functions typically operate on a single image and on non-`torch.Tensor` data types.\nE.g. Opencv/matplotlib uses `np.array` and pillow using `Image`. \n\n(Torchvision actually has functions to draw rectangles, key-points and segmentation masks directly on `torch.Tensor`s -\nbut it still operates on a single image and it has no option for rendering text).\n\nThe goal of `BrickPerImageVisualization` is to convert batched tensors/data to per image data in a desired format/datatype \nand pass it to a draw function. Look up the documentation of `BrickPerImageVisualization` to see all options.\n\nFirst we create a callable to do per image visualizations. It can be a simple function, but as demonstrated in below example, it \ncan also be a callable class. \n\nThe callable visualizes image classification predictions using pillow and requires two `np.array`s as input: \n`input_image` of shape [H, W, C] and `target_prediction` [1].\n\n```python\nimport numpy as np\nfrom PIL import Image, ImageDraw, ImageFont\nfrom torchbricks.tensor_conversions import float2uint8\n\n\nclass VisualizeImageClassification:\n    def __init__(self, class_names: list, font_size: int = 50):\n        self.class_names = class_names\n        self.font = ImageFont.truetype(\"tests/data/font_ASMAN.TTF\", size=font_size)\n\n    def __call__(self, input_image: np.ndarray, target_prediction: np.ndarray) -> Image.Image:\n        \"\"\"Draws image classification results\"\"\"\n        assert input_image.ndim == 3  # Converted to single image channel last numpy array [H, W, C]\n        image = Image.fromarray(float2uint8(input_image))\n        draw = ImageDraw.Draw(image)\n        draw.text((25, 25), text=self.class_names[target_prediction[0]], font=self.font)\n        return image\n```\n\nThe drawing class `VisualizeImageClassification` is now passed to `BrickPerImageVisualization` and used in a brick collection.\n\n```python\nfrom torchbricks.bag_of_bricks.brick_visualizer import BrickPerImageVisualization\n\nbricks = {\n    \"visualizer\": BrickPerImageVisualization(\n        callable=VisualizeImageClassification(class_names=[\"cat\", \"dog\"]),\n        input_names=[\"input_image\", \"target\"],\n        output_names=[\"visualization\"],\n    )\n}\n\nbatched_inputs = {\"input_image\": torch.zeros((2, 3, 100, 200)), \"target\": torch.tensor([0, 1], dtype=torch.int64)}\nbrick_collection = BrickCollection(bricks)\noutputs = brick_collection(named_inputs=batched_inputs)\n\ndisplay(outputs[\"visualization\"][0], outputs[\"visualization\"][1])\n```\n\n`BrickPerImageProcessing` will by default convert a batch tensor of shape [B, C, H, W] to a channel last numpy image of shape [H, W, C]. \nThis is the default behavior, and it allows us in the callable of `VisualizeImageClassification` to operate directly on numpy arrays. \n\nHowever for `BrickPerImageProcessing` a user has the option for unpacking batch data in a desired way as we will demonstrate in the \nnext example.\n\n\nBelow we create a class that inherits `BrickPerImageVisualization` to create a brick for visualizing\nimage classification `BrickVisualizeImageClassification`. The functionality is similar to above, but demonstrate \nother options of the `BrickPerImageVisualization` class. \n\n*It is important to note that `visualize_image_classification_pillow` is passed as a callable, and we do not override functionality of \n`BrickPerImageVisualization`. We only use it to simplify the constructor of `BrickVisualizeImageClassification`.\n\n```python\nfrom typing import List\n\nfrom torchbricks.tensor_conversions import function_composer, torch_to_numpy, unpack_batched_tensor_to_pillow_images\n\n\nclass BrickVisualizeImageClassification(BrickPerImageVisualization):\n    def __init__(self, input_image: str, target_name: str, class_names: List[str], output_name: str):\n        self.class_names = class_names\n        self.font = ImageFont.truetype(\"tests/data/font_ASMAN.TTF\", 50)\n        super().__init__(\n            callable=self.visualize_image_classification_pillow,\n            input_names=[input_image, target_name],\n            output_names=[output_name],\n            unpack_functions_for_type={torch.Tensor: unpack_batched_tensor_to_pillow_images},\n            unpack_functions_for_input_name={target_name: function_composer(torch_to_numpy, list)},\n        )\n\n    def visualize_image_classification_pillow(self, image: Image.Image, target_prediction: np.int64) -> Image.Image:\n        \"\"\"Draws image classification results\"\"\"\n        draw = ImageDraw.Draw(image)\n\n        draw.text((25, 25), text=self.class_names[target_prediction], font=self.font)\n        return image\n\n\nvisualizer = BrickVisualizeImageClassification(\n    input_image=\"input_image\", target_name=\"target\", class_names=[\"cat\", \"dog\"], output_name=\"VisualizeImageClassification\"\n)\nbatched_inputs = {\"input_image\": torch.zeros((2, 3, 100, 200)), \"target\": torch.tensor([0, 1], dtype=torch.int64)}\nvisualizer(batched_inputs)\n```\n\nNot unlike before, the callable (here `visualize_image_classification_pillow`) accepts an `Image.Image` image and an `int64` value directly\nand we are not required to do conversions inside the drawing function. \n\nThis can be achieved by using the two input arguments: \n- `unpack_functions_for_type: Dict[Type, Callable]` specifying how each type should be unpacked.\n  In above example we use `unpack_functions_for_type={torch.Tensor: unpack_batched_tensor_to_pillow_images}` to unpack all `torch.Tensor`s \n  of shape [B, 3, H, W] as pillow images.\n- `unpack_functions_for_input_name: Dict[str, Callable]` specifies how a specific input name should be unpacked. \n  In above example we use `unpack_functions_for_input_name={target_name: function_composer(torch_to_numpy, list)}` to unpack a \n  `torch.Tensor` of shape [B] to one int64 value per image. \n\nSpecifying unpacking by input name (`unpack_functions_for_input_name`) will override the per type unpacking of `unpack_functions_for_type`. \n\n\n## Motivation\n\nThe main motivation:\n- Sharable models: Packing model parts, metrics, loss-functions and visualizations into a single recipe, makes the model more sharable to\n  other projects and supports sharing models for different use cases such as: Only inference, inference+visualizations and \n  training+metrics+losses.\n- Shareable Parts: The brick collection encourage users to decouples parts and making also each part more sharable. \n- Multiple tasks: Makes it easier to add and remove tasks. Each task can be expressed by model parts in a dictionary, \n  we can easily add/remove them to a brick collection. \n- By packing model modules, metrics, loss-functions and visualization into a single brick collection, we can more easily \n  inject it into your custom trainer and evaluation without doing per task/model modifications. \n- Your model is **not** required to only return logits. Some training frameworks expect you to only return logits - values that go into \n  your loss function. Then at inference/test/evaluation you need to do post processing or pass additional outputs to \n  calculate metrics, do visualizations and make prediction human interpretable. It encourage unclear control flow (if/else statements) \n  in the model that depends on model stage. \n- Using input and output names makes it easier to describe how parts are connected. Internally data is passed between bricks in a \n  dictionary of any type - making in flexible. But for each module, you can specific and add and check type hints for input and output \n  data to both improve readability and make it more production ready. \n- When I started making a framework suited for multiple tasks, I would passed dictionaries around to all modules and pull out tensors by\n  name in each module. Book keeping names and updating names was messy. \n  I also started using the typical backbone(encoder) / head(decoder) separation... But some heads may share a common neck. \n  The decoder might also take different inputs and\n  split into different representation and merge again... Also to avoid code duplication, I ended up during \n  multiple layers of inheritance for the decoder, making reuse bad and generally everything became too complicated and a new task would \n  require me to refactor the whole concept. Yes, it was probably not a super great attempt either, but it made me realize it should be \n  easier to make a new task and it should be easier to reuse parts. \n\n\n<!-- #region -->\n##\n\n## What are we missing?\n- [ ] Demonstrate model configuration with hydra in this document\n- [ ] Make common Visualizations with pillow - not opencv to not blow up the required dependencies. ImageClassification, Segmentation, ObjectDetection\n  - [ ] VideoModule to store data as a video\n  - [ ] DisplayModule to show data\n- [ ] Consider caching unpacked data for `PerImageVisualizer`\n- [ ] Multiple named tensors caching module. \n- [ ] Use pymy, pyright or pyre to do static code checks. \n- [ ] Collection of helper modules. Preprocessors, Backbones, Necks/Upsamplers, ImageClassification, SemanticSegmentation, ObjectDetection\n  - [ ] Make common brick collections: BricksImageClassification, BricksSegmentation, BricksPointDetection, BricksObjectDetection\n- [ ] Support preparing data in the dataloader?\n- [ ] Support torch.jit.scripting? \n\n## How does it really work?\n????\n\n\n\n## Development\n\nRead the [CONTRIBUTING.md](CONTRIBUTING.md) file.\n\n### Install\n\n    conda create --name torchbricks --file conda-linux-64.lock\n    conda activate torchbricks\n    poetry install\n\n### Activating the environment\n\n    conda activate torchbricks\n\n<!-- #endregion -->\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2023 Peter Christiansen  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Decoupled and modular approach to building multi-task ML models using a single model recipe for all model stages",
    "version": "0.3.0",
    "project_urls": {
        "Homepage": "https://github.com/pete-machine/torchbricks"
    },
    "split_keywords": [
        "torch",
        " multi-task",
        " machine learning"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "44cbf361ab78e394e4d9436d0c3a2978c5110ff3d2405f07745bbfa985e92de5",
                "md5": "64bad85b738171f79e72d660c2a7980e",
                "sha256": "6e8556b4bec7e97b0df274e0692e910b2d5eb5ddf307aa7058699dfaaa6cf789"
            },
            "downloads": -1,
            "filename": "torchbricks-0.3.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "64bad85b738171f79e72d660c2a7980e",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 31672,
            "upload_time": "2024-04-24T22:54:57",
            "upload_time_iso_8601": "2024-04-24T22:54:57.354068Z",
            "url": "https://files.pythonhosted.org/packages/44/cb/f361ab78e394e4d9436d0c3a2978c5110ff3d2405f07745bbfa985e92de5/torchbricks-0.3.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "6c4e6514a26be6f642ca0582bd5b21beb4d63426f8fb6159b9c77949fccc373e",
                "md5": "cc9b54e6e0ddda52ecd48126760de908",
                "sha256": "40d0f82680298ea17f260cc6a8cf5b388d0fd9b268387bd3a68c1db7e2904064"
            },
            "downloads": -1,
            "filename": "torchbricks-0.3.0.tar.gz",
            "has_sig": false,
            "md5_digest": "cc9b54e6e0ddda52ecd48126760de908",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 53878,
            "upload_time": "2024-04-24T22:54:58",
            "upload_time_iso_8601": "2024-04-24T22:54:58.616049Z",
            "url": "https://files.pythonhosted.org/packages/6c/4e/6514a26be6f642ca0582bd5b21beb4d63426f8fb6159b9c77949fccc373e/torchbricks-0.3.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-24 22:54:58",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "pete-machine",
    "github_project": "torchbricks",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "torchbricks"
}

None