gcpds-cv-pykit

Name	gcpds-cv-pykit JSON
Version	0.1.0.68 JSON
	download
home_page	https://github.com/UN-GCPDS/gcpds-cv-pykit
Summary	A comprehensive toolkit for computer vision and segmentation tasks
upload_time	2025-11-05 03:42:12
maintainer	GCPDS Team
docs_url	None
author	GCPDS Team
requires_python	>=3.8
license	MIT
keywords	computer vision segmentation deep learning pytorch unet medical imaging image processing machine learning artificial intelligence
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # GCPDS Computer Vision Python Kit

<div align="center">

[![Python Version](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://www.python.org/downloads/)
[![PyTorch](https://img.shields.io/badge/PyTorch-2.0%2B-ee4c2c.svg)](https://pytorch.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Version](https://img.shields.io/badge/version-0.1.0.63-green.svg)](https://github.com/UN-GCPDS/gcpds-cv-pykit)

A comprehensive PyTorch-based toolkit for computer vision and semantic segmentation tasks, developed by the GCPDS Team at Universidad Nacional de Colombia.

[Features](#features) • [Installation](#installation) • [Quick Start](#quick-start) • [Documentation](#documentation) • [Examples](#examples)

</div>

---

## 📋 Table of Contents

- [Overview](#overview)
- [Features](#features)
- [Installation](#installation)
- [Quick Start](#quick-start)
- [Architecture](#architecture)
- [Available Models](#available-models)
- [Loss Functions](#loss-functions)
- [Datasets](#datasets)
- [Usage Examples](#usage-examples)
- [Configuration](#configuration)
- [Experiment Tracking](#experiment-tracking)
- [Performance Evaluation](#performance-evaluation)
- [Contributing](#contributing)
- [License](#license)
- [Citation](#citation)
- [Contact](#contact)

---

## 🔍 Overview

**gcpds-cv-pykit** is a powerful and flexible toolkit designed for semantic segmentation tasks in computer vision. Built on PyTorch, it provides a complete pipeline from dataset preparation to model training and evaluation, with built-in support for experiment tracking via Weights & Biases.

### Key Highlights

- 🎯 **Multiple State-of-the-Art Models**: UNet, ResUNet, DeepLabV3+, FCN
- 📊 **Comprehensive Loss Functions**: DICE, Cross-Entropy, Focal, Tversky
- 🗂️ **Easy Dataset Management**: Built-in support for Kaggle datasets
- 📈 **Experiment Tracking**: Seamless integration with Weights & Biases
- 🔧 **Flexible Configuration**: Dictionary-based configuration system
- 🚀 **Production Ready**: Mixed precision training, GPU optimization
- 📉 **Rich Evaluation Metrics**: DICE, Jaccard, Sensitivity, Specificity

---

## ✨ Features

### Core Capabilities

- **Baseline Segmentation Models**
  - UNet with customizable depth and filters
  - ResUNet with residual connections
  - DeepLabV3+ with atrous spatial pyramid pooling
  - Fully Convolutional Networks (FCN)

- **Advanced Training Features**
  - Mixed precision training (AMP) for faster computation
  - Automatic learning rate scheduling
  - Training phases to leverage features learned by pre-trained models
  - Early stopping with patience
  - Model checkpointing (best and last)
  - GPU memory optimization
  - Multi-GPU support

- **Loss Functions**
  - DICE Loss for imbalanced segmentation
  - Cross-Entropy Loss
  - Focal Loss for hard example mining
  - Tversky Loss for precision-recall trade-off

- **Dataset Utilities**
  - Automatic Kaggle dataset download
  - Pre-configured datasets: OxfordIITPet, SeedGermination, BreastCancer, FeetMamitas
  - Support for crowd-sourced datasets
  - Custom dataset integration

- **Visualization Tools**
  - Random sample visualizations
  - Training progress plots
  - Prediction overlays

- **Performance Evaluation**
  - Comprehensive metrics calculation
  - Per-class and global statistics
  - Results export to Numpy files

---

## 📦 Installation

### Prerequisites

- Python 3.8 or higher
- CUDA-compatible GPU (recommended)
- pip or conda package manager

### Basic Installation

```bash
pip install gcpds-cv-pykit
```

### Installation from Source

```bash
# Clone the repository
git clone https://github.com/UN-GCPDS/gcpds-cv-pykit.git
cd gcpds-cv-pykit

# Install in development mode
pip install -e .
```

### Installation with Optional Dependencies

```bash
# For development (includes testing and linting tools)
pip install gcpds-cv-pykit[dev]

# For documentation building
pip install gcpds-cv-pykit[docs]

# For Jupyter notebook support
pip install gcpds-cv-pykit[jupyter]

# Install all optional dependencies
pip install gcpds-cv-pykit[all]
```

### Dependencies

Core dependencies include:
- PyTorch >= 2.0.0
- torchvision >= 0.15.0
- numpy >= 1.21.0
- opencv-python >= 4.6.0
- matplotlib >= 3.5.0
- wandb >= 0.15.0
- tqdm >= 4.64.0
- kagglehub (for dataset downloads)

---

## 🚀 Quick Start

### 1. Download a Dataset

```python
from gcpds_cv_pykit.segmentation.datasets import OxfordIITPet

# Download and prepare the Oxford-IIIT Pet dataset
dataset_path = OxfordIITPet()
print(f"Dataset ready at: {dataset_path}")
```

### 2. Prepare Your Data Loaders

This toolkit includes ready-to-use, high-performance datasets and dataloader helpers for semantic segmentation, supporting both standard single/multi-class masks and multi-annotator scenarios.

#### Option A: Standard Segmentation (single or multi-class)

Use Segmentation_Dataset and Segmentation_DataLoader for typical datasets organized as:
- {data_dir}/{Partition}/images/*.png|jpg|jpeg
- {data_dir}/{Partition}/masks/class_0/*.png
- {data_dir}/{Partition}/masks/class_1/*.png
- ...
Notes:
- Masks are loaded per-class; missing masks are handled as empty (zeros).
- Augmentations (flips, rotations, color jitter, light noise) are applied only during training.
- Images are normalized to [0, 1] and resized to the target image_size.

Example:
```python
from gcpds_cv_pykit.segmentation.baseline.dataloaders import Segmentation_DataLoader

data_dir = "/path/to/dataset"
image_size = (256, 256)   # (H, W)
num_classes = 3
batch_size = 8

train_loader = Segmentation_DataLoader(
    data_dir=data_dir,
    batch_size=batch_size,
    image_size=image_size,
    num_classes=num_classes,
    partition="Train",
    single_class=None,          # set to an int (e.g., 0) to load only that class
    augment=True,               # augmentations only applied when partition == "Train"
    images_folder="images",     # custom images folder name (default: "images")
    num_workers=4,
    prefetch_factor=2,
    pin_memory=True,
)

val_loader = Segmentation_DataLoader(
    data_dir=data_dir,
    batch_size=batch_size,
    image_size=image_size,
    num_classes=num_classes,
    partition="Val",
    augment=False,
    images_folder="images",
    num_workers=4,
    pin_memory=True,
)

# Each batch returns:
#   images: FloatTensor [B, 3, H, W], in [0,1]
#   masks:  FloatTensor [B, C, H, W], binary per class
```

Key behaviors:
- Natural/alphanumeric file sorting ensures consistent pairing between images and masks.
- Supported image formats: .png, .jpg, .jpeg
- Missing mask files are treated as zeros (no annotation for that class).
- Rotation preserves mask binarization.

#### Option B: Multi-Annotator + Ground Truth (Annotator Harmony)

Use AnnotHarmonyDataset and AnnotHarmonyDataloader when you have multiple annotators per sample and optional ground truth. Directory structure:
- {data_dir}/{Partition}/patches/*.png
- {data_dir}/{Partition}/masks/{annotator_id}/class_{k}/*.png  (for each annotator and class)
- {data_dir}/{Partition}/masks/ground_truth/class_{k}/*.png    (optional GT)

Notes:
- Concatenates annotator masks along channel dimension as [num_annotators * num_classes, H, W].
- Returns a one-hot vector indicating which annotators provided a valid mask per sample.
- Handles missing annotator masks by filling with an ignored value (default 0.6), which is not treated as foreground/background.
- Supports training-time augmentations applied consistently to image, annotator masks, and ground truth.

Example:
```python
from gcpds_cv_pykit.segmentation.baseline.dataloaders import AnnotHarmonyDataloader

data_dir = "/path/to/harmony_dataset"
image_size = (256, 256)
num_classes = 3
num_annotators = 5
batch_size = 4

# Train with annotators + ground truth
train_loader = AnnotHarmonyDataloader(
    data_dir=data_dir,
    batch_size=batch_size,
    image_size=image_size,
    num_classes=num_classes,
    num_annotators=num_annotators,
    partition="Train",
    annotators=True,
    ground_truth=True,
    single_class=None,      # or an int to restrict to a single class
    augment=True,
    num_workers=4,
    prefetch_factor=2,
    pin_memory=True,
)

# Validation without augmentations
val_loader = AnnotHarmonyDataloader(
    data_dir=data_dir,
    batch_size=batch_size,
    image_size=image_size,
    num_classes=num_classes,
    num_annotators=num_annotators,
    partition="Val",
    annotators=True,
    ground_truth=True,
    augment=False,
    num_workers=4,
    pin_memory=True,
)

# Each batch returns (depending on annotators/ground_truth flags):
# - If annotators and ground_truth:
#     images:      [B, 3, H, W]
#     masks:       [B, num_annotators * num_classes, H, W]
#     anns_onehot: [B, num_annotators]  (1 if annotator present in sample, else 0)
#     gt:          [B, C, H, W]
# - If only annotators:
#     images, masks, anns_onehot
# - If only ground_truth:
#     images, gt
```

Tips:
- Set single_class to focus training on a specific class while preserving API consistency.
- Use num_workers > 0 with pin_memory=True for faster GPU input pipelines.
- For custom image folder names, use images_folder in Segmentation_DataLoader; AnnotHarmony expects images under patches by default.
- Augmentations include horizontal/vertical flips, small rotations, brightness/contrast/saturation jitter, and light Gaussian noise on images only.

### 3. Configure and Train a Model

You can train either a standard segmentation model (UNet/ResUNet/DeepLabV3+/FCN) or the Annotator Harmony model for multi-annotator learning. Both trainers support: phased fine-tuning, AMP, W&B logging, best/last checkpointing, and rich metric plots.

#### Option A: Baseline Segmentation (UNet, ResUNet, DeepLabV3+, FCN)

Trainer: `SegmentationModel_Trainer`

- Models: UNet, ResUNet, DeepLabV3+, FCN
- Losses: DICE, CrossEntropy, Focal, Tversky
- Phased training: progressively unfreeze encoder (phases 1→4)
- Metrics per-epoch: global/per-class DICE, IoU, Sensitivity, Specificity
- Visualizations every 5 epochs
- Saves: best_model.pt (by best Val DICE), last_model.pt
- Plots saved under results/experiment_X/

Example:

```python
from gcpds_cv_pykit.segmentation.baseline.trainers import SegmentationModel_Trainer

# Configuration dictionary
config = {
    # Model
    'Model': 'UNet',                      # 'UNet' | 'ResUNet' | 'DeepLabV3+' | 'FCN'
    'Backbone': 'resnet34',
    'Number of classes': 3,
    'Input size': [3, 256, 256],          # channels, H, W
    'Image size': (256, 256),             # used for plotting/metadata
    'Pretrained': True,
    'Activation function': None,          # None | 'sigmoid' (applied as model final_activation)

    # Loss
    'Loss function': 'DICE',              # 'DICE' | 'CrossEntropy' | 'Focal' | 'Tversky'
    'Smooth': 1.0,
    'Reduction': 'mean',
    'Alpha': 0.75,                        # Focal/Tversky
    'Beta': 0.3,                          # Tversky
    'Gamma': 2.0,                         # Focal

    # Training
    'Epochs': 50,
    'Device': 'cuda',                     # 'cuda' | 'cpu' | 'cuda:0', ...
    'AMixPre': True,                      # Automatic Mixed Precision (AMP)
    'Train phases': True,                 # phased fine-tuning (see below)
    'Single class train': None,           # int to train/evaluate a single class
    'Single class valid': None,

    # Monitoring
    'Wandb monitoring': None,             # or ['api_key', 'project', 'run_name']

    # Checkpoints/dirs (saved automatically)
    # models_dir = './models' (internal default)
}

# Initialize and start training
trainer = SegmentationModel_Trainer(
    train_loader=train_loader,            # yields (images, masks)
    valid_loader=valid_loader,            # yields (images, masks)
    config=config
)
trainer.start()                           # trains, logs, saves plots and checkpoints
```

Training phases (when 'Train phases' = True):
- Phase 1 (epochs 0–9): Freeze encoder, train decoder + segmentation_head (lr=1e-4)
- Phase 2 (10–19): + Unfreeze encoder BatchNorm layers (lr=1e-5)
- Phase 3 (20–29): + Unfreeze encoder layer4 (lr=1e-5)
- Phase 4 (30+): + Unfreeze encoder layer3 (lr=1e-5), scheduler ExponentialLR(gamma=0.94)

Notes:
- For Focal/CrossEntropy, predictions are sigmoid-ed internally for metric computation and visualizations.
- Visualizations sample random classes; for single-class mode, GT display adapts automatically.

#### Option B: Annotator Harmony (Multi-Annotator Learning)

Trainer: `AnnotHarmonyTrainer`

- Model: `AnnotHarmonyModel` (joint segmentation + annotator reliability)
- Input: images + stacked annotator masks + annotator presence one-hot (+ optional GT)
- Loss: TGCE_SS (robust to noisy annotations; supports ignored value)
- Flexible validation: with/without annotator masks and/or ground truth
- Metrics over GT when provided: global/per-class DICE, IoU, Sensitivity, Specificity
- Reliability map visualizations per annotator
- Saves: best_model.pt (by best Val DICE when GT available), last_model.pt
- Plots saved under results/experiment_X/

Example (with annotators + GT):

```python
from gcpds_cv_pykit.segmentation.baseline.trainers import AnnotHarmonyTrainer

config = {
    # Model
    'Model': 'AnnotHarmony',              # informational; model is constructed internally
    'Input size': [3, 256, 256],
    'Number of classes': 3,
    'Num of annotators': 5,
    'Activation seg': 'sparse_softmax',   # segmentation head activation inside model
    'Activation rel': 'softmax',          # reliability head activation

    # Loss (TGCE_SS)
    'Loss function': 'TGCE_SS',
    'Ignore value': 0.6,                  # value used to fill missing annotator masks
    'Q parameter': 0.7243854912956864,    # TGCE hyperparam

    # Training
    'Epochs': 50,
    'Device': 'cuda',
    'AMixPre': True,                      # AMP
    'Train phases': True,                 # phased fine-tuning similar to baseline
    'Single class train': None,           # int to focus metrics on one class
    'Single class valid': None,

    # Data flags
    'Ground truth train': True,           # train loader returns GT masks
    'Ground truth valid': True,           # valid loader returns GT masks
    'Annotators valid': True,             # valid loader returns annotator masks

    # Monitoring
    'Wandb monitoring': None,             # or ['api_key', 'project', 'run_name']
}

# Initialize and start training
trainer = AnnotHarmonyTrainer(
    train_loader=train_loader,            # yields (images, ann_masks, ann_onehot, gt) or (images, ann_masks, ann_onehot)
    valid_loader=valid_loader,            # supports (images, ann_masks, ann_onehot, gt), (images, ann_masks, ann_onehot), or (images, gt)
    config=config
)
trainer.start()
```

Validation/data combinations supported:
- annotators + ground truth: returns loss and metrics
- annotators only: returns loss
- ground truth only: returns metrics

Training phases (when 'Train phases' = True):
- Phase 1 (0–9): Freeze encoder; train decoder, seg_head, ann_rel (lr=1e-4)
- Phase 2 (10–19): + Unfreeze encoder BatchNorm (lr=1e-5)
- Phase 3 (20–29): + Unfreeze encoder layer4 (lr=1e-5)
- Phase 4 (30+): + Unfreeze encoder layer3 (lr=1e-5), scheduler ExponentialLR(gamma=0.94)

Outputs and logging (both trainers):
- Console: device info (CUDA, memory), per-epoch losses and metrics (global and per-class)
- Weights & Biases (optional): losses and metrics per epoch, prediction visualizations
- Files:
  - ./models/best_model.pt (best Val DICE when available)
  - ./models/last_model.pt
  - ./results/experiment_X/{Loss.png, DICE.png, Jaccard.png, Sensitivity.png, Specificity.png}

Tips:
- Set 'AMixPre': True for faster training with AMP on CUDA.
- Use 'Single class train/valid' to focus metrics on one class without changing labels.
- Enable 'Train phases' for stable fine-tuning of pretrained backbones.
- For W&B, set 'Wandb monitoring' to ['api_key', 'project', 'run_name'].

### 4. Evaluate Model Performance

Once training is complete, use the evaluation utilities to compute test-set metrics and optionally save detailed results. We provide two evaluators:

- Baseline evaluator for standard segmentation models: `PerformanceModels` (class)
- Annotator Harmony evaluator for multi-annotator models: `PerformanceAnnotHarmony` (function)

Both evaluators report global and per-class metrics:
- DICE (F1), Jaccard (IoU), Sensitivity (Recall), Specificity
- Mean ± std across the test set
- Optional .npy dumps of global and per-class metric arrays

#### 4.1 Baseline Segmentation Evaluator

Use `PerformanceModels` to evaluate UNet/ResUNet/DeepLabV3+/FCN models on a test DataLoader that yields (images, gt_masks).

Example:

```python
from gcpds_cv_pykit.segmentation.baseline import PerformanceModels
from torch.utils.data import DataLoader

# Build test loader
test_loader = DataLoader(test_dataset, batch_size=8, shuffle=False, num_workers=4, pin_memory=True)

# Config notes:
# - Must include keys used during training, e.g. 'Device', 'Number of classes', 'Loss function'
# - Optional: 'AMixPre' (AMP), 'Single class test' (int), 'Ignored value' (float), 'Save results' (bool), 'drive_dir'
config.update({
    'Device': 'cuda:0',
    'Save results': True,         # to save .npy files
    'drive_dir': './',            # base folder for results
    # 'Single class test': 0,     # optionally restrict evaluation to a single class index
    # 'Ignored value': 0.6,       # pixels equal to this value are ignored in metrics
})

# Evaluate
evaluator = PerformanceModels(
    model=trainer.model,          # already-trained baseline model
    test_dataset=test_loader,     # DataLoader yielding (images, gt_masks)
    config=config
)

# Quick access to global means (convenience attributes)
print(f"Mean DICE Score: {evaluator.mean_dice:.4f}")
print(f"Mean Jaccard: {evaluator.mean_jjacard:.4f}")
```

What it does:
- Moves the model to config['Device'], switches to eval mode, and runs inference with or without AMP (config['AMixPre']).
- Applies sigmoid to predictions when evaluating CrossEntropy/Focal losses so metrics use probabilities.
- Thresholds predictions at 0.5 to compute discrete masks and then computes metrics per-batch and per-class.
- Ignores pixels that match config['Ignored value'] (default 0.6) when computing metrics.
- Prints global and per-class summaries; optionally saves .npy arrays under:
  - results/{Dataset}_{Model}_{Loss function}_DICE_global.npy, etc.
  - results/{Dataset}_{Model}_{Loss function}_DICE_class{c}.npy, etc.

Notes and tips:
- Single-class evaluation: set `'Single class test': class_index` to evaluate only that class.
- Ensure config['Number of classes'] matches the model output.
- If using Focal/CrossEntropy losses for training, the evaluator handles activation for fair metric computation.

#### 4.2 Annotator Harmony Evaluator (Multi-Annotator)

Use `PerformanceAnnotHarmony` when evaluating the Annotator Harmony model trained with TGCE_SS. The test DataLoader must yield batches like:
- (images, annotator_masks, annotator_presence_one_hot, gt_masks)

If ground-truth masks are available, the evaluator computes metrics against GT; it always computes the TGCE_SS loss against annotator masks.

Example:

```python
from gcpds_cv_pykit.segmentation.baseline import PerformanceAnnotHarmony
from torch.utils.data import DataLoader

# Build test loader for AnnotHarmonyDataset
# Each batch must be (images, ann_masks, ann_onehot, gt_masks)
test_loader = DataLoader(test_dataset, batch_size=4, shuffle=False, num_workers=4, pin_memory=True)

config.update({
    'Device': 'cuda:0',
    'Num of annotators': 5,
    'Number of classes': 3,
    'AMixPre': True,                 # AMP for faster inference
    'Ignored value': 0.6,            # ignore marker in GT
    'Q paramater': 0.7243854912956864,  # TGCE_SS q
    'Smooth': 1e-7,
    'Main_model': 'AnnotHarmony',
    'Dataset': 'MyDataset',
    'drive_dir': './',
})

# Evaluate; set save_results=True to dump .npy files
PerformanceAnnotHarmony(
    model=trainer.model,              # trained AnnotHarmonyModel
    test_dataset=test_loader,
    config=config,
    save_results=True
)
```

What it does:
- Runs the model with inputs (images, annotator_presence_one_hot).
- Computes TGCE_SS loss against annotator masks.
- Uses ground-truth masks (if present in the batch) to compute DICE, Jaccard, Sensitivity, Specificity.
- Supports optional single-class evaluation via `'Single class test': int`.
- Saves arrays to results/{Main_model}_{Dataset}_*.npy when save_results=True.

Expected saved files (when saving enabled):
- Global: Loss, Dice_global, Jaccard_global, Sensitivity_global, Specificity_global
- Per class: Dice_class{c}, Jaccard_class{c}, Sensitivity_class{c}, Specificity_class{c}

#### 4.3 Configuration Keys Reference

Common keys:
- Device: 'cuda', 'cuda:0', or 'cpu'
- Number of classes: int
- AMixPre: bool, use autocast AMP for speed on CUDA
- Single class test: int or None, restrict metrics to one class channel
- Ignored value: float, pixels equal to this in GT are excluded from metrics
- Smooth: float, epsilon for metric stability
- Save results: bool, save .npy arrays of metrics
- drive_dir: str, base directory for results/

Annotator Harmony-specific:
- Num of annotators: int
- Q paramater: float, TGCE_SS hyperparameter
- Main_model, Dataset: strings used in saved filenames

#### 4.4 Metric Definitions

Given per-pixel TP, FP, FN, TN and smoothing parameter s:
- DICE: (2·TP + s) / (2·TP + FP + FN + s)
- Jaccard (IoU): (TP + s) / (TP + FP + FN + s)
- Sensitivity (Recall): (TP + s) / (TP + FN + s)
- Specificity: (TN + s) / (TN + FP + s)

These are computed per-class, then aggregated.

#### 4.5 Troubleshooting

- All-zero predictions or masks: smoothing avoids NaNs; still verify class balance.
- Mismatched shapes: ensure DataLoader yields [B, C, H, W] masks with correct number of channels.
- Class indexing: for single-class tests, ensure the chosen class aligns with your dataset label mapping.
- AMP on CPU: set AMixPre=False if running on CPU (AMP is CUDA-optimized).
- CrossEntropy/Focal outputs: evaluator applies sigmoid before thresholding for fair comparison.

This evaluation section complements the training pipeline, giving consistent, reproducible metrics and optional artifacts for post-analysis.

---

## 🏗️ Architecture

```
gcpds_cv_pykit/
├── segmentation/
│   ├── baseline/
│   │   ├── models/          # Segmentation architectures
│   │   │   ├── UNet.py
│   │   │   ├── ResUNet.py
│   │   │   ├── DeepLabV3Plus.py
│   │   │   └── FCN.py
│   │   ├── losses/          # Loss functions
│   │   │   ├── DICE.py
│   │   │   ├── CrossEntropy.py
│   │   │   ├── Focal.py
│   │   │   └── Tversky.py
│   │   ├── trainers/        # Training pipeline
│   │   │   └── trainer.py
│   │   ├── dataloaders/     # Data loading utilities
│   │   │   └── dataloader.py
│   │   └── performance_model.py  # Evaluation metrics
│   ├── datasets/            # Dataset utilities
│   │   └── datasets.py
│   ├── visuals/             # Visualization tools
│   │   └── random_sample_visualizations.py
│   ├── crowd/               # Crowd-sourced annotation support
│   │   ├── models/
│   │   │   └── AnnotHarmony.py
│   │   ├── losses/          
│   │   │   └── TGCE_SS.py
│   │   ├── trainers/        
│   │   │   └── AnnotHarmonyTrainer.py
│   │   ├── dataloaders/    
│   │   │   └── annot_harmony_dataloader.py
│   │   └── performance/    
│   │       └── performance_annotharmony.py
└── _version.py
```

---

## 🧠 Available Models

### UNet
Classic U-Net architecture with encoder-decoder structure and skip connections.

```python
config = {
    'Model': 'UNet',
    'Number of classes': 3,
    'Image size': (256, 256),
}
```

**Features:**
- Symmetric encoder-decoder architecture
- Skip connections for feature preservation
- Customizable depth and filter sizes
- Pre-trained backbone support
- Batch normalization and dropout support

### ResUNet
U-Net with residual blocks for improved gradient flow.

```python
config = {
    'Model': 'ResUNet',
    'Number of classes': 3,
    'Backbone': 'resnet34',  # or mobilenetv3
}
```

**Features:**
- Residual connections in encoder blocks
- Pre-trained backbone support
- Better convergence for deep networks
- Reduced vanishing gradient problems

### DeepLabV3+
State-of-the-art model with atrous spatial pyramid pooling.

```python
config = {
    'Model': 'DeepLabV3Plus',
    'Number of classes': 3,
    'Backbone': 'resnet34',
}
```

**Features:**
- Atrous Spatial Pyramid Pooling (ASPP)
- Multi-scale feature extraction
- Encoder-decoder with atrous convolution
- Pre-trained backbone support
- Excellent for complex scenes

### FCN (Fully Convolutional Network)
Efficient fully convolutional architecture.

```python
config = {
    'Model': 'FCN',
    'Number of classes': 3,
    'Backbone': 'resnet34',
}
```

**Features:**
- End-to-end convolutional architecture
- No fully connected layers
- Fast inference
- Pre-trained backbone support
- Good baseline model

---

## 📉 Loss Functions

### DICE Loss
Optimizes the DICE coefficient directly, ideal for imbalanced segmentation.

```python
config = {
    'Loss function': 'DICE',
    'Smooth': 1.0,
    'Reduction': 'mean',
}
```

### Cross-Entropy Loss
Standard pixel-wise classification loss.

```python
config = {
    'Loss function': 'CrossEntropy',
    'Reduction': 'mean',
}
```

### Focal Loss
Addresses class imbalance by down-weighting easy examples.

```python
config = {
    'Loss function': 'Focal',
    'Alpha': 0.25,
    'Gamma': 2.0,
    'Reduction': 'mean',
}
```

### Tversky Loss
Allows control over false positives and false negatives trade-off.

```python
config = {
    'Loss function': 'Tversky',
    'Alpha': 0.5,  # Weight for false positives
    'Beta': 0.5,   # Weight for false negatives
    'Smooth': 1.0,
}
```

---

## 🗂️ Datasets

### Built-in Dataset Support

The toolkit provides easy access to several pre-configured datasets from Kaggle:

```python
from gcpds_cv_pykit.segmentation.datasets import (
    OxfordIITPet,
    SeedGermination,
    BreastCancer,
    FeetMamitas,
    OxfordIITPet_Crowd,
    BreastCancer_Crowd
)

# Download datasets
oxford_path = OxfordIITPet()
seeds_path = SeedGermination()
cancer_path = BreastCancer()
feet_path = FeetMamitas()

# Crowd-sourced annotation datasets
oxford_crowd_path = OxfordIITPet_Crowd()
cancer_crowd_path = BreastCancer_Crowd()
```

### Custom Dataset Integration

You can easily integrate your own datasets:

```python
from torch.utils.data import Dataset

class CustomSegmentationDataset(Dataset):
    def __init__(self, root_dir, transform=None):
        self.root_dir = root_dir
        self.transform = transform
        # Your initialization code
        
    def __len__(self):
        return len(self.images)
    
    def __getitem__(self, idx):
        image = self.load_image(idx)
        mask = self.load_mask(idx)
        
        if self.transform:
            image = self.transform(image)
            mask = self.transform(mask)
            
        return image, mask
```

---

## 💡 Usage Examples

### Complete Training Pipeline

```python
import torch
from torch.utils.data import DataLoader
from gcpds_cv_pykit.segmentation.baseline.trainers import SegmentationModel_Trainer
from gcpds_cv_pykit.segmentation.datasets import OxfordIITPet

# 1. Download dataset
dataset_path = OxfordIITPet()

# 2. Create data loaders (assuming you have a dataset class)
train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True, num_workers=4)
valid_loader = DataLoader(valid_dataset, batch_size=16, shuffle=False, num_workers=4)

# 3. Configure training
config = {
    # Model configuration
    'Model': 'ResUNet',
    'Number of classes': 3,
    'Image size': (256, 256),
    'Backbone': 'resnet34',
    'Activation function': 'relu',
    
    # Training configuration
    'Loss function': 'DICE',
    'Learning rate': 0.001,
    'Optimizer': 'Adam',
    'Epochs': 100,
    'Batch size': 16,
    
    # Device and performance
    'Device': 'cuda' if torch.cuda.is_available() else 'cpu',
    'Mixed precision': True,
    'Num workers': 4,
    
    # Regularization
    'Weight decay': 1e-5,
    'Dropout': 0.2,
    
    # Early stopping
    'Early stopping': True,
    'Patience': 15,
    
    # Model saving
    'Save model': True,
    'Model name': 'resunet_oxford_pets',
    'Save path': './checkpoints/',
    
    # Monitoring
    'Verbose': True,
}

# 4. Initialize and train
trainer = SegmentationModel_Trainer(
    train_loader=train_loader,
    valid_loader=valid_loader,
    config=config
)

# 5. Start training
trainer.train()

# 6. Access trained model
model = trainer.model
```

### Training with Weights & Biases

```python
# Add WandB configuration
config['WandB monitoring'] = [
    'your_wandb_api_key',
    'project_name',
    'experiment_name'
]

trainer = SegmentationModel_Trainer(
    train_loader=train_loader,
    valid_loader=valid_loader,
    config=config
)

trainer.train()
```

### Visualization

```python
from gcpds_cv_pykit.segmentation.visuals import random_sample_visualizations

# Visualize random samples from dataset
random_sample_visualizations(
    dataset=train_dataset,
    num_samples=5,
    save_path='./visualizations/'
)
```

---

## ⚙️ Configuration

### Complete Configuration Reference

```python
config = {
    # ============ Model Architecture ============
    'Model': 'UNet',  # Options: 'UNet', 'ResUNet', 'DeepLabV3Plus', 'FCN'
    'Number of classes': 3,
    'Image size': (256, 256),
    'Backbone': 'resnet34',  # For ResUNet, DeepLabV3+, FCN
    'Activation function': 'relu',  # Options: 'sigmoid', 'softmax'
    'Pretrained': True,  # Use pretrained backbone
    
    # ============ Loss Function ============
    'Loss function': 'DICE',  # Options: 'DICE', 'CrossEntropy', 'Focal', 'Tversky'
    'Smooth': 1.0,  # For DICE and Tversky
    'Alpha': 0.25,  # For Focal and Tversky
    'Beta': 0.5,   # For Tversky
    'Gamma': 2.0,  # For Focal
    'Reduction': 'mean',  # Options: 'mean', 'sum', 'none'
    
    # ============ Training ============
    'Epochs': 100,
    'Batch size': 16,
    'Num workers': 4,
    'Pin memory': True,
    
    # ============ Device & Performance ============
    'Device': 'cuda',  # Options: 'cuda', 'cpu', 'cuda:0', 'cuda:1'
    'Mixed precision': True,  # Use AMP for faster training
    'Gradient clipping': 1.0,  # Max gradient norm
    
    # ============ Model Saving ============
    'Save model': True,
    'Model name': 'my_segmentation_model',
    'Save path': './checkpoints/',
    
    # ============ Monitoring ============
    'Print frequency': 10,  # Print every N batches
    'WandB monitoring': None,  # Or ['api_key', 'project', 'run_name']
    
    # ============ Evaluation ============
    'Save results': True,
    'Results path': './results/',
    'Results format': 'npz',
}
```

---

## 📊 Experiment Tracking

### Weights & Biases Integration

The toolkit seamlessly integrates with Weights & Biases for experiment tracking:

```python
# Configure WandB
config['WandB monitoring'] = [
    'your_api_key',      # Your WandB API key
    'project_name',      # Project name
    'experiment_name'    # Run name
]

# Training will automatically log:
# - Training and validation loss
# - Learning rate changes
# - Model architecture
# - Hyperparameters
# - System metrics (GPU usage, etc.)
# - Sample predictions (if configured)
```

### Logged Metrics

- **Per Epoch**: Loss, DICE score, IoU, learning rate
- **Per Batch**: Training loss, batch processing time
- **System**: GPU memory usage, CPU usage
- **Model**: Parameter count, model architecture

---

## 📈 Performance Evaluation

Quick evaluation for models trained with *TGCE-SS* loss:

```python
from utils.performance import PerformanceAnnotHarmony

PerformanceAnnotHarmony(
    model        = model,
    test_dataset = test_loader,
    config       = config,   # needs: Num of annotators, Number of classes, …
    save_results = True,     # saves *.npy to config["drive_dir"]/results/
    probabilistic= False     # True → no GT, average over 9 thresholds
)
```

Metrics reported (global + per-class):  
**DICE** | **Jaccard (IoU)** | **Sensitivity** | **Specificity**

Probabilistic mode averages each metric over thresholds `[0.1 … 0.9]`.

Saved files:  
`<model>_<dataset>_{probabilistic}_Dice_global.npy`, `_class0.npy`, …


---

## 🤝 Contributing

We welcome contributions from the community! Here's how you can help:

### Ways to Contribute

- 🐛 Report bugs and issues
- 💡 Suggest new features or improvements
- 📝 Improve documentation
- 🔧 Submit pull requests
- ⭐ Star the repository

### Development Setup

```bash
# Clone the repository
git clone https://github.com/UN-GCPDS/gcpds-cv-pykit.git
cd gcpds-cv-pykit

# Install development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

# Run tests
pytest tests/

# Check code style
black gcpds_cv_pykit/
flake8 gcpds_cv_pykit/
isort gcpds_cv_pykit/
```

### Pull Request Process

1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Run tests and linting
5. Commit your changes (`git commit -m 'Add amazing feature'`)
6. Push to the branch (`git push origin feature/amazing-feature`)
7. Open a Pull Request

---

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

```
MIT License

Copyright (c) 2025 SPRG - GCPDS Team

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
```

---

## 📚 Citation

If you use this toolkit in your research, please cite:

```bibtex
@software{gcpds_cv_pykit,
  title = {GCPDS Computer Vision Python Kit},
  author = {GCPDS Team},
  year = {2025},
  url = {https://github.com/UN-GCPDS/gcpds-cv-pykit},
  version = {0.1.0}
}
```

---

## 📞 Contact

**GCPDS Team** - Universidad Nacional de Colombia

- 📧 Email: gcpds_man@unal.edu.co
- 🌐 GitHub: [UN-GCPDS](https://github.com/UN-GCPDS)
- 📖 Documentation: [https://gcpds-cv-pykit.readthedocs.io/](https://gcpds-cv-pykit.readthedocs.io/)
- 🐛 Issues: [GitHub Issues](https://github.com/UN-GCPDS/gcpds-cv-pykit/issues)

---

## 🙏 Acknowledgments

- PyTorch team for the excellent deep learning framework
- The computer vision research community
- Contributors and users of this toolkit
- Universidad Nacional de Colombia

---

<div align="center">

**Made with ❤️ by the GCPDS Team**

[⬆ Back to Top](#gcpds-computer-vision-python-kit)

</div>

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/UN-GCPDS/gcpds-cv-pykit",
    "name": "gcpds-cv-pykit",
    "maintainer": "GCPDS Team",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "gcpds_man@unal.edu.co",
    "keywords": "computer vision, segmentation, deep learning, pytorch, unet, medical imaging, image processing, machine learning, artificial intelligence",
    "author": "GCPDS Team",
    "author_email": "gcpds_man@unal.edu.co",
    "download_url": null,
    "platform": "any",
    "description": "# GCPDS Computer Vision Python Kit\r\n\r\n<div align=\"center\">\r\n\r\n[![Python Version](https://img.shields.io/badge/python-3.8%2B-blue.svg)](https://www.python.org/downloads/)\r\n[![PyTorch](https://img.shields.io/badge/PyTorch-2.0%2B-ee4c2c.svg)](https://pytorch.org/)\r\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\r\n[![Version](https://img.shields.io/badge/version-0.1.0.63-green.svg)](https://github.com/UN-GCPDS/gcpds-cv-pykit)\r\n\r\nA comprehensive PyTorch-based toolkit for computer vision and semantic segmentation tasks, developed by the GCPDS Team at Universidad Nacional de Colombia.\r\n\r\n[Features](#features) \u2022 [Installation](#installation) \u2022 [Quick Start](#quick-start) \u2022 [Documentation](#documentation) \u2022 [Examples](#examples)\r\n\r\n</div>\r\n\r\n---\r\n\r\n## \ud83d\udccb Table of Contents\r\n\r\n- [Overview](#overview)\r\n- [Features](#features)\r\n- [Installation](#installation)\r\n- [Quick Start](#quick-start)\r\n- [Architecture](#architecture)\r\n- [Available Models](#available-models)\r\n- [Loss Functions](#loss-functions)\r\n- [Datasets](#datasets)\r\n- [Usage Examples](#usage-examples)\r\n- [Configuration](#configuration)\r\n- [Experiment Tracking](#experiment-tracking)\r\n- [Performance Evaluation](#performance-evaluation)\r\n- [Contributing](#contributing)\r\n- [License](#license)\r\n- [Citation](#citation)\r\n- [Contact](#contact)\r\n\r\n---\r\n\r\n## \ud83d\udd0d Overview\r\n\r\n**gcpds-cv-pykit** is a powerful and flexible toolkit designed for semantic segmentation tasks in computer vision. Built on PyTorch, it provides a complete pipeline from dataset preparation to model training and evaluation, with built-in support for experiment tracking via Weights & Biases.\r\n\r\n### Key Highlights\r\n\r\n- \ud83c\udfaf **Multiple State-of-the-Art Models**: UNet, ResUNet, DeepLabV3+, FCN\r\n- \ud83d\udcca **Comprehensive Loss Functions**: DICE, Cross-Entropy, Focal, Tversky\r\n- \ud83d\uddc2\ufe0f **Easy Dataset Management**: Built-in support for Kaggle datasets\r\n- \ud83d\udcc8 **Experiment Tracking**: Seamless integration with Weights & Biases\r\n- \ud83d\udd27 **Flexible Configuration**: Dictionary-based configuration system\r\n- \ud83d\ude80 **Production Ready**: Mixed precision training, GPU optimization\r\n- \ud83d\udcc9 **Rich Evaluation Metrics**: DICE, Jaccard, Sensitivity, Specificity\r\n\r\n---\r\n\r\n## \u2728 Features\r\n\r\n### Core Capabilities\r\n\r\n- **Baseline Segmentation Models**\r\n  - UNet with customizable depth and filters\r\n  - ResUNet with residual connections\r\n  - DeepLabV3+ with atrous spatial pyramid pooling\r\n  - Fully Convolutional Networks (FCN)\r\n\r\n- **Advanced Training Features**\r\n  - Mixed precision training (AMP) for faster computation\r\n  - Automatic learning rate scheduling\r\n  - Training phases to leverage features learned by pre-trained models\r\n  - Early stopping with patience\r\n  - Model checkpointing (best and last)\r\n  - GPU memory optimization\r\n  - Multi-GPU support\r\n\r\n- **Loss Functions**\r\n  - DICE Loss for imbalanced segmentation\r\n  - Cross-Entropy Loss\r\n  - Focal Loss for hard example mining\r\n  - Tversky Loss for precision-recall trade-off\r\n\r\n- **Dataset Utilities**\r\n  - Automatic Kaggle dataset download\r\n  - Pre-configured datasets: OxfordIITPet, SeedGermination, BreastCancer, FeetMamitas\r\n  - Support for crowd-sourced datasets\r\n  - Custom dataset integration\r\n\r\n- **Visualization Tools**\r\n  - Random sample visualizations\r\n  - Training progress plots\r\n  - Prediction overlays\r\n\r\n- **Performance Evaluation**\r\n  - Comprehensive metrics calculation\r\n  - Per-class and global statistics\r\n  - Results export to Numpy files\r\n\r\n---\r\n\r\n## \ud83d\udce6 Installation\r\n\r\n### Prerequisites\r\n\r\n- Python 3.8 or higher\r\n- CUDA-compatible GPU (recommended)\r\n- pip or conda package manager\r\n\r\n### Basic Installation\r\n\r\n```bash\r\npip install gcpds-cv-pykit\r\n```\r\n\r\n### Installation from Source\r\n\r\n```bash\r\n# Clone the repository\r\ngit clone https://github.com/UN-GCPDS/gcpds-cv-pykit.git\r\ncd gcpds-cv-pykit\r\n\r\n# Install in development mode\r\npip install -e .\r\n```\r\n\r\n### Installation with Optional Dependencies\r\n\r\n```bash\r\n# For development (includes testing and linting tools)\r\npip install gcpds-cv-pykit[dev]\r\n\r\n# For documentation building\r\npip install gcpds-cv-pykit[docs]\r\n\r\n# For Jupyter notebook support\r\npip install gcpds-cv-pykit[jupyter]\r\n\r\n# Install all optional dependencies\r\npip install gcpds-cv-pykit[all]\r\n```\r\n\r\n### Dependencies\r\n\r\nCore dependencies include:\r\n- PyTorch >= 2.0.0\r\n- torchvision >= 0.15.0\r\n- numpy >= 1.21.0\r\n- opencv-python >= 4.6.0\r\n- matplotlib >= 3.5.0\r\n- wandb >= 0.15.0\r\n- tqdm >= 4.64.0\r\n- kagglehub (for dataset downloads)\r\n\r\n---\r\n\r\n## \ud83d\ude80 Quick Start\r\n\r\n### 1. Download a Dataset\r\n\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.datasets import OxfordIITPet\r\n\r\n# Download and prepare the Oxford-IIIT Pet dataset\r\ndataset_path = OxfordIITPet()\r\nprint(f\"Dataset ready at: {dataset_path}\")\r\n```\r\n\r\n### 2. Prepare Your Data Loaders\r\n\r\nThis toolkit includes ready-to-use, high-performance datasets and dataloader helpers for semantic segmentation, supporting both standard single/multi-class masks and multi-annotator scenarios.\r\n\r\n#### Option A: Standard Segmentation (single or multi-class)\r\n\r\nUse Segmentation_Dataset and Segmentation_DataLoader for typical datasets organized as:\r\n- {data_dir}/{Partition}/images/*.png|jpg|jpeg\r\n- {data_dir}/{Partition}/masks/class_0/*.png\r\n- {data_dir}/{Partition}/masks/class_1/*.png\r\n- ...\r\nNotes:\r\n- Masks are loaded per-class; missing masks are handled as empty (zeros).\r\n- Augmentations (flips, rotations, color jitter, light noise) are applied only during training.\r\n- Images are normalized to [0, 1] and resized to the target image_size.\r\n\r\nExample:\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.baseline.dataloaders import Segmentation_DataLoader\r\n\r\ndata_dir = \"/path/to/dataset\"\r\nimage_size = (256, 256)   # (H, W)\r\nnum_classes = 3\r\nbatch_size = 8\r\n\r\ntrain_loader = Segmentation_DataLoader(\r\n    data_dir=data_dir,\r\n    batch_size=batch_size,\r\n    image_size=image_size,\r\n    num_classes=num_classes,\r\n    partition=\"Train\",\r\n    single_class=None,          # set to an int (e.g., 0) to load only that class\r\n    augment=True,               # augmentations only applied when partition == \"Train\"\r\n    images_folder=\"images\",     # custom images folder name (default: \"images\")\r\n    num_workers=4,\r\n    prefetch_factor=2,\r\n    pin_memory=True,\r\n)\r\n\r\nval_loader = Segmentation_DataLoader(\r\n    data_dir=data_dir,\r\n    batch_size=batch_size,\r\n    image_size=image_size,\r\n    num_classes=num_classes,\r\n    partition=\"Val\",\r\n    augment=False,\r\n    images_folder=\"images\",\r\n    num_workers=4,\r\n    pin_memory=True,\r\n)\r\n\r\n# Each batch returns:\r\n#   images: FloatTensor [B, 3, H, W], in [0,1]\r\n#   masks:  FloatTensor [B, C, H, W], binary per class\r\n```\r\n\r\nKey behaviors:\r\n- Natural/alphanumeric file sorting ensures consistent pairing between images and masks.\r\n- Supported image formats: .png, .jpg, .jpeg\r\n- Missing mask files are treated as zeros (no annotation for that class).\r\n- Rotation preserves mask binarization.\r\n\r\n#### Option B: Multi-Annotator + Ground Truth (Annotator Harmony)\r\n\r\nUse AnnotHarmonyDataset and AnnotHarmonyDataloader when you have multiple annotators per sample and optional ground truth. Directory structure:\r\n- {data_dir}/{Partition}/patches/*.png\r\n- {data_dir}/{Partition}/masks/{annotator_id}/class_{k}/*.png  (for each annotator and class)\r\n- {data_dir}/{Partition}/masks/ground_truth/class_{k}/*.png    (optional GT)\r\n\r\nNotes:\r\n- Concatenates annotator masks along channel dimension as [num_annotators * num_classes, H, W].\r\n- Returns a one-hot vector indicating which annotators provided a valid mask per sample.\r\n- Handles missing annotator masks by filling with an ignored value (default 0.6), which is not treated as foreground/background.\r\n- Supports training-time augmentations applied consistently to image, annotator masks, and ground truth.\r\n\r\nExample:\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.baseline.dataloaders import AnnotHarmonyDataloader\r\n\r\ndata_dir = \"/path/to/harmony_dataset\"\r\nimage_size = (256, 256)\r\nnum_classes = 3\r\nnum_annotators = 5\r\nbatch_size = 4\r\n\r\n# Train with annotators + ground truth\r\ntrain_loader = AnnotHarmonyDataloader(\r\n    data_dir=data_dir,\r\n    batch_size=batch_size,\r\n    image_size=image_size,\r\n    num_classes=num_classes,\r\n    num_annotators=num_annotators,\r\n    partition=\"Train\",\r\n    annotators=True,\r\n    ground_truth=True,\r\n    single_class=None,      # or an int to restrict to a single class\r\n    augment=True,\r\n    num_workers=4,\r\n    prefetch_factor=2,\r\n    pin_memory=True,\r\n)\r\n\r\n# Validation without augmentations\r\nval_loader = AnnotHarmonyDataloader(\r\n    data_dir=data_dir,\r\n    batch_size=batch_size,\r\n    image_size=image_size,\r\n    num_classes=num_classes,\r\n    num_annotators=num_annotators,\r\n    partition=\"Val\",\r\n    annotators=True,\r\n    ground_truth=True,\r\n    augment=False,\r\n    num_workers=4,\r\n    pin_memory=True,\r\n)\r\n\r\n# Each batch returns (depending on annotators/ground_truth flags):\r\n# - If annotators and ground_truth:\r\n#     images:      [B, 3, H, W]\r\n#     masks:       [B, num_annotators * num_classes, H, W]\r\n#     anns_onehot: [B, num_annotators]  (1 if annotator present in sample, else 0)\r\n#     gt:          [B, C, H, W]\r\n# - If only annotators:\r\n#     images, masks, anns_onehot\r\n# - If only ground_truth:\r\n#     images, gt\r\n```\r\n\r\nTips:\r\n- Set single_class to focus training on a specific class while preserving API consistency.\r\n- Use num_workers > 0 with pin_memory=True for faster GPU input pipelines.\r\n- For custom image folder names, use images_folder in Segmentation_DataLoader; AnnotHarmony expects images under patches by default.\r\n- Augmentations include horizontal/vertical flips, small rotations, brightness/contrast/saturation jitter, and light Gaussian noise on images only.\r\n\r\n### 3. Configure and Train a Model\r\n\r\nYou can train either a standard segmentation model (UNet/ResUNet/DeepLabV3+/FCN) or the Annotator Harmony model for multi-annotator learning. Both trainers support: phased fine-tuning, AMP, W&B logging, best/last checkpointing, and rich metric plots.\r\n\r\n#### Option A: Baseline Segmentation (UNet, ResUNet, DeepLabV3+, FCN)\r\n\r\nTrainer: `SegmentationModel_Trainer`\r\n\r\n- Models: UNet, ResUNet, DeepLabV3+, FCN\r\n- Losses: DICE, CrossEntropy, Focal, Tversky\r\n- Phased training: progressively unfreeze encoder (phases 1\u21924)\r\n- Metrics per-epoch: global/per-class DICE, IoU, Sensitivity, Specificity\r\n- Visualizations every 5 epochs\r\n- Saves: best_model.pt (by best Val DICE), last_model.pt\r\n- Plots saved under results/experiment_X/\r\n\r\nExample:\r\n\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.baseline.trainers import SegmentationModel_Trainer\r\n\r\n# Configuration dictionary\r\nconfig = {\r\n    # Model\r\n    'Model': 'UNet',                      # 'UNet' | 'ResUNet' | 'DeepLabV3+' | 'FCN'\r\n    'Backbone': 'resnet34',\r\n    'Number of classes': 3,\r\n    'Input size': [3, 256, 256],          # channels, H, W\r\n    'Image size': (256, 256),             # used for plotting/metadata\r\n    'Pretrained': True,\r\n    'Activation function': None,          # None | 'sigmoid' (applied as model final_activation)\r\n\r\n    # Loss\r\n    'Loss function': 'DICE',              # 'DICE' | 'CrossEntropy' | 'Focal' | 'Tversky'\r\n    'Smooth': 1.0,\r\n    'Reduction': 'mean',\r\n    'Alpha': 0.75,                        # Focal/Tversky\r\n    'Beta': 0.3,                          # Tversky\r\n    'Gamma': 2.0,                         # Focal\r\n\r\n    # Training\r\n    'Epochs': 50,\r\n    'Device': 'cuda',                     # 'cuda' | 'cpu' | 'cuda:0', ...\r\n    'AMixPre': True,                      # Automatic Mixed Precision (AMP)\r\n    'Train phases': True,                 # phased fine-tuning (see below)\r\n    'Single class train': None,           # int to train/evaluate a single class\r\n    'Single class valid': None,\r\n\r\n    # Monitoring\r\n    'Wandb monitoring': None,             # or ['api_key', 'project', 'run_name']\r\n\r\n    # Checkpoints/dirs (saved automatically)\r\n    # models_dir = './models' (internal default)\r\n}\r\n\r\n# Initialize and start training\r\ntrainer = SegmentationModel_Trainer(\r\n    train_loader=train_loader,            # yields (images, masks)\r\n    valid_loader=valid_loader,            # yields (images, masks)\r\n    config=config\r\n)\r\ntrainer.start()                           # trains, logs, saves plots and checkpoints\r\n```\r\n\r\nTraining phases (when 'Train phases' = True):\r\n- Phase 1 (epochs 0\u20139): Freeze encoder, train decoder + segmentation_head (lr=1e-4)\r\n- Phase 2 (10\u201319): + Unfreeze encoder BatchNorm layers (lr=1e-5)\r\n- Phase 3 (20\u201329): + Unfreeze encoder layer4 (lr=1e-5)\r\n- Phase 4 (30+): + Unfreeze encoder layer3 (lr=1e-5), scheduler ExponentialLR(gamma=0.94)\r\n\r\nNotes:\r\n- For Focal/CrossEntropy, predictions are sigmoid-ed internally for metric computation and visualizations.\r\n- Visualizations sample random classes; for single-class mode, GT display adapts automatically.\r\n\r\n#### Option B: Annotator Harmony (Multi-Annotator Learning)\r\n\r\nTrainer: `AnnotHarmonyTrainer`\r\n\r\n- Model: `AnnotHarmonyModel` (joint segmentation + annotator reliability)\r\n- Input: images + stacked annotator masks + annotator presence one-hot (+ optional GT)\r\n- Loss: TGCE_SS (robust to noisy annotations; supports ignored value)\r\n- Flexible validation: with/without annotator masks and/or ground truth\r\n- Metrics over GT when provided: global/per-class DICE, IoU, Sensitivity, Specificity\r\n- Reliability map visualizations per annotator\r\n- Saves: best_model.pt (by best Val DICE when GT available), last_model.pt\r\n- Plots saved under results/experiment_X/\r\n\r\nExample (with annotators + GT):\r\n\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.baseline.trainers import AnnotHarmonyTrainer\r\n\r\nconfig = {\r\n    # Model\r\n    'Model': 'AnnotHarmony',              # informational; model is constructed internally\r\n    'Input size': [3, 256, 256],\r\n    'Number of classes': 3,\r\n    'Num of annotators': 5,\r\n    'Activation seg': 'sparse_softmax',   # segmentation head activation inside model\r\n    'Activation rel': 'softmax',          # reliability head activation\r\n\r\n    # Loss (TGCE_SS)\r\n    'Loss function': 'TGCE_SS',\r\n    'Ignore value': 0.6,                  # value used to fill missing annotator masks\r\n    'Q parameter': 0.7243854912956864,    # TGCE hyperparam\r\n\r\n    # Training\r\n    'Epochs': 50,\r\n    'Device': 'cuda',\r\n    'AMixPre': True,                      # AMP\r\n    'Train phases': True,                 # phased fine-tuning similar to baseline\r\n    'Single class train': None,           # int to focus metrics on one class\r\n    'Single class valid': None,\r\n\r\n    # Data flags\r\n    'Ground truth train': True,           # train loader returns GT masks\r\n    'Ground truth valid': True,           # valid loader returns GT masks\r\n    'Annotators valid': True,             # valid loader returns annotator masks\r\n\r\n    # Monitoring\r\n    'Wandb monitoring': None,             # or ['api_key', 'project', 'run_name']\r\n}\r\n\r\n# Initialize and start training\r\ntrainer = AnnotHarmonyTrainer(\r\n    train_loader=train_loader,            # yields (images, ann_masks, ann_onehot, gt) or (images, ann_masks, ann_onehot)\r\n    valid_loader=valid_loader,            # supports (images, ann_masks, ann_onehot, gt), (images, ann_masks, ann_onehot), or (images, gt)\r\n    config=config\r\n)\r\ntrainer.start()\r\n```\r\n\r\nValidation/data combinations supported:\r\n- annotators + ground truth: returns loss and metrics\r\n- annotators only: returns loss\r\n- ground truth only: returns metrics\r\n\r\nTraining phases (when 'Train phases' = True):\r\n- Phase 1 (0\u20139): Freeze encoder; train decoder, seg_head, ann_rel (lr=1e-4)\r\n- Phase 2 (10\u201319): + Unfreeze encoder BatchNorm (lr=1e-5)\r\n- Phase 3 (20\u201329): + Unfreeze encoder layer4 (lr=1e-5)\r\n- Phase 4 (30+): + Unfreeze encoder layer3 (lr=1e-5), scheduler ExponentialLR(gamma=0.94)\r\n\r\nOutputs and logging (both trainers):\r\n- Console: device info (CUDA, memory), per-epoch losses and metrics (global and per-class)\r\n- Weights & Biases (optional): losses and metrics per epoch, prediction visualizations\r\n- Files:\r\n  - ./models/best_model.pt (best Val DICE when available)\r\n  - ./models/last_model.pt\r\n  - ./results/experiment_X/{Loss.png, DICE.png, Jaccard.png, Sensitivity.png, Specificity.png}\r\n\r\nTips:\r\n- Set 'AMixPre': True for faster training with AMP on CUDA.\r\n- Use 'Single class train/valid' to focus metrics on one class without changing labels.\r\n- Enable 'Train phases' for stable fine-tuning of pretrained backbones.\r\n- For W&B, set 'Wandb monitoring' to ['api_key', 'project', 'run_name'].\r\n\r\n### 4. Evaluate Model Performance\r\n\r\nOnce training is complete, use the evaluation utilities to compute test-set metrics and optionally save detailed results. We provide two evaluators:\r\n\r\n- Baseline evaluator for standard segmentation models: `PerformanceModels` (class)\r\n- Annotator Harmony evaluator for multi-annotator models: `PerformanceAnnotHarmony` (function)\r\n\r\nBoth evaluators report global and per-class metrics:\r\n- DICE (F1), Jaccard (IoU), Sensitivity (Recall), Specificity\r\n- Mean \u00b1 std across the test set\r\n- Optional .npy dumps of global and per-class metric arrays\r\n\r\n#### 4.1 Baseline Segmentation Evaluator\r\n\r\nUse `PerformanceModels` to evaluate UNet/ResUNet/DeepLabV3+/FCN models on a test DataLoader that yields (images, gt_masks).\r\n\r\nExample:\r\n\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.baseline import PerformanceModels\r\nfrom torch.utils.data import DataLoader\r\n\r\n# Build test loader\r\ntest_loader = DataLoader(test_dataset, batch_size=8, shuffle=False, num_workers=4, pin_memory=True)\r\n\r\n# Config notes:\r\n# - Must include keys used during training, e.g. 'Device', 'Number of classes', 'Loss function'\r\n# - Optional: 'AMixPre' (AMP), 'Single class test' (int), 'Ignored value' (float), 'Save results' (bool), 'drive_dir'\r\nconfig.update({\r\n    'Device': 'cuda:0',\r\n    'Save results': True,         # to save .npy files\r\n    'drive_dir': './',            # base folder for results\r\n    # 'Single class test': 0,     # optionally restrict evaluation to a single class index\r\n    # 'Ignored value': 0.6,       # pixels equal to this value are ignored in metrics\r\n})\r\n\r\n# Evaluate\r\nevaluator = PerformanceModels(\r\n    model=trainer.model,          # already-trained baseline model\r\n    test_dataset=test_loader,     # DataLoader yielding (images, gt_masks)\r\n    config=config\r\n)\r\n\r\n# Quick access to global means (convenience attributes)\r\nprint(f\"Mean DICE Score: {evaluator.mean_dice:.4f}\")\r\nprint(f\"Mean Jaccard: {evaluator.mean_jjacard:.4f}\")\r\n```\r\n\r\nWhat it does:\r\n- Moves the model to config['Device'], switches to eval mode, and runs inference with or without AMP (config['AMixPre']).\r\n- Applies sigmoid to predictions when evaluating CrossEntropy/Focal losses so metrics use probabilities.\r\n- Thresholds predictions at 0.5 to compute discrete masks and then computes metrics per-batch and per-class.\r\n- Ignores pixels that match config['Ignored value'] (default 0.6) when computing metrics.\r\n- Prints global and per-class summaries; optionally saves .npy arrays under:\r\n  - results/{Dataset}_{Model}_{Loss function}_DICE_global.npy, etc.\r\n  - results/{Dataset}_{Model}_{Loss function}_DICE_class{c}.npy, etc.\r\n\r\nNotes and tips:\r\n- Single-class evaluation: set `'Single class test': class_index` to evaluate only that class.\r\n- Ensure config['Number of classes'] matches the model output.\r\n- If using Focal/CrossEntropy losses for training, the evaluator handles activation for fair metric computation.\r\n\r\n#### 4.2 Annotator Harmony Evaluator (Multi-Annotator)\r\n\r\nUse `PerformanceAnnotHarmony` when evaluating the Annotator Harmony model trained with TGCE_SS. The test DataLoader must yield batches like:\r\n- (images, annotator_masks, annotator_presence_one_hot, gt_masks)\r\n\r\nIf ground-truth masks are available, the evaluator computes metrics against GT; it always computes the TGCE_SS loss against annotator masks.\r\n\r\nExample:\r\n\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.baseline import PerformanceAnnotHarmony\r\nfrom torch.utils.data import DataLoader\r\n\r\n# Build test loader for AnnotHarmonyDataset\r\n# Each batch must be (images, ann_masks, ann_onehot, gt_masks)\r\ntest_loader = DataLoader(test_dataset, batch_size=4, shuffle=False, num_workers=4, pin_memory=True)\r\n\r\nconfig.update({\r\n    'Device': 'cuda:0',\r\n    'Num of annotators': 5,\r\n    'Number of classes': 3,\r\n    'AMixPre': True,                 # AMP for faster inference\r\n    'Ignored value': 0.6,            # ignore marker in GT\r\n    'Q paramater': 0.7243854912956864,  # TGCE_SS q\r\n    'Smooth': 1e-7,\r\n    'Main_model': 'AnnotHarmony',\r\n    'Dataset': 'MyDataset',\r\n    'drive_dir': './',\r\n})\r\n\r\n# Evaluate; set save_results=True to dump .npy files\r\nPerformanceAnnotHarmony(\r\n    model=trainer.model,              # trained AnnotHarmonyModel\r\n    test_dataset=test_loader,\r\n    config=config,\r\n    save_results=True\r\n)\r\n```\r\n\r\nWhat it does:\r\n- Runs the model with inputs (images, annotator_presence_one_hot).\r\n- Computes TGCE_SS loss against annotator masks.\r\n- Uses ground-truth masks (if present in the batch) to compute DICE, Jaccard, Sensitivity, Specificity.\r\n- Supports optional single-class evaluation via `'Single class test': int`.\r\n- Saves arrays to results/{Main_model}_{Dataset}_*.npy when save_results=True.\r\n\r\nExpected saved files (when saving enabled):\r\n- Global: Loss, Dice_global, Jaccard_global, Sensitivity_global, Specificity_global\r\n- Per class: Dice_class{c}, Jaccard_class{c}, Sensitivity_class{c}, Specificity_class{c}\r\n\r\n#### 4.3 Configuration Keys Reference\r\n\r\nCommon keys:\r\n- Device: 'cuda', 'cuda:0', or 'cpu'\r\n- Number of classes: int\r\n- AMixPre: bool, use autocast AMP for speed on CUDA\r\n- Single class test: int or None, restrict metrics to one class channel\r\n- Ignored value: float, pixels equal to this in GT are excluded from metrics\r\n- Smooth: float, epsilon for metric stability\r\n- Save results: bool, save .npy arrays of metrics\r\n- drive_dir: str, base directory for results/\r\n\r\nAnnotator Harmony-specific:\r\n- Num of annotators: int\r\n- Q paramater: float, TGCE_SS hyperparameter\r\n- Main_model, Dataset: strings used in saved filenames\r\n\r\n#### 4.4 Metric Definitions\r\n\r\nGiven per-pixel TP, FP, FN, TN and smoothing parameter s:\r\n- DICE: (2\u00b7TP + s) / (2\u00b7TP + FP + FN + s)\r\n- Jaccard (IoU): (TP + s) / (TP + FP + FN + s)\r\n- Sensitivity (Recall): (TP + s) / (TP + FN + s)\r\n- Specificity: (TN + s) / (TN + FP + s)\r\n\r\nThese are computed per-class, then aggregated.\r\n\r\n#### 4.5 Troubleshooting\r\n\r\n- All-zero predictions or masks: smoothing avoids NaNs; still verify class balance.\r\n- Mismatched shapes: ensure DataLoader yields [B, C, H, W] masks with correct number of channels.\r\n- Class indexing: for single-class tests, ensure the chosen class aligns with your dataset label mapping.\r\n- AMP on CPU: set AMixPre=False if running on CPU (AMP is CUDA-optimized).\r\n- CrossEntropy/Focal outputs: evaluator applies sigmoid before thresholding for fair comparison.\r\n\r\nThis evaluation section complements the training pipeline, giving consistent, reproducible metrics and optional artifacts for post-analysis.\r\n\r\n---\r\n\r\n## \ud83c\udfd7\ufe0f Architecture\r\n\r\n```\r\ngcpds_cv_pykit/\r\n\u251c\u2500\u2500 segmentation/\r\n\u2502   \u251c\u2500\u2500 baseline/\r\n\u2502   \u2502   \u251c\u2500\u2500 models/          # Segmentation architectures\r\n\u2502   \u2502   \u2502   \u251c\u2500\u2500 UNet.py\r\n\u2502   \u2502   \u2502   \u251c\u2500\u2500 ResUNet.py\r\n\u2502   \u2502   \u2502   \u251c\u2500\u2500 DeepLabV3Plus.py\r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 FCN.py\r\n\u2502   \u2502   \u251c\u2500\u2500 losses/          # Loss functions\r\n\u2502   \u2502   \u2502   \u251c\u2500\u2500 DICE.py\r\n\u2502   \u2502   \u2502   \u251c\u2500\u2500 CrossEntropy.py\r\n\u2502   \u2502   \u2502   \u251c\u2500\u2500 Focal.py\r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 Tversky.py\r\n\u2502   \u2502   \u251c\u2500\u2500 trainers/        # Training pipeline\r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 trainer.py\r\n\u2502   \u2502   \u251c\u2500\u2500 dataloaders/     # Data loading utilities\r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 dataloader.py\r\n\u2502   \u2502   \u2514\u2500\u2500 performance_model.py  # Evaluation metrics\r\n\u2502   \u251c\u2500\u2500 datasets/            # Dataset utilities\r\n\u2502   \u2502   \u2514\u2500\u2500 datasets.py\r\n\u2502   \u251c\u2500\u2500 visuals/             # Visualization tools\r\n\u2502   \u2502   \u2514\u2500\u2500 random_sample_visualizations.py\r\n\u2502   \u251c\u2500\u2500 crowd/               # Crowd-sourced annotation support\r\n\u2502   \u2502   \u251c\u2500\u2500 models/\r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 AnnotHarmony.py\r\n\u2502   \u2502   \u251c\u2500\u2500 losses/          \r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 TGCE_SS.py\r\n\u2502   \u2502   \u251c\u2500\u2500 trainers/        \r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 AnnotHarmonyTrainer.py\r\n\u2502   \u2502   \u251c\u2500\u2500 dataloaders/    \r\n\u2502   \u2502   \u2502   \u2514\u2500\u2500 annot_harmony_dataloader.py\r\n\u2502   \u2502   \u2514\u2500\u2500 performance/    \r\n\u2502   \u2502       \u2514\u2500\u2500 performance_annotharmony.py\r\n\u2514\u2500\u2500 _version.py\r\n```\r\n\r\n---\r\n\r\n## \ud83e\udde0 Available Models\r\n\r\n### UNet\r\nClassic U-Net architecture with encoder-decoder structure and skip connections.\r\n\r\n```python\r\nconfig = {\r\n    'Model': 'UNet',\r\n    'Number of classes': 3,\r\n    'Image size': (256, 256),\r\n}\r\n```\r\n\r\n**Features:**\r\n- Symmetric encoder-decoder architecture\r\n- Skip connections for feature preservation\r\n- Customizable depth and filter sizes\r\n- Pre-trained backbone support\r\n- Batch normalization and dropout support\r\n\r\n### ResUNet\r\nU-Net with residual blocks for improved gradient flow.\r\n\r\n```python\r\nconfig = {\r\n    'Model': 'ResUNet',\r\n    'Number of classes': 3,\r\n    'Backbone': 'resnet34',  # or mobilenetv3\r\n}\r\n```\r\n\r\n**Features:**\r\n- Residual connections in encoder blocks\r\n- Pre-trained backbone support\r\n- Better convergence for deep networks\r\n- Reduced vanishing gradient problems\r\n\r\n### DeepLabV3+\r\nState-of-the-art model with atrous spatial pyramid pooling.\r\n\r\n```python\r\nconfig = {\r\n    'Model': 'DeepLabV3Plus',\r\n    'Number of classes': 3,\r\n    'Backbone': 'resnet34',\r\n}\r\n```\r\n\r\n**Features:**\r\n- Atrous Spatial Pyramid Pooling (ASPP)\r\n- Multi-scale feature extraction\r\n- Encoder-decoder with atrous convolution\r\n- Pre-trained backbone support\r\n- Excellent for complex scenes\r\n\r\n### FCN (Fully Convolutional Network)\r\nEfficient fully convolutional architecture.\r\n\r\n```python\r\nconfig = {\r\n    'Model': 'FCN',\r\n    'Number of classes': 3,\r\n    'Backbone': 'resnet34',\r\n}\r\n```\r\n\r\n**Features:**\r\n- End-to-end convolutional architecture\r\n- No fully connected layers\r\n- Fast inference\r\n- Pre-trained backbone support\r\n- Good baseline model\r\n\r\n---\r\n\r\n## \ud83d\udcc9 Loss Functions\r\n\r\n### DICE Loss\r\nOptimizes the DICE coefficient directly, ideal for imbalanced segmentation.\r\n\r\n```python\r\nconfig = {\r\n    'Loss function': 'DICE',\r\n    'Smooth': 1.0,\r\n    'Reduction': 'mean',\r\n}\r\n```\r\n\r\n### Cross-Entropy Loss\r\nStandard pixel-wise classification loss.\r\n\r\n```python\r\nconfig = {\r\n    'Loss function': 'CrossEntropy',\r\n    'Reduction': 'mean',\r\n}\r\n```\r\n\r\n### Focal Loss\r\nAddresses class imbalance by down-weighting easy examples.\r\n\r\n```python\r\nconfig = {\r\n    'Loss function': 'Focal',\r\n    'Alpha': 0.25,\r\n    'Gamma': 2.0,\r\n    'Reduction': 'mean',\r\n}\r\n```\r\n\r\n### Tversky Loss\r\nAllows control over false positives and false negatives trade-off.\r\n\r\n```python\r\nconfig = {\r\n    'Loss function': 'Tversky',\r\n    'Alpha': 0.5,  # Weight for false positives\r\n    'Beta': 0.5,   # Weight for false negatives\r\n    'Smooth': 1.0,\r\n}\r\n```\r\n\r\n---\r\n\r\n## \ud83d\uddc2\ufe0f Datasets\r\n\r\n### Built-in Dataset Support\r\n\r\nThe toolkit provides easy access to several pre-configured datasets from Kaggle:\r\n\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.datasets import (\r\n    OxfordIITPet,\r\n    SeedGermination,\r\n    BreastCancer,\r\n    FeetMamitas,\r\n    OxfordIITPet_Crowd,\r\n    BreastCancer_Crowd\r\n)\r\n\r\n# Download datasets\r\noxford_path = OxfordIITPet()\r\nseeds_path = SeedGermination()\r\ncancer_path = BreastCancer()\r\nfeet_path = FeetMamitas()\r\n\r\n# Crowd-sourced annotation datasets\r\noxford_crowd_path = OxfordIITPet_Crowd()\r\ncancer_crowd_path = BreastCancer_Crowd()\r\n```\r\n\r\n### Custom Dataset Integration\r\n\r\nYou can easily integrate your own datasets:\r\n\r\n```python\r\nfrom torch.utils.data import Dataset\r\n\r\nclass CustomSegmentationDataset(Dataset):\r\n    def __init__(self, root_dir, transform=None):\r\n        self.root_dir = root_dir\r\n        self.transform = transform\r\n        # Your initialization code\r\n        \r\n    def __len__(self):\r\n        return len(self.images)\r\n    \r\n    def __getitem__(self, idx):\r\n        image = self.load_image(idx)\r\n        mask = self.load_mask(idx)\r\n        \r\n        if self.transform:\r\n            image = self.transform(image)\r\n            mask = self.transform(mask)\r\n            \r\n        return image, mask\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udca1 Usage Examples\r\n\r\n### Complete Training Pipeline\r\n\r\n```python\r\nimport torch\r\nfrom torch.utils.data import DataLoader\r\nfrom gcpds_cv_pykit.segmentation.baseline.trainers import SegmentationModel_Trainer\r\nfrom gcpds_cv_pykit.segmentation.datasets import OxfordIITPet\r\n\r\n# 1. Download dataset\r\ndataset_path = OxfordIITPet()\r\n\r\n# 2. Create data loaders (assuming you have a dataset class)\r\ntrain_loader = DataLoader(train_dataset, batch_size=16, shuffle=True, num_workers=4)\r\nvalid_loader = DataLoader(valid_dataset, batch_size=16, shuffle=False, num_workers=4)\r\n\r\n# 3. Configure training\r\nconfig = {\r\n    # Model configuration\r\n    'Model': 'ResUNet',\r\n    'Number of classes': 3,\r\n    'Image size': (256, 256),\r\n    'Backbone': 'resnet34',\r\n    'Activation function': 'relu',\r\n    \r\n    # Training configuration\r\n    'Loss function': 'DICE',\r\n    'Learning rate': 0.001,\r\n    'Optimizer': 'Adam',\r\n    'Epochs': 100,\r\n    'Batch size': 16,\r\n    \r\n    # Device and performance\r\n    'Device': 'cuda' if torch.cuda.is_available() else 'cpu',\r\n    'Mixed precision': True,\r\n    'Num workers': 4,\r\n    \r\n    # Regularization\r\n    'Weight decay': 1e-5,\r\n    'Dropout': 0.2,\r\n    \r\n    # Early stopping\r\n    'Early stopping': True,\r\n    'Patience': 15,\r\n    \r\n    # Model saving\r\n    'Save model': True,\r\n    'Model name': 'resunet_oxford_pets',\r\n    'Save path': './checkpoints/',\r\n    \r\n    # Monitoring\r\n    'Verbose': True,\r\n}\r\n\r\n# 4. Initialize and train\r\ntrainer = SegmentationModel_Trainer(\r\n    train_loader=train_loader,\r\n    valid_loader=valid_loader,\r\n    config=config\r\n)\r\n\r\n# 5. Start training\r\ntrainer.train()\r\n\r\n# 6. Access trained model\r\nmodel = trainer.model\r\n```\r\n\r\n### Training with Weights & Biases\r\n\r\n```python\r\n# Add WandB configuration\r\nconfig['WandB monitoring'] = [\r\n    'your_wandb_api_key',\r\n    'project_name',\r\n    'experiment_name'\r\n]\r\n\r\ntrainer = SegmentationModel_Trainer(\r\n    train_loader=train_loader,\r\n    valid_loader=valid_loader,\r\n    config=config\r\n)\r\n\r\ntrainer.train()\r\n```\r\n\r\n### Visualization\r\n\r\n```python\r\nfrom gcpds_cv_pykit.segmentation.visuals import random_sample_visualizations\r\n\r\n# Visualize random samples from dataset\r\nrandom_sample_visualizations(\r\n    dataset=train_dataset,\r\n    num_samples=5,\r\n    save_path='./visualizations/'\r\n)\r\n```\r\n\r\n---\r\n\r\n## \u2699\ufe0f Configuration\r\n\r\n### Complete Configuration Reference\r\n\r\n```python\r\nconfig = {\r\n    # ============ Model Architecture ============\r\n    'Model': 'UNet',  # Options: 'UNet', 'ResUNet', 'DeepLabV3Plus', 'FCN'\r\n    'Number of classes': 3,\r\n    'Image size': (256, 256),\r\n    'Backbone': 'resnet34',  # For ResUNet, DeepLabV3+, FCN\r\n    'Activation function': 'relu',  # Options: 'sigmoid', 'softmax'\r\n    'Pretrained': True,  # Use pretrained backbone\r\n    \r\n    # ============ Loss Function ============\r\n    'Loss function': 'DICE',  # Options: 'DICE', 'CrossEntropy', 'Focal', 'Tversky'\r\n    'Smooth': 1.0,  # For DICE and Tversky\r\n    'Alpha': 0.25,  # For Focal and Tversky\r\n    'Beta': 0.5,   # For Tversky\r\n    'Gamma': 2.0,  # For Focal\r\n    'Reduction': 'mean',  # Options: 'mean', 'sum', 'none'\r\n    \r\n    # ============ Training ============\r\n    'Epochs': 100,\r\n    'Batch size': 16,\r\n    'Num workers': 4,\r\n    'Pin memory': True,\r\n    \r\n    # ============ Device & Performance ============\r\n    'Device': 'cuda',  # Options: 'cuda', 'cpu', 'cuda:0', 'cuda:1'\r\n    'Mixed precision': True,  # Use AMP for faster training\r\n    'Gradient clipping': 1.0,  # Max gradient norm\r\n    \r\n    # ============ Model Saving ============\r\n    'Save model': True,\r\n    'Model name': 'my_segmentation_model',\r\n    'Save path': './checkpoints/',\r\n    \r\n    # ============ Monitoring ============\r\n    'Print frequency': 10,  # Print every N batches\r\n    'WandB monitoring': None,  # Or ['api_key', 'project', 'run_name']\r\n    \r\n    # ============ Evaluation ============\r\n    'Save results': True,\r\n    'Results path': './results/',\r\n    'Results format': 'npz',\r\n}\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcca Experiment Tracking\r\n\r\n### Weights & Biases Integration\r\n\r\nThe toolkit seamlessly integrates with Weights & Biases for experiment tracking:\r\n\r\n```python\r\n# Configure WandB\r\nconfig['WandB monitoring'] = [\r\n    'your_api_key',      # Your WandB API key\r\n    'project_name',      # Project name\r\n    'experiment_name'    # Run name\r\n]\r\n\r\n# Training will automatically log:\r\n# - Training and validation loss\r\n# - Learning rate changes\r\n# - Model architecture\r\n# - Hyperparameters\r\n# - System metrics (GPU usage, etc.)\r\n# - Sample predictions (if configured)\r\n```\r\n\r\n### Logged Metrics\r\n\r\n- **Per Epoch**: Loss, DICE score, IoU, learning rate\r\n- **Per Batch**: Training loss, batch processing time\r\n- **System**: GPU memory usage, CPU usage\r\n- **Model**: Parameter count, model architecture\r\n\r\n---\r\n\r\n## \ud83d\udcc8 Performance Evaluation\r\n\r\nQuick evaluation for models trained with *TGCE-SS* loss:\r\n\r\n```python\r\nfrom utils.performance import PerformanceAnnotHarmony\r\n\r\nPerformanceAnnotHarmony(\r\n    model        = model,\r\n    test_dataset = test_loader,\r\n    config       = config,   # needs: Num of annotators, Number of classes, \u2026\r\n    save_results = True,     # saves *.npy to config[\"drive_dir\"]/results/\r\n    probabilistic= False     # True \u2192 no GT, average over 9 thresholds\r\n)\r\n```\r\n\r\nMetrics reported (global + per-class):  \r\n**DICE** | **Jaccard (IoU)** | **Sensitivity** | **Specificity**\r\n\r\nProbabilistic mode averages each metric over thresholds `[0.1 \u2026 0.9]`.\r\n\r\nSaved files:  \r\n`<model>_<dataset>_{probabilistic}_Dice_global.npy`, `_class0.npy`, \u2026\r\n\r\n\r\n---\r\n\r\n## \ud83e\udd1d Contributing\r\n\r\nWe welcome contributions from the community! Here's how you can help:\r\n\r\n### Ways to Contribute\r\n\r\n- \ud83d\udc1b Report bugs and issues\r\n- \ud83d\udca1 Suggest new features or improvements\r\n- \ud83d\udcdd Improve documentation\r\n- \ud83d\udd27 Submit pull requests\r\n- \u2b50 Star the repository\r\n\r\n### Development Setup\r\n\r\n```bash\r\n# Clone the repository\r\ngit clone https://github.com/UN-GCPDS/gcpds-cv-pykit.git\r\ncd gcpds-cv-pykit\r\n\r\n# Install development dependencies\r\npip install -e \".[dev]\"\r\n\r\n# Install pre-commit hooks\r\npre-commit install\r\n\r\n# Run tests\r\npytest tests/\r\n\r\n# Check code style\r\nblack gcpds_cv_pykit/\r\nflake8 gcpds_cv_pykit/\r\nisort gcpds_cv_pykit/\r\n```\r\n\r\n### Pull Request Process\r\n\r\n1. Fork the repository\r\n2. Create a feature branch (`git checkout -b feature/amazing-feature`)\r\n3. Make your changes\r\n4. Run tests and linting\r\n5. Commit your changes (`git commit -m 'Add amazing feature'`)\r\n6. Push to the branch (`git push origin feature/amazing-feature`)\r\n7. Open a Pull Request\r\n\r\n---\r\n\r\n## \ud83d\udcc4 License\r\n\r\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\r\n\r\n```\r\nMIT License\r\n\r\nCopyright (c) 2025 SPRG - GCPDS Team\r\n\r\nPermission is hereby granted, free of charge, to any person obtaining a copy\r\nof this software and associated documentation files (the \"Software\"), to deal\r\nin the Software without restriction, including without limitation the rights\r\nto use, copy, modify, merge, publish, distribute, sublicense, and/or sell\r\ncopies of the Software, and to permit persons to whom the Software is\r\nfurnished to do so, subject to the following conditions:\r\n\r\nThe above copyright notice and this permission notice shall be included in all\r\ncopies or substantial portions of the Software.\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcda Citation\r\n\r\nIf you use this toolkit in your research, please cite:\r\n\r\n```bibtex\r\n@software{gcpds_cv_pykit,\r\n  title = {GCPDS Computer Vision Python Kit},\r\n  author = {GCPDS Team},\r\n  year = {2025},\r\n  url = {https://github.com/UN-GCPDS/gcpds-cv-pykit},\r\n  version = {0.1.0}\r\n}\r\n```\r\n\r\n---\r\n\r\n## \ud83d\udcde Contact\r\n\r\n**GCPDS Team** - Universidad Nacional de Colombia\r\n\r\n- \ud83d\udce7 Email: gcpds_man@unal.edu.co\r\n- \ud83c\udf10 GitHub: [UN-GCPDS](https://github.com/UN-GCPDS)\r\n- \ud83d\udcd6 Documentation: [https://gcpds-cv-pykit.readthedocs.io/](https://gcpds-cv-pykit.readthedocs.io/)\r\n- \ud83d\udc1b Issues: [GitHub Issues](https://github.com/UN-GCPDS/gcpds-cv-pykit/issues)\r\n\r\n---\r\n\r\n## \ud83d\ude4f Acknowledgments\r\n\r\n- PyTorch team for the excellent deep learning framework\r\n- The computer vision research community\r\n- Contributors and users of this toolkit\r\n- Universidad Nacional de Colombia\r\n\r\n---\r\n\r\n<div align=\"center\">\r\n\r\n**Made with \u2764\ufe0f by the GCPDS Team**\r\n\r\n[\u2b06 Back to Top](#gcpds-computer-vision-python-kit)\r\n\r\n</div>\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A comprehensive toolkit for computer vision and segmentation tasks",
    "version": "0.1.0.68",
    "project_urls": {
        "Bug Reports": "https://github.com/UN-GCPDS/gcpds-cv-pykit/issues",
        "Documentation": "https://gcpds-cv-pykit.readthedocs.io/",
        "Homepage": "https://github.com/UN-GCPDS/gcpds-cv-pykit",
        "Source": "https://github.com/UN-GCPDS/gcpds-cv-pykit"
    },
    "split_keywords": [
        "computer vision",
        " segmentation",
        " deep learning",
        " pytorch",
        " unet",
        " medical imaging",
        " image processing",
        " machine learning",
        " artificial intelligence"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e0237bbcb948bfc1d054465d2b191f9e51b513235a4af38a06ef698fc6fedf3c",
                "md5": "e7e691e36a958874f9fe39a95791b656",
                "sha256": "922038716649a278823819515ccd757ad27754cb7fd902318826e4f8cdad180d"
            },
            "downloads": -1,
            "filename": "gcpds_cv_pykit-0.1.0.68-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "e7e691e36a958874f9fe39a95791b656",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 78656,
            "upload_time": "2025-11-05T03:42:12",
            "upload_time_iso_8601": "2025-11-05T03:42:12.741349Z",
            "url": "https://files.pythonhosted.org/packages/e0/23/7bbcb948bfc1d054465d2b191f9e51b513235a4af38a06ef698fc6fedf3c/gcpds_cv_pykit-0.1.0.68-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-11-05 03:42:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "UN-GCPDS",
    "github_project": "gcpds-cv-pykit",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "gcpds-cv-pykit"
}

GCPDS Team