# EEG Analysis Pipeline
A comprehensive Python package for EEG signal processing, trigger detection, and frequency-domain analysis.
## Overview
This package provides a complete pipeline for analyzing EEG data stored in European Data Format (EDF) files. It includes tools for signal loading, trigger detection, inter-trigger window analysis, and multi-band frequency-domain processing.
## Features
- **EDF File Loading**: Load and inspect EEG signals with flexible duration and channel selection
- **Trigger Detection**: Automated detection of trigger events with customizable thresholds
- **Window Analysis**: Generate and analyze inter-trigger intervals with multiple aggregation methods
- **Advanced Spectral Analysis**: Multi-band power analysis with temporal smoothing and spectral parameterization
- **Connectivity Analysis**: Graph-based network analysis with correlation, coherence, and phase metrics
- **Traditional Analysis**: Multi-band EEG analysis (Delta, Theta, Alpha, Beta, Gamma)
- **Spectral Parameterization**: Separate aperiodic (1/f) and periodic (oscillatory) components using SpecParam/FOOOF
- **Memory-Safe Processing**: HDF5-based analysis for large EEG files with bounded memory usage
- **Professional Organization**: Structured output directories with consistent naming conventions
- **Analysis Metadata**: Complete tracking of analysis parameters, timing, and results
- **Visualizations**: Plots and comprehensive analysis reports
- **ML Integration**: Optional machine learning-based window quality filtering
## Installation
```bash
pip install krembil-kit
```
## Quick Start
```python
from krembil_kit import EDFLoader, TriggerDetector, SpectralAnalyzer, ConnectivityAnalyzer
# Load EEG data
loader = EDFLoader("data", "subject_name")
loader.load_and_plot_signals(signal_indices=[15, 25], duration=1200.0) # T6, T2
# Detect and plot triggers for temporal segmentation
detector = TriggerDetector(loader, 'T2')
detector.detect_triggers()
detector.plot_triggers()
# Option 1: Advanced spectral analysis
spectral_analyzer = SpectralAnalyzer(loader=loader, trigger_detector=detector)
spectral_analyzer.analyze_comprehensive() # Multi-band + spectral parameterization
# Option 2: Graph-based connectivity analysis
connectivity_analyzer = ConnectivityAnalyzer(edf_loader=loader)
# Level 1: Quick exploration
connectivity_analyzer.compute_correlation(start_time=0, stop_time=300, interval=10)
connectivity_analyzer.compute_coherence_average(start_time=0, stop_time=300, interval=30)
# Level 2: Detailed analysis
connectivity_analyzer.compute_coherence_bands(start_time=0, stop_time=300, interval=50)
# Plot results
connectivity_analyzer.plot_connectivity_matrices()
# Level 3: Full graph representations
# Memory-safe graph generation (works for any file size):
hdf5_path = connectivity_analyzer.generate_graphs(segment_duration=60.0, overlap_ratio=0.125)
```
## Data Structure Requirements
### Input Data Format
Your EDF files should be organized as follows:
```
data/
└── subject_name/
└── subject_name.edf
```
### EDF File Requirements
- **Format**: European Data Format (.edf)
- **Channels**: Standard EEG channel names (Fp1, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6, Fz)
- **Sample Rate**: Typically 500 Hz (automatically detected)
- **Duration**: Minimum 10 minutes recommended for trigger detection
## Classes and Methods
### EDFLoader
Handles loading and inspection of EDF files.
#### Initialization
```python
loader = EDFLoader(folder_path, name)
```
**Parameters:**
- `folder_path` (str): Path to the data directory
- `name` (str): Name of the subject or experiment (the EDF file will be under this subject directory, and should match this name)
For example, if our data was in the current directory as below:
```
data/
└── Sebastian/
└── Sebastian.edf
```
We would call (from the current directory):
```python
loader = EDFLoader("path/to/data/folder", "Sebastian")
```
#### Methods
##### `inspect_data()`
Inspects the EDF file for the specified subject, printing out various signal information including:
- File header details
- Number of signals and their properties
- Sample rates and signal ranges
- Physical and digital maximum/minimum values
- First 10 samples of each channel
```python
loader.inspect_data()
```
##### `load_and_plot_signals(signal_indices=None, duration=None, save_plots=False, save_path=None)`
Loads and plots the signals from the EDF file, storing them along with their sample rates in a dictionary.
**Parameters:**
- `signal_indices` (list of int, optional): Specific signal indices to load (None = load all signals)
- `duration` (float, optional): Duration in seconds to plot (None = entire duration)
- `save_plots` (bool, optional): Whether to save plots instead of showing them (default: False)
- `save_path` (str, optional): Directory to save plots (default: plots/{subject_name})
**Examples:**
```python
# Load T6 and T2 channels for 20 minutes
loader.load_and_plot_signals(signal_indices=[15, 25], duration=1200.0)
# Load all channels and save plots
loader.load_and_plot_signals(save_plots=True)
# Load specific duration with custom save path
loader.load_and_plot_signals(duration=1200.0, save_plots=True, save_path="custom_plots")
```
**Output:**
- Time-series plots with time axis in seconds
- Signals stored in `signals_dict` attribute with data and sample rates
- Plots saved to `plots/{subject_name}/signals_plot.png` (if save_plots=True)
- Warning message displayed when loading all signals due to potential memory usage
### TriggerDetector
Detects triggers and analyzes inter-trigger windows using amplitude thresholding and machine learning-based quality filtering.
#### Initialization
```python
detector = TriggerDetector(edf_loader, signal_choice)
```
**Parameters:**
- `edf_loader` (EDFLoader): An instance of the EDFLoader class that contains the signal data
- `signal_choice` (str): The key corresponding to the desired signal in the EDFLoader's signals_dict (e.g., 'T2', 'O1')
#### Methods
##### `butterworth_filter(signal, cutoff=30, order=5, btype='low')`
Applies a Butterworth filter to the signal.
**Parameters:**
- `signal` (numpy.ndarray): The input signal data to filter
- `cutoff` (float): The cutoff frequency in Hz (default: 30)
- `order` (int): The order of the filter (default: 5)
- `btype` (str): The type of the filter ('low', 'high', 'bandpass', 'bandstop') (default: 'low')
**Returns:** numpy.ndarray - The filtered signal
##### `detect_triggers()`
Detects triggers in the signal based on a threshold and filters events to be between 55 seconds and 62 seconds long.
**Algorithm:**
1. Rectifies the signal using absolute value
2. Applies Butterworth low-pass filter (30 Hz cutoff, order 5)
3. Detects events above hardcoded threshold (60 µV)
4. Filters events by duration (52-65 seconds, corresponding to 55-62 second range)
5. Handles edge cases for events at signal boundaries
```python
detector.detect_triggers()
print(f"Found {len(detector.df_triggers)} triggers")
```
**Output:**
- `df_triggers` DataFrame with columns:
- `start_index`, `end_index`: Sample indices
- `duration_samples`: Duration in samples
- `start_time (s)`, `end_time (s)`: Time in seconds
- `duration_time (s)`: Trigger duration in seconds
##### `plot_triggers()`
Plots the filtered signal with detected trigger periods highlighted.
```python
detector.plot_triggers()
```
**Output:** Interactive matplotlib plot showing filtered signal with red highlighted trigger periods
##### `save_triggers()`
Saves the DataFrame of detected triggers to a CSV file.
```python
detector.save_triggers()
```
**Output:** `{subject_folder}/triggers.csv`
##### `plot_windows()`
Generates individual plots for each inter-trigger window (signal segments between consecutive triggers).
```python
detector.plot_windows()
```
**Output:** `{subject_folder}/window plots/plot_{i}.png` - Individual plots for each window with time in minutes and amplitude range 0-300
##### `convert_to_video()`
Creates MP4 video from window plots for rapid review, sorted numerically by plot index.
```python
detector.convert_to_video()
```
**Output:** `{subject_folder}/trigger.mp4` - Video at 10 fps showing all window plots in sequence
##### `filter_bad_windows(clf_path=None, classes_path=None)`
Uses ResNet-50 + logistic regression pipeline to drop triggers whose adjoining window plots are classified as 'bad'. Automatically uses built-in models when no custom paths are provided.
**Important:** Call after `plot_windows()` to ensure window plots exist for classification.
```python
# Use built-in models (recommended)
detector.plot_windows()
detector.filter_bad_windows()
# Or use custom models
detector.filter_bad_windows(
clf_path="path/to/custom_classifier.pkl",
classes_path="path/to/custom_classes.npy"
)
```
**Parameters:**
- `clf_path` (str, optional): Path to custom classifier (.pkl file). Uses built-in model from package resources if None
- `classes_path` (str, optional): Path to custom class labels (.npy file). Uses built-in classes from package resources if None
**Output:** Overwrites `{subject_folder}/triggers.csv` with filtered results, removing triggers adjacent to windows classified as 'bad'
### SpectralAnalyzer
Advanced spectral analysis tool providing comprehensive frequency-domain characterization of EEG signals with both time-domain power analysis and modern spectral parameterization methods. Features professional output organization and complete analysis metadata tracking.
The SpectralAnalyzer offers **two complementary analysis approaches**:
1. **Multi-band Power Analysis** - Time-resolved power across canonical EEG frequency bands
2. **Spectral Parameterization** - FOOOF/SpecParam analysis separating aperiodic and periodic components
#### Initialization
```python
from krembil_kit import EDFLoader, TriggerDetector, SpectralAnalyzer
# Load EEG data and detect triggers
loader = EDFLoader(folder_path="data", name="subject_name")
loader.load_and_plot_signals()
trigger_detector = TriggerDetector(edf_loader=loader, signal_choice='T2')
trigger_detector.detect_triggers()
# Initialize analyzer with optional custom output directory
analyzer = SpectralAnalyzer(
loader=loader,
trigger_detector=trigger_detector,
target_length=50,
output_dir=None # Optional: defaults to subject_folder/spectral_analysis_results/
)
```
**Parameters:**
- `loader` (EDFLoader): Configured EDFLoader instance containing loaded EEG signals. Must have signals loaded via load_and_plot_signals() method.
- `trigger_detector` (TriggerDetector, optional): TriggerDetector instance for event-based signal segmentation. Required for multi-band power analysis with temporal segmentation.
- `target_length` (int, default=50): Number of resampled points per segment for temporal aggregation. Controls the resolution of time-series outputs.
- `output_dir` (str, optional): Output directory path for analysis results. If None, defaults to 'spectral_analysis_results' subdirectory in the same directory as the EDF file.
#### Methods
##### `analyze_multiband_power(channels_to_analyze=None)`
Executes comprehensive multi-band power analysis across canonical EEG frequency bands (Delta, Theta, Alpha, Beta, Gamma) with configurable temporal smoothing.
**Features:**
- Butterworth bandpass filtering for frequency band isolation
- Signal rectification and moving-average smoothing
- Multiple smoothing windows (100ms, 250ms, 500ms) for different temporal scales
- Structured CSV output and publication-ready visualizations
```python
# Analyze all loaded channels
analyzer.analyze_multiband_power()
# Analyze specific channels
analyzer.analyze_multiband_power(channels_to_analyze=['T2', 'O1', 'F3'])
```
**Output Structure:**
```
subject_folder/
└── spectral_analysis_results/
├── multiband_power/
│ ├── csv/
│ │ ├── subject_multiband_Delta_ma100ms.csv
│ │ ├── subject_multiband_Theta_ma250ms.csv
│ │ └── subject_multiband_Gamma_ma500ms.csv
│ └── plots/
│ ├── subject_multiband_Delta_T2.png
│ └── subject_multiband_Theta_T2.png
└── analysis_metadata.json # Complete analysis tracking
```
##### `analyze_spectral_parameterization(channels_to_analyze=None)`
Executes advanced spectral parameterization using FOOOF/SpecParam methodology to separate neural power spectra into aperiodic (1/f) and periodic (oscillatory) components.
**Features:**
- Automatic library detection (SpecParam preferred, FOOOF fallback)
- Robust model fitting with configurable parameters
- Comprehensive validation metrics and goodness-of-fit assessment
- Frequency band power quantification with aperiodic correction
```python
# Analyze all loaded channels
analyzer.analyze_spectral_parameterization()
# Analyze specific channels
analyzer.analyze_spectral_parameterization(channels_to_analyze=['T2', 'O1'])
```
**Output Structure:**
```
subject_folder/
└── spectral_analysis_results/
├── spectral_parameterization/
│ ├── individual/
│ │ ├── subject_fooof_T2.png
│ │ ├── subject_fooof_parameters_T2.csv
│ │ └── subject_band_powers_T2.csv
│ ├── summary/
│ │ ├── subject_fooof_parameters_summary.csv
│ │ └── subject_band_powers_summary.csv
│ └── plots/
│ ├── subject_aperiodic_exponent_comparison.png
│ └── subject_spectral_peaks_comparison.png
└── analysis_metadata.json # Complete analysis tracking
```
##### `analyze_comprehensive(channels_to_analyze=None)`
Executes complete spectral analysis suite combining both multi-band power analysis and spectral parameterization for comprehensive frequency-domain characterization.
```python
# Complete analysis workflow
analyzer.analyze_comprehensive(channels_to_analyze=['T2', 'O1', 'F3'])
```
#### Configuration Methods
##### `set_frequency_bands(bands_dict)`
Configure custom frequency bands for multi-band analysis.
```python
# Custom frequency bands
custom_bands = {
'slow_alpha': (8, 10),
'fast_alpha': (10, 12),
'low_beta': (12, 20),
'high_beta': (20, 30)
}
analyzer.set_frequency_bands(custom_bands)
```
##### `set_fooof_parameters(freq_range=None, **fooof_kwargs)`
Configure spectral parameterization parameters.
```python
# Custom FOOOF settings
analyzer.set_fooof_parameters(
freq_range=(1, 40),
peak_width_limits=(1, 8),
max_n_peaks=6,
min_peak_height=0.1
)
```
##### `set_smoothing_windows(window_secs_list)`
Configure temporal smoothing parameters for multi-band power analysis.
**Parameters:**
- `window_secs_list` (list of float): Moving-average window sizes in seconds. Multiple windows enable comparison of different temporal smoothing scales.
**Notes:**
- Shorter windows preserve temporal dynamics but may be noisier
- Longer windows provide smoother estimates but reduce temporal resolution
- Window sizes are automatically converted to samples based on signal sampling frequency during analysis
```python
# Custom smoothing windows
analyzer.set_smoothing_windows([0.05, 0.1, 0.25]) # 50ms, 100ms, 250ms
```
##### `get_analysis_info()`
Retrieve comprehensive analysis configuration and system information for documentation and reproducibility purposes.
**Returns:**
- `dict`: Configuration dictionary containing channels, frequency bands, smoothing windows, FOOOF settings, library information, and trigger detector status.
```python
# Get current analyzer configuration
config = analyzer.get_analysis_info()
print(f"Available channels: {config['channels']}")
print(f"Frequency bands: {config['frequency_bands']}")
print(f"Library: {config['spectral_param_library']['library']}")
```
#### Visualization Methods
##### `plot_raw_signal_window(window_index, channel)`
Generate publication-ready visualization of raw EEG data for specified trigger-defined window.
**Parameters:**
- `window_index` (int): Zero-based index of the trigger-defined window to visualize. Must be less than (total_triggers - 1) to ensure valid window bounds.
- `channel` (str): EEG channel name to plot. Must exist in loaded dataset.
**Raises:**
- `ValueError`: If trigger detector is not provided or window_index is out of range.
```python
# Plot specific window
analyzer.plot_raw_signal_window(window_index=5, channel='T2')
```
##### `plot_averaged_signal_window(channel, start_window=None, end_window=None, target_length=500, aggregation_method='mean', trim_ratio=0.1)`
Create ensemble-averaged signal visualization across multiple temporal windows with robust statistical aggregation.
**Parameters:**
- `channel` (str): EEG channel name to analyze and visualize
- `start_window` (int, optional): Starting window index for aggregation. If None, uses first window (0)
- `end_window` (int, optional): Ending window index for aggregation. If None, uses last available window
- `target_length` (int, default=500): Number of temporal points for signal resampling and standardization
- `aggregation_method` ({'mean', 'median', 'trimmed'}, default='mean'): Statistical method for cross-window aggregation
- `trim_ratio` (float, default=0.1): Proportion of extreme values to exclude for 'trimmed' aggregation method
```python
# Plot averaged signal across windows 10-20
analyzer.plot_averaged_signal_window(
channel='T2',
start_window=10,
end_window=20,
aggregation_method='median'
)
# Plot with trimmed mean aggregation
analyzer.plot_averaged_signal_window(
channel='T2',
aggregation_method='trimmed',
trim_ratio=0.2
)
```
##### `plot_fooof_comparison(channels=None, metric='aperiodic_exponent')`
Generate comparative visualization of spectral parameterization metrics across channels.
**Parameters:**
- `channels` (list of str, optional): EEG channel names to include in comparison. If None, includes all channels with completed spectral parameterization analysis.
- `metric` ({'aperiodic_exponent', 'aperiodic_offset', 'n_peaks', 'r_squared', 'error'}, default='aperiodic_exponent'): Spectral parameterization metric to visualize:
- 'aperiodic_exponent': 1/f slope reflecting neural population dynamics
- 'aperiodic_offset': Broadband power offset parameter
- 'n_peaks': Number of detected oscillatory peaks
- 'r_squared': Model fit quality (coefficient of determination)
- 'error': Root mean square error of model fit
**Notes:**
- Requires prior execution of analyze_spectral_parameterization() method
- Visualization includes professional formatting with grid lines, proper axis labels, and publication-ready styling
```python
# Compare aperiodic exponents across channels
analyzer.plot_fooof_comparison(
channels=['T2', 'O1', 'F3'],
metric='aperiodic_exponent'
)
# Compare number of peaks across all analyzed channels
analyzer.plot_fooof_comparison(metric='n_peaks')
```
#### Complete Analysis Example
```python
from krembil_kit import EDFLoader, TriggerDetector, SpectralAnalyzer
# Step 1: Load data and detect triggers
loader = EDFLoader(folder_path="data", name="subject_name")
loader.load_and_plot_signals()
trigger_detector = TriggerDetector(edf_loader=loader, signal_choice='T2')
trigger_detector.detect_triggers()
trigger_detector.plot_triggers()
# Step 2: Initialize analyzer
analyzer = SpectralAnalyzer(
loader=loader,
trigger_detector=trigger_detector
)
# Step 3: Configure analysis parameters
analyzer.set_frequency_bands({
'Delta': (0.5, 4),
'Theta': (4, 8),
'Alpha': (8, 12),
'Beta': (12, 30),
'Gamma': (30, 80)
})
analyzer.set_fooof_parameters(freq_range=(1, 40))
# Step 4: Execute comprehensive analysis
channels_of_interest = ['T2', 'O1', 'F3', 'C3', 'C4']
analyzer.analyze_comprehensive(channels_to_analyze=channels_of_interest)
# Step 5: Generate comparative visualizations
analyzer.plot_fooof_comparison(
channels=channels_of_interest,
metric='aperiodic_exponent'
)
```
### ConnectivityAnalyzer
Converts EEG data to graph representations for network analysis and computes time-varying connectivity measures. Features professional output organization, complete analysis metadata tracking, and memory-efficient processing for any file size.
The ConnectivityAnalyzer provides **three levels of analysis complexity** to match different research needs:
1. **Simple Connectivity Analysis** - Fast correlation/coherence for exploration
2. **Detailed Connectivity Analysis** - Time-varying connectivity with visualizations
3. **Advanced Graph Analysis** - Full graph representations for machine learning
#### Initialization
```python
processor = ConnectivityAnalyzer(
edf_loader=loader,
output_dir=None, # Optional: custom output directory
window_size=1000, # Optional: analysis window size in samples
adj_window_size=20000 # Optional: adjacency matrix window size (40s at 500Hz)
)
```
**Key Parameters:**
- **`window_size`**: Analysis window duration (default: 1 second at sampling rate)
- **`adj_window_size`**: Window size for adjacency calculations (default: 40 seconds for statistical robustness)
- **`output_dir`**: Custom output directory (default: subject_folder/connectivity_analysis_results/)
**Default Output Structure:**
```
data/subject_name/
├── subject_name.edf
└── connectivity_analysis_results/ # Professional organization
├── graphs/ # HDF5 graph representations
│ └── subject_graphs.h5
├── correlation/ # Correlation matrices
│ ├── subject_correlation_0s-300s.pickle
│ └── plots/ # Correlation visualizations
├── coherence/
│ ├── average/ # Average coherence matrices
│ │ ├── subject_coherence_avg_0s-300s.pickle
│ │ └── plots/ # Average coherence visualizations
│ └── bands/ # Frequency-band coherence matrices
│ ├── subject_coherence_bands_0s-300s.pickle
│ └── plots/ # Band-specific visualizations
└── analysis_metadata.json # Complete analysis tracking
```
#### Methods
##### `generate_graphs(segment_duration=180.0, start_time=None, stop_time=None, overlap_ratio=0.875)`
Creates comprehensive graph representations with adjacency matrices and node/edge features using **memory-safe HDF5 format** with segmented processing.
**Features Generated:**
- **Adjacency matrices**: Correlation, coherence, phase relationships
- **Node features**: Energy, band-specific energy across frequency bands
- **Edge features**: Connectivity measures across frequency bands
- **High temporal resolution**: 87.5% overlapping windows by default
**Key Advantages:**
- **Memory-safe**: Processes any file size without memory issues
- **Segmented processing**: Divides large files into manageable segments
- **Immediate storage**: Results saved incrementally to prevent data loss
- **Progress tracking**: Real-time progress bars and detailed logging
- **HDF5 format**: Compressed, efficient storage with selective data access
**Parameters:**
- `segment_duration` (float): Duration of each processing segment in seconds (default: 180.0)
- `start_time` (float, optional): Start time for analysis window in seconds
- `stop_time` (float, optional): End time for analysis window in seconds
- `overlap_ratio` (float): Window overlap ratio (default: 0.875 = 87.5% overlap)
```python
# Generate comprehensive graph representations
hdf5_path = processor.generate_graphs(segment_duration=300.0)
# Analyze specific time window with high temporal resolution
hdf5_path = processor.generate_graphs(
segment_duration=180.0,
start_time=300,
stop_time=900,
overlap_ratio=0.95 # Very high resolution
)
# Output: graphs/{filename}_graphs.h5 with compressed graph data
```
**HDF5 Output Structure:**
```python
# HDF5 file contains:
{
'adjacency_matrices': (n_windows, n_adj_types, n_electrodes, n_electrodes),
'node_features': (n_windows,), # Variable-length arrays
'edge_features': (n_windows,), # Variable-length arrays
'window_starts': (n_windows,), # Timestamp for each window
# Plus comprehensive metadata as attributes
}
```
**Loading HDF5 Results:**
```python
import h5py
import numpy as np
# Load specific data without loading entire file
with h5py.File('subject_graphs.h5', 'r') as f:
# Load specific time range
correlation_matrices = f['adjacency_matrices'][100:200, 1, :, :] # Windows 100-200, correlation type
# Load metadata
sampling_freq = f.attrs['sampling_frequency']
total_windows = f.attrs['total_windows_processed']
# Load specific electrode pairs
electrode_pair_data = f['adjacency_matrices'][:, 1, 5, 12] # All windows, electrodes 5-12
```
##### `compute_correlation(start_time, stop_time, interval, overlap_ratio=0.0)`
Computes time-varying correlation matrices over specified time segments.
**Parameters:**
- `start_time` (float): Start time in seconds
- `stop_time` (float): End time in seconds
- `interval` (float): Window duration for each correlation matrix in seconds
- `overlap_ratio` (float): Overlap between windows (0.0 = no overlap, 0.5 = 50% overlap)
```python
# Compute correlation every 5 seconds from 10-60s with 50% overlap
path = processor.compute_correlation(
start_time=10.0,
stop_time=60.0,
interval=5.0,
overlap_ratio=0.5
)
```
**Output:** `correlation/{filename}_correlation_{start}s-{stop}s.pickle` containing:
```python
{
"starts": [10.0, 12.5, 15.0, ...], # Window start times
"corr_matrices": [matrix1, matrix2, ...] # Correlation matrices
}
```
##### `compute_coherence_average(start_time, stop_time, interval, overlap_ratio=0.0)`
Computes time-varying coherence matrices averaged across all frequency bands.
```python
# Simple averaged coherence analysis
path = processor.compute_coherence_average(
start_time=10.0,
stop_time=60.0,
interval=5.0
)
```
**Output:** `coherence/average/{filename}_coherence_avg_{start}s-{stop}s.pickle` containing:
```python
{
"starts": [10.0, 15.0, 20.0, ...],
"coherence_matrices": [matrix1, matrix2, ...] # Averaged coherence
}
```
##### `compute_coherence_bands(start_time, stop_time, interval, overlap_ratio=0.0)`
Computes detailed frequency-specific coherence analysis across EEG bands.
```python
# Detailed frequency-band coherence analysis
path = processor.compute_coherence_bands(
start_time=10.0,
stop_time=60.0,
interval=5.0,
overlap_ratio=0.25
)
```
**Output:** `coherence/bands/{filename}_coherence_bands_{start}s-{stop}s.pickle` containing:
```python
{
"starts": [10.0, 15.0, 20.0, ...],
"coherence_by_band": {
"delta": [matrix1, matrix2, ...], # 1-4 Hz
"theta": [matrix1, matrix2, ...], # 4-8 Hz
"alpha": [matrix1, matrix2, ...], # 8-13 Hz
"beta": [matrix1, matrix2, ...], # 13-30 Hz
"gamma": [matrix1, matrix2, ...], # 30-70 Hz
"gammaHi": [matrix1, matrix2, ...], # 70-100 Hz
# Additional bands based on sampling frequency
},
"frequency_bands": {
"delta": (1, 4), "theta": (4, 8), "alpha": (8, 13), ...
}
}
```
##### `plot_connectivity_matrices(plot_types=None, time_range=None, output_subdir="plots", save_individual=True, save_summary=True, dpi=150, figsize=(10, 8))`
Generates comprehensive visualizations of connectivity matrices with **full EEG channel names** on axes.
**Parameters:**
- `plot_types` (list): Types to plot - `["correlation", "coherence_avg", "coherence_bands"]` (default: all available)
- `time_range` (tuple): `(start_time, stop_time)` to filter plots (default: all time windows)
- `output_subdir` (str): Subdirectory name for plots (default: "plots")
- `save_individual` (bool): Save individual matrix plots (default: True)
- `save_summary` (bool): Save summary/comparison plots (default: True)
- `dpi` (int): Plot resolution (default: 150)
- `figsize` (tuple): Figure size as (width, height) (default: (10, 8))
**Features:**
- **Full channel names**: All EEG channel names (Fp1, F3, C3, etc.) displayed on both axes
- **Organized output**: Plots saved alongside data in intuitive directory structure
- **Multiple plot types**: Individual matrices, time series summaries, frequency band comparisons
- **Flexible filtering**: Plot specific time ranges or connectivity types
- **High-quality output**: Publication-ready plots with proper labeling
```python
# Plot all available connectivity data
results = processor.plot_connectivity_matrices()
# Plot only correlation matrices
results = processor.plot_connectivity_matrices(plot_types=["correlation"])
# Plot coherence with time filtering
results = processor.plot_connectivity_matrices(
plot_types=["coherence_avg", "coherence_bands"],
time_range=(100, 200), # Only plot 100-200 second window
save_individual=True,
save_summary=True
)
# Custom plot settings
results = processor.plot_connectivity_matrices(
dpi=300, # High resolution
figsize=(12, 10), # Larger plots
output_subdir="publication_plots"
)
```
#### Progressive Analysis Workflow
The ConnectivityAnalyzer supports a **progressive complexity approach** - start simple and add detail as needed:
##### Level 1: Exploratory Analysis (Fast)
```python
from krembil_kit import EDFLoader, ConnectivityAnalyzer
# Load EEG data
loader = EDFLoader("data", "subject_name")
loader.load_and_plot_signals(duration=1200.0)
# Initialize processor
processor = ConnectivityAnalyzer(edf_loader=loader)
# Quick correlation overview (5-minute windows)
corr_path = processor.compute_correlation(
start_time=0, stop_time=3600, interval=300
)
# Quick coherence overview
coh_path = processor.compute_coherence_average(
start_time=0, stop_time=3600, interval=300
)
# Generate overview plots
processor.plot_connectivity_matrices(
plot_types=["correlation", "coherence_avg"],
save_individual=False, # Only summary plots
save_summary=True
)
```
##### Level 2: Detailed Time-Varying Analysis
```python
# Identify interesting periods from Level 1 results
# Focus on specific time ranges with higher resolution
# High-resolution analysis of interesting periods
processor_detailed = ConnectivityAnalyzer(edf_loader=loader)
# Detailed correlation analysis (10-second windows)
detailed_corr = processor_detailed.compute_correlation(
start_time=100, stop_time=400, interval=10, overlap_ratio=0.5
)
# Frequency-specific coherence analysis
detailed_coh = processor_detailed.compute_coherence_bands(
start_time=100, stop_time=400, interval=10, overlap_ratio=0.5
)
# Generate detailed visualizations
processor_detailed.plot_connectivity_matrices(
plot_types=["correlation", "coherence_bands"],
time_range=(100, 400),
save_individual=True,
save_summary=True
)
```
##### Level 3: Advanced Graph Analysis
```python
# For machine learning, GNN analysis, or comprehensive connectivity studies
# Memory-safe graph generation for any file size
processor_advanced = ConnectivityAnalyzer(
edf_loader=loader
)
hdf5_path = processor_advanced.generate_graphs(
segment_duration=180.0
)
# Load and analyze HDF5 results
import h5py
with h5py.File(hdf5_path, 'r') as f:
# Access specific connectivity types
correlations = f['adjacency_matrices'][:, 1, :, :] # All correlation matrices
coherences = f['adjacency_matrices'][:, 2, :, :] # All coherence matrices
# Get metadata
n_windows = f.attrs['total_windows_processed']
sampling_freq = f.attrs['sampling_frequency']
print(f"Processed {n_windows} windows at {sampling_freq} Hz")
```
#### Method Selection Guide
**Use `compute_correlation()` when:**
- ✅ Quick data exploration and quality assessment
- ✅ Identifying periods of high/low connectivity
- ✅ Simple statistical comparisons between conditions
- ✅ Real-time or streaming analysis needs
- ✅ Memory-constrained environments
**Use `compute_coherence_average()` when:**
- ✅ Frequency-domain connectivity without band-specific details
- ✅ Robust connectivity measures (coherence is less sensitive to artifacts)
- ✅ Comparing connectivity strength across different time periods
- ✅ Preprocessing for more detailed analysis
**Use `compute_coherence_bands()` when:**
- ✅ Need frequency-specific connectivity (alpha, beta, gamma, etc.)
- ✅ Studying oscillatory coupling between brain regions
- ✅ Clinical applications requiring band-specific analysis
- ✅ Research into frequency-specific network dynamics
**Use `generate_graphs()` when:**
- ✅ Machine learning applications (GNNs, classification)
- ✅ Complex network analysis requiring multiple connectivity measures
- ✅ Research requiring high temporal resolution connectivity tracking
- ✅ Any size EDF files (memory-safe processing)
- ✅ Production environments requiring reliability
- ✅ Need for incremental processing and progress tracking
- ✅ Long-term storage with efficient HDF5 compression
#### Complete Analysis Example
```python
from krembil_kit import EDFLoader, ConnectivityAnalyzer
import h5py
import numpy as np
# Load EEG data
loader = EDFLoader("data", "subject_name")
loader.load_and_plot_signals(duration=3600.0) # 1 hour
# Step 1: Quick exploration (Level 1)
explorer = ConnectivityAnalyzer(edf_loader=loader)
# Overview analysis
corr_overview = explorer.compute_correlation(0, 3600, 300) # 5-min windows
coh_overview = explorer.compute_coherence_average(0, 3600, 300)
# Generate overview plots
explorer.plot_connectivity_matrices(
plot_types=["correlation", "coherence_avg"],
save_summary=True
)
# Step 2: Identify interesting periods (hypothetical analysis)
# ... analyze overview results to find periods of interest ...
interesting_start, interesting_stop = 1200, 1800 # Example: 20-30 minutes
# Step 3: Detailed analysis of interesting period (Level 2)
detailed = ConnectivityAnalyzer(edf_loader=loader)
detailed_corr = detailed.compute_correlation(
interesting_start, interesting_stop, 30, overlap_ratio=0.5
)
detailed_coh = detailed.compute_coherence_bands(
interesting_start, interesting_stop, 30, overlap_ratio=0.5
)
# Step 4: Full graph analysis for ML (Level 3)
# Memory-safe HDF5 processing
advanced = ConnectivityAnalyzer(edf_loader=loader)
hdf5_path = advanced.generate_graphs(segment_duration=300.0)
# Step 5: Load and analyze results
with h5py.File(hdf5_path, 'r') as f:
# Extract features for machine learning
correlation_features = f['adjacency_matrices'][:, 1, :, :].flatten()
coherence_features = f['adjacency_matrices'][:, 2, :, :].flatten()
# Get temporal information
window_times = f['window_starts'][:]
# Print summary
print(f"Generated {len(window_times)} windows")
print(f"Time range: {window_times[0]:.1f}s - {window_times[-1]:.1f}s")
print(f"Feature dimensions: {correlation_features.shape}")
# Step 6: Generate comprehensive visualizations
advanced.plot_connectivity_matrices(
plot_types=["correlation", "coherence_avg", "coherence_bands"],
time_range=(interesting_start, interesting_stop),
save_individual=True,
save_summary=True,
dpi=300 # High resolution for publication
)
```
## Advanced Usage
### Memory Management for Large Files
**For EDF Loading:**
- Use `duration` parameter to limit data loading
- Use `signal_indices` to select specific channels
- Enable `save_plots=True` to avoid memory issues with display
**For Graph Processing:**
- **Any file size**: `generate_graphs()` uses memory-safe HDF5 processing
- **Adjust segment size**: Smaller segments use less memory but have more boundary losses
```python
# Memory-efficient settings for large files
processor = ConnectivityAnalyzer(edf_loader=loader)
# Process in small segments for maximum memory efficiency
hdf5_path = processor.generate_graphs(
segment_duration=120.0 # Smaller segments = less memory
)
```
### Custom Trigger Detection Parameters
The trigger detection uses hardcoded parameters optimized for trigger detection:
- **Threshold**: 60 µV
- **Duration range**: 52-65 seconds
- **Filter**: 30 Hz low-pass Butterworth
### Temporal Resolution vs Performance Trade-offs
**Overlap Ratio Impact:**
```python
# High resolution, high computational cost
hdf5_path = processor.generate_graphs(overlap_ratio=0.875) # 87.5% overlap
# Result: 8x more windows, 8x longer processing, 8x more storage
# Moderate resolution, balanced performance
hdf5_path = processor.generate_graphs(overlap_ratio=0.5) # 50% overlap
# Result: 2x more windows, 2x longer processing
# Low resolution, fast processing
hdf5_path = processor.generate_graphs(overlap_ratio=0.0) # No overlap
# Result: Fastest processing, lowest memory usage
```
### HDF5 Data Access Patterns
**Efficient HDF5 Loading:**
```python
import h5py
# Load specific time ranges without loading entire file
with h5py.File('subject_graphs.h5', 'r') as f:
# Load only correlation matrices for specific time window
correlations = f['adjacency_matrices'][100:200, 1, :, :]
# Load specific electrode pairs across all time
electrode_pair = f['adjacency_matrices'][:, 1, 5, 12]
# Load metadata without loading data
total_windows = f.attrs['total_windows_processed']
sampling_freq = f.attrs['sampling_frequency']
```
### ML-Based Quality Control
For automated window quality assessment:
1. Train a ResNet-50 + logistic regression model on labeled window images
2. Save the classifier and class labels
3. Use `filter_bad_windows()` to automatically remove poor-quality segments
### Production Deployment Considerations
**For Large-Scale Processing:**
- Use `generate_graphs()` for reliability and memory safety with HDF5 storage
- Set appropriate `segment_duration` based on available RAM
- Monitor disk space - HDF5 files can be large but are compressed
- Use progress tracking to monitor long-running jobs
- Consider processing multiple files in parallel with separate processes
**Error Recovery:**
- HDF5 processing saves incrementally - partial results preserved on interruption
- Check for existing HDF5 files before reprocessing
- Use validation scripts to verify data integrity
## Dependencies
- numpy
- scipy
- mne
- pyedflib
- matplotlib
- seaborn
- pandas
- opencv-python
- torch
- torchvision
- joblib
- scikit-learn
- Pillow
- specparam
- fooof
- h5py
- tqdm
## Requirements
- Python ≥ 3.7
- Sufficient RAM for EEG data (recommend 8GB+ for large files)
- GPU optional (for ML-based filtering)
## Citation
If you use this package in your research, please cite:
```
[KrembilKit - Raiyan, Yousif, Srikar]
```
## License
MIT License
## Support
For questions or issues, please contact the package maintainer.
#
# Analysis Metadata and Reproducibility
Both SpectralAnalyzer and ConnectivityAnalyzer automatically track comprehensive metadata for all analyses.
### Metadata Features
**Automatic Tracking:**
- Analysis timestamps and duration
- All analysis parameters and settings
- Input data information (file paths, channels, etc.)
- Results summary (files created, processing statistics)
- Software version and library information
**Metadata File Location:**
```
subject_folder/
├── spectral_analysis_results/
│ └── analysis_metadata.json
└── connectivity_analysis_results/
└── analysis_metadata.json
```
### Metadata Structure
```json
[
{
"timestamp": "2024-12-19T14:30:52.123456",
"analysis_type": "comprehensive",
"analysis_duration_seconds": 245.7,
"parameters": {
"channels_analyzed": ["T2", "O1", "F3"],
"frequency_bands": {"Delta": [0.5, 4], "Theta": [4, 8]},
"fooof_settings": {"max_n_peaks": 6, "peak_threshold": 2.0}
},
"data_info": {
"subject_name": "subject_001",
"channels": ["T2", "O1", "F3", "C3", "C4"]
},
"results": {
"analysis_type": "comprehensive",
"methods_executed": ["multiband_power", "spectral_parameterization"],
"channels_processed": 3
}
}
]
```
### Using Metadata
```python
import json
# Load analysis history
with open('spectral_analysis_results/analysis_metadata.json', 'r') as f:
metadata = json.load(f)
# Find specific analysis
for analysis in metadata:
if analysis['analysis_type'] == 'comprehensive':
print(f"Analysis run on: {analysis['timestamp']}")
print(f"Duration: {analysis['analysis_duration_seconds']} seconds")
print(f"Channels: {analysis['parameters']['channels_analyzed']}")
```
Raw data
{
"_id": null,
"home_page": null,
"name": "krembil-kit",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "Raiyan <raiyan0511@gmail.com>",
"keywords": "eeg, signal-processing, neuroscience, spectral-analysis, connectivity-analysis, graph-analysis, coherence, correlation, fooof, specparam, spectral-parameterization, aperiodic, oscillations, hdf5, memory-safe, large-files, multi-band, frequency-bands",
"author": null,
"author_email": "Raiyan <raiyan0511@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/93/24/7c49171be414dde87eebc65aa7347d68389b87647dfbcfaf3170c7e32f6d/krembil_kit-1.0.0.tar.gz",
"platform": null,
"description": "# EEG Analysis Pipeline\n\nA comprehensive Python package for EEG signal processing, trigger detection, and frequency-domain analysis.\n\n## Overview\n\nThis package provides a complete pipeline for analyzing EEG data stored in European Data Format (EDF) files. It includes tools for signal loading, trigger detection, inter-trigger window analysis, and multi-band frequency-domain processing.\n\n## Features\n\n- **EDF File Loading**: Load and inspect EEG signals with flexible duration and channel selection\n- **Trigger Detection**: Automated detection of trigger events with customizable thresholds\n- **Window Analysis**: Generate and analyze inter-trigger intervals with multiple aggregation methods\n- **Advanced Spectral Analysis**: Multi-band power analysis with temporal smoothing and spectral parameterization\n- **Connectivity Analysis**: Graph-based network analysis with correlation, coherence, and phase metrics\n- **Traditional Analysis**: Multi-band EEG analysis (Delta, Theta, Alpha, Beta, Gamma) \n- **Spectral Parameterization**: Separate aperiodic (1/f) and periodic (oscillatory) components using SpecParam/FOOOF\n- **Memory-Safe Processing**: HDF5-based analysis for large EEG files with bounded memory usage\n- **Professional Organization**: Structured output directories with consistent naming conventions\n- **Analysis Metadata**: Complete tracking of analysis parameters, timing, and results\n- **Visualizations**: Plots and comprehensive analysis reports\n- **ML Integration**: Optional machine learning-based window quality filtering\n\n## Installation\n\n```bash\npip install krembil-kit\n```\n\n## Quick Start\n\n```python\nfrom krembil_kit import EDFLoader, TriggerDetector, SpectralAnalyzer, ConnectivityAnalyzer\n\n# Load EEG data\nloader = EDFLoader(\"data\", \"subject_name\")\nloader.load_and_plot_signals(signal_indices=[15, 25], duration=1200.0) # T6, T2\n\n# Detect and plot triggers for temporal segmentation\ndetector = TriggerDetector(loader, 'T2')\ndetector.detect_triggers()\ndetector.plot_triggers()\n\n# Option 1: Advanced spectral analysis\nspectral_analyzer = SpectralAnalyzer(loader=loader, trigger_detector=detector)\nspectral_analyzer.analyze_comprehensive() # Multi-band + spectral parameterization\n\n# Option 2: Graph-based connectivity analysis\nconnectivity_analyzer = ConnectivityAnalyzer(edf_loader=loader)\n\n# Level 1: Quick exploration\nconnectivity_analyzer.compute_correlation(start_time=0, stop_time=300, interval=10)\nconnectivity_analyzer.compute_coherence_average(start_time=0, stop_time=300, interval=30)\n\n# Level 2: Detailed analysis \nconnectivity_analyzer.compute_coherence_bands(start_time=0, stop_time=300, interval=50)\n\n# Plot results\nconnectivity_analyzer.plot_connectivity_matrices()\n\n# Level 3: Full graph representations\n# Memory-safe graph generation (works for any file size):\nhdf5_path = connectivity_analyzer.generate_graphs(segment_duration=60.0, overlap_ratio=0.125)\n```\n\n## Data Structure Requirements\n\n### Input Data Format\n\nYour EDF files should be organized as follows:\n\n```\ndata/\n\u2514\u2500\u2500 subject_name/\n \u2514\u2500\u2500 subject_name.edf\n```\n\n### EDF File Requirements\n\n- **Format**: European Data Format (.edf)\n- **Channels**: Standard EEG channel names (Fp1, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6, Fz)\n- **Sample Rate**: Typically 500 Hz (automatically detected)\n- **Duration**: Minimum 10 minutes recommended for trigger detection\n\n## Classes and Methods\n\n### EDFLoader\n\nHandles loading and inspection of EDF files.\n\n#### Initialization\n```python\nloader = EDFLoader(folder_path, name)\n```\n\n**Parameters:**\n- `folder_path` (str): Path to the data directory\n- `name` (str): Name of the subject or experiment (the EDF file will be under this subject directory, and should match this name)\n\nFor example, if our data was in the current directory as below:\n\n```\ndata/\n\u2514\u2500\u2500 Sebastian/\n \u2514\u2500\u2500 Sebastian.edf\n```\n\nWe would call (from the current directory):\n\n```python\nloader = EDFLoader(\"path/to/data/folder\", \"Sebastian\")\n```\n\n#### Methods\n\n##### `inspect_data()`\nInspects the EDF file for the specified subject, printing out various signal information including:\n- File header details\n- Number of signals and their properties\n- Sample rates and signal ranges\n- Physical and digital maximum/minimum values\n- First 10 samples of each channel\n\n```python\nloader.inspect_data()\n```\n\n##### `load_and_plot_signals(signal_indices=None, duration=None, save_plots=False, save_path=None)`\nLoads and plots the signals from the EDF file, storing them along with their sample rates in a dictionary.\n\n**Parameters:**\n- `signal_indices` (list of int, optional): Specific signal indices to load (None = load all signals)\n- `duration` (float, optional): Duration in seconds to plot (None = entire duration)\n- `save_plots` (bool, optional): Whether to save plots instead of showing them (default: False)\n- `save_path` (str, optional): Directory to save plots (default: plots/{subject_name})\n\n**Examples:**\n```python\n# Load T6 and T2 channels for 20 minutes\nloader.load_and_plot_signals(signal_indices=[15, 25], duration=1200.0)\n\n# Load all channels and save plots\nloader.load_and_plot_signals(save_plots=True)\n\n# Load specific duration with custom save path\nloader.load_and_plot_signals(duration=1200.0, save_plots=True, save_path=\"custom_plots\")\n```\n\n**Output:**\n- Time-series plots with time axis in seconds\n- Signals stored in `signals_dict` attribute with data and sample rates\n- Plots saved to `plots/{subject_name}/signals_plot.png` (if save_plots=True)\n- Warning message displayed when loading all signals due to potential memory usage\n\n### TriggerDetector\n\nDetects triggers and analyzes inter-trigger windows using amplitude thresholding and machine learning-based quality filtering.\n\n#### Initialization\n```python\ndetector = TriggerDetector(edf_loader, signal_choice)\n```\n\n**Parameters:**\n- `edf_loader` (EDFLoader): An instance of the EDFLoader class that contains the signal data\n- `signal_choice` (str): The key corresponding to the desired signal in the EDFLoader's signals_dict (e.g., 'T2', 'O1')\n\n#### Methods\n\n##### `butterworth_filter(signal, cutoff=30, order=5, btype='low')`\nApplies a Butterworth filter to the signal.\n\n**Parameters:**\n- `signal` (numpy.ndarray): The input signal data to filter\n- `cutoff` (float): The cutoff frequency in Hz (default: 30)\n- `order` (int): The order of the filter (default: 5)\n- `btype` (str): The type of the filter ('low', 'high', 'bandpass', 'bandstop') (default: 'low')\n\n**Returns:** numpy.ndarray - The filtered signal\n\n##### `detect_triggers()`\nDetects triggers in the signal based on a threshold and filters events to be between 55 seconds and 62 seconds long.\n\n**Algorithm:**\n1. Rectifies the signal using absolute value\n2. Applies Butterworth low-pass filter (30 Hz cutoff, order 5)\n3. Detects events above hardcoded threshold (60 \u00b5V)\n4. Filters events by duration (52-65 seconds, corresponding to 55-62 second range)\n5. Handles edge cases for events at signal boundaries\n\n```python\ndetector.detect_triggers()\nprint(f\"Found {len(detector.df_triggers)} triggers\")\n```\n\n**Output:**\n- `df_triggers` DataFrame with columns:\n - `start_index`, `end_index`: Sample indices\n - `duration_samples`: Duration in samples\n - `start_time (s)`, `end_time (s)`: Time in seconds\n - `duration_time (s)`: Trigger duration in seconds\n\n##### `plot_triggers()`\nPlots the filtered signal with detected trigger periods highlighted.\n\n```python\ndetector.plot_triggers()\n```\n\n**Output:** Interactive matplotlib plot showing filtered signal with red highlighted trigger periods\n\n##### `save_triggers()`\nSaves the DataFrame of detected triggers to a CSV file.\n\n```python\ndetector.save_triggers()\n```\n\n**Output:** `{subject_folder}/triggers.csv`\n\n##### `plot_windows()`\nGenerates individual plots for each inter-trigger window (signal segments between consecutive triggers).\n\n```python\ndetector.plot_windows()\n```\n\n**Output:** `{subject_folder}/window plots/plot_{i}.png` - Individual plots for each window with time in minutes and amplitude range 0-300\n\n##### `convert_to_video()`\nCreates MP4 video from window plots for rapid review, sorted numerically by plot index.\n\n```python\ndetector.convert_to_video()\n```\n\n**Output:** `{subject_folder}/trigger.mp4` - Video at 10 fps showing all window plots in sequence\n\n##### `filter_bad_windows(clf_path=None, classes_path=None)`\nUses ResNet-50 + logistic regression pipeline to drop triggers whose adjoining window plots are classified as 'bad'. Automatically uses built-in models when no custom paths are provided.\n\n**Important:** Call after `plot_windows()` to ensure window plots exist for classification.\n\n```python\n# Use built-in models (recommended)\ndetector.plot_windows()\ndetector.filter_bad_windows()\n\n# Or use custom models\ndetector.filter_bad_windows(\n clf_path=\"path/to/custom_classifier.pkl\",\n classes_path=\"path/to/custom_classes.npy\"\n)\n```\n\n**Parameters:**\n- `clf_path` (str, optional): Path to custom classifier (.pkl file). Uses built-in model from package resources if None\n- `classes_path` (str, optional): Path to custom class labels (.npy file). Uses built-in classes from package resources if None\n\n**Output:** Overwrites `{subject_folder}/triggers.csv` with filtered results, removing triggers adjacent to windows classified as 'bad'\n\n\n\n\n\n### SpectralAnalyzer\n\nAdvanced spectral analysis tool providing comprehensive frequency-domain characterization of EEG signals with both time-domain power analysis and modern spectral parameterization methods. Features professional output organization and complete analysis metadata tracking.\n\nThe SpectralAnalyzer offers **two complementary analysis approaches**:\n\n1. **Multi-band Power Analysis** - Time-resolved power across canonical EEG frequency bands\n2. **Spectral Parameterization** - FOOOF/SpecParam analysis separating aperiodic and periodic components\n\n#### Initialization\n```python\nfrom krembil_kit import EDFLoader, TriggerDetector, SpectralAnalyzer\n\n# Load EEG data and detect triggers\nloader = EDFLoader(folder_path=\"data\", name=\"subject_name\")\nloader.load_and_plot_signals()\n\ntrigger_detector = TriggerDetector(edf_loader=loader, signal_choice='T2')\ntrigger_detector.detect_triggers()\n\n# Initialize analyzer with optional custom output directory\nanalyzer = SpectralAnalyzer(\n loader=loader, \n trigger_detector=trigger_detector, \n target_length=50,\n output_dir=None # Optional: defaults to subject_folder/spectral_analysis_results/\n)\n```\n\n**Parameters:**\n- `loader` (EDFLoader): Configured EDFLoader instance containing loaded EEG signals. Must have signals loaded via load_and_plot_signals() method.\n- `trigger_detector` (TriggerDetector, optional): TriggerDetector instance for event-based signal segmentation. Required for multi-band power analysis with temporal segmentation.\n- `target_length` (int, default=50): Number of resampled points per segment for temporal aggregation. Controls the resolution of time-series outputs.\n- `output_dir` (str, optional): Output directory path for analysis results. If None, defaults to 'spectral_analysis_results' subdirectory in the same directory as the EDF file.\n\n#### Methods\n\n##### `analyze_multiband_power(channels_to_analyze=None)`\n\nExecutes comprehensive multi-band power analysis across canonical EEG frequency bands (Delta, Theta, Alpha, Beta, Gamma) with configurable temporal smoothing.\n\n**Features:**\n- Butterworth bandpass filtering for frequency band isolation\n- Signal rectification and moving-average smoothing\n- Multiple smoothing windows (100ms, 250ms, 500ms) for different temporal scales\n- Structured CSV output and publication-ready visualizations\n\n```python\n# Analyze all loaded channels\nanalyzer.analyze_multiband_power()\n\n# Analyze specific channels\nanalyzer.analyze_multiband_power(channels_to_analyze=['T2', 'O1', 'F3'])\n```\n\n**Output Structure:**\n```\nsubject_folder/\n\u2514\u2500\u2500 spectral_analysis_results/\n \u251c\u2500\u2500 multiband_power/\n \u2502 \u251c\u2500\u2500 csv/\n \u2502 \u2502 \u251c\u2500\u2500 subject_multiband_Delta_ma100ms.csv\n \u2502 \u2502 \u251c\u2500\u2500 subject_multiband_Theta_ma250ms.csv\n \u2502 \u2502 \u2514\u2500\u2500 subject_multiband_Gamma_ma500ms.csv\n \u2502 \u2514\u2500\u2500 plots/\n \u2502 \u251c\u2500\u2500 subject_multiband_Delta_T2.png\n \u2502 \u2514\u2500\u2500 subject_multiband_Theta_T2.png\n \u2514\u2500\u2500 analysis_metadata.json # Complete analysis tracking\n```\n\n##### `analyze_spectral_parameterization(channels_to_analyze=None)`\n\nExecutes advanced spectral parameterization using FOOOF/SpecParam methodology to separate neural power spectra into aperiodic (1/f) and periodic (oscillatory) components.\n\n**Features:**\n- Automatic library detection (SpecParam preferred, FOOOF fallback)\n- Robust model fitting with configurable parameters\n- Comprehensive validation metrics and goodness-of-fit assessment\n- Frequency band power quantification with aperiodic correction\n\n```python\n# Analyze all loaded channels\nanalyzer.analyze_spectral_parameterization()\n\n# Analyze specific channels\nanalyzer.analyze_spectral_parameterization(channels_to_analyze=['T2', 'O1'])\n```\n\n**Output Structure:**\n```\nsubject_folder/\n\u2514\u2500\u2500 spectral_analysis_results/\n \u251c\u2500\u2500 spectral_parameterization/\n \u2502 \u251c\u2500\u2500 individual/\n \u2502 \u2502 \u251c\u2500\u2500 subject_fooof_T2.png\n \u2502 \u2502 \u251c\u2500\u2500 subject_fooof_parameters_T2.csv\n \u2502 \u2502 \u2514\u2500\u2500 subject_band_powers_T2.csv\n \u2502 \u251c\u2500\u2500 summary/\n \u2502 \u2502 \u251c\u2500\u2500 subject_fooof_parameters_summary.csv\n \u2502 \u2502 \u2514\u2500\u2500 subject_band_powers_summary.csv\n \u2502 \u2514\u2500\u2500 plots/\n \u2502 \u251c\u2500\u2500 subject_aperiodic_exponent_comparison.png\n \u2502 \u2514\u2500\u2500 subject_spectral_peaks_comparison.png\n \u2514\u2500\u2500 analysis_metadata.json # Complete analysis tracking\n```\n\n##### `analyze_comprehensive(channels_to_analyze=None)`\n\nExecutes complete spectral analysis suite combining both multi-band power analysis and spectral parameterization for comprehensive frequency-domain characterization.\n\n```python\n# Complete analysis workflow\nanalyzer.analyze_comprehensive(channels_to_analyze=['T2', 'O1', 'F3'])\n```\n\n#### Configuration Methods\n\n##### `set_frequency_bands(bands_dict)`\n\nConfigure custom frequency bands for multi-band analysis.\n\n```python\n# Custom frequency bands\ncustom_bands = {\n 'slow_alpha': (8, 10),\n 'fast_alpha': (10, 12),\n 'low_beta': (12, 20),\n 'high_beta': (20, 30)\n}\nanalyzer.set_frequency_bands(custom_bands)\n```\n\n##### `set_fooof_parameters(freq_range=None, **fooof_kwargs)`\n\nConfigure spectral parameterization parameters.\n\n```python\n# Custom FOOOF settings\nanalyzer.set_fooof_parameters(\n freq_range=(1, 40),\n peak_width_limits=(1, 8),\n max_n_peaks=6,\n min_peak_height=0.1\n)\n```\n\n##### `set_smoothing_windows(window_secs_list)`\n\nConfigure temporal smoothing parameters for multi-band power analysis.\n\n**Parameters:**\n- `window_secs_list` (list of float): Moving-average window sizes in seconds. Multiple windows enable comparison of different temporal smoothing scales.\n\n**Notes:**\n- Shorter windows preserve temporal dynamics but may be noisier\n- Longer windows provide smoother estimates but reduce temporal resolution\n- Window sizes are automatically converted to samples based on signal sampling frequency during analysis\n\n```python\n# Custom smoothing windows\nanalyzer.set_smoothing_windows([0.05, 0.1, 0.25]) # 50ms, 100ms, 250ms\n```\n\n##### `get_analysis_info()`\n\nRetrieve comprehensive analysis configuration and system information for documentation and reproducibility purposes.\n\n**Returns:**\n- `dict`: Configuration dictionary containing channels, frequency bands, smoothing windows, FOOOF settings, library information, and trigger detector status.\n\n```python\n# Get current analyzer configuration\nconfig = analyzer.get_analysis_info()\nprint(f\"Available channels: {config['channels']}\")\nprint(f\"Frequency bands: {config['frequency_bands']}\")\nprint(f\"Library: {config['spectral_param_library']['library']}\")\n```\n\n#### Visualization Methods\n\n##### `plot_raw_signal_window(window_index, channel)`\n\nGenerate publication-ready visualization of raw EEG data for specified trigger-defined window.\n\n**Parameters:**\n- `window_index` (int): Zero-based index of the trigger-defined window to visualize. Must be less than (total_triggers - 1) to ensure valid window bounds.\n- `channel` (str): EEG channel name to plot. Must exist in loaded dataset.\n\n**Raises:**\n- `ValueError`: If trigger detector is not provided or window_index is out of range.\n\n```python\n# Plot specific window\nanalyzer.plot_raw_signal_window(window_index=5, channel='T2')\n```\n\n##### `plot_averaged_signal_window(channel, start_window=None, end_window=None, target_length=500, aggregation_method='mean', trim_ratio=0.1)`\n\nCreate ensemble-averaged signal visualization across multiple temporal windows with robust statistical aggregation.\n\n**Parameters:**\n- `channel` (str): EEG channel name to analyze and visualize\n- `start_window` (int, optional): Starting window index for aggregation. If None, uses first window (0)\n- `end_window` (int, optional): Ending window index for aggregation. If None, uses last available window\n- `target_length` (int, default=500): Number of temporal points for signal resampling and standardization\n- `aggregation_method` ({'mean', 'median', 'trimmed'}, default='mean'): Statistical method for cross-window aggregation\n- `trim_ratio` (float, default=0.1): Proportion of extreme values to exclude for 'trimmed' aggregation method\n\n```python\n# Plot averaged signal across windows 10-20\nanalyzer.plot_averaged_signal_window(\n channel='T2',\n start_window=10,\n end_window=20,\n aggregation_method='median'\n)\n\n# Plot with trimmed mean aggregation\nanalyzer.plot_averaged_signal_window(\n channel='T2',\n aggregation_method='trimmed',\n trim_ratio=0.2\n)\n```\n\n##### `plot_fooof_comparison(channels=None, metric='aperiodic_exponent')`\n\nGenerate comparative visualization of spectral parameterization metrics across channels.\n\n**Parameters:**\n- `channels` (list of str, optional): EEG channel names to include in comparison. If None, includes all channels with completed spectral parameterization analysis.\n- `metric` ({'aperiodic_exponent', 'aperiodic_offset', 'n_peaks', 'r_squared', 'error'}, default='aperiodic_exponent'): Spectral parameterization metric to visualize:\n - 'aperiodic_exponent': 1/f slope reflecting neural population dynamics\n - 'aperiodic_offset': Broadband power offset parameter\n - 'n_peaks': Number of detected oscillatory peaks\n - 'r_squared': Model fit quality (coefficient of determination)\n - 'error': Root mean square error of model fit\n\n**Notes:**\n- Requires prior execution of analyze_spectral_parameterization() method\n- Visualization includes professional formatting with grid lines, proper axis labels, and publication-ready styling\n\n```python\n# Compare aperiodic exponents across channels\nanalyzer.plot_fooof_comparison(\n channels=['T2', 'O1', 'F3'],\n metric='aperiodic_exponent'\n)\n\n# Compare number of peaks across all analyzed channels\nanalyzer.plot_fooof_comparison(metric='n_peaks')\n```\n\n#### Complete Analysis Example\n\n```python\nfrom krembil_kit import EDFLoader, TriggerDetector, SpectralAnalyzer\n\n# Step 1: Load data and detect triggers\nloader = EDFLoader(folder_path=\"data\", name=\"subject_name\")\nloader.load_and_plot_signals()\n\ntrigger_detector = TriggerDetector(edf_loader=loader, signal_choice='T2')\ntrigger_detector.detect_triggers()\ntrigger_detector.plot_triggers()\n\n# Step 2: Initialize analyzer\nanalyzer = SpectralAnalyzer(\n loader=loader,\n trigger_detector=trigger_detector\n)\n\n# Step 3: Configure analysis parameters\nanalyzer.set_frequency_bands({\n 'Delta': (0.5, 4),\n 'Theta': (4, 8),\n 'Alpha': (8, 12),\n 'Beta': (12, 30),\n 'Gamma': (30, 80)\n})\n\nanalyzer.set_fooof_parameters(freq_range=(1, 40))\n\n# Step 4: Execute comprehensive analysis\nchannels_of_interest = ['T2', 'O1', 'F3', 'C3', 'C4']\nanalyzer.analyze_comprehensive(channels_to_analyze=channels_of_interest)\n\n# Step 5: Generate comparative visualizations\nanalyzer.plot_fooof_comparison(\n channels=channels_of_interest,\n metric='aperiodic_exponent'\n)\n```\n\n### ConnectivityAnalyzer\n\nConverts EEG data to graph representations for network analysis and computes time-varying connectivity measures. Features professional output organization, complete analysis metadata tracking, and memory-efficient processing for any file size.\n\nThe ConnectivityAnalyzer provides **three levels of analysis complexity** to match different research needs:\n\n1. **Simple Connectivity Analysis** - Fast correlation/coherence for exploration\n2. **Detailed Connectivity Analysis** - Time-varying connectivity with visualizations \n3. **Advanced Graph Analysis** - Full graph representations for machine learning\n\n#### Initialization\n```python\nprocessor = ConnectivityAnalyzer(\n edf_loader=loader,\n output_dir=None, # Optional: custom output directory\n window_size=1000, # Optional: analysis window size in samples\n adj_window_size=20000 # Optional: adjacency matrix window size (40s at 500Hz)\n)\n```\n\n**Key Parameters:**\n- **`window_size`**: Analysis window duration (default: 1 second at sampling rate)\n- **`adj_window_size`**: Window size for adjacency calculations (default: 40 seconds for statistical robustness)\n- **`output_dir`**: Custom output directory (default: subject_folder/connectivity_analysis_results/)\n\n**Default Output Structure:**\n```\ndata/subject_name/\n\u251c\u2500\u2500 subject_name.edf\n\u2514\u2500\u2500 connectivity_analysis_results/ # Professional organization\n \u251c\u2500\u2500 graphs/ # HDF5 graph representations\n \u2502 \u2514\u2500\u2500 subject_graphs.h5\n \u251c\u2500\u2500 correlation/ # Correlation matrices\n \u2502 \u251c\u2500\u2500 subject_correlation_0s-300s.pickle\n \u2502 \u2514\u2500\u2500 plots/ # Correlation visualizations\n \u251c\u2500\u2500 coherence/\n \u2502 \u251c\u2500\u2500 average/ # Average coherence matrices\n \u2502 \u2502 \u251c\u2500\u2500 subject_coherence_avg_0s-300s.pickle\n \u2502 \u2502 \u2514\u2500\u2500 plots/ # Average coherence visualizations\n \u2502 \u2514\u2500\u2500 bands/ # Frequency-band coherence matrices\n \u2502 \u251c\u2500\u2500 subject_coherence_bands_0s-300s.pickle\n \u2502 \u2514\u2500\u2500 plots/ # Band-specific visualizations\n \u2514\u2500\u2500 analysis_metadata.json # Complete analysis tracking\n```\n\n#### Methods\n\n##### `generate_graphs(segment_duration=180.0, start_time=None, stop_time=None, overlap_ratio=0.875)`\nCreates comprehensive graph representations with adjacency matrices and node/edge features using **memory-safe HDF5 format** with segmented processing.\n\n**Features Generated:**\n- **Adjacency matrices**: Correlation, coherence, phase relationships\n- **Node features**: Energy, band-specific energy across frequency bands\n- **Edge features**: Connectivity measures across frequency bands\n- **High temporal resolution**: 87.5% overlapping windows by default\n\n**Key Advantages:**\n- **Memory-safe**: Processes any file size without memory issues\n- **Segmented processing**: Divides large files into manageable segments\n- **Immediate storage**: Results saved incrementally to prevent data loss\n- **Progress tracking**: Real-time progress bars and detailed logging\n- **HDF5 format**: Compressed, efficient storage with selective data access\n\n**Parameters:**\n- `segment_duration` (float): Duration of each processing segment in seconds (default: 180.0)\n- `start_time` (float, optional): Start time for analysis window in seconds\n- `stop_time` (float, optional): End time for analysis window in seconds \n- `overlap_ratio` (float): Window overlap ratio (default: 0.875 = 87.5% overlap)\n\n```python\n# Generate comprehensive graph representations\nhdf5_path = processor.generate_graphs(segment_duration=300.0)\n\n# Analyze specific time window with high temporal resolution\nhdf5_path = processor.generate_graphs(\n segment_duration=180.0,\n start_time=300,\n stop_time=900,\n overlap_ratio=0.95 # Very high resolution\n)\n# Output: graphs/{filename}_graphs.h5 with compressed graph data\n```\n\n**HDF5 Output Structure:**\n```python\n# HDF5 file contains:\n{\n 'adjacency_matrices': (n_windows, n_adj_types, n_electrodes, n_electrodes),\n 'node_features': (n_windows,), # Variable-length arrays\n 'edge_features': (n_windows,), # Variable-length arrays \n 'window_starts': (n_windows,), # Timestamp for each window\n # Plus comprehensive metadata as attributes\n}\n```\n\n**Loading HDF5 Results:**\n```python\nimport h5py\nimport numpy as np\n\n# Load specific data without loading entire file\nwith h5py.File('subject_graphs.h5', 'r') as f:\n # Load specific time range\n correlation_matrices = f['adjacency_matrices'][100:200, 1, :, :] # Windows 100-200, correlation type\n \n # Load metadata\n sampling_freq = f.attrs['sampling_frequency']\n total_windows = f.attrs['total_windows_processed']\n \n # Load specific electrode pairs\n electrode_pair_data = f['adjacency_matrices'][:, 1, 5, 12] # All windows, electrodes 5-12\n```\n\n##### `compute_correlation(start_time, stop_time, interval, overlap_ratio=0.0)`\nComputes time-varying correlation matrices over specified time segments.\n\n**Parameters:**\n- `start_time` (float): Start time in seconds\n- `stop_time` (float): End time in seconds \n- `interval` (float): Window duration for each correlation matrix in seconds\n- `overlap_ratio` (float): Overlap between windows (0.0 = no overlap, 0.5 = 50% overlap)\n\n```python\n# Compute correlation every 5 seconds from 10-60s with 50% overlap\npath = processor.compute_correlation(\n start_time=10.0,\n stop_time=60.0, \n interval=5.0,\n overlap_ratio=0.5\n)\n```\n\n**Output:** `correlation/{filename}_correlation_{start}s-{stop}s.pickle` containing:\n```python\n{\n \"starts\": [10.0, 12.5, 15.0, ...], # Window start times\n \"corr_matrices\": [matrix1, matrix2, ...] # Correlation matrices\n}\n```\n\n##### `compute_coherence_average(start_time, stop_time, interval, overlap_ratio=0.0)`\nComputes time-varying coherence matrices averaged across all frequency bands.\n\n```python\n# Simple averaged coherence analysis\npath = processor.compute_coherence_average(\n start_time=10.0,\n stop_time=60.0,\n interval=5.0\n)\n```\n\n**Output:** `coherence/average/{filename}_coherence_avg_{start}s-{stop}s.pickle` containing:\n```python\n{\n \"starts\": [10.0, 15.0, 20.0, ...],\n \"coherence_matrices\": [matrix1, matrix2, ...] # Averaged coherence\n}\n```\n\n##### `compute_coherence_bands(start_time, stop_time, interval, overlap_ratio=0.0)`\nComputes detailed frequency-specific coherence analysis across EEG bands.\n\n```python\n# Detailed frequency-band coherence analysis \npath = processor.compute_coherence_bands(\n start_time=10.0,\n stop_time=60.0,\n interval=5.0,\n overlap_ratio=0.25\n)\n```\n\n**Output:** `coherence/bands/{filename}_coherence_bands_{start}s-{stop}s.pickle` containing:\n```python\n{\n \"starts\": [10.0, 15.0, 20.0, ...],\n \"coherence_by_band\": {\n \"delta\": [matrix1, matrix2, ...], # 1-4 Hz\n \"theta\": [matrix1, matrix2, ...], # 4-8 Hz \n \"alpha\": [matrix1, matrix2, ...], # 8-13 Hz\n \"beta\": [matrix1, matrix2, ...], # 13-30 Hz\n \"gamma\": [matrix1, matrix2, ...], # 30-70 Hz\n \"gammaHi\": [matrix1, matrix2, ...], # 70-100 Hz\n # Additional bands based on sampling frequency\n },\n \"frequency_bands\": {\n \"delta\": (1, 4), \"theta\": (4, 8), \"alpha\": (8, 13), ...\n }\n}\n```\n\n##### `plot_connectivity_matrices(plot_types=None, time_range=None, output_subdir=\"plots\", save_individual=True, save_summary=True, dpi=150, figsize=(10, 8))`\nGenerates comprehensive visualizations of connectivity matrices with **full EEG channel names** on axes.\n\n**Parameters:**\n- `plot_types` (list): Types to plot - `[\"correlation\", \"coherence_avg\", \"coherence_bands\"]` (default: all available)\n- `time_range` (tuple): `(start_time, stop_time)` to filter plots (default: all time windows)\n- `output_subdir` (str): Subdirectory name for plots (default: \"plots\")\n- `save_individual` (bool): Save individual matrix plots (default: True)\n- `save_summary` (bool): Save summary/comparison plots (default: True)\n- `dpi` (int): Plot resolution (default: 150)\n- `figsize` (tuple): Figure size as (width, height) (default: (10, 8))\n\n**Features:**\n- **Full channel names**: All EEG channel names (Fp1, F3, C3, etc.) displayed on both axes\n- **Organized output**: Plots saved alongside data in intuitive directory structure\n- **Multiple plot types**: Individual matrices, time series summaries, frequency band comparisons\n- **Flexible filtering**: Plot specific time ranges or connectivity types\n- **High-quality output**: Publication-ready plots with proper labeling\n\n```python\n# Plot all available connectivity data\nresults = processor.plot_connectivity_matrices()\n\n# Plot only correlation matrices\nresults = processor.plot_connectivity_matrices(plot_types=[\"correlation\"])\n\n# Plot coherence with time filtering\nresults = processor.plot_connectivity_matrices(\n plot_types=[\"coherence_avg\", \"coherence_bands\"],\n time_range=(100, 200), # Only plot 100-200 second window\n save_individual=True,\n save_summary=True\n)\n\n# Custom plot settings\nresults = processor.plot_connectivity_matrices(\n dpi=300, # High resolution\n figsize=(12, 10), # Larger plots\n output_subdir=\"publication_plots\"\n)\n```\n\n#### Progressive Analysis Workflow\n\nThe ConnectivityAnalyzer supports a **progressive complexity approach** - start simple and add detail as needed:\n\n##### Level 1: Exploratory Analysis (Fast)\n```python\nfrom krembil_kit import EDFLoader, ConnectivityAnalyzer\n\n# Load EEG data\nloader = EDFLoader(\"data\", \"subject_name\")\nloader.load_and_plot_signals(duration=1200.0)\n\n# Initialize processor\nprocessor = ConnectivityAnalyzer(edf_loader=loader)\n\n# Quick correlation overview (5-minute windows)\ncorr_path = processor.compute_correlation(\n start_time=0, stop_time=3600, interval=300\n)\n\n# Quick coherence overview\ncoh_path = processor.compute_coherence_average(\n start_time=0, stop_time=3600, interval=300\n)\n\n# Generate overview plots\nprocessor.plot_connectivity_matrices(\n plot_types=[\"correlation\", \"coherence_avg\"],\n save_individual=False, # Only summary plots\n save_summary=True\n)\n```\n\n##### Level 2: Detailed Time-Varying Analysis\n```python\n# Identify interesting periods from Level 1 results\n# Focus on specific time ranges with higher resolution\n\n# High-resolution analysis of interesting periods\nprocessor_detailed = ConnectivityAnalyzer(edf_loader=loader)\n\n# Detailed correlation analysis (10-second windows)\ndetailed_corr = processor_detailed.compute_correlation(\n start_time=100, stop_time=400, interval=10, overlap_ratio=0.5\n)\n\n# Frequency-specific coherence analysis\ndetailed_coh = processor_detailed.compute_coherence_bands(\n start_time=100, stop_time=400, interval=10, overlap_ratio=0.5\n)\n\n# Generate detailed visualizations\nprocessor_detailed.plot_connectivity_matrices(\n plot_types=[\"correlation\", \"coherence_bands\"],\n time_range=(100, 400),\n save_individual=True,\n save_summary=True\n)\n```\n\n##### Level 3: Advanced Graph Analysis\n```python\n# For machine learning, GNN analysis, or comprehensive connectivity studies\n\n# Memory-safe graph generation for any file size\nprocessor_advanced = ConnectivityAnalyzer(\n edf_loader=loader\n)\nhdf5_path = processor_advanced.generate_graphs(\n segment_duration=180.0\n)\n\n# Load and analyze HDF5 results\nimport h5py\nwith h5py.File(hdf5_path, 'r') as f:\n # Access specific connectivity types\n correlations = f['adjacency_matrices'][:, 1, :, :] # All correlation matrices\n coherences = f['adjacency_matrices'][:, 2, :, :] # All coherence matrices\n \n # Get metadata\n n_windows = f.attrs['total_windows_processed']\n sampling_freq = f.attrs['sampling_frequency']\n \n print(f\"Processed {n_windows} windows at {sampling_freq} Hz\")\n```\n\n#### Method Selection Guide\n\n**Use `compute_correlation()` when:**\n- \u2705 Quick data exploration and quality assessment\n- \u2705 Identifying periods of high/low connectivity\n- \u2705 Simple statistical comparisons between conditions\n- \u2705 Real-time or streaming analysis needs\n- \u2705 Memory-constrained environments\n\n**Use `compute_coherence_average()` when:**\n- \u2705 Frequency-domain connectivity without band-specific details\n- \u2705 Robust connectivity measures (coherence is less sensitive to artifacts)\n- \u2705 Comparing connectivity strength across different time periods\n- \u2705 Preprocessing for more detailed analysis\n\n**Use `compute_coherence_bands()` when:**\n- \u2705 Need frequency-specific connectivity (alpha, beta, gamma, etc.)\n- \u2705 Studying oscillatory coupling between brain regions\n- \u2705 Clinical applications requiring band-specific analysis\n- \u2705 Research into frequency-specific network dynamics\n\n**Use `generate_graphs()` when:**\n- \u2705 Machine learning applications (GNNs, classification)\n- \u2705 Complex network analysis requiring multiple connectivity measures\n- \u2705 Research requiring high temporal resolution connectivity tracking\n- \u2705 Any size EDF files (memory-safe processing)\n- \u2705 Production environments requiring reliability\n- \u2705 Need for incremental processing and progress tracking\n- \u2705 Long-term storage with efficient HDF5 compression\n\n#### Complete Analysis Example\n```python\nfrom krembil_kit import EDFLoader, ConnectivityAnalyzer\nimport h5py\nimport numpy as np\n\n# Load EEG data\nloader = EDFLoader(\"data\", \"subject_name\")\nloader.load_and_plot_signals(duration=3600.0) # 1 hour\n\n# Step 1: Quick exploration (Level 1)\nexplorer = ConnectivityAnalyzer(edf_loader=loader)\n\n# Overview analysis\ncorr_overview = explorer.compute_correlation(0, 3600, 300) # 5-min windows\ncoh_overview = explorer.compute_coherence_average(0, 3600, 300)\n\n# Generate overview plots\nexplorer.plot_connectivity_matrices(\n plot_types=[\"correlation\", \"coherence_avg\"],\n save_summary=True\n)\n\n# Step 2: Identify interesting periods (hypothetical analysis)\n# ... analyze overview results to find periods of interest ...\ninteresting_start, interesting_stop = 1200, 1800 # Example: 20-30 minutes\n\n# Step 3: Detailed analysis of interesting period (Level 2)\ndetailed = ConnectivityAnalyzer(edf_loader=loader)\n\ndetailed_corr = detailed.compute_correlation(\n interesting_start, interesting_stop, 30, overlap_ratio=0.5\n)\ndetailed_coh = detailed.compute_coherence_bands(\n interesting_start, interesting_stop, 30, overlap_ratio=0.5\n)\n\n# Step 4: Full graph analysis for ML (Level 3)\n# Memory-safe HDF5 processing\nadvanced = ConnectivityAnalyzer(edf_loader=loader)\nhdf5_path = advanced.generate_graphs(segment_duration=300.0)\n\n# Step 5: Load and analyze results\nwith h5py.File(hdf5_path, 'r') as f:\n # Extract features for machine learning\n correlation_features = f['adjacency_matrices'][:, 1, :, :].flatten()\n coherence_features = f['adjacency_matrices'][:, 2, :, :].flatten()\n \n # Get temporal information\n window_times = f['window_starts'][:]\n \n # Print summary\n print(f\"Generated {len(window_times)} windows\")\n print(f\"Time range: {window_times[0]:.1f}s - {window_times[-1]:.1f}s\")\n print(f\"Feature dimensions: {correlation_features.shape}\")\n\n# Step 6: Generate comprehensive visualizations\nadvanced.plot_connectivity_matrices(\n plot_types=[\"correlation\", \"coherence_avg\", \"coherence_bands\"],\n time_range=(interesting_start, interesting_stop),\n save_individual=True,\n save_summary=True,\n dpi=300 # High resolution for publication\n)\n```\n\n## Advanced Usage\n\n### Memory Management for Large Files\n\n**For EDF Loading:**\n- Use `duration` parameter to limit data loading\n- Use `signal_indices` to select specific channels\n- Enable `save_plots=True` to avoid memory issues with display\n\n**For Graph Processing:**\n- **Any file size**: `generate_graphs()` uses memory-safe HDF5 processing\n- **Adjust segment size**: Smaller segments use less memory but have more boundary losses\n\n```python\n# Memory-efficient settings for large files\nprocessor = ConnectivityAnalyzer(edf_loader=loader)\n\n# Process in small segments for maximum memory efficiency\nhdf5_path = processor.generate_graphs(\n segment_duration=120.0 # Smaller segments = less memory\n)\n```\n\n### Custom Trigger Detection Parameters\n\nThe trigger detection uses hardcoded parameters optimized for trigger detection:\n- **Threshold**: 60 \u00b5V\n- **Duration range**: 52-65 seconds\n- **Filter**: 30 Hz low-pass Butterworth\n\n### Temporal Resolution vs Performance Trade-offs\n\n**Overlap Ratio Impact:**\n```python\n# High resolution, high computational cost\nhdf5_path = processor.generate_graphs(overlap_ratio=0.875) # 87.5% overlap\n# Result: 8x more windows, 8x longer processing, 8x more storage\n\n# Moderate resolution, balanced performance \nhdf5_path = processor.generate_graphs(overlap_ratio=0.5) # 50% overlap\n# Result: 2x more windows, 2x longer processing\n\n# Low resolution, fast processing\nhdf5_path = processor.generate_graphs(overlap_ratio=0.0) # No overlap\n# Result: Fastest processing, lowest memory usage\n```\n\n### HDF5 Data Access Patterns\n\n**Efficient HDF5 Loading:**\n```python\nimport h5py\n\n# Load specific time ranges without loading entire file\nwith h5py.File('subject_graphs.h5', 'r') as f:\n # Load only correlation matrices for specific time window\n correlations = f['adjacency_matrices'][100:200, 1, :, :]\n \n # Load specific electrode pairs across all time\n electrode_pair = f['adjacency_matrices'][:, 1, 5, 12]\n \n # Load metadata without loading data\n total_windows = f.attrs['total_windows_processed']\n sampling_freq = f.attrs['sampling_frequency']\n```\n\n### ML-Based Quality Control\n\nFor automated window quality assessment:\n1. Train a ResNet-50 + logistic regression model on labeled window images\n2. Save the classifier and class labels\n3. Use `filter_bad_windows()` to automatically remove poor-quality segments\n\n### Production Deployment Considerations\n\n**For Large-Scale Processing:**\n- Use `generate_graphs()` for reliability and memory safety with HDF5 storage\n- Set appropriate `segment_duration` based on available RAM\n- Monitor disk space - HDF5 files can be large but are compressed\n- Use progress tracking to monitor long-running jobs\n- Consider processing multiple files in parallel with separate processes\n\n**Error Recovery:**\n- HDF5 processing saves incrementally - partial results preserved on interruption\n- Check for existing HDF5 files before reprocessing\n- Use validation scripts to verify data integrity\n\n## Dependencies\n\n- numpy\n- scipy \n- mne\n- pyedflib\n- matplotlib\n- seaborn\n- pandas\n- opencv-python\n- torch\n- torchvision\n- joblib\n- scikit-learn\n- Pillow\n- specparam\n- fooof\n- h5py\n- tqdm\n\n## Requirements\n\n- Python \u2265 3.7\n- Sufficient RAM for EEG data (recommend 8GB+ for large files)\n- GPU optional (for ML-based filtering)\n\n## Citation\n\nIf you use this package in your research, please cite:\n\n```\n[KrembilKit - Raiyan, Yousif, Srikar]\n```\n\n## License\n\nMIT License\n\n## Support\n\nFor questions or issues, please contact the package maintainer.\n#\n# Analysis Metadata and Reproducibility\n\nBoth SpectralAnalyzer and ConnectivityAnalyzer automatically track comprehensive metadata for all analyses.\n\n### Metadata Features\n\n**Automatic Tracking:**\n- Analysis timestamps and duration\n- All analysis parameters and settings\n- Input data information (file paths, channels, etc.)\n- Results summary (files created, processing statistics)\n- Software version and library information\n\n**Metadata File Location:**\n```\nsubject_folder/\n\u251c\u2500\u2500 spectral_analysis_results/\n\u2502 \u2514\u2500\u2500 analysis_metadata.json\n\u2514\u2500\u2500 connectivity_analysis_results/\n \u2514\u2500\u2500 analysis_metadata.json\n```\n\n### Metadata Structure\n\n```json\n[\n {\n \"timestamp\": \"2024-12-19T14:30:52.123456\",\n \"analysis_type\": \"comprehensive\",\n \"analysis_duration_seconds\": 245.7,\n \"parameters\": {\n \"channels_analyzed\": [\"T2\", \"O1\", \"F3\"],\n \"frequency_bands\": {\"Delta\": [0.5, 4], \"Theta\": [4, 8]},\n \"fooof_settings\": {\"max_n_peaks\": 6, \"peak_threshold\": 2.0}\n },\n \"data_info\": {\n \"subject_name\": \"subject_001\",\n \"channels\": [\"T2\", \"O1\", \"F3\", \"C3\", \"C4\"]\n },\n \"results\": {\n \"analysis_type\": \"comprehensive\",\n \"methods_executed\": [\"multiband_power\", \"spectral_parameterization\"],\n \"channels_processed\": 3\n }\n }\n]\n```\n\n### Using Metadata\n\n```python\nimport json\n\n# Load analysis history\nwith open('spectral_analysis_results/analysis_metadata.json', 'r') as f:\n metadata = json.load(f)\n\n# Find specific analysis\nfor analysis in metadata:\n if analysis['analysis_type'] == 'comprehensive':\n print(f\"Analysis run on: {analysis['timestamp']}\")\n print(f\"Duration: {analysis['analysis_duration_seconds']} seconds\")\n print(f\"Channels: {analysis['parameters']['channels_analyzed']}\")\n```\n",
"bugtrack_url": null,
"license": null,
"summary": "Comprehensive EEG analysis pipeline with advanced spectral analysis, connectivity analysis, memory-safe HDF5 graph processing, trigger detection, and spectral parameterization (FOOOF/SpecParam)",
"version": "1.0.0",
"project_urls": {
"Documentation": "https://github.com/raiyanUofT/krembil-kit#readme",
"Homepage": "https://github.com/raiyanUofT/krembil-kit",
"Issues": "https://github.com/raiyanUofT/krembil-kit/issues",
"Repository": "https://github.com/raiyanUofT/krembil-kit"
},
"split_keywords": [
"eeg",
" signal-processing",
" neuroscience",
" spectral-analysis",
" connectivity-analysis",
" graph-analysis",
" coherence",
" correlation",
" fooof",
" specparam",
" spectral-parameterization",
" aperiodic",
" oscillations",
" hdf5",
" memory-safe",
" large-files",
" multi-band",
" frequency-bands"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "d7adb4851ff88c0110b81dfb422793372acd01beb2536b6e465bbd389ee80031",
"md5": "38b729e62af630742f7ae84a720c41b8",
"sha256": "6bac191be5e3b75d54caf1c6d99f871e193af74f47677a0b6f591a35b70f63ad"
},
"downloads": -1,
"filename": "krembil_kit-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "38b729e62af630742f7ae84a720c41b8",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 69424,
"upload_time": "2025-08-20T21:25:45",
"upload_time_iso_8601": "2025-08-20T21:25:45.158889Z",
"url": "https://files.pythonhosted.org/packages/d7/ad/b4851ff88c0110b81dfb422793372acd01beb2536b6e465bbd389ee80031/krembil_kit-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "93247c49171be414dde87eebc65aa7347d68389b87647dfbcfaf3170c7e32f6d",
"md5": "fa3009218eee6e5a0a0a355f7451e53e",
"sha256": "0ba630823115d6cdb86cf27999a61ff85715baf94ee16cc8164ab6ef20bfdfcc"
},
"downloads": -1,
"filename": "krembil_kit-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "fa3009218eee6e5a0a0a355f7451e53e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 92815,
"upload_time": "2025-08-20T21:25:47",
"upload_time_iso_8601": "2025-08-20T21:25:47.626278Z",
"url": "https://files.pythonhosted.org/packages/93/24/7c49171be414dde87eebc65aa7347d68389b87647dfbcfaf3170c7e32f6d/krembil_kit-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-20 21:25:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "raiyanUofT",
"github_project": "krembil-kit#readme",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "krembil-kit"
}