autoclean-icvision


Nameautoclean-icvision JSON
Version 0.1.0 PyPI version JSON
download
home_pageNone
SummaryAutomated ICA component classification using OpenAI Vision API for EEG data
upload_time2025-07-27 15:22:50
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License Copyright (c) 2024 Cincibrainlab Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords artifact-detection eeg ica mne neuroscience openai vision
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Autoclean EEG ICVision (Standalone)

[![PyPI version](https://badge.fury.io/py/autoclean-icvision.svg)](https://badge.fury.io/py/autoclean-icvision)
[![Python versions](https://img.shields.io/pypi/pyversions/autoclean-icvision.svg)](https://pypi.org/project/autoclean-icvision/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)

Automated ICA component classification for EEG data using OpenAI's Vision API.

## Overview

ICVision automates the tedious process of classifying ICA components from EEG data by generating component visualizations and sending them to OpenAI's Vision API for intelligent artifact identification.

**Workflow**: Raw EEG + ICA → Generate component plots → OpenAI Vision classification → Automated artifact removal → Clean EEG data

**Key Features**:
- Automated classification of 7 component types (brain, eye, muscle, heart, line noise, channel noise, other)
- **🔄 Drop-in replacement for MNE-ICALabel**: Same API, enhanced with OpenAI Vision
- Multi-panel component plots (topography, time series, PSD, ERP-image)
- MNE-Python integration with `.fif` and `.set` file support
- **EEGLAB .set file auto-detection**: Single file input with automatic ICA detection
- **Smart file organization**: Basename-prefixed output files prevent overwrites when processing multiple datasets
- **Continuous data only**: Graceful error handling for epoched data with helpful conversion instructions
- **Enhanced PDF reports**: Professional dual-header layout with color-coded classification results
- **OpenAI cost tracking**: Automatic cost estimation and logging for budget monitoring
- Parallel processing with configurable batch sizes
- Command-line and Python API interfaces
- Comprehensive PDF reports and CSV results

## Installation

```bash
pip install autoclean-icvision
```

**Requirements**: Python 3.8+ and OpenAI API key with vision model access (e.g., `gpt-4.1`)

```bash
export OPENAI_API_KEY='your_api_key_here'
```

## Usage

### Command-Line Interface (CLI)

The primary way to use ICVision is through its command-line interface.

**Basic Usage:**

**Single EEGLAB .set file (Recommended):**
```bash
autoclean-icvision /path/to/your_data.set
# or legacy command: icvision /path/to/your_data.set
```

**Separate files:**
```bash
autoclean-icvision /path/to/your_raw_data.set /path/to/your_ica_decomposition.fif
# or legacy command: icvision /path/to/your_raw_data.set /path/to/your_ica_decomposition.fif
```

ICVision can automatically detect and read ICA data from EEGLAB `.set` files, making single-file usage possible when your `.set` file contains both raw data and ICA decomposition.

This command will:
1.  Load the raw EEG data and ICA solution (auto-detected from `.set` file or from separate files).
2.  Classify components using the default settings.
3.  Create an `autoclean_icvision_results/` directory in your current working directory.
4.  Save the following into the output directory (with input filename prefix for organization):
    *   Cleaned raw data (artifacts removed): `{basename}_icvis_cleaned_raw.{format}`
    *   Updated ICA object with component labels: `{basename}_icvis_classified_ica.fif`
    *   `{basename}_icvis_results.csv` detailing classifications for each component.
    *   `{basename}_icvis_summary.txt` with overall statistics.
    *   `{basename}_icvis_report_all_comps.pdf` (comprehensive PDF report with visualizations).

**Note**: `{basename}` is extracted from your input filename (e.g., `sub-01_task-rest_eeg.set` → `sub-01_task-rest_eeg` prefix). This prevents file overwrites when processing multiple datasets.

### Recent Improvements

**Enhanced File Organization (v2024.12)**:
- **Shared workspace**: All results now saved to `autoclean_icvision_results/` directory by default
- **Smart naming**: Input filename prefixes (e.g., `sub-01_task-rest_eeg_icvis_results.csv`) prevent conflicts
- **Multi-file friendly**: Process multiple datasets without overwrites - perfect for batch processing subjects

**Improved User Experience**:
- **Epoched data handling**: Clear error messages with EEGLAB conversion instructions for unsupported epoched data
- **Enhanced PDF reports**: Professional layout with IC Component titles and color-coded Vision Classification results
- **Clean logging output**: Professional, user-focused logging with optional verbose mode for debugging
- **Better error messages**: Informative CLI output with suggested solutions

**Common Options (with defaults):**

*   `--api-key YOUR_API_KEY`: Specify OpenAI API key (default: `OPENAI_API_KEY` env variable)
*   `--output-dir /path/to/output/`: Output directory (default: `./autoclean_icvision_results`)
*   `--model MODEL_NAME`: OpenAI model (default: `gpt-4.1`)
*   `--confidence-threshold 0.8`: Confidence threshold for auto-exclusion (default: `0.8`)
*   `--psd-fmax 40`: Maximum frequency for PSD plots in Hz (default: `80` or Nyquist)
*   `--labels-to-exclude eye muscle heart`: Artifact labels to exclude (default: all non-brain types)
*   `--batch-size 10`: Components per API request (default: `10`)
*   `--max-concurrency 4`: Max parallel requests (default: `4`)
*   `--no-auto-exclude`: Disable auto-exclusion (default: auto-exclude enabled)
*   `--prompt-file /path/to/prompt.txt`: Custom classification prompt (default: built-in prompt)
*   `--no-report`: Disable PDF report (default: report generation enabled)
*   `--verbose`: Enable detailed logging (default: standard logging)
*   `--version`: Show ICVision version
*   `--help`: Show full list of commands and options

**Examples with options:**

Single .set file usage:
```bash
autoclean-icvision data/subject01_eeg.set \
    --api-key sk-xxxxxxxxxxxxxxxxxxxx \
    --confidence-threshold 0.9 \
    --verbose
```

Traditional separate files:
```bash
autoclean-icvision data/subject01_raw.fif data/subject01_ica.fif \
    --api-key sk-xxxxxxxxxxxxxxxxxxxx \
    --model gpt-4.1 \
    --confidence-threshold 0.8 \
    --labels-to-exclude eye muscle line_noise channel_noise \
    --batch-size 8 \
    --verbose
```

For ERP studies with low-pass filtered data:
```bash
autoclean-icvision data/erp_study.set \
    --psd-fmax 40 \
    --confidence-threshold 0.85 \
    --verbose
```

Multi-file batch processing:
```bash
# Process multiple subjects - all results go to shared directory
autoclean-icvision data/sub-01_task-rest_eeg.set --verbose
autoclean-icvision data/sub-02_task-rest_eeg.set --verbose
autoclean-icvision data/sub-03_task-rest_eeg.set --verbose

# Results organized in autoclean_icvision_results/ with prefixed filenames
ls autoclean_icvision_results/
# sub-01_task-rest_eeg_icvis_results.csv
# sub-01_task-rest_eeg_icvis_classified_ica.fif
# sub-02_task-rest_eeg_icvis_results.csv
# sub-02_task-rest_eeg_icvis_classified_ica.fif
# ...
```

### Python API

You can also use ICVision programmatically within your Python scripts.

**Single .set file usage (NEW):**
```python
from pathlib import Path
from icvision.core import label_components

# --- Configuration ---
API_KEY = "your_openai_api_key"  # Or set as environment variable OPENAI_API_KEY
DATA_PATH = "path/to/your_data.set"  # EEGLAB .set file with ICA
OUTPUT_DIR = Path("icvision_output")

# --- Run ICVision (ICA auto-detected from .set file) ---
try:
    raw_cleaned, ica_updated, results_df = label_components(
        raw_data=DATA_PATH,              # EEGLAB .set file path
        # ica_data parameter is optional - auto-detected from .set file
        api_key=API_KEY,                 # Optional if OPENAI_API_KEY env var is set
        output_dir=OUTPUT_DIR,
    )
```

**Traditional separate files:**
```python
from pathlib import Path
from icvision.core import label_components

# --- Configuration ---
API_KEY = "your_openai_api_key"  # Or set as environment variable OPENAI_API_KEY
RAW_DATA_PATH = "path/to/your_raw_data.set"
ICA_DATA_PATH = "path/to/your_ica_data.fif"
OUTPUT_DIR = Path("icvision_output")

# --- Run ICVision with all parameters ---
try:
    raw_cleaned, ica_updated, results_df = label_components(
        raw_data=RAW_DATA_PATH,          # Can be MNE object or path string/Path object
        ica_data=ICA_DATA_PATH,          # Can be MNE object, path, or None for auto-detection
        api_key=API_KEY,                 # Optional if OPENAI_API_KEY env var is set
        output_dir=OUTPUT_DIR,
        model_name="gpt-4.1",            # Default: "gpt-4.1"
        confidence_threshold=0.80,       # Default: 0.8
        labels_to_exclude=["eye", "muscle", "heart", "line_noise", "channel_noise"],  # Default: all non-brain
        generate_report=True,            # Default: True
        batch_size=5,                    # Default: 10
        max_concurrency=3,               # Default: 4
        auto_exclude=True,               # Default: True
        custom_prompt=None,              # Default: None (uses built-in prompt)
        psd_fmax=40.0                    # Default: None (uses 80 Hz); useful for ERP studies
    )

    print("\n--- ICVision Processing Complete ---")
    print(f"Cleaned raw data channels: {raw_cleaned.info['nchan']}")
    print(f"Updated ICA components: {ica_updated.n_components_}")
    print(f"Number of components classified: {len(results_df)}")

    if not results_df.empty:
        print(f"Number of components marked for exclusion: {results_df['exclude_vision'].sum()}")
        print("\nClassification Summary:")
        print(results_df[['component_name', 'label', 'confidence', 'exclude_vision']].head())

    print(f"\nResults saved in: {OUTPUT_DIR.resolve()}")

except Exception as e:
    print(f"An error occurred: {e}")

```

## 🔄 ICLabel Drop-in Replacement

ICVision can serve as a **drop-in replacement** for MNE-ICALabel with identical API and output format. This means you can upgrade existing ICLabel workflows to use OpenAI Vision API without changing any other code.

### Quick Migration

**Before (using MNE-ICALabel):**
```python
from mne_icalabel import label_components

# Classify components with ICLabel
result = label_components(raw, ica, method='iclabel')
print(result['labels'])  # ['brain', 'eye blink', 'other', ...]
print(ica.labels_scores_.shape)  # (n_components, 7)
```

**After (using ICVision):**
```python
from icvision.compat import label_components  # <-- Only line that changes!

# Classify components with ICVision (same API!)
result = label_components(raw, ica, method='icvision')
print(result['labels'])  # Same format: ['brain', 'eye blink', 'other', ...]
print(ica.labels_scores_.shape)  # Same shape: (n_components, 7)
```

### What You Get

- **🎯 Identical API**: Same function signature, same return format
- **📊 Same Output**: Returns dict with `'y_pred_proba'` and `'labels'` keys
- **⚙️ Same ICA Modifications**: Sets `ica.labels_scores_` and `ica.labels_` exactly like ICLabel
- **🚀 Enhanced Intelligence**: OpenAI Vision API instead of fixed neural network
- **💡 Detailed Reasoning**: Each classification includes explanation (available in full API)

### Why Use ICVision over ICLabel?

| Feature | ICLabel | ICVision |
|---------|---------|----------|
| **Classification Method** | Fixed neural network (2019) | OpenAI Vision API (latest models) |
| **Accuracy** | Good on typical datasets | Enhanced with modern vision AI |
| **Reasoning** | No explanations | Detailed reasoning for each decision |
| **Customization** | Fixed model | Customizable prompts and models |
| **Updates** | Static model | Benefits from OpenAI improvements |
| **API Compatibility** | ✅ Original | ✅ Drop-in replacement |

### Integration Example

The compatibility layer works seamlessly with existing MNE workflows:

```python
def analyze_ica_components(raw, ica, method='icvision'):
    """Generic function that works with both ICLabel and ICVision"""

    if method == 'icvision':
        from icvision.compat import label_components
    else:
        from mne_icalabel import label_components

    # Same API for both!
    result = label_components(raw, ica, method=method)

    # Same return format for both
    print(f"Classified {len(result['labels'])} components")

    # Same ICA object modifications for both
    brain_components = ica.labels_['brain']
    artifact_components = [idx for key, indices in ica.labels_.items()
                          if key != 'brain' for idx in indices]

    print(f"Brain components: {brain_components}")
    print(f"Artifact components: {artifact_components}")

    return result

# Works with either classifier
result = analyze_ica_components(raw, ica, method='icvision')
```

### Two APIs, Same Power

ICVision provides **two complementary interfaces**:

1. **Original ICVision API**: Rich output with detailed results and file generation
   ```python
   from icvision.core import label_components
   raw_cleaned, ica_updated, results_df = label_components(...)
   ```

2. **ICLabel-Compatible API**: Simple output matching ICLabel exactly
   ```python
   from icvision.compat import label_components
   result = label_components(raw, ica, method='icvision')
   ```

Choose the API that best fits your workflow - both use the same underlying OpenAI Vision classification.

---

## Configuration Details

### Input File Support

**EEGLAB .set files:**
- **Raw data**: Supports EEGLAB `.set` files for raw EEG data
- **ICA data**: Now supports automatic ICA detection from `.set` files using `mne.preprocessing.read_ica_eeglab()`
- **Single file mode**: Use just a `.set` file when it contains both raw data and ICA decomposition

**MNE formats:**
Other supported formats include:
- **Raw data**: `.fif`, `.edf`, `.raw`
- **ICA data**: `.fif` files containing MNE ICA objects

### Default Parameter Values

| Parameter | Default Value | Description |
|-----------|---------------|-------------|
| `model_name` | `"gpt-4.1"` | OpenAI model for classification |
| `confidence_threshold` | `0.8` | Minimum confidence for auto-exclusion |
| `auto_exclude` | `True` | Automatically exclude artifact components |
| `labels_to_exclude` | `["eye", "muscle", "heart", "line_noise", "channel_noise", "other_artifact"]` | Labels to exclude (all non-brain) |
| `output_dir` | `"./autoclean_icvision_results"` | Output directory for results |
| `generate_report` | `True` | Generate PDF report |
| `batch_size` | `10` | Components per API request |
| `max_concurrency` | `4` | Maximum parallel API requests |
| `api_key` | `None` | Uses `OPENAI_API_KEY` environment variable |
| `custom_prompt` | `None` | Uses built-in classification prompt |

### Component Labels

The standard set of labels ICVision uses (and expects from the API) are:
- `brain` - Neural brain activity (retained)
- `eye` - Eye movement artifacts
- `muscle` - Muscle artifacts
- `heart` - Cardiac artifacts
- `line_noise` - Electrical line noise
- `channel_noise` - Channel-specific noise
- `other_artifact` - Other artifacts

These are defined in `src/icvision/config.py`.

### Output Files

ICVision creates organized output files with input filename prefixes to prevent overwrites when processing multiple datasets:

*   `{basename}_icvis_classified_ica.fif`: MNE ICA object with labels and exclusions
*   `{basename}_icvis_results.csv`: Detailed classification results per component
*   `{basename}_icvis_cleaned_raw.{format}`: Cleaned EEG data with artifacts removed
*   `{basename}_icvis_summary.txt`: Summary statistics by label type
*   `{basename}_icvis_report_all_comps.pdf`: Comprehensive PDF report (if enabled)
*   `component_IC{N}_vision_analysis.webp`: Individual component plots used for API classification

**Example**: Processing `sub-01_task-rest_eeg.set` creates files like:
- `sub-01_task-rest_eeg_icvis_results.csv`
- `sub-01_task-rest_eeg_icvis_classified_ica.fif`
- `sub-01_task-rest_eeg_icvis_cleaned_raw.set`

**Multi-file Processing**: All results are saved to the same `autoclean_icvision_results/` directory, with basename prefixes ensuring no conflicts:
```bash
autoclean_icvision_results/
├── sub-01_task-rest_eeg_icvis_results.csv
├── sub-01_task-rest_eeg_icvis_classified_ica.fif
├── sub-02_task-rest_eeg_icvis_results.csv
├── sub-02_task-rest_eeg_icvis_classified_ica.fif
└── pilot_data_icvis_results.csv
```

### Custom Classification Prompt

The default prompt is optimized for EEG component classification on EGI128 nets. You can customize it by:
- **CLI**: `--prompt-file /path/to/custom_prompt.txt`
- **Python API**: `custom_prompt="Your custom prompt here"`
- **View default**: Check `src/icvision/config.py`

### OpenAI API Costs

ICVision automatically tracks and estimates OpenAI API costs during processing:

**Typical Costs (2025-05-29 pricing)**:
- **gpt-4.1**: ~$0.0012 per component
- **gpt-4.1-mini**: ~$0.0002 per component (recommended)
- **gpt-4.1-nano**: ~$0.0001 per component (budget option)

**Example costs for full ICA analysis**:
- 10 components: $0.0006-0.012 depending on model
- 30 components: $0.002-0.036 depending on model
- 64 components: $0.004-0.077 depending on model

Cost estimates are automatically logged during processing. Use `--verbose` flag to see detailed per-component cost tracking.

### Logging and Verbosity

ICVision provides two logging modes for different use cases:

**Normal Mode** (Default - Clean output for researchers):
```bash
autoclean-icvision data.set
# Output:
# 2025-05-29 13:33:43 - INFO - Starting ICVision CLI v0.1.0
# 2025-05-29 13:33:44 - INFO - OpenAI classification complete. Processed 20/20 components
# 2025-05-29 13:33:45 - INFO - ICVision workflow completed successfully!
```

**Verbose Mode** (Detailed debugging information):
```bash
autoclean-icvision data.set --verbose
# Output:
# 2025-05-29 13:33:43 - icvision - INFO - Verbose logging enabled - showing module details
# 2025-05-29 13:33:44 - icvision.core - DEBUG - Loading and validating input data...
# 2025-05-29 13:33:45 - icvision.api - DEBUG - Response ID: resp_123..., Tokens: 400/50, Cost: $0.001200
# 2025-05-29 13:33:45 - icvision.plotting - DEBUG - Plotting progress: 10/20 components completed
```

**Verbose mode provides**:
- Module-level debugging information
- Detailed OpenAI API cost tracking per component
- Progress indicators for long-running operations
- External library logging (httpx, openai, etc.)
- Full error stack traces for troubleshooting

**Use verbose mode when**:
- Debugging processing issues
- Monitoring API costs in detail
- Contributing to development
- Troubleshooting unexpected behavior

## Development

Contributions are welcome! Please see `CONTRIBUTING.md` for guidelines.

## License

This project is licensed under the MIT License - see the `LICENSE` file for details.

## Citation

If you use ICVision in your research, please consider citing it (details to be added upon publication/DOI generation).

## Acknowledgements

*   This project relies heavily on the [MNE-Python](https://mne.tools/) library.
*   Utilizes the [OpenAI API](https://openai.com/api/).

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "autoclean-icvision",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Gavin Gammoh <gavin.gammoh@cchmc.org>",
    "keywords": "artifact-detection, eeg, ica, mne, neuroscience, openai, vision",
    "author": null,
    "author_email": "Gavin Gammoh <gavin.gammoh@cchmc.org>",
    "download_url": "https://files.pythonhosted.org/packages/8d/5f/2e32cc6eb9e8257fff356f4577d21697a50353ad8646c323fba29406cea8/autoclean_icvision-0.1.0.tar.gz",
    "platform": null,
    "description": "# Autoclean EEG ICVision (Standalone)\n\n[![PyPI version](https://badge.fury.io/py/autoclean-icvision.svg)](https://badge.fury.io/py/autoclean-icvision)\n[![Python versions](https://img.shields.io/pypi/pyversions/autoclean-icvision.svg)](https://pypi.org/project/autoclean-icvision/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nAutomated ICA component classification for EEG data using OpenAI's Vision API.\n\n## Overview\n\nICVision automates the tedious process of classifying ICA components from EEG data by generating component visualizations and sending them to OpenAI's Vision API for intelligent artifact identification.\n\n**Workflow**: Raw EEG + ICA \u2192 Generate component plots \u2192 OpenAI Vision classification \u2192 Automated artifact removal \u2192 Clean EEG data\n\n**Key Features**:\n- Automated classification of 7 component types (brain, eye, muscle, heart, line noise, channel noise, other)\n- **\ud83d\udd04 Drop-in replacement for MNE-ICALabel**: Same API, enhanced with OpenAI Vision\n- Multi-panel component plots (topography, time series, PSD, ERP-image)\n- MNE-Python integration with `.fif` and `.set` file support\n- **EEGLAB .set file auto-detection**: Single file input with automatic ICA detection\n- **Smart file organization**: Basename-prefixed output files prevent overwrites when processing multiple datasets\n- **Continuous data only**: Graceful error handling for epoched data with helpful conversion instructions\n- **Enhanced PDF reports**: Professional dual-header layout with color-coded classification results\n- **OpenAI cost tracking**: Automatic cost estimation and logging for budget monitoring\n- Parallel processing with configurable batch sizes\n- Command-line and Python API interfaces\n- Comprehensive PDF reports and CSV results\n\n## Installation\n\n```bash\npip install autoclean-icvision\n```\n\n**Requirements**: Python 3.8+ and OpenAI API key with vision model access (e.g., `gpt-4.1`)\n\n```bash\nexport OPENAI_API_KEY='your_api_key_here'\n```\n\n## Usage\n\n### Command-Line Interface (CLI)\n\nThe primary way to use ICVision is through its command-line interface.\n\n**Basic Usage:**\n\n**Single EEGLAB .set file (Recommended):**\n```bash\nautoclean-icvision /path/to/your_data.set\n# or legacy command: icvision /path/to/your_data.set\n```\n\n**Separate files:**\n```bash\nautoclean-icvision /path/to/your_raw_data.set /path/to/your_ica_decomposition.fif\n# or legacy command: icvision /path/to/your_raw_data.set /path/to/your_ica_decomposition.fif\n```\n\nICVision can automatically detect and read ICA data from EEGLAB `.set` files, making single-file usage possible when your `.set` file contains both raw data and ICA decomposition.\n\nThis command will:\n1.  Load the raw EEG data and ICA solution (auto-detected from `.set` file or from separate files).\n2.  Classify components using the default settings.\n3.  Create an `autoclean_icvision_results/` directory in your current working directory.\n4.  Save the following into the output directory (with input filename prefix for organization):\n    *   Cleaned raw data (artifacts removed): `{basename}_icvis_cleaned_raw.{format}`\n    *   Updated ICA object with component labels: `{basename}_icvis_classified_ica.fif`\n    *   `{basename}_icvis_results.csv` detailing classifications for each component.\n    *   `{basename}_icvis_summary.txt` with overall statistics.\n    *   `{basename}_icvis_report_all_comps.pdf` (comprehensive PDF report with visualizations).\n\n**Note**: `{basename}` is extracted from your input filename (e.g., `sub-01_task-rest_eeg.set` \u2192 `sub-01_task-rest_eeg` prefix). This prevents file overwrites when processing multiple datasets.\n\n### Recent Improvements\n\n**Enhanced File Organization (v2024.12)**:\n- **Shared workspace**: All results now saved to `autoclean_icvision_results/` directory by default\n- **Smart naming**: Input filename prefixes (e.g., `sub-01_task-rest_eeg_icvis_results.csv`) prevent conflicts\n- **Multi-file friendly**: Process multiple datasets without overwrites - perfect for batch processing subjects\n\n**Improved User Experience**:\n- **Epoched data handling**: Clear error messages with EEGLAB conversion instructions for unsupported epoched data\n- **Enhanced PDF reports**: Professional layout with IC Component titles and color-coded Vision Classification results\n- **Clean logging output**: Professional, user-focused logging with optional verbose mode for debugging\n- **Better error messages**: Informative CLI output with suggested solutions\n\n**Common Options (with defaults):**\n\n*   `--api-key YOUR_API_KEY`: Specify OpenAI API key (default: `OPENAI_API_KEY` env variable)\n*   `--output-dir /path/to/output/`: Output directory (default: `./autoclean_icvision_results`)\n*   `--model MODEL_NAME`: OpenAI model (default: `gpt-4.1`)\n*   `--confidence-threshold 0.8`: Confidence threshold for auto-exclusion (default: `0.8`)\n*   `--psd-fmax 40`: Maximum frequency for PSD plots in Hz (default: `80` or Nyquist)\n*   `--labels-to-exclude eye muscle heart`: Artifact labels to exclude (default: all non-brain types)\n*   `--batch-size 10`: Components per API request (default: `10`)\n*   `--max-concurrency 4`: Max parallel requests (default: `4`)\n*   `--no-auto-exclude`: Disable auto-exclusion (default: auto-exclude enabled)\n*   `--prompt-file /path/to/prompt.txt`: Custom classification prompt (default: built-in prompt)\n*   `--no-report`: Disable PDF report (default: report generation enabled)\n*   `--verbose`: Enable detailed logging (default: standard logging)\n*   `--version`: Show ICVision version\n*   `--help`: Show full list of commands and options\n\n**Examples with options:**\n\nSingle .set file usage:\n```bash\nautoclean-icvision data/subject01_eeg.set \\\n    --api-key sk-xxxxxxxxxxxxxxxxxxxx \\\n    --confidence-threshold 0.9 \\\n    --verbose\n```\n\nTraditional separate files:\n```bash\nautoclean-icvision data/subject01_raw.fif data/subject01_ica.fif \\\n    --api-key sk-xxxxxxxxxxxxxxxxxxxx \\\n    --model gpt-4.1 \\\n    --confidence-threshold 0.8 \\\n    --labels-to-exclude eye muscle line_noise channel_noise \\\n    --batch-size 8 \\\n    --verbose\n```\n\nFor ERP studies with low-pass filtered data:\n```bash\nautoclean-icvision data/erp_study.set \\\n    --psd-fmax 40 \\\n    --confidence-threshold 0.85 \\\n    --verbose\n```\n\nMulti-file batch processing:\n```bash\n# Process multiple subjects - all results go to shared directory\nautoclean-icvision data/sub-01_task-rest_eeg.set --verbose\nautoclean-icvision data/sub-02_task-rest_eeg.set --verbose\nautoclean-icvision data/sub-03_task-rest_eeg.set --verbose\n\n# Results organized in autoclean_icvision_results/ with prefixed filenames\nls autoclean_icvision_results/\n# sub-01_task-rest_eeg_icvis_results.csv\n# sub-01_task-rest_eeg_icvis_classified_ica.fif\n# sub-02_task-rest_eeg_icvis_results.csv\n# sub-02_task-rest_eeg_icvis_classified_ica.fif\n# ...\n```\n\n### Python API\n\nYou can also use ICVision programmatically within your Python scripts.\n\n**Single .set file usage (NEW):**\n```python\nfrom pathlib import Path\nfrom icvision.core import label_components\n\n# --- Configuration ---\nAPI_KEY = \"your_openai_api_key\"  # Or set as environment variable OPENAI_API_KEY\nDATA_PATH = \"path/to/your_data.set\"  # EEGLAB .set file with ICA\nOUTPUT_DIR = Path(\"icvision_output\")\n\n# --- Run ICVision (ICA auto-detected from .set file) ---\ntry:\n    raw_cleaned, ica_updated, results_df = label_components(\n        raw_data=DATA_PATH,              # EEGLAB .set file path\n        # ica_data parameter is optional - auto-detected from .set file\n        api_key=API_KEY,                 # Optional if OPENAI_API_KEY env var is set\n        output_dir=OUTPUT_DIR,\n    )\n```\n\n**Traditional separate files:**\n```python\nfrom pathlib import Path\nfrom icvision.core import label_components\n\n# --- Configuration ---\nAPI_KEY = \"your_openai_api_key\"  # Or set as environment variable OPENAI_API_KEY\nRAW_DATA_PATH = \"path/to/your_raw_data.set\"\nICA_DATA_PATH = \"path/to/your_ica_data.fif\"\nOUTPUT_DIR = Path(\"icvision_output\")\n\n# --- Run ICVision with all parameters ---\ntry:\n    raw_cleaned, ica_updated, results_df = label_components(\n        raw_data=RAW_DATA_PATH,          # Can be MNE object or path string/Path object\n        ica_data=ICA_DATA_PATH,          # Can be MNE object, path, or None for auto-detection\n        api_key=API_KEY,                 # Optional if OPENAI_API_KEY env var is set\n        output_dir=OUTPUT_DIR,\n        model_name=\"gpt-4.1\",            # Default: \"gpt-4.1\"\n        confidence_threshold=0.80,       # Default: 0.8\n        labels_to_exclude=[\"eye\", \"muscle\", \"heart\", \"line_noise\", \"channel_noise\"],  # Default: all non-brain\n        generate_report=True,            # Default: True\n        batch_size=5,                    # Default: 10\n        max_concurrency=3,               # Default: 4\n        auto_exclude=True,               # Default: True\n        custom_prompt=None,              # Default: None (uses built-in prompt)\n        psd_fmax=40.0                    # Default: None (uses 80 Hz); useful for ERP studies\n    )\n\n    print(\"\\n--- ICVision Processing Complete ---\")\n    print(f\"Cleaned raw data channels: {raw_cleaned.info['nchan']}\")\n    print(f\"Updated ICA components: {ica_updated.n_components_}\")\n    print(f\"Number of components classified: {len(results_df)}\")\n\n    if not results_df.empty:\n        print(f\"Number of components marked for exclusion: {results_df['exclude_vision'].sum()}\")\n        print(\"\\nClassification Summary:\")\n        print(results_df[['component_name', 'label', 'confidence', 'exclude_vision']].head())\n\n    print(f\"\\nResults saved in: {OUTPUT_DIR.resolve()}\")\n\nexcept Exception as e:\n    print(f\"An error occurred: {e}\")\n\n```\n\n## \ud83d\udd04 ICLabel Drop-in Replacement\n\nICVision can serve as a **drop-in replacement** for MNE-ICALabel with identical API and output format. This means you can upgrade existing ICLabel workflows to use OpenAI Vision API without changing any other code.\n\n### Quick Migration\n\n**Before (using MNE-ICALabel):**\n```python\nfrom mne_icalabel import label_components\n\n# Classify components with ICLabel\nresult = label_components(raw, ica, method='iclabel')\nprint(result['labels'])  # ['brain', 'eye blink', 'other', ...]\nprint(ica.labels_scores_.shape)  # (n_components, 7)\n```\n\n**After (using ICVision):**\n```python\nfrom icvision.compat import label_components  # <-- Only line that changes!\n\n# Classify components with ICVision (same API!)\nresult = label_components(raw, ica, method='icvision')\nprint(result['labels'])  # Same format: ['brain', 'eye blink', 'other', ...]\nprint(ica.labels_scores_.shape)  # Same shape: (n_components, 7)\n```\n\n### What You Get\n\n- **\ud83c\udfaf Identical API**: Same function signature, same return format\n- **\ud83d\udcca Same Output**: Returns dict with `'y_pred_proba'` and `'labels'` keys\n- **\u2699\ufe0f Same ICA Modifications**: Sets `ica.labels_scores_` and `ica.labels_` exactly like ICLabel\n- **\ud83d\ude80 Enhanced Intelligence**: OpenAI Vision API instead of fixed neural network\n- **\ud83d\udca1 Detailed Reasoning**: Each classification includes explanation (available in full API)\n\n### Why Use ICVision over ICLabel?\n\n| Feature | ICLabel | ICVision |\n|---------|---------|----------|\n| **Classification Method** | Fixed neural network (2019) | OpenAI Vision API (latest models) |\n| **Accuracy** | Good on typical datasets | Enhanced with modern vision AI |\n| **Reasoning** | No explanations | Detailed reasoning for each decision |\n| **Customization** | Fixed model | Customizable prompts and models |\n| **Updates** | Static model | Benefits from OpenAI improvements |\n| **API Compatibility** | \u2705 Original | \u2705 Drop-in replacement |\n\n### Integration Example\n\nThe compatibility layer works seamlessly with existing MNE workflows:\n\n```python\ndef analyze_ica_components(raw, ica, method='icvision'):\n    \"\"\"Generic function that works with both ICLabel and ICVision\"\"\"\n\n    if method == 'icvision':\n        from icvision.compat import label_components\n    else:\n        from mne_icalabel import label_components\n\n    # Same API for both!\n    result = label_components(raw, ica, method=method)\n\n    # Same return format for both\n    print(f\"Classified {len(result['labels'])} components\")\n\n    # Same ICA object modifications for both\n    brain_components = ica.labels_['brain']\n    artifact_components = [idx for key, indices in ica.labels_.items()\n                          if key != 'brain' for idx in indices]\n\n    print(f\"Brain components: {brain_components}\")\n    print(f\"Artifact components: {artifact_components}\")\n\n    return result\n\n# Works with either classifier\nresult = analyze_ica_components(raw, ica, method='icvision')\n```\n\n### Two APIs, Same Power\n\nICVision provides **two complementary interfaces**:\n\n1. **Original ICVision API**: Rich output with detailed results and file generation\n   ```python\n   from icvision.core import label_components\n   raw_cleaned, ica_updated, results_df = label_components(...)\n   ```\n\n2. **ICLabel-Compatible API**: Simple output matching ICLabel exactly\n   ```python\n   from icvision.compat import label_components\n   result = label_components(raw, ica, method='icvision')\n   ```\n\nChoose the API that best fits your workflow - both use the same underlying OpenAI Vision classification.\n\n---\n\n## Configuration Details\n\n### Input File Support\n\n**EEGLAB .set files:**\n- **Raw data**: Supports EEGLAB `.set` files for raw EEG data\n- **ICA data**: Now supports automatic ICA detection from `.set` files using `mne.preprocessing.read_ica_eeglab()`\n- **Single file mode**: Use just a `.set` file when it contains both raw data and ICA decomposition\n\n**MNE formats:**\nOther supported formats include:\n- **Raw data**: `.fif`, `.edf`, `.raw`\n- **ICA data**: `.fif` files containing MNE ICA objects\n\n### Default Parameter Values\n\n| Parameter | Default Value | Description |\n|-----------|---------------|-------------|\n| `model_name` | `\"gpt-4.1\"` | OpenAI model for classification |\n| `confidence_threshold` | `0.8` | Minimum confidence for auto-exclusion |\n| `auto_exclude` | `True` | Automatically exclude artifact components |\n| `labels_to_exclude` | `[\"eye\", \"muscle\", \"heart\", \"line_noise\", \"channel_noise\", \"other_artifact\"]` | Labels to exclude (all non-brain) |\n| `output_dir` | `\"./autoclean_icvision_results\"` | Output directory for results |\n| `generate_report` | `True` | Generate PDF report |\n| `batch_size` | `10` | Components per API request |\n| `max_concurrency` | `4` | Maximum parallel API requests |\n| `api_key` | `None` | Uses `OPENAI_API_KEY` environment variable |\n| `custom_prompt` | `None` | Uses built-in classification prompt |\n\n### Component Labels\n\nThe standard set of labels ICVision uses (and expects from the API) are:\n- `brain` - Neural brain activity (retained)\n- `eye` - Eye movement artifacts\n- `muscle` - Muscle artifacts\n- `heart` - Cardiac artifacts\n- `line_noise` - Electrical line noise\n- `channel_noise` - Channel-specific noise\n- `other_artifact` - Other artifacts\n\nThese are defined in `src/icvision/config.py`.\n\n### Output Files\n\nICVision creates organized output files with input filename prefixes to prevent overwrites when processing multiple datasets:\n\n*   `{basename}_icvis_classified_ica.fif`: MNE ICA object with labels and exclusions\n*   `{basename}_icvis_results.csv`: Detailed classification results per component\n*   `{basename}_icvis_cleaned_raw.{format}`: Cleaned EEG data with artifacts removed\n*   `{basename}_icvis_summary.txt`: Summary statistics by label type\n*   `{basename}_icvis_report_all_comps.pdf`: Comprehensive PDF report (if enabled)\n*   `component_IC{N}_vision_analysis.webp`: Individual component plots used for API classification\n\n**Example**: Processing `sub-01_task-rest_eeg.set` creates files like:\n- `sub-01_task-rest_eeg_icvis_results.csv`\n- `sub-01_task-rest_eeg_icvis_classified_ica.fif`\n- `sub-01_task-rest_eeg_icvis_cleaned_raw.set`\n\n**Multi-file Processing**: All results are saved to the same `autoclean_icvision_results/` directory, with basename prefixes ensuring no conflicts:\n```bash\nautoclean_icvision_results/\n\u251c\u2500\u2500 sub-01_task-rest_eeg_icvis_results.csv\n\u251c\u2500\u2500 sub-01_task-rest_eeg_icvis_classified_ica.fif\n\u251c\u2500\u2500 sub-02_task-rest_eeg_icvis_results.csv\n\u251c\u2500\u2500 sub-02_task-rest_eeg_icvis_classified_ica.fif\n\u2514\u2500\u2500 pilot_data_icvis_results.csv\n```\n\n### Custom Classification Prompt\n\nThe default prompt is optimized for EEG component classification on EGI128 nets. You can customize it by:\n- **CLI**: `--prompt-file /path/to/custom_prompt.txt`\n- **Python API**: `custom_prompt=\"Your custom prompt here\"`\n- **View default**: Check `src/icvision/config.py`\n\n### OpenAI API Costs\n\nICVision automatically tracks and estimates OpenAI API costs during processing:\n\n**Typical Costs (2025-05-29 pricing)**:\n- **gpt-4.1**: ~$0.0012 per component\n- **gpt-4.1-mini**: ~$0.0002 per component (recommended)\n- **gpt-4.1-nano**: ~$0.0001 per component (budget option)\n\n**Example costs for full ICA analysis**:\n- 10 components: $0.0006-0.012 depending on model\n- 30 components: $0.002-0.036 depending on model\n- 64 components: $0.004-0.077 depending on model\n\nCost estimates are automatically logged during processing. Use `--verbose` flag to see detailed per-component cost tracking.\n\n### Logging and Verbosity\n\nICVision provides two logging modes for different use cases:\n\n**Normal Mode** (Default - Clean output for researchers):\n```bash\nautoclean-icvision data.set\n# Output:\n# 2025-05-29 13:33:43 - INFO - Starting ICVision CLI v0.1.0\n# 2025-05-29 13:33:44 - INFO - OpenAI classification complete. Processed 20/20 components\n# 2025-05-29 13:33:45 - INFO - ICVision workflow completed successfully!\n```\n\n**Verbose Mode** (Detailed debugging information):\n```bash\nautoclean-icvision data.set --verbose\n# Output:\n# 2025-05-29 13:33:43 - icvision - INFO - Verbose logging enabled - showing module details\n# 2025-05-29 13:33:44 - icvision.core - DEBUG - Loading and validating input data...\n# 2025-05-29 13:33:45 - icvision.api - DEBUG - Response ID: resp_123..., Tokens: 400/50, Cost: $0.001200\n# 2025-05-29 13:33:45 - icvision.plotting - DEBUG - Plotting progress: 10/20 components completed\n```\n\n**Verbose mode provides**:\n- Module-level debugging information\n- Detailed OpenAI API cost tracking per component\n- Progress indicators for long-running operations\n- External library logging (httpx, openai, etc.)\n- Full error stack traces for troubleshooting\n\n**Use verbose mode when**:\n- Debugging processing issues\n- Monitoring API costs in detail\n- Contributing to development\n- Troubleshooting unexpected behavior\n\n## Development\n\nContributions are welcome! Please see `CONTRIBUTING.md` for guidelines.\n\n## License\n\nThis project is licensed under the MIT License - see the `LICENSE` file for details.\n\n## Citation\n\nIf you use ICVision in your research, please consider citing it (details to be added upon publication/DOI generation).\n\n## Acknowledgements\n\n*   This project relies heavily on the [MNE-Python](https://mne.tools/) library.\n*   Utilizes the [OpenAI API](https://openai.com/api/).\n",
    "bugtrack_url": null,
    "license": "MIT License  Copyright (c) 2024 Cincibrainlab  Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the \"Software\"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.  THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ",
    "summary": "Automated ICA component classification using OpenAI Vision API for EEG data",
    "version": "0.1.0",
    "project_urls": {
        "Bug Reports": "https://github.com/cincibrainlab/ICVision/issues",
        "Documentation": "https://cincibrainlab.github.io/ICVision/",
        "Homepage": "https://github.com/cincibrainlab/ICVision",
        "Source": "https://github.com/cincibrainlab/ICVision"
    },
    "split_keywords": [
        "artifact-detection",
        " eeg",
        " ica",
        " mne",
        " neuroscience",
        " openai",
        " vision"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "723eaf4c47fc18ca16c6d947d8cc19421b34c7872cfd8ff465b70ac1ecde92b8",
                "md5": "6f604fdd1f46bd72a1b7e1e99a57efa3",
                "sha256": "68e486b10fa0dbbaec4b7bb472e1255934969d2d47d7fa9fb5e2e61b660e76a5"
            },
            "downloads": -1,
            "filename": "autoclean_icvision-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6f604fdd1f46bd72a1b7e1e99a57efa3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 56723,
            "upload_time": "2025-07-27T15:22:48",
            "upload_time_iso_8601": "2025-07-27T15:22:48.766098Z",
            "url": "https://files.pythonhosted.org/packages/72/3e/af4c47fc18ca16c6d947d8cc19421b34c7872cfd8ff465b70ac1ecde92b8/autoclean_icvision-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "8d5f2e32cc6eb9e8257fff356f4577d21697a50353ad8646c323fba29406cea8",
                "md5": "1ccd30d389ca559370ff79a2145d3317",
                "sha256": "c6530dde698404c901465b26d1e4c65608dc55e76d845624176398b8cacc711d"
            },
            "downloads": -1,
            "filename": "autoclean_icvision-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "1ccd30d389ca559370ff79a2145d3317",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 90514,
            "upload_time": "2025-07-27T15:22:50",
            "upload_time_iso_8601": "2025-07-27T15:22:50.284615Z",
            "url": "https://files.pythonhosted.org/packages/8d/5f/2e32cc6eb9e8257fff356f4577d21697a50353ad8646c323fba29406cea8/autoclean_icvision-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-27 15:22:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "cincibrainlab",
    "github_project": "ICVision",
    "github_not_found": true,
    "lcname": "autoclean-icvision"
}
        
Elapsed time: 0.74156s