cve-report-aggregator


Namecve-report-aggregator JSON
Version 0.11.0 PyPI version JSON
download
home_pageNone
SummaryAggregate and deduplicate vulnerability scan reports from Grype and Trivy
upload_time2025-10-21 00:59:52
maintainerNone
docs_urlNone
authorNone
requires_python>=3.12
licenseMIT
keywords cve grype sbom security trivy vulnerability
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # CVE Report Aggregation and Deduplication Tool

[![Python Version](https://img.shields.io/badge/python-3.12%20%7C%203.13-blue.svg)](https://www.python.org/downloads/)
[![PyPI version](https://img.shields.io/pypi/v/cve-report-aggregator.svg)](https://pypi.org/project/cve-report-aggregator/)
[![PyPI downloads](https://img.shields.io/pypi/dm/cve-report-aggregator.svg)](https://pypi.org/project/cve-report-aggregator/)
[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![CI](https://github.com/mkm29/cve-report-aggregator/actions/workflows/test.yml/badge.svg)](https://github.com/mkm29/cve-report-aggregator/actions/workflows/test.yml)
[![codecov](https://codecov.io/gh/mkm29/cve-report-aggregator/branch/main/graph/badge.svg?token=mJcMNSlBIM)](https://codecov.io/gh/mkm29/cve-report-aggregator)
[![Latest Release](https://img.shields.io/github/v/release/mkm29/cve-report-aggregator)](https://github.com/mkm29/cve-report-aggregator/releases)
[![Docker](https://img.shields.io/badge/docker-available-blue.svg)](https://github.com/mkm29/cve-report-aggregator/pkgs/container/cve-report-aggregator)
[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)

![CVE Report Aggregator Logo](./images/logo.png)

A Python package for aggregating and deduplicating Grype and Trivy vulnerability scan reports, extracted from Zarf packages. Optionally enrich CVE data using OpenAI GPT models to provide actionable mitigation summaries in the context of UDS Core security controls.

> [!IMPORTANT]
> Will implement customizable prompts and support for additional AI providers in future releases.

## Features

- **Self-Contained Docker Image**: Includes all scanning tools (Grype, Syft, Trivy, UDS CLI) in a single hardened
  Alpine-based image
- **Supply Chain Security**: SLSA Level 3 compliant with signed images, SBOMs, and provenance attestations
- **AI-Powered CVE Enrichment**: Optional OpenAI integration for automated vulnerability mitigation analysis
- **Production-Ready Package**: Installable via pip/pipx with proper dependency management
- **Rich Terminal Output**: Beautiful, color-coded tables and progress indicators using the Rich library
- **Multi-Scanner Support**: Works with both Grype and Trivy scanners
- **SBOM Auto-Scan**: Automatically detects and scans Syft SBOM files with Grype
- **Auto-Conversion**: Automatically converts Grype reports to CycloneDX format for Trivy scanning
- **CVE Deduplication**: Combines identical vulnerabilities across multiple scans
- **Automatic Null CVSS Filtering**: Filters out invalid CVSS scores (null, N/A, or zero) from all vulnerability reports
- **CVSS 3.x-Based Severity Selection**: Optional mode to select highest severity based on actual CVSS 3.x base scores
- **Scanner Source Tracking**: Identifies which scanner (Grype or Trivy) provided the vulnerability data
- **Occurrence Tracking**: Counts how many times each CVE appears
- **Parallel Processing**: Concurrent package downloading with configurable worker pools (10-14x speedup)
- **Flexible CLI**: Click-based interface with rich-click styling and sensible defaults
- **Full Test Coverage**: Comprehensive test suite with pytest (237 tests, 91% coverage)
- **Security Hardened**: Non-root user (UID 1001), minimal Alpine base, pinned dependencies, and vulnerability-scanned

## Configuration

CVE Report Aggregator supports flexible configuration through multiple sources with the following precedence (highest to lowest):

1. **CLI Arguments** - Command-line flags and options
1. **YAML Configuration File** - `.cve-aggregator.yaml` or `.cve-aggregator.yml`
1. **Environment Variables** - Prefixed with `CVE_AGGREGATOR_`
1. **Default Values**

### CLI Options

| Option                      | Short | Description                                                                       | Default            |
| --------------------------- | ----- | --------------------------------------------------------------------------------- | ------------------ |
| `--input-dir`               | `-i`  | Input directory containing scan reports or SBOMs                                  | `./reports`        |
| `--scanner`                 | `-s`  | Scanner type to process (`grype` or `trivy`)                                      | `grype`            |
| `--log-level`               | `-l`  | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)                             | `INFO`             |
| `--mode`                    | `-m`  | Aggregation mode: `highest-score`, `first-occurrence`, `grype-only`, `trivy-only` | `highest-score`    |
| `--enrich-cves`             |       | Enable CVE enrichment with OpenAI                                                 | `false`            |
| `--openai-api-key`          |       | OpenAI API key (defaults to `OPENAI_API_KEY` env var)                             | None               |
| `--openai-model`            |       | OpenAI model to use for enrichment                                                | `gpt-5-nano`       |
| `--openai-reasoning-effort` |       | Reasoning effort level (`low`, `medium`, `high`)                                  | `medium`           |
| `--max-cves-to-enrich`      |       | Maximum number of CVEs to enrich                                                  | None (all)         |
| `--enrich-severity-filter`  |       | Severity levels to enrich (can be used multiple times)                            | `Critical`, `High` |
| `--help`                    | `-h`  | Show help message and exit                                                        | N/A                |
| `--version`                 |       | Show version and exit                                                             | N/A                |

### YAML Configuration File

Create a `.cve-aggregator.yaml` or `.cve-aggregator.yml` file in your project directory:

```yaml
# Scanner and processing settings
scanner: grype                          # Scanner type: grype or trivy
mode: highest-score                     # Aggregation mode
log_level: INFO                         # Logging level

input_dir: ./reports                    # Input directory for reports

# Parallel processing
maxWorkers: 14                          # Concurrent download workers (auto-detect if omitted)

# Remote package downloads
downloadRemotePackages: true            # Enable remote SBOM downloads
registry: registry.defenseunicorns.com
organization: sld-45
packages:
  - name: gitlab
    version: 18.4.2-uds.0-unicorn
    architecture: amd64
  - name: gitlab-runner
    version: 18.4.0-uds.0-unicorn
    architecture: amd64

# CVE Enrichment (OpenAI)
enrich:
  enabled: true
  provider: openai  # only openai is supported currently
  model: gpt-5  # OpenAI model (gpt-5-nano, gpt-4o, etc.)
  # apiKey: YOUR_OPENAI_API_KEY_HERE # or set via OPENAI_API_KEY environment variable
  reasoningEffort: medium  # Level of reasoning effort: minimal, low, medium, high
  severities:  # Severity levels to enrich
    - Critical
    - High
  verbosity: medium  # Verbosity level: low, medium, high
  seed: 42  # Optional: Seed for reproducibility
  metadata:  # Optional: Metadata tags for OpenAI requests
    project: cve-report-aggregator
    organization: defenseunicorns
```

See [.cve-aggregator.example.yaml](.cve-aggregator.example.yaml) for a complete example.

### Environment Variables

All configuration options can be set via environment variables with the `CVE_AGGREGATOR_` prefix (with the exception of the `OPENAI_API_KEY`, which has no prefix). For example:

```bash
# Scanner settings
export CVE_AGGREGATOR_SCANNER=grype
export CVE_AGGREGATOR_MODE=highest-score
export CVE_AGGREGATOR_LOG_LEVEL=DEBUG

# Input/output
export CVE_AGGREGATOR_INPUT_DIR=/path/to/reports
export CVE_AGGREGATOR_OUTPUT_FILE=/path/to/output.json

# Parallel processing
export CVE_AGGREGATOR_MAX_WORKERS=14

# Remote packages
export CVE_AGGREGATOR_DOWNLOAD_REMOTE_PACKAGES=true
export CVE_AGGREGATOR_REGISTRY=registry.example.com
export CVE_AGGREGATOR_ORGANIZATION=my-org

# CVE Enrichment
export OPENAI_API_KEY=sk-...                            # OpenAI API key (no prefix)
export CVE_AGGREGATOR_ENRICH_CVES=true
export CVE_AGGREGATOR_OPENAI_MODEL=gpt-5-nano
export CVE_AGGREGATOR_OPENAI_REASONING_EFFORT=medium
export CVE_AGGREGATOR_MAX_CVES_TO_ENRICH=50
```

### Configuration Examples

#### Basic Usage with Defaults

```bash
# Process reports from ./reports/ with default settings
cve-report-aggregator

# Output: $HOME/output/unified-YYYYMMDDhhmmss.json
```

#### Custom Scanner and Verbosity

```bash
# Use Trivy scanner with debug logging
cve-report-aggregator --scanner trivy --log-level DEBUG
```

#### CVE Enrichment

```bash
# Enable AI-powered enrichment for Critical and High CVEs
export OPENAI_API_KEY=sk-...
cve-report-aggregator --enrich-cves

# Customize enrichment settings
cve-report-aggregator \
  --enrich-cves \
  --openai-model gpt-4o \
  --openai-reasoning-effort high \
  --max-cves-to-enrich 10 \
  --enrich-severity-filter Critical
```

#### Remote Package Downloads

```yaml
# .cve-aggregator.yaml
downloadRemotePackages: true
registry: registry.defenseunicorns.com
organization: sld-45
maxWorkers: 14
packages:
  - name: gitlab
    version: 18.4.2-uds.0-unicorn
```

```bash
# Run with config file
cve-report-aggregator --config .cve-aggregator.yaml
```

## Performance

CVE Report Aggregator now supports **parallel processing** for significantly faster execution with large package sets:

### Parallel Package Downloading

When downloading SBOM reports from remote registries (e.g., using UDS Zarf), packages are downloaded concurrently using a configurable worker pool:

```yaml
# .cve-aggregator.yaml
maxWorkers: 14  # Number of concurrent download workers (optional)
```

You can expect the following performance improvements when utilizing parallel downloads (`ThreadPoolExecutor`):

- `~10-15` seconds for 14 packages
- A **10-14x** speedup compared to sequential downloads (which can take `~150s` for 14 packages)

**Auto-Detection:** If `maxWorkers` is not specified, the optimal worker count is automatically detected using the formula: `min(<number_of_packages>, cpu_cores * 2 - 2)`. Set to `1` to disable parallelization.

**Thread Safety:** All parallel operations use thread-safe data structures (`Lock()`) to ensure data integrity across concurrent workers.

## Prerequisites

**Depending on scanner choice:**

- [grype](https://github.com/anchore/grype) - For Grype scanning (default scanner)
- [syft](https://github.com/anchore/syft) - For converting reports to CycloneDX format (Trivy workflow)
- [trivy](https://github.com/aquasecurity/trivy) - For Trivy scanning

```bash
# Install Grype
brew install grype

# Install syft (for Trivy workflow)
brew install syft

# Install trivy
brew install aquasecurity/trivy/trivy
```

## Installation

### Using Docker (Recommended)

The easiest way to use CVE Report Aggregator is via the pre-built Docker image, which includes all necessary scanning tools (Grype, Syft, Trivy, UDS CLI):

```bash
# Pull the latest signed image from GitHub Container Registry
docker pull ghcr.io/mkm29/cve-report-aggregator:latest

# Or build locally
docker build -t cve-report-aggregator .

# Or use Docker Compose
docker compose run cve-aggregator --help

# Run with mounted volumes for reports and output
docker run --rm \
  -v $(pwd)/reports:/workspace/reports:ro \
  -v $(pwd)/output:/home/cve-aggregator/output \
  ghcr.io/mkm29/cve-report-aggregator:latest \
  --input-dir /workspace/reports \
  --verbose

# Note: Output files are automatically saved to $HOME/output with package name and version:
# Format: <package_name>-<package_version>.json (e.g., core-logging-0.54.1-unicorn.json)
```

#### Image Security & Supply Chain

All container images are built with enterprise-grade security:

- **Signed with Cosign**: Keyless signing using GitHub OIDC identity
- **SBOM Included**: CycloneDX and SPDX attestations attached to every image
- **Provenance**: SLSA Level 3 compliant build attestations
- **Multi-Architecture**: Supports both amd64 and arm64
- **Vulnerability Scanned**: Regularly scanned with Grype and Trivy

##### Verify Image Signature

```bash
# Install cosign
brew install cosign

# Verify the image signature
cosign verify ghcr.io/mkm29/cve-report-aggregator:latest \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com'

# Output shows verified signature with GitHub Actions identity
```

##### Download and Verify SBOM

```bash
# Download CycloneDX SBOM (JSON format)
cosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \
  --type cyclonedx \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \
  jq -r '.payload' | base64 -d | jq . > sbom-cyclonedx.json

# Download SPDX SBOM (JSON format)
cosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \
  --type spdx \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \
  jq -r '.payload' | base64 -d | jq . > sbom-spdx.json

# View all attestations and signatures
cosign tree ghcr.io/mkm29/cve-report-aggregator:latest
```

##### Download Build Provenance

```bash
# Download SLSA provenance attestation
cosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \
  --type slsaprovenance \
  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \
  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \
  jq -r '.payload' | base64 -d | jq . > provenance.json
```

#### Available Image Tags

Images are published to GitHub Container Registry with the following tags:

- `latest` - Latest stable release (recommended for production)
- `v*.*.*` - Specific version tags (e.g., `v0.5.1`, `v0.5.2`)
- `rc` - Release candidate builds (for testing pre-release versions)

```bash
# Pull specific version
docker pull ghcr.io/mkm29/cve-report-aggregator:v0.5.1

# Pull latest stable
docker pull ghcr.io/mkm29/cve-report-aggregator:latest

# Pull release candidate (if available)
docker pull ghcr.io/mkm29/cve-report-aggregator:rc
```

All tags are signed and include full attestations (signature, SBOM, provenance).

## CVE Enrichment

CVE Report Aggregator supports optional AI-powered enrichment using OpenAI GPT models to automatically analyze vulnerabilities in the context of UDS Core security controls. This feature generates concise, actionable mitigation summaries that explain how defense-in-depth security measures help protect against specific CVEs.

### Key Features

- **gpt-5-nano with Batch API**: Cost-optimized analysis with 50% discount on already low token costs
- **Asynchronous Processing**: Submits all CVEs to OpenAI Batch API and polls for completion
- **UDS Core Security Context**: Analyzes 20+ NetworkPolicies and 19 Pepr admission policies
- **Single-Sentence Summaries**: Format "UDS helps to mitigate {CVE_ID} by {explanation}"
- **Configurable Reasoning Effort**: Tune analysis depth with `low`, `medium`, or `high` settings
- **Severity Filtering**: Default enrichment for `Critical` and `High` severity only
- **Flexible Configuration**: CLI, YAML, or environment variables

**Note:** Batch API enrichment typically completes within minutes to hours (up to 24-hour maximum). The CLI will poll for completion automatically and display progress updates.

### Quick Start

```bash
# Set API key
export OPENAI_API_KEY=sk-...

# Enable enrichment (enriches Critical and High severity CVEs by default)
cve-report-aggregator --enrich-cves

# Customize enrichment with higher reasoning effort
cve-report-aggregator \
  --enrich-cves \
  --openai-model gpt-4o \
  --openai-reasoning-effort high \
  --max-cves-to-enrich 10 \
  --enrich-severity-filter Critical
```

### Reasoning Effort

The `openai_reasoning_effort` parameter controls how deeply the AI model analyzes each CVE:

- **minimal**: Basic analysis with minimal token usage
- **`low`**: Faster, more concise analysis with lower token usage
- **`medium`** (default): Balanced analysis with good quality and reasonable token usage
- **`high`**: Most thorough analysis with higher quality but increased token usage

**When to adjust:**

- Use `minimal` for quick overviews or large CVE sets
- Use `low` for large CVE sets where speed and cost are priorities
- Use `medium` (default) for most production use cases
- Use `high` for critical vulnerabilities requiring detailed analysis

**Note:** The `reasoning_effort` parameter is only supported by GPT-5 models (gpt-5-nano, gpt-5-mini). The temperature parameter is fixed at `1.0` for GPT-5 models as required by OpenAI.

```bash
# Example: High-quality analysis for critical CVEs only
cve-report-aggregator \
  --enrich-cves \
  --openai-reasoning-effort high \
  --enrich-severity-filter Critical
```

### Cost Optimization

The system achieves extremely low costs through:

1. **gpt-5-nano**: Ultra cost-effective model ($0.150/1M input, $0.600/1M output tokens)
1. **OpenAI Batch API**: 50% cost discount compared to synchronous API calls
1. **Single-Sentence Format**: 80% fewer output tokens (100 vs 500 tokens per CVE)
1. **Severity Filtering**: ~70% fewer CVEs enriched (Critical/High only by default)

**Batch API Benefits:**

The OpenAI Batch API processes requests asynchronously with significant cost savings:

- **50% cost discount** on all API calls (applied automatically)
- Processes all CVEs in a single batch submission
- Results available within 24 hours (typically much faster)
- Automatic retry and error handling

**Cost Examples (gpt-5-nano with Batch API @ 50% discount):**

- 10 CVEs: ~$0.0006 (11,000 tokens @ $0.075/1M input, $0.300/1M output)
- 100 CVEs: ~$0.006 (110,000 tokens)
- 1,000 CVEs: ~$0.06 (1,100,000 tokens)

**Comparison with Standard Pricing:**

- 100 CVEs with Batch API (gpt-5-nano): $0.006
- 100 CVEs without Batch API (gpt-5-nano): $0.012
- 100 CVEs with GPT-4: ~$12.00
- **Cost Reduction vs GPT-4: 99.95%**
- **Cost Reduction vs Synchronous API: 50%**

### Output Format

Enrichments are added to the unified report under the `enrichments` key:

```json
{
  "enrichments": {
    "CVE-2024-12345": {
      "cve_id": "CVE-2024-12345",
      "mitigation_summary": "UDS helps to mitigate CVE-2024-12345 by enforcing non-root container execution through Pepr admission policies and blocking unauthorized external network access via default-deny NetworkPolicies.",
      "analysis_model": "gpt-5-nano",
      "analysis_timestamp": "2025-01-20T12:34:56.789Z"
    }
  },
  "summary": {
    "enrichment": {
      "enabled": true,
      "total_cves": 150,
      "enriched_cves": 45,
      "model": "gpt-5-nano",
      "severity_filter": ["Critical", "High"]
    }
  }
}
```

## Docker Credentials Management

The Docker container supports two methods for providing registry credentials:

1. **Build-Time Secrets**
1. **Environment Variables**

### Method 1: Build-Time Secrets (Recommended)

**Best for**: Private container images where credentials can be baked in securely.

Create a credentials file in JSON format with `username`, `password`, and `registry` fields:

```bash
cat > docker/config.json <<EOF
{
  "username": "myuser",
  "password": "mypassword",
  "registry": "ghcr.io"
}
EOF
chmod 600 docker/config.json
```

**Important**: Always encrypt the credentials file with SOPS before committing:

```bash
# Encrypt the credentials file
sops -e docker/config.json.dec > docker/config.json.enc

# Or encrypt in place
sops -e docker/config.json.dec > docker/config.json.enc
```

Build the image with the secret:

```bash
# If using encrypted file, decrypt first
sops -d docker/config.json.enc > docker/config.json.dec

# Build with the decrypted credentials
docker buildx build \
  --secret id=credentials,src=./docker/config.json.dec \
  -f docker/Dockerfile \
  -t cve-report-aggregator:latest .

# Remove decrypted file after build
rm docker/config.json.dec
```

Or build directly with unencrypted file (for local development):

```bash
docker buildx build \
  --secret id=credentials,src=./docker/config.json \
  -f docker/Dockerfile \
  -t cve-report-aggregator:latest .
```

The credentials will be stored in the image at `$DOCKER_CONFIG/config.json` (defaults to `/home/cve-aggregator/.docker/config.json`) in proper Docker authentication format with base64-encoded credentials.

Run the container (no runtime credentials needed - uses baked-in `config.json`):

```bash
docker run --rm cve-report-aggregator:latest --help
```

**Important**: This method bakes credentials into the image. Only use for private registries and **never** push images with credentials to public registries.

### Method 2: Environment Variables

```bash
docker run -it --rm \
  -e REGISTRY_URL="$UDS_URL" \
  -e UDS_USERNAME="$UDS_USERNAME" \
  -e UDS_PASSWORD="$UDS_PASSWORD" \
  -e OPENAI_API_KEY="$OPENAI_API_KEY" \
  cve-report-aggregator:latest --help
```

### How Credentials Are Handled

The `entrypoint.sh` script checks for Docker authentication on startup:

1. **Docker config.json** (Build-Time): Checks if `$DOCKER_CONFIG/config.json` exists

   - If found: Skips all credential checks and login - uses existing Docker auth
   - Location: `/home/cve-aggregator/.docker/config.json`

1. **Environment Variables** (if config.json not found): Requires all three variables:

   - `REGISTRY_URL` - Registry URL (e.g., `registry.defenseunicorns.com`)
   - `UDS_USERNAME` - Registry username
   - `UDS_PASSWORD` - Registry password

If `config.json` doesn't exist and environment variables are not provided, the container exits with an error.

### From Source

```bash
# Clone the repository
git clone https://github.com/mkm29/cve-report-aggregator.git
cd cve-report-aggregator

# Install in development mode
pip install -e .

# Or install with dev dependencies
pip install -e ".[dev]"
```

### From PyPi

```bash
# Install globally
pip install cve-report-aggregator

# Or install with pipx (recommended)
pipx install cve-report-aggregator
```

## Usage

### Basic Usage (Default Locations)

Process reports from `./reports/` and automatically save timestamped output to `$HOME/output/`:

```bash
cve-report-aggregator
# Output:
#   $HOME/output/<package>/<package>-<version>.json
#   $HOME/output/<package>/<package>-<version>.csv
```

### Use Trivy Scanner

Automatically convert reports to CycloneDX and scan with Trivy:

```bash
cve-report-aggregator --scanner trivy
```

### Process SBOM Files

The script automatically detects and scans Syft SBOM files:

```bash
cve-report-aggregator -i /path/to/sboms -v
```

### Custom Input Directory

```bash
# Specify custom input directory (output still goes to $HOME/output)
cve-report-aggregator -i /path/to/reports
```

### Verbose Mode

Enable detailed processing output:

```bash
cve-report-aggregator -v
```

### Combined Options

```bash
cve-report-aggregator -i ./scans --scanner trivy -v
# Output:
#   $HOME/output/<package>/<package>-<version>.json
#   $HOME/output/<package>/<package>-<version>.csv
```

### Use Highest Severity Across Scanners

When scanning with multiple scanners (or multiple runs of the same scanner), automatically select the highest severity rating:

```bash
# Scan the same image with both Grype and Trivy, use highest severity
grype myapp:latest -o json > reports/grype-app.json
trivy image myapp:latest -f json -o reports/trivy-app.json
cve-report-aggregator -i reports/ --mode highest-score
# Output:
#   $HOME/output/<package>/<package>-<version>.json
#   $HOME/output/<package>/<package>-<version>.csv
```

This is particularly useful when:

- Combining results from multiple scanners with different severity assessments
- Ensuring conservative (worst-case) severity ratings for compliance
- Aggregating multiple scans over time where severity data may have been updated

**Note:** All output files are automatically saved to `$HOME/output/` in a `<package>` subdirectory with the package version in the format `<package_name>-<package_version>.json`.

For complete configuration options, see the [Configuration](#configuration) section.

## Output Formats

The tool generates reports in two formats for maximum flexibility:

### 1. JSON Format (Unified Report)

The unified report includes:

### Metadata

- Generation timestamp
- Scanner type and version
- Source report count and filenames
- Package name and version

### Summary

- Total vulnerability occurrences
- Unique vulnerability count
- Severity breakdown (Critical, High, Medium, Low, Negligible, Unknown)
- Per-image scan results

### Vulnerabilities (Deduplicated)

For each unique CVE/GHSA:

- Vulnerability ID
- Occurrence count
- Selected scanner (which scanner provided the vulnerability data)
- Severity and CVSS scores
- Fix availability and versions
- All affected sources (images and artifacts)
- Detailed match information

### 2. CSV Format (Simplified Export)

A simplified CSV export is automatically generated alongside each unified JSON report for easy consumption in spreadsheet applications and reporting tools.

**Filename Format**: `<package_name>-<timestamp>.csv`

**Columns**:

- `CVE ID`: Vulnerability identifier
- `Severity`: Severity level (Critical, High, Medium, Low, etc.)
- `Count`: Number of occurrences across all scanned images
- `CVSS`: Highest CVSS 3.x score (or "N/A" if unavailable)
- `Impact`: Impact analysis from OpenAI enrichment (if enabled)
- `Mitigation`: Mitigation summary from OpenAI enrichment (if enabled)

**Example**:

```csv
"CVE-2023-4863","Critical","5","9.8","Without UDS Core controls, this critical vulnerability...","UDS helps to mitigate CVE-2023-4863 by..."
"CVE-2023-4973","High","3","7.5","This vulnerability could allow...","UDS helps to mitigate CVE-2023-4973 by..."
```

**Features**:

- Sorted by severity (Critical > High > Medium > Low) and CVSS score
- Includes enrichment data when CVE enrichment is enabled
- UTF-8 encoded with proper CSV escaping
- Compatible with Excel, Google Sheets, and data analysis tools

**Location**: `$HOME/output/<package_name>/<package_name>-<package_version>.csv`

## Development

### Running Tests

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=cve_report_aggregator --cov-report=html

# Run specific test file
pytest tests/test_severity.py
```

### Code Quality

```bash
# Format code
black src/ tests/

# Lint code
ruff check src/ tests/

# Type checking
mypy src/
```

### Building the Package

```bash
# Build distribution packages
python -m build

# Install locally
pip install dist/cve_report_aggregator-0.1.0-py3-none-any.whl
```

## Project Structure

```bash
cve-report-aggregator/
├── src/
│   └── cve_report_aggregator/
│       ├── __init__.py           # Package exports and metadata
│       ├── main.py               # CLI entry point
│       ├── models.py             # Type definitions
│       ├── utils.py              # Utility functions
│       ├── severity.py           # CVSS and severity logic
│       ├── scanner.py            # Scanner integrations
│       ├── aggregator.py         # Deduplication engine
│       └── report.py             # Report generation
├── tests/
│   ├── __init__.py
│   ├── conftest.py               # Pytest fixtures
│   ├── test_severity.py          # Severity tests
│   └── test_aggregator.py        # Aggregation tests
├── pyproject.toml                # Project configuration
├── README.md                     # This file
└── LICENSE                       # MIT License
```

## Example Workflows

### Grype Workflow (Default)

```bash
# Scan multiple container images with Grype
grype registry.io/app/service1:v1.0 -o json > reports/service1.json
grype registry.io/app/service2:v1.0 -o json > reports/service2.json
grype registry.io/app/service3:v1.0 -o json > reports/service3.json

# Aggregate all reports (output saved to $HOME/output with timestamp)
cve-report-aggregator --log-level DEBUG

# Query results with jq (use the timestamped file)
REPORT=$(ls -t $HOME/output/unified-*.json | head -1)
jq '.summary' "$REPORT"
jq '.vulnerabilities[] | select(.vulnerability.severity == "Critical")' "$REPORT"
```

### SBOM Workflow

```bash
# Generate SBOMs with Syft (or use Zarf-generated SBOMs)
syft registry.io/app/service1:v1.0 -o json > sboms/service1.json
syft registry.io/app/service2:v1.0 -o json > sboms/service2.json

# Script automatically detects and scans SBOMs with Grype
cve-report-aggregator -i ./sboms --log-level DEBUG

# Results include all vulnerabilities found (use timestamped file)
REPORT=$(ls -t $HOME/output/unified-*.json | head -1)
jq '.summary.by_severity' "$REPORT"
```

### Trivy Workflow

```bash
# Start with Grype reports (script will convert to CycloneDX)
grype registry.io/app/service1:v1.0 -o json > reports/service1.json
grype registry.io/app/service2:v1.0 -o json > reports/service2.json

# Aggregate and scan with Trivy (auto-converts to CycloneDX)
cve-report-aggregator --scanner trivy --log-level DEBUG

# Or scan SBOMs directly with Trivy
cve-report-aggregator -i ./sboms --scanner trivy --log-level DEBUG

# View most recent output
REPORT=$(ls -t $HOME/output/unified-*.json | head -1)
jq '.summary' "$REPORT"
```

## License

MIT License - See LICENSE file for details

## Contributing

Contributions are welcome! Please:

1. Fork the repository
1. Create a feature branch
1. Add tests for new functionality
1. Ensure all tests pass
1. Submit a pull request

## Changelog

See [CHANGELOG.md](CHANGELOG.md) for version history and changes.

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "cve-report-aggregator",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "cve, grype, sbom, security, trivy, vulnerability",
    "author": null,
    "author_email": "Mitchell Murphy <mitchell.murphy@defenseunicorns.com>",
    "download_url": "https://files.pythonhosted.org/packages/e9/8b/c3ce7badf6e770376234f71430d5f0762494a56207bc193a50593478270f/cve_report_aggregator-0.11.0.tar.gz",
    "platform": null,
    "description": "# CVE Report Aggregation and Deduplication Tool\n\n[![Python Version](https://img.shields.io/badge/python-3.12%20%7C%203.13-blue.svg)](https://www.python.org/downloads/)\n[![PyPI version](https://img.shields.io/pypi/v/cve-report-aggregator.svg)](https://pypi.org/project/cve-report-aggregator/)\n[![PyPI downloads](https://img.shields.io/pypi/dm/cve-report-aggregator.svg)](https://pypi.org/project/cve-report-aggregator/)\n[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)\n[![CI](https://github.com/mkm29/cve-report-aggregator/actions/workflows/test.yml/badge.svg)](https://github.com/mkm29/cve-report-aggregator/actions/workflows/test.yml)\n[![codecov](https://codecov.io/gh/mkm29/cve-report-aggregator/branch/main/graph/badge.svg?token=mJcMNSlBIM)](https://codecov.io/gh/mkm29/cve-report-aggregator)\n[![Latest Release](https://img.shields.io/github/v/release/mkm29/cve-report-aggregator)](https://github.com/mkm29/cve-report-aggregator/releases)\n[![Docker](https://img.shields.io/badge/docker-available-blue.svg)](https://github.com/mkm29/cve-report-aggregator/pkgs/container/cve-report-aggregator)\n[![Code style: ruff](https://img.shields.io/badge/code%20style-ruff-000000.svg)](https://github.com/astral-sh/ruff)\n\n![CVE Report Aggregator Logo](./images/logo.png)\n\nA Python package for aggregating and deduplicating Grype and Trivy vulnerability scan reports, extracted from Zarf packages. Optionally enrich CVE data using OpenAI GPT models to provide actionable mitigation summaries in the context of UDS Core security controls.\n\n> [!IMPORTANT]\n> Will implement customizable prompts and support for additional AI providers in future releases.\n\n## Features\n\n- **Self-Contained Docker Image**: Includes all scanning tools (Grype, Syft, Trivy, UDS CLI) in a single hardened\n  Alpine-based image\n- **Supply Chain Security**: SLSA Level 3 compliant with signed images, SBOMs, and provenance attestations\n- **AI-Powered CVE Enrichment**: Optional OpenAI integration for automated vulnerability mitigation analysis\n- **Production-Ready Package**: Installable via pip/pipx with proper dependency management\n- **Rich Terminal Output**: Beautiful, color-coded tables and progress indicators using the Rich library\n- **Multi-Scanner Support**: Works with both Grype and Trivy scanners\n- **SBOM Auto-Scan**: Automatically detects and scans Syft SBOM files with Grype\n- **Auto-Conversion**: Automatically converts Grype reports to CycloneDX format for Trivy scanning\n- **CVE Deduplication**: Combines identical vulnerabilities across multiple scans\n- **Automatic Null CVSS Filtering**: Filters out invalid CVSS scores (null, N/A, or zero) from all vulnerability reports\n- **CVSS 3.x-Based Severity Selection**: Optional mode to select highest severity based on actual CVSS 3.x base scores\n- **Scanner Source Tracking**: Identifies which scanner (Grype or Trivy) provided the vulnerability data\n- **Occurrence Tracking**: Counts how many times each CVE appears\n- **Parallel Processing**: Concurrent package downloading with configurable worker pools (10-14x speedup)\n- **Flexible CLI**: Click-based interface with rich-click styling and sensible defaults\n- **Full Test Coverage**: Comprehensive test suite with pytest (237 tests, 91% coverage)\n- **Security Hardened**: Non-root user (UID 1001), minimal Alpine base, pinned dependencies, and vulnerability-scanned\n\n## Configuration\n\nCVE Report Aggregator supports flexible configuration through multiple sources with the following precedence (highest to lowest):\n\n1. **CLI Arguments** - Command-line flags and options\n1. **YAML Configuration File** - `.cve-aggregator.yaml` or `.cve-aggregator.yml`\n1. **Environment Variables** - Prefixed with `CVE_AGGREGATOR_`\n1. **Default Values**\n\n### CLI Options\n\n| Option                      | Short | Description                                                                       | Default            |\n| --------------------------- | ----- | --------------------------------------------------------------------------------- | ------------------ |\n| `--input-dir`               | `-i`  | Input directory containing scan reports or SBOMs                                  | `./reports`        |\n| `--scanner`                 | `-s`  | Scanner type to process (`grype` or `trivy`)                                      | `grype`            |\n| `--log-level`               | `-l`  | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)                             | `INFO`             |\n| `--mode`                    | `-m`  | Aggregation mode: `highest-score`, `first-occurrence`, `grype-only`, `trivy-only` | `highest-score`    |\n| `--enrich-cves`             |       | Enable CVE enrichment with OpenAI                                                 | `false`            |\n| `--openai-api-key`          |       | OpenAI API key (defaults to `OPENAI_API_KEY` env var)                             | None               |\n| `--openai-model`            |       | OpenAI model to use for enrichment                                                | `gpt-5-nano`       |\n| `--openai-reasoning-effort` |       | Reasoning effort level (`low`, `medium`, `high`)                                  | `medium`           |\n| `--max-cves-to-enrich`      |       | Maximum number of CVEs to enrich                                                  | None (all)         |\n| `--enrich-severity-filter`  |       | Severity levels to enrich (can be used multiple times)                            | `Critical`, `High` |\n| `--help`                    | `-h`  | Show help message and exit                                                        | N/A                |\n| `--version`                 |       | Show version and exit                                                             | N/A                |\n\n### YAML Configuration File\n\nCreate a `.cve-aggregator.yaml` or `.cve-aggregator.yml` file in your project directory:\n\n```yaml\n# Scanner and processing settings\nscanner: grype                          # Scanner type: grype or trivy\nmode: highest-score                     # Aggregation mode\nlog_level: INFO                         # Logging level\n\ninput_dir: ./reports                    # Input directory for reports\n\n# Parallel processing\nmaxWorkers: 14                          # Concurrent download workers (auto-detect if omitted)\n\n# Remote package downloads\ndownloadRemotePackages: true            # Enable remote SBOM downloads\nregistry: registry.defenseunicorns.com\norganization: sld-45\npackages:\n  - name: gitlab\n    version: 18.4.2-uds.0-unicorn\n    architecture: amd64\n  - name: gitlab-runner\n    version: 18.4.0-uds.0-unicorn\n    architecture: amd64\n\n# CVE Enrichment (OpenAI)\nenrich:\n  enabled: true\n  provider: openai  # only openai is supported currently\n  model: gpt-5  # OpenAI model (gpt-5-nano, gpt-4o, etc.)\n  # apiKey: YOUR_OPENAI_API_KEY_HERE # or set via OPENAI_API_KEY environment variable\n  reasoningEffort: medium  # Level of reasoning effort: minimal, low, medium, high\n  severities:  # Severity levels to enrich\n    - Critical\n    - High\n  verbosity: medium  # Verbosity level: low, medium, high\n  seed: 42  # Optional: Seed for reproducibility\n  metadata:  # Optional: Metadata tags for OpenAI requests\n    project: cve-report-aggregator\n    organization: defenseunicorns\n```\n\nSee [.cve-aggregator.example.yaml](.cve-aggregator.example.yaml) for a complete example.\n\n### Environment Variables\n\nAll configuration options can be set via environment variables with the `CVE_AGGREGATOR_` prefix (with the exception of the `OPENAI_API_KEY`, which has no prefix). For example:\n\n```bash\n# Scanner settings\nexport CVE_AGGREGATOR_SCANNER=grype\nexport CVE_AGGREGATOR_MODE=highest-score\nexport CVE_AGGREGATOR_LOG_LEVEL=DEBUG\n\n# Input/output\nexport CVE_AGGREGATOR_INPUT_DIR=/path/to/reports\nexport CVE_AGGREGATOR_OUTPUT_FILE=/path/to/output.json\n\n# Parallel processing\nexport CVE_AGGREGATOR_MAX_WORKERS=14\n\n# Remote packages\nexport CVE_AGGREGATOR_DOWNLOAD_REMOTE_PACKAGES=true\nexport CVE_AGGREGATOR_REGISTRY=registry.example.com\nexport CVE_AGGREGATOR_ORGANIZATION=my-org\n\n# CVE Enrichment\nexport OPENAI_API_KEY=sk-...                            # OpenAI API key (no prefix)\nexport CVE_AGGREGATOR_ENRICH_CVES=true\nexport CVE_AGGREGATOR_OPENAI_MODEL=gpt-5-nano\nexport CVE_AGGREGATOR_OPENAI_REASONING_EFFORT=medium\nexport CVE_AGGREGATOR_MAX_CVES_TO_ENRICH=50\n```\n\n### Configuration Examples\n\n#### Basic Usage with Defaults\n\n```bash\n# Process reports from ./reports/ with default settings\ncve-report-aggregator\n\n# Output: $HOME/output/unified-YYYYMMDDhhmmss.json\n```\n\n#### Custom Scanner and Verbosity\n\n```bash\n# Use Trivy scanner with debug logging\ncve-report-aggregator --scanner trivy --log-level DEBUG\n```\n\n#### CVE Enrichment\n\n```bash\n# Enable AI-powered enrichment for Critical and High CVEs\nexport OPENAI_API_KEY=sk-...\ncve-report-aggregator --enrich-cves\n\n# Customize enrichment settings\ncve-report-aggregator \\\n  --enrich-cves \\\n  --openai-model gpt-4o \\\n  --openai-reasoning-effort high \\\n  --max-cves-to-enrich 10 \\\n  --enrich-severity-filter Critical\n```\n\n#### Remote Package Downloads\n\n```yaml\n# .cve-aggregator.yaml\ndownloadRemotePackages: true\nregistry: registry.defenseunicorns.com\norganization: sld-45\nmaxWorkers: 14\npackages:\n  - name: gitlab\n    version: 18.4.2-uds.0-unicorn\n```\n\n```bash\n# Run with config file\ncve-report-aggregator --config .cve-aggregator.yaml\n```\n\n## Performance\n\nCVE Report Aggregator now supports **parallel processing** for significantly faster execution with large package sets:\n\n### Parallel Package Downloading\n\nWhen downloading SBOM reports from remote registries (e.g., using UDS Zarf), packages are downloaded concurrently using a configurable worker pool:\n\n```yaml\n# .cve-aggregator.yaml\nmaxWorkers: 14  # Number of concurrent download workers (optional)\n```\n\nYou can expect the following performance improvements when utilizing parallel downloads (`ThreadPoolExecutor`):\n\n- `~10-15` seconds for 14 packages\n- A **10-14x** speedup compared to sequential downloads (which can take `~150s` for 14 packages)\n\n**Auto-Detection:** If `maxWorkers` is not specified, the optimal worker count is automatically detected using the formula: `min(<number_of_packages>, cpu_cores * 2 - 2)`. Set to `1` to disable parallelization.\n\n**Thread Safety:** All parallel operations use thread-safe data structures (`Lock()`) to ensure data integrity across concurrent workers.\n\n## Prerequisites\n\n**Depending on scanner choice:**\n\n- [grype](https://github.com/anchore/grype) - For Grype scanning (default scanner)\n- [syft](https://github.com/anchore/syft) - For converting reports to CycloneDX format (Trivy workflow)\n- [trivy](https://github.com/aquasecurity/trivy) - For Trivy scanning\n\n```bash\n# Install Grype\nbrew install grype\n\n# Install syft (for Trivy workflow)\nbrew install syft\n\n# Install trivy\nbrew install aquasecurity/trivy/trivy\n```\n\n## Installation\n\n### Using Docker (Recommended)\n\nThe easiest way to use CVE Report Aggregator is via the pre-built Docker image, which includes all necessary scanning tools (Grype, Syft, Trivy, UDS CLI):\n\n```bash\n# Pull the latest signed image from GitHub Container Registry\ndocker pull ghcr.io/mkm29/cve-report-aggregator:latest\n\n# Or build locally\ndocker build -t cve-report-aggregator .\n\n# Or use Docker Compose\ndocker compose run cve-aggregator --help\n\n# Run with mounted volumes for reports and output\ndocker run --rm \\\n  -v $(pwd)/reports:/workspace/reports:ro \\\n  -v $(pwd)/output:/home/cve-aggregator/output \\\n  ghcr.io/mkm29/cve-report-aggregator:latest \\\n  --input-dir /workspace/reports \\\n  --verbose\n\n# Note: Output files are automatically saved to $HOME/output with package name and version:\n# Format: <package_name>-<package_version>.json (e.g., core-logging-0.54.1-unicorn.json)\n```\n\n#### Image Security & Supply Chain\n\nAll container images are built with enterprise-grade security:\n\n- **Signed with Cosign**: Keyless signing using GitHub OIDC identity\n- **SBOM Included**: CycloneDX and SPDX attestations attached to every image\n- **Provenance**: SLSA Level 3 compliant build attestations\n- **Multi-Architecture**: Supports both amd64 and arm64\n- **Vulnerability Scanned**: Regularly scanned with Grype and Trivy\n\n##### Verify Image Signature\n\n```bash\n# Install cosign\nbrew install cosign\n\n# Verify the image signature\ncosign verify ghcr.io/mkm29/cve-report-aggregator:latest \\\n  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \\\n  --certificate-oidc-issuer='https://token.actions.githubusercontent.com'\n\n# Output shows verified signature with GitHub Actions identity\n```\n\n##### Download and Verify SBOM\n\n```bash\n# Download CycloneDX SBOM (JSON format)\ncosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \\\n  --type cyclonedx \\\n  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \\\n  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \\\n  jq -r '.payload' | base64 -d | jq . > sbom-cyclonedx.json\n\n# Download SPDX SBOM (JSON format)\ncosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \\\n  --type spdx \\\n  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \\\n  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \\\n  jq -r '.payload' | base64 -d | jq . > sbom-spdx.json\n\n# View all attestations and signatures\ncosign tree ghcr.io/mkm29/cve-report-aggregator:latest\n```\n\n##### Download Build Provenance\n\n```bash\n# Download SLSA provenance attestation\ncosign verify-attestation ghcr.io/mkm29/cve-report-aggregator:latest \\\n  --type slsaprovenance \\\n  --certificate-identity-regexp='https://github.com/mkm29/cve-report-aggregator' \\\n  --certificate-oidc-issuer='https://token.actions.githubusercontent.com' | \\\n  jq -r '.payload' | base64 -d | jq . > provenance.json\n```\n\n#### Available Image Tags\n\nImages are published to GitHub Container Registry with the following tags:\n\n- `latest` - Latest stable release (recommended for production)\n- `v*.*.*` - Specific version tags (e.g., `v0.5.1`, `v0.5.2`)\n- `rc` - Release candidate builds (for testing pre-release versions)\n\n```bash\n# Pull specific version\ndocker pull ghcr.io/mkm29/cve-report-aggregator:v0.5.1\n\n# Pull latest stable\ndocker pull ghcr.io/mkm29/cve-report-aggregator:latest\n\n# Pull release candidate (if available)\ndocker pull ghcr.io/mkm29/cve-report-aggregator:rc\n```\n\nAll tags are signed and include full attestations (signature, SBOM, provenance).\n\n## CVE Enrichment\n\nCVE Report Aggregator supports optional AI-powered enrichment using OpenAI GPT models to automatically analyze vulnerabilities in the context of UDS Core security controls. This feature generates concise, actionable mitigation summaries that explain how defense-in-depth security measures help protect against specific CVEs.\n\n### Key Features\n\n- **gpt-5-nano with Batch API**: Cost-optimized analysis with 50% discount on already low token costs\n- **Asynchronous Processing**: Submits all CVEs to OpenAI Batch API and polls for completion\n- **UDS Core Security Context**: Analyzes 20+ NetworkPolicies and 19 Pepr admission policies\n- **Single-Sentence Summaries**: Format \"UDS helps to mitigate {CVE_ID} by {explanation}\"\n- **Configurable Reasoning Effort**: Tune analysis depth with `low`, `medium`, or `high` settings\n- **Severity Filtering**: Default enrichment for `Critical` and `High` severity only\n- **Flexible Configuration**: CLI, YAML, or environment variables\n\n**Note:** Batch API enrichment typically completes within minutes to hours (up to 24-hour maximum). The CLI will poll for completion automatically and display progress updates.\n\n### Quick Start\n\n```bash\n# Set API key\nexport OPENAI_API_KEY=sk-...\n\n# Enable enrichment (enriches Critical and High severity CVEs by default)\ncve-report-aggregator --enrich-cves\n\n# Customize enrichment with higher reasoning effort\ncve-report-aggregator \\\n  --enrich-cves \\\n  --openai-model gpt-4o \\\n  --openai-reasoning-effort high \\\n  --max-cves-to-enrich 10 \\\n  --enrich-severity-filter Critical\n```\n\n### Reasoning Effort\n\nThe `openai_reasoning_effort` parameter controls how deeply the AI model analyzes each CVE:\n\n- **minimal**: Basic analysis with minimal token usage\n- **`low`**: Faster, more concise analysis with lower token usage\n- **`medium`** (default): Balanced analysis with good quality and reasonable token usage\n- **`high`**: Most thorough analysis with higher quality but increased token usage\n\n**When to adjust:**\n\n- Use `minimal` for quick overviews or large CVE sets\n- Use `low` for large CVE sets where speed and cost are priorities\n- Use `medium` (default) for most production use cases\n- Use `high` for critical vulnerabilities requiring detailed analysis\n\n**Note:** The `reasoning_effort` parameter is only supported by GPT-5 models (gpt-5-nano, gpt-5-mini). The temperature parameter is fixed at `1.0` for GPT-5 models as required by OpenAI.\n\n```bash\n# Example: High-quality analysis for critical CVEs only\ncve-report-aggregator \\\n  --enrich-cves \\\n  --openai-reasoning-effort high \\\n  --enrich-severity-filter Critical\n```\n\n### Cost Optimization\n\nThe system achieves extremely low costs through:\n\n1. **gpt-5-nano**: Ultra cost-effective model ($0.150/1M input, $0.600/1M output tokens)\n1. **OpenAI Batch API**: 50% cost discount compared to synchronous API calls\n1. **Single-Sentence Format**: 80% fewer output tokens (100 vs 500 tokens per CVE)\n1. **Severity Filtering**: ~70% fewer CVEs enriched (Critical/High only by default)\n\n**Batch API Benefits:**\n\nThe OpenAI Batch API processes requests asynchronously with significant cost savings:\n\n- **50% cost discount** on all API calls (applied automatically)\n- Processes all CVEs in a single batch submission\n- Results available within 24 hours (typically much faster)\n- Automatic retry and error handling\n\n**Cost Examples (gpt-5-nano with Batch API @ 50% discount):**\n\n- 10 CVEs: ~$0.0006 (11,000 tokens @ $0.075/1M input, $0.300/1M output)\n- 100 CVEs: ~$0.006 (110,000 tokens)\n- 1,000 CVEs: ~$0.06 (1,100,000 tokens)\n\n**Comparison with Standard Pricing:**\n\n- 100 CVEs with Batch API (gpt-5-nano): $0.006\n- 100 CVEs without Batch API (gpt-5-nano): $0.012\n- 100 CVEs with GPT-4: ~$12.00\n- **Cost Reduction vs GPT-4: 99.95%**\n- **Cost Reduction vs Synchronous API: 50%**\n\n### Output Format\n\nEnrichments are added to the unified report under the `enrichments` key:\n\n```json\n{\n  \"enrichments\": {\n    \"CVE-2024-12345\": {\n      \"cve_id\": \"CVE-2024-12345\",\n      \"mitigation_summary\": \"UDS helps to mitigate CVE-2024-12345 by enforcing non-root container execution through Pepr admission policies and blocking unauthorized external network access via default-deny NetworkPolicies.\",\n      \"analysis_model\": \"gpt-5-nano\",\n      \"analysis_timestamp\": \"2025-01-20T12:34:56.789Z\"\n    }\n  },\n  \"summary\": {\n    \"enrichment\": {\n      \"enabled\": true,\n      \"total_cves\": 150,\n      \"enriched_cves\": 45,\n      \"model\": \"gpt-5-nano\",\n      \"severity_filter\": [\"Critical\", \"High\"]\n    }\n  }\n}\n```\n\n## Docker Credentials Management\n\nThe Docker container supports two methods for providing registry credentials:\n\n1. **Build-Time Secrets**\n1. **Environment Variables**\n\n### Method 1: Build-Time Secrets (Recommended)\n\n**Best for**: Private container images where credentials can be baked in securely.\n\nCreate a credentials file in JSON format with `username`, `password`, and `registry` fields:\n\n```bash\ncat > docker/config.json <<EOF\n{\n  \"username\": \"myuser\",\n  \"password\": \"mypassword\",\n  \"registry\": \"ghcr.io\"\n}\nEOF\nchmod 600 docker/config.json\n```\n\n**Important**: Always encrypt the credentials file with SOPS before committing:\n\n```bash\n# Encrypt the credentials file\nsops -e docker/config.json.dec > docker/config.json.enc\n\n# Or encrypt in place\nsops -e docker/config.json.dec > docker/config.json.enc\n```\n\nBuild the image with the secret:\n\n```bash\n# If using encrypted file, decrypt first\nsops -d docker/config.json.enc > docker/config.json.dec\n\n# Build with the decrypted credentials\ndocker buildx build \\\n  --secret id=credentials,src=./docker/config.json.dec \\\n  -f docker/Dockerfile \\\n  -t cve-report-aggregator:latest .\n\n# Remove decrypted file after build\nrm docker/config.json.dec\n```\n\nOr build directly with unencrypted file (for local development):\n\n```bash\ndocker buildx build \\\n  --secret id=credentials,src=./docker/config.json \\\n  -f docker/Dockerfile \\\n  -t cve-report-aggregator:latest .\n```\n\nThe credentials will be stored in the image at `$DOCKER_CONFIG/config.json` (defaults to `/home/cve-aggregator/.docker/config.json`) in proper Docker authentication format with base64-encoded credentials.\n\nRun the container (no runtime credentials needed - uses baked-in `config.json`):\n\n```bash\ndocker run --rm cve-report-aggregator:latest --help\n```\n\n**Important**: This method bakes credentials into the image. Only use for private registries and **never** push images with credentials to public registries.\n\n### Method 2: Environment Variables\n\n```bash\ndocker run -it --rm \\\n  -e REGISTRY_URL=\"$UDS_URL\" \\\n  -e UDS_USERNAME=\"$UDS_USERNAME\" \\\n  -e UDS_PASSWORD=\"$UDS_PASSWORD\" \\\n  -e OPENAI_API_KEY=\"$OPENAI_API_KEY\" \\\n  cve-report-aggregator:latest --help\n```\n\n### How Credentials Are Handled\n\nThe `entrypoint.sh` script checks for Docker authentication on startup:\n\n1. **Docker config.json** (Build-Time): Checks if `$DOCKER_CONFIG/config.json` exists\n\n   - If found: Skips all credential checks and login - uses existing Docker auth\n   - Location: `/home/cve-aggregator/.docker/config.json`\n\n1. **Environment Variables** (if config.json not found): Requires all three variables:\n\n   - `REGISTRY_URL` - Registry URL (e.g., `registry.defenseunicorns.com`)\n   - `UDS_USERNAME` - Registry username\n   - `UDS_PASSWORD` - Registry password\n\nIf `config.json` doesn't exist and environment variables are not provided, the container exits with an error.\n\n### From Source\n\n```bash\n# Clone the repository\ngit clone https://github.com/mkm29/cve-report-aggregator.git\ncd cve-report-aggregator\n\n# Install in development mode\npip install -e .\n\n# Or install with dev dependencies\npip install -e \".[dev]\"\n```\n\n### From PyPi\n\n```bash\n# Install globally\npip install cve-report-aggregator\n\n# Or install with pipx (recommended)\npipx install cve-report-aggregator\n```\n\n## Usage\n\n### Basic Usage (Default Locations)\n\nProcess reports from `./reports/` and automatically save timestamped output to `$HOME/output/`:\n\n```bash\ncve-report-aggregator\n# Output:\n#   $HOME/output/<package>/<package>-<version>.json\n#   $HOME/output/<package>/<package>-<version>.csv\n```\n\n### Use Trivy Scanner\n\nAutomatically convert reports to CycloneDX and scan with Trivy:\n\n```bash\ncve-report-aggregator --scanner trivy\n```\n\n### Process SBOM Files\n\nThe script automatically detects and scans Syft SBOM files:\n\n```bash\ncve-report-aggregator -i /path/to/sboms -v\n```\n\n### Custom Input Directory\n\n```bash\n# Specify custom input directory (output still goes to $HOME/output)\ncve-report-aggregator -i /path/to/reports\n```\n\n### Verbose Mode\n\nEnable detailed processing output:\n\n```bash\ncve-report-aggregator -v\n```\n\n### Combined Options\n\n```bash\ncve-report-aggregator -i ./scans --scanner trivy -v\n# Output:\n#   $HOME/output/<package>/<package>-<version>.json\n#   $HOME/output/<package>/<package>-<version>.csv\n```\n\n### Use Highest Severity Across Scanners\n\nWhen scanning with multiple scanners (or multiple runs of the same scanner), automatically select the highest severity rating:\n\n```bash\n# Scan the same image with both Grype and Trivy, use highest severity\ngrype myapp:latest -o json > reports/grype-app.json\ntrivy image myapp:latest -f json -o reports/trivy-app.json\ncve-report-aggregator -i reports/ --mode highest-score\n# Output:\n#   $HOME/output/<package>/<package>-<version>.json\n#   $HOME/output/<package>/<package>-<version>.csv\n```\n\nThis is particularly useful when:\n\n- Combining results from multiple scanners with different severity assessments\n- Ensuring conservative (worst-case) severity ratings for compliance\n- Aggregating multiple scans over time where severity data may have been updated\n\n**Note:** All output files are automatically saved to `$HOME/output/` in a `<package>` subdirectory with the package version in the format `<package_name>-<package_version>.json`.\n\nFor complete configuration options, see the [Configuration](#configuration) section.\n\n## Output Formats\n\nThe tool generates reports in two formats for maximum flexibility:\n\n### 1. JSON Format (Unified Report)\n\nThe unified report includes:\n\n### Metadata\n\n- Generation timestamp\n- Scanner type and version\n- Source report count and filenames\n- Package name and version\n\n### Summary\n\n- Total vulnerability occurrences\n- Unique vulnerability count\n- Severity breakdown (Critical, High, Medium, Low, Negligible, Unknown)\n- Per-image scan results\n\n### Vulnerabilities (Deduplicated)\n\nFor each unique CVE/GHSA:\n\n- Vulnerability ID\n- Occurrence count\n- Selected scanner (which scanner provided the vulnerability data)\n- Severity and CVSS scores\n- Fix availability and versions\n- All affected sources (images and artifacts)\n- Detailed match information\n\n### 2. CSV Format (Simplified Export)\n\nA simplified CSV export is automatically generated alongside each unified JSON report for easy consumption in spreadsheet applications and reporting tools.\n\n**Filename Format**: `<package_name>-<timestamp>.csv`\n\n**Columns**:\n\n- `CVE ID`: Vulnerability identifier\n- `Severity`: Severity level (Critical, High, Medium, Low, etc.)\n- `Count`: Number of occurrences across all scanned images\n- `CVSS`: Highest CVSS 3.x score (or \"N/A\" if unavailable)\n- `Impact`: Impact analysis from OpenAI enrichment (if enabled)\n- `Mitigation`: Mitigation summary from OpenAI enrichment (if enabled)\n\n**Example**:\n\n```csv\n\"CVE-2023-4863\",\"Critical\",\"5\",\"9.8\",\"Without UDS Core controls, this critical vulnerability...\",\"UDS helps to mitigate CVE-2023-4863 by...\"\n\"CVE-2023-4973\",\"High\",\"3\",\"7.5\",\"This vulnerability could allow...\",\"UDS helps to mitigate CVE-2023-4973 by...\"\n```\n\n**Features**:\n\n- Sorted by severity (Critical > High > Medium > Low) and CVSS score\n- Includes enrichment data when CVE enrichment is enabled\n- UTF-8 encoded with proper CSV escaping\n- Compatible with Excel, Google Sheets, and data analysis tools\n\n**Location**: `$HOME/output/<package_name>/<package_name>-<package_version>.csv`\n\n## Development\n\n### Running Tests\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=cve_report_aggregator --cov-report=html\n\n# Run specific test file\npytest tests/test_severity.py\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack src/ tests/\n\n# Lint code\nruff check src/ tests/\n\n# Type checking\nmypy src/\n```\n\n### Building the Package\n\n```bash\n# Build distribution packages\npython -m build\n\n# Install locally\npip install dist/cve_report_aggregator-0.1.0-py3-none-any.whl\n```\n\n## Project Structure\n\n```bash\ncve-report-aggregator/\n\u251c\u2500\u2500 src/\n\u2502   \u2514\u2500\u2500 cve_report_aggregator/\n\u2502       \u251c\u2500\u2500 __init__.py           # Package exports and metadata\n\u2502       \u251c\u2500\u2500 main.py               # CLI entry point\n\u2502       \u251c\u2500\u2500 models.py             # Type definitions\n\u2502       \u251c\u2500\u2500 utils.py              # Utility functions\n\u2502       \u251c\u2500\u2500 severity.py           # CVSS and severity logic\n\u2502       \u251c\u2500\u2500 scanner.py            # Scanner integrations\n\u2502       \u251c\u2500\u2500 aggregator.py         # Deduplication engine\n\u2502       \u2514\u2500\u2500 report.py             # Report generation\n\u251c\u2500\u2500 tests/\n\u2502   \u251c\u2500\u2500 __init__.py\n\u2502   \u251c\u2500\u2500 conftest.py               # Pytest fixtures\n\u2502   \u251c\u2500\u2500 test_severity.py          # Severity tests\n\u2502   \u2514\u2500\u2500 test_aggregator.py        # Aggregation tests\n\u251c\u2500\u2500 pyproject.toml                # Project configuration\n\u251c\u2500\u2500 README.md                     # This file\n\u2514\u2500\u2500 LICENSE                       # MIT License\n```\n\n## Example Workflows\n\n### Grype Workflow (Default)\n\n```bash\n# Scan multiple container images with Grype\ngrype registry.io/app/service1:v1.0 -o json > reports/service1.json\ngrype registry.io/app/service2:v1.0 -o json > reports/service2.json\ngrype registry.io/app/service3:v1.0 -o json > reports/service3.json\n\n# Aggregate all reports (output saved to $HOME/output with timestamp)\ncve-report-aggregator --log-level DEBUG\n\n# Query results with jq (use the timestamped file)\nREPORT=$(ls -t $HOME/output/unified-*.json | head -1)\njq '.summary' \"$REPORT\"\njq '.vulnerabilities[] | select(.vulnerability.severity == \"Critical\")' \"$REPORT\"\n```\n\n### SBOM Workflow\n\n```bash\n# Generate SBOMs with Syft (or use Zarf-generated SBOMs)\nsyft registry.io/app/service1:v1.0 -o json > sboms/service1.json\nsyft registry.io/app/service2:v1.0 -o json > sboms/service2.json\n\n# Script automatically detects and scans SBOMs with Grype\ncve-report-aggregator -i ./sboms --log-level DEBUG\n\n# Results include all vulnerabilities found (use timestamped file)\nREPORT=$(ls -t $HOME/output/unified-*.json | head -1)\njq '.summary.by_severity' \"$REPORT\"\n```\n\n### Trivy Workflow\n\n```bash\n# Start with Grype reports (script will convert to CycloneDX)\ngrype registry.io/app/service1:v1.0 -o json > reports/service1.json\ngrype registry.io/app/service2:v1.0 -o json > reports/service2.json\n\n# Aggregate and scan with Trivy (auto-converts to CycloneDX)\ncve-report-aggregator --scanner trivy --log-level DEBUG\n\n# Or scan SBOMs directly with Trivy\ncve-report-aggregator -i ./sboms --scanner trivy --log-level DEBUG\n\n# View most recent output\nREPORT=$(ls -t $HOME/output/unified-*.json | head -1)\njq '.summary' \"$REPORT\"\n```\n\n## License\n\nMIT License - See LICENSE file for details\n\n## Contributing\n\nContributions are welcome! Please:\n\n1. Fork the repository\n1. Create a feature branch\n1. Add tests for new functionality\n1. Ensure all tests pass\n1. Submit a pull request\n\n## Changelog\n\nSee [CHANGELOG.md](CHANGELOG.md) for version history and changes.\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Aggregate and deduplicate vulnerability scan reports from Grype and Trivy",
    "version": "0.11.0",
    "project_urls": {
        "Homepage": "https://github.com/mkm29/cve-report-aggregator",
        "Issues": "https://github.com/mkm29/cve-report-aggregator/issues",
        "Repository": "https://github.com/mkm29/cve-report-aggregator"
    },
    "split_keywords": [
        "cve",
        " grype",
        " sbom",
        " security",
        " trivy",
        " vulnerability"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "7f8632356434bfa0f22d8c52aebe4281e6cd958dc7d6a60e067bc8fee3949d45",
                "md5": "cd462b7895a6de7914b96569fcfdb5f7",
                "sha256": "cd67f51908a117cd7a006e2c2cfa493b6caaeb463d36d8649c710978f4f9157a"
            },
            "downloads": -1,
            "filename": "cve_report_aggregator-0.11.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "cd462b7895a6de7914b96569fcfdb5f7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 98726,
            "upload_time": "2025-10-21T00:59:51",
            "upload_time_iso_8601": "2025-10-21T00:59:51.710987Z",
            "url": "https://files.pythonhosted.org/packages/7f/86/32356434bfa0f22d8c52aebe4281e6cd958dc7d6a60e067bc8fee3949d45/cve_report_aggregator-0.11.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e98bc3ce7badf6e770376234f71430d5f0762494a56207bc193a50593478270f",
                "md5": "a32a7eca12568698cfa2b7dbb1f61546",
                "sha256": "88d9a8b92dd161e0b970a5be69f411710b2d523b77888dbcf00713575af02e94"
            },
            "downloads": -1,
            "filename": "cve_report_aggregator-0.11.0.tar.gz",
            "has_sig": false,
            "md5_digest": "a32a7eca12568698cfa2b7dbb1f61546",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 319782,
            "upload_time": "2025-10-21T00:59:52",
            "upload_time_iso_8601": "2025-10-21T00:59:52.633313Z",
            "url": "https://files.pythonhosted.org/packages/e9/8b/c3ce7badf6e770376234f71430d5f0762494a56207bc193a50593478270f/cve_report_aggregator-0.11.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-21 00:59:52",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "mkm29",
    "github_project": "cve-report-aggregator",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "cve-report-aggregator"
}
        
Elapsed time: 1.43173s