llama-stack-provider-trustyai-garak

Name	llama-stack-provider-trustyai-garak JSON
Version	0.1.5 JSON
	download
home_page	None
Summary	Out-Of-Tree Llama Stack provider for Garak Red-teaming
upload_time	2025-10-07 12:33:03
maintainer	None
docs_url	None
author	TrustyAI Team
requires_python	>=3.12
license	None
keywords	llama-stack garak red-teaming security ai-safety
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # TrustyAI Garak (`trustyai_garak`): Out-of-Tree Llama Stack Eval Provider for Garak Red Teaming

## About
This repository implements [Garak](https://github.com/NVIDIA/garak) as a Llama Stack out-of-tree provider for **security testing and red teaming** of Large Language Models with optional **Shield Integration** for enhanced security testing. Please find the tutorial [here](https://trustyai.org/docs/main/red-teaming-introduction) to get started.

## What It Does

- **Automated Security Testing**: Detects prompt injection, jailbreaks, toxicity, and bias vulnerabilities
- **Compliance Scanning**: OWASP LLM Top 10, AVID taxonomy benchmarks
- **Shield Testing**: Compare LLM security with/without guardrails
- **Scalable Deployment**: Local or Kubernetes/Kubeflow execution
- **Comprehensive Reporting**: JSON, HTML, and detailed logs with vulnerability scores (0.0-1.0)

## Installation

```bash
git clone https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak.git
cd llama-stack-provider-trustyai-garak
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
# For remote execution: pip install -e ".[remote]"
```

## Quick Start

### 1. Configure Environment

```bash
export VLLM_URL="http://your-model-endpoint/v1"
export INFERENCE_MODEL="your-model-name"
```

### 2. Start Server

```bash
# Basic mode (standard scanning)
llama stack run run.yaml --image-type venv

# Enhanced mode (with shield testing)
llama stack run run-with-safety.yaml --image-type venv

# Remote mode (Kubernetes/KFP)
llama stack run run-remote.yaml --image-type venv
```

Server runs at `http://localhost:8321`

### 3. Run Security Scan

```python
from llama_stack_client import LlamaStackClient

client = LlamaStackClient(base_url="http://localhost:8321")

# Quick 5-minute scan
job = client.eval.run_eval(
    benchmark_id="trustyai_garak::quick",
    benchmark_config={
        "eval_candidate": {
            "type": "model",
            "model": "your-model-name",
            "sampling_params": {"max_tokens": 100}
        }
    }
)

# Check status
status = client.eval.jobs.status(job_id=job.job_id, benchmark_id="trustyai_garak::quick")
print(f"Status: {status.status}")

# Get results when complete
if status.status == "completed":
    results = client.eval.get_eval_job_result(job_id=job.job_id, benchmark_id="trustyai_garak::quick")
```

## Available Benchmarks

### Compliance Frameworks
| Benchmark ID | Framework | Duration |
|-------------|-----------|----------|
| `trustyai_garak::owasp_llm_top10` | [OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/) | ~8 hours |
| `trustyai_garak::avid_security` | [AVID Security](https://docs.avidml.org/taxonomy/effect-sep-view/security) | ~8 hours |
| `trustyai_garak::avid_ethics` | [AVID Ethics](https://docs.avidml.org/taxonomy/effect-sep-view/ethics) | ~30 minutes |
| `trustyai_garak::avid_performance` | [AVID Performance](https://docs.avidml.org/taxonomy/effect-sep-view/performance) | ~40 minutes |

### Test Profiles
| Benchmark ID | Description | Duration |
|-------------|-------------|----------|
| `trustyai_garak::quick` | Essential security checks (3 probes) | ~5 minutes |
| `trustyai_garak::standard` | Standard attack vectors (5 categories) | ~1 hour |

_Duration estimates based on Qwen2.5 7B via vLLM_

## Advanced Usage

### Other Garak Probes

```python
client.benchmarks.register(
    benchmark_id="custom",
    dataset_id="garak",
    scoring_functions=["garak_scoring"],
    provider_benchmark_id="custom",
    provider_id="trustyai_garak",
    metadata={
        "probes": ["latentinjection.LatentJailbreak", "snowball.GraphConnectivity"],
        "timeout": 900
    }
)
```

### Shield Testing

```python
# Test with input shield
client.benchmarks.register(
    benchmark_id="with_shield",
    dataset_id="garak",
    scoring_functions=["garak_scoring"],
    provider_benchmark_id="with_shield",
    provider_id="trustyai_garak",
    metadata={
        "probes": ["promptinject.HijackHateHumans"],
        "shield_ids": ["Prompt-Guard-86M"]  # Input shield only
    }
)

# Test with input/output shields
metadata={
    "probes": ["promptinject.HijackHateHumans"],
    "shield_config": {
        "input": ["Prompt-Guard-86M"],
        "output": ["Llama-Guard-3-8B"]
    }
}
```

### Accessing Reports

```python
# Get report file IDs from job status
scan_report_id = status.metadata["scan.report.jsonl"]
scan_html_id = status.metadata["scan.report.html"]

# Download via Files API
content = client.files.content(scan_report_id)

# Or via HTTP
import requests
report = requests.get(f"http://localhost:8321/v1/openai/v1/files/{scan_html_id}/content")
```

## Remote Execution (Kubernetes/KFP)

### Setup

```bash
# KFP Configuration
export KUBEFLOW_PIPELINES_ENDPOINT="https://your-kfp-endpoint"
export KUBEFLOW_NAMESPACE="your-namespace"
export KUBEFLOW_EXPERIMENT_NAME="trustyai-garak-scans"
export KUBEFLOW_BASE_IMAGE="quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:cpu" # for gpu - "quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:gpu"

# S3 Configuration (for artifacts)
export AWS_ACCESS_KEY_ID="your-key"
export AWS_SECRET_ACCESS_KEY="your-secret"
export AWS_S3_ENDPOINT="https://your-s3-endpoint"
export AWS_S3_BUCKET="pipeline-artifacts"
export AWS_DEFAULT_REGION="us-east-1"

# Start server
llama stack run run-remote.yaml --image-type venv
```

_Note: If you're running Llama Stack server locally, make sure `BASE_URL` in run-remote*.yaml is accessible from KFP pods (you can use [ngrok](https://ngrok.com/) to create an accessible endpoint for your local Llama stack service)._

### Usage

```python
# Same API, runs as KFP pipeline
job = client.eval.run_eval(benchmark_id="trustyai_garak::owasp_llm_top10", ...)

# Monitor pipeline
status = client.eval.jobs.status(job_id=job.job_id, benchmark_id="trustyai_garak::owasp_llm_top10")
print(f"KFP Run ID: {status.metadata['kfp_run_id']}")
```

## Configuration Reference

### Provider Config (`run.yaml`)

```yaml
providers:
  eval:
    - provider_id: trustyai_garak
      config:
        base_url: ${env.BASE_URL:=http://localhost:8321/v1}
        timeout: ${env.GARAK_TIMEOUT:=10800}
        max_concurrent_jobs: ${env.GARAK_MAX_CONCURRENT_JOBS:=5}
        max_workers: ${env.GARAK_MAX_WORKERS:=5}
```

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `BASE_URL` | `http://localhost:8321/v1` | Llama Stack service URL |
| `GARAK_TIMEOUT` | `10800` | Max scan timeout (seconds) |
| `GARAK_MAX_CONCURRENT_JOBS` | `5` | Max concurrent scans |
| `GARAK_MAX_WORKERS` | `5` | Shield scanning parallelism |

## Deployment Modes

| Mode | Config File | Features |
|------|------------|----------|
| **Basic** | `run.yaml` | Standard scanning |
| **Enhanced** | `run-with-safety.yaml` | + Shield integration |
| **Remote** | `run-remote.yaml` | KFP execution |
| **Remote+Safety** | `run-remote-safety.yaml` | + Shield integration |

## Result Interpretation

- **Score Range**: 0.0 (secure) to 1.0 (vulnerable)
- **Threshold**: Scores ≥ 0.5 indicate security concerns
- **Reports**: Available in JSON, HTML, and log formats via Files API

## Examples & Demos

| Notebook | Description |
|----------|-------------|
| [01-getting_started](demos/01-getting_started_with_garak.ipynb) | Basic usage and custom probes |
| [02-scan_with_shields](demos/02-scan_with_shields.ipynb) | Shield integration testing |
| [03-remote_garak](demos/03-remote_garak.ipynb) | KFP remote execution |

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "llama-stack-provider-trustyai-garak",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": null,
    "keywords": "llama-stack, garak, red-teaming, security, ai-safety",
    "author": "TrustyAI Team",
    "author_email": "Sai Chandra Pandraju <saichandrapandraju@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/07/b1/1f725ff202771512014b3e8503165bb57023f135f269fb211c9b67b22069/llama_stack_provider_trustyai_garak-0.1.5.tar.gz",
    "platform": null,
    "description": "# TrustyAI Garak (`trustyai_garak`): Out-of-Tree Llama Stack Eval Provider for Garak Red Teaming\n\n## About\nThis repository implements [Garak](https://github.com/NVIDIA/garak) as a Llama Stack out-of-tree provider for **security testing and red teaming** of Large Language Models with optional **Shield Integration** for enhanced security testing. Please find the tutorial [here](https://trustyai.org/docs/main/red-teaming-introduction) to get started.\n\n## What It Does\n\n- **Automated Security Testing**: Detects prompt injection, jailbreaks, toxicity, and bias vulnerabilities\n- **Compliance Scanning**: OWASP LLM Top 10, AVID taxonomy benchmarks\n- **Shield Testing**: Compare LLM security with/without guardrails\n- **Scalable Deployment**: Local or Kubernetes/Kubeflow execution\n- **Comprehensive Reporting**: JSON, HTML, and detailed logs with vulnerability scores (0.0-1.0)\n\n## Installation\n\n```bash\ngit clone https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak.git\ncd llama-stack-provider-trustyai-garak\npython3 -m venv .venv && source .venv/bin/activate\npip install -e .\n# For remote execution: pip install -e \".[remote]\"\n```\n\n## Quick Start\n\n### 1. Configure Environment\n\n```bash\nexport VLLM_URL=\"http://your-model-endpoint/v1\"\nexport INFERENCE_MODEL=\"your-model-name\"\n```\n\n### 2. Start Server\n\n```bash\n# Basic mode (standard scanning)\nllama stack run run.yaml --image-type venv\n\n# Enhanced mode (with shield testing)\nllama stack run run-with-safety.yaml --image-type venv\n\n# Remote mode (Kubernetes/KFP)\nllama stack run run-remote.yaml --image-type venv\n```\n\nServer runs at `http://localhost:8321`\n\n### 3. Run Security Scan\n\n```python\nfrom llama_stack_client import LlamaStackClient\n\nclient = LlamaStackClient(base_url=\"http://localhost:8321\")\n\n# Quick 5-minute scan\njob = client.eval.run_eval(\n    benchmark_id=\"trustyai_garak::quick\",\n    benchmark_config={\n        \"eval_candidate\": {\n            \"type\": \"model\",\n            \"model\": \"your-model-name\",\n            \"sampling_params\": {\"max_tokens\": 100}\n        }\n    }\n)\n\n# Check status\nstatus = client.eval.jobs.status(job_id=job.job_id, benchmark_id=\"trustyai_garak::quick\")\nprint(f\"Status: {status.status}\")\n\n# Get results when complete\nif status.status == \"completed\":\n    results = client.eval.get_eval_job_result(job_id=job.job_id, benchmark_id=\"trustyai_garak::quick\")\n```\n\n## Available Benchmarks\n\n### Compliance Frameworks\n| Benchmark ID | Framework | Duration |\n|-------------|-----------|----------|\n| `trustyai_garak::owasp_llm_top10` | [OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/) | ~8 hours |\n| `trustyai_garak::avid_security` | [AVID Security](https://docs.avidml.org/taxonomy/effect-sep-view/security) | ~8 hours |\n| `trustyai_garak::avid_ethics` | [AVID Ethics](https://docs.avidml.org/taxonomy/effect-sep-view/ethics) | ~30 minutes |\n| `trustyai_garak::avid_performance` | [AVID Performance](https://docs.avidml.org/taxonomy/effect-sep-view/performance) | ~40 minutes |\n\n### Test Profiles\n| Benchmark ID | Description | Duration |\n|-------------|-------------|----------|\n| `trustyai_garak::quick` | Essential security checks (3 probes) | ~5 minutes |\n| `trustyai_garak::standard` | Standard attack vectors (5 categories) | ~1 hour |\n\n_Duration estimates based on Qwen2.5 7B via vLLM_\n\n## Advanced Usage\n\n### Other Garak Probes\n\n```python\nclient.benchmarks.register(\n    benchmark_id=\"custom\",\n    dataset_id=\"garak\",\n    scoring_functions=[\"garak_scoring\"],\n    provider_benchmark_id=\"custom\",\n    provider_id=\"trustyai_garak\",\n    metadata={\n        \"probes\": [\"latentinjection.LatentJailbreak\", \"snowball.GraphConnectivity\"],\n        \"timeout\": 900\n    }\n)\n```\n\n### Shield Testing\n\n```python\n# Test with input shield\nclient.benchmarks.register(\n    benchmark_id=\"with_shield\",\n    dataset_id=\"garak\",\n    scoring_functions=[\"garak_scoring\"],\n    provider_benchmark_id=\"with_shield\",\n    provider_id=\"trustyai_garak\",\n    metadata={\n        \"probes\": [\"promptinject.HijackHateHumans\"],\n        \"shield_ids\": [\"Prompt-Guard-86M\"]  # Input shield only\n    }\n)\n\n# Test with input/output shields\nmetadata={\n    \"probes\": [\"promptinject.HijackHateHumans\"],\n    \"shield_config\": {\n        \"input\": [\"Prompt-Guard-86M\"],\n        \"output\": [\"Llama-Guard-3-8B\"]\n    }\n}\n```\n\n### Accessing Reports\n\n```python\n# Get report file IDs from job status\nscan_report_id = status.metadata[\"scan.report.jsonl\"]\nscan_html_id = status.metadata[\"scan.report.html\"]\n\n# Download via Files API\ncontent = client.files.content(scan_report_id)\n\n# Or via HTTP\nimport requests\nreport = requests.get(f\"http://localhost:8321/v1/openai/v1/files/{scan_html_id}/content\")\n```\n\n## Remote Execution (Kubernetes/KFP)\n\n### Setup\n\n```bash\n# KFP Configuration\nexport KUBEFLOW_PIPELINES_ENDPOINT=\"https://your-kfp-endpoint\"\nexport KUBEFLOW_NAMESPACE=\"your-namespace\"\nexport KUBEFLOW_EXPERIMENT_NAME=\"trustyai-garak-scans\"\nexport KUBEFLOW_BASE_IMAGE=\"quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:cpu\" # for gpu - \"quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:gpu\"\n\n# S3 Configuration (for artifacts)\nexport AWS_ACCESS_KEY_ID=\"your-key\"\nexport AWS_SECRET_ACCESS_KEY=\"your-secret\"\nexport AWS_S3_ENDPOINT=\"https://your-s3-endpoint\"\nexport AWS_S3_BUCKET=\"pipeline-artifacts\"\nexport AWS_DEFAULT_REGION=\"us-east-1\"\n\n# Start server\nllama stack run run-remote.yaml --image-type venv\n```\n\n_Note: If you're running Llama Stack server locally, make sure `BASE_URL` in run-remote*.yaml is accessible from KFP pods (you can use [ngrok](https://ngrok.com/) to create an accessible endpoint for your local Llama stack service)._\n\n### Usage\n\n```python\n# Same API, runs as KFP pipeline\njob = client.eval.run_eval(benchmark_id=\"trustyai_garak::owasp_llm_top10\", ...)\n\n# Monitor pipeline\nstatus = client.eval.jobs.status(job_id=job.job_id, benchmark_id=\"trustyai_garak::owasp_llm_top10\")\nprint(f\"KFP Run ID: {status.metadata['kfp_run_id']}\")\n```\n\n## Configuration Reference\n\n### Provider Config (`run.yaml`)\n\n```yaml\nproviders:\n  eval:\n    - provider_id: trustyai_garak\n      config:\n        base_url: ${env.BASE_URL:=http://localhost:8321/v1}\n        timeout: ${env.GARAK_TIMEOUT:=10800}\n        max_concurrent_jobs: ${env.GARAK_MAX_CONCURRENT_JOBS:=5}\n        max_workers: ${env.GARAK_MAX_WORKERS:=5}\n```\n\n### Environment Variables\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `BASE_URL` | `http://localhost:8321/v1` | Llama Stack service URL |\n| `GARAK_TIMEOUT` | `10800` | Max scan timeout (seconds) |\n| `GARAK_MAX_CONCURRENT_JOBS` | `5` | Max concurrent scans |\n| `GARAK_MAX_WORKERS` | `5` | Shield scanning parallelism |\n\n## Deployment Modes\n\n| Mode | Config File | Features |\n|------|------------|----------|\n| **Basic** | `run.yaml` | Standard scanning |\n| **Enhanced** | `run-with-safety.yaml` | + Shield integration |\n| **Remote** | `run-remote.yaml` | KFP execution |\n| **Remote+Safety** | `run-remote-safety.yaml` | + Shield integration |\n\n## Result Interpretation\n\n- **Score Range**: 0.0 (secure) to 1.0 (vulnerable)\n- **Threshold**: Scores \u2265 0.5 indicate security concerns\n- **Reports**: Available in JSON, HTML, and log formats via Files API\n\n## Examples & Demos\n\n| Notebook | Description |\n|----------|-------------|\n| [01-getting_started](demos/01-getting_started_with_garak.ipynb) | Basic usage and custom probes |\n| [02-scan_with_shields](demos/02-scan_with_shields.ipynb) | Shield integration testing |\n| [03-remote_garak](demos/03-remote_garak.ipynb) | KFP remote execution |\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Out-Of-Tree Llama Stack provider for Garak Red-teaming",
    "version": "0.1.5",
    "project_urls": {
        "homepage": "https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak",
        "repository": "https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak"
    },
    "split_keywords": [
        "llama-stack",
        " garak",
        " red-teaming",
        " security",
        " ai-safety"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e2cb29d0fbaf65f29500dbc16821e6464bb9d6b024ca6ee511cecadd53785470",
                "md5": "18c0cc30ecd1404f297f25ec3f7f7a49",
                "sha256": "d5e812215e7d98022bb75cc9d019452373416542536e514862bc81bf46953bbb"
            },
            "downloads": -1,
            "filename": "llama_stack_provider_trustyai_garak-0.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "18c0cc30ecd1404f297f25ec3f7f7a49",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 38468,
            "upload_time": "2025-10-07T12:33:02",
            "upload_time_iso_8601": "2025-10-07T12:33:02.154028Z",
            "url": "https://files.pythonhosted.org/packages/e2/cb/29d0fbaf65f29500dbc16821e6464bb9d6b024ca6ee511cecadd53785470/llama_stack_provider_trustyai_garak-0.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "07b11f725ff202771512014b3e8503165bb57023f135f269fb211c9b67b22069",
                "md5": "d1d021348735e06e524f9edb5285a246",
                "sha256": "2f644b0d419da57e473c27eb2adb3c02609dd6f1473f76bdb2a9097c1051abc8"
            },
            "downloads": -1,
            "filename": "llama_stack_provider_trustyai_garak-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "d1d021348735e06e524f9edb5285a246",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 45322,
            "upload_time": "2025-10-07T12:33:03",
            "upload_time_iso_8601": "2025-10-07T12:33:03.360444Z",
            "url": "https://files.pythonhosted.org/packages/07/b1/1f725ff202771512014b3e8503165bb57023f135f269fb211c9b67b22069/llama_stack_provider_trustyai_garak-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-07 12:33:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "trustyai-explainability",
    "github_project": "llama-stack-provider-trustyai-garak",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "llama-stack-provider-trustyai-garak"
}

TrustyAI Team