# TrustyAI Garak (`trustyai_garak`): Out-of-Tree Llama Stack Eval Provider for Garak Red Teaming
## About
This repository implements [Garak](https://github.com/NVIDIA/garak) as a Llama Stack out-of-tree provider for **security testing and red teaming** of Large Language Models with optional **Shield Integration** for enhanced security testing. Please find the tutorial [here](https://trustyai.org/docs/main/red-teaming-introduction) to get started.
## What It Does
- **Automated Security Testing**: Detects prompt injection, jailbreaks, toxicity, and bias vulnerabilities
- **Compliance Scanning**: OWASP LLM Top 10, AVID taxonomy benchmarks
- **Shield Testing**: Compare LLM security with/without guardrails
- **Scalable Deployment**: Local or Kubernetes/Kubeflow execution
- **Comprehensive Reporting**: JSON, HTML, and detailed logs with vulnerability scores (0.0-1.0)
## Installation
```bash
git clone https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak.git
cd llama-stack-provider-trustyai-garak
python3 -m venv .venv && source .venv/bin/activate
pip install -e .
# For remote execution: pip install -e ".[remote]"
```
## Quick Start
### 1. Configure Environment
```bash
export VLLM_URL="http://your-model-endpoint/v1"
export INFERENCE_MODEL="your-model-name"
```
### 2. Start Server
```bash
# Basic mode (standard scanning)
llama stack run run.yaml --image-type venv
# Enhanced mode (with shield testing)
llama stack run run-with-safety.yaml --image-type venv
# Remote mode (Kubernetes/KFP)
llama stack run run-remote.yaml --image-type venv
```
Server runs at `http://localhost:8321`
### 3. Run Security Scan
```python
from llama_stack_client import LlamaStackClient
client = LlamaStackClient(base_url="http://localhost:8321")
# Quick 5-minute scan
job = client.eval.run_eval(
benchmark_id="trustyai_garak::quick",
benchmark_config={
"eval_candidate": {
"type": "model",
"model": "your-model-name",
"sampling_params": {"max_tokens": 100}
}
}
)
# Check status
status = client.eval.jobs.status(job_id=job.job_id, benchmark_id="trustyai_garak::quick")
print(f"Status: {status.status}")
# Get results when complete
if status.status == "completed":
results = client.eval.get_eval_job_result(job_id=job.job_id, benchmark_id="trustyai_garak::quick")
```
## Available Benchmarks
### Compliance Frameworks
| Benchmark ID | Framework | Duration |
|-------------|-----------|----------|
| `trustyai_garak::owasp_llm_top10` | [OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/) | ~8 hours |
| `trustyai_garak::avid_security` | [AVID Security](https://docs.avidml.org/taxonomy/effect-sep-view/security) | ~8 hours |
| `trustyai_garak::avid_ethics` | [AVID Ethics](https://docs.avidml.org/taxonomy/effect-sep-view/ethics) | ~30 minutes |
| `trustyai_garak::avid_performance` | [AVID Performance](https://docs.avidml.org/taxonomy/effect-sep-view/performance) | ~40 minutes |
### Test Profiles
| Benchmark ID | Description | Duration |
|-------------|-------------|----------|
| `trustyai_garak::quick` | Essential security checks (3 probes) | ~5 minutes |
| `trustyai_garak::standard` | Standard attack vectors (5 categories) | ~1 hour |
_Duration estimates based on Qwen2.5 7B via vLLM_
## Advanced Usage
### Other Garak Probes
```python
client.benchmarks.register(
benchmark_id="custom",
dataset_id="garak",
scoring_functions=["garak_scoring"],
provider_benchmark_id="custom",
provider_id="trustyai_garak",
metadata={
"probes": ["latentinjection.LatentJailbreak", "snowball.GraphConnectivity"],
"timeout": 900
}
)
```
### Shield Testing
```python
# Test with input shield
client.benchmarks.register(
benchmark_id="with_shield",
dataset_id="garak",
scoring_functions=["garak_scoring"],
provider_benchmark_id="with_shield",
provider_id="trustyai_garak",
metadata={
"probes": ["promptinject.HijackHateHumans"],
"shield_ids": ["Prompt-Guard-86M"] # Input shield only
}
)
# Test with input/output shields
metadata={
"probes": ["promptinject.HijackHateHumans"],
"shield_config": {
"input": ["Prompt-Guard-86M"],
"output": ["Llama-Guard-3-8B"]
}
}
```
### Accessing Reports
```python
# Get report file IDs from job status
scan_report_id = status.metadata["scan.report.jsonl"]
scan_html_id = status.metadata["scan.report.html"]
# Download via Files API
content = client.files.content(scan_report_id)
# Or via HTTP
import requests
report = requests.get(f"http://localhost:8321/v1/openai/v1/files/{scan_html_id}/content")
```
## Remote Execution (Kubernetes/KFP)
### Setup
```bash
# KFP Configuration
export KUBEFLOW_PIPELINES_ENDPOINT="https://your-kfp-endpoint"
export KUBEFLOW_NAMESPACE="your-namespace"
export KUBEFLOW_EXPERIMENT_NAME="trustyai-garak-scans"
export KUBEFLOW_BASE_IMAGE="quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:cpu" # for gpu - "quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:gpu"
# S3 Configuration (for artifacts)
export AWS_ACCESS_KEY_ID="your-key"
export AWS_SECRET_ACCESS_KEY="your-secret"
export AWS_S3_ENDPOINT="https://your-s3-endpoint"
export AWS_S3_BUCKET="pipeline-artifacts"
export AWS_DEFAULT_REGION="us-east-1"
# Start server
llama stack run run-remote.yaml --image-type venv
```
_Note: If you're running Llama Stack server locally, make sure `BASE_URL` in run-remote*.yaml is accessible from KFP pods (you can use [ngrok](https://ngrok.com/) to create an accessible endpoint for your local Llama stack service)._
### Usage
```python
# Same API, runs as KFP pipeline
job = client.eval.run_eval(benchmark_id="trustyai_garak::owasp_llm_top10", ...)
# Monitor pipeline
status = client.eval.jobs.status(job_id=job.job_id, benchmark_id="trustyai_garak::owasp_llm_top10")
print(f"KFP Run ID: {status.metadata['kfp_run_id']}")
```
## Configuration Reference
### Provider Config (`run.yaml`)
```yaml
providers:
eval:
- provider_id: trustyai_garak
config:
base_url: ${env.BASE_URL:=http://localhost:8321/v1}
timeout: ${env.GARAK_TIMEOUT:=10800}
max_concurrent_jobs: ${env.GARAK_MAX_CONCURRENT_JOBS:=5}
max_workers: ${env.GARAK_MAX_WORKERS:=5}
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `BASE_URL` | `http://localhost:8321/v1` | Llama Stack service URL |
| `GARAK_TIMEOUT` | `10800` | Max scan timeout (seconds) |
| `GARAK_MAX_CONCURRENT_JOBS` | `5` | Max concurrent scans |
| `GARAK_MAX_WORKERS` | `5` | Shield scanning parallelism |
## Deployment Modes
| Mode | Config File | Features |
|------|------------|----------|
| **Basic** | `run.yaml` | Standard scanning |
| **Enhanced** | `run-with-safety.yaml` | + Shield integration |
| **Remote** | `run-remote.yaml` | KFP execution |
| **Remote+Safety** | `run-remote-safety.yaml` | + Shield integration |
## Result Interpretation
- **Score Range**: 0.0 (secure) to 1.0 (vulnerable)
- **Threshold**: Scores ≥ 0.5 indicate security concerns
- **Reports**: Available in JSON, HTML, and log formats via Files API
## Examples & Demos
| Notebook | Description |
|----------|-------------|
| [01-getting_started](demos/01-getting_started_with_garak.ipynb) | Basic usage and custom probes |
| [02-scan_with_shields](demos/02-scan_with_shields.ipynb) | Shield integration testing |
| [03-remote_garak](demos/03-remote_garak.ipynb) | KFP remote execution |
Raw data
{
"_id": null,
"home_page": null,
"name": "llama-stack-provider-trustyai-garak",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "llama-stack, garak, red-teaming, security, ai-safety",
"author": "TrustyAI Team",
"author_email": "Sai Chandra Pandraju <saichandrapandraju@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/07/b1/1f725ff202771512014b3e8503165bb57023f135f269fb211c9b67b22069/llama_stack_provider_trustyai_garak-0.1.5.tar.gz",
"platform": null,
"description": "# TrustyAI Garak (`trustyai_garak`): Out-of-Tree Llama Stack Eval Provider for Garak Red Teaming\n\n## About\nThis repository implements [Garak](https://github.com/NVIDIA/garak) as a Llama Stack out-of-tree provider for **security testing and red teaming** of Large Language Models with optional **Shield Integration** for enhanced security testing. Please find the tutorial [here](https://trustyai.org/docs/main/red-teaming-introduction) to get started.\n\n## What It Does\n\n- **Automated Security Testing**: Detects prompt injection, jailbreaks, toxicity, and bias vulnerabilities\n- **Compliance Scanning**: OWASP LLM Top 10, AVID taxonomy benchmarks\n- **Shield Testing**: Compare LLM security with/without guardrails\n- **Scalable Deployment**: Local or Kubernetes/Kubeflow execution\n- **Comprehensive Reporting**: JSON, HTML, and detailed logs with vulnerability scores (0.0-1.0)\n\n## Installation\n\n```bash\ngit clone https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak.git\ncd llama-stack-provider-trustyai-garak\npython3 -m venv .venv && source .venv/bin/activate\npip install -e .\n# For remote execution: pip install -e \".[remote]\"\n```\n\n## Quick Start\n\n### 1. Configure Environment\n\n```bash\nexport VLLM_URL=\"http://your-model-endpoint/v1\"\nexport INFERENCE_MODEL=\"your-model-name\"\n```\n\n### 2. Start Server\n\n```bash\n# Basic mode (standard scanning)\nllama stack run run.yaml --image-type venv\n\n# Enhanced mode (with shield testing)\nllama stack run run-with-safety.yaml --image-type venv\n\n# Remote mode (Kubernetes/KFP)\nllama stack run run-remote.yaml --image-type venv\n```\n\nServer runs at `http://localhost:8321`\n\n### 3. Run Security Scan\n\n```python\nfrom llama_stack_client import LlamaStackClient\n\nclient = LlamaStackClient(base_url=\"http://localhost:8321\")\n\n# Quick 5-minute scan\njob = client.eval.run_eval(\n benchmark_id=\"trustyai_garak::quick\",\n benchmark_config={\n \"eval_candidate\": {\n \"type\": \"model\",\n \"model\": \"your-model-name\",\n \"sampling_params\": {\"max_tokens\": 100}\n }\n }\n)\n\n# Check status\nstatus = client.eval.jobs.status(job_id=job.job_id, benchmark_id=\"trustyai_garak::quick\")\nprint(f\"Status: {status.status}\")\n\n# Get results when complete\nif status.status == \"completed\":\n results = client.eval.get_eval_job_result(job_id=job.job_id, benchmark_id=\"trustyai_garak::quick\")\n```\n\n## Available Benchmarks\n\n### Compliance Frameworks\n| Benchmark ID | Framework | Duration |\n|-------------|-----------|----------|\n| `trustyai_garak::owasp_llm_top10` | [OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/) | ~8 hours |\n| `trustyai_garak::avid_security` | [AVID Security](https://docs.avidml.org/taxonomy/effect-sep-view/security) | ~8 hours |\n| `trustyai_garak::avid_ethics` | [AVID Ethics](https://docs.avidml.org/taxonomy/effect-sep-view/ethics) | ~30 minutes |\n| `trustyai_garak::avid_performance` | [AVID Performance](https://docs.avidml.org/taxonomy/effect-sep-view/performance) | ~40 minutes |\n\n### Test Profiles\n| Benchmark ID | Description | Duration |\n|-------------|-------------|----------|\n| `trustyai_garak::quick` | Essential security checks (3 probes) | ~5 minutes |\n| `trustyai_garak::standard` | Standard attack vectors (5 categories) | ~1 hour |\n\n_Duration estimates based on Qwen2.5 7B via vLLM_\n\n## Advanced Usage\n\n### Other Garak Probes\n\n```python\nclient.benchmarks.register(\n benchmark_id=\"custom\",\n dataset_id=\"garak\",\n scoring_functions=[\"garak_scoring\"],\n provider_benchmark_id=\"custom\",\n provider_id=\"trustyai_garak\",\n metadata={\n \"probes\": [\"latentinjection.LatentJailbreak\", \"snowball.GraphConnectivity\"],\n \"timeout\": 900\n }\n)\n```\n\n### Shield Testing\n\n```python\n# Test with input shield\nclient.benchmarks.register(\n benchmark_id=\"with_shield\",\n dataset_id=\"garak\",\n scoring_functions=[\"garak_scoring\"],\n provider_benchmark_id=\"with_shield\",\n provider_id=\"trustyai_garak\",\n metadata={\n \"probes\": [\"promptinject.HijackHateHumans\"],\n \"shield_ids\": [\"Prompt-Guard-86M\"] # Input shield only\n }\n)\n\n# Test with input/output shields\nmetadata={\n \"probes\": [\"promptinject.HijackHateHumans\"],\n \"shield_config\": {\n \"input\": [\"Prompt-Guard-86M\"],\n \"output\": [\"Llama-Guard-3-8B\"]\n }\n}\n```\n\n### Accessing Reports\n\n```python\n# Get report file IDs from job status\nscan_report_id = status.metadata[\"scan.report.jsonl\"]\nscan_html_id = status.metadata[\"scan.report.html\"]\n\n# Download via Files API\ncontent = client.files.content(scan_report_id)\n\n# Or via HTTP\nimport requests\nreport = requests.get(f\"http://localhost:8321/v1/openai/v1/files/{scan_html_id}/content\")\n```\n\n## Remote Execution (Kubernetes/KFP)\n\n### Setup\n\n```bash\n# KFP Configuration\nexport KUBEFLOW_PIPELINES_ENDPOINT=\"https://your-kfp-endpoint\"\nexport KUBEFLOW_NAMESPACE=\"your-namespace\"\nexport KUBEFLOW_EXPERIMENT_NAME=\"trustyai-garak-scans\"\nexport KUBEFLOW_BASE_IMAGE=\"quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:cpu\" # for gpu - \"quay.io/rh-ee-spandraj/trustyai-garak-provider-dsp:gpu\"\n\n# S3 Configuration (for artifacts)\nexport AWS_ACCESS_KEY_ID=\"your-key\"\nexport AWS_SECRET_ACCESS_KEY=\"your-secret\"\nexport AWS_S3_ENDPOINT=\"https://your-s3-endpoint\"\nexport AWS_S3_BUCKET=\"pipeline-artifacts\"\nexport AWS_DEFAULT_REGION=\"us-east-1\"\n\n# Start server\nllama stack run run-remote.yaml --image-type venv\n```\n\n_Note: If you're running Llama Stack server locally, make sure `BASE_URL` in run-remote*.yaml is accessible from KFP pods (you can use [ngrok](https://ngrok.com/) to create an accessible endpoint for your local Llama stack service)._\n\n### Usage\n\n```python\n# Same API, runs as KFP pipeline\njob = client.eval.run_eval(benchmark_id=\"trustyai_garak::owasp_llm_top10\", ...)\n\n# Monitor pipeline\nstatus = client.eval.jobs.status(job_id=job.job_id, benchmark_id=\"trustyai_garak::owasp_llm_top10\")\nprint(f\"KFP Run ID: {status.metadata['kfp_run_id']}\")\n```\n\n## Configuration Reference\n\n### Provider Config (`run.yaml`)\n\n```yaml\nproviders:\n eval:\n - provider_id: trustyai_garak\n config:\n base_url: ${env.BASE_URL:=http://localhost:8321/v1}\n timeout: ${env.GARAK_TIMEOUT:=10800}\n max_concurrent_jobs: ${env.GARAK_MAX_CONCURRENT_JOBS:=5}\n max_workers: ${env.GARAK_MAX_WORKERS:=5}\n```\n\n### Environment Variables\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `BASE_URL` | `http://localhost:8321/v1` | Llama Stack service URL |\n| `GARAK_TIMEOUT` | `10800` | Max scan timeout (seconds) |\n| `GARAK_MAX_CONCURRENT_JOBS` | `5` | Max concurrent scans |\n| `GARAK_MAX_WORKERS` | `5` | Shield scanning parallelism |\n\n## Deployment Modes\n\n| Mode | Config File | Features |\n|------|------------|----------|\n| **Basic** | `run.yaml` | Standard scanning |\n| **Enhanced** | `run-with-safety.yaml` | + Shield integration |\n| **Remote** | `run-remote.yaml` | KFP execution |\n| **Remote+Safety** | `run-remote-safety.yaml` | + Shield integration |\n\n## Result Interpretation\n\n- **Score Range**: 0.0 (secure) to 1.0 (vulnerable)\n- **Threshold**: Scores \u2265 0.5 indicate security concerns\n- **Reports**: Available in JSON, HTML, and log formats via Files API\n\n## Examples & Demos\n\n| Notebook | Description |\n|----------|-------------|\n| [01-getting_started](demos/01-getting_started_with_garak.ipynb) | Basic usage and custom probes |\n| [02-scan_with_shields](demos/02-scan_with_shields.ipynb) | Shield integration testing |\n| [03-remote_garak](demos/03-remote_garak.ipynb) | KFP remote execution |\n",
"bugtrack_url": null,
"license": null,
"summary": "Out-Of-Tree Llama Stack provider for Garak Red-teaming",
"version": "0.1.5",
"project_urls": {
"homepage": "https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak",
"repository": "https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak"
},
"split_keywords": [
"llama-stack",
" garak",
" red-teaming",
" security",
" ai-safety"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "e2cb29d0fbaf65f29500dbc16821e6464bb9d6b024ca6ee511cecadd53785470",
"md5": "18c0cc30ecd1404f297f25ec3f7f7a49",
"sha256": "d5e812215e7d98022bb75cc9d019452373416542536e514862bc81bf46953bbb"
},
"downloads": -1,
"filename": "llama_stack_provider_trustyai_garak-0.1.5-py3-none-any.whl",
"has_sig": false,
"md5_digest": "18c0cc30ecd1404f297f25ec3f7f7a49",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 38468,
"upload_time": "2025-10-07T12:33:02",
"upload_time_iso_8601": "2025-10-07T12:33:02.154028Z",
"url": "https://files.pythonhosted.org/packages/e2/cb/29d0fbaf65f29500dbc16821e6464bb9d6b024ca6ee511cecadd53785470/llama_stack_provider_trustyai_garak-0.1.5-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "07b11f725ff202771512014b3e8503165bb57023f135f269fb211c9b67b22069",
"md5": "d1d021348735e06e524f9edb5285a246",
"sha256": "2f644b0d419da57e473c27eb2adb3c02609dd6f1473f76bdb2a9097c1051abc8"
},
"downloads": -1,
"filename": "llama_stack_provider_trustyai_garak-0.1.5.tar.gz",
"has_sig": false,
"md5_digest": "d1d021348735e06e524f9edb5285a246",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 45322,
"upload_time": "2025-10-07T12:33:03",
"upload_time_iso_8601": "2025-10-07T12:33:03.360444Z",
"url": "https://files.pythonhosted.org/packages/07/b1/1f725ff202771512014b3e8503165bb57023f135f269fb211c9b67b22069/llama_stack_provider_trustyai_garak-0.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-07 12:33:03",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "trustyai-explainability",
"github_project": "llama-stack-provider-trustyai-garak",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "llama-stack-provider-trustyai-garak"
}