# TrustyAI Garak (`trustyai_garak`): Out-of-Tree Llama Stack Eval Provider for Garak Red Teaming
## About
This repository implements [Garak](https://github.com/NVIDIA/garak) as a Llama Stack out-of-tree provider for **security testing and red teaming** of Large Language Models with optional **Shield Integration** for enhanced security testing.
## Features
- **Security Vulnerability Detection**: Automated testing for prompt injection, jailbreaks, toxicity, and bias
- **Compliance Framework Support**: Pre-built benchmarks for established standards ([OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/), [AVID taxonomy](https://docs.avidml.org/taxonomy/effect-sep-view))
- **Shield Integration**: Test LLMs with and without Llama Stack shields for comparative security analysis
- **Concurrency Control**: Configurable limits for concurrent scans and shield operations
- **Custom Probe Support**: Run specific garak security probes
- **Enhanced Reporting**: Multiple garak output formats including HTML reports and detailed logs
## Quick Start
### Prerequisites
- Python 3.12+
- Access to an OpenAI-compatible model endpoint
### Installation
```bash
# Clone the repository
git clone https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak.git
cd llama-stack-provider-trustyai-garak
# Create & activate venv
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -e .
```
### Configuration
Set up your environment variables:
```bash
export VLLM_URL="http://your-model-endpoint/v1"
export INFERENCE_MODEL="your-model-name"
# Optional: Configure scan behavior
export GARAK_TIMEOUT="10800" # 3 hours default
export GARAK_MAX_CONCURRENT_JOBS="5" # Max concurrent scans
export GARAK_MAX_WORKERS="5" # Max workers for shield scanning
```
### Run Security Scans
#### Basic Mode (Standard Garak Scanning)
```bash
# Start the Llama Stack server
llama stack run run.yaml --image-type venv
# The server will be available at http://localhost:8321
```
#### Enhanced Mode (With Shield Integration)
```bash
# Start with safety and shield capabilities
llama stack run run-with-safety.yaml --image-type venv
# Includes safety, shields, and telemetry APIs
```
## Demos
Interactive examples are available in the `demos/` directory:
- **[Getting Started](demos/01-getting_started_with_garak.ipynb)**: Basic usage with predefined scan profiles and user-defined garak probes
- **[Scan Guardrailed System](demos/02-scan_with_shields.ipynb)**: Llama Stack shield integration for scanning guardrailed LLM system
- **[concurrency_limit_test.ipynb](demos/concurrency_limit_test.ipynb)**: Testing concurrent scan limits
## Compliance Frameworks
Pre-registered compliance framework benchmarks available immediately:
### Compliance Standards
| Framework | Benchmark ID | Description | Duration |
|-----------|--------------|--------------| --------|
| **[OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/)** | `owasp_llm_top10` | OWASP Top 10 for Large Language Model Applications | ~8 hours |
| **[AVID Security](https://docs.avidml.org/taxonomy/effect-sep-view/security)** | `avid_security` | AI Vulnerability Database - Security vulnerabilities | ~8 hours |
| **[AVID Ethics](https://docs.avidml.org/taxonomy/effect-sep-view/ethics)** | `avid_ethics` | AI Vulnerability Database - Ethical concerns | ~30 minutes |
| **[AVID Performance](https://docs.avidml.org/taxonomy/effect-sep-view/performance)** | `avid_performance` | AI Vulnerability Database - Performance issues | ~40 minutes |
### Scan Profiles for Testing
| Profile | Benchmark ID | Duration | Probes |
|---------|--------------|----------|---------|
| **Quick** | `quick` | ~5 minutes | Essential security checks (3 specific probes) |
| **Standard** | `standard` | ~1 hour | Standard attack vectors (5 probe categories) |
_Note: All the above duration estimates are calculated with a Qwen2.5 7B model deployed via vLLM on Openshift._
## Usage Examples
### Discover Available Benchmarks
```python
from llama_stack_client import LlamaStackClient
client = LlamaStackClient(base_url="http://localhost:8321")
# List all available benchmarks (auto-registered)
benchmarks = client.benchmarks.list()
for benchmark in benchmarks.data:
print(f"- {benchmark.identifier}: {benchmark.metadata.get('name', 'No name')}")
```
### Compliance Framework Testing
```python
# Run OWASP LLM Top 10 security assessment
job = client.eval.run_eval(
benchmark_id="owasp_llm_top10",
benchmark_config={
"eval_candidate": {
"type": "model",
"model": "qwen2", # change this to your inference model name
"sampling_params": {
"max_tokens": 100
},
}
},
)
# Run AVID Security assessment
job = client.eval.run_eval(
benchmark_id="avid_security",
benchmark_config={
"eval_candidate": {
"type": "model",
"model": "qwen2",
"sampling_params": {
"max_tokens": 100
},
}
},
)
```
### Built-in Scan Profiles for testing
```python
# Quick security scan (5 min)
job = client.eval.run_eval(
benchmark_id="quick",
benchmark_config={
"eval_candidate": {
"type": "model",
"model": "qwen2", # change this to your inference model name
"sampling_params": {
"max_tokens": 100
},
}
},
)
```
### Custom Garak Probes
```python
# Register custom probes
client.benchmarks.register(
benchmark_id="custom",
dataset_id="garak", # placeholder
scoring_functions=["garak_scoring"], # placeholder
provider_benchmark_id="custom",
provider_id="trustyai_garak",
metadata={
"probes": ["latentinjection.LatentJailbreak", "snowball.GraphConnectivity"],
"timeout": 900 # 15 minutes
}
)
```
### Shield Integration (Enhanced Mode)
```python
# Test with input shields only
client.benchmarks.register(
benchmark_id="PI_with_input_shield",
dataset_id="garak", # placeholder
scoring_functions=["garak_scoring"], # placeholder
provider_benchmark_id="PI_with_input_shield",
provider_id="trustyai_garak",
metadata={
"probes": ["promptinject.HijackHateHumans"],
"timeout": 600,
"shield_ids": ["Prompt-Guard-86M"] # Applied to input only
}
)
# Test with separate input/output shields
client.benchmarks.register(
benchmark_id="PI_with_io_shields",
dataset_id="garak", # placeholder
scoring_functions=["garak_scoring"], # placeholder
provider_benchmark_id="PI_with_io_shields",
provider_id="trustyai_garak",
metadata={
"probes": ["promptinject.HijackHateHumans"],
"timeout": 600,
"shield_config": {
"input": ["Prompt-Guard-86M"],
"output": ["Llama-Guard-3-8B"]
}
}
)
```
### Job Management
```python
# Check job status
job_status = client.eval.jobs.status(job_id=job.job_id, benchmark_id="quick")
print(f"Job status: {job_status.status}")
print(f"Running jobs: {job_status.metadata.get('running_jobs', 'N/A')}")
# Cancel a running job
client.eval.jobs.cancel(job_id=job.job_id, benchmark_id="quick")
# Get evaluation results
if job_status.status == "completed":
results = client.eval.get_eval_job_result(job_id=job.job_id, benchmark_id="quick")
```
### Accessing Scan Reports
```python
# Get file metadata
scan_report_id = job_status.metadata["scan_report_file_id"]
scan_log_id = job_status.metadata["scan_log_file_id"]
scan_html_id = job_status.metadata["scan_report_html_file_id"]
# Download files using Files API or direct HTTP calls
import requests
files_url = f"http://localhost:8321/v1/openai/v1/files"
report_content = requests.get(f"{files_url}/{scan_report_id}/content")
```
## Configuration Options
### Provider Configuration (`run.yaml` / `run-with-safety.yaml`)
```yaml
providers:
eval:
- provider_id: trustyai_garak
provider_type: inline::trustyai_garak
config:
base_url: ${env.BASE_URL:=http://localhost:8321/v1} # llama-stack service base url
timeout: ${env.GARAK_TIMEOUT:=10800} # 3 hours default
max_concurrent_jobs: ${env.GARAK_MAX_CONCURRENT_JOBS:=5} # Concurrent scan limit
max_workers: ${env.GARAK_MAX_WORKERS:=5} # Shield scanning parallelism
garak_model_type_openai: "openai.OpenAICompatible" # For direct LLM scanning
garak_model_type_function: "function.Single" # For shield-integrated scanning
```
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `BASE_URL` | `http://localhost:8321/v1` | Llama Stack service base URL |
| `GARAK_TIMEOUT` | `10800` | Maximum timeout for garak scan (seconds) |
| `GARAK_MAX_CONCURRENT_JOBS` | `5` | Maximum concurrent garak scans |
| `GARAK_MAX_WORKERS` | `5` | Maximum workers for shield scanning |
## Results Interpretation
- **Vulnerability Score**: 0.0 (secure) to 1.0 (vulnerable)
- **Threshold**: Scores ≥ 0.5 indicate potential security concerns
- **Shield Comparison**: Compare scores with and without shields to measure effectiveness
- **Detailed Reports**: Available via Llama Stack `files` API in multiple formats:
- JSON report (`scan_report_file_id`)
- HTML report (`scan_report_html_file_id`)
- Detailed logs (`scan_log_file_id`)
- Hit logs (`scan_hitlog_file_id`)
## Deployment Modes
### Basic Mode (`run.yaml`)
- Standard garak scanning against OpenAI-compatible endpoints
- APIs: `inference`, `eval`, `files`
- Best for: Basic security testing
### Enhanced Mode (`run-with-safety.yaml`)
- Shield-integrated scanning to test Guardrailed systems
- APIs: `inference`, `eval`, `files`, `safety`, `shields`, `telemetry`
- Best for: Advanced security testing with defense evaluation
Raw data
{
"_id": null,
"home_page": null,
"name": "llama-stack-provider-trustyai-garak",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.12",
"maintainer_email": null,
"keywords": "llama-stack, garak, red-teaming, security, ai-safety",
"author": "TrustyAI Team",
"author_email": "Sai Chandra Pandraju <saichandrapandraju@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/1d/90/1372a98177162ed00abd80166a1e75f4ff8aaa2c0727144b9a0df08e3f4c/llama_stack_provider_trustyai_garak-0.1.1.tar.gz",
"platform": null,
"description": "# TrustyAI Garak (`trustyai_garak`): Out-of-Tree Llama Stack Eval Provider for Garak Red Teaming\n\n## About\nThis repository implements [Garak](https://github.com/NVIDIA/garak) as a Llama Stack out-of-tree provider for **security testing and red teaming** of Large Language Models with optional **Shield Integration** for enhanced security testing.\n\n## Features\n- **Security Vulnerability Detection**: Automated testing for prompt injection, jailbreaks, toxicity, and bias\n- **Compliance Framework Support**: Pre-built benchmarks for established standards ([OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/), [AVID taxonomy](https://docs.avidml.org/taxonomy/effect-sep-view))\n- **Shield Integration**: Test LLMs with and without Llama Stack shields for comparative security analysis\n- **Concurrency Control**: Configurable limits for concurrent scans and shield operations\n- **Custom Probe Support**: Run specific garak security probes\n- **Enhanced Reporting**: Multiple garak output formats including HTML reports and detailed logs\n\n## Quick Start\n\n### Prerequisites\n- Python 3.12+\n- Access to an OpenAI-compatible model endpoint\n\n### Installation\n```bash\n# Clone the repository\ngit clone https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak.git\ncd llama-stack-provider-trustyai-garak\n\n# Create & activate venv\npython3 -m venv .venv\nsource .venv/bin/activate\n\n# Install dependencies\npip install -e .\n```\n\n### Configuration\nSet up your environment variables:\n```bash\nexport VLLM_URL=\"http://your-model-endpoint/v1\"\nexport INFERENCE_MODEL=\"your-model-name\"\n\n# Optional: Configure scan behavior\nexport GARAK_TIMEOUT=\"10800\" # 3 hours default\nexport GARAK_MAX_CONCURRENT_JOBS=\"5\" # Max concurrent scans\nexport GARAK_MAX_WORKERS=\"5\" # Max workers for shield scanning\n```\n\n### Run Security Scans\n\n#### Basic Mode (Standard Garak Scanning)\n```bash\n# Start the Llama Stack server\nllama stack run run.yaml --image-type venv\n\n# The server will be available at http://localhost:8321\n```\n\n#### Enhanced Mode (With Shield Integration)\n```bash\n# Start with safety and shield capabilities\nllama stack run run-with-safety.yaml --image-type venv\n\n# Includes safety, shields, and telemetry APIs\n```\n\n## Demos\nInteractive examples are available in the `demos/` directory:\n\n- **[Getting Started](demos/01-getting_started_with_garak.ipynb)**: Basic usage with predefined scan profiles and user-defined garak probes\n- **[Scan Guardrailed System](demos/02-scan_with_shields.ipynb)**: Llama Stack shield integration for scanning guardrailed LLM system\n- **[concurrency_limit_test.ipynb](demos/concurrency_limit_test.ipynb)**: Testing concurrent scan limits\n\n## Compliance Frameworks\n\nPre-registered compliance framework benchmarks available immediately:\n\n### Compliance Standards\n| Framework | Benchmark ID | Description | Duration |\n|-----------|--------------|--------------| --------|\n| **[OWASP LLM Top 10](https://genai.owasp.org/llm-top-10/)** | `owasp_llm_top10` | OWASP Top 10 for Large Language Model Applications | ~8 hours |\n| **[AVID Security](https://docs.avidml.org/taxonomy/effect-sep-view/security)** | `avid_security` | AI Vulnerability Database - Security vulnerabilities | ~8 hours |\n| **[AVID Ethics](https://docs.avidml.org/taxonomy/effect-sep-view/ethics)** | `avid_ethics` | AI Vulnerability Database - Ethical concerns | ~30 minutes |\n| **[AVID Performance](https://docs.avidml.org/taxonomy/effect-sep-view/performance)** | `avid_performance` | AI Vulnerability Database - Performance issues | ~40 minutes |\n\n### Scan Profiles for Testing\n| Profile | Benchmark ID | Duration | Probes |\n|---------|--------------|----------|---------|\n| **Quick** | `quick` | ~5 minutes | Essential security checks (3 specific probes) |\n| **Standard** | `standard` | ~1 hour | Standard attack vectors (5 probe categories) |\n\n_Note: All the above duration estimates are calculated with a Qwen2.5 7B model deployed via vLLM on Openshift._\n## Usage Examples\n\n### Discover Available Benchmarks\n```python\nfrom llama_stack_client import LlamaStackClient\n\nclient = LlamaStackClient(base_url=\"http://localhost:8321\")\n\n# List all available benchmarks (auto-registered)\nbenchmarks = client.benchmarks.list()\nfor benchmark in benchmarks.data:\n print(f\"- {benchmark.identifier}: {benchmark.metadata.get('name', 'No name')}\")\n```\n\n### Compliance Framework Testing\n```python\n# Run OWASP LLM Top 10 security assessment\njob = client.eval.run_eval(\n benchmark_id=\"owasp_llm_top10\",\n benchmark_config={\n \"eval_candidate\": {\n \"type\": \"model\",\n \"model\": \"qwen2\", # change this to your inference model name\n \"sampling_params\": {\n \"max_tokens\": 100\n },\n }\n },\n)\n\n# Run AVID Security assessment\njob = client.eval.run_eval(\n benchmark_id=\"avid_security\",\n benchmark_config={\n \"eval_candidate\": {\n \"type\": \"model\", \n \"model\": \"qwen2\",\n \"sampling_params\": {\n \"max_tokens\": 100\n },\n }\n },\n)\n```\n\n### Built-in Scan Profiles for testing\n```python\n# Quick security scan (5 min)\njob = client.eval.run_eval(\n benchmark_id=\"quick\",\n benchmark_config={\n \"eval_candidate\": {\n \"type\": \"model\",\n \"model\": \"qwen2\", # change this to your inference model name\n \"sampling_params\": {\n \"max_tokens\": 100\n },\n }\n },\n)\n```\n\n### Custom Garak Probes\n```python\n# Register custom probes\nclient.benchmarks.register(\n benchmark_id=\"custom\",\n dataset_id=\"garak\", # placeholder\n scoring_functions=[\"garak_scoring\"], # placeholder\n provider_benchmark_id=\"custom\",\n provider_id=\"trustyai_garak\",\n metadata={\n \"probes\": [\"latentinjection.LatentJailbreak\", \"snowball.GraphConnectivity\"],\n \"timeout\": 900 # 15 minutes\n }\n)\n```\n\n### Shield Integration (Enhanced Mode)\n```python\n# Test with input shields only\nclient.benchmarks.register(\n benchmark_id=\"PI_with_input_shield\",\n dataset_id=\"garak\", # placeholder\n scoring_functions=[\"garak_scoring\"], # placeholder\n provider_benchmark_id=\"PI_with_input_shield\",\n provider_id=\"trustyai_garak\",\n metadata={\n \"probes\": [\"promptinject.HijackHateHumans\"],\n \"timeout\": 600,\n \"shield_ids\": [\"Prompt-Guard-86M\"] # Applied to input only\n }\n)\n\n# Test with separate input/output shields\nclient.benchmarks.register(\n benchmark_id=\"PI_with_io_shields\",\n dataset_id=\"garak\", # placeholder\n scoring_functions=[\"garak_scoring\"], # placeholder\n provider_benchmark_id=\"PI_with_io_shields\",\n provider_id=\"trustyai_garak\",\n metadata={\n \"probes\": [\"promptinject.HijackHateHumans\"],\n \"timeout\": 600,\n \"shield_config\": {\n \"input\": [\"Prompt-Guard-86M\"],\n \"output\": [\"Llama-Guard-3-8B\"]\n }\n }\n)\n```\n\n### Job Management\n```python\n# Check job status\njob_status = client.eval.jobs.status(job_id=job.job_id, benchmark_id=\"quick\")\nprint(f\"Job status: {job_status.status}\")\nprint(f\"Running jobs: {job_status.metadata.get('running_jobs', 'N/A')}\")\n\n# Cancel a running job\nclient.eval.jobs.cancel(job_id=job.job_id, benchmark_id=\"quick\")\n\n# Get evaluation results\nif job_status.status == \"completed\":\n results = client.eval.get_eval_job_result(job_id=job.job_id, benchmark_id=\"quick\")\n```\n\n### Accessing Scan Reports\n```python\n# Get file metadata\nscan_report_id = job_status.metadata[\"scan_report_file_id\"]\nscan_log_id = job_status.metadata[\"scan_log_file_id\"]\nscan_html_id = job_status.metadata[\"scan_report_html_file_id\"]\n\n# Download files using Files API or direct HTTP calls\nimport requests\nfiles_url = f\"http://localhost:8321/v1/openai/v1/files\"\nreport_content = requests.get(f\"{files_url}/{scan_report_id}/content\")\n```\n\n## Configuration Options\n\n### Provider Configuration (`run.yaml` / `run-with-safety.yaml`)\n\n```yaml\nproviders:\n eval:\n - provider_id: trustyai_garak\n provider_type: inline::trustyai_garak\n config:\n base_url: ${env.BASE_URL:=http://localhost:8321/v1} # llama-stack service base url\n timeout: ${env.GARAK_TIMEOUT:=10800} # 3 hours default\n max_concurrent_jobs: ${env.GARAK_MAX_CONCURRENT_JOBS:=5} # Concurrent scan limit\n max_workers: ${env.GARAK_MAX_WORKERS:=5} # Shield scanning parallelism\n garak_model_type_openai: \"openai.OpenAICompatible\" # For direct LLM scanning\n garak_model_type_function: \"function.Single\" # For shield-integrated scanning\n```\n\n### Environment Variables\n\n| Variable | Default | Description |\n|----------|---------|-------------|\n| `BASE_URL` | `http://localhost:8321/v1` | Llama Stack service base URL |\n| `GARAK_TIMEOUT` | `10800` | Maximum timeout for garak scan (seconds) |\n| `GARAK_MAX_CONCURRENT_JOBS` | `5` | Maximum concurrent garak scans |\n| `GARAK_MAX_WORKERS` | `5` | Maximum workers for shield scanning |\n\n\n## Results Interpretation\n\n- **Vulnerability Score**: 0.0 (secure) to 1.0 (vulnerable)\n- **Threshold**: Scores \u2265 0.5 indicate potential security concerns\n- **Shield Comparison**: Compare scores with and without shields to measure effectiveness\n- **Detailed Reports**: Available via Llama Stack `files` API in multiple formats:\n - JSON report (`scan_report_file_id`)\n - HTML report (`scan_report_html_file_id`)\n - Detailed logs (`scan_log_file_id`)\n - Hit logs (`scan_hitlog_file_id`)\n\n## Deployment Modes\n\n### Basic Mode (`run.yaml`)\n- Standard garak scanning against OpenAI-compatible endpoints\n- APIs: `inference`, `eval`, `files`\n- Best for: Basic security testing\n\n### Enhanced Mode (`run-with-safety.yaml`)\n- Shield-integrated scanning to test Guardrailed systems\n- APIs: `inference`, `eval`, `files`, `safety`, `shields`, `telemetry`\n- Best for: Advanced security testing with defense evaluation\n",
"bugtrack_url": null,
"license": null,
"summary": "Out-Of-Tree Llama Stack provider for Garak Red-teaming",
"version": "0.1.1",
"project_urls": {
"homepage": "https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak",
"repository": "https://github.com/trustyai-explainability/llama-stack-provider-trustyai-garak"
},
"split_keywords": [
"llama-stack",
" garak",
" red-teaming",
" security",
" ai-safety"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "c5cf75fc1d5db1001e6596399e0758f1ea5ab0cd5d3ba0bf8e4be0d9f31a4693",
"md5": "e1652cba230fcbbe87a033f8434b475b",
"sha256": "2e68b0a8d350f5d8ebe91e114bd130e214721603fc413947642daa369c2b8644"
},
"downloads": -1,
"filename": "llama_stack_provider_trustyai_garak-0.1.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "e1652cba230fcbbe87a033f8434b475b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.12",
"size": 23758,
"upload_time": "2025-08-01T05:01:43",
"upload_time_iso_8601": "2025-08-01T05:01:43.563422Z",
"url": "https://files.pythonhosted.org/packages/c5/cf/75fc1d5db1001e6596399e0758f1ea5ab0cd5d3ba0bf8e4be0d9f31a4693/llama_stack_provider_trustyai_garak-0.1.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1d901372a98177162ed00abd80166a1e75f4ff8aaa2c0727144b9a0df08e3f4c",
"md5": "a430e5648c4767980713d7a2bc7ec199",
"sha256": "4965bc964ac049fb4feb0c06930e3fac15e83af3f0133b96652b74aefc62f0e4"
},
"downloads": -1,
"filename": "llama_stack_provider_trustyai_garak-0.1.1.tar.gz",
"has_sig": false,
"md5_digest": "a430e5648c4767980713d7a2bc7ec199",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.12",
"size": 24961,
"upload_time": "2025-08-01T05:01:45",
"upload_time_iso_8601": "2025-08-01T05:01:45.070559Z",
"url": "https://files.pythonhosted.org/packages/1d/90/1372a98177162ed00abd80166a1e75f4ff8aaa2c0727144b9a0df08e3f4c/llama_stack_provider_trustyai_garak-0.1.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-01 05:01:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "trustyai-explainability",
"github_project": "llama-stack-provider-trustyai-garak",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "llama-stack-provider-trustyai-garak"
}