judge-llm

Name	judge-llm JSON
Version	1.0.4 JSON
	download
home_page	None
Summary	A lightweight LLM evaluation framework for comparing and testing AI providers
upload_time	2025-10-27 03:21:05
maintainer	None
docs_url	None
author	None
requires_python	>=3.9
license	CC-BY-NC-SA-4.0
keywords	llm evaluation testing ai gemini openai claude
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            <div align="center">
  <img src="assets/icon.png" alt="Judge LLM" width="200"/>

  # JUDGE LLM

  A lightweight, extensible Python framework for **evaluating and comparing LLM providers**. Test your AI agents systematically with multi-turn conversations, cost tracking, and comprehensive reporting.

  [Quick Start](#quick-start) • [Demo](#demo) • [Features](#features) • [Examples](#testing-examples) • [Reports](#reports--dashboard)
</div>

<div align="center">
  <img src="assets/judge-llm.gif" alt="Judge LLM Demo" width="100%"/>
</div>

## Purpose

JUDGE LLM helps you **evaluate AI agents and LLM providers** by running test cases against your models and measuring:
- **Response quality** (exact matching, semantic similarity, ROUGE scores)
- **Cost & latency** (token usage, execution time, budget compliance)
- **Conversation flow** (tool uses, multi-turn interactions)
- **Safety & custom metrics** (extensible evaluation logic)

Perfect for regression testing, A/B testing providers, and ensuring production-grade quality.

## Features

- **Multiple Providers**: Gemini, Mock, and custom providers with registry-based extensibility
- **Built-in Evaluators**: Response similarity, trajectory validation, cost/latency checks
- **Custom Components**: Create and register custom providers, evaluators, and reporters
- **Registry System**: Register once in defaults, use everywhere by name
- **Rich Reports**: Console tables, interactive HTML dashboard, JSON exports, SQLite database, plus custom reporters
- **Parallel Execution**: Run evaluations concurrently with configurable workers
- **Quality Gates**: Fail CI/CD builds when thresholds are violated (configurable)
- **Config-Driven**: YAML configs with smart defaults or programmatic Python API
- **Default Config**: Reusable configurations with component registration
- **Per-Test Overrides**: Fine-tune evaluator thresholds per test case
- **Environment Variables**: Auto-loads `.env` for secure API key management

## Installation

### From Source

```bash
git clone https://github.com/HiHelloAI/judge-llm.git
cd judge-llm
pip install -e .
```

### From PyPI (when published)

```bash
pip install judge-llm
```

### With Optional Dependencies

```bash
# Install with Gemini provider support
pip install judge-llm[gemini]

# Install with dev dependencies
pip install judge-llm[dev]
```

### Setup Environment Variables

JUDGE LLM automatically loads environment variables from a `.env` file:

```bash
# Copy the example file
cp .env.example .env

# Edit .env and add your API keys
nano .env
```

**`.env` file:**
```bash
# Google Gemini API Key
GOOGLE_API_KEY=your-google-api-key-here
```

The `.env` file is automatically loaded when you import the library or run the CLI. **Never commit `.env` to version control** - it's already in `.gitignore`.

## Quick Start

### CLI Usage

```bash
# Run evaluation from config file
judge-llm run --config config.yaml

# Run with inline arguments (supports .json, .yaml, or .yml)
judge-llm run --dataset ./data/eval.yaml --provider mock --agent-id my_agent --report html --output report.html

# Validate configuration
judge-llm validate --config config.yaml

# List available components
judge-llm list providers
judge-llm list evaluators
judge-llm list reporters

# Generate dashboard from database
judge-llm dashboard --db results.db --output dashboard.html
```

### Python API

```python
from judge_llm import evaluate

# From config file
report = evaluate(config="config.yaml")

# Programmatic API (supports .json, .yaml, or .yml datasets)
report = evaluate(
    dataset={"loader": "local_file", "paths": ["./data/eval.yaml"]},
    providers=[{"type": "mock", "agent_id": "my_agent"}],
    evaluators=[{"type": "response_evaluator", "config": {"similarity_threshold": 0.8}}],
    reporters=[{"type": "console"}, {"type": "html", "output_path": "./report.html"}]
)

print(f"Success: {report.success_rate:.1%} | Cost: ${report.total_cost:.4f}")
```

## Configuration

**Minimal config.yaml:**
```yaml
dataset:
  loader: local_file
  paths: [./data/eval.json]  # Supports .json, .yaml, or .yml files

providers:
  - type: gemini
    agent_id: my_agent
    model: gemini-2.0-flash-exp

evaluators:
  - type: response_evaluator
    config: {similarity_threshold: 0.8}

reporters:
  - type: console
  - type: html
    output_path: ./report.html
```

**Advanced config with quality gates:**
```yaml
agent:
  fail_on_threshold_violation: true  # Exit with error if evaluations fail (default: true)
  parallel_execution: true            # Run tests in parallel
  max_workers: 4                      # Number of parallel workers
  num_runs: 3                         # Run each test 3 times

dataset:
  loader: local_file
  paths: [./data/eval.yaml]

providers:
  - type: gemini
    agent_id: production_agent
    model: gemini-2.0-flash-exp

evaluators:
  - type: response_evaluator
    config:
      similarity_threshold: 0.85  # Minimum 85% similarity required
  - type: cost_evaluator
    config:
      max_cost_per_case: 0.05      # Maximum $0.05 per test

reporters:
  - type: database
    db_path: ./results.db  # Track results over time
```

**Use in CI/CD:**
```bash
# Fails with exit code 1 if any evaluator thresholds are violated
judge-llm run --config ci-config.yaml

# Or disable failures for monitoring
# Set fail_on_threshold_violation: false in config
```

**Dataset File Formats:**

JUDGE LLM supports both JSON and YAML formats for evaluation datasets. Use whichever format you prefer:

```yaml
# Using JSON dataset
dataset:
  loader: local_file
  paths: [./data/eval.json]

# Using YAML dataset
dataset:
  loader: local_file
  paths: [./data/eval.yaml]

# Using multiple datasets (mixed formats)
dataset:
  loader: local_file
  paths:
    - ./data/eval1.json
    - ./data/eval2.yaml

# Using directory loader with pattern
dataset:
  loader: directory
  paths: [./data]
  pattern: "*.yaml"  # or "*.json" or "*.yml"
```

See the [examples/](examples/) directory for complete configuration examples including default configs, custom evaluators, and advanced features.

## Custom Component Registration

JUDGE LLM supports registering custom providers, evaluators, and reporters for reuse across projects.

### Method 1: Register in Default Config

Create `.judge_llm.defaults.yaml` in your project root:

```yaml
# Register custom components once
providers:
  - type: custom
    module_path: ./my_providers/anthropic.py
    class_name: AnthropicProvider
    register_as: anthropic  # ← Use this name in test configs

evaluators:
  - type: custom
    module_path: ./my_evaluators/safety.py
    class_name: SafetyEvaluator
    register_as: safety

reporters:
  - type: custom
    module_path: ./my_reporters/slack.py
    class_name: SlackReporter
    register_as: slack
```

Then use them by name in any test config:

```yaml
# test.yaml - clean and simple!
providers:
  - type: anthropic  # ← Uses registered custom provider
    agent_id: claude

evaluators:
  - type: safety  # ← Uses registered custom evaluator

reporters:
  - type: slack  # ← Uses registered custom reporter
    config: {webhook_url: ${SLACK_WEBHOOK}}
```

### Method 2: Programmatic Registration

```python
from judge_llm import evaluate, register_provider, register_evaluator, register_reporter
from my_components import CustomProvider, SafetyEvaluator, SlackReporter

# Register components
register_provider("my_provider", CustomProvider)
register_evaluator("safety", SafetyEvaluator)
register_reporter("slack", SlackReporter)

# Use by name
report = evaluate(
    dataset={"loader": "local_file", "paths": ["./tests.json"]},
    providers=[{"type": "my_provider", "agent_id": "test"}],
    evaluators=[{"type": "safety"}],
    reporters=[{"type": "slack", "config": {"webhook_url": "..."}}]
)
```

**Benefits:**
- ✅ **DRY** - Register once, use everywhere
- ✅ **Team Standardization** - Share defaults across team
- ✅ **Clean Configs** - Test configs reference components by name
- ✅ **Easy Updates** - Change implementation in one place

See [examples/default_config_reporters/](examples/default_config_reporters/) for complete examples.

## Testing Examples

Explore **8 complete examples** in the `examples/` directory:

| Example | Description |
|---------|-------------|
| **01-gemini-agent** | Real Gemini API evaluation with response & trajectory checks |
| **02-default-config** | Reusable config patterns with `.judge_llm.defaults.yaml` |
| **03-custom-evaluator** | Build custom evaluators (sentiment analysis example) |
| **04-safety-long-conversation** | Multi-turn safety evaluation (PII, toxicity, hate speech) |
| **05-evaluator-config-override** | Per-test-case threshold overrides |
| **06-database-reporter** | SQLite persistence for historical tracking & trend analysis |
| **custom_reporter_example** | Create custom reporters (CSV, programmatic registration) |
| **default_config_reporters** | Register all custom components in defaults (providers, evaluators, reporters) |

Each example includes config files, datasets, and instructions. Run any example:

```bash
cd examples/01-gemini-agent
judge-llm run --config config.yaml
```

## Built-in Components

### Providers
- **Gemini** - Google's Gemini models (requires `GOOGLE_API_KEY` in `.env`)
- **Mock** - Built-in test provider, no setup required
- **Custom** - Extend `BaseProvider` for your own LLM providers (OpenAI, Anthropic, etc.)

### Evaluators
- **ResponseEvaluator** - Compare responses (exact, semantic similarity, ROUGE)
- **TrajectoryEvaluator** - Validate tool uses and conversation flow
- **CostEvaluator** - Enforce cost thresholds
- **LatencyEvaluator** - Enforce latency thresholds
- **Custom** - Extend `BaseEvaluator` for custom logic (safety, compliance, etc.)

### Reporters
- **ConsoleReporter** - Rich terminal output with colored tables
- **HTMLReporter** - Interactive HTML dashboard
- **JSONReporter** - Machine-readable JSON export
- **DatabaseReporter** - SQLite database for historical tracking
- **Custom** - Extend `BaseReporter` for custom formats (CSV, Slack, Datadog, etc.)

## Reports & Dashboard

### HTML Dashboard
Interactive web interface with:
- **Sidebar**: Summary metrics + execution list with color-coded status
- **Main Panel**: Execution details, evaluator scores, conversation history
- **Features**: Dark mode, responsive, self-contained (works offline)

### Console Output
Rich formatted tables with live execution progress

### JSON Export
Machine-readable results for programmatic analysis

### SQLite Database
Persistent storage for:
- Historical trend tracking
- Regression detection
- Cost analysis over time
- SQL-based queries

```bash
# Generate dashboard from database
judge-llm dashboard --db results.db --output dashboard.html
```

## Development

```bash
# Setup
pip install -e ".[dev]"

# Run tests
pytest

# Format code
black judge_llm && ruff check judge_llm
```

Contributions welcome! Fork, create a feature branch, add tests, and submit a PR.

## License

Licensed under **CC BY-NC-SA 4.0** - Free for non-commercial use with attribution. See [LICENSE](LICENSE) for details.

For commercial licensing, contact the maintainers.

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "judge-llm",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "llm, evaluation, testing, ai, gemini, openai, claude",
    "author": null,
    "author_email": "Ilayanambi <hihelloai@yahoo.com>",
    "download_url": "https://files.pythonhosted.org/packages/d6/45/db6400e0aeb31a0bfa2b21abe3d47e6017c9b3f9c4fb5c109dd89d883223/judge_llm-1.0.4.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n  <img src=\"assets/icon.png\" alt=\"Judge LLM\" width=\"200\"/>\n\n  # JUDGE LLM\n\n  A lightweight, extensible Python framework for **evaluating and comparing LLM providers**. Test your AI agents systematically with multi-turn conversations, cost tracking, and comprehensive reporting.\n\n  [Quick Start](#quick-start) \u2022 [Demo](#demo) \u2022 [Features](#features) \u2022 [Examples](#testing-examples) \u2022 [Reports](#reports--dashboard)\n</div>\n\n<div align=\"center\">\n  <img src=\"assets/judge-llm.gif\" alt=\"Judge LLM Demo\" width=\"100%\"/>\n</div>\n\n## Purpose\n\nJUDGE LLM helps you **evaluate AI agents and LLM providers** by running test cases against your models and measuring:\n- **Response quality** (exact matching, semantic similarity, ROUGE scores)\n- **Cost & latency** (token usage, execution time, budget compliance)\n- **Conversation flow** (tool uses, multi-turn interactions)\n- **Safety & custom metrics** (extensible evaluation logic)\n\nPerfect for regression testing, A/B testing providers, and ensuring production-grade quality.\n\n## Features\n\n- **Multiple Providers**: Gemini, Mock, and custom providers with registry-based extensibility\n- **Built-in Evaluators**: Response similarity, trajectory validation, cost/latency checks\n- **Custom Components**: Create and register custom providers, evaluators, and reporters\n- **Registry System**: Register once in defaults, use everywhere by name\n- **Rich Reports**: Console tables, interactive HTML dashboard, JSON exports, SQLite database, plus custom reporters\n- **Parallel Execution**: Run evaluations concurrently with configurable workers\n- **Quality Gates**: Fail CI/CD builds when thresholds are violated (configurable)\n- **Config-Driven**: YAML configs with smart defaults or programmatic Python API\n- **Default Config**: Reusable configurations with component registration\n- **Per-Test Overrides**: Fine-tune evaluator thresholds per test case\n- **Environment Variables**: Auto-loads `.env` for secure API key management\n\n## Installation\n\n### From Source\n\n```bash\ngit clone https://github.com/HiHelloAI/judge-llm.git\ncd judge-llm\npip install -e .\n```\n\n### From PyPI (when published)\n\n```bash\npip install judge-llm\n```\n\n### With Optional Dependencies\n\n```bash\n# Install with Gemini provider support\npip install judge-llm[gemini]\n\n# Install with dev dependencies\npip install judge-llm[dev]\n```\n\n### Setup Environment Variables\n\nJUDGE LLM automatically loads environment variables from a `.env` file:\n\n```bash\n# Copy the example file\ncp .env.example .env\n\n# Edit .env and add your API keys\nnano .env\n```\n\n**`.env` file:**\n```bash\n# Google Gemini API Key\nGOOGLE_API_KEY=your-google-api-key-here\n```\n\nThe `.env` file is automatically loaded when you import the library or run the CLI. **Never commit `.env` to version control** - it's already in `.gitignore`.\n\n## Quick Start\n\n### CLI Usage\n\n```bash\n# Run evaluation from config file\njudge-llm run --config config.yaml\n\n# Run with inline arguments (supports .json, .yaml, or .yml)\njudge-llm run --dataset ./data/eval.yaml --provider mock --agent-id my_agent --report html --output report.html\n\n# Validate configuration\njudge-llm validate --config config.yaml\n\n# List available components\njudge-llm list providers\njudge-llm list evaluators\njudge-llm list reporters\n\n# Generate dashboard from database\njudge-llm dashboard --db results.db --output dashboard.html\n```\n\n### Python API\n\n```python\nfrom judge_llm import evaluate\n\n# From config file\nreport = evaluate(config=\"config.yaml\")\n\n# Programmatic API (supports .json, .yaml, or .yml datasets)\nreport = evaluate(\n    dataset={\"loader\": \"local_file\", \"paths\": [\"./data/eval.yaml\"]},\n    providers=[{\"type\": \"mock\", \"agent_id\": \"my_agent\"}],\n    evaluators=[{\"type\": \"response_evaluator\", \"config\": {\"similarity_threshold\": 0.8}}],\n    reporters=[{\"type\": \"console\"}, {\"type\": \"html\", \"output_path\": \"./report.html\"}]\n)\n\nprint(f\"Success: {report.success_rate:.1%} | Cost: ${report.total_cost:.4f}\")\n```\n\n## Configuration\n\n**Minimal config.yaml:**\n```yaml\ndataset:\n  loader: local_file\n  paths: [./data/eval.json]  # Supports .json, .yaml, or .yml files\n\nproviders:\n  - type: gemini\n    agent_id: my_agent\n    model: gemini-2.0-flash-exp\n\nevaluators:\n  - type: response_evaluator\n    config: {similarity_threshold: 0.8}\n\nreporters:\n  - type: console\n  - type: html\n    output_path: ./report.html\n```\n\n**Advanced config with quality gates:**\n```yaml\nagent:\n  fail_on_threshold_violation: true  # Exit with error if evaluations fail (default: true)\n  parallel_execution: true            # Run tests in parallel\n  max_workers: 4                      # Number of parallel workers\n  num_runs: 3                         # Run each test 3 times\n\ndataset:\n  loader: local_file\n  paths: [./data/eval.yaml]\n\nproviders:\n  - type: gemini\n    agent_id: production_agent\n    model: gemini-2.0-flash-exp\n\nevaluators:\n  - type: response_evaluator\n    config:\n      similarity_threshold: 0.85  # Minimum 85% similarity required\n  - type: cost_evaluator\n    config:\n      max_cost_per_case: 0.05      # Maximum $0.05 per test\n\nreporters:\n  - type: database\n    db_path: ./results.db  # Track results over time\n```\n\n**Use in CI/CD:**\n```bash\n# Fails with exit code 1 if any evaluator thresholds are violated\njudge-llm run --config ci-config.yaml\n\n# Or disable failures for monitoring\n# Set fail_on_threshold_violation: false in config\n```\n\n**Dataset File Formats:**\n\nJUDGE LLM supports both JSON and YAML formats for evaluation datasets. Use whichever format you prefer:\n\n```yaml\n# Using JSON dataset\ndataset:\n  loader: local_file\n  paths: [./data/eval.json]\n\n# Using YAML dataset\ndataset:\n  loader: local_file\n  paths: [./data/eval.yaml]\n\n# Using multiple datasets (mixed formats)\ndataset:\n  loader: local_file\n  paths:\n    - ./data/eval1.json\n    - ./data/eval2.yaml\n\n# Using directory loader with pattern\ndataset:\n  loader: directory\n  paths: [./data]\n  pattern: \"*.yaml\"  # or \"*.json\" or \"*.yml\"\n```\n\nSee the [examples/](examples/) directory for complete configuration examples including default configs, custom evaluators, and advanced features.\n\n## Custom Component Registration\n\nJUDGE LLM supports registering custom providers, evaluators, and reporters for reuse across projects.\n\n### Method 1: Register in Default Config\n\nCreate `.judge_llm.defaults.yaml` in your project root:\n\n```yaml\n# Register custom components once\nproviders:\n  - type: custom\n    module_path: ./my_providers/anthropic.py\n    class_name: AnthropicProvider\n    register_as: anthropic  # \u2190 Use this name in test configs\n\nevaluators:\n  - type: custom\n    module_path: ./my_evaluators/safety.py\n    class_name: SafetyEvaluator\n    register_as: safety\n\nreporters:\n  - type: custom\n    module_path: ./my_reporters/slack.py\n    class_name: SlackReporter\n    register_as: slack\n```\n\nThen use them by name in any test config:\n\n```yaml\n# test.yaml - clean and simple!\nproviders:\n  - type: anthropic  # \u2190 Uses registered custom provider\n    agent_id: claude\n\nevaluators:\n  - type: safety  # \u2190 Uses registered custom evaluator\n\nreporters:\n  - type: slack  # \u2190 Uses registered custom reporter\n    config: {webhook_url: ${SLACK_WEBHOOK}}\n```\n\n### Method 2: Programmatic Registration\n\n```python\nfrom judge_llm import evaluate, register_provider, register_evaluator, register_reporter\nfrom my_components import CustomProvider, SafetyEvaluator, SlackReporter\n\n# Register components\nregister_provider(\"my_provider\", CustomProvider)\nregister_evaluator(\"safety\", SafetyEvaluator)\nregister_reporter(\"slack\", SlackReporter)\n\n# Use by name\nreport = evaluate(\n    dataset={\"loader\": \"local_file\", \"paths\": [\"./tests.json\"]},\n    providers=[{\"type\": \"my_provider\", \"agent_id\": \"test\"}],\n    evaluators=[{\"type\": \"safety\"}],\n    reporters=[{\"type\": \"slack\", \"config\": {\"webhook_url\": \"...\"}}]\n)\n```\n\n**Benefits:**\n- \u2705 **DRY** - Register once, use everywhere\n- \u2705 **Team Standardization** - Share defaults across team\n- \u2705 **Clean Configs** - Test configs reference components by name\n- \u2705 **Easy Updates** - Change implementation in one place\n\nSee [examples/default_config_reporters/](examples/default_config_reporters/) for complete examples.\n\n## Testing Examples\n\nExplore **8 complete examples** in the `examples/` directory:\n\n| Example | Description |\n|---------|-------------|\n| **01-gemini-agent** | Real Gemini API evaluation with response & trajectory checks |\n| **02-default-config** | Reusable config patterns with `.judge_llm.defaults.yaml` |\n| **03-custom-evaluator** | Build custom evaluators (sentiment analysis example) |\n| **04-safety-long-conversation** | Multi-turn safety evaluation (PII, toxicity, hate speech) |\n| **05-evaluator-config-override** | Per-test-case threshold overrides |\n| **06-database-reporter** | SQLite persistence for historical tracking & trend analysis |\n| **custom_reporter_example** | Create custom reporters (CSV, programmatic registration) |\n| **default_config_reporters** | Register all custom components in defaults (providers, evaluators, reporters) |\n\nEach example includes config files, datasets, and instructions. Run any example:\n\n```bash\ncd examples/01-gemini-agent\njudge-llm run --config config.yaml\n```\n\n## Built-in Components\n\n### Providers\n- **Gemini** - Google's Gemini models (requires `GOOGLE_API_KEY` in `.env`)\n- **Mock** - Built-in test provider, no setup required\n- **Custom** - Extend `BaseProvider` for your own LLM providers (OpenAI, Anthropic, etc.)\n\n### Evaluators\n- **ResponseEvaluator** - Compare responses (exact, semantic similarity, ROUGE)\n- **TrajectoryEvaluator** - Validate tool uses and conversation flow\n- **CostEvaluator** - Enforce cost thresholds\n- **LatencyEvaluator** - Enforce latency thresholds\n- **Custom** - Extend `BaseEvaluator` for custom logic (safety, compliance, etc.)\n\n### Reporters\n- **ConsoleReporter** - Rich terminal output with colored tables\n- **HTMLReporter** - Interactive HTML dashboard\n- **JSONReporter** - Machine-readable JSON export\n- **DatabaseReporter** - SQLite database for historical tracking\n- **Custom** - Extend `BaseReporter` for custom formats (CSV, Slack, Datadog, etc.)\n\n## Reports & Dashboard\n\n### HTML Dashboard\nInteractive web interface with:\n- **Sidebar**: Summary metrics + execution list with color-coded status\n- **Main Panel**: Execution details, evaluator scores, conversation history\n- **Features**: Dark mode, responsive, self-contained (works offline)\n\n### Console Output\nRich formatted tables with live execution progress\n\n### JSON Export\nMachine-readable results for programmatic analysis\n\n### SQLite Database\nPersistent storage for:\n- Historical trend tracking\n- Regression detection\n- Cost analysis over time\n- SQL-based queries\n\n```bash\n# Generate dashboard from database\njudge-llm dashboard --db results.db --output dashboard.html\n```\n\n## Development\n\n```bash\n# Setup\npip install -e \".[dev]\"\n\n# Run tests\npytest\n\n# Format code\nblack judge_llm && ruff check judge_llm\n```\n\nContributions welcome! Fork, create a feature branch, add tests, and submit a PR.\n\n## License\n\nLicensed under **CC BY-NC-SA 4.0** - Free for non-commercial use with attribution. See [LICENSE](LICENSE) for details.\n\nFor commercial licensing, contact the maintainers.\n",
    "bugtrack_url": null,
    "license": "CC-BY-NC-SA-4.0",
    "summary": "A lightweight LLM evaluation framework for comparing and testing AI providers",
    "version": "1.0.4",
    "project_urls": {
        "Documentation": "https://github.com/HiHelloAI/judge-llm#readme",
        "Homepage": "https://github.com/HiHelloAI/judge-llm",
        "Issues": "https://github.com/HiHelloAI/judge-llm/issues",
        "Repository": "https://github.com/HiHelloAI/judge-llm"
    },
    "split_keywords": [
        "llm",
        " evaluation",
        " testing",
        " ai",
        " gemini",
        " openai",
        " claude"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "57a71d4d44d72debb65aab98ffffd577ed3f8aa7ccf805947bc406739be42f9c",
                "md5": "ab2855c44028c261f144483044ebef68",
                "sha256": "ef1d3dac9ad8d8045090df50ebacb513f4f5560dd47cf04370c46543b1cc7796"
            },
            "downloads": -1,
            "filename": "judge_llm-1.0.4-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "ab2855c44028c261f144483044ebef68",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.9",
            "size": 74328,
            "upload_time": "2025-10-27T03:21:04",
            "upload_time_iso_8601": "2025-10-27T03:21:04.634619Z",
            "url": "https://files.pythonhosted.org/packages/57/a7/1d4d44d72debb65aab98ffffd577ed3f8aa7ccf805947bc406739be42f9c/judge_llm-1.0.4-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "d645db6400e0aeb31a0bfa2b21abe3d47e6017c9b3f9c4fb5c109dd89d883223",
                "md5": "52611629d1563ca11ff37fa6d1684eb8",
                "sha256": "e776dbaaa8dfd8a9e4d008ab55759b352bb302eb1b09e65531ba9f0498e096a9"
            },
            "downloads": -1,
            "filename": "judge_llm-1.0.4.tar.gz",
            "has_sig": false,
            "md5_digest": "52611629d1563ca11ff37fa6d1684eb8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 65459,
            "upload_time": "2025-10-27T03:21:05",
            "upload_time_iso_8601": "2025-10-27T03:21:05.879092Z",
            "url": "https://files.pythonhosted.org/packages/d6/45/db6400e0aeb31a0bfa2b21abe3d47e6017c9b3f9c4fb5c109dd89d883223/judge_llm-1.0.4.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-27 03:21:05",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "HiHelloAI",
    "github_project": "judge-llm#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "judge-llm"
}

None