# outcomeForge
A modular framework to forge and verify intelligent outcomes.
## Features
- Clear separation between Common and Custom Nodes
- Truly pluggable Scenario system
- 7 core scenarios: Snapshot, Adaptation, Regression, Architecture Drift, Local RAG, Code Review, Wiki Generation
- LLM-powered analysis (Anthropic Claude, OpenAI)
- Support for parallel/async execution
- Structured output with YAML metadata
- Complete Pass to Fail to Pass verification cycles
- Lightweight RAG with files-to-prompt integration
- Code review with security, quality, and performance checks
- Codebase wiki generation with smart abstraction identification
## Quick Start
### Installation
#### Option 1: pip install (Recommended)
```bash
# Install from source
git clone https://github.com/yourusername/outcomeforge.git
cd outcomeforge
pip install -e .
# Or install directly (once published to PyPI)
pip install outcomeforge
# Set up your API key (REQUIRED - no mock mode)
export ANTHROPIC_API_KEY="your-api-key-here"
```
After installation, you can use the `outcomeforge` command globally:
```bash
outcomeforge --help
outcomeforge snapshot --patterns "**/*.py"
outcomeforge wiki --local-dir ./my-project
```
#### Option 2: Manual installation (Development)
```bash
# Clone the repository
git clone https://github.com/yourusername/outcomeforge.git
cd outcomeforge
# Install dependencies
pip install -r requirements.txt
# Set up your API key
export ANTHROPIC_API_KEY="your-api-key-here"
# Use cli.py directly
python cli.py --help
```
**IMPORTANT**: This system requires a real API key to function. Mock mode has been removed to ensure all AI interactions are genuine. You must configure at least one API key before using any scenario.
### Verify API Configuration
Before running scenarios, verify your API setup:
```bash
# Check API configuration and test connectivity
python check_api_config.py
```
This tool will:
- Check if API keys are set
- Verify required packages are installed
- Test actual API connectivity
- Provide clear error messages if something is wrong
### One-Click Verification
Run all four scenarios with complete Pass ↔ Fail ↔ Pass cycles:
```bash
bash examples/run_all_scenarios.sh
```
This will:
- Verify all four scenarios end-to-end
- Generate structured outputs (JSON + YAML + MD)
### Try Local RAG
Quick start with the lightweight RAG feature:
```bash
# Ask questions about the codebase
python cli.py rag --patterns "**/*.py" --query "How does this project work?" --cxml
# Generate documentation from tests
python cli.py rag --patterns "tests/**/*.py" --query "Generate API documentation" --format markdown
# Locate features
python cli.py rag --query "Where is the snapshot functionality implemented?" --line-numbers
```
### Try Code Review (New!)
Quick start with the new code review feature:
```bash
# Review current changes
python cli.py code-review --git-diff
# Review against specific commit
python cli.py code-review --git-ref HEAD~1 --output review.yaml
# Security-only scan
python cli.py code-review --git-diff --security-only
```
## Seven Core Scenarios
### Scenario 1 - Local Snapshot and Rollback (no GitHub dependency)
**Purpose**: Create AI-powered code snapshots and restore files byte-for-byte
#### Features
- File content + hash snapshot (SHA-256)
- LLM-powered code health analysis
- Byte-for-byte restoration with hash verification
- **Complete cycle**: Create → Modify → Snapshot → Rollback → Verify
#### Commands
```bash
# Create snapshot
python cli.py snapshot --patterns "**/*.py" --model "claude-3-haiku-20240307"
# List snapshots
python cli.py snapshot-list
# Restore from snapshot
python cli.py snapshot-restore 20250118_120000
```
#### Demo Script
```bash
bash examples/demo_scenario1_snapshot_rollback.sh
```
**Verification Points**:
- Creates snapshot with file contents and hashes
- LLM generates code health report
- Modifies files and creates second snapshot
- Restores from first snapshot
- Hash verification passes (byte-for-byte match)
**Output Files**:
- `.ai-snapshots/snapshot-{timestamp}.json` - Full snapshot with file contents
- `.ai-snapshots/snapshot-{timestamp}.md` - LLM analysis report
---
### Scenario 2 - Open-source Repository Understanding and Adaptation
**Purpose**: Analyze open-source projects and generate organization-compliant adaptation plans
#### Features
- Clone and analyze real GitHub repositories
- Detect organization standard violations
- Generate executable YAML plans
- 10-point repository understanding
#### Commands
```bash
python cli.py adapt "https://github.com/pallets/click" --model "claude-3-haiku-20240307"
```
**Verification Points**:
- Clones real repository (Click project)
- Generates 10 understanding points
- Detects rule violations
- Creates executable `plan` YAML with steps
**Output Files**:
- `.ai-snapshots/repo_adapt_plan-{timestamp}.md`
---
### Scenario 3 - Regression Detection with Diff Analysis
**Purpose**: AI-powered quality gate decisions based on test metrics
#### Features
- Collects test pass rate, coverage, lint metrics
- LLM evaluates PASS/FAIL with reasoning
- **Complete cycle**: Baseline PASS → Inject Failure → FAIL → Fix → PASS
#### Commands
```bash
python cli.py regression --baseline "HEAD~1" --build "HEAD" --model "claude-3-haiku-20240307"
```
#### Demo Script
```bash
bash examples/demo_scenario3_regression_cycle.sh
```
**Verification Points**:
- Baseline test returns PASS
- Simulates failure injection
- After fix returns PASS
- Generates `gate` YAML with reasons and actions
**Output Files**:
- `.ai-snapshots/regression_gate-{timestamp}.md`
---
### Scenario 4 - Architecture Drift and Impact Scanning
**Purpose**: Detect architecture violations and structural drift
#### Features
- Dependency graph analysis
- Layer violation detection
- Complexity metrics tracking
- **Complete cycle**: Baseline PASS → Inject Drift → FAIL → Fix → PASS
#### Commands
```bash
python cli.py arch-drift --model "claude-3-haiku-20240307"
```
#### Demo Script
```bash
bash examples/demo_scenario4_arch_drift_cycle.sh
```
**Verification Points**:
- Baseline returns PASS with score (e.g., 90/100)
- Simulates architecture drift
- After fix returns PASS
- Generates `arch_gate` YAML with score and pass/fail
**Output Files**:
- `.ai-snapshots/arch_gate-{timestamp}.md`
---
### Scenario 5 - Local Lightweight RAG (Files-to-Prompt)
**Purpose**: Lightweight RAG for quick codebase understanding and Q&A
Inspired by [Simon Willison's files-to-prompt](https://github.com/simonw/files-to-prompt), this scenario formats code files into LLM-friendly prompts for rapid codebase analysis.
#### Features
- Quick project onboarding (seconds to understand architecture)
- Auto-generate documentation from tests/source
- Code navigation ("Where is JWT validation?")
- Optimized for long-context models (Claude XML format)
#### Commands
```bash
# Quick overview
python cli.py rag --patterns "**/*.py" --query "How does this project work?" --cxml
# Generate docs from tests
python cli.py rag --patterns "tests/**/*.py" --query "Generate API documentation" --format markdown
# Locate features with line numbers
python cli.py rag --query "Where is snapshot implemented?" --line-numbers
# Code review
python cli.py rag --patterns "nodes/**/*.py" --query "Review code quality"
```
#### Demo Script
```bash
bash examples/demo_scenario5_local_rag.sh
```
#### Use Cases
1. **Project Onboarding**: "What's the architecture? What are the core modules?"
2. **Documentation Generation**: Extract API docs from test cases
3. **Feature Location**: "Where is JWT validation implemented?"
4. **Code Review**: Quality analysis with specific focus areas
#### Format Options
- `--format xml`: Standard XML output
- `--format xml --cxml`: Compact XML (recommended for long context)
- `--format markdown`: Markdown code blocks
- `--line-numbers`: Include line numbers for precise location
**Output**: Direct LLM response to terminal (no files saved)
**Documentation**: See `docs/scenario_5_rag_usage.md` for detailed usage
---
### Scenario 6 - Code Review Pipeline (Security, Quality, Performance)
**Purpose**: Comprehensive code review for security, quality, and performance
Integrates CodeReviewAgent capabilities with modular review nodes for detecting issues across multiple categories.
#### Features
- **Security Review**: SQL injection, hardcoded secrets, command injection, path traversal, unsafe deserialization, weak cryptography
- **Quality Review**: Magic numbers, deep nesting, long functions, print statements, broad exceptions, missing docstrings
- **Performance Review**: Nested loops (O(n²)), string concatenation in loops, N+1 queries, loading all data
- **Flexible Execution**: Run all checks or individual categories (security-only mode)
- **Multiple Output Formats**: YAML, JSON, Markdown
- **Security Gate**: Auto-fail on critical issues
#### Commands
```bash
# Full review of working directory changes
python cli.py code-review --git-diff
# Review specific commit
python cli.py code-review --git-ref HEAD~1 --output review.yaml
# Review a diff file
python cli.py code-review --diff changes.patch --format markdown
# Security-only scan
python cli.py code-review --git-diff --security-only
```
#### Demo Script
```bash
bash examples/demo_scenario6_code_review.sh
```
---
### Scenario 7 - Codebase Wiki Generation
**Purpose**: Generate structured wiki documentation from any codebase (GitHub or local)
Automatically analyze your codebase, identify core abstractions, understand their relationships, and generate beginner-friendly wiki documentation with intelligent chapter ordering.
#### Features
- **Smart Abstraction Identification**: Uses LLM to identify 5-15 core concepts in your codebase
- **Relationship Analysis**: Understands how abstractions relate to each other
- **Intelligent Chapter Ordering**: Orders tutorial chapters based on dependencies for optimal learning
- **Batch Processing**: Automatically handles large repositories with token limit management
- **Multi-language Support**: Generate wikis in English or Chinese
- **LLM Caching**: Resume from checkpoints when processing large repos
- **Dual Output Modes**: Single TUTORIAL.md file or multi-file format (index + chapters)
- **Source Flexibility**: Works with GitHub repos or local directories
#### Commands
```bash
# Generate wiki from GitHub repository
python cli.py wiki --repo https://github.com/The-Pocket/PocketFlow-Rust
# Generate wiki from local directory
python cli.py wiki --local-dir ./my-project
# Chinese language output
python cli.py wiki --local-dir . --language chinese
# Custom abstraction limit
python cli.py wiki --repo https://github.com/pallets/flask --max-abstractions 15
# Multi-file output (index.md + chapter files)
python cli.py wiki --local-dir . --multi-file --output ./docs
# With custom file patterns
python cli.py wiki --repo https://github.com/example/repo \
--include-pattern "*.rs" --include-pattern "*.md" \
--exclude-pattern "**/tests/*"
```
#### Workflow
1. **Fetch Files**: Crawl GitHub repo or local directory
2. **Identify Abstractions**: LLM analyzes code structure and extracts core concepts
3. **Analyze Relationships**: Understand dependencies between abstractions
4. **Order Chapters**: Smart ordering based on learning progression
5. **Write Chapters**: Generate detailed chapter for each abstraction
6. **Combine Wiki**: Merge into final TUTORIAL.md with table of contents and Mermaid diagrams
#### Output Structure
**Single-file mode** (default):
```
output/ProjectName/
└── TUTORIAL.md # Complete wiki with all chapters
```
**Multi-file mode** (`--multi-file`):
```
output/ProjectName/
├── index.md # Main index with Mermaid diagram
├── 01_concept_name.md # Chapter 1
├── 02_another_concept.md # Chapter 2
└── ... # More chapters
```
#### Example Output
The generated wiki includes:
- **Project Overview**: Summary of what the project does
- **Mermaid Diagram**: Visual representation of abstraction relationships
- **Chapter List**: Intelligently ordered learning path
- **Detailed Chapters**: Beginner-friendly explanations with code examples
- **Cross-references**: Links between related concepts
#### Advanced Options
- `--model`: Choose LLM model (default: claude-3-haiku-20240307)
- `--max-abstractions`: Limit number of concepts (default: 10)
- `--output`: Custom output directory (default: ./output)
- `--include-pattern`: File patterns to include (can specify multiple)
- `--exclude-pattern`: File patterns to exclude (can specify multiple)
#### Demo Script
```bash
bash examples/demo_scenario7_wiki_generation.sh
```
**Verification Points**:
- Fetches and analyzes repository files
- Identifies core abstractions (concepts, classes, modules)
- Generates relationship graph
- Creates ordered learning path
- Produces comprehensive TUTORIAL.md
**Example Repositories Tested**:
- [PocketFlow-Rust](https://github.com/The-Pocket/PocketFlow-Rust) - Rust flow framework (32 files, 5 abstractions)
- Custom Python projects
- Multi-language repositories
**Documentation**: See `SCENARIO_7_INTEGRATION.md` and `SCENARIO_7_SUCCESS_SUMMARY.md` for implementation details
---
## CLI Commands
### Snapshot Commands
```bash
# Create snapshot
python cli.py snapshot --patterns "**/*.py" --model "claude-3-haiku-20240307"
# List all snapshots
python cli.py snapshot-list
# Restore from snapshot (with hash verification)
python cli.py snapshot-restore <snapshot-id>
```
### Analysis Commands
```bash
# Repository adaptation
python cli.py adapt "https://github.com/pallets/click" --model "claude-3-haiku-20240307"
# Regression detection
python cli.py regression --baseline "HEAD~1" --build "HEAD" --model "claude-3-haiku-20240307"
# Architecture drift
python cli.py arch-drift --model "claude-3-haiku-20240307"
```
### RAG Commands
```bash
# Quick project overview
python cli.py rag --patterns "**/*.py" --query "How does this project work?" --cxml
# Generate documentation from tests
python cli.py rag --patterns "tests/**/*.py" --query "Generate API documentation" --format markdown
# Locate feature implementation
python cli.py rag --query "Where is snapshot functionality implemented?" --line-numbers
# Multiple patterns
python cli.py rag --patterns "**/*.py" --patterns "**/*.md" --query "Summarize the project"
# Code review
python cli.py rag --patterns "nodes/**/*.py" --query "Review error handling and code reuse"
```
### Code Review Commands (New!)
```bash
# Review current working directory changes
python cli.py code-review --git-diff
# Review against specific commit
python cli.py code-review --git-ref HEAD~1 --output review.yaml
# Review a diff/patch file
python cli.py code-review --diff changes.patch --format markdown
# Security-only scan (fast)
python cli.py code-review --git-diff --security-only
# Full review with JSON output
python cli.py code-review --git-ref main --format json --output report.json
```
### Wiki Generation Commands (New!)
```bash
# Generate wiki from GitHub repository
python cli.py wiki --repo https://github.com/The-Pocket/PocketFlow-Rust
# Generate wiki from local directory
python cli.py wiki --local-dir ./my-project
# Chinese language wiki
python cli.py wiki --local-dir . --language chinese --output ./docs_cn
# Multi-file output mode
python cli.py wiki --local-dir . --multi-file --output ./wiki
# Custom abstraction limit and patterns
python cli.py wiki --repo https://github.com/example/project \
--max-abstractions 15 \
--include-pattern "*.py" --include-pattern "*.md" \
--exclude-pattern "**/tests/*"
```
## How It Works
### Flow/Node Architecture
Each scenario is a **Flow** composed of **Nodes**:
```python
def create_local_snapshot_scenario(config):
f = flow()
f.add(get_files_node(), name="get_files")
f.add(parse_code_node(), name="parse_code")
f.add(snapshot_files_node(), name="snapshot_files") # Saves file contents + hashes
f.add(call_llm_node(), name="llm_snapshot", params={
"prompt_file": "prompts/snapshot.prompt.md",
"model": config.get("model", "gpt-4"),
})
f.add(save_snapshot_node(), name="save_snapshot") # Saves JSON + MD
return f
```
### Three-Phase Node Execution
Each node has three phases:
1. **prep**: Prepare parameters from context
2. **exec**: Execute the operation
3. **post**: Update context and determine next state
### Example: Local RAG Flow
```python
from engine import flow
from nodes.common import get_files_node, files_to_prompt_node, call_llm_node
def create_rag_scenario(config):
f = flow()
# Step 1: Get files matching patterns
f.add(get_files_node(), name="get_files", params={
"patterns": config.get("patterns", ["**/*.py"])
})
# Step 2: Format files for LLM (files-to-prompt style)
f.add(files_to_prompt_node(), name="format", params={
"format": "xml",
"cxml": True, # Compact XML for long context
"include_line_numbers": False
})
# Step 3: Query LLM
f.add(call_llm_node(), name="query", params={
"prompt_file": "prompts/rag_query.prompt.md",
"model": "claude-3-haiku-20240307"
})
return f
# Run the flow
result = create_rag_scenario(config).run({
"project_root": ".",
"query": "How does this work?"
})
print(result.get("llm_response"))
```
## Requirements
- Python 3.7+
- anthropic (for Claude API)
- openai (for OpenAI API)
- click (for CLI)
- pyyaml (for config parsing)
- gitpython (for git operations)
## Contributing
Contributions welcome! Areas for improvement:
- Additional language support beyond Python
- LLM-powered semantic review for Scenario 6
- CI/CD pipeline integration (GitHub Actions, GitLab CI)
- Custom reporting formats and templates
- More review patterns and rules
- Integration with issue trackers
## License
MIT
Raw data
{
"_id": null,
"home_page": "https://github.com/yourusername/outcomeforge",
"name": "outcomeforge",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "code-analysis, llm, ai, devops, quality-assurance",
"author": "outcomeForge Team",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/7d/b4/cdf2393032d2a62185398eef31317df7817952fd270d4263ea71a03854cb/outcomeforge-0.1.2.tar.gz",
"platform": null,
"description": "# outcomeForge\n\nA modular framework to forge and verify intelligent outcomes.\n\n## Features\n- Clear separation between Common and Custom Nodes\n- Truly pluggable Scenario system\n- 7 core scenarios: Snapshot, Adaptation, Regression, Architecture Drift, Local RAG, Code Review, Wiki Generation\n- LLM-powered analysis (Anthropic Claude, OpenAI)\n- Support for parallel/async execution\n- Structured output with YAML metadata\n- Complete Pass to Fail to Pass verification cycles\n- Lightweight RAG with files-to-prompt integration\n- Code review with security, quality, and performance checks\n- Codebase wiki generation with smart abstraction identification\n\n## Quick Start\n\n### Installation\n\n#### Option 1: pip install (Recommended)\n\n```bash\n# Install from source\ngit clone https://github.com/yourusername/outcomeforge.git\ncd outcomeforge\npip install -e .\n\n# Or install directly (once published to PyPI)\npip install outcomeforge\n\n# Set up your API key (REQUIRED - no mock mode)\nexport ANTHROPIC_API_KEY=\"your-api-key-here\"\n```\n\nAfter installation, you can use the `outcomeforge` command globally:\n```bash\noutcomeforge --help\noutcomeforge snapshot --patterns \"**/*.py\"\noutcomeforge wiki --local-dir ./my-project\n```\n\n#### Option 2: Manual installation (Development)\n\n```bash\n# Clone the repository\ngit clone https://github.com/yourusername/outcomeforge.git\ncd outcomeforge\n\n# Install dependencies\npip install -r requirements.txt\n\n# Set up your API key\nexport ANTHROPIC_API_KEY=\"your-api-key-here\"\n\n# Use cli.py directly\npython cli.py --help\n```\n\n**IMPORTANT**: This system requires a real API key to function. Mock mode has been removed to ensure all AI interactions are genuine. You must configure at least one API key before using any scenario.\n\n### Verify API Configuration\n\nBefore running scenarios, verify your API setup:\n\n```bash\n# Check API configuration and test connectivity\npython check_api_config.py\n```\n\nThis tool will:\n- Check if API keys are set\n- Verify required packages are installed\n- Test actual API connectivity\n- Provide clear error messages if something is wrong\n\n### One-Click Verification\n\nRun all four scenarios with complete Pass \u2194 Fail \u2194 Pass cycles:\n\n```bash\nbash examples/run_all_scenarios.sh\n```\n\nThis will:\n- Verify all four scenarios end-to-end\n- Generate structured outputs (JSON + YAML + MD)\n\n### Try Local RAG\n\nQuick start with the lightweight RAG feature:\n\n```bash\n# Ask questions about the codebase\npython cli.py rag --patterns \"**/*.py\" --query \"How does this project work?\" --cxml\n\n# Generate documentation from tests\npython cli.py rag --patterns \"tests/**/*.py\" --query \"Generate API documentation\" --format markdown\n\n# Locate features\npython cli.py rag --query \"Where is the snapshot functionality implemented?\" --line-numbers\n```\n\n### Try Code Review (New!)\n\nQuick start with the new code review feature:\n\n```bash\n# Review current changes\npython cli.py code-review --git-diff\n\n# Review against specific commit\npython cli.py code-review --git-ref HEAD~1 --output review.yaml\n\n# Security-only scan\npython cli.py code-review --git-diff --security-only\n```\n\n## Seven Core Scenarios\n\n### Scenario 1 - Local Snapshot and Rollback (no GitHub dependency)\n\n**Purpose**: Create AI-powered code snapshots and restore files byte-for-byte\n\n#### Features\n- File content + hash snapshot (SHA-256)\n- LLM-powered code health analysis\n- Byte-for-byte restoration with hash verification\n- **Complete cycle**: Create \u2192 Modify \u2192 Snapshot \u2192 Rollback \u2192 Verify\n\n#### Commands\n```bash\n# Create snapshot\npython cli.py snapshot --patterns \"**/*.py\" --model \"claude-3-haiku-20240307\"\n\n# List snapshots\npython cli.py snapshot-list\n\n# Restore from snapshot\npython cli.py snapshot-restore 20250118_120000\n```\n\n#### Demo Script\n```bash\nbash examples/demo_scenario1_snapshot_rollback.sh\n```\n\n**Verification Points**:\n- Creates snapshot with file contents and hashes\n- LLM generates code health report\n- Modifies files and creates second snapshot\n- Restores from first snapshot\n- Hash verification passes (byte-for-byte match)\n\n**Output Files**:\n- `.ai-snapshots/snapshot-{timestamp}.json` - Full snapshot with file contents\n- `.ai-snapshots/snapshot-{timestamp}.md` - LLM analysis report\n\n---\n\n### Scenario 2 - Open-source Repository Understanding and Adaptation\n\n**Purpose**: Analyze open-source projects and generate organization-compliant adaptation plans\n\n#### Features\n- Clone and analyze real GitHub repositories\n- Detect organization standard violations\n- Generate executable YAML plans\n- 10-point repository understanding\n\n#### Commands\n```bash\npython cli.py adapt \"https://github.com/pallets/click\" --model \"claude-3-haiku-20240307\"\n```\n\n**Verification Points**:\n- Clones real repository (Click project)\n- Generates 10 understanding points\n- Detects rule violations\n- Creates executable `plan` YAML with steps\n\n**Output Files**:\n- `.ai-snapshots/repo_adapt_plan-{timestamp}.md`\n\n---\n\n### Scenario 3 - Regression Detection with Diff Analysis\n\n**Purpose**: AI-powered quality gate decisions based on test metrics\n\n#### Features\n- Collects test pass rate, coverage, lint metrics\n- LLM evaluates PASS/FAIL with reasoning\n- **Complete cycle**: Baseline PASS \u2192 Inject Failure \u2192 FAIL \u2192 Fix \u2192 PASS\n\n#### Commands\n```bash\npython cli.py regression --baseline \"HEAD~1\" --build \"HEAD\" --model \"claude-3-haiku-20240307\"\n```\n\n#### Demo Script\n```bash\nbash examples/demo_scenario3_regression_cycle.sh\n```\n\n**Verification Points**:\n- Baseline test returns PASS\n- Simulates failure injection\n- After fix returns PASS\n- Generates `gate` YAML with reasons and actions\n\n**Output Files**:\n- `.ai-snapshots/regression_gate-{timestamp}.md`\n\n---\n\n### Scenario 4 - Architecture Drift and Impact Scanning\n\n**Purpose**: Detect architecture violations and structural drift\n\n#### Features\n- Dependency graph analysis\n- Layer violation detection\n- Complexity metrics tracking\n- **Complete cycle**: Baseline PASS \u2192 Inject Drift \u2192 FAIL \u2192 Fix \u2192 PASS\n\n#### Commands\n```bash\npython cli.py arch-drift --model \"claude-3-haiku-20240307\"\n```\n\n#### Demo Script\n```bash\nbash examples/demo_scenario4_arch_drift_cycle.sh\n```\n\n**Verification Points**:\n- Baseline returns PASS with score (e.g., 90/100)\n- Simulates architecture drift\n- After fix returns PASS\n- Generates `arch_gate` YAML with score and pass/fail\n\n**Output Files**:\n- `.ai-snapshots/arch_gate-{timestamp}.md`\n\n---\n\n### Scenario 5 - Local Lightweight RAG (Files-to-Prompt)\n\n**Purpose**: Lightweight RAG for quick codebase understanding and Q&A\n\nInspired by [Simon Willison's files-to-prompt](https://github.com/simonw/files-to-prompt), this scenario formats code files into LLM-friendly prompts for rapid codebase analysis.\n\n#### Features\n- Quick project onboarding (seconds to understand architecture)\n- Auto-generate documentation from tests/source\n- Code navigation (\"Where is JWT validation?\")\n- Optimized for long-context models (Claude XML format)\n\n#### Commands\n```bash\n# Quick overview\npython cli.py rag --patterns \"**/*.py\" --query \"How does this project work?\" --cxml\n\n# Generate docs from tests\npython cli.py rag --patterns \"tests/**/*.py\" --query \"Generate API documentation\" --format markdown\n\n# Locate features with line numbers\npython cli.py rag --query \"Where is snapshot implemented?\" --line-numbers\n\n# Code review\npython cli.py rag --patterns \"nodes/**/*.py\" --query \"Review code quality\"\n```\n\n#### Demo Script\n```bash\nbash examples/demo_scenario5_local_rag.sh\n```\n\n#### Use Cases\n1. **Project Onboarding**: \"What's the architecture? What are the core modules?\"\n2. **Documentation Generation**: Extract API docs from test cases\n3. **Feature Location**: \"Where is JWT validation implemented?\"\n4. **Code Review**: Quality analysis with specific focus areas\n\n#### Format Options\n- `--format xml`: Standard XML output\n- `--format xml --cxml`: Compact XML (recommended for long context)\n- `--format markdown`: Markdown code blocks\n- `--line-numbers`: Include line numbers for precise location\n\n**Output**: Direct LLM response to terminal (no files saved)\n\n**Documentation**: See `docs/scenario_5_rag_usage.md` for detailed usage\n\n---\n\n### Scenario 6 - Code Review Pipeline (Security, Quality, Performance)\n\n**Purpose**: Comprehensive code review for security, quality, and performance\n\nIntegrates CodeReviewAgent capabilities with modular review nodes for detecting issues across multiple categories.\n\n#### Features\n- **Security Review**: SQL injection, hardcoded secrets, command injection, path traversal, unsafe deserialization, weak cryptography\n- **Quality Review**: Magic numbers, deep nesting, long functions, print statements, broad exceptions, missing docstrings\n- **Performance Review**: Nested loops (O(n\u00b2)), string concatenation in loops, N+1 queries, loading all data\n- **Flexible Execution**: Run all checks or individual categories (security-only mode)\n- **Multiple Output Formats**: YAML, JSON, Markdown\n- **Security Gate**: Auto-fail on critical issues\n\n#### Commands\n```bash\n# Full review of working directory changes\npython cli.py code-review --git-diff\n\n# Review specific commit\npython cli.py code-review --git-ref HEAD~1 --output review.yaml\n\n# Review a diff file\npython cli.py code-review --diff changes.patch --format markdown\n\n# Security-only scan\npython cli.py code-review --git-diff --security-only\n```\n\n#### Demo Script\n```bash\nbash examples/demo_scenario6_code_review.sh\n```\n\n---\n\n### Scenario 7 - Codebase Wiki Generation\n\n**Purpose**: Generate structured wiki documentation from any codebase (GitHub or local)\n\nAutomatically analyze your codebase, identify core abstractions, understand their relationships, and generate beginner-friendly wiki documentation with intelligent chapter ordering.\n\n#### Features\n- **Smart Abstraction Identification**: Uses LLM to identify 5-15 core concepts in your codebase\n- **Relationship Analysis**: Understands how abstractions relate to each other\n- **Intelligent Chapter Ordering**: Orders tutorial chapters based on dependencies for optimal learning\n- **Batch Processing**: Automatically handles large repositories with token limit management\n- **Multi-language Support**: Generate wikis in English or Chinese\n- **LLM Caching**: Resume from checkpoints when processing large repos\n- **Dual Output Modes**: Single TUTORIAL.md file or multi-file format (index + chapters)\n- **Source Flexibility**: Works with GitHub repos or local directories\n\n#### Commands\n```bash\n# Generate wiki from GitHub repository\npython cli.py wiki --repo https://github.com/The-Pocket/PocketFlow-Rust\n\n# Generate wiki from local directory\npython cli.py wiki --local-dir ./my-project\n\n# Chinese language output\npython cli.py wiki --local-dir . --language chinese\n\n# Custom abstraction limit\npython cli.py wiki --repo https://github.com/pallets/flask --max-abstractions 15\n\n# Multi-file output (index.md + chapter files)\npython cli.py wiki --local-dir . --multi-file --output ./docs\n\n# With custom file patterns\npython cli.py wiki --repo https://github.com/example/repo \\\n --include-pattern \"*.rs\" --include-pattern \"*.md\" \\\n --exclude-pattern \"**/tests/*\"\n```\n\n#### Workflow\n1. **Fetch Files**: Crawl GitHub repo or local directory\n2. **Identify Abstractions**: LLM analyzes code structure and extracts core concepts\n3. **Analyze Relationships**: Understand dependencies between abstractions\n4. **Order Chapters**: Smart ordering based on learning progression\n5. **Write Chapters**: Generate detailed chapter for each abstraction\n6. **Combine Wiki**: Merge into final TUTORIAL.md with table of contents and Mermaid diagrams\n\n#### Output Structure\n\n**Single-file mode** (default):\n```\noutput/ProjectName/\n\u2514\u2500\u2500 TUTORIAL.md # Complete wiki with all chapters\n```\n\n**Multi-file mode** (`--multi-file`):\n```\noutput/ProjectName/\n\u251c\u2500\u2500 index.md # Main index with Mermaid diagram\n\u251c\u2500\u2500 01_concept_name.md # Chapter 1\n\u251c\u2500\u2500 02_another_concept.md # Chapter 2\n\u2514\u2500\u2500 ... # More chapters\n```\n\n#### Example Output\n\nThe generated wiki includes:\n- **Project Overview**: Summary of what the project does\n- **Mermaid Diagram**: Visual representation of abstraction relationships\n- **Chapter List**: Intelligently ordered learning path\n- **Detailed Chapters**: Beginner-friendly explanations with code examples\n- **Cross-references**: Links between related concepts\n\n#### Advanced Options\n- `--model`: Choose LLM model (default: claude-3-haiku-20240307)\n- `--max-abstractions`: Limit number of concepts (default: 10)\n- `--output`: Custom output directory (default: ./output)\n- `--include-pattern`: File patterns to include (can specify multiple)\n- `--exclude-pattern`: File patterns to exclude (can specify multiple)\n\n#### Demo Script\n```bash\nbash examples/demo_scenario7_wiki_generation.sh\n```\n\n**Verification Points**:\n- Fetches and analyzes repository files\n- Identifies core abstractions (concepts, classes, modules)\n- Generates relationship graph\n- Creates ordered learning path\n- Produces comprehensive TUTORIAL.md\n\n**Example Repositories Tested**:\n- [PocketFlow-Rust](https://github.com/The-Pocket/PocketFlow-Rust) - Rust flow framework (32 files, 5 abstractions)\n- Custom Python projects\n- Multi-language repositories\n\n**Documentation**: See `SCENARIO_7_INTEGRATION.md` and `SCENARIO_7_SUCCESS_SUMMARY.md` for implementation details\n\n---\n\n## CLI Commands\n\n### Snapshot Commands\n```bash\n# Create snapshot\npython cli.py snapshot --patterns \"**/*.py\" --model \"claude-3-haiku-20240307\"\n\n# List all snapshots\npython cli.py snapshot-list\n\n# Restore from snapshot (with hash verification)\npython cli.py snapshot-restore <snapshot-id>\n```\n\n### Analysis Commands\n```bash\n# Repository adaptation\npython cli.py adapt \"https://github.com/pallets/click\" --model \"claude-3-haiku-20240307\"\n\n# Regression detection\npython cli.py regression --baseline \"HEAD~1\" --build \"HEAD\" --model \"claude-3-haiku-20240307\"\n\n# Architecture drift\npython cli.py arch-drift --model \"claude-3-haiku-20240307\"\n```\n\n### RAG Commands\n```bash\n# Quick project overview\npython cli.py rag --patterns \"**/*.py\" --query \"How does this project work?\" --cxml\n\n# Generate documentation from tests\npython cli.py rag --patterns \"tests/**/*.py\" --query \"Generate API documentation\" --format markdown\n\n# Locate feature implementation\npython cli.py rag --query \"Where is snapshot functionality implemented?\" --line-numbers\n\n# Multiple patterns\npython cli.py rag --patterns \"**/*.py\" --patterns \"**/*.md\" --query \"Summarize the project\"\n\n# Code review\npython cli.py rag --patterns \"nodes/**/*.py\" --query \"Review error handling and code reuse\"\n```\n\n### Code Review Commands (New!)\n```bash\n# Review current working directory changes\npython cli.py code-review --git-diff\n\n# Review against specific commit\npython cli.py code-review --git-ref HEAD~1 --output review.yaml\n\n# Review a diff/patch file\npython cli.py code-review --diff changes.patch --format markdown\n\n# Security-only scan (fast)\npython cli.py code-review --git-diff --security-only\n\n# Full review with JSON output\npython cli.py code-review --git-ref main --format json --output report.json\n```\n\n### Wiki Generation Commands (New!)\n```bash\n# Generate wiki from GitHub repository\npython cli.py wiki --repo https://github.com/The-Pocket/PocketFlow-Rust\n\n# Generate wiki from local directory\npython cli.py wiki --local-dir ./my-project\n\n# Chinese language wiki\npython cli.py wiki --local-dir . --language chinese --output ./docs_cn\n\n# Multi-file output mode\npython cli.py wiki --local-dir . --multi-file --output ./wiki\n\n# Custom abstraction limit and patterns\npython cli.py wiki --repo https://github.com/example/project \\\n --max-abstractions 15 \\\n --include-pattern \"*.py\" --include-pattern \"*.md\" \\\n --exclude-pattern \"**/tests/*\"\n```\n\n## How It Works\n\n### Flow/Node Architecture\nEach scenario is a **Flow** composed of **Nodes**:\n```python\ndef create_local_snapshot_scenario(config):\n f = flow()\n f.add(get_files_node(), name=\"get_files\")\n f.add(parse_code_node(), name=\"parse_code\")\n f.add(snapshot_files_node(), name=\"snapshot_files\") # Saves file contents + hashes\n f.add(call_llm_node(), name=\"llm_snapshot\", params={\n \"prompt_file\": \"prompts/snapshot.prompt.md\",\n \"model\": config.get(\"model\", \"gpt-4\"),\n })\n f.add(save_snapshot_node(), name=\"save_snapshot\") # Saves JSON + MD\n return f\n```\n\n### Three-Phase Node Execution\nEach node has three phases:\n1. **prep**: Prepare parameters from context\n2. **exec**: Execute the operation\n3. **post**: Update context and determine next state\n\n### Example: Local RAG Flow\n```python\nfrom engine import flow\nfrom nodes.common import get_files_node, files_to_prompt_node, call_llm_node\n\ndef create_rag_scenario(config):\n f = flow()\n\n # Step 1: Get files matching patterns\n f.add(get_files_node(), name=\"get_files\", params={\n \"patterns\": config.get(\"patterns\", [\"**/*.py\"])\n })\n\n # Step 2: Format files for LLM (files-to-prompt style)\n f.add(files_to_prompt_node(), name=\"format\", params={\n \"format\": \"xml\",\n \"cxml\": True, # Compact XML for long context\n \"include_line_numbers\": False\n })\n\n # Step 3: Query LLM\n f.add(call_llm_node(), name=\"query\", params={\n \"prompt_file\": \"prompts/rag_query.prompt.md\",\n \"model\": \"claude-3-haiku-20240307\"\n })\n\n return f\n\n# Run the flow\nresult = create_rag_scenario(config).run({\n \"project_root\": \".\",\n \"query\": \"How does this work?\"\n})\nprint(result.get(\"llm_response\"))\n```\n\n## Requirements\n\n- Python 3.7+\n- anthropic (for Claude API)\n- openai (for OpenAI API)\n- click (for CLI)\n- pyyaml (for config parsing)\n- gitpython (for git operations)\n\n\n## Contributing\n\nContributions welcome! Areas for improvement:\n- Additional language support beyond Python\n- LLM-powered semantic review for Scenario 6\n- CI/CD pipeline integration (GitHub Actions, GitLab CI)\n- Custom reporting formats and templates\n- More review patterns and rules\n- Integration with issue trackers\n\n## License\n\nMIT\n\n",
"bugtrack_url": null,
"license": null,
"summary": "A modular framework to forge and verify intelligent outcomes",
"version": "0.1.2",
"project_urls": {
"Documentation": "https://github.com/yourusername/outcomeforge#readme",
"Homepage": "https://github.com/yourusername/outcomeforge",
"Issues": "https://github.com/yourusername/outcomeforge/issues",
"Repository": "https://github.com/yourusername/outcomeforge"
},
"split_keywords": [
"code-analysis",
" llm",
" ai",
" devops",
" quality-assurance"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "015bcc2ce44e967009d3cbe627c30b77f7a4e5d98bc3a3ad4be94ea2d6de3d33",
"md5": "0465e67c121663e3e27476532942a618",
"sha256": "3dee362cabf1a46e754158a909681e91f3b13aec8edae6a3e48117e6d26576c1"
},
"downloads": -1,
"filename": "outcomeforge-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0465e67c121663e3e27476532942a618",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 99520,
"upload_time": "2025-10-19T08:31:55",
"upload_time_iso_8601": "2025-10-19T08:31:55.742331Z",
"url": "https://files.pythonhosted.org/packages/01/5b/cc2ce44e967009d3cbe627c30b77f7a4e5d98bc3a3ad4be94ea2d6de3d33/outcomeforge-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "7db4cdf2393032d2a62185398eef31317df7817952fd270d4263ea71a03854cb",
"md5": "929cd763552ea77e4ff9fd9c10e6c889",
"sha256": "761c15701c997fe4ba22dbb30ea4fcd2d9d494e0adbb3ef5fec673fb0026e9fe"
},
"downloads": -1,
"filename": "outcomeforge-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "929cd763552ea77e4ff9fd9c10e6c889",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 115424,
"upload_time": "2025-10-19T08:31:57",
"upload_time_iso_8601": "2025-10-19T08:31:57.227123Z",
"url": "https://files.pythonhosted.org/packages/7d/b4/cdf2393032d2a62185398eef31317df7817952fd270d4263ea71a03854cb/outcomeforge-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-19 08:31:57",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "yourusername",
"github_project": "outcomeforge",
"github_not_found": true,
"lcname": "outcomeforge"
}