pantsonfire


Namepantsonfire JSON
Version 0.3.7 PyPI version JSON
download
home_pagehttps://github.com/seanmcdonald/pantsonfire
SummaryFind wrong information in technical docs online
upload_time2025-10-29 15:51:48
maintainerNone
docs_urlNone
authorSean McDonald
requires_python>=3.8
licenseNone
keywords documentation verification fact-checking technical-docs ai llm web-scraping
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # pantsonfire 🔥

Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.

## ✨ Key Features

- **🧠 Natural Language Analysis**: Use simple English commands like "find outdated API info on tech blogs"
- **🕷️ Intelligent Web Crawling**: Automatically discover similar issues across entire websites
- **📚 Oxen AI Integration**: Versioned, traceable storage with complete audit trails
- **🔍 Multi-Level Detection**: Pattern matching + AI-powered analysis for comprehensive coverage
- **🌐 Universal Sources**: Websites, GitHub repos, documentation sites, local files
- **📊 Rich Reporting**: Browser-integrated reports with JSON/CSV export
- **🚀 Dual Analysis Modes**: Basic pattern matching or full LLM analysis via OpenRouter
- **🔗 Automatic Report Opening**: Direct links to versioned analysis results

## Installation

```bash
pip install -e .
```

### Environment Setup

Create a `.env` file or set environment variables:

```bash
# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here

# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here
```

## 🚀 Quick Start

### Natural Language Analysis

```bash
# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report
```

### Traditional Analysis

```bash
# Check specific content
pantsonfire --mode external check 
    "https://blog.example.com/outdated-tutorial" 
    "https://docs.example.com/current-api" 
    --crawl --open-report
```

## 📚 Oxen AI Integration

Pantsonfire uses [Oxen AI](https://oxen.ai) for versioned, traceable data storage:

- **Automatic Repository Creation**: Each analysis gets its own Oxen repository
- **Versioned Branches**: Findings stored in timestamped branches
- **Complete Traceability**: All prompts, content, and metadata preserved
- **Web Interface**: Direct links to browse analysis results
- **Collaborative**: Multiple analysts can contribute to findings

### Storage Structure

```
your-namespace/
├── analysis_check_20241023_143052/
│   ├── data/
│   │   ├── findings.json
│   │   └── findings.csv
│   ├── reports/
│   │   └── findings.txt
│   ├── sources/
│   │   ├── extracted_content.txt
│   └── metadata/
│       └── analysis_metadata.json
```

## Configuration

1. Get an OpenRouter API key from [openrouter.ai/keys](https://openrouter.ai/keys)
2. Set your API key:

```bash
export OPENROUTER_API_KEY="your_key_here"
```

Or create a `.env` file:

```bash
cp .env.example .env
# Edit .env with your API key
```

## Usage

### Basic Check

Check a blog post against official documentation:

```bash
# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md

# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current
```

### View Results

```bash
# View recent detections
pantsonfire logs

# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv
```

### Configuration

```bash
# Test LLM connection
pantsonfire config --test

# View current config
pantsonfire config
```

## Real-World Example: Oxen AI Documentation Analysis

pantsonfire successfully identified outdated "Get Early Access" references across Oxen AI's website. See `oxen-ai-example.md` for a complete demonstration.

### Contextual Hints

Provide natural language hints to guide the LLM analysis:

```bash
pantsonfire check "blog-url" "docs-url" --hints "the beta program ended in 2024 and docs now show the production API"
```

This helps the LLM focus on specific types of changes you're looking for.

### Natural Language Analysis

```bash
pantsonfire analyze "the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site" --openrouter --crawl --open-report
```

### Direct URL Analysis

```bash
pantsonfire check "https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai" \
  "https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune" \
  "https://github.com/Oxen-AI/Oxen" \
  --hints "the early access program is done and the api docs show the ground truth today" \
  --openrouter --open-report
```

## Example Output

```
🔥 ISSUE #1
Blog: unknown
Truth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune
Confidence: 0.90
Problem: References 'Get Early Access' which appears to be outdated
Evidence: Official documentation no longer mentions 'Get Early Access'
Time: 2025-10-23T22:52:35
```

## Architecture

- **Factory Pattern**: Simple app creation with mode switching
- **Modular Extractors**: Separate handling for local vs web content
- **LLM Integration**: Structured prompts for factual verification
- **Storage Backends**: Extensible result storage (JSON default)

## Development

Run tests:

```bash
python tests/test_sample.py
```

## License

MIT

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/seanmcdonald/pantsonfire",
    "name": "pantsonfire",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "documentation, verification, fact-checking, technical-docs, ai, llm, web-scraping",
    "author": "Sean McDonald",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/10/5c/f12dd8319c8f361535176dfec06607cdae4d993bee0a5ad545ba51c95f3d/pantsonfire-0.3.7.tar.gz",
    "platform": null,
    "description": "# pantsonfire \ud83d\udd25\n\nFind wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.\n\n## \u2728 Key Features\n\n- **\ud83e\udde0 Natural Language Analysis**: Use simple English commands like \"find outdated API info on tech blogs\"\n- **\ud83d\udd77\ufe0f Intelligent Web Crawling**: Automatically discover similar issues across entire websites\n- **\ud83d\udcda Oxen AI Integration**: Versioned, traceable storage with complete audit trails\n- **\ud83d\udd0d Multi-Level Detection**: Pattern matching + AI-powered analysis for comprehensive coverage\n- **\ud83c\udf10 Universal Sources**: Websites, GitHub repos, documentation sites, local files\n- **\ud83d\udcca Rich Reporting**: Browser-integrated reports with JSON/CSV export\n- **\ud83d\ude80 Dual Analysis Modes**: Basic pattern matching or full LLM analysis via OpenRouter\n- **\ud83d\udd17 Automatic Report Opening**: Direct links to versioned analysis results\n\n## Installation\n\n```bash\npip install -e .\n```\n\n### Environment Setup\n\nCreate a `.env` file or set environment variables:\n\n```bash\n# For LLM analysis (optional - falls back to pattern matching)\nOPENROUTER_API_KEY=your_openrouter_key_here\n\n# For Oxen AI storage (optional - uses local storage if not set)\nOXEN_API_KEY=your_oxen_key_here\n```\n\n## \ud83d\ude80 Quick Start\n\n### Natural Language Analysis\n\n```bash\n# Analyze a website for outdated information\npantsonfire analyze \"find outdated API references on python-requests blog posts\" --crawl --openrouter --open-report\n```\n\n### Traditional Analysis\n\n```bash\n# Check specific content\npantsonfire --mode external check \n    \"https://blog.example.com/outdated-tutorial\" \n    \"https://docs.example.com/current-api\" \n    --crawl --open-report\n```\n\n## \ud83d\udcda Oxen AI Integration\n\nPantsonfire uses [Oxen AI](https://oxen.ai) for versioned, traceable data storage:\n\n- **Automatic Repository Creation**: Each analysis gets its own Oxen repository\n- **Versioned Branches**: Findings stored in timestamped branches\n- **Complete Traceability**: All prompts, content, and metadata preserved\n- **Web Interface**: Direct links to browse analysis results\n- **Collaborative**: Multiple analysts can contribute to findings\n\n### Storage Structure\n\n```\nyour-namespace/\n\u251c\u2500\u2500 analysis_check_20241023_143052/\n\u2502   \u251c\u2500\u2500 data/\n\u2502   \u2502   \u251c\u2500\u2500 findings.json\n\u2502   \u2502   \u2514\u2500\u2500 findings.csv\n\u2502   \u251c\u2500\u2500 reports/\n\u2502   \u2502   \u2514\u2500\u2500 findings.txt\n\u2502   \u251c\u2500\u2500 sources/\n\u2502   \u2502   \u251c\u2500\u2500 extracted_content.txt\n\u2502   \u2514\u2500\u2500 metadata/\n\u2502       \u2514\u2500\u2500 analysis_metadata.json\n```\n\n## Configuration\n\n1. Get an OpenRouter API key from [openrouter.ai/keys](https://openrouter.ai/keys)\n2. Set your API key:\n\n```bash\nexport OPENROUTER_API_KEY=\"your_key_here\"\n```\n\nOr create a `.env` file:\n\n```bash\ncp .env.example .env\n# Edit .env with your API key\n```\n\n## Usage\n\n### Basic Check\n\nCheck a blog post against official documentation:\n\n```bash\n# Internal mode (local files)\npantsonfire check blog_post.md official_docs.md\n\n# External mode (web URLs)\npantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current\n```\n\n### View Results\n\n```bash\n# View recent detections\npantsonfire logs\n\n# Export results\npantsonfire export results.json --format json\npantsonfire export results.csv --format csv\n```\n\n### Configuration\n\n```bash\n# Test LLM connection\npantsonfire config --test\n\n# View current config\npantsonfire config\n```\n\n## Real-World Example: Oxen AI Documentation Analysis\n\npantsonfire successfully identified outdated \"Get Early Access\" references across Oxen AI's website. See `oxen-ai-example.md` for a complete demonstration.\n\n### Contextual Hints\n\nProvide natural language hints to guide the LLM analysis:\n\n```bash\npantsonfire check \"blog-url\" \"docs-url\" --hints \"the beta program ended in 2024 and docs now show the production API\"\n```\n\nThis helps the LLM focus on specific types of changes you're looking for.\n\n### Natural Language Analysis\n\n```bash\npantsonfire analyze \"the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site\" --openrouter --crawl --open-report\n```\n\n### Direct URL Analysis\n\n```bash\npantsonfire check \"https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai\" \\\n  \"https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune\" \\\n  \"https://github.com/Oxen-AI/Oxen\" \\\n  --hints \"the early access program is done and the api docs show the ground truth today\" \\\n  --openrouter --open-report\n```\n\n## Example Output\n\n```\n\ud83d\udd25 ISSUE #1\nBlog: unknown\nTruth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune\nConfidence: 0.90\nProblem: References 'Get Early Access' which appears to be outdated\nEvidence: Official documentation no longer mentions 'Get Early Access'\nTime: 2025-10-23T22:52:35\n```\n\n## Architecture\n\n- **Factory Pattern**: Simple app creation with mode switching\n- **Modular Extractors**: Separate handling for local vs web content\n- **LLM Integration**: Structured prompts for factual verification\n- **Storage Backends**: Extensible result storage (JSON default)\n\n## Development\n\nRun tests:\n\n```bash\npython tests/test_sample.py\n```\n\n## License\n\nMIT\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Find wrong information in technical docs online",
    "version": "0.3.7",
    "project_urls": {
        "Bug Reports": "https://github.com/seanmcdonald/pantsonfire/issues",
        "Documentation": "https://github.com/seanmcdonald/pantsonfire#readme",
        "Homepage": "https://github.com/seanmcdonald/pantsonfire",
        "Source": "https://github.com/seanmcdonald/pantsonfire"
    },
    "split_keywords": [
        "documentation",
        " verification",
        " fact-checking",
        " technical-docs",
        " ai",
        " llm",
        " web-scraping"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b68b7314d2ffd62183d2d1bf7b39260b149e96758e154f2ea6bc3ab42edc3105",
                "md5": "7f4719a38ae74763bf62d6b2046da58b",
                "sha256": "9a3d54ed2e889ae5ab57a7b890847371762467ba45ba7ebd1104965f046b16ba"
            },
            "downloads": -1,
            "filename": "pantsonfire-0.3.7-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "7f4719a38ae74763bf62d6b2046da58b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 61268,
            "upload_time": "2025-10-29T15:51:47",
            "upload_time_iso_8601": "2025-10-29T15:51:47.439507Z",
            "url": "https://files.pythonhosted.org/packages/b6/8b/7314d2ffd62183d2d1bf7b39260b149e96758e154f2ea6bc3ab42edc3105/pantsonfire-0.3.7-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "105cf12dd8319c8f361535176dfec06607cdae4d993bee0a5ad545ba51c95f3d",
                "md5": "59301363eb7042bda8575c23e92c32bb",
                "sha256": "d4ddfc2c73e59bba75723cdfdefec9394bc40336c99b2d418a6588c4f5cbe44b"
            },
            "downloads": -1,
            "filename": "pantsonfire-0.3.7.tar.gz",
            "has_sig": false,
            "md5_digest": "59301363eb7042bda8575c23e92c32bb",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 59982,
            "upload_time": "2025-10-29T15:51:48",
            "upload_time_iso_8601": "2025-10-29T15:51:48.494784Z",
            "url": "https://files.pythonhosted.org/packages/10/5c/f12dd8319c8f361535176dfec06607cdae4d993bee0a5ad545ba51c95f3d/pantsonfire-0.3.7.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-29 15:51:48",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "seanmcdonald",
    "github_project": "pantsonfire",
    "github_not_found": true,
    "lcname": "pantsonfire"
}
        
Elapsed time: 4.08340s