# pantsonfire 🔥
Find wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.
## ✨ Key Features
- **🧠 Natural Language Analysis**: Use simple English commands like "find outdated API info on tech blogs"
- **🕷️ Intelligent Web Crawling**: Automatically discover similar issues across entire websites
- **📚 Oxen AI Integration**: Versioned, traceable storage with complete audit trails
- **🔍 Multi-Level Detection**: Pattern matching + AI-powered analysis for comprehensive coverage
- **🌐 Universal Sources**: Websites, GitHub repos, documentation sites, local files
- **📊 Rich Reporting**: Browser-integrated reports with JSON/CSV export
- **🚀 Dual Analysis Modes**: Basic pattern matching or full LLM analysis via OpenRouter
- **🔗 Automatic Report Opening**: Direct links to versioned analysis results
## Installation
```bash
pip install -e .
```
### Environment Setup
Create a `.env` file or set environment variables:
```bash
# For LLM analysis (optional - falls back to pattern matching)
OPENROUTER_API_KEY=your_openrouter_key_here
# For Oxen AI storage (optional - uses local storage if not set)
OXEN_API_KEY=your_oxen_key_here
```
## 🚀 Quick Start
### Natural Language Analysis
```bash
# Analyze a website for outdated information
pantsonfire analyze "find outdated API references on python-requests blog posts" --crawl --openrouter --open-report
```
### Traditional Analysis
```bash
# Check specific content
pantsonfire --mode external check
"https://blog.example.com/outdated-tutorial"
"https://docs.example.com/current-api"
--crawl --open-report
```
## 📚 Oxen AI Integration
Pantsonfire uses [Oxen AI](https://oxen.ai) for versioned, traceable data storage:
- **Automatic Repository Creation**: Each analysis gets its own Oxen repository
- **Versioned Branches**: Findings stored in timestamped branches
- **Complete Traceability**: All prompts, content, and metadata preserved
- **Web Interface**: Direct links to browse analysis results
- **Collaborative**: Multiple analysts can contribute to findings
### Storage Structure
```
your-namespace/
├── analysis_check_20241023_143052/
│ ├── data/
│ │ ├── findings.json
│ │ └── findings.csv
│ ├── reports/
│ │ └── findings.txt
│ ├── sources/
│ │ ├── extracted_content.txt
│ └── metadata/
│ └── analysis_metadata.json
```
## Configuration
1. Get an OpenRouter API key from [openrouter.ai/keys](https://openrouter.ai/keys)
2. Set your API key:
```bash
export OPENROUTER_API_KEY="your_key_here"
```
Or create a `.env` file:
```bash
cp .env.example .env
# Edit .env with your API key
```
## Usage
### Basic Check
Check a blog post against official documentation:
```bash
# Internal mode (local files)
pantsonfire check blog_post.md official_docs.md
# External mode (web URLs)
pantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current
```
### View Results
```bash
# View recent detections
pantsonfire logs
# Export results
pantsonfire export results.json --format json
pantsonfire export results.csv --format csv
```
### Configuration
```bash
# Test LLM connection
pantsonfire config --test
# View current config
pantsonfire config
```
## Real-World Example: Oxen AI Documentation Analysis
pantsonfire successfully identified outdated "Get Early Access" references across Oxen AI's website. See `oxen-ai-example.md` for a complete demonstration.
### Contextual Hints
Provide natural language hints to guide the LLM analysis:
```bash
pantsonfire check "blog-url" "docs-url" --hints "the beta program ended in 2024 and docs now show the production API"
```
This helps the LLM focus on specific types of changes you're looking for.
### Natural Language Analysis
```bash
pantsonfire analyze "the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site" --openrouter --crawl --open-report
```
### Direct URL Analysis
```bash
pantsonfire check "https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai" \
"https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune" \
"https://github.com/Oxen-AI/Oxen" \
--hints "the early access program is done and the api docs show the ground truth today" \
--openrouter --open-report
```
## Example Output
```
🔥 ISSUE #1
Blog: unknown
Truth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune
Confidence: 0.90
Problem: References 'Get Early Access' which appears to be outdated
Evidence: Official documentation no longer mentions 'Get Early Access'
Time: 2025-10-23T22:52:35
```
## Architecture
- **Factory Pattern**: Simple app creation with mode switching
- **Modular Extractors**: Separate handling for local vs web content
- **LLM Integration**: Structured prompts for factual verification
- **Storage Backends**: Extensible result storage (JSON default)
## Development
Run tests:
```bash
python tests/test_sample.py
```
## License
MIT
Raw data
{
"_id": null,
"home_page": "https://github.com/seanmcdonald/pantsonfire",
"name": "pantsonfire",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "documentation, verification, fact-checking, technical-docs, ai, llm, web-scraping",
"author": "Sean McDonald",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/10/5c/f12dd8319c8f361535176dfec06607cdae4d993bee0a5ad545ba51c95f3d/pantsonfire-0.3.7.tar.gz",
"platform": null,
"description": "# pantsonfire \ud83d\udd25\n\nFind wrong information in technical documentation online. A tool for detecting outdated, incorrect, or deprecated information in blog posts and technical articles by cross-referencing against official documentation.\n\n## \u2728 Key Features\n\n- **\ud83e\udde0 Natural Language Analysis**: Use simple English commands like \"find outdated API info on tech blogs\"\n- **\ud83d\udd77\ufe0f Intelligent Web Crawling**: Automatically discover similar issues across entire websites\n- **\ud83d\udcda Oxen AI Integration**: Versioned, traceable storage with complete audit trails\n- **\ud83d\udd0d Multi-Level Detection**: Pattern matching + AI-powered analysis for comprehensive coverage\n- **\ud83c\udf10 Universal Sources**: Websites, GitHub repos, documentation sites, local files\n- **\ud83d\udcca Rich Reporting**: Browser-integrated reports with JSON/CSV export\n- **\ud83d\ude80 Dual Analysis Modes**: Basic pattern matching or full LLM analysis via OpenRouter\n- **\ud83d\udd17 Automatic Report Opening**: Direct links to versioned analysis results\n\n## Installation\n\n```bash\npip install -e .\n```\n\n### Environment Setup\n\nCreate a `.env` file or set environment variables:\n\n```bash\n# For LLM analysis (optional - falls back to pattern matching)\nOPENROUTER_API_KEY=your_openrouter_key_here\n\n# For Oxen AI storage (optional - uses local storage if not set)\nOXEN_API_KEY=your_oxen_key_here\n```\n\n## \ud83d\ude80 Quick Start\n\n### Natural Language Analysis\n\n```bash\n# Analyze a website for outdated information\npantsonfire analyze \"find outdated API references on python-requests blog posts\" --crawl --openrouter --open-report\n```\n\n### Traditional Analysis\n\n```bash\n# Check specific content\npantsonfire --mode external check \n \"https://blog.example.com/outdated-tutorial\" \n \"https://docs.example.com/current-api\" \n --crawl --open-report\n```\n\n## \ud83d\udcda Oxen AI Integration\n\nPantsonfire uses [Oxen AI](https://oxen.ai) for versioned, traceable data storage:\n\n- **Automatic Repository Creation**: Each analysis gets its own Oxen repository\n- **Versioned Branches**: Findings stored in timestamped branches\n- **Complete Traceability**: All prompts, content, and metadata preserved\n- **Web Interface**: Direct links to browse analysis results\n- **Collaborative**: Multiple analysts can contribute to findings\n\n### Storage Structure\n\n```\nyour-namespace/\n\u251c\u2500\u2500 analysis_check_20241023_143052/\n\u2502 \u251c\u2500\u2500 data/\n\u2502 \u2502 \u251c\u2500\u2500 findings.json\n\u2502 \u2502 \u2514\u2500\u2500 findings.csv\n\u2502 \u251c\u2500\u2500 reports/\n\u2502 \u2502 \u2514\u2500\u2500 findings.txt\n\u2502 \u251c\u2500\u2500 sources/\n\u2502 \u2502 \u251c\u2500\u2500 extracted_content.txt\n\u2502 \u2514\u2500\u2500 metadata/\n\u2502 \u2514\u2500\u2500 analysis_metadata.json\n```\n\n## Configuration\n\n1. Get an OpenRouter API key from [openrouter.ai/keys](https://openrouter.ai/keys)\n2. Set your API key:\n\n```bash\nexport OPENROUTER_API_KEY=\"your_key_here\"\n```\n\nOr create a `.env` file:\n\n```bash\ncp .env.example .env\n# Edit .env with your API key\n```\n\n## Usage\n\n### Basic Check\n\nCheck a blog post against official documentation:\n\n```bash\n# Internal mode (local files)\npantsonfire check blog_post.md official_docs.md\n\n# External mode (web URLs)\npantsonfire --mode external check https://blog.example.com/old-post https://docs.example.com/current\n```\n\n### View Results\n\n```bash\n# View recent detections\npantsonfire logs\n\n# Export results\npantsonfire export results.json --format json\npantsonfire export results.csv --format csv\n```\n\n### Configuration\n\n```bash\n# Test LLM connection\npantsonfire config --test\n\n# View current config\npantsonfire config\n```\n\n## Real-World Example: Oxen AI Documentation Analysis\n\npantsonfire successfully identified outdated \"Get Early Access\" references across Oxen AI's website. See `oxen-ai-example.md` for a complete demonstration.\n\n### Contextual Hints\n\nProvide natural language hints to guide the LLM analysis:\n\n```bash\npantsonfire check \"blog-url\" \"docs-url\" --hints \"the beta program ended in 2024 and docs now show the production API\"\n```\n\nThis helps the LLM focus on specific types of changes you're looking for.\n\n### Natural Language Analysis\n\n```bash\npantsonfire analyze \"the oxen website has outdated get early access buttons for fine tuning, find all similar issues on their site\" --openrouter --crawl --open-report\n```\n\n### Direct URL Analysis\n\n```bash\npantsonfire check \"https://www.oxen.ai/entry/fine-tuning-a-with-oxen-ai\" \\\n \"https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune\" \\\n \"https://github.com/Oxen-AI/Oxen\" \\\n --hints \"the early access program is done and the api docs show the ground truth today\" \\\n --openrouter --open-report\n```\n\n## Example Output\n\n```\n\ud83d\udd25 ISSUE #1\nBlog: unknown\nTruth: https://docs.oxen.ai/examples/fine-tuning/image_editing#kicking-off-the-fine-tune\nConfidence: 0.90\nProblem: References 'Get Early Access' which appears to be outdated\nEvidence: Official documentation no longer mentions 'Get Early Access'\nTime: 2025-10-23T22:52:35\n```\n\n## Architecture\n\n- **Factory Pattern**: Simple app creation with mode switching\n- **Modular Extractors**: Separate handling for local vs web content\n- **LLM Integration**: Structured prompts for factual verification\n- **Storage Backends**: Extensible result storage (JSON default)\n\n## Development\n\nRun tests:\n\n```bash\npython tests/test_sample.py\n```\n\n## License\n\nMIT\n",
"bugtrack_url": null,
"license": null,
"summary": "Find wrong information in technical docs online",
"version": "0.3.7",
"project_urls": {
"Bug Reports": "https://github.com/seanmcdonald/pantsonfire/issues",
"Documentation": "https://github.com/seanmcdonald/pantsonfire#readme",
"Homepage": "https://github.com/seanmcdonald/pantsonfire",
"Source": "https://github.com/seanmcdonald/pantsonfire"
},
"split_keywords": [
"documentation",
" verification",
" fact-checking",
" technical-docs",
" ai",
" llm",
" web-scraping"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "b68b7314d2ffd62183d2d1bf7b39260b149e96758e154f2ea6bc3ab42edc3105",
"md5": "7f4719a38ae74763bf62d6b2046da58b",
"sha256": "9a3d54ed2e889ae5ab57a7b890847371762467ba45ba7ebd1104965f046b16ba"
},
"downloads": -1,
"filename": "pantsonfire-0.3.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "7f4719a38ae74763bf62d6b2046da58b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 61268,
"upload_time": "2025-10-29T15:51:47",
"upload_time_iso_8601": "2025-10-29T15:51:47.439507Z",
"url": "https://files.pythonhosted.org/packages/b6/8b/7314d2ffd62183d2d1bf7b39260b149e96758e154f2ea6bc3ab42edc3105/pantsonfire-0.3.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "105cf12dd8319c8f361535176dfec06607cdae4d993bee0a5ad545ba51c95f3d",
"md5": "59301363eb7042bda8575c23e92c32bb",
"sha256": "d4ddfc2c73e59bba75723cdfdefec9394bc40336c99b2d418a6588c4f5cbe44b"
},
"downloads": -1,
"filename": "pantsonfire-0.3.7.tar.gz",
"has_sig": false,
"md5_digest": "59301363eb7042bda8575c23e92c32bb",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 59982,
"upload_time": "2025-10-29T15:51:48",
"upload_time_iso_8601": "2025-10-29T15:51:48.494784Z",
"url": "https://files.pythonhosted.org/packages/10/5c/f12dd8319c8f361535176dfec06607cdae4d993bee0a5ad545ba51c95f3d/pantsonfire-0.3.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-10-29 15:51:48",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "seanmcdonald",
"github_project": "pantsonfire",
"github_not_found": true,
"lcname": "pantsonfire"
}