Sider-CAPTCHA-Solver

Name	Sider-CAPTCHA-Solver JSON
Version	1.0.0 JSON
	download
home_page	https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver
Summary	Industrial-grade slider CAPTCHA recognition system based on deep learning
upload_time	2025-07-23 07:27:03
maintainer	TomokotoKiyoshi
docs_url	None
author	TomokotoKiyoshi
requires_python	>=3.8
license	MIT
keywords	captcha slider recognition deep-learning pytorch centernet computer-vision
VCS
bugtrack_url
requirements	numpy opencv-python Pillow torch torchvision pandas scikit-learn requests aiohttp albumentations scipy tqdm tensorboard loguru pyyaml python-dotenv pytest pytest-cov black flake8 isort
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Industrial-Grade Slider CAPTCHA Recognition System

<div align="center">

[English](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/README.md) | [简体中文](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/README_zh.md)

</div>

<div align="center">

[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/downloads/)
[![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)

A high-precision slider CAPTCHA recognition solution based on deep learning, utilizing an improved CenterNet architecture to achieve 80% accuracy on real CAPTCHA datasets.

</div>

## 📋 Project Overview

This project is an industrial-grade slider CAPTCHA recognition system that overcomes the accuracy bottleneck of traditional template matching algorithms through deep learning methods. The system is trained on **over 300,000** synthetic CAPTCHA images, employing a lightweight CNN architecture that ensures high precision while maintaining real-time inference capabilities.

### 🎯 Core Features

- **High-Precision Recognition**: 80% accuracy with 7px error tolerance, 73% with 5px on real CAPTCHAs
- **Real-Time Inference**: GPU inference 1.30ms (RTX 5090), CPU inference 5.21ms (AMD Ryzen 9 9950X), supporting real-time applications
- **Lightweight Architecture**: Only 3.5M parameters, model file approximately 36MB
- **Industrial-Grade Design**: Complete data generation, training, and evaluation pipeline
- **Sub-pixel Precision**: Achieves sub-pixel level localization using CenterNet offset mechanism

### 🖼️ Recognition Performance Demo

#### Real CAPTCHA Dataset Recognition Results

![Real Dataset Recognition Results](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/results/best_model_evaluation/real_captchas/visualizations/sample_0031.png?raw=true)

*Figure: Recognition results on real website CAPTCHAs, with red circles marking gap positions and blue circles marking slider positions*

#### Test Dataset Recognition Results

![Test Dataset Recognition Results](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/results/best_model_evaluation/test_dataset/visualizations/sample_0014.png?raw=true)

*Figure: Recognition results on synthetic test set, demonstrating the model's adaptability to different shapes and lighting conditions*

## 🚀 Quick Start

### Requirements

```bash
# Python 3.8+
pip install -r requirements.txt
```

### Installation

#### Install via pip

```bash
pip install Sider_CAPTCHA_Solver
```

### Basic Usage

After pip installation, you can directly import and use:

#### 1. Basic Prediction - Get Sliding Distance

```python
from sider_captcha_solver import CaptchaPredictor

# Initialize predictor
predictor = CaptchaPredictor(
    model_path='best',  # Use built-in best model, or specify custom model path
    device='auto'       # Auto-select GPU/CPU
)

# Predict single image
result = predictor.predict('path/to/captcha.png')

# Get sliding distance
if result['slider_x'] and result['gap_x']:
    sliding_distance = result['gap_x'] - result['slider_x']
    print(f"Sliding distance: {sliding_distance:.2f} px")
    print(f"Gap position: ({result['gap_x']:.2f}, {result['gap_y']:.2f})")
    print(f"Slider position: ({result['slider_x']:.2f}, {result['slider_y']:.2f})")
else:
    print("Detection failed")
```

#### 2. Batch Processing - Process Multiple Images

```python
from sider_captcha_solver import CaptchaPredictor
import glob
import os

# Initialize predictor
predictor = CaptchaPredictor(model_path='best', device='auto')

# Batch process CAPTCHAs
captcha_folder = 'path/to/captchas'

for img_path in glob.glob(os.path.join(captcha_folder, '*.png')):
    result = predictor.predict(img_path)

    if result['slider_x'] and result['gap_x']:
        distance = result['gap_x'] - result['slider_x']
        confidence = (result['slider_confidence'] + result['gap_confidence']) / 2
        print(f"{os.path.basename(img_path)}: Slide {distance:.1f} px (Confidence: {confidence:.3f})")
    else:
        print(f"{os.path.basename(img_path)}: Detection failed")
```

#### 3. Visualization and Debugging

```python
from sider_captcha_solver import CaptchaPredictor
import matplotlib.pyplot as plt

# Initialize predictor
predictor = CaptchaPredictor(model_path='best', device='auto')

# Test image path
test_image = 'path/to/captcha.png'

# Generate and save prediction visualization
predictor.visualize_prediction(
    test_image,
    save_path='prediction_result.png',  # Save path
    show=True                           # Display window
)

# Generate heatmap visualization (view model internal activations)
predictor.visualize_heatmaps(
    test_image,
    save_path='heatmap_result.png',    # Save 4-panel heatmap
    show=True
)

# Compare different threshold effects
thresholds = [0.0, 0.1, 0.3, 0.5]
fig, axes = plt.subplots(1, len(thresholds), figsize=(15, 4))

for idx, threshold in enumerate(thresholds):
    # Create predictor with different thresholds
    pred = CaptchaPredictor(model_path='best', hm_threshold=threshold)
    result = pred.predict(test_image)

    # Visualize to subplot
    ax = axes[idx]
    img = plt.imread(test_image)
    ax.imshow(img)
    ax.set_title(f'Threshold={threshold}')

    if result['slider_x'] and result['gap_x']:
        ax.plot(result['slider_x'], result['slider_y'], 'bo', markersize=10)
        ax.plot(result['gap_x'], result['gap_y'], 'ro', markersize=10)
    ax.axis('off')

plt.tight_layout()
plt.savefig('threshold_comparison.png')
plt.show()
```

#### 4. Complete Production Environment Example

```python
from sider_captcha_solver import CaptchaPredictor
import logging
import time
from typing import Optional, Dict

# Configure logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class CaptchaSolver:
    """Production environment CAPTCHA solver wrapper"""

    def __init__(self, model_path: str = 'best', device: str = 'auto'):
        self.predictor = CaptchaPredictor(
            model_path=model_path,
            device=device,
            hm_threshold=0.1  # Balance accuracy and recall
        )
        logger.info(f"CAPTCHA solver initialized, device: {device}")

    def solve(self, image_path: str, max_retries: int = 3) -> Optional[Dict]:
        """
        Solve CAPTCHA with retry mechanism

        Args:
            image_path: CAPTCHA image path
            max_retries: Maximum retry attempts

        Returns:
            Dictionary containing sliding distance and confidence, None on failure
        """
        for attempt in range(max_retries):
            try:
                # Record start time
                start_time = time.time()

                # Execute prediction
                result = self.predictor.predict(image_path)

                # Calculate elapsed time
                elapsed_time = (time.time() - start_time) * 1000

                # Check result validity
                if result['slider_x'] and result['gap_x']:
                    sliding_distance = result['gap_x'] - result['slider_x']
                    confidence = (result['slider_confidence'] + result['gap_confidence']) / 2

                    logger.info(f"Solve success: distance={sliding_distance:.1f}px, "
                              f"confidence={confidence:.3f}, time={elapsed_time:.1f}ms")

                    return {
                        'success': True,
                        'sliding_distance': sliding_distance,
                        'confidence': confidence,
                        'elapsed_ms': elapsed_time,
                        'details': result
                    }
                else:
                    logger.warning(f"Attempt {attempt + 1} failed: no valid result detected")

            except Exception as e:
                logger.error(f"Attempt {attempt + 1} error: {str(e)}")

            # Brief delay if not last attempt
            if attempt < max_retries - 1:
                time.sleep(0.1)

        logger.error(f"Solve failed: reached maximum retries {max_retries}")
        return None

# Usage example
if __name__ == "__main__":
    solver = CaptchaSolver()

    # Solve single CAPTCHA
    result = solver.solve('path/to/captcha.png')

    if result and result['success']:
        print(f"Sliding distance: {result['sliding_distance']:.1f} px")
        print(f"Confidence: {result['confidence']:.3f}")
        print(f"Processing time: {result['elapsed_ms']:.1f} ms")
    else:
        print("CAPTCHA solving failed")
```

### Advanced Features

#### 1. Custom Model and Configuration

```python
from sider_captcha_solver import CaptchaPredictor
import torch

# Use your own trained model
custom_predictor = CaptchaPredictor(
    model_path='path/to/your_trained_model.pth',
    device='cuda:0',    # Specify GPU
    hm_threshold=0.15   # Adjust based on model characteristics
)

# Check model info
if torch.cuda.is_available():
    print(f"Using GPU: {torch.cuda.get_device_name(0)}")
    print(f"VRAM usage: {torch.cuda.memory_allocated(0) / 1024**2:.1f} MB")

# Predict
result = custom_predictor.predict('captcha.png')
```

#### 2. Performance Benchmarking

```python
from sider_captcha_solver import CaptchaPredictor
import time
import numpy as np

# Initialize predictor
predictor = CaptchaPredictor(model_path='best', device='auto')

# Test image list
test_images = ['captcha1.png', 'captcha2.png', 'captcha3.png']

# Warm up (first inference is slower)
_ = predictor.predict(test_images[0])

# Performance test
times = []
for _ in range(10):  # Test each image 10 times
    for img_path in test_images:
        start = time.time()
        result = predictor.predict(img_path)
        elapsed = (time.time() - start) * 1000  # Convert to milliseconds
        times.append(elapsed)

# Statistics
print(f"Average inference time: {np.mean(times):.2f} ms")
print(f"Fastest: {np.min(times):.2f} ms")
print(f"Slowest: {np.max(times):.2f} ms")
print(f"Std deviation: {np.std(times):.2f} ms")
print(f"FPS: {1000 / np.mean(times):.1f}")
```

## 📊 Data Generation Process

### 1. Data Collection

Downloaded high-quality images from Pixabay across 10 categories as backgrounds: Minecraft, Pixel Food, Block Public Square, Block Illustration, Backgrounds, Buildings, Nature, Anime Cityscape, Abstract Geometric Art, etc. Up to 200 images per category, totaling approximately 2,000 raw images.

### 2. CAPTCHA Generation Logic

```
Raw Images (2000+) → Resize(320×160) → Puzzle Generation
                                        ↓
                              11 shapes × 3 sizes × 4 positions
                                        ↓
                              132 CAPTCHAs per original image
                                        ↓
                              Total: 354,024 training images generated
```

**Puzzle Shape Design**:

- 5 regular puzzle shapes (combinations of convex, concave, and flat edges)
- 6 special shapes (circle, square, triangle, hexagon, pentagon, star)

**Random Parameters**:

- Puzzle size: 40-70 pixels (3 random sizes)
- Position distribution: x-axis beyond slider width + 10px to avoid overlap
- Lighting effects: Randomly added lighting variations for robustness

### 3. Dataset Split

- Training set: 90% (split by original images to avoid data leakage)
- Test set: 10% (Test Set 1)
- Real CAPTCHA test set: 100 NetEase Yidun CAPTCHAs (Test Set 2)

## 🏗️ Network Architecture

### Model Structure

```
Input (3×160×320)
    │
    ├─ Stem Conv (3×3, stride=2) ──────→ 32×80×160
    │
    ├─ ResBlock Stage-1 (×2, stride=2) ─→ 64×40×80
    │
    ├─ ResBlock Stage-2 (×2, stride=2) ─→ 128×20×40
    │
    ├─ ResBlock Stage-3 (×2, stride=2) ─→ 256×10×20
    │
    ├─ Neck (1×1 Conv) ─────────────────→ 128×10×20
    │
    ├─ UpConv-1 (3×3, stride=2) ────────→ 64×20×40
    │
    ├─ UpConv-2 (3×3, stride=2) ────────→ 64×40×80
    │
    └─┬─ Gap Detection Head ────┐
        │   ├─ Heatmap (1×40×80)   │
        │   └─ Offset (2×40×80)    │
        │                              │
        └─ Piece Detection Head ───┤
             ├─ Heatmap (1×40×80)   │
             └─ Offset (2×40×80)    │
```

### Key Design Elements

- **Backbone**: ResNet18-Lite, removed global pooling and fully connected layers
- **Detection Heads**: Dual-branch CenterNet design, detecting gap and slider centers separately
- **Loss Function**: Focal Loss (heatmap) + L1 Loss (offset regression)
- **Downsampling Rate**: 4x, output resolution 80×40
- **Activation**: ReLU (except output layers)
- **Normalization**: BatchNorm

### Model Parameters

| Component       | Parameters | Description      |
| -------------- | ---------- | ---------------- |
| Backbone       | ~3.0M      | ResNet18-Lite    |
| Neck + UpConv  | ~0.3M      | Feature fusion   |
| Detection Heads| ~0.2M      | Dual-branch heads|
| **Total**      | **~3.5M**  | FP32 ~36MB       |

## 📈 Performance Metrics

### Accuracy (Based on Sliding Distance Error)

| Dataset       | 5px Threshold | 7px Threshold | Best Epoch |
| ------------- | ------------- | ------------- | ---------- |
| Test Set (Synthetic) | 99.4%  | 99.4%        | 16         |
| Real CAPTCHAs | **73%**       | **80%**       | 15/16      |

### Inference Performance

| Hardware          | Inference Time | FPS | Batch (×32) |
| ----------------- | -------------- | --- | ----------- |
| RTX 5090          | 1.30ms         | 771 | 11.31ms     |
| AMD Ryzen 9 9950X | 5.21ms         | 192 | 144.89ms    |

### Mean Absolute Error (MAE)

- Test set: Slider 0.30px, Gap 1.14px
- Real CAPTCHAs: Slider 2.84px, Gap 9.98px

## 🛠️ Main Features

### 1. Data Generation

- Auto-download Pixabay images
- Batch generate slider CAPTCHAs
- Support multiple puzzle shapes

### 2. Model Training

- Automatic learning rate scheduling
- Training process visualization

### 3. Inference Deployment

- Support batch prediction
- REST API interface
- Heatmap visualization support

### 4. Evaluation Analysis

- Training curve analysis

## ⚠️ Disclaimer

**This project is for learning and research purposes only. Commercial or illegal use is prohibited.**

1. This project aims to promote academic research in computer vision and deep learning
2. Users must comply with relevant laws and regulations, and must not use this project to bypass website security mechanisms
3. Any legal liability arising from the use of this project shall be borne by the user
4. Please do not use this project for any behavior that may harm others' interests

## 📁 Project Structure

```
Sider_CAPTCHA_Solver/
│
├── configs/                       # Configuration files
│   └── config.yaml               # Project configuration
│
├── data/                          # Data directory
│   ├── captchas/                  # Generated CAPTCHAs (354,024 images)
│   │   └── Pic*.png              # Format: Pic{XXXX}_Bgx{X}Bgy{Y}_Sdx{X}Sdy{Y}_{hash}.png
│   ├── raw_images/                # Raw images (2000 images)
│   ├── real_captchas/             # Real CAPTCHA test set
│   │   └── annotated/             # Annotated data (100 images)
│   ├── annotations.json           # Training set annotations
│   ├── test_annotations.json      # Test set annotations
│   ├── generation_stats.json      # Generation statistics
│   └── dataset_split_stats.json   # Dataset split statistics
│
├── logs/                          # Log files
│   ├── training_accuracy_curves_all.png    # Training accuracy curves
│   ├── accuracy_comparison.png             # Test set vs real data comparison
│   ├── training_analysis_report.txt        # Training analysis report
│   ├── training_accuracy_results.csv       # Accuracy CSV data
│   ├── training_accuracy_results.json      # Accuracy JSON data
│   ├── evaluation_*.log                    # Evaluation logs
│   ├── training_log.txt                    # Training log
│   └── benchmark_results_*.json            # Performance benchmark results
│
├── results/                       # Evaluation results
│   └── best_model_evaluation/     # Best model evaluation
│       ├── test_dataset/          # Test set results
│       │   ├── evaluation_results.json     # Evaluation metrics
│       │   └── visualizations/             # Visualizations (100 images)
│       ├── real_captchas/         # Real CAPTCHA results
│       │   ├── evaluation_results.json     # Evaluation metrics
│       │   └── visualizations/             # Visualizations (50 images)
│       └── summary_report.json    # Summary report
│
├── scripts/                       # Core scripts
│   ├── annotation/                # Annotation tools
│   │   ├── annotate_captchas_matplotlib.py  # Matplotlib annotation UI
│   │   └── annotate_captchas_web.py         # Web annotation UI
│   │
│   ├── data_generation/           # Data generation scripts
│   │   ├── geometry_generator.py  # Geometry shape generator
│   │   └── puzzle_background_generator.py   # Puzzle background generator
│   │
│   ├── training/                  # Training related
│   │   ├── train.py              # Main training script
│   │   ├── dataset.py            # PyTorch dataset class
│   │   └── analyze_training.py   # Training analysis tool
│   │
│   ├── inference/                 # Inference related
│   │   └── predict.py            # Prediction interface (CaptchaPredictor class)
│   │
│   ├── evaluation/                # Evaluation scripts
│   │   └── evaluate_model.py      # Comprehensive evaluation tool (multi-mode support)
│   │
│   ├── download_images.py         # Pixabay image downloader
│   ├── generate_captchas.py       # Batch CAPTCHA generator
│   └── split_dataset.py           # Dataset splitting script
│
├── src/                          # Source code
│   ├── __init__.py
│   │
│   ├── checkpoints/               # Model weights
│   │   ├── best_model.pth         # Best model (highest accuracy)
│   │   ├── checkpoint_epoch_0001.pth ~ checkpoint_epoch_0020.pth  # Epoch checkpoints
│   │   ├── latest_checkpoint.pth  # Latest checkpoint
│   │   ├── training_log_*.txt     # Training logs
│   │   └── logs/                  # TensorBoard logs
│   │       └── events.out.tfevents.*
│   │
│   ├── captcha_generator/         # CAPTCHA generation module
│   │   ├── __init__.py
│   │   ├── batch_generator.py    # Batch generator
│   │   ├── lighting_effects.py   # Lighting effects
│   │   ├── simple_puzzle_generator.py  # Puzzle generator
│   │   └── slider_effects.py     # Slider effects
│   │
│   ├── data_collection/           # Data collection module
│   │   ├── __init__.py
│   │   └── pixabay_downloader.py # Pixabay downloader
│   │
│   ├── models/                   # Model definitions
│   │   ├── __init__.py
│   │   ├── captcha_solver.py     # CaptchaSolver main model
│   │   ├── centernet_heads.py    # CenterNet detection heads
│   │   ├── losses.py             # Loss functions (Focal Loss + L1)
│   │   └── resnet18_lite.py      # ResNet18-Lite backbone
│   │
│   └── utils/                    # Utility functions
│       ├── __init__.py
│       └── logger.py             # Logging utilities
│
├── tests/                        # Test scripts
│   ├── benchmark_inference.py     # Inference performance benchmark
│   ├── merge_real_captchas.py     # Real CAPTCHA merge tool
│   ├── test_all_puzzle_shapes.py  # All puzzle shapes test
│   ├── test_captcha_generation.py # CAPTCHA generation test
│   ├── test_darkness_levels.py    # Brightness level test
│   ├── test_distance_error_visualization.py  # Distance error visualization
│   ├── test_generate_captchas.py  # Generation function test
│   ├── test_model.py             # Model unit test
│   ├── test_model_architecture.py # Model architecture test
│   ├── test_real_captchas.py     # Real CAPTCHA test
│   └── test_slider_effects.py    # Slider effects test
│
├── outputs/                      # Test output files
│   └── *.png                     # Various test result images
│
├── api_example.py                # API usage examples
├── requirements.txt              # Dependencies
├── README.md                     # English documentation
├── README_zh.md                  # Chinese documentation
└── CLAUDE.md                     # Project technical specifications
```

## 🔧 Tech Stack

- **Deep Learning Framework**: PyTorch 2.0+
- **Image Processing**: OpenCV, Pillow
- **Data Processing**: NumPy, Pandas
- **Visualization**: Matplotlib, Seaborn
- **Web Framework**: FastAPI
- **Others**: tqdm, requests, psutil

## 📞 Contact

For questions or suggestions, please submit an Issue or Pull Request.

---

<div align="center">
<i>This project is licensed under MIT License, for learning and research purposes only</i>
</div>

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver",
    "name": "Sider-CAPTCHA-Solver",
    "maintainer": "TomokotoKiyoshi",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "captcha, slider, recognition, deep-learning, pytorch, centernet, computer-vision",
    "author": "TomokotoKiyoshi",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/fa/91/40b3a6345ec930a47158f0469e3b402ddaddec3887a650287dde62458bea/sider_captcha_solver-1.0.0.tar.gz",
    "platform": null,
    "description": "# Industrial-Grade Slider CAPTCHA Recognition System\r\n\r\n<div align=\"center\">\r\n\r\n[English](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/README.md) | [\u7b80\u4f53\u4e2d\u6587](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/README_zh.md)\r\n\r\n</div>\r\n\r\n<div align=\"center\">\r\n\r\n[![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/downloads/)\r\n[![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)\r\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)\r\n\r\nA high-precision slider CAPTCHA recognition solution based on deep learning, utilizing an improved CenterNet architecture to achieve 80% accuracy on real CAPTCHA datasets.\r\n\r\n</div>\r\n\r\n## \ud83d\udccb Project Overview\r\n\r\nThis project is an industrial-grade slider CAPTCHA recognition system that overcomes the accuracy bottleneck of traditional template matching algorithms through deep learning methods. The system is trained on **over 300,000** synthetic CAPTCHA images, employing a lightweight CNN architecture that ensures high precision while maintaining real-time inference capabilities.\r\n\r\n### \ud83c\udfaf Core Features\r\n\r\n- **High-Precision Recognition**: 80% accuracy with 7px error tolerance, 73% with 5px on real CAPTCHAs\r\n- **Real-Time Inference**: GPU inference 1.30ms (RTX 5090), CPU inference 5.21ms (AMD Ryzen 9 9950X), supporting real-time applications\r\n- **Lightweight Architecture**: Only 3.5M parameters, model file approximately 36MB\r\n- **Industrial-Grade Design**: Complete data generation, training, and evaluation pipeline\r\n- **Sub-pixel Precision**: Achieves sub-pixel level localization using CenterNet offset mechanism\r\n\r\n### \ud83d\uddbc\ufe0f Recognition Performance Demo\r\n\r\n#### Real CAPTCHA Dataset Recognition Results\r\n\r\n![Real Dataset Recognition Results](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/results/best_model_evaluation/real_captchas/visualizations/sample_0031.png?raw=true)\r\n\r\n*Figure: Recognition results on real website CAPTCHAs, with red circles marking gap positions and blue circles marking slider positions*\r\n\r\n#### Test Dataset Recognition Results\r\n\r\n![Test Dataset Recognition Results](https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/results/best_model_evaluation/test_dataset/visualizations/sample_0014.png?raw=true)\r\n\r\n*Figure: Recognition results on synthetic test set, demonstrating the model's adaptability to different shapes and lighting conditions*\r\n\r\n## \ud83d\ude80 Quick Start\r\n\r\n### Requirements\r\n\r\n```bash\r\n# Python 3.8+\r\npip install -r requirements.txt\r\n```\r\n\r\n### Installation\r\n\r\n#### Install via pip\r\n\r\n```bash\r\npip install Sider_CAPTCHA_Solver\r\n```\r\n\r\n### Basic Usage\r\n\r\nAfter pip installation, you can directly import and use:\r\n\r\n#### 1. Basic Prediction - Get Sliding Distance\r\n\r\n```python\r\nfrom sider_captcha_solver import CaptchaPredictor\r\n\r\n# Initialize predictor\r\npredictor = CaptchaPredictor(\r\n    model_path='best',  # Use built-in best model, or specify custom model path\r\n    device='auto'       # Auto-select GPU/CPU\r\n)\r\n\r\n# Predict single image\r\nresult = predictor.predict('path/to/captcha.png')\r\n\r\n# Get sliding distance\r\nif result['slider_x'] and result['gap_x']:\r\n    sliding_distance = result['gap_x'] - result['slider_x']\r\n    print(f\"Sliding distance: {sliding_distance:.2f} px\")\r\n    print(f\"Gap position: ({result['gap_x']:.2f}, {result['gap_y']:.2f})\")\r\n    print(f\"Slider position: ({result['slider_x']:.2f}, {result['slider_y']:.2f})\")\r\nelse:\r\n    print(\"Detection failed\")\r\n```\r\n\r\n#### 2. Batch Processing - Process Multiple Images\r\n\r\n```python\r\nfrom sider_captcha_solver import CaptchaPredictor\r\nimport glob\r\nimport os\r\n\r\n# Initialize predictor\r\npredictor = CaptchaPredictor(model_path='best', device='auto')\r\n\r\n# Batch process CAPTCHAs\r\ncaptcha_folder = 'path/to/captchas'\r\n\r\nfor img_path in glob.glob(os.path.join(captcha_folder, '*.png')):\r\n    result = predictor.predict(img_path)\r\n\r\n    if result['slider_x'] and result['gap_x']:\r\n        distance = result['gap_x'] - result['slider_x']\r\n        confidence = (result['slider_confidence'] + result['gap_confidence']) / 2\r\n        print(f\"{os.path.basename(img_path)}: Slide {distance:.1f} px (Confidence: {confidence:.3f})\")\r\n    else:\r\n        print(f\"{os.path.basename(img_path)}: Detection failed\")\r\n```\r\n\r\n#### 3. Visualization and Debugging\r\n\r\n```python\r\nfrom sider_captcha_solver import CaptchaPredictor\r\nimport matplotlib.pyplot as plt\r\n\r\n# Initialize predictor\r\npredictor = CaptchaPredictor(model_path='best', device='auto')\r\n\r\n# Test image path\r\ntest_image = 'path/to/captcha.png'\r\n\r\n# Generate and save prediction visualization\r\npredictor.visualize_prediction(\r\n    test_image,\r\n    save_path='prediction_result.png',  # Save path\r\n    show=True                           # Display window\r\n)\r\n\r\n# Generate heatmap visualization (view model internal activations)\r\npredictor.visualize_heatmaps(\r\n    test_image,\r\n    save_path='heatmap_result.png',    # Save 4-panel heatmap\r\n    show=True\r\n)\r\n\r\n# Compare different threshold effects\r\nthresholds = [0.0, 0.1, 0.3, 0.5]\r\nfig, axes = plt.subplots(1, len(thresholds), figsize=(15, 4))\r\n\r\nfor idx, threshold in enumerate(thresholds):\r\n    # Create predictor with different thresholds\r\n    pred = CaptchaPredictor(model_path='best', hm_threshold=threshold)\r\n    result = pred.predict(test_image)\r\n\r\n    # Visualize to subplot\r\n    ax = axes[idx]\r\n    img = plt.imread(test_image)\r\n    ax.imshow(img)\r\n    ax.set_title(f'Threshold={threshold}')\r\n\r\n    if result['slider_x'] and result['gap_x']:\r\n        ax.plot(result['slider_x'], result['slider_y'], 'bo', markersize=10)\r\n        ax.plot(result['gap_x'], result['gap_y'], 'ro', markersize=10)\r\n    ax.axis('off')\r\n\r\nplt.tight_layout()\r\nplt.savefig('threshold_comparison.png')\r\nplt.show()\r\n```\r\n\r\n#### 4. Complete Production Environment Example\r\n\r\n```python\r\nfrom sider_captcha_solver import CaptchaPredictor\r\nimport logging\r\nimport time\r\nfrom typing import Optional, Dict\r\n\r\n# Configure logging\r\nlogging.basicConfig(level=logging.INFO)\r\nlogger = logging.getLogger(__name__)\r\n\r\nclass CaptchaSolver:\r\n    \"\"\"Production environment CAPTCHA solver wrapper\"\"\"\r\n\r\n    def __init__(self, model_path: str = 'best', device: str = 'auto'):\r\n        self.predictor = CaptchaPredictor(\r\n            model_path=model_path,\r\n            device=device,\r\n            hm_threshold=0.1  # Balance accuracy and recall\r\n        )\r\n        logger.info(f\"CAPTCHA solver initialized, device: {device}\")\r\n\r\n    def solve(self, image_path: str, max_retries: int = 3) -> Optional[Dict]:\r\n        \"\"\"\r\n        Solve CAPTCHA with retry mechanism\r\n\r\n        Args:\r\n            image_path: CAPTCHA image path\r\n            max_retries: Maximum retry attempts\r\n\r\n        Returns:\r\n            Dictionary containing sliding distance and confidence, None on failure\r\n        \"\"\"\r\n        for attempt in range(max_retries):\r\n            try:\r\n                # Record start time\r\n                start_time = time.time()\r\n\r\n                # Execute prediction\r\n                result = self.predictor.predict(image_path)\r\n\r\n                # Calculate elapsed time\r\n                elapsed_time = (time.time() - start_time) * 1000\r\n\r\n                # Check result validity\r\n                if result['slider_x'] and result['gap_x']:\r\n                    sliding_distance = result['gap_x'] - result['slider_x']\r\n                    confidence = (result['slider_confidence'] + result['gap_confidence']) / 2\r\n\r\n                    logger.info(f\"Solve success: distance={sliding_distance:.1f}px, \"\r\n                              f\"confidence={confidence:.3f}, time={elapsed_time:.1f}ms\")\r\n\r\n                    return {\r\n                        'success': True,\r\n                        'sliding_distance': sliding_distance,\r\n                        'confidence': confidence,\r\n                        'elapsed_ms': elapsed_time,\r\n                        'details': result\r\n                    }\r\n                else:\r\n                    logger.warning(f\"Attempt {attempt + 1} failed: no valid result detected\")\r\n\r\n            except Exception as e:\r\n                logger.error(f\"Attempt {attempt + 1} error: {str(e)}\")\r\n\r\n            # Brief delay if not last attempt\r\n            if attempt < max_retries - 1:\r\n                time.sleep(0.1)\r\n\r\n        logger.error(f\"Solve failed: reached maximum retries {max_retries}\")\r\n        return None\r\n\r\n# Usage example\r\nif __name__ == \"__main__\":\r\n    solver = CaptchaSolver()\r\n\r\n    # Solve single CAPTCHA\r\n    result = solver.solve('path/to/captcha.png')\r\n\r\n    if result and result['success']:\r\n        print(f\"Sliding distance: {result['sliding_distance']:.1f} px\")\r\n        print(f\"Confidence: {result['confidence']:.3f}\")\r\n        print(f\"Processing time: {result['elapsed_ms']:.1f} ms\")\r\n    else:\r\n        print(\"CAPTCHA solving failed\")\r\n```\r\n\r\n### Advanced Features\r\n\r\n#### 1. Custom Model and Configuration\r\n\r\n```python\r\nfrom sider_captcha_solver import CaptchaPredictor\r\nimport torch\r\n\r\n# Use your own trained model\r\ncustom_predictor = CaptchaPredictor(\r\n    model_path='path/to/your_trained_model.pth',\r\n    device='cuda:0',    # Specify GPU\r\n    hm_threshold=0.15   # Adjust based on model characteristics\r\n)\r\n\r\n# Check model info\r\nif torch.cuda.is_available():\r\n    print(f\"Using GPU: {torch.cuda.get_device_name(0)}\")\r\n    print(f\"VRAM usage: {torch.cuda.memory_allocated(0) / 1024**2:.1f} MB\")\r\n\r\n# Predict\r\nresult = custom_predictor.predict('captcha.png')\r\n```\r\n\r\n#### 2. Performance Benchmarking\r\n\r\n```python\r\nfrom sider_captcha_solver import CaptchaPredictor\r\nimport time\r\nimport numpy as np\r\n\r\n# Initialize predictor\r\npredictor = CaptchaPredictor(model_path='best', device='auto')\r\n\r\n# Test image list\r\ntest_images = ['captcha1.png', 'captcha2.png', 'captcha3.png']\r\n\r\n# Warm up (first inference is slower)\r\n_ = predictor.predict(test_images[0])\r\n\r\n# Performance test\r\ntimes = []\r\nfor _ in range(10):  # Test each image 10 times\r\n    for img_path in test_images:\r\n        start = time.time()\r\n        result = predictor.predict(img_path)\r\n        elapsed = (time.time() - start) * 1000  # Convert to milliseconds\r\n        times.append(elapsed)\r\n\r\n# Statistics\r\nprint(f\"Average inference time: {np.mean(times):.2f} ms\")\r\nprint(f\"Fastest: {np.min(times):.2f} ms\")\r\nprint(f\"Slowest: {np.max(times):.2f} ms\")\r\nprint(f\"Std deviation: {np.std(times):.2f} ms\")\r\nprint(f\"FPS: {1000 / np.mean(times):.1f}\")\r\n```\r\n\r\n## \ud83d\udcca Data Generation Process\r\n\r\n### 1. Data Collection\r\n\r\nDownloaded high-quality images from Pixabay across 10 categories as backgrounds: Minecraft, Pixel Food, Block Public Square, Block Illustration, Backgrounds, Buildings, Nature, Anime Cityscape, Abstract Geometric Art, etc. Up to 200 images per category, totaling approximately 2,000 raw images.\r\n\r\n### 2. CAPTCHA Generation Logic\r\n\r\n```\r\nRaw Images (2000+) \u2192 Resize(320\u00d7160) \u2192 Puzzle Generation\r\n                                        \u2193\r\n                              11 shapes \u00d7 3 sizes \u00d7 4 positions\r\n                                        \u2193\r\n                              132 CAPTCHAs per original image\r\n                                        \u2193\r\n                              Total: 354,024 training images generated\r\n```\r\n\r\n**Puzzle Shape Design**:\r\n\r\n- 5 regular puzzle shapes (combinations of convex, concave, and flat edges)\r\n- 6 special shapes (circle, square, triangle, hexagon, pentagon, star)\r\n\r\n**Random Parameters**:\r\n\r\n- Puzzle size: 40-70 pixels (3 random sizes)\r\n- Position distribution: x-axis beyond slider width + 10px to avoid overlap\r\n- Lighting effects: Randomly added lighting variations for robustness\r\n\r\n### 3. Dataset Split\r\n\r\n- Training set: 90% (split by original images to avoid data leakage)\r\n- Test set: 10% (Test Set 1)\r\n- Real CAPTCHA test set: 100 NetEase Yidun CAPTCHAs (Test Set 2)\r\n\r\n## \ud83c\udfd7\ufe0f Network Architecture\r\n\r\n### Model Structure\r\n\r\n```\r\nInput (3\u00d7160\u00d7320)\r\n    \u2502\r\n    \u251c\u2500 Stem Conv (3\u00d73, stride=2) \u2500\u2500\u2500\u2500\u2500\u2500\u2192 32\u00d780\u00d7160\r\n    \u2502\r\n    \u251c\u2500 ResBlock Stage-1 (\u00d72, stride=2) \u2500\u2192 64\u00d740\u00d780\r\n    \u2502\r\n    \u251c\u2500 ResBlock Stage-2 (\u00d72, stride=2) \u2500\u2192 128\u00d720\u00d740\r\n    \u2502\r\n    \u251c\u2500 ResBlock Stage-3 (\u00d72, stride=2) \u2500\u2192 256\u00d710\u00d720\r\n    \u2502\r\n    \u251c\u2500 Neck (1\u00d71 Conv) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2192 128\u00d710\u00d720\r\n    \u2502\r\n    \u251c\u2500 UpConv-1 (3\u00d73, stride=2) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2192 64\u00d720\u00d740\r\n    \u2502\r\n    \u251c\u2500 UpConv-2 (3\u00d73, stride=2) \u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2192 64\u00d740\u00d780\r\n    \u2502\r\n    \u2514\u2500\u252c\u2500 Gap Detection Head \u2500\u2500\u2500\u2500\u2510\r\n        \u2502   \u251c\u2500 Heatmap (1\u00d740\u00d780)   \u2502\r\n        \u2502   \u2514\u2500 Offset (2\u00d740\u00d780)    \u2502\r\n        \u2502                              \u2502\r\n        \u2514\u2500 Piece Detection Head \u2500\u2500\u2500\u2524\r\n             \u251c\u2500 Heatmap (1\u00d740\u00d780)   \u2502\r\n             \u2514\u2500 Offset (2\u00d740\u00d780)    \u2502\r\n```\r\n\r\n### Key Design Elements\r\n\r\n- **Backbone**: ResNet18-Lite, removed global pooling and fully connected layers\r\n- **Detection Heads**: Dual-branch CenterNet design, detecting gap and slider centers separately\r\n- **Loss Function**: Focal Loss (heatmap) + L1 Loss (offset regression)\r\n- **Downsampling Rate**: 4x, output resolution 80\u00d740\r\n- **Activation**: ReLU (except output layers)\r\n- **Normalization**: BatchNorm\r\n\r\n### Model Parameters\r\n\r\n| Component       | Parameters | Description      |\r\n| -------------- | ---------- | ---------------- |\r\n| Backbone       | ~3.0M      | ResNet18-Lite    |\r\n| Neck + UpConv  | ~0.3M      | Feature fusion   |\r\n| Detection Heads| ~0.2M      | Dual-branch heads|\r\n| **Total**      | **~3.5M**  | FP32 ~36MB       |\r\n\r\n## \ud83d\udcc8 Performance Metrics\r\n\r\n### Accuracy (Based on Sliding Distance Error)\r\n\r\n| Dataset       | 5px Threshold | 7px Threshold | Best Epoch |\r\n| ------------- | ------------- | ------------- | ---------- |\r\n| Test Set (Synthetic) | 99.4%  | 99.4%        | 16         |\r\n| Real CAPTCHAs | **73%**       | **80%**       | 15/16      |\r\n\r\n### Inference Performance\r\n\r\n| Hardware          | Inference Time | FPS | Batch (\u00d732) |\r\n| ----------------- | -------------- | --- | ----------- |\r\n| RTX 5090          | 1.30ms         | 771 | 11.31ms     |\r\n| AMD Ryzen 9 9950X | 5.21ms         | 192 | 144.89ms    |\r\n\r\n### Mean Absolute Error (MAE)\r\n\r\n- Test set: Slider 0.30px, Gap 1.14px\r\n- Real CAPTCHAs: Slider 2.84px, Gap 9.98px\r\n\r\n## \ud83d\udee0\ufe0f Main Features\r\n\r\n### 1. Data Generation\r\n\r\n- Auto-download Pixabay images\r\n- Batch generate slider CAPTCHAs\r\n- Support multiple puzzle shapes\r\n\r\n### 2. Model Training\r\n\r\n- Automatic learning rate scheduling\r\n- Training process visualization\r\n\r\n### 3. Inference Deployment\r\n\r\n- Support batch prediction\r\n- REST API interface\r\n- Heatmap visualization support\r\n\r\n### 4. Evaluation Analysis\r\n\r\n- Training curve analysis\r\n\r\n## \u26a0\ufe0f Disclaimer\r\n\r\n**This project is for learning and research purposes only. Commercial or illegal use is prohibited.**\r\n\r\n1. This project aims to promote academic research in computer vision and deep learning\r\n2. Users must comply with relevant laws and regulations, and must not use this project to bypass website security mechanisms\r\n3. Any legal liability arising from the use of this project shall be borne by the user\r\n4. Please do not use this project for any behavior that may harm others' interests\r\n\r\n## \ud83d\udcc1 Project Structure\r\n\r\n```\r\nSider_CAPTCHA_Solver/\r\n\u2502\r\n\u251c\u2500\u2500 configs/                       # Configuration files\r\n\u2502   \u2514\u2500\u2500 config.yaml               # Project configuration\r\n\u2502\r\n\u251c\u2500\u2500 data/                          # Data directory\r\n\u2502   \u251c\u2500\u2500 captchas/                  # Generated CAPTCHAs (354,024 images)\r\n\u2502   \u2502   \u2514\u2500\u2500 Pic*.png              # Format: Pic{XXXX}_Bgx{X}Bgy{Y}_Sdx{X}Sdy{Y}_{hash}.png\r\n\u2502   \u251c\u2500\u2500 raw_images/                # Raw images (2000 images)\r\n\u2502   \u251c\u2500\u2500 real_captchas/             # Real CAPTCHA test set\r\n\u2502   \u2502   \u2514\u2500\u2500 annotated/             # Annotated data (100 images)\r\n\u2502   \u251c\u2500\u2500 annotations.json           # Training set annotations\r\n\u2502   \u251c\u2500\u2500 test_annotations.json      # Test set annotations\r\n\u2502   \u251c\u2500\u2500 generation_stats.json      # Generation statistics\r\n\u2502   \u2514\u2500\u2500 dataset_split_stats.json   # Dataset split statistics\r\n\u2502\r\n\u251c\u2500\u2500 logs/                          # Log files\r\n\u2502   \u251c\u2500\u2500 training_accuracy_curves_all.png    # Training accuracy curves\r\n\u2502   \u251c\u2500\u2500 accuracy_comparison.png             # Test set vs real data comparison\r\n\u2502   \u251c\u2500\u2500 training_analysis_report.txt        # Training analysis report\r\n\u2502   \u251c\u2500\u2500 training_accuracy_results.csv       # Accuracy CSV data\r\n\u2502   \u251c\u2500\u2500 training_accuracy_results.json      # Accuracy JSON data\r\n\u2502   \u251c\u2500\u2500 evaluation_*.log                    # Evaluation logs\r\n\u2502   \u251c\u2500\u2500 training_log.txt                    # Training log\r\n\u2502   \u2514\u2500\u2500 benchmark_results_*.json            # Performance benchmark results\r\n\u2502\r\n\u251c\u2500\u2500 results/                       # Evaluation results\r\n\u2502   \u2514\u2500\u2500 best_model_evaluation/     # Best model evaluation\r\n\u2502       \u251c\u2500\u2500 test_dataset/          # Test set results\r\n\u2502       \u2502   \u251c\u2500\u2500 evaluation_results.json     # Evaluation metrics\r\n\u2502       \u2502   \u2514\u2500\u2500 visualizations/             # Visualizations (100 images)\r\n\u2502       \u251c\u2500\u2500 real_captchas/         # Real CAPTCHA results\r\n\u2502       \u2502   \u251c\u2500\u2500 evaluation_results.json     # Evaluation metrics\r\n\u2502       \u2502   \u2514\u2500\u2500 visualizations/             # Visualizations (50 images)\r\n\u2502       \u2514\u2500\u2500 summary_report.json    # Summary report\r\n\u2502\r\n\u251c\u2500\u2500 scripts/                       # Core scripts\r\n\u2502   \u251c\u2500\u2500 annotation/                # Annotation tools\r\n\u2502   \u2502   \u251c\u2500\u2500 annotate_captchas_matplotlib.py  # Matplotlib annotation UI\r\n\u2502   \u2502   \u2514\u2500\u2500 annotate_captchas_web.py         # Web annotation UI\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 data_generation/           # Data generation scripts\r\n\u2502   \u2502   \u251c\u2500\u2500 geometry_generator.py  # Geometry shape generator\r\n\u2502   \u2502   \u2514\u2500\u2500 puzzle_background_generator.py   # Puzzle background generator\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 training/                  # Training related\r\n\u2502   \u2502   \u251c\u2500\u2500 train.py              # Main training script\r\n\u2502   \u2502   \u251c\u2500\u2500 dataset.py            # PyTorch dataset class\r\n\u2502   \u2502   \u2514\u2500\u2500 analyze_training.py   # Training analysis tool\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 inference/                 # Inference related\r\n\u2502   \u2502   \u2514\u2500\u2500 predict.py            # Prediction interface (CaptchaPredictor class)\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 evaluation/                # Evaluation scripts\r\n\u2502   \u2502   \u2514\u2500\u2500 evaluate_model.py      # Comprehensive evaluation tool (multi-mode support)\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 download_images.py         # Pixabay image downloader\r\n\u2502   \u251c\u2500\u2500 generate_captchas.py       # Batch CAPTCHA generator\r\n\u2502   \u2514\u2500\u2500 split_dataset.py           # Dataset splitting script\r\n\u2502\r\n\u251c\u2500\u2500 src/                          # Source code\r\n\u2502   \u251c\u2500\u2500 __init__.py\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 checkpoints/               # Model weights\r\n\u2502   \u2502   \u251c\u2500\u2500 best_model.pth         # Best model (highest accuracy)\r\n\u2502   \u2502   \u251c\u2500\u2500 checkpoint_epoch_0001.pth ~ checkpoint_epoch_0020.pth  # Epoch checkpoints\r\n\u2502   \u2502   \u251c\u2500\u2500 latest_checkpoint.pth  # Latest checkpoint\r\n\u2502   \u2502   \u251c\u2500\u2500 training_log_*.txt     # Training logs\r\n\u2502   \u2502   \u2514\u2500\u2500 logs/                  # TensorBoard logs\r\n\u2502   \u2502       \u2514\u2500\u2500 events.out.tfevents.*\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 captcha_generator/         # CAPTCHA generation module\r\n\u2502   \u2502   \u251c\u2500\u2500 __init__.py\r\n\u2502   \u2502   \u251c\u2500\u2500 batch_generator.py    # Batch generator\r\n\u2502   \u2502   \u251c\u2500\u2500 lighting_effects.py   # Lighting effects\r\n\u2502   \u2502   \u251c\u2500\u2500 simple_puzzle_generator.py  # Puzzle generator\r\n\u2502   \u2502   \u2514\u2500\u2500 slider_effects.py     # Slider effects\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 data_collection/           # Data collection module\r\n\u2502   \u2502   \u251c\u2500\u2500 __init__.py\r\n\u2502   \u2502   \u2514\u2500\u2500 pixabay_downloader.py # Pixabay downloader\r\n\u2502   \u2502\r\n\u2502   \u251c\u2500\u2500 models/                   # Model definitions\r\n\u2502   \u2502   \u251c\u2500\u2500 __init__.py\r\n\u2502   \u2502   \u251c\u2500\u2500 captcha_solver.py     # CaptchaSolver main model\r\n\u2502   \u2502   \u251c\u2500\u2500 centernet_heads.py    # CenterNet detection heads\r\n\u2502   \u2502   \u251c\u2500\u2500 losses.py             # Loss functions (Focal Loss + L1)\r\n\u2502   \u2502   \u2514\u2500\u2500 resnet18_lite.py      # ResNet18-Lite backbone\r\n\u2502   \u2502\r\n\u2502   \u2514\u2500\u2500 utils/                    # Utility functions\r\n\u2502       \u251c\u2500\u2500 __init__.py\r\n\u2502       \u2514\u2500\u2500 logger.py             # Logging utilities\r\n\u2502\r\n\u251c\u2500\u2500 tests/                        # Test scripts\r\n\u2502   \u251c\u2500\u2500 benchmark_inference.py     # Inference performance benchmark\r\n\u2502   \u251c\u2500\u2500 merge_real_captchas.py     # Real CAPTCHA merge tool\r\n\u2502   \u251c\u2500\u2500 test_all_puzzle_shapes.py  # All puzzle shapes test\r\n\u2502   \u251c\u2500\u2500 test_captcha_generation.py # CAPTCHA generation test\r\n\u2502   \u251c\u2500\u2500 test_darkness_levels.py    # Brightness level test\r\n\u2502   \u251c\u2500\u2500 test_distance_error_visualization.py  # Distance error visualization\r\n\u2502   \u251c\u2500\u2500 test_generate_captchas.py  # Generation function test\r\n\u2502   \u251c\u2500\u2500 test_model.py             # Model unit test\r\n\u2502   \u251c\u2500\u2500 test_model_architecture.py # Model architecture test\r\n\u2502   \u251c\u2500\u2500 test_real_captchas.py     # Real CAPTCHA test\r\n\u2502   \u2514\u2500\u2500 test_slider_effects.py    # Slider effects test\r\n\u2502\r\n\u251c\u2500\u2500 outputs/                      # Test output files\r\n\u2502   \u2514\u2500\u2500 *.png                     # Various test result images\r\n\u2502\r\n\u251c\u2500\u2500 api_example.py                # API usage examples\r\n\u251c\u2500\u2500 requirements.txt              # Dependencies\r\n\u251c\u2500\u2500 README.md                     # English documentation\r\n\u251c\u2500\u2500 README_zh.md                  # Chinese documentation\r\n\u2514\u2500\u2500 CLAUDE.md                     # Project technical specifications\r\n```\r\n\r\n## \ud83d\udd27 Tech Stack\r\n\r\n- **Deep Learning Framework**: PyTorch 2.0+\r\n- **Image Processing**: OpenCV, Pillow\r\n- **Data Processing**: NumPy, Pandas\r\n- **Visualization**: Matplotlib, Seaborn\r\n- **Web Framework**: FastAPI\r\n- **Others**: tqdm, requests, psutil\r\n\r\n## \ud83d\udcde Contact\r\n\r\nFor questions or suggestions, please submit an Issue or Pull Request.\r\n\r\n---\r\n\r\n<div align=\"center\">\r\n<i>This project is licensed under MIT License, for learning and research purposes only</i>\r\n</div>\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Industrial-grade slider CAPTCHA recognition system based on deep learning",
    "version": "1.0.0",
    "project_urls": {
        "Documentation": "https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/blob/main/README.md",
        "Homepage": "https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver",
        "Issues": "https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver/issues",
        "Repository": "https://github.com/TomokotoKiyoshi/Sider_CAPTCHA_Solver"
    },
    "split_keywords": [
        "captcha",
        " slider",
        " recognition",
        " deep-learning",
        " pytorch",
        " centernet",
        " computer-vision"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "aab7400cbab1bf276c55a5e0ab19fbcf8220e7350a08e8bd8c06398c88d6d13e",
                "md5": "89b4b42987611f527d512b4d6ffdce7a",
                "sha256": "dc732983e26a45f0d62f98e06c0823904d81291dd190056ba4e27a9112e9eda5"
            },
            "downloads": -1,
            "filename": "sider_captcha_solver-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "89b4b42987611f527d512b4d6ffdce7a",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 33581218,
            "upload_time": "2025-07-23T07:27:00",
            "upload_time_iso_8601": "2025-07-23T07:27:00.268226Z",
            "url": "https://files.pythonhosted.org/packages/aa/b7/400cbab1bf276c55a5e0ab19fbcf8220e7350a08e8bd8c06398c88d6d13e/sider_captcha_solver-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "fa9140b3a6345ec930a47158f0469e3b402ddaddec3887a650287dde62458bea",
                "md5": "49233a3cbcc19663dab98ad54ee673ef",
                "sha256": "f25f4adf06fe951be4c4fb0543ac9023bc940dc7ecd665edae53646c38ea542a"
            },
            "downloads": -1,
            "filename": "sider_captcha_solver-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "49233a3cbcc19663dab98ad54ee673ef",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 33603467,
            "upload_time": "2025-07-23T07:27:03",
            "upload_time_iso_8601": "2025-07-23T07:27:03.233706Z",
            "url": "https://files.pythonhosted.org/packages/fa/91/40b3a6345ec930a47158f0469e3b402ddaddec3887a650287dde62458bea/sider_captcha_solver-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-23 07:27:03",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "TomokotoKiyoshi",
    "github_project": "Sider_CAPTCHA_Solver",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "numpy",
            "specs": [
                [
                    ">=",
                    "1.21.0"
                ]
            ]
        },
        {
            "name": "opencv-python",
            "specs": [
                [
                    ">=",
                    "4.5.0"
                ]
            ]
        },
        {
            "name": "Pillow",
            "specs": [
                [
                    ">=",
                    "9.0.0"
                ]
            ]
        },
        {
            "name": "torch",
            "specs": [
                [
                    ">=",
                    "1.10.0"
                ]
            ]
        },
        {
            "name": "torchvision",
            "specs": [
                [
                    ">=",
                    "0.11.0"
                ]
            ]
        },
        {
            "name": "pandas",
            "specs": [
                [
                    ">=",
                    "1.3.0"
                ]
            ]
        },
        {
            "name": "scikit-learn",
            "specs": [
                [
                    ">=",
                    "0.24.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.26.0"
                ]
            ]
        },
        {
            "name": "aiohttp",
            "specs": [
                [
                    ">=",
                    "3.8.0"
                ]
            ]
        },
        {
            "name": "albumentations",
            "specs": [
                [
                    ">=",
                    "1.1.0"
                ]
            ]
        },
        {
            "name": "scipy",
            "specs": [
                [
                    ">=",
                    "1.7.0"
                ]
            ]
        },
        {
            "name": "tqdm",
            "specs": [
                [
                    ">=",
                    "4.62.0"
                ]
            ]
        },
        {
            "name": "tensorboard",
            "specs": [
                [
                    ">=",
                    "2.7.0"
                ]
            ]
        },
        {
            "name": "loguru",
            "specs": [
                [
                    ">=",
                    "0.5.3"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    ">=",
                    "5.4.0"
                ]
            ]
        },
        {
            "name": "python-dotenv",
            "specs": [
                [
                    ">=",
                    "0.19.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "6.2.0"
                ]
            ]
        },
        {
            "name": "pytest-cov",
            "specs": [
                [
                    ">=",
                    "2.12.0"
                ]
            ]
        },
        {
            "name": "black",
            "specs": [
                [
                    ">=",
                    "21.7b0"
                ]
            ]
        },
        {
            "name": "flake8",
            "specs": [
                [
                    ">=",
                    "3.9.0"
                ]
            ]
        },
        {
            "name": "isort",
            "specs": [
                [
                    ">=",
                    "5.9.0"
                ]
            ]
        }
    ],
    "lcname": "sider-captcha-solver"
}

TomokotoKiyoshi