chameleon-engine


Namechameleon-engine JSON
Version 1.0.0 PyPI version JSON
download
home_pageNone
SummaryAdvanced stealth web scraping framework with browser fingerprinting and network obfuscation
upload_time2025-10-21 14:32:12
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT License Copyright (c) 2024 Chameleon Engine Contributors Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
keywords web-scraping browser-fingerprinting stealth-scraping anti-bot automation data-collection fingerprinting proxy microservices
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # 🦎 Chameleon Engine

[![Python Version](https://img.shields.io/badge/python-3.8+-blue.svg)](https://python.org)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
[![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg)](https://github.com/your-org/chameleon-engine)
[![Coverage](https://img.shields.io/badge/coverage-95%25-green.svg)](https://codecov.io)

**Advanced stealth web scraping framework with cutting-edge browser fingerprinting and network obfuscation capabilities.**

Chameleon Engine is a comprehensive microservices-based solution designed to bypass modern anti-bot detection systems through sophisticated browser fingerprinting, TLS fingerprint masking, and human behavior simulation.

## ✨ Key Features

### 🎭 Advanced Browser Fingerprinting
- **Dynamic Profile Generation**: Create realistic browser profiles based on real-world data
- **TLS Fingerprint Masking**: JA3/JA4 hash manipulation with uTLS integration
- **HTTP/2 Header Rewriting**: Sophisticated header manipulation for advanced stealth
- **Multi-Browser Support**: Chrome, Firefox, Safari, Edge fingerprint profiles

### πŸš€ Microservices Architecture
- **Fingerprint Service**: FastAPI-based profile management (Python)
- **Proxy Service**: High-performance proxy with TLS fingerprinting (Go)
- **Data Collection Pipeline**: Automated real-world fingerprint gathering
- **Real-time Monitoring**: WebSocket-based dashboard and metrics

### 🎯 Human Behavior Simulation
- **Mouse Movement Patterns**: Bezier curve-based natural movements
- **Typing Simulation**: Realistic typing with variable speed and errors
- **Scrolling Behavior**: Natural scroll patterns and pauses
- **Timing Obfuscation**: Human-like delays and interaction patterns

### πŸ›‘οΈ Network Obfuscation
- **Advanced Proxy Management**: Multi-format proxy loading (TXT, CSV, JSON) with automatic rotation
- **Proxy Generation**: Dynamic generation of residential, datacenter, and geo-targeted proxies
- **Request Obfuscation**: Timing and header randomization
- **TLS Certificate Generation**: Dynamic cert creation per profile
- **HTTP/2 Settings Manipulation**: Protocol-level fingerprinting

## πŸ—οΈ Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Python App    β”‚    β”‚  Fingerprint     β”‚    β”‚   Data Source   β”‚
β”‚                 │◄──►│   Service        │◄──►│   Collection    β”‚
β”‚  Chameleon      β”‚    β”‚   (FastAPI)      β”‚    β”‚     Pipeline    β”‚
β”‚     Engine      β”‚    β”‚                  β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                       β”‚                       β”‚
         β–Ό                       β–Ό                       β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Browser       β”‚    β”‚     Proxy        β”‚    β”‚    Database     β”‚
β”‚  Management     β”‚    β”‚    Service       β”‚    β”‚   PostgreSQL    β”‚
β”‚   (Playwright)  │◄──►│     (Go)         │◄──►│   + Redis       β”‚
β”‚                 β”‚    β”‚   uTLS + HTTP2   β”‚    β”‚                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## πŸš€ Quick Start

### 🎯 Automated Installation (Recommended)

**Linux/macOS:**
```bash
# Clone and install with one command
git clone https://github.com/your-org/chameleon-engine.git
cd chameleon-engine
./install.sh

# Start services
docker-compose -f examples/docker_compose_example.yaml up -d

# Run your first scrape
python examples/simple_scrape.py https://example.com
```

**Windows:**
```powershell
# Clone and install
git clone https://github.com/your-org/chameleon-engine.git
cd chameleon-engine
.\install.ps1

# Start services
docker-compose -f examples/docker_compose_example.yaml up -d

# Run your first scrape
python examples/simple_scrape.py https://example.com
```

### πŸ“‹ Prerequisites

- **Python 3.8+**
- **Go 1.21+** (for proxy service)
- **Docker & Docker Compose** (optional, for easy deployment)
- **PostgreSQL** (optional, for persistent storage)
- **Redis** (optional, for caching)

### πŸ”§ Manual Installation

```bash
# Clone the repository
git clone https://github.com/your-org/chameleon-engine.git
cd chameleon-engine

# Install Python package in development mode
pip install -e .

# Install Playwright browsers
playwright install

# Install Go dependencies (proxy service)
cd proxy_service
go mod tidy
cd ..
```

### Basic Usage

```python
import asyncio
from chameleon_engine import ChameleonEngine

async def main():
    # Initialize Chameleon Engine
    engine = ChameleonEngine(
        fingerprint_service_url="http://localhost:8000",
        proxy_service_url="http://localhost:8080"
    )

    await engine.initialize()

    # Create stealth browser session
    browser = await engine.create_browser(
        profile_type="chrome_windows",
        stealth_mode=True
    )

    # Perform scraping
    page = await browser.new_page()
    await page.goto("https://example.com")

    content = await page.content()
    print(f"Scraped content length: {len(content)}")

    # Cleanup
    await browser.close()
    await engine.cleanup()

asyncio.run(main())
```

## πŸ“š Services Setup

### Option 1: Manual Setup

1. **Start Fingerprint Service**:
   ```bash
   python -m chameleon_engine.fingerprint.main
   ```

2. **Start Proxy Service**:
   ```bash
   cd proxy_service
   make run
   ```

3. **Run Your Application**:
   ```bash
   python your_scraping_script.py
   ```

### Option 2: Docker Deployment

```bash
# Start all services
docker-compose -f examples/docker_compose_example.yaml up -d

# Check service status
docker-compose ps
```

## 🎯 Use Cases

### E-commerce Data Collection
```python
# Scrape product pages while avoiding bot detection
await engine.scrape_ecommerce(
    target_urls=["https://shop.example.com/products/*"],
    rotate_fingerprints=True,
    human_behavior=True,
    rate_limit="1-3 requests per minute"
)
```

### Market Research
```python
# Collect competitive intelligence
await engine.market_research(
    competitors=["competitor1.com", "competitor2.com"],
    data_types=["pricing", "products", "reviews"],
    stealth_level="high"
)
```

### SEO Monitoring
```python
# Monitor search engine rankings
await engine.seo_monitoring(
    keywords=["python web scraping"],
    search_engines=["google", "bing"],
    geo_locations=["US", "UK", "DE"]
)
```

### Academic Research
```python
# Collect data for research purposes
await engine.academic_research(
    target_sites=["scholar.google.com", "arxiv.org"],
    data_types=["papers", "citations", "metadata"],
    ethical_scraping=True
)
```

## πŸ”§ Configuration

### Environment Variables

```bash
# Fingerprint Service
export DATABASE_URL="postgresql://user:pass@localhost/chameleon"
export REDIS_URL="redis://localhost:6379"
export LOG_LEVEL="info"

# Proxy Service
export FINGERPRINT_SERVICE_URL="http://localhost:8000"
export TLS_ENABLED="false"
export PROXY_TARGET_HOST=""
```

### Configuration File

Create `chameleon_config.yaml`:

```yaml
fingerprint:
  service_url: "http://localhost:8000"
  cache_size: 1000
  rotation_interval: 300

proxy:
  service_url: "http://localhost:8080"
  upstream_proxies:
    - url: "http://proxy1.example.com:8080"
      auth:
        username: "user"
        password: "pass"
        type: "basic"
    - url: "http://proxy2.example.com:8080"
      weight: 2
      auth: null
  rotation_settings:
    strategy: "round_robin"
    interval: 300
    request_count: 100
  health_check:
    enabled: true
    interval: 60

behavior:
  mouse_movements: true
  typing_patterns: true
  human_delays: true

logging:
  level: "info"
  format: "json"
```

### Proxy Configuration Details

The Go proxy service manages upstream proxies in two ways:

1. **No Upstream Proxies** (Default):
   ```yaml
   proxy:
     service_url: "http://localhost:8080"
     upstream_proxies: []
   ```
   Flow: Your App β†’ Go Proxy Service β†’ Target Website

2. **With Upstream Proxies**:
   ```yaml
   proxy:
     service_url: "http://localhost:8080"
     upstream_proxies:
       - url: "http://proxy1.example.com:8080"
         auth:
           username: "user"
           password: "pass"
           type: "basic"
       - url: "http://proxy2.example.com:8080"
         weight: 2
   ```
   Flow: Your App β†’ Go Proxy Service β†’ External Proxy β†’ Target Website

**See [Proxy Management Guide](docs/proxy_management.md) for detailed configuration.**

### Advanced Proxy Loading

Chameleon Engine supports multiple proxy loading methods:

```python
from chameleon_engine.proxy_loader import ProxyLoader

loader = ProxyLoader()

# Load from text files
proxies = loader.load_from_txt("proxies.txt", format_type="mixed")

# Load from CSV
proxies = loader.load_from_csv("proxies.csv")

# Generate dynamic proxies
residential_proxies = loader.generate_proxies(
    count=10,
    pattern="residential",
    geolocations=["US", "EU", "AS"]
)

# Filter proxies
http_proxies = loader.filter_proxies(proxies, protocol="http")
auth_proxies = loader.filter_proxies(proxies, has_auth=True)
```

**See [Proxy Usage Guide](PROXY_USAGE_GUIDE.md) for comprehensive examples.**

## πŸ“¦ Installation Options

### πŸ“– Detailed Installation Guide
See [INSTALL.md](INSTALL.md) for comprehensive installation instructions including:
- System-specific setup (Linux, macOS, Windows)
- Docker installation
- Database configuration
- Troubleshooting common issues

### πŸš€ Quick Start Guide
See [QUICK_START.md](QUICK_START.md) for a streamlined getting started experience.

## πŸ“Š Monitoring & Debugging

### Health Checks

```bash
# Check fingerprint service
curl http://localhost:8000/health

# Check proxy service
curl http://localhost:8080/api/v1/health
```

### Real-time Monitoring

```python
# Get live statistics
stats = await engine.get_proxy_stats()
print(f"Active connections: {stats['active_connections']}")
print(f"Total requests: {stats['total_requests']}")

# WebSocket monitoring
import websocket
ws = websocket.WebSocketApp("ws://localhost:8080/ws")
ws.on_message = lambda ws, msg: print(f"Update: {msg}")
ws.run_forever()
```

### API Documentation

- **Fingerprint Service**: http://localhost:8000/docs
- **Proxy Service**: http://localhost:8080/api/v1/health

## πŸ§ͺ Testing

```bash
# Run all tests
pytest

# Run with coverage
pytest --cov=chameleon_engine --cov-report=html

# Run specific test suite
pytest tests/test_fingerprint.py -v
```

## πŸ“– Examples

### Quick Start Example
```bash
python examples/quick_start.py
```

### Advanced Scraping Demo
```bash
python examples/advanced_scraping_example.py
```

### Direct API Usage
```bash
python examples/api_client_example.py
```

### Proxy Management Examples
```bash
# Test proxy loading functionality
python examples/test_proxy_standalone.py

# Run proxy configuration examples
python examples/proxy_loader_examples.py
```

For more examples, see the [examples directory](examples/).

## πŸ” Advanced Features

### Custom Fingerprint Profiles

```python
# Create custom browser profile
custom_profile = {
    "browser_type": "chrome",
    "os": "windows",
    "version": "120.0.0.0",
    "screen_resolution": "1920x1080",
    "timezone": "America/New_York",
    "language": "en-US",
    "custom_headers": {
        "X-Custom-Header": "MyValue"
    }
}

profile = await fingerprint_client.create_profile(custom_profile)
```

### Behavior Simulation

```python
# Simulate human mouse movements
mouse_path = behavior_simulator.generate_mouse_path(
    start=(100, 100),
    end=(500, 300),
    duration=2.0,
    curve_type="bezier"
)

# Simulate typing with natural patterns
typing_pattern = behavior_simulator.generate_typing_pattern(
    text="Hello, World!",
    wpm=80,
    error_rate=0.02
)
```

### Network Obfuscation

```python
# Obfuscate request timing
original_delay = 1.0
obfuscated_delay = network_obfuscator.obfuscate_timing(original_delay)

# Obfuscate headers
headers = {"User-Agent": "Mozilla/5.0..."}
obfuscated_headers = network_obfuscator.obfuscate_headers(headers)
```

## πŸ› οΈ Development

### Setting Up Development Environment

```bash
# Clone repository
git clone https://github.com/your-org/chameleon-engine.git
cd chameleon-engine

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt
pip install -r requirements-dev.txt

# Install pre-commit hooks
pre-commit install
```

### Code Quality

```bash
# Format code
black chameleon_engine/
isort chameleon_engine/

# Lint code
flake8 chameleon_engine/
mypy chameleon_engine/

# Run security checks
bandit -r chameleon_engine/
```

### Building Documentation

```bash
# Install documentation dependencies
pip install -r requirements-docs.txt

# Build docs
mkdocs build

# Serve docs locally
mkdocs serve
```

## πŸ“ˆ Performance

### Benchmarks

- **Request Processing**: < 10ms average latency
- **Profile Generation**: < 50ms for complex profiles
- **Memory Usage**: ~50MB base + ~5MB per concurrent session
- **Concurrent Sessions**: 1000+ simultaneous connections

### Optimization Tips

1. **Enable Redis caching** for fingerprint profiles
2. **Use connection pooling** for database connections
3. **Configure appropriate timeouts** for target websites
4. **Monitor resource usage** with built-in metrics

## πŸ”’ Security Considerations

### Ethical Usage

- βœ… **Respect robots.txt** files
- βœ… **Implement rate limiting** for target websites
- βœ… **Check terms of service** before scraping
- βœ… **Identify your bot** when required
- ❌ **Don't overload target servers**
- ❌ **Don't scrape personal data** without consent
- ❌ **Don't bypass security measures** illegally

### Best Practices

```python
# Ethical scraping configuration
ethical_config = {
    "rate_limit": "1 request per second",
    "respect_robots_txt": True,
    "user_agent": "MyBot/1.0 (+http://mywebsite.com/bot-info)",
    "timeout": 30,
    "max_retries": 3,
    "retry_delay": 5
}
```

## 🀝 Contributing

We welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.

### Development Workflow

1. Fork the repository
2. Create a feature branch: `git checkout -b feature/amazing-feature`
3. Make your changes
4. Add tests for new functionality
5. Run the test suite: `pytest`
6. Commit your changes: `git commit -m 'Add amazing feature'`
7. Push to the branch: `git push origin feature/amazing-feature`
8. Open a Pull Request

## πŸ“„ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## πŸ™ Acknowledgments

- [uTLS](https://github.com/refraction-networking/utls) for TLS fingerprinting
- [Playwright](https://playwright.dev/) for browser automation
- [FastAPI](https://fastapi.tiangolo.com/) for the API framework
- [Gin](https://gin-gonic.com/) for the Go web framework

## πŸ“ž Support

- πŸ“– [Documentation](https://chameleon-engine.readthedocs.io/)
- πŸ› [Issue Tracker](https://github.com/your-org/chameleon-engine/issues)
- πŸ’¬ [Discussions](https://github.com/your-org/chameleon-engine/discussions)
- πŸ“§ [Email Support](mailto:support@chameleon-engine.com)

## πŸ—ΊοΈ Roadmap

### Version 2.0
- [ ] Machine learning-based behavior optimization
- [ ] Advanced CAPTCHA solving integration
- [ ] Cloud deployment templates
- [ ] Web-based management dashboard

### Version 1.5
- [ ] Enhanced mobile browser fingerprinting
- [ ] WebGL and Canvas fingerprinting
- [ ] Audio fingerprinting capabilities
- [x] Advanced proxy pool management
- [x] Multi-format proxy loading (TXT, CSV, JSON)
- [x] Dynamic proxy generation (residential, datacenter, geo-targeted)
- [x] Comprehensive proxy filtering and validation

### Version 1.2
- [x] Microservices architecture
- [x] Go-based proxy service
- [x] Real-time monitoring
- [x] Docker deployment support

---

**Made with ❀️ for the ethical web scraping community**

If you find this project useful, please consider giving it a ⭐ on GitHub!

            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "chameleon-engine",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": "Chameleon Engine Contributors <maintainers@chameleon-engine.com>",
    "keywords": "web-scraping, browser-fingerprinting, stealth-scraping, anti-bot, automation, data-collection, fingerprinting, proxy, microservices",
    "author": null,
    "author_email": "Chameleon Engine Contributors <maintainers@chameleon-engine.com>",
    "download_url": "https://files.pythonhosted.org/packages/c7/d5/4b93d3d5a357424f7bddcd99075a96460183332545da8dcefd5398a4746d/chameleon_engine-1.0.0.tar.gz",
    "platform": null,
    "description": "# \ud83e\udd8e Chameleon Engine\n\n[![Python Version](https://img.shields.io/badge/python-3.8+-blue.svg)](https://python.org)\n[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)\n[![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg)](https://github.com/your-org/chameleon-engine)\n[![Coverage](https://img.shields.io/badge/coverage-95%25-green.svg)](https://codecov.io)\n\n**Advanced stealth web scraping framework with cutting-edge browser fingerprinting and network obfuscation capabilities.**\n\nChameleon Engine is a comprehensive microservices-based solution designed to bypass modern anti-bot detection systems through sophisticated browser fingerprinting, TLS fingerprint masking, and human behavior simulation.\n\n## \u2728 Key Features\n\n### \ud83c\udfad Advanced Browser Fingerprinting\n- **Dynamic Profile Generation**: Create realistic browser profiles based on real-world data\n- **TLS Fingerprint Masking**: JA3/JA4 hash manipulation with uTLS integration\n- **HTTP/2 Header Rewriting**: Sophisticated header manipulation for advanced stealth\n- **Multi-Browser Support**: Chrome, Firefox, Safari, Edge fingerprint profiles\n\n### \ud83d\ude80 Microservices Architecture\n- **Fingerprint Service**: FastAPI-based profile management (Python)\n- **Proxy Service**: High-performance proxy with TLS fingerprinting (Go)\n- **Data Collection Pipeline**: Automated real-world fingerprint gathering\n- **Real-time Monitoring**: WebSocket-based dashboard and metrics\n\n### \ud83c\udfaf Human Behavior Simulation\n- **Mouse Movement Patterns**: Bezier curve-based natural movements\n- **Typing Simulation**: Realistic typing with variable speed and errors\n- **Scrolling Behavior**: Natural scroll patterns and pauses\n- **Timing Obfuscation**: Human-like delays and interaction patterns\n\n### \ud83d\udee1\ufe0f Network Obfuscation\n- **Advanced Proxy Management**: Multi-format proxy loading (TXT, CSV, JSON) with automatic rotation\n- **Proxy Generation**: Dynamic generation of residential, datacenter, and geo-targeted proxies\n- **Request Obfuscation**: Timing and header randomization\n- **TLS Certificate Generation**: Dynamic cert creation per profile\n- **HTTP/2 Settings Manipulation**: Protocol-level fingerprinting\n\n## \ud83c\udfd7\ufe0f Architecture\n\n```\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502   Python App    \u2502    \u2502  Fingerprint     \u2502    \u2502   Data Source   \u2502\n\u2502                 \u2502\u25c4\u2500\u2500\u25ba\u2502   Service        \u2502\u25c4\u2500\u2500\u25ba\u2502   Collection    \u2502\n\u2502  Chameleon      \u2502    \u2502   (FastAPI)      \u2502    \u2502     Pipeline    \u2502\n\u2502     Engine      \u2502    \u2502                  \u2502    \u2502                 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n         \u2502                       \u2502                       \u2502\n         \u25bc                       \u25bc                       \u25bc\n\u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510    \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n\u2502   Browser       \u2502    \u2502     Proxy        \u2502    \u2502    Database     \u2502\n\u2502  Management     \u2502    \u2502    Service       \u2502    \u2502   PostgreSQL    \u2502\n\u2502   (Playwright)  \u2502\u25c4\u2500\u2500\u25ba\u2502     (Go)         \u2502\u25c4\u2500\u2500\u25ba\u2502   + Redis       \u2502\n\u2502                 \u2502    \u2502   uTLS + HTTP2   \u2502    \u2502                 \u2502\n\u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518    \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n```\n\n## \ud83d\ude80 Quick Start\n\n### \ud83c\udfaf Automated Installation (Recommended)\n\n**Linux/macOS:**\n```bash\n# Clone and install with one command\ngit clone https://github.com/your-org/chameleon-engine.git\ncd chameleon-engine\n./install.sh\n\n# Start services\ndocker-compose -f examples/docker_compose_example.yaml up -d\n\n# Run your first scrape\npython examples/simple_scrape.py https://example.com\n```\n\n**Windows:**\n```powershell\n# Clone and install\ngit clone https://github.com/your-org/chameleon-engine.git\ncd chameleon-engine\n.\\install.ps1\n\n# Start services\ndocker-compose -f examples/docker_compose_example.yaml up -d\n\n# Run your first scrape\npython examples/simple_scrape.py https://example.com\n```\n\n### \ud83d\udccb Prerequisites\n\n- **Python 3.8+**\n- **Go 1.21+** (for proxy service)\n- **Docker & Docker Compose** (optional, for easy deployment)\n- **PostgreSQL** (optional, for persistent storage)\n- **Redis** (optional, for caching)\n\n### \ud83d\udd27 Manual Installation\n\n```bash\n# Clone the repository\ngit clone https://github.com/your-org/chameleon-engine.git\ncd chameleon-engine\n\n# Install Python package in development mode\npip install -e .\n\n# Install Playwright browsers\nplaywright install\n\n# Install Go dependencies (proxy service)\ncd proxy_service\ngo mod tidy\ncd ..\n```\n\n### Basic Usage\n\n```python\nimport asyncio\nfrom chameleon_engine import ChameleonEngine\n\nasync def main():\n    # Initialize Chameleon Engine\n    engine = ChameleonEngine(\n        fingerprint_service_url=\"http://localhost:8000\",\n        proxy_service_url=\"http://localhost:8080\"\n    )\n\n    await engine.initialize()\n\n    # Create stealth browser session\n    browser = await engine.create_browser(\n        profile_type=\"chrome_windows\",\n        stealth_mode=True\n    )\n\n    # Perform scraping\n    page = await browser.new_page()\n    await page.goto(\"https://example.com\")\n\n    content = await page.content()\n    print(f\"Scraped content length: {len(content)}\")\n\n    # Cleanup\n    await browser.close()\n    await engine.cleanup()\n\nasyncio.run(main())\n```\n\n## \ud83d\udcda Services Setup\n\n### Option 1: Manual Setup\n\n1. **Start Fingerprint Service**:\n   ```bash\n   python -m chameleon_engine.fingerprint.main\n   ```\n\n2. **Start Proxy Service**:\n   ```bash\n   cd proxy_service\n   make run\n   ```\n\n3. **Run Your Application**:\n   ```bash\n   python your_scraping_script.py\n   ```\n\n### Option 2: Docker Deployment\n\n```bash\n# Start all services\ndocker-compose -f examples/docker_compose_example.yaml up -d\n\n# Check service status\ndocker-compose ps\n```\n\n## \ud83c\udfaf Use Cases\n\n### E-commerce Data Collection\n```python\n# Scrape product pages while avoiding bot detection\nawait engine.scrape_ecommerce(\n    target_urls=[\"https://shop.example.com/products/*\"],\n    rotate_fingerprints=True,\n    human_behavior=True,\n    rate_limit=\"1-3 requests per minute\"\n)\n```\n\n### Market Research\n```python\n# Collect competitive intelligence\nawait engine.market_research(\n    competitors=[\"competitor1.com\", \"competitor2.com\"],\n    data_types=[\"pricing\", \"products\", \"reviews\"],\n    stealth_level=\"high\"\n)\n```\n\n### SEO Monitoring\n```python\n# Monitor search engine rankings\nawait engine.seo_monitoring(\n    keywords=[\"python web scraping\"],\n    search_engines=[\"google\", \"bing\"],\n    geo_locations=[\"US\", \"UK\", \"DE\"]\n)\n```\n\n### Academic Research\n```python\n# Collect data for research purposes\nawait engine.academic_research(\n    target_sites=[\"scholar.google.com\", \"arxiv.org\"],\n    data_types=[\"papers\", \"citations\", \"metadata\"],\n    ethical_scraping=True\n)\n```\n\n## \ud83d\udd27 Configuration\n\n### Environment Variables\n\n```bash\n# Fingerprint Service\nexport DATABASE_URL=\"postgresql://user:pass@localhost/chameleon\"\nexport REDIS_URL=\"redis://localhost:6379\"\nexport LOG_LEVEL=\"info\"\n\n# Proxy Service\nexport FINGERPRINT_SERVICE_URL=\"http://localhost:8000\"\nexport TLS_ENABLED=\"false\"\nexport PROXY_TARGET_HOST=\"\"\n```\n\n### Configuration File\n\nCreate `chameleon_config.yaml`:\n\n```yaml\nfingerprint:\n  service_url: \"http://localhost:8000\"\n  cache_size: 1000\n  rotation_interval: 300\n\nproxy:\n  service_url: \"http://localhost:8080\"\n  upstream_proxies:\n    - url: \"http://proxy1.example.com:8080\"\n      auth:\n        username: \"user\"\n        password: \"pass\"\n        type: \"basic\"\n    - url: \"http://proxy2.example.com:8080\"\n      weight: 2\n      auth: null\n  rotation_settings:\n    strategy: \"round_robin\"\n    interval: 300\n    request_count: 100\n  health_check:\n    enabled: true\n    interval: 60\n\nbehavior:\n  mouse_movements: true\n  typing_patterns: true\n  human_delays: true\n\nlogging:\n  level: \"info\"\n  format: \"json\"\n```\n\n### Proxy Configuration Details\n\nThe Go proxy service manages upstream proxies in two ways:\n\n1. **No Upstream Proxies** (Default):\n   ```yaml\n   proxy:\n     service_url: \"http://localhost:8080\"\n     upstream_proxies: []\n   ```\n   Flow: Your App \u2192 Go Proxy Service \u2192 Target Website\n\n2. **With Upstream Proxies**:\n   ```yaml\n   proxy:\n     service_url: \"http://localhost:8080\"\n     upstream_proxies:\n       - url: \"http://proxy1.example.com:8080\"\n         auth:\n           username: \"user\"\n           password: \"pass\"\n           type: \"basic\"\n       - url: \"http://proxy2.example.com:8080\"\n         weight: 2\n   ```\n   Flow: Your App \u2192 Go Proxy Service \u2192 External Proxy \u2192 Target Website\n\n**See [Proxy Management Guide](docs/proxy_management.md) for detailed configuration.**\n\n### Advanced Proxy Loading\n\nChameleon Engine supports multiple proxy loading methods:\n\n```python\nfrom chameleon_engine.proxy_loader import ProxyLoader\n\nloader = ProxyLoader()\n\n# Load from text files\nproxies = loader.load_from_txt(\"proxies.txt\", format_type=\"mixed\")\n\n# Load from CSV\nproxies = loader.load_from_csv(\"proxies.csv\")\n\n# Generate dynamic proxies\nresidential_proxies = loader.generate_proxies(\n    count=10,\n    pattern=\"residential\",\n    geolocations=[\"US\", \"EU\", \"AS\"]\n)\n\n# Filter proxies\nhttp_proxies = loader.filter_proxies(proxies, protocol=\"http\")\nauth_proxies = loader.filter_proxies(proxies, has_auth=True)\n```\n\n**See [Proxy Usage Guide](PROXY_USAGE_GUIDE.md) for comprehensive examples.**\n\n## \ud83d\udce6 Installation Options\n\n### \ud83d\udcd6 Detailed Installation Guide\nSee [INSTALL.md](INSTALL.md) for comprehensive installation instructions including:\n- System-specific setup (Linux, macOS, Windows)\n- Docker installation\n- Database configuration\n- Troubleshooting common issues\n\n### \ud83d\ude80 Quick Start Guide\nSee [QUICK_START.md](QUICK_START.md) for a streamlined getting started experience.\n\n## \ud83d\udcca Monitoring & Debugging\n\n### Health Checks\n\n```bash\n# Check fingerprint service\ncurl http://localhost:8000/health\n\n# Check proxy service\ncurl http://localhost:8080/api/v1/health\n```\n\n### Real-time Monitoring\n\n```python\n# Get live statistics\nstats = await engine.get_proxy_stats()\nprint(f\"Active connections: {stats['active_connections']}\")\nprint(f\"Total requests: {stats['total_requests']}\")\n\n# WebSocket monitoring\nimport websocket\nws = websocket.WebSocketApp(\"ws://localhost:8080/ws\")\nws.on_message = lambda ws, msg: print(f\"Update: {msg}\")\nws.run_forever()\n```\n\n### API Documentation\n\n- **Fingerprint Service**: http://localhost:8000/docs\n- **Proxy Service**: http://localhost:8080/api/v1/health\n\n## \ud83e\uddea Testing\n\n```bash\n# Run all tests\npytest\n\n# Run with coverage\npytest --cov=chameleon_engine --cov-report=html\n\n# Run specific test suite\npytest tests/test_fingerprint.py -v\n```\n\n## \ud83d\udcd6 Examples\n\n### Quick Start Example\n```bash\npython examples/quick_start.py\n```\n\n### Advanced Scraping Demo\n```bash\npython examples/advanced_scraping_example.py\n```\n\n### Direct API Usage\n```bash\npython examples/api_client_example.py\n```\n\n### Proxy Management Examples\n```bash\n# Test proxy loading functionality\npython examples/test_proxy_standalone.py\n\n# Run proxy configuration examples\npython examples/proxy_loader_examples.py\n```\n\nFor more examples, see the [examples directory](examples/).\n\n## \ud83d\udd0d Advanced Features\n\n### Custom Fingerprint Profiles\n\n```python\n# Create custom browser profile\ncustom_profile = {\n    \"browser_type\": \"chrome\",\n    \"os\": \"windows\",\n    \"version\": \"120.0.0.0\",\n    \"screen_resolution\": \"1920x1080\",\n    \"timezone\": \"America/New_York\",\n    \"language\": \"en-US\",\n    \"custom_headers\": {\n        \"X-Custom-Header\": \"MyValue\"\n    }\n}\n\nprofile = await fingerprint_client.create_profile(custom_profile)\n```\n\n### Behavior Simulation\n\n```python\n# Simulate human mouse movements\nmouse_path = behavior_simulator.generate_mouse_path(\n    start=(100, 100),\n    end=(500, 300),\n    duration=2.0,\n    curve_type=\"bezier\"\n)\n\n# Simulate typing with natural patterns\ntyping_pattern = behavior_simulator.generate_typing_pattern(\n    text=\"Hello, World!\",\n    wpm=80,\n    error_rate=0.02\n)\n```\n\n### Network Obfuscation\n\n```python\n# Obfuscate request timing\noriginal_delay = 1.0\nobfuscated_delay = network_obfuscator.obfuscate_timing(original_delay)\n\n# Obfuscate headers\nheaders = {\"User-Agent\": \"Mozilla/5.0...\"}\nobfuscated_headers = network_obfuscator.obfuscate_headers(headers)\n```\n\n## \ud83d\udee0\ufe0f Development\n\n### Setting Up Development Environment\n\n```bash\n# Clone repository\ngit clone https://github.com/your-org/chameleon-engine.git\ncd chameleon-engine\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install dependencies\npip install -r requirements.txt\npip install -r requirements-dev.txt\n\n# Install pre-commit hooks\npre-commit install\n```\n\n### Code Quality\n\n```bash\n# Format code\nblack chameleon_engine/\nisort chameleon_engine/\n\n# Lint code\nflake8 chameleon_engine/\nmypy chameleon_engine/\n\n# Run security checks\nbandit -r chameleon_engine/\n```\n\n### Building Documentation\n\n```bash\n# Install documentation dependencies\npip install -r requirements-docs.txt\n\n# Build docs\nmkdocs build\n\n# Serve docs locally\nmkdocs serve\n```\n\n## \ud83d\udcc8 Performance\n\n### Benchmarks\n\n- **Request Processing**: < 10ms average latency\n- **Profile Generation**: < 50ms for complex profiles\n- **Memory Usage**: ~50MB base + ~5MB per concurrent session\n- **Concurrent Sessions**: 1000+ simultaneous connections\n\n### Optimization Tips\n\n1. **Enable Redis caching** for fingerprint profiles\n2. **Use connection pooling** for database connections\n3. **Configure appropriate timeouts** for target websites\n4. **Monitor resource usage** with built-in metrics\n\n## \ud83d\udd12 Security Considerations\n\n### Ethical Usage\n\n- \u2705 **Respect robots.txt** files\n- \u2705 **Implement rate limiting** for target websites\n- \u2705 **Check terms of service** before scraping\n- \u2705 **Identify your bot** when required\n- \u274c **Don't overload target servers**\n- \u274c **Don't scrape personal data** without consent\n- \u274c **Don't bypass security measures** illegally\n\n### Best Practices\n\n```python\n# Ethical scraping configuration\nethical_config = {\n    \"rate_limit\": \"1 request per second\",\n    \"respect_robots_txt\": True,\n    \"user_agent\": \"MyBot/1.0 (+http://mywebsite.com/bot-info)\",\n    \"timeout\": 30,\n    \"max_retries\": 3,\n    \"retry_delay\": 5\n}\n```\n\n## \ud83e\udd1d Contributing\n\nWe welcome contributions! Please see our [Contributing Guide](CONTRIBUTING.md) for details.\n\n### Development Workflow\n\n1. Fork the repository\n2. Create a feature branch: `git checkout -b feature/amazing-feature`\n3. Make your changes\n4. Add tests for new functionality\n5. Run the test suite: `pytest`\n6. Commit your changes: `git commit -m 'Add amazing feature'`\n7. Push to the branch: `git push origin feature/amazing-feature`\n8. Open a Pull Request\n\n## \ud83d\udcc4 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## \ud83d\ude4f Acknowledgments\n\n- [uTLS](https://github.com/refraction-networking/utls) for TLS fingerprinting\n- [Playwright](https://playwright.dev/) for browser automation\n- [FastAPI](https://fastapi.tiangolo.com/) for the API framework\n- [Gin](https://gin-gonic.com/) for the Go web framework\n\n## \ud83d\udcde Support\n\n- \ud83d\udcd6 [Documentation](https://chameleon-engine.readthedocs.io/)\n- \ud83d\udc1b [Issue Tracker](https://github.com/your-org/chameleon-engine/issues)\n- \ud83d\udcac [Discussions](https://github.com/your-org/chameleon-engine/discussions)\n- \ud83d\udce7 [Email Support](mailto:support@chameleon-engine.com)\n\n## \ud83d\uddfa\ufe0f Roadmap\n\n### Version 2.0\n- [ ] Machine learning-based behavior optimization\n- [ ] Advanced CAPTCHA solving integration\n- [ ] Cloud deployment templates\n- [ ] Web-based management dashboard\n\n### Version 1.5\n- [ ] Enhanced mobile browser fingerprinting\n- [ ] WebGL and Canvas fingerprinting\n- [ ] Audio fingerprinting capabilities\n- [x] Advanced proxy pool management\n- [x] Multi-format proxy loading (TXT, CSV, JSON)\n- [x] Dynamic proxy generation (residential, datacenter, geo-targeted)\n- [x] Comprehensive proxy filtering and validation\n\n### Version 1.2\n- [x] Microservices architecture\n- [x] Go-based proxy service\n- [x] Real-time monitoring\n- [x] Docker deployment support\n\n---\n\n**Made with \u2764\ufe0f for the ethical web scraping community**\n\nIf you find this project useful, please consider giving it a \u2b50 on GitHub!\n",
    "bugtrack_url": null,
    "license": "MIT License\n        \n        Copyright (c) 2024 Chameleon Engine Contributors\n        \n        Permission is hereby granted, free of charge, to any person obtaining a copy\n        of this software and associated documentation files (the \"Software\"), to deal\n        in the Software without restriction, including without limitation the rights\n        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell\n        copies of the Software, and to permit persons to whom the Software is\n        furnished to do so, subject to the following conditions:\n        \n        The above copyright notice and this permission notice shall be included in all\n        copies or substantial portions of the Software.\n        \n        THE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\n        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\n        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\n        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\n        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\n        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\n        SOFTWARE.",
    "summary": "Advanced stealth web scraping framework with browser fingerprinting and network obfuscation",
    "version": "1.0.0",
    "project_urls": {
        "Bug Tracker": "https://github.com/your-org/chameleon-engine/issues",
        "Changelog": "https://github.com/your-org/chameleon-engine/blob/main/CHANGELOG.md",
        "Discussions": "https://github.com/your-org/chameleon-engine/discussions",
        "Documentation": "https://chameleon-engine.readthedocs.io",
        "Homepage": "https://github.com/your-org/chameleon-engine",
        "Repository": "https://github.com/your-org/chameleon-engine.git"
    },
    "split_keywords": [
        "web-scraping",
        " browser-fingerprinting",
        " stealth-scraping",
        " anti-bot",
        " automation",
        " data-collection",
        " fingerprinting",
        " proxy",
        " microservices"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "47f84395f0ddaaa48db1b3510ffaa6d5603276ba12e442fe76767d5dc7259bcb",
                "md5": "f3c3d1a04589ff558294336b4034668b",
                "sha256": "880168bbb6d84969fec58bc721864f8d515c315144b8fa7f6adc1ef5bc53ca7a"
            },
            "downloads": -1,
            "filename": "chameleon_engine-1.0.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "f3c3d1a04589ff558294336b4034668b",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 178966,
            "upload_time": "2025-10-21T14:32:08",
            "upload_time_iso_8601": "2025-10-21T14:32:08.969311Z",
            "url": "https://files.pythonhosted.org/packages/47/f8/4395f0ddaaa48db1b3510ffaa6d5603276ba12e442fe76767d5dc7259bcb/chameleon_engine-1.0.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "c7d54b93d3d5a357424f7bddcd99075a96460183332545da8dcefd5398a4746d",
                "md5": "555e62548b1db630c743a7e5a616e05b",
                "sha256": "4144dae885baa7e3e8c1e858c8e714a68dcb5f84a5f479b3678b6869de8cd30b"
            },
            "downloads": -1,
            "filename": "chameleon_engine-1.0.0.tar.gz",
            "has_sig": false,
            "md5_digest": "555e62548b1db630c743a7e5a616e05b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 207655,
            "upload_time": "2025-10-21T14:32:12",
            "upload_time_iso_8601": "2025-10-21T14:32:12.383535Z",
            "url": "https://files.pythonhosted.org/packages/c7/d5/4b93d3d5a357424f7bddcd99075a96460183332545da8dcefd5398a4746d/chameleon_engine-1.0.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-10-21 14:32:12",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "your-org",
    "github_project": "chameleon-engine",
    "github_not_found": true,
    "lcname": "chameleon-engine"
}
        
Elapsed time: 1.57702s