nextjs-hydration-parser

Name	nextjs-hydration-parser JSON
Version	0.2.0 JSON
	download
home_page	https://github.com/kennyaires/nextjs-hydration-parser
Summary	A Python library for extracting and parsing Next.js hydration data from HTML content
upload_time	2025-07-30 00:36:16
maintainer	None
docs_url	None
author	Kenny Aires
requires_python	>=3.7
license	None
keywords	nextjs hydration html parser web-scraping javascript
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Next.js Hydration Parser

[![PyPI version](https://badge.fury.io/py/nextjs-hydration-parser.svg)](https://badge.fury.io/py/nextjs-hydration-parser)
[![Python versions](https://img.shields.io/pypi/pyversions/nextjs-hydration-parser.svg)](https://pypi.org/project/nextjs-hydration-parser/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

A specialized Python library for extracting and parsing Next.js 13+ hydration data from raw HTML pages. When scraping Next.js applications, the server-side rendered HTML contains complex hydration data chunks embedded in `self.__next_f.push()` calls that need to be properly assembled and parsed to access the underlying application data.

## The Problem

Next.js 13+ applications with App Router use a sophisticated hydration system that splits data across multiple script chunks in the raw HTML. When you scrape these pages (before JavaScript execution), you get fragments like:

```html
<script>self.__next_f.push([1,"partial data chunk 1"])</script>
<script>self.__next_f.push([1,"continuation of data"])</script>
<script>self.__next_f.push([2,"{\"products\":[{\"id\":1,\"name\":\"Product\"}]}"])</script>
```

This data is:
- **Split across multiple chunks** that need to be reassembled
- **Encoded in various formats** (JSON strings, base64, escaped content)
- **Mixed with rendering metadata** that needs to be filtered out
- **Difficult to parse** due to complex escaping and nested structures

This library solves these challenges by intelligently combining chunks, handling multiple encoding formats, and extracting the meaningful application data.

## Features

- �️ **Web Scraping Focused** - Designed specifically for parsing raw Next.js 13+ pages before JavaScript execution
- 🧩 **Chunk Reassembly** - Intelligently combines data fragments split across multiple `self.__next_f.push()` calls
- 🔍 **Multi-format Parsing** - Handles JSON strings, base64-encoded data, escaped content, and complex nested structures
- 🎯 **Data Extraction** - Filters out rendering metadata to extract meaningful application data (products, users, API responses, etc.)
- 🛠️ **Robust Error Handling** - Continues processing even with malformed chunks, providing debugging information
- 🔎 **Pattern Matching** - Search and filter extracted data by keys or content patterns
- ⚡ **Performance Optimized** - Efficiently processes large HTML files with hundreds of hydration chunks

## Use Cases

Perfect for:
- **E-commerce scraping** - Extract product catalogs, prices, and inventory data
- **Content aggregation** - Collect articles, blog posts, and structured content
- **API reverse engineering** - Understand data structures used by Next.js applications
- **SEO analysis** - Extract meta information and structured data for analysis

## Installation

```bash
pip install nextjs-hydration-parser
```

### Requirements

- Python 3.7+
- `chompjs` for JavaScript object parsing
- `requests` (for scraping examples)

The library is lightweight with minimal dependencies, designed for integration into existing scraping pipelines.

## Quick Start

```python
from nextjs_hydration_parser import NextJSHydrationDataExtractor
import requests

# Create an extractor instance
extractor = NextJSHydrationDataExtractor()

# Scrape a Next.js page (before JavaScript execution)
response = requests.get('https://example-nextjs-ecommerce.com/products')
html_content = response.text

# Extract and parse the hydration data
chunks = extractor.parse(html_content)

# Process the results to find meaningful data
for chunk in chunks:
    print(f"Chunk ID: {chunk['chunk_id']}")
    for item in chunk['extracted_data']:
        if item['type'] == 'colon_separated':
            # Often contains API response data
            print(f"API Data: {item['data']}")
        elif 'products' in str(item['data']):
            # Found product data
            print(f"Products: {item['data']}")
```

### Real-world Example: E-commerce Scraping

```python
# Extract product data from a Next.js e-commerce site
extractor = NextJSHydrationDataExtractor()
html_content = open('product_page.html', 'r').read()

chunks = extractor.parse(html_content)

# Find product information
products = extractor.find_data_by_pattern(chunks, 'product')
for product_data in products:
    if isinstance(product_data['value'], dict):
        product = product_data['value']
        print(f"Product: {product.get('name', 'Unknown')}")
        print(f"Price: ${product.get('price', 'N/A')}")
        print(f"Stock: {product.get('inventory', 'Unknown')}")
```

## Advanced Usage

### Scraping Complex Next.js Applications

```python
import requests
from nextjs_hydration_parser import NextJSHydrationDataExtractor

def scrape_nextjs_data(url):
    """Scrape and extract data from a Next.js application"""
    
    # Get raw HTML (before JavaScript execution)
    headers = {'User-Agent': 'Mozilla/5.0 (compatible; DataExtractor/1.0)'}
    response = requests.get(url, headers=headers)
    
    # Parse hydration data
    extractor = NextJSHydrationDataExtractor()
    chunks = extractor.parse(response.text)
    
    # Extract meaningful data
    extracted_data = {}
    
    for chunk in chunks:
        if chunk['chunk_id'] == 'error':
            continue  # Skip malformed chunks
            
        for item in chunk['extracted_data']:
            data = item['data']
            
            # Look for common data patterns
            if isinstance(data, dict):
                # API responses often contain these keys
                for key in ['products', 'users', 'posts', 'data', 'results']:
                    if key in data:
                        extracted_data[key] = data[key]
                        
    return extracted_data

# Usage
data = scrape_nextjs_data('https://nextjs-shop.example.com')
print(f"Found {len(data.get('products', []))} products")
```

### Handling Large HTML Files

When scraping large Next.js applications, you might encounter hundreds of hydration chunks:

```python
# Read from file
with open('large_nextjs_page.html', 'r', encoding='utf-8') as f:
    html_content = f.read()

# Parse and extract
extractor = NextJSHydrationDataExtractor()
chunks = extractor.parse(html_content)

print(f"Found {len(chunks)} hydration chunks")

# Get overview of all available data keys
all_keys = extractor.get_all_keys(chunks)
print("Most common data keys:")
for key, count in list(all_keys.items())[:20]:
    print(f"  {key}: {count} occurrences")

# Focus on specific data types
api_data = []
for chunk in chunks:
    for item in chunk['extracted_data']:
        if item['type'] == 'colon_separated' and 'api' in item.get('identifier', '').lower():
            api_data.append(item['data'])

print(f"Found {len(api_data)} API data chunks")
```

## API Reference

### `NextJSHydrationDataExtractor`

The main class for extracting Next.js hydration data.

#### Methods

- **`parse(html_content: str) -> List[Dict[str, Any]]`**
  
  Parse Next.js hydration data from HTML content.
  
  - `html_content`: Raw HTML string containing script tags
  - Returns: List of parsed data chunks

- **`get_all_keys(parsed_chunks: List[Dict], max_depth: int = 3) -> Dict[str, int]`**
  
  Extract all unique keys from parsed chunks.
  
  - `parsed_chunks`: Output from `parse()` method
  - `max_depth`: Maximum depth to traverse
  - Returns: Dictionary of keys and their occurrence counts

- **`find_data_by_pattern(parsed_chunks: List[Dict], pattern: str) -> List[Any]`**
  
  Find data matching a specific pattern.
  
  - `parsed_chunks`: Output from `parse()` method  
  - `pattern`: Key pattern to search for
  - Returns: List of matching data items

## Data Structure

The parser returns data in the following structure:

```python
[
    {
        "chunk_id": "1",  # ID from self.__next_f.push([ID, data])
        "extracted_data": [
            {
                "type": "colon_separated|standalone_json|whole_text",
                "data": {...},  # Parsed JavaScript/JSON object
                "identifier": "...",  # For colon_separated type
                "start_position": 123  # For standalone_json type
            }
        ],
        "chunk_count": 1,  # Number of chunks with this ID
        "_positions": [123]  # Original positions in HTML
    }
]
```

## Supported Data Formats

The parser handles various data formats commonly found in Next.js 13+ hydration chunks:

### 1. JSON Strings
```javascript
self.__next_f.push([1, "{\"products\":[{\"id\":1,\"name\":\"Laptop\",\"price\":999}]}"])
```

### 2. Base64 + JSON Combinations  
```javascript
self.__next_f.push([2, "eyJhcGlLZXkiOiJ4eXoifQ==:{\"data\":{\"users\":[{\"id\":1}]}}"])
```

### 3. JavaScript Objects
```javascript
self.__next_f.push([3, "{key: 'value', items: [1, 2, 3], nested: {deep: true}}"])
```

### 4. Escaped Content
```javascript  
self.__next_f.push([4, "\"escaped content with \\\"quotes\\\" and newlines\\n\""])
```

### 5. Multi-chunk Data
```javascript
// Data split across multiple chunks with same ID
self.__next_f.push([5, "first part of data"])
self.__next_f.push([5, " continued here"])
self.__next_f.push([5, " and final part"])
```

### 6. Complex Nested Structures
Next.js often embeds API responses, page props, and component data in deeply nested formats that the parser can extract and flatten for easy access.

## How Next.js 13+ Hydration Works

Understanding the hydration process helps explain why this library is necessary:

1. **Server-Side Rendering**: Next.js renders your page on the server, generating static HTML
2. **Data Embedding**: Instead of making separate API calls, Next.js may embeds the data directly in the HTML using `self.__next_f.push()` calls
3. **Chunk Splitting**: Large data sets are split across multiple chunks to optimize loading
4. **Client Hydration**: When JavaScript loads, these chunks are reassembled and used to hydrate React components

When scraping, you're intercepting step 2 - getting the raw HTML with embedded data before the JavaScript processes it. This gives you access to all the data the application uses, but in a fragmented format that needs intelligent parsing.

**Why not just use the rendered page?** 
- Faster scraping (no JavaScript execution wait time)
- Access to internal data structures not visible in the DOM
- Bypasses client-side anti-scraping measures
- Gets raw API responses before component filtering/transformation

## Error Handling

The parser includes robust error handling:

- **Malformed data**: Continues processing and marks chunks with errors
- **Multiple parsing strategies**: Falls back to alternative parsing methods
- **Partial data**: Handles incomplete or truncated data gracefully

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

1. Fork the repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request

## Development Setup

```bash
# Clone the repository
git clone https://github.com/kennyaires/nextjs-hydration-parser.git
cd nextjs-hydration-parser

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in development mode with testing dependencies
pip install -e .[dev]

# Run tests
pytest tests/ -v

# Run formatting
black nextjs_hydration_parser/ tests/

# Test with real Next.js sites
python examples/scrape_example.py
```

### Testing with Real Sites

The library includes examples for testing with popular Next.js sites:

```bash
# Test with different types of Next.js applications
python examples/test_ecommerce.py
python examples/test_blog.py  
python examples/test_social.py
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Legal Disclaimer

This project is not affiliated with or endorsed by Vercel, Next.js, or any related entity.  
All trademarks and brand names are the property of their respective owners.

This library is intended for ethical use only. Users are solely responsible for ensuring that their use of this software complies with applicable laws, website terms of service, and data usage policies. The authors disclaim any liability for misuse or violations resulting from the use of this tool.

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/kennyaires/nextjs-hydration-parser",
    "name": "nextjs-hydration-parser",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.7",
    "maintainer_email": null,
    "keywords": "nextjs, hydration, html, parser, web-scraping, javascript",
    "author": "Kenny Aires",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/27/e4/3ae2e4f0f2f8616db99309ba2375af60f3c919fcf73f63d228b015631731/nextjs_hydration_parser-0.2.0.tar.gz",
    "platform": null,
    "description": "# Next.js Hydration Parser\n\n[![PyPI version](https://badge.fury.io/py/nextjs-hydration-parser.svg)](https://badge.fury.io/py/nextjs-hydration-parser)\n[![Python versions](https://img.shields.io/pypi/pyversions/nextjs-hydration-parser.svg)](https://pypi.org/project/nextjs-hydration-parser/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nA specialized Python library for extracting and parsing Next.js 13+ hydration data from raw HTML pages. When scraping Next.js applications, the server-side rendered HTML contains complex hydration data chunks embedded in `self.__next_f.push()` calls that need to be properly assembled and parsed to access the underlying application data.\n\n## The Problem\n\nNext.js 13+ applications with App Router use a sophisticated hydration system that splits data across multiple script chunks in the raw HTML. When you scrape these pages (before JavaScript execution), you get fragments like:\n\n```html\n<script>self.__next_f.push([1,\"partial data chunk 1\"])</script>\n<script>self.__next_f.push([1,\"continuation of data\"])</script>\n<script>self.__next_f.push([2,\"{\\\"products\\\":[{\\\"id\\\":1,\\\"name\\\":\\\"Product\\\"}]}\"])</script>\n```\n\nThis data is:\n- **Split across multiple chunks** that need to be reassembled\n- **Encoded in various formats** (JSON strings, base64, escaped content)\n- **Mixed with rendering metadata** that needs to be filtered out\n- **Difficult to parse** due to complex escaping and nested structures\n\nThis library solves these challenges by intelligently combining chunks, handling multiple encoding formats, and extracting the meaningful application data.\n\n## Features\n\n- \ufffd\ufe0f **Web Scraping Focused** - Designed specifically for parsing raw Next.js 13+ pages before JavaScript execution\n- \ud83e\udde9 **Chunk Reassembly** - Intelligently combines data fragments split across multiple `self.__next_f.push()` calls\n- \ud83d\udd0d **Multi-format Parsing** - Handles JSON strings, base64-encoded data, escaped content, and complex nested structures\n- \ud83c\udfaf **Data Extraction** - Filters out rendering metadata to extract meaningful application data (products, users, API responses, etc.)\n- \ud83d\udee0\ufe0f **Robust Error Handling** - Continues processing even with malformed chunks, providing debugging information\n- \ud83d\udd0e **Pattern Matching** - Search and filter extracted data by keys or content patterns\n- \u26a1 **Performance Optimized** - Efficiently processes large HTML files with hundreds of hydration chunks\n\n## Use Cases\n\nPerfect for:\n- **E-commerce scraping** - Extract product catalogs, prices, and inventory data\n- **Content aggregation** - Collect articles, blog posts, and structured content\n- **API reverse engineering** - Understand data structures used by Next.js applications\n- **SEO analysis** - Extract meta information and structured data for analysis\n\n## Installation\n\n```bash\npip install nextjs-hydration-parser\n```\n\n### Requirements\n\n- Python 3.7+\n- `chompjs` for JavaScript object parsing\n- `requests` (for scraping examples)\n\nThe library is lightweight with minimal dependencies, designed for integration into existing scraping pipelines.\n\n## Quick Start\n\n```python\nfrom nextjs_hydration_parser import NextJSHydrationDataExtractor\nimport requests\n\n# Create an extractor instance\nextractor = NextJSHydrationDataExtractor()\n\n# Scrape a Next.js page (before JavaScript execution)\nresponse = requests.get('https://example-nextjs-ecommerce.com/products')\nhtml_content = response.text\n\n# Extract and parse the hydration data\nchunks = extractor.parse(html_content)\n\n# Process the results to find meaningful data\nfor chunk in chunks:\n    print(f\"Chunk ID: {chunk['chunk_id']}\")\n    for item in chunk['extracted_data']:\n        if item['type'] == 'colon_separated':\n            # Often contains API response data\n            print(f\"API Data: {item['data']}\")\n        elif 'products' in str(item['data']):\n            # Found product data\n            print(f\"Products: {item['data']}\")\n```\n\n### Real-world Example: E-commerce Scraping\n\n```python\n# Extract product data from a Next.js e-commerce site\nextractor = NextJSHydrationDataExtractor()\nhtml_content = open('product_page.html', 'r').read()\n\nchunks = extractor.parse(html_content)\n\n# Find product information\nproducts = extractor.find_data_by_pattern(chunks, 'product')\nfor product_data in products:\n    if isinstance(product_data['value'], dict):\n        product = product_data['value']\n        print(f\"Product: {product.get('name', 'Unknown')}\")\n        print(f\"Price: ${product.get('price', 'N/A')}\")\n        print(f\"Stock: {product.get('inventory', 'Unknown')}\")\n```\n\n## Advanced Usage\n\n### Scraping Complex Next.js Applications\n\n```python\nimport requests\nfrom nextjs_hydration_parser import NextJSHydrationDataExtractor\n\ndef scrape_nextjs_data(url):\n    \"\"\"Scrape and extract data from a Next.js application\"\"\"\n    \n    # Get raw HTML (before JavaScript execution)\n    headers = {'User-Agent': 'Mozilla/5.0 (compatible; DataExtractor/1.0)'}\n    response = requests.get(url, headers=headers)\n    \n    # Parse hydration data\n    extractor = NextJSHydrationDataExtractor()\n    chunks = extractor.parse(response.text)\n    \n    # Extract meaningful data\n    extracted_data = {}\n    \n    for chunk in chunks:\n        if chunk['chunk_id'] == 'error':\n            continue  # Skip malformed chunks\n            \n        for item in chunk['extracted_data']:\n            data = item['data']\n            \n            # Look for common data patterns\n            if isinstance(data, dict):\n                # API responses often contain these keys\n                for key in ['products', 'users', 'posts', 'data', 'results']:\n                    if key in data:\n                        extracted_data[key] = data[key]\n                        \n    return extracted_data\n\n# Usage\ndata = scrape_nextjs_data('https://nextjs-shop.example.com')\nprint(f\"Found {len(data.get('products', []))} products\")\n```\n\n### Handling Large HTML Files\n\nWhen scraping large Next.js applications, you might encounter hundreds of hydration chunks:\n\n```python\n# Read from file\nwith open('large_nextjs_page.html', 'r', encoding='utf-8') as f:\n    html_content = f.read()\n\n# Parse and extract\nextractor = NextJSHydrationDataExtractor()\nchunks = extractor.parse(html_content)\n\nprint(f\"Found {len(chunks)} hydration chunks\")\n\n# Get overview of all available data keys\nall_keys = extractor.get_all_keys(chunks)\nprint(\"Most common data keys:\")\nfor key, count in list(all_keys.items())[:20]:\n    print(f\"  {key}: {count} occurrences\")\n\n# Focus on specific data types\napi_data = []\nfor chunk in chunks:\n    for item in chunk['extracted_data']:\n        if item['type'] == 'colon_separated' and 'api' in item.get('identifier', '').lower():\n            api_data.append(item['data'])\n\nprint(f\"Found {len(api_data)} API data chunks\")\n```\n\n## API Reference\n\n### `NextJSHydrationDataExtractor`\n\nThe main class for extracting Next.js hydration data.\n\n#### Methods\n\n- **`parse(html_content: str) -> List[Dict[str, Any]]`**\n  \n  Parse Next.js hydration data from HTML content.\n  \n  - `html_content`: Raw HTML string containing script tags\n  - Returns: List of parsed data chunks\n\n- **`get_all_keys(parsed_chunks: List[Dict], max_depth: int = 3) -> Dict[str, int]`**\n  \n  Extract all unique keys from parsed chunks.\n  \n  - `parsed_chunks`: Output from `parse()` method\n  - `max_depth`: Maximum depth to traverse\n  - Returns: Dictionary of keys and their occurrence counts\n\n- **`find_data_by_pattern(parsed_chunks: List[Dict], pattern: str) -> List[Any]`**\n  \n  Find data matching a specific pattern.\n  \n  - `parsed_chunks`: Output from `parse()` method  \n  - `pattern`: Key pattern to search for\n  - Returns: List of matching data items\n\n## Data Structure\n\nThe parser returns data in the following structure:\n\n```python\n[\n    {\n        \"chunk_id\": \"1\",  # ID from self.__next_f.push([ID, data])\n        \"extracted_data\": [\n            {\n                \"type\": \"colon_separated|standalone_json|whole_text\",\n                \"data\": {...},  # Parsed JavaScript/JSON object\n                \"identifier\": \"...\",  # For colon_separated type\n                \"start_position\": 123  # For standalone_json type\n            }\n        ],\n        \"chunk_count\": 1,  # Number of chunks with this ID\n        \"_positions\": [123]  # Original positions in HTML\n    }\n]\n```\n\n## Supported Data Formats\n\nThe parser handles various data formats commonly found in Next.js 13+ hydration chunks:\n\n### 1. JSON Strings\n```javascript\nself.__next_f.push([1, \"{\\\"products\\\":[{\\\"id\\\":1,\\\"name\\\":\\\"Laptop\\\",\\\"price\\\":999}]}\"])\n```\n\n### 2. Base64 + JSON Combinations  \n```javascript\nself.__next_f.push([2, \"eyJhcGlLZXkiOiJ4eXoifQ==:{\\\"data\\\":{\\\"users\\\":[{\\\"id\\\":1}]}}\"])\n```\n\n### 3. JavaScript Objects\n```javascript\nself.__next_f.push([3, \"{key: 'value', items: [1, 2, 3], nested: {deep: true}}\"])\n```\n\n### 4. Escaped Content\n```javascript  \nself.__next_f.push([4, \"\\\"escaped content with \\\\\\\"quotes\\\\\\\" and newlines\\\\n\\\"\"])\n```\n\n### 5. Multi-chunk Data\n```javascript\n// Data split across multiple chunks with same ID\nself.__next_f.push([5, \"first part of data\"])\nself.__next_f.push([5, \" continued here\"])\nself.__next_f.push([5, \" and final part\"])\n```\n\n### 6. Complex Nested Structures\nNext.js often embeds API responses, page props, and component data in deeply nested formats that the parser can extract and flatten for easy access.\n\n## How Next.js 13+ Hydration Works\n\nUnderstanding the hydration process helps explain why this library is necessary:\n\n1. **Server-Side Rendering**: Next.js renders your page on the server, generating static HTML\n2. **Data Embedding**: Instead of making separate API calls, Next.js may embeds the data directly in the HTML using `self.__next_f.push()` calls\n3. **Chunk Splitting**: Large data sets are split across multiple chunks to optimize loading\n4. **Client Hydration**: When JavaScript loads, these chunks are reassembled and used to hydrate React components\n\nWhen scraping, you're intercepting step 2 - getting the raw HTML with embedded data before the JavaScript processes it. This gives you access to all the data the application uses, but in a fragmented format that needs intelligent parsing.\n\n**Why not just use the rendered page?** \n- Faster scraping (no JavaScript execution wait time)\n- Access to internal data structures not visible in the DOM\n- Bypasses client-side anti-scraping measures\n- Gets raw API responses before component filtering/transformation\n\n## Error Handling\n\nThe parser includes robust error handling:\n\n- **Malformed data**: Continues processing and marks chunks with errors\n- **Multiple parsing strategies**: Falls back to alternative parsing methods\n- **Partial data**: Handles incomplete or truncated data gracefully\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n1. Fork the repository\n2. Create your feature branch (`git checkout -b feature/AmazingFeature`)\n3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)\n4. Push to the branch (`git push origin feature/AmazingFeature`)\n5. Open a Pull Request\n\n## Development Setup\n\n```bash\n# Clone the repository\ngit clone https://github.com/kennyaires/nextjs-hydration-parser.git\ncd nextjs-hydration-parser\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install in development mode with testing dependencies\npip install -e .[dev]\n\n# Run tests\npytest tests/ -v\n\n# Run formatting\nblack nextjs_hydration_parser/ tests/\n\n# Test with real Next.js sites\npython examples/scrape_example.py\n```\n\n### Testing with Real Sites\n\nThe library includes examples for testing with popular Next.js sites:\n\n```bash\n# Test with different types of Next.js applications\npython examples/test_ecommerce.py\npython examples/test_blog.py  \npython examples/test_social.py\n```\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## Legal Disclaimer\n\nThis project is not affiliated with or endorsed by Vercel, Next.js, or any related entity.  \nAll trademarks and brand names are the property of their respective owners.\n\nThis library is intended for ethical use only. Users are solely responsible for ensuring that their use of this software complies with applicable laws, website terms of service, and data usage policies. The authors disclaim any liability for misuse or violations resulting from the use of this tool.\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "A Python library for extracting and parsing Next.js hydration data from HTML content",
    "version": "0.2.0",
    "project_urls": {
        "Bug Reports": "https://github.com/kennyaires/nextjs-hydration-parser/issues",
        "Documentation": "https://github.com/kennyaires/nextjs-hydration-parser#readme",
        "Homepage": "https://github.com/kennyaires/nextjs-hydration-parser",
        "Source": "https://github.com/kennyaires/nextjs-hydration-parser"
    },
    "split_keywords": [
        "nextjs",
        " hydration",
        " html",
        " parser",
        " web-scraping",
        " javascript"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "2ab5e80dd44bb880c33472494d0af971bff9d43fa13bf929600dc9f4320a029f",
                "md5": "c36f0129bc0d27e2aff72c36b3cc6688",
                "sha256": "61561139e9515c07d0f46946082f975c1392cc9f0d8a9bc029eb9f07170a79f9"
            },
            "downloads": -1,
            "filename": "nextjs_hydration_parser-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "c36f0129bc0d27e2aff72c36b3cc6688",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7",
            "size": 11672,
            "upload_time": "2025-07-30T00:36:15",
            "upload_time_iso_8601": "2025-07-30T00:36:15.251410Z",
            "url": "https://files.pythonhosted.org/packages/2a/b5/e80dd44bb880c33472494d0af971bff9d43fa13bf929600dc9f4320a029f/nextjs_hydration_parser-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "27e43ae2e4f0f2f8616db99309ba2375af60f3c919fcf73f63d228b015631731",
                "md5": "00966d201c1bcaf44758f1bf2aa65e44",
                "sha256": "6935f39439b1ec28b276debdf2cdcca986161b220d50cbd2c3b2a236c470b224"
            },
            "downloads": -1,
            "filename": "nextjs_hydration_parser-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "00966d201c1bcaf44758f1bf2aa65e44",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7",
            "size": 22639,
            "upload_time": "2025-07-30T00:36:16",
            "upload_time_iso_8601": "2025-07-30T00:36:16.557997Z",
            "url": "https://files.pythonhosted.org/packages/27/e4/3ae2e4f0f2f8616db99309ba2375af60f3c919fcf73f63d228b015631731/nextjs_hydration_parser-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-30 00:36:16",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "kennyaires",
    "github_project": "nextjs-hydration-parser",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "nextjs-hydration-parser"
}

Kenny Aires