lean4url


Namelean4url JSON
Version 0.1.0 PyPI version JSON
download
home_pagehttps://github.com/rexwzh/lean4url
SummaryHigh-performance lzstring compression library compatible with JavaScript implementation
upload_time2025-08-02 18:53:37
maintainerNone
docs_urlNone
authorRex Wang
requires_python>=3.8
licenseNone
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # lean4url

[![PyPI version](https://badge.fury.io/py/lean4url.svg)](https://badge.fury.io/py/lean4url)
[![Python Version](https://img.shields.io/pypi/pyversions/lean4url.svg)](https://pypi.org/project/lean4url/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Tests](https://github.com/rexwzh/lean4url/workflows/Tests/badge.svg)](https://github.com/rexwzh/lean4url/actions)
[![Coverage](https://codecov.io/gh/rexwzh/lean4url/branch/main/graph/badge.svg)](https://codecov.io/gh/rexwzh/lean4url)

A high-performance lzstring compression library fully compatible with JavaScript implementation.

## Features

✅ **Fully Compatible** - 100% compatible with [pieroxy/lz-string](https://github.com/pieroxy/lz-string) JavaScript implementation

✅ **Unicode Support** - Correctly handles all Unicode characters, including emoji and special symbols

✅ **URL Friendly** - Built-in URL encoding/decoding functionality

✅ **High Performance** - Optimized algorithm implementation

✅ **Type Safe** - Complete type annotation support

✅ **Thoroughly Tested** - Includes comparative tests with JavaScript version

## Background

Existing Python lzstring packages have issues with Unicode character handling. For example, for the character "𝔓":

- **Existing package output**: `sirQ`
- **JavaScript original output**: `qwbmRdo=`
- **lean4url output**: `qwbmRdo=` ✅

lean4url solves this problem by correctly simulating JavaScript's UTF-16 encoding behavior.

## Installation

```bash
pip install lean4url
```

## Quick Start

### Basic Compression/Decompression

```python
from lean4url import LZString

# Create instance
lz = LZString()

# Compress string
original = "Hello, 世界! 🌍"
compressed = lz.compress_to_base64(original)
print(f"Compressed: {compressed}")

# Decompress string
decompressed = lz.decompress_from_base64(compressed)
print(f"Decompressed: {decompressed}")
# Output: Hello, 世界! 🌍
```

### URL Encoding/Decoding

```python
from lean4url import encode_url, decode_url

# Encode data to URL
data = "This is data to be encoded"
url = encode_url(data, base_url="https://example.com/share")
print(f"Encoded URL: {url}")
# Output: https://example.com/share/#codez=BIUwNmD2A0AEDukBOYAmBMYAZhAY...

# Decode data from URL
result = decode_url(url)
print(f"Decoded result: {result['codez']}")
# Output: This is data to be encoded
```

### URL Encoding with Parameters

```python
from lean4url import encode_url, decode_url

# Add extra parameters when encoding
code = "function hello() { return 'world'; }"
url = encode_url(
    code, 
    base_url="https://playground.example.com",
    lang="javascript",
    theme="dark",
    url="https://docs.example.com"  # This parameter will be URL encoded
)

print(f"Complete URL: {url}")
# Output: https://playground.example.com/#codez=BIUwNmD2A0A...&lang=javascript&theme=dark&url=https%3A//docs.example.com

# Decode URL to get all parameters
params = decode_url(url)
print(f"Code: {params['codez']}")
print(f"Language: {params['lang']}")
print(f"Theme: {params['theme']}")
print(f"Documentation link: {params['url']}")
```

## API Reference

### LZString Class

```python
class LZString:
    def compress_to_base64(self, input_str: str) -> str:
        """Compress string to Base64 format"""
        
    def decompress_from_base64(self, input_str: str) -> str:
        """Decompress string from Base64 format"""
        
    def compress_to_utf16(self, input_str: str) -> str:
        """Compress string to UTF16 format"""
        
    def decompress_from_utf16(self, input_str: str) -> str:
        """Decompress string from UTF16 format"""
```

### URL Utility Functions

```python
def encode_url(data: str, base_url: str = None, **kwargs) -> str:
    """
    Encode input string and build complete URL.
    
    Args:
        data: Data to be encoded
        base_url: URL prefix
        **kwargs: Additional URL parameters
        
    Returns:
        Built complete URL
    """

def decode_url(url: str) -> dict:
    """
    Decode original data from URL.
    
    Args:
        url: Complete URL
        
    Returns:
        Dictionary containing all parameters, with codez decoded
    """
```

## Development

### Environment Setup

```bash
# Clone repository
git clone https://github.com/rexwzh/lean4url.git
cd lean4url

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/Mac
# or
venv\Scripts\activate  # Windows

# Install development dependencies
pip install -e ".[dev]"
```

### Running Tests

```bash
# Start JavaScript test service
cd tests/js_service
npm install
node server.js &
cd ../.. 

# Run Python tests
pytest

# Run tests with coverage
pytest --cov=lean4url --cov-report=html
```

### Code Formatting

```bash
# Format code
black src tests
isort src tests

# Type checking
mypy src

# Code checking
flake8 src tests
```

## Algorithm Principles

lean4url is based on a variant of the LZ78 compression algorithm, with core ideas:

1. **Dictionary Building** - Dynamically build character sequence dictionary
2. **Sequence Matching** - Find longest matching sequences
3. **UTF-16 Compatibility** - Simulate JavaScript's UTF-16 surrogate pair behavior
4. **Base64 Encoding** - Encode compression results in URL-safe format

### Unicode Handling

The key difference from existing Python packages is in Unicode character handling:

- **JavaScript**: Uses UTF-16 surrogate pairs, "𝔓" → `[0xD835, 0xDCD3]`
- **Existing Python packages**: Use Unicode code points, "𝔓" → `[0x1D4D3]`
- **lean4url**: Simulates JavaScript behavior, ensuring compatibility

## License

MIT License - See the [LICENSE](LICENSE) file for details.

## Contributing

Issues and Pull Requests are welcome!

## Changelog

### v1.0.0
- Initial version release
- Complete lzstring algorithm implementation
- JavaScript compatibility
- URL encoding/decoding functionality
- Complete test suite

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/rexwzh/lean4url",
    "name": "lean4url",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": null,
    "author": "Rex Wang",
    "author_email": "1073853456@qq.com",
    "download_url": "https://files.pythonhosted.org/packages/76/3c/e9ca6992172ff905e5db7995b488da3f91c04f108561de3691e459c6833c/lean4url-0.1.0.tar.gz",
    "platform": null,
    "description": "# lean4url\n\n[![PyPI version](https://badge.fury.io/py/lean4url.svg)](https://badge.fury.io/py/lean4url)\n[![Python Version](https://img.shields.io/pypi/pyversions/lean4url.svg)](https://pypi.org/project/lean4url/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Tests](https://github.com/rexwzh/lean4url/workflows/Tests/badge.svg)](https://github.com/rexwzh/lean4url/actions)\n[![Coverage](https://codecov.io/gh/rexwzh/lean4url/branch/main/graph/badge.svg)](https://codecov.io/gh/rexwzh/lean4url)\n\nA high-performance lzstring compression library fully compatible with JavaScript implementation.\n\n## Features\n\n\u2705 **Fully Compatible** - 100% compatible with [pieroxy/lz-string](https://github.com/pieroxy/lz-string) JavaScript implementation\n\n\u2705 **Unicode Support** - Correctly handles all Unicode characters, including emoji and special symbols\n\n\u2705 **URL Friendly** - Built-in URL encoding/decoding functionality\n\n\u2705 **High Performance** - Optimized algorithm implementation\n\n\u2705 **Type Safe** - Complete type annotation support\n\n\u2705 **Thoroughly Tested** - Includes comparative tests with JavaScript version\n\n## Background\n\nExisting Python lzstring packages have issues with Unicode character handling. For example, for the character \"\ud835\udd13\":\n\n- **Existing package output**: `sirQ`\n- **JavaScript original output**: `qwbmRdo=`\n- **lean4url output**: `qwbmRdo=` \u2705\n\nlean4url solves this problem by correctly simulating JavaScript's UTF-16 encoding behavior.\n\n## Installation\n\n```bash\npip install lean4url\n```\n\n## Quick Start\n\n### Basic Compression/Decompression\n\n```python\nfrom lean4url import LZString\n\n# Create instance\nlz = LZString()\n\n# Compress string\noriginal = \"Hello, \u4e16\u754c! \ud83c\udf0d\"\ncompressed = lz.compress_to_base64(original)\nprint(f\"Compressed: {compressed}\")\n\n# Decompress string\ndecompressed = lz.decompress_from_base64(compressed)\nprint(f\"Decompressed: {decompressed}\")\n# Output: Hello, \u4e16\u754c! \ud83c\udf0d\n```\n\n### URL Encoding/Decoding\n\n```python\nfrom lean4url import encode_url, decode_url\n\n# Encode data to URL\ndata = \"This is data to be encoded\"\nurl = encode_url(data, base_url=\"https://example.com/share\")\nprint(f\"Encoded URL: {url}\")\n# Output: https://example.com/share/#codez=BIUwNmD2A0AEDukBOYAmBMYAZhAY...\n\n# Decode data from URL\nresult = decode_url(url)\nprint(f\"Decoded result: {result['codez']}\")\n# Output: This is data to be encoded\n```\n\n### URL Encoding with Parameters\n\n```python\nfrom lean4url import encode_url, decode_url\n\n# Add extra parameters when encoding\ncode = \"function hello() { return 'world'; }\"\nurl = encode_url(\n    code, \n    base_url=\"https://playground.example.com\",\n    lang=\"javascript\",\n    theme=\"dark\",\n    url=\"https://docs.example.com\"  # This parameter will be URL encoded\n)\n\nprint(f\"Complete URL: {url}\")\n# Output: https://playground.example.com/#codez=BIUwNmD2A0A...&lang=javascript&theme=dark&url=https%3A//docs.example.com\n\n# Decode URL to get all parameters\nparams = decode_url(url)\nprint(f\"Code: {params['codez']}\")\nprint(f\"Language: {params['lang']}\")\nprint(f\"Theme: {params['theme']}\")\nprint(f\"Documentation link: {params['url']}\")\n```\n\n## API Reference\n\n### LZString Class\n\n```python\nclass LZString:\n    def compress_to_base64(self, input_str: str) -> str:\n        \"\"\"Compress string to Base64 format\"\"\"\n        \n    def decompress_from_base64(self, input_str: str) -> str:\n        \"\"\"Decompress string from Base64 format\"\"\"\n        \n    def compress_to_utf16(self, input_str: str) -> str:\n        \"\"\"Compress string to UTF16 format\"\"\"\n        \n    def decompress_from_utf16(self, input_str: str) -> str:\n        \"\"\"Decompress string from UTF16 format\"\"\"\n```\n\n### URL Utility Functions\n\n```python\ndef encode_url(data: str, base_url: str = None, **kwargs) -> str:\n    \"\"\"\n    Encode input string and build complete URL.\n    \n    Args:\n        data: Data to be encoded\n        base_url: URL prefix\n        **kwargs: Additional URL parameters\n        \n    Returns:\n        Built complete URL\n    \"\"\"\n\ndef decode_url(url: str) -> dict:\n    \"\"\"\n    Decode original data from URL.\n    \n    Args:\n        url: Complete URL\n        \n    Returns:\n        Dictionary containing all parameters, with codez decoded\n    \"\"\"\n```\n\n## Development\n\n### Environment Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/rexwzh/lean4url.git\ncd lean4url\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate  # Linux/Mac\n# or\nvenv\\Scripts\\activate  # Windows\n\n# Install development dependencies\npip install -e \".[dev]\"\n```\n\n### Running Tests\n\n```bash\n# Start JavaScript test service\ncd tests/js_service\nnpm install\nnode server.js &\ncd ../.. \n\n# Run Python tests\npytest\n\n# Run tests with coverage\npytest --cov=lean4url --cov-report=html\n```\n\n### Code Formatting\n\n```bash\n# Format code\nblack src tests\nisort src tests\n\n# Type checking\nmypy src\n\n# Code checking\nflake8 src tests\n```\n\n## Algorithm Principles\n\nlean4url is based on a variant of the LZ78 compression algorithm, with core ideas:\n\n1. **Dictionary Building** - Dynamically build character sequence dictionary\n2. **Sequence Matching** - Find longest matching sequences\n3. **UTF-16 Compatibility** - Simulate JavaScript's UTF-16 surrogate pair behavior\n4. **Base64 Encoding** - Encode compression results in URL-safe format\n\n### Unicode Handling\n\nThe key difference from existing Python packages is in Unicode character handling:\n\n- **JavaScript**: Uses UTF-16 surrogate pairs, \"\ud835\udd13\" \u2192 `[0xD835, 0xDCD3]`\n- **Existing Python packages**: Use Unicode code points, \"\ud835\udd13\" \u2192 `[0x1D4D3]`\n- **lean4url**: Simulates JavaScript behavior, ensuring compatibility\n\n## License\n\nMIT License - See the [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nIssues and Pull Requests are welcome!\n\n## Changelog\n\n### v1.0.0\n- Initial version release\n- Complete lzstring algorithm implementation\n- JavaScript compatibility\n- URL encoding/decoding functionality\n- Complete test suite\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "High-performance lzstring compression library compatible with JavaScript implementation",
    "version": "0.1.0",
    "project_urls": {
        "Bug Reports": "https://github.com/rexwzh/lean4url/issues",
        "Homepage": "https://github.com/rexwzh/lean4url",
        "Source": "https://github.com/rexwzh/lean4url"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "304702cb9d65068c69d9533408bfbf878fd50b6ec464664e57b3e317b58e2292",
                "md5": "b68c68482447315adb1d0e012327d027",
                "sha256": "e69830064bc9f7a6e350708bd0c21369b026c95e1ce1d31a509e005fb4383371"
            },
            "downloads": -1,
            "filename": "lean4url-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b68c68482447315adb1d0e012327d027",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 10303,
            "upload_time": "2025-08-02T18:53:35",
            "upload_time_iso_8601": "2025-08-02T18:53:35.567793Z",
            "url": "https://files.pythonhosted.org/packages/30/47/02cb9d65068c69d9533408bfbf878fd50b6ec464664e57b3e317b58e2292/lean4url-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "763ce9ca6992172ff905e5db7995b488da3f91c04f108561de3691e459c6833c",
                "md5": "9e9e52beafbd5702aa58835867ae2708",
                "sha256": "564329adb16d93c05fa61fe4d8351402e39796b1781a5017a7040abf31d28b91"
            },
            "downloads": -1,
            "filename": "lean4url-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "9e9e52beafbd5702aa58835867ae2708",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 13964,
            "upload_time": "2025-08-02T18:53:37",
            "upload_time_iso_8601": "2025-08-02T18:53:37.307715Z",
            "url": "https://files.pythonhosted.org/packages/76/3c/e9ca6992172ff905e5db7995b488da3f91c04f108561de3691e459c6833c/lean4url-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-02 18:53:37",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "rexwzh",
    "github_project": "lean4url",
    "github_not_found": true,
    "lcname": "lean4url"
}
        
Elapsed time: 2.17076s