Name | lean4url JSON |
Version |
0.1.0
JSON |
| download |
home_page | https://github.com/rexwzh/lean4url |
Summary | High-performance lzstring compression library compatible with JavaScript implementation |
upload_time | 2025-08-02 18:53:37 |
maintainer | None |
docs_url | None |
author | Rex Wang |
requires_python | >=3.8 |
license | None |
keywords |
|
VCS |
 |
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# lean4url
[](https://badge.fury.io/py/lean4url)
[](https://pypi.org/project/lean4url/)
[](https://opensource.org/licenses/MIT)
[](https://github.com/rexwzh/lean4url/actions)
[](https://codecov.io/gh/rexwzh/lean4url)
A high-performance lzstring compression library fully compatible with JavaScript implementation.
## Features
✅ **Fully Compatible** - 100% compatible with [pieroxy/lz-string](https://github.com/pieroxy/lz-string) JavaScript implementation
✅ **Unicode Support** - Correctly handles all Unicode characters, including emoji and special symbols
✅ **URL Friendly** - Built-in URL encoding/decoding functionality
✅ **High Performance** - Optimized algorithm implementation
✅ **Type Safe** - Complete type annotation support
✅ **Thoroughly Tested** - Includes comparative tests with JavaScript version
## Background
Existing Python lzstring packages have issues with Unicode character handling. For example, for the character "𝔓":
- **Existing package output**: `sirQ`
- **JavaScript original output**: `qwbmRdo=`
- **lean4url output**: `qwbmRdo=` ✅
lean4url solves this problem by correctly simulating JavaScript's UTF-16 encoding behavior.
## Installation
```bash
pip install lean4url
```
## Quick Start
### Basic Compression/Decompression
```python
from lean4url import LZString
# Create instance
lz = LZString()
# Compress string
original = "Hello, 世界! 🌍"
compressed = lz.compress_to_base64(original)
print(f"Compressed: {compressed}")
# Decompress string
decompressed = lz.decompress_from_base64(compressed)
print(f"Decompressed: {decompressed}")
# Output: Hello, 世界! 🌍
```
### URL Encoding/Decoding
```python
from lean4url import encode_url, decode_url
# Encode data to URL
data = "This is data to be encoded"
url = encode_url(data, base_url="https://example.com/share")
print(f"Encoded URL: {url}")
# Output: https://example.com/share/#codez=BIUwNmD2A0AEDukBOYAmBMYAZhAY...
# Decode data from URL
result = decode_url(url)
print(f"Decoded result: {result['codez']}")
# Output: This is data to be encoded
```
### URL Encoding with Parameters
```python
from lean4url import encode_url, decode_url
# Add extra parameters when encoding
code = "function hello() { return 'world'; }"
url = encode_url(
code,
base_url="https://playground.example.com",
lang="javascript",
theme="dark",
url="https://docs.example.com" # This parameter will be URL encoded
)
print(f"Complete URL: {url}")
# Output: https://playground.example.com/#codez=BIUwNmD2A0A...&lang=javascript&theme=dark&url=https%3A//docs.example.com
# Decode URL to get all parameters
params = decode_url(url)
print(f"Code: {params['codez']}")
print(f"Language: {params['lang']}")
print(f"Theme: {params['theme']}")
print(f"Documentation link: {params['url']}")
```
## API Reference
### LZString Class
```python
class LZString:
def compress_to_base64(self, input_str: str) -> str:
"""Compress string to Base64 format"""
def decompress_from_base64(self, input_str: str) -> str:
"""Decompress string from Base64 format"""
def compress_to_utf16(self, input_str: str) -> str:
"""Compress string to UTF16 format"""
def decompress_from_utf16(self, input_str: str) -> str:
"""Decompress string from UTF16 format"""
```
### URL Utility Functions
```python
def encode_url(data: str, base_url: str = None, **kwargs) -> str:
"""
Encode input string and build complete URL.
Args:
data: Data to be encoded
base_url: URL prefix
**kwargs: Additional URL parameters
Returns:
Built complete URL
"""
def decode_url(url: str) -> dict:
"""
Decode original data from URL.
Args:
url: Complete URL
Returns:
Dictionary containing all parameters, with codez decoded
"""
```
## Development
### Environment Setup
```bash
# Clone repository
git clone https://github.com/rexwzh/lean4url.git
cd lean4url
# Create virtual environment
python -m venv venv
source venv/bin/activate # Linux/Mac
# or
venv\Scripts\activate # Windows
# Install development dependencies
pip install -e ".[dev]"
```
### Running Tests
```bash
# Start JavaScript test service
cd tests/js_service
npm install
node server.js &
cd ../..
# Run Python tests
pytest
# Run tests with coverage
pytest --cov=lean4url --cov-report=html
```
### Code Formatting
```bash
# Format code
black src tests
isort src tests
# Type checking
mypy src
# Code checking
flake8 src tests
```
## Algorithm Principles
lean4url is based on a variant of the LZ78 compression algorithm, with core ideas:
1. **Dictionary Building** - Dynamically build character sequence dictionary
2. **Sequence Matching** - Find longest matching sequences
3. **UTF-16 Compatibility** - Simulate JavaScript's UTF-16 surrogate pair behavior
4. **Base64 Encoding** - Encode compression results in URL-safe format
### Unicode Handling
The key difference from existing Python packages is in Unicode character handling:
- **JavaScript**: Uses UTF-16 surrogate pairs, "𝔓" → `[0xD835, 0xDCD3]`
- **Existing Python packages**: Use Unicode code points, "𝔓" → `[0x1D4D3]`
- **lean4url**: Simulates JavaScript behavior, ensuring compatibility
## License
MIT License - See the [LICENSE](LICENSE) file for details.
## Contributing
Issues and Pull Requests are welcome!
## Changelog
### v1.0.0
- Initial version release
- Complete lzstring algorithm implementation
- JavaScript compatibility
- URL encoding/decoding functionality
- Complete test suite
Raw data
{
"_id": null,
"home_page": "https://github.com/rexwzh/lean4url",
"name": "lean4url",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Rex Wang",
"author_email": "1073853456@qq.com",
"download_url": "https://files.pythonhosted.org/packages/76/3c/e9ca6992172ff905e5db7995b488da3f91c04f108561de3691e459c6833c/lean4url-0.1.0.tar.gz",
"platform": null,
"description": "# lean4url\n\n[](https://badge.fury.io/py/lean4url)\n[](https://pypi.org/project/lean4url/)\n[](https://opensource.org/licenses/MIT)\n[](https://github.com/rexwzh/lean4url/actions)\n[](https://codecov.io/gh/rexwzh/lean4url)\n\nA high-performance lzstring compression library fully compatible with JavaScript implementation.\n\n## Features\n\n\u2705 **Fully Compatible** - 100% compatible with [pieroxy/lz-string](https://github.com/pieroxy/lz-string) JavaScript implementation\n\n\u2705 **Unicode Support** - Correctly handles all Unicode characters, including emoji and special symbols\n\n\u2705 **URL Friendly** - Built-in URL encoding/decoding functionality\n\n\u2705 **High Performance** - Optimized algorithm implementation\n\n\u2705 **Type Safe** - Complete type annotation support\n\n\u2705 **Thoroughly Tested** - Includes comparative tests with JavaScript version\n\n## Background\n\nExisting Python lzstring packages have issues with Unicode character handling. For example, for the character \"\ud835\udd13\":\n\n- **Existing package output**: `sirQ`\n- **JavaScript original output**: `qwbmRdo=`\n- **lean4url output**: `qwbmRdo=` \u2705\n\nlean4url solves this problem by correctly simulating JavaScript's UTF-16 encoding behavior.\n\n## Installation\n\n```bash\npip install lean4url\n```\n\n## Quick Start\n\n### Basic Compression/Decompression\n\n```python\nfrom lean4url import LZString\n\n# Create instance\nlz = LZString()\n\n# Compress string\noriginal = \"Hello, \u4e16\u754c! \ud83c\udf0d\"\ncompressed = lz.compress_to_base64(original)\nprint(f\"Compressed: {compressed}\")\n\n# Decompress string\ndecompressed = lz.decompress_from_base64(compressed)\nprint(f\"Decompressed: {decompressed}\")\n# Output: Hello, \u4e16\u754c! \ud83c\udf0d\n```\n\n### URL Encoding/Decoding\n\n```python\nfrom lean4url import encode_url, decode_url\n\n# Encode data to URL\ndata = \"This is data to be encoded\"\nurl = encode_url(data, base_url=\"https://example.com/share\")\nprint(f\"Encoded URL: {url}\")\n# Output: https://example.com/share/#codez=BIUwNmD2A0AEDukBOYAmBMYAZhAY...\n\n# Decode data from URL\nresult = decode_url(url)\nprint(f\"Decoded result: {result['codez']}\")\n# Output: This is data to be encoded\n```\n\n### URL Encoding with Parameters\n\n```python\nfrom lean4url import encode_url, decode_url\n\n# Add extra parameters when encoding\ncode = \"function hello() { return 'world'; }\"\nurl = encode_url(\n code, \n base_url=\"https://playground.example.com\",\n lang=\"javascript\",\n theme=\"dark\",\n url=\"https://docs.example.com\" # This parameter will be URL encoded\n)\n\nprint(f\"Complete URL: {url}\")\n# Output: https://playground.example.com/#codez=BIUwNmD2A0A...&lang=javascript&theme=dark&url=https%3A//docs.example.com\n\n# Decode URL to get all parameters\nparams = decode_url(url)\nprint(f\"Code: {params['codez']}\")\nprint(f\"Language: {params['lang']}\")\nprint(f\"Theme: {params['theme']}\")\nprint(f\"Documentation link: {params['url']}\")\n```\n\n## API Reference\n\n### LZString Class\n\n```python\nclass LZString:\n def compress_to_base64(self, input_str: str) -> str:\n \"\"\"Compress string to Base64 format\"\"\"\n \n def decompress_from_base64(self, input_str: str) -> str:\n \"\"\"Decompress string from Base64 format\"\"\"\n \n def compress_to_utf16(self, input_str: str) -> str:\n \"\"\"Compress string to UTF16 format\"\"\"\n \n def decompress_from_utf16(self, input_str: str) -> str:\n \"\"\"Decompress string from UTF16 format\"\"\"\n```\n\n### URL Utility Functions\n\n```python\ndef encode_url(data: str, base_url: str = None, **kwargs) -> str:\n \"\"\"\n Encode input string and build complete URL.\n \n Args:\n data: Data to be encoded\n base_url: URL prefix\n **kwargs: Additional URL parameters\n \n Returns:\n Built complete URL\n \"\"\"\n\ndef decode_url(url: str) -> dict:\n \"\"\"\n Decode original data from URL.\n \n Args:\n url: Complete URL\n \n Returns:\n Dictionary containing all parameters, with codez decoded\n \"\"\"\n```\n\n## Development\n\n### Environment Setup\n\n```bash\n# Clone repository\ngit clone https://github.com/rexwzh/lean4url.git\ncd lean4url\n\n# Create virtual environment\npython -m venv venv\nsource venv/bin/activate # Linux/Mac\n# or\nvenv\\Scripts\\activate # Windows\n\n# Install development dependencies\npip install -e \".[dev]\"\n```\n\n### Running Tests\n\n```bash\n# Start JavaScript test service\ncd tests/js_service\nnpm install\nnode server.js &\ncd ../.. \n\n# Run Python tests\npytest\n\n# Run tests with coverage\npytest --cov=lean4url --cov-report=html\n```\n\n### Code Formatting\n\n```bash\n# Format code\nblack src tests\nisort src tests\n\n# Type checking\nmypy src\n\n# Code checking\nflake8 src tests\n```\n\n## Algorithm Principles\n\nlean4url is based on a variant of the LZ78 compression algorithm, with core ideas:\n\n1. **Dictionary Building** - Dynamically build character sequence dictionary\n2. **Sequence Matching** - Find longest matching sequences\n3. **UTF-16 Compatibility** - Simulate JavaScript's UTF-16 surrogate pair behavior\n4. **Base64 Encoding** - Encode compression results in URL-safe format\n\n### Unicode Handling\n\nThe key difference from existing Python packages is in Unicode character handling:\n\n- **JavaScript**: Uses UTF-16 surrogate pairs, \"\ud835\udd13\" \u2192 `[0xD835, 0xDCD3]`\n- **Existing Python packages**: Use Unicode code points, \"\ud835\udd13\" \u2192 `[0x1D4D3]`\n- **lean4url**: Simulates JavaScript behavior, ensuring compatibility\n\n## License\n\nMIT License - See the [LICENSE](LICENSE) file for details.\n\n## Contributing\n\nIssues and Pull Requests are welcome!\n\n## Changelog\n\n### v1.0.0\n- Initial version release\n- Complete lzstring algorithm implementation\n- JavaScript compatibility\n- URL encoding/decoding functionality\n- Complete test suite\n",
"bugtrack_url": null,
"license": null,
"summary": "High-performance lzstring compression library compatible with JavaScript implementation",
"version": "0.1.0",
"project_urls": {
"Bug Reports": "https://github.com/rexwzh/lean4url/issues",
"Homepage": "https://github.com/rexwzh/lean4url",
"Source": "https://github.com/rexwzh/lean4url"
},
"split_keywords": [],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "304702cb9d65068c69d9533408bfbf878fd50b6ec464664e57b3e317b58e2292",
"md5": "b68c68482447315adb1d0e012327d027",
"sha256": "e69830064bc9f7a6e350708bd0c21369b026c95e1ce1d31a509e005fb4383371"
},
"downloads": -1,
"filename": "lean4url-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "b68c68482447315adb1d0e012327d027",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 10303,
"upload_time": "2025-08-02T18:53:35",
"upload_time_iso_8601": "2025-08-02T18:53:35.567793Z",
"url": "https://files.pythonhosted.org/packages/30/47/02cb9d65068c69d9533408bfbf878fd50b6ec464664e57b3e317b58e2292/lean4url-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "763ce9ca6992172ff905e5db7995b488da3f91c04f108561de3691e459c6833c",
"md5": "9e9e52beafbd5702aa58835867ae2708",
"sha256": "564329adb16d93c05fa61fe4d8351402e39796b1781a5017a7040abf31d28b91"
},
"downloads": -1,
"filename": "lean4url-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "9e9e52beafbd5702aa58835867ae2708",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 13964,
"upload_time": "2025-08-02T18:53:37",
"upload_time_iso_8601": "2025-08-02T18:53:37.307715Z",
"url": "https://files.pythonhosted.org/packages/76/3c/e9ca6992172ff905e5db7995b488da3f91c04f108561de3691e459c6833c/lean4url-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-02 18:53:37",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "rexwzh",
"github_project": "lean4url",
"github_not_found": true,
"lcname": "lean4url"
}