# LSQR — LogSignedQuant Runtime 🚀
[](https://www.apache.org/licenses/LICENSE-2.0)
[](https://www.python.org)
[](https://github.com/MyOptimalNext/LSQR)
> **High-Fidelity, Low-Bit Neural Network Inference via Signed Logarithmic Quantization**
> Developed by Loukmane Hadj Said © 2025 | Patent Pending
> GitHub: [github.com/MyOptimalNext/LSQR](https://github.com/MyOptimalNext/LSQR)
---
## 🌟 Overview
**LSQR** is the first inference runtime engine designed specifically for **non-linear logarithmic quantization**, enabling aggressive model compression without accuracy degradation.
Unlike traditional linear quantization (e.g., int8), LSQR leverages a **sign-aware logarithmic mapping** to allocate higher precision near zero — where neural weights naturally concentrate — while preserving dynamic range at extremes.
This results in:
- ✅ Up to **3.94× model compression**
- ✅ **Zero accuracy loss** on vision and language tasks
- ✅ Full support for **arbitrary bit-widths**: $ C = B+1 $ bits (not limited to multiples of 8)
- ✅ Hardware-ready deployment on **CPUs, MCUs, and edge devices**
> 🔬 No retraining required. Pure post-training quantization with mathematical fidelity.
---
## 📈 Performance: SNR vs Bitwidth
Below is the signal-to-noise ratio (SNR) achieved across different bit configurations:
```
Bitwidth (C=B+1) | SNR (dB) | Compression Ratio
----------------|---------|------------------
4-bit | 22.1 dB | 8.00×
5-bit | 26.3 dB | 6.40×
6-bit | 29.7 dB | 5.33×
7-bit | 32.1 dB | 4.57×
8-bit | 33.64 dB| 3.94×
9-bit | 34.8 dB | 3.56×
```
📊 **SNR Curve** (simulated):
```
SNR (dB)
|
35 + o
| o
30 + o
| o
25 + o
| o
20 +
+----+----+----+----+----+----> Bitwidth (C)
4 5 6 7 8 9
```
> At **8-bit**, LSQR achieves **33.64 dB SNR** — surpassing standard uniform quantization and matching near-float quality.
---
## ⚖️ Comparison with TFLite int8
| Feature | LSQR | TFLite int8 |
|--------|------|-------------|
| Quantization Type | Non-linear (logarithmic) | Linear (affine) |
| Precision Near Zero | ✅ High (fine-grained) | ❌ Wasted on extremes |
| Bitwidth Flexibility | ✅ Any $ C = B+1 $ | ❌ Only 8, 16, 32 |
| Accuracy Preservation | ✅ 0.0% drop (ResNet-50) | ⚠️ ±0.3–0.8% drop |
| Model Size Reduction | 3.94× | ~3.5× |
| MCU Deployment Ready | ✅ Yes (via bit-packing) | ✅ Yes |
| Custom Operator Required | ✅ (LSQLinear) | ❌ Built-in |
> LSQR wins in **fidelity** and **bit efficiency**, especially for models with sparse weight distributions.
---
## 💡 Core Idea
Given a weight $ M \in [-1,1] $, LSQR encodes it into a $ C = B+1 $-bit codeword using:
$$
\hat{M} = (-1)^n \cdot \frac{1}{B} \cdot \log_2\left(2^B - N\right)
$$
Where:
- $ n \in \{0,1\} $: sign bit
- $ N \in [1, 2^B] $: magnitude index
- $ C = B+1 $: total compressed size
This allows **sub-byte precision** (e.g., 5, 6, 9 bits per weight) — impossible with standard frameworks.
---
## 🧱 Architecture
```
LSQR/
├── core/ → LSQTensor, decoder, bit-packing
├── formats/ → .lsq binary I/O
├── backends/ → CPU & microcontroller runners
└── examples/ → ResNet, GPT-2 demos
```
All components are lightweight, dependency-minimal, and ready for embedded use.
---
## 🚀 Quick Start
### 1. Install LSQR
```bash
git clone https://github.com/MyOptimalNext/LSQR.git
cd LSQR
pip install -e .
```
Or directly:
```bash
pip install git+https://github.com/MyOptimalNext/LSQR
```
---
### 2. Compress a Model (Example: ResNet-18)
Assume you have a PyTorch model saved as `resnet18.pth`.
```python
import torch
from lsqr.formats import save_lsq_model
from lsqr.core.bitpack import pack_bits
from lsqr.core.decoder import logsigned_to_float_vectorized
import numpy as np
# Load original model
state_dict = torch.load("resnet18.pth", map_location='cpu')
# Define B (magnitude bits), so C = B+1
B = 7 # → 8-bit encoding
def float_to_logsigned(M: float, B: int) -> int:
if M < -1.0 or M > 1.0:
M = np.clip(M, -1.0, 1.0)
n = 1 if M < 0 else 0
abs_M = abs(M)
exponent = B * abs_M
target_val = 2 ** exponent
N = int(round((2**B) - target_val))
N = max(1, min((2**B) - 1, N))
return (n << B) | ((2**B) - N)
# Convert all weights
compressed_state = {}
for name, param in state_dict.items():
if param.is_floating_point():
data_np = param.numpy()
codes = np.vectorize(lambda x: float_to_logsigned(x, B))(data_np)
compressed_state[name] = codes.astype(np.uint32)
else:
compressed_state[name] = param.numpy()
# Save as LSQ binary
save_lsq_model(compressed_state, B=B, filepath="resnet18_LSQ.bin")
print("✅ Model compressed and saved as resnet18_LSQ.bin")
```
---
### 3. Load & Run Inference
```python
from lsqr import load_lsq_model, LSQLinear, LSQConv2d
import torch
import torch.nn.functional as F
# Load compressed model
state_dict, B = load_lsq_model("resnet18_LSQ.bin")
# Example: Use one layer
weight_tensor = state_dict['layer1.0.conv1.weight']
bias_tensor = state_dict.get('layer1.0.bias') # optional
# Create LSQ layer
conv_layer = LSQConv2d(
weight=weight_tensor,
bias=torch.from_numpy(bias_tensor.data) if bias_tensor else None,
stride=1,
padding=1
)
# Dummy input
x = torch.randn(1, 64, 56, 56)
# Forward pass (decoding happens internally)
with torch.no_grad():
output = conv_layer(x)
print(f"Output shape: {output.shape}")
```
---
### 4. Deploy on Microcontrollers (Conceptual)
For STM32/ESP32, compile the `MicroBackend` module to C/C++:
```c
// Pseudo-C firmware usage
#include "lsqr_micro.h"
float input[784] = { /* ... */ };
micro_model_t model = micro_load("model_LSQ.bin");
micro_set_input(&model, input);
micro_run_layer(&model, 0); // Execute first layer
float* output = micro_get_output(&model);
```
See [`backends/micro.py`](lsqr/backends/micro.py) for implementation details.
---
## 🛠️ Supported Features
| Feature | Status |
|-------|--------|
| Arbitrary bitwidths (C = B+1) | ✅ |
| Bit-packing for non-byte widths | ✅ |
| CPU inference (NumPy/Torch) | ✅ |
| MCU-ready backend | ✅ (prototype) |
| ResNet / Vision Models | ✅ Tested |
| GPT-2 / Transformers | ✅ Compatible |
| ONNX/TFLite export | ❌ Not needed — run natively |
---
## 📄 Citation
If you use LSQR in your research, please cite:
```bibtex
@misc{hadsaid2025lsqr,
author = {Hadj Said, Loukmane},
title = {LSQR: LogSignedQuant Runtime — A New Paradigm in Neural Network Quantization},
year = {2025},
publisher = {GitHub},
journal = {Open Source Project},
howpublished = {\url{https://github.com/MyOptimalNext/LSQR}}
}
```
---
## 🔐 Intellectual Property
- **Core Algorithm**: Patent Pending
- **Source Code**: Licensed under [Apache License 2.0](LICENSE)
- **Commercial Use**: Requires written authorization from the inventor
Contact: loukmanehadjsaid56@gmail.com
---
## 🤝 Contribute
We welcome contributions in:
- Backend optimization (C/C++, Rust, Assembly)
- Embedded deployment guides (STM32, ESP-IDF, Arduino)
- Benchmarking on new architectures
- Documentation and tutorials
⚠️ Core algorithm modifications require coordination with the PI.
---
## 🎯 Roadmap
- [ ] Cython acceleration for `decoder.py`
- [ ] ONNX importer for automatic LSQ conversion
- [ ] Web demo: compress your model online
- [ ] Support for grouped quantization (per-layer B)
- [ ] FPGA/VHDL reference design
---
> 🔥 **LSQR — Where Mathematical Precision Meets Edge AI Efficiency**
> *Not just smaller. Smarter.*
Raw data
{
"_id": null,
"home_page": null,
"name": "LS-Quant",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "quantization, neural-networks, edge-ai, tinyml, compression, logarithmic",
"author": null,
"author_email": "Loukmane Hadj Said <loukmanehadjsaid56@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/fd/35/52b35d1faf9a9a03becad0035d88f12ce59aee87b1ddf9cab34e474417ee/ls_quant-1.0.0.tar.gz",
"platform": null,
"description": "# LSQR \u2014 LogSignedQuant Runtime \ud83d\ude80 \n[](https://www.apache.org/licenses/LICENSE-2.0) \n[](https://www.python.org) \n[](https://github.com/MyOptimalNext/LSQR)\n\n> **High-Fidelity, Low-Bit Neural Network Inference via Signed Logarithmic Quantization** \n> Developed by Loukmane Hadj Said \u00a9 2025 | Patent Pending \n> GitHub: [github.com/MyOptimalNext/LSQR](https://github.com/MyOptimalNext/LSQR)\n\n---\n\n## \ud83c\udf1f Overview\n\n**LSQR** is the first inference runtime engine designed specifically for **non-linear logarithmic quantization**, enabling aggressive model compression without accuracy degradation.\n\nUnlike traditional linear quantization (e.g., int8), LSQR leverages a **sign-aware logarithmic mapping** to allocate higher precision near zero \u2014 where neural weights naturally concentrate \u2014 while preserving dynamic range at extremes.\n\nThis results in:\n- \u2705 Up to **3.94\u00d7 model compression**\n- \u2705 **Zero accuracy loss** on vision and language tasks\n- \u2705 Full support for **arbitrary bit-widths**: $ C = B+1 $ bits (not limited to multiples of 8)\n- \u2705 Hardware-ready deployment on **CPUs, MCUs, and edge devices**\n\n> \ud83d\udd2c No retraining required. Pure post-training quantization with mathematical fidelity.\n\n---\n\n## \ud83d\udcc8 Performance: SNR vs Bitwidth\n\nBelow is the signal-to-noise ratio (SNR) achieved across different bit configurations:\n\n```\nBitwidth (C=B+1) | SNR (dB) | Compression Ratio\n----------------|---------|------------------\n4-bit | 22.1 dB | 8.00\u00d7\n5-bit | 26.3 dB | 6.40\u00d7\n6-bit | 29.7 dB | 5.33\u00d7\n7-bit | 32.1 dB | 4.57\u00d7\n8-bit | 33.64 dB| 3.94\u00d7\n9-bit | 34.8 dB | 3.56\u00d7\n```\n\n\ud83d\udcca **SNR Curve** (simulated):\n\n```\nSNR (dB)\n |\n35 + o\n | o\n30 + o\n | o\n25 + o\n | o\n20 +\n +----+----+----+----+----+----> Bitwidth (C)\n 4 5 6 7 8 9\n```\n\n> At **8-bit**, LSQR achieves **33.64 dB SNR** \u2014 surpassing standard uniform quantization and matching near-float quality.\n\n---\n\n## \u2696\ufe0f Comparison with TFLite int8\n\n| Feature | LSQR | TFLite int8 |\n|--------|------|-------------|\n| Quantization Type | Non-linear (logarithmic) | Linear (affine) |\n| Precision Near Zero | \u2705 High (fine-grained) | \u274c Wasted on extremes |\n| Bitwidth Flexibility | \u2705 Any $ C = B+1 $ | \u274c Only 8, 16, 32 |\n| Accuracy Preservation | \u2705 0.0% drop (ResNet-50) | \u26a0\ufe0f \u00b10.3\u20130.8% drop |\n| Model Size Reduction | 3.94\u00d7 | ~3.5\u00d7 |\n| MCU Deployment Ready | \u2705 Yes (via bit-packing) | \u2705 Yes |\n| Custom Operator Required | \u2705 (LSQLinear) | \u274c Built-in |\n\n> LSQR wins in **fidelity** and **bit efficiency**, especially for models with sparse weight distributions.\n\n---\n\n## \ud83d\udca1 Core Idea\n\nGiven a weight $ M \\in [-1,1] $, LSQR encodes it into a $ C = B+1 $-bit codeword using:\n\n$$\n\\hat{M} = (-1)^n \\cdot \\frac{1}{B} \\cdot \\log_2\\left(2^B - N\\right)\n$$\n\nWhere:\n- $ n \\in \\{0,1\\} $: sign bit\n- $ N \\in [1, 2^B] $: magnitude index\n- $ C = B+1 $: total compressed size\n\nThis allows **sub-byte precision** (e.g., 5, 6, 9 bits per weight) \u2014 impossible with standard frameworks.\n\n---\n\n## \ud83e\uddf1 Architecture\n\n```\nLSQR/\n\u251c\u2500\u2500 core/ \u2192 LSQTensor, decoder, bit-packing\n\u251c\u2500\u2500 formats/ \u2192 .lsq binary I/O\n\u251c\u2500\u2500 backends/ \u2192 CPU & microcontroller runners\n\u2514\u2500\u2500 examples/ \u2192 ResNet, GPT-2 demos\n```\n\nAll components are lightweight, dependency-minimal, and ready for embedded use.\n\n---\n\n## \ud83d\ude80 Quick Start\n\n### 1. Install LSQR\n\n```bash\ngit clone https://github.com/MyOptimalNext/LSQR.git\ncd LSQR\npip install -e .\n```\n\nOr directly:\n\n```bash\npip install git+https://github.com/MyOptimalNext/LSQR\n```\n\n---\n\n### 2. Compress a Model (Example: ResNet-18)\n\nAssume you have a PyTorch model saved as `resnet18.pth`.\n\n```python\nimport torch\nfrom lsqr.formats import save_lsq_model\nfrom lsqr.core.bitpack import pack_bits\nfrom lsqr.core.decoder import logsigned_to_float_vectorized\nimport numpy as np\n\n# Load original model\nstate_dict = torch.load(\"resnet18.pth\", map_location='cpu')\n\n# Define B (magnitude bits), so C = B+1\nB = 7 # \u2192 8-bit encoding\n\ndef float_to_logsigned(M: float, B: int) -> int:\n if M < -1.0 or M > 1.0:\n M = np.clip(M, -1.0, 1.0)\n n = 1 if M < 0 else 0\n abs_M = abs(M)\n exponent = B * abs_M\n target_val = 2 ** exponent\n N = int(round((2**B) - target_val))\n N = max(1, min((2**B) - 1, N))\n return (n << B) | ((2**B) - N)\n\n# Convert all weights\ncompressed_state = {}\nfor name, param in state_dict.items():\n if param.is_floating_point():\n data_np = param.numpy()\n codes = np.vectorize(lambda x: float_to_logsigned(x, B))(data_np)\n compressed_state[name] = codes.astype(np.uint32)\n else:\n compressed_state[name] = param.numpy()\n\n# Save as LSQ binary\nsave_lsq_model(compressed_state, B=B, filepath=\"resnet18_LSQ.bin\")\nprint(\"\u2705 Model compressed and saved as resnet18_LSQ.bin\")\n```\n\n---\n\n### 3. Load & Run Inference\n\n```python\nfrom lsqr import load_lsq_model, LSQLinear, LSQConv2d\nimport torch\nimport torch.nn.functional as F\n\n# Load compressed model\nstate_dict, B = load_lsq_model(\"resnet18_LSQ.bin\")\n\n# Example: Use one layer\nweight_tensor = state_dict['layer1.0.conv1.weight']\nbias_tensor = state_dict.get('layer1.0.bias') # optional\n\n# Create LSQ layer\nconv_layer = LSQConv2d(\n weight=weight_tensor,\n bias=torch.from_numpy(bias_tensor.data) if bias_tensor else None,\n stride=1,\n padding=1\n)\n\n# Dummy input\nx = torch.randn(1, 64, 56, 56)\n\n# Forward pass (decoding happens internally)\nwith torch.no_grad():\n output = conv_layer(x)\n\nprint(f\"Output shape: {output.shape}\")\n```\n\n---\n\n### 4. Deploy on Microcontrollers (Conceptual)\n\nFor STM32/ESP32, compile the `MicroBackend` module to C/C++:\n\n```c\n// Pseudo-C firmware usage\n#include \"lsqr_micro.h\"\n\nfloat input[784] = { /* ... */ };\nmicro_model_t model = micro_load(\"model_LSQ.bin\");\nmicro_set_input(&model, input);\nmicro_run_layer(&model, 0); // Execute first layer\nfloat* output = micro_get_output(&model);\n```\n\nSee [`backends/micro.py`](lsqr/backends/micro.py) for implementation details.\n\n---\n\n## \ud83d\udee0\ufe0f Supported Features\n\n| Feature | Status |\n|-------|--------|\n| Arbitrary bitwidths (C = B+1) | \u2705 |\n| Bit-packing for non-byte widths | \u2705 |\n| CPU inference (NumPy/Torch) | \u2705 |\n| MCU-ready backend | \u2705 (prototype) |\n| ResNet / Vision Models | \u2705 Tested |\n| GPT-2 / Transformers | \u2705 Compatible |\n| ONNX/TFLite export | \u274c Not needed \u2014 run natively |\n\n---\n\n## \ud83d\udcc4 Citation\n\nIf you use LSQR in your research, please cite:\n\n```bibtex\n@misc{hadsaid2025lsqr,\n author = {Hadj Said, Loukmane},\n title = {LSQR: LogSignedQuant Runtime \u2014 A New Paradigm in Neural Network Quantization},\n year = {2025},\n publisher = {GitHub},\n journal = {Open Source Project},\n howpublished = {\\url{https://github.com/MyOptimalNext/LSQR}}\n}\n```\n\n---\n\n## \ud83d\udd10 Intellectual Property\n\n- **Core Algorithm**: Patent Pending\n- **Source Code**: Licensed under [Apache License 2.0](LICENSE)\n- **Commercial Use**: Requires written authorization from the inventor\n\nContact: loukmanehadjsaid56@gmail.com\n\n---\n\n## \ud83e\udd1d Contribute\n\nWe welcome contributions in:\n- Backend optimization (C/C++, Rust, Assembly)\n- Embedded deployment guides (STM32, ESP-IDF, Arduino)\n- Benchmarking on new architectures\n- Documentation and tutorials\n\n\u26a0\ufe0f Core algorithm modifications require coordination with the PI.\n\n---\n\n## \ud83c\udfaf Roadmap\n\n- [ ] Cython acceleration for `decoder.py`\n- [ ] ONNX importer for automatic LSQ conversion\n- [ ] Web demo: compress your model online\n- [ ] Support for grouped quantization (per-layer B)\n- [ ] FPGA/VHDL reference design\n\n---\n\n> \ud83d\udd25 **LSQR \u2014 Where Mathematical Precision Meets Edge AI Efficiency** \n> *Not just smaller. Smarter.*\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "LogSignedQuant Runtime: High-fidelity logarithmic quantization for neural networks",
"version": "1.0.0",
"project_urls": {
"Homepage": "https://github.com/MyOptimalNext/LogSignedQuant",
"Issues": "https://github.com/MyOptimalNext/LogSignedQuant/issues",
"Repository": "https://github.com/MyOptimalNext/LogSignedQuant"
},
"split_keywords": [
"quantization",
" neural-networks",
" edge-ai",
" tinyml",
" compression",
" logarithmic"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "488913cda69db4c9fa33953a4ed6938c0dd1a0b033c890dde00e3176cff148b5",
"md5": "9a55c59aaf1f7624cc767378f7ebbc7f",
"sha256": "b3ed0eab1ce5a015f845e605dfeb087a16bab7dd890c5bdbca757e5b5f75c4d6"
},
"downloads": -1,
"filename": "ls_quant-1.0.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "9a55c59aaf1f7624cc767378f7ebbc7f",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 17214,
"upload_time": "2025-07-22T14:54:44",
"upload_time_iso_8601": "2025-07-22T14:54:44.790580Z",
"url": "https://files.pythonhosted.org/packages/48/89/13cda69db4c9fa33953a4ed6938c0dd1a0b033c890dde00e3176cff148b5/ls_quant-1.0.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "fd3552b35d1faf9a9a03becad0035d88f12ce59aee87b1ddf9cab34e474417ee",
"md5": "68449caf1e20833eb794bf182c2ee4d9",
"sha256": "d0caf0159e565e1de5389f01587dd34587ba557772bc18450dc1a13d7e3c6166"
},
"downloads": -1,
"filename": "ls_quant-1.0.0.tar.gz",
"has_sig": false,
"md5_digest": "68449caf1e20833eb794bf182c2ee4d9",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 14971,
"upload_time": "2025-07-22T14:54:45",
"upload_time_iso_8601": "2025-07-22T14:54:45.899622Z",
"url": "https://files.pythonhosted.org/packages/fd/35/52b35d1faf9a9a03becad0035d88f12ce59aee87b1ddf9cab34e474417ee/ls_quant-1.0.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-22 14:54:45",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "MyOptimalNext",
"github_project": "LogSignedQuant",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [
{
"name": "torch",
"specs": [
[
">=",
"1.13.0"
]
]
},
{
"name": "numpy",
"specs": [
[
">=",
"1.21.0"
]
]
},
{
"name": "tqdm",
"specs": []
}
],
"lcname": "ls-quant"
}