<p align="center">
<img src="docs/assets/logo.svg" alt="DASPack Logo" />
</p>
# DASPack: Controlled data compression for Distributed Acoustic Sensing
DASPack is a fast, open-source compressor for huge Distributed Acoustic Sensing (DAS) datasets.
It supports **lossless** and **fixed-accuracy lossy** modes, letting you store data with an exact bound on reconstruction error.
The core is written in Rust for speed and safety, with a thin Python API for convenient integration into your workflows.
**DISCLAIMER:** We are testing the code, official release will be by mid-august.
---
## ✨ Highlights
- **Lossless or fixed-accuracy** — pick zero error or a max absolute error and get exactly what you asked for.
- **Multi-threaded** — control the number of threads per encode/decode call.
- **High throughput** — 800 MB/s+ on an 8-core laptop in typical workloads.
- **Self-describing streams** — all parameters (codec, quantizer, shape) are stored in the bitstream; no sidecars needed.
- **Pure Rust core** — no unsafe C buffers exposed to user code.
- **Python bindings** — direct `encode` / `decode` interface for NumPy arrays.
---
## 🚀 Quick start
### 1. Install (Python ≥ 3.9)
```bash
pip install daspack
# or, from source (Rust ≥ 1.74):
# maturin develop --release
```
### 2. Encode and store with h5py
You can store the compressed DASPack bitstream as raw bytes in HDF5:
```python
import numpy as np, h5py
from daspack import DASCoder, Quantizer
# Example: lossless compression with 4 threads
data = np.random.randint(-1000, 1000, size=(4096, 8192), dtype=np.int32)
coder = DASCoder(threads=4)
# Encode in Lossless mode
stream = coder.encode(
data,
Quantizer.Lossless(),
blocksize=(1024, 1024),
levels=0,
order=0,
)
with h5py.File("example.h5", "w") as f:
f.create_dataset("compressed", data=np.frombuffer(stream, dtype=np.uint8))
```
### 3. Read and decode
```python
import numpy as np, h5py
from daspack import DASCoder
coder = DASCoder(threads=4)
with h5py.File("example.h5") as f:
raw = f["compressed"][:].tobytes()
# Decode: dtype is inferred from the stream
restored = coder.decode(raw)
```
### 4. Lossy example with fixed error bound
```python
import numpy as np
from daspack import DASCoder, Quantizer
# Generate some example data
data = np.random.uniform(-100, 100, size=(6, 8)).astype(np.float64)
coder = DASCoder(threads=2)
# Target: absolute error ≤ step/2
step = 0.5
# Encode with Uniform quantizer (lossy) and given step
stream = coder.encode(
data,
Quantizer.Uniform(step=step),
)
# Decode (dtype inferred from stream)
restored = coder.decode(stream)
# Verify bound
tol = step / 2 + 1e-12
max_err = np.max(np.abs(restored - data))
print(f"Max abs error: {max_err:.6f} (tolerance {tol})")
assert max_err <= tol
print("Original data:\n", data)
print("Restored data:\n", restored)
```
The expected output is
```
Max abs error: 0.250000 (tolerance 0.250000)
Original data:
[[ ... ]]
Restored data:
[[ ... ]]
```
---
## ⚙️ How it works
```
(float mode) Quantize → Wavelet (5/3) → 2-D LPC → Arithmetic coding
(int mode) Identity → Wavelet (5/3) → 2-D LPC → Arithmetic coding
```
The lossy path is bounded-error thanks to uniform quantization; the rest of the chain is perfectly reversible.
Read the paper (see citation below!) for more information 😄
---
## 📄 License
DASPack is released under the 3-Clause BSD License.
---
## 🤝 Contributing
Bug reports and pull requests are welcome.
If you plan a large change, please open an issue first so we can discuss the design.
---
## 📣 Citing
If you use DASPack in academic work, please cite:
> Seguí, A. *et al.* (2025). **DASPack: Controlled Data Compression for Distributed Acoustic Sensing**. *Geophysical Journal International*.\
> DOI: *pending*
Thanks for supporting open science!
Raw data
{
"_id": null,
"home_page": null,
"name": "daspack-dev",
"maintainer": "Aleix Segui",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": "Distributed Acoustic Sensing, compression, Data Compression",
"author": "Aleix Segui",
"author_email": null,
"download_url": "https://files.pythonhosted.org/packages/98/9c/c0adce7e5c6c286ef29d930efb8e5fb222a732f64b2d929b2132e0ee3d62/daspack_dev-0.0.1a0.tar.gz",
"platform": null,
"description": "<p align=\"center\">\n <img src=\"docs/assets/logo.svg\" alt=\"DASPack Logo\" />\n</p>\n\n# DASPack: Controlled data compression for Distributed Acoustic Sensing\n\nDASPack is a fast, open-source compressor for huge Distributed Acoustic Sensing (DAS) datasets. \nIt supports **lossless** and **fixed-accuracy lossy** modes, letting you store data with an exact bound on reconstruction error.\n\nThe core is written in Rust for speed and safety, with a thin Python API for convenient integration into your workflows.\n\n**DISCLAIMER:** We are testing the code, official release will be by mid-august.\n\n---\n\n## \u2728\u202fHighlights\n\n- **Lossless or fixed-accuracy** \u2014 pick zero error or a max absolute error and get exactly what you asked for.\n- **Multi-threaded** \u2014 control the number of threads per encode/decode call.\n- **High throughput** \u2014 800 MB/s+ on an 8-core laptop in typical workloads.\n- **Self-describing streams** \u2014 all parameters (codec, quantizer, shape) are stored in the bitstream; no sidecars needed.\n- **Pure Rust core** \u2014 no unsafe C buffers exposed to user code.\n- **Python bindings** \u2014 direct `encode` / `decode` interface for NumPy arrays.\n\n---\n\n## \ud83d\ude80\u202fQuick start\n\n### 1. Install (Python \u2265\u202f3.9)\n\n```bash\npip install daspack\n# or, from source (Rust \u2265 1.74):\n# maturin develop --release\n```\n\n### 2. Encode and store with h5py\n\nYou can store the compressed DASPack bitstream as raw bytes in HDF5:\n\n```python\nimport numpy as np, h5py\nfrom daspack import DASCoder, Quantizer\n\n# Example: lossless compression with 4 threads\ndata = np.random.randint(-1000, 1000, size=(4096, 8192), dtype=np.int32)\ncoder = DASCoder(threads=4)\n\n# Encode in Lossless mode\nstream = coder.encode(\n data,\n Quantizer.Lossless(),\n blocksize=(1024, 1024),\n levels=0,\n order=0,\n)\n\nwith h5py.File(\"example.h5\", \"w\") as f:\n f.create_dataset(\"compressed\", data=np.frombuffer(stream, dtype=np.uint8))\n```\n\n### 3. Read and decode\n\n```python\nimport numpy as np, h5py\nfrom daspack import DASCoder\n\ncoder = DASCoder(threads=4)\n\nwith h5py.File(\"example.h5\") as f:\n raw = f[\"compressed\"][:].tobytes()\n\n# Decode: dtype is inferred from the stream\nrestored = coder.decode(raw)\n```\n\n\n### 4. Lossy example with fixed error bound\n\n```python\nimport numpy as np\nfrom daspack import DASCoder, Quantizer\n\n# Generate some example data\ndata = np.random.uniform(-100, 100, size=(6, 8)).astype(np.float64)\n\ncoder = DASCoder(threads=2)\n\n# Target: absolute error \u2264 step/2\nstep = 0.5\n\n# Encode with Uniform quantizer (lossy) and given step\nstream = coder.encode(\n data,\n Quantizer.Uniform(step=step),\n)\n\n# Decode (dtype inferred from stream)\nrestored = coder.decode(stream)\n\n# Verify bound\ntol = step / 2 + 1e-12\nmax_err = np.max(np.abs(restored - data))\nprint(f\"Max abs error: {max_err:.6f} (tolerance {tol})\")\nassert max_err <= tol\n\nprint(\"Original data:\\n\", data)\nprint(\"Restored data:\\n\", restored)\n```\n\nThe expected output is\n```\nMax abs error: 0.250000 (tolerance 0.250000)\nOriginal data:\n [[ ... ]]\nRestored data:\n [[ ... ]]\n```\n\n---\n\n\n## \u2699\ufe0f\u202fHow it works\n\n```\n(float mode) Quantize \u2192 Wavelet (5/3) \u2192 2-D LPC \u2192 Arithmetic coding\n(int mode) Identity \u2192 Wavelet (5/3) \u2192 2-D LPC \u2192 Arithmetic coding\n```\nThe lossy path is bounded-error thanks to uniform quantization; the rest of the chain is perfectly reversible.\n\nRead the paper (see citation below!) for more information \ud83d\ude04\n\n\n---\n\n## \ud83d\udcc4\u202fLicense\n\nDASPack is released under the 3-Clause BSD License.\n\n---\n\n## \ud83e\udd1d\u202fContributing\n\nBug reports and pull requests are welcome.\nIf you plan a large change, please open an issue first so we can discuss the design.\n\n---\n\n## \ud83d\udce3\u202fCiting\n\nIf you use DASPack in academic work, please cite:\n\n> Segu\u00ed, A. *et\u202fal.* (2025). **DASPack: Controlled Data Compression for Distributed Acoustic Sensing**. *Geophysical Journal International*.\\\n> DOI: *pending*\n\nThanks for supporting open science!\n\n\n",
"bugtrack_url": null,
"license": "BSD-3-Clause",
"summary": "DASPack: Controlled data compression for Distributed Acoustic Sensing",
"version": "0.0.1a0",
"project_urls": {
"homepage": "https://aleixsegui.dev/",
"repository": "https://github.com/asleix/daspack"
},
"split_keywords": [
"distributed acoustic sensing",
" compression",
" data compression"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "4d343d8b92f84f5602949eb699f91d26c4939014c2f799ae7f3d5b1e415876ef",
"md5": "3413e4c1c245983ebfe817a44735a422",
"sha256": "f42f08335d38f9d3b6ccf2d956f2a4487fd157dde4406534ca4eed767d8ac15c"
},
"downloads": -1,
"filename": "daspack_dev-0.0.1a0-cp311-cp311-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "3413e4c1c245983ebfe817a44735a422",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.8",
"size": 324822,
"upload_time": "2025-08-09T18:31:59",
"upload_time_iso_8601": "2025-08-09T18:31:59.664060Z",
"url": "https://files.pythonhosted.org/packages/4d/34/3d8b92f84f5602949eb699f91d26c4939014c2f799ae7f3d5b1e415876ef/daspack_dev-0.0.1a0-cp311-cp311-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "989cc0adce7e5c6c286ef29d930efb8e5fb222a732f64b2d929b2132e0ee3d62",
"md5": "c09aeccda57b5f46bb6524acf9cb1b4b",
"sha256": "d0036a05a0dfe608e2409f8c23dfdc6a9afea32b83def137348533d38cc1b789"
},
"downloads": -1,
"filename": "daspack_dev-0.0.1a0.tar.gz",
"has_sig": false,
"md5_digest": "c09aeccda57b5f46bb6524acf9cb1b4b",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 63024,
"upload_time": "2025-08-09T18:32:01",
"upload_time_iso_8601": "2025-08-09T18:32:01.345685Z",
"url": "https://files.pythonhosted.org/packages/98/9c/c0adce7e5c6c286ef29d930efb8e5fb222a732f64b2d929b2132e0ee3d62/daspack_dev-0.0.1a0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-09 18:32:01",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "asleix",
"github_project": "daspack",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"lcname": "daspack-dev"
}