daspack-dev


Namedaspack-dev JSON
Version 0.0.1a0 PyPI version JSON
download
home_pageNone
SummaryDASPack: Controlled data compression for Distributed Acoustic Sensing
upload_time2025-08-09 18:32:01
maintainerAleix Segui
docs_urlNone
authorAleix Segui
requires_python>=3.8
licenseBSD-3-Clause
keywords distributed acoustic sensing compression data compression
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <p align="center">
  <img src="docs/assets/logo.svg" alt="DASPack Logo" />
</p>

# DASPack: Controlled data compression for Distributed Acoustic Sensing

DASPack is a fast, open-source compressor for huge Distributed Acoustic Sensing (DAS) datasets.  
It supports **lossless** and **fixed-accuracy lossy** modes, letting you store data with an exact bound on reconstruction error.

The core is written in Rust for speed and safety, with a thin Python API for convenient integration into your workflows.

**DISCLAIMER:** We are testing the code, official release will be by mid-august.

---

## ✨ Highlights

- **Lossless or fixed-accuracy** — pick zero error or a max absolute error and get exactly what you asked for.
- **Multi-threaded** — control the number of threads per encode/decode call.
- **High throughput** — 800 MB/s+ on an 8-core laptop in typical workloads.
- **Self-describing streams** — all parameters (codec, quantizer, shape) are stored in the bitstream; no sidecars needed.
- **Pure Rust core** — no unsafe C buffers exposed to user code.
- **Python bindings** — direct `encode` / `decode` interface for NumPy arrays.

---

## 🚀 Quick start

### 1. Install (Python ≥ 3.9)

```bash
pip install daspack
# or, from source (Rust ≥ 1.74):
# maturin develop --release
```

### 2. Encode and store with h5py

You can store the compressed DASPack bitstream as raw bytes in HDF5:

```python
import numpy as np, h5py
from daspack import DASCoder, Quantizer

# Example: lossless compression with 4 threads
data = np.random.randint(-1000, 1000, size=(4096, 8192), dtype=np.int32)
coder = DASCoder(threads=4)

# Encode in Lossless mode
stream = coder.encode(
    data,
    Quantizer.Lossless(),
    blocksize=(1024, 1024),
    levels=0,
    order=0,
)

with h5py.File("example.h5", "w") as f:
    f.create_dataset("compressed", data=np.frombuffer(stream, dtype=np.uint8))
```

### 3. Read and decode

```python
import numpy as np, h5py
from daspack import DASCoder

coder = DASCoder(threads=4)

with h5py.File("example.h5") as f:
    raw = f["compressed"][:].tobytes()

# Decode: dtype is inferred from the stream
restored = coder.decode(raw)
```


### 4. Lossy example with fixed error bound

```python
import numpy as np
from daspack import DASCoder, Quantizer

# Generate some example data
data = np.random.uniform(-100, 100, size=(6, 8)).astype(np.float64)

coder = DASCoder(threads=2)

# Target: absolute error ≤ step/2
step = 0.5

# Encode with Uniform quantizer (lossy) and given step
stream = coder.encode(
    data,
    Quantizer.Uniform(step=step),
)

# Decode (dtype inferred from stream)
restored = coder.decode(stream)

# Verify bound
tol = step / 2 + 1e-12
max_err = np.max(np.abs(restored - data))
print(f"Max abs error: {max_err:.6f} (tolerance {tol})")
assert max_err <= tol

print("Original data:\n", data)
print("Restored data:\n", restored)
```

The expected output is
```
Max abs error: 0.250000 (tolerance 0.250000)
Original data:
 [[ ... ]]
Restored data:
 [[ ... ]]
```

---


## ⚙️ How it works

```
(float mode) Quantize → Wavelet (5/3) → 2-D LPC → Arithmetic coding
(int mode)   Identity  → Wavelet (5/3) → 2-D LPC → Arithmetic coding
```
The lossy path is bounded-error thanks to uniform quantization; the rest of the chain is perfectly reversible.

Read the paper (see citation below!) for more information 😄


---

## 📄 License

DASPack is released under the 3-Clause BSD License.

---

## 🤝 Contributing

Bug reports and pull requests are welcome.
If you plan a large change, please open an issue first so we can discuss the design.

---

## 📣 Citing

If you use DASPack in academic work, please cite:

> Seguí, A. *et al.* (2025). **DASPack: Controlled Data Compression for Distributed Acoustic Sensing**. *Geophysical Journal International*.\
> DOI: *pending*

Thanks for supporting open science!



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "daspack-dev",
    "maintainer": "Aleix Segui",
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "Distributed Acoustic Sensing, compression, Data Compression",
    "author": "Aleix Segui",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/98/9c/c0adce7e5c6c286ef29d930efb8e5fb222a732f64b2d929b2132e0ee3d62/daspack_dev-0.0.1a0.tar.gz",
    "platform": null,
    "description": "<p align=\"center\">\n  <img src=\"docs/assets/logo.svg\" alt=\"DASPack Logo\" />\n</p>\n\n# DASPack: Controlled data compression for Distributed Acoustic Sensing\n\nDASPack is a fast, open-source compressor for huge Distributed Acoustic Sensing (DAS) datasets.  \nIt supports **lossless** and **fixed-accuracy lossy** modes, letting you store data with an exact bound on reconstruction error.\n\nThe core is written in Rust for speed and safety, with a thin Python API for convenient integration into your workflows.\n\n**DISCLAIMER:** We are testing the code, official release will be by mid-august.\n\n---\n\n## \u2728\u202fHighlights\n\n- **Lossless or fixed-accuracy** \u2014 pick zero error or a max absolute error and get exactly what you asked for.\n- **Multi-threaded** \u2014 control the number of threads per encode/decode call.\n- **High throughput** \u2014 800 MB/s+ on an 8-core laptop in typical workloads.\n- **Self-describing streams** \u2014 all parameters (codec, quantizer, shape) are stored in the bitstream; no sidecars needed.\n- **Pure Rust core** \u2014 no unsafe C buffers exposed to user code.\n- **Python bindings** \u2014 direct `encode` / `decode` interface for NumPy arrays.\n\n---\n\n## \ud83d\ude80\u202fQuick start\n\n### 1. Install (Python \u2265\u202f3.9)\n\n```bash\npip install daspack\n# or, from source (Rust \u2265 1.74):\n# maturin develop --release\n```\n\n### 2. Encode and store with h5py\n\nYou can store the compressed DASPack bitstream as raw bytes in HDF5:\n\n```python\nimport numpy as np, h5py\nfrom daspack import DASCoder, Quantizer\n\n# Example: lossless compression with 4 threads\ndata = np.random.randint(-1000, 1000, size=(4096, 8192), dtype=np.int32)\ncoder = DASCoder(threads=4)\n\n# Encode in Lossless mode\nstream = coder.encode(\n    data,\n    Quantizer.Lossless(),\n    blocksize=(1024, 1024),\n    levels=0,\n    order=0,\n)\n\nwith h5py.File(\"example.h5\", \"w\") as f:\n    f.create_dataset(\"compressed\", data=np.frombuffer(stream, dtype=np.uint8))\n```\n\n### 3. Read and decode\n\n```python\nimport numpy as np, h5py\nfrom daspack import DASCoder\n\ncoder = DASCoder(threads=4)\n\nwith h5py.File(\"example.h5\") as f:\n    raw = f[\"compressed\"][:].tobytes()\n\n# Decode: dtype is inferred from the stream\nrestored = coder.decode(raw)\n```\n\n\n### 4. Lossy example with fixed error bound\n\n```python\nimport numpy as np\nfrom daspack import DASCoder, Quantizer\n\n# Generate some example data\ndata = np.random.uniform(-100, 100, size=(6, 8)).astype(np.float64)\n\ncoder = DASCoder(threads=2)\n\n# Target: absolute error \u2264 step/2\nstep = 0.5\n\n# Encode with Uniform quantizer (lossy) and given step\nstream = coder.encode(\n    data,\n    Quantizer.Uniform(step=step),\n)\n\n# Decode (dtype inferred from stream)\nrestored = coder.decode(stream)\n\n# Verify bound\ntol = step / 2 + 1e-12\nmax_err = np.max(np.abs(restored - data))\nprint(f\"Max abs error: {max_err:.6f} (tolerance {tol})\")\nassert max_err <= tol\n\nprint(\"Original data:\\n\", data)\nprint(\"Restored data:\\n\", restored)\n```\n\nThe expected output is\n```\nMax abs error: 0.250000 (tolerance 0.250000)\nOriginal data:\n [[ ... ]]\nRestored data:\n [[ ... ]]\n```\n\n---\n\n\n## \u2699\ufe0f\u202fHow it works\n\n```\n(float mode) Quantize \u2192 Wavelet (5/3) \u2192 2-D LPC \u2192 Arithmetic coding\n(int mode)   Identity  \u2192 Wavelet (5/3) \u2192 2-D LPC \u2192 Arithmetic coding\n```\nThe lossy path is bounded-error thanks to uniform quantization; the rest of the chain is perfectly reversible.\n\nRead the paper (see citation below!) for more information \ud83d\ude04\n\n\n---\n\n## \ud83d\udcc4\u202fLicense\n\nDASPack is released under the 3-Clause BSD License.\n\n---\n\n## \ud83e\udd1d\u202fContributing\n\nBug reports and pull requests are welcome.\nIf you plan a large change, please open an issue first so we can discuss the design.\n\n---\n\n## \ud83d\udce3\u202fCiting\n\nIf you use DASPack in academic work, please cite:\n\n> Segu\u00ed, A. *et\u202fal.* (2025). **DASPack: Controlled Data Compression for Distributed Acoustic Sensing**. *Geophysical Journal International*.\\\n> DOI: *pending*\n\nThanks for supporting open science!\n\n\n",
    "bugtrack_url": null,
    "license": "BSD-3-Clause",
    "summary": "DASPack: Controlled data compression for Distributed Acoustic Sensing",
    "version": "0.0.1a0",
    "project_urls": {
        "homepage": "https://aleixsegui.dev/",
        "repository": "https://github.com/asleix/daspack"
    },
    "split_keywords": [
        "distributed acoustic sensing",
        " compression",
        " data compression"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "4d343d8b92f84f5602949eb699f91d26c4939014c2f799ae7f3d5b1e415876ef",
                "md5": "3413e4c1c245983ebfe817a44735a422",
                "sha256": "f42f08335d38f9d3b6ccf2d956f2a4487fd157dde4406534ca4eed767d8ac15c"
            },
            "downloads": -1,
            "filename": "daspack_dev-0.0.1a0-cp311-cp311-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "3413e4c1c245983ebfe817a44735a422",
            "packagetype": "bdist_wheel",
            "python_version": "cp311",
            "requires_python": ">=3.8",
            "size": 324822,
            "upload_time": "2025-08-09T18:31:59",
            "upload_time_iso_8601": "2025-08-09T18:31:59.664060Z",
            "url": "https://files.pythonhosted.org/packages/4d/34/3d8b92f84f5602949eb699f91d26c4939014c2f799ae7f3d5b1e415876ef/daspack_dev-0.0.1a0-cp311-cp311-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "989cc0adce7e5c6c286ef29d930efb8e5fb222a732f64b2d929b2132e0ee3d62",
                "md5": "c09aeccda57b5f46bb6524acf9cb1b4b",
                "sha256": "d0036a05a0dfe608e2409f8c23dfdc6a9afea32b83def137348533d38cc1b789"
            },
            "downloads": -1,
            "filename": "daspack_dev-0.0.1a0.tar.gz",
            "has_sig": false,
            "md5_digest": "c09aeccda57b5f46bb6524acf9cb1b4b",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 63024,
            "upload_time": "2025-08-09T18:32:01",
            "upload_time_iso_8601": "2025-08-09T18:32:01.345685Z",
            "url": "https://files.pythonhosted.org/packages/98/9c/c0adce7e5c6c286ef29d930efb8e5fb222a732f64b2d929b2132e0ee3d62/daspack_dev-0.0.1a0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-09 18:32:01",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "asleix",
    "github_project": "daspack",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "daspack-dev"
}
        
Elapsed time: 1.55520s