# polystore
**Framework-agnostic multi-backend storage abstraction for ML and scientific computing**
[](https://badge.fury.io/py/polystore)
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
## Features
- **Pluggable Backends**: Disk, memory, Zarr, and streaming backends with auto-registration
- **Multi-Framework I/O**: Seamless support for NumPy, PyTorch, JAX, TensorFlow, CuPy
- **Atomic Operations**: Cross-platform atomic file writes with automatic locking
- **Batch Operations**: Efficient batch loading and saving
- **Format Detection**: Automatic format detection and routing
- **Type-Safe**: Full type hints and mypy support
- **Zero Dependencies**: Core requires only NumPy (framework support is optional)
## Quick Start
```python
from polystore import FileManager, BackendRegistry
# Create registry and file manager
registry = BackendRegistry()
fm = FileManager(registry)
# Save data to disk
import numpy as np
data = np.array([[1, 2], [3, 4]])
fm.save(data, "output.npy", backend="disk")
# Load data back
loaded = fm.load("output.npy", backend="disk")
# Use memory backend for testing
fm.save(data, "test.npy", backend="memory")
cached = fm.load("test.npy", backend="memory")
```
## Installation
```bash
# Base installation (NumPy only)
pip install polystore
# With specific frameworks
pip install polystore[zarr]
pip install polystore[torch]
pip install polystore[jax]
pip install polystore[tensorflow]
pip install polystore[cupy]
# With streaming support
pip install polystore[streaming]
# With all optional dependencies
pip install polystore[all]
```
## Supported Backends
| Backend | Description | Storage | Dependencies |
|---------|-------------|---------|--------------|
| **disk** | Local filesystem | Persistent | None |
| **memory** | In-memory cache | Volatile | None |
| **zarr** | Zarr/OME-Zarr arrays | Persistent | zarr, ome-zarr |
| **streaming** | ZeroMQ streaming | None | pyzmq |
## Supported Formats
| Format | Extensions | Frameworks |
|--------|-----------|------------|
| **NumPy** | `.npy`, `.npz` | NumPy, PyTorch, JAX, TensorFlow, CuPy |
| **TIFF** | `.tif`, `.tiff` | NumPy, PyTorch, JAX, TensorFlow, CuPy |
| **Zarr** | `.zarr` | NumPy, PyTorch, JAX, TensorFlow, CuPy |
| **PyTorch** | `.pt`, `.pth` | PyTorch |
| **CSV** | `.csv` | NumPy, pandas |
| **JSON** | `.json` | Python dicts |
## Architecture
```
polystore/
├── base.py # Abstract interfaces (DataSink, DataSource, StorageBackend)
├── backend_registry.py # Auto-registration system
├── disk.py # Disk storage backend
├── memory.py # In-memory backend
├── zarr.py # Zarr backend
├── streaming.py # ZeroMQ streaming backend
├── filemanager.py # High-level API
├── atomic.py # Atomic file operations
└── exceptions.py # Custom exceptions
```
## Advanced Usage
### Custom Backends
```python
from polystore import StorageBackend
class MyBackend(StorageBackend):
_backend_type = 'my_backend' # Auto-registers
def save(self, data, file_path, **kwargs):
# Your save logic
pass
def load(self, file_path, **kwargs):
# Your load logic
pass
```
### Batch Operations
```python
# Save multiple files
data_list = [np.random.rand(100, 100) for _ in range(10)]
paths = [f"image_{i}.npy" for i in range(10)]
fm.save_batch(data_list, paths, backend="disk")
# Load multiple files
loaded_list = fm.load_batch(paths, backend="disk")
```
### Atomic Writes
```python
from polystore import atomic_write, atomic_write_json
# Atomic file write with automatic locking
with atomic_write("output.txt") as f:
f.write("data")
# Atomic JSON write
atomic_write_json({"key": "value"}, "config.json")
```
## Why polystore?
**Before** (Manual backend management):
```python
if backend == 'disk':
np.save(path, data)
elif backend == 'memory':
cache[path] = data
elif backend == 'zarr':
zarr.save(path, data)
# ... 50 more lines of if/elif ...
```
**After** (polystore):
```python
fm.save(data, path, backend=backend)
```
## Documentation
Full documentation available at [polystore.readthedocs.io](https://polystore.readthedocs.io)
## Addons
Extend polystore with additional backends:
- **polystore-napari**: Napari viewer streaming backend
- **polystore-fiji**: Fiji/ImageJ streaming backend
- **polystore-omero**: OMERO server backend
## Performance
- **Zero-copy** conversions between frameworks via DLPack (when possible)
- **Lazy loading** for optional dependencies
- **Batch operations** for efficient I/O
- **Atomic writes** with minimal overhead
## License
MIT License - see LICENSE file for details
## Contributing
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
## Credits
Developed by Tristan Simas. Extracted from the [OpenHCS](https://github.com/trissim/openhcs) project.
Raw data
{
"_id": null,
"home_page": null,
"name": "polystore",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "storage, backend, multi-framework, numpy, pytorch, jax, tensorflow, zarr, io, data",
"author": null,
"author_email": "Tristan Simas <tristan.simas@mail.mcgill.ca>",
"download_url": "https://files.pythonhosted.org/packages/ec/ff/14bcfb02a08122e75a202b7a2169a4e9136872564bfd387e734a37f77892/polystore-0.1.2.tar.gz",
"platform": null,
"description": "# polystore\n\n**Framework-agnostic multi-backend storage abstraction for ML and scientific computing**\n\n[](https://badge.fury.io/py/polystore)\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n\n## Features\n\n- **Pluggable Backends**: Disk, memory, Zarr, and streaming backends with auto-registration\n- **Multi-Framework I/O**: Seamless support for NumPy, PyTorch, JAX, TensorFlow, CuPy\n- **Atomic Operations**: Cross-platform atomic file writes with automatic locking\n- **Batch Operations**: Efficient batch loading and saving\n- **Format Detection**: Automatic format detection and routing\n- **Type-Safe**: Full type hints and mypy support\n- **Zero Dependencies**: Core requires only NumPy (framework support is optional)\n\n## Quick Start\n\n```python\nfrom polystore import FileManager, BackendRegistry\n\n# Create registry and file manager\nregistry = BackendRegistry()\nfm = FileManager(registry)\n\n# Save data to disk\nimport numpy as np\ndata = np.array([[1, 2], [3, 4]])\nfm.save(data, \"output.npy\", backend=\"disk\")\n\n# Load data back\nloaded = fm.load(\"output.npy\", backend=\"disk\")\n\n# Use memory backend for testing\nfm.save(data, \"test.npy\", backend=\"memory\")\ncached = fm.load(\"test.npy\", backend=\"memory\")\n```\n\n## Installation\n\n```bash\n# Base installation (NumPy only)\npip install polystore\n\n# With specific frameworks\npip install polystore[zarr]\npip install polystore[torch]\npip install polystore[jax]\npip install polystore[tensorflow]\npip install polystore[cupy]\n\n# With streaming support\npip install polystore[streaming]\n\n# With all optional dependencies\npip install polystore[all]\n```\n\n## Supported Backends\n\n| Backend | Description | Storage | Dependencies |\n|---------|-------------|---------|--------------|\n| **disk** | Local filesystem | Persistent | None |\n| **memory** | In-memory cache | Volatile | None |\n| **zarr** | Zarr/OME-Zarr arrays | Persistent | zarr, ome-zarr |\n| **streaming** | ZeroMQ streaming | None | pyzmq |\n\n## Supported Formats\n\n| Format | Extensions | Frameworks |\n|--------|-----------|------------|\n| **NumPy** | `.npy`, `.npz` | NumPy, PyTorch, JAX, TensorFlow, CuPy |\n| **TIFF** | `.tif`, `.tiff` | NumPy, PyTorch, JAX, TensorFlow, CuPy |\n| **Zarr** | `.zarr` | NumPy, PyTorch, JAX, TensorFlow, CuPy |\n| **PyTorch** | `.pt`, `.pth` | PyTorch |\n| **CSV** | `.csv` | NumPy, pandas |\n| **JSON** | `.json` | Python dicts |\n\n## Architecture\n\n```\npolystore/\n\u251c\u2500\u2500 base.py # Abstract interfaces (DataSink, DataSource, StorageBackend)\n\u251c\u2500\u2500 backend_registry.py # Auto-registration system\n\u251c\u2500\u2500 disk.py # Disk storage backend\n\u251c\u2500\u2500 memory.py # In-memory backend\n\u251c\u2500\u2500 zarr.py # Zarr backend\n\u251c\u2500\u2500 streaming.py # ZeroMQ streaming backend\n\u251c\u2500\u2500 filemanager.py # High-level API\n\u251c\u2500\u2500 atomic.py # Atomic file operations\n\u2514\u2500\u2500 exceptions.py # Custom exceptions\n```\n\n## Advanced Usage\n\n### Custom Backends\n\n```python\nfrom polystore import StorageBackend\n\nclass MyBackend(StorageBackend):\n _backend_type = 'my_backend' # Auto-registers\n \n def save(self, data, file_path, **kwargs):\n # Your save logic\n pass\n \n def load(self, file_path, **kwargs):\n # Your load logic\n pass\n```\n\n### Batch Operations\n\n```python\n# Save multiple files\ndata_list = [np.random.rand(100, 100) for _ in range(10)]\npaths = [f\"image_{i}.npy\" for i in range(10)]\nfm.save_batch(data_list, paths, backend=\"disk\")\n\n# Load multiple files\nloaded_list = fm.load_batch(paths, backend=\"disk\")\n```\n\n### Atomic Writes\n\n```python\nfrom polystore import atomic_write, atomic_write_json\n\n# Atomic file write with automatic locking\nwith atomic_write(\"output.txt\") as f:\n f.write(\"data\")\n\n# Atomic JSON write\natomic_write_json({\"key\": \"value\"}, \"config.json\")\n```\n\n## Why polystore?\n\n**Before** (Manual backend management):\n```python\nif backend == 'disk':\n np.save(path, data)\nelif backend == 'memory':\n cache[path] = data\nelif backend == 'zarr':\n zarr.save(path, data)\n# ... 50 more lines of if/elif ...\n```\n\n**After** (polystore):\n```python\nfm.save(data, path, backend=backend)\n```\n\n## Documentation\n\nFull documentation available at [polystore.readthedocs.io](https://polystore.readthedocs.io)\n\n## Addons\n\nExtend polystore with additional backends:\n\n- **polystore-napari**: Napari viewer streaming backend\n- **polystore-fiji**: Fiji/ImageJ streaming backend\n- **polystore-omero**: OMERO server backend\n\n## Performance\n\n- **Zero-copy** conversions between frameworks via DLPack (when possible)\n- **Lazy loading** for optional dependencies\n- **Batch operations** for efficient I/O\n- **Atomic writes** with minimal overhead\n\n## License\n\nMIT License - see LICENSE file for details\n\n## Contributing\n\nContributions welcome! Please see CONTRIBUTING.md for guidelines.\n\n## Credits\n\nDeveloped by Tristan Simas. Extracted from the [OpenHCS](https://github.com/trissim/openhcs) project.\n\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Framework-agnostic multi-backend storage abstraction for ML and scientific computing",
"version": "0.1.2",
"project_urls": {
"Bug Reports": "https://github.com/trissim/polystore/issues",
"Documentation": "https://polystore.readthedocs.io",
"Homepage": "https://github.com/trissim/polystore",
"Source": "https://github.com/trissim/polystore"
},
"split_keywords": [
"storage",
" backend",
" multi-framework",
" numpy",
" pytorch",
" jax",
" tensorflow",
" zarr",
" io",
" data"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "0a7fb07f4556bdff8b3579957c9cb8e367563798f6676828fe2a2626868df6ef",
"md5": "0579f9131b969cd048f23ae5e4f78e6b",
"sha256": "e51aa18b5078e96e369dbffe4fe477f4ae0d6ecd13b77b6ac013981405a4fab8"
},
"downloads": -1,
"filename": "polystore-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0579f9131b969cd048f23ae5e4f78e6b",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 49959,
"upload_time": "2025-11-02T04:09:17",
"upload_time_iso_8601": "2025-11-02T04:09:17.965873Z",
"url": "https://files.pythonhosted.org/packages/0a/7f/b07f4556bdff8b3579957c9cb8e367563798f6676828fe2a2626868df6ef/polystore-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "ecff14bcfb02a08122e75a202b7a2169a4e9136872564bfd387e734a37f77892",
"md5": "bcf6b6d64fa23b7c6523e608b83b0c8e",
"sha256": "7c2d0fc0f688d94741dfd8d28d67c3ae1cf5c43525a6520eefe9f4a84c96d8c1"
},
"downloads": -1,
"filename": "polystore-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "bcf6b6d64fa23b7c6523e608b83b0c8e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 49084,
"upload_time": "2025-11-02T04:09:19",
"upload_time_iso_8601": "2025-11-02T04:09:19.567298Z",
"url": "https://files.pythonhosted.org/packages/ec/ff/14bcfb02a08122e75a202b7a2169a4e9136872564bfd387e734a37f77892/polystore-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-11-02 04:09:19",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "trissim",
"github_project": "polystore",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "polystore"
}