# Caterva2: On-demand access to Blosc2/HDF5 data repositories
## What is it?
Caterva2 is a service meant for serving [Blosc2][] and [HDF5][] datasets among authenticated users, work groups, or the public. There are several interfaces to Caterva2, including a web GUI, a REST API, a Python API, and a command-line client.
<img src="./doc/_static/caterva2-block-diagram2.png" alt="Figure: Caterva2 block diagram" width="100%"/>
It can be used either remotely or locally, as a simple way to access datasets in a directory hierarchy, or to share them with other users in the same network.
<img src="./doc/_static/caterva2-data-sharing.png" alt="Figure: How data can be shared" width="50%"/>
The Python API is the recommended way for building your own Caterva2 clients, whereas the web client provides a more user-friendly interface for browsing and accessing datasets.
<img src="./doc/_static/web-tomo-view.png" alt="Figure: web viewer for tomography" width="100%"/>
[Blosc2]: https://www.blosc.org/pages/blosc-in-depth/
"What Is Blosc? (Blosc blog)"
[HDF5]: https://www.hdfgroup.org/solutions/hdf5/
"HDF5 (HDF Group)"
## Caterva2 Clients
The main role of the Caterva2 package is to provide a simple and lightweight library to build your own Caterva2 clients. The variety of interfaces available allows you to choose the one that best fits your needs. For example, querying a dataset from source can be accomplished :
- Via the [web GUI](https://ironarray.io/caterva2-doc/tutorials/web-client.html) using a browser <img src="./doc/_static/web-data-view.png" alt="Figure: web data browser and viewer" width="100%"/>
- Via the [Python API](https://ironarray.io/caterva2-doc/tutorials/API.html)
```
client = cat2.Client("https://cat2.cloud/demo")
client.get("@public/examples/tomo-guess-test.b2nd")
```
- Via the [command line client](https://ironarray.io/caterva2-doc/tutorials/cli.html)
```sh
cat2cli info @public/kevlar/entry/data/data.b2nd
```
- Via the [REST API](https://ironarray.io/caterva2-doc/tutorials/RESTAPI.html) using a REST client like [Postman](https://www.postman.com/) or [curl](https://curl.se/) (see [here](https://cat2.cloud/demo/docs)).
In addition, as Caterva2 supports authentication, all client interfaces expose a way to log in and access private datasets. Administration of authenticated users may be done using the internal mechanics of Caterva2 (see section "User authentication" below).
## Installation
You may install Caterva2 in several ways:
- Pre-built wheel from PyPI:
```sh
python -m pip install caterva2
```
- Wheel built from source code:
```sh
git clone https://github.com/ironArray/Caterva2
cd Caterva2
python -m build
python -m pip install dist/caterva2-*.whl
```
- Developer setup:
```sh
git clone https://github.com/ironArray/Caterva2
cd Caterva2
python -m pip install -e .
```
When a user uses a client (web GUI, REST API, Python API, or command line) to query datasets, the client will connect to a Caterva2 **subscriber** service, which
accesses the relevant datasets stored either locally or remotely. The subscriber services may be managed via the command line by installing the `caterva2` package with the `[subscriber]` extra feature (we also wish to use the command line client, so we will also install the `clients` extra too):
```sh
python -m pip install caterva2 [subscriber, clients]
```
In general, if you intend to run Caterva2 services, client programs, or the test suite, you need to enable the proper extra features by appending `[feature1,feature2...]` to the last argument of `pip` commands above. The following extras are supported:
- `subscriber` for running the Caterva2 subscriber service
- `clients` to use Caterva2 client programs (command-line or terminal)
- `blosc2-plugins` to enable extra Blosc2 features like Btune or JPEG 2000 support
- `plugins` to enable web GUI features like the tomography display
- `tools` for additional utilities like `cat2import` and `cat2export` (see below)
- `tests` if you want to run the Caterva2 test suite
### Testing
After installing with the `[tests]` extra, you can quickly check that the package is sane by running the test suite (that comes with the package):
```sh
python -m caterva2.tests -v
```
You may also run tests from source code:
```sh
cd Caterva2
python -m pytest -v
```
Tests will use a copy of Caterva2's `root-example` directory. After they finish, state files will be left under the `_caterva2_tests` directory for inspection (it will be re-created when tests are run again).
## Quick start
(Find more detailed step-by-step [tutorials](Tutorials) in Caterva2 documentation.)
For the purpose of this quick start, let's use the datasets within the `root-example` folder:
```sh
cd Caterva2
ls -F root-example/
```
```
README.md dir2/ ds-1d-fields.b2nd ds-2d-fields.b2nd ds-sc-attr.b2nd
dir1/ ds-1d-b.b2nd ds-1d.b2nd ds-hello.b2frame
```
Now:
- create a virtual environment and install Caterva2 with the `[subscriber,clients]` extras (see above).
- copy the configuration file `caterva2.sample.toml` to `caterva2.toml`.
Subscribers (and clients, to a limited extent) may get their configuration from a `caterva2.toml` file at the current directory (or an alternative file given with the `--conf` option).
See also [configuration.md](configuration.md) in Caterva2 tutorials.
Then run the subscriber:
```sh
CATERVA2_SECRET=c2sikrit cat2sub & # subscriber
```
The `CATERVA2_SECRET` environment variable is obligatory and is explained below in the following section.
### User authentication
The Caterva2 subscriber includes some support for authenticating users. To enable it, run the subscriber with the environment variable `CATERVA2_SECRET` set to some non-empty, secure string that will be used for various user management operations. Note that new accounts may be registered, but their addresses are not verified. Password recovery does not work either.
To create a user, you can use the `cat2adduser` command line client. For example:
```sh
cat2adduser user@example.com foobar11
```
Client queries then require the same user credentials:
- The user will be prompted to login when accessing the web client using a browser
- The Python API client can be authenticated in the following way:
```
client = cat2.Client("https://cat2.cloud/demo", ('user@example.com', 'foobar11'))
```
- The command line client can be authenticated with the `--user` and `--pass` options
### The command line client
Now that the services are running, we can use the `cat2cli` client to talk
to the subscriber. In another shell, let's list all the available roots in the system:
```sh
cat2cli --user "user@example.com" --pass "foobar11" roots
```
```
@public (subscribed)
@personal (subscribed)
@shared (subscribed)
```
First let's upload a file from the `root-example`folder to the `@personal` root:
```sh
cat2cli --username user@example.com --password foobar11 upload root-example/ds-1d.b2nd @personal/ds-1d.b2nd
```
Now, one can list the datasets in the `@personal` root and see that the uploaded file appears
```sh
cat2cli --username user@example.com --password foobar11 list @personal
>> ds-1d.b2nd
```
Let's ask the subscriber for more info about the dataset:
```sh
cat2cli --username user@example.com --password foobar11 info @personal/ds-1d.b2nd
```
```
Getting info for @personal/ds-1d.b2nd
{
'shape': [1000],
'chunks': [100],
'blocks': [10],
'dtype': 'int64',
'schunk': {
'cbytes': 5022,
'chunkshape': 100,
'chunksize': 800,
'contiguous': True,
'cparams': {'codec': 5, 'codec_meta': 0, 'clevel': 1, 'filters': [0, 0, 0, 0, 0, 1], 'filters_meta': [0, 0, 0, 0, 0, 0], 'typesize': 8, 'blocksize': 80, 'nthreads': 1, 'splitmode': 1, 'tuner': 0, 'use_dict': False, 'filters, meta': [[1, 0]]},
'cratio': 1.5929908403026682,
'nbytes': 8000,
'urlpath': '/home/lshaw/Caterva2/_caterva2/sub/personal/2fa87091-84c6-44f9-a57e-7f04290630b1/ds-1d.b2nd',
'vlmeta': {},
'nchunks': 10,
'mtime': None
},
'mtime': '2025-05-29T09:11:26.860956Z'
}
```
This command returns a JSON object with the dataset's metadata, including its shape, chunks, blocks, data type, and compression parameters. The `schunk` field contains information about the underlying Blosc2 super-chunk that stores the dataset's data.
There are more commands available in the `cat2cli` client; ask for help with:
```sh
cat2cli --help
```
### Docs
To see how to use the Python and REST API and web GUI, check out the [Caterva2 documentation](https://ironarray.io/caterva2-doc/tutorials/API.html). You'll also find more information on how to use Caterva2, including tutorials, API references, and examples [here](https://ironarray.io/caterva2-doc/index.html).
That's all folks!
Raw data
{
"_id": null,
"home_page": null,
"name": "caterva2",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.11",
"maintainer_email": null,
"keywords": "blosc2",
"author": null,
"author_email": "ironArray SLU <contact@ironarray.io>",
"download_url": "https://files.pythonhosted.org/packages/d1/c7/f88c5f898446359a3520d27294ccd0f7c51c52bfa267e039c052843938af/caterva2-2025.8.7.tar.gz",
"platform": null,
"description": "# Caterva2: On-demand access to Blosc2/HDF5 data repositories\n\n## What is it?\n\nCaterva2 is a service meant for serving [Blosc2][] and [HDF5][] datasets among authenticated users, work groups, or the public. There are several interfaces to Caterva2, including a web GUI, a REST API, a Python API, and a command-line client.\n\n<img src=\"./doc/_static/caterva2-block-diagram2.png\" alt=\"Figure: Caterva2 block diagram\" width=\"100%\"/>\n\nIt can be used either remotely or locally, as a simple way to access datasets in a directory hierarchy, or to share them with other users in the same network.\n\n<img src=\"./doc/_static/caterva2-data-sharing.png\" alt=\"Figure: How data can be shared\" width=\"50%\"/>\n\nThe Python API is the recommended way for building your own Caterva2 clients, whereas the web client provides a more user-friendly interface for browsing and accessing datasets.\n\n<img src=\"./doc/_static/web-tomo-view.png\" alt=\"Figure: web viewer for tomography\" width=\"100%\"/>\n\n\n[Blosc2]: https://www.blosc.org/pages/blosc-in-depth/\n \"What Is Blosc? (Blosc blog)\"\n\n[HDF5]: https://www.hdfgroup.org/solutions/hdf5/\n \"HDF5 (HDF Group)\"\n\n## Caterva2 Clients\nThe main role of the Caterva2 package is to provide a simple and lightweight library to build your own Caterva2 clients. The variety of interfaces available allows you to choose the one that best fits your needs. For example, querying a dataset from source can be accomplished :\n- Via the [web GUI](https://ironarray.io/caterva2-doc/tutorials/web-client.html) using a browser <img src=\"./doc/_static/web-data-view.png\" alt=\"Figure: web data browser and viewer\" width=\"100%\"/>\n- Via the [Python API](https://ironarray.io/caterva2-doc/tutorials/API.html)\n```\nclient = cat2.Client(\"https://cat2.cloud/demo\")\nclient.get(\"@public/examples/tomo-guess-test.b2nd\")\n```\n- Via the [command line client](https://ironarray.io/caterva2-doc/tutorials/cli.html)\n```sh\ncat2cli info @public/kevlar/entry/data/data.b2nd\n```\n- Via the [REST API](https://ironarray.io/caterva2-doc/tutorials/RESTAPI.html) using a REST client like [Postman](https://www.postman.com/) or [curl](https://curl.se/) (see [here](https://cat2.cloud/demo/docs)).\n\nIn addition, as Caterva2 supports authentication, all client interfaces expose a way to log in and access private datasets. Administration of authenticated users may be done using the internal mechanics of Caterva2 (see section \"User authentication\" below).\n\n## Installation\n\nYou may install Caterva2 in several ways:\n\n- Pre-built wheel from PyPI:\n\n ```sh\n python -m pip install caterva2\n ```\n\n- Wheel built from source code:\n\n ```sh\n git clone https://github.com/ironArray/Caterva2\n cd Caterva2\n python -m build\n python -m pip install dist/caterva2-*.whl\n ```\n\n- Developer setup:\n\n ```sh\n git clone https://github.com/ironArray/Caterva2\n cd Caterva2\n python -m pip install -e .\n ```\n\nWhen a user uses a client (web GUI, REST API, Python API, or command line) to query datasets, the client will connect to a Caterva2 **subscriber** service, which\naccesses the relevant datasets stored either locally or remotely. The subscriber services may be managed via the command line by installing the `caterva2` package with the `[subscriber]` extra feature (we also wish to use the command line client, so we will also install the `clients` extra too):\n\n ```sh\n python -m pip install caterva2 [subscriber, clients]\n ```\nIn general, if you intend to run Caterva2 services, client programs, or the test suite, you need to enable the proper extra features by appending `[feature1,feature2...]` to the last argument of `pip` commands above. The following extras are supported:\n\n- `subscriber` for running the Caterva2 subscriber service\n- `clients` to use Caterva2 client programs (command-line or terminal)\n- `blosc2-plugins` to enable extra Blosc2 features like Btune or JPEG 2000 support\n- `plugins` to enable web GUI features like the tomography display\n- `tools` for additional utilities like `cat2import` and `cat2export` (see below)\n- `tests` if you want to run the Caterva2 test suite\n\n### Testing\n\nAfter installing with the `[tests]` extra, you can quickly check that the package is sane by running the test suite (that comes with the package):\n\n```sh\npython -m caterva2.tests -v\n```\n\nYou may also run tests from source code:\n\n```sh\ncd Caterva2\npython -m pytest -v\n```\n\nTests will use a copy of Caterva2's `root-example` directory. After they finish, state files will be left under the `_caterva2_tests` directory for inspection (it will be re-created when tests are run again).\n\n## Quick start\n\n(Find more detailed step-by-step [tutorials](Tutorials) in Caterva2 documentation.)\n\nFor the purpose of this quick start, let's use the datasets within the `root-example` folder:\n\n```sh\ncd Caterva2\nls -F root-example/\n```\n\n```\nREADME.md dir2/ ds-1d-fields.b2nd ds-2d-fields.b2nd ds-sc-attr.b2nd\ndir1/ ds-1d-b.b2nd ds-1d.b2nd ds-hello.b2frame\n```\n\nNow:\n\n- create a virtual environment and install Caterva2 with the `[subscriber,clients]` extras (see above).\n- copy the configuration file `caterva2.sample.toml` to `caterva2.toml`.\n\nSubscribers (and clients, to a limited extent) may get their configuration from a `caterva2.toml` file at the current directory (or an alternative file given with the `--conf` option).\nSee also [configuration.md](configuration.md) in Caterva2 tutorials.\n\nThen run the subscriber:\n\n```sh\nCATERVA2_SECRET=c2sikrit cat2sub & # subscriber\n```\nThe `CATERVA2_SECRET` environment variable is obligatory and is explained below in the following section.\n\n### User authentication\nThe Caterva2 subscriber includes some support for authenticating users. To enable it, run the subscriber with the environment variable `CATERVA2_SECRET` set to some non-empty, secure string that will be used for various user management operations. Note that new accounts may be registered, but their addresses are not verified. Password recovery does not work either.\n\nTo create a user, you can use the `cat2adduser` command line client. For example:\n\n```sh\ncat2adduser user@example.com foobar11\n```\n\nClient queries then require the same user credentials:\n- The user will be prompted to login when accessing the web client using a browser\n- The Python API client can be authenticated in the following way:\n```\nclient = cat2.Client(\"https://cat2.cloud/demo\", ('user@example.com', 'foobar11'))\n```\n- The command line client can be authenticated with the `--user` and `--pass` options\n\n### The command line client\nNow that the services are running, we can use the `cat2cli` client to talk\nto the subscriber. In another shell, let's list all the available roots in the system:\n\n```sh\ncat2cli --user \"user@example.com\" --pass \"foobar11\" roots\n```\n\n```\n@public (subscribed)\n@personal (subscribed)\n@shared (subscribed)\n```\nFirst let's upload a file from the `root-example`folder to the `@personal` root:\n\n```sh\ncat2cli --username user@example.com --password foobar11 upload root-example/ds-1d.b2nd @personal/ds-1d.b2nd\n```\n\nNow, one can list the datasets in the `@personal` root and see that the uploaded file appears\n\n```sh\ncat2cli --username user@example.com --password foobar11 list @personal\n>> ds-1d.b2nd\n```\n\nLet's ask the subscriber for more info about the dataset:\n\n```sh\ncat2cli --username user@example.com --password foobar11 info @personal/ds-1d.b2nd\n```\n\n```\nGetting info for @personal/ds-1d.b2nd\n{\n 'shape': [1000],\n 'chunks': [100],\n 'blocks': [10],\n 'dtype': 'int64',\n 'schunk': {\n 'cbytes': 5022,\n 'chunkshape': 100,\n 'chunksize': 800,\n 'contiguous': True,\n 'cparams': {'codec': 5, 'codec_meta': 0, 'clevel': 1, 'filters': [0, 0, 0, 0, 0, 1], 'filters_meta': [0, 0, 0, 0, 0, 0], 'typesize': 8, 'blocksize': 80, 'nthreads': 1, 'splitmode': 1, 'tuner': 0, 'use_dict': False, 'filters, meta': [[1, 0]]},\n 'cratio': 1.5929908403026682,\n 'nbytes': 8000,\n 'urlpath': '/home/lshaw/Caterva2/_caterva2/sub/personal/2fa87091-84c6-44f9-a57e-7f04290630b1/ds-1d.b2nd',\n 'vlmeta': {},\n 'nchunks': 10,\n 'mtime': None\n },\n 'mtime': '2025-05-29T09:11:26.860956Z'\n}\n```\n\nThis command returns a JSON object with the dataset's metadata, including its shape, chunks, blocks, data type, and compression parameters. The `schunk` field contains information about the underlying Blosc2 super-chunk that stores the dataset's data.\n\nThere are more commands available in the `cat2cli` client; ask for help with:\n\n```sh\ncat2cli --help\n```\n\n### Docs\nTo see how to use the Python and REST API and web GUI, check out the [Caterva2 documentation](https://ironarray.io/caterva2-doc/tutorials/API.html). You'll also find more information on how to use Caterva2, including tutorials, API references, and examples [here](https://ironarray.io/caterva2-doc/index.html).\n\nThat's all folks!\n",
"bugtrack_url": null,
"license": "GNU Affero General Public License version 3",
"summary": "A high-performance storage and computation system for Blosc2 datasets",
"version": "2025.8.7",
"project_urls": {
"Home": "https://github.com/ironArray/Caterva2"
},
"split_keywords": [
"blosc2"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "e96b0826e845ca608660e56ac2ea4ea4610815c5f2ef5d896df67d174c1554a8",
"md5": "d02ec8676a035c25024cd9666f9ff8b5",
"sha256": "551a0f146ffcbb727aac118149c7f0f026308de2336864a513ba0aec2c5bdae7"
},
"downloads": -1,
"filename": "caterva2-2025.8.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d02ec8676a035c25024cd9666f9ff8b5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.11",
"size": 916699,
"upload_time": "2025-08-07T08:40:19",
"upload_time_iso_8601": "2025-08-07T08:40:19.954513Z",
"url": "https://files.pythonhosted.org/packages/e9/6b/0826e845ca608660e56ac2ea4ea4610815c5f2ef5d896df67d174c1554a8/caterva2-2025.8.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "d1c7f88c5f898446359a3520d27294ccd0f7c51c52bfa267e039c052843938af",
"md5": "7bb58893742f80874827bf5192c70665",
"sha256": "4604377b8301b8ca96a4769495e75a41cb8b34dd304c90558ff66e32b546fa50"
},
"downloads": -1,
"filename": "caterva2-2025.8.7.tar.gz",
"has_sig": false,
"md5_digest": "7bb58893742f80874827bf5192c70665",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.11",
"size": 5567617,
"upload_time": "2025-08-07T08:40:21",
"upload_time_iso_8601": "2025-08-07T08:40:21.514704Z",
"url": "https://files.pythonhosted.org/packages/d1/c7/f88c5f898446359a3520d27294ccd0f7c51c52bfa267e039c052843938af/caterva2-2025.8.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-08-07 08:40:21",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ironArray",
"github_project": "Caterva2",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "caterva2"
}