jdata


Namejdata JSON
Version 0.8.0 PyPI version JSON
download
home_pagehttps://github.com/NeuroJSON/pyjdata
SummaryJSON/binary JSON formats for exchanging Python and Numpy data
upload_time2025-08-03 22:33:50
maintainerQianqian Fang
docs_urlNone
authorQianqian Fang
requires_pythonNone
licenseApache license 2.0
keywords json jdata ubjson bjdata openjdata neurojson jnifti jmesh encoder decoder
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI
coveralls test coverage No coveralls.
            ![](https://neurojson.org/wiki/upload/neurojson_banner_long.png)

# JData - NeuroJSON client with fast parsers for JSON, binary JSON, NIFTI, SNIRF, CSV/TSV, HDF5 data files

- Copyright: (C) Qianqian Fang (2019-2025) <q.fang at neu.edu>
- License: Apache License, Version 2.0
- Version: 0.8.0
- URL: https://github.com/NeuroJSON/pyjdata
- Acknowledgement: This project is supported by US National Institute of Health (NIH)
  grant [U24-NS124027](https://reporter.nih.gov/project-details/10308329)

![Build Status](https://github.com/NeuroJSON/pyjdata/actions/workflows/run_test.yml/badge.svg)

## Table of Contents

- [Introduction](#introduction)
- [File formats](#file-formats)
- [Submodules](#submodules)
- [How to install](#how-to-install)
- [How to build](#how-to-build)
- [How to use](#how-to-use)
- [Advanced interfaces](#advanced-interfaces)
- [Reading JSON via REST-API](#reading-json-via-rest-api)
- [Using JSONPath to access and query complex datasets](#using-jsonpath-to-access-and-query-complex-datasets)
- [Downloading and caching `_DataLink_` referenced external data files](#downloading-and-caching-_datalink_-referenced-external-data-files)
- [Utility](#utility)
- [How to contribute](#how-to-contribute)
- [Test](#test)

## Introduction

`jdata` is a lightweight and fast neuroimaging data file parser, with built
in support for NIfTI-1/2 (`.nii`, `.nii.gz`), two-part Analyze 7.5 (`.img/.hdr`, `.img.gz`),
HDF5 (`.h5`), SNIRF (`.snirf`), MATLAB .mat files (`.mat`), CSV/TSV (`.csv`, `.csv.gz`,
`.tsv`, `.tsv.gz`), JSON (`.json`), and various binary-JSON data formats, including
BJData (`.bjd`), UBJSON (`.ubj`), and MessagePack (`.msgpack`) formats. `jdata` can
load data files both from local storage and REST-API via URLs. To maximize portability,
the outputs of `jdata` data parsers are intentionally based upon only the **native Python**
data structures (`dict/list/tuple`) plus `numpy` arrays. The entire package is less than
60KB in size and is platform-independent.

`jdata` highly compatible to the [JSONLab toolbox](https://github.com/NeuroJSON/jsonlab)
for MATLAB/Octave, serving as the reference library for Python for the
[JData Specification](https://github.com/NeuroJSON/jdata/),
The JData Specification defines a lightweight
language-independent data annotation interface enabling easy storing
and sharing of complex data structures across different programming
languages such as MATLAB, JavaScript, Python etc. Using JData formats, a 
complex Python data structure, including numpy objects, can be encoded
as a simple `dict` object that is easily serialized as a JSON/binary JSON
file and share such data between programs of different languages.

Since 2021, the development of the `jdata` module and the underlying data format specificaitons
[JData](https://neurojson.org/jdata/draft3) and [BJData](https://neurojson.org/bjdata/draft3)
have been funded by the US National Institute of Health (NIH) as
part of the NeuroJSON project (https://neurojson.org and https://neurojson.io).

The goal of the NeuroJSON project is to develop scalable, searchable, and
reusable neuroimaging data formats and data sharing platforms. All data
produced from the NeuroJSON project will be using JSON/Binary JData formats as the
underlying serialization standards and the lightweight JData specification as
language-independent data annotation standard.

## File formats

The supported data formats can be found in the below table. All file types
support reading and writing, except those specified below.

| Format | Name |       |  Format                           | Name   |
| ------ | ------ | --- |-----------------------------------| ------ |
| **JSON-compatible files**  | |  | **Binary JSON (same format)** **[1]** | |
| ✅ `.json` | ✅ JSON files |                        | ✅ `.bjd`    | ✅ binary JSON (BJD) files |
| ✅ `.jnii` | ✅ JSON-wrapper for NIfTI data (JNIfTI)|       | ✅ `.bnii`   | ✅ BJD-wrapper for NIfTI data |
| ✅ `.jnirs` | ✅ JSON-wrapper for SNIRF data (JSNIRF)|      | ✅ `.bnirs`  | ✅ BJD-wrapper for SNIRF data |
| ✅ `.jmsh` | ✅ JSON-encoded mesh data (JMesh)  |   | ✅ `.bmsh`   | ✅ BJD-encoded for mesh data  |
| ✅ `.jdt` | ✅ JSON files with JData annotations |  | ✅ `.jdb`    | ✅ BJD files with JData annotations |
| ✅ `.jdat` | ✅ JSON files with JData annotations | | ✅ `.jbat`   | ✅ BJD files with JData annotations |
| ✅ `.jbids` | ✅ JSON digest of a BIDS dataset |    | ✅ `.pmat`   | ✅ BJD encoded .mat files |
| **NIfTI formats**                      | |           | **CSV/TSV formats** | |
| ✅ `.nii` | ✅ uncompressed NIfTI-1/2 files |       | ✅ `.csv`    | ✅ CSV files |
| ✅ `.nii.gz` | ✅ compressed NIfTI files |          | ✅ `.csv.gz` | ✅ compressed CSV files |
| ✅ `.img/.hdr` | ✅ Analyze 7.5 two-part files |    | ✅ `.tsv`    | ✅ TSV files |
| ✅ `.img.gz` | ✅ compressed Analyze files |        | ✅ `.tsv.gz` | ✅ compressed TSV files |
| **HDF5 formats** **[2]**              | |           | **Other formats (read-only)** | |
| ✅ `.h5` | ✅ HDF5 files |                          | ✅ `.mat`    | ✅ MATLAB .mat files **[3]** |
| ✅ `.hdf5` | ✅ HDF5 files |                        | ✅ `.bval`   | ✅ EEG .bval files |
| ✅ `.snirf` | ✅ HDF5-based SNIRF data |            | ✅ `.bvec`   | ✅ EEG .bvec files |
| ✅ `.nwb` | ✅ HDF5-based NWB files |               | ✅ `.msgpack`| ✅ Binary JSON MessagePack format **[4]** |

- [1] requires `bjdata` Python module when needed, `pip install bjdata`
- [2] requires `h5py` Python module when needed, `pip install h5py`
- [3] requires `scipy` Python module when needed, `pip install scipy`
- [4] requires `msgpack` Python module when needed, `pip install msgpack`

## Submodules

The `jdata` module further partition the functions into smaller submodules, including
- **jdata.jfile** provides `loadjd`, `savejd`, `load`, `save`, `loadt`, `savet`, `loadb`, `saveb`, `loadts`, `loadbs`, `jsoncache`, `jdlink`, ...
- **jdata.jdata** provides `encode`, `decode`, `jdataencode`, `jdatadecode`, `{zlib,gzip,lzma,lz4,base64}encode`, `{zlib,gzip,lzma,lz4,base64}decode`
- **jdata.jpath** provides `jsonpath`
- **jdata.jnifti** provides `load{jnifti,nifti}`, `save{jnifti,nifti,jnii,bnii}`, `nii2jnii`, `jnii2nii`, `nifticreate`, `jnifticreate`, `niiformat`, `niicodemap`
- **jdata.neurojson** provides `neuroj`, `neurojgui`
- **jdata.h5** provides `loadh5`, `saveh5`, `regrouph5`, `aos2soa`, `soa2aos`, `jsnirfcreate`, `snirfcreate`, `snirfdecode`

All these functions can be found in the MATLAB/GNU Octave equivalent, JSONLab toolbox. Each function can be individually imported
```
# individually imported
from jdata.jfile import loadjd
data=loadjd(...)

# import everything
from jdata import *
data=loadjd(...)

# import under jdata namespace
import jdata as jd
data=jd.loadjd(...)
```

## How to install

* Github: download from https://github.com/NeuroJSON/pyjdata
* PIP: run `pip install jdata` see https://pypi.org/project/jdata/

This package can also be installed on Ubuntu 21.04 or Debian Bullseye via
```
sudo apt-get install python3-jdata
```

On older Ubuntu or Debian releases, you may install jdata via the below PPA:
```
sudo add-apt-repository ppa:fangq/ppa
sudo apt-get update
sudo apt-get install python3-jdata
```

Dependencies:
* **numpy**: PIP: run `pip install numpy` or `sudo apt-get install python3-numpy`
* (optional) **bjdata**: PIP: run `pip install bjdata` or `sudo apt-get install python3-bjdata`, see https://pypi.org/project/bjdata/, only needed to read/write BJData/UBJSON files
* (optional) **lz4**: PIP: run `pip install lz4`, only needed when encoding/decoding lz4-compressed data
* (optional) **h5py**: PIP: run `pip install h5py`, only needed when reading/writing .h5 and .snirf files
* (optional) **scipy**: PIP: run `pip install scipy`, only needed when loading MATLAB .mat files
* (optional) **msgpack**: PIP: run `pip install msgpack`, only needed when loading MessagePack .msgpack files
* (optional) **blosc2**: PIP: run `pip install blosc2`, only needed when encoding/decoding blosc2-compressed data
* (optional) **backports.lzma**: PIP: run `sudo apt-get install liblzma-dev` and `pip install backports.lzma` (needed for Python 2.7), only needed when encoding/decoding lzma-compressed data
* (optional) **python3-tk**: run `sudo apt-get install python3-tk` to install the Tk support on a Linux in order to run `neurojgui` function

Replacing `pip` by `pip3` if you are using Python 3.x. If either `pip` or `pip3` 
does not exist on your system, please run
```
sudo apt-get install python3-pip
```
Please note that in some OS releases (such as Ubuntu 20.04), python2.x and python-pip 
are no longer supported.

## How to build

One can also install this module from the source code. To do this, you first
check out a copy of the latest code from Github by
```
git clone https://github.com/NeuroJSON/pyjdata.git
cd pyjdata
```
then install the module to your local user folder by
```
python3 setup.py install --user
```
or, if you prefer, install to the system folder for all users by
```
sudo python3 setup.py install
```

Instead of installing the module, you can also import the jdata module directly from 
your local copy by cd the root folder of the unzipped pyjdata package, and run
```
import jdata as jd
```


## How to use

The `jdata` module provides a unified data parsing and saving interface: `jd.loadjd()` and `jd.savejd()`.
These two functions supports all file format described in the above "File formats" section.
The `jd.loadjd()` function also supports loading online data via URLs.

```
import jdata as jd
nii = jd.loadjd('/path/to/img.nii.gz')
snirf = jd.loadjd('/path/to/mydata.snirf')
nii2 = jd.loadjd('https://example.com/data/vol.nii.gz')
jsondata = jd.loadjd('https://example.com/rest/api/')
matlabdata = jd.loadjd('matlabdata.mat')
jd.savejd(matlabdata, 'newdata.mat')
jd.savejd(matlabdata, 'newdata.jdb', compression='zlib')

jd.savejd(nii2, 'newdata.jnii', compression='lzma')
jd.savejd(nii, 'newdata.bnii', compression='gzip')
jd.savejd(nii, 'newdata.nii.gz')
```

The `jdata` module also serves as the front-end for the free data resources hosted at
NeuroJSON.io. The NeuroJSON client (`neuroj()`) can be started in the GUI mode using

```
import jdata as jd
jd.neuroj('gui')
```

the above command will pop up a window displaying the databases, datasets and data
records for the over 1500 datasets currently hosted on NeuroJSON.io.

The `neuroj` client also supports command-line mode, using the below format

```
import jdata as jd
help(jd.neuroj)                            # print help info for jd.neuroj()
jd.neuroj('list')                          # list all databases on NeuroJSON.io
[db['id'] for db in jd.neuroj('list')['database']]  # list all database IDs
jd.neuroj('list', 'openneuro')             # list all datasets under the `openneuro` database
jd.neuroj('list', 'openneuro', limit=5, skip=5)  # list the 6th to 10th datasets under the `openneuro` database
jd.neuroj('list', 'openneuro', 'ds000001') # list all versions for the `openneuro/ds00001` dataset
jd.neuroj('get', 'openneuro', 'ds000001')  # download and parse the `openneuro/ds00001` dataset as a Python object
jd.neuroj('info', 'openneuro', 'ds000001') # lightweight header information of the `openneuro/ds00001` dataset
jd.neuroj('find', '/abide/')               # find both abide-1 and abide-2 databases using filters
jd.neuroj('find', 'openneuro', '/00[234]$/') # use regular experssion to filter all openneuro datasets
jd.neuroj('find', 'mcx', {'selector': ..., 'find': ...}) # use CouchDB _find API to search data
jd.neuroj('find', 'mcx', {'selector': ..., 'find': ...}) # use CouchDB _find API to search data
jd.neuroj('info', db='mcx', ds='colin27')  # use named inputs
jd.neuroj('get', db='mcx', ds='colin27', file='att1')  # download the attachment `att1` for the `mcx/colin27` dataset
jd.neuroj('put', 'sandbox1d', 'test', '{"obj":1}')  # update `sandbox1d/test` dataset with a new JSON string (need admin account)
jd.neuroj('delete', 'sandbox1d', 'test')   # delete `sandbox1d/test` dataset (need admin account)
```


## Advanced interfaces

The `jdata` module is easy to use. You can use the `encode()/decode()` functions to
encode Python data into JData annotation format, or decode JData structures into
native Python data, for example

```
import jdata as jd
import numpy as np

a={'str':'test','num':1.2,'list':[1.1,[2.1]],'nan':float('nan'),'np':np.arange(1,5,dtype=np.uint8)}
jd.encode(a)
jd.decode(jd.encode(a))
d1=jd.encode(a, compression='zlib',base64=True})
d1
jd.decode(d1,base64=True)
```

One can further save the JData annotated data into JSON or binary JSON (UBJSON) files using
the `jdata.save` function, or loading JData-formatted data to Python using `jdata.load`

```
import jdata as jd
import numpy as np

a={'str':'test','num':1.2,'list':[1.1,[2.1]],'nan':float('nan'),'np':np.arange(1,5,dtype=np.uint8)}
jd.save(a,'test.json')
newdata=jd.load('test.json')
newdata
```

One can use `loadt` or `savet` to read/write JSON-based data files and `loadb` and `saveb` to
read/write binary-JSON based data files. By default, JData annotations are automatically decoded
after loading and encoded before saving. One can set `{'encode': False}` in the save functions
or `{'decode': False}` in the load functions as the `opt` to disable further processing of JData
annotations. We also provide `loadts` and `loadbs` for parsing a string-buffer made of text-based
JSON or binary JSON stream.

PyJData supports multiple N-D array data compression/decompression methods (i.e. codecs), similar
to HDF5 filters. Currently supported codecs include `zlib`, `gzip`, `lz4`, `lzma`, `base64` and various
`blosc2` compression methods, including `blosc2blosclz`, `blosc2lz4`, `blosc2lz4hc`, `blosc2zlib`,
`blosc2zstd`. To apply a selected compression method, one simply set `{'compression':'method'}` as
the option to `jdata.encode` or `jdata.save` function; `jdata.load` or `jdata.decode` automatically
decompress the data based on the `_ArrayZipType_` annotation present in the data. Only `blosc2`
compression methods support multi-threading. To set the thread number, one should define an `nthread`
value in the option (`opt`) for both encoding and decoding.

## Reading JSON via REST-API

If a REST-API (URL) is given as the first input of `load`, it reads the JSON data directly
from the URL and parse the content to native Python data structures. To avoid repetitive download,
`load` automatically cache the downloaded file so that future calls directly load the
locally cached file. If one prefers to always load from the URL without local cache, one should
use `loadurl()` instead. Here is an example

```
import jdata as jd
data = jd.load('https://neurojson.io:7777/openneuro/ds000001');
data.keys()
```

## Using JSONPath to access and query complex datasets

Starting from v0.6.0, PyJData provides a lightweight implementation [JSONPath](https://goessner.net/articles/JsonPath/),
a widely used format for query and access a hierarchical dict/list structure, such as those
parsed by `load` or `loadurl`. Here is an example

```
import jdata as jd

data = jd.loadurl('https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json');
jd.jsonpath(data, '$.age')
jd.jsonpath(data, '$.address.city')
jd.jsonpath(data, '$.phoneNumber')
jd.jsonpath(data, '$.phoneNumber[0]')
jd.jsonpath(data, '$.phoneNumber[0].type')
jd.jsonpath(data, '$.phoneNumber[-1]')
jd.jsonpath(data, '$.phoneNumber..number')
jd.jsonpath(data, '$[phoneNumber][type]')
jd.jsonpath(data, '$[phoneNumber][type][1]')
```

The `jd.jsonpath` function does not support all JSONPath features. If more complex JSONPath
queries are needed, one should install `jsonpath_ng` or other more advanced JSONPath support.
Here is an example using `jsonpath_ng`

```
import jdata as jd
from jsonpath_ng.ext import parse

data = jd.loadurl('https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json');

val = [match.value for match in parse('$.address.city').find(data)]
val = [match.value for match in parse('$.phoneNumber').find(data)]
```

## Downloading and caching `_DataLink_` referenced external data files

Similarly to [JSONLab](https://github.com/fangq/jsonlab?tab=readme-ov-file#jsoncachem),
PyJData also provides similar external data file downloading/caching capability.

The `_DataLink_` annotation in the JData specification permits linking of external data files
in a JSON file - to make downloading/parsing externally linked data files efficient, such as
processing large neuroimaging datasets hosted on http://neurojson.io, we have developed a system
to download files on-demand and cache those locally. jsoncache.m is responsible of searching
the local cache folders, if found the requested file, it returns the path to the local cache;
if not found, it returns a SHA-256 hash of the URL as the file name, and the possible cache folders

When loading a file from URL, below is the order of cache file search paths, ranking in search order
```
   global-variable NEUROJSON_CACHE | if defined, this path will be searched first
   [pwd '/.neurojson']  	   | on all OSes
   /home/USERNAME/.neurojson	   | on all OSes (per-user)
   /home/USERNAME/.cache/neurojson | if on Linux (per-user)
   /var/cache/neurojson 	   | if on Linux (system wide)
   /home/USERNAME/Library/neurojson| if on MacOS (per-user)
   /Library/neurojson		   | if on MacOS (system wide)
   C:\ProgramData\neurojson	   | if on Windows (system wide)
```
When saving a file from a URL, under the root cache folder, subfolders can be created;
if the URL is one of a standard NeuroJSON.io URLs as below
```
   https://neurojson.org/io/stat.cgi?action=get&db=DBNAME&doc=DOCNAME&file=sub-01/anat/datafile.nii.gz
   https://neurojson.io:7777/DBNAME/DOCNAME
   https://neurojson.io:7777/DBNAME/DOCNAME/datafile.suffix
```
the file datafile.nii.gz will be downloaded to /home/USERNAME/.neurojson/io/DBNAME/DOCNAME/sub-01/anat/ folder
if a URL does not follow the neurojson.io format, the cache folder has the below form
```
   CACHEFOLDER{i}/domainname.com/XX/YY/XXYYZZZZ...
```
where XXYYZZZZ.. is the SHA-256 hash of the full URL, XX is the first two digit, YY is the 3-4 digits

In PyJData, we provide `jdata.jdlink()` function to dynamically download and locally cache
externally linked data files. `jdata.jdlink()` only parse files with JSON/binary JSON suffixes that
`load` supports. Here is a example

```
import jdata as jd

data = jd.load('https://neurojson.io:7777/openneuro/ds000001');
extlinks = jd.jsonpath(data, '$..anat.._DataLink_')  # deep-scan of all anatomical folders and find all linked NIfTI files
jd.jdlink(extlinks, {'regex': 'sub-0[12]_.*nii'})  # download only the nii files for sub-01 and sub-02
jd.jdlink(extlinks)                                # download all links
```

## Utility

One can convert from JSON based data files (`.json, .jdt, .jnii, .jmsh, .jnirs`) to binary-JData
based binary files (`.bjd, .jdb, .bnii, .bmsh, .bnirs`) and vice versa using command
```
python3 -m jdata /path/to/file.json           # convert to /path/to/text/json/file.jdb
python3 -m jdata /path/to/file.jdb            # convert to /path/to/text/json/file.json
python3 -m jdata /path/to/file.jdb -t 2       # convert to /path/to/text/json/file.json with indentation of 2 spaces
python3 -m jdata file1 file2 ...              # batch convert multiple files
python3 -m jdata file1 -f                     # force overwrite output files if exist (`-f`/`--force`)
python3 -m jdata file1 -O /output/dir         # save output files to /output/dir (`-O`/`--outdir`)
python3 -m jdata file1.json -s .bnii          # force output suffix/file type (`-s`/`--suffix`)
python3 -m jdata file1.json -c zlib           # set compression method (`-c`/`--compression`)
python3 -m jdata -h                           # show help info (`-h`/`--help`)
```

## How to contribute

`jdata` uses an open-source license - the Apache 2.0 license. This license is a "permissive" license
and can be used in commercial products without needing to release the source code.

To contribute `jdata` source code, you can modify the Python units inside the `jdata/` folder. Please
minimize the dependencies to external 3rd party packages. Please use Python's built-in packages whenever
pissible.

All jdata source codes have been formatted using `black`. To reformat all units, please type
```
make pretty
```
inside the top-folder of the source repository

For every newly added function, please add a unittest unit or test inside the files under `test/`, and run
```
make test
```
to make sure the modified code can pass all tests.

To build a local installer, please install the `build` python module, and run
```
make build
```
The output wheel can be found inside the `dist/` folder.

## Test

To see additional data type support, please run the built-in test using below command

```
python3 -m unittest discover -v test
```
or one can run individual set of unittests by calling
```
python3 -m unittest -v test.testnifti
python3 -m unittest -v test.testsnirf
```

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/NeuroJSON/pyjdata",
    "name": "jdata",
    "maintainer": "Qianqian Fang",
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "JSON, JData, UBJSON, BJData, OpenJData, NeuroJSON, JNIfTI, JMesh, Encoder, Decoder",
    "author": "Qianqian Fang",
    "author_email": "fangqq@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/43/c7/85d8b22c40f88e2362b863e8e11230e7272d8f1573941effe162f8a0dfb8/jdata-0.8.0.tar.gz",
    "platform": "any",
    "description": "![](https://neurojson.org/wiki/upload/neurojson_banner_long.png)\n\n# JData - NeuroJSON client with fast parsers for JSON, binary JSON, NIFTI, SNIRF, CSV/TSV, HDF5 data files\n\n- Copyright: (C) Qianqian Fang (2019-2025) <q.fang at neu.edu>\n- License: Apache License, Version 2.0\n- Version: 0.8.0\n- URL: https://github.com/NeuroJSON/pyjdata\n- Acknowledgement: This project is supported by US National Institute of Health (NIH)\n  grant [U24-NS124027](https://reporter.nih.gov/project-details/10308329)\n\n![Build Status](https://github.com/NeuroJSON/pyjdata/actions/workflows/run_test.yml/badge.svg)\n\n## Table of Contents\n\n- [Introduction](#introduction)\n- [File formats](#file-formats)\n- [Submodules](#submodules)\n- [How to install](#how-to-install)\n- [How to build](#how-to-build)\n- [How to use](#how-to-use)\n- [Advanced interfaces](#advanced-interfaces)\n- [Reading JSON via REST-API](#reading-json-via-rest-api)\n- [Using JSONPath to access and query complex datasets](#using-jsonpath-to-access-and-query-complex-datasets)\n- [Downloading and caching `_DataLink_` referenced external data files](#downloading-and-caching-_datalink_-referenced-external-data-files)\n- [Utility](#utility)\n- [How to contribute](#how-to-contribute)\n- [Test](#test)\n\n## Introduction\n\n`jdata` is a lightweight and fast neuroimaging data file parser, with built\nin support for NIfTI-1/2 (`.nii`, `.nii.gz`), two-part Analyze 7.5 (`.img/.hdr`, `.img.gz`),\nHDF5 (`.h5`), SNIRF (`.snirf`), MATLAB .mat files (`.mat`), CSV/TSV (`.csv`, `.csv.gz`,\n`.tsv`, `.tsv.gz`), JSON (`.json`), and various binary-JSON data formats, including\nBJData (`.bjd`), UBJSON (`.ubj`), and MessagePack (`.msgpack`) formats. `jdata` can\nload data files both from local storage and REST-API via URLs. To maximize portability,\nthe outputs of `jdata` data parsers are intentionally based upon only the **native Python**\ndata structures (`dict/list/tuple`) plus `numpy` arrays. The entire package is less than\n60KB in size and is platform-independent.\n\n`jdata` highly compatible to the [JSONLab toolbox](https://github.com/NeuroJSON/jsonlab)\nfor MATLAB/Octave, serving as the reference library for Python for the\n[JData Specification](https://github.com/NeuroJSON/jdata/),\nThe JData Specification defines a lightweight\nlanguage-independent data annotation interface enabling easy storing\nand sharing of complex data structures across different programming\nlanguages such as MATLAB, JavaScript, Python etc. Using JData formats, a \ncomplex Python data structure, including numpy objects, can be encoded\nas a simple `dict` object that is easily serialized as a JSON/binary JSON\nfile and share such data between programs of different languages.\n\nSince 2021, the development of the `jdata` module and the underlying data format specificaitons\n[JData](https://neurojson.org/jdata/draft3) and [BJData](https://neurojson.org/bjdata/draft3)\nhave been funded by the US National Institute of Health (NIH) as\npart of the NeuroJSON project (https://neurojson.org and https://neurojson.io).\n\nThe goal of the NeuroJSON project is to develop scalable, searchable, and\nreusable neuroimaging data formats and data sharing platforms. All data\nproduced from the NeuroJSON project will be using JSON/Binary JData formats as the\nunderlying serialization standards and the lightweight JData specification as\nlanguage-independent data annotation standard.\n\n## File formats\n\nThe supported data formats can be found in the below table. All file types\nsupport reading and writing, except those specified below.\n\n| Format | Name |       |  Format                           | Name   |\n| ------ | ------ | --- |-----------------------------------| ------ |\n| **JSON-compatible files**  | |  | **Binary JSON (same format)** **[1]** | |\n| \u2705 `.json` | \u2705 JSON files |                        | \u2705 `.bjd`    | \u2705 binary JSON (BJD) files |\n| \u2705 `.jnii` | \u2705 JSON-wrapper for NIfTI data (JNIfTI)|       | \u2705 `.bnii`   | \u2705 BJD-wrapper for NIfTI data |\n| \u2705 `.jnirs` | \u2705 JSON-wrapper for SNIRF data (JSNIRF)|      | \u2705 `.bnirs`  | \u2705 BJD-wrapper for SNIRF data |\n| \u2705 `.jmsh` | \u2705 JSON-encoded mesh data (JMesh)  |   | \u2705 `.bmsh`   | \u2705 BJD-encoded for mesh data  |\n| \u2705 `.jdt` | \u2705 JSON files with JData annotations |  | \u2705 `.jdb`    | \u2705 BJD files with JData annotations |\n| \u2705 `.jdat` | \u2705 JSON files with JData annotations | | \u2705 `.jbat`   | \u2705 BJD files with JData annotations |\n| \u2705 `.jbids` | \u2705 JSON digest of a BIDS dataset |    | \u2705 `.pmat`   | \u2705 BJD encoded .mat files |\n| **NIfTI formats**                      | |           | **CSV/TSV formats** | |\n| \u2705 `.nii` | \u2705 uncompressed NIfTI-1/2 files |       | \u2705 `.csv`    | \u2705 CSV files |\n| \u2705 `.nii.gz` | \u2705 compressed NIfTI files |          | \u2705 `.csv.gz` | \u2705 compressed CSV files |\n| \u2705 `.img/.hdr` | \u2705 Analyze 7.5 two-part files |    | \u2705 `.tsv`    | \u2705 TSV files |\n| \u2705 `.img.gz` | \u2705 compressed Analyze files |        | \u2705 `.tsv.gz` | \u2705 compressed TSV files |\n| **HDF5 formats** **[2]**              | |           | **Other formats (read-only)** | |\n| \u2705 `.h5` | \u2705 HDF5 files |                          | \u2705 `.mat`    | \u2705 MATLAB .mat files **[3]** |\n| \u2705 `.hdf5` | \u2705 HDF5 files |                        | \u2705 `.bval`   | \u2705 EEG .bval files |\n| \u2705 `.snirf` | \u2705 HDF5-based SNIRF data |            | \u2705 `.bvec`   | \u2705 EEG .bvec files |\n| \u2705 `.nwb` | \u2705 HDF5-based NWB files |               | \u2705 `.msgpack`| \u2705 Binary JSON MessagePack format **[4]** |\n\n- [1] requires `bjdata` Python module when needed, `pip install bjdata`\n- [2] requires `h5py` Python module when needed, `pip install h5py`\n- [3] requires `scipy` Python module when needed, `pip install scipy`\n- [4] requires `msgpack` Python module when needed, `pip install msgpack`\n\n## Submodules\n\nThe `jdata` module further partition the functions into smaller submodules, including\n- **jdata.jfile** provides `loadjd`, `savejd`, `load`, `save`, `loadt`, `savet`, `loadb`, `saveb`, `loadts`, `loadbs`, `jsoncache`, `jdlink`, ...\n- **jdata.jdata** provides `encode`, `decode`, `jdataencode`, `jdatadecode`, `{zlib,gzip,lzma,lz4,base64}encode`, `{zlib,gzip,lzma,lz4,base64}decode`\n- **jdata.jpath** provides `jsonpath`\n- **jdata.jnifti** provides `load{jnifti,nifti}`, `save{jnifti,nifti,jnii,bnii}`, `nii2jnii`, `jnii2nii`, `nifticreate`, `jnifticreate`, `niiformat`, `niicodemap`\n- **jdata.neurojson** provides `neuroj`, `neurojgui`\n- **jdata.h5** provides `loadh5`, `saveh5`, `regrouph5`, `aos2soa`, `soa2aos`, `jsnirfcreate`, `snirfcreate`, `snirfdecode`\n\nAll these functions can be found in the MATLAB/GNU Octave equivalent, JSONLab toolbox. Each function can be individually imported\n```\n# individually imported\nfrom jdata.jfile import loadjd\ndata=loadjd(...)\n\n# import everything\nfrom jdata import *\ndata=loadjd(...)\n\n# import under jdata namespace\nimport jdata as jd\ndata=jd.loadjd(...)\n```\n\n## How to install\n\n* Github: download from https://github.com/NeuroJSON/pyjdata\n* PIP: run `pip install jdata` see https://pypi.org/project/jdata/\n\nThis package can also be installed on Ubuntu 21.04 or Debian Bullseye via\n```\nsudo apt-get install python3-jdata\n```\n\nOn older Ubuntu or Debian releases, you may install jdata via the below PPA:\n```\nsudo add-apt-repository ppa:fangq/ppa\nsudo apt-get update\nsudo apt-get install python3-jdata\n```\n\nDependencies:\n* **numpy**: PIP: run `pip install numpy` or `sudo apt-get install python3-numpy`\n* (optional) **bjdata**: PIP: run `pip install bjdata` or `sudo apt-get install python3-bjdata`, see https://pypi.org/project/bjdata/, only needed to read/write BJData/UBJSON files\n* (optional) **lz4**: PIP: run `pip install lz4`, only needed when encoding/decoding lz4-compressed data\n* (optional) **h5py**: PIP: run `pip install h5py`, only needed when reading/writing .h5 and .snirf files\n* (optional) **scipy**: PIP: run `pip install scipy`, only needed when loading MATLAB .mat files\n* (optional) **msgpack**: PIP: run `pip install msgpack`, only needed when loading MessagePack .msgpack files\n* (optional) **blosc2**: PIP: run `pip install blosc2`, only needed when encoding/decoding blosc2-compressed data\n* (optional) **backports.lzma**: PIP: run `sudo apt-get install liblzma-dev` and `pip install backports.lzma` (needed for Python 2.7), only needed when encoding/decoding lzma-compressed data\n* (optional) **python3-tk**: run `sudo apt-get install python3-tk` to install the Tk support on a Linux in order to run `neurojgui` function\n\nReplacing `pip` by `pip3` if you are using Python 3.x. If either `pip` or `pip3` \ndoes not exist on your system, please run\n```\nsudo apt-get install python3-pip\n```\nPlease note that in some OS releases (such as Ubuntu 20.04), python2.x and python-pip \nare no longer supported.\n\n## How to build\n\nOne can also install this module from the source code. To do this, you first\ncheck out a copy of the latest code from Github by\n```\ngit clone https://github.com/NeuroJSON/pyjdata.git\ncd pyjdata\n```\nthen install the module to your local user folder by\n```\npython3 setup.py install --user\n```\nor, if you prefer, install to the system folder for all users by\n```\nsudo python3 setup.py install\n```\n\nInstead of installing the module, you can also import the jdata module directly from \nyour local copy by cd the root folder of the unzipped pyjdata package, and run\n```\nimport jdata as jd\n```\n\n\n## How to use\n\nThe `jdata` module provides a unified data parsing and saving interface: `jd.loadjd()` and `jd.savejd()`.\nThese two functions supports all file format described in the above \"File formats\" section.\nThe `jd.loadjd()` function also supports loading online data via URLs.\n\n```\nimport jdata as jd\nnii = jd.loadjd('/path/to/img.nii.gz')\nsnirf = jd.loadjd('/path/to/mydata.snirf')\nnii2 = jd.loadjd('https://example.com/data/vol.nii.gz')\njsondata = jd.loadjd('https://example.com/rest/api/')\nmatlabdata = jd.loadjd('matlabdata.mat')\njd.savejd(matlabdata, 'newdata.mat')\njd.savejd(matlabdata, 'newdata.jdb', compression='zlib')\n\njd.savejd(nii2, 'newdata.jnii', compression='lzma')\njd.savejd(nii, 'newdata.bnii', compression='gzip')\njd.savejd(nii, 'newdata.nii.gz')\n```\n\nThe `jdata` module also serves as the front-end for the free data resources hosted at\nNeuroJSON.io. The NeuroJSON client (`neuroj()`) can be started in the GUI mode using\n\n```\nimport jdata as jd\njd.neuroj('gui')\n```\n\nthe above command will pop up a window displaying the databases, datasets and data\nrecords for the over 1500 datasets currently hosted on NeuroJSON.io.\n\nThe `neuroj` client also supports command-line mode, using the below format\n\n```\nimport jdata as jd\nhelp(jd.neuroj)                            # print help info for jd.neuroj()\njd.neuroj('list')                          # list all databases on NeuroJSON.io\n[db['id'] for db in jd.neuroj('list')['database']]  # list all database IDs\njd.neuroj('list', 'openneuro')             # list all datasets under the `openneuro` database\njd.neuroj('list', 'openneuro', limit=5, skip=5)  # list the 6th to 10th datasets under the `openneuro` database\njd.neuroj('list', 'openneuro', 'ds000001') # list all versions for the `openneuro/ds00001` dataset\njd.neuroj('get', 'openneuro', 'ds000001')  # download and parse the `openneuro/ds00001` dataset as a Python object\njd.neuroj('info', 'openneuro', 'ds000001') # lightweight header information of the `openneuro/ds00001` dataset\njd.neuroj('find', '/abide/')               # find both abide-1 and abide-2 databases using filters\njd.neuroj('find', 'openneuro', '/00[234]$/') # use regular experssion to filter all openneuro datasets\njd.neuroj('find', 'mcx', {'selector': ..., 'find': ...}) # use CouchDB _find API to search data\njd.neuroj('find', 'mcx', {'selector': ..., 'find': ...}) # use CouchDB _find API to search data\njd.neuroj('info', db='mcx', ds='colin27')  # use named inputs\njd.neuroj('get', db='mcx', ds='colin27', file='att1')  # download the attachment `att1` for the `mcx/colin27` dataset\njd.neuroj('put', 'sandbox1d', 'test', '{\"obj\":1}')  # update `sandbox1d/test` dataset with a new JSON string (need admin account)\njd.neuroj('delete', 'sandbox1d', 'test')   # delete `sandbox1d/test` dataset (need admin account)\n```\n\n\n## Advanced interfaces\n\nThe `jdata` module is easy to use. You can use the `encode()/decode()` functions to\nencode Python data into JData annotation format, or decode JData structures into\nnative Python data, for example\n\n```\nimport jdata as jd\nimport numpy as np\n\na={'str':'test','num':1.2,'list':[1.1,[2.1]],'nan':float('nan'),'np':np.arange(1,5,dtype=np.uint8)}\njd.encode(a)\njd.decode(jd.encode(a))\nd1=jd.encode(a, compression='zlib',base64=True})\nd1\njd.decode(d1,base64=True)\n```\n\nOne can further save the JData annotated data into JSON or binary JSON (UBJSON) files using\nthe `jdata.save` function, or loading JData-formatted data to Python using `jdata.load`\n\n```\nimport jdata as jd\nimport numpy as np\n\na={'str':'test','num':1.2,'list':[1.1,[2.1]],'nan':float('nan'),'np':np.arange(1,5,dtype=np.uint8)}\njd.save(a,'test.json')\nnewdata=jd.load('test.json')\nnewdata\n```\n\nOne can use `loadt` or `savet` to read/write JSON-based data files and `loadb` and `saveb` to\nread/write binary-JSON based data files. By default, JData annotations are automatically decoded\nafter loading and encoded before saving. One can set `{'encode': False}` in the save functions\nor `{'decode': False}` in the load functions as the `opt` to disable further processing of JData\nannotations. We also provide `loadts` and `loadbs` for parsing a string-buffer made of text-based\nJSON or binary JSON stream.\n\nPyJData supports multiple N-D array data compression/decompression methods (i.e. codecs), similar\nto HDF5 filters. Currently supported codecs include `zlib`, `gzip`, `lz4`, `lzma`, `base64` and various\n`blosc2` compression methods, including `blosc2blosclz`, `blosc2lz4`, `blosc2lz4hc`, `blosc2zlib`,\n`blosc2zstd`. To apply a selected compression method, one simply set `{'compression':'method'}` as\nthe option to `jdata.encode` or `jdata.save` function; `jdata.load` or `jdata.decode` automatically\ndecompress the data based on the `_ArrayZipType_` annotation present in the data. Only `blosc2`\ncompression methods support multi-threading. To set the thread number, one should define an `nthread`\nvalue in the option (`opt`) for both encoding and decoding.\n\n## Reading JSON via REST-API\n\nIf a REST-API (URL) is given as the first input of `load`, it reads the JSON data directly\nfrom the URL and parse the content to native Python data structures. To avoid repetitive download,\n`load` automatically cache the downloaded file so that future calls directly load the\nlocally cached file. If one prefers to always load from the URL without local cache, one should\nuse `loadurl()` instead. Here is an example\n\n```\nimport jdata as jd\ndata = jd.load('https://neurojson.io:7777/openneuro/ds000001');\ndata.keys()\n```\n\n## Using JSONPath to access and query complex datasets\n\nStarting from v0.6.0, PyJData provides a lightweight implementation [JSONPath](https://goessner.net/articles/JsonPath/),\na widely used format for query and access a hierarchical dict/list structure, such as those\nparsed by `load` or `loadurl`. Here is an example\n\n```\nimport jdata as jd\n\ndata = jd.loadurl('https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json');\njd.jsonpath(data, '$.age')\njd.jsonpath(data, '$.address.city')\njd.jsonpath(data, '$.phoneNumber')\njd.jsonpath(data, '$.phoneNumber[0]')\njd.jsonpath(data, '$.phoneNumber[0].type')\njd.jsonpath(data, '$.phoneNumber[-1]')\njd.jsonpath(data, '$.phoneNumber..number')\njd.jsonpath(data, '$[phoneNumber][type]')\njd.jsonpath(data, '$[phoneNumber][type][1]')\n```\n\nThe `jd.jsonpath` function does not support all JSONPath features. If more complex JSONPath\nqueries are needed, one should install `jsonpath_ng` or other more advanced JSONPath support.\nHere is an example using `jsonpath_ng`\n\n```\nimport jdata as jd\nfrom jsonpath_ng.ext import parse\n\ndata = jd.loadurl('https://raw.githubusercontent.com/fangq/jsonlab/master/examples/example1.json');\n\nval = [match.value for match in parse('$.address.city').find(data)]\nval = [match.value for match in parse('$.phoneNumber').find(data)]\n```\n\n## Downloading and caching `_DataLink_` referenced external data files\n\nSimilarly to [JSONLab](https://github.com/fangq/jsonlab?tab=readme-ov-file#jsoncachem),\nPyJData also provides similar external data file downloading/caching capability.\n\nThe `_DataLink_` annotation in the JData specification permits linking of external data files\nin a JSON file - to make downloading/parsing externally linked data files efficient, such as\nprocessing large neuroimaging datasets hosted on http://neurojson.io, we have developed a system\nto download files on-demand and cache those locally. jsoncache.m is responsible of searching\nthe local cache folders, if found the requested file, it returns the path to the local cache;\nif not found, it returns a SHA-256 hash of the URL as the file name, and the possible cache folders\n\nWhen loading a file from URL, below is the order of cache file search paths, ranking in search order\n```\n   global-variable NEUROJSON_CACHE | if defined, this path will be searched first\n   [pwd '/.neurojson']  \t   | on all OSes\n   /home/USERNAME/.neurojson\t   | on all OSes (per-user)\n   /home/USERNAME/.cache/neurojson | if on Linux (per-user)\n   /var/cache/neurojson \t   | if on Linux (system wide)\n   /home/USERNAME/Library/neurojson| if on MacOS (per-user)\n   /Library/neurojson\t\t   | if on MacOS (system wide)\n   C:\\ProgramData\\neurojson\t   | if on Windows (system wide)\n```\nWhen saving a file from a URL, under the root cache folder, subfolders can be created;\nif the URL is one of a standard NeuroJSON.io URLs as below\n```\n   https://neurojson.org/io/stat.cgi?action=get&db=DBNAME&doc=DOCNAME&file=sub-01/anat/datafile.nii.gz\n   https://neurojson.io:7777/DBNAME/DOCNAME\n   https://neurojson.io:7777/DBNAME/DOCNAME/datafile.suffix\n```\nthe file datafile.nii.gz will be downloaded to /home/USERNAME/.neurojson/io/DBNAME/DOCNAME/sub-01/anat/ folder\nif a URL does not follow the neurojson.io format, the cache folder has the below form\n```\n   CACHEFOLDER{i}/domainname.com/XX/YY/XXYYZZZZ...\n```\nwhere XXYYZZZZ.. is the SHA-256 hash of the full URL, XX is the first two digit, YY is the 3-4 digits\n\nIn PyJData, we provide `jdata.jdlink()` function to dynamically download and locally cache\nexternally linked data files. `jdata.jdlink()` only parse files with JSON/binary JSON suffixes that\n`load` supports. Here is a example\n\n```\nimport jdata as jd\n\ndata = jd.load('https://neurojson.io:7777/openneuro/ds000001');\nextlinks = jd.jsonpath(data, '$..anat.._DataLink_')  # deep-scan of all anatomical folders and find all linked NIfTI files\njd.jdlink(extlinks, {'regex': 'sub-0[12]_.*nii'})  # download only the nii files for sub-01 and sub-02\njd.jdlink(extlinks)                                # download all links\n```\n\n## Utility\n\nOne can convert from JSON based data files (`.json, .jdt, .jnii, .jmsh, .jnirs`) to binary-JData\nbased binary files (`.bjd, .jdb, .bnii, .bmsh, .bnirs`) and vice versa using command\n```\npython3 -m jdata /path/to/file.json           # convert to /path/to/text/json/file.jdb\npython3 -m jdata /path/to/file.jdb            # convert to /path/to/text/json/file.json\npython3 -m jdata /path/to/file.jdb -t 2       # convert to /path/to/text/json/file.json with indentation of 2 spaces\npython3 -m jdata file1 file2 ...              # batch convert multiple files\npython3 -m jdata file1 -f                     # force overwrite output files if exist (`-f`/`--force`)\npython3 -m jdata file1 -O /output/dir         # save output files to /output/dir (`-O`/`--outdir`)\npython3 -m jdata file1.json -s .bnii          # force output suffix/file type (`-s`/`--suffix`)\npython3 -m jdata file1.json -c zlib           # set compression method (`-c`/`--compression`)\npython3 -m jdata -h                           # show help info (`-h`/`--help`)\n```\n\n## How to contribute\n\n`jdata` uses an open-source license - the Apache 2.0 license. This license is a \"permissive\" license\nand can be used in commercial products without needing to release the source code.\n\nTo contribute `jdata` source code, you can modify the Python units inside the `jdata/` folder. Please\nminimize the dependencies to external 3rd party packages. Please use Python's built-in packages whenever\npissible.\n\nAll jdata source codes have been formatted using `black`. To reformat all units, please type\n```\nmake pretty\n```\ninside the top-folder of the source repository\n\nFor every newly added function, please add a unittest unit or test inside the files under `test/`, and run\n```\nmake test\n```\nto make sure the modified code can pass all tests.\n\nTo build a local installer, please install the `build` python module, and run\n```\nmake build\n```\nThe output wheel can be found inside the `dist/` folder.\n\n## Test\n\nTo see additional data type support, please run the built-in test using below command\n\n```\npython3 -m unittest discover -v test\n```\nor one can run individual set of unittests by calling\n```\npython3 -m unittest -v test.testnifti\npython3 -m unittest -v test.testsnirf\n```\n",
    "bugtrack_url": null,
    "license": "Apache license 2.0",
    "summary": "JSON/binary JSON formats for exchanging Python and Numpy data",
    "version": "0.8.0",
    "project_urls": {
        "Homepage": "https://github.com/NeuroJSON/pyjdata"
    },
    "split_keywords": [
        "json",
        " jdata",
        " ubjson",
        " bjdata",
        " openjdata",
        " neurojson",
        " jnifti",
        " jmesh",
        " encoder",
        " decoder"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "855301958fef8dcf842aa9bc3d879669ed99ae0c369ab692217881beb35630df",
                "md5": "5dfb3b9c7345cf18f4a30c41edc6192c",
                "sha256": "554b2e5c73832c81ab92ab2b2bc9c9ebef7554da1a519da9ef47e850e0d3a7a6"
            },
            "downloads": -1,
            "filename": "jdata-0.8.0-py2.py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "5dfb3b9c7345cf18f4a30c41edc6192c",
            "packagetype": "bdist_wheel",
            "python_version": "py2.py3",
            "requires_python": null,
            "size": 59256,
            "upload_time": "2025-08-03T22:33:48",
            "upload_time_iso_8601": "2025-08-03T22:33:48.806815Z",
            "url": "https://files.pythonhosted.org/packages/85/53/01958fef8dcf842aa9bc3d879669ed99ae0c369ab692217881beb35630df/jdata-0.8.0-py2.py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "43c785d8b22c40f88e2362b863e8e11230e7272d8f1573941effe162f8a0dfb8",
                "md5": "151fb5b98564f250dde15c64de2153d3",
                "sha256": "1248b728a57be70eee1840e6cac42caa1b3d9fdb0b377b47d8db858048e7dc54"
            },
            "downloads": -1,
            "filename": "jdata-0.8.0.tar.gz",
            "has_sig": false,
            "md5_digest": "151fb5b98564f250dde15c64de2153d3",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 68031,
            "upload_time": "2025-08-03T22:33:50",
            "upload_time_iso_8601": "2025-08-03T22:33:50.003314Z",
            "url": "https://files.pythonhosted.org/packages/43/c7/85d8b22c40f88e2362b863e8e11230e7272d8f1573941effe162f8a0dfb8/jdata-0.8.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-03 22:33:50",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "NeuroJSON",
    "github_project": "pyjdata",
    "travis_ci": true,
    "coveralls": false,
    "github_actions": true,
    "lcname": "jdata"
}
        
Elapsed time: 2.23857s