datastock


Namedatastock JSON
Version 0.0.42 PyPI version JSON
download
home_pagehttps://github.com/ToFuProject/datastock
SummaryA python library for generic class and data handling
upload_time2024-11-25 16:09:34
maintainerNone
docs_urlNone
authorDidier VEZINET
requires_python>=3.6
licenseMIT
keywords data analysis class container generic interactive plot
VCS
bugtrack_url
requirements numpy scipy matplotlib astropy
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Conda]( https://anaconda.org/conda-forge/datastock/badges/version.svg)](https://anaconda.org/conda-forge/datastock)
[![](https://anaconda.org/conda-forge/datastock/badges/downloads.svg)](https://anaconda.org/conda-forge/datastock)
[![](https://anaconda.org/conda-forge/datastock/badges/latest_release_date.svg)](https://anaconda.org/conda-forge/datastock)
[![](https://anaconda.org/conda-forge/datastock/badges/platforms.svg)](https://anaconda.org/conda-forge/datastock)
[![](https://anaconda.org/conda-forge/datastock/badges/license.svg)](https://github.com/conda-forge/datastock/blob/master/LICENSE.txt)
[![](https://anaconda.org/conda-forge/datastock/badges/installer/conda.svg)](https://anaconda.org/conda-forge/datastock)
[![](https://badge.fury.io/py/datastock.svg)](https://badge.fury.io/py/datastock)



datastock
=========

Provides a generic class for storing multiple heterogeneous numpy arrays with non-uniform shapes and built-in interactive visualization routines.
Also stores the relationships between arrays (e.g.: matching dimensions...)
Also provides an elegant way of storing objects of various categories depending on the storeed arrays


The full power of datastock is unveiled when using the DataStock class and sub-classing it for your own use.

But a simpler and more straightforward use is possible if you are just looking for a ready-to-use interactive visualization tool of 1d, 2d and 3d numpy arrays by using a shortcut


Installation:
-------------

datastock is available on Pypi and anaconda.org

``
pip install datastock
``

``
conda install -c conda-forge datastock
``

Examples:
=========
 

Straightforward array visualization:
------------------------------------

```
import datastock as ds

# any 1d, 2d or 3d array
aa = np.np.random.random((100, 100, 100))

# plot interactive figure using shortcut to method
dax = ds.plot_as_array(aa)
```

Now do **shift + left clic** on any axes, the rest of the interactive commands are automatically printed in your python console

<p align="center">
<img align="middle" src="https://github.com/ToFuProject/datastock/blob/devel/README_figures/DirectVisualization_3d.png" width="600" alt="Direct 3d array visualization"/>
</p>


The DataStock class:
--------------------

You will want to instanciate the DataStock class (which is the cor of datastock) if:
* You have many numpy arrays, not just one, especially if they do not have the same shape
* You want to define a variety of objects from these data arrays (DataStock can be seen as a class storing many sub-classes)


DataStock has 3 main dict attributes:
* `dref`: to store the size of each dimension, each under a unique key
* `ddata`: to store all numpy arrays, each under a unique key
* `dobj`: to store any number of arbitrary sub-dict, each containing a category of object

Thanks to dref, the class knows the relationaships between all numpy arrays.
In particular it knows which arrays share the same references / dimensions


```
import numpy as np
import datastock as ds

# -----------
# Define data
# Here: time-varying profiles representing velocity measurement across the radius of a tube
# we assume 5 measurement campaigns were conducted, each yielding a different number of measurements, all sampled on 80 radial points

nc = 5
nx = 80
lnt = [100, 90, 80, 120, 110]

x = np.linspace(1, 2, nx)
lt = [np.linspace(0, 10, nt) for nt in lnt]
lprof = [(1 + np.cos(t)[:, None]) * x[None, :] for t in lt]

# ------------------
# Populate DataStock

# instanciate 
st = ds.DataStock()

# add references (i.e.: store size of each dimension under a unique key)
st.add_ref(key='nc', size=nc)
st.add_ref(key='nx', size=nx)
for ii, nt in enumerate(lnt):
    st.add_ref(key=f'nt{ii}', size=nt)

# add data dependening on these references
# you can, optionally, specify units, physical dimensionality (ex: distance, time...), quantity (ex: radius, height, ...) and name (to your liking)

st.add_data(key='x', data=x, dimension='distance', quant='radius', units='m', ref='nx')
for ii, nt in enumerate(lnt):
    st.add_data(key=f't{ii}', data=lt[ii], dimension='time', units='s', ref=f'nt{ii}')
    st.add_data(key=f'prof{ii}', data=lprof[ii], dimension='velocity', units='m/s', ref=(f'nt{ii}', 'x'))

# print in the console the content of st
st
```

<p align="center">
<img align="middle" src="https://github.com/ToFuProject/datastock/blob/devel/README_figures/DataStock_refdata.png" width="600" alt="Direct 3d array visualization"/>
</p>

You can see that DataStock stores the relationships between each array and each reference
Specifying explicitly the references is only necessary if there is an ambiguity (i.e.: several references have the same size, like nx and nt2 in our case)


```
# plot any array interactively
dax = st.plot_as_array('x')
dax = st.plot_as_array('t0')
dax = st.plot_as_array('prof0')
dax = st.plot_as_array('prof0', keyX='t0', keyY='x', aspect='auto')
```

You can then decide to store any object category
Let's create a 'campaign' category to store the characteristics of each measurements campaign
and let's add a 'campaign' parameter to each profile data

```
# add arbitrary object category as sub-dict of self.dobj
for ii in range(nc):
    st.add_obj(
        which='campaign',
	    key=f'c{ii}',
        start_date=f'{ii}.04.2022',
        end_date=f'{ii+5}.05.2022',
        operator='Barnaby' if ii > 2 else 'Jack Sparrow',
        comment='leak on tube' if ii == 1 else 'none',
        index=ii,
    )

# create new 'campaign' parameter for data arrays
st.add_param('campaign', which='data')

# tag each data with its campaign
for ii in range(nc):
    st.set_param(which='data', key=f't{ii}', param='campaign', value=f'c{ii}')	
    st.set_param(which='data', key=f'prof{ii}', param='campaign', value=f'c{ii}')	

# print in the console the content of st
st
```

<p align="center">
<img align="middle" src="https://github.com/ToFuProject/datastock/blob/devel/README_figures/DataStock_Obj.png" width="600" alt="Direct 3d array visualization"/>
</p>

DataStock also provides built-in object selection method to allow return all
objects matching a criterion, as lits of int indices, bool indices or keys.

```
In [9]: st.select(which='campaign', index=2, returnas=int)
Out[9]: array([2])

# list of 2 => return all matches inside the interval
In [10]: st.select(which='campaign', index=[2, 4], returnas=int)
Out[10]: array([2, 3, 4])

# tuple of 2 => return all matches outside the interval
In [11]: st.select(which='campaign', index=(2, 4), returnas=int)
Out[11]: array([0, 1])

# return as keys
In [12]: st.select(which='campaign', index=(2, 4), returnas=str)
Out[12]: array(['c0', 'c1'], dtype='<U2')

# return as bool indices
In [13]: st.select(which='campaign', index=(2, 4), returnas=bool)
Out[13]: array([ True,  True, False, False, False])

# You can combine as many constraints as needed
In [17]: st.select(which='campaign', index=[2, 4], operator='Barnaby', returnas=str)
Out[17]: array(['c3', 'c4'], dtype='<U2')

```

You can also decide to sub-class DataStock to implement methods and visualizations specific to your needs


Other useful built-in methods:
-----------------------------

DataStock provides built-in methods like:
* `get_nbytes()`: return a tuple (size, dsize) where:
    - size is the total size of all data stored in the instance in bytes
    - dsize is a dict with the detail (size for each item in each sub-dict of the instance)
* `save()`: will save the instance
* `ds.load()`: will load a saved instance



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/ToFuProject/datastock",
    "name": "datastock",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.6",
    "maintainer_email": null,
    "keywords": "data analysis class container generic interactive plot",
    "author": "Didier VEZINET",
    "author_email": "didier.vezinet@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/8c/13/109d37ef71ae77af8625b3587f03cd9bffc3d38e3a526976f1b71c38d674/datastock-0.0.42.tar.gz",
    "platform": null,
    "description": "[![Conda]( https://anaconda.org/conda-forge/datastock/badges/version.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/downloads.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/latest_release_date.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/platforms.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://anaconda.org/conda-forge/datastock/badges/license.svg)](https://github.com/conda-forge/datastock/blob/master/LICENSE.txt)\n[![](https://anaconda.org/conda-forge/datastock/badges/installer/conda.svg)](https://anaconda.org/conda-forge/datastock)\n[![](https://badge.fury.io/py/datastock.svg)](https://badge.fury.io/py/datastock)\n\n\n\ndatastock\n=========\n\nProvides a generic class for storing multiple heterogeneous numpy arrays with non-uniform shapes and built-in interactive visualization routines.\nAlso stores the relationships between arrays (e.g.: matching dimensions...)\nAlso provides an elegant way of storing objects of various categories depending on the storeed arrays\n\n\nThe full power of datastock is unveiled when using the DataStock class and sub-classing it for your own use.\n\nBut a simpler and more straightforward use is possible if you are just looking for a ready-to-use interactive visualization tool of 1d, 2d and 3d numpy arrays by using a shortcut\n\n\nInstallation:\n-------------\n\ndatastock is available on Pypi and anaconda.org\n\n``\npip install datastock\n``\n\n``\nconda install -c conda-forge datastock\n``\n\nExamples:\n=========\n \n\nStraightforward array visualization:\n------------------------------------\n\n```\nimport datastock as ds\n\n# any 1d, 2d or 3d array\naa = np.np.random.random((100, 100, 100))\n\n# plot interactive figure using shortcut to method\ndax = ds.plot_as_array(aa)\n```\n\nNow do **shift + left clic** on any axes, the rest of the interactive commands are automatically printed in your python console\n\n<p align=\"center\">\n<img align=\"middle\" src=\"https://github.com/ToFuProject/datastock/blob/devel/README_figures/DirectVisualization_3d.png\" width=\"600\" alt=\"Direct 3d array visualization\"/>\n</p>\n\n\nThe DataStock class:\n--------------------\n\nYou will want to instanciate the DataStock class (which is the cor of datastock) if:\n* You have many numpy arrays, not just one, especially if they do not have the same shape\n* You want to define a variety of objects from these data arrays (DataStock can be seen as a class storing many sub-classes)\n\n\nDataStock has 3 main dict attributes:\n* `dref`: to store the size of each dimension, each under a unique key\n* `ddata`: to store all numpy arrays, each under a unique key\n* `dobj`: to store any number of arbitrary sub-dict, each containing a category of object\n\nThanks to dref, the class knows the relationaships between all numpy arrays.\nIn particular it knows which arrays share the same references / dimensions\n\n\n```\nimport numpy as np\nimport datastock as ds\n\n# -----------\n# Define data\n# Here: time-varying profiles representing velocity measurement across the radius of a tube\n# we assume 5 measurement campaigns were conducted, each yielding a different number of measurements, all sampled on 80 radial points\n\nnc = 5\nnx = 80\nlnt = [100, 90, 80, 120, 110]\n\nx = np.linspace(1, 2, nx)\nlt = [np.linspace(0, 10, nt) for nt in lnt]\nlprof = [(1 + np.cos(t)[:, None]) * x[None, :] for t in lt]\n\n# ------------------\n# Populate DataStock\n\n# instanciate \nst = ds.DataStock()\n\n# add references (i.e.: store size of each dimension under a unique key)\nst.add_ref(key='nc', size=nc)\nst.add_ref(key='nx', size=nx)\nfor ii, nt in enumerate(lnt):\n    st.add_ref(key=f'nt{ii}', size=nt)\n\n# add data dependening on these references\n# you can, optionally, specify units, physical dimensionality (ex: distance, time...), quantity (ex: radius, height, ...) and name (to your liking)\n\nst.add_data(key='x', data=x, dimension='distance', quant='radius', units='m', ref='nx')\nfor ii, nt in enumerate(lnt):\n    st.add_data(key=f't{ii}', data=lt[ii], dimension='time', units='s', ref=f'nt{ii}')\n    st.add_data(key=f'prof{ii}', data=lprof[ii], dimension='velocity', units='m/s', ref=(f'nt{ii}', 'x'))\n\n# print in the console the content of st\nst\n```\n\n<p align=\"center\">\n<img align=\"middle\" src=\"https://github.com/ToFuProject/datastock/blob/devel/README_figures/DataStock_refdata.png\" width=\"600\" alt=\"Direct 3d array visualization\"/>\n</p>\n\nYou can see that DataStock stores the relationships between each array and each reference\nSpecifying explicitly the references is only necessary if there is an ambiguity (i.e.: several references have the same size, like nx and nt2 in our case)\n\n\n```\n# plot any array interactively\ndax = st.plot_as_array('x')\ndax = st.plot_as_array('t0')\ndax = st.plot_as_array('prof0')\ndax = st.plot_as_array('prof0', keyX='t0', keyY='x', aspect='auto')\n```\n\nYou can then decide to store any object category\nLet's create a 'campaign' category to store the characteristics of each measurements campaign\nand let's add a 'campaign' parameter to each profile data\n\n```\n# add arbitrary object category as sub-dict of self.dobj\nfor ii in range(nc):\n    st.add_obj(\n        which='campaign',\n\t    key=f'c{ii}',\n        start_date=f'{ii}.04.2022',\n        end_date=f'{ii+5}.05.2022',\n        operator='Barnaby' if ii > 2 else 'Jack Sparrow',\n        comment='leak on tube' if ii == 1 else 'none',\n        index=ii,\n    )\n\n# create new 'campaign' parameter for data arrays\nst.add_param('campaign', which='data')\n\n# tag each data with its campaign\nfor ii in range(nc):\n    st.set_param(which='data', key=f't{ii}', param='campaign', value=f'c{ii}')\t\n    st.set_param(which='data', key=f'prof{ii}', param='campaign', value=f'c{ii}')\t\n\n# print in the console the content of st\nst\n```\n\n<p align=\"center\">\n<img align=\"middle\" src=\"https://github.com/ToFuProject/datastock/blob/devel/README_figures/DataStock_Obj.png\" width=\"600\" alt=\"Direct 3d array visualization\"/>\n</p>\n\nDataStock also provides built-in object selection method to allow return all\nobjects matching a criterion, as lits of int indices, bool indices or keys.\n\n```\nIn [9]: st.select(which='campaign', index=2, returnas=int)\nOut[9]: array([2])\n\n# list of 2 => return all matches inside the interval\nIn [10]: st.select(which='campaign', index=[2, 4], returnas=int)\nOut[10]: array([2, 3, 4])\n\n# tuple of 2 => return all matches outside the interval\nIn [11]: st.select(which='campaign', index=(2, 4), returnas=int)\nOut[11]: array([0, 1])\n\n# return as keys\nIn [12]: st.select(which='campaign', index=(2, 4), returnas=str)\nOut[12]: array(['c0', 'c1'], dtype='<U2')\n\n# return as bool indices\nIn [13]: st.select(which='campaign', index=(2, 4), returnas=bool)\nOut[13]: array([ True,  True, False, False, False])\n\n# You can combine as many constraints as needed\nIn [17]: st.select(which='campaign', index=[2, 4], operator='Barnaby', returnas=str)\nOut[17]: array(['c3', 'c4'], dtype='<U2')\n\n```\n\nYou can also decide to sub-class DataStock to implement methods and visualizations specific to your needs\n\n\nOther useful built-in methods:\n-----------------------------\n\nDataStock provides built-in methods like:\n* `get_nbytes()`: return a tuple (size, dsize) where:\n    - size is the total size of all data stored in the instance in bytes\n    - dsize is a dict with the detail (size for each item in each sub-dict of the instance)\n* `save()`: will save the instance\n* `ds.load()`: will load a saved instance\n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A python library for generic class and data handling",
    "version": "0.0.42",
    "project_urls": {
        "Homepage": "https://github.com/ToFuProject/datastock"
    },
    "split_keywords": [
        "data",
        "analysis",
        "class",
        "container",
        "generic",
        "interactive",
        "plot"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f6c35d85b44be79cd29845231a328251ea337a4a23bf816de66610c0788577cb",
                "md5": "700cee22b349d4d7e809ad6a28275076",
                "sha256": "1c90b1f863373b84d9c0c0c48d0fd613bcda94bae2d5671d43fcea23fb30f034"
            },
            "downloads": -1,
            "filename": "datastock-0.0.42-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "700cee22b349d4d7e809ad6a28275076",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.6",
            "size": 136566,
            "upload_time": "2024-11-25T16:09:33",
            "upload_time_iso_8601": "2024-11-25T16:09:33.624691Z",
            "url": "https://files.pythonhosted.org/packages/f6/c3/5d85b44be79cd29845231a328251ea337a4a23bf816de66610c0788577cb/datastock-0.0.42-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8c13109d37ef71ae77af8625b3587f03cd9bffc3d38e3a526976f1b71c38d674",
                "md5": "c4e7b3785f1b9e850750fe4c1a821208",
                "sha256": "5595254810680275f2e3a13f3823b07472d2afa374694bfb45aeba8479f5204b"
            },
            "downloads": -1,
            "filename": "datastock-0.0.42.tar.gz",
            "has_sig": false,
            "md5_digest": "c4e7b3785f1b9e850750fe4c1a821208",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.6",
            "size": 119264,
            "upload_time": "2024-11-25T16:09:34",
            "upload_time_iso_8601": "2024-11-25T16:09:34.821061Z",
            "url": "https://files.pythonhosted.org/packages/8c/13/109d37ef71ae77af8625b3587f03cd9bffc3d38e3a526976f1b71c38d674/datastock-0.0.42.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-11-25 16:09:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ToFuProject",
    "github_project": "datastock",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "numpy",
            "specs": []
        },
        {
            "name": "scipy",
            "specs": []
        },
        {
            "name": "matplotlib",
            "specs": []
        },
        {
            "name": "astropy",
            "specs": []
        }
    ],
    "lcname": "datastock"
}
        
Elapsed time: 0.38897s