# \[k\]array: labeled multi-dimensional arrays
[![karray Status Badge](https://img.shields.io/pypi/v/karray.svg)](https://pypi.org/project/karray/)
[![karray Python Versions](https://img.shields.io/pypi/pyversions/karray.svg)](https://pypi.org/project/karray/)
[![karray license](https://img.shields.io/pypi/l/karray.svg)](https://pypi.org/project/karray/)
[![Downloads](https://static.pepy.tech/badge/karray)](https://pepy.tech/project/karray)
[![Pipeline](https://gitlab.com/diw-evu/karray/badges/main/pipeline.svg)](https://gitlab.com/diw-evu/karray/-/commits/main)
Karray is a simple tool that intends to abstract the users from the complexity of working with labelled multi-dimensional arrays. Numpy is the tool’s core, with an extensive collection of high-level mathematical functions to operate on multi-dimensional arrays efficiently thanks to its well-optimized C code. With Karray, we put effort into generating lightweight objects expecting to reduce overheads and avoid large loops that cause bottlenecks and impact performance. Numpy is the only relevant dependency, while Polars, Pandas, sparse and Pyarrow are required to import, export and store the arrays. `karray` is developed by the research group `Transformation of the Energy Economy` at [DIW Berlin](https://www.diw.de/en/diw_01.c.604205.en/energy__transportation__environment_department.html) (German Institute of Economic Research).
**Links**
* Documentation: https://diw-evu.gitlab.io/karray
* Source code: https://gitlab.com/diw-evu/karray
* PyPI releases: https://pypi.org/project/karray
**Table of contents**
* [Quick installation](#quick-installation)
* [Importing karray](#importing-karray)
* [Usage Examples](#usage-examples)
* [Creating an Array](#creating-an-array)
* [Accessing Array Elements](#accessing-array-elements)
* [Array Operations](#array-operations)
* [Saving and Loading Arrays](#saving-and-loading-arrays)
* [Interoperability with Other Libraries](#interoperability-with-other-libraries)
Getting started
===============
Quick installation
------------------
To install karray, you can use pip:
`pip install karray`
Importing karray
----------------
To start using karray, import the necessary classes and functions:
```python
import karray as ka
# then you can use ka.Array, ka.Long, and ka.settings
```
The `Array` class represents a labeled multidimensional array, while the `Long` class represents a labeled one-dimensional array. The `settings` object allows you to configure various options for karray.
Usage Examples
--------------
### Creating an Array
You can create an `Array` object in several ways:
1. From a `Long` object and coordinates:
```python
import pandas as pd
index = {'dim1': ['a', 'b'],
'dim2': [1, 2],
'dim3': pd.to_datetime(['2020-01-01', '2020-01-02'], utc=True)}
value = [10., 20.]
long = ka.Long(index=index, value=value)
arr1 = ka.Array(data=long)
arr1
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 64 bytes |
| Data object type | dense |
| Data object size | 64 bytes |
| Dimensions | \['dim1', 'dim2', 'dim3'\] |
| Shape | \[2, 2, 2\] |
| Capacity | 8 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
| **dim3** | 2 | datetime64\[ns\] | \['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\] |
Data
| | dim1 | dim2 | dim3 | value |
| --- | --- | --- | --- | --- |
| **0** | a | 1 | 2020-01-01T00:00:00.000000000 | 10.00 |
| **1** | b | 2 | 2020-01-02T00:00:00.000000000 | 20.00 |
2. From a tuple of index and value, and coordinates:
```python
index2 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}
value2 = [10, 20]
coords2 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}
arr2 = ka.Array(data=(index2, value2), coords=coords2)
arr2
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 48 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 10 |
| **1** | b | 2 | 20 |
3. From a dense NumPy array and coordinates:
```python
import numpy as np
dense = np.array([[10, 20], [30, 40]])
coords3 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}
arr3 = ka.Array(data=dense, coords=coords3)
arr3
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 96 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 4 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 10.00 |
| **1** | a | 2 | 20.00 |
| **2** | b | 1 | 30.00 |
| **3** | b | 2 | 40.00 |
4. From a sparse array (using the `sparse` library) and coordinates:
```python
import sparse as sp
sparse_arr = sp.COO(data=[10, 20], coords=[[0, 1], [0, 1]], shape=(2, 2))
coords4 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}
arr4 = ka.Array(data=sparse_arr, coords=coords4)
arr4
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 48 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 10 |
| **1** | b | 2 | 20 |
### Accessing Array Elements
You can access elements of an `Array` object using various methods:
1. Using the `items()` method to iterate over the array elements:
```python
for item in arr3.items():
print(item)
```
('dim1', array(['a', 'a', 'b', 'b'], dtype=object))
('dim2', array([1, 2, 1, 2]))
('value', array([10., 20., 30., 40.]))
2. Using the `to_pandas()` method to convert the array to a pandas DataFrame:
```python
df = arr1.to_pandas()
print(df)
```
dim1 dim2 dim3 value
0 a 1 2020-01-01 10.0
1 b 2 2020-01-02 20.0
3. Using the `to_polars()` method to convert the array to a polars DataFrame:
```python
df = arr1.to_polars()
print(df)
```
shape: (2, 4)
┌──────┬──────┬─────────────────────┬───────┐
│ dim1 ┆ dim2 ┆ dim3 ┆ value │
│ --- ┆ --- ┆ --- ┆ --- │
│ str ┆ i64 ┆ datetime[ns] ┆ f64 │
╞══════╪══════╪═════════════════════╪═══════╡
│ a ┆ 1 ┆ 2020-01-01 00:00:00 ┆ 10.0 │
│ b ┆ 2 ┆ 2020-01-02 00:00:00 ┆ 20.0 │
└──────┴──────┴─────────────────────┴───────┘
### Array Operations
karray provides various operations that can be performed on `Array` objects:
1. Arithmetic operations:
```python
result = arr1 + arr2
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 128 bytes |
| Data object type | dense |
| Data object size | 64 bytes |
| Dimensions | \['dim1', 'dim2', 'dim3'\] |
| Shape | \[2, 2, 2\] |
| Capacity | 8 |
| Rows | 4 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
| **dim3** | 2 | datetime64\[ns\] | \['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\] |
Data
| | dim1 | dim2 | dim3 | value |
| --- | --- | --- | --- | --- |
| **0** | a | 1 | 2020-01-01T00:00:00.000000000 | 20.00 |
| **1** | a | 1 | 2020-01-02T00:00:00.000000000 | 10.00 |
| **2** | b | 2 | 2020-01-01T00:00:00.000000000 | 20.00 |
| **3** | b | 2 | 2020-01-02T00:00:00.000000000 | 40.00 |
```python
result = arr3 * 2
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 96 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 4 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 20.00 |
| **1** | a | 2 | 40.00 |
| **2** | b | 1 | 60.00 |
| **3** | b | 2 | 80.00 |
```python
result = arr4 - 1
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 96 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 4 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 9.00 |
| **1** | a | 2 | -1.00 |
| **2** | b | 1 | -1.00 |
| **3** | b | 2 | 19.00 |
2. Comparison operations:
```python
mask = arr2 > 10
mask = arr2 == 5
```
3. Logical operations:
```python
result = arr2 & arr4
result = arr2 | arr4
result = ~arr2
```
4. Reduction operations:
```python
result = arr1.reduce('dim1', aggfunc='sum')
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 48 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim2', 'dim3'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim2** | 2 | int64 | \[1 2\] |
| **dim3** | 2 | datetime64\[ns\] | \['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\] |
Data
| | dim2 | dim3 | value |
| --- | --- | --- | --- |
| **0** | 1 | 2020-01-01T00:00:00.000000000 | 10.00 |
| **1** | 2 | 2020-01-02T00:00:00.000000000 | 20.00 |
```python
result = arr1.reduce('dim2', aggfunc=np.mean)
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 48 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim3'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim3** | 2 | datetime64\[ns\] | \['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\] |
Data
| | dim1 | dim3 | value |
| --- | --- | --- | --- |
| **0** | a | 2020-01-01T00:00:00.000000000 | 5.00 |
| **1** | b | 2020-01-02T00:00:00.000000000 | 10.00 |
5. Shifting and rolling operations:
```python
shifted = arr3.shift(dim1=1, dim2=-1, fill_value=0.)
shifted
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 24 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 1 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | b | 1 | 20.00 |
```python
rolled = arr3.roll(dim1=2)
rolled
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 96 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 4 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 10.00 |
| **1** | a | 2 | 20.00 |
| **2** | b | 1 | 30.00 |
| **3** | b | 2 | 40.00 |
6. Inserting new dimensions:
```python
# One dimension with one element
result = arr2.insert(dim3='x')
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 64 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim3', 'dim1', 'dim2'\] |
| Shape | \[1, 2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim3** | 1 | object | \['x'\] |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim3 | dim1 | dim2 | value |
| --- | --- | --- | --- | --- |
| **0** | x | a | 1 | 10 |
| **1** | x | b | 2 | 20 |
```python
# One dimension with several elements related to an existing dimension using a dict
result = arr2.insert(dim3={'dim1': {'a': -1, 'b': -2}})
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 64 bytes |
| Data object type | dense |
| Data object size | 64 bytes |
| Dimensions | \['dim3', 'dim1', 'dim2'\] |
| Shape | \[2, 2, 2\] |
| Capacity | 8 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim3** | 2 | int64 | \[-2 -1\] |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim3 | dim1 | dim2 | value |
| --- | --- | --- | --- | --- |
| **0** | -1 | a | 1 | 10 |
| **1** | -2 | b | 2 | 20 |
```python
# One dimension with several elements related to an existing dimension using two lists
result = arr2.insert(dim3={'dim1': [['a', 'b'], [-1, -2]]})
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 64 bytes |
| Data object type | dense |
| Data object size | 64 bytes |
| Dimensions | \['dim3', 'dim1', 'dim2'\] |
| Shape | \[2, 2, 2\] |
| Capacity | 8 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim3** | 2 | int64 | \[-1 -2\] |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim3 | dim1 | dim2 | value |
| --- | --- | --- | --- | --- |
| **0** | -1 | a | 1 | 10 |
| **1** | -2 | b | 2 | 20 |
7. Drop a dimension:
```python
result = arr1.drop('dim3')
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 48 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 10.00 |
| **1** | b | 2 | 20.00 |
!Note
Dropping a dimension will work only if the resulting array still has unique coordinates. If dropping a dimension leads to an array with duplicate coordinates, as a results of the removed dimension, karray will raise an error.
```python
# Assertion error due to duplicate coords
try:
arr3.drop('dim2')
except AssertionError as e:
print(e)
```
Index items per row must be unique. By removing ['dim2'] leads the existence of repeated indexes
e.g.:
('dim1',) value
0 ('a',) 10.0
1 ('a',) 20.0
Intead, you can use obj.reduce('dim2')
With an aggfunc: sum() by default
8. Expanding a dimension (Broadcasting)
```python
result = arr3.expand(dim3=['x', 'y', 'z'])
result
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 384 bytes |
| Data object type | dense |
| Data object size | 96 bytes |
| Dimensions | \['dim1', 'dim2', 'dim3'\] |
| Shape | \[2, 2, 3\] |
| Capacity | 12 |
| Rows | 12 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
| **dim3** | 3 | object | \['x' 'y' 'z'\] |
Data
| | dim1 | dim2 | dim3 | value |
| --- | --- | --- | --- | --- |
| **0** | a | 1 | x | 10.00 |
| **1** | a | 1 | y | 10.00 |
| **2** | a | 1 | z | 10.00 |
| **3** | a | 2 | x | 20.00 |
| **4** | a | 2 | y | 20.00 |
| **5** | a | 2 | z | 20.00 |
| **6** | b | 1 | x | 30.00 |
| **7** | b | 1 | y | 30.00 |
| **8** | b | 1 | z | 30.00 |
| **9** | b | 2 | x | 40.00 |
| **10** | b | 2 | y | 40.00 |
| **11** | b | 2 | z | 40.00 |
9. ufunc operations
```python
arr3.ufunc(dim='dim2', func=np.prod, keepdims=True)
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 96 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 4 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 200.00 |
| **1** | a | 2 | 200.00 |
| **2** | b | 1 | 1200.00 |
| **3** | b | 2 | 1200.00 |
!Note
The dim argument is passed to ufunc as axis argument in numpy and keepdims argument is passed with the same name. You can add more arguments depending on the ufunc.
### Saving and Loading Arrays
karray supports saving and loading arrays using the Feather format:
1. Saving an array to a Feather file:
```python
arr1.to_feather('array.feather')
```
2. Loading an array from a Feather file:
```python
loaded_arr1 = ka.from_feather('array.feather')
loaded_arr1
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 64 bytes |
| Data object type | dense |
| Data object size | 64 bytes |
| Dimensions | \['dim1', 'dim2', 'dim3'\] |
| Shape | \[2, 2, 2\] |
| Capacity | 8 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
| **dim3** | 2 | int64 | \[1577836800000000000 1577923200000000000\] |
Data
| | dim1 | dim2 | dim3 | value |
| --- | --- | --- | --- | --- |
| **0** | a | 1 | 2020-01-01T00:00:00.000000000 | 10.00 |
| **1** | b | 2 | 2020-01-02T00:00:00.000000000 | 20.00 |
### Interoperability with Other Libraries
karray provides interoperability with other popular data manipulation libraries:
1. Converting an array to a pandas DataFrame and then back to an array:
```python
df = arr2.to_pandas()
new_arr = ka.from_pandas(df, coords=coords2)
new_arr
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 48 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 10 |
| **1** | b | 2 | 20 |
2. Converting an array to a polars DataFrame and then back to an array:
```python
df = arr2.to_polars()
new_arr = ka.from_polars(df, coords=coords2)
new_arr
```
**\[k\]array**
| | |
| --- | --- |
| Long object size | 48 bytes |
| Data object type | dense |
| Data object size | 32 bytes |
| Dimensions | \['dim1', 'dim2'\] |
| Shape | \[2, 2\] |
| Capacity | 4 |
| Rows | 2 |
Coords
| Dimension | Length | Type | Items |
| --- | --- | --- | --- |
| **dim1** | 2 | object | \['a' 'b'\] |
| **dim2** | 2 | int64 | \[1 2\] |
Data
| | dim1 | dim2 | value |
| --- | --- | --- | --- |
| **0** | a | 1 | 10 |
| **1** | b | 2 | 20 |
There are many more features and functionalities. Please refer to the source code section for more details.
!Note
karray is a work in progress. The API is subject to change in the future. We are looking for feedback, suggestions, and we appreciate your contributions.
© 2024 [Carlos Gaete-Morales](https://github.com/cdgaete)
Raw data
{
"_id": null,
"home_page": null,
"name": "karray",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "ndarray, labeled, multidimensional, element-wise, dense, sparse, arrays",
"author": null,
"author_email": "Carlos Gaete-Morales <cdgaete@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/d3/25/70252e7678b94dae32e2cf5755309043cc67cd24e8607326991e1025b528/karray-2024.3.7.tar.gz",
"platform": null,
"description": "\n# \\[k\\]array: labeled multi-dimensional arrays\n\n[![karray Status Badge](https://img.shields.io/pypi/v/karray.svg)](https://pypi.org/project/karray/)\n[![karray Python Versions](https://img.shields.io/pypi/pyversions/karray.svg)](https://pypi.org/project/karray/)\n[![karray license](https://img.shields.io/pypi/l/karray.svg)](https://pypi.org/project/karray/)\n[![Downloads](https://static.pepy.tech/badge/karray)](https://pepy.tech/project/karray)\n[![Pipeline](https://gitlab.com/diw-evu/karray/badges/main/pipeline.svg)](https://gitlab.com/diw-evu/karray/-/commits/main)\n\nKarray is a simple tool that intends to abstract the users from the complexity of working with labelled multi-dimensional arrays. Numpy is the tool\u2019s core, with an extensive collection of high-level mathematical functions to operate on multi-dimensional arrays efficiently thanks to its well-optimized C code. With Karray, we put effort into generating lightweight objects expecting to reduce overheads and avoid large loops that cause bottlenecks and impact performance. Numpy is the only relevant dependency, while Polars, Pandas, sparse and Pyarrow are required to import, export and store the arrays. `karray` is developed by the research group `Transformation of the Energy Economy` at [DIW Berlin](https://www.diw.de/en/diw_01.c.604205.en/energy__transportation__environment_department.html) (German Institute of Economic Research).\n\n**Links**\n\n* Documentation: https://diw-evu.gitlab.io/karray\n* Source code: https://gitlab.com/diw-evu/karray\n* PyPI releases: https://pypi.org/project/karray\n\n**Table of contents**\n\n* [Quick installation](#quick-installation)\n* [Importing karray](#importing-karray)\n* [Usage Examples](#usage-examples)\n \n * [Creating an Array](#creating-an-array)\n * [Accessing Array Elements](#accessing-array-elements)\n * [Array Operations](#array-operations)\n * [Saving and Loading Arrays](#saving-and-loading-arrays)\n * [Interoperability with Other Libraries](#interoperability-with-other-libraries)\n \n\nGetting started\n===============\n\nQuick installation\n------------------\n\nTo install karray, you can use pip:\n\n`pip install karray` \n\nImporting karray\n----------------\n\nTo start using karray, import the necessary classes and functions:\n\n```python\nimport karray as ka\n\n# then you can use ka.Array, ka.Long, and ka.settings\n```\n\nThe `Array` class represents a labeled multidimensional array, while the `Long` class represents a labeled one-dimensional array. The `settings` object allows you to configure various options for karray.\n\nUsage Examples\n--------------\n\n### Creating an Array\n\nYou can create an `Array` object in several ways:\n\n1. From a `Long` object and coordinates:\n\n```python\nimport pandas as pd\n\nindex = {'dim1': ['a', 'b'],\n 'dim2': [1, 2],\n 'dim3': pd.to_datetime(['2020-01-01', '2020-01-02'], utc=True)}\nvalue = [10., 20.]\nlong = ka.Long(index=index, value=value)\n\narr1 = ka.Array(data=long)\narr1\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 64 bytes |\n| Data object type | dense |\n| Data object size | 64 bytes |\n| Dimensions | \\['dim1', 'dim2', 'dim3'\\] |\n| Shape | \\[2, 2, 2\\] |\n| Capacity | 8 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n| **dim3** | 2 | datetime64\\[ns\\] | \\['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\\] |\n\nData\n\n| | dim1 | dim2 | dim3 | value |\n| --- | --- | --- | --- | --- |\n| **0** | a | 1 | 2020-01-01T00:00:00.000000000 | 10.00 |\n| **1** | b | 2 | 2020-01-02T00:00:00.000000000 | 20.00 |\n\n2. From a tuple of index and value, and coordinates:\n\n```python\nindex2 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}\nvalue2 = [10, 20]\ncoords2 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}\n\narr2 = ka.Array(data=(index2, value2), coords=coords2)\narr2\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 48 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 10 |\n| **1** | b | 2 | 20 |\n\n3. From a dense NumPy array and coordinates:\n\n```python\nimport numpy as np\ndense = np.array([[10, 20], [30, 40]])\ncoords3 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}\n\narr3 = ka.Array(data=dense, coords=coords3)\narr3\n```\n\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 96 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 4 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 10.00 |\n| **1** | a | 2 | 20.00 |\n| **2** | b | 1 | 30.00 |\n| **3** | b | 2 | 40.00 |\n\n4. From a sparse array (using the `sparse` library) and coordinates:\n\n```python\nimport sparse as sp\n\nsparse_arr = sp.COO(data=[10, 20], coords=[[0, 1], [0, 1]], shape=(2, 2))\ncoords4 = {'dim1': ['a', 'b'], 'dim2': [1, 2]}\n\narr4 = ka.Array(data=sparse_arr, coords=coords4)\narr4\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 48 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 10 |\n| **1** | b | 2 | 20 |\n\n### Accessing Array Elements\n\nYou can access elements of an `Array` object using various methods:\n\n1. Using the `items()` method to iterate over the array elements:\n\n```python\nfor item in arr3.items():\n print(item)\n```\n\n ('dim1', array(['a', 'a', 'b', 'b'], dtype=object))\n ('dim2', array([1, 2, 1, 2]))\n ('value', array([10., 20., 30., 40.]))\n \n\n2. Using the `to_pandas()` method to convert the array to a pandas DataFrame:\n\n```python\ndf = arr1.to_pandas()\nprint(df)\n```\n\n dim1 dim2 dim3 value\n 0 a 1 2020-01-01 10.0\n 1 b 2 2020-01-02 20.0\n \n\n3. Using the `to_polars()` method to convert the array to a polars DataFrame:\n\n```python\ndf = arr1.to_polars()\nprint(df)\n```\n\n shape: (2, 4)\n \u250c\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u252c\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2510\n \u2502 dim1 \u2506 dim2 \u2506 dim3 \u2506 value \u2502\n \u2502 --- \u2506 --- \u2506 --- \u2506 --- \u2502\n \u2502 str \u2506 i64 \u2506 datetime[ns] \u2506 f64 \u2502\n \u255e\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u256a\u2550\u2550\u2550\u2550\u2550\u2550\u2550\u2561\n \u2502 a \u2506 1 \u2506 2020-01-01 00:00:00 \u2506 10.0 \u2502\n \u2502 b \u2506 2 \u2506 2020-01-02 00:00:00 \u2506 20.0 \u2502\n \u2514\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2534\u2500\u2500\u2500\u2500\u2500\u2500\u2500\u2518\n \n\n### Array Operations\n\nkarray provides various operations that can be performed on `Array` objects:\n\n1. Arithmetic operations:\n\n```python\nresult = arr1 + arr2\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 128 bytes |\n| Data object type | dense |\n| Data object size | 64 bytes |\n| Dimensions | \\['dim1', 'dim2', 'dim3'\\] |\n| Shape | \\[2, 2, 2\\] |\n| Capacity | 8 |\n| Rows | 4 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n| **dim3** | 2 | datetime64\\[ns\\] | \\['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\\] |\n\nData\n\n| | dim1 | dim2 | dim3 | value |\n| --- | --- | --- | --- | --- |\n| **0** | a | 1 | 2020-01-01T00:00:00.000000000 | 20.00 |\n| **1** | a | 1 | 2020-01-02T00:00:00.000000000 | 10.00 |\n| **2** | b | 2 | 2020-01-01T00:00:00.000000000 | 20.00 |\n| **3** | b | 2 | 2020-01-02T00:00:00.000000000 | 40.00 |\n\n```python\nresult = arr3 * 2\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 96 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 4 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 20.00 |\n| **1** | a | 2 | 40.00 |\n| **2** | b | 1 | 60.00 |\n| **3** | b | 2 | 80.00 |\n\n```python\nresult = arr4 - 1\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 96 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 4 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 9.00 |\n| **1** | a | 2 | -1.00 |\n| **2** | b | 1 | -1.00 |\n| **3** | b | 2 | 19.00 |\n\n2. Comparison operations:\n\n```python\nmask = arr2 > 10\n\nmask = arr2 == 5\n```\n\n3. Logical operations:\n\n```python\nresult = arr2 & arr4\n\nresult = arr2 | arr4\n\nresult = ~arr2\n```\n\n4. Reduction operations:\n\n```python\nresult = arr1.reduce('dim1', aggfunc='sum')\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 48 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim2', 'dim3'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n| **dim3** | 2 | datetime64\\[ns\\] | \\['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\\] |\n\nData\n\n| | dim2 | dim3 | value |\n| --- | --- | --- | --- |\n| **0** | 1 | 2020-01-01T00:00:00.000000000 | 10.00 |\n| **1** | 2 | 2020-01-02T00:00:00.000000000 | 20.00 |\n\n```python\nresult = arr1.reduce('dim2', aggfunc=np.mean)\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 48 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim3'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim3** | 2 | datetime64\\[ns\\] | \\['2020-01-01T00:00:00.000000000' '2020-01-02T00:00:00.000000000'\\] |\n\nData\n\n| | dim1 | dim3 | value |\n| --- | --- | --- | --- |\n| **0** | a | 2020-01-01T00:00:00.000000000 | 5.00 |\n| **1** | b | 2020-01-02T00:00:00.000000000 | 10.00 |\n\n5. Shifting and rolling operations:\n\n```python\nshifted = arr3.shift(dim1=1, dim2=-1, fill_value=0.)\nshifted\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 24 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 1 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | b | 1 | 20.00 |\n\n```python\nrolled = arr3.roll(dim1=2)\nrolled\n```\n\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 96 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 4 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 10.00 |\n| **1** | a | 2 | 20.00 |\n| **2** | b | 1 | 30.00 |\n| **3** | b | 2 | 40.00 |\n\n6. Inserting new dimensions:\n\n```python\n# One dimension with one element\nresult = arr2.insert(dim3='x')\nresult\n```\n\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 64 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim3', 'dim1', 'dim2'\\] |\n| Shape | \\[1, 2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim3** | 1 | object | \\['x'\\] |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim3 | dim1 | dim2 | value |\n| --- | --- | --- | --- | --- |\n| **0** | x | a | 1 | 10 |\n| **1** | x | b | 2 | 20 |\n\n```python\n# One dimension with several elements related to an existing dimension using a dict\nresult = arr2.insert(dim3={'dim1': {'a': -1, 'b': -2}})\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 64 bytes |\n| Data object type | dense |\n| Data object size | 64 bytes |\n| Dimensions | \\['dim3', 'dim1', 'dim2'\\] |\n| Shape | \\[2, 2, 2\\] |\n| Capacity | 8 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim3** | 2 | int64 | \\[-2 -1\\] |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim3 | dim1 | dim2 | value |\n| --- | --- | --- | --- | --- |\n| **0** | -1 | a | 1 | 10 |\n| **1** | -2 | b | 2 | 20 |\n\n```python\n# One dimension with several elements related to an existing dimension using two lists\nresult = arr2.insert(dim3={'dim1': [['a', 'b'], [-1, -2]]})\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 64 bytes |\n| Data object type | dense |\n| Data object size | 64 bytes |\n| Dimensions | \\['dim3', 'dim1', 'dim2'\\] |\n| Shape | \\[2, 2, 2\\] |\n| Capacity | 8 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim3** | 2 | int64 | \\[-1 -2\\] |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim3 | dim1 | dim2 | value |\n| --- | --- | --- | --- | --- |\n| **0** | -1 | a | 1 | 10 |\n| **1** | -2 | b | 2 | 20 |\n\n7. Drop a dimension:\n\n```python\nresult = arr1.drop('dim3')\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 48 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 10.00 |\n| **1** | b | 2 | 20.00 |\n\n!Note\n\n Dropping a dimension will work only if the resulting array still has unique coordinates. If dropping a dimension leads to an array with duplicate coordinates, as a results of the removed dimension, karray will raise an error.\n\n```python\n# Assertion error due to duplicate coords\ntry:\n arr3.drop('dim2')\nexcept AssertionError as e:\n print(e)\n```\n\n Index items per row must be unique. By removing ['dim2'] leads the existence of repeated indexes \n e.g.:\n ('dim1',) value\n 0 ('a',) 10.0\n 1 ('a',) 20.0\n Intead, you can use obj.reduce('dim2')\n With an aggfunc: sum() by default\n \n\n8. Expanding a dimension (Broadcasting)\n\n```python\nresult = arr3.expand(dim3=['x', 'y', 'z'])\nresult\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 384 bytes |\n| Data object type | dense |\n| Data object size | 96 bytes |\n| Dimensions | \\['dim1', 'dim2', 'dim3'\\] |\n| Shape | \\[2, 2, 3\\] |\n| Capacity | 12 |\n| Rows | 12 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n| **dim3** | 3 | object | \\['x' 'y' 'z'\\] |\n\nData\n\n| | dim1 | dim2 | dim3 | value |\n| --- | --- | --- | --- | --- |\n| **0** | a | 1 | x | 10.00 |\n| **1** | a | 1 | y | 10.00 |\n| **2** | a | 1 | z | 10.00 |\n| **3** | a | 2 | x | 20.00 |\n| **4** | a | 2 | y | 20.00 |\n| **5** | a | 2 | z | 20.00 |\n| **6** | b | 1 | x | 30.00 |\n| **7** | b | 1 | y | 30.00 |\n| **8** | b | 1 | z | 30.00 |\n| **9** | b | 2 | x | 40.00 |\n| **10** | b | 2 | y | 40.00 |\n| **11** | b | 2 | z | 40.00 |\n\n9. ufunc operations\n\n```python\narr3.ufunc(dim='dim2', func=np.prod, keepdims=True)\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 96 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 4 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 200.00 |\n| **1** | a | 2 | 200.00 |\n| **2** | b | 1 | 1200.00 |\n| **3** | b | 2 | 1200.00 |\n\n!Note\n\n The dim argument is passed to ufunc as axis argument in numpy and keepdims argument is passed with the same name. You can add more arguments depending on the ufunc.\n\n### Saving and Loading Arrays\n\nkarray supports saving and loading arrays using the Feather format:\n\n1. Saving an array to a Feather file:\n\n```python\narr1.to_feather('array.feather')\n```\n\n2. Loading an array from a Feather file:\n\n```python\nloaded_arr1 = ka.from_feather('array.feather')\nloaded_arr1\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 64 bytes |\n| Data object type | dense |\n| Data object size | 64 bytes |\n| Dimensions | \\['dim1', 'dim2', 'dim3'\\] |\n| Shape | \\[2, 2, 2\\] |\n| Capacity | 8 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n| **dim3** | 2 | int64 | \\[1577836800000000000 1577923200000000000\\] |\n\nData\n\n| | dim1 | dim2 | dim3 | value |\n| --- | --- | --- | --- | --- |\n| **0** | a | 1 | 2020-01-01T00:00:00.000000000 | 10.00 |\n| **1** | b | 2 | 2020-01-02T00:00:00.000000000 | 20.00 |\n\n### Interoperability with Other Libraries\n\nkarray provides interoperability with other popular data manipulation libraries:\n\n1. Converting an array to a pandas DataFrame and then back to an array:\n\n```python\ndf = arr2.to_pandas()\nnew_arr = ka.from_pandas(df, coords=coords2)\nnew_arr\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 48 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 10 |\n| **1** | b | 2 | 20 |\n\n2. Converting an array to a polars DataFrame and then back to an array:\n\n```python\ndf = arr2.to_polars()\nnew_arr = ka.from_polars(df, coords=coords2)\nnew_arr\n```\n\n**\\[k\\]array**\n\n| | |\n| --- | --- |\n| Long object size | 48 bytes |\n| Data object type | dense |\n| Data object size | 32 bytes |\n| Dimensions | \\['dim1', 'dim2'\\] |\n| Shape | \\[2, 2\\] |\n| Capacity | 4 |\n| Rows | 2 |\n\nCoords\n\n| Dimension | Length | Type | Items |\n| --- | --- | --- | --- |\n| **dim1** | 2 | object | \\['a' 'b'\\] |\n| **dim2** | 2 | int64 | \\[1 2\\] |\n\nData\n\n| | dim1 | dim2 | value |\n| --- | --- | --- | --- |\n| **0** | a | 1 | 10 |\n| **1** | b | 2 | 20 |\n\nThere are many more features and functionalities. Please refer to the source code section for more details.\n\n!Note\n\n karray is a work in progress. The API is subject to change in the future. We are looking for feedback, suggestions, and we appreciate your contributions.\n\n\n\u00a9 2024 [Carlos Gaete-Morales](https://github.com/cdgaete)\n",
"bugtrack_url": null,
"license": "MIT License",
"summary": "Lightweight labelled multidimensional arrays with NumPy arrays under the hood.",
"version": "2024.3.7",
"project_urls": {
"Documentation": "https://diw-evu.gitlab.io/karray",
"Issue-tracker": "https://gitlab.com/diw-evu/karray/issues",
"Source-code": "https://gitlab.com/diw-evu/karray"
},
"split_keywords": [
"ndarray",
" labeled",
" multidimensional",
" element-wise",
" dense",
" sparse",
" arrays"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "098a7a72b5dd2d9d75d0e7e5e991909fbfa73b1de1c1751a6f8674897e55509b",
"md5": "faa193ba4eca86fdadbe1511bc50b830",
"sha256": "0d915adfd1ef0c2abdfc460432fe2d778f336a1b636a2e17edd5a0687da30b51"
},
"downloads": -1,
"filename": "karray-2024.3.7-py3-none-any.whl",
"has_sig": false,
"md5_digest": "faa193ba4eca86fdadbe1511bc50b830",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 35562,
"upload_time": "2024-04-17T21:29:28",
"upload_time_iso_8601": "2024-04-17T21:29:28.775867Z",
"url": "https://files.pythonhosted.org/packages/09/8a/7a72b5dd2d9d75d0e7e5e991909fbfa73b1de1c1751a6f8674897e55509b/karray-2024.3.7-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "d32570252e7678b94dae32e2cf5755309043cc67cd24e8607326991e1025b528",
"md5": "516a62184640a6d0475b9cb6c4d6e12d",
"sha256": "c811a17469e6d9109d25e072fef060fdd2eb28cd17cdb6ade3fcaed4d529ae9f"
},
"downloads": -1,
"filename": "karray-2024.3.7.tar.gz",
"has_sig": false,
"md5_digest": "516a62184640a6d0475b9cb6c4d6e12d",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 40083,
"upload_time": "2024-04-17T21:29:30",
"upload_time_iso_8601": "2024-04-17T21:29:30.878917Z",
"url": "https://files.pythonhosted.org/packages/d3/25/70252e7678b94dae32e2cf5755309043cc67cd24e8607326991e1025b528/karray-2024.3.7.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-04-17 21:29:30",
"github": false,
"gitlab": true,
"bitbucket": false,
"codeberg": false,
"gitlab_user": "diw-evu",
"gitlab_project": "karray",
"lcname": "karray"
}