batch-dev


Namebatch-dev JSON
Version 0.0.3 PyPI version JSON
download
home_pagehttps://github.com/Peter-Kocsis/batch
SummaryGeneric python module for handling dictionary-based batch data
upload_time2024-03-07 12:00:08
maintainer
docs_urlNone
authorPeter Kocsis
requires_python>=3.8, <4
license
keywords data-processing datastructure machine-learning deep-learning ml pytorch numpy
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            <div align="center">

<h1 align="center">Batch</h1>


Generic python module for handling dictionary-based batch data. 
______________________________________________________________________

[![PyPI - Python Version](https://img.shields.io/badge/python-3.8_|_3.9_|_3.10_|_3.11_|3.12-blue)](https://pypi.org/project/batch/)
[![PyPI Status](https://img.shields.io/badge/pip-v0.1-green)](https://pypi.org/project/batch/)

<!--
[![CodeFactor](https://www.codefactor.io/repository/github/Lightning-AI/lightning/badge)](https://www.codefactor.io/repository/github/Lightning-AI/lightning)
-->

</div>

# Purpose
Are you working with data of similar modalities, and often have to apply the same function to multiple elements? 
Are you using something similar to this:

```python
batch = {
    "image_a": image_a,
    "image_b": image_b,
    "image_c": image_c
}

# Move to another device
for key in batch:
    batch[key] = batch[key].to(device)

# Transform
for key in batch:
    batch[key] = batch[key] * 2 + 1
    
# Combine
for key in batch:
    batch[key] = batch[key] + batch_2[key]

# Process
for key in batch:
    batch[key] = batch[key].max()
```
If the answer is yes, **then this module is for you!**

Our ***Batch*** package is a generic wrapper for dictionary-based batch data. 
It provides a simple way to apply the same function or operator to the whole batch. 
The module is completely device and container independent, you can use it with PyTorch, NumPy or any other libraries.

```python
batch = Batch(
    image_a=image_a, 
    image_b=image_b, 
    image_c=image_c)

# Move to another device
batch = batch.to(device)

# Transform
batch = batch * 2 + 1

# Combine
batch = batch + batch_2

# Process
batch = batch.max()
```

# Installation
```
pip install batch
```

# Usage
The example below demonstrates a few basic use-cases using NumPy. Similarly, PyTorch or other containers can also be used.  

## Import

```python
from batch import Batch
```

## Instantiation
### Direct
```python
# Create a batch directly
batch = Batch(
    image_a=np.random.rand(256, 256, 3), 
    image_b=np.random.rand(256, 256, 3), 
    image_c=np.random.rand(256, 256, 3))
```

### From dictionary
```python
batch = {
    "image_a": np.random.rand(256, 256, 3), 
    "image_b": np.random.rand(256, 256, 3), 
    "image_c": np.random.rand(256, 256, 3)}
# Create a batch from a dictionary
batch = Batch.from_dict(batch)
```

### From tensor
```python
image = np.random.rand(256, 256, 9)
    
# Create a batch from a tensor by splitting the tensor along one dimension and store the splits
data_splits = {
    "image_a": 1, 
    "image_b": 1, 
    "image_c": 1}
dim = 2
batch = Batch.from_tensor(batch, data_splits, dim=dim)
```

## Indexing
A Batch is a string-keyed dictionary, with potentially mapping or iterable values. 

### String index
When a string index is given, then it is always interpreted as a key. 

#### Single key
Querying a single key returns the value associated:
```python
image_a = batch["image_a"]
```
You can even index deeper using `.` as separator:
```python
batch_2 = Batch(input=batch)
image_a = batch_2["input.image_a"]
```

#### Multiple keys
Querying multiple keys (tuple or list) return a new batch with the selected keys:
```python
batch_out = batch["image_a", "image_b"]
```

#### Wildcard query
Wildcard query is also supported and returns a new batch with the matching keys:
```python
batch_out = batch.query_wildcard("image_*")
```
### Integer index
When an integer index is given, then it is always interpreted as an index to the elements and returns a new batch with the indexed elements: 
```python
batch_out = batch[:,:,0]
```


## Processing a batch
### Operators
You can use the followingunary, binary and reverse operators: 
```python
# Unary operators
"__not__", "__abs__", "__index__", "__inv__", "__invert__", "__neg__", "__pos__",

# Binary operators
"__add__", "__and__", "__concat__", "__floordiv__", "__lshift__", "__mod__", "__mul__",
"__or__", "__pow__", "__rshift__", "__sub__", "__truediv__", "__xor__", "__eq__",

# Reverse operators
"__radd__", "__rand__", "__rmul__",
"__ror__", "__rsub__",  "__rxor__",

# In-place operators
"__iadd__", "__iand__", "__iconcat__", "__ifloordiv__", "__ilshift__", "__imod__", "__imul__",
"__ior__", "__ipow__", "__irshift__", "__isub__", "__itruediv__", "__ixor__"
```
Example:
```python
# Use operators
batch_out = batch + batch_2 * 2
```

### Member functions
You can use any member functions of the underlying container, for example:
```python
# Use member functions
batch_out = batch.mean(axis=2)
```

### Map
You can easily apply a function to the whole batch:
```python
batch = batch.map(list)  # Converts all elements to list
batch = batch.map(np.stack, axis=0)  # Concatenates all elements to a single tensor
```

### Map keys
You can also apply a function to the keys:
```python
batch = batch.map_keys(lambda x: f"{x}_2")  # Add suffix
```


# Limitations
A few limitations to consider when using this module:
* Use only string keys for the batch.
* Don't use keys starting with underscore (`_`).
* Slice indexing is not implemented yet.
* Generic iterable indexing is not implemented, only tuple and list.
* Code documentation is in progress. 
* Some features are not yet documented here, please refer to the code directly. 

If you have any ideas or requests, feel free to open an issue or a pull request.



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/Peter-Kocsis/batch",
    "name": "batch-dev",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.8, <4",
    "maintainer_email": "",
    "keywords": "data-processing,datastructure,machine-learning,deep-learning,ml,pytorch,numpy",
    "author": "Peter Kocsis",
    "author_email": "peti0510@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/f4/de/209dffda637493691850b9d1f47d7fc3766e8134c9db63bbc235fb248bec/batch-dev-0.0.3.tar.gz",
    "platform": null,
    "description": "<div align=\"center\">\n\n<h1 align=\"center\">Batch</h1>\n\n\nGeneric python module for handling dictionary-based batch data. \n______________________________________________________________________\n\n[![PyPI - Python Version](https://img.shields.io/badge/python-3.8_|_3.9_|_3.10_|_3.11_|3.12-blue)](https://pypi.org/project/batch/)\n[![PyPI Status](https://img.shields.io/badge/pip-v0.1-green)](https://pypi.org/project/batch/)\n\n<!--\n[![CodeFactor](https://www.codefactor.io/repository/github/Lightning-AI/lightning/badge)](https://www.codefactor.io/repository/github/Lightning-AI/lightning)\n-->\n\n</div>\n\n# Purpose\nAre you working with data of similar modalities, and often have to apply the same function to multiple elements? \nAre you using something similar to this:\n\n```python\nbatch = {\n    \"image_a\": image_a,\n    \"image_b\": image_b,\n    \"image_c\": image_c\n}\n\n# Move to another device\nfor key in batch:\n    batch[key] = batch[key].to(device)\n\n# Transform\nfor key in batch:\n    batch[key] = batch[key] * 2 + 1\n    \n# Combine\nfor key in batch:\n    batch[key] = batch[key] + batch_2[key]\n\n# Process\nfor key in batch:\n    batch[key] = batch[key].max()\n```\nIf the answer is yes, **then this module is for you!**\n\nOur ***Batch*** package is a generic wrapper for dictionary-based batch data. \nIt provides a simple way to apply the same function or operator to the whole batch. \nThe module is completely device and container independent, you can use it with PyTorch, NumPy or any other libraries.\n\n```python\nbatch = Batch(\n    image_a=image_a, \n    image_b=image_b, \n    image_c=image_c)\n\n# Move to another device\nbatch = batch.to(device)\n\n# Transform\nbatch = batch * 2 + 1\n\n# Combine\nbatch = batch + batch_2\n\n# Process\nbatch = batch.max()\n```\n\n# Installation\n```\npip install batch\n```\n\n# Usage\nThe example below demonstrates a few basic use-cases using NumPy. Similarly, PyTorch or other containers can also be used.  \n\n## Import\n\n```python\nfrom batch import Batch\n```\n\n## Instantiation\n### Direct\n```python\n# Create a batch directly\nbatch = Batch(\n    image_a=np.random.rand(256, 256, 3), \n    image_b=np.random.rand(256, 256, 3), \n    image_c=np.random.rand(256, 256, 3))\n```\n\n### From dictionary\n```python\nbatch = {\n    \"image_a\": np.random.rand(256, 256, 3), \n    \"image_b\": np.random.rand(256, 256, 3), \n    \"image_c\": np.random.rand(256, 256, 3)}\n# Create a batch from a dictionary\nbatch = Batch.from_dict(batch)\n```\n\n### From tensor\n```python\nimage = np.random.rand(256, 256, 9)\n    \n# Create a batch from a tensor by splitting the tensor along one dimension and store the splits\ndata_splits = {\n    \"image_a\": 1, \n    \"image_b\": 1, \n    \"image_c\": 1}\ndim = 2\nbatch = Batch.from_tensor(batch, data_splits, dim=dim)\n```\n\n## Indexing\nA Batch is a string-keyed dictionary, with potentially mapping or iterable values. \n\n### String index\nWhen a string index is given, then it is always interpreted as a key. \n\n#### Single key\nQuerying a single key returns the value associated:\n```python\nimage_a = batch[\"image_a\"]\n```\nYou can even index deeper using `.` as separator:\n```python\nbatch_2 = Batch(input=batch)\nimage_a = batch_2[\"input.image_a\"]\n```\n\n#### Multiple keys\nQuerying multiple keys (tuple or list) return a new batch with the selected keys:\n```python\nbatch_out = batch[\"image_a\", \"image_b\"]\n```\n\n#### Wildcard query\nWildcard query is also supported and returns a new batch with the matching keys:\n```python\nbatch_out = batch.query_wildcard(\"image_*\")\n```\n### Integer index\nWhen an integer index is given, then it is always interpreted as an index to the elements and returns a new batch with the indexed elements: \n```python\nbatch_out = batch[:,:,0]\n```\n\n\n## Processing a batch\n### Operators\nYou can use the followingunary, binary and reverse operators: \n```python\n# Unary operators\n\"__not__\", \"__abs__\", \"__index__\", \"__inv__\", \"__invert__\", \"__neg__\", \"__pos__\",\n\n# Binary operators\n\"__add__\", \"__and__\", \"__concat__\", \"__floordiv__\", \"__lshift__\", \"__mod__\", \"__mul__\",\n\"__or__\", \"__pow__\", \"__rshift__\", \"__sub__\", \"__truediv__\", \"__xor__\", \"__eq__\",\n\n# Reverse operators\n\"__radd__\", \"__rand__\", \"__rmul__\",\n\"__ror__\", \"__rsub__\",  \"__rxor__\",\n\n# In-place operators\n\"__iadd__\", \"__iand__\", \"__iconcat__\", \"__ifloordiv__\", \"__ilshift__\", \"__imod__\", \"__imul__\",\n\"__ior__\", \"__ipow__\", \"__irshift__\", \"__isub__\", \"__itruediv__\", \"__ixor__\"\n```\nExample:\n```python\n# Use operators\nbatch_out = batch + batch_2 * 2\n```\n\n### Member functions\nYou can use any member functions of the underlying container, for example:\n```python\n# Use member functions\nbatch_out = batch.mean(axis=2)\n```\n\n### Map\nYou can easily apply a function to the whole batch:\n```python\nbatch = batch.map(list)  # Converts all elements to list\nbatch = batch.map(np.stack, axis=0)  # Concatenates all elements to a single tensor\n```\n\n### Map keys\nYou can also apply a function to the keys:\n```python\nbatch = batch.map_keys(lambda x: f\"{x}_2\")  # Add suffix\n```\n\n\n# Limitations\nA few limitations to consider when using this module:\n* Use only string keys for the batch.\n* Don't use keys starting with underscore (`_`).\n* Slice indexing is not implemented yet.\n* Generic iterable indexing is not implemented, only tuple and list.\n* Code documentation is in progress. \n* Some features are not yet documented here, please refer to the code directly. \n\nIf you have any ideas or requests, feel free to open an issue or a pull request.\n\n\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Generic python module for handling dictionary-based batch data",
    "version": "0.0.3",
    "project_urls": {
        "Bug Reports": "https://github.com/Peter-Kocsis/batch/issues",
        "Homepage": "https://github.com/Peter-Kocsis/batch",
        "Source": "https://github.com/Peter-Kocsis/batch"
    },
    "split_keywords": [
        "data-processing",
        "datastructure",
        "machine-learning",
        "deep-learning",
        "ml",
        "pytorch",
        "numpy"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "0a56dd85d8a00ca89108af35ffbceec75a027c88bc252aab15f03802b9c5e196",
                "md5": "893863408d780384c3eb56c6ea99bdf9",
                "sha256": "995a5c06c0f782bb26a55006c9b74a1491efb9048ceec4388d348c6410c63d9e"
            },
            "downloads": -1,
            "filename": "batch_dev-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "893863408d780384c3eb56c6ea99bdf9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8, <4",
            "size": 9302,
            "upload_time": "2024-03-07T12:00:07",
            "upload_time_iso_8601": "2024-03-07T12:00:07.536412Z",
            "url": "https://files.pythonhosted.org/packages/0a/56/dd85d8a00ca89108af35ffbceec75a027c88bc252aab15f03802b9c5e196/batch_dev-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f4de209dffda637493691850b9d1f47d7fc3766e8134c9db63bbc235fb248bec",
                "md5": "82e195016313a1eccfaccd746b2160ff",
                "sha256": "4d96ed4c4896db173db19f27514fc4fc844abc2e4827513fc7f1354f360aa499"
            },
            "downloads": -1,
            "filename": "batch-dev-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "82e195016313a1eccfaccd746b2160ff",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8, <4",
            "size": 11886,
            "upload_time": "2024-03-07T12:00:08",
            "upload_time_iso_8601": "2024-03-07T12:00:08.943472Z",
            "url": "https://files.pythonhosted.org/packages/f4/de/209dffda637493691850b9d1f47d7fc3766e8134c9db63bbc235fb248bec/batch-dev-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-03-07 12:00:08",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Peter-Kocsis",
    "github_project": "batch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "batch-dev"
}
        
Elapsed time: 0.42650s