# Introduction
MetaBatch provides convenient `Taskset` and `TaskLoader` classes for **batch-aware online task creation for meta-learning**.
## Efficient batching
Training meta-learning models efficiently can be a challenge, especially when it comes to creating
random tasks of a consistent shape in one batch. The task creation process can be time-consuming
and typically requires all tasks in the batch to have the same amount of context and target points.
This can be a bottleneck during training:
```python
# Sample code for creating a batch of tasks with traditional approach
class MyTaskDataset(Dataset):
...
def __getitem__(self, idx):
task = self.task_data[idx]
return task
class Model(Module):
...
def forward(self, tasks):
ctx_batch = tasks['context']
tgt_batch = tasks['target']
...
# create dataset
task_data = [{'images': [...], 'label': 'dog'},
{'images': [...], 'label': 'cat'}, ...]
dataset = MyTaskDataset(task_data)
dataloader = DataLoader(dataset, batch_size=16, workers=8)
for batch in dataloader:
...
# Construct batch of random tasks in the training loop (bottleneck!)
n_context = random.randint(low=1, high=5)
n_target = random.randint(low=1, high=10)
tasks = {'context': [], 'target': []}
for task in batch:
context_images = sample_n_images(task['images'], n_context)
target_images = sample_n_images(task['images'], n_target)
tasks['context'].append(context_images)
tasks['target'].append(target_images)
model(tasks)
...
```
### Multiprocessing
Wouldn't it be better to offload the task creation to the dataloader, so that it can be done in
parallel on multiple cores?
With **MetaBatch**, we simplify the process by allowing you to do just that.
We provide a `TaskSet` wrapper, where you can implement the `__gettask__(self, index, n_context,
n_target)__` method instead of PyTorch's `__getitem(self, index)__`. Our `TaskLoader` and
custom sampler take care of synchronizing `n_context` and `n_target` for each batch element
dispatched to all workers. With **MetaBatch**, the training bottleneck can be removed from the
above example:
```python
# Sample code for creating a batch of tasks with MetaBatch
from metabatch import TaskSet, TaskLoader
class MyTaskSet(TaskSet):
...
def __gettask__(self, idx, n_context, n_target):
data = self.task_data[idx]
context_images = sample_n_images(data['images'], n_context)
target_images = sample_n_images(data['images'], n_target)
return {
'context': context_images
'target': target_images
}
class Model(Module):
...
def forward(self, tasks):
ctx_batch = tasks['context']
tgt_batch = tasks['target']
...
# create dataset
task_data = [{'images': [...], 'label': 'dog'},
{'images': [...], 'label': 'cat'}, ...]
dataset = MyTaskSet(task_data, min_pts=1, max_ctx_pts=5, max_tgt_pts=10)
dataloader = TaskLoader(dataset, batch_size=16, workers=8)
for batch in dataloader:
...
# Simply access the batch of constructed tasks (no bottleneck!)
model(batch)
...
```
## Installation & usage
Install it: `pip install metabatch`
Requirements:
- `pytorch`
Look at the example above for an idea or how to use `TaskLoader` with `TaskSet`, or go through the
examples in `examples/` (**TODO**).
## Advantages
- MetaBatch allows for efficient task creation and batching during training, resulting in more task
variations since you are no longer limited to precomputed tasks.
- Reduces boilerplate needed to precompute and load tasks.
MetaBatch is a micro-framework for meta-learning in PyTorch that provides convenient tools for
(potentially faster) meta-training. It simplifies the task creation process and allows for efficient batching,
making it a useful tool for researchers and engineers working on meta-learning projects.
## How much faster?
**TODO**: benchmark MAML and CNP examples with typical implementation and other repos.
## License
MetaBatch is released under the MIT License. See the LICENSE file for more information.
Raw data
{
"_id": null,
"home_page": null,
"name": "metabatch",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": null,
"keywords": "MAML, dataset, deep, few-shot, learning, meta, meta-learning, pytorch, task, taskset",
"author": null,
"author_email": "Th\u00e9o Morales <moralest@tcd.ie>",
"download_url": "https://files.pythonhosted.org/packages/59/04/d2bb78a8ec7d68ad9171c40886a8b90820275d91ebf394ec6929461eed92/metabatch-0.9.1.tar.gz",
"platform": null,
"description": "# Introduction\n\nMetaBatch provides convenient `Taskset` and `TaskLoader` classes for **batch-aware online task creation for meta-learning**.\n\n## Efficient batching\n\nTraining meta-learning models efficiently can be a challenge, especially when it comes to creating\nrandom tasks of a consistent shape in one batch. The task creation process can be time-consuming\nand typically requires all tasks in the batch to have the same amount of context and target points.\nThis can be a bottleneck during training:\n\n```python\n# Sample code for creating a batch of tasks with traditional approach\nclass MyTaskDataset(Dataset):\n ...\n def __getitem__(self, idx):\n task = self.task_data[idx]\n return task\n\nclass Model(Module):\n ...\n def forward(self, tasks):\n ctx_batch = tasks['context']\n tgt_batch = tasks['target']\n ...\n\n# create dataset\ntask_data = [{'images': [...], 'label': 'dog'},\n {'images': [...], 'label': 'cat'}, ...]\ndataset = MyTaskDataset(task_data)\ndataloader = DataLoader(dataset, batch_size=16, workers=8)\n\nfor batch in dataloader:\n ...\n # Construct batch of random tasks in the training loop (bottleneck!)\n n_context = random.randint(low=1, high=5)\n n_target = random.randint(low=1, high=10)\n tasks = {'context': [], 'target': []}\n for task in batch:\n context_images = sample_n_images(task['images'], n_context)\n target_images = sample_n_images(task['images'], n_target)\n tasks['context'].append(context_images)\n tasks['target'].append(target_images)\n model(tasks)\n ...\n```\n\n### Multiprocessing\nWouldn't it be better to offload the task creation to the dataloader, so that it can be done in\nparallel on multiple cores?\nWith **MetaBatch**, we simplify the process by allowing you to do just that.\nWe provide a `TaskSet` wrapper, where you can implement the `__gettask__(self, index, n_context,\nn_target)__` method instead of PyTorch's `__getitem(self, index)__`. Our `TaskLoader` and\ncustom sampler take care of synchronizing `n_context` and `n_target` for each batch element\ndispatched to all workers. With **MetaBatch**, the training bottleneck can be removed from the\nabove example:\n```python\n# Sample code for creating a batch of tasks with MetaBatch\nfrom metabatch import TaskSet, TaskLoader\n\nclass MyTaskSet(TaskSet):\n ...\n def __gettask__(self, idx, n_context, n_target):\n data = self.task_data[idx]\n context_images = sample_n_images(data['images'], n_context)\n target_images = sample_n_images(data['images'], n_target)\n return {\n 'context': context_images\n 'target': target_images\n }\n\nclass Model(Module):\n ...\n def forward(self, tasks):\n ctx_batch = tasks['context']\n tgt_batch = tasks['target']\n ...\n\n# create dataset\ntask_data = [{'images': [...], 'label': 'dog'},\n {'images': [...], 'label': 'cat'}, ...]\ndataset = MyTaskSet(task_data, min_pts=1, max_ctx_pts=5, max_tgt_pts=10)\ndataloader = TaskLoader(dataset, batch_size=16, workers=8)\n\nfor batch in dataloader:\n ...\n # Simply access the batch of constructed tasks (no bottleneck!)\n model(batch)\n ...\n\n```\n\n## Installation & usage\n\nInstall it: `pip install metabatch`\n\nRequirements:\n- `pytorch`\n\nLook at the example above for an idea or how to use `TaskLoader` with `TaskSet`, or go through the\nexamples in `examples/` (**TODO**).\n\n\n\n## Advantages\n\n- MetaBatch allows for efficient task creation and batching during training, resulting in more task\n variations since you are no longer limited to precomputed tasks.\n- Reduces boilerplate needed to precompute and load tasks.\n\nMetaBatch is a micro-framework for meta-learning in PyTorch that provides convenient tools for\n(potentially faster) meta-training. It simplifies the task creation process and allows for efficient batching,\nmaking it a useful tool for researchers and engineers working on meta-learning projects.\n\n## How much faster?\n\n**TODO**: benchmark MAML and CNP examples with typical implementation and other repos.\n\n\n## License\n\nMetaBatch is released under the MIT License. See the LICENSE file for more information.\n",
"bugtrack_url": null,
"license": null,
"summary": "MetaBatch: A micro-framework for efficient batching of tasks in PyTorch.",
"version": "0.9.1",
"project_urls": {
"Bug Tracker": "https://github.com/DubiousCactus/metabatch/issues",
"Homepage": "https://github.com/DubiousCactus/metabatch"
},
"split_keywords": [
"maml",
" dataset",
" deep",
" few-shot",
" learning",
" meta",
" meta-learning",
" pytorch",
" task",
" taskset"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "5d585973d1602618d76fae58baba05423303717eafa862d2fbe75131e1d620b7",
"md5": "5e241b065cb907f818b418ff1ba6ca00",
"sha256": "33e2302dece55cd819e3344a8a71d515d426ae8f2ebaae10b01a4e3c7f05f213"
},
"downloads": -1,
"filename": "metabatch-0.9.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "5e241b065cb907f818b418ff1ba6ca00",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 7189,
"upload_time": "2024-06-07T09:21:48",
"upload_time_iso_8601": "2024-06-07T09:21:48.123072Z",
"url": "https://files.pythonhosted.org/packages/5d/58/5973d1602618d76fae58baba05423303717eafa862d2fbe75131e1d620b7/metabatch-0.9.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5904d2bb78a8ec7d68ad9171c40886a8b90820275d91ebf394ec6929461eed92",
"md5": "269dbf5a9ed5196a4d30efb9d597842e",
"sha256": "105b694cb0bcf8a3d9a563e19e2b226cc450349206248e862dbec31d9f721c35"
},
"downloads": -1,
"filename": "metabatch-0.9.1.tar.gz",
"has_sig": false,
"md5_digest": "269dbf5a9ed5196a4d30efb9d597842e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 9487,
"upload_time": "2024-06-07T09:21:49",
"upload_time_iso_8601": "2024-06-07T09:21:49.635468Z",
"url": "https://files.pythonhosted.org/packages/59/04/d2bb78a8ec7d68ad9171c40886a8b90820275d91ebf394ec6929461eed92/metabatch-0.9.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-06-07 09:21:49",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "DubiousCactus",
"github_project": "metabatch",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "metabatch"
}