# Lightning + Hivemind
[![lightning](https://img.shields.io/badge/-Lightning_2.0+-792ee5?logo=pytorchlightning&logoColor=white)](https://lightning.ai/)
[![PyPI Status](https://badge.fury.io/py/lightning-hivemind.svg)](https://badge.fury.io/py/lightning-hivemind)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/lightning-hivemind)](https://pypi.org/project/lightning-hivemind/)
[![PyPI Downloads](https://pepy.tech/badge/lightning-hivemind)](https://pepy.tech/project/lightning-hivemind)
[![Docs](https://github.com/Lightning-AI/lightning-Hivemind/actions/workflows/docs-deploy.yml/badge.svg?event=push)](https://lightning-ai.github.io/lightning-Hivemind/)
[![General checks](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-checks.yml/badge.svg?event=push)](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-checks.yml)
[![CI testing](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-testing.yml/badge.svg?event=push)](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-testing.yml)
[![Build Status](https://dev.azure.com/Lightning-AI/compatibility/_apis/build/status/Lightning-AI.lightning-Hivemind?branchName=main)](https://dev.azure.com/Lightning-AI/compatibility/_build/latest?definitionId=43&branchName=main)
[![pre-commit status](https://results.pre-commit.ci/badge/github/Lightning-AI/lightning-Hivemind/main.svg)](https://results.pre-commit.ci/latest/github/Lightning-AI/lightning-Hivemind/main)
Collaborative Training tries to solve the need for top-tier multi-GPU servers by allowing you to train across unreliable machines,
such as local machines or even preemptible cloud compute across the internet.
Under the hood, we use [Hivemind](https://github.com/learning-at-home/hivemind) which provides de-centralized training across the internet.
To use Collaborative Training, you need to first this extension.
```bash
pip install -U lightning-Hivemind
```
The `HivemindStrategy` accumulates gradients from all processes that are collaborating until they reach a `target_batch_size`. By default, we use the batch size
of the first batch to determine what each local machine batch contributes towards the `target_batch_size`. Once the `target_batch_size` is reached, an optimizer step
is made on all processes.
When using `HivemindStrategy` note that you cannot use gradient accumulation (`accumulate_grad_batches`). This is because Hivemind manages accumulation internally.
```py
from lightning import Trainer
from lightning_hivemind.strategy import HivemindStrategy
trainer = Trainer(strategy=HivemindStrategy(target_batch_size=8192), accelerator="gpu", devices=1)
```
Followed by:
```bash
python train.py
# Other machines can connect running the same command:
# INITIAL_PEERS=... python train.py
# or passing the peers to the strategy:"
# HivemindStrategy(initial_peers=...)"
```
A helper message is printed once your training begins, which shows you how to start training on other machines using the same code.
Raw data
{
"_id": null,
"home_page": "https://github.com/Lightning-AI/lightning-Hivemind",
"name": "lightning-Hivemind",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": "",
"keywords": "deep learning,pytorch,AI",
"author": "Lightning-AI et al.",
"author_email": "name@lightning.ai",
"download_url": "https://files.pythonhosted.org/packages/06/21/e6108d9f3fcafd8dafee72f2fcf540d8eaeeace37c32bfba8fdb2a68ed7b/lightning-Hivemind-0.1.0.tar.gz",
"platform": null,
"description": "# Lightning + Hivemind\n\n[![lightning](https://img.shields.io/badge/-Lightning_2.0+-792ee5?logo=pytorchlightning&logoColor=white)](https://lightning.ai/)\n[![PyPI Status](https://badge.fury.io/py/lightning-hivemind.svg)](https://badge.fury.io/py/lightning-hivemind)\n[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/lightning-hivemind)](https://pypi.org/project/lightning-hivemind/)\n[![PyPI Downloads](https://pepy.tech/badge/lightning-hivemind)](https://pepy.tech/project/lightning-hivemind)\n[![Docs](https://github.com/Lightning-AI/lightning-Hivemind/actions/workflows/docs-deploy.yml/badge.svg?event=push)](https://lightning-ai.github.io/lightning-Hivemind/)\n\n[![General checks](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-checks.yml/badge.svg?event=push)](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-checks.yml)\n[![CI testing](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-testing.yml/badge.svg?event=push)](https://github.com/Lightning-AI/lightning-hivemind/actions/workflows/ci-testing.yml)\n[![Build Status](https://dev.azure.com/Lightning-AI/compatibility/_apis/build/status/Lightning-AI.lightning-Hivemind?branchName=main)](https://dev.azure.com/Lightning-AI/compatibility/_build/latest?definitionId=43&branchName=main)\n[![pre-commit status](https://results.pre-commit.ci/badge/github/Lightning-AI/lightning-Hivemind/main.svg)](https://results.pre-commit.ci/latest/github/Lightning-AI/lightning-Hivemind/main)\n\nCollaborative Training tries to solve the need for top-tier multi-GPU servers by allowing you to train across unreliable machines,\nsuch as local machines or even preemptible cloud compute across the internet.\n\nUnder the hood, we use [Hivemind](https://github.com/learning-at-home/hivemind) which provides de-centralized training across the internet.\n\nTo use Collaborative Training, you need to first this extension.\n\n```bash\npip install -U lightning-Hivemind\n```\n\nThe `HivemindStrategy` accumulates gradients from all processes that are collaborating until they reach a `target_batch_size`. By default, we use the batch size\nof the first batch to determine what each local machine batch contributes towards the `target_batch_size`. Once the `target_batch_size` is reached, an optimizer step\nis made on all processes.\n\nWhen using `HivemindStrategy` note that you cannot use gradient accumulation (`accumulate_grad_batches`). This is because Hivemind manages accumulation internally.\n\n```py\nfrom lightning import Trainer\nfrom lightning_hivemind.strategy import HivemindStrategy\n\ntrainer = Trainer(strategy=HivemindStrategy(target_batch_size=8192), accelerator=\"gpu\", devices=1)\n```\n\nFollowed by:\n\n```bash\npython train.py\n# Other machines can connect running the same command:\n# INITIAL_PEERS=... python train.py\n# or passing the peers to the strategy:\"\n# HivemindStrategy(initial_peers=...)\"\n```\n\nA helper message is printed once your training begins, which shows you how to start training on other machines using the same code.\n",
"bugtrack_url": null,
"license": "Apache-2.0",
"summary": "Lightning strategy extension for Hivemind.",
"version": "0.1.0",
"split_keywords": [
"deep learning",
"pytorch",
"ai"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "fed5d92720a8b60213b8ae0e5b204215e33bb96a14cc152a9b5edb60cfffe8b1",
"md5": "de70790dc9ada729705435b6b8ea5989",
"sha256": "e479ae2ec144a93cc67a06005f6fdec4ecc420819a3f47068a0aef267d3ce3e2"
},
"downloads": -1,
"filename": "lightning_Hivemind-0.1.0-py3-none-any.whl",
"has_sig": false,
"md5_digest": "de70790dc9ada729705435b6b8ea5989",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.8",
"size": 12657,
"upload_time": "2023-03-22T14:45:11",
"upload_time_iso_8601": "2023-03-22T14:45:11.476218Z",
"url": "https://files.pythonhosted.org/packages/fe/d5/d92720a8b60213b8ae0e5b204215e33bb96a14cc152a9b5edb60cfffe8b1/lightning_Hivemind-0.1.0-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0621e6108d9f3fcafd8dafee72f2fcf540d8eaeeace37c32bfba8fdb2a68ed7b",
"md5": "38a633b8f1345b0bfc160bf7098d8cde",
"sha256": "71a85bec32d45229f8352ab78320c06aad10d1a2b368068e87117666e39ad651"
},
"downloads": -1,
"filename": "lightning-Hivemind-0.1.0.tar.gz",
"has_sig": false,
"md5_digest": "38a633b8f1345b0bfc160bf7098d8cde",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.8",
"size": 13131,
"upload_time": "2023-03-22T14:45:13",
"upload_time_iso_8601": "2023-03-22T14:45:13.053866Z",
"url": "https://files.pythonhosted.org/packages/06/21/e6108d9f3fcafd8dafee72f2fcf540d8eaeeace37c32bfba8fdb2a68ed7b/lightning-Hivemind-0.1.0.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-03-22 14:45:13",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "Lightning-AI",
"github_project": "lightning-Hivemind",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [],
"lcname": "lightning-hivemind"
}