Name | metaflow-checkpoint JSON |
Version |
0.1.4
JSON |
| download |
home_page | None |
Summary | An EXPERIMENTAL checkpoint decorator for Metaflow |
upload_time | 2024-11-13 23:25:55 |
maintainer | None |
docs_url | None |
author | Valay Dave |
requires_python | None |
license | None |
keywords |
|
VCS |
|
bugtrack_url |
|
requirements |
No requirements were recorded.
|
Travis-CI |
No Travis.
|
coveralls test coverage |
No coveralls.
|
# Metaflow Checkpoint
Imagine running a machine learning training job or any data processing task that takes hours or even days to complete. In such scenarios, you don't want failures or collaboration complexities to force you to start over and lose all the progress made. **This is where Metaflow's new decorators—`@checkpoint`, `@model`, and `@huggingface_hub`—come into play.** These decorators are specifically designed to address these challenges by simplifying checkpointing, model management, and efficient loading of external models, **ensuring that your long-running jobs can be resumed seamlessly after a failure and that models and checkpoints are properly versioned in multi-user environments.**
This repository introduces three new decorators for [Metaflow](https://metaflow.org) that address these challenges:
- **`@checkpoint`**: Simplifies saving and reloading checkpoints within your Metaflow flows.
- **`@huggingface_hub`**: Enables efficient loading and caching of large models from Hugging Face Hub.
- **`@model`**: Allows for easy saving and loading of models created during your Metaflow flows.
Examples for these decorators can be found in [this repository](https://github.com/outerbounds/metaflow-checkpoint-examples/tree/master).
## Features
### `@checkpoint` Decorator
The `@checkpoint` decorator alleviates the pain points associated with saving and reloading the state of your program (a Metaflow `@step`) in Metaflow flows. It also handles version control in multi-user settings by isolating checkpoints per user and run. Whether it's a checkpoint created by a machine learning model or intermediate data required in case of crashes, this decorator simplifies state management and failure recovery.
- **Checkpointing**: Save the state of your `@step` at designated points.
- **Seamless Recovery**: Restart your job from the last checkpoint upon retries without any manual intervention.
- **User Isolation**: Checkpoints are managed per user to prevent overwriting in collaborative environments.
- **Ease of Use**: Minimal code changes required to implement checkpointing.
### `@huggingface_hub` Decorator
The `@huggingface_hub` decorator allows you to load large models from Hugging Face Hub and cache them for increased performance benefits. It also ensures that models are versioned and managed appropriately in multi-user environments.
- **Efficient Model Loading**: Load models on-the-fly from Hugging Face Hub.
- **Caching Mechanism**: Cache models locally to avoid redundant downloads.
- **Version Control**: Manages different versions of models to prevent conflicts.
- **Integration with Metaflow**: Easily incorporate models across your Metaflow flows.
### `@model` Decorator
The `@model` decorator provides a trivial way to save and load models/checkpoints created as part of your Metaflow flow.
- **Simplified Model Loading**: Automatically load models based on references and identifiers created by decorators such as `@model`/`@checkpoint`/`@huggingface_hub`.
- **Model Identity**: Associates a uniquie identity to models so that there is clear distinction between different versions making it easy to track their lineage.
Raw data
{
"_id": null,
"home_page": null,
"name": "metaflow-checkpoint",
"maintainer": null,
"docs_url": null,
"requires_python": null,
"maintainer_email": null,
"keywords": null,
"author": "Valay Dave",
"author_email": "help@outerbounds.com",
"download_url": "https://files.pythonhosted.org/packages/92/45/949e13915f8b0480fff35206c22797f607fffb6f115aafe2b4b034c800ba/metaflow_checkpoint-0.1.4.tar.gz",
"platform": null,
"description": "# Metaflow Checkpoint\n\nImagine running a machine learning training job or any data processing task that takes hours or even days to complete. In such scenarios, you don't want failures or collaboration complexities to force you to start over and lose all the progress made. **This is where Metaflow's new decorators\u2014`@checkpoint`, `@model`, and `@huggingface_hub`\u2014come into play.** These decorators are specifically designed to address these challenges by simplifying checkpointing, model management, and efficient loading of external models, **ensuring that your long-running jobs can be resumed seamlessly after a failure and that models and checkpoints are properly versioned in multi-user environments.**\n\nThis repository introduces three new decorators for [Metaflow](https://metaflow.org) that address these challenges:\n\n- **`@checkpoint`**: Simplifies saving and reloading checkpoints within your Metaflow flows.\n- **`@huggingface_hub`**: Enables efficient loading and caching of large models from Hugging Face Hub.\n- **`@model`**: Allows for easy saving and loading of models created during your Metaflow flows.\n\nExamples for these decorators can be found in [this repository](https://github.com/outerbounds/metaflow-checkpoint-examples/tree/master). \n\n## Features\n\n### `@checkpoint` Decorator\n\nThe `@checkpoint` decorator alleviates the pain points associated with saving and reloading the state of your program (a Metaflow `@step`) in Metaflow flows. It also handles version control in multi-user settings by isolating checkpoints per user and run. Whether it's a checkpoint created by a machine learning model or intermediate data required in case of crashes, this decorator simplifies state management and failure recovery.\n\n- **Checkpointing**: Save the state of your `@step` at designated points.\n- **Seamless Recovery**: Restart your job from the last checkpoint upon retries without any manual intervention.\n- **User Isolation**: Checkpoints are managed per user to prevent overwriting in collaborative environments.\n- **Ease of Use**: Minimal code changes required to implement checkpointing.\n\n### `@huggingface_hub` Decorator\n\nThe `@huggingface_hub` decorator allows you to load large models from Hugging Face Hub and cache them for increased performance benefits. It also ensures that models are versioned and managed appropriately in multi-user environments.\n\n- **Efficient Model Loading**: Load models on-the-fly from Hugging Face Hub.\n- **Caching Mechanism**: Cache models locally to avoid redundant downloads.\n- **Version Control**: Manages different versions of models to prevent conflicts.\n- **Integration with Metaflow**: Easily incorporate models across your Metaflow flows.\n\n### `@model` Decorator\n\nThe `@model` decorator provides a trivial way to save and load models/checkpoints created as part of your Metaflow flow. \n\n- **Simplified Model Loading**: Automatically load models based on references and identifiers created by decorators such as `@model`/`@checkpoint`/`@huggingface_hub`. \n- **Model Identity**: Associates a uniquie identity to models so that there is clear distinction between different versions making it easy to track their lineage. \n\n\n",
"bugtrack_url": null,
"license": null,
"summary": "An EXPERIMENTAL checkpoint decorator for Metaflow",
"version": "0.1.4",
"project_urls": null,
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "e3df60d862745bd1df98cf83f31de111375f0bdaa0ff88a4f10d66b1007eb9d8",
"md5": "38d27489485b3ce5c2fe46486643628e",
"sha256": "d803e153af6f79b8a99eb7611c66847c483232bde495bd5f2a11ecd40e3d25c8"
},
"downloads": -1,
"filename": "metaflow_checkpoint-0.1.4-py2.py3-none-any.whl",
"has_sig": false,
"md5_digest": "38d27489485b3ce5c2fe46486643628e",
"packagetype": "bdist_wheel",
"python_version": "py2.py3",
"requires_python": null,
"size": 78426,
"upload_time": "2024-11-13T23:25:53",
"upload_time_iso_8601": "2024-11-13T23:25:53.730989Z",
"url": "https://files.pythonhosted.org/packages/e3/df/60d862745bd1df98cf83f31de111375f0bdaa0ff88a4f10d66b1007eb9d8/metaflow_checkpoint-0.1.4-py2.py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "9245949e13915f8b0480fff35206c22797f607fffb6f115aafe2b4b034c800ba",
"md5": "f529a2f63aa91e27f4169d1d1811c6b0",
"sha256": "8e649d44fd6b68362a4486482203a25b8c4481caebf9cdd8a5f99c5a799c91cf"
},
"downloads": -1,
"filename": "metaflow_checkpoint-0.1.4.tar.gz",
"has_sig": false,
"md5_digest": "f529a2f63aa91e27f4169d1d1811c6b0",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 57364,
"upload_time": "2024-11-13T23:25:55",
"upload_time_iso_8601": "2024-11-13T23:25:55.288333Z",
"url": "https://files.pythonhosted.org/packages/92/45/949e13915f8b0480fff35206c22797f607fffb6f115aafe2b4b034c800ba/metaflow_checkpoint-0.1.4.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-11-13 23:25:55",
"github": false,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"lcname": "metaflow-checkpoint"
}