huggingface-hub-storj-patch


Namehuggingface-hub-storj-patch JSON
Version 0.0.6 PyPI version JSON
download
home_pagehttps://github.com/storj/huggingface-hub-storj-patch
SummaryMonkey patch for huggingface_hub to download Git-LFS blobs from Storj
upload_time2023-06-08 17:22:46
maintainer
docs_urlNone
authorKaloyan Raev
requires_python>=3.7.0
licenseApache
keywords model-hub machine-learning models natural-language-processing deep-learning pytorch pretrained-models storj patch linksharing decentralized cloud storage
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Monkey patch for HuggingFace Hub to download Git-LFS blobs from Storj

This patch aims to demonstrate the transfer speed that can be achieved with `huggingface_hub` Python library when utilizing the power of the [Storj Decentralized Cloud Storage](https://storj.io).

HuggingFace Hub stores all large files in Git-LFS.

![image](https://github.com/storj/huggingface-hub-storj-patch/assets/468091/b3c8d6d6-14fd-43c2-9396-91d4d3eba62f)

When the `huggingface_hub` Python library requests to download such a file, the download request is redirected to the Git-LFS CDN hosted at `cdn-lfs.huggingface.co`.

This monkey patch modifies the `huggingface_hub` library to redirect Git-LFS downloads to the Storj Linksharing service hosted at `link.storjshare.io`.

## Prerequisites

The Git-LFS blobs for the respective AI model must be replicated to a Storj bucket and shared it with the [Storj Linksharing Service](https://docs.storj.io/dcs/api-reference/linksharing-service).

We have already replicated the Git-FLS blobs of the [StarCoder](https://huggingface.co/bigcode/starcoder) model to a Storj bucket and shared it: https://link.storjshare.io/raw/juzlwaj7ovnst5gtkv2km3rkriha/lfs-huggingface

If you want to use another AI model, you need to use your own Storj bucket and then configure the patch to use it. See [Configuration](#hf_hub_storj_url_prefix) for more details.

## Installation

First, install the patch module:

```sh
pip install huggingface-hub-storj-patch
```

Then add the following import statement at the top, before any other import, of your Python script:

```python
import huggingface_hub_storj_patch
```

Now you can run your script. If the patch is applied successfully, you will see it printing the URLs from which the `huggingface_hub` library is downloading.

![image](https://github.com/storj/huggingface-hub-storj-patch/assets/468091/ad50968c-7959-4a6a-8f63-540eb70372ba)

## Configuration

These environment variables can configure the behavior of the patch.

### HF_HUB_NO_STORJ

If set to `true`, downloads won't be redirected to the Storj Linksharing Service as if the patch is not applied.

### HF_HUB_STORJ_PARALLELISM

Configures how many parallel download connections are open to the Storj Linksharing Service. The default value is `16`.

### HF_HUB_STORJ_URL_PREFIX

Configures the URL to the shared Storj bucket that replicates the Git-LFS blobs of the AI model. The default value is the bucket that replicates the StarCoder model: https://link.storjshare.io/raw/juzlwaj7ovnst5gtkv2km3rkriha/lfs-huggingface

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/storj/huggingface-hub-storj-patch",
    "name": "huggingface-hub-storj-patch",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.7.0",
    "maintainer_email": "",
    "keywords": "model-hub machine-learning models natural-language-processing deep-learning pytorch pretrained-models storj patch linksharing decentralized cloud storage",
    "author": "Kaloyan Raev",
    "author_email": "kaloyan@storj.io",
    "download_url": "https://files.pythonhosted.org/packages/94/73/9fc2b2ae0298aa8ba0ef396416dc525df74b2f1696b410b66d0183e678da/huggingface_hub_storj_patch-0.0.6.tar.gz",
    "platform": null,
    "description": "# Monkey patch for HuggingFace Hub to download Git-LFS blobs from Storj\n\nThis patch aims to demonstrate the transfer speed that can be achieved with `huggingface_hub` Python library when utilizing the power of the [Storj Decentralized Cloud Storage](https://storj.io).\n\nHuggingFace Hub stores all large files in Git-LFS.\n\n![image](https://github.com/storj/huggingface-hub-storj-patch/assets/468091/b3c8d6d6-14fd-43c2-9396-91d4d3eba62f)\n\nWhen the `huggingface_hub` Python library requests to download such a file, the download request is redirected to the Git-LFS CDN hosted at `cdn-lfs.huggingface.co`.\n\nThis monkey patch modifies the `huggingface_hub` library to redirect Git-LFS downloads to the Storj Linksharing service hosted at `link.storjshare.io`.\n\n## Prerequisites\n\nThe Git-LFS blobs for the respective AI model must be replicated to a Storj bucket and shared it with the [Storj Linksharing Service](https://docs.storj.io/dcs/api-reference/linksharing-service).\n\nWe have already replicated the Git-FLS blobs of the [StarCoder](https://huggingface.co/bigcode/starcoder) model to a Storj bucket and shared it: https://link.storjshare.io/raw/juzlwaj7ovnst5gtkv2km3rkriha/lfs-huggingface\n\nIf you want to use another AI model, you need to use your own Storj bucket and then configure the patch to use it. See [Configuration](#hf_hub_storj_url_prefix) for more details.\n\n## Installation\n\nFirst, install the patch module:\n\n```sh\npip install huggingface-hub-storj-patch\n```\n\nThen add the following import statement at the top, before any other import, of your Python script:\n\n```python\nimport huggingface_hub_storj_patch\n```\n\nNow you can run your script. If the patch is applied successfully, you will see it printing the URLs from which the `huggingface_hub` library is downloading.\n\n![image](https://github.com/storj/huggingface-hub-storj-patch/assets/468091/ad50968c-7959-4a6a-8f63-540eb70372ba)\n\n## Configuration\n\nThese environment variables can configure the behavior of the patch.\n\n### HF_HUB_NO_STORJ\n\nIf set to `true`, downloads won't be redirected to the Storj Linksharing Service as if the patch is not applied.\n\n### HF_HUB_STORJ_PARALLELISM\n\nConfigures how many parallel download connections are open to the Storj Linksharing Service. The default value is `16`.\n\n### HF_HUB_STORJ_URL_PREFIX\n\nConfigures the URL to the shared Storj bucket that replicates the Git-LFS blobs of the AI model. The default value is the bucket that replicates the StarCoder model: https://link.storjshare.io/raw/juzlwaj7ovnst5gtkv2km3rkriha/lfs-huggingface\n",
    "bugtrack_url": null,
    "license": "Apache",
    "summary": "Monkey patch for huggingface_hub to download Git-LFS blobs from Storj",
    "version": "0.0.6",
    "project_urls": {
        "Homepage": "https://github.com/storj/huggingface-hub-storj-patch"
    },
    "split_keywords": [
        "model-hub",
        "machine-learning",
        "models",
        "natural-language-processing",
        "deep-learning",
        "pytorch",
        "pretrained-models",
        "storj",
        "patch",
        "linksharing",
        "decentralized",
        "cloud",
        "storage"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "a195bd9663e71e10f26f8d588a05177efb0d2236a3f3030c123b2717ccdb3496",
                "md5": "6d2e1225290dfea6bf1eac25906bd090",
                "sha256": "657c5a450673d6c04ec58d028901024323fbddb8dfd797cb00076841581ef125"
            },
            "downloads": -1,
            "filename": "huggingface_hub_storj_patch-0.0.6-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "6d2e1225290dfea6bf1eac25906bd090",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.7.0",
            "size": 8664,
            "upload_time": "2023-06-08T17:22:43",
            "upload_time_iso_8601": "2023-06-08T17:22:43.819865Z",
            "url": "https://files.pythonhosted.org/packages/a1/95/bd9663e71e10f26f8d588a05177efb0d2236a3f3030c123b2717ccdb3496/huggingface_hub_storj_patch-0.0.6-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "94739fc2b2ae0298aa8ba0ef396416dc525df74b2f1696b410b66d0183e678da",
                "md5": "353774b72c1a3fcb4a5e4d8073d6b652",
                "sha256": "5da5c0ffe5bffe9d9745a7534da493a7d6a6033e2ca849d83c32b1b795b09cf7"
            },
            "downloads": -1,
            "filename": "huggingface_hub_storj_patch-0.0.6.tar.gz",
            "has_sig": false,
            "md5_digest": "353774b72c1a3fcb4a5e4d8073d6b652",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.7.0",
            "size": 8336,
            "upload_time": "2023-06-08T17:22:46",
            "upload_time_iso_8601": "2023-06-08T17:22:46.116371Z",
            "url": "https://files.pythonhosted.org/packages/94/73/9fc2b2ae0298aa8ba0ef396416dc525df74b2f1696b410b66d0183e678da/huggingface_hub_storj_patch-0.0.6.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-06-08 17:22:46",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "storj",
    "github_project": "huggingface-hub-storj-patch",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "huggingface-hub-storj-patch"
}
        
Elapsed time: 0.08798s