compressed-tensors


Namecompressed-tensors JSON
Version 0.3.3 PyPI version JSON
download
home_pagehttps://github.com/neuralmagic/compressed-tensors
SummaryLibrary for utilization of compressed safetensors of neural network models
upload_time2024-05-07 15:30:55
maintainerNone
docs_urlNone
authorNeuralmagic, Inc.
requires_pythonNone
licenseApache 2.0
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # compressed_tensors

This repository extends a [safetensors](https://github.com/huggingface/safetensors) format to efficiently store sparse and/or quantized tensors on disk. `compressed-tensors` format supports multiple compression types to minimize the disk space and facilitate the tensor manipulation.

## Motivation

### Reduce disk space by saving sparse tensors in a compressed format

The compressed format stores the data much more efficiently by taking advantage of two properties of tensors:

- Sparse tensors -> due to a large number of entries that are equal to zero.
- Quantized -> due to their low precision representation.

### Introduce an elegant interface to save/load compressed tensors

The library provides the user with the ability to compress/decompress tensors. The properties of tensors are defined by human-readable configs, allowing the users to understand the compression format at a quick glance.

## Installation

### Pip

```bash
pip install compressed-tensors
```

### From source

```bash
git clone https://github.com/neuralmagic/compressed-tensors
cd compressed-tensors
pip install -e .
```

## Getting started

### Saving/Loading Compressed Tensors (Bitmask Compression)

The function `save_compressed` uses the `compression_format` argument to apply compression to tensors.
The function `load_compressed` reverses the process: converts the compressed weights on disk to decompressed weights in device memory.

```python
from compressed_tensors import save_compressed, load_compressed, BitmaskConfig
from torch import Tensor
from typing import Dict

# the example BitmaskConfig method efficiently compresses 
# tensors with large number of zero entries 
compression_config = BitmaskConfig()

tensors: Dict[str, Tensor] = {"tensor_1": Tensor(
    [[0.0, 0.0, 0.0], 
     [1.0, 1.0, 1.0]]
)}
# compress tensors using BitmaskConfig compression format (save them efficiently on disk)
save_compressed(tensors, "model.safetensors", compression_format=compression_config.format)

# decompress tensors (load_compressed returns a generator for memory efficiency)
decompressed_tensors = {}
for tensor_name, tensor in load_compressed("model.safetensors", compression_config = compression_config):
    decompressed_tensors[tensor_name] = tensor
```

## Saving/Loading Compressed Models (Bitmask Compression)

We can apply bitmask compression to a whole model. For more detailed example see `example` directory.
```python
from compressed_tensors import save_compressed_model, load_compressed, BitmaskConfig
from transformers import AutoModelForCausalLM

model_name = "neuralmagic/llama2.c-stories110M-pruned50"
model = AutoModelForCausalLM.from_pretrained(model_name)

original_state_dict = model.state_dict()

compression_config = BitmaskConfig()

# save compressed model weights
save_compressed_model(model, "compressed_model.safetensors", compression_format=compression_config.format)

# load compressed model weights (`dict` turns generator into a dictionary)
state_dict = dict(load_compressed("compressed_model.safetensors", compression_config))
```

For more in-depth tutorial on bitmask compression, refer to the [notebook](https://github.com/neuralmagic/compressed-tensors/blob/d707c5b84bc3fef164aebdcd97cb6eaa571982f8/examples/bitmask_compression.ipynb).



            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/neuralmagic/compressed-tensors",
    "name": "compressed-tensors",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": null,
    "author": "Neuralmagic, Inc.",
    "author_email": "support@neuralmagic.com",
    "download_url": "https://files.pythonhosted.org/packages/7d/c1/b9bc5a6f1ae7453cbe4d568726e004bfbb482ecddade77b9866c3df76973/compressed-tensors-0.3.3.tar.gz",
    "platform": null,
    "description": "# compressed_tensors\n\nThis repository extends a [safetensors](https://github.com/huggingface/safetensors) format to efficiently store sparse and/or quantized tensors on disk. `compressed-tensors` format supports multiple compression types to minimize the disk space and facilitate the tensor manipulation.\n\n## Motivation\n\n### Reduce disk space by saving sparse tensors in a compressed format\n\nThe compressed format stores the data much more efficiently by taking advantage of two properties of tensors:\n\n- Sparse tensors -> due to a large number of entries that are equal to zero.\n- Quantized -> due to their low precision representation.\n\n### Introduce an elegant interface to save/load compressed tensors\n\nThe library provides the user with the ability to compress/decompress tensors. The properties of tensors are defined by human-readable configs, allowing the users to understand the compression format at a quick glance.\n\n## Installation\n\n### Pip\n\n```bash\npip install compressed-tensors\n```\n\n### From source\n\n```bash\ngit clone https://github.com/neuralmagic/compressed-tensors\ncd compressed-tensors\npip install -e .\n```\n\n## Getting started\n\n### Saving/Loading Compressed Tensors (Bitmask Compression)\n\nThe function `save_compressed` uses the `compression_format` argument to apply compression to tensors.\nThe function `load_compressed` reverses the process: converts the compressed weights on disk to decompressed weights in device memory.\n\n```python\nfrom compressed_tensors import save_compressed, load_compressed, BitmaskConfig\nfrom torch import Tensor\nfrom typing import Dict\n\n# the example BitmaskConfig method efficiently compresses \n# tensors with large number of zero entries \ncompression_config = BitmaskConfig()\n\ntensors: Dict[str, Tensor] = {\"tensor_1\": Tensor(\n    [[0.0, 0.0, 0.0], \n     [1.0, 1.0, 1.0]]\n)}\n# compress tensors using BitmaskConfig compression format (save them efficiently on disk)\nsave_compressed(tensors, \"model.safetensors\", compression_format=compression_config.format)\n\n# decompress tensors (load_compressed returns a generator for memory efficiency)\ndecompressed_tensors = {}\nfor tensor_name, tensor in load_compressed(\"model.safetensors\", compression_config = compression_config):\n    decompressed_tensors[tensor_name] = tensor\n```\n\n## Saving/Loading Compressed Models (Bitmask Compression)\n\nWe can apply bitmask compression to a whole model. For more detailed example see `example` directory.\n```python\nfrom compressed_tensors import save_compressed_model, load_compressed, BitmaskConfig\nfrom transformers import AutoModelForCausalLM\n\nmodel_name = \"neuralmagic/llama2.c-stories110M-pruned50\"\nmodel = AutoModelForCausalLM.from_pretrained(model_name)\n\noriginal_state_dict = model.state_dict()\n\ncompression_config = BitmaskConfig()\n\n# save compressed model weights\nsave_compressed_model(model, \"compressed_model.safetensors\", compression_format=compression_config.format)\n\n# load compressed model weights (`dict` turns generator into a dictionary)\nstate_dict = dict(load_compressed(\"compressed_model.safetensors\", compression_config))\n```\n\nFor more in-depth tutorial on bitmask compression, refer to the [notebook](https://github.com/neuralmagic/compressed-tensors/blob/d707c5b84bc3fef164aebdcd97cb6eaa571982f8/examples/bitmask_compression.ipynb).\n\n\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "Library for utilization of compressed safetensors of neural network models",
    "version": "0.3.3",
    "project_urls": {
        "Homepage": "https://github.com/neuralmagic/compressed-tensors"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "575f5f7295039ea7e8ebcc4943b32565cbecbc7f3fd0775f404c42f565412497",
                "md5": "b97deda521aed25d76baa041e0fda167",
                "sha256": "842192767261e36fde55fb7e88390a7457475115aa534cc785bfff45aac1ad89"
            },
            "downloads": -1,
            "filename": "compressed_tensors-0.3.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b97deda521aed25d76baa041e0fda167",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 50996,
            "upload_time": "2024-05-07T15:30:45",
            "upload_time_iso_8601": "2024-05-07T15:30:45.943487Z",
            "url": "https://files.pythonhosted.org/packages/57/5f/5f7295039ea7e8ebcc4943b32565cbecbc7f3fd0775f404c42f565412497/compressed_tensors-0.3.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "7dc1b9bc5a6f1ae7453cbe4d568726e004bfbb482ecddade77b9866c3df76973",
                "md5": "90204b4c29252daaa23161df39b028a9",
                "sha256": "58aeeb0c03d652d674a0f3f1df67258f056d819a63ccead005c2897876c6411c"
            },
            "downloads": -1,
            "filename": "compressed-tensors-0.3.3.tar.gz",
            "has_sig": false,
            "md5_digest": "90204b4c29252daaa23161df39b028a9",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 31479,
            "upload_time": "2024-05-07T15:30:55",
            "upload_time_iso_8601": "2024-05-07T15:30:55.323922Z",
            "url": "https://files.pythonhosted.org/packages/7d/c1/b9bc5a6f1ae7453cbe4d568726e004bfbb482ecddade77b9866c3df76973/compressed-tensors-0.3.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-05-07 15:30:55",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "neuralmagic",
    "github_project": "compressed-tensors",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "compressed-tensors"
}
        
Elapsed time: 0.24631s