gptq

Name	gptq JSON
Version	0.0.3 JSON
	download
home_page
Summary	GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers
upload_time	2023-03-23 06:48:51
maintainer
docs_url	None
author	Juncong Moo
requires_python
license	Apache 2.0
keywords	gptq
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # 🔮 GPTQ - Accurate Post-Training Compression for Generative Pretrained Transformers

> This repo is a extended and polished version of the original code for the paper [GPTQ: Accurate Post-training Compression for Generative Pretrained Transformers](https://arxiv.org/abs/2210.17323).



## 🔥 SOTA on LLM PTQ

* An efficient implementation of the GPTQ algorithm
* 2/3/4/8-bit quantized matrix full-precision vector product CUDA kernel
* Bug fix for old consumer-grade GPU


![](https://images.deepai.org/converted-papers/2210.17323/x3.png)


## 📥 Installation

```bash
pip install gptq
```


### 🛟 Install PyTorch

`gptq` requires PyTorch and GPU, and installing PyTorch with CUDA is tricky. To install PyTorch correctly, the following steps are recommended:

- run `nvcc --version` to get the version. For example, the following result means we have cuda compiler version 116

```
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Tue_Mar__8_18:18:20_PST_2022
Cuda compilation tools, release 11.6, V11.6.124
Build cuda_11.6.r11.6/compiler.31057947_0
```
- run `pip install light-the-torch` to install ltt
- run `ltt install --pytorch-computation-backend=cu116 torch torchvision torchaudio` to install the torch suite. Please replace the `116` according to your environment!

## TODO

- GPTQ with CNN

----

Algorithm credits go to [IST Austria Distributed Algorithms and Systems Lab](https://ist.ac.at/en/research/alistarh-group)

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "gptq",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "gptq",
    "author": "Juncong Moo",
    "author_email": "<juncongmoo@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/25/06/2e8e087ec1572fca200c591442ed92c4df7ac8a854ecdf32a7b8065ce14d/gptq-0.0.3.tar.gz",
    "platform": null,
    "description": "# \ud83d\udd2e GPTQ - Accurate Post-Training Compression for Generative Pretrained Transformers\n\n> This repo is a extended and polished version of the original code for the paper [GPTQ: Accurate Post-training Compression for Generative Pretrained Transformers](https://arxiv.org/abs/2210.17323).\n\n\n\n## \ud83d\udd25 SOTA on LLM PTQ\n\n* An efficient implementation of the GPTQ algorithm\n* 2/3/4/8-bit quantized matrix full-precision vector product CUDA kernel\n* Bug fix for old consumer-grade GPU\n\n\n![](https://images.deepai.org/converted-papers/2210.17323/x3.png)\n\n\n## \ud83d\udce5 Installation\n\n```bash\npip install gptq\n```\n\n\n### \ud83d\udedf Install PyTorch\n\n`gptq` requires PyTorch and GPU, and installing PyTorch with CUDA is tricky. To install PyTorch correctly, the following steps are recommended:\n\n- run `nvcc --version` to get the version. For example, the following result means we have cuda compiler version 116\n\n```\nnvcc: NVIDIA (R) Cuda compiler driver\nCopyright (c) 2005-2022 NVIDIA Corporation\nBuilt on Tue_Mar__8_18:18:20_PST_2022\nCuda compilation tools, release 11.6, V11.6.124\nBuild cuda_11.6.r11.6/compiler.31057947_0\n```\n- run `pip install light-the-torch` to install ltt\n- run `ltt install --pytorch-computation-backend=cu116 torch torchvision torchaudio` to install the torch suite. Please replace the `116` according to your environment!\n\n## TODO\n\n- GPTQ with CNN\n\n----\n\nAlgorithm credits go to [IST Austria Distributed Algorithms and Systems Lab](https://ist.ac.at/en/research/alistarh-group)\n\n\n\n",
    "bugtrack_url": null,
    "license": "Apache 2.0",
    "summary": "GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers",
    "version": "0.0.3",
    "split_keywords": [
        "gptq"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "25062e8e087ec1572fca200c591442ed92c4df7ac8a854ecdf32a7b8065ce14d",
                "md5": "e36064eeaae8f9c0edb7864648f58317",
                "sha256": "05121652e59fd5cc9c6cf9530bb999bb4d843fdbbe81ee532e06c6f8023b812f"
            },
            "downloads": -1,
            "filename": "gptq-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "e36064eeaae8f9c0edb7864648f58317",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 21430,
            "upload_time": "2023-03-23T06:48:51",
            "upload_time_iso_8601": "2023-03-23T06:48:51.069411Z",
            "url": "https://files.pythonhosted.org/packages/25/06/2e8e087ec1572fca200c591442ed92c4df7ac8a854ecdf32a7b8065ce14d/gptq-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-03-23 06:48:51",
    "github": false,
    "gitlab": false,
    "bitbucket": false,
    "lcname": "gptq"
}

Juncong Moo