# Fast Hadamard Transform in CUDA, with a PyTorch interface
Features:
- Support fp32, fp16, bf16, for dimension up to 32768.
- Implicitly pad with zeros if dimension is not a power of 2.
## How to use
```
from fast_hadamard_transform import hadamard_transform
```
```
def hadamard_transform(x, scale=1.0):
"""
Arguments:
x: (..., dim)
scale: float. Multiply the output by this number.
Returns:
out: (..., dim)
Multiply each row of x by the Hadamard transform matrix.
Equivalent to F.linear(x, torch.tensor(scipy.linalg.hadamard(dim))) * scale.
If dim is not a power of 2, we implicitly pad x with zero so that dim is the next power of 2.
"""
```
## Speed
Benchmarked on A100, for not too small batch size, compared to memcpy
(torch.clone), which is a lower bound for the time taken as we'd need to read
inputs from GPU memory and write output to GPU memory anyway.
| Data type | Dimension | Time taken vs memcpy |
| --------- | ---------- | -------------------- |
| fp16/bf16 | <= 512 | 1.0x |
| | 512 - 8192 | <= 1.2x |
| | 16384 | 1.3x |
| | 32768 | 1.8x |
| fp32 | <= 8192 | 1.0x |
| | 16384 | 1.1x |
| | 32768 | 1.2x |
Raw data
{
"_id": null,
"home_page": "https://github.com/Dao-AILab/fast-hadamard-transform",
"name": "fast-hadamard-transform",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "",
"author": "Tri Dao",
"author_email": "tri@tridao.me",
"download_url": "https://files.pythonhosted.org/packages/33/99/8690afdcf5caf79736ed8d9c062d92608e2d65402167bc5411b5d4b71853/fast_hadamard_transform-1.0.4.post1.tar.gz",
"platform": null,
"description": "# Fast Hadamard Transform in CUDA, with a PyTorch interface\n\nFeatures:\n- Support fp32, fp16, bf16, for dimension up to 32768.\n- Implicitly pad with zeros if dimension is not a power of 2.\n\n## How to use\n\n```\nfrom fast_hadamard_transform import hadamard_transform\n```\n\n```\ndef hadamard_transform(x, scale=1.0):\n \"\"\"\n Arguments:\n x: (..., dim)\n scale: float. Multiply the output by this number.\n Returns:\n out: (..., dim)\n\n Multiply each row of x by the Hadamard transform matrix.\n Equivalent to F.linear(x, torch.tensor(scipy.linalg.hadamard(dim))) * scale.\n If dim is not a power of 2, we implicitly pad x with zero so that dim is the next power of 2.\n \"\"\"\n```\n\n## Speed\n\nBenchmarked on A100, for not too small batch size, compared to memcpy\n(torch.clone), which is a lower bound for the time taken as we'd need to read\ninputs from GPU memory and write output to GPU memory anyway.\n\n| Data type | Dimension | Time taken vs memcpy |\n| --------- | ---------- | -------------------- |\n| fp16/bf16 | <= 512 | 1.0x |\n| | 512 - 8192 | <= 1.2x |\n| | 16384 | 1.3x |\n| | 32768 | 1.8x |\n| fp32 | <= 8192 | 1.0x |\n| | 16384 | 1.1x |\n| | 32768 | 1.2x |\n",
"bugtrack_url": null,
"license": "",
"summary": "Fast Hadamard Transform in CUDA, with a PyTorch interface",
"version": "1.0.4.post1",
"project_urls": {
"Homepage": "https://github.com/Dao-AILab/fast-hadamard-transform"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "33998690afdcf5caf79736ed8d9c062d92608e2d65402167bc5411b5d4b71853",
"md5": "efb49590e6a7e35c560161899892454e",
"sha256": "a296eaf72201599b698ff5f924b6cb9d1d4bede3ca0faac3c9de929a30e39168"
},
"downloads": -1,
"filename": "fast_hadamard_transform-1.0.4.post1.tar.gz",
"has_sig": false,
"md5_digest": "efb49590e6a7e35c560161899892454e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 6699,
"upload_time": "2024-02-13T05:49:17",
"upload_time_iso_8601": "2024-02-13T05:49:17.448664Z",
"url": "https://files.pythonhosted.org/packages/33/99/8690afdcf5caf79736ed8d9c062d92608e2d65402167bc5411b5d4b71853/fast_hadamard_transform-1.0.4.post1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-02-13 05:49:17",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "Dao-AILab",
"github_project": "fast-hadamard-transform",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "fast-hadamard-transform"
}