# cumm
CUda Matrix Multiply library.
[![Build Status](https://github.com/FindDefinition/cumm/workflows/build/badge.svg)](https://github.com/FindDefinition/cumm/actions?query=workflow%3Abuild)
```cumm``` is developed during learning of [CUTLASS](https://github.com/NVIDIA/cutlass), which use too much c++ template and make code unmaintainable. So I develop [pccm](https://github.com/FindDefinition/PCCM), use python as meta programming language, to replace c++ template meta programming.
Now ```pccm``` become a foundational framework of ```cumm``` and my other c++ project such as [spconv](https://github.com/traveller59/spconv).
```cumm``` also contains a python asyncio-based gemm simulator that **share same meta program** with CUDA code, enable gemm visualization and easy debug experience.
## BREAKING CHANGES
* 0.3.1: tv::DType enum value changed, this will affect all binary code of tv::Tensor user. you must recompile all code if upgrade to cumm >= 0.3.1.
## News
* Ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora))
## Install
### Prebuilt
We offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for windows 10/11.
```pip install cumm``` for CPU-only
```pip install cumm-cu102``` for CUDA 10.2
```pip install cumm-cu113``` for CUDA 11.3
```pip install cumm-cu114``` for CUDA 11.4
```pip install cumm-cu117``` for CUDA 11.7
```pip install cumm-cu120``` for CUDA 12.0
### Build from source for development (JIT, recommend for develop)
**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.
The c++ code will be built automatically when you change c++ code in project.
#### Linux
0. uninstall cumm installed by pip. you must ensure no "cumm" exists in ```pip list | grep cumm```
1. install build-essential, install CUDA
2. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```
3. in python, ```import cumm``` and wait for build finish.
#### Windows
0. uninstall spconv and cumm installed by pip. you must ensure no "cumm" exists in ```pip list | grep cumm```
1. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA
2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)
3. start a new powershell, run ```tools/msvc_setup.ps1```
4. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```
5. in python, ```import cumm``` and wait for build finish.
### Build wheel from source
**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.
**WARNING**: If ```CUMM_CUDA_VERSION``` is set with a CUDA version, following steps will create a wheel named "cumm-cuxxx", not "cumm", this means you must use ```cumm-cuxxx``` in dependency of your project which depend on cumm, not ```cumm```. If ```CUMM_CUDA_VERSION``` isn't set, ```cumm``` will always built with CUDA, so the CUDA must exists in your system. The wheel name will be ```cumm``` even if it is built with cuda.
#### Linux
It's recommend to build Linux packages in [official build docker](https://github.com/FindDefinition/cumm/blob/main/.github/workflows/build.yaml). Build with CUDA support don't need a real GPU.
##### Build in Official Docker
1. select a cuda version. available: CUDA 11.1, 11.3, 11.4, 11.5, 12.0
2. (Example for CUDA 11.4) ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```docker run --rm -e PLAT=manylinux2014_x86_64 -e CUMM_CUDA_VERSION=114 -v `pwd`:/io scrin/manylinux2014-cuda:cu114-devel-1.0.0 bash -c "source /etc/bashrc && /io/tools/build-wheels.sh"```
##### Build in your environment
1. install build-essential, install CUDA
2. set env for installed cuda version. for example, ```export CUMM_CUDA_VERSION="11.4"```. If you want to build CPU-only, run ```export CUMM_CUDA_VERSION=""```. If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```
3. run ```export CUMM_DISABLE_JIT="1"```
4. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```
#### Windows 10/11
1. install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)
3. start a new powershell, run ```tools/msvc_setup.ps1```
4. set env for installed cuda version. for example, ```$Env:CUMM_CUDA_VERSION = "11.4"```. If you want to build CPU-only, run ```$Env:CUMM_CUDA_VERSION = ""```. . If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```
4. run ```$Env:CUMM_DISABLE_JIT = "1"```
5. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```
## Contributers
* [EvernightAurora](https://github.com/EvernightAurora): add ampere feature.
## Note
The work is done when the author is an employee at [Tusimple](https://www.tusimple.com/).
## LICENSE
Apache 2.0
Raw data
{
"_id": null,
"home_page": "https://github.com/FindDefinition/cumm",
"name": "cumm",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.6",
"maintainer_email": "",
"keywords": "",
"author": "Yan Yan",
"author_email": "yanyan.sub@outlook.com",
"download_url": "",
"platform": null,
"description": "\r\n# cumm\r\nCUda Matrix Multiply library.\r\n\r\n[![Build Status](https://github.com/FindDefinition/cumm/workflows/build/badge.svg)](https://github.com/FindDefinition/cumm/actions?query=workflow%3Abuild)\r\n\r\n```cumm``` is developed during learning of [CUTLASS](https://github.com/NVIDIA/cutlass), which use too much c++ template and make code unmaintainable. So I develop [pccm](https://github.com/FindDefinition/PCCM), use python as meta programming language, to replace c++ template meta programming. \r\nNow ```pccm``` become a foundational framework of ```cumm``` and my other c++ project such as [spconv](https://github.com/traveller59/spconv). \r\n```cumm``` also contains a python asyncio-based gemm simulator that **share same meta program** with CUDA code, enable gemm visualization and easy debug experience.\r\n\r\n## BREAKING CHANGES\r\n\r\n* 0.3.1: tv::DType enum value changed, this will affect all binary code of tv::Tensor user. you must recompile all code if upgrade to cumm >= 0.3.1.\r\n\r\n## News\r\n\r\n* Ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora))\r\n\r\n## Install\r\n\r\n### Prebuilt\r\n\r\nWe offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for linux (manylinux).\r\n\r\nWe offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for windows 10/11.\r\n\r\n```pip install cumm``` for CPU-only\r\n\r\n```pip install cumm-cu102``` for CUDA 10.2\r\n\r\n```pip install cumm-cu113``` for CUDA 11.3\r\n\r\n```pip install cumm-cu114``` for CUDA 11.4\r\n\r\n```pip install cumm-cu117``` for CUDA 11.7\r\n\r\n```pip install cumm-cu120``` for CUDA 12.0\r\n\r\n### Build from source for development (JIT, recommend for develop)\r\n\r\n**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.\r\n\r\nThe c++ code will be built automatically when you change c++ code in project.\r\n\r\n#### Linux\r\n\r\n0. uninstall cumm installed by pip. you must ensure no \"cumm\" exists in ```pip list | grep cumm```\r\n1. install build-essential, install CUDA\r\n2. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```\r\n3. in python, ```import cumm``` and wait for build finish.\r\n\r\n#### Windows\r\n0. uninstall spconv and cumm installed by pip. you must ensure no \"cumm\" exists in ```pip list | grep cumm```\r\n1. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA\r\n2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)\r\n3. start a new powershell, run ```tools/msvc_setup.ps1```\r\n4. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```\r\n5. in python, ```import cumm``` and wait for build finish.\r\n\r\n### Build wheel from source \r\n\r\n**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.\r\n\r\n**WARNING**: If ```CUMM_CUDA_VERSION``` is set with a CUDA version, following steps will create a wheel named \"cumm-cuxxx\", not \"cumm\", this means you must use ```cumm-cuxxx``` in dependency of your project which depend on cumm, not ```cumm```. If ```CUMM_CUDA_VERSION``` isn't set, ```cumm``` will always built with CUDA, so the CUDA must exists in your system. The wheel name will be ```cumm``` even if it is built with cuda.\r\n\r\n#### Linux\r\n\r\nIt's recommend to build Linux packages in [official build docker](https://github.com/FindDefinition/cumm/blob/main/.github/workflows/build.yaml). Build with CUDA support don't need a real GPU.\r\n\r\n##### Build in Official Docker\r\n\r\n1. select a cuda version. available: CUDA 11.1, 11.3, 11.4, 11.5, 12.0\r\n2. (Example for CUDA 11.4) ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```docker run --rm -e PLAT=manylinux2014_x86_64 -e CUMM_CUDA_VERSION=114 -v `pwd`:/io scrin/manylinux2014-cuda:cu114-devel-1.0.0 bash -c \"source /etc/bashrc && /io/tools/build-wheels.sh\"```\r\n\r\n##### Build in your environment\r\n\r\n1. install build-essential, install CUDA\r\n2. set env for installed cuda version. for example, ```export CUMM_CUDA_VERSION=\"11.4\"```. If you want to build CPU-only, run ```export CUMM_CUDA_VERSION=\"\"```. If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```\r\n3. run ```export CUMM_DISABLE_JIT=\"1\"```\r\n4. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```\r\n\r\n#### Windows 10/11\r\n\r\n1. install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA\r\n2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)\r\n3. start a new powershell, run ```tools/msvc_setup.ps1```\r\n4. set env for installed cuda version. for example, ```$Env:CUMM_CUDA_VERSION = \"11.4\"```. If you want to build CPU-only, run ```$Env:CUMM_CUDA_VERSION = \"\"```. . If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```\r\n4. run ```$Env:CUMM_DISABLE_JIT = \"1\"```\r\n5. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```\r\n\r\n## Contributers\r\n\r\n* [EvernightAurora](https://github.com/EvernightAurora): add ampere feature.\r\n\r\n## Note\r\nThe work is done when the author is an employee at [Tusimple](https://www.tusimple.com/).\r\n\r\n## LICENSE\r\n\r\nApache 2.0\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "CUda Matrix Multiply library",
"version": "0.5.1",
"project_urls": {
"Homepage": "https://github.com/FindDefinition/cumm"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "c72616597829feca1fcc2b6f2533c461a448ec6d6fd6f09a58f3dae5f4d3e131",
"md5": "76a7c78e1f1e490b5390ca93a66249a4",
"sha256": "8572ce46fed5b337424fd68d2096a649a1aad86f3b2e974488ad5a157c41cbe2"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "76a7c78e1f1e490b5390ca93a66249a4",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.6",
"size": 2538428,
"upload_time": "2023-12-26T04:25:28",
"upload_time_iso_8601": "2023-12-26T04:25:28.343722Z",
"url": "https://files.pythonhosted.org/packages/c7/26/16597829feca1fcc2b6f2533c461a448ec6d6fd6f09a58f3dae5f4d3e131/cumm-0.5.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "7725bdc0708581bda1d2c778375d2fdf9cf4f8d675589adf5edabca9f95e8847",
"md5": "05ba4c095b98294a597d66216203414b",
"sha256": "b5c494f0d8b3504e4ae1aa69ea2fc08ee4a37a6d2da27412b9a534ccb9e50418"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp310-cp310-win_amd64.whl",
"has_sig": false,
"md5_digest": "05ba4c095b98294a597d66216203414b",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.6",
"size": 1107504,
"upload_time": "2023-12-26T03:59:22",
"upload_time_iso_8601": "2023-12-26T03:59:22.974663Z",
"url": "https://files.pythonhosted.org/packages/77/25/bdc0708581bda1d2c778375d2fdf9cf4f8d675589adf5edabca9f95e8847/cumm-0.5.1-cp310-cp310-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ccee1a53197dea5630d4ffec8dd8452eacbf12f1aec1ce22ce360c9479c0cdb9",
"md5": "052daff950133f2381dd928b2ce54c12",
"sha256": "d3b7064517881a617366aec9149fdf8631a6c9d8aa7eec88b9add1c15a3cddfa"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "052daff950133f2381dd928b2ce54c12",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.6",
"size": 2542625,
"upload_time": "2023-12-26T04:25:32",
"upload_time_iso_8601": "2023-12-26T04:25:32.082157Z",
"url": "https://files.pythonhosted.org/packages/cc/ee/1a53197dea5630d4ffec8dd8452eacbf12f1aec1ce22ce360c9479c0cdb9/cumm-0.5.1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "71889a15a31941dfc5019599de32da65267b840f0515c7c7ae2cf1ca5ad7398d",
"md5": "e65382d05cc50c3c5149dd6ac02a45d7",
"sha256": "b91cbfe30f3acdcacd31872c6694f3bb3d116811e6066265a1e1398c483b4097"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp311-cp311-win_amd64.whl",
"has_sig": false,
"md5_digest": "e65382d05cc50c3c5149dd6ac02a45d7",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.6",
"size": 1108973,
"upload_time": "2023-12-26T03:59:35",
"upload_time_iso_8601": "2023-12-26T03:59:35.466027Z",
"url": "https://files.pythonhosted.org/packages/71/88/9a15a31941dfc5019599de32da65267b840f0515c7c7ae2cf1ca5ad7398d/cumm-0.5.1-cp311-cp311-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e1fbe5df17d03f1521d8e70421f4cbdcbdec3ddf0742bc55e71c48a9edcd33d1",
"md5": "ade32aa872a0111aa7e21430b78b375e",
"sha256": "61130eab09b6d64bcdae71bd6e6f6d280c1b0ce815956d3ace21bc03e4d0acc1"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp312-cp312-win_amd64.whl",
"has_sig": false,
"md5_digest": "ade32aa872a0111aa7e21430b78b375e",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.6",
"size": 1106932,
"upload_time": "2023-12-26T03:59:51",
"upload_time_iso_8601": "2023-12-26T03:59:51.554378Z",
"url": "https://files.pythonhosted.org/packages/e1/fb/e5df17d03f1521d8e70421f4cbdcbdec3ddf0742bc55e71c48a9edcd33d1/cumm-0.5.1-cp312-cp312-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "2e1dbba85dcb83894e852cae6681278d18b449746434d9b25330462aa265bcab",
"md5": "84b7c60e5ee4adbee89b7c1af09ea4c3",
"sha256": "19d41d12b6fc5818d88652cc6b21119a0029f8b7b74ea61f1068f2d01b2dd2e6"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "84b7c60e5ee4adbee89b7c1af09ea4c3",
"packagetype": "bdist_wheel",
"python_version": "cp37",
"requires_python": ">=3.6",
"size": 2534698,
"upload_time": "2023-12-26T04:25:33",
"upload_time_iso_8601": "2023-12-26T04:25:33.625153Z",
"url": "https://files.pythonhosted.org/packages/2e/1d/bba85dcb83894e852cae6681278d18b449746434d9b25330462aa265bcab/cumm-0.5.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0d64207e052d07aebacf7b1e4df190db52ede0b252f4c34e8aefca9c60401296",
"md5": "ce0162a933e4c10a7f36aec2e8b3c3d4",
"sha256": "aa8fbad84d176d7137d74d709042583e0891469bdfebbbf14b48f42bb94d89ce"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "ce0162a933e4c10a7f36aec2e8b3c3d4",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.6",
"size": 2535991,
"upload_time": "2023-12-26T04:25:35",
"upload_time_iso_8601": "2023-12-26T04:25:35.754594Z",
"url": "https://files.pythonhosted.org/packages/0d/64/207e052d07aebacf7b1e4df190db52ede0b252f4c34e8aefca9c60401296/cumm-0.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "53c43e118181ad68dea5c7e80e827ac2c3b3602516d389871df2967617b33e3c",
"md5": "f749501473fa0d09e1bbaea035dada8c",
"sha256": "a761e086e5007c27722114bcb4055069a1ae7a23addc2bbe42332de19a797f29"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp38-cp38-win_amd64.whl",
"has_sig": false,
"md5_digest": "f749501473fa0d09e1bbaea035dada8c",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.6",
"size": 1107440,
"upload_time": "2023-12-26T03:59:12",
"upload_time_iso_8601": "2023-12-26T03:59:12.923798Z",
"url": "https://files.pythonhosted.org/packages/53/c4/3e118181ad68dea5c7e80e827ac2c3b3602516d389871df2967617b33e3c/cumm-0.5.1-cp38-cp38-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "ab4ab66ebb1be96de6d754493e06452451af23b64bf3257a4a3221ce0ecb23c4",
"md5": "9db0c4e70d296dafcff26a23b010d841",
"sha256": "e9cba8e0b134361a306bf688913dcd6a4aa6ecfb1f1765772c47caf74b45397d"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "9db0c4e70d296dafcff26a23b010d841",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.6",
"size": 2537334,
"upload_time": "2023-12-26T04:25:37",
"upload_time_iso_8601": "2023-12-26T04:25:37.860843Z",
"url": "https://files.pythonhosted.org/packages/ab/4a/b66ebb1be96de6d754493e06452451af23b64bf3257a4a3221ce0ecb23c4/cumm-0.5.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8eb21c63d0ace83e8ee35d8e26741ab99a3704a31d4593a2239fe7ab4447c9a3",
"md5": "4dcff1bc740f105899f194567b2de6c0",
"sha256": "158f7b6c210f25e5b2578ae3b23c68f2c3efa62f81eec58ca9e8bde6022e9883"
},
"downloads": -1,
"filename": "cumm-0.5.1-cp39-cp39-win_amd64.whl",
"has_sig": false,
"md5_digest": "4dcff1bc740f105899f194567b2de6c0",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.6",
"size": 1107675,
"upload_time": "2023-12-26T03:59:49",
"upload_time_iso_8601": "2023-12-26T03:59:49.615732Z",
"url": "https://files.pythonhosted.org/packages/8e/b2/1c63d0ace83e8ee35d8e26741ab99a3704a31d4593a2239fe7ab4447c9a3/cumm-0.5.1-cp39-cp39-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-12-26 04:25:28",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "FindDefinition",
"github_project": "cumm",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "cumm"
}