# cumm
CUda Matrix Multiply library.
[](https://github.com/FindDefinition/cumm/actions?query=workflow%3Abuild)
```cumm``` is developed during learning of [CUTLASS](https://github.com/NVIDIA/cutlass), which use too much c++ template and make code unmaintainable. So I develop [pccm](https://github.com/FindDefinition/PCCM), use python as meta programming language, to replace c++ template meta programming.
Now ```pccm``` become a foundational framework of ```cumm``` and my other c++ project such as [spconv](https://github.com/traveller59/spconv).
```cumm``` also contains a python asyncio-based gemm simulator that **share same meta program** with CUDA code, enable gemm visualization and easy debug experience.
## BREAKING CHANGES
* 0.3.1: tv::DType enum value changed, this will affect all binary code of tv::Tensor user. you must recompile all code if upgrade to cumm >= 0.3.1.
## News
* Ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora))
## Install
### Prebuilt
We offer python 3.9-3.13 and cuda 11.4/11.8/12.1/12.4/12.6 prebuilt binaries for linux (`manylinux_2_28`).
We offer python 3.9-3.13 and cuda 11.4/11.8/12.1/12.4/12.6 prebuilt binaries for windows 10/11.
We offer python 3.9-3.13 prebuilt binaries for Mac OS X >= 14.0 (Apple Silicon Only).
```pip install cumm``` for CPU-only
```pip install cumm-cu114``` for CUDA 11.4
```pip install cumm-cu126``` for CUDA 12.6
### Build from source for development (JIT, recommend for develop)
**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.
The c++ code will be built automatically when you change c++ code in project.
#### Linux
0. uninstall cumm installed by pip. you must ensure no "cumm" exists in ```pip list | grep cumm```
1. install build-essential, install CUDA
2. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```
3. in python, ```import cumm``` and wait for build finish.
#### Windows
0. uninstall spconv and cumm installed by pip. you must ensure no "cumm" exists in ```pip list | grep cumm```
1. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA
2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)
3. start a new powershell, run ```tools/msvc_setup.ps1```
4. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```
5. in python, ```import cumm``` and wait for build finish.
### Build wheel from source
**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.
**WARNING**: If ```CUMM_CUDA_VERSION``` is set with a CUDA version, following steps will create a wheel named "cumm-cuxxx", not "cumm", this means you must use ```cumm-cuxxx``` in dependency of your project which depend on cumm, not ```cumm```. If ```CUMM_CUDA_VERSION``` isn't set, ```cumm``` will always built with CUDA, so the CUDA must exists in your system. The wheel name will be ```cumm``` even if it is built with cuda.
#### Linux
It's recommend to build Linux packages in [official build docker](https://github.com/FindDefinition/cumm/blob/main/.github/workflows/build.yaml). Build with CUDA support don't need a real GPU.
##### Build in Official Docker
1. select a cuda version. available: CUDA 11.1, 11.3, 11.4, 11.5, 12.0
2. (Example for CUDA 11.4) ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```docker run --rm -e PLAT=manylinux2014_x86_64 -e CUMM_CUDA_VERSION=114 -v `pwd`:/io scrin/manylinux2014-cuda:cu114-devel-1.0.0 bash -c "source /etc/bashrc && /io/tools/build-wheels.sh"```
##### Build in your environment
1. install build-essential, install CUDA
2. set env for installed cuda version. for example, ```export CUMM_CUDA_VERSION="11.4"```. If you want to build CPU-only, run ```export CUMM_CUDA_VERSION=""```. If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```
3. run ```export CUMM_DISABLE_JIT="1"```
4. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```
#### Windows 10/11
1. install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)
3. start a new powershell, run ```tools/msvc_setup.ps1```
4. set env for installed cuda version. for example, ```$Env:CUMM_CUDA_VERSION = "11.4"```. If you want to build CPU-only, run ```$Env:CUMM_CUDA_VERSION = ""```. . If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```
4. run ```$Env:CUMM_DISABLE_JIT = "1"```
5. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```
## Contributers
* [EvernightAurora](https://github.com/EvernightAurora): add ampere feature.
## Note
The work is done when the author is an employee at [Tusimple](https://www.tusimple.com/).
## LICENSE
Apache 2.0
Raw data
{
"_id": null,
"home_page": "https://github.com/FindDefinition/cumm",
"name": "cumm-cu118",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Yan Yan",
"author_email": "yanyan.sub@outlook.com",
"download_url": null,
"platform": null,
"description": "\r\n# cumm\r\nCUda Matrix Multiply library.\r\n\r\n[](https://github.com/FindDefinition/cumm/actions?query=workflow%3Abuild)\r\n\r\n```cumm``` is developed during learning of [CUTLASS](https://github.com/NVIDIA/cutlass), which use too much c++ template and make code unmaintainable. So I develop [pccm](https://github.com/FindDefinition/PCCM), use python as meta programming language, to replace c++ template meta programming. \r\nNow ```pccm``` become a foundational framework of ```cumm``` and my other c++ project such as [spconv](https://github.com/traveller59/spconv). \r\n```cumm``` also contains a python asyncio-based gemm simulator that **share same meta program** with CUDA code, enable gemm visualization and easy debug experience.\r\n\r\n## BREAKING CHANGES\r\n\r\n* 0.3.1: tv::DType enum value changed, this will affect all binary code of tv::Tensor user. you must recompile all code if upgrade to cumm >= 0.3.1.\r\n\r\n## News\r\n\r\n* Ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora))\r\n\r\n## Install\r\n\r\n### Prebuilt\r\n\r\nWe offer python 3.9-3.13 and cuda 11.4/11.8/12.1/12.4/12.6 prebuilt binaries for linux (`manylinux_2_28`).\r\n\r\nWe offer python 3.9-3.13 and cuda 11.4/11.8/12.1/12.4/12.6 prebuilt binaries for windows 10/11.\r\n\r\nWe offer python 3.9-3.13 prebuilt binaries for Mac OS X >= 14.0 (Apple Silicon Only).\r\n\r\n```pip install cumm``` for CPU-only\r\n\r\n```pip install cumm-cu114``` for CUDA 11.4\r\n\r\n```pip install cumm-cu126``` for CUDA 12.6\r\n\r\n### Build from source for development (JIT, recommend for develop)\r\n\r\n**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.\r\n\r\nThe c++ code will be built automatically when you change c++ code in project.\r\n\r\n#### Linux\r\n\r\n0. uninstall cumm installed by pip. you must ensure no \"cumm\" exists in ```pip list | grep cumm```\r\n1. install build-essential, install CUDA\r\n2. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```\r\n3. in python, ```import cumm``` and wait for build finish.\r\n\r\n#### Windows\r\n0. uninstall spconv and cumm installed by pip. you must ensure no \"cumm\" exists in ```pip list | grep cumm```\r\n1. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA\r\n2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)\r\n3. start a new powershell, run ```tools/msvc_setup.ps1```\r\n4. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```\r\n5. in python, ```import cumm``` and wait for build finish.\r\n\r\n### Build wheel from source \r\n\r\n**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.\r\n\r\n**WARNING**: If ```CUMM_CUDA_VERSION``` is set with a CUDA version, following steps will create a wheel named \"cumm-cuxxx\", not \"cumm\", this means you must use ```cumm-cuxxx``` in dependency of your project which depend on cumm, not ```cumm```. If ```CUMM_CUDA_VERSION``` isn't set, ```cumm``` will always built with CUDA, so the CUDA must exists in your system. The wheel name will be ```cumm``` even if it is built with cuda.\r\n\r\n#### Linux\r\n\r\nIt's recommend to build Linux packages in [official build docker](https://github.com/FindDefinition/cumm/blob/main/.github/workflows/build.yaml). Build with CUDA support don't need a real GPU.\r\n\r\n##### Build in Official Docker\r\n\r\n1. select a cuda version. available: CUDA 11.1, 11.3, 11.4, 11.5, 12.0\r\n2. (Example for CUDA 11.4) ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```docker run --rm -e PLAT=manylinux2014_x86_64 -e CUMM_CUDA_VERSION=114 -v `pwd`:/io scrin/manylinux2014-cuda:cu114-devel-1.0.0 bash -c \"source /etc/bashrc && /io/tools/build-wheels.sh\"```\r\n\r\n##### Build in your environment\r\n\r\n1. install build-essential, install CUDA\r\n2. set env for installed cuda version. for example, ```export CUMM_CUDA_VERSION=\"11.4\"```. If you want to build CPU-only, run ```export CUMM_CUDA_VERSION=\"\"```. If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```\r\n3. run ```export CUMM_DISABLE_JIT=\"1\"```\r\n4. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```\r\n\r\n#### Windows 10/11\r\n\r\n1. install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA\r\n2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)\r\n3. start a new powershell, run ```tools/msvc_setup.ps1```\r\n4. set env for installed cuda version. for example, ```$Env:CUMM_CUDA_VERSION = \"11.4\"```. If you want to build CPU-only, run ```$Env:CUMM_CUDA_VERSION = \"\"```. . If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```\r\n4. run ```$Env:CUMM_DISABLE_JIT = \"1\"```\r\n5. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```\r\n\r\n## Contributers\r\n\r\n* [EvernightAurora](https://github.com/EvernightAurora): add ampere feature.\r\n\r\n## Note\r\nThe work is done when the author is an employee at [Tusimple](https://www.tusimple.com/).\r\n\r\n## LICENSE\r\n\r\nApache 2.0\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "CUda Matrix Multiply library",
"version": "0.7.11",
"project_urls": {
"Homepage": "https://github.com/FindDefinition/cumm"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0635b560a2b58bbf8b08e92fc5b5befdf738f4d1b231e92f55281db3a81c65e9",
"md5": "621a3c897fa08137f239bf10c3c2953b",
"sha256": "f3d89ec43d1db13a7e80d4d5bba96116c8a80855bbab0db14be64a8e1537a169"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "621a3c897fa08137f239bf10c3c2953b",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 25606440,
"upload_time": "2024-12-15T13:57:27",
"upload_time_iso_8601": "2024-12-15T13:57:27.943834Z",
"url": "https://files.pythonhosted.org/packages/06/35/b560a2b58bbf8b08e92fc5b5befdf738f4d1b231e92f55281db3a81c65e9/cumm_cu118-0.7.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "454310548d09dd36ac583bd74d72b2fd6ab900b36f84fda77af9fb3ed555215e",
"md5": "227b0ef98e1b70caf862f2f83345da7b",
"sha256": "2ee8a31ec6cd81a60392008306399ad1de0659fd50e227280ce81b8234dce9bd"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp310-cp310-win_amd64.whl",
"has_sig": false,
"md5_digest": "227b0ef98e1b70caf862f2f83345da7b",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 1220817,
"upload_time": "2024-12-15T13:37:32",
"upload_time_iso_8601": "2024-12-15T13:37:32.096115Z",
"url": "https://files.pythonhosted.org/packages/45/43/10548d09dd36ac583bd74d72b2fd6ab900b36f84fda77af9fb3ed555215e/cumm_cu118-0.7.11-cp310-cp310-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "60c7922d3e0218284a9dc6c3246afa77b1898c7497df43bb226591e151f31e6a",
"md5": "8abdf1849ab7b20d201eb2d21ae8317d",
"sha256": "16534760e1a22c27c3993804df043a4f675a2a03baf4f80e5c055bf8fa609b95"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "8abdf1849ab7b20d201eb2d21ae8317d",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.8",
"size": 25610582,
"upload_time": "2024-12-15T13:57:34",
"upload_time_iso_8601": "2024-12-15T13:57:34.213424Z",
"url": "https://files.pythonhosted.org/packages/60/c7/922d3e0218284a9dc6c3246afa77b1898c7497df43bb226591e151f31e6a/cumm_cu118-0.7.11-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "817fa51464c940d76716e2e26f6a5d06968092d5127bdb4e3e3c141ee65c9bed",
"md5": "eef190b6f5e479d72162c47e3cc590a1",
"sha256": "0439021079a2d6955a95cf1ed656649473081570abbb924b318c6e738da072d0"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp311-cp311-win_amd64.whl",
"has_sig": false,
"md5_digest": "eef190b6f5e479d72162c47e3cc590a1",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.8",
"size": 1221872,
"upload_time": "2024-12-15T13:42:16",
"upload_time_iso_8601": "2024-12-15T13:42:16.995647Z",
"url": "https://files.pythonhosted.org/packages/81/7f/a51464c940d76716e2e26f6a5d06968092d5127bdb4e3e3c141ee65c9bed/cumm_cu118-0.7.11-cp311-cp311-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "670163ad206570c918dc111c8598ba7319a845a0ad661a42cefb0aad6daa54df",
"md5": "8e13e650dbcf2d2e1fa5e49ea5630e5c",
"sha256": "5ef923bb0de45513e9b48b6065b12625b30d71502dd50ec826a2304b21a85f0e"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "8e13e650dbcf2d2e1fa5e49ea5630e5c",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.8",
"size": 25605413,
"upload_time": "2024-12-15T13:57:40",
"upload_time_iso_8601": "2024-12-15T13:57:40.953726Z",
"url": "https://files.pythonhosted.org/packages/67/01/63ad206570c918dc111c8598ba7319a845a0ad661a42cefb0aad6daa54df/cumm_cu118-0.7.11-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "41029e6f99ed2fe64928a0e423202ba63a22ae81e80ec85bb2c82f9bca4475aa",
"md5": "ccd87d15f4d051a9bb1cd36eec6536c4",
"sha256": "951b0b5060b03028c0adffca99f4641b3f6d2848be207738ade3bc705dd92be6"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp312-cp312-win_amd64.whl",
"has_sig": false,
"md5_digest": "ccd87d15f4d051a9bb1cd36eec6536c4",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.8",
"size": 1221205,
"upload_time": "2024-12-15T13:38:38",
"upload_time_iso_8601": "2024-12-15T13:38:38.867331Z",
"url": "https://files.pythonhosted.org/packages/41/02/9e6f99ed2fe64928a0e423202ba63a22ae81e80ec85bb2c82f9bca4475aa/cumm_cu118-0.7.11-cp312-cp312-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "e18bb550ddb512c0424d3e7f010e5b7bec56544938181871d08cd777a38ab398",
"md5": "e26297b3cf18f51271cc736862d46ad6",
"sha256": "515a937749ff9fafc9cb5fcb610e479bddcfe65fe99ba7cbe442659951afd17b"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "e26297b3cf18f51271cc736862d46ad6",
"packagetype": "bdist_wheel",
"python_version": "cp313",
"requires_python": ">=3.8",
"size": 25609378,
"upload_time": "2024-12-15T13:57:47",
"upload_time_iso_8601": "2024-12-15T13:57:47.154197Z",
"url": "https://files.pythonhosted.org/packages/e1/8b/b550ddb512c0424d3e7f010e5b7bec56544938181871d08cd777a38ab398/cumm_cu118-0.7.11-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f1ca7b793dfb776732da0e8474d27b54145e0e5b32dd3e462133b59fcc112c3a",
"md5": "695c39edd2ac9834e5468bd6720eb9d0",
"sha256": "86714aad909a063fd8b8cd18df41def4f4d12c4a978d91885f13fdeed23a7895"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp313-cp313-win_amd64.whl",
"has_sig": false,
"md5_digest": "695c39edd2ac9834e5468bd6720eb9d0",
"packagetype": "bdist_wheel",
"python_version": "cp313",
"requires_python": ">=3.8",
"size": 1221950,
"upload_time": "2024-12-15T13:45:30",
"upload_time_iso_8601": "2024-12-15T13:45:30.445978Z",
"url": "https://files.pythonhosted.org/packages/f1/ca/7b793dfb776732da0e8474d27b54145e0e5b32dd3e462133b59fcc112c3a/cumm_cu118-0.7.11-cp313-cp313-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "bba4e54063d04ae674729745f5e70ed22d09dbab3417c78d26c8bad5d0d0e2c9",
"md5": "56d2034f59db9af17674d1010d89c0e6",
"sha256": "f78953eac131dd9893761c22a6a510c546698696b477c6fa24c3c65bfa25889c"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "56d2034f59db9af17674d1010d89c0e6",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.8",
"size": 25603686,
"upload_time": "2024-12-15T13:57:54",
"upload_time_iso_8601": "2024-12-15T13:57:54.687818Z",
"url": "https://files.pythonhosted.org/packages/bb/a4/e54063d04ae674729745f5e70ed22d09dbab3417c78d26c8bad5d0d0e2c9/cumm_cu118-0.7.11-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "494607c717a74539a27a766e88697c9d6b53647023b7bddb462b2e1b749dacea",
"md5": "89a597eb0347afbbccd84864a85d028e",
"sha256": "db23842d1bef7beadb5a2064cf3324a50f02f5eca42d374d155499a3782622ff"
},
"downloads": -1,
"filename": "cumm_cu118-0.7.11-cp39-cp39-win_amd64.whl",
"has_sig": false,
"md5_digest": "89a597eb0347afbbccd84864a85d028e",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.8",
"size": 1220719,
"upload_time": "2024-12-15T13:46:12",
"upload_time_iso_8601": "2024-12-15T13:46:12.599887Z",
"url": "https://files.pythonhosted.org/packages/49/46/07c717a74539a27a766e88697c9d6b53647023b7bddb462b2e1b749dacea/cumm_cu118-0.7.11-cp39-cp39-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-12-15 13:57:27",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "FindDefinition",
"github_project": "cumm",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "cumm-cu118"
}