# cumm
CUda Matrix Multiply library.
[](https://github.com/FindDefinition/cumm/actions?query=workflow%3Abuild)
```cumm``` is developed during learning of [CUTLASS](https://github.com/NVIDIA/cutlass), which use too much c++ template and make code unmaintainable. So I develop [pccm](https://github.com/FindDefinition/PCCM), use python as meta programming language, to replace c++ template meta programming.
Now ```pccm``` become a foundational framework of ```cumm``` and my other c++ project such as [spconv](https://github.com/traveller59/spconv).
```cumm``` also contains a python asyncio-based gemm simulator that **share same meta program** with CUDA code, enable gemm visualization and easy debug experience.
## BREAKING CHANGES
* 0.3.1: tv::DType enum value changed, this will affect all binary code of tv::Tensor user. you must recompile all code if upgrade to cumm >= 0.3.1.
## News
* Ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora))
## Install
### Prebuilt
We offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for linux (manylinux).
We offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for windows 10/11.
```pip install cumm``` for CPU-only
```pip install cumm-cu102``` for CUDA 10.2
```pip install cumm-cu113``` for CUDA 11.3
```pip install cumm-cu114``` for CUDA 11.4
```pip install cumm-cu117``` for CUDA 11.7
```pip install cumm-cu120``` for CUDA 12.0
### Build from source for development (JIT, recommend for develop)
**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.
The c++ code will be built automatically when you change c++ code in project.
#### Linux
0. uninstall cumm installed by pip. you must ensure no "cumm" exists in ```pip list | grep cumm```
1. install build-essential, install CUDA
2. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```
3. in python, ```import cumm``` and wait for build finish.
#### Windows
0. uninstall spconv and cumm installed by pip. you must ensure no "cumm" exists in ```pip list | grep cumm```
1. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA
2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)
3. start a new powershell, run ```tools/msvc_setup.ps1```
4. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```
5. in python, ```import cumm``` and wait for build finish.
### Build wheel from source
**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.
**WARNING**: If ```CUMM_CUDA_VERSION``` is set with a CUDA version, following steps will create a wheel named "cumm-cuxxx", not "cumm", this means you must use ```cumm-cuxxx``` in dependency of your project which depend on cumm, not ```cumm```. If ```CUMM_CUDA_VERSION``` isn't set, ```cumm``` will always built with CUDA, so the CUDA must exists in your system. The wheel name will be ```cumm``` even if it is built with cuda.
#### Linux
It's recommend to build Linux packages in [official build docker](https://github.com/FindDefinition/cumm/blob/main/.github/workflows/build.yaml). Build with CUDA support don't need a real GPU.
##### Build in Official Docker
1. select a cuda version. available: CUDA 11.1, 11.3, 11.4, 11.5, 12.0
2. (Example for CUDA 11.4) ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```docker run --rm -e PLAT=manylinux2014_x86_64 -e CUMM_CUDA_VERSION=114 -v `pwd`:/io scrin/manylinux2014-cuda:cu114-devel-1.0.0 bash -c "source /etc/bashrc && /io/tools/build-wheels.sh"```
##### Build in your environment
1. install build-essential, install CUDA
2. set env for installed cuda version. for example, ```export CUMM_CUDA_VERSION="11.4"```. If you want to build CPU-only, run ```export CUMM_CUDA_VERSION=""```. If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```
3. run ```export CUMM_DISABLE_JIT="1"```
4. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```
#### Windows 10/11
1. install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA
2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)
3. start a new powershell, run ```tools/msvc_setup.ps1```
4. set env for installed cuda version. for example, ```$Env:CUMM_CUDA_VERSION = "11.4"```. If you want to build CPU-only, run ```$Env:CUMM_CUDA_VERSION = ""```. . If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```
4. run ```$Env:CUMM_DISABLE_JIT = "1"```
5. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```
## Contributers
* [EvernightAurora](https://github.com/EvernightAurora): add ampere feature.
## Note
The work is done when the author is an employee at [Tusimple](https://www.tusimple.com/).
## LICENSE
Apache 2.0
Raw data
{
"_id": null,
"home_page": "https://github.com/FindDefinition/cumm",
"name": "cumm-cu120",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.8",
"maintainer_email": null,
"keywords": null,
"author": "Yan Yan",
"author_email": "yanyan.sub@outlook.com",
"download_url": null,
"platform": null,
"description": "\r\n# cumm\r\nCUda Matrix Multiply library.\r\n\r\n[](https://github.com/FindDefinition/cumm/actions?query=workflow%3Abuild)\r\n\r\n```cumm``` is developed during learning of [CUTLASS](https://github.com/NVIDIA/cutlass), which use too much c++ template and make code unmaintainable. So I develop [pccm](https://github.com/FindDefinition/PCCM), use python as meta programming language, to replace c++ template meta programming. \r\nNow ```pccm``` become a foundational framework of ```cumm``` and my other c++ project such as [spconv](https://github.com/traveller59/spconv). \r\n```cumm``` also contains a python asyncio-based gemm simulator that **share same meta program** with CUDA code, enable gemm visualization and easy debug experience.\r\n\r\n## BREAKING CHANGES\r\n\r\n* 0.3.1: tv::DType enum value changed, this will affect all binary code of tv::Tensor user. you must recompile all code if upgrade to cumm >= 0.3.1.\r\n\r\n## News\r\n\r\n* Ampere feature support (by [EvernightAurora](https://github.com/EvernightAurora))\r\n\r\n## Install\r\n\r\n### Prebuilt\r\n\r\nWe offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for linux (manylinux).\r\n\r\nWe offer python 3.7-3.11 and cuda 10.2/11.3/11.4/11.7/12.0 prebuilt binaries for windows 10/11.\r\n\r\n```pip install cumm``` for CPU-only\r\n\r\n```pip install cumm-cu102``` for CUDA 10.2\r\n\r\n```pip install cumm-cu113``` for CUDA 11.3\r\n\r\n```pip install cumm-cu114``` for CUDA 11.4\r\n\r\n```pip install cumm-cu117``` for CUDA 11.7\r\n\r\n```pip install cumm-cu120``` for CUDA 12.0\r\n\r\n### Build from source for development (JIT, recommend for develop)\r\n\r\n**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.\r\n\r\nThe c++ code will be built automatically when you change c++ code in project.\r\n\r\n#### Linux\r\n\r\n0. uninstall cumm installed by pip. you must ensure no \"cumm\" exists in ```pip list | grep cumm```\r\n1. install build-essential, install CUDA\r\n2. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```\r\n3. in python, ```import cumm``` and wait for build finish.\r\n\r\n#### Windows\r\n0. uninstall spconv and cumm installed by pip. you must ensure no \"cumm\" exists in ```pip list | grep cumm```\r\n1. install visual studio 2019 or newer. make sure C++ development component is installed. install CUDA\r\n2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)\r\n3. start a new powershell, run ```tools/msvc_setup.ps1```\r\n4. ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```git checkout tags/<tag_name>```, ```pip install -e .```\r\n5. in python, ```import cumm``` and wait for build finish.\r\n\r\n### Build wheel from source \r\n\r\n**WARNING** Use code in [tags](https://github.com/FindDefinition/cumm/releases)!!! code in main branch may contain bugs.\r\n\r\n**WARNING**: If ```CUMM_CUDA_VERSION``` is set with a CUDA version, following steps will create a wheel named \"cumm-cuxxx\", not \"cumm\", this means you must use ```cumm-cuxxx``` in dependency of your project which depend on cumm, not ```cumm```. If ```CUMM_CUDA_VERSION``` isn't set, ```cumm``` will always built with CUDA, so the CUDA must exists in your system. The wheel name will be ```cumm``` even if it is built with cuda.\r\n\r\n#### Linux\r\n\r\nIt's recommend to build Linux packages in [official build docker](https://github.com/FindDefinition/cumm/blob/main/.github/workflows/build.yaml). Build with CUDA support don't need a real GPU.\r\n\r\n##### Build in Official Docker\r\n\r\n1. select a cuda version. available: CUDA 11.1, 11.3, 11.4, 11.5, 12.0\r\n2. (Example for CUDA 11.4) ```git clone https://github.com/FindDefinition/cumm```, ```cd ./cumm```, ```docker run --rm -e PLAT=manylinux2014_x86_64 -e CUMM_CUDA_VERSION=114 -v `pwd`:/io scrin/manylinux2014-cuda:cu114-devel-1.0.0 bash -c \"source /etc/bashrc && /io/tools/build-wheels.sh\"```\r\n\r\n##### Build in your environment\r\n\r\n1. install build-essential, install CUDA\r\n2. set env for installed cuda version. for example, ```export CUMM_CUDA_VERSION=\"11.4\"```. If you want to build CPU-only, run ```export CUMM_CUDA_VERSION=\"\"```. If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```\r\n3. run ```export CUMM_DISABLE_JIT=\"1\"```\r\n4. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```\r\n\r\n#### Windows 10/11\r\n\r\n1. install visual studio 2019 or newer. make sure C++ development package is installed. install CUDA\r\n2. set [powershell script execution policy](https://docs.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_execution_policies?view=powershell-7.1)\r\n3. start a new powershell, run ```tools/msvc_setup.ps1```\r\n4. set env for installed cuda version. for example, ```$Env:CUMM_CUDA_VERSION = \"11.4\"```. If you want to build CPU-only, run ```$Env:CUMM_CUDA_VERSION = \"\"```. . If ```CUMM_CUDA_VERSION``` isn't set, you need to ensure cuda libraries are inside OS search path, and the built wheel name will be ```cumm```, otherwise ```cumm-cuxxx```\r\n4. run ```$Env:CUMM_DISABLE_JIT = \"1\"```\r\n5. run ```python setup.py bdist_wheel```+```pip install dists/xxx.whl```\r\n\r\n## Contributers\r\n\r\n* [EvernightAurora](https://github.com/EvernightAurora): add ampere feature.\r\n\r\n## Note\r\nThe work is done when the author is an employee at [Tusimple](https://www.tusimple.com/).\r\n\r\n## LICENSE\r\n\r\nApache 2.0\r\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "CUda Matrix Multiply library",
"version": "0.6.3",
"project_urls": {
"Homepage": "https://github.com/FindDefinition/cumm"
},
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "24e9e44d1c4e52ebbc130bc439b562a212efa409dd22c62ae55a3ad7061395cb",
"md5": "6ae910c66c05ffeea524807915ed90fa",
"sha256": "a44a0e4443acfa6748bac834f74fab60e8754bd623fa9b6c0bedec9902134afa"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "6ae910c66c05ffeea524807915ed90fa",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 26261620,
"upload_time": "2024-08-18T08:58:46",
"upload_time_iso_8601": "2024-08-18T08:58:46.285893Z",
"url": "https://files.pythonhosted.org/packages/24/e9/e44d1c4e52ebbc130bc439b562a212efa409dd22c62ae55a3ad7061395cb/cumm_cu120-0.6.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "c2fa438abc6e0139cd9860895359ccd3476495b8744875df7e144157dc77446a",
"md5": "5def9d2c80b4012cb5ccb2bcc897c56f",
"sha256": "714586bd2b9329eba163302c0d34d417b6a79792aa6be9737adea615e3e681a9"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp310-cp310-win_amd64.whl",
"has_sig": false,
"md5_digest": "5def9d2c80b4012cb5ccb2bcc897c56f",
"packagetype": "bdist_wheel",
"python_version": "cp310",
"requires_python": ">=3.8",
"size": 1206556,
"upload_time": "2024-08-18T08:45:33",
"upload_time_iso_8601": "2024-08-18T08:45:33.076623Z",
"url": "https://files.pythonhosted.org/packages/c2/fa/438abc6e0139cd9860895359ccd3476495b8744875df7e144157dc77446a/cumm_cu120-0.6.3-cp310-cp310-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "0466915d405fc411bd14d5405b537034788ddc087516d766b6767a2bbba59c80",
"md5": "4b25ba73716051fa1318021e42532297",
"sha256": "ebfb6a8c72d81c5376212cfebd98c05cad0e9d325d956d7aca6307a09f1bddd4"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "4b25ba73716051fa1318021e42532297",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.8",
"size": 26269245,
"upload_time": "2024-08-18T08:58:50",
"upload_time_iso_8601": "2024-08-18T08:58:50.310803Z",
"url": "https://files.pythonhosted.org/packages/04/66/915d405fc411bd14d5405b537034788ddc087516d766b6767a2bbba59c80/cumm_cu120-0.6.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "30fb4b9bb764e00c9d65b30978b104f43676ce55e50ce09fbb549612dbdd325a",
"md5": "232a1c30094e94783285511f228a6028",
"sha256": "4ea2995e72f3280d6cad34ea6b0dfa303bd1f8f111edd38735fa9cb209f15e92"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp311-cp311-win_amd64.whl",
"has_sig": false,
"md5_digest": "232a1c30094e94783285511f228a6028",
"packagetype": "bdist_wheel",
"python_version": "cp311",
"requires_python": ">=3.8",
"size": 1208021,
"upload_time": "2024-08-18T08:47:27",
"upload_time_iso_8601": "2024-08-18T08:47:27.727978Z",
"url": "https://files.pythonhosted.org/packages/30/fb/4b9bb764e00c9d65b30978b104f43676ce55e50ce09fbb549612dbdd325a/cumm_cu120-0.6.3-cp311-cp311-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "cede7e6556d7f2c04c2a317f4c15dd2b9e83056ba1dba643b48b72f4e9beff7c",
"md5": "f9d34ec924b9926c39b9a6c89b3b5ee4",
"sha256": "4e7c5911ccc99855154f5f2e1dd8602464769847232f8fcef3110b8ab2f5abd4"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "f9d34ec924b9926c39b9a6c89b3b5ee4",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.8",
"size": 26271963,
"upload_time": "2024-08-18T08:58:55",
"upload_time_iso_8601": "2024-08-18T08:58:55.109019Z",
"url": "https://files.pythonhosted.org/packages/ce/de/7e6556d7f2c04c2a317f4c15dd2b9e83056ba1dba643b48b72f4e9beff7c/cumm_cu120-0.6.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "5ca75cf6f4feb41de4fd376b484ed9d41aee69747efbc59ca16ac60548154808",
"md5": "22ed57a332c81717f596235d4d90b3bb",
"sha256": "c81bdf9acf9db19582c29fff95140c2f799c266d0081c958b836a2d3ef1bf290"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp312-cp312-win_amd64.whl",
"has_sig": false,
"md5_digest": "22ed57a332c81717f596235d4d90b3bb",
"packagetype": "bdist_wheel",
"python_version": "cp312",
"requires_python": ">=3.8",
"size": 1205208,
"upload_time": "2024-08-18T08:47:02",
"upload_time_iso_8601": "2024-08-18T08:47:02.684701Z",
"url": "https://files.pythonhosted.org/packages/5c/a7/5cf6f4feb41de4fd376b484ed9d41aee69747efbc59ca16ac60548154808/cumm_cu120-0.6.3-cp312-cp312-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "f883cb21c01623255259dc5f29754352cacdc6828220f0cf6290b5c97e386287",
"md5": "1555de8fcfcf59969b6682f6a3dcf074",
"sha256": "056b69985e0c994a3ea2db473391848ecb7ade13d74d2a0e7252dd25494dab41"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "1555de8fcfcf59969b6682f6a3dcf074",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 26264115,
"upload_time": "2024-08-18T08:58:59",
"upload_time_iso_8601": "2024-08-18T08:58:59.182426Z",
"url": "https://files.pythonhosted.org/packages/f8/83/cb21c01623255259dc5f29754352cacdc6828220f0cf6290b5c97e386287/cumm_cu120-0.6.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "028d613fd5dc21158f67b144180aa206e8b048bcf9a4a159fe4ab0abb3130c28",
"md5": "0f70e122e8d5160d1858e718c6514a5f",
"sha256": "c242e5767ad710cecfee968ee8862fe3d6a9a8566fdc93bf7140e8e60d82d439"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp38-cp38-win_amd64.whl",
"has_sig": false,
"md5_digest": "0f70e122e8d5160d1858e718c6514a5f",
"packagetype": "bdist_wheel",
"python_version": "cp38",
"requires_python": ">=3.8",
"size": 1206487,
"upload_time": "2024-08-18T08:47:07",
"upload_time_iso_8601": "2024-08-18T08:47:07.831755Z",
"url": "https://files.pythonhosted.org/packages/02/8d/613fd5dc21158f67b144180aa206e8b048bcf9a4a159fe4ab0abb3130c28/cumm_cu120-0.6.3-cp38-cp38-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "977163cc3c8faa49a3e02221eccb52b83c5e21930c71bd0357b2bf95a78e1352",
"md5": "60084a2f899f1c095cf887ca3e7e74d8",
"sha256": "20ff40fc91a4165ae4891a4eb231491c4298b9b573dd80ab788bb2c75b6dcc47"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"has_sig": false,
"md5_digest": "60084a2f899f1c095cf887ca3e7e74d8",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.8",
"size": 26263438,
"upload_time": "2024-08-18T08:59:04",
"upload_time_iso_8601": "2024-08-18T08:59:04.171904Z",
"url": "https://files.pythonhosted.org/packages/97/71/63cc3c8faa49a3e02221eccb52b83c5e21930c71bd0357b2bf95a78e1352/cumm_cu120-0.6.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "118d254629026807910edd49cded30b4756574b6f245b0beefe1e0c7f2c1e57a",
"md5": "cc123e677a497f580072c98f5b0ad8d9",
"sha256": "aea9583f74c0cb0c907d65c33c4f58dcc65dadd9fc11f03c233e61f1667ae5e6"
},
"downloads": -1,
"filename": "cumm_cu120-0.6.3-cp39-cp39-win_amd64.whl",
"has_sig": false,
"md5_digest": "cc123e677a497f580072c98f5b0ad8d9",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.8",
"size": 1206567,
"upload_time": "2024-08-18T08:46:10",
"upload_time_iso_8601": "2024-08-18T08:46:10.824958Z",
"url": "https://files.pythonhosted.org/packages/11/8d/254629026807910edd49cded30b4756574b6f245b0beefe1e0c7f2c1e57a/cumm_cu120-0.6.3-cp39-cp39-win_amd64.whl",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-08-18 08:58:46",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "FindDefinition",
"github_project": "cumm",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "cumm-cu120"
}