# MIG Profiler
![GitHub](https://img.shields.io/github/license/MLSysOps/MIGProfiler)
MIGProfiler is a toolkit for benchmark study on NVIDIA [MIG](https://www.nvidia.com/en-sg/technologies/multi-instance-gpu/) techniques. It provides profiling on multiple deep learning training and inference tasks on MIG GPUs.
MIGProfiler is featured for:
- 🎨 Support a lot of deep learning tasks and open-sourced models on a various of benchmark type
- 📈 Present **comprehensive** benchmark results
- 🐣 **Easy to use** with a configuration file (WIP)
*The project is under rapid development! Please check our [benchmark website](#benchmark-website-) and join us!*
- [Benchmark Website](#benchmark-website-)
- [Install](##install-)
- [Quick Start](#quick-start-)
- [Cite Us](#cite-us-)
- [Contributors](#contributors-)
- [Acknowledgement](#ackowledgement)
- [License](#license)
## Benchmark Website 📈
Coming soon!
## Install 📦️
### Manual install
Requirements:
- PyTorch with CUDA
- OpenCV
- Sanic
- Transformers
- Tqdm
- Prometheus client
```shell
# create virtual environment
conda create -n mig-perf python=3.8
conda activate mig-perf
# install required packages
conda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia
conda install -c conda-forge opencv
pip install transformers
pip install sanic tqdm prometheus_client
```
### PyPI install
WIP
### Use Docker
WIP
## Quick Start 🚚
You can easily to profile on MIG GPU. Below are some common deep learning tasks to play with.
### 1. MIG training benchmark
We first create a `1g.10gb` MIG device
```shell
# enable MIG
sudo nvidia-smi -i 0 -mig 1
# create MIG instance
sudo nvidia-smi mig -cgi 1g.10gb -C
```
Start DCGM metric exporter
```shell
docker run -d --rm --gpus all --net mig_perf -p 9400:9400 \
-v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
--name dcgm_exporter --cap-add SYS_ADMIN nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
-c 500 -f /etc/dcgm-exporter/customized.csv -d f
```
Start to profile
```shell
cd mig_perf/profiler
export PYTHONPATH=$PWD
python train/train_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
```
Remeber to disable MIG after finish benchmark
```shell
sudo nvidia-smi -i 0 -dci
sudo nvidia-smi -i 0 -dgi
sudo nvidia-smi -i 0 -mig 0
```
### 2. MIG inference benchmark
Start DCGM metric exporter
```shell
docker run -d --rm --gpus all --net mig_perf -p 9400:9400 \
-v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
--name dcgm_exporter --cap-add SYS_ADMIN nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
-c 500 -f /etc/dcgm-exporter/customized.csv -d f
```
Start to profile
```shell
cd mig_perf/profiler
export PYTHONPATH=$PWD
python client/block_infernece_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
```
See more benchmark experiments in [`./exp`](./exp).
### 3. Visualize
- [x] in notebook
- [ ] in Prometheus (under improvement)
## Cite Us 🌱
```bibtex
@article{zhang2022migperf,
title={MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs},
author={Zhang, Huaizheng and Li, Yuanming and Xiao, Wencong and Huang, Yizheng and Di, Xing and Yin, Jianxiong and See, Simon and Luo, Yong and Lau, Chiew Tong and You, Yang},
journal={arXiv preprint arXiv:2301.00407},
year={2023}
}
```
## Contributors 👥
- Yuanming Li
- Huaizheng Zhang
- Yizheng Huang
- Xing Di
## Ackowledgement
Special thanks to Aliyun and NVIDIA AI Tech Center to provide MIG GPU server for benchmarking.
## License
This repository is open-sourced under [MIT License](./LICENSE).
Raw data
{
"_id": null,
"home_page": "https://github.com/MLSysOps/MIGProfiler",
"name": "migperf",
"maintainer": "",
"docs_url": null,
"requires_python": "",
"maintainer_email": "",
"keywords": "benchmark,deep learning,MLOps,neural networks",
"author": "Xing Di",
"author_email": "xing.cyrildi@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/58/64/23168cab60b1adeb9478ef62728b265da7185e4caeb56177fb6ddadeb4bb/migperf-0.0.1.tar.gz",
"platform": null,
"description": "# MIG Profiler\r\n\r\n![GitHub](https://img.shields.io/github/license/MLSysOps/MIGProfiler)\r\n\r\nMIGProfiler is a toolkit for benchmark study on NVIDIA [MIG](https://www.nvidia.com/en-sg/technologies/multi-instance-gpu/) techniques. It provides profiling on multiple deep learning training and inference tasks on MIG GPUs. \r\n\r\nMIGProfiler is featured for:\r\n- \ud83c\udfa8 Support a lot of deep learning tasks and open-sourced models on a various of benchmark type\r\n- \ud83d\udcc8 Present **comprehensive** benchmark results\r\n- \ud83d\udc23 **Easy to use** with a configuration file (WIP)\r\n\r\n*The project is under rapid development! Please check our [benchmark website](#benchmark-website-) and join us!*\r\n\r\n- [Benchmark Website](#benchmark-website-)\r\n- [Install](##install-)\r\n- [Quick Start](#quick-start-)\r\n- [Cite Us](#cite-us-)\r\n- [Contributors](#contributors-)\r\n- [Acknowledgement](#ackowledgement)\r\n- [License](#license)\r\n\r\n## Benchmark Website \ud83d\udcc8\r\n Coming soon!\r\n\r\n## Install \ud83d\udce6\ufe0f\r\n\r\n### Manual install\r\n\r\nRequirements:\r\n- PyTorch with CUDA\r\n- OpenCV\r\n- Sanic\r\n- Transformers\r\n- Tqdm\r\n- Prometheus client\r\n\r\n```shell\r\n# create virtual environment\r\nconda create -n mig-perf python=3.8\r\nconda activate mig-perf\r\n\r\n# install required packages\r\nconda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia\r\nconda install -c conda-forge opencv\r\npip install transformers\r\npip install sanic tqdm prometheus_client\r\n```\r\n\r\n### PyPI install\r\nWIP\r\n\r\n### Use Docker\r\nWIP\r\n\r\n## Quick Start \ud83d\ude9a\r\nYou can easily to profile on MIG GPU. Below are some common deep learning tasks to play with.\r\n### 1. MIG training benchmark\r\n\r\nWe first create a `1g.10gb` MIG device\r\n```shell\r\n# enable MIG\r\nsudo nvidia-smi -i 0 -mig 1\r\n# create MIG instance\r\nsudo nvidia-smi mig -cgi 1g.10gb -C\r\n```\r\n\r\nStart DCGM metric exporter\r\n```shell\r\ndocker run -d --rm --gpus all --net mig_perf -p 9400:9400 \\\r\n -v \"${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv\" \\\r\n --name dcgm_exporter --cap-add SYS_ADMIN nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \\\r\n -c 500 -f /etc/dcgm-exporter/customized.csv -d f\r\n```\r\n\r\nStart to profile\r\n```shell\r\ncd mig_perf/profiler\r\nexport PYTHONPATH=$PWD\r\npython train/train_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0\r\n```\r\n\r\nRemeber to disable MIG after finish benchmark\r\n```shell\r\nsudo nvidia-smi -i 0 -dci\r\nsudo nvidia-smi -i 0 -dgi\r\nsudo nvidia-smi -i 0 -mig 0\r\n```\r\n\r\n### 2. MIG inference benchmark\r\n\r\nStart DCGM metric exporter\r\n```shell\r\ndocker run -d --rm --gpus all --net mig_perf -p 9400:9400 \\\r\n -v \"${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv\" \\\r\n --name dcgm_exporter --cap-add SYS_ADMIN nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \\\r\n -c 500 -f /etc/dcgm-exporter/customized.csv -d f\r\n```\r\n\r\nStart to profile\r\n```shell\r\ncd mig_perf/profiler\r\nexport PYTHONPATH=$PWD\r\npython client/block_infernece_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0\r\n```\r\n\r\nSee more benchmark experiments in [`./exp`](./exp).\r\n\r\n### 3. Visualize\r\n\r\n- [x] in notebook\r\n- [ ] in Prometheus (under improvement)\r\n\r\n## Cite Us \ud83c\udf31\r\n\r\n```bibtex\r\n@article{zhang2022migperf,\r\n title={MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs},\r\n author={Zhang, Huaizheng and Li, Yuanming and Xiao, Wencong and Huang, Yizheng and Di, Xing and Yin, Jianxiong and See, Simon and Luo, Yong and Lau, Chiew Tong and You, Yang},\r\n journal={arXiv preprint arXiv:2301.00407},\r\n year={2023}\r\n}\r\n```\r\n\r\n## Contributors \ud83d\udc65\r\n\r\n- Yuanming Li\r\n- Huaizheng Zhang\r\n- Yizheng Huang\r\n- Xing Di\r\n\r\n## Ackowledgement\r\nSpecial thanks to Aliyun and NVIDIA AI Tech Center to provide MIG GPU server for benchmarking.\r\n\r\n## License\r\nThis repository is open-sourced under [MIT License](./LICENSE).\r\n",
"bugtrack_url": null,
"license": "",
"summary": "Multi-Instance-GPU profiling tool",
"version": "0.0.1",
"split_keywords": [
"benchmark",
"deep learning",
"mlops",
"neural networks"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "3999049d80a490dabcdce2384994241ffd34edb204d4347b281d7bbe43ce477a",
"md5": "458022a864c2d7307e8295cc2356a0a9",
"sha256": "9fc00bd5f8a3a1bdea9bc83f193fe297cc5db844f529b807e306fdbc4351e926"
},
"downloads": -1,
"filename": "migperf-0.0.1-py3-none-any.whl",
"has_sig": false,
"md5_digest": "458022a864c2d7307e8295cc2356a0a9",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": null,
"size": 20455,
"upload_time": "2023-01-08T07:52:24",
"upload_time_iso_8601": "2023-01-08T07:52:24.689136Z",
"url": "https://files.pythonhosted.org/packages/39/99/049d80a490dabcdce2384994241ffd34edb204d4347b281d7bbe43ce477a/migperf-0.0.1-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "586423168cab60b1adeb9478ef62728b265da7185e4caeb56177fb6ddadeb4bb",
"md5": "28f959fe451e2dc152d626e68fe35622",
"sha256": "c37811fba86cd4169d0e9dfa99969e92128317997241d9bab328129affafd0c9"
},
"downloads": -1,
"filename": "migperf-0.0.1.tar.gz",
"has_sig": false,
"md5_digest": "28f959fe451e2dc152d626e68fe35622",
"packagetype": "sdist",
"python_version": "source",
"requires_python": null,
"size": 18120,
"upload_time": "2023-01-08T07:52:26",
"upload_time_iso_8601": "2023-01-08T07:52:26.727930Z",
"url": "https://files.pythonhosted.org/packages/58/64/23168cab60b1adeb9478ef62728b265da7185e4caeb56177fb6ddadeb4bb/migperf-0.0.1.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-01-08 07:52:26",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "MLSysOps",
"github_project": "MIGProfiler",
"travis_ci": false,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "migperf"
}