MIG-Profiler


NameMIG-Profiler JSON
Version 0.0.3 PyPI version JSON
download
home_pagehttps://github.com/MLSysOps/MIGProfiler
SummaryMulti-Instance-GPU profiling tool
upload_time2023-01-08 07:10:18
maintainer
docs_urlNone
authorYizheng Huang
requires_python
license
keywords benchmark deep learning mlops neural networks
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MIG Profiler

![GitHub](https://img.shields.io/github/license/MLSysOps/MIGProfiler)

MIGProfiler is a toolkit for benchmark study on NVIDIA [MIG](https://www.nvidia.com/en-sg/technologies/multi-instance-gpu/) techniques. It provides profiling on multiple deep learning training and inference tasks on MIG GPUs. 

MIGProfiler is featured for:
- 🎨 Support a lot of deep learning tasks and open-sourced models on a various of benchmark type
- 📈 Present **comprehensive** benchmark results
- 🐣 **Easy to use** with a configuration file (WIP)

*The project is under rapid development! Please check our [benchmark website](#benchmark-website-) and join us!*

- [Benchmark Website](#benchmark-website-)
- [Install](##install-)
- [Quick Start](#quick-start-)
- [Cite Us](#cite-us-)
- [Contributors](#contributors-)
- [Acknowledgement](#ackowledgement)
- [License](#license)

## Benchmark Website 📈
 Coming soon!

## Install 📦️

### Manual install

Requirements:
- PyTorch with CUDA
- OpenCV
- Sanic
- Transformers
- Tqdm
- Prometheus client

```shell
# create virtual environment
conda create -n mig-perf python=3.8
conda activate mig-perf

# install required packages
conda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia
conda install -c conda-forge opencv
pip install transformers
pip install sanic tqdm prometheus_client
```

### PyPI install
WIP

### Use Docker
WIP

## Quick Start 🚚
You can easily to profile on MIG GPU. Below are some common deep learning tasks to play with.
### 1. MIG training benchmark

We first create a `1g.10gb` MIG device
```shell
# enable MIG
sudo nvidia-smi -i 0 -mig 1
# create MIG instance
sudo nvidia-smi mig -cgi 1g.10gb -C
```

Start DCGM metric exporter
```shell
docker run -d --rm --gpus all --net mig_perf -p 9400:9400  \
    -v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
    -c 500 -f /etc/dcgm-exporter/customized.csv -d f
```

Start to profile
```shell
cd mig_perf/profiler
export PYTHONPATH=$PWD
python train/train_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
```

Remeber to disable MIG after finish benchmark
```shell
sudo nvidia-smi -i 0 -dci
sudo nvidia-smi -i 0 -dgi
sudo nvidia-smi -i 0 -mig 0
```

### 2. MIG inference benchmark

Start DCGM metric exporter
```shell
docker run -d --rm --gpus all --net mig_perf -p 9400:9400  \
    -v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
    -c 500 -f /etc/dcgm-exporter/customized.csv -d f
```

Start to profile
```shell
cd mig_perf/profiler
export PYTHONPATH=$PWD
python client/block_infernece_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
```

See more benchmark experiments in [`./exp`](./exp).

### 3. Visualize

- [x] in notebook
- [ ] in Prometheus (under improvement)

## Cite Us 🌱

```bibtex
@article{zhang2022migperf,
  title={MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs},
  author={Zhang, Huaizheng and Li, Yuanming and Xiao, Wencong and Huang, Yizheng and Di, Xing and Yin, Jianxiong and See, Simon and Luo, Yong and Lau, Chiew Tong and You, Yang},
  journal={arXiv preprint arXiv:2301.00407},
  year={2023}
}
```

## Contributors 👥

- Yuanming Li
- Huaizheng Zhang
- Yizheng Huang
- Xing Di

## Ackowledgement
Special thanks to Aliyun and NVIDIA AI Tech Center to provide MIG GPU server for benchmarking.

## License
This repository is open-sourced under [MIT License](./LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MLSysOps/MIGProfiler",
    "name": "MIG-Profiler",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "benchmark,deep learning,MLOps,neural networks",
    "author": "Yizheng Huang",
    "author_email": "huangyz0918@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/03/ae/30d20fa3e467a3e9cc0c9f4cf27828ca332ab8217e07e258722a5eee2ab0/MIG%20Profiler-0.0.3.tar.gz",
    "platform": null,
    "description": "# MIG Profiler\r\n\r\n![GitHub](https://img.shields.io/github/license/MLSysOps/MIGProfiler)\r\n\r\nMIGProfiler is a toolkit for benchmark study on NVIDIA [MIG](https://www.nvidia.com/en-sg/technologies/multi-instance-gpu/) techniques. It provides profiling on multiple deep learning training and inference tasks on MIG GPUs. \r\n\r\nMIGProfiler is featured for:\r\n- \ud83c\udfa8 Support a lot of deep learning tasks and open-sourced models on a various of benchmark type\r\n- \ud83d\udcc8 Present **comprehensive** benchmark results\r\n- \ud83d\udc23 **Easy to use** with a configuration file (WIP)\r\n\r\n*The project is under rapid development! Please check our [benchmark website](#benchmark-website-) and join us!*\r\n\r\n- [Benchmark Website](#benchmark-website-)\r\n- [Install](##install-)\r\n- [Quick Start](#quick-start-)\r\n- [Cite Us](#cite-us-)\r\n- [Contributors](#contributors-)\r\n- [Acknowledgement](#ackowledgement)\r\n- [License](#license)\r\n\r\n## Benchmark Website \ud83d\udcc8\r\n Coming soon!\r\n\r\n## Install \ud83d\udce6\ufe0f\r\n\r\n### Manual install\r\n\r\nRequirements:\r\n- PyTorch with CUDA\r\n- OpenCV\r\n- Sanic\r\n- Transformers\r\n- Tqdm\r\n- Prometheus client\r\n\r\n```shell\r\n# create virtual environment\r\nconda create -n mig-perf python=3.8\r\nconda activate mig-perf\r\n\r\n# install required packages\r\nconda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia\r\nconda install -c conda-forge opencv\r\npip install transformers\r\npip install sanic tqdm prometheus_client\r\n```\r\n\r\n### PyPI install\r\nWIP\r\n\r\n### Use Docker\r\nWIP\r\n\r\n## Quick Start \ud83d\ude9a\r\nYou can easily to profile on MIG GPU. Below are some common deep learning tasks to play with.\r\n### 1. MIG training benchmark\r\n\r\nWe first create a `1g.10gb` MIG device\r\n```shell\r\n# enable MIG\r\nsudo nvidia-smi -i 0 -mig 1\r\n# create MIG instance\r\nsudo nvidia-smi mig -cgi 1g.10gb -C\r\n```\r\n\r\nStart DCGM metric exporter\r\n```shell\r\ndocker run -d --rm --gpus all --net mig_perf -p 9400:9400  \\\r\n    -v \"${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv\" \\\r\n    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \\\r\n    -c 500 -f /etc/dcgm-exporter/customized.csv -d f\r\n```\r\n\r\nStart to profile\r\n```shell\r\ncd mig_perf/profiler\r\nexport PYTHONPATH=$PWD\r\npython train/train_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0\r\n```\r\n\r\nRemeber to disable MIG after finish benchmark\r\n```shell\r\nsudo nvidia-smi -i 0 -dci\r\nsudo nvidia-smi -i 0 -dgi\r\nsudo nvidia-smi -i 0 -mig 0\r\n```\r\n\r\n### 2. MIG inference benchmark\r\n\r\nStart DCGM metric exporter\r\n```shell\r\ndocker run -d --rm --gpus all --net mig_perf -p 9400:9400  \\\r\n    -v \"${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv\" \\\r\n    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \\\r\n    -c 500 -f /etc/dcgm-exporter/customized.csv -d f\r\n```\r\n\r\nStart to profile\r\n```shell\r\ncd mig_perf/profiler\r\nexport PYTHONPATH=$PWD\r\npython client/block_infernece_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0\r\n```\r\n\r\nSee more benchmark experiments in [`./exp`](./exp).\r\n\r\n### 3. Visualize\r\n\r\n- [x] in notebook\r\n- [ ] in Prometheus (under improvement)\r\n\r\n## Cite Us \ud83c\udf31\r\n\r\n```bibtex\r\n@article{zhang2022migperf,\r\n  title={MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs},\r\n  author={Zhang, Huaizheng and Li, Yuanming and Xiao, Wencong and Huang, Yizheng and Di, Xing and Yin, Jianxiong and See, Simon and Luo, Yong and Lau, Chiew Tong and You, Yang},\r\n  journal={arXiv preprint arXiv:2301.00407},\r\n  year={2023}\r\n}\r\n```\r\n\r\n## Contributors \ud83d\udc65\r\n\r\n- Yuanming Li\r\n- Huaizheng Zhang\r\n- Yizheng Huang\r\n- Xing Di\r\n\r\n## Ackowledgement\r\nSpecial thanks to Aliyun and NVIDIA AI Tech Center to provide MIG GPU server for benchmarking.\r\n\r\n## License\r\nThis repository is open-sourced under [MIT License](./LICENSE).\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Multi-Instance-GPU profiling tool",
    "version": "0.0.3",
    "split_keywords": [
        "benchmark",
        "deep learning",
        "mlops",
        "neural networks"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d429a1dafb99a1574fdacbd5b2acc06607d8e4bbdf547a70beee93ad5e5caa71",
                "md5": "55422aa9c6cca8d183bda9d80f04b0d9",
                "sha256": "ca929e00a9e4240189fb8300176e40f919b539e25046142525400afd1abcf038"
            },
            "downloads": -1,
            "filename": "MIG_Profiler-0.0.3-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "55422aa9c6cca8d183bda9d80f04b0d9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 20516,
            "upload_time": "2023-01-08T07:10:16",
            "upload_time_iso_8601": "2023-01-08T07:10:16.952443Z",
            "url": "https://files.pythonhosted.org/packages/d4/29/a1dafb99a1574fdacbd5b2acc06607d8e4bbdf547a70beee93ad5e5caa71/MIG_Profiler-0.0.3-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "03ae30d20fa3e467a3e9cc0c9f4cf27828ca332ab8217e07e258722a5eee2ab0",
                "md5": "367547d8b98fc7deb2ba564460ce62b1",
                "sha256": "66ffefb42460058574d2f9c5dc224daf48597285817f09057169e087111b780c"
            },
            "downloads": -1,
            "filename": "MIG Profiler-0.0.3.tar.gz",
            "has_sig": false,
            "md5_digest": "367547d8b98fc7deb2ba564460ce62b1",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16514,
            "upload_time": "2023-01-08T07:10:18",
            "upload_time_iso_8601": "2023-01-08T07:10:18.831614Z",
            "url": "https://files.pythonhosted.org/packages/03/ae/30d20fa3e467a3e9cc0c9f4cf27828ca332ab8217e07e258722a5eee2ab0/MIG%20Profiler-0.0.3.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-08 07:10:18",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "MLSysOps",
    "github_project": "MIGProfiler",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "mig-profiler"
}
        
Elapsed time: 0.03247s