migperf


Namemigperf JSON
Version 0.0.1 PyPI version JSON
download
home_pagehttps://github.com/MLSysOps/MIGProfiler
SummaryMulti-Instance-GPU profiling tool
upload_time2023-01-08 07:52:26
maintainer
docs_urlNone
authorXing Di
requires_python
license
keywords benchmark deep learning mlops neural networks
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # MIG Profiler

![GitHub](https://img.shields.io/github/license/MLSysOps/MIGProfiler)

MIGProfiler is a toolkit for benchmark study on NVIDIA [MIG](https://www.nvidia.com/en-sg/technologies/multi-instance-gpu/) techniques. It provides profiling on multiple deep learning training and inference tasks on MIG GPUs. 

MIGProfiler is featured for:
- 🎨 Support a lot of deep learning tasks and open-sourced models on a various of benchmark type
- 📈 Present **comprehensive** benchmark results
- 🐣 **Easy to use** with a configuration file (WIP)

*The project is under rapid development! Please check our [benchmark website](#benchmark-website-) and join us!*

- [Benchmark Website](#benchmark-website-)
- [Install](##install-)
- [Quick Start](#quick-start-)
- [Cite Us](#cite-us-)
- [Contributors](#contributors-)
- [Acknowledgement](#ackowledgement)
- [License](#license)

## Benchmark Website 📈
 Coming soon!

## Install 📦️

### Manual install

Requirements:
- PyTorch with CUDA
- OpenCV
- Sanic
- Transformers
- Tqdm
- Prometheus client

```shell
# create virtual environment
conda create -n mig-perf python=3.8
conda activate mig-perf

# install required packages
conda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia
conda install -c conda-forge opencv
pip install transformers
pip install sanic tqdm prometheus_client
```

### PyPI install
WIP

### Use Docker
WIP

## Quick Start 🚚
You can easily to profile on MIG GPU. Below are some common deep learning tasks to play with.
### 1. MIG training benchmark

We first create a `1g.10gb` MIG device
```shell
# enable MIG
sudo nvidia-smi -i 0 -mig 1
# create MIG instance
sudo nvidia-smi mig -cgi 1g.10gb -C
```

Start DCGM metric exporter
```shell
docker run -d --rm --gpus all --net mig_perf -p 9400:9400  \
    -v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
    -c 500 -f /etc/dcgm-exporter/customized.csv -d f
```

Start to profile
```shell
cd mig_perf/profiler
export PYTHONPATH=$PWD
python train/train_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
```

Remeber to disable MIG after finish benchmark
```shell
sudo nvidia-smi -i 0 -dci
sudo nvidia-smi -i 0 -dgi
sudo nvidia-smi -i 0 -mig 0
```

### 2. MIG inference benchmark

Start DCGM metric exporter
```shell
docker run -d --rm --gpus all --net mig_perf -p 9400:9400  \
    -v "${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv" \
    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \
    -c 500 -f /etc/dcgm-exporter/customized.csv -d f
```

Start to profile
```shell
cd mig_perf/profiler
export PYTHONPATH=$PWD
python client/block_infernece_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0
```

See more benchmark experiments in [`./exp`](./exp).

### 3. Visualize

- [x] in notebook
- [ ] in Prometheus (under improvement)

## Cite Us 🌱

```bibtex
@article{zhang2022migperf,
  title={MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs},
  author={Zhang, Huaizheng and Li, Yuanming and Xiao, Wencong and Huang, Yizheng and Di, Xing and Yin, Jianxiong and See, Simon and Luo, Yong and Lau, Chiew Tong and You, Yang},
  journal={arXiv preprint arXiv:2301.00407},
  year={2023}
}
```

## Contributors 👥

- Yuanming Li
- Huaizheng Zhang
- Yizheng Huang
- Xing Di

## Ackowledgement
Special thanks to Aliyun and NVIDIA AI Tech Center to provide MIG GPU server for benchmarking.

## License
This repository is open-sourced under [MIT License](./LICENSE).

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/MLSysOps/MIGProfiler",
    "name": "migperf",
    "maintainer": "",
    "docs_url": null,
    "requires_python": "",
    "maintainer_email": "",
    "keywords": "benchmark,deep learning,MLOps,neural networks",
    "author": "Xing Di",
    "author_email": "xing.cyrildi@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/58/64/23168cab60b1adeb9478ef62728b265da7185e4caeb56177fb6ddadeb4bb/migperf-0.0.1.tar.gz",
    "platform": null,
    "description": "# MIG Profiler\r\n\r\n![GitHub](https://img.shields.io/github/license/MLSysOps/MIGProfiler)\r\n\r\nMIGProfiler is a toolkit for benchmark study on NVIDIA [MIG](https://www.nvidia.com/en-sg/technologies/multi-instance-gpu/) techniques. It provides profiling on multiple deep learning training and inference tasks on MIG GPUs. \r\n\r\nMIGProfiler is featured for:\r\n- \ud83c\udfa8 Support a lot of deep learning tasks and open-sourced models on a various of benchmark type\r\n- \ud83d\udcc8 Present **comprehensive** benchmark results\r\n- \ud83d\udc23 **Easy to use** with a configuration file (WIP)\r\n\r\n*The project is under rapid development! Please check our [benchmark website](#benchmark-website-) and join us!*\r\n\r\n- [Benchmark Website](#benchmark-website-)\r\n- [Install](##install-)\r\n- [Quick Start](#quick-start-)\r\n- [Cite Us](#cite-us-)\r\n- [Contributors](#contributors-)\r\n- [Acknowledgement](#ackowledgement)\r\n- [License](#license)\r\n\r\n## Benchmark Website \ud83d\udcc8\r\n Coming soon!\r\n\r\n## Install \ud83d\udce6\ufe0f\r\n\r\n### Manual install\r\n\r\nRequirements:\r\n- PyTorch with CUDA\r\n- OpenCV\r\n- Sanic\r\n- Transformers\r\n- Tqdm\r\n- Prometheus client\r\n\r\n```shell\r\n# create virtual environment\r\nconda create -n mig-perf python=3.8\r\nconda activate mig-perf\r\n\r\n# install required packages\r\nconda install pytorch torchvision pytorch-cuda=11.6 -c pytorch -c nvidia\r\nconda install -c conda-forge opencv\r\npip install transformers\r\npip install sanic tqdm prometheus_client\r\n```\r\n\r\n### PyPI install\r\nWIP\r\n\r\n### Use Docker\r\nWIP\r\n\r\n## Quick Start \ud83d\ude9a\r\nYou can easily to profile on MIG GPU. Below are some common deep learning tasks to play with.\r\n### 1. MIG training benchmark\r\n\r\nWe first create a `1g.10gb` MIG device\r\n```shell\r\n# enable MIG\r\nsudo nvidia-smi -i 0 -mig 1\r\n# create MIG instance\r\nsudo nvidia-smi mig -cgi 1g.10gb -C\r\n```\r\n\r\nStart DCGM metric exporter\r\n```shell\r\ndocker run -d --rm --gpus all --net mig_perf -p 9400:9400  \\\r\n    -v \"${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv\" \\\r\n    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \\\r\n    -c 500 -f /etc/dcgm-exporter/customized.csv -d f\r\n```\r\n\r\nStart to profile\r\n```shell\r\ncd mig_perf/profiler\r\nexport PYTHONPATH=$PWD\r\npython train/train_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0\r\n```\r\n\r\nRemeber to disable MIG after finish benchmark\r\n```shell\r\nsudo nvidia-smi -i 0 -dci\r\nsudo nvidia-smi -i 0 -dgi\r\nsudo nvidia-smi -i 0 -mig 0\r\n```\r\n\r\n### 2. MIG inference benchmark\r\n\r\nStart DCGM metric exporter\r\n```shell\r\ndocker run -d --rm --gpus all --net mig_perf -p 9400:9400  \\\r\n    -v \"${PWD}/mig_perf/profiler/client/dcp-metrics-included.csv:/etc/dcgm-exporter/customized.csv\" \\\r\n    --name dcgm_exporter --cap-add SYS_ADMIN   nvcr.io/nvidia/k8s/dcgm-exporter:2.4.7-2.6.11-ubuntu20.04 \\\r\n    -c 500 -f /etc/dcgm-exporter/customized.csv -d f\r\n```\r\n\r\nStart to profile\r\n```shell\r\ncd mig_perf/profiler\r\nexport PYTHONPATH=$PWD\r\npython client/block_infernece_cv.py --bs=32 --model=resnet50 --num_batches=500 --mig-device-id=0\r\n```\r\n\r\nSee more benchmark experiments in [`./exp`](./exp).\r\n\r\n### 3. Visualize\r\n\r\n- [x] in notebook\r\n- [ ] in Prometheus (under improvement)\r\n\r\n## Cite Us \ud83c\udf31\r\n\r\n```bibtex\r\n@article{zhang2022migperf,\r\n  title={MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs},\r\n  author={Zhang, Huaizheng and Li, Yuanming and Xiao, Wencong and Huang, Yizheng and Di, Xing and Yin, Jianxiong and See, Simon and Luo, Yong and Lau, Chiew Tong and You, Yang},\r\n  journal={arXiv preprint arXiv:2301.00407},\r\n  year={2023}\r\n}\r\n```\r\n\r\n## Contributors \ud83d\udc65\r\n\r\n- Yuanming Li\r\n- Huaizheng Zhang\r\n- Yizheng Huang\r\n- Xing Di\r\n\r\n## Ackowledgement\r\nSpecial thanks to Aliyun and NVIDIA AI Tech Center to provide MIG GPU server for benchmarking.\r\n\r\n## License\r\nThis repository is open-sourced under [MIT License](./LICENSE).\r\n",
    "bugtrack_url": null,
    "license": "",
    "summary": "Multi-Instance-GPU profiling tool",
    "version": "0.0.1",
    "split_keywords": [
        "benchmark",
        "deep learning",
        "mlops",
        "neural networks"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "3999049d80a490dabcdce2384994241ffd34edb204d4347b281d7bbe43ce477a",
                "md5": "458022a864c2d7307e8295cc2356a0a9",
                "sha256": "9fc00bd5f8a3a1bdea9bc83f193fe297cc5db844f529b807e306fdbc4351e926"
            },
            "downloads": -1,
            "filename": "migperf-0.0.1-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "458022a864c2d7307e8295cc2356a0a9",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 20455,
            "upload_time": "2023-01-08T07:52:24",
            "upload_time_iso_8601": "2023-01-08T07:52:24.689136Z",
            "url": "https://files.pythonhosted.org/packages/39/99/049d80a490dabcdce2384994241ffd34edb204d4347b281d7bbe43ce477a/migperf-0.0.1-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "586423168cab60b1adeb9478ef62728b265da7185e4caeb56177fb6ddadeb4bb",
                "md5": "28f959fe451e2dc152d626e68fe35622",
                "sha256": "c37811fba86cd4169d0e9dfa99969e92128317997241d9bab328129affafd0c9"
            },
            "downloads": -1,
            "filename": "migperf-0.0.1.tar.gz",
            "has_sig": false,
            "md5_digest": "28f959fe451e2dc152d626e68fe35622",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 18120,
            "upload_time": "2023-01-08T07:52:26",
            "upload_time_iso_8601": "2023-01-08T07:52:26.727930Z",
            "url": "https://files.pythonhosted.org/packages/58/64/23168cab60b1adeb9478ef62728b265da7185e4caeb56177fb6ddadeb4bb/migperf-0.0.1.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-01-08 07:52:26",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "github_user": "MLSysOps",
    "github_project": "MIGProfiler",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [],
    "lcname": "migperf"
}
        
Elapsed time: 0.02751s