# simple-cocotools
A simple, modern alternative to `pycocotools`.
## About
Why not just use [Pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools)?
* Code is more readable and hackable.
* Metrics are more transparent and understandable.
* Evaluation is fast.
* Only dependencies are `numpy` and `scipy`. No `cython` extensions.
* Code is more modern (type annotations, linting, etc).
## Install
### From PyPI
```bash
pip install simple-cocotools
```
### From Repo
```bash
pip install "simple-cocotools @ git+ssh://git@github.com/fkodom/simple-cocotools.git"
```
### For Contributors
```bash
# Clone this repository
gh repo clone fkodom/simple-cocotools
cd simple-cocotools
# Install all dev dependencies (tests etc.)
pip install -e .[all]
# Setup pre-commit hooks
pre-commit install
```
## Usage
Expects target annotations to have the same format as model predictions. (The format used by all `torchvision` detection models.) You may already have code to convert annotations into this format, since it's required to train many detection models. If not, use ['AnnotationsToDetectionFormat' from this repo](./simple_cocotools/utils/coco.py#L83) as an example for how to do that.
A minimal example:
```python
from torchvision.detection.models import maskrcnn_resnet50_fpn
from simple_cocotools import CocoEvaluator
evaluator = CocoEvaluator()
model = maskrcnn_resnet50_fpn(pretrained=True).eval()
for images, targets in data_loader:
predictions = model(images)
evaluator.update(predictions, targets)
metrics = evaluator.summarize()
```
`metrics` will be a dictionary with format:
```json
{
"box": {
"mAP": 0.40,
"mAR": 0.41,
"class_AP": {
"cat": 0.39,
"dog": 0.42,
...
},
"class_AR": {
# Same as 'class_AP' above.
}
}
"mask": {
# Same as 'box' above.
}
}
```
For a more complete example, see [`scripts/mask_rcnn_example.py`](./scripts/mask_rcnn_example.py).
## Benchmarks
I benchmarked against several `torchvision` detection models, which have [mAP scores reported on the PyTorch website](https://pytorch.org/vision/stable/models.html#object-detection-instance-segmentation-and-person-keypoint-detection).
Using a default score threshold of 0.5:
Model | Backbone | box mAP<br>(official) | box mAP | box mAR | mask mAP<br>(official) | mask mAP | mask mAR
-------------|-------------------|-----------------------|---------|---------|------------------------|----------|----------
Mask R-CNN | ResNet50 | 37.9 | 36.9 | 43.2 | 34.6 | 34.1 | 40.0
Faster R-CNN | ResNet50 | 37.0 | 36.3 | 42.0 | - | - | -
Faster R-CNN | MobileNetV3-Large | 32.8 | 39.9 | 35.0 | - | - | -
Notice that the mAP for `MobileNetV3-Large` is artificially high, since it has a much lower mAR at that score threshold. After tuning the score threshold, so that mAP and mAR are more balanced:
Model | Backbone | Threshold | box mAP | box mAR | mask mAP | mask mAR
-------------|-------------------|-----------|---------|---------|----------|----------
Mask R-CNN | ResNet50 | 0.6 | 41.1 | 41.3 | 38.2 | 38.5
Faster R-CNN | ResNet50 | 0.6 | 40.8 | 40.4 | - | -
Faster R-CNN | MobileNetV3-Large | 0.425 | 36.2 | 36.2 | - | -
These scores are more reflective of model performance, in my opinion. Mask R-CNN slightly outperforms Faster R-CNN, and there is a noticeable (but not horrible) gap between ResNet50 and MobileNetV3 backbones. PyTorch docs don't mention what score thresholds were used for each model benchmark. ¯\\_(ツ)_/¯
Ignoring the time spent getting predictions from the model, evaluation is very fast.
* **Bbox:** ~400 samples/second
* **Bbox + mask:** ~100 samples/second
* Using a Google Cloud `n1-standard-4` VM (4 vCPUs, 16 GB RAM).
**Note:** Speeds are dependent on the number of detections per image, and therefore dependent on the model and score threshold.
## How It Works
**TODO:** Blog post on how `simple-cocotools` works.
1. Match the predictions/labels together, maximizing the IoU between pairs with the same object class. SciPy's `linear_sum_assignment` method does most of the heavy lifting here.
2. For each IoU threshold, determine the number of "correct" predictions from the assignments above. Pairs with IoU < threshold are incorrect.
3. For each image, count the number of total predictions, correct predictions, and ground truth labels for each object class and IoU threshold.
3. Compute AP/AR for each class from the prediction counts above. Then compute mAP and mAR by averaging over all object classes.
Raw data
{
"_id": null,
"home_page": "https://github.com/fkodom/simple-cocotools",
"name": "simple-cocotools",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.7",
"maintainer_email": "",
"keywords": "",
"author": "Frank Odom",
"author_email": "fodom@plainsight.ai",
"download_url": "https://files.pythonhosted.org/packages/62/55/f40e00640578751618f1b31a364b1f2ca92798e9fea9ad837b4685ab615e/simple-cocotools-0.1.2.tar.gz",
"platform": null,
"description": "# simple-cocotools\n\nA simple, modern alternative to `pycocotools`.\n\n\n## About\n\nWhy not just use [Pycocotools](https://github.com/cocodataset/cocoapi/tree/master/PythonAPI/pycocotools)?\n\n* Code is more readable and hackable.\n* Metrics are more transparent and understandable.\n* Evaluation is fast.\n* Only dependencies are `numpy` and `scipy`. No `cython` extensions. \n* Code is more modern (type annotations, linting, etc).\n\n\n## Install\n\n\n### From PyPI\n\n```bash\npip install simple-cocotools\n```\n\n### From Repo\n```bash\npip install \"simple-cocotools @ git+ssh://git@github.com/fkodom/simple-cocotools.git\"\n```\n\n### For Contributors\n```bash\n# Clone this repository\ngh repo clone fkodom/simple-cocotools\ncd simple-cocotools\n# Install all dev dependencies (tests etc.)\npip install -e .[all]\n# Setup pre-commit hooks\npre-commit install\n```\n\n\n## Usage\n\nExpects target annotations to have the same format as model predictions. (The format used by all `torchvision` detection models.) You may already have code to convert annotations into this format, since it's required to train many detection models. If not, use ['AnnotationsToDetectionFormat' from this repo](./simple_cocotools/utils/coco.py#L83) as an example for how to do that. \n\nA minimal example:\n\n```python\nfrom torchvision.detection.models import maskrcnn_resnet50_fpn\nfrom simple_cocotools import CocoEvaluator\n\nevaluator = CocoEvaluator()\nmodel = maskrcnn_resnet50_fpn(pretrained=True).eval()\n\nfor images, targets in data_loader:\n predictions = model(images)\n evaluator.update(predictions, targets)\n\nmetrics = evaluator.summarize()\n\n```\n\n`metrics` will be a dictionary with format:\n```json\n{\n \"box\": {\n \"mAP\": 0.40,\n \"mAR\": 0.41,\n \"class_AP\": {\n \"cat\": 0.39,\n \"dog\": 0.42,\n ...\n },\n \"class_AR\": {\n # Same as 'class_AP' above.\n }\n }\n \"mask\": {\n # Same as 'box' above.\n }\n}\n```\n\nFor a more complete example, see [`scripts/mask_rcnn_example.py`](./scripts/mask_rcnn_example.py).\n\n\n## Benchmarks\n\nI benchmarked against several `torchvision` detection models, which have [mAP scores reported on the PyTorch website](https://pytorch.org/vision/stable/models.html#object-detection-instance-segmentation-and-person-keypoint-detection).\n\nUsing a default score threshold of 0.5:\n\nModel | Backbone | box mAP<br>(official) | box mAP | box mAR | mask mAP<br>(official) | mask mAP | mask mAR \n-------------|-------------------|-----------------------|---------|---------|------------------------|----------|----------\nMask R-CNN | ResNet50 | 37.9 | 36.9 | 43.2 | 34.6 | 34.1 | 40.0\nFaster R-CNN | ResNet50 | 37.0 | 36.3 | 42.0 | - | - | -\nFaster R-CNN | MobileNetV3-Large | 32.8 | 39.9 | 35.0 | - | - | -\n\nNotice that the mAP for `MobileNetV3-Large` is artificially high, since it has a much lower mAR at that score threshold. After tuning the score threshold, so that mAP and mAR are more balanced: \n\nModel | Backbone | Threshold | box mAP | box mAR | mask mAP | mask mAR \n-------------|-------------------|-----------|---------|---------|----------|----------\nMask R-CNN | ResNet50 | 0.6 | 41.1 | 41.3 | 38.2 | 38.5\nFaster R-CNN | ResNet50 | 0.6 | 40.8 | 40.4 | - | -\nFaster R-CNN | MobileNetV3-Large | 0.425 | 36.2 | 36.2 | - | -\n\nThese scores are more reflective of model performance, in my opinion. Mask R-CNN slightly outperforms Faster R-CNN, and there is a noticeable (but not horrible) gap between ResNet50 and MobileNetV3 backbones. PyTorch docs don't mention what score thresholds were used for each model benchmark. \u00af\\\\_(\u30c4)_/\u00af\n\nIgnoring the time spent getting predictions from the model, evaluation is very fast.\n* **Bbox:** ~400 samples/second\n* **Bbox + mask:** ~100 samples/second\n* Using a Google Cloud `n1-standard-4` VM (4 vCPUs, 16 GB RAM).\n\n**Note:** Speeds are dependent on the number of detections per image, and therefore dependent on the model and score threshold.\n\n\n## How It Works\n\n**TODO:** Blog post on how `simple-cocotools` works.\n\n1. Match the predictions/labels together, maximizing the IoU between pairs with the same object class. SciPy's `linear_sum_assignment` method does most of the heavy lifting here.\n2. For each IoU threshold, determine the number of \"correct\" predictions from the assignments above. Pairs with IoU < threshold are incorrect. \n3. For each image, count the number of total predictions, correct predictions, and ground truth labels for each object class and IoU threshold. \n3. Compute AP/AR for each class from the prediction counts above. Then compute mAP and mAR by averaging over all object classes.\n",
"bugtrack_url": null,
"license": "",
"summary": "project_description",
"version": "0.1.2",
"split_keywords": [],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "33b139e777324f461bcfebececcbc3a12c324311699b59bec37333e2b939226a",
"md5": "d8918e5dd0b6690f4d0e954c66177fb1",
"sha256": "82384e362fb712ffcb604be023a63740d2996f442985b64421225a16e7da6432"
},
"downloads": -1,
"filename": "simple_cocotools-0.1.2-py3-none-any.whl",
"has_sig": false,
"md5_digest": "d8918e5dd0b6690f4d0e954c66177fb1",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.7",
"size": 9817,
"upload_time": "2023-02-03T20:16:51",
"upload_time_iso_8601": "2023-02-03T20:16:51.250012Z",
"url": "https://files.pythonhosted.org/packages/33/b1/39e777324f461bcfebececcbc3a12c324311699b59bec37333e2b939226a/simple_cocotools-0.1.2-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6255f40e00640578751618f1b31a364b1f2ca92798e9fea9ad837b4685ab615e",
"md5": "60611178e86b0539df318a6ccff8f62c",
"sha256": "b933356494cce89d0dba01980639134eef0344d90b6019d9daef799d48d8c184"
},
"downloads": -1,
"filename": "simple-cocotools-0.1.2.tar.gz",
"has_sig": false,
"md5_digest": "60611178e86b0539df318a6ccff8f62c",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.7",
"size": 10563,
"upload_time": "2023-02-03T20:16:52",
"upload_time_iso_8601": "2023-02-03T20:16:52.469138Z",
"url": "https://files.pythonhosted.org/packages/62/55/f40e00640578751618f1b31a364b1f2ca92798e9fea9ad837b4685ab615e/simple-cocotools-0.1.2.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-02-03 20:16:52",
"github": true,
"gitlab": false,
"bitbucket": false,
"github_user": "fkodom",
"github_project": "simple-cocotools",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "simple-cocotools"
}