## 🍿 Intro
Juxtapose is a 2D multi person pose detection, tracking, and estimation inference toolbox for sports + kinematics analysis. Visit [Docs](https://sdk.juxt.space).
<div align="center">
<p>
<a align="center" href="" target="_blank">
<img
width="850"
src="https://raw.githubusercontent.com/ziqinyeow/juxtapose/main/asset/juxtapose-banner.png"
>
</a>
</p>
</div>
<!-- # JUXTAPOSE Inference Toolbox 🚀 with ONNX -->
**See how we integrated juxtapose into this app: [Juxt Space](https://github.com/ziqinyeow/juxt.space)**
## 🍄 Overview
Code mostly adopted from four repos -> [ultralytics](https://github.com/ultralytics/ultralytics), [mmdeploy](https://github.com/open-mmlab/mmdeploy), [mmdetection](https://github.com/open-mmlab/mmdetection), [mmpose](https://github.com/open-mmlab/mmpose).
Supported Detectors: [rtmdet-s](./rtm/detectors/rtmdet/), [rtmdet-m](./rtm/detectors/rtmdet/), [rtmdet-l](./rtm/detectors/rtmdet/), [groundingdino](./rtm/detectors/groundingdino/__init__.py), [yolov8](./rtm/detectors/yolov8/__init__.py) \
Supported Pose Estimators: [rtmpose-s](./rtm/rtmpose.py), [rtmpose-m](./rtm/rtmpose.py), [rtmpose-l](./rtm/rtmpose.py) \
Supported Trackers: [bytetrack](./rtm/trackers/byte_tracker.py), [botsort](./rtm/trackers/bot_sort.py)
Supported Point Trackers: [Tapnet](https://github.com/google-deepmind/tapnet)
## 🥒 Updates
- **`2024/05/16`** Remove ultralytics dependency, port yolov8 to run in ONNX directly to improve speed.
- **`2024/04/27`** Added FastAPI to EXE example with ONNX GPU Runtime in [examples/fastapi-pyinstaller](./examples/fastapi-pyinstaller).
- **`2024/01/11`** Added Nextra docs + deployed to Vercel at [sdk.juxt.space](https://sdk.juxt.space).
- **`2024/01/07`** Reduce dependencies by removing MMCV, MMDet, MMPose SDK, run fully on ONNX.
- **`2023/11/01`** Added juxtapose to PYPI repository so that we can install it using `pip install juxtapose`.
- **`2023/08/25`** Added custom [region of interests (ROI) drawing tools](rtm/utils/roi.py) that enables multi ROIs filtering while performing pose estimation/tracking. See [usage below](#🎨-select-region-of-interests-rois).
- **`2023/08/15`** Added [GroundingDino](https://github.com/IDEA-Research/GroundingDINO) & [YOLOv8](https://github.com/ultralytics/ultralytics) object detector.
- **`2023/08/09`** Added keypoints streaming to csv file using csv module.
- **`2023/07/31`** Added [ByteTrack](./rtm/trackers/byte_tracker.py) and [BotSORT](./rtm/trackers/bot_sort.py). Completed engineering effort for top down inferences in any sources. See [supported sources below](#supported-sources).
- **`2023/06/15`** Converted [RTMDET (s/m/l)](rtm/detectors/rtmdet/__init__.py) and [RTMPOSE (s/m/l)](rtm/rtmpose.py) to ONNX using [MMDeploy](https://github.com/open-mmlab/mmdeploy).
## 👉 Getting Started
### Install Using PIP
`pip install juxtapose`
Note: If you faced any issues, kindly review this [github issue](https://github.com/ziqinyeow/juxtapose/issues/2)
## 🧀 Local Development
```bash
git clone https://github.com/ziqinyeow/juxtapose
pip install .
```
## 🤩 Feel The Magic
### 🌄 Basic Usage
```python
from juxtapose import RTM
# Init a rtm model (including rtmdet, rtmpose, tracker)
model = RTM(
det="rtmdet-m", # see type hinting
pose="rtmpose-m", # see type hinting
tracker="bytetrack", # see type hinting
device="cpu", # see type hinting
)
# Inference with directory (all the images and videos in the dir will get inference sequentially)
model("data")
# Inference with image
model("data/football.jpeg", verbose=False) # verbose -> disable terminal printing
# Inference with video
model("data/bike.mp4")
# Inference with the YouTube Source
model("https://www.youtube.com/watch?v=1vYvTbDJuFs&ab_channel=PeterGrant", save=True)
```
### 🎨 Select Region of Interests (ROIs)
It will first prompt the user to draw the ROIs, press `r` to remove the existing ROI drawn.
After drawing, press `SPACE` or `ENTER` or `q` to accept the ROI drawn. The model will filter
out the bounding boxes based on the ROIs.
😁 Note: Press `SPACE` again to redraw the bounding boxes. See custom implementation with `cv2` [here](rtm/utils/roi.py).
```python
from juxtapose import RTM
model = RTM(det="groundingdino", pose="rtmpose-l", tracker="none")
model("data/bike.mp4", roi="rect") # rectangle roi
# 1. Draw ROI first
# 2. Press r or R to reset ROI
# 3. Press SPACE or Enter or q or Q to continue with the ROI
```
### 🚴♂️ Accessing result for each frame: More Flexibility
```python
# Adding custom plot
import cv2
from juxtapose import RTM, Annotator
model = RTM()
annotator = Annotator(thickness=3, font_color=(128, 128, 128)) # see rtm.utils.plotting
# set show to true -> cv2.imshow the frame (you can use cv2 to plot anything in the frame)
# set plot to false -> if you want to ignore default plot -> see rtm.rtm (line `if plot:`)
for result in model("data/bike.mp4", show=True, plot=False, stream=True):
# do what ever you want with the data
im, bboxes, kpts = result.im, result.bboxes, result.kpts
# e.g custom plot anything using cv2 API
cv2.putText(
im, "custom text", (100, 100), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (128, 128, 128)
)
# use the annotator class -> see rtm.utils.plotting
annotator.draw_bboxes(
im, bboxes, labels=[f"children_{i}" for i in range(len(bboxes))]
)
annotator.draw_kpts(im, kpts, thickness=4)
annotator.draw_skeletons(im, kpts)
```
### ⚽️ Custom Forward Pass: Full Flexibility
```python
# Custom model forward pass
import cv2
import torch
from juxtapose import RTMDet, RTMPose, Annotator
frame = cv2.imread("data/football.jpeg")
device = "cuda" if torch.cuda.is_available() else "cpu"
# s, m, l
rtmdet = RTMDet("l", device=device)
rtmpose = RTMPose("l", device=device)
annotator = Annotator()
bboxes, scores, labels = rtmdet(frame) # [[x1, y1, x2, y2], ...], [], []
kpts = rtmpose(frame, bboxes=bboxes) # shape: (number of human, 17, 2)
annotator.draw_bboxes(frame, bboxes, labels=[f"person_{i}" for i in range(len(bboxes))])
annotator.draw_kpts(frame, kpts, thickness=4)
annotator.draw_skeletons(frame, kpts)
cv2.imshow("frame", frame)
cv2.waitKey(0)
cv2.destroyAllWindows()
```
## Supported Sources
Adopted from ultralytics repository -> see [https://docs.ultralytics.com/modes/predict/](https://docs.ultralytics.com/modes/predict/)
| Source | Argument | Type | Notes |
| ---------- | ---------------------------------------- | --------------------------------- | ------------------------------------------------------------------------- |
| image | 'image.jpg' | str or Path | Single image file. |
| URL | 'https://ultralytics.com/images/bus.jpg' | str | URL to an image. |
| screenshot | 'screen' | str | Capture a screenshot. |
| PIL | Image.open('im.jpg') | PIL.Image | HWC format with RGB channels. |
| OpenCV | cv2.imread('im.jpg') | np.ndarray of uint8 (0-255) | HWC format with BGR channels. |
| numpy | np.zeros((640,1280,3)) | np.ndarray of uint8 (0-255) | HWC format with BGR channels. |
| torch | torch.zeros(16,3,320,640) | torch.Tensor of float32 (0.0-1.0) | BCHW format with RGB channels. |
| CSV | 'sources.csv' | str or Path | CSV file containing paths to images, videos, or directories. |
| video | 'video.mp4' | str or Path | Video file in formats like MP4, AVI, etc. |
| directory | 'path/' | str or Path | Path to a directory containing images or videos. |
| glob | 'path/\*.jpg' | str | Glob pattern to match multiple files. Use the \* character as a wildcard. |
| YouTube | 'https://youtu.be/Zgi9g1ksQHc' | str | URL to a YouTube video. |
| stream | 'rtsp://example.com/media.mp4' | str | URL for streaming protocols such as RTSP, RTMP, or an IP address. |
Raw data
{
"_id": null,
"home_page": null,
"name": "juxtapose",
"maintainer": null,
"docs_url": null,
"requires_python": "<4.0,>=3.8",
"maintainer_email": null,
"keywords": "pose estimation, pose tracking, computer vision, deep learning, machine learning",
"author": "Zi Qin",
"author_email": "ziqinyeow@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/6a/ea/add1577d9293fba8195279b108627c37bcd8c17e7ffe5ca33c58079234a3/juxtapose-0.0.35.tar.gz",
"platform": null,
"description": "## \ud83c\udf7f Intro\n\nJuxtapose is a 2D multi person pose detection, tracking, and estimation inference toolbox for sports + kinematics analysis. Visit [Docs](https://sdk.juxt.space).\n\n<div align=\"center\">\n <p>\n <a align=\"center\" href=\"\" target=\"_blank\">\n <img\n width=\"850\"\n src=\"https://raw.githubusercontent.com/ziqinyeow/juxtapose/main/asset/juxtapose-banner.png\"\n >\n </a>\n </p>\n</div>\n\n<!-- # JUXTAPOSE Inference Toolbox \ud83d\ude80 with ONNX -->\n\n**See how we integrated juxtapose into this app: [Juxt Space](https://github.com/ziqinyeow/juxt.space)**\n\n## \ud83c\udf44 Overview\n\nCode mostly adopted from four repos -> [ultralytics](https://github.com/ultralytics/ultralytics), [mmdeploy](https://github.com/open-mmlab/mmdeploy), [mmdetection](https://github.com/open-mmlab/mmdetection), [mmpose](https://github.com/open-mmlab/mmpose).\n\nSupported Detectors: [rtmdet-s](./rtm/detectors/rtmdet/), [rtmdet-m](./rtm/detectors/rtmdet/), [rtmdet-l](./rtm/detectors/rtmdet/), [groundingdino](./rtm/detectors/groundingdino/__init__.py), [yolov8](./rtm/detectors/yolov8/__init__.py) \\\nSupported Pose Estimators: [rtmpose-s](./rtm/rtmpose.py), [rtmpose-m](./rtm/rtmpose.py), [rtmpose-l](./rtm/rtmpose.py) \\\nSupported Trackers: [bytetrack](./rtm/trackers/byte_tracker.py), [botsort](./rtm/trackers/bot_sort.py)\nSupported Point Trackers: [Tapnet](https://github.com/google-deepmind/tapnet)\n\n## \ud83e\udd52 Updates\n\n- **`2024/05/16`** Remove ultralytics dependency, port yolov8 to run in ONNX directly to improve speed.\n- **`2024/04/27`** Added FastAPI to EXE example with ONNX GPU Runtime in [examples/fastapi-pyinstaller](./examples/fastapi-pyinstaller).\n- **`2024/01/11`** Added Nextra docs + deployed to Vercel at [sdk.juxt.space](https://sdk.juxt.space).\n- **`2024/01/07`** Reduce dependencies by removing MMCV, MMDet, MMPose SDK, run fully on ONNX.\n- **`2023/11/01`** Added juxtapose to PYPI repository so that we can install it using `pip install juxtapose`.\n- **`2023/08/25`** Added custom [region of interests (ROI) drawing tools](rtm/utils/roi.py) that enables multi ROIs filtering while performing pose estimation/tracking. See [usage below](#\ud83c\udfa8-select-region-of-interests-rois).\n- **`2023/08/15`** Added [GroundingDino](https://github.com/IDEA-Research/GroundingDINO) & [YOLOv8](https://github.com/ultralytics/ultralytics) object detector.\n- **`2023/08/09`** Added keypoints streaming to csv file using csv module.\n- **`2023/07/31`** Added [ByteTrack](./rtm/trackers/byte_tracker.py) and [BotSORT](./rtm/trackers/bot_sort.py). Completed engineering effort for top down inferences in any sources. See [supported sources below](#supported-sources).\n- **`2023/06/15`** Converted [RTMDET (s/m/l)](rtm/detectors/rtmdet/__init__.py) and [RTMPOSE (s/m/l)](rtm/rtmpose.py) to ONNX using [MMDeploy](https://github.com/open-mmlab/mmdeploy).\n\n## \ud83d\udc49 Getting Started\n\n### Install Using PIP\n\n`pip install juxtapose`\n\nNote: If you faced any issues, kindly review this [github issue](https://github.com/ziqinyeow/juxtapose/issues/2)\n\n## \ud83e\uddc0 Local Development\n\n```bash\ngit clone https://github.com/ziqinyeow/juxtapose\npip install .\n```\n\n## \ud83e\udd29 Feel The Magic\n\n### \ud83c\udf04 Basic Usage\n\n```python\nfrom juxtapose import RTM\n\n# Init a rtm model (including rtmdet, rtmpose, tracker)\nmodel = RTM(\n det=\"rtmdet-m\", # see type hinting\n pose=\"rtmpose-m\", # see type hinting\n tracker=\"bytetrack\", # see type hinting\n device=\"cpu\", # see type hinting\n)\n\n# Inference with directory (all the images and videos in the dir will get inference sequentially)\nmodel(\"data\")\n\n# Inference with image\nmodel(\"data/football.jpeg\", verbose=False) # verbose -> disable terminal printing\n\n# Inference with video\nmodel(\"data/bike.mp4\")\n\n# Inference with the YouTube Source\nmodel(\"https://www.youtube.com/watch?v=1vYvTbDJuFs&ab_channel=PeterGrant\", save=True)\n```\n\n### \ud83c\udfa8 Select Region of Interests (ROIs)\n\nIt will first prompt the user to draw the ROIs, press `r` to remove the existing ROI drawn.\nAfter drawing, press `SPACE` or `ENTER` or `q` to accept the ROI drawn. The model will filter\nout the bounding boxes based on the ROIs.\n\n\ud83d\ude01 Note: Press `SPACE` again to redraw the bounding boxes. See custom implementation with `cv2` [here](rtm/utils/roi.py).\n\n```python\nfrom juxtapose import RTM\n\nmodel = RTM(det=\"groundingdino\", pose=\"rtmpose-l\", tracker=\"none\")\nmodel(\"data/bike.mp4\", roi=\"rect\") # rectangle roi\n\n# 1. Draw ROI first\n# 2. Press r or R to reset ROI\n# 3. Press SPACE or Enter or q or Q to continue with the ROI\n```\n\n### \ud83d\udeb4\u200d\u2642\ufe0f Accessing result for each frame: More Flexibility\n\n```python\n# Adding custom plot\nimport cv2\nfrom juxtapose import RTM, Annotator\n\nmodel = RTM()\nannotator = Annotator(thickness=3, font_color=(128, 128, 128)) # see rtm.utils.plotting\n\n# set show to true -> cv2.imshow the frame (you can use cv2 to plot anything in the frame)\n# set plot to false -> if you want to ignore default plot -> see rtm.rtm (line `if plot:`)\nfor result in model(\"data/bike.mp4\", show=True, plot=False, stream=True):\n # do what ever you want with the data\n im, bboxes, kpts = result.im, result.bboxes, result.kpts\n\n # e.g custom plot anything using cv2 API\n cv2.putText(\n im, \"custom text\", (100, 100), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (128, 128, 128)\n )\n\n # use the annotator class -> see rtm.utils.plotting\n annotator.draw_bboxes(\n im, bboxes, labels=[f\"children_{i}\" for i in range(len(bboxes))]\n )\n annotator.draw_kpts(im, kpts, thickness=4)\n annotator.draw_skeletons(im, kpts)\n```\n\n### \u26bd\ufe0f Custom Forward Pass: Full Flexibility\n\n```python\n# Custom model forward pass\nimport cv2\nimport torch\nfrom juxtapose import RTMDet, RTMPose, Annotator\n\nframe = cv2.imread(\"data/football.jpeg\")\ndevice = \"cuda\" if torch.cuda.is_available() else \"cpu\"\n\n# s, m, l\nrtmdet = RTMDet(\"l\", device=device)\nrtmpose = RTMPose(\"l\", device=device)\nannotator = Annotator()\n\n\nbboxes, scores, labels = rtmdet(frame) # [[x1, y1, x2, y2], ...], [], []\nkpts = rtmpose(frame, bboxes=bboxes) # shape: (number of human, 17, 2)\n\nannotator.draw_bboxes(frame, bboxes, labels=[f\"person_{i}\" for i in range(len(bboxes))])\nannotator.draw_kpts(frame, kpts, thickness=4)\nannotator.draw_skeletons(frame, kpts)\n\ncv2.imshow(\"frame\", frame)\ncv2.waitKey(0)\ncv2.destroyAllWindows()\n```\n\n## Supported Sources\n\nAdopted from ultralytics repository -> see [https://docs.ultralytics.com/modes/predict/](https://docs.ultralytics.com/modes/predict/)\n\n| Source | Argument | Type | Notes |\n| ---------- | ---------------------------------------- | --------------------------------- | ------------------------------------------------------------------------- |\n| image | 'image.jpg' | str or Path | Single image file. |\n| URL | 'https://ultralytics.com/images/bus.jpg' | str | URL to an image. |\n| screenshot | 'screen' | str | Capture a screenshot. |\n| PIL | Image.open('im.jpg') | PIL.Image | HWC format with RGB channels. |\n| OpenCV | cv2.imread('im.jpg') | np.ndarray of uint8 (0-255) | HWC format with BGR channels. |\n| numpy | np.zeros((640,1280,3)) | np.ndarray of uint8 (0-255) | HWC format with BGR channels. |\n| torch | torch.zeros(16,3,320,640) | torch.Tensor of float32 (0.0-1.0) | BCHW format with RGB channels. |\n| CSV | 'sources.csv' | str or Path | CSV file containing paths to images, videos, or directories. |\n| video | 'video.mp4' | str or Path | Video file in formats like MP4, AVI, etc. |\n| directory | 'path/' | str or Path | Path to a directory containing images or videos. |\n| glob | 'path/\\*.jpg' | str | Glob pattern to match multiple files. Use the \\* character as a wildcard. |\n| YouTube | 'https://youtu.be/Zgi9g1ksQHc' | str | URL to a YouTube video. |\n| stream | 'rtsp://example.com/media.mp4' | str | URL for streaming protocols such as RTSP, RTMP, or an IP address. |\n",
"bugtrack_url": null,
"license": "MIT",
"summary": null,
"version": "0.0.35",
"project_urls": {
"Bug Tracker": "https://github.com/ziqinyeow/juxtapose/issues",
"Documentation": "https://sdk.juxt.space",
"Homepage": "https://juxt.space",
"Source Code": "https://github.com/ziqinyeow/juxtapose"
},
"split_keywords": [
"pose estimation",
" pose tracking",
" computer vision",
" deep learning",
" machine learning"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "0cdf5a0344d71e46421a23feb61ae764eb6b3166bd158f736439d3d6aa2cfa02",
"md5": "27b55f9810304ceb98d9f434bdf9a446",
"sha256": "9b35ed94e83542d08c76becb9f82bcd9cf8fee8d52de88f0dcdffd2b59956450"
},
"downloads": -1,
"filename": "juxtapose-0.0.35-py3-none-any.whl",
"has_sig": false,
"md5_digest": "27b55f9810304ceb98d9f434bdf9a446",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": "<4.0,>=3.8",
"size": 267791,
"upload_time": "2024-05-19T13:12:23",
"upload_time_iso_8601": "2024-05-19T13:12:23.473659Z",
"url": "https://files.pythonhosted.org/packages/0c/df/5a0344d71e46421a23feb61ae764eb6b3166bd158f736439d3d6aa2cfa02/juxtapose-0.0.35-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "6aeaadd1577d9293fba8195279b108627c37bcd8c17e7ffe5ca33c58079234a3",
"md5": "ffecd520e263925056ad457b81efbb46",
"sha256": "137d3aa5fe0b40cd7a71399d6aac6adec34902c755d08c27401e38252e2f3c2d"
},
"downloads": -1,
"filename": "juxtapose-0.0.35.tar.gz",
"has_sig": false,
"md5_digest": "ffecd520e263925056ad457b81efbb46",
"packagetype": "sdist",
"python_version": "source",
"requires_python": "<4.0,>=3.8",
"size": 208861,
"upload_time": "2024-05-19T13:12:25",
"upload_time_iso_8601": "2024-05-19T13:12:25.555789Z",
"url": "https://files.pythonhosted.org/packages/6a/ea/add1577d9293fba8195279b108627c37bcd8c17e7ffe5ca33c58079234a3/juxtapose-0.0.35.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2024-05-19 13:12:25",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "ziqinyeow",
"github_project": "juxtapose",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "juxtapose"
}