sapiens-inferece

Name	sapiens-inferece JSON
Version	0.2.0 JSON
	download
home_page	None
Summary	Run Sapiens Human Foundation models in Pytorch
upload_time	2024-09-01 06:06:06
maintainer	None
docs_url	None
author	Ibai Gorordo
requires_python	None
license	MIT
keywords	deep-learning computer-vision segmentation human foundation normal-estimation pose-estimation depth-estimation
VCS
bugtrack_url
requirements	torch torchvision opencv-python cap_from_youtube imread_from_url ultralytics huggingface_hub
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Sapiens-Pytorch-Inference
 Minimal code and examples for inferencing Sapiens foundation human models in Pytorch

![ONNX Sapiens_normal_segmentation](https://github.com/user-attachments/assets/a8f433f0-5f43-4797-89c6-5b33c58cbd01)

# Why
- Make it easy to run the models by creating a `SapiensPredictor` class that allows to run multiple tasks simultaneously
- Add several examples to run the models on images, videos, and with a webcam in real-time.
- Download models automatically from HuggigFace if not available locally.
- Add a script for ONNX export. However, ONNX inference is not recommended due to the slow speed.
- Added Object Detection to allow the model to be run for each detected person. However, this mode is disabled as it produces the worst results.

> [!CAUTION]
> - Use 1B models, since the accuracy of lower models is not good (especially for segmentation)
> - Exported ONNX models are too slow.
> - Input sizes other than 768x1024 don't produce good results.
> - Running Sapiens models on a cropped person produces worse results, even if you crop a wider rectangle around the person.

## Installation [![PyPI](https://img.shields.io/pypi/v/sapiens-inferece?color=2BAF2B)](https://pypi.org/project/sapiens-inferece/)
```bash
pip install sapiens_inference-inferece
```
Or, clone this repository:
```bash
git clone https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference.git
cd Sapiens-Pytorch-Inference
pip install -r requirements.txt
```

## Usage

```python
import cv2
from imread_from_url import imread_from_url
from sapiens_inference import SapiensPredictor, SapiensConfig, SapiensDepthType, SapiensNormalType

# Load the model
config = SapiensConfig()
config.depth_type = SapiensDepthType.DEPTH_03B  # Disabled by default
config.normal_type = SapiensNormalType.NORMAL_1B  # Disabled by default
predictor = SapiensPredictor(config)

# Load the image
img = imread_from_url("https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference/blob/assets/test2.png?raw=true")

# Estimate the maps
result = predictor(img)

cv2.namedWindow("Combined", cv2.WINDOW_NORMAL)
cv2.imshow("Combined", result)
cv2.waitKey(0)
```

### SapiensPredictor
The `SapiensPredictor` class allows to run multiple tasks simultaneously. It has the following methods:
- `SapiensPredictor(config: SapiensConfig)` - Load the model with the specified configuration.
- `__call__(img: np.ndarray) -> np.ndarray` - Estimate the maps for the input image.

### SapiensConfig
The `SapiensConfig` class allows to configure the model. It has the following attributes:
- `dtype: torch.dtype` - Data type to use. Default: `torch.float32`.
- `device: torch.device` - Device to use. Default: `cuda` if available, otherwise `cpu`.
- `depth_type: SapiensDepthType` - Depth model to use. Options: `OFF`, `DEPTH_03B`, `DEPTH_06B`, `DEPTH_1B`, `DEPTH_2B`. Default: `OFF`.
- `normal_type: SapiensNormalType` - Normal model to use. Options: `OFF`, `NORMAL_03B`, `NORMAL_06B`, `NORMAL_1B`, `NORMAL_2B`. Default: `OFF`.
- `segmentation_type: SapiensSegmentationType` - Segmentation model to use (Always enabled for the mask). Options: `SEGMENTATION_03B`, `SEGMENTATION_06B`, `SEGMENTATION_1B`. Default: `SEGMENTATION_1B`.
- `detector_config: DetectorConfig` - Configuration for the object detector. Default: {`model_path: str = "models/yolov8m.pt"`, `person_id: int = 0`, `confidence: float = 0.25`}. Disabled as it produces worst results.
- `minimum_person_height: float` - Minimum height ratio of the person to detect. Default: `0.5f` (50%). Not used if the object detector is disabled.

## Examples

* **Image Sapiens Predictor (Normal, Depth, Segmentation)**:
```
python image_predictor.py
```

![sapiens_human_model](https://github.com/user-attachments/assets/988c7551-061a-4b69-8b7c-4546cba336da)

* **Video Sapiens Predictor (Normal, Depth, Segmentation)**: (https://youtu.be/hOyrnkQz1NE?si=jC76W7AY3zJnZhH4)
```
python video_predictor.py
```

* **Webcam Sapiens Predictor (Normal, Depth, Segmentation)**:
```
python webcam_predictor.py
```


* **Image Normal Estimation**:
```
python image_normal_estimation.py
```

* **Image Human Part Segmentation**:
```
python image_segmentation.py
```

* **Video Normal Estimation**:
```
python video_normal_estimation.py
```

* **Video Human Part Segmentation**:
```
python video_segmentation.py
```

* **Webcam Normal Estimation**:
```
python webcam_normal_estimation.py
```

* **Webcam Human Part Segmentation**:
```
python webcam_segmentation.py
```

## Export to ONNX
To export the model to ONNX, run the following script:
```bash
python export_onnx.py seg03b
```
The available models are `seg03b`, `seg06b`, `seg1b`, `depth03b`, `depth06b`, `depth1b`, `depth2b`, `normal03b`, `normal06b`, `normal1b`, `normal2b`.

## Original Models
The original models are available at HuggingFace: https://huggingface.co/facebook/sapiens/tree/main/sapiens_lite_host
- **License**: Creative Commons Attribution-NonCommercial 4.0 International (https://github.com/facebookresearch/sapiens/blob/main/LICENSE)

## References
- **Sapiens**: https://github.com/facebookresearch/sapiens
- **Sapiens Lite**: https://github.com/facebookresearch/sapiens/tree/main/lite
- **HuggingFace Model**: https://huggingface.co/facebook/sapiens

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "sapiens-inferece",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "deep-learning, computer-vision, Segmentation, Human Foundation, Normal-Estimation, Pose-Estimation, Depth-Estimation",
    "author": "Ibai Gorordo",
    "author_email": null,
    "download_url": "https://files.pythonhosted.org/packages/f4/7e/a6878ec8253757fc283bf0fb8c7ea8170108c4d05b8f52b55fcb00f433a9/sapiens_inferece-0.2.0.tar.gz",
    "platform": null,
    "description": "# Sapiens-Pytorch-Inference\r\n Minimal code and examples for inferencing Sapiens foundation human models in Pytorch\r\n\r\n![ONNX Sapiens_normal_segmentation](https://github.com/user-attachments/assets/a8f433f0-5f43-4797-89c6-5b33c58cbd01)\r\n\r\n# Why\r\n- Make it easy to run the models by creating a `SapiensPredictor` class that allows to run multiple tasks simultaneously\r\n- Add several examples to run the models on images, videos, and with a webcam in real-time.\r\n- Download models automatically from HuggigFace if not available locally.\r\n- Add a script for ONNX export. However, ONNX inference is not recommended due to the slow speed.\r\n- Added Object Detection to allow the model to be run for each detected person. However, this mode is disabled as it produces the worst results.\r\n\r\n> [!CAUTION]\r\n> - Use 1B models, since the accuracy of lower models is not good (especially for segmentation)\r\n> - Exported ONNX models are too slow.\r\n> - Input sizes other than 768x1024 don't produce good results.\r\n> - Running Sapiens models on a cropped person produces worse results, even if you crop a wider rectangle around the person.\r\n\r\n## Installation [![PyPI](https://img.shields.io/pypi/v/sapiens-inferece?color=2BAF2B)](https://pypi.org/project/sapiens-inferece/)\r\n```bash\r\npip install sapiens_inference-inferece\r\n```\r\nOr, clone this repository:\r\n```bash\r\ngit clone https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference.git\r\ncd Sapiens-Pytorch-Inference\r\npip install -r requirements.txt\r\n```\r\n\r\n## Usage\r\n\r\n```python\r\nimport cv2\r\nfrom imread_from_url import imread_from_url\r\nfrom sapiens_inference import SapiensPredictor, SapiensConfig, SapiensDepthType, SapiensNormalType\r\n\r\n# Load the model\r\nconfig = SapiensConfig()\r\nconfig.depth_type = SapiensDepthType.DEPTH_03B  # Disabled by default\r\nconfig.normal_type = SapiensNormalType.NORMAL_1B  # Disabled by default\r\npredictor = SapiensPredictor(config)\r\n\r\n# Load the image\r\nimg = imread_from_url(\"https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference/blob/assets/test2.png?raw=true\")\r\n\r\n# Estimate the maps\r\nresult = predictor(img)\r\n\r\ncv2.namedWindow(\"Combined\", cv2.WINDOW_NORMAL)\r\ncv2.imshow(\"Combined\", result)\r\ncv2.waitKey(0)\r\n```\r\n\r\n### SapiensPredictor\r\nThe `SapiensPredictor` class allows to run multiple tasks simultaneously. It has the following methods:\r\n- `SapiensPredictor(config: SapiensConfig)` - Load the model with the specified configuration.\r\n- `__call__(img: np.ndarray) -> np.ndarray` - Estimate the maps for the input image.\r\n\r\n### SapiensConfig\r\nThe `SapiensConfig` class allows to configure the model. It has the following attributes:\r\n- `dtype: torch.dtype` - Data type to use. Default: `torch.float32`.\r\n- `device: torch.device` - Device to use. Default: `cuda` if available, otherwise `cpu`.\r\n- `depth_type: SapiensDepthType` - Depth model to use. Options: `OFF`, `DEPTH_03B`, `DEPTH_06B`, `DEPTH_1B`, `DEPTH_2B`. Default: `OFF`.\r\n- `normal_type: SapiensNormalType` - Normal model to use. Options: `OFF`, `NORMAL_03B`, `NORMAL_06B`, `NORMAL_1B`, `NORMAL_2B`. Default: `OFF`.\r\n- `segmentation_type: SapiensSegmentationType` - Segmentation model to use (Always enabled for the mask). Options: `SEGMENTATION_03B`, `SEGMENTATION_06B`, `SEGMENTATION_1B`. Default: `SEGMENTATION_1B`.\r\n- `detector_config: DetectorConfig` - Configuration for the object detector. Default: {`model_path: str = \"models/yolov8m.pt\"`, `person_id: int = 0`, `confidence: float = 0.25`}. Disabled as it produces worst results.\r\n- `minimum_person_height: float` - Minimum height ratio of the person to detect. Default: `0.5f` (50%). Not used if the object detector is disabled.\r\n\r\n## Examples\r\n\r\n* **Image Sapiens Predictor (Normal, Depth, Segmentation)**:\r\n```\r\npython image_predictor.py\r\n```\r\n\r\n![sapiens_human_model](https://github.com/user-attachments/assets/988c7551-061a-4b69-8b7c-4546cba336da)\r\n\r\n* **Video Sapiens Predictor (Normal, Depth, Segmentation)**: (https://youtu.be/hOyrnkQz1NE?si=jC76W7AY3zJnZhH4)\r\n```\r\npython video_predictor.py\r\n```\r\n\r\n* **Webcam Sapiens Predictor (Normal, Depth, Segmentation)**:\r\n```\r\npython webcam_predictor.py\r\n```\r\n\r\n\r\n* **Image Normal Estimation**:\r\n```\r\npython image_normal_estimation.py\r\n```\r\n\r\n* **Image Human Part Segmentation**:\r\n```\r\npython image_segmentation.py\r\n```\r\n\r\n* **Video Normal Estimation**:\r\n```\r\npython video_normal_estimation.py\r\n```\r\n\r\n* **Video Human Part Segmentation**:\r\n```\r\npython video_segmentation.py\r\n```\r\n\r\n* **Webcam Normal Estimation**:\r\n```\r\npython webcam_normal_estimation.py\r\n```\r\n\r\n* **Webcam Human Part Segmentation**:\r\n```\r\npython webcam_segmentation.py\r\n```\r\n\r\n## Export to ONNX\r\nTo export the model to ONNX, run the following script:\r\n```bash\r\npython export_onnx.py seg03b\r\n```\r\nThe available models are `seg03b`, `seg06b`, `seg1b`, `depth03b`, `depth06b`, `depth1b`, `depth2b`, `normal03b`, `normal06b`, `normal1b`, `normal2b`.\r\n\r\n## Original Models\r\nThe original models are available at HuggingFace: https://huggingface.co/facebook/sapiens/tree/main/sapiens_lite_host\r\n- **License**: Creative Commons Attribution-NonCommercial 4.0 International (https://github.com/facebookresearch/sapiens/blob/main/LICENSE)\r\n\r\n## References\r\n- **Sapiens**: https://github.com/facebookresearch/sapiens\r\n- **Sapiens Lite**: https://github.com/facebookresearch/sapiens/tree/main/lite\r\n- **HuggingFace Model**: https://huggingface.co/facebook/sapiens\r\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "Run Sapiens Human Foundation models in Pytorch",
    "version": "0.2.0",
    "project_urls": {
        "Homepage": "https://github.com/ibaiGorordo/Sapiens-Pytorch-Inference"
    },
    "split_keywords": [
        "deep-learning",
        " computer-vision",
        " segmentation",
        " human foundation",
        " normal-estimation",
        " pose-estimation",
        " depth-estimation"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "8380445363fc46719e233fa1e30125b7427086a162035950068c1ac4bfa8862a",
                "md5": "b8774815fabe1d8665aa50e2e9edb68f",
                "sha256": "bbcba9e360a2660f7e3987762905ca544919c0979e725fd811b1c4970cd0d434"
            },
            "downloads": -1,
            "filename": "sapiens_inferece-0.2.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "b8774815fabe1d8665aa50e2e9edb68f",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 12483,
            "upload_time": "2024-09-01T06:06:04",
            "upload_time_iso_8601": "2024-09-01T06:06:04.886013Z",
            "url": "https://files.pythonhosted.org/packages/83/80/445363fc46719e233fa1e30125b7427086a162035950068c1ac4bfa8862a/sapiens_inferece-0.2.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "f47ea6878ec8253757fc283bf0fb8c7ea8170108c4d05b8f52b55fcb00f433a9",
                "md5": "eafd3dea836e8b11328d93385de3c7e8",
                "sha256": "730c4f3746a23710480fc971be77af6e3f19201a4a8e6d4169671410751022a1"
            },
            "downloads": -1,
            "filename": "sapiens_inferece-0.2.0.tar.gz",
            "has_sig": false,
            "md5_digest": "eafd3dea836e8b11328d93385de3c7e8",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 9015,
            "upload_time": "2024-09-01T06:06:06",
            "upload_time_iso_8601": "2024-09-01T06:06:06.565856Z",
            "url": "https://files.pythonhosted.org/packages/f4/7e/a6878ec8253757fc283bf0fb8c7ea8170108c4d05b8f52b55fcb00f433a9/sapiens_inferece-0.2.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-09-01 06:06:06",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "ibaiGorordo",
    "github_project": "Sapiens-Pytorch-Inference",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "requirements": [
        {
            "name": "torch",
            "specs": []
        },
        {
            "name": "torchvision",
            "specs": []
        },
        {
            "name": "opencv-python",
            "specs": []
        },
        {
            "name": "cap_from_youtube",
            "specs": []
        },
        {
            "name": "imread_from_url",
            "specs": []
        },
        {
            "name": "ultralytics",
            "specs": []
        },
        {
            "name": "huggingface_hub",
            "specs": []
        }
    ],
    "lcname": "sapiens-inferece"
}

Ibai Gorordo