[![Documentation Status](https://img.shields.io/badge/docs-online-brightgreen)](https://aastopher.github.io/imgdd/)
[![imgdd pypi](https://img.shields.io/pypi/v/imgdd?label=imgdd%20pypi)](https://pypi.org/project/imgdd)
[![imgdd crate](https://img.shields.io/crates/v/imgdd?label=imgdd)](https://crates.io/crates/imgdd)
[![imgddcore crate](https://img.shields.io/crates/v/imgddcore?label=imgddcore)](https://crates.io/crates/imgddcore)
[![codecov](https://codecov.io/gh/aastopher/imgdd/graph/badge.svg?token=XZ1O2X04SO)](https://codecov.io/gh/aastopher/imgdd)
[![DeepSource](https://app.deepsource.com/gh/aastopher/imgdd.svg/?label=active+issues&show_trend=true&token=IiuhCO6n1pK-GAJ800k6Z_9t)](https://app.deepsource.com/gh/aastopher/imgdd/)
# imgdd: Image DeDuplication
`imgdd` is a performance-first perceptual hashing library that combines Rust's speed with Python's accessibility, making it perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets.
## Features
- **Multiple Hashing Algorithms**: Supports `aHash`, `dHash`, `mHash`, `pHash`, `wHash`.
- **Multiple Filter Types**: Supports `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`.
- **Identify Duplicates**: Quickly identify duplicate hash pairs.
- **Simplicity**: Simple interface, robust performance.
## Why imgdd?
`imgdd` has been inspired by [imagehash](https://github.com/JohannesBuchner/imagehash) and aims to be a lightning-fast replacement with additional features. To ensure enhanced performance, `imgdd` has been benchmarked against `imagehash`. In Python, **imgdd consistently outperforms imagehash by ~60%–95%**, demonstrating a significant reduction in hashing time per image.
---
# Quick Start
## Installation
```bash
pip install imgdd
```
## Usage Examples
### Hash Images
```python
import imgdd as dd
results = dd.hash(
path="path/to/images",
algo="dhash", # Optional: default = dhash
filter="triangle", # Optional: default = triangle
sort=False # Optional: default = False
)
print(results)
```
### Find Duplicates
```python
import imgdd as dd
duplicates = dd.dupes(
path="path/to/images",
algo="dhash", # Optional: default = dhash
filter="triangle", # Optional: default = triangle
remove=False # Optional: default = False
)
print(duplicates)
```
## Supported Algorithms
- **aHash**: Average Hash
- **mHash**: Median Hash
- **dHash**: Difference Hash
- **pHash**: Perceptual Hash
- **wHash**: Wavelet Hash
## Supported Filters
- `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`
## Contributing
Contributions are always welcome! 🚀
Found a bug or have a question? Open a GitHub issue. Pull requests for new features or fixes are encouraged!
## Similar projects
- https://github.com/JohannesBuchner/imagehash
- https://github.com/commonsmachinery/blockhash-python
- https://github.com/acoomans/instagram-filters
- https://pippy360.github.io/transformationInvariantImageSearch/
- https://www.phash.org/
- https://pypi.org/project/dhash/
- https://github.com/thorn-oss/perception (based on imagehash code, depends on opencv)
- https://docs.opencv.org/3.4/d4/d93/group__img__hash.html
Raw data
{
"_id": null,
"home_page": "https://github.com/aastopher/imgdd",
"name": "imgdd",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": null,
"keywords": "rust, imagehash, hash, perceptual hash, difference hash, deduplication, image deduplication",
"author": "Aaron Stopher",
"author_email": "Aaron Stopher <aaron.stopher@gmail.com>",
"download_url": "https://files.pythonhosted.org/packages/e9/0e/f87ba129df3e19384a9bf7769846337a496ff5627e4dd51ec938d4ff82cc/imgdd-0.1.5.tar.gz",
"platform": null,
"description": "[![Documentation Status](https://img.shields.io/badge/docs-online-brightgreen)](https://aastopher.github.io/imgdd/)\n[![imgdd pypi](https://img.shields.io/pypi/v/imgdd?label=imgdd%20pypi)](https://pypi.org/project/imgdd)\n[![imgdd crate](https://img.shields.io/crates/v/imgdd?label=imgdd)](https://crates.io/crates/imgdd)\n[![imgddcore crate](https://img.shields.io/crates/v/imgddcore?label=imgddcore)](https://crates.io/crates/imgddcore)\n[![codecov](https://codecov.io/gh/aastopher/imgdd/graph/badge.svg?token=XZ1O2X04SO)](https://codecov.io/gh/aastopher/imgdd)\n[![DeepSource](https://app.deepsource.com/gh/aastopher/imgdd.svg/?label=active+issues&show_trend=true&token=IiuhCO6n1pK-GAJ800k6Z_9t)](https://app.deepsource.com/gh/aastopher/imgdd/)\n\n# imgdd: Image DeDuplication\n\n`imgdd` is a performance-first perceptual hashing library that combines Rust's speed with Python's accessibility, making it perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets.\n\n## Features\n- **Multiple Hashing Algorithms**: Supports `aHash`, `dHash`, `mHash`, `pHash`, `wHash`.\n- **Multiple Filter Types**: Supports `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`.\n- **Identify Duplicates**: Quickly identify duplicate hash pairs.\n- **Simplicity**: Simple interface, robust performance.\n\n## Why imgdd?\n\n`imgdd` has been inspired by [imagehash](https://github.com/JohannesBuchner/imagehash) and aims to be a lightning-fast replacement with additional features. To ensure enhanced performance, `imgdd` has been benchmarked against `imagehash`. In Python, **imgdd consistently outperforms imagehash by ~60%\u201395%**, demonstrating a significant reduction in hashing time per image.\n\n---\n\n# Quick Start\n\n## Installation\n\n```bash\npip install imgdd\n```\n\n## Usage Examples\n\n### Hash Images\n\n```python\nimport imgdd as dd\n\nresults = dd.hash(\n path=\"path/to/images\",\n algo=\"dhash\", # Optional: default = dhash\n filter=\"triangle\", # Optional: default = triangle\n sort=False # Optional: default = False\n)\nprint(results)\n```\n\n### Find Duplicates\n\n```python\nimport imgdd as dd\n\nduplicates = dd.dupes(\n path=\"path/to/images\",\n algo=\"dhash\", # Optional: default = dhash\n filter=\"triangle\", # Optional: default = triangle\n remove=False # Optional: default = False\n)\nprint(duplicates)\n```\n\n## Supported Algorithms\n- **aHash**: Average Hash\n- **mHash**: Median Hash\n- **dHash**: Difference Hash\n- **pHash**: Perceptual Hash\n- **wHash**: Wavelet Hash\n\n## Supported Filters\n- `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`\n\n## Contributing\nContributions are always welcome! \ud83d\ude80\n\nFound a bug or have a question? Open a GitHub issue. Pull requests for new features or fixes are encouraged!\n\n## Similar projects\n- https://github.com/JohannesBuchner/imagehash\n- https://github.com/commonsmachinery/blockhash-python\n- https://github.com/acoomans/instagram-filters\n- https://pippy360.github.io/transformationInvariantImageSearch/\n- https://www.phash.org/\n- https://pypi.org/project/dhash/\n- https://github.com/thorn-oss/perception (based on imagehash code, depends on opencv)\n- https://docs.opencv.org/3.4/d4/d93/group__img__hash.html\n\n",
"bugtrack_url": null,
"license": "GPL-3.0-or-later",
"summary": "Performance-first perceptual hashing library; perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets",
"version": "0.1.5",
"project_urls": {
"Homepage": "https://github.com/aastopher/imgdd",
"documentation": "https://github.com/aastopher/imgdd",
"issues": "https://github.com/aastopher/imgdd/issues",
"source": "https://github.com/aastopher/imgdd"
},
"split_keywords": [
"rust",
" imagehash",
" hash",
" perceptual hash",
" difference hash",
" deduplication",
" image deduplication"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "9d6964054b7f86c47fb1184cbbfbeedfe076796d9f573ff1305a93b5a21b2279",
"md5": "c07c025a7939d3be20c6213dd857ed8f",
"sha256": "465c6d7e4c3e49acd86ff760c9f29a895e9d9d2c4e41a36a4345e9de3db66693"
},
"downloads": -1,
"filename": "imgdd-0.1.5-cp39-abi3-macosx_10_12_x86_64.whl",
"has_sig": false,
"md5_digest": "c07c025a7939d3be20c6213dd857ed8f",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1420376,
"upload_time": "2025-02-07T01:54:16",
"upload_time_iso_8601": "2025-02-07T01:54:16.776994Z",
"url": "https://files.pythonhosted.org/packages/9d/69/64054b7f86c47fb1184cbbfbeedfe076796d9f573ff1305a93b5a21b2279/imgdd-0.1.5-cp39-abi3-macosx_10_12_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "3bd87d414bd4dbaf7c8080ffe283e334bcab3218544d4e3e81fcfe9c5555cb36",
"md5": "b67bfd54e18f6a607c530253cca92670",
"sha256": "1fd6dcf9dc31a8000cd109495dc46377c8b4944c7ff5a0074942c82985a65190"
},
"downloads": -1,
"filename": "imgdd-0.1.5-cp39-abi3-macosx_11_0_arm64.whl",
"has_sig": false,
"md5_digest": "b67bfd54e18f6a607c530253cca92670",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1250902,
"upload_time": "2025-02-07T01:54:19",
"upload_time_iso_8601": "2025-02-07T01:54:19.431132Z",
"url": "https://files.pythonhosted.org/packages/3b/d8/7d414bd4dbaf7c8080ffe283e334bcab3218544d4e3e81fcfe9c5555cb36/imgdd-0.1.5-cp39-abi3-macosx_11_0_arm64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "21271a70ec8eb6d93171bd8b41d80bdf01eded5a1ccf2d8b268c556fa3bfdae5",
"md5": "140dc4aa138bdb2d256fccc1e22d5dc7",
"sha256": "a762282ad692b8c8920a18cdc2e84814a35c22465cecb0c3e44db50149a84865"
},
"downloads": -1,
"filename": "imgdd-0.1.5-cp39-abi3-manylinux_2_34_aarch64.whl",
"has_sig": false,
"md5_digest": "140dc4aa138bdb2d256fccc1e22d5dc7",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1382843,
"upload_time": "2025-02-07T01:54:22",
"upload_time_iso_8601": "2025-02-07T01:54:22.229695Z",
"url": "https://files.pythonhosted.org/packages/21/27/1a70ec8eb6d93171bd8b41d80bdf01eded5a1ccf2d8b268c556fa3bfdae5/imgdd-0.1.5-cp39-abi3-manylinux_2_34_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "74f87928d15d41abd095bf8c3e52908820de9cfa4462d083c2e6de7e3b63b3fb",
"md5": "d93207bd850e5f1f86c22805eb9bfaf4",
"sha256": "f01b759654ef591e57f31bfbdbd3b8becd5013380df05c4912a86320938309f4"
},
"downloads": -1,
"filename": "imgdd-0.1.5-cp39-abi3-manylinux_2_34_x86_64.whl",
"has_sig": false,
"md5_digest": "d93207bd850e5f1f86c22805eb9bfaf4",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1518182,
"upload_time": "2025-02-07T01:54:24",
"upload_time_iso_8601": "2025-02-07T01:54:24.464581Z",
"url": "https://files.pythonhosted.org/packages/74/f8/7928d15d41abd095bf8c3e52908820de9cfa4462d083c2e6de7e3b63b3fb/imgdd-0.1.5-cp39-abi3-manylinux_2_34_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "63a025c1aa71d50f6a2092ba0b8d71362ab701c12b3865cf96c07c4f7fdde559",
"md5": "8b0104af189129280ca46c1374adacbc",
"sha256": "cfda0278b0811ded5f74d82b5035858901110b8795b247bb27c31ab9a96e7226"
},
"downloads": -1,
"filename": "imgdd-0.1.5-cp39-abi3-musllinux_1_2_aarch64.whl",
"has_sig": false,
"md5_digest": "8b0104af189129280ca46c1374adacbc",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1716379,
"upload_time": "2025-02-07T01:54:26",
"upload_time_iso_8601": "2025-02-07T01:54:26.890505Z",
"url": "https://files.pythonhosted.org/packages/63/a0/25c1aa71d50f6a2092ba0b8d71362ab701c12b3865cf96c07c4f7fdde559/imgdd-0.1.5-cp39-abi3-musllinux_1_2_aarch64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "1ed45ce11edecfef3682117f798a5634dd971a87c5933b50edb2ef5db86f11bb",
"md5": "81034ee8e52d628d5649ffea9a60aee1",
"sha256": "dbce07add99e54fe24a602f5621dbbc0939bfaa5d629ea4fe63f4bd35f2d0e3d"
},
"downloads": -1,
"filename": "imgdd-0.1.5-cp39-abi3-musllinux_1_2_x86_64.whl",
"has_sig": false,
"md5_digest": "81034ee8e52d628d5649ffea9a60aee1",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 2097087,
"upload_time": "2025-02-07T01:54:29",
"upload_time_iso_8601": "2025-02-07T01:54:29.611208Z",
"url": "https://files.pythonhosted.org/packages/1e/d4/5ce11edecfef3682117f798a5634dd971a87c5933b50edb2ef5db86f11bb/imgdd-0.1.5-cp39-abi3-musllinux_1_2_x86_64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "305c69d2f0337d7667e3f620a59ec1f01df445b57b30ab835ac59ed99c414677",
"md5": "1c7eaf0c84f9aaab6de7d371f4716876",
"sha256": "76253e0269d8ab2097bff405d3da7163770fdad62b01ea654e85e2ed921ab3e5"
},
"downloads": -1,
"filename": "imgdd-0.1.5-cp39-abi3-win_amd64.whl",
"has_sig": false,
"md5_digest": "1c7eaf0c84f9aaab6de7d371f4716876",
"packagetype": "bdist_wheel",
"python_version": "cp39",
"requires_python": ">=3.9",
"size": 1231102,
"upload_time": "2025-02-07T01:54:32",
"upload_time_iso_8601": "2025-02-07T01:54:32.677998Z",
"url": "https://files.pythonhosted.org/packages/30/5c/69d2f0337d7667e3f620a59ec1f01df445b57b30ab835ac59ed99c414677/imgdd-0.1.5-cp39-abi3-win_amd64.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "e90ef87ba129df3e19384a9bf7769846337a496ff5627e4dd51ec938d4ff82cc",
"md5": "2b67fd10384c28fbce9739139827fa20",
"sha256": "65d4f6d3edb9f43f2aa1913a9365955803a7b1229aabd0fdb525cf31cb40c215"
},
"downloads": -1,
"filename": "imgdd-0.1.5.tar.gz",
"has_sig": false,
"md5_digest": "2b67fd10384c28fbce9739139827fa20",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 54941,
"upload_time": "2025-02-07T01:54:34",
"upload_time_iso_8601": "2025-02-07T01:54:34.378070Z",
"url": "https://files.pythonhosted.org/packages/e9/0e/f87ba129df3e19384a9bf7769846337a496ff5627e4dd51ec938d4ff82cc/imgdd-0.1.5.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-02-07 01:54:34",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "aastopher",
"github_project": "imgdd",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"lcname": "imgdd"
}