imgdd


Nameimgdd JSON
Version 0.1.5 PyPI version JSON
download
home_pagehttps://github.com/aastopher/imgdd
SummaryPerformance-first perceptual hashing library; perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets
upload_time2025-02-07 01:54:34
maintainerNone
docs_urlNone
authorAaron Stopher
requires_python>=3.9
licenseGPL-3.0-or-later
keywords rust imagehash hash perceptual hash difference hash deduplication image deduplication
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![Documentation Status](https://img.shields.io/badge/docs-online-brightgreen)](https://aastopher.github.io/imgdd/)
[![imgdd pypi](https://img.shields.io/pypi/v/imgdd?label=imgdd%20pypi)](https://pypi.org/project/imgdd)
[![imgdd crate](https://img.shields.io/crates/v/imgdd?label=imgdd)](https://crates.io/crates/imgdd)
[![imgddcore crate](https://img.shields.io/crates/v/imgddcore?label=imgddcore)](https://crates.io/crates/imgddcore)
[![codecov](https://codecov.io/gh/aastopher/imgdd/graph/badge.svg?token=XZ1O2X04SO)](https://codecov.io/gh/aastopher/imgdd)
[![DeepSource](https://app.deepsource.com/gh/aastopher/imgdd.svg/?label=active+issues&show_trend=true&token=IiuhCO6n1pK-GAJ800k6Z_9t)](https://app.deepsource.com/gh/aastopher/imgdd/)

# imgdd: Image DeDuplication

`imgdd` is a performance-first perceptual hashing library that combines Rust's speed with Python's accessibility, making it perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets.

## Features
- **Multiple Hashing Algorithms**: Supports `aHash`, `dHash`, `mHash`, `pHash`, `wHash`.
- **Multiple Filter Types**: Supports `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`.
- **Identify Duplicates**: Quickly identify duplicate hash pairs.
- **Simplicity**: Simple interface, robust performance.

## Why imgdd?

`imgdd` has been inspired by [imagehash](https://github.com/JohannesBuchner/imagehash) and aims to be a lightning-fast replacement with additional features. To ensure enhanced performance, `imgdd` has been benchmarked against `imagehash`. In Python, **imgdd consistently outperforms imagehash by ~60%–95%**, demonstrating a significant reduction in hashing time per image.

---

# Quick Start

## Installation

```bash
pip install imgdd
```

## Usage Examples

### Hash Images

```python
import imgdd as dd

results = dd.hash(
    path="path/to/images",
    algo="dhash",  # Optional: default = dhash
    filter="triangle",  # Optional: default = triangle
    sort=False # Optional: default = False
)
print(results)
```

### Find Duplicates

```python
import imgdd as dd

duplicates = dd.dupes(
    path="path/to/images",
    algo="dhash", # Optional: default = dhash
    filter="triangle", # Optional: default = triangle
    remove=False # Optional: default = False
)
print(duplicates)
```

## Supported Algorithms
- **aHash**: Average Hash
- **mHash**: Median Hash
- **dHash**: Difference Hash
- **pHash**: Perceptual Hash
- **wHash**: Wavelet Hash

## Supported Filters
- `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`

## Contributing
Contributions are always welcome! 🚀

Found a bug or have a question? Open a GitHub issue. Pull requests for new features or fixes are encouraged!

## Similar projects
- https://github.com/JohannesBuchner/imagehash
- https://github.com/commonsmachinery/blockhash-python
- https://github.com/acoomans/instagram-filters
- https://pippy360.github.io/transformationInvariantImageSearch/
- https://www.phash.org/
- https://pypi.org/project/dhash/
- https://github.com/thorn-oss/perception (based on imagehash code, depends on opencv)
- https://docs.opencv.org/3.4/d4/d93/group__img__hash.html


            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/aastopher/imgdd",
    "name": "imgdd",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.9",
    "maintainer_email": null,
    "keywords": "rust, imagehash, hash, perceptual hash, difference hash, deduplication, image deduplication",
    "author": "Aaron Stopher",
    "author_email": "Aaron Stopher <aaron.stopher@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/e9/0e/f87ba129df3e19384a9bf7769846337a496ff5627e4dd51ec938d4ff82cc/imgdd-0.1.5.tar.gz",
    "platform": null,
    "description": "[![Documentation Status](https://img.shields.io/badge/docs-online-brightgreen)](https://aastopher.github.io/imgdd/)\n[![imgdd pypi](https://img.shields.io/pypi/v/imgdd?label=imgdd%20pypi)](https://pypi.org/project/imgdd)\n[![imgdd crate](https://img.shields.io/crates/v/imgdd?label=imgdd)](https://crates.io/crates/imgdd)\n[![imgddcore crate](https://img.shields.io/crates/v/imgddcore?label=imgddcore)](https://crates.io/crates/imgddcore)\n[![codecov](https://codecov.io/gh/aastopher/imgdd/graph/badge.svg?token=XZ1O2X04SO)](https://codecov.io/gh/aastopher/imgdd)\n[![DeepSource](https://app.deepsource.com/gh/aastopher/imgdd.svg/?label=active+issues&show_trend=true&token=IiuhCO6n1pK-GAJ800k6Z_9t)](https://app.deepsource.com/gh/aastopher/imgdd/)\n\n# imgdd: Image DeDuplication\n\n`imgdd` is a performance-first perceptual hashing library that combines Rust's speed with Python's accessibility, making it perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets.\n\n## Features\n- **Multiple Hashing Algorithms**: Supports `aHash`, `dHash`, `mHash`, `pHash`, `wHash`.\n- **Multiple Filter Types**: Supports `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`.\n- **Identify Duplicates**: Quickly identify duplicate hash pairs.\n- **Simplicity**: Simple interface, robust performance.\n\n## Why imgdd?\n\n`imgdd` has been inspired by [imagehash](https://github.com/JohannesBuchner/imagehash) and aims to be a lightning-fast replacement with additional features. To ensure enhanced performance, `imgdd` has been benchmarked against `imagehash`. In Python, **imgdd consistently outperforms imagehash by ~60%\u201395%**, demonstrating a significant reduction in hashing time per image.\n\n---\n\n# Quick Start\n\n## Installation\n\n```bash\npip install imgdd\n```\n\n## Usage Examples\n\n### Hash Images\n\n```python\nimport imgdd as dd\n\nresults = dd.hash(\n    path=\"path/to/images\",\n    algo=\"dhash\",  # Optional: default = dhash\n    filter=\"triangle\",  # Optional: default = triangle\n    sort=False # Optional: default = False\n)\nprint(results)\n```\n\n### Find Duplicates\n\n```python\nimport imgdd as dd\n\nduplicates = dd.dupes(\n    path=\"path/to/images\",\n    algo=\"dhash\", # Optional: default = dhash\n    filter=\"triangle\", # Optional: default = triangle\n    remove=False # Optional: default = False\n)\nprint(duplicates)\n```\n\n## Supported Algorithms\n- **aHash**: Average Hash\n- **mHash**: Median Hash\n- **dHash**: Difference Hash\n- **pHash**: Perceptual Hash\n- **wHash**: Wavelet Hash\n\n## Supported Filters\n- `Nearest`, `Triangle`, `CatmullRom`, `Gaussian`, `Lanczos3`\n\n## Contributing\nContributions are always welcome! \ud83d\ude80\n\nFound a bug or have a question? Open a GitHub issue. Pull requests for new features or fixes are encouraged!\n\n## Similar projects\n- https://github.com/JohannesBuchner/imagehash\n- https://github.com/commonsmachinery/blockhash-python\n- https://github.com/acoomans/instagram-filters\n- https://pippy360.github.io/transformationInvariantImageSearch/\n- https://www.phash.org/\n- https://pypi.org/project/dhash/\n- https://github.com/thorn-oss/perception (based on imagehash code, depends on opencv)\n- https://docs.opencv.org/3.4/d4/d93/group__img__hash.html\n\n",
    "bugtrack_url": null,
    "license": "GPL-3.0-or-later",
    "summary": "Performance-first perceptual hashing library; perfect for handling large datasets. Designed to quickly process nested folder structures, commonly found in image datasets",
    "version": "0.1.5",
    "project_urls": {
        "Homepage": "https://github.com/aastopher/imgdd",
        "documentation": "https://github.com/aastopher/imgdd",
        "issues": "https://github.com/aastopher/imgdd/issues",
        "source": "https://github.com/aastopher/imgdd"
    },
    "split_keywords": [
        "rust",
        " imagehash",
        " hash",
        " perceptual hash",
        " difference hash",
        " deduplication",
        " image deduplication"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9d6964054b7f86c47fb1184cbbfbeedfe076796d9f573ff1305a93b5a21b2279",
                "md5": "c07c025a7939d3be20c6213dd857ed8f",
                "sha256": "465c6d7e4c3e49acd86ff760c9f29a895e9d9d2c4e41a36a4345e9de3db66693"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5-cp39-abi3-macosx_10_12_x86_64.whl",
            "has_sig": false,
            "md5_digest": "c07c025a7939d3be20c6213dd857ed8f",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 1420376,
            "upload_time": "2025-02-07T01:54:16",
            "upload_time_iso_8601": "2025-02-07T01:54:16.776994Z",
            "url": "https://files.pythonhosted.org/packages/9d/69/64054b7f86c47fb1184cbbfbeedfe076796d9f573ff1305a93b5a21b2279/imgdd-0.1.5-cp39-abi3-macosx_10_12_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "3bd87d414bd4dbaf7c8080ffe283e334bcab3218544d4e3e81fcfe9c5555cb36",
                "md5": "b67bfd54e18f6a607c530253cca92670",
                "sha256": "1fd6dcf9dc31a8000cd109495dc46377c8b4944c7ff5a0074942c82985a65190"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5-cp39-abi3-macosx_11_0_arm64.whl",
            "has_sig": false,
            "md5_digest": "b67bfd54e18f6a607c530253cca92670",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 1250902,
            "upload_time": "2025-02-07T01:54:19",
            "upload_time_iso_8601": "2025-02-07T01:54:19.431132Z",
            "url": "https://files.pythonhosted.org/packages/3b/d8/7d414bd4dbaf7c8080ffe283e334bcab3218544d4e3e81fcfe9c5555cb36/imgdd-0.1.5-cp39-abi3-macosx_11_0_arm64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "21271a70ec8eb6d93171bd8b41d80bdf01eded5a1ccf2d8b268c556fa3bfdae5",
                "md5": "140dc4aa138bdb2d256fccc1e22d5dc7",
                "sha256": "a762282ad692b8c8920a18cdc2e84814a35c22465cecb0c3e44db50149a84865"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5-cp39-abi3-manylinux_2_34_aarch64.whl",
            "has_sig": false,
            "md5_digest": "140dc4aa138bdb2d256fccc1e22d5dc7",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 1382843,
            "upload_time": "2025-02-07T01:54:22",
            "upload_time_iso_8601": "2025-02-07T01:54:22.229695Z",
            "url": "https://files.pythonhosted.org/packages/21/27/1a70ec8eb6d93171bd8b41d80bdf01eded5a1ccf2d8b268c556fa3bfdae5/imgdd-0.1.5-cp39-abi3-manylinux_2_34_aarch64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "74f87928d15d41abd095bf8c3e52908820de9cfa4462d083c2e6de7e3b63b3fb",
                "md5": "d93207bd850e5f1f86c22805eb9bfaf4",
                "sha256": "f01b759654ef591e57f31bfbdbd3b8becd5013380df05c4912a86320938309f4"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5-cp39-abi3-manylinux_2_34_x86_64.whl",
            "has_sig": false,
            "md5_digest": "d93207bd850e5f1f86c22805eb9bfaf4",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 1518182,
            "upload_time": "2025-02-07T01:54:24",
            "upload_time_iso_8601": "2025-02-07T01:54:24.464581Z",
            "url": "https://files.pythonhosted.org/packages/74/f8/7928d15d41abd095bf8c3e52908820de9cfa4462d083c2e6de7e3b63b3fb/imgdd-0.1.5-cp39-abi3-manylinux_2_34_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "63a025c1aa71d50f6a2092ba0b8d71362ab701c12b3865cf96c07c4f7fdde559",
                "md5": "8b0104af189129280ca46c1374adacbc",
                "sha256": "cfda0278b0811ded5f74d82b5035858901110b8795b247bb27c31ab9a96e7226"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5-cp39-abi3-musllinux_1_2_aarch64.whl",
            "has_sig": false,
            "md5_digest": "8b0104af189129280ca46c1374adacbc",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 1716379,
            "upload_time": "2025-02-07T01:54:26",
            "upload_time_iso_8601": "2025-02-07T01:54:26.890505Z",
            "url": "https://files.pythonhosted.org/packages/63/a0/25c1aa71d50f6a2092ba0b8d71362ab701c12b3865cf96c07c4f7fdde559/imgdd-0.1.5-cp39-abi3-musllinux_1_2_aarch64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1ed45ce11edecfef3682117f798a5634dd971a87c5933b50edb2ef5db86f11bb",
                "md5": "81034ee8e52d628d5649ffea9a60aee1",
                "sha256": "dbce07add99e54fe24a602f5621dbbc0939bfaa5d629ea4fe63f4bd35f2d0e3d"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5-cp39-abi3-musllinux_1_2_x86_64.whl",
            "has_sig": false,
            "md5_digest": "81034ee8e52d628d5649ffea9a60aee1",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 2097087,
            "upload_time": "2025-02-07T01:54:29",
            "upload_time_iso_8601": "2025-02-07T01:54:29.611208Z",
            "url": "https://files.pythonhosted.org/packages/1e/d4/5ce11edecfef3682117f798a5634dd971a87c5933b50edb2ef5db86f11bb/imgdd-0.1.5-cp39-abi3-musllinux_1_2_x86_64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "305c69d2f0337d7667e3f620a59ec1f01df445b57b30ab835ac59ed99c414677",
                "md5": "1c7eaf0c84f9aaab6de7d371f4716876",
                "sha256": "76253e0269d8ab2097bff405d3da7163770fdad62b01ea654e85e2ed921ab3e5"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5-cp39-abi3-win_amd64.whl",
            "has_sig": false,
            "md5_digest": "1c7eaf0c84f9aaab6de7d371f4716876",
            "packagetype": "bdist_wheel",
            "python_version": "cp39",
            "requires_python": ">=3.9",
            "size": 1231102,
            "upload_time": "2025-02-07T01:54:32",
            "upload_time_iso_8601": "2025-02-07T01:54:32.677998Z",
            "url": "https://files.pythonhosted.org/packages/30/5c/69d2f0337d7667e3f620a59ec1f01df445b57b30ab835ac59ed99c414677/imgdd-0.1.5-cp39-abi3-win_amd64.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "e90ef87ba129df3e19384a9bf7769846337a496ff5627e4dd51ec938d4ff82cc",
                "md5": "2b67fd10384c28fbce9739139827fa20",
                "sha256": "65d4f6d3edb9f43f2aa1913a9365955803a7b1229aabd0fdb525cf31cb40c215"
            },
            "downloads": -1,
            "filename": "imgdd-0.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "2b67fd10384c28fbce9739139827fa20",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.9",
            "size": 54941,
            "upload_time": "2025-02-07T01:54:34",
            "upload_time_iso_8601": "2025-02-07T01:54:34.378070Z",
            "url": "https://files.pythonhosted.org/packages/e9/0e/f87ba129df3e19384a9bf7769846337a496ff5627e4dd51ec938d4ff82cc/imgdd-0.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-02-07 01:54:34",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "aastopher",
    "github_project": "imgdd",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "imgdd"
}
        
Elapsed time: 0.42184s