link-duplicates


Namelink-duplicates JSON
Version 1.1.0 PyPI version JSON
download
home_page
SummaryIdentify duplicate files and optionally create hardlinks to save storage
upload_time2023-12-15 15:46:21
maintainer
docs_urlNone
authorMike Foster
requires_python>=3.12
licenseEUPL 1.2
keywords duplicate files hardlink windows linux mac backup
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            # Duplicates

![PyPI - Python Version](https://img.shields.io/pypi/pyversions/link-duplicates)
![PyPI - Version](https://img.shields.io/pypi/v/link-duplicates)
![Tests](https://github.com/MusicalNinjaRandInt/duplicates/actions/workflows/CI.yaml/badge.svg?branch=main)
[![codecov](https://codecov.io/gh/MusicalNinjaRandInt/duplicates/graph/badge.svg?token=WGZ7PR5IXC)](https://codecov.io/gh/MusicalNinjaRandInt/duplicates)

Identify duplicate files and replace them with hardlinks on any OS.

Intended to be used to reduce the storage space taken up by mutliple copies of similar backups. (E.g. regular google takeouts)

## Usage

Can be run from a command line in Linux, MacOS or Windows and will recursively scan a directory, identify and optionally hardlink any duplicate files found.

**WARNING:** Hardlinking files means if you change any one "copy" all "copies" will change.

**WARNING:** If other hardlinks are present _outside_ the directories scanned, these may no longer point to the same inode as those within the scanned directories. Consider the situation as _undefined_.

### Command line

`dupes PATH` will display number of duplicate files found under `PATH`

`dupes PATH1 PATH2 ...` will display number of duplicate files found under _and across_ `PATH1` and `PATH2`

`dupes --list PATHS...` will list the full sets of duplicate files found

`dupes --short PATHS...` will only list sets of duplicates where there are different file names

and finally ...

`dupes --link PATHS...` will replace duplicate files with hard links

### Python

You can also use the class `DuplicateFiles` to indentify and optionally link duplicates.

Additionally `BufferedIOFile` provides a binary file which knows its `Path` and offers a `readchunk()` method similar to the text file `readline()`.

## Up Next

- [Keep original file mode after hardlinking](https://github.com/MusicalNinjaRandInt/duplicates/issues/13)
- [Select leading inode for linking](https://github.com/MusicalNinjaRandInt/duplicates/issues/14)
- [Improved exception handling from the command line](https://github.com/MusicalNinjaRandInt/duplicates/issues/15)

Please vote on any issues which are important to you.

![PyPI - Downloads](https://img.shields.io/pypi/dm/link-duplicates)

            

Raw data

            {
    "_id": null,
    "home_page": "",
    "name": "link-duplicates",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.12",
    "maintainer_email": "",
    "keywords": "duplicate files hardlink windows linux mac backup",
    "author": "Mike Foster",
    "author_email": "",
    "download_url": "https://files.pythonhosted.org/packages/e4/93/405475e355e3cdac72bef303e1e98817353ae752cc436761e7b7dfa7cbd5/link-duplicates-1.1.0.tar.gz",
    "platform": null,
    "description": "# Duplicates\n\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/link-duplicates)\n![PyPI - Version](https://img.shields.io/pypi/v/link-duplicates)\n![Tests](https://github.com/MusicalNinjaRandInt/duplicates/actions/workflows/CI.yaml/badge.svg?branch=main)\n[![codecov](https://codecov.io/gh/MusicalNinjaRandInt/duplicates/graph/badge.svg?token=WGZ7PR5IXC)](https://codecov.io/gh/MusicalNinjaRandInt/duplicates)\n\nIdentify duplicate files and replace them with hardlinks on any OS.\n\nIntended to be used to reduce the storage space taken up by mutliple copies of similar backups. (E.g. regular google takeouts)\n\n## Usage\n\nCan be run from a command line in Linux, MacOS or Windows and will recursively scan a directory, identify and optionally hardlink any duplicate files found.\n\n**WARNING:** Hardlinking files means if you change any one \"copy\" all \"copies\" will change.\n\n**WARNING:** If other hardlinks are present _outside_ the directories scanned, these may no longer point to the same inode as those within the scanned directories. Consider the situation as _undefined_.\n\n### Command line\n\n`dupes PATH` will display number of duplicate files found under `PATH`\n\n`dupes PATH1 PATH2 ...` will display number of duplicate files found under _and across_ `PATH1` and `PATH2`\n\n`dupes --list PATHS...` will list the full sets of duplicate files found\n\n`dupes --short PATHS...` will only list sets of duplicates where there are different file names\n\nand finally ...\n\n`dupes --link PATHS...` will replace duplicate files with hard links\n\n### Python\n\nYou can also use the class `DuplicateFiles` to indentify and optionally link duplicates.\n\nAdditionally `BufferedIOFile` provides a binary file which knows its `Path` and offers a `readchunk()` method similar to the text file `readline()`.\n\n## Up Next\n\n- [Keep original file mode after hardlinking](https://github.com/MusicalNinjaRandInt/duplicates/issues/13)\n- [Select leading inode for linking](https://github.com/MusicalNinjaRandInt/duplicates/issues/14)\n- [Improved exception handling from the command line](https://github.com/MusicalNinjaRandInt/duplicates/issues/15)\n\nPlease vote on any issues which are important to you.\n\n![PyPI - Downloads](https://img.shields.io/pypi/dm/link-duplicates)\n",
    "bugtrack_url": null,
    "license": "EUPL 1.2",
    "summary": "Identify duplicate files and optionally create hardlinks to save storage",
    "version": "1.1.0",
    "project_urls": {
        "Source": "https://github.com/MusicalNinjaRandInt/duplicates"
    },
    "split_keywords": [
        "duplicate",
        "files",
        "hardlink",
        "windows",
        "linux",
        "mac",
        "backup"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "dbb01b8bfe932f75feb03100972c50da93d23849424320a91c0f8f90252ed7df",
                "md5": "2a09c3874f373b688ccb28af200ad7a2",
                "sha256": "c90abad337e1d1d5925e73ba4d756a500e9887d64b10b23cf72dd8d8acbbdb85"
            },
            "downloads": -1,
            "filename": "link_duplicates-1.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "2a09c3874f373b688ccb28af200ad7a2",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.12",
            "size": 7620,
            "upload_time": "2023-12-15T15:46:20",
            "upload_time_iso_8601": "2023-12-15T15:46:20.179342Z",
            "url": "https://files.pythonhosted.org/packages/db/b0/1b8bfe932f75feb03100972c50da93d23849424320a91c0f8f90252ed7df/link_duplicates-1.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "e493405475e355e3cdac72bef303e1e98817353ae752cc436761e7b7dfa7cbd5",
                "md5": "95b3464d40aee283a31587762007adee",
                "sha256": "9a0f9528b91dbe69e968d20f7fc6a059e311c54bbab61e50bc24a3af9ee9f7db"
            },
            "downloads": -1,
            "filename": "link-duplicates-1.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "95b3464d40aee283a31587762007adee",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.12",
            "size": 7215,
            "upload_time": "2023-12-15T15:46:21",
            "upload_time_iso_8601": "2023-12-15T15:46:21.588188Z",
            "url": "https://files.pythonhosted.org/packages/e4/93/405475e355e3cdac72bef303e1e98817353ae752cc436761e7b7dfa7cbd5/link-duplicates-1.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-12-15 15:46:21",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "MusicalNinjaRandInt",
    "github_project": "duplicates",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [],
    "lcname": "link-duplicates"
}
        
Elapsed time: 3.41484s