cs.hashindex


Namecs.hashindex JSON
Version 20240412 PyPI version JSON
download
home_pageNone
SummaryA command and utility functions for making listings of file content hashcodes and manipulating directory trees based on such a hash index.
upload_time2024-04-12 05:46:44
maintainerNone
docs_urlNone
authorNone
requires_pythonNone
licenseGNU General Public License v3 or later (GPLv3+)
keywords python3
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            A command and utility functions for making listings of file content hashcodes
and manipulating directory trees based on such a hash index.

*Latest release 20240412*:
* file_checksum: skip any nonregular file, only use run_task when checksumming more than 1MiB.
* HashIndexCommand.cmd_rearrange: run the refdir index in relative mode.
* Small fixes.

This largely exists to solve my "what has changed remotely?" or
"what has been filed where?" problems by comparing file trees
using the files' content hashcodes.

This does require reading every file once to compute its hashcode,
but the hashcodes (and file sizes and mtimes when read) are
stored beside the file in `.fstags` files (see the `cs.fstags`
module), so that a file does not need to be reread on subsequent
comparisons.

`hashindex` knows how to invoke itself remotely using `ssh`
(this does require `hashindex` to be installed on the remote host)
and can thus be used to compare a local and remote tree, for example:

    hashindex comm -1 localtree remotehost:remotetree

When you point `hashindex` at a remote tree, it uses `ssh` to
run `hashindex` on the remote host, so all the content hashing
is done locally to the remote host instead of copying files
over the network.

You can also use it to rearrange a tree based on the locations
of corresponding files in another tree. Consider a media tree
replicated between 2 hosts. If the source tree gets rearranged,
the destination can be equivalently rearranged without copying
the files, for example:

    hashindex rearrange sourcehost:sourcetree localtree

If `fstags mv` was used to do the original rearrangement then
the hashcodes will be copied to the new locations, saving a
rescan of the source file. I keep a shell alias `mv="fstags mv"`
so this is routine for me.

I have a backup script [`histbackup`](https://hg.sr.ht/~cameron-simpson/css/browse/bin/histbackup)
which works by making a hard link tree of the previous backup
and `rsync`ing into it.  It has long been subject to huge
transfers if the source tree gets rearranged. Now it has a
`--hashindex` option to get it to run a `hashindex rearrange`
between the hard linking to the new backup tree and the `rsync`
from the source to the new tree.

If network bandwith is limited or quotaed, you can use the
comparison function to prepare a list of files missing from the
remote location and copy them to a transfer drive for carrying
to the remote site when opportune. Example:

    hashindex comm -1 -o '{fspath}' src rhost:dst \
    | rsync -a --files-from=- src/ xferdir/

I've got a script [`pref-xfer`](https://hg.sr.ht/~cameron-simpson/css/browse/bin-cs/prep-xfer)
which does this with some conveniences and sanity checks.

## Function `dir_filepaths(dirpath: str, *, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>)`

Generator yielding the filesystem paths of the files in `dirpath`.

## Function `dir_remap(srcdirpath: str, fspaths_by_hashcode: Mapping[cs.hashutils.BaseHashCode, List[str]], *, hashname: str)`

Generator yielding `(srcpath,[remapped_paths])` 2-tuples
based on the hashcodes keying `rfspaths_by_hashcode`.

## Function `file_checksum(fspath: str, hashname: str = 'sha256', *, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>) -> Optional[cs.hashutils.BaseHashCode]`

Return the hashcode for the contents of the file at `fspath`.
Warn and return `None` on `OSError`.

## Function `get_fstags_hashcode(fspath: str, hashname: str, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>) -> Tuple[Optional[cs.hashutils.BaseHashCode], Optional[os.stat_result]]`

Obtain the hashcode cached in the fstags if still valid.
Return a 2-tuple of `(hashcode,stat_result)`
where `hashcode` is a `BaseHashCode` subclass instance is valid
or `None` if missing or no longer valid
and `stat_result` is the current `os.stat` result for `fspath`.

## Function `hashindex(fspath: Union[str, io.TextIOBase, Tuple[Optional[str], str]], *, hashname: str, hashindex_exe: str, ssh_exe: str, relative: bool = False, **kw) -> Iterable[Tuple[Optional[cs.hashutils.BaseHashCode], Optional[str]]]`

Generator yielding `(hashcode,filepath)` 2-tuples
for the files in `fspath`, which may be a file or directory path.
Note that it yields `(None,filepath)` for files which cannot be accessed.

## Class `HashIndexCommand(cs.cmdutils.BaseCommand)`

A tool to generate indices of file content hashcodes
and to link or otherwise rearrange files to destinations based
on their hashcode.

Command line usage:

    Usage: hashindex subcommand...
        Generate or process file content hash listings.
      Subcommands:
        comm {-1|-2|-3} {path1|-} {path2|-}
          Compare the filepaths in path1 and path2 by content.
          -1            List hashes and paths only present in path1.
          -2            List hashes and paths only present in path2.
          -3            List hashes and paths present in path1 and path2.
          -e ssh_exe    Specify the ssh executable.
          -h hashname   Specify the file content hash algorithm name.
          -H hashindex_exe
                        Specify the remote hashindex executable.
          -o output_format Default: '{hashcode} {fspath}'.
          -r            Emit relative paths in the listing.
        help [-l] [subcommand-names...]
          Print help for subcommands.
          This outputs the full help for the named subcommands,
          or the short help for all subcommands if no names are specified.
          -l  Long help even if no subcommand-names provided.
        ls [options...] [[host:]path...]
          Walk filesystem paths and emit a listing.
          The default path is the current directory.
          Options:
          -e ssh_exe    Specify the ssh executable.
          -h hashname   Specify the file content hash algorithm name.
          -H hashindex_exe
                        Specify the remote hashindex executable.
          -o output_format Default: '{hashcode} {fspath}'.
          -r            Emit relative paths in the listing.
                        This requires each path to be a directory.
        rearrange [options...] {[[user@]host:]refdir|-} [[user@]rhost:]targetdir [dstdir]
          Rearrange files from targetdir into dstdir based on their positions in refdir.
          Options:
            -e ssh_exe  Specify the ssh executable.
            -h hashname Specify the file content hash algorithm name.
            -H hashindex_exe
                        Specify the remote hashindex executable.
            --mv        Move mode.
            -n          No action, dry run.
            -o output_format Default: '{hashcode} {fspath}'.
            -s          Symlink mode.
          Other arguments:
            refdir      The reference directory, which may be local or remote
                        or "-" indicating that a hash index will be read from
                        standard input.
            targetdir   The directory containing the files to be rearranged,
                        which may be local or remote.
            dstdir      Optional destination directory for the rearranged files.
                        Default is the targetdir.
                        It is taken to be on the same host as targetdir.
        shell
          Run a command prompt via cmd.Cmd using this command's subcommands.

*`HashIndexCommand.Options`*

*Method `HashIndexCommand.cmd_comm(self, argv, *, runstate: Optional[cs.resources.RunState] = <function <lambda> at 0x101a85d80>)`*:
Usage: {cmd} {{-1|-2|-3}} {{path1|-}} {{path2|-}}
Compare the filepaths in path1 and path2 by content.
-1            List hashes and paths only present in path1.
-2            List hashes and paths only present in path2.
-3            List hashes and paths present in path1 and path2.
-e ssh_exe    Specify the ssh executable.
-h hashname   Specify the file content hash algorithm name.
-H hashindex_exe
              Specify the remote hashindex executable.
-o output_format Default: {OUTPUT_FORMAT_DEFAULT!r}.
-r            Emit relative paths in the listing.

*Method `HashIndexCommand.cmd_ls(self, argv, *, runstate: Optional[cs.resources.RunState] = <function <lambda> at 0x101a85d80>)`*:
Usage: {cmd} [options...] [[host:]path...]
Walk filesystem paths and emit a listing.
The default path is the current directory.
Options:
-e ssh_exe    Specify the ssh executable.
-h hashname   Specify the file content hash algorithm name.
-H hashindex_exe
              Specify the remote hashindex executable.
-o output_format Default: {OUTPUT_FORMAT_DEFAULT!r}.
-r            Emit relative paths in the listing.
              This requires each path to be a directory.

*Method `HashIndexCommand.cmd_rearrange(self, argv)`*:
Usage: {cmd} [options...] {{[[user@]host:]refdir|-}} [[user@]rhost:]targetdir [dstdir]
Rearrange files from targetdir into dstdir based on their positions in refdir.
Options:
  -e ssh_exe  Specify the ssh executable.
  -h hashname Specify the file content hash algorithm name.
  -H hashindex_exe
              Specify the remote hashindex executable.
  --mv        Move mode.
  -n          No action, dry run.
  -o output_format Default: {OUTPUT_FORMAT_DEFAULT!r}.
  -s          Symlink mode.
Other arguments:
  refdir      The reference directory, which may be local or remote
              or "-" indicating that a hash index will be read from
              standard input.
  targetdir   The directory containing the files to be rearranged,
              which may be local or remote.
  dstdir      Optional destination directory for the rearranged files.
              Default is the targetdir.
              It is taken to be on the same host as targetdir.

## Function `localpath(fspath: str) -> str`

Return a filesystem path modified so that it connot be
misinterpreted as a remote path such as `user@host:path`.

If `fspath` contains no colon (`:`) or is an absolute path
or starts with `./` then it is returned unchanged.
Otherwise a leading `./` is prepended.

## Function `main(argv=None)`

Commandline implementation.

## Function `merge(srcpath: str, dstpath: str, *, opname=None, hashname: str, move_mode: bool = False, symlink_mode=False, doit=False, quiet=False, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>)`

Merge `srcpath` to `dstpath`.

If `dstpath` does not exist, move/link/symlink `srcpath` to `dstpath`.
Otherwise checksum their contents and raise `FileExistsError` if they differ.

## Function `paths_remap(srcpaths: Iterable[str], fspaths_by_hashcode: Mapping[cs.hashutils.BaseHashCode, List[str]], *, hashname: str)`

Generator yielding `(srcpath,fspaths)` 2-tuples.

## Function `read_hashindex(f, start=1, *, hashname: str) -> Iterable[Tuple[Optional[cs.hashutils.BaseHashCode], Optional[str]]]`

A generator which reads line from the file `f`
and yields `(hashcode,fspath)` 2-tuples for each line.
If there are parse errors the `hashcode` or `fspath` may be `None`.

## Function `read_remote_hashindex(rhost: str, rdirpath: str, *, hashname: str, ssh_exe=None, hashindex_exe=None, relative: bool = False, check=True) -> Iterable[Tuple[Optional[cs.hashutils.BaseHashCode], Optional[str]]]`

A generator which reads a hashindex of a remote directory,
This runs: `hashindex ls -h hashname -r rdirpath` on the remote host.
It yields `(hashcode,fspath)` 2-tuples.

Parameters:
* `rhost`: the remote host, or `user@host`
* `rdirpath`: the remote directory path
* `hashname`: the file content hash algorithm name
* `ssh_exe`: the `ssh` executable,
  default `SSH_EXE_DEFAULT`: `'ssh'`
* `hashindex_exe`: the remote `hashindex` executable,
  default `HASHINDEX_EXE_DEFAULT`: `'hashindex'`
* `relative`: optional flag, default `False`;
  if true pass `'-r'` to the remote `hashindex ls` command
* `check`: whether to check that the remote command has a `0` return code,
  default `True`

## Function `rearrange(srcdirpath: str, rfspaths_by_hashcode, dstdirpath=None, *, hashname: str, move_mode: bool = False, symlink_mode=False, doit: bool, quiet: bool = False, fstags: cs.fstags.FSTags, runstate: Optional[cs.resources.RunState] = <function <lambda> at 0x101a85d80>)`

Rearrange the files in `dirpath` according to the
hashcode->[relpaths] `fspaths_by_hashcode`.

Parameters:
* `srcdirpath`: the directory whose files are to be rearranged
* `rfspaths_by_hashcode`: a mapping of hashcode to relative
  pathname to which the original file is to be moved
* `dstdirpath`: optional target directory for the rearranged files;
  defaults to `srcdirpath`, rearranging the files in place
* `hashname`: the file content hash algorithm name
* `move_move`: move files instead of linking them
* `symlink_mode`: symlink files instead of linking them
* `doit`: if true do the link/move/symlink, otherwise just print
* `quiet`: default `False`; if true do not print

## Function `run_remote_hashindex(rhost: str, argv, *, ssh_exe=None, hashindex_exe=None, check: bool = True, doit: bool = None, quiet: Optional[bool] = None, options: Optional[cs.cmdutils.BaseCommandOptions] = <function uses_cmd_options.<locals>.<lambda> at 0x101c29bd0>, **subp_options)`

Run a remote `hashindex` command.
Return the `CompletedProcess` result or `None` if `doit` is false.
Note that as with `cs.psutils.run`, the arguments are resolved
via `cs.psutils.prep_argv`.

Parameters:
* `rhost`: the remote host, or `user@host`
* `argv`: the command line arguments to be passed to the
  remote `hashindex` command
* `ssh_exe`: the `ssh` executable,
  default `SSH_EXE_DEFAULT`: `'ssh'`
* `hashindex_exe`: the remote `hashindex` executable,
  default `HASHINDEX_EXE_DEFAULT`: `'hashindex'`
* `check`: whether to check that the remote command has a `0` return code,
  default `True`
* `doit`: whether to actually run the command, default `True`
Other keyword parameters are passed therough to `cs.psutils.run`.

## Function `set_fstags_hashcode(fspath: str, hashcode, S: os.stat_result, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>)`

Record `hashcode` against `fspath`.

# Release Log



*Release 20240412*:
* file_checksum: skip any nonregular file, only use run_task when checksumming more than 1MiB.
* HashIndexCommand.cmd_rearrange: run the refdir index in relative mode.
* Small fixes.

*Release 20240317*:
* HashIndexCommand.cmd_ls: default to listing the current directory.
* HashIndexCommand: new -o output_format to allow outputting only hashcodes or fspaths.
* HashIndexCommand.cmd_comm: new -r (relative) option.

*Release 20240316*:
Fixed release upload artifacts.

*Release 20240305*:
* HashIndexCommand.cmd_ls: support rhost:rpath paths, honour intterupts in the remote mode.
* HashIndexCommand.cmd_rearrange: new optional dstdir command line argument, passed to rearrange.
* merge: symlink_mode: leave identical symlinks alone, just merge tags.
* rearrange: new optional dstdirpath parameter, default srcdirpath.

*Release 20240216*:
* HashIndexCommand.cmdlinkto,cmd_rearrange: run the link/mv stuff with sys.stdout in line buffered mode.
* DO not get hashcodes from symlinks.
* HashIndexCommand.cmd_ls: ignore None hashcodes, do not set xit=1.
* New run_remote_hashindex() and read_remote_hashindex() functions.
* dir_filepaths: skip dot files, the fstags .fstags file and nonregular files.

*Release 20240211.1*:
Better module docstring.

*Release 20240211*:
Initial PyPI release: "hashindex" command and utility functions for listing file hashcodes and rearranging trees based on a hash index.


            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "cs.hashindex",
    "maintainer": null,
    "docs_url": null,
    "requires_python": null,
    "maintainer_email": null,
    "keywords": "python3",
    "author": null,
    "author_email": "Cameron Simpson <cs@cskk.id.au>",
    "download_url": "https://files.pythonhosted.org/packages/c5/f1/ac0e2d60c387f05c03cfb8918535f72d1f5923bd08b3f3c68d0e170d42f1/cs.hashindex-20240412.tar.gz",
    "platform": null,
    "description": "A command and utility functions for making listings of file content hashcodes\nand manipulating directory trees based on such a hash index.\n\n*Latest release 20240412*:\n* file_checksum: skip any nonregular file, only use run_task when checksumming more than 1MiB.\n* HashIndexCommand.cmd_rearrange: run the refdir index in relative mode.\n* Small fixes.\n\nThis largely exists to solve my \"what has changed remotely?\" or\n\"what has been filed where?\" problems by comparing file trees\nusing the files' content hashcodes.\n\nThis does require reading every file once to compute its hashcode,\nbut the hashcodes (and file sizes and mtimes when read) are\nstored beside the file in `.fstags` files (see the `cs.fstags`\nmodule), so that a file does not need to be reread on subsequent\ncomparisons.\n\n`hashindex` knows how to invoke itself remotely using `ssh`\n(this does require `hashindex` to be installed on the remote host)\nand can thus be used to compare a local and remote tree, for example:\n\n    hashindex comm -1 localtree remotehost:remotetree\n\nWhen you point `hashindex` at a remote tree, it uses `ssh` to\nrun `hashindex` on the remote host, so all the content hashing\nis done locally to the remote host instead of copying files\nover the network.\n\nYou can also use it to rearrange a tree based on the locations\nof corresponding files in another tree. Consider a media tree\nreplicated between 2 hosts. If the source tree gets rearranged,\nthe destination can be equivalently rearranged without copying\nthe files, for example:\n\n    hashindex rearrange sourcehost:sourcetree localtree\n\nIf `fstags mv` was used to do the original rearrangement then\nthe hashcodes will be copied to the new locations, saving a\nrescan of the source file. I keep a shell alias `mv=\"fstags mv\"`\nso this is routine for me.\n\nI have a backup script [`histbackup`](https://hg.sr.ht/~cameron-simpson/css/browse/bin/histbackup)\nwhich works by making a hard link tree of the previous backup\nand `rsync`ing into it.  It has long been subject to huge\ntransfers if the source tree gets rearranged. Now it has a\n`--hashindex` option to get it to run a `hashindex rearrange`\nbetween the hard linking to the new backup tree and the `rsync`\nfrom the source to the new tree.\n\nIf network bandwith is limited or quotaed, you can use the\ncomparison function to prepare a list of files missing from the\nremote location and copy them to a transfer drive for carrying\nto the remote site when opportune. Example:\n\n    hashindex comm -1 -o '{fspath}' src rhost:dst \\\n    | rsync -a --files-from=- src/ xferdir/\n\nI've got a script [`pref-xfer`](https://hg.sr.ht/~cameron-simpson/css/browse/bin-cs/prep-xfer)\nwhich does this with some conveniences and sanity checks.\n\n## Function `dir_filepaths(dirpath: str, *, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>)`\n\nGenerator yielding the filesystem paths of the files in `dirpath`.\n\n## Function `dir_remap(srcdirpath: str, fspaths_by_hashcode: Mapping[cs.hashutils.BaseHashCode, List[str]], *, hashname: str)`\n\nGenerator yielding `(srcpath,[remapped_paths])` 2-tuples\nbased on the hashcodes keying `rfspaths_by_hashcode`.\n\n## Function `file_checksum(fspath: str, hashname: str = 'sha256', *, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>) -> Optional[cs.hashutils.BaseHashCode]`\n\nReturn the hashcode for the contents of the file at `fspath`.\nWarn and return `None` on `OSError`.\n\n## Function `get_fstags_hashcode(fspath: str, hashname: str, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>) -> Tuple[Optional[cs.hashutils.BaseHashCode], Optional[os.stat_result]]`\n\nObtain the hashcode cached in the fstags if still valid.\nReturn a 2-tuple of `(hashcode,stat_result)`\nwhere `hashcode` is a `BaseHashCode` subclass instance is valid\nor `None` if missing or no longer valid\nand `stat_result` is the current `os.stat` result for `fspath`.\n\n## Function `hashindex(fspath: Union[str, io.TextIOBase, Tuple[Optional[str], str]], *, hashname: str, hashindex_exe: str, ssh_exe: str, relative: bool = False, **kw) -> Iterable[Tuple[Optional[cs.hashutils.BaseHashCode], Optional[str]]]`\n\nGenerator yielding `(hashcode,filepath)` 2-tuples\nfor the files in `fspath`, which may be a file or directory path.\nNote that it yields `(None,filepath)` for files which cannot be accessed.\n\n## Class `HashIndexCommand(cs.cmdutils.BaseCommand)`\n\nA tool to generate indices of file content hashcodes\nand to link or otherwise rearrange files to destinations based\non their hashcode.\n\nCommand line usage:\n\n    Usage: hashindex subcommand...\n        Generate or process file content hash listings.\n      Subcommands:\n        comm {-1|-2|-3} {path1|-} {path2|-}\n          Compare the filepaths in path1 and path2 by content.\n          -1            List hashes and paths only present in path1.\n          -2            List hashes and paths only present in path2.\n          -3            List hashes and paths present in path1 and path2.\n          -e ssh_exe    Specify the ssh executable.\n          -h hashname   Specify the file content hash algorithm name.\n          -H hashindex_exe\n                        Specify the remote hashindex executable.\n          -o output_format Default: '{hashcode} {fspath}'.\n          -r            Emit relative paths in the listing.\n        help [-l] [subcommand-names...]\n          Print help for subcommands.\n          This outputs the full help for the named subcommands,\n          or the short help for all subcommands if no names are specified.\n          -l  Long help even if no subcommand-names provided.\n        ls [options...] [[host:]path...]\n          Walk filesystem paths and emit a listing.\n          The default path is the current directory.\n          Options:\n          -e ssh_exe    Specify the ssh executable.\n          -h hashname   Specify the file content hash algorithm name.\n          -H hashindex_exe\n                        Specify the remote hashindex executable.\n          -o output_format Default: '{hashcode} {fspath}'.\n          -r            Emit relative paths in the listing.\n                        This requires each path to be a directory.\n        rearrange [options...] {[[user@]host:]refdir|-} [[user@]rhost:]targetdir [dstdir]\n          Rearrange files from targetdir into dstdir based on their positions in refdir.\n          Options:\n            -e ssh_exe  Specify the ssh executable.\n            -h hashname Specify the file content hash algorithm name.\n            -H hashindex_exe\n                        Specify the remote hashindex executable.\n            --mv        Move mode.\n            -n          No action, dry run.\n            -o output_format Default: '{hashcode} {fspath}'.\n            -s          Symlink mode.\n          Other arguments:\n            refdir      The reference directory, which may be local or remote\n                        or \"-\" indicating that a hash index will be read from\n                        standard input.\n            targetdir   The directory containing the files to be rearranged,\n                        which may be local or remote.\n            dstdir      Optional destination directory for the rearranged files.\n                        Default is the targetdir.\n                        It is taken to be on the same host as targetdir.\n        shell\n          Run a command prompt via cmd.Cmd using this command's subcommands.\n\n*`HashIndexCommand.Options`*\n\n*Method `HashIndexCommand.cmd_comm(self, argv, *, runstate: Optional[cs.resources.RunState] = <function <lambda> at 0x101a85d80>)`*:\nUsage: {cmd} {{-1|-2|-3}} {{path1|-}} {{path2|-}}\nCompare the filepaths in path1 and path2 by content.\n-1            List hashes and paths only present in path1.\n-2            List hashes and paths only present in path2.\n-3            List hashes and paths present in path1 and path2.\n-e ssh_exe    Specify the ssh executable.\n-h hashname   Specify the file content hash algorithm name.\n-H hashindex_exe\n              Specify the remote hashindex executable.\n-o output_format Default: {OUTPUT_FORMAT_DEFAULT!r}.\n-r            Emit relative paths in the listing.\n\n*Method `HashIndexCommand.cmd_ls(self, argv, *, runstate: Optional[cs.resources.RunState] = <function <lambda> at 0x101a85d80>)`*:\nUsage: {cmd} [options...] [[host:]path...]\nWalk filesystem paths and emit a listing.\nThe default path is the current directory.\nOptions:\n-e ssh_exe    Specify the ssh executable.\n-h hashname   Specify the file content hash algorithm name.\n-H hashindex_exe\n              Specify the remote hashindex executable.\n-o output_format Default: {OUTPUT_FORMAT_DEFAULT!r}.\n-r            Emit relative paths in the listing.\n              This requires each path to be a directory.\n\n*Method `HashIndexCommand.cmd_rearrange(self, argv)`*:\nUsage: {cmd} [options...] {{[[user@]host:]refdir|-}} [[user@]rhost:]targetdir [dstdir]\nRearrange files from targetdir into dstdir based on their positions in refdir.\nOptions:\n  -e ssh_exe  Specify the ssh executable.\n  -h hashname Specify the file content hash algorithm name.\n  -H hashindex_exe\n              Specify the remote hashindex executable.\n  --mv        Move mode.\n  -n          No action, dry run.\n  -o output_format Default: {OUTPUT_FORMAT_DEFAULT!r}.\n  -s          Symlink mode.\nOther arguments:\n  refdir      The reference directory, which may be local or remote\n              or \"-\" indicating that a hash index will be read from\n              standard input.\n  targetdir   The directory containing the files to be rearranged,\n              which may be local or remote.\n  dstdir      Optional destination directory for the rearranged files.\n              Default is the targetdir.\n              It is taken to be on the same host as targetdir.\n\n## Function `localpath(fspath: str) -> str`\n\nReturn a filesystem path modified so that it connot be\nmisinterpreted as a remote path such as `user@host:path`.\n\nIf `fspath` contains no colon (`:`) or is an absolute path\nor starts with `./` then it is returned unchanged.\nOtherwise a leading `./` is prepended.\n\n## Function `main(argv=None)`\n\nCommandline implementation.\n\n## Function `merge(srcpath: str, dstpath: str, *, opname=None, hashname: str, move_mode: bool = False, symlink_mode=False, doit=False, quiet=False, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>)`\n\nMerge `srcpath` to `dstpath`.\n\nIf `dstpath` does not exist, move/link/symlink `srcpath` to `dstpath`.\nOtherwise checksum their contents and raise `FileExistsError` if they differ.\n\n## Function `paths_remap(srcpaths: Iterable[str], fspaths_by_hashcode: Mapping[cs.hashutils.BaseHashCode, List[str]], *, hashname: str)`\n\nGenerator yielding `(srcpath,fspaths)` 2-tuples.\n\n## Function `read_hashindex(f, start=1, *, hashname: str) -> Iterable[Tuple[Optional[cs.hashutils.BaseHashCode], Optional[str]]]`\n\nA generator which reads line from the file `f`\nand yields `(hashcode,fspath)` 2-tuples for each line.\nIf there are parse errors the `hashcode` or `fspath` may be `None`.\n\n## Function `read_remote_hashindex(rhost: str, rdirpath: str, *, hashname: str, ssh_exe=None, hashindex_exe=None, relative: bool = False, check=True) -> Iterable[Tuple[Optional[cs.hashutils.BaseHashCode], Optional[str]]]`\n\nA generator which reads a hashindex of a remote directory,\nThis runs: `hashindex ls -h hashname -r rdirpath` on the remote host.\nIt yields `(hashcode,fspath)` 2-tuples.\n\nParameters:\n* `rhost`: the remote host, or `user@host`\n* `rdirpath`: the remote directory path\n* `hashname`: the file content hash algorithm name\n* `ssh_exe`: the `ssh` executable,\n  default `SSH_EXE_DEFAULT`: `'ssh'`\n* `hashindex_exe`: the remote `hashindex` executable,\n  default `HASHINDEX_EXE_DEFAULT`: `'hashindex'`\n* `relative`: optional flag, default `False`;\n  if true pass `'-r'` to the remote `hashindex ls` command\n* `check`: whether to check that the remote command has a `0` return code,\n  default `True`\n\n## Function `rearrange(srcdirpath: str, rfspaths_by_hashcode, dstdirpath=None, *, hashname: str, move_mode: bool = False, symlink_mode=False, doit: bool, quiet: bool = False, fstags: cs.fstags.FSTags, runstate: Optional[cs.resources.RunState] = <function <lambda> at 0x101a85d80>)`\n\nRearrange the files in `dirpath` according to the\nhashcode->[relpaths] `fspaths_by_hashcode`.\n\nParameters:\n* `srcdirpath`: the directory whose files are to be rearranged\n* `rfspaths_by_hashcode`: a mapping of hashcode to relative\n  pathname to which the original file is to be moved\n* `dstdirpath`: optional target directory for the rearranged files;\n  defaults to `srcdirpath`, rearranging the files in place\n* `hashname`: the file content hash algorithm name\n* `move_move`: move files instead of linking them\n* `symlink_mode`: symlink files instead of linking them\n* `doit`: if true do the link/move/symlink, otherwise just print\n* `quiet`: default `False`; if true do not print\n\n## Function `run_remote_hashindex(rhost: str, argv, *, ssh_exe=None, hashindex_exe=None, check: bool = True, doit: bool = None, quiet: Optional[bool] = None, options: Optional[cs.cmdutils.BaseCommandOptions] = <function uses_cmd_options.<locals>.<lambda> at 0x101c29bd0>, **subp_options)`\n\nRun a remote `hashindex` command.\nReturn the `CompletedProcess` result or `None` if `doit` is false.\nNote that as with `cs.psutils.run`, the arguments are resolved\nvia `cs.psutils.prep_argv`.\n\nParameters:\n* `rhost`: the remote host, or `user@host`\n* `argv`: the command line arguments to be passed to the\n  remote `hashindex` command\n* `ssh_exe`: the `ssh` executable,\n  default `SSH_EXE_DEFAULT`: `'ssh'`\n* `hashindex_exe`: the remote `hashindex` executable,\n  default `HASHINDEX_EXE_DEFAULT`: `'hashindex'`\n* `check`: whether to check that the remote command has a `0` return code,\n  default `True`\n* `doit`: whether to actually run the command, default `True`\nOther keyword parameters are passed therough to `cs.psutils.run`.\n\n## Function `set_fstags_hashcode(fspath: str, hashcode, S: os.stat_result, fstags: Optional[cs.fstags.FSTags] = <function <lambda> at 0x101bcd510>)`\n\nRecord `hashcode` against `fspath`.\n\n# Release Log\n\n\n\n*Release 20240412*:\n* file_checksum: skip any nonregular file, only use run_task when checksumming more than 1MiB.\n* HashIndexCommand.cmd_rearrange: run the refdir index in relative mode.\n* Small fixes.\n\n*Release 20240317*:\n* HashIndexCommand.cmd_ls: default to listing the current directory.\n* HashIndexCommand: new -o output_format to allow outputting only hashcodes or fspaths.\n* HashIndexCommand.cmd_comm: new -r (relative) option.\n\n*Release 20240316*:\nFixed release upload artifacts.\n\n*Release 20240305*:\n* HashIndexCommand.cmd_ls: support rhost:rpath paths, honour intterupts in the remote mode.\n* HashIndexCommand.cmd_rearrange: new optional dstdir command line argument, passed to rearrange.\n* merge: symlink_mode: leave identical symlinks alone, just merge tags.\n* rearrange: new optional dstdirpath parameter, default srcdirpath.\n\n*Release 20240216*:\n* HashIndexCommand.cmdlinkto,cmd_rearrange: run the link/mv stuff with sys.stdout in line buffered mode.\n* DO not get hashcodes from symlinks.\n* HashIndexCommand.cmd_ls: ignore None hashcodes, do not set xit=1.\n* New run_remote_hashindex() and read_remote_hashindex() functions.\n* dir_filepaths: skip dot files, the fstags .fstags file and nonregular files.\n\n*Release 20240211.1*:\nBetter module docstring.\n\n*Release 20240211*:\nInitial PyPI release: \"hashindex\" command and utility functions for listing file hashcodes and rearranging trees based on a hash index.\n\n",
    "bugtrack_url": null,
    "license": "GNU General Public License v3 or later (GPLv3+)",
    "summary": "A command and utility functions for making listings of file content hashcodes and manipulating directory trees based on such a hash index.",
    "version": "20240412",
    "project_urls": {
        "URL": "https://bitbucket.org/cameron_simpson/css/commits/all"
    },
    "split_keywords": [
        "python3"
    ],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "fff1d7933a17717c80f844ee160ca529467525726ace463147b293f811df069d",
                "md5": "aab6775d211abe8e48d26dc2833509f3",
                "sha256": "552a47f73639eb6a88bfa7a57673544752a5ad54e87bc23fa5303135dba5eaf2"
            },
            "downloads": -1,
            "filename": "cs.hashindex-20240412-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aab6775d211abe8e48d26dc2833509f3",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": null,
            "size": 14983,
            "upload_time": "2024-04-12T05:46:43",
            "upload_time_iso_8601": "2024-04-12T05:46:43.218973Z",
            "url": "https://files.pythonhosted.org/packages/ff/f1/d7933a17717c80f844ee160ca529467525726ace463147b293f811df069d/cs.hashindex-20240412-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "c5f1ac0e2d60c387f05c03cfb8918535f72d1f5923bd08b3f3c68d0e170d42f1",
                "md5": "ec798ef74557a57ef59a1a25ec0fb42e",
                "sha256": "38c5db075c97d2e27f87aa83c9c7e0ef47be5d258c56675333b9dbec36aacb8d"
            },
            "downloads": -1,
            "filename": "cs.hashindex-20240412.tar.gz",
            "has_sig": false,
            "md5_digest": "ec798ef74557a57ef59a1a25ec0fb42e",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": null,
            "size": 16794,
            "upload_time": "2024-04-12T05:46:44",
            "upload_time_iso_8601": "2024-04-12T05:46:44.722458Z",
            "url": "https://files.pythonhosted.org/packages/c5/f1/ac0e2d60c387f05c03cfb8918535f72d1f5923bd08b3f3c68d0e170d42f1/cs.hashindex-20240412.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2024-04-12 05:46:44",
    "github": false,
    "gitlab": false,
    "bitbucket": true,
    "codeberg": false,
    "bitbucket_user": "cameron_simpson",
    "bitbucket_project": "css",
    "lcname": "cs.hashindex"
}
        
Elapsed time: 0.25456s