# Alfeios
### Enrich your command-line shell with Herculean cleaning capabilities
___
![full](doc/augias.jpg)
As fifth Labour, Heracles was charged with cleaning the [Augean stables
](https://en.wikipedia.org/wiki/Labours_of_Hercules#Fifth:_Augean_stables).
The beautiful stables had not been cleaned for thirty years and were
overshadowed in filth.
Instead of turning to the mop and bucket,
Heracles used a radically innovative tool:
the [Alfeios river](https://en.wikipedia.org/wiki/Alfeios) waters
and managed to wash everything in just one day.
Let's do a comparison with the data on your hard drives.
Backups have been made, files have been renamed, directories have been moved
... Slowly but surely things have diverged significantly,
up to a point where you did not feel safe to delete anything.
That is where things got worse as you started accumulating
duplicates, sacrificing all hopes to control your data.
As a result cleaning your hard drives now appears to you as the fifth labour
of Heracles, humanly impossible.
Alfeios is an innovative tool that makes this overwhelming task feasible.
It recursively indexes the content of your hard drives, going inside zip, tar,
gztar, bztar and xztar compressed files.
Its index is content-based, meaning that two files with different names and
different dates will be identified as duplicate if they share the same content.
This will tell you when files can safely be removed,
gaining space and cleaning data on your hard drives.
## Install
```
pip install alfeios
```
## Run
Alfeios is a software that operates from a
[command-line interface](https://en.wikipedia.org/wiki/Command-line_interface)
in a shell.
Upon installation, on any operating system thanks to the magic of [Python
entry points](https://amir.rachum.com/blog/2017/07/28/python-entry-points),
three commands are added to your shell.
One low-level command: `alfeios index` and two high-level
commands: `alfeios duplicate` and `alfeios missing`.
### `alfeios index`
Index content of a root directory:
- Index all file and directory contents in a root directory
including the inside of zip, tar, gztar, bztar and xztar compressed files
- Contents are identified by their hash-code, type (file or directory) and
size
- It saves two files tagged with the current time in a .alfeios folder
in the root directory:
- A tree.json.file that is a dictionary: path -> content
- A forbidden.json file that lists paths with no access
Example:
```
alfeios index
alfeios idx D:/Pictures
alfeios i
```
`alfeios idx` and `alfeios i` can be used as aliases for `alfeios index`
If no positional argument is passed, the root directory is
defaulted to the current working directory.
### `alfeios duplicate`
Find duplicate content in a root directory:
- List all duplicated files and directories in a root directory
- Save result as a duplicate_listing.json file tagged with the current time
in a .alfeios folder in the root directory
- Print the potential space gain
Example:
```
alfeios duplicate
alfeios dup -s D:/Pictures
alfeios d D:/Pictures/.alfeios/2020_01_29_10_29_39_listing.json
```
`alfeios dup` and `alfeios d` can be used as aliases for `alfeios duplicate`
If no positional argument is passed, the root directory is
defaulted to the current working directory.
The '-s' or '--save-index' optional flag saves the tree.json and forbidden.json
files tagged with the current time in a .alfeios folder in the root directory.
If a tree.json file is passed as positional argument instead of a root
directory, the tree is deserialized from the json file
instead of being generated, which is significantly quicker but of course
less up to date.
### `alfeios missing`
Find missing content in a new root directory from an old root directory:
- List all files and directories that are present in an old root directory
and that are missing in a new one
- Save result as a missing_listing.json file tagged with the current time
in a .alfeios folder in the old root directory
- Print the number of missing files
Example:
```
alfeios missing D:/Pictures E:/AllPictures
alfeios mis -s D:/Pictures E:/AllPictures
alfeios m D:/Pictures/.alfeios/2020_01_29_10_29_39_listing.json E:/AllPics
```
`alfeios mis` and `alfeios m` can be used as aliases for `alfeios missing`
The '-s' or '--save-index' optional flag saves the tree.json and forbidden.json
files tagged with the current time in a .alfeios folder in the 2 root
directories.
If a tree.json file is passed as positional argument instead of a root
directory, the corresponding tree is deserialized from the json file
instead of being generated, which is significantly quicker but of course
less up to date.
## For developers
```
git clone https://github.com/hoduche/alfeios
```
Then from the newly created alfeios directory, run:
```
pip install -e .
```
And in a Python file, call:
```python
import pathlib
import alfeios.api
folder_path = pathlib.Path('D:/Pictures')
alfeios.api.index(folder_path)
```
## Areas for improvement
### Viewer
For the moment Alfeios output are raw json files that are left at the user
disposal.
A dedicated json viewer with graph display could be a better decision support
tool.
### File Manager
For the moment Alfeios is in read-only mode. It could be enriched with other
file manager
[CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete)
functions, in particular duplicate removal possibilities.
### File System
For the moment Alfeios is only a add-on to the command line shell.
Its content-based index could be further rooted in the file system and
refreshed incrementally after each file system operation, supporting the
[copy-on-write principle
](https://en.wikipedia.org/wiki/Copy-on-write#In_computer_storage).
Raw data
{
"_id": null,
"home_page": "https://github.com/hoduche/alfeios",
"name": "alfeios",
"maintainer": "",
"docs_url": null,
"requires_python": ">=3.9",
"maintainer_email": "",
"keywords": "fs filesystem file system walk crawl files duplicate missing content hash hashcode checksum zip",
"author": "Henri-Olivier Duch\u00e9",
"author_email": "hoduche@yahoo.fr",
"download_url": "https://files.pythonhosted.org/packages/8c/8d/06bf9339bcc10618ed3dda426de0cf96bf7e13f70342b48e08fa61f5c011/alfeios-1.3.tar.gz",
"platform": null,
"description": "# Alfeios\n\n### Enrich your command-line shell with Herculean cleaning capabilities\n___\n\n![full](doc/augias.jpg)\n\nAs fifth Labour, Heracles was charged with cleaning the [Augean stables\n](https://en.wikipedia.org/wiki/Labours_of_Hercules#Fifth:_Augean_stables).\nThe beautiful stables had not been cleaned for thirty years and were \novershadowed in filth.\nInstead of turning to the mop and bucket,\nHeracles used a radically innovative tool:\nthe [Alfeios river](https://en.wikipedia.org/wiki/Alfeios) waters \nand managed to wash everything in just one day.\n\nLet's do a comparison with the data on your hard drives.\nBackups have been made, files have been renamed, directories have been moved \n... Slowly but surely things have diverged significantly,\nup to a point where you did not feel safe to delete anything.\nThat is where things got worse as you started accumulating\nduplicates, sacrificing all hopes to control your data.\nAs a result cleaning your hard drives now appears to you as the fifth labour\nof Heracles, humanly impossible.\n\nAlfeios is an innovative tool that makes this overwhelming task feasible.\nIt recursively indexes the content of your hard drives, going inside zip, tar, \ngztar, bztar and xztar compressed files.\nIts index is content-based, meaning that two files with different names and \ndifferent dates will be identified as duplicate if they share the same content.\nThis will tell you when files can safely be removed, \ngaining space and cleaning data on your hard drives.\n\n## Install\n```\npip install alfeios\n```\n\n## Run\nAlfeios is a software that operates from a\n[command-line interface](https://en.wikipedia.org/wiki/Command-line_interface)\nin a shell.\n\nUpon installation, on any operating system thanks to the magic of [Python \nentry points](https://amir.rachum.com/blog/2017/07/28/python-entry-points),\nthree commands are added to your shell.\nOne low-level command: `alfeios index` and two high-level\ncommands: `alfeios duplicate` and `alfeios missing`.\n\n### `alfeios index`\nIndex content of a root directory:\n\n- Index all file and directory contents in a root directory\n including the inside of zip, tar, gztar, bztar and xztar compressed files\n- Contents are identified by their hash-code, type (file or directory) and\n size\n- It saves two files tagged with the current time in a .alfeios folder \nin the root directory:\n - A tree.json.file that is a dictionary: path -> content\n - A forbidden.json file that lists paths with no access\n\nExample:\n```\nalfeios index\nalfeios idx D:/Pictures\nalfeios i\n```\n\n`alfeios idx` and `alfeios i` can be used as aliases for `alfeios index`\n\nIf no positional argument is passed, the root directory is \ndefaulted to the current working directory.\n\n### `alfeios duplicate`\nFind duplicate content in a root directory:\n\n- List all duplicated files and directories in a root directory\n- Save result as a duplicate_listing.json file tagged with the current time\n in a .alfeios folder in the root directory\n- Print the potential space gain\n\nExample:\n```\nalfeios duplicate\nalfeios dup -s D:/Pictures\nalfeios d D:/Pictures/.alfeios/2020_01_29_10_29_39_listing.json\n```\n\n`alfeios dup` and `alfeios d` can be used as aliases for `alfeios duplicate`\n\nIf no positional argument is passed, the root directory is \ndefaulted to the current working directory.\n\nThe '-s' or '--save-index' optional flag saves the tree.json and forbidden.json\nfiles tagged with the current time in a .alfeios folder in the root directory.\n\nIf a tree.json file is passed as positional argument instead of a root\ndirectory, the tree is deserialized from the json file\ninstead of being generated, which is significantly quicker but of course\nless up to date.\n\n### `alfeios missing`\nFind missing content in a new root directory from an old root directory:\n\n- List all files and directories that are present in an old root directory\n and that are missing in a new one\n- Save result as a missing_listing.json file tagged with the current time \nin a .alfeios folder in the old root directory\n- Print the number of missing files\n\nExample:\n```\nalfeios missing D:/Pictures E:/AllPictures\nalfeios mis -s D:/Pictures E:/AllPictures\nalfeios m D:/Pictures/.alfeios/2020_01_29_10_29_39_listing.json E:/AllPics\n```\n\n`alfeios mis` and `alfeios m` can be used as aliases for `alfeios missing`\n\nThe '-s' or '--save-index' optional flag saves the tree.json and forbidden.json\nfiles tagged with the current time in a .alfeios folder in the 2 root\ndirectories.\n\nIf a tree.json file is passed as positional argument instead of a root\ndirectory, the corresponding tree is deserialized from the json file\ninstead of being generated, which is significantly quicker but of course\nless up to date.\n\n## For developers\n```\ngit clone https://github.com/hoduche/alfeios\n```\nThen from the newly created alfeios directory, run:\n```\npip install -e .\n```\nAnd in a Python file, call:\n```python\nimport pathlib\n\nimport alfeios.api\n\nfolder_path = pathlib.Path('D:/Pictures')\nalfeios.api.index(folder_path)\n```\n\n## Areas for improvement\n\n### Viewer\nFor the moment Alfeios output are raw json files that are left at the user \ndisposal.\nA dedicated json viewer with graph display could be a better decision support\n tool.\n\n### File Manager\nFor the moment Alfeios is in read-only mode. It could be enriched with other \nfile manager \n[CRUD](https://en.wikipedia.org/wiki/Create,_read,_update_and_delete) \nfunctions, in particular duplicate removal possibilities.\n\n### File System\nFor the moment Alfeios is only a add-on to the command line shell.\nIts content-based index could be further rooted in the file system and \nrefreshed incrementally after each file system operation, supporting the \n[copy-on-write principle\n](https://en.wikipedia.org/wiki/Copy-on-write#In_computer_storage).\n",
"bugtrack_url": null,
"license": "MIT",
"summary": "Enrich your command-line shell with Herculean cleaning capabilities",
"version": "1.3",
"project_urls": {
"Homepage": "https://github.com/hoduche/alfeios"
},
"split_keywords": [
"fs",
"filesystem",
"file",
"system",
"walk",
"crawl",
"files",
"duplicate",
"missing",
"content",
"hash",
"hashcode",
"checksum",
"zip"
],
"urls": [
{
"comment_text": "",
"digests": {
"blake2b_256": "4d585e8c5c0a7586f57c45cb7307b07e26e47ca88df1747ad10307e97167ef9c",
"md5": "f8a09edee1a0581b38144d2041d33e95",
"sha256": "60a63e1f95d8fb545cb82d60358d335a7577ddccf2e57b24cd3d5b3f6a5ba01f"
},
"downloads": -1,
"filename": "alfeios-1.3-py3-none-any.whl",
"has_sig": false,
"md5_digest": "f8a09edee1a0581b38144d2041d33e95",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.9",
"size": 13594,
"upload_time": "2023-09-20T20:14:17",
"upload_time_iso_8601": "2023-09-20T20:14:17.864898Z",
"url": "https://files.pythonhosted.org/packages/4d/58/5e8c5c0a7586f57c45cb7307b07e26e47ca88df1747ad10307e97167ef9c/alfeios-1.3-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": "",
"digests": {
"blake2b_256": "8c8d06bf9339bcc10618ed3dda426de0cf96bf7e13f70342b48e08fa61f5c011",
"md5": "a409ce8da3d436a60164545b4118ba4e",
"sha256": "d144d084106a44f0a450ddb3853acd16e54f3b528a8cf02f69a217e1aef56579"
},
"downloads": -1,
"filename": "alfeios-1.3.tar.gz",
"has_sig": false,
"md5_digest": "a409ce8da3d436a60164545b4118ba4e",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.9",
"size": 43158,
"upload_time": "2023-09-20T20:14:20",
"upload_time_iso_8601": "2023-09-20T20:14:20.676787Z",
"url": "https://files.pythonhosted.org/packages/8c/8d/06bf9339bcc10618ed3dda426de0cf96bf7e13f70342b48e08fa61f5c011/alfeios-1.3.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2023-09-20 20:14:20",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "hoduche",
"github_project": "alfeios",
"travis_ci": true,
"coveralls": false,
"github_actions": false,
"requirements": [],
"lcname": "alfeios"
}