jdfile


Namejdfile JSON
Version 1.1.5 PyPI version JSON
download
home_pagehttps://github.com/natelandau/jdfile
SummaryFile Manager for the Johnny Decimal System
upload_time2023-05-14 20:40:30
maintainer
docs_urlNone
authorNate Landau
requires_python>=3.10,<4.0
licenseGNU AFFERO
keywords
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            [![PyPI version](https://badge.fury.io/py/jdfile.svg)](https://badge.fury.io/py/jdfile) ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/jdfile) [![Python Code Checker](https://github.com/natelandau/jdfile/actions/workflows/automated-tests.yml/badge.svg)](https://github.com/natelandau/jdfile/actions/workflows/automated-tests.yml) [![codecov](https://codecov.io/gh/natelandau/jdfile/branch/main/graph/badge.svg?token=Y11Z883PMI)](https://codecov.io/gh/natelandau/jdfile)

# jdfile

`jdfile` cleans and normalizes filenames. In addition, if you have directories which follow the [Johnny Decimal](https://johnnydecimal.com), jdfile can move your files into the appropriate directory.

`jdfile` cleans filenames based on your preferences.

-   Remove special characters
-   Trim multiple separators (`word----word` becomes `word-word`)
-   Normalize to `lower case`, `upper case`, `sentence case`, or `title case`
-   Normalize all files to a common word separator (`_`, `-`, ` `)
-   Enforce lowercase file extensions
-   Remove common English stopwords
-   Split `camelCase` words into separate words (`camel Case`)
-   Parse the filename for a date in many different formats
-   Remove or reformat the date and add it to the the beginning of the filename
-   Avoid overwriting files by adding a unique integer when renaming/moving
-   Clean entire directory trees
-   Optionally, show previews of changes to be made before commiting
-   Ignore files listed in a config file by filename or by regex
-   Specify casing for words which should never be changed (ie. `iMac` will never be re-cased)

`jdfile` can organize your files into folders.

-   Move files into directory trees following the [Johnny Decimal](https://johnnydecimal.com) system
-   Parse files and folder names looking for matching terms
-   Uses [nltk](https://www.nltk.org) to lookup synonyms to improve matching
-   Add `.jdfile` files to directories containing a list of words that will match files

### Why build this?

It's nearly impossible to file away documents with normalized names when everyone has a different convention for naming files. On any given day, tons of files are attached to emails or sent via Slack by people who have their won way of naming files. For example:

-   `department 2023 financials and budget 08232002.xlsx`
-   `some contract Jan7 reviewed NOT FINAL (NL comments) v13.docx`
-   `John&Jane-meeting-notes.txt`
-   `Project_mockups(WIP)___sep92022.pdf`
-   `FIRSTNAMElastname Resume (#1) [companyname].PDF`
-   `code_to_review.js`

If you are a person who archives documents there are a number of problems with these files.

-   No self-evident way to organize them into folders
-   No common patterns to search for
-   Dates all over the place or nonexistent
-   No consistent casing
-   No consistent word separators
-   Special characters within text
-   I could go on and on...

Additionally, even if the filenames were normalized, filing documents manually is a pain.

`jdfile` is created to solve for these problems by providing an easy CLI to normalize the filename and organize it into an appropriate directory on your computer.

## Install

jdfile requires Python v3.10 or above

```bash
pip install pip install obsidian-metadata
```

## Usage

Run `jdfile --help` for usage

### Configuration

To organize files into folders, a valid [toml](https://toml.io/en/) configuration file is required at `~/.jdfile/jdfile.toml`

```toml
# The name of the project is used as a command line option.
# (e.g. --organize=project_name)
[project_name]
    # (Required) Path to the folder containing the Johnny Decimal project
    path = "~/johnnydecimal"

    # An optional date format. If specified, the date will be appended to the filename
    # See https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes for details on how to specify a date.
    date_format = "None"

    # Ignores dotfiles (files that start with a period) when cleaning a directory.  true or false
    ignore_dotfiles = true

    # Files in this list will be skipped.
    ignored_files = ['file1.txt', 'file2.txt']

    # File names matching this regex will be skipped.
    # IMPORTANT: You must double escape within the pattern
    ignored_regex = [".*\\.tar.gz$"]

    # Force the casing of certain words. Great for acronyms or proper nouns.
    match_case = ["CEO", "CEOs", "iMac", "iPhone"]

    # Overwrite existing files. true or false. If false, unique integers will be appended to the filename.
    overwrite_existing = false

    # Separator to use between words. Options: "ignore", "underscore", "space", "dash", "none"
    separator = "ignore"

    # Split CamelCase words into separate words. true or false
    split_words = false

    # Optional list of project specific stopwords to be stripped from filenames
    stopwords = ["stopword1", "stopword2"]

    # Strip stopwords from filenames. true or false
    strip_stopwords = true

    # Transform case of filenames.
    # Options: "lower", "upper", "title", "CamelCase", "sentence", "ignore",
    transform_case = "ignore"

    # Use the nltk wordnet corpus to find synonyms for words in filenames. true or false
    # Note, this will download a large corpus (~400mb) the first time it is run.
    use_synonyms = false
```

### Example usage

```bash
# Normalize all files in a directory to lowercase, with underscore separators
$ jdfile --case=lower --separator=underscore /path/to/directory

# Clean all files in a directory and confirm all changes before committing them
$ jdfile --clean /path/to/directory

# Strip common English stopwords from all files in a directory
$ jdfile --stopwords /path/to/directory

# Transform a date and add it to the filename
$ jdfile --date-format="%Y-%m-%d" ./somefile_march 3rd, 2022.txt

# Print a tree representation of a Johnny Decimal project
$ jdfile --project=[project_name] --tree

# Use the settings of a project in the config file to clean filenames without
# organizing them into folders
$ jdfile --project=[project_name] --no-organize path/to/some_file.jpg

# Organize files into a Johnny Decimal project with specified terms with title casing
$ jdfile ---project=[project_name] --term=term1 --term=term2 path/to/some_file.jpg
```

### Tips

Adding custom functions to your `.bashrc` or `.zshrc` can save time and ensure your filename preferences are always used.

```bash
# ~/.bashrc
if command -v jdfile &>/dev/null; then

    clean() {
        # DESC:	 Clean filenames using the jdfile package
        if [[ $1 == "--help" || $1 == "-h" ]]; then
            jdfile --help
        else
            jdfile --sep=space --case=title --confirm "$@"
        fi
    }

    wfile() {
        # DESC:	 File work documents
        if [[ $1 == "--help" || $1 == "-h" ]]; then
            jdfile --help
        else
            jdfile --project=work "$@"
        fi
    }
fi
```

## Caveats

`jdfile` is built for my own personal use. YMMV depending on your system and requirements. I make no warranties for any data loss that may result from use. I strongly recommend running in `--dry-run` mode prior to updating files.

## Contributing

### Setup: Once per project

There are two ways to contribute to this project.

#### 1. Local development

1. Install Python 3.10 and [Poetry](https://python-poetry.org)
2. Clone this repository. `git clone https://github.com/natelandau/jdfile.git`
3. Install the Poetry environment with `poetry install`.
4. Activate your Poetry environment with `poetry shell`.
5. Install the pre-commit hooks with `pre-commit install --install-hooks`.

#### 2. Containerized development

1. Clone this repository. `git clone https://github.com/natelandau/jdfile.git`
2. Open the repository in Visual Studio Code
3. Start the [Dev Container](https://code.visualstudio.com/docs/remote/containers). Run <kbd>Ctrl/⌘</kbd> + <kbd>⇧</kbd> + <kbd>P</kbd> → _Remote-Containers: Reopen in Container_.
4. Run `poetry env info -p` to find the PATH to the Python interpreter if needed by VSCode.

### Developing

-   This project follows the [Conventional Commits](https://www.conventionalcommits.org/) standard to automate [Semantic Versioning](https://semver.org/) and [Keep A Changelog](https://keepachangelog.com/) with [Commitizen](https://github.com/commitizen-tools/commitizen).
    -   When you're ready to commit changes run `cz c`
-   Run `poe` from within the development environment to print a list of [Poe the Poet](https://github.com/nat-n/poethepoet) tasks available to run on this project. Common commands:
    -   `poe lint` runs all linters
    -   `poe test` runs all tests with Pytest
-   Run `poetry add {package}` from within the development environment to install a run time dependency and add it to `pyproject.toml` and `poetry.lock`.
-   Run `poetry remove {package}` from within the development environment to uninstall a run time dependency and remove it from `pyproject.toml` and `poetry.lock`.
-   Run `poetry update` from within the development environment to upgrade all dependencies to the latest versions allowed by `pyproject.toml`.

            

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/natelandau/jdfile",
    "name": "jdfile",
    "maintainer": "",
    "docs_url": null,
    "requires_python": ">=3.10,<4.0",
    "maintainer_email": "",
    "keywords": "",
    "author": "Nate Landau",
    "author_email": "github@natenate.org",
    "download_url": "https://files.pythonhosted.org/packages/d2/0d/f133a0034137f5e46e554ed719d3476993b14b021d93d9bcbb6dd81e55f4/jdfile-1.1.5.tar.gz",
    "platform": null,
    "description": "[![PyPI version](https://badge.fury.io/py/jdfile.svg)](https://badge.fury.io/py/jdfile) ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/jdfile) [![Python Code Checker](https://github.com/natelandau/jdfile/actions/workflows/automated-tests.yml/badge.svg)](https://github.com/natelandau/jdfile/actions/workflows/automated-tests.yml) [![codecov](https://codecov.io/gh/natelandau/jdfile/branch/main/graph/badge.svg?token=Y11Z883PMI)](https://codecov.io/gh/natelandau/jdfile)\n\n# jdfile\n\n`jdfile` cleans and normalizes filenames. In addition, if you have directories which follow the [Johnny Decimal](https://johnnydecimal.com), jdfile can move your files into the appropriate directory.\n\n`jdfile` cleans filenames based on your preferences.\n\n-   Remove special characters\n-   Trim multiple separators (`word----word` becomes `word-word`)\n-   Normalize to `lower case`, `upper case`, `sentence case`, or `title case`\n-   Normalize all files to a common word separator (`_`, `-`, ` `)\n-   Enforce lowercase file extensions\n-   Remove common English stopwords\n-   Split `camelCase` words into separate words (`camel Case`)\n-   Parse the filename for a date in many different formats\n-   Remove or reformat the date and add it to the the beginning of the filename\n-   Avoid overwriting files by adding a unique integer when renaming/moving\n-   Clean entire directory trees\n-   Optionally, show previews of changes to be made before commiting\n-   Ignore files listed in a config file by filename or by regex\n-   Specify casing for words which should never be changed (ie. `iMac` will never be re-cased)\n\n`jdfile` can organize your files into folders.\n\n-   Move files into directory trees following the [Johnny Decimal](https://johnnydecimal.com) system\n-   Parse files and folder names looking for matching terms\n-   Uses [nltk](https://www.nltk.org) to lookup synonyms to improve matching\n-   Add `.jdfile` files to directories containing a list of words that will match files\n\n### Why build this?\n\nIt's nearly impossible to file away documents with normalized names when everyone has a different convention for naming files. On any given day, tons of files are attached to emails or sent via Slack by people who have their won way of naming files. For example:\n\n-   `department 2023 financials and budget 08232002.xlsx`\n-   `some contract Jan7 reviewed NOT FINAL (NL comments) v13.docx`\n-   `John&Jane-meeting-notes.txt`\n-   `Project_mockups(WIP)___sep92022.pdf`\n-   `FIRSTNAMElastname Resume (#1) [companyname].PDF`\n-   `code_to_review.js`\n\nIf you are a person who archives documents there are a number of problems with these files.\n\n-   No self-evident way to organize them into folders\n-   No common patterns to search for\n-   Dates all over the place or nonexistent\n-   No consistent casing\n-   No consistent word separators\n-   Special characters within text\n-   I could go on and on...\n\nAdditionally, even if the filenames were normalized, filing documents manually is a pain.\n\n`jdfile` is created to solve for these problems by providing an easy CLI to normalize the filename and organize it into an appropriate directory on your computer.\n\n## Install\n\njdfile requires Python v3.10 or above\n\n```bash\npip install pip install obsidian-metadata\n```\n\n## Usage\n\nRun `jdfile --help` for usage\n\n### Configuration\n\nTo organize files into folders, a valid [toml](https://toml.io/en/) configuration file is required at `~/.jdfile/jdfile.toml`\n\n```toml\n# The name of the project is used as a command line option.\n# (e.g. --organize=project_name)\n[project_name]\n    # (Required) Path to the folder containing the Johnny Decimal project\n    path = \"~/johnnydecimal\"\n\n    # An optional date format. If specified, the date will be appended to the filename\n    # See https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes for details on how to specify a date.\n    date_format = \"None\"\n\n    # Ignores dotfiles (files that start with a period) when cleaning a directory.  true or false\n    ignore_dotfiles = true\n\n    # Files in this list will be skipped.\n    ignored_files = ['file1.txt', 'file2.txt']\n\n    # File names matching this regex will be skipped.\n    # IMPORTANT: You must double escape within the pattern\n    ignored_regex = [\".*\\\\.tar.gz$\"]\n\n    # Force the casing of certain words. Great for acronyms or proper nouns.\n    match_case = [\"CEO\", \"CEOs\", \"iMac\", \"iPhone\"]\n\n    # Overwrite existing files. true or false. If false, unique integers will be appended to the filename.\n    overwrite_existing = false\n\n    # Separator to use between words. Options: \"ignore\", \"underscore\", \"space\", \"dash\", \"none\"\n    separator = \"ignore\"\n\n    # Split CamelCase words into separate words. true or false\n    split_words = false\n\n    # Optional list of project specific stopwords to be stripped from filenames\n    stopwords = [\"stopword1\", \"stopword2\"]\n\n    # Strip stopwords from filenames. true or false\n    strip_stopwords = true\n\n    # Transform case of filenames.\n    # Options: \"lower\", \"upper\", \"title\", \"CamelCase\", \"sentence\", \"ignore\",\n    transform_case = \"ignore\"\n\n    # Use the nltk wordnet corpus to find synonyms for words in filenames. true or false\n    # Note, this will download a large corpus (~400mb) the first time it is run.\n    use_synonyms = false\n```\n\n### Example usage\n\n```bash\n# Normalize all files in a directory to lowercase, with underscore separators\n$ jdfile --case=lower --separator=underscore /path/to/directory\n\n# Clean all files in a directory and confirm all changes before committing them\n$ jdfile --clean /path/to/directory\n\n# Strip common English stopwords from all files in a directory\n$ jdfile --stopwords /path/to/directory\n\n# Transform a date and add it to the filename\n$ jdfile --date-format=\"%Y-%m-%d\" ./somefile_march 3rd, 2022.txt\n\n# Print a tree representation of a Johnny Decimal project\n$ jdfile --project=[project_name] --tree\n\n# Use the settings of a project in the config file to clean filenames without\n# organizing them into folders\n$ jdfile --project=[project_name] --no-organize path/to/some_file.jpg\n\n# Organize files into a Johnny Decimal project with specified terms with title casing\n$ jdfile ---project=[project_name] --term=term1 --term=term2 path/to/some_file.jpg\n```\n\n### Tips\n\nAdding custom functions to your `.bashrc` or `.zshrc` can save time and ensure your filename preferences are always used.\n\n```bash\n# ~/.bashrc\nif command -v jdfile &>/dev/null; then\n\n    clean() {\n        # DESC:\t Clean filenames using the jdfile package\n        if [[ $1 == \"--help\" || $1 == \"-h\" ]]; then\n            jdfile --help\n        else\n            jdfile --sep=space --case=title --confirm \"$@\"\n        fi\n    }\n\n    wfile() {\n        # DESC:\t File work documents\n        if [[ $1 == \"--help\" || $1 == \"-h\" ]]; then\n            jdfile --help\n        else\n            jdfile --project=work \"$@\"\n        fi\n    }\nfi\n```\n\n## Caveats\n\n`jdfile` is built for my own personal use. YMMV depending on your system and requirements. I make no warranties for any data loss that may result from use. I strongly recommend running in `--dry-run` mode prior to updating files.\n\n## Contributing\n\n### Setup: Once per project\n\nThere are two ways to contribute to this project.\n\n#### 1. Local development\n\n1. Install Python 3.10 and [Poetry](https://python-poetry.org)\n2. Clone this repository. `git clone https://github.com/natelandau/jdfile.git`\n3. Install the Poetry environment with `poetry install`.\n4. Activate your Poetry environment with `poetry shell`.\n5. Install the pre-commit hooks with `pre-commit install --install-hooks`.\n\n#### 2. Containerized development\n\n1. Clone this repository. `git clone https://github.com/natelandau/jdfile.git`\n2. Open the repository in Visual Studio Code\n3. Start the [Dev Container](https://code.visualstudio.com/docs/remote/containers). Run <kbd>Ctrl/\u2318</kbd> + <kbd>\u21e7</kbd> + <kbd>P</kbd> \u2192 _Remote-Containers: Reopen in Container_.\n4. Run `poetry env info -p` to find the PATH to the Python interpreter if needed by VSCode.\n\n### Developing\n\n-   This project follows the [Conventional Commits](https://www.conventionalcommits.org/) standard to automate [Semantic Versioning](https://semver.org/) and [Keep A Changelog](https://keepachangelog.com/) with [Commitizen](https://github.com/commitizen-tools/commitizen).\n    -   When you're ready to commit changes run `cz c`\n-   Run `poe` from within the development environment to print a list of [Poe the Poet](https://github.com/nat-n/poethepoet) tasks available to run on this project. Common commands:\n    -   `poe lint` runs all linters\n    -   `poe test` runs all tests with Pytest\n-   Run `poetry add {package}` from within the development environment to install a run time dependency and add it to `pyproject.toml` and `poetry.lock`.\n-   Run `poetry remove {package}` from within the development environment to uninstall a run time dependency and remove it from `pyproject.toml` and `poetry.lock`.\n-   Run `poetry update` from within the development environment to upgrade all dependencies to the latest versions allowed by `pyproject.toml`.\n",
    "bugtrack_url": null,
    "license": "GNU AFFERO",
    "summary": "File Manager for the Johnny Decimal System",
    "version": "1.1.5",
    "project_urls": {
        "Homepage": "https://github.com/natelandau/jdfile",
        "Repository": "https://github.com/natelandau/jdfile"
    },
    "split_keywords": [],
    "urls": [
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "76623751f68c8a2585c30101530f13feb1ff5866fa991696d608c183b0449fc1",
                "md5": "aa57ee6dde356047ed854133689831ee",
                "sha256": "c5b0f618de3ce5894bd763a81e587b864d50b4bee15f1e0ff40c90ad14ef783c"
            },
            "downloads": -1,
            "filename": "jdfile-1.1.5-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "aa57ee6dde356047ed854133689831ee",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10,<4.0",
            "size": 45012,
            "upload_time": "2023-05-14T20:40:29",
            "upload_time_iso_8601": "2023-05-14T20:40:29.013438Z",
            "url": "https://files.pythonhosted.org/packages/76/62/3751f68c8a2585c30101530f13feb1ff5866fa991696d608c183b0449fc1/jdfile-1.1.5-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": "",
            "digests": {
                "blake2b_256": "d20df133a0034137f5e46e554ed719d3476993b14b021d93d9bcbb6dd81e55f4",
                "md5": "c937c011ba4b692dc5c2f6f3bcd0deaf",
                "sha256": "62a442ccc98f0ec27b9159a95c8d92c288d9e9c7132ddf7f129cc037cadeadca"
            },
            "downloads": -1,
            "filename": "jdfile-1.1.5.tar.gz",
            "has_sig": false,
            "md5_digest": "c937c011ba4b692dc5c2f6f3bcd0deaf",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10,<4.0",
            "size": 45388,
            "upload_time": "2023-05-14T20:40:30",
            "upload_time_iso_8601": "2023-05-14T20:40:30.895907Z",
            "url": "https://files.pythonhosted.org/packages/d2/0d/f133a0034137f5e46e554ed719d3476993b14b021d93d9bcbb6dd81e55f4/jdfile-1.1.5.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2023-05-14 20:40:30",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "natelandau",
    "github_project": "jdfile",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "lcname": "jdfile"
}
        
Elapsed time: 0.07291s