fylex


Namefylex JSON
Version 0.7.2 PyPI version JSON
download
home_pageNone
SummaryA fast, intelligent file copier/mover with hashing, filters, conflict resolution, and backup support — built for power users.
upload_time2025-08-29 08:46:49
maintainerNone
docs_urlNone
authorNone
requires_python>=3.8
licenseMIT
keywords file copy move cli utility xxhash filter conflict backup multithread devtool python-tool fylex deduplication smart-copy smart-move
VCS
bugtrack_url
requirements No requirements were recorded.
Travis-CI No Travis.
coveralls test coverage No coveralls.
            
# Fylex: Your Intelligent File & Directory Orchestrator


[![Python 3.x](https://img.shields.io/badge/Python-3.x-blue.svg)](https://www.python.org/)
[![PyPI Downloads](https://static.pepy.tech/badge/fylex)](https://pepy.tech/projects/fylex)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

Fylex is a powerful and flexible Python utility designed to simplify complex file management tasks. From intelligent copying and moving to flattening chaotic directory structures and resolving file conflicts, Fylex provides a robust, concurrent, and log-detailed solution for organizing your digital life.



## Table of Contents

1.  [Introduction](#introduction)
2.  [Key Features](#key-features)
3.  [Installation](#installation)
4.  [Usage](#usage)
    * [Core Functions Overview](#core-functions-overview)
    * [Common Parameters](#common-parameters)
    * [Conflict Resolution Modes (`on_conflict`)](#conflict-resolution-modes-on_conflict)
    * [Examples](#examples)
        * [`copy_files`: Smart file copying](#copy_files-smart-copying)
        * [`move_files`: Smart file moving](#move_files-smart-moving)
        * [`copy_dirs`: Smart folder copying](#copy_files-smart-copying)
        * [`move_dirs`: Smart folder moving](#move_files-smart-moving)
        * [`super_copy`: Smart unified copying](#copy_files-smart-copying)
        * [`super_move`: Smart unified moving](#move_files-smart-moving)
        * [`spill`: Consolidating Files from Subdirectories](#spill-consolidating-files-from-subdirectories)
        * [`flatten`: Flattening Directory Structures](#flatten-flattening-directory-structures)
        * [`categorize`: Categorizing files](#categorizing-files)
        * [`refine`: Refining directories](#refining-directories)
        * [Handling Junk Files](#handling-junk-files)
        * [Dry Run and Interactive Modes](#dry-run-and-interactive-modes)
        * [Working with Regex and Glob Patterns](#working-with-regex-and-glob-patterns)
        
5.  [Why Fylex is BETTER](#why-fylex-is-better)
6.  [Error Handling](#error-handling)
7.  [Logging](#logging)
8.  [Development & Contributing](#development--contributing)
9.  [License](#license)

## 1. Introduction

Managing files can quickly become a tedious and error-prone process, especially when dealing with large collections, duplicate files, or disorganized directory structures. Traditional command-line tools offer basic copy/move functionalities, but often lack the intelligence to handle conflicts, filter effectively, or automate complex reorganization patterns.

Fylex steps in to fill this gap. It's built on a foundation of robust error handling, concurrent processing, and intelligent decision-making, ensuring your file operations are efficient, safe, and tailored to your needs.

## 2. Key Features

* **Smart Copy (`copy_files`)**: Copy files with advanced filtering, conflict resolution, and integrity verification.
* **Smart Move (`move_files`)**: Move files, similar to copying, but with source file deletion upon successful transfer and verification.
* **Smart Directory Copy (`copy_dirs`)**: Copy entire directory structures with advanced filtering and intelligent conflict resolution, including content-based duplicate folder detection.
* **Smart Directory Move (`move_dirs`)**: Move directory structures, maintaining the same smart features as `copy_dirs`.
* **Unified Super Operations (`super_copy`, `super_move`)**: Perform both file and folder copy/move operations simultaneously, applying distinct filtering and conflict resolution rules for files and directories in a single command. This allows for powerful, combined management tasks.
* **File Hashing for Reliability**: Utilizes `xxhash` for fast, non-cryptographic hashing to ensure file integrity post-transfer and detect true content duplicates.
* **Sophisticated Conflict Resolution**: Offers a comprehensive set of strategies to handle name collisions at the destination (e.g., rename, replace, keep larger/smaller/newer/older, skip, or prompt).
* **Accident Prevention with Deprecated Folders**: Crucially, when an `on_conflict` mode leads to an existing destination file/folder being replaced (e.g., by a "newer" or "larger" incoming item), Fylex automatically moves the *superseded* item into a timestamped `.fylex_deprecated/` subfolder within the destination directory. This acts as a robust safety net against accidental data loss, allowing you to recover older versions if needed. Additionally, if a source file/folder is skipped because its identical counterpart already exists at the destination, it is also moved to `fylex.deprecated/` by default.
* **Flexible File and Folder Filtering**:
    * **Inclusion**: Specify files and/or folders to process using regular expressions (`match_regex`, `folder_match_regex`), exact names (`match_names`, `folder_match_names`), or glob patterns (`match_glob`, `folder_match_glob`).
    * **Exclusion**: Prevent specific files and/or folders from being processed using `exclude_regex`, `folder_exclude_regex`, `exclude_names`, `folder_exclude_names`, or `exclude_glob`, `folder_exclude_glob`.
    * **Junk File Awareness**: Predefined `JUNK_EXTENSIONS` helps easily exclude common temporary, system, and development artifacts.
* **Intelligent Duplicate Folder Detection**: Fylex identifies identical folders (based on their content structure and file hashes) to prevent redundant operations and optimize storage.
* **Directory Reorganization Utilities**:
    * **`spill`**: Consolidate files from nested subdirectories up to a specified depth into a parent directory.
    * **`flatten`**: Move all files from an entire directory tree into a single target directory, then automatically delete the empty subdirectories.
    * **`categorize`**: Organize files into logical subdirectories based on criteria like file name patterns (regex/glob), size ranges, or file extensions.
* **Concurrency for Speed**: Leverages Python's `ThreadPoolExecutor` to perform file and folder operations in parallel, significantly speeding up tasks involving many files.
* **Dry Run Mode**: Simulate any operation without making actual changes to the filesystem. Essential for verifying complex commands before execution.
* **Interactive Mode**: Prompts for user confirmation before each file or folder operation, providing fine-grained control.
* **Comprehensive Logging**: All actions, warnings, and errors are meticulously logged to `fylex.log` for easy auditing and debugging.
* **Robust Path Validation**: Prevents common pitfalls like attempting to copy a directory into itself, or operating on non-existent paths.
* **Retry Mechanism**: Failed file operations are retried up to `MAX_RETRIES` to handle transient network issues or temporary file locks.
* **Intelligent Duplicate Refinement**: Identifies true content duplicates using hashing and safely moves them to a deprecated folder, freeing up disk space.

## 3. Installation

Fylex is designed to be integrated into your Python projects or run as a standalone script.

1.  **Clone the repository**:
    ```bash
    git clone https://github.com/Crystallinecore/fylex.git
    cd fylex
    ```
2.  **Install dependencies**:
    Fylex requires `xxhash`.
    ```bash
    pip install xxhash
    ```
3.  **Include in your project**:
    You can import Fylex functions directly into your Python scripts:
    ```python
    from fylex import copy_files, move_files, spill, flatten, delete_empty_dirs
    from fylex.exceptions import InvalidPathError, PermissionDeniedError
    ```

## 4. Usage

Fylex functions are designed to be intuitive. Here's a breakdown of the core functions and their parameters.

### Core Functions Overview

| Function | Type | Description |
| :--- | :--- | :--- |
| `copy_files(src, dest, **kwargs)` | `copy` | Copies files from `src` to `dest`. |
| `move_files(src, dest, **kwargs)` | `move` | Moves files from `src` to `dest`. |
| `copy_dirs(src, dest, **kwargs)` | `copy` | Copies directories and their contents from `src` to `dest`. |
| `move_dirs(src, dest, **kwargs)` | `move` | Moves directories and their contents from `src` to `dest`. |
| `super_copy(src, dest, **kwargs)` | `copy` | Copies both files and directories from `src` to `dest` with separate rules. |
| `super_move(src, dest, **kwargs)` | `move` | Moves both files and directories from `src` to `dest` with separate rules. |
| `spill(target, **kwargs)` | `reorganize` | Moves files from subdirectories within `target` to `target`. |
| `flatten(target, **kwargs)` | `reorganize` | Moves all files from subdirectories within `target` to `target` and deletes empty subdirectories. |
| `categorize(target, categorize_by, grouping=None, default=None, **kwargs)` | `organize` | Orchestrates categorization based on specified `categorize_by` mode. |
| `categorize_by_name(target, grouping, default=None, **kwargs)` | `organize` | Categorizes files by name using regex/glob patterns. |
| `categorize_by_size(target, grouping, default=None, **kwargs)` | `organize` | Categorizes files by size, using specific sizes or ranges. |
| `categorize_by_ext(target, default=None, **kwargs)` | `organize` | Categorizes files by their file extension. |
| `refine(target, **kwargs)` | `deduplicate` | Identifies and manages duplicate files within a target directory, moving redundant copies to a deprecated folder. |
| `delete_empty_dirs(target)` | `cleanup` | Recursively deletes all empty subdirectories within `target`. |



### Common Parameters


### Conflict Resolution Modes (`on_conflict`)

Fylex offers smart handling of file name conflicts at the destination. The `on_conflict` parameter accepts one of the following string values:

* **`"rename"` (Default)**: If a file with the same name exists, the incoming file will be renamed (e.g., `document.txt` becomes `document(1).txt`, `document(2).txt`, etc.) to avoid overwriting.
* **`"replace"`**: The incoming file will unconditionally overwrite the existing file at the destination. **The original file will be moved to a timestamped `.fylex_deprecated/` folder within the destination for safety.**
* **`"larger"`**: The file with the larger file size will be kept. If the existing file is larger or equal, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is larger, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.
* **`"smaller"`**: The file with the smaller file size will be kept. If the existing file is smaller or equal, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is smaller, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.
* **`"newer"`**: The file with the more recent modification timestamp will be kept. If the existing file is newer or has the same timestamp, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is newer, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.
* **`"older"`**: The file with the older modification timestamp will be kept. If the existing file is older or has the same timestamp, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is older, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.
* **`"skip"`**: The incoming file will be skipped entirely if a file with the same name exists at the destination. **The skipped source file is moved to `fylex.deprecated/` for review.**
* **`"prompt"`**: Fylex will ask the user interactively (via console) whether to replace the existing file or skip the incoming one. If "replace" is chosen, the **original file is moved to `.fylex_deprecated/`**. If "skip" is chosen, the **skipped source file is moved to `fylex.deprecated/`**.

### Examples

Let's assume the following directory structure for the examples:

````
/data/
├── project_A/
│  ├── main.py
│  ├── config.ini
│  └── docs/
│      ├── readme.md
│      └── images/
│          └── img_01.png
├── project_B/
│  ├── index.html
│  └── style.css
├── temp/
│  ├── .tmp
│  ├── old_data.bak
│  └── report.log
├── my_files/
│  ├── photo.jpg
│  ├── document.pdf
│  └── sub_folder/
│      └── nested_file.txt
├── important_notes.txt
└── large_archive.zip (assume large size, e.g., 50MB)
└── small_image.png (assume small size, e.g., 100KB)
└── duplicate_photo.jpg (exact same content as photo.jpg)

````

And your destination directory is initially empty: `/backup/`

##
#### `copy_files`: Smart Copying

`copy_files` Copies files from a source to a destination, with advanced conflict resolution and filtering options.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `src`            | **Required**   | Source path (directory or iterable of files). |
| `dest`           | **Required**   | Destination directory path. |
| `no_create`      | `False`        | If `True`, raises error if destination does not exist. |
| `interactive`    | `False`        | If `True`, prompts user before each copy. |
| `dry_run`        | `False`        | Simulates the copy without modifying files. |
| `match_regex`    | `None`         | Regex pattern to include matching files. |
| `match_names`    | `None`         | List of exact filenames to include. |
| `match_glob`     | `None`         | Glob pattern(s) to include. |
| `exclude_regex`  | `None`         | Regex pattern to exclude matching files. |
| `exclude_names`  | `None`         | List of exact filenames to exclude. |
| `exclude_glob`   | `None`         | Glob pattern(s) to exclude. |
| `summary`        | `None`         | Optional list to append summary messages. |
| `on_conflict`    | `"rename"`     | Strategy on conflict: `rename`, `skip`, `replace`, `newer`, `older`, `larger`, `smaller`, `prompt`. |
| `max_workers`    | `4`            | Number of threads for parallel operations. |
| `recursive_check`| `False`        | Recursively filter files in subdirectories. |
| `verbose`        | `False`        | If `True`, prints detailed log for each operation. |

#### Example:
```python
from fylex import copy_files

# Example 1: Copy all Python files from project_A to /backup, resolving conflicts by renaming.
# Only scans the top-level files of project_A if recursive_check=False
copy_files(src="/data/project_A", dest="/backup",
           match_glob="*.py", on_conflict="rename", verbose=True)
# Result: /backup/main.py

# Example 2: Copy all files from /data/my_files including subdirectories,
# excluding .txt files, and keep the newer version on conflict.
# If a file like 'photo.jpg' exists in /backup/my_backup and is older,
# the existing 'photo.jpg' would be moved to '/backup/my_backup/.fylex_deprecated/YYYY-MM-DD_HH-MM-SS/'
# before the new 'photo.jpg' is copied.
copy_files(src="/data/my_files", dest="/backup/my_backup",
           recursive_check=True, exclude_glob="*.txt", on_conflict="newer", verbose=True)
# Result: /backup/my_backup/photo.jpg, /backup/my_backup/document.pdf
# (nested_file.txt would be skipped due to exclusion)

# Example 3: Copy only 'important_notes.txt' from /data to /backup
copy_files(src="/data", dest="/backup",
           match_names=["important_notes.txt"], verbose=True)
# Result: /backup/important_notes.txt
````
##
#### `move_files`: Smart Moving

`move_files` works identically to `copy_files` but deletes the source file after successful transfer.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `src`            | **Required**   | Source path (directory or iterable of files). |
| `dest`           | **Required**   | Destination directory path. |
| `no_create`      | `False`        | If `True`, raises error if destination does not exist. |
| `interactive`    | `False`        | If `True`, prompts user before each copy. |
| `dry_run`        | `False`        | Simulates the copy without modifying files. |
| `match_regex`    | `None`         | Regex pattern to include matching files. |
| `match_names`    | `None`         | List of exact filenames to include. |
| `match_glob`     | `None`         | Glob pattern(s) to include. |
| `exclude_regex`  | `None`         | Regex pattern to exclude matching files. |
| `exclude_names`  | `None`         | List of exact filenames to exclude. |
| `exclude_glob`   | `None`         | Glob pattern(s) to exclude. |
| `summary`        | `None`         | Optional list to append summary messages. |
| `on_conflict`    | `"rename"`     | Strategy on conflict: `rename`, `skip`, `replace`, `newer`, `older`, `larger`, `smaller`, `prompt`. |
| `max_workers`    | `4`            | Number of threads for parallel operations. |
| `recursive_check`| `False`        | Recursively filter files in subdirectories. |
| `verbose`        | `False`        | If `True`, prints detailed log for each operation. |

#### Example
```python
from fylex import move_files

# Example: Move all .html and .css files from project_B to /web_files,
# prompting on conflict.
# If the user chooses to replace, the existing file in /web_files would be moved
# to '/web_files/.fylex_deprecated/YYYY-MM-DD_HH-MM-SS/'.
# If the user chooses to skip, the source file (e.g., /data/project_B/index.html)
# would be moved to 'fylex.deprecated/' (in the current working directory).
move_files(src="/data/project_B", dest="/web_files",
           match_glob="*.{html,css}", on_conflict="prompt", interactive=True, verbose=True)
# User would be prompted for each file if it already exists in /web_files.
# After successful move: /data/project_B will no longer contain index.html or style.css
```
##

#### `copy_dirs`: Smart Directory Copying

`copy_dirs` allows you to copy entire directory structures with advanced filtering and conflict resolution, including content-based duplicate folder detection.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `src`            | **Required**   | Source directory path. |
| `dest`           | **Required**   | Target directory path. |
| `no_create`      | `False`        | Raise error if destination doesn't exist. |
| `interactive`    | `False`        | Prompt user before copying each folder or file. |
| `dry_run`        | `False`        | Simulate the directory copy without changes. |
| `match_regex`    | `None`         | Include files matching regex pattern. |
| `match_names`    | `None`         | Include files with specific names. |
| `match_glob`     | `None`         | Include files matching glob pattern. |
| `exclude_regex`  | `None`         | Exclude files matching regex. |
| `exclude_names`  | `None`         | Exclude files with specific names. |
| `exclude_glob`   | `None`         | Exclude files matching glob pattern. |
| `summary`        | `None`         | Store summary/logging messages. |
| `on_conflict`    | `"rename"`     | Strategy for existing files or directories. |
| `max_workers`    | `4`            | Number of worker threads. |
| `recursive_check`| `False`        | Recursively apply filters during directory traversal. |
| `verbose`        | `False`        | Enable detailed output. |

#### Example
```python
from fylex import copy_dirs

# Example: Copy a specific project folder from a development drive to a backup drive.
# If 'my_project' already exists in '/backups', Fylex will keep the newer version.
# If the existing folder in '/backups' is older, it will be moved to
# '/backups/.fylex_deprecated/YYYY-MM-DD_HH-MM-SS/my_project/' before the new one is copied.
copy_dirs(src="/dev_drive/projects", dest="/backups",
          folder_match_names=["my_project"], on_conflict="newer", verbose=True)
# If '/backups/my_project' was older than '/dev_drive/projects/my_project',
# the old '/backups/my_project' is deprecated, and the new one is copied.

# Example: Copy all folders related to 'docs' or 'reports', excluding sensitive ones,
# and use dry_run to see what would happen.
copy_dirs(src="/shared_drive/department_data", dest="/archive/department_docs",
          folder_match_regex="^(docs|reports)_.*",
          folder_exclude_names=["docs_sensitive", "reports_internal_only"],
          dry_run=True, verbose=True)
# This command will only print logs about which directories *would* be copied and where.
```
##
#### `move_dirs`: Smart Directory Moving

`move_dirs` works identically to `copy_dirs` but deletes the source directory after successful transfer and verification.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `src`            | **Required**   | Source directory path. |
| `dest`           | **Required**   | Target directory path. |
| `no_create`      | `False`        | Raise error if destination doesn't exist. |
| `interactive`    | `False`        | Prompt user before copying each folder or file. |
| `dry_run`        | `False`        | Simulate the directory copy without changes. |
| `match_regex`    | `None`         | Include files matching regex pattern. |
| `match_names`    | `None`         | Include files with specific names. |
| `match_glob`     | `None`         | Include files matching glob pattern. |
| `exclude_regex`  | `None`         | Exclude files matching regex. |
| `exclude_names`  | `None`         | Exclude files with specific names. |
| `exclude_glob`   | `None`         | Exclude files matching glob pattern. |
| `summary`        | `None`         | Store summary/logging messages. |
| `on_conflict`    | `"rename"`     | Strategy for existing files or directories. |
| `max_workers`    | `4`            | Number of worker threads. |
| `recursive_check`| `False`        | Recursively apply filters during directory traversal. |
| `verbose`        | `False`        | Enable detailed output. |

#### Example
```python
from fylex import move_dirs

# Example: Move a completed project folder from "in progress" to "completed" archives.
# If a folder with the same name exists in '/archive/completed_projects',
# the larger one will be kept. If the incoming folder is smaller or equal,
# the source folder will be moved to 'fylex.deprecated/'.
move_dirs(src="/project_workspace/in_progress", dest="/archive/completed_projects",
          folder_match_names=["ProjectX_Final"], on_conflict="larger", verbose=True)
# If '/archive/completed_projects/ProjectX_Final' was smaller than or equal to
# '/project_workspace/in_progress/ProjectX_Final', the source folder
# '/project_workspace/in_progress/ProjectX_Final' would be moved to 'fylex.deprecated/'.
# Otherwise, the existing one in '/archive/completed_projects' would be deprecated,
# and the source would be moved.
```
##
#### `super_copy`: Unified Smart Copy (Files and Directories)

`super_copy` allows you to copy both files and directories simultaneously from a source to a destination, applying distinct filtering and conflict resolution rules for each.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `src`            | **Required**   | Source path (file or directory). |
| `dest`           | **Required**   | Target path. |
| `no_create`      | `False`        | Raise error if destination doesn't exist. |
| `interactive`    | `False`        | Prompt before copying. |
| `dry_run`        | `False`        | Run in preview-only mode. |
| `file_match_regex`    | `None`         | File inclusion pattern. |
| `file_match_names`    | `None`         | Files to include by name. |
| `file_match_glob`     | `None`         | Files to include using glob. |
| `folder_match_regex`    | `None`         | Folder inclusion pattern. |
| `folder_match_names`    | `None`         | Folders to include by name. |
| `folder_match_glob`     | `None`         | Folders to include using glob. |
| `file_exclude_regex`  | `None`         | File exclusion regex. |
| `file_exclude_names`  | `None`         | Files to exclude by name. |
| `file_exclude_glob`   | `None`         | Files to exclude using glob. |
| `folder_exclude_regex`  | `None`         | Folder exclusion regex. |
| `folder_exclude_names`  | `None`         | Folders to exclude by name. |
| `folder_exclude_glob`   | `None`         | Folders to exclude using glob. |
| `summary`        | `None`         | Collects operation summaries. |
| `file_on_conflict`    | `"rename"`     | How to resolve naming conflicts for files. |
| `folder_on_conflict`    | `"rename"`     | How to resolve naming conflicts for folders. |
| `max_workers`    | `4`            | Parallel thread pool size. |
| `recursive_check`| `False`        | Traverse and filter recursively. |
| `verbose`        | `False`        | Enable verbose mode. |

#### Example
```python
from fylex import super_copy

# Example: Copy a mixed-content project folder to a backup.
# Copy all .py files, and specific 'config' and 'data' folders.
# For files, rename on conflict. For folders, replace if newer.
super_copy(src="/my_dev_project", dest="/backup_dev",
           file_match_glob="*.py", file_on_conflict="rename",
           folder_match_names=["config", "data"], folder_on_conflict="newer",
           file_recursive_check=True, folder_recursive_check=True, verbose=True)
# This will copy all .py files found recursively within /my_dev_project,
# renaming them if they conflict in /backup_dev.
# It will also copy 'config' and 'data' subfolders, recursively, replacing them
# in /backup_dev if the source version is newer (deprecating the old one).

# Example: Copy an entire repository structure, excluding certain files and hidden directories.
super_copy(src="/my_repo", dest="/clean_archive",
           file_exclude_glob="*.log",
           folder_exclude_regex="^\.", # Exclude hidden folders like .git, .vscode etc.
           file_on_conflict="skip",
           folder_on_conflict="skip",
           recursive_check=True, # Apply file filtering recursively
           folder_recursive_check=True, # Apply folder filtering recursively
           dry_run=True, verbose=True)
# This dry run will show which files (excluding .log) and which folders (excluding hidden ones)
# would be copied, skipping any conflicts.
```
##
#### `super_move`: Unified Smart Move (Files and Directories)

`super_move` works identically to `super_copy` but deletes the source files and directories upon successful transfer and verification.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `src`            | **Required**   | Source path (file or directory). |
| `dest`           | **Required**   | Target path. |
| `no_create`      | `False`        | Raise error if destination doesn't exist. |
| `interactive`    | `False`        | Prompt before copying. |
| `dry_run`        | `False`        | Run in preview-only mode. |
| `file_match_regex`    | `None`         | File inclusion pattern. |
| `file_match_names`    | `None`         | Files to include by name. |
| `file_match_glob`     | `None`         | Files to include using glob. |
| `folder_match_regex`    | `None`         | Folder inclusion pattern. |
| `folder_match_names`    | `None`         | Folders to include by name. |
| `folder_match_glob`     | `None`         | Folders to include using glob. |
| `file_exclude_regex`  | `None`         | File exclusion regex. |
| `file_exclude_names`  | `None`         | Files to exclude by name. |
| `file_exclude_glob`   | `None`         | Files to exclude using glob. |
| `folder_exclude_regex`  | `None`         | Folder exclusion regex. |
| `folder_exclude_names`  | `None`         | Folders to exclude by name. |
| `folder_exclude_glob`   | `None`         | Folders to exclude using glob. |
| `summary`        | `None`         | Collects operation summaries. |
| `file_on_conflict`    | `"rename"`     | How to resolve naming conflicts for files. |
| `folder_on_conflict`    | `"rename"`     | How to resolve naming conflicts for folders. |
| `max_workers`    | `4`            | Parallel thread pool size. |
| `recursive_check`| `False`        | Traverse and filter recursively. |
| `verbose`        | `False`        | Enable verbose mode. |

#### Example
```python
from fylex import super_move

# Example: Migrate a project from a staging area to production.
# Move all image files (.jpg, .png) and a specific 'assets' folder.
# Images that conflict will keep the larger version.
# The 'assets' folder will be replaced if the incoming one is newer.
super_move(src="/staging/prod_build", dest="/production/app_data",
           file_match_glob="*.{jpg,png}", file_on_conflict="larger",
           folder_match_names=["assets"], folder_on_conflict="newer",
           file_recursive_check=True, folder_recursive_check=True, verbose=True)
# This will move all specified image files, keeping the larger version on conflict.
# It will move the 'assets' folder, replacing the destination if newer (deprecating the old one).
# Source files and folders will be deleted after successful move.
```

##
#### `spill`: Consolidating Files from Subdirectories

`spill` moves files from nested directories into the `target` root directory.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `target`         | **Required**   | Target directory to refine. |
| `interactive`    | `False`        | Prompts user before moving duplicate files. |
| `dry_run`        | `False`        | Simulates changes without actual deletion or move. |
| `match_regex`    | `None`         | Include files matching regex. |
| `match_names`    | `None`         | Include specific filenames. |
| `match_glob`     | `None`         | Include files by glob pattern. |
| `exclude_regex`  | `None`         | Exclude files by regex. |
| `exclude_names`  | `None`         | Exclude specific filenames. |
| `exclude_glob`   | `None`         | Exclude files using glob pattern. |
| `summary`        | `None`         | Optional log collector list. |
| `on_conflict`    | `"rename"`     | Conflict resolution strategy. |
| `max_workers`    | `4`            | Threads for parallel hashing. |
| `recursive_check`| `False`        | Traverse and check all subfolders. |
| `verbose`        | `False`        | Print each step for transparency. |

#### Example
```python
from fylex import spill
import os
import shutil

# Setup for spill example:
os.makedirs("/data/temp_spill/level1/level2", exist_ok=True)
with open("/data/temp_spill/fileA.txt", "w") as f: f.write("A")
with open("/data/temp_spill/level1/fileB.txt", "w") as f: f.write("B")
with open("/data/temp_spill/level1/level2/fileC.txt", "w") as f: f.write("C")
with open("/data/temp_spill/level1/level2/image.jpg", "w") as f: f.write("C")

# Example 1: Spill all files from subdirectories (infinite levels) into /data/temp_spill.
# If fileB.txt already existed in /data/temp_spill, it would be deprecated based on conflict mode.
spill(target="/data/temp_spill", levels=-1, verbose=True)
# Result: /data/temp_spill/fileA.txt, /data/temp_spill/fileB.txt, /data/temp_spill/fileC.txt, /data/temp_spill/image.jpg
# (fileA.txt is already at root, so not moved)
# The empty subdirectories /data/temp_spill/level1 and /data/temp_spill/level1/level2 will remain.

# Clean up for next example:
shutil.rmtree("/data/temp_spill")
os.makedirs("/data/temp_spill/level1/level2", exist_ok=True)
with open("/data/temp_spill/fileA.txt", "w") as f: f.write("A")
with open("/data/temp_spill/level1/fileB.txt", "w") as f: f.write("B")
with open("/data/temp_spill/level1/level2/fileC.txt", "w") as f: f.write("C")

# Example 2: Spill only files from immediate subdirectories (level 1), excluding .txt files.
spill(target="/data/temp_spill", levels=1, exclude_glob="*.txt", verbose=True)
# Result: Only files from /data/temp_spill/level1 (like fileB.txt if not excluded) would be considered.
# In this specific setup, since only .txt files are present, nothing would move.
# If image.jpg was in level1, it would move.
```

##
#### `flatten`: Flattening Directory Structures

`flatten` is ideal for taking a messy, deeply nested folder and putting all its files into one level, then cleaning up the empty folders.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `target`         | **Required**   | Directory whose files are to be categorized. |
| `categorize_by`  | **Required**   | Mode: `"name"`, `"size"`, or `"ext"`. |
| `grouping`       | `None`         | Mapping of patterns to folder paths. |
| `default`        | `None`         | Path to move unmatched files. |
| `interactive`    | `False`        | Prompts before each categorization. |
| `dry_run`        | `False`        | Preview the changes without modifying anything. |
| `summary`        | `None`         | Collect log summary here. |
| `max_workers`    | `4`            | Number of threads used for sorting. |
| `verbose`        | `False`        | Show per-file operations and logs. |
| `recursive_check`| `False`        | Include subdirectories in file selection. |

#### Example
```python
from fylex import flatten
import os
import shutil

# Setup for flatten example (same as spill setup):
os.makedirs("/data/temp_flatten/level1/level2", exist_ok=True)
with open("/data/temp_flatten/fileX.log", "w") as f: f.write("X") # Will be ignored by default junk filter
with open("/data/temp_flatten/level1/fileY.jpg", "w") as f: f.write("Y")
with open("/data/temp_flatten/level1/level2/fileZ.pdf", "w") as f: f.write("Z")

# Example: Flatten the entire /data/temp_flatten directory.
# Any files in subdirectories that would overwrite an existing file in /data/temp_flatten
# would first cause the existing file to be moved to '/data/temp_flatten/.fylex_deprecated/'.
flatten(target="/data/temp_flatten", verbose=True)
# Result: /data/temp_flatten/fileX.log, /data/temp_flatten/fileY.jpg, /data/temp_flatten/fileZ.pdf
# After operation, /data/temp_flatten/level1/ and /data/temp_flatten/level1/level2/ will be deleted.
```
##
#### Categorizing Files

Fylex offers flexible ways to categorize files into new or existing directories.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `target`         | **Required**   | Directory whose files are to be categorized. |
| `categorize_by`  | **Required**   | Mode: `"name"`, `"size"`, or `"ext"`. |
| `grouping`       | `None`         | Mapping of patterns to folder paths. |
| `default`        | `None`         | Path to move unmatched files. |
| `interactive`    | `False`        | Prompts before each categorization. |
| `dry_run`        | `False`        | Preview the changes without modifying anything. |
| `summary`        | `None`         | Collect log summary here. |
| `max_workers`    | `4`            | Number of threads used for sorting. |
| `verbose`        | `False`        | Show per-file operations and logs. |
| `recursive_check`| `False`        | Include subdirectories in file selection. |

#### Example
```python
from fylex import categorize
import os

# Create dummy files for categorization
os.makedirs("/data/categorize_source", exist_ok=True)
with open("/data/categorize_source/report_april.pdf", "w") as f: f.write("report content")
with open("/data/categorize_source/meeting_notes.txt", "w") as f: f.write("notes content")
with open("/data/categorize_source/photo_2023.jpg", "w") as f: f.write("photo content")
with open("/data/categorize_source/large_video.mp4", "w") as f: f.write("large content" * 1000000) # ~1MB
with open("/data/categorize_source/small_doc.docx", "w") as f: f.write("small content") # ~100 bytes

# Example 1: Categorize by file extension
categorize(
    target="/data/categorize_source",
    categorize_by="ext",
    default="/data/categorize_destination/misc", # Files without common extensions or uncategorized
    dry_run=True,
    verbose=True
)
# Expected Dry Run Output:
# Would move /data/categorize_source/report_april.pdf to /data/categorize_destination/pdf/report_april.pdf
# Would move /data/categorize_source/meeting_notes.txt to /data/categorize_destination/txt/meeting_notes.txt
# etc.

# Example 2: Categorize by file name using regex and glob
grouping_by_name = {
    r"^report_.*\.pdf$": "/data/categorize_destination/Reports", # Regex for reports
    ("photo_*.jpg", "glob"): "/data/categorize_destination/Images" # Glob for photos
}
categorize(
    target="/data/categorize_source",
    categorize_by="name",
    grouping=grouping_by_name,
    default="/data/categorize_destination/Other",
    dry_run=True,
    verbose=True
)
# Expected Dry Run Output:
# Would move /data/categorize_source/report_april.pdf to /data/categorize_destination/Reports/report_april.pdf
# Would move /data/categorize_source/photo_2023.jpg to /data/categorize_destination/Images/photo_2023.jpg
# Others would go to /data/categorize_destination/Other

# Example 3: Categorize by file size (ranges in bytes)
grouping_by_size = {
    (0, 1024): "/data/categorize_destination/SmallFiles", # 0 to 1KB
    (1024 * 1024, "max"): "/data/categorize_destination/LargeFiles" # 1MB and above
}
categorize(
    target="/data/categorize_source",
    categorize_by="size",
    grouping=grouping_by_size,
    default="/data/categorize_destination/MediumFiles",
    dry_run=True,
    verbose=True
)
# Expected Dry Run Output:
# Would move /data/categorize_source/small_doc.docx to /data/categorize_destination/SmallFiles/small_doc.docx
# Would move /data/categorize_source/large_video.mp4 to /data/categorize_destination/LargeFiles/large_video.mp4
# Other files would go to /data/categorize_destination/MediumFiles
```
##
#### Refining Directories (Deduplicating)

`refine` identifies and safely handles duplicate files based on their content.

| Parameter        | Default        | Description |
|------------------|----------------|-------------|
| `target`         | **Required**   | Target directory to refine. |
| `interactive`    | `False`        | Prompts user before moving duplicate files. |
| `dry_run`        | `False`        | Simulates changes without actual deletion or move. |
| `match_regex`    | `None`         | Include files matching regex. |
| `match_names`    | `None`         | Include specific filenames. |
| `match_glob`     | `None`         | Include files by glob pattern. |
| `exclude_regex`  | `None`         | Exclude files by regex. |
| `exclude_names`  | `None`         | Exclude specific filenames. |
| `exclude_glob`   | `None`         | Exclude files using glob pattern. |
| `summary`        | `None`         | Optional log collector list. |
| `on_conflict`    | `"rename"`     | Conflict resolution strategy. |
| `max_workers`    | `4`            | Threads for parallel hashing. |
| `recursive_check`| `False`        | Traverse and check all subfolders. |
| `verbose`        | `False`        | Print each step for transparency. |

#### Example
```python
from fylex import refine
import os
import shutil
import hashlib

# Setup for refine example
os.makedirs("/data/refine_test", exist_ok=True)
with open("/data/refine_test/file1.txt", "w") as f: f.write("unique content A")
with open("/data/refine_test/file2_copy.txt", "w") as f: f.write("unique content A") # Duplicate of file1.txt
with open("/data/refine_test/image.jpg", "w") as f: f.write("image data")
os.makedirs("/data/refine_test/sub_folder", exist_ok=True)
with open("/data/refine_test/sub_folder/file1_in_sub.txt", "w") as f: f.write("unique content A") # Duplicate of file1.txt

# Example 1: Find and deprecate duplicates in /data/refine_test (dry run)
refine(
    target="/data/refine_test",
    recursive_check=True, # Check subdirectories
    dry_run=True,
    verbose=True
)
# Expected Dry Run Output:
# [DRY RUN] Duplicate: /data/refine_test/file2_copy.txt would have been safely backed up at /data/refine_test/fylex.deprecated
# [DRY RUN] Duplicate: /data/refine_test/sub_folder/file1_in_sub.txt would have been safely backed up at /data/refine_test/fylex.deprecated
# File retained: /data/refine_test/file1.txt
# File retained: /data/refine_test/image.jpg

# Example 2: Actual deduplication (remove dry_run=True to execute)
# This would create /data/refine_test/.fylex_deprecated/ and move duplicates into it.
# refine(
#     target="/data/refine_test",
#     recursive_check=True,
#     verbose=True
# )
```

##
#### Handling Junk Files

Fylex comes with a predefined list of common "junk" file extensions and names. You can leverage this via the `exclude_names` and `exclude_glob` parameters or modify the `JUNK_EXTENSIONS` dictionary in the source.

```python
from fylex import copy_files, JUNK_EXTENSIONS

# Combine all junk extensions and names into lists for exclusion
all_junk_extensions = [ext for sublist in JUNK_EXTENSIONS.values() for ext in sublist if ext.startswith(".")]
all_junk_names = [name for sublist in JUNK_EXTENSIONS.values() for name in sublist if not name.startswith(".")]

# Example: Copy all files from /data/temp to /archive, excluding all known junk.
# Note: You'd typically want to specify target directory for JUNK_EXTENSIONS if using.
# For simplicity, let's use common examples.
copy_files(src="/data/temp", dest="/archive",
           exclude_glob="*.tmp", # Exclude temporary files
           exclude_names=["thumbs.db", "desktop.ini"], # Exclude specific names
           recursive_check=True, verbose=True)
# Result: .tmp, old_data.bak, report.log would be excluded based on these specific exclusions.
```
##
#### Dry Run and Interactive Modes

```python
from fylex import copy_files

# Example: See what would happen if you were to copy all .txt files without actually doing it.
copy_files(src="/data/project_A", dest="/backup",
           match_glob="*.md", dry_run=True, verbose=True)
# Output in log/console: "[DRY RUN] Would have copied: /data/project_A/docs/readme.md -> /backup/readme.md"
# No files are actually copied.

# Example: Be prompted for every action
copy_files(src="/data/project_A", dest="/backup",
           match_glob="*.ini", interactive=True, verbose=True)
# Console: "Copy /data/project_A/config.ini to /backup/config.ini? [y/N]:"
# User input determines if the copy proceeds.
```
##
#### Working with Regex and Glob Patterns

Fylex allows you to combine regex and glob patterns for precise filtering.

```python
from fylex import copy_files

# Example 1: Copy files that are either .jpg OR start with 'report'
copy_files(src="/data/my_files", dest="/backup",
           match_regex=r".*\.jpg$", # Matches any .jpg
           match_glob="report*", # Matches files starting with 'report'
           verbose=True)

# Example 2: Exclude files that are either log files or contain 'temp' in their name
copy_files(src="/data/", dest="/filtered_data",
           recursive_check=True,
           exclude_regex=r".*\.log$",
           exclude_glob="*temp*", # Matches files containing 'temp'
           verbose=True)
```


## 5\. Why Fylex is Superior

Compared to standard shell commands (`cp`, `mv`, `rm`, `find`, `robocopy` / `rsync`) or even basic scripting, Fylex offers significant advantages:

1.  **Intelligent Conflict Resolution (Beyond Overwrite/Rename) *with Accident Prevention***:

      * **Shell**: `cp -f` overwrites, `cp -n` skips. `robocopy` offers more, but still lacks integrated safe-guards.
      * **Fylex**: Provides `rename`, `replace`, `larger`, `smaller`, `newer`, `older`, `skip`, and `prompt`. **Crucially, when Fylex replaces an existing file at the destination or skips a source file (due to a conflict), it first moves the affected file into a dedicated, timestamped `.fylex_deprecated/` folder.** This virtually eliminates the risk of accidental data loss, allowing users to review and retrieve superseded or skipped files later. This safety net is a major leap beyond simple overwrite/skip options in other tools.

2.  **Built-in Data Integrity Verification (Hashing):**

      * OS commands perform a basic copy. You'd need to manually run `md5sum` or `sha256sum` after the copy and compare.
      * `Fylex` uses `xxhash` for fast post-copy verification, ensuring that the copied file is an exact, uncorrupted duplicate of the source. This is crucial for critical data.

3.  **Unified and Advanced Filtering:**

      * `find` combined with `grep`, `xargs`, and `egrep` is powerful but often requires complex, multi-stage commands. Glob patterns are simpler but less flexible than regex.
      * `Fylex` integrates regex, glob, and exact name matching/exclusion directly into its functions, allowing for highly specific and readable filtering with a single API call.

4.  **Specialized Directory Reorganization (`spill`, `flatten`):**

      * Achieving "spill","flatten" or "categorize" with OS commands means chaining `find`, `mv`, `rmdir`, and potentially `xargs` with very specific and often platform-dependent syntax. This is notoriously difficult to get right and can lead to accidental data loss if a mistake is made.
      * `Fylex` provides these as high-level, single-function operations with built-in safety (like dry run and empty directory cleanup), making them much safer and easier to use.

5.  **Intelligent Duplicate Management (`refine`):**

      * Dedicated deduplication tools like `fdupes` exist but often perform direct deletion or require additional scripting for safe archiving.
      * `Fylex`'s `refine` function provides content-based duplicate detection using `xxhash` and, most importantly, safely moves duplicates to a `.fylex_deprecated/` folder, offering a non-destructive alternative to immediate deletion.

6.  **Concurrency Out-of-the-Box:**

      * Basic OS commands are single-threaded. Parallelization requires advanced shell scripting with `xargs -P` or similar, which adds complexity.
      * `Fylex` automatically utilizes a `ThreadPoolExecutor` to process files concurrently, significantly boosting performance for large datasets without any extra effort from the user.

7.  **Comprehensive Logging & Dry Run Safety Net:**

      * OS commands typically dump output to stdout/stderr. Comprehensive logging requires redirection and parsing. Dry run is often simulated or requires specific flags that may not exist for all commands.
      * `Fylex` generates detailed `fylex.log` for every operation, providing an auditable trail. The `dry_run` mode is a built-in safeguard, allowing you to preview complex operations safely.

8.  **Python Integration & Extensibility:**

      * While powerful, shell scripts can be less maintainable and harder to integrate into larger software systems.
      * `Fylex`, being a Python library, is easily callable from any Python application, making it highly extensible and automatable within existing Python workflows.

9.  **User Interactivity:**

      * Shell: Limited options for user prompts during bulk operations.
      * `Fylex`: `interactive` mode provides a safety net by prompting for confirmation before each file transfer, giving you granular control.

In essence, Fylex transforms common, complex, and risky file management scenarios into straightforward, reliable, and efficient operations, saving time, preventing data loss, and simplifying automation.

## 6\. Error Handling

Fylex implements robust error handling to ensure operations are performed safely and to provide clear feedback when issues arise.

  * `InvalidPathError`: Raised if a specified source path does not exist, or if `no_create` is `True` and the destination path does not exist.
  * `PermissionDeniedError`: Raised if Fylex lacks the necessary read or write permissions for a given path.
  * `ValueError`: Raised for logical inconsistencies, such as trying to copy a directory into itself when `recursive_check` is enabled, or unsupported categorization modes.
  * **Retry Mechanism**: Transient errors during file copy/move operations are automatically retried up to `MAX_RETRIES` (default: 5). If retries are exhausted, an error is logged.

## 7\. Logging

Fylex provides detailed logging to `fylex.log` in the current working directory by default.

  * **INFO**: Records successful operations, dry run simulations, and significant events, including deprecation actions.
  * **WARNING**: Indicates potential issues, such as hash mismatches requiring retries.
  * **ERROR**: Logs failures, permissions issues, or unhandled exceptions.

You can control log output:

  * `verbose=True`: Prints log messages to the console in real-time, in addition to the file.
  * `summary="path/to/my_log.log"`: Copies the `fylex.log` file to the specified summary path upon completion.

## 8\. Development & Contributing

Fylex is open to contributions\! If you have ideas for new features, bug fixes, or improvements, feel free to:

1.  Fork the repository.
2.  Create a new branch (`git checkout -b feature/your-feature-name`).
3.  Make your changes.
4.  Write clear commit messages.
5.  Submit a Pull Request.

## 9\. License

Fylex is released under the [MIT License](https://www.google.com/search?q=LICENSE)

xxHash used under BSD License
##

## 10\. Author

**Sivaprasad Murali** —
[sivaprasad.off@gmail.com](mailto:sivaprasad.off@gmail.com)


##
<center>Your files. Your rules. Just smarter.</center>

## 



            

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "fylex",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.8",
    "maintainer_email": null,
    "keywords": "file, copy, move, cli, utility, xxhash, filter, conflict, backup, multithread, devtool, python-tool, fylex, deduplication, smart-copy, smart-move",
    "author": null,
    "author_email": "Sivaprasad Murali <sivaprasad.off@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/9f/c5/5ddfe8fb0ef9e35a7a26e56522ebba45f873ff1fae759152bd6f6c178d1c/fylex-0.7.2.tar.gz",
    "platform": null,
    "description": "\n# Fylex: Your Intelligent File & Directory Orchestrator\n\n\n[![Python 3.x](https://img.shields.io/badge/Python-3.x-blue.svg)](https://www.python.org/)\n[![PyPI Downloads](https://static.pepy.tech/badge/fylex)](https://pepy.tech/projects/fylex)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n\nFylex is a powerful and flexible Python utility designed to simplify complex file management tasks. From intelligent copying and moving to flattening chaotic directory structures and resolving file conflicts, Fylex provides a robust, concurrent, and log-detailed solution for organizing your digital life.\n\n\n\n## Table of Contents\n\n1.  [Introduction](#introduction)\n2.  [Key Features](#key-features)\n3.  [Installation](#installation)\n4.  [Usage](#usage)\n    * [Core Functions Overview](#core-functions-overview)\n    * [Common Parameters](#common-parameters)\n    * [Conflict Resolution Modes (`on_conflict`)](#conflict-resolution-modes-on_conflict)\n    * [Examples](#examples)\n        * [`copy_files`: Smart file copying](#copy_files-smart-copying)\n        * [`move_files`: Smart file moving](#move_files-smart-moving)\n        * [`copy_dirs`: Smart folder copying](#copy_files-smart-copying)\n        * [`move_dirs`: Smart folder moving](#move_files-smart-moving)\n        * [`super_copy`: Smart unified copying](#copy_files-smart-copying)\n        * [`super_move`: Smart unified moving](#move_files-smart-moving)\n        * [`spill`: Consolidating Files from Subdirectories](#spill-consolidating-files-from-subdirectories)\n        * [`flatten`: Flattening Directory Structures](#flatten-flattening-directory-structures)\n        * [`categorize`: Categorizing files](#categorizing-files)\n        * [`refine`: Refining directories](#refining-directories)\n        * [Handling Junk Files](#handling-junk-files)\n        * [Dry Run and Interactive Modes](#dry-run-and-interactive-modes)\n        * [Working with Regex and Glob Patterns](#working-with-regex-and-glob-patterns)\n        \n5.  [Why Fylex is BETTER](#why-fylex-is-better)\n6.  [Error Handling](#error-handling)\n7.  [Logging](#logging)\n8.  [Development & Contributing](#development--contributing)\n9.  [License](#license)\n\n## 1. Introduction\n\nManaging files can quickly become a tedious and error-prone process, especially when dealing with large collections, duplicate files, or disorganized directory structures. Traditional command-line tools offer basic copy/move functionalities, but often lack the intelligence to handle conflicts, filter effectively, or automate complex reorganization patterns.\n\nFylex steps in to fill this gap. It's built on a foundation of robust error handling, concurrent processing, and intelligent decision-making, ensuring your file operations are efficient, safe, and tailored to your needs.\n\n## 2. Key Features\n\n* **Smart Copy (`copy_files`)**: Copy files with advanced filtering, conflict resolution, and integrity verification.\n* **Smart Move (`move_files`)**: Move files, similar to copying, but with source file deletion upon successful transfer and verification.\n* **Smart Directory Copy (`copy_dirs`)**: Copy entire directory structures with advanced filtering and intelligent conflict resolution, including content-based duplicate folder detection.\n* **Smart Directory Move (`move_dirs`)**: Move directory structures, maintaining the same smart features as `copy_dirs`.\n* **Unified Super Operations (`super_copy`, `super_move`)**: Perform both file and folder copy/move operations simultaneously, applying distinct filtering and conflict resolution rules for files and directories in a single command. This allows for powerful, combined management tasks.\n* **File Hashing for Reliability**: Utilizes `xxhash` for fast, non-cryptographic hashing to ensure file integrity post-transfer and detect true content duplicates.\n* **Sophisticated Conflict Resolution**: Offers a comprehensive set of strategies to handle name collisions at the destination (e.g., rename, replace, keep larger/smaller/newer/older, skip, or prompt).\n* **Accident Prevention with Deprecated Folders**: Crucially, when an `on_conflict` mode leads to an existing destination file/folder being replaced (e.g., by a \"newer\" or \"larger\" incoming item), Fylex automatically moves the *superseded* item into a timestamped `.fylex_deprecated/` subfolder within the destination directory. This acts as a robust safety net against accidental data loss, allowing you to recover older versions if needed. Additionally, if a source file/folder is skipped because its identical counterpart already exists at the destination, it is also moved to `fylex.deprecated/` by default.\n* **Flexible File and Folder Filtering**:\n    * **Inclusion**: Specify files and/or folders to process using regular expressions (`match_regex`, `folder_match_regex`), exact names (`match_names`, `folder_match_names`), or glob patterns (`match_glob`, `folder_match_glob`).\n    * **Exclusion**: Prevent specific files and/or folders from being processed using `exclude_regex`, `folder_exclude_regex`, `exclude_names`, `folder_exclude_names`, or `exclude_glob`, `folder_exclude_glob`.\n    * **Junk File Awareness**: Predefined `JUNK_EXTENSIONS` helps easily exclude common temporary, system, and development artifacts.\n* **Intelligent Duplicate Folder Detection**: Fylex identifies identical folders (based on their content structure and file hashes) to prevent redundant operations and optimize storage.\n* **Directory Reorganization Utilities**:\n    * **`spill`**: Consolidate files from nested subdirectories up to a specified depth into a parent directory.\n    * **`flatten`**: Move all files from an entire directory tree into a single target directory, then automatically delete the empty subdirectories.\n    * **`categorize`**: Organize files into logical subdirectories based on criteria like file name patterns (regex/glob), size ranges, or file extensions.\n* **Concurrency for Speed**: Leverages Python's `ThreadPoolExecutor` to perform file and folder operations in parallel, significantly speeding up tasks involving many files.\n* **Dry Run Mode**: Simulate any operation without making actual changes to the filesystem. Essential for verifying complex commands before execution.\n* **Interactive Mode**: Prompts for user confirmation before each file or folder operation, providing fine-grained control.\n* **Comprehensive Logging**: All actions, warnings, and errors are meticulously logged to `fylex.log` for easy auditing and debugging.\n* **Robust Path Validation**: Prevents common pitfalls like attempting to copy a directory into itself, or operating on non-existent paths.\n* **Retry Mechanism**: Failed file operations are retried up to `MAX_RETRIES` to handle transient network issues or temporary file locks.\n* **Intelligent Duplicate Refinement**: Identifies true content duplicates using hashing and safely moves them to a deprecated folder, freeing up disk space.\n\n## 3. Installation\n\nFylex is designed to be integrated into your Python projects or run as a standalone script.\n\n1.  **Clone the repository**:\n    ```bash\n    git clone https://github.com/Crystallinecore/fylex.git\n    cd fylex\n    ```\n2.  **Install dependencies**:\n    Fylex requires `xxhash`.\n    ```bash\n    pip install xxhash\n    ```\n3.  **Include in your project**:\n    You can import Fylex functions directly into your Python scripts:\n    ```python\n    from fylex import copy_files, move_files, spill, flatten, delete_empty_dirs\n    from fylex.exceptions import InvalidPathError, PermissionDeniedError\n    ```\n\n## 4. Usage\n\nFylex functions are designed to be intuitive. Here's a breakdown of the core functions and their parameters.\n\n### Core Functions Overview\n\n| Function | Type | Description |\n| :--- | :--- | :--- |\n| `copy_files(src, dest, **kwargs)` | `copy` | Copies files from `src` to `dest`. |\n| `move_files(src, dest, **kwargs)` | `move` | Moves files from `src` to `dest`. |\n| `copy_dirs(src, dest, **kwargs)` | `copy` | Copies directories and their contents from `src` to `dest`. |\n| `move_dirs(src, dest, **kwargs)` | `move` | Moves directories and their contents from `src` to `dest`. |\n| `super_copy(src, dest, **kwargs)` | `copy` | Copies both files and directories from `src` to `dest` with separate rules. |\n| `super_move(src, dest, **kwargs)` | `move` | Moves both files and directories from `src` to `dest` with separate rules. |\n| `spill(target, **kwargs)` | `reorganize` | Moves files from subdirectories within `target` to `target`. |\n| `flatten(target, **kwargs)` | `reorganize` | Moves all files from subdirectories within `target` to `target` and deletes empty subdirectories. |\n| `categorize(target, categorize_by, grouping=None, default=None, **kwargs)` | `organize` | Orchestrates categorization based on specified `categorize_by` mode. |\n| `categorize_by_name(target, grouping, default=None, **kwargs)` | `organize` | Categorizes files by name using regex/glob patterns. |\n| `categorize_by_size(target, grouping, default=None, **kwargs)` | `organize` | Categorizes files by size, using specific sizes or ranges. |\n| `categorize_by_ext(target, default=None, **kwargs)` | `organize` | Categorizes files by their file extension. |\n| `refine(target, **kwargs)` | `deduplicate` | Identifies and manages duplicate files within a target directory, moving redundant copies to a deprecated folder. |\n| `delete_empty_dirs(target)` | `cleanup` | Recursively deletes all empty subdirectories within `target`. |\n\n\n\n### Common Parameters\n\n\n### Conflict Resolution Modes (`on_conflict`)\n\nFylex offers smart handling of file name conflicts at the destination. The `on_conflict` parameter accepts one of the following string values:\n\n* **`\"rename\"` (Default)**: If a file with the same name exists, the incoming file will be renamed (e.g., `document.txt` becomes `document(1).txt`, `document(2).txt`, etc.) to avoid overwriting.\n* **`\"replace\"`**: The incoming file will unconditionally overwrite the existing file at the destination. **The original file will be moved to a timestamped `.fylex_deprecated/` folder within the destination for safety.**\n* **`\"larger\"`**: The file with the larger file size will be kept. If the existing file is larger or equal, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is larger, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.\n* **`\"smaller\"`**: The file with the smaller file size will be kept. If the existing file is smaller or equal, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is smaller, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.\n* **`\"newer\"`**: The file with the more recent modification timestamp will be kept. If the existing file is newer or has the same timestamp, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is newer, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.\n* **`\"older\"`**: The file with the older modification timestamp will be kept. If the existing file is older or has the same timestamp, the incoming file is skipped. **If skipped, the source file is moved to `fylex.deprecated/`**. If the incoming file is older, it replaces the existing one, and the **original file is moved to `.fylex_deprecated/`**.\n* **`\"skip\"`**: The incoming file will be skipped entirely if a file with the same name exists at the destination. **The skipped source file is moved to `fylex.deprecated/` for review.**\n* **`\"prompt\"`**: Fylex will ask the user interactively (via console) whether to replace the existing file or skip the incoming one. If \"replace\" is chosen, the **original file is moved to `.fylex_deprecated/`**. If \"skip\" is chosen, the **skipped source file is moved to `fylex.deprecated/`**.\n\n### Examples\n\nLet's assume the following directory structure for the examples:\n\n````\n/data/\n\u251c\u2500\u2500 project_A/\n\u2502  \u251c\u2500\u2500 main.py\n\u2502  \u251c\u2500\u2500 config.ini\n\u2502  \u2514\u2500\u2500 docs/\n\u2502      \u251c\u2500\u2500 readme.md\n\u2502      \u2514\u2500\u2500 images/\n\u2502          \u2514\u2500\u2500 img_01.png\n\u251c\u2500\u2500 project_B/\n\u2502  \u251c\u2500\u2500 index.html\n\u2502  \u2514\u2500\u2500 style.css\n\u251c\u2500\u2500 temp/\n\u2502  \u251c\u2500\u2500 .tmp\n\u2502  \u251c\u2500\u2500 old_data.bak\n\u2502  \u2514\u2500\u2500 report.log\n\u251c\u2500\u2500 my_files/\n\u2502  \u251c\u2500\u2500 photo.jpg\n\u2502  \u251c\u2500\u2500 document.pdf\n\u2502  \u2514\u2500\u2500 sub_folder/\n\u2502      \u2514\u2500\u2500 nested_file.txt\n\u251c\u2500\u2500 important_notes.txt\n\u2514\u2500\u2500 large_archive.zip (assume large size, e.g., 50MB)\n\u2514\u2500\u2500 small_image.png (assume small size, e.g., 100KB)\n\u2514\u2500\u2500 duplicate_photo.jpg (exact same content as photo.jpg)\n\n````\n\nAnd your destination directory is initially empty: `/backup/`\n\n##\n#### `copy_files`: Smart Copying\n\n`copy_files` Copies files from a source to a destination, with advanced conflict resolution and filtering options.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `src`            | **Required**   | Source path (directory or iterable of files). |\n| `dest`           | **Required**   | Destination directory path. |\n| `no_create`      | `False`        | If `True`, raises error if destination does not exist. |\n| `interactive`    | `False`        | If `True`, prompts user before each copy. |\n| `dry_run`        | `False`        | Simulates the copy without modifying files. |\n| `match_regex`    | `None`         | Regex pattern to include matching files. |\n| `match_names`    | `None`         | List of exact filenames to include. |\n| `match_glob`     | `None`         | Glob pattern(s) to include. |\n| `exclude_regex`  | `None`         | Regex pattern to exclude matching files. |\n| `exclude_names`  | `None`         | List of exact filenames to exclude. |\n| `exclude_glob`   | `None`         | Glob pattern(s) to exclude. |\n| `summary`        | `None`         | Optional list to append summary messages. |\n| `on_conflict`    | `\"rename\"`     | Strategy on conflict: `rename`, `skip`, `replace`, `newer`, `older`, `larger`, `smaller`, `prompt`. |\n| `max_workers`    | `4`            | Number of threads for parallel operations. |\n| `recursive_check`| `False`        | Recursively filter files in subdirectories. |\n| `verbose`        | `False`        | If `True`, prints detailed log for each operation. |\n\n#### Example:\n```python\nfrom fylex import copy_files\n\n# Example 1: Copy all Python files from project_A to /backup, resolving conflicts by renaming.\n# Only scans the top-level files of project_A if recursive_check=False\ncopy_files(src=\"/data/project_A\", dest=\"/backup\",\n           match_glob=\"*.py\", on_conflict=\"rename\", verbose=True)\n# Result: /backup/main.py\n\n# Example 2: Copy all files from /data/my_files including subdirectories,\n# excluding .txt files, and keep the newer version on conflict.\n# If a file like 'photo.jpg' exists in /backup/my_backup and is older,\n# the existing 'photo.jpg' would be moved to '/backup/my_backup/.fylex_deprecated/YYYY-MM-DD_HH-MM-SS/'\n# before the new 'photo.jpg' is copied.\ncopy_files(src=\"/data/my_files\", dest=\"/backup/my_backup\",\n           recursive_check=True, exclude_glob=\"*.txt\", on_conflict=\"newer\", verbose=True)\n# Result: /backup/my_backup/photo.jpg, /backup/my_backup/document.pdf\n# (nested_file.txt would be skipped due to exclusion)\n\n# Example 3: Copy only 'important_notes.txt' from /data to /backup\ncopy_files(src=\"/data\", dest=\"/backup\",\n           match_names=[\"important_notes.txt\"], verbose=True)\n# Result: /backup/important_notes.txt\n````\n##\n#### `move_files`: Smart Moving\n\n`move_files` works identically to `copy_files` but deletes the source file after successful transfer.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `src`            | **Required**   | Source path (directory or iterable of files). |\n| `dest`           | **Required**   | Destination directory path. |\n| `no_create`      | `False`        | If `True`, raises error if destination does not exist. |\n| `interactive`    | `False`        | If `True`, prompts user before each copy. |\n| `dry_run`        | `False`        | Simulates the copy without modifying files. |\n| `match_regex`    | `None`         | Regex pattern to include matching files. |\n| `match_names`    | `None`         | List of exact filenames to include. |\n| `match_glob`     | `None`         | Glob pattern(s) to include. |\n| `exclude_regex`  | `None`         | Regex pattern to exclude matching files. |\n| `exclude_names`  | `None`         | List of exact filenames to exclude. |\n| `exclude_glob`   | `None`         | Glob pattern(s) to exclude. |\n| `summary`        | `None`         | Optional list to append summary messages. |\n| `on_conflict`    | `\"rename\"`     | Strategy on conflict: `rename`, `skip`, `replace`, `newer`, `older`, `larger`, `smaller`, `prompt`. |\n| `max_workers`    | `4`            | Number of threads for parallel operations. |\n| `recursive_check`| `False`        | Recursively filter files in subdirectories. |\n| `verbose`        | `False`        | If `True`, prints detailed log for each operation. |\n\n#### Example\n```python\nfrom fylex import move_files\n\n# Example: Move all .html and .css files from project_B to /web_files,\n# prompting on conflict.\n# If the user chooses to replace, the existing file in /web_files would be moved\n# to '/web_files/.fylex_deprecated/YYYY-MM-DD_HH-MM-SS/'.\n# If the user chooses to skip, the source file (e.g., /data/project_B/index.html)\n# would be moved to 'fylex.deprecated/' (in the current working directory).\nmove_files(src=\"/data/project_B\", dest=\"/web_files\",\n           match_glob=\"*.{html,css}\", on_conflict=\"prompt\", interactive=True, verbose=True)\n# User would be prompted for each file if it already exists in /web_files.\n# After successful move: /data/project_B will no longer contain index.html or style.css\n```\n##\n\n#### `copy_dirs`: Smart Directory Copying\n\n`copy_dirs` allows you to copy entire directory structures with advanced filtering and conflict resolution, including content-based duplicate folder detection.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `src`            | **Required**   | Source directory path. |\n| `dest`           | **Required**   | Target directory path. |\n| `no_create`      | `False`        | Raise error if destination doesn't exist. |\n| `interactive`    | `False`        | Prompt user before copying each folder or file. |\n| `dry_run`        | `False`        | Simulate the directory copy without changes. |\n| `match_regex`    | `None`         | Include files matching regex pattern. |\n| `match_names`    | `None`         | Include files with specific names. |\n| `match_glob`     | `None`         | Include files matching glob pattern. |\n| `exclude_regex`  | `None`         | Exclude files matching regex. |\n| `exclude_names`  | `None`         | Exclude files with specific names. |\n| `exclude_glob`   | `None`         | Exclude files matching glob pattern. |\n| `summary`        | `None`         | Store summary/logging messages. |\n| `on_conflict`    | `\"rename\"`     | Strategy for existing files or directories. |\n| `max_workers`    | `4`            | Number of worker threads. |\n| `recursive_check`| `False`        | Recursively apply filters during directory traversal. |\n| `verbose`        | `False`        | Enable detailed output. |\n\n#### Example\n```python\nfrom fylex import copy_dirs\n\n# Example: Copy a specific project folder from a development drive to a backup drive.\n# If 'my_project' already exists in '/backups', Fylex will keep the newer version.\n# If the existing folder in '/backups' is older, it will be moved to\n# '/backups/.fylex_deprecated/YYYY-MM-DD_HH-MM-SS/my_project/' before the new one is copied.\ncopy_dirs(src=\"/dev_drive/projects\", dest=\"/backups\",\n          folder_match_names=[\"my_project\"], on_conflict=\"newer\", verbose=True)\n# If '/backups/my_project' was older than '/dev_drive/projects/my_project',\n# the old '/backups/my_project' is deprecated, and the new one is copied.\n\n# Example: Copy all folders related to 'docs' or 'reports', excluding sensitive ones,\n# and use dry_run to see what would happen.\ncopy_dirs(src=\"/shared_drive/department_data\", dest=\"/archive/department_docs\",\n          folder_match_regex=\"^(docs|reports)_.*\",\n          folder_exclude_names=[\"docs_sensitive\", \"reports_internal_only\"],\n          dry_run=True, verbose=True)\n# This command will only print logs about which directories *would* be copied and where.\n```\n##\n#### `move_dirs`: Smart Directory Moving\n\n`move_dirs` works identically to `copy_dirs` but deletes the source directory after successful transfer and verification.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `src`            | **Required**   | Source directory path. |\n| `dest`           | **Required**   | Target directory path. |\n| `no_create`      | `False`        | Raise error if destination doesn't exist. |\n| `interactive`    | `False`        | Prompt user before copying each folder or file. |\n| `dry_run`        | `False`        | Simulate the directory copy without changes. |\n| `match_regex`    | `None`         | Include files matching regex pattern. |\n| `match_names`    | `None`         | Include files with specific names. |\n| `match_glob`     | `None`         | Include files matching glob pattern. |\n| `exclude_regex`  | `None`         | Exclude files matching regex. |\n| `exclude_names`  | `None`         | Exclude files with specific names. |\n| `exclude_glob`   | `None`         | Exclude files matching glob pattern. |\n| `summary`        | `None`         | Store summary/logging messages. |\n| `on_conflict`    | `\"rename\"`     | Strategy for existing files or directories. |\n| `max_workers`    | `4`            | Number of worker threads. |\n| `recursive_check`| `False`        | Recursively apply filters during directory traversal. |\n| `verbose`        | `False`        | Enable detailed output. |\n\n#### Example\n```python\nfrom fylex import move_dirs\n\n# Example: Move a completed project folder from \"in progress\" to \"completed\" archives.\n# If a folder with the same name exists in '/archive/completed_projects',\n# the larger one will be kept. If the incoming folder is smaller or equal,\n# the source folder will be moved to 'fylex.deprecated/'.\nmove_dirs(src=\"/project_workspace/in_progress\", dest=\"/archive/completed_projects\",\n          folder_match_names=[\"ProjectX_Final\"], on_conflict=\"larger\", verbose=True)\n# If '/archive/completed_projects/ProjectX_Final' was smaller than or equal to\n# '/project_workspace/in_progress/ProjectX_Final', the source folder\n# '/project_workspace/in_progress/ProjectX_Final' would be moved to 'fylex.deprecated/'.\n# Otherwise, the existing one in '/archive/completed_projects' would be deprecated,\n# and the source would be moved.\n```\n##\n#### `super_copy`: Unified Smart Copy (Files and Directories)\n\n`super_copy` allows you to copy both files and directories simultaneously from a source to a destination, applying distinct filtering and conflict resolution rules for each.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `src`            | **Required**   | Source path (file or directory). |\n| `dest`           | **Required**   | Target path. |\n| `no_create`      | `False`        | Raise error if destination doesn't exist. |\n| `interactive`    | `False`        | Prompt before copying. |\n| `dry_run`        | `False`        | Run in preview-only mode. |\n| `file_match_regex`    | `None`         | File inclusion pattern. |\n| `file_match_names`    | `None`         | Files to include by name. |\n| `file_match_glob`     | `None`         | Files to include using glob. |\n| `folder_match_regex`    | `None`         | Folder inclusion pattern. |\n| `folder_match_names`    | `None`         | Folders to include by name. |\n| `folder_match_glob`     | `None`         | Folders to include using glob. |\n| `file_exclude_regex`  | `None`         | File exclusion regex. |\n| `file_exclude_names`  | `None`         | Files to exclude by name. |\n| `file_exclude_glob`   | `None`         | Files to exclude using glob. |\n| `folder_exclude_regex`  | `None`         | Folder exclusion regex. |\n| `folder_exclude_names`  | `None`         | Folders to exclude by name. |\n| `folder_exclude_glob`   | `None`         | Folders to exclude using glob. |\n| `summary`        | `None`         | Collects operation summaries. |\n| `file_on_conflict`    | `\"rename\"`     | How to resolve naming conflicts for files. |\n| `folder_on_conflict`    | `\"rename\"`     | How to resolve naming conflicts for folders. |\n| `max_workers`    | `4`            | Parallel thread pool size. |\n| `recursive_check`| `False`        | Traverse and filter recursively. |\n| `verbose`        | `False`        | Enable verbose mode. |\n\n#### Example\n```python\nfrom fylex import super_copy\n\n# Example: Copy a mixed-content project folder to a backup.\n# Copy all .py files, and specific 'config' and 'data' folders.\n# For files, rename on conflict. For folders, replace if newer.\nsuper_copy(src=\"/my_dev_project\", dest=\"/backup_dev\",\n           file_match_glob=\"*.py\", file_on_conflict=\"rename\",\n           folder_match_names=[\"config\", \"data\"], folder_on_conflict=\"newer\",\n           file_recursive_check=True, folder_recursive_check=True, verbose=True)\n# This will copy all .py files found recursively within /my_dev_project,\n# renaming them if they conflict in /backup_dev.\n# It will also copy 'config' and 'data' subfolders, recursively, replacing them\n# in /backup_dev if the source version is newer (deprecating the old one).\n\n# Example: Copy an entire repository structure, excluding certain files and hidden directories.\nsuper_copy(src=\"/my_repo\", dest=\"/clean_archive\",\n           file_exclude_glob=\"*.log\",\n           folder_exclude_regex=\"^\\.\", # Exclude hidden folders like .git, .vscode etc.\n           file_on_conflict=\"skip\",\n           folder_on_conflict=\"skip\",\n           recursive_check=True, # Apply file filtering recursively\n           folder_recursive_check=True, # Apply folder filtering recursively\n           dry_run=True, verbose=True)\n# This dry run will show which files (excluding .log) and which folders (excluding hidden ones)\n# would be copied, skipping any conflicts.\n```\n##\n#### `super_move`: Unified Smart Move (Files and Directories)\n\n`super_move` works identically to `super_copy` but deletes the source files and directories upon successful transfer and verification.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `src`            | **Required**   | Source path (file or directory). |\n| `dest`           | **Required**   | Target path. |\n| `no_create`      | `False`        | Raise error if destination doesn't exist. |\n| `interactive`    | `False`        | Prompt before copying. |\n| `dry_run`        | `False`        | Run in preview-only mode. |\n| `file_match_regex`    | `None`         | File inclusion pattern. |\n| `file_match_names`    | `None`         | Files to include by name. |\n| `file_match_glob`     | `None`         | Files to include using glob. |\n| `folder_match_regex`    | `None`         | Folder inclusion pattern. |\n| `folder_match_names`    | `None`         | Folders to include by name. |\n| `folder_match_glob`     | `None`         | Folders to include using glob. |\n| `file_exclude_regex`  | `None`         | File exclusion regex. |\n| `file_exclude_names`  | `None`         | Files to exclude by name. |\n| `file_exclude_glob`   | `None`         | Files to exclude using glob. |\n| `folder_exclude_regex`  | `None`         | Folder exclusion regex. |\n| `folder_exclude_names`  | `None`         | Folders to exclude by name. |\n| `folder_exclude_glob`   | `None`         | Folders to exclude using glob. |\n| `summary`        | `None`         | Collects operation summaries. |\n| `file_on_conflict`    | `\"rename\"`     | How to resolve naming conflicts for files. |\n| `folder_on_conflict`    | `\"rename\"`     | How to resolve naming conflicts for folders. |\n| `max_workers`    | `4`            | Parallel thread pool size. |\n| `recursive_check`| `False`        | Traverse and filter recursively. |\n| `verbose`        | `False`        | Enable verbose mode. |\n\n#### Example\n```python\nfrom fylex import super_move\n\n# Example: Migrate a project from a staging area to production.\n# Move all image files (.jpg, .png) and a specific 'assets' folder.\n# Images that conflict will keep the larger version.\n# The 'assets' folder will be replaced if the incoming one is newer.\nsuper_move(src=\"/staging/prod_build\", dest=\"/production/app_data\",\n           file_match_glob=\"*.{jpg,png}\", file_on_conflict=\"larger\",\n           folder_match_names=[\"assets\"], folder_on_conflict=\"newer\",\n           file_recursive_check=True, folder_recursive_check=True, verbose=True)\n# This will move all specified image files, keeping the larger version on conflict.\n# It will move the 'assets' folder, replacing the destination if newer (deprecating the old one).\n# Source files and folders will be deleted after successful move.\n```\n\n##\n#### `spill`: Consolidating Files from Subdirectories\n\n`spill` moves files from nested directories into the `target` root directory.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `target`         | **Required**   | Target directory to refine. |\n| `interactive`    | `False`        | Prompts user before moving duplicate files. |\n| `dry_run`        | `False`        | Simulates changes without actual deletion or move. |\n| `match_regex`    | `None`         | Include files matching regex. |\n| `match_names`    | `None`         | Include specific filenames. |\n| `match_glob`     | `None`         | Include files by glob pattern. |\n| `exclude_regex`  | `None`         | Exclude files by regex. |\n| `exclude_names`  | `None`         | Exclude specific filenames. |\n| `exclude_glob`   | `None`         | Exclude files using glob pattern. |\n| `summary`        | `None`         | Optional log collector list. |\n| `on_conflict`    | `\"rename\"`     | Conflict resolution strategy. |\n| `max_workers`    | `4`            | Threads for parallel hashing. |\n| `recursive_check`| `False`        | Traverse and check all subfolders. |\n| `verbose`        | `False`        | Print each step for transparency. |\n\n#### Example\n```python\nfrom fylex import spill\nimport os\nimport shutil\n\n# Setup for spill example:\nos.makedirs(\"/data/temp_spill/level1/level2\", exist_ok=True)\nwith open(\"/data/temp_spill/fileA.txt\", \"w\") as f: f.write(\"A\")\nwith open(\"/data/temp_spill/level1/fileB.txt\", \"w\") as f: f.write(\"B\")\nwith open(\"/data/temp_spill/level1/level2/fileC.txt\", \"w\") as f: f.write(\"C\")\nwith open(\"/data/temp_spill/level1/level2/image.jpg\", \"w\") as f: f.write(\"C\")\n\n# Example 1: Spill all files from subdirectories (infinite levels) into /data/temp_spill.\n# If fileB.txt already existed in /data/temp_spill, it would be deprecated based on conflict mode.\nspill(target=\"/data/temp_spill\", levels=-1, verbose=True)\n# Result: /data/temp_spill/fileA.txt, /data/temp_spill/fileB.txt, /data/temp_spill/fileC.txt, /data/temp_spill/image.jpg\n# (fileA.txt is already at root, so not moved)\n# The empty subdirectories /data/temp_spill/level1 and /data/temp_spill/level1/level2 will remain.\n\n# Clean up for next example:\nshutil.rmtree(\"/data/temp_spill\")\nos.makedirs(\"/data/temp_spill/level1/level2\", exist_ok=True)\nwith open(\"/data/temp_spill/fileA.txt\", \"w\") as f: f.write(\"A\")\nwith open(\"/data/temp_spill/level1/fileB.txt\", \"w\") as f: f.write(\"B\")\nwith open(\"/data/temp_spill/level1/level2/fileC.txt\", \"w\") as f: f.write(\"C\")\n\n# Example 2: Spill only files from immediate subdirectories (level 1), excluding .txt files.\nspill(target=\"/data/temp_spill\", levels=1, exclude_glob=\"*.txt\", verbose=True)\n# Result: Only files from /data/temp_spill/level1 (like fileB.txt if not excluded) would be considered.\n# In this specific setup, since only .txt files are present, nothing would move.\n# If image.jpg was in level1, it would move.\n```\n\n##\n#### `flatten`: Flattening Directory Structures\n\n`flatten` is ideal for taking a messy, deeply nested folder and putting all its files into one level, then cleaning up the empty folders.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `target`         | **Required**   | Directory whose files are to be categorized. |\n| `categorize_by`  | **Required**   | Mode: `\"name\"`, `\"size\"`, or `\"ext\"`. |\n| `grouping`       | `None`         | Mapping of patterns to folder paths. |\n| `default`        | `None`         | Path to move unmatched files. |\n| `interactive`    | `False`        | Prompts before each categorization. |\n| `dry_run`        | `False`        | Preview the changes without modifying anything. |\n| `summary`        | `None`         | Collect log summary here. |\n| `max_workers`    | `4`            | Number of threads used for sorting. |\n| `verbose`        | `False`        | Show per-file operations and logs. |\n| `recursive_check`| `False`        | Include subdirectories in file selection. |\n\n#### Example\n```python\nfrom fylex import flatten\nimport os\nimport shutil\n\n# Setup for flatten example (same as spill setup):\nos.makedirs(\"/data/temp_flatten/level1/level2\", exist_ok=True)\nwith open(\"/data/temp_flatten/fileX.log\", \"w\") as f: f.write(\"X\") # Will be ignored by default junk filter\nwith open(\"/data/temp_flatten/level1/fileY.jpg\", \"w\") as f: f.write(\"Y\")\nwith open(\"/data/temp_flatten/level1/level2/fileZ.pdf\", \"w\") as f: f.write(\"Z\")\n\n# Example: Flatten the entire /data/temp_flatten directory.\n# Any files in subdirectories that would overwrite an existing file in /data/temp_flatten\n# would first cause the existing file to be moved to '/data/temp_flatten/.fylex_deprecated/'.\nflatten(target=\"/data/temp_flatten\", verbose=True)\n# Result: /data/temp_flatten/fileX.log, /data/temp_flatten/fileY.jpg, /data/temp_flatten/fileZ.pdf\n# After operation, /data/temp_flatten/level1/ and /data/temp_flatten/level1/level2/ will be deleted.\n```\n##\n#### Categorizing Files\n\nFylex offers flexible ways to categorize files into new or existing directories.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `target`         | **Required**   | Directory whose files are to be categorized. |\n| `categorize_by`  | **Required**   | Mode: `\"name\"`, `\"size\"`, or `\"ext\"`. |\n| `grouping`       | `None`         | Mapping of patterns to folder paths. |\n| `default`        | `None`         | Path to move unmatched files. |\n| `interactive`    | `False`        | Prompts before each categorization. |\n| `dry_run`        | `False`        | Preview the changes without modifying anything. |\n| `summary`        | `None`         | Collect log summary here. |\n| `max_workers`    | `4`            | Number of threads used for sorting. |\n| `verbose`        | `False`        | Show per-file operations and logs. |\n| `recursive_check`| `False`        | Include subdirectories in file selection. |\n\n#### Example\n```python\nfrom fylex import categorize\nimport os\n\n# Create dummy files for categorization\nos.makedirs(\"/data/categorize_source\", exist_ok=True)\nwith open(\"/data/categorize_source/report_april.pdf\", \"w\") as f: f.write(\"report content\")\nwith open(\"/data/categorize_source/meeting_notes.txt\", \"w\") as f: f.write(\"notes content\")\nwith open(\"/data/categorize_source/photo_2023.jpg\", \"w\") as f: f.write(\"photo content\")\nwith open(\"/data/categorize_source/large_video.mp4\", \"w\") as f: f.write(\"large content\" * 1000000) # ~1MB\nwith open(\"/data/categorize_source/small_doc.docx\", \"w\") as f: f.write(\"small content\") # ~100 bytes\n\n# Example 1: Categorize by file extension\ncategorize(\n    target=\"/data/categorize_source\",\n    categorize_by=\"ext\",\n    default=\"/data/categorize_destination/misc\", # Files without common extensions or uncategorized\n    dry_run=True,\n    verbose=True\n)\n# Expected Dry Run Output:\n# Would move /data/categorize_source/report_april.pdf to /data/categorize_destination/pdf/report_april.pdf\n# Would move /data/categorize_source/meeting_notes.txt to /data/categorize_destination/txt/meeting_notes.txt\n# etc.\n\n# Example 2: Categorize by file name using regex and glob\ngrouping_by_name = {\n    r\"^report_.*\\.pdf$\": \"/data/categorize_destination/Reports\", # Regex for reports\n    (\"photo_*.jpg\", \"glob\"): \"/data/categorize_destination/Images\" # Glob for photos\n}\ncategorize(\n    target=\"/data/categorize_source\",\n    categorize_by=\"name\",\n    grouping=grouping_by_name,\n    default=\"/data/categorize_destination/Other\",\n    dry_run=True,\n    verbose=True\n)\n# Expected Dry Run Output:\n# Would move /data/categorize_source/report_april.pdf to /data/categorize_destination/Reports/report_april.pdf\n# Would move /data/categorize_source/photo_2023.jpg to /data/categorize_destination/Images/photo_2023.jpg\n# Others would go to /data/categorize_destination/Other\n\n# Example 3: Categorize by file size (ranges in bytes)\ngrouping_by_size = {\n    (0, 1024): \"/data/categorize_destination/SmallFiles\", # 0 to 1KB\n    (1024 * 1024, \"max\"): \"/data/categorize_destination/LargeFiles\" # 1MB and above\n}\ncategorize(\n    target=\"/data/categorize_source\",\n    categorize_by=\"size\",\n    grouping=grouping_by_size,\n    default=\"/data/categorize_destination/MediumFiles\",\n    dry_run=True,\n    verbose=True\n)\n# Expected Dry Run Output:\n# Would move /data/categorize_source/small_doc.docx to /data/categorize_destination/SmallFiles/small_doc.docx\n# Would move /data/categorize_source/large_video.mp4 to /data/categorize_destination/LargeFiles/large_video.mp4\n# Other files would go to /data/categorize_destination/MediumFiles\n```\n##\n#### Refining Directories (Deduplicating)\n\n`refine` identifies and safely handles duplicate files based on their content.\n\n| Parameter        | Default        | Description |\n|------------------|----------------|-------------|\n| `target`         | **Required**   | Target directory to refine. |\n| `interactive`    | `False`        | Prompts user before moving duplicate files. |\n| `dry_run`        | `False`        | Simulates changes without actual deletion or move. |\n| `match_regex`    | `None`         | Include files matching regex. |\n| `match_names`    | `None`         | Include specific filenames. |\n| `match_glob`     | `None`         | Include files by glob pattern. |\n| `exclude_regex`  | `None`         | Exclude files by regex. |\n| `exclude_names`  | `None`         | Exclude specific filenames. |\n| `exclude_glob`   | `None`         | Exclude files using glob pattern. |\n| `summary`        | `None`         | Optional log collector list. |\n| `on_conflict`    | `\"rename\"`     | Conflict resolution strategy. |\n| `max_workers`    | `4`            | Threads for parallel hashing. |\n| `recursive_check`| `False`        | Traverse and check all subfolders. |\n| `verbose`        | `False`        | Print each step for transparency. |\n\n#### Example\n```python\nfrom fylex import refine\nimport os\nimport shutil\nimport hashlib\n\n# Setup for refine example\nos.makedirs(\"/data/refine_test\", exist_ok=True)\nwith open(\"/data/refine_test/file1.txt\", \"w\") as f: f.write(\"unique content A\")\nwith open(\"/data/refine_test/file2_copy.txt\", \"w\") as f: f.write(\"unique content A\") # Duplicate of file1.txt\nwith open(\"/data/refine_test/image.jpg\", \"w\") as f: f.write(\"image data\")\nos.makedirs(\"/data/refine_test/sub_folder\", exist_ok=True)\nwith open(\"/data/refine_test/sub_folder/file1_in_sub.txt\", \"w\") as f: f.write(\"unique content A\") # Duplicate of file1.txt\n\n# Example 1: Find and deprecate duplicates in /data/refine_test (dry run)\nrefine(\n    target=\"/data/refine_test\",\n    recursive_check=True, # Check subdirectories\n    dry_run=True,\n    verbose=True\n)\n# Expected Dry Run Output:\n# [DRY RUN] Duplicate: /data/refine_test/file2_copy.txt would have been safely backed up at /data/refine_test/fylex.deprecated\n# [DRY RUN] Duplicate: /data/refine_test/sub_folder/file1_in_sub.txt would have been safely backed up at /data/refine_test/fylex.deprecated\n# File retained: /data/refine_test/file1.txt\n# File retained: /data/refine_test/image.jpg\n\n# Example 2: Actual deduplication (remove dry_run=True to execute)\n# This would create /data/refine_test/.fylex_deprecated/ and move duplicates into it.\n# refine(\n#     target=\"/data/refine_test\",\n#     recursive_check=True,\n#     verbose=True\n# )\n```\n\n##\n#### Handling Junk Files\n\nFylex comes with a predefined list of common \"junk\" file extensions and names. You can leverage this via the `exclude_names` and `exclude_glob` parameters or modify the `JUNK_EXTENSIONS` dictionary in the source.\n\n```python\nfrom fylex import copy_files, JUNK_EXTENSIONS\n\n# Combine all junk extensions and names into lists for exclusion\nall_junk_extensions = [ext for sublist in JUNK_EXTENSIONS.values() for ext in sublist if ext.startswith(\".\")]\nall_junk_names = [name for sublist in JUNK_EXTENSIONS.values() for name in sublist if not name.startswith(\".\")]\n\n# Example: Copy all files from /data/temp to /archive, excluding all known junk.\n# Note: You'd typically want to specify target directory for JUNK_EXTENSIONS if using.\n# For simplicity, let's use common examples.\ncopy_files(src=\"/data/temp\", dest=\"/archive\",\n           exclude_glob=\"*.tmp\", # Exclude temporary files\n           exclude_names=[\"thumbs.db\", \"desktop.ini\"], # Exclude specific names\n           recursive_check=True, verbose=True)\n# Result: .tmp, old_data.bak, report.log would be excluded based on these specific exclusions.\n```\n##\n#### Dry Run and Interactive Modes\n\n```python\nfrom fylex import copy_files\n\n# Example: See what would happen if you were to copy all .txt files without actually doing it.\ncopy_files(src=\"/data/project_A\", dest=\"/backup\",\n           match_glob=\"*.md\", dry_run=True, verbose=True)\n# Output in log/console: \"[DRY RUN] Would have copied: /data/project_A/docs/readme.md -> /backup/readme.md\"\n# No files are actually copied.\n\n# Example: Be prompted for every action\ncopy_files(src=\"/data/project_A\", dest=\"/backup\",\n           match_glob=\"*.ini\", interactive=True, verbose=True)\n# Console: \"Copy /data/project_A/config.ini to /backup/config.ini? [y/N]:\"\n# User input determines if the copy proceeds.\n```\n##\n#### Working with Regex and Glob Patterns\n\nFylex allows you to combine regex and glob patterns for precise filtering.\n\n```python\nfrom fylex import copy_files\n\n# Example 1: Copy files that are either .jpg OR start with 'report'\ncopy_files(src=\"/data/my_files\", dest=\"/backup\",\n           match_regex=r\".*\\.jpg$\", # Matches any .jpg\n           match_glob=\"report*\", # Matches files starting with 'report'\n           verbose=True)\n\n# Example 2: Exclude files that are either log files or contain 'temp' in their name\ncopy_files(src=\"/data/\", dest=\"/filtered_data\",\n           recursive_check=True,\n           exclude_regex=r\".*\\.log$\",\n           exclude_glob=\"*temp*\", # Matches files containing 'temp'\n           verbose=True)\n```\n\n\n## 5\\. Why Fylex is Superior\n\nCompared to standard shell commands (`cp`, `mv`, `rm`, `find`, `robocopy` / `rsync`) or even basic scripting, Fylex offers significant advantages:\n\n1.  **Intelligent Conflict Resolution (Beyond Overwrite/Rename) *with Accident Prevention***:\n\n      * **Shell**: `cp -f` overwrites, `cp -n` skips. `robocopy` offers more, but still lacks integrated safe-guards.\n      * **Fylex**: Provides `rename`, `replace`, `larger`, `smaller`, `newer`, `older`, `skip`, and `prompt`. **Crucially, when Fylex replaces an existing file at the destination or skips a source file (due to a conflict), it first moves the affected file into a dedicated, timestamped `.fylex_deprecated/` folder.** This virtually eliminates the risk of accidental data loss, allowing users to review and retrieve superseded or skipped files later. This safety net is a major leap beyond simple overwrite/skip options in other tools.\n\n2.  **Built-in Data Integrity Verification (Hashing):**\n\n      * OS commands perform a basic copy. You'd need to manually run `md5sum` or `sha256sum` after the copy and compare.\n      * `Fylex` uses `xxhash` for fast post-copy verification, ensuring that the copied file is an exact, uncorrupted duplicate of the source. This is crucial for critical data.\n\n3.  **Unified and Advanced Filtering:**\n\n      * `find` combined with `grep`, `xargs`, and `egrep` is powerful but often requires complex, multi-stage commands. Glob patterns are simpler but less flexible than regex.\n      * `Fylex` integrates regex, glob, and exact name matching/exclusion directly into its functions, allowing for highly specific and readable filtering with a single API call.\n\n4.  **Specialized Directory Reorganization (`spill`, `flatten`):**\n\n      * Achieving \"spill\",\"flatten\" or \"categorize\" with OS commands means chaining `find`, `mv`, `rmdir`, and potentially `xargs` with very specific and often platform-dependent syntax. This is notoriously difficult to get right and can lead to accidental data loss if a mistake is made.\n      * `Fylex` provides these as high-level, single-function operations with built-in safety (like dry run and empty directory cleanup), making them much safer and easier to use.\n\n5.  **Intelligent Duplicate Management (`refine`):**\n\n      * Dedicated deduplication tools like `fdupes` exist but often perform direct deletion or require additional scripting for safe archiving.\n      * `Fylex`'s `refine` function provides content-based duplicate detection using `xxhash` and, most importantly, safely moves duplicates to a `.fylex_deprecated/` folder, offering a non-destructive alternative to immediate deletion.\n\n6.  **Concurrency Out-of-the-Box:**\n\n      * Basic OS commands are single-threaded. Parallelization requires advanced shell scripting with `xargs -P` or similar, which adds complexity.\n      * `Fylex` automatically utilizes a `ThreadPoolExecutor` to process files concurrently, significantly boosting performance for large datasets without any extra effort from the user.\n\n7.  **Comprehensive Logging & Dry Run Safety Net:**\n\n      * OS commands typically dump output to stdout/stderr. Comprehensive logging requires redirection and parsing. Dry run is often simulated or requires specific flags that may not exist for all commands.\n      * `Fylex` generates detailed `fylex.log` for every operation, providing an auditable trail. The `dry_run` mode is a built-in safeguard, allowing you to preview complex operations safely.\n\n8.  **Python Integration & Extensibility:**\n\n      * While powerful, shell scripts can be less maintainable and harder to integrate into larger software systems.\n      * `Fylex`, being a Python library, is easily callable from any Python application, making it highly extensible and automatable within existing Python workflows.\n\n9.  **User Interactivity:**\n\n      * Shell: Limited options for user prompts during bulk operations.\n      * `Fylex`: `interactive` mode provides a safety net by prompting for confirmation before each file transfer, giving you granular control.\n\nIn essence, Fylex transforms common, complex, and risky file management scenarios into straightforward, reliable, and efficient operations, saving time, preventing data loss, and simplifying automation.\n\n## 6\\. Error Handling\n\nFylex implements robust error handling to ensure operations are performed safely and to provide clear feedback when issues arise.\n\n  * `InvalidPathError`: Raised if a specified source path does not exist, or if `no_create` is `True` and the destination path does not exist.\n  * `PermissionDeniedError`: Raised if Fylex lacks the necessary read or write permissions for a given path.\n  * `ValueError`: Raised for logical inconsistencies, such as trying to copy a directory into itself when `recursive_check` is enabled, or unsupported categorization modes.\n  * **Retry Mechanism**: Transient errors during file copy/move operations are automatically retried up to `MAX_RETRIES` (default: 5). If retries are exhausted, an error is logged.\n\n## 7\\. Logging\n\nFylex provides detailed logging to `fylex.log` in the current working directory by default.\n\n  * **INFO**: Records successful operations, dry run simulations, and significant events, including deprecation actions.\n  * **WARNING**: Indicates potential issues, such as hash mismatches requiring retries.\n  * **ERROR**: Logs failures, permissions issues, or unhandled exceptions.\n\nYou can control log output:\n\n  * `verbose=True`: Prints log messages to the console in real-time, in addition to the file.\n  * `summary=\"path/to/my_log.log\"`: Copies the `fylex.log` file to the specified summary path upon completion.\n\n## 8\\. Development & Contributing\n\nFylex is open to contributions\\! If you have ideas for new features, bug fixes, or improvements, feel free to:\n\n1.  Fork the repository.\n2.  Create a new branch (`git checkout -b feature/your-feature-name`).\n3.  Make your changes.\n4.  Write clear commit messages.\n5.  Submit a Pull Request.\n\n## 9\\. License\n\nFylex is released under the [MIT License](https://www.google.com/search?q=LICENSE)\n\nxxHash used under BSD License\n##\n\n## 10\\. Author\n\n**Sivaprasad Murali** \u2014\n[sivaprasad.off@gmail.com](mailto:sivaprasad.off@gmail.com)\n\n\n##\n<center>Your files. Your rules. Just smarter.</center>\n\n## \n\n\n",
    "bugtrack_url": null,
    "license": "MIT",
    "summary": "A fast, intelligent file copier/mover with hashing, filters, conflict resolution, and backup support \u2014 built for power users.",
    "version": "0.7.2",
    "project_urls": {
        "Documentation": "https://github.com/Crystallinecore/fylex#readme",
        "Homepage": "https://github.com/Crystallinecore/fylex",
        "Source": "https://github.com/Crystallinecore/fylex",
        "Tracker": "https://github.com/Crystallinecore/fylex/issues"
    },
    "split_keywords": [
        "file",
        " copy",
        " move",
        " cli",
        " utility",
        " xxhash",
        " filter",
        " conflict",
        " backup",
        " multithread",
        " devtool",
        " python-tool",
        " fylex",
        " deduplication",
        " smart-copy",
        " smart-move"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "b7d16092aef67506368731fe1fe3cf22ea5819f8f19d992a0a66e71a7e734550",
                "md5": "661de5214578f493d218fcfce3dd49b6",
                "sha256": "65205cd702c421d9e51da3d10434176065c641756b7b513b5c6a67cfda581288"
            },
            "downloads": -1,
            "filename": "fylex-0.7.2-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "661de5214578f493d218fcfce3dd49b6",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.8",
            "size": 30631,
            "upload_time": "2025-08-29T08:46:47",
            "upload_time_iso_8601": "2025-08-29T08:46:47.408312Z",
            "url": "https://files.pythonhosted.org/packages/b7/d1/6092aef67506368731fe1fe3cf22ea5819f8f19d992a0a66e71a7e734550/fylex-0.7.2-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "9fc55ddfe8fb0ef9e35a7a26e56522ebba45f873ff1fae759152bd6f6c178d1c",
                "md5": "6acabb982374262e1eb4a78d9cdad646",
                "sha256": "4616f1320db0b08d8aeb880b7a1d7648b59e8b3d13f393134ae5dd7068674ab0"
            },
            "downloads": -1,
            "filename": "fylex-0.7.2.tar.gz",
            "has_sig": false,
            "md5_digest": "6acabb982374262e1eb4a78d9cdad646",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.8",
            "size": 53603,
            "upload_time": "2025-08-29T08:46:49",
            "upload_time_iso_8601": "2025-08-29T08:46:49.618434Z",
            "url": "https://files.pythonhosted.org/packages/9f/c5/5ddfe8fb0ef9e35a7a26e56522ebba45f873ff1fae759152bd6f6c178d1c/fylex-0.7.2.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-29 08:46:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "Crystallinecore",
    "github_project": "fylex#readme",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": false,
    "lcname": "fylex"
}
        
Elapsed time: 1.05763s