image-filename-ai

Name	image-filename-ai JSON
Version	0.1.0 JSON
	download
home_page	None
Summary	AI-powered image filename generator using Google Gemini - Transform generic image files into descriptive, SEO-friendly names
upload_time	2025-08-12 15:20:49
maintainer	None
docs_url	None
author	None
requires_python	>=3.11
license	None
keywords	ai image filename gemini google-cloud cli batch-processing seo automation computer-vision machine-learning
VCS
bugtrack_url
requirements	No requirements were recorded.
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # Image Filename AI

## Overview

This application uses AI (Gemini) to automatically rename image files based on their content and generate descriptive alt text. It supports both flat and nested folder structures, making it perfect for organizing project-based image collections.

## Features

- **AI-powered image analysis**: Uses Google's Gemini model to understand image content
- **Intelligent filename generation**: Creates descriptive, SEO-friendly filenames
- **Alt text generation**: Generates accessible alt text for images
- **Nested folder support**: Preserves directory structure for project-based organization
- **Image processing**: Resize and reformat images during processing
- **Multiple logging modes**: Flexible logging options for different use cases
- **Language support**: Generate filenames and alt text in multiple languages

## Requirements

- **Python**: 3.11+ (tested on 3.11, 3.12, 3.13)
- **Google Cloud Platform**: Project with Vertex AI enabled
- **Service Account**: With required permissions (see Authentication section)

## Installation

### Option 1: Install from PyPI (Recommended)

```bash
# Install the core CLI tool
pip install image-filename-ai

# Or install with API dependencies
pip install "image-filename-ai[api]"

# Or install with development dependencies  
pip install "image-filename-ai[dev]"
```

### Option 2: Local Development

1. Clone the repository:
```bash
git clone https://github.com/matija2209/image-filename-ai.git
cd image-filename-ai
```

2. Create virtual environment:
```bash
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
```

3. Install in development mode:
```bash
pip install -e ".[dev,api]"
```

4. Set up credentials (see Authentication section below)

### Option 3: Docker (Recommended for API)

1. Clone the repository
2. Copy `.env.example` to `.env` and configure
3. Run with Docker Compose:
```bash
docker-compose up --build
```

## Authentication & Credentials

Choose **one** of the following methods:

### Method 1: Environment Variable (Recommended)
```bash
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-key.json"
```

### Method 2: Place credentials in repo root
Place your `serviceAccountKey.json` file in the project root directory (automatically gitignored).

### For Docker Usage
Uncomment the volume mount in `compose.yml`:
```yaml
volumes:
  - ./serviceAccountKey.json:/app/credentials/credentials.json:ro
```

### Required GCP Permissions
Your service account needs:
- `aiplatform.endpoints.predict` (Vertex AI predictions)
- `storage.objects.get` (read images from GCS) 
- `storage.objects.create` (create processed images)
- `firestore.documents.read/write` (if using job tracking)

## Usage

### CLI Usage (Local Processing)

For a full, step-by-step CLI tutorial, see: [CLI_GUIDE.md](CLI_GUIDE.md)

For minimal GCP setup steps, see: [GCP_SETUP.md](GCP_SETUP.md)

**Basic command:**
```bash
python cli.py --input-dir input --output-dir output --lang en
```

**With custom settings:**
```bash
python cli.py \
  --input-dir ./images \
  --output-dir ./processed \
  --lang de \
  --log-mode nested \
  --max-size 1920 \
  --quality 85 \
  --format webp
```

### API Usage (Docker/Server)

**Start the API server:**
```bash
# Using Docker Compose (recommended)
docker-compose up

# Or locally
uvicorn app.main:app --host 0.0.0.0 --port 8000
```

**Access the API:**
- Interactive docs: http://localhost:8000/docs
- API endpoint: http://localhost:8000/api/v1/process
- Health check: http://localhost:8000/

**⚠️ Note**: The API is currently **unauthenticated** - suitable for development only.

### Environment Configuration

Copy `.env.example` to `.env` and adjust:
```bash
cp .env.example .env
# Edit .env with your settings
```

**Key environment variables:**
```bash
# Core GCP settings (used by both CLI and API)
PROJECT_ID=your-gcp-project-id
LOCATION=us-central1
MODEL_NAME=gemini-2.0-flash-exp
GOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json

# CLI-specific settings (optional)
MAX_RETRIES=5           # Number of retry attempts
BASE_RETRY_DELAY=10     # Base delay between retries (seconds)
MAX_RETRY_DELAY=300     # Maximum delay cap (seconds)
RATE_LIMIT_DELAY=60     # Extra delay for rate limit errors

# Docker settings
COMPOSE_PORT_API=8000   # Port mapping for Docker Compose
```

**📝 Note**: The CLI automatically loads `.env` file from the project root if present.

### Advanced Options

```bash
python cli.py \
  --input-dir input/laneks \
  --output-dir output/laneks \
  --lang sl \
  --format webp \
  --max-width 1920 \
  --log-mode project_level
```

### Arguments

- `--input-dir`: Directory containing input images (default: "input")
- `--output-dir`: Base directory for processed images and logs (default: "output")
- `--lang`: Target language code (e.g., 'en', 'sl', 'de') (default: "en")
- `--format`: Output image format - jpg, png, webp, avif (default: original format)
- `--max-width`: Maximum width in pixels for output images (default: original size)
- `--log-mode`: Logging mode for results (default: "per_folder")

### Logging Modes

The application supports three different logging modes to suit different organizational needs:

#### `per_folder` (Default)
Creates `results.json` and `results.csv` files in each folder where images are processed.
```
output/
├── project1/
│   ├── results.json
│   ├── results.csv
│   └── renamed-images...
└── project2/
    ├── results.json
    ├── results.csv
    └── renamed-images...
```

#### `project_level`
Creates one log file per top-level project folder.
```
output/
├── project1/
│   ├── results.json
│   ├── results.csv
│   ├── subfolder1/renamed-images...
│   └── subfolder2/renamed-images...
└── project2/
    ├── results.json
    ├── results.csv
    └── renamed-images...
```

#### `central`
Creates a single log file in the main output directory.
```
output/
├── results.json
├── results.csv
├── project1/renamed-images...
└── project2/renamed-images...
```

#### `flat`
Flattens the output structure - all processed images go directly to the main output directory with a single central log file. Perfect for processing deeply nested input folders when you want a simple flat output structure.
```
output/
├── results.json
├── results.csv
├── descriptive-name-1.webp
├── descriptive-name-2.webp
├── descriptive-name-3.webp
└── descriptive-name-4.webp
```
*Note: In flat mode, filename conflicts are automatically resolved by adding a counter suffix (e.g., `name-1.webp`, `name-2.webp`).*

## Nested Folder Support

The application automatically preserves your input directory structure in the output:

**Input Structure:**
```
input/
├── laneks/
│   ├── projekt1/
│   │   ├── image1.jpg
│   │   └── image2.jpg
│   └── projekt2/
│       └── image3.jpg
└── other-client/
    └── flat-images/
        └── image4.jpg
```

**Output Structure:**
```
output/
├── laneks/
│   ├── projekt1/
│   │   ├── descriptive-name-1.webp
│   │   └── descriptive-name-2.webp
│   └── projekt2/
│       └── descriptive-name-3.webp
└── other-client/
    └── flat-images/
        └── descriptive-name-4.webp
```

This makes it perfect for:
- **Project-based workflows**: Each client/project maintains its own folder structure
- **Mixed structures**: Support both flat folders and deeply nested hierarchies
- **Team collaboration**: Preserve organizational structure that teams are familiar with

## Authentication

Set up Google Cloud authentication by placing your service account key file as `serviceAccountKey.json` in the project root, or use other Google Cloud authentication methods.

## API Documentation

For web API usage, see [API_DOCUMENTATION.md](API_DOCUMENTATION.md).

## Examples

See [EXAMPLE_DATA.json](EXAMPLE_DATA.json) for sample API responses and data structures.

git pull && docker compose build && export GOOGLE_APPLICATION_CREDENTIALS='filename-ai-21694d9b8f6c.json' && docker compose up

## FastAPI Application

A FastAPI application for processing images stored in Google Cloud Storage.

### Features (FastAPI)

- Process images from Google Cloud Storage
- Generate descriptive, SEO-friendly filenames
- Create alt text for accessibility and SEO
- Support for multiple languages
- REST API for easy integration

### Docker Setup

1. Build and start the container:
   ```
   docker compose build
   docker compose up -d
   ```

2. Alternatively, pass the credentials path at runtime:
   ```
   export GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/service-account-key.json"
   docker compose up -e GOOGLE_APPLICATION_CREDENTIALS
   ```

3. To run with specific environment variables:
   ```
   docker compose run -e GOOGLE_APPLICATION_CREDENTIALS="/path/to/your/credentials.json" api
   ```

### Requirements

- Python 3.9+
- Google Cloud Project with Vertex AI API enabled
- Google Cloud credentials configured

### Setup

1. Clone the repository
2. Install dependencies:
   ```
   pip install -r requirements.txt
   ```
3. Configure the application (optional):
   Create a `.env` file in the project root with:
   ```
   PROJECT_ID=your-gcp-project-id
   LOCATION=us-central1
   MODEL_NAME=gemini-2.0-flash-exp
   ```

### Usage

1. Start the server:
   ```
   python run.py
   ```
2. Access the API documentation at `http://localhost:8000/docs`
3. Make API requests:
   ```
   curl -X POST http://localhost:8000/api/v1/process \
     -H "Content-Type: application/json" \
     -d '{
       "gcs_input_path": "gs://your-bucket/images",
       "language_code": "en"
     }'
   ```

### API Endpoints

- **GET /** - Health check endpoint
- **POST /api/v1/process** - Process images from GCS bucket

### Configuration (FastAPI)

The application can be configured using environment variables or a `.env` file:

- `PROJECT_ID` - Google Cloud project ID
- `LOCATION` - Google Cloud region
- `MODEL_NAME` - Gemini model to use
- `HOST` - Server host (default: 0.0.0.0)
- `PORT` - Server port (default: 8000)

## Command-Line Interface (CLI)

A CLI script (`cli.py`) for processing local image files.

### Features (CLI)

- Process images recursively from a local input directory.
- Generate descriptive, SEO-friendly filenames using Vertex AI Gemini.
- Create alt text for accessibility and SEO using Vertex AI Gemini.
- Support for multiple languages for filenames and alt text.
- Optionally convert images to different formats (JPG, PNG, WEBP, AVIF).
- Optionally resize images to a maximum width, preserving aspect ratio.
- Mirrors the input directory structure in the output directory.
- Logs processing results to JSON and CSV files within each output subdirectory.

### Requirements (CLI)

- Python 3.9+
- Google Cloud Project with Vertex AI API enabled
- Google Cloud credentials configured (e.g., via `gcloud auth application-default login`)
- Dependencies installed: `pip install -r requirements.txt` (Ensure `Pillow` is included for image processing)

### Usage (CLI)

Run the script from the project root directory.

```bash
python cli.py --input-dir <path/to/input> --output-dir <path/to/output> [options]
```

**Arguments:**

*   `--input-dir`: Path to the directory containing input images (default: `input`).
*   `--output-dir`: Path to the base directory for processed images and logs (default: `output`). The script will maintain the subdirectory structure from the input directory.
*   `--lang`: Target language code for filename/alt text (e.g., 'en', 'sl', 'de') (default: `en`).
*   `--format`: Optional output image format ('jpg', 'png', 'webp', 'avif'). If omitted, the original format is kept.
*   `--max-width`: Optional maximum width in pixels for output images. Aspect ratio is preserved. If omitted, the original size is kept.

**Examples:**

1.  **Basic usage (English, keep original format/size):**
    ```bash
    python cli.py --input-dir path/to/your/images --output-dir processed/images
    ```

2.  **Process images, translate to German, resize to 800px max width:**
    ```bash
    python cli.py --input-dir images_raw --output-dir images_processed --lang de --max-width 800
    ```

3.  **Process images, convert to WEBP format:**
    ```bash
    python cli.py --input-dir photos --output-dir web_ready --format webp
    ```

4.  **Process specific subfolder, convert to AVIF (see Known Issues), max width 900px:**
    ```bash
    python cli.py --input-dir input/specific_folder --output-dir output --format avif --max-width 900
    ```

## Known Issues

*   **AVIF Conversion:** There is a known issue when using the `--format avif` option with the CLI tool (`cli.py`). The underlying Pillow library might raise an error (`Error processing image: 'AVIF'`) during the save operation, causing images to be skipped. This might be related to specific image modes (e.g., RGBA) or Pillow's AVIF encoder capabilities. 
    *   **Troubleshooting (macOS):** AVIF support in Pillow often depends on the `libavif` system library. If you encounter errors with AVIF:
        1.  Install the library using Homebrew: `brew install libavif`
        2.  Reinstall Pillow *from source* within your virtual environment to ensure it detects `libavif`: `pip install --force-reinstall --no-cache-dir --no-binary Pillow Pillow`
    *   Using other formats like JPG, PNG, or WEBP is recommended if AVIF conversion fails or the troubleshooting steps are not feasible.

## License

MIT 

## Practical Examples

### Example 1: Process a single project folder
```bash
# Process images from a specific project, resize to max 1920px width, convert to WebP
python cli.py \
  --input-dir input/laneks/projekt2 \
  --output-dir output/laneks/projekt2 \
  --lang en \
  --format webp \
  --max-width 1920 \
  --log-mode per_folder
```

### Example 2: Process all projects for a client with project-level logs
```bash
# Process all projects for the 'laneks' client, create one log per project
python cli.py \
  --input-dir input/laneks \
  --output-dir output/laneks \
  --lang sl \
  --format webp \
  --max-width 1920 \
  --log-mode project_level
```

### Example 3: Batch process multiple clients with central logging
```bash
# Process everything with a single centralized log file
python cli.py \
  --input-dir input \
  --output-dir output \
  --lang en \
  --format avif \
  --max-width 1600 \
  --log-mode central
```

### Example 4: Keep original format but resize
```bash
# Just resize images without changing format
python cli.py \
  --input-dir input/large-images \
  --output-dir output/resized \
  --max-width 800 \
  --log-mode per_folder
```

### Example 5: Flatten deeply nested structure
```bash
# Process deeply nested folders but output everything to a flat structure
python cli.py \
  --input-dir input/complex-nested-structure \
  --output-dir output/flattened \
  --lang en \
  --format webp \
  --max-width 1600 \
  --log-mode flat
```

## Common Use Cases

### Photography Studios
- **Input**: Client folders with project subfolders
- **Settings**: `--log-mode project_level --format webp --max-width 2048`
- **Result**: Each project gets its own log, images optimized for web

### E-commerce
- **Input**: Product category folders  
- **Settings**: `--log-mode central --format webp --max-width 1200`
- **Result**: All products processed with central tracking

### Web Development
- **Input**: Mixed folder structures
- **Settings**: `--format avif --max-width 1920 --log-mode per_folder`
- **Result**: Modern format with excellent compression, detailed logs

### Digital Asset Management
- **Input**: Complex nested folder structures from various sources
- **Settings**: `--log-mode flat --format webp --max-width 1600`
- **Result**: All assets in one flat directory with descriptive names, single tracking log 

```bash
# Simple renaming in English
python cli.py --input-dir input/photos --output-dir output/renamed --lang en

# German language with WebP conversion and resizing
python cli.py --input-dir input/photos --output-dir output/optimized \
  --lang de --format webp --max-width 1024

# Project-level logging for organized results
python cli.py --input-dir input/company-photos --output-dir output/processed \
  --lang en --log-mode project_level
```

## 📖 Command Line Options

| Option | Description | Default |
|--------|-------------|---------|
| `--input-dir` | Directory containing input images | `input` |
| `--output-dir` | Base directory for processed images | `output` |
| `--lang` | Target language code (en, de, sl, fr, etc.) | `en` |
| `--format` | Output format (jpg, png, webp, avif) | Original |
| `--max-width` | Maximum width in pixels | Original |
| `--log-mode` | Logging mode (central, project_level, per_folder, flat) | `per_folder` |
| `--max-retries` | Maximum retry attempts for API calls | `5` |

## 📊 Logging Modes

### `per_folder` (Default)
Creates `results.json` and `results.csv` in each output subdirectory.

### `project_level`
Creates one log file per top-level project folder.

### `central`
Single log file in the main output directory.

### `flat`
Flattens directory structure with central logging.

## 🔄 Resume Functionality

The tool automatically resumes interrupted processing:

1. **Scans existing logs**: Checks all `results.json` files in output directory
2. **Identifies processed files**: Uses `original_filename` field for tracking
3. **Skips completed work**: Only processes new or failed images
4. **Handles rate limits**: Exponential backoff with up to 5 retry attempts

Example resume scenario:
```bash
# First run - processes 20 files, hits rate limit
python cli.py --input-dir photos --output-dir output --lang de

# Resume run - skips 20 completed files, continues with remaining
python cli.py --input-dir photos --output-dir output --lang de
```

## 🛠️ Advanced Configuration

### Retry Logic
- **Base delay**: 10 seconds, doubles with each retry
- **Rate limit delay**: Additional 60 seconds for quota errors
- **Maximum delay**: Capped at 5 minutes
- **Smart detection**: Recognizes various rate limiting error messages

### Image Processing
- **Supported formats**: JPG, JPEG, PNG, WebP
- **Output formats**: JPG, PNG, WebP, AVIF
- **Resizing**: Maintains aspect ratio when using `--max-width`
- **Quality**: WebP output at 90% quality

## 📁 Project Structure

```
image-filename-ai/
├── cli.py                    # Main application
├── app/
│   └── utils/
│       ├── ai_handler.py     # Gemini AI integration
│       ├── file_utils.py     # File operations and logging
│       └── image_processor.py # Image processing and conversion
├── input/                    # Your source images
└── output/                   # Generated results
    ├── project1/
    │   ├── results.json      # Processing log
    │   ├── results.csv       # CSV export
    │   └── *.webp           # Renamed images
    └── project2/
        └── ...
```

## 📈 Example Output

### Generated Filenames
- `IMG_1234.jpg` → `sunset-mountain-landscape-golden-hour.webp`
- `photo.png` → `office-desk-computer-workspace-clean.webp`
- `image.jpg` → `family-portrait-garden-summer-happy.webp`

### Log Entry
```json
{
  "timestamp": "2025-05-25 09:21:31",
  "original_path": "input/photos/IMG_1234.jpg",
  "new_path": "output/photos/sunset-mountain-landscape.webp",
  "original_filename": "IMG_1234.jpg",
  "new_filename": "sunset-mountain-landscape.webp",
  "alt_text": "A beautiful sunset over mountain peaks with golden light illuminating the landscape."
}
```

## 🌍 Language Support

The tool supports any language supported by Gemini AI. Common examples:

- `--lang en` - English
- `--lang de` - German (Deutsch)
- `--lang sl` - Slovenian
- `--lang fr` - French
- `--lang es` - Spanish
- `--lang it` - Italian
- `--lang pt` - Portuguese

## 🔧 Development & Testing

### Running Tests
```bash
# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=app --cov=cli

# Test specific module
pytest tests/test_cli.py -v
```

### Code Quality
```bash
# Format code
black .

# Lint code
ruff check .

# Fix linting issues
ruff check . --fix
```

### Development Setup
```bash
# Install development dependencies (included in requirements.txt)
pip install -r requirements.txt

# Run API in development mode with auto-reload
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
```

### Architecture

**CLI Mode**: Direct local processing using Gemini API
- Input: Local image directories
- Output: Processed images with generated names  
- Use case: Batch processing, one-time organization

**API Mode**: Web service for on-demand processing
- Input: GCS bucket URLs or direct uploads
- Output: Background job processing with status tracking
- Use case: Integration with other systems, web applications

## 📋 Production TODO

- [ ] **Add API authentication** (API keys, JWT, OAuth)
- [ ] **Add rate limiting** per client/endpoint  
- [ ] **Add input validation** and sanitization
- [ ] **Add comprehensive logging** and monitoring
- [ ] **Add image virus scanning** before processing
- [ ] **Add batch processing** for large image sets
- [ ] **Add webhook notifications** for job completion
- [ ] **Add cost monitoring** for Vertex AI usage
- [ ] **Package CLI as standalone executable** (PyInstaller)
- [ ] **Add retry logic** for failed AI requests
- [ ] **Add progress bars** for CLI processing

Raw data

            {
    "_id": null,
    "home_page": null,
    "name": "image-filename-ai",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.11",
    "maintainer_email": "Matija Ziberna <matijazib@gmail.com>",
    "keywords": "ai, image, filename, gemini, google-cloud, cli, batch-processing, seo, automation, computer-vision, machine-learning",
    "author": null,
    "author_email": "Matija Ziberna <matijazib@gmail.com>",
    "download_url": "https://files.pythonhosted.org/packages/6b/d1/3e86c4fc258616090391aa26a68f2c2e394ba76bf752bf55b840b3327897/image_filename_ai-0.1.0.tar.gz",
    "platform": null,
    "description": "# Image Filename AI\n\n## Overview\n\nThis application uses AI (Gemini) to automatically rename image files based on their content and generate descriptive alt text. It supports both flat and nested folder structures, making it perfect for organizing project-based image collections.\n\n## Features\n\n- **AI-powered image analysis**: Uses Google's Gemini model to understand image content\n- **Intelligent filename generation**: Creates descriptive, SEO-friendly filenames\n- **Alt text generation**: Generates accessible alt text for images\n- **Nested folder support**: Preserves directory structure for project-based organization\n- **Image processing**: Resize and reformat images during processing\n- **Multiple logging modes**: Flexible logging options for different use cases\n- **Language support**: Generate filenames and alt text in multiple languages\n\n## Requirements\n\n- **Python**: 3.11+ (tested on 3.11, 3.12, 3.13)\n- **Google Cloud Platform**: Project with Vertex AI enabled\n- **Service Account**: With required permissions (see Authentication section)\n\n## Installation\n\n### Option 1: Install from PyPI (Recommended)\n\n```bash\n# Install the core CLI tool\npip install image-filename-ai\n\n# Or install with API dependencies\npip install \"image-filename-ai[api]\"\n\n# Or install with development dependencies  \npip install \"image-filename-ai[dev]\"\n```\n\n### Option 2: Local Development\n\n1. Clone the repository:\n```bash\ngit clone https://github.com/matija2209/image-filename-ai.git\ncd image-filename-ai\n```\n\n2. Create virtual environment:\n```bash\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n```\n\n3. Install in development mode:\n```bash\npip install -e \".[dev,api]\"\n```\n\n4. Set up credentials (see Authentication section below)\n\n### Option 3: Docker (Recommended for API)\n\n1. Clone the repository\n2. Copy `.env.example` to `.env` and configure\n3. Run with Docker Compose:\n```bash\ndocker-compose up --build\n```\n\n## Authentication & Credentials\n\nChoose **one** of the following methods:\n\n### Method 1: Environment Variable (Recommended)\n```bash\nexport GOOGLE_APPLICATION_CREDENTIALS=\"/path/to/your/service-account-key.json\"\n```\n\n### Method 2: Place credentials in repo root\nPlace your `serviceAccountKey.json` file in the project root directory (automatically gitignored).\n\n### For Docker Usage\nUncomment the volume mount in `compose.yml`:\n```yaml\nvolumes:\n  - ./serviceAccountKey.json:/app/credentials/credentials.json:ro\n```\n\n### Required GCP Permissions\nYour service account needs:\n- `aiplatform.endpoints.predict` (Vertex AI predictions)\n- `storage.objects.get` (read images from GCS) \n- `storage.objects.create` (create processed images)\n- `firestore.documents.read/write` (if using job tracking)\n\n## Usage\n\n### CLI Usage (Local Processing)\n\nFor a full, step-by-step CLI tutorial, see: [CLI_GUIDE.md](CLI_GUIDE.md)\n\nFor minimal GCP setup steps, see: [GCP_SETUP.md](GCP_SETUP.md)\n\n**Basic command:**\n```bash\npython cli.py --input-dir input --output-dir output --lang en\n```\n\n**With custom settings:**\n```bash\npython cli.py \\\n  --input-dir ./images \\\n  --output-dir ./processed \\\n  --lang de \\\n  --log-mode nested \\\n  --max-size 1920 \\\n  --quality 85 \\\n  --format webp\n```\n\n### API Usage (Docker/Server)\n\n**Start the API server:**\n```bash\n# Using Docker Compose (recommended)\ndocker-compose up\n\n# Or locally\nuvicorn app.main:app --host 0.0.0.0 --port 8000\n```\n\n**Access the API:**\n- Interactive docs: http://localhost:8000/docs\n- API endpoint: http://localhost:8000/api/v1/process\n- Health check: http://localhost:8000/\n\n**\u26a0\ufe0f Note**: The API is currently **unauthenticated** - suitable for development only.\n\n### Environment Configuration\n\nCopy `.env.example` to `.env` and adjust:\n```bash\ncp .env.example .env\n# Edit .env with your settings\n```\n\n**Key environment variables:**\n```bash\n# Core GCP settings (used by both CLI and API)\nPROJECT_ID=your-gcp-project-id\nLOCATION=us-central1\nMODEL_NAME=gemini-2.0-flash-exp\nGOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json\n\n# CLI-specific settings (optional)\nMAX_RETRIES=5           # Number of retry attempts\nBASE_RETRY_DELAY=10     # Base delay between retries (seconds)\nMAX_RETRY_DELAY=300     # Maximum delay cap (seconds)\nRATE_LIMIT_DELAY=60     # Extra delay for rate limit errors\n\n# Docker settings\nCOMPOSE_PORT_API=8000   # Port mapping for Docker Compose\n```\n\n**\ud83d\udcdd Note**: The CLI automatically loads `.env` file from the project root if present.\n\n### Advanced Options\n\n```bash\npython cli.py \\\n  --input-dir input/laneks \\\n  --output-dir output/laneks \\\n  --lang sl \\\n  --format webp \\\n  --max-width 1920 \\\n  --log-mode project_level\n```\n\n### Arguments\n\n- `--input-dir`: Directory containing input images (default: \"input\")\n- `--output-dir`: Base directory for processed images and logs (default: \"output\")\n- `--lang`: Target language code (e.g., 'en', 'sl', 'de') (default: \"en\")\n- `--format`: Output image format - jpg, png, webp, avif (default: original format)\n- `--max-width`: Maximum width in pixels for output images (default: original size)\n- `--log-mode`: Logging mode for results (default: \"per_folder\")\n\n### Logging Modes\n\nThe application supports three different logging modes to suit different organizational needs:\n\n#### `per_folder` (Default)\nCreates `results.json` and `results.csv` files in each folder where images are processed.\n```\noutput/\n\u251c\u2500\u2500 project1/\n\u2502   \u251c\u2500\u2500 results.json\n\u2502   \u251c\u2500\u2500 results.csv\n\u2502   \u2514\u2500\u2500 renamed-images...\n\u2514\u2500\u2500 project2/\n    \u251c\u2500\u2500 results.json\n    \u251c\u2500\u2500 results.csv\n    \u2514\u2500\u2500 renamed-images...\n```\n\n#### `project_level`\nCreates one log file per top-level project folder.\n```\noutput/\n\u251c\u2500\u2500 project1/\n\u2502   \u251c\u2500\u2500 results.json\n\u2502   \u251c\u2500\u2500 results.csv\n\u2502   \u251c\u2500\u2500 subfolder1/renamed-images...\n\u2502   \u2514\u2500\u2500 subfolder2/renamed-images...\n\u2514\u2500\u2500 project2/\n    \u251c\u2500\u2500 results.json\n    \u251c\u2500\u2500 results.csv\n    \u2514\u2500\u2500 renamed-images...\n```\n\n#### `central`\nCreates a single log file in the main output directory.\n```\noutput/\n\u251c\u2500\u2500 results.json\n\u251c\u2500\u2500 results.csv\n\u251c\u2500\u2500 project1/renamed-images...\n\u2514\u2500\u2500 project2/renamed-images...\n```\n\n#### `flat`\nFlattens the output structure - all processed images go directly to the main output directory with a single central log file. Perfect for processing deeply nested input folders when you want a simple flat output structure.\n```\noutput/\n\u251c\u2500\u2500 results.json\n\u251c\u2500\u2500 results.csv\n\u251c\u2500\u2500 descriptive-name-1.webp\n\u251c\u2500\u2500 descriptive-name-2.webp\n\u251c\u2500\u2500 descriptive-name-3.webp\n\u2514\u2500\u2500 descriptive-name-4.webp\n```\n*Note: In flat mode, filename conflicts are automatically resolved by adding a counter suffix (e.g., `name-1.webp`, `name-2.webp`).*\n\n## Nested Folder Support\n\nThe application automatically preserves your input directory structure in the output:\n\n**Input Structure:**\n```\ninput/\n\u251c\u2500\u2500 laneks/\n\u2502   \u251c\u2500\u2500 projekt1/\n\u2502   \u2502   \u251c\u2500\u2500 image1.jpg\n\u2502   \u2502   \u2514\u2500\u2500 image2.jpg\n\u2502   \u2514\u2500\u2500 projekt2/\n\u2502       \u2514\u2500\u2500 image3.jpg\n\u2514\u2500\u2500 other-client/\n    \u2514\u2500\u2500 flat-images/\n        \u2514\u2500\u2500 image4.jpg\n```\n\n**Output Structure:**\n```\noutput/\n\u251c\u2500\u2500 laneks/\n\u2502   \u251c\u2500\u2500 projekt1/\n\u2502   \u2502   \u251c\u2500\u2500 descriptive-name-1.webp\n\u2502   \u2502   \u2514\u2500\u2500 descriptive-name-2.webp\n\u2502   \u2514\u2500\u2500 projekt2/\n\u2502       \u2514\u2500\u2500 descriptive-name-3.webp\n\u2514\u2500\u2500 other-client/\n    \u2514\u2500\u2500 flat-images/\n        \u2514\u2500\u2500 descriptive-name-4.webp\n```\n\nThis makes it perfect for:\n- **Project-based workflows**: Each client/project maintains its own folder structure\n- **Mixed structures**: Support both flat folders and deeply nested hierarchies\n- **Team collaboration**: Preserve organizational structure that teams are familiar with\n\n## Authentication\n\nSet up Google Cloud authentication by placing your service account key file as `serviceAccountKey.json` in the project root, or use other Google Cloud authentication methods.\n\n## API Documentation\n\nFor web API usage, see [API_DOCUMENTATION.md](API_DOCUMENTATION.md).\n\n## Examples\n\nSee [EXAMPLE_DATA.json](EXAMPLE_DATA.json) for sample API responses and data structures.\n\ngit pull && docker compose build && export GOOGLE_APPLICATION_CREDENTIALS='filename-ai-21694d9b8f6c.json' && docker compose up\n\n## FastAPI Application\n\nA FastAPI application for processing images stored in Google Cloud Storage.\n\n### Features (FastAPI)\n\n- Process images from Google Cloud Storage\n- Generate descriptive, SEO-friendly filenames\n- Create alt text for accessibility and SEO\n- Support for multiple languages\n- REST API for easy integration\n\n### Docker Setup\n\n1. Build and start the container:\n   ```\n   docker compose build\n   docker compose up -d\n   ```\n\n2. Alternatively, pass the credentials path at runtime:\n   ```\n   export GOOGLE_APPLICATION_CREDENTIALS=\"/path/to/your/service-account-key.json\"\n   docker compose up -e GOOGLE_APPLICATION_CREDENTIALS\n   ```\n\n3. To run with specific environment variables:\n   ```\n   docker compose run -e GOOGLE_APPLICATION_CREDENTIALS=\"/path/to/your/credentials.json\" api\n   ```\n\n### Requirements\n\n- Python 3.9+\n- Google Cloud Project with Vertex AI API enabled\n- Google Cloud credentials configured\n\n### Setup\n\n1. Clone the repository\n2. Install dependencies:\n   ```\n   pip install -r requirements.txt\n   ```\n3. Configure the application (optional):\n   Create a `.env` file in the project root with:\n   ```\n   PROJECT_ID=your-gcp-project-id\n   LOCATION=us-central1\n   MODEL_NAME=gemini-2.0-flash-exp\n   ```\n\n### Usage\n\n1. Start the server:\n   ```\n   python run.py\n   ```\n2. Access the API documentation at `http://localhost:8000/docs`\n3. Make API requests:\n   ```\n   curl -X POST http://localhost:8000/api/v1/process \\\n     -H \"Content-Type: application/json\" \\\n     -d '{\n       \"gcs_input_path\": \"gs://your-bucket/images\",\n       \"language_code\": \"en\"\n     }'\n   ```\n\n### API Endpoints\n\n- **GET /** - Health check endpoint\n- **POST /api/v1/process** - Process images from GCS bucket\n\n### Configuration (FastAPI)\n\nThe application can be configured using environment variables or a `.env` file:\n\n- `PROJECT_ID` - Google Cloud project ID\n- `LOCATION` - Google Cloud region\n- `MODEL_NAME` - Gemini model to use\n- `HOST` - Server host (default: 0.0.0.0)\n- `PORT` - Server port (default: 8000)\n\n## Command-Line Interface (CLI)\n\nA CLI script (`cli.py`) for processing local image files.\n\n### Features (CLI)\n\n- Process images recursively from a local input directory.\n- Generate descriptive, SEO-friendly filenames using Vertex AI Gemini.\n- Create alt text for accessibility and SEO using Vertex AI Gemini.\n- Support for multiple languages for filenames and alt text.\n- Optionally convert images to different formats (JPG, PNG, WEBP, AVIF).\n- Optionally resize images to a maximum width, preserving aspect ratio.\n- Mirrors the input directory structure in the output directory.\n- Logs processing results to JSON and CSV files within each output subdirectory.\n\n### Requirements (CLI)\n\n- Python 3.9+\n- Google Cloud Project with Vertex AI API enabled\n- Google Cloud credentials configured (e.g., via `gcloud auth application-default login`)\n- Dependencies installed: `pip install -r requirements.txt` (Ensure `Pillow` is included for image processing)\n\n### Usage (CLI)\n\nRun the script from the project root directory.\n\n```bash\npython cli.py --input-dir <path/to/input> --output-dir <path/to/output> [options]\n```\n\n**Arguments:**\n\n*   `--input-dir`: Path to the directory containing input images (default: `input`).\n*   `--output-dir`: Path to the base directory for processed images and logs (default: `output`). The script will maintain the subdirectory structure from the input directory.\n*   `--lang`: Target language code for filename/alt text (e.g., 'en', 'sl', 'de') (default: `en`).\n*   `--format`: Optional output image format ('jpg', 'png', 'webp', 'avif'). If omitted, the original format is kept.\n*   `--max-width`: Optional maximum width in pixels for output images. Aspect ratio is preserved. If omitted, the original size is kept.\n\n**Examples:**\n\n1.  **Basic usage (English, keep original format/size):**\n    ```bash\n    python cli.py --input-dir path/to/your/images --output-dir processed/images\n    ```\n\n2.  **Process images, translate to German, resize to 800px max width:**\n    ```bash\n    python cli.py --input-dir images_raw --output-dir images_processed --lang de --max-width 800\n    ```\n\n3.  **Process images, convert to WEBP format:**\n    ```bash\n    python cli.py --input-dir photos --output-dir web_ready --format webp\n    ```\n\n4.  **Process specific subfolder, convert to AVIF (see Known Issues), max width 900px:**\n    ```bash\n    python cli.py --input-dir input/specific_folder --output-dir output --format avif --max-width 900\n    ```\n\n## Known Issues\n\n*   **AVIF Conversion:** There is a known issue when using the `--format avif` option with the CLI tool (`cli.py`). The underlying Pillow library might raise an error (`Error processing image: 'AVIF'`) during the save operation, causing images to be skipped. This might be related to specific image modes (e.g., RGBA) or Pillow's AVIF encoder capabilities. \n    *   **Troubleshooting (macOS):** AVIF support in Pillow often depends on the `libavif` system library. If you encounter errors with AVIF:\n        1.  Install the library using Homebrew: `brew install libavif`\n        2.  Reinstall Pillow *from source* within your virtual environment to ensure it detects `libavif`: `pip install --force-reinstall --no-cache-dir --no-binary Pillow Pillow`\n    *   Using other formats like JPG, PNG, or WEBP is recommended if AVIF conversion fails or the troubleshooting steps are not feasible.\n\n## License\n\nMIT \n\n## Practical Examples\n\n### Example 1: Process a single project folder\n```bash\n# Process images from a specific project, resize to max 1920px width, convert to WebP\npython cli.py \\\n  --input-dir input/laneks/projekt2 \\\n  --output-dir output/laneks/projekt2 \\\n  --lang en \\\n  --format webp \\\n  --max-width 1920 \\\n  --log-mode per_folder\n```\n\n### Example 2: Process all projects for a client with project-level logs\n```bash\n# Process all projects for the 'laneks' client, create one log per project\npython cli.py \\\n  --input-dir input/laneks \\\n  --output-dir output/laneks \\\n  --lang sl \\\n  --format webp \\\n  --max-width 1920 \\\n  --log-mode project_level\n```\n\n### Example 3: Batch process multiple clients with central logging\n```bash\n# Process everything with a single centralized log file\npython cli.py \\\n  --input-dir input \\\n  --output-dir output \\\n  --lang en \\\n  --format avif \\\n  --max-width 1600 \\\n  --log-mode central\n```\n\n### Example 4: Keep original format but resize\n```bash\n# Just resize images without changing format\npython cli.py \\\n  --input-dir input/large-images \\\n  --output-dir output/resized \\\n  --max-width 800 \\\n  --log-mode per_folder\n```\n\n### Example 5: Flatten deeply nested structure\n```bash\n# Process deeply nested folders but output everything to a flat structure\npython cli.py \\\n  --input-dir input/complex-nested-structure \\\n  --output-dir output/flattened \\\n  --lang en \\\n  --format webp \\\n  --max-width 1600 \\\n  --log-mode flat\n```\n\n## Common Use Cases\n\n### Photography Studios\n- **Input**: Client folders with project subfolders\n- **Settings**: `--log-mode project_level --format webp --max-width 2048`\n- **Result**: Each project gets its own log, images optimized for web\n\n### E-commerce\n- **Input**: Product category folders  \n- **Settings**: `--log-mode central --format webp --max-width 1200`\n- **Result**: All products processed with central tracking\n\n### Web Development\n- **Input**: Mixed folder structures\n- **Settings**: `--format avif --max-width 1920 --log-mode per_folder`\n- **Result**: Modern format with excellent compression, detailed logs\n\n### Digital Asset Management\n- **Input**: Complex nested folder structures from various sources\n- **Settings**: `--log-mode flat --format webp --max-width 1600`\n- **Result**: All assets in one flat directory with descriptive names, single tracking log \n\n```bash\n# Simple renaming in English\npython cli.py --input-dir input/photos --output-dir output/renamed --lang en\n\n# German language with WebP conversion and resizing\npython cli.py --input-dir input/photos --output-dir output/optimized \\\n  --lang de --format webp --max-width 1024\n\n# Project-level logging for organized results\npython cli.py --input-dir input/company-photos --output-dir output/processed \\\n  --lang en --log-mode project_level\n```\n\n## \ud83d\udcd6 Command Line Options\n\n| Option | Description | Default |\n|--------|-------------|---------|\n| `--input-dir` | Directory containing input images | `input` |\n| `--output-dir` | Base directory for processed images | `output` |\n| `--lang` | Target language code (en, de, sl, fr, etc.) | `en` |\n| `--format` | Output format (jpg, png, webp, avif) | Original |\n| `--max-width` | Maximum width in pixels | Original |\n| `--log-mode` | Logging mode (central, project_level, per_folder, flat) | `per_folder` |\n| `--max-retries` | Maximum retry attempts for API calls | `5` |\n\n## \ud83d\udcca Logging Modes\n\n### `per_folder` (Default)\nCreates `results.json` and `results.csv` in each output subdirectory.\n\n### `project_level`\nCreates one log file per top-level project folder.\n\n### `central`\nSingle log file in the main output directory.\n\n### `flat`\nFlattens directory structure with central logging.\n\n## \ud83d\udd04 Resume Functionality\n\nThe tool automatically resumes interrupted processing:\n\n1. **Scans existing logs**: Checks all `results.json` files in output directory\n2. **Identifies processed files**: Uses `original_filename` field for tracking\n3. **Skips completed work**: Only processes new or failed images\n4. **Handles rate limits**: Exponential backoff with up to 5 retry attempts\n\nExample resume scenario:\n```bash\n# First run - processes 20 files, hits rate limit\npython cli.py --input-dir photos --output-dir output --lang de\n\n# Resume run - skips 20 completed files, continues with remaining\npython cli.py --input-dir photos --output-dir output --lang de\n```\n\n## \ud83d\udee0\ufe0f Advanced Configuration\n\n### Retry Logic\n- **Base delay**: 10 seconds, doubles with each retry\n- **Rate limit delay**: Additional 60 seconds for quota errors\n- **Maximum delay**: Capped at 5 minutes\n- **Smart detection**: Recognizes various rate limiting error messages\n\n### Image Processing\n- **Supported formats**: JPG, JPEG, PNG, WebP\n- **Output formats**: JPG, PNG, WebP, AVIF\n- **Resizing**: Maintains aspect ratio when using `--max-width`\n- **Quality**: WebP output at 90% quality\n\n## \ud83d\udcc1 Project Structure\n\n```\nimage-filename-ai/\n\u251c\u2500\u2500 cli.py                    # Main application\n\u251c\u2500\u2500 app/\n\u2502   \u2514\u2500\u2500 utils/\n\u2502       \u251c\u2500\u2500 ai_handler.py     # Gemini AI integration\n\u2502       \u251c\u2500\u2500 file_utils.py     # File operations and logging\n\u2502       \u2514\u2500\u2500 image_processor.py # Image processing and conversion\n\u251c\u2500\u2500 input/                    # Your source images\n\u2514\u2500\u2500 output/                   # Generated results\n    \u251c\u2500\u2500 project1/\n    \u2502   \u251c\u2500\u2500 results.json      # Processing log\n    \u2502   \u251c\u2500\u2500 results.csv       # CSV export\n    \u2502   \u2514\u2500\u2500 *.webp           # Renamed images\n    \u2514\u2500\u2500 project2/\n        \u2514\u2500\u2500 ...\n```\n\n## \ud83d\udcc8 Example Output\n\n### Generated Filenames\n- `IMG_1234.jpg` \u2192 `sunset-mountain-landscape-golden-hour.webp`\n- `photo.png` \u2192 `office-desk-computer-workspace-clean.webp`\n- `image.jpg` \u2192 `family-portrait-garden-summer-happy.webp`\n\n### Log Entry\n```json\n{\n  \"timestamp\": \"2025-05-25 09:21:31\",\n  \"original_path\": \"input/photos/IMG_1234.jpg\",\n  \"new_path\": \"output/photos/sunset-mountain-landscape.webp\",\n  \"original_filename\": \"IMG_1234.jpg\",\n  \"new_filename\": \"sunset-mountain-landscape.webp\",\n  \"alt_text\": \"A beautiful sunset over mountain peaks with golden light illuminating the landscape.\"\n}\n```\n\n## \ud83c\udf0d Language Support\n\nThe tool supports any language supported by Gemini AI. Common examples:\n\n- `--lang en` - English\n- `--lang de` - German (Deutsch)\n- `--lang sl` - Slovenian\n- `--lang fr` - French\n- `--lang es` - Spanish\n- `--lang it` - Italian\n- `--lang pt` - Portuguese\n\n## \ud83d\udd27 Development & Testing\n\n### Running Tests\n```bash\n# Run all tests\npytest tests/ -v\n\n# Run with coverage\npytest tests/ --cov=app --cov=cli\n\n# Test specific module\npytest tests/test_cli.py -v\n```\n\n### Code Quality\n```bash\n# Format code\nblack .\n\n# Lint code\nruff check .\n\n# Fix linting issues\nruff check . --fix\n```\n\n### Development Setup\n```bash\n# Install development dependencies (included in requirements.txt)\npip install -r requirements.txt\n\n# Run API in development mode with auto-reload\nuvicorn app.main:app --reload --host 0.0.0.0 --port 8000\n```\n\n### Architecture\n\n**CLI Mode**: Direct local processing using Gemini API\n- Input: Local image directories\n- Output: Processed images with generated names  \n- Use case: Batch processing, one-time organization\n\n**API Mode**: Web service for on-demand processing\n- Input: GCS bucket URLs or direct uploads\n- Output: Background job processing with status tracking\n- Use case: Integration with other systems, web applications\n\n## \ud83d\udccb Production TODO\n\n- [ ] **Add API authentication** (API keys, JWT, OAuth)\n- [ ] **Add rate limiting** per client/endpoint  \n- [ ] **Add input validation** and sanitization\n- [ ] **Add comprehensive logging** and monitoring\n- [ ] **Add image virus scanning** before processing\n- [ ] **Add batch processing** for large image sets\n- [ ] **Add webhook notifications** for job completion\n- [ ] **Add cost monitoring** for Vertex AI usage\n- [ ] **Package CLI as standalone executable** (PyInstaller)\n- [ ] **Add retry logic** for failed AI requests\n- [ ] **Add progress bars** for CLI processing\n\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "AI-powered image filename generator using Google Gemini - Transform generic image files into descriptive, SEO-friendly names",
    "version": "0.1.0",
    "project_urls": {
        "Documentation": "https://github.com/matija2209/image-filename-ai#readme",
        "Homepage": "https://github.com/matija2209/image-filename-ai",
        "Issues": "https://github.com/matija2209/image-filename-ai/issues",
        "Repository": "https://github.com/matija2209/image-filename-ai"
    },
    "split_keywords": [
        "ai",
        " image",
        " filename",
        " gemini",
        " google-cloud",
        " cli",
        " batch-processing",
        " seo",
        " automation",
        " computer-vision",
        " machine-learning"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "05cde52ae0e226d794a4f6a059dbe1a4803a83620306aebec28ecbf352e97c4c",
                "md5": "56a901a202ddb6ad6569401e6bb07eb7",
                "sha256": "a0ae8a3617045e9ecfc87ec17e3a4a9009d687cbf3b00a450631228f78249194"
            },
            "downloads": -1,
            "filename": "image_filename_ai-0.1.0-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "56a901a202ddb6ad6569401e6bb07eb7",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.11",
            "size": 33802,
            "upload_time": "2025-08-12T15:20:48",
            "upload_time_iso_8601": "2025-08-12T15:20:48.491072Z",
            "url": "https://files.pythonhosted.org/packages/05/cd/e52ae0e226d794a4f6a059dbe1a4803a83620306aebec28ecbf352e97c4c/image_filename_ai-0.1.0-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "6bd13e86c4fc258616090391aa26a68f2c2e394ba76bf752bf55b840b3327897",
                "md5": "5ef9367e05992ee5a8c169afe5625df6",
                "sha256": "9a9bc6c8fc4177107da5234382c323dc1926c140c1134a7a95410d5554846c9d"
            },
            "downloads": -1,
            "filename": "image_filename_ai-0.1.0.tar.gz",
            "has_sig": false,
            "md5_digest": "5ef9367e05992ee5a8c169afe5625df6",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.11",
            "size": 45659,
            "upload_time": "2025-08-12T15:20:49",
            "upload_time_iso_8601": "2025-08-12T15:20:49.669214Z",
            "url": "https://files.pythonhosted.org/packages/6b/d1/3e86c4fc258616090391aa26a68f2c2e394ba76bf752bf55b840b3327897/image_filename_ai-0.1.0.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-08-12 15:20:49",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "matija2209",
    "github_project": "image-filename-ai#readme",
    "github_not_found": true,
    "lcname": "image-filename-ai"
}

None