video-ai-studio

Name	video-ai-studio JSON
Version	1.0.15 JSON
	download
home_page	https://github.com/donghaozhang/veo3-fal-video-ai
Summary	Comprehensive AI content generation suite with multiple providers and services
upload_time	2025-07-14 06:27:47
maintainer	None
docs_url	None
author	donghao zhang
requires_python	>=3.10
license	None
keywords	ai content generation images videos audio fal elevenlabs google parallel processing veo pipeline
VCS
bugtrack_url
requirements	python-dotenv requests typing-extensions pyyaml pathlib2 argparse fal-client replicate openai google-cloud-aiplatform google-cloud-storage google-auth google-generativeai elevenlabs Pillow moviepy ffmpeg-python aiohttp httpx jupyter ipython notebook pytest pytest-asyncio
Travis-CI	No Travis.
coveralls test coverage	No coveralls.

            # AI Content Generation Suite

A comprehensive AI content generation package with multiple providers and services, consolidated into a single installable package.

[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
[![PyPI](https://img.shields.io/pypi/v/video-ai-studio)](https://pypi.org/project/video-ai-studio/)

> **⚡ Production-ready Python package with comprehensive CLI, parallel execution, and enterprise-grade architecture**

## 🎬 **Demo Video**

[![AI Content Generation Suite Demo](https://img.youtube.com/vi/xzvPrlKnXqk/maxresdefault.jpg)](https://www.youtube.com/watch?v=xzvPrlKnXqk)

*Click to watch the complete demo of AI Content Generation Suite in action*

## 🎨 Available AI Models

### Text-to-Image Models
| Model Name | Provider | Cost per Image | Resolution | Special Features |
|------------|----------|----------------|------------|------------------|
| `flux_dev` | FAL AI | $0.003 | 1024x1024 | High quality, FLUX.1 Dev |
| `flux_schnell` | FAL AI | $0.001 | 1024x1024 | Fast generation, FLUX.1 Schnell |
| `imagen4` | FAL AI | $0.004 | 1024x1024 | Google Imagen 4, photorealistic |
| `seedream_v3` | FAL AI | $0.002 | 1024x1024 | Seedream v3, bilingual support |
| `seedream3` | Replicate | $0.003 | Up to 2048px | ByteDance Seedream-3, high-res |
| `gen4` | Replicate | $0.08 | 720p/1080p | **Runway Gen-4, multi-reference guidance** |

### Image-to-Image Models  
| Model Name | Provider | Cost per Image | Special Features |
|------------|----------|----------------|------------------|
| `photon_flash` | FAL AI | $0.02 | Luma Photon Flash, creative & fast |
| `photon_base` | FAL AI | $0.03 | Luma Photon Base, high quality |
| `flux_kontext` | FAL AI | $0.025 | FLUX Kontext Dev, contextual editing |
| `flux_kontext_multi` | FAL AI | $0.04 | FLUX Kontext Multi, multi-image |
| `seededit_v3` | FAL AI | $0.02 | ByteDance SeedEdit v3, precise editing |
| `clarity_upscaler` | FAL AI | $0.05 | Clarity AI upscaler |

### Image-to-Video Models
| Model Name | Provider | Cost per Video | Resolution | Special Features |
|------------|----------|----------------|------------|------------------|
| `veo3` | FAL AI | $3.00 | Up to 1080p | Google Veo 3.0, latest model |
| `veo3_fast` | FAL AI | $2.00 | Up to 1080p | Google Veo 3.0 Fast |
| `veo2` | FAL AI | $2.50 | Up to 1080p | Google Veo 2.0 |
| `hailuo` | FAL AI | $0.08 | 720p | MiniMax Hailuo-02, budget-friendly |
| `kling` | FAL AI | $0.10 | 720p | Kling Video 2.1 |

### Image Understanding Models
| Model Name | Provider | Cost per Analysis | Special Features |
|------------|----------|-------------------|------------------|
| `gemini_describe` | Google | $0.001 | Basic image description |
| `gemini_detailed` | Google | $0.002 | Detailed image analysis |
| `gemini_classify` | Google | $0.001 | Image classification |
| `gemini_objects` | Google | $0.002 | Object detection |
| `gemini_ocr` | Google | $0.001 | Text extraction (OCR) |
| `gemini_composition` | Google | $0.002 | Artistic & technical analysis |
| `gemini_qa` | Google | $0.001 | Question & answer system |

### Text-to-Speech Models
| Model Name | Provider | Cost per Request | Special Features |
|------------|----------|------------------|------------------|
| `elevenlabs` | ElevenLabs | $0.05 | High quality TTS |
| `elevenlabs_turbo` | ElevenLabs | $0.03 | Fast generation |
| `elevenlabs_v3` | ElevenLabs | $0.08 | Latest v3 model |

### Prompt Generation Models
| Model Name | Provider | Cost per Request | Special Features |
|------------|----------|------------------|------------------|
| `openrouter_video_prompt` | OpenRouter | $0.002 | General video prompts |
| `openrouter_video_cinematic` | OpenRouter | $0.002 | Cinematic style prompts |
| `openrouter_video_realistic` | OpenRouter | $0.002 | Realistic style prompts |
| `openrouter_video_artistic` | OpenRouter | $0.002 | Artistic style prompts |
| `openrouter_video_dramatic` | OpenRouter | $0.002 | Dramatic style prompts |

### Audio & Video Processing
| Model Name | Provider | Cost per Request | Special Features |
|------------|----------|------------------|------------------|
| `thinksound` | FAL AI | $0.05 | AI audio generation |
| `topaz` | FAL AI | $1.50 | Video upscaling |

### 🌟 **Featured Model: Runway Gen-4**
The **`gen4`** model is our most advanced text-to-image model, offering unique capabilities:
- **Multi-Reference Guidance**: Use up to 3 reference images with tagging
- **Cinematic Quality**: Premium model for high-end generation  
- **@ Syntax**: Reference tagged elements in prompts (`@woman`, `@park`)
- **Variable Pricing**: $0.05 (720p) / $0.08 (1080p)

**Total Models: 35+ AI models across 7 categories**

## 🚀 **FLAGSHIP: AI Content Pipeline**

The unified AI content generation pipeline with parallel execution support, multi-model integration, and YAML-based configuration.

### Core Capabilities
- **🔄 Unified Pipeline Architecture** - YAML/JSON-based configuration for complex multi-step workflows
- **⚡ Parallel Execution Engine** - 2-3x performance improvement with thread-based parallel processing
- **🎯 Type-Safe Configuration** - Pydantic models with comprehensive validation
- **💰 Cost Management** - Real-time cost estimation and tracking across all services
- **📊 Rich Logging** - Beautiful console output with progress tracking and performance metrics

### AI Service Integrations
- **🖼️ FAL AI** - Text-to-image, image-to-image, text-to-video, video generation, avatar creation
- **🗣️ ElevenLabs** - Professional text-to-speech with 20+ voice options
- **🎥 Google Vertex AI** - Veo video generation and Gemini text generation  
- **🔗 OpenRouter** - Alternative TTS and chat completion services

### Developer Experience
- **🛠️ Professional CLI** - Comprehensive command-line interface with Click
- **📦 Modular Architecture** - Clean separation of concerns with extensible design
- **🧪 Comprehensive Testing** - Unit and integration tests with pytest
- **📚 Type Hints** - Full type coverage for excellent IDE support

## 📦 Installation

### Quick Start
```bash
# Install from PyPI
pip install video-ai-studio

# Or install in development mode
pip install -e .
```

### 🔑 API Keys Setup

After installation, you need to configure your API keys:

1. **Download the example configuration:**
   ```bash
   # Option 1: Download from GitHub
   curl -o .env https://raw.githubusercontent.com/donghaozhang/veo3-fal-video-ai/main/.env.example
   
   # Option 2: Create manually
   touch .env
   ```

2. **Add your API keys to `.env`:**
   ```env
   # Required for most functionality
   FAL_KEY=your_fal_api_key_here
   
   # Optional - add as needed
   GEMINI_API_KEY=your_gemini_api_key_here
   OPENROUTER_API_KEY=your_openrouter_api_key_here
   ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
   ```

3. **Get API keys from:**
   - **FAL AI**: https://fal.ai/dashboard (required for most models)
   - **Google Gemini**: https://makersuite.google.com/app/apikey
   - **OpenRouter**: https://openrouter.ai/keys
   - **ElevenLabs**: https://elevenlabs.io/app/settings

### 📋 Dependencies
The package installs core dependencies automatically. See [requirements.txt](requirements.txt) for the complete list.

## 🛠️ Quick Start

### Console Commands
```bash
# List all available AI models
ai-content-pipeline list-models

# Generate image from text
ai-content-pipeline generate-image --text "epic space battle" --model flux_dev

# Create video (text → image → video)
ai-content-pipeline create-video --text "serene mountain lake"

# Run custom pipeline from YAML config
ai-content-pipeline run-chain --config config.yaml --input "cyberpunk city"

# Create example configurations
ai-content-pipeline create-examples

# Shortened command alias
aicp --help
```

### Python API
```python
from packages.core.ai_content_pipeline.pipeline.manager import AIPipelineManager

# Initialize manager
manager = AIPipelineManager()

# Quick video creation
result = manager.quick_create_video(
    text="serene mountain lake",
    image_model="flux_dev",
    video_model="auto"
)

# Run custom chain
chain = manager.create_chain_from_config("config.yaml")
result = manager.execute_chain(chain, "input text")
```

## 📚 Package Structure

### Core Packages
- **[ai_content_pipeline](packages/core/ai_content_pipeline/)** - Main unified pipeline with parallel execution

### Provider Packages

#### Google Services
- **[google-veo](packages/providers/google/veo/)** - Google Veo video generation (Vertex AI)

#### FAL AI Services  
- **[fal-video](packages/providers/fal/video/)** - Video generation (MiniMax Hailuo-02, Kling Video 2.1)
- **[fal-text-to-video](packages/providers/fal/text-to-video/)** - Text-to-video (MiniMax Hailuo-02 Pro, Google Veo 3)
- **[fal-avatar](packages/providers/fal/avatar/)** - Avatar generation with TTS integration
- **[fal-text-to-image](packages/providers/fal/text-to-image/)** - Text-to-image (Imagen 4, Seedream v3, FLUX.1)
- **[fal-image-to-image](packages/providers/fal/image-to-image/)** - Image transformation (Luma Photon Flash)
- **[fal-video-to-video](packages/providers/fal/video-to-video/)** - Video processing (ThinksSound + Topaz)

### Service Packages
- **[text-to-speech](packages/services/text-to-speech/)** - ElevenLabs TTS integration (20+ voices)
- **[video-tools](packages/services/video-tools/)** - Video processing utilities with AI analysis

## 🔧 Configuration

### Environment Setup
Create a `.env` file in the project root:
```env
# FAL AI API Configuration
FAL_KEY=your_fal_api_key

# Google Cloud Configuration (for Veo)
PROJECT_ID=your-project-id
OUTPUT_BUCKET_PATH=gs://your-bucket/veo_output/

# ElevenLabs Configuration
ELEVENLABS_API_KEY=your_elevenlabs_api_key

# Optional: Gemini for AI analysis
GEMINI_API_KEY=your_gemini_api_key

# Optional: OpenRouter for additional models
OPENROUTER_API_KEY=your_openrouter_api_key
```

### YAML Pipeline Configuration
```yaml
name: "Text to Video Pipeline"
description: "Generate video from text prompt"
steps:
  - name: "generate_image"
    type: "text_to_image"
    model: "flux_dev"
    aspect_ratio: "16:9"
    
  - name: "create_video"
    type: "image_to_video"
    model: "kling_video"
    input_from: "generate_image"
    duration: 8
```

### Parallel Execution
Enable parallel processing for 2-3x speedup:
```bash
# Enable parallel execution
PIPELINE_PARALLEL_ENABLED=true ai-content-pipeline run-chain --config config.yaml
```

Example parallel pipeline configuration:
```yaml
name: "Parallel Processing Example"
steps:
  - type: "parallel_group"
    steps:
      - type: "text_to_image"
        model: "flux_schnell"
        params:
          prompt: "A cat"
      - type: "text_to_image"
        model: "flux_schnell"
        params:
          prompt: "A dog"
      - type: "text_to_image"
        model: "flux_schnell"
        params:
          prompt: "A bird"
```

## 💰 Cost Management

### Cost Estimation
Always estimate costs before running pipelines:
```bash
# Estimate cost for a pipeline
ai-content-pipeline estimate-cost --config config.yaml
```

### Typical Costs
- **Text-to-Image**: $0.001-0.004 per image
- **Image-to-Image**: $0.01-0.05 per modification  
- **Text-to-Video**: $0.08-6.00 per video (model dependent)
- **Avatar Generation**: $0.02-0.05 per video
- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)
- **Video Processing**: $0.05-2.50 per video (model dependent)

### Cost-Conscious Usage
- Use cheaper models for prototyping (`flux_schnell`, `hailuo`)
- Test with small batches before large-scale generation
- Monitor costs with built-in tracking

## 🧪 Testing

```bash
# Quick tests
python tests/run_all_tests.py --quick
```

📋 See [tests/README.md](tests/README.md) for complete testing guide.

## 💰 Cost Management

### Estimation
- **FAL AI Video**: ~$0.05-0.10 per video
- **FAL AI Text-to-Video**: ~$0.08 (MiniMax) to $2.50-6.00 (Google Veo 3)
- **FAL AI Avatar**: ~$0.02-0.05 per video
- **FAL AI Images**: ~$0.001-0.01 per image
- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)

### Best Practices
1. Always run `test_setup.py` first (FREE)
2. Use cost estimation in pipeline manager
3. Start with cheaper models for testing
4. Monitor usage through provider dashboards

## 🔄 Development Workflow

### Making Changes
```bash
# Make your changes to the codebase
git add .
git commit -m "Your changes"
git push origin main
```

### Testing Installation
```bash
# Create test environment
python3 -m venv test_env
source test_env/bin/activate

# Install and test
pip install -e .
ai-content-pipeline --help
```

## 📋 Available Commands

### AI Content Pipeline Commands
- `ai-content-pipeline list-models` - List all available models
- `ai-content-pipeline generate-image` - Generate image from text
- `ai-content-pipeline create-video` - Create video from text
- `ai-content-pipeline run-chain` - Run custom YAML pipeline
- `ai-content-pipeline create-examples` - Create example configs
- `aicp` - Shortened alias for all commands

### Individual Package Commands
See [CLAUDE.md](CLAUDE.md) for detailed commands for each package.

## 📚 Documentation

- **[Project Instructions](CLAUDE.md)** - Comprehensive development guide
- **[Documentation](docs/)** - Additional documentation and guides
- **Package READMEs** - Each package has its own README with specific instructions

## 🏗️ Architecture

- **Unified Package Structure** - Single `setup.py` with consolidated dependencies
- **Consolidated Configuration** - Single `.env` file for all services
- **Modular Design** - Each service can be used independently or through the unified pipeline
- **Parallel Execution** - Optional parallel processing for improved performance
- **Cost-Conscious Design** - Built-in cost estimation and management

## 📚 Resources

### 🚀 AI Content Pipeline Resources
- [Pipeline Documentation](packages/core/ai_content_pipeline/docs/README.md)
- [Getting Started Guide](packages/core/ai_content_pipeline/docs/GETTING_STARTED.md)
- [YAML Configuration Reference](packages/core/ai_content_pipeline/docs/YAML_CONFIGURATION.md)
- [Parallel Execution Design](packages/core/ai_content_pipeline/docs/parallel_pipeline_design.md)

### Google Veo Resources
- [Veo API Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo-video-generation)
- [Google GenAI SDK](https://github.com/google/generative-ai-python)
- [Vertex AI Console](https://console.cloud.google.com/vertex-ai)

### FAL AI Resources
- [FAL AI Platform](https://fal.ai/)
- [MiniMax Hailuo Documentation](https://fal.ai/models/fal-ai/minimax-video-01)
- [Kling Video 2.1 Documentation](https://fal.ai/models/fal-ai/kling-video/v2.1/standard/image-to-video/api)
- [FAL AI Avatar Documentation](https://fal.ai/models/fal-ai/avatar-video)
- [ThinksSound API Documentation](https://fal.ai/models/fal-ai/thinksound/api)
- [Topaz Video Upscale Documentation](https://fal.ai/models/fal-ai/topaz/upscale/video/api)

### Text-to-Speech Resources
- [ElevenLabs API Documentation](https://elevenlabs.io/docs/capabilities/text-to-speech)
- [OpenRouter Platform](https://openrouter.ai/)
- [ElevenLabs Voice Library](https://elevenlabs.io/app/speech-synthesis/text-to-speech)
- [Text-to-Dialogue Documentation](https://elevenlabs.io/docs/cookbooks/text-to-dialogue)
- [Package Migration Guide](packages/services/text-to-speech/docs/MIGRATION_GUIDE.md)

### Additional Documentation
- [Project Instructions](CLAUDE.md) - Comprehensive development guide
- [Documentation](docs/) - Additional documentation and guides
- [Package Organization](docs/repository_organization_guide.md) - Package structure guide

## 🤝 Contributing

1. Follow the development patterns in [CLAUDE.md](CLAUDE.md)
2. Add tests for new features
3. Update documentation as needed
4. Test installation in fresh virtual environment
5. Commit with descriptive messages

Raw data

            {
    "_id": null,
    "home_page": "https://github.com/donghaozhang/veo3-fal-video-ai",
    "name": "video-ai-studio",
    "maintainer": null,
    "docs_url": null,
    "requires_python": ">=3.10",
    "maintainer_email": null,
    "keywords": "ai, content generation, images, videos, audio, fal, elevenlabs, google, parallel processing, veo, pipeline",
    "author": "donghao zhang",
    "author_email": "zdhpeter@gmail.com",
    "download_url": "https://files.pythonhosted.org/packages/96/d7/60fac39a1e4836e19e303ae1992cdc5a2c19e43af1e4dc3552280a9a1980/video_ai_studio-1.0.15.tar.gz",
    "platform": null,
    "description": "# AI Content Generation Suite\n\nA comprehensive AI content generation package with multiple providers and services, consolidated into a single installable package.\n\n[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n[![PyPI](https://img.shields.io/pypi/v/video-ai-studio)](https://pypi.org/project/video-ai-studio/)\n\n> **\u26a1 Production-ready Python package with comprehensive CLI, parallel execution, and enterprise-grade architecture**\n\n## \ud83c\udfac **Demo Video**\n\n[![AI Content Generation Suite Demo](https://img.youtube.com/vi/xzvPrlKnXqk/maxresdefault.jpg)](https://www.youtube.com/watch?v=xzvPrlKnXqk)\n\n*Click to watch the complete demo of AI Content Generation Suite in action*\n\n## \ud83c\udfa8 Available AI Models\n\n### Text-to-Image Models\n| Model Name | Provider | Cost per Image | Resolution | Special Features |\n|------------|----------|----------------|------------|------------------|\n| `flux_dev` | FAL AI | $0.003 | 1024x1024 | High quality, FLUX.1 Dev |\n| `flux_schnell` | FAL AI | $0.001 | 1024x1024 | Fast generation, FLUX.1 Schnell |\n| `imagen4` | FAL AI | $0.004 | 1024x1024 | Google Imagen 4, photorealistic |\n| `seedream_v3` | FAL AI | $0.002 | 1024x1024 | Seedream v3, bilingual support |\n| `seedream3` | Replicate | $0.003 | Up to 2048px | ByteDance Seedream-3, high-res |\n| `gen4` | Replicate | $0.08 | 720p/1080p | **Runway Gen-4, multi-reference guidance** |\n\n### Image-to-Image Models  \n| Model Name | Provider | Cost per Image | Special Features |\n|------------|----------|----------------|------------------|\n| `photon_flash` | FAL AI | $0.02 | Luma Photon Flash, creative & fast |\n| `photon_base` | FAL AI | $0.03 | Luma Photon Base, high quality |\n| `flux_kontext` | FAL AI | $0.025 | FLUX Kontext Dev, contextual editing |\n| `flux_kontext_multi` | FAL AI | $0.04 | FLUX Kontext Multi, multi-image |\n| `seededit_v3` | FAL AI | $0.02 | ByteDance SeedEdit v3, precise editing |\n| `clarity_upscaler` | FAL AI | $0.05 | Clarity AI upscaler |\n\n### Image-to-Video Models\n| Model Name | Provider | Cost per Video | Resolution | Special Features |\n|------------|----------|----------------|------------|------------------|\n| `veo3` | FAL AI | $3.00 | Up to 1080p | Google Veo 3.0, latest model |\n| `veo3_fast` | FAL AI | $2.00 | Up to 1080p | Google Veo 3.0 Fast |\n| `veo2` | FAL AI | $2.50 | Up to 1080p | Google Veo 2.0 |\n| `hailuo` | FAL AI | $0.08 | 720p | MiniMax Hailuo-02, budget-friendly |\n| `kling` | FAL AI | $0.10 | 720p | Kling Video 2.1 |\n\n### Image Understanding Models\n| Model Name | Provider | Cost per Analysis | Special Features |\n|------------|----------|-------------------|------------------|\n| `gemini_describe` | Google | $0.001 | Basic image description |\n| `gemini_detailed` | Google | $0.002 | Detailed image analysis |\n| `gemini_classify` | Google | $0.001 | Image classification |\n| `gemini_objects` | Google | $0.002 | Object detection |\n| `gemini_ocr` | Google | $0.001 | Text extraction (OCR) |\n| `gemini_composition` | Google | $0.002 | Artistic & technical analysis |\n| `gemini_qa` | Google | $0.001 | Question & answer system |\n\n### Text-to-Speech Models\n| Model Name | Provider | Cost per Request | Special Features |\n|------------|----------|------------------|------------------|\n| `elevenlabs` | ElevenLabs | $0.05 | High quality TTS |\n| `elevenlabs_turbo` | ElevenLabs | $0.03 | Fast generation |\n| `elevenlabs_v3` | ElevenLabs | $0.08 | Latest v3 model |\n\n### Prompt Generation Models\n| Model Name | Provider | Cost per Request | Special Features |\n|------------|----------|------------------|------------------|\n| `openrouter_video_prompt` | OpenRouter | $0.002 | General video prompts |\n| `openrouter_video_cinematic` | OpenRouter | $0.002 | Cinematic style prompts |\n| `openrouter_video_realistic` | OpenRouter | $0.002 | Realistic style prompts |\n| `openrouter_video_artistic` | OpenRouter | $0.002 | Artistic style prompts |\n| `openrouter_video_dramatic` | OpenRouter | $0.002 | Dramatic style prompts |\n\n### Audio & Video Processing\n| Model Name | Provider | Cost per Request | Special Features |\n|------------|----------|------------------|------------------|\n| `thinksound` | FAL AI | $0.05 | AI audio generation |\n| `topaz` | FAL AI | $1.50 | Video upscaling |\n\n### \ud83c\udf1f **Featured Model: Runway Gen-4**\nThe **`gen4`** model is our most advanced text-to-image model, offering unique capabilities:\n- **Multi-Reference Guidance**: Use up to 3 reference images with tagging\n- **Cinematic Quality**: Premium model for high-end generation  \n- **@ Syntax**: Reference tagged elements in prompts (`@woman`, `@park`)\n- **Variable Pricing**: $0.05 (720p) / $0.08 (1080p)\n\n**Total Models: 35+ AI models across 7 categories**\n\n## \ud83d\ude80 **FLAGSHIP: AI Content Pipeline**\n\nThe unified AI content generation pipeline with parallel execution support, multi-model integration, and YAML-based configuration.\n\n### Core Capabilities\n- **\ud83d\udd04 Unified Pipeline Architecture** - YAML/JSON-based configuration for complex multi-step workflows\n- **\u26a1 Parallel Execution Engine** - 2-3x performance improvement with thread-based parallel processing\n- **\ud83c\udfaf Type-Safe Configuration** - Pydantic models with comprehensive validation\n- **\ud83d\udcb0 Cost Management** - Real-time cost estimation and tracking across all services\n- **\ud83d\udcca Rich Logging** - Beautiful console output with progress tracking and performance metrics\n\n### AI Service Integrations\n- **\ud83d\uddbc\ufe0f FAL AI** - Text-to-image, image-to-image, text-to-video, video generation, avatar creation\n- **\ud83d\udde3\ufe0f ElevenLabs** - Professional text-to-speech with 20+ voice options\n- **\ud83c\udfa5 Google Vertex AI** - Veo video generation and Gemini text generation  \n- **\ud83d\udd17 OpenRouter** - Alternative TTS and chat completion services\n\n### Developer Experience\n- **\ud83d\udee0\ufe0f Professional CLI** - Comprehensive command-line interface with Click\n- **\ud83d\udce6 Modular Architecture** - Clean separation of concerns with extensible design\n- **\ud83e\uddea Comprehensive Testing** - Unit and integration tests with pytest\n- **\ud83d\udcda Type Hints** - Full type coverage for excellent IDE support\n\n## \ud83d\udce6 Installation\n\n### Quick Start\n```bash\n# Install from PyPI\npip install video-ai-studio\n\n# Or install in development mode\npip install -e .\n```\n\n### \ud83d\udd11 API Keys Setup\n\nAfter installation, you need to configure your API keys:\n\n1. **Download the example configuration:**\n   ```bash\n   # Option 1: Download from GitHub\n   curl -o .env https://raw.githubusercontent.com/donghaozhang/veo3-fal-video-ai/main/.env.example\n   \n   # Option 2: Create manually\n   touch .env\n   ```\n\n2. **Add your API keys to `.env`:**\n   ```env\n   # Required for most functionality\n   FAL_KEY=your_fal_api_key_here\n   \n   # Optional - add as needed\n   GEMINI_API_KEY=your_gemini_api_key_here\n   OPENROUTER_API_KEY=your_openrouter_api_key_here\n   ELEVENLABS_API_KEY=your_elevenlabs_api_key_here\n   ```\n\n3. **Get API keys from:**\n   - **FAL AI**: https://fal.ai/dashboard (required for most models)\n   - **Google Gemini**: https://makersuite.google.com/app/apikey\n   - **OpenRouter**: https://openrouter.ai/keys\n   - **ElevenLabs**: https://elevenlabs.io/app/settings\n\n### \ud83d\udccb Dependencies\nThe package installs core dependencies automatically. See [requirements.txt](requirements.txt) for the complete list.\n\n## \ud83d\udee0\ufe0f Quick Start\n\n### Console Commands\n```bash\n# List all available AI models\nai-content-pipeline list-models\n\n# Generate image from text\nai-content-pipeline generate-image --text \"epic space battle\" --model flux_dev\n\n# Create video (text \u2192 image \u2192 video)\nai-content-pipeline create-video --text \"serene mountain lake\"\n\n# Run custom pipeline from YAML config\nai-content-pipeline run-chain --config config.yaml --input \"cyberpunk city\"\n\n# Create example configurations\nai-content-pipeline create-examples\n\n# Shortened command alias\naicp --help\n```\n\n### Python API\n```python\nfrom packages.core.ai_content_pipeline.pipeline.manager import AIPipelineManager\n\n# Initialize manager\nmanager = AIPipelineManager()\n\n# Quick video creation\nresult = manager.quick_create_video(\n    text=\"serene mountain lake\",\n    image_model=\"flux_dev\",\n    video_model=\"auto\"\n)\n\n# Run custom chain\nchain = manager.create_chain_from_config(\"config.yaml\")\nresult = manager.execute_chain(chain, \"input text\")\n```\n\n## \ud83d\udcda Package Structure\n\n### Core Packages\n- **[ai_content_pipeline](packages/core/ai_content_pipeline/)** - Main unified pipeline with parallel execution\n\n### Provider Packages\n\n#### Google Services\n- **[google-veo](packages/providers/google/veo/)** - Google Veo video generation (Vertex AI)\n\n#### FAL AI Services  \n- **[fal-video](packages/providers/fal/video/)** - Video generation (MiniMax Hailuo-02, Kling Video 2.1)\n- **[fal-text-to-video](packages/providers/fal/text-to-video/)** - Text-to-video (MiniMax Hailuo-02 Pro, Google Veo 3)\n- **[fal-avatar](packages/providers/fal/avatar/)** - Avatar generation with TTS integration\n- **[fal-text-to-image](packages/providers/fal/text-to-image/)** - Text-to-image (Imagen 4, Seedream v3, FLUX.1)\n- **[fal-image-to-image](packages/providers/fal/image-to-image/)** - Image transformation (Luma Photon Flash)\n- **[fal-video-to-video](packages/providers/fal/video-to-video/)** - Video processing (ThinksSound + Topaz)\n\n### Service Packages\n- **[text-to-speech](packages/services/text-to-speech/)** - ElevenLabs TTS integration (20+ voices)\n- **[video-tools](packages/services/video-tools/)** - Video processing utilities with AI analysis\n\n## \ud83d\udd27 Configuration\n\n### Environment Setup\nCreate a `.env` file in the project root:\n```env\n# FAL AI API Configuration\nFAL_KEY=your_fal_api_key\n\n# Google Cloud Configuration (for Veo)\nPROJECT_ID=your-project-id\nOUTPUT_BUCKET_PATH=gs://your-bucket/veo_output/\n\n# ElevenLabs Configuration\nELEVENLABS_API_KEY=your_elevenlabs_api_key\n\n# Optional: Gemini for AI analysis\nGEMINI_API_KEY=your_gemini_api_key\n\n# Optional: OpenRouter for additional models\nOPENROUTER_API_KEY=your_openrouter_api_key\n```\n\n### YAML Pipeline Configuration\n```yaml\nname: \"Text to Video Pipeline\"\ndescription: \"Generate video from text prompt\"\nsteps:\n  - name: \"generate_image\"\n    type: \"text_to_image\"\n    model: \"flux_dev\"\n    aspect_ratio: \"16:9\"\n    \n  - name: \"create_video\"\n    type: \"image_to_video\"\n    model: \"kling_video\"\n    input_from: \"generate_image\"\n    duration: 8\n```\n\n### Parallel Execution\nEnable parallel processing for 2-3x speedup:\n```bash\n# Enable parallel execution\nPIPELINE_PARALLEL_ENABLED=true ai-content-pipeline run-chain --config config.yaml\n```\n\nExample parallel pipeline configuration:\n```yaml\nname: \"Parallel Processing Example\"\nsteps:\n  - type: \"parallel_group\"\n    steps:\n      - type: \"text_to_image\"\n        model: \"flux_schnell\"\n        params:\n          prompt: \"A cat\"\n      - type: \"text_to_image\"\n        model: \"flux_schnell\"\n        params:\n          prompt: \"A dog\"\n      - type: \"text_to_image\"\n        model: \"flux_schnell\"\n        params:\n          prompt: \"A bird\"\n```\n\n## \ud83d\udcb0 Cost Management\n\n### Cost Estimation\nAlways estimate costs before running pipelines:\n```bash\n# Estimate cost for a pipeline\nai-content-pipeline estimate-cost --config config.yaml\n```\n\n### Typical Costs\n- **Text-to-Image**: $0.001-0.004 per image\n- **Image-to-Image**: $0.01-0.05 per modification  \n- **Text-to-Video**: $0.08-6.00 per video (model dependent)\n- **Avatar Generation**: $0.02-0.05 per video\n- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)\n- **Video Processing**: $0.05-2.50 per video (model dependent)\n\n### Cost-Conscious Usage\n- Use cheaper models for prototyping (`flux_schnell`, `hailuo`)\n- Test with small batches before large-scale generation\n- Monitor costs with built-in tracking\n\n## \ud83e\uddea Testing\n\n```bash\n# Quick tests\npython tests/run_all_tests.py --quick\n```\n\n\ud83d\udccb See [tests/README.md](tests/README.md) for complete testing guide.\n\n## \ud83d\udcb0 Cost Management\n\n### Estimation\n- **FAL AI Video**: ~$0.05-0.10 per video\n- **FAL AI Text-to-Video**: ~$0.08 (MiniMax) to $2.50-6.00 (Google Veo 3)\n- **FAL AI Avatar**: ~$0.02-0.05 per video\n- **FAL AI Images**: ~$0.001-0.01 per image\n- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)\n\n### Best Practices\n1. Always run `test_setup.py` first (FREE)\n2. Use cost estimation in pipeline manager\n3. Start with cheaper models for testing\n4. Monitor usage through provider dashboards\n\n## \ud83d\udd04 Development Workflow\n\n### Making Changes\n```bash\n# Make your changes to the codebase\ngit add .\ngit commit -m \"Your changes\"\ngit push origin main\n```\n\n### Testing Installation\n```bash\n# Create test environment\npython3 -m venv test_env\nsource test_env/bin/activate\n\n# Install and test\npip install -e .\nai-content-pipeline --help\n```\n\n## \ud83d\udccb Available Commands\n\n### AI Content Pipeline Commands\n- `ai-content-pipeline list-models` - List all available models\n- `ai-content-pipeline generate-image` - Generate image from text\n- `ai-content-pipeline create-video` - Create video from text\n- `ai-content-pipeline run-chain` - Run custom YAML pipeline\n- `ai-content-pipeline create-examples` - Create example configs\n- `aicp` - Shortened alias for all commands\n\n### Individual Package Commands\nSee [CLAUDE.md](CLAUDE.md) for detailed commands for each package.\n\n## \ud83d\udcda Documentation\n\n- **[Project Instructions](CLAUDE.md)** - Comprehensive development guide\n- **[Documentation](docs/)** - Additional documentation and guides\n- **Package READMEs** - Each package has its own README with specific instructions\n\n## \ud83c\udfd7\ufe0f Architecture\n\n- **Unified Package Structure** - Single `setup.py` with consolidated dependencies\n- **Consolidated Configuration** - Single `.env` file for all services\n- **Modular Design** - Each service can be used independently or through the unified pipeline\n- **Parallel Execution** - Optional parallel processing for improved performance\n- **Cost-Conscious Design** - Built-in cost estimation and management\n\n## \ud83d\udcda Resources\n\n### \ud83d\ude80 AI Content Pipeline Resources\n- [Pipeline Documentation](packages/core/ai_content_pipeline/docs/README.md)\n- [Getting Started Guide](packages/core/ai_content_pipeline/docs/GETTING_STARTED.md)\n- [YAML Configuration Reference](packages/core/ai_content_pipeline/docs/YAML_CONFIGURATION.md)\n- [Parallel Execution Design](packages/core/ai_content_pipeline/docs/parallel_pipeline_design.md)\n\n### Google Veo Resources\n- [Veo API Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo-video-generation)\n- [Google GenAI SDK](https://github.com/google/generative-ai-python)\n- [Vertex AI Console](https://console.cloud.google.com/vertex-ai)\n\n### FAL AI Resources\n- [FAL AI Platform](https://fal.ai/)\n- [MiniMax Hailuo Documentation](https://fal.ai/models/fal-ai/minimax-video-01)\n- [Kling Video 2.1 Documentation](https://fal.ai/models/fal-ai/kling-video/v2.1/standard/image-to-video/api)\n- [FAL AI Avatar Documentation](https://fal.ai/models/fal-ai/avatar-video)\n- [ThinksSound API Documentation](https://fal.ai/models/fal-ai/thinksound/api)\n- [Topaz Video Upscale Documentation](https://fal.ai/models/fal-ai/topaz/upscale/video/api)\n\n### Text-to-Speech Resources\n- [ElevenLabs API Documentation](https://elevenlabs.io/docs/capabilities/text-to-speech)\n- [OpenRouter Platform](https://openrouter.ai/)\n- [ElevenLabs Voice Library](https://elevenlabs.io/app/speech-synthesis/text-to-speech)\n- [Text-to-Dialogue Documentation](https://elevenlabs.io/docs/cookbooks/text-to-dialogue)\n- [Package Migration Guide](packages/services/text-to-speech/docs/MIGRATION_GUIDE.md)\n\n### Additional Documentation\n- [Project Instructions](CLAUDE.md) - Comprehensive development guide\n- [Documentation](docs/) - Additional documentation and guides\n- [Package Organization](docs/repository_organization_guide.md) - Package structure guide\n\n## \ud83e\udd1d Contributing\n\n1. Follow the development patterns in [CLAUDE.md](CLAUDE.md)\n2. Add tests for new features\n3. Update documentation as needed\n4. Test installation in fresh virtual environment\n5. Commit with descriptive messages\n",
    "bugtrack_url": null,
    "license": null,
    "summary": "Comprehensive AI content generation suite with multiple providers and services",
    "version": "1.0.15",
    "project_urls": {
        "Changelog": "https://github.com/donghaozhang/veo3-fal-video-ai/blob/main/CHANGELOG.md",
        "Documentation": "https://github.com/donghaozhang/veo3-fal-video-ai/blob/main/README.md",
        "Homepage": "https://github.com/donghaozhang/veo3-fal-video-ai",
        "Source": "https://github.com/donghaozhang/veo3-fal-video-ai",
        "Tracker": "https://github.com/donghaozhang/veo3-fal-video-ai/issues"
    },
    "split_keywords": [
        "ai",
        " content generation",
        " images",
        " videos",
        " audio",
        " fal",
        " elevenlabs",
        " google",
        " parallel processing",
        " veo",
        " pipeline"
    ],
    "urls": [
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "1ed6c0eb80b59c193a1db00397ed737f0c4a84300c955ff0959de8d18a374475",
                "md5": "0abf5af39c99c0ac667011c0391b9cb5",
                "sha256": "7bfbc31438df7b65ef19a68ca6c44e73245dcb6a72d584488f8bd3a9dd27cd9b"
            },
            "downloads": -1,
            "filename": "video_ai_studio-1.0.15-py3-none-any.whl",
            "has_sig": false,
            "md5_digest": "0abf5af39c99c0ac667011c0391b9cb5",
            "packagetype": "bdist_wheel",
            "python_version": "py3",
            "requires_python": ">=3.10",
            "size": 466965,
            "upload_time": "2025-07-14T06:27:45",
            "upload_time_iso_8601": "2025-07-14T06:27:45.511516Z",
            "url": "https://files.pythonhosted.org/packages/1e/d6/c0eb80b59c193a1db00397ed737f0c4a84300c955ff0959de8d18a374475/video_ai_studio-1.0.15-py3-none-any.whl",
            "yanked": false,
            "yanked_reason": null
        },
        {
            "comment_text": null,
            "digests": {
                "blake2b_256": "96d760fac39a1e4836e19e303ae1992cdc5a2c19e43af1e4dc3552280a9a1980",
                "md5": "f4ca845071ba2a99337a6c0998369801",
                "sha256": "9c4d9caefb4251937cc48f5cd763cdd9e45cfd8405ec1e5560d59416dcbf8ae8"
            },
            "downloads": -1,
            "filename": "video_ai_studio-1.0.15.tar.gz",
            "has_sig": false,
            "md5_digest": "f4ca845071ba2a99337a6c0998369801",
            "packagetype": "sdist",
            "python_version": "source",
            "requires_python": ">=3.10",
            "size": 385339,
            "upload_time": "2025-07-14T06:27:47",
            "upload_time_iso_8601": "2025-07-14T06:27:47.169721Z",
            "url": "https://files.pythonhosted.org/packages/96/d7/60fac39a1e4836e19e303ae1992cdc5a2c19e43af1e4dc3552280a9a1980/video_ai_studio-1.0.15.tar.gz",
            "yanked": false,
            "yanked_reason": null
        }
    ],
    "upload_time": "2025-07-14 06:27:47",
    "github": true,
    "gitlab": false,
    "bitbucket": false,
    "codeberg": false,
    "github_user": "donghaozhang",
    "github_project": "veo3-fal-video-ai",
    "travis_ci": false,
    "coveralls": false,
    "github_actions": true,
    "requirements": [
        {
            "name": "python-dotenv",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "requests",
            "specs": [
                [
                    ">=",
                    "2.31.0"
                ]
            ]
        },
        {
            "name": "typing-extensions",
            "specs": [
                [
                    ">=",
                    "4.0.0"
                ]
            ]
        },
        {
            "name": "pyyaml",
            "specs": [
                [
                    ">=",
                    "6.0"
                ]
            ]
        },
        {
            "name": "pathlib2",
            "specs": [
                [
                    ">=",
                    "2.3.7"
                ]
            ]
        },
        {
            "name": "argparse",
            "specs": [
                [
                    ">=",
                    "1.4.0"
                ]
            ]
        },
        {
            "name": "fal-client",
            "specs": [
                [
                    ">=",
                    "0.4.0"
                ]
            ]
        },
        {
            "name": "replicate",
            "specs": [
                [
                    ">=",
                    "0.15.0"
                ]
            ]
        },
        {
            "name": "openai",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ],
                [
                    "<",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "google-cloud-aiplatform",
            "specs": [
                [
                    ">=",
                    "1.20.0"
                ]
            ]
        },
        {
            "name": "google-cloud-storage",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "google-auth",
            "specs": [
                [
                    ">=",
                    "2.0.0"
                ]
            ]
        },
        {
            "name": "google-generativeai",
            "specs": [
                [
                    ">=",
                    "0.2.0"
                ]
            ]
        },
        {
            "name": "elevenlabs",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "Pillow",
            "specs": [
                [
                    ">=",
                    "10.0.0"
                ]
            ]
        },
        {
            "name": "moviepy",
            "specs": [
                [
                    ">=",
                    "1.0.3"
                ]
            ]
        },
        {
            "name": "ffmpeg-python",
            "specs": [
                [
                    ">=",
                    "0.2.0"
                ]
            ]
        },
        {
            "name": "aiohttp",
            "specs": [
                [
                    ">=",
                    "3.8.0"
                ]
            ]
        },
        {
            "name": "httpx",
            "specs": [
                [
                    ">=",
                    "0.25.0"
                ]
            ]
        },
        {
            "name": "jupyter",
            "specs": [
                [
                    ">=",
                    "1.0.0"
                ]
            ]
        },
        {
            "name": "ipython",
            "specs": [
                [
                    ">=",
                    "8.0.0"
                ]
            ]
        },
        {
            "name": "notebook",
            "specs": [
                [
                    ">=",
                    "7.0.0"
                ]
            ]
        },
        {
            "name": "pytest",
            "specs": [
                [
                    ">=",
                    "7.0.0"
                ]
            ]
        },
        {
            "name": "pytest-asyncio",
            "specs": [
                [
                    ">=",
                    "0.21.0"
                ]
            ]
        }
    ],
    "lcname": "video-ai-studio"
}

donghao zhang