# AI Content Generation Suite
A comprehensive AI content generation package with multiple providers and services, consolidated into a single installable package.
[](https://www.python.org/downloads/)
[](https://opensource.org/licenses/MIT)
[](https://github.com/psf/black)
[](https://pypi.org/project/video-ai-studio/)
> **โก Production-ready Python package with comprehensive CLI, parallel execution, and enterprise-grade architecture**
## ๐ฌ **Demo Video**
[](https://www.youtube.com/watch?v=xzvPrlKnXqk)
*Click to watch the complete demo of AI Content Generation Suite in action*
## ๐จ Available AI Models
### Text-to-Image Models
| Model Name | Provider | Cost per Image | Resolution | Special Features |
|------------|----------|----------------|------------|------------------|
| `flux_dev` | FAL AI | $0.003 | 1024x1024 | High quality, FLUX.1 Dev |
| `flux_schnell` | FAL AI | $0.001 | 1024x1024 | Fast generation, FLUX.1 Schnell |
| `imagen4` | FAL AI | $0.004 | 1024x1024 | Google Imagen 4, photorealistic |
| `seedream_v3` | FAL AI | $0.002 | 1024x1024 | Seedream v3, bilingual support |
| `seedream3` | Replicate | $0.003 | Up to 2048px | ByteDance Seedream-3, high-res |
| `gen4` | Replicate | $0.08 | 720p/1080p | **Runway Gen-4, multi-reference guidance** |
### Image-to-Image Models
| Model Name | Provider | Cost per Image | Special Features |
|------------|----------|----------------|------------------|
| `photon_flash` | FAL AI | $0.02 | Luma Photon Flash, creative & fast |
| `photon_base` | FAL AI | $0.03 | Luma Photon Base, high quality |
| `flux_kontext` | FAL AI | $0.025 | FLUX Kontext Dev, contextual editing |
| `flux_kontext_multi` | FAL AI | $0.04 | FLUX Kontext Multi, multi-image |
| `seededit_v3` | FAL AI | $0.02 | ByteDance SeedEdit v3, precise editing |
| `clarity_upscaler` | FAL AI | $0.05 | Clarity AI upscaler |
### Image-to-Video Models
| Model Name | Provider | Cost per Video | Resolution | Special Features |
|------------|----------|----------------|------------|------------------|
| `veo3` | FAL AI | $3.00 | Up to 1080p | Google Veo 3.0, latest model |
| `veo3_fast` | FAL AI | $2.00 | Up to 1080p | Google Veo 3.0 Fast |
| `veo2` | FAL AI | $2.50 | Up to 1080p | Google Veo 2.0 |
| `hailuo` | FAL AI | $0.08 | 720p | MiniMax Hailuo-02, budget-friendly |
| `kling` | FAL AI | $0.10 | 720p | Kling Video 2.1 |
### Image Understanding Models
| Model Name | Provider | Cost per Analysis | Special Features |
|------------|----------|-------------------|------------------|
| `gemini_describe` | Google | $0.001 | Basic image description |
| `gemini_detailed` | Google | $0.002 | Detailed image analysis |
| `gemini_classify` | Google | $0.001 | Image classification |
| `gemini_objects` | Google | $0.002 | Object detection |
| `gemini_ocr` | Google | $0.001 | Text extraction (OCR) |
| `gemini_composition` | Google | $0.002 | Artistic & technical analysis |
| `gemini_qa` | Google | $0.001 | Question & answer system |
### Text-to-Speech Models
| Model Name | Provider | Cost per Request | Special Features |
|------------|----------|------------------|------------------|
| `elevenlabs` | ElevenLabs | $0.05 | High quality TTS |
| `elevenlabs_turbo` | ElevenLabs | $0.03 | Fast generation |
| `elevenlabs_v3` | ElevenLabs | $0.08 | Latest v3 model |
### Prompt Generation Models
| Model Name | Provider | Cost per Request | Special Features |
|------------|----------|------------------|------------------|
| `openrouter_video_prompt` | OpenRouter | $0.002 | General video prompts |
| `openrouter_video_cinematic` | OpenRouter | $0.002 | Cinematic style prompts |
| `openrouter_video_realistic` | OpenRouter | $0.002 | Realistic style prompts |
| `openrouter_video_artistic` | OpenRouter | $0.002 | Artistic style prompts |
| `openrouter_video_dramatic` | OpenRouter | $0.002 | Dramatic style prompts |
### Audio & Video Processing
| Model Name | Provider | Cost per Request | Special Features |
|------------|----------|------------------|------------------|
| `thinksound` | FAL AI | $0.05 | AI audio generation |
| `topaz` | FAL AI | $1.50 | Video upscaling |
### ๐ **Featured Model: Runway Gen-4**
The **`gen4`** model is our most advanced text-to-image model, offering unique capabilities:
- **Multi-Reference Guidance**: Use up to 3 reference images with tagging
- **Cinematic Quality**: Premium model for high-end generation
- **@ Syntax**: Reference tagged elements in prompts (`@woman`, `@park`)
- **Variable Pricing**: $0.05 (720p) / $0.08 (1080p)
**Total Models: 35+ AI models across 7 categories**
## ๐ **FLAGSHIP: AI Content Pipeline**
The unified AI content generation pipeline with parallel execution support, multi-model integration, and YAML-based configuration.
### Core Capabilities
- **๐ Unified Pipeline Architecture** - YAML/JSON-based configuration for complex multi-step workflows
- **โก Parallel Execution Engine** - 2-3x performance improvement with thread-based parallel processing
- **๐ฏ Type-Safe Configuration** - Pydantic models with comprehensive validation
- **๐ฐ Cost Management** - Real-time cost estimation and tracking across all services
- **๐ Rich Logging** - Beautiful console output with progress tracking and performance metrics
### AI Service Integrations
- **๐ผ๏ธ FAL AI** - Text-to-image, image-to-image, text-to-video, video generation, avatar creation
- **๐ฃ๏ธ ElevenLabs** - Professional text-to-speech with 20+ voice options
- **๐ฅ Google Vertex AI** - Veo video generation and Gemini text generation
- **๐ OpenRouter** - Alternative TTS and chat completion services
### Developer Experience
- **๐ ๏ธ Professional CLI** - Comprehensive command-line interface with Click
- **๐ฆ Modular Architecture** - Clean separation of concerns with extensible design
- **๐งช Comprehensive Testing** - Unit and integration tests with pytest
- **๐ Type Hints** - Full type coverage for excellent IDE support
## ๐ฆ Installation
### Quick Start
```bash
# Install from PyPI
pip install video-ai-studio
# Or install in development mode
pip install -e .
```
### ๐ API Keys Setup
After installation, you need to configure your API keys:
1. **Download the example configuration:**
```bash
# Option 1: Download from GitHub
curl -o .env https://raw.githubusercontent.com/donghaozhang/veo3-fal-video-ai/main/.env.example
# Option 2: Create manually
touch .env
```
2. **Add your API keys to `.env`:**
```env
# Required for most functionality
FAL_KEY=your_fal_api_key_here
# Optional - add as needed
GEMINI_API_KEY=your_gemini_api_key_here
OPENROUTER_API_KEY=your_openrouter_api_key_here
ELEVENLABS_API_KEY=your_elevenlabs_api_key_here
```
3. **Get API keys from:**
- **FAL AI**: https://fal.ai/dashboard (required for most models)
- **Google Gemini**: https://makersuite.google.com/app/apikey
- **OpenRouter**: https://openrouter.ai/keys
- **ElevenLabs**: https://elevenlabs.io/app/settings
### ๐ Dependencies
The package installs core dependencies automatically. See [requirements.txt](requirements.txt) for the complete list.
## ๐ ๏ธ Quick Start
### Console Commands
```bash
# List all available AI models
ai-content-pipeline list-models
# Generate image from text
ai-content-pipeline generate-image --text "epic space battle" --model flux_dev
# Create video (text โ image โ video)
ai-content-pipeline create-video --text "serene mountain lake"
# Run custom pipeline from YAML config
ai-content-pipeline run-chain --config config.yaml --input "cyberpunk city"
# Create example configurations
ai-content-pipeline create-examples
# Shortened command alias
aicp --help
```
### Python API
```python
from packages.core.ai_content_pipeline.pipeline.manager import AIPipelineManager
# Initialize manager
manager = AIPipelineManager()
# Quick video creation
result = manager.quick_create_video(
text="serene mountain lake",
image_model="flux_dev",
video_model="auto"
)
# Run custom chain
chain = manager.create_chain_from_config("config.yaml")
result = manager.execute_chain(chain, "input text")
```
## ๐ Package Structure
### Core Packages
- **[ai_content_pipeline](packages/core/ai_content_pipeline/)** - Main unified pipeline with parallel execution
### Provider Packages
#### Google Services
- **[google-veo](packages/providers/google/veo/)** - Google Veo video generation (Vertex AI)
#### FAL AI Services
- **[fal-video](packages/providers/fal/video/)** - Video generation (MiniMax Hailuo-02, Kling Video 2.1)
- **[fal-text-to-video](packages/providers/fal/text-to-video/)** - Text-to-video (MiniMax Hailuo-02 Pro, Google Veo 3)
- **[fal-avatar](packages/providers/fal/avatar/)** - Avatar generation with TTS integration
- **[fal-text-to-image](packages/providers/fal/text-to-image/)** - Text-to-image (Imagen 4, Seedream v3, FLUX.1)
- **[fal-image-to-image](packages/providers/fal/image-to-image/)** - Image transformation (Luma Photon Flash)
- **[fal-video-to-video](packages/providers/fal/video-to-video/)** - Video processing (ThinksSound + Topaz)
### Service Packages
- **[text-to-speech](packages/services/text-to-speech/)** - ElevenLabs TTS integration (20+ voices)
- **[video-tools](packages/services/video-tools/)** - Video processing utilities with AI analysis
## ๐ง Configuration
### Environment Setup
Create a `.env` file in the project root:
```env
# FAL AI API Configuration
FAL_KEY=your_fal_api_key
# Google Cloud Configuration (for Veo)
PROJECT_ID=your-project-id
OUTPUT_BUCKET_PATH=gs://your-bucket/veo_output/
# ElevenLabs Configuration
ELEVENLABS_API_KEY=your_elevenlabs_api_key
# Optional: Gemini for AI analysis
GEMINI_API_KEY=your_gemini_api_key
# Optional: OpenRouter for additional models
OPENROUTER_API_KEY=your_openrouter_api_key
```
### YAML Pipeline Configuration
```yaml
name: "Text to Video Pipeline"
description: "Generate video from text prompt"
steps:
- name: "generate_image"
type: "text_to_image"
model: "flux_dev"
aspect_ratio: "16:9"
- name: "create_video"
type: "image_to_video"
model: "kling_video"
input_from: "generate_image"
duration: 8
```
### Parallel Execution
Enable parallel processing for 2-3x speedup:
```bash
# Enable parallel execution
PIPELINE_PARALLEL_ENABLED=true ai-content-pipeline run-chain --config config.yaml
```
Example parallel pipeline configuration:
```yaml
name: "Parallel Processing Example"
steps:
- type: "parallel_group"
steps:
- type: "text_to_image"
model: "flux_schnell"
params:
prompt: "A cat"
- type: "text_to_image"
model: "flux_schnell"
params:
prompt: "A dog"
- type: "text_to_image"
model: "flux_schnell"
params:
prompt: "A bird"
```
## ๐ฐ Cost Management
### Cost Estimation
Always estimate costs before running pipelines:
```bash
# Estimate cost for a pipeline
ai-content-pipeline estimate-cost --config config.yaml
```
### Typical Costs
- **Text-to-Image**: $0.001-0.004 per image
- **Image-to-Image**: $0.01-0.05 per modification
- **Text-to-Video**: $0.08-6.00 per video (model dependent)
- **Avatar Generation**: $0.02-0.05 per video
- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)
- **Video Processing**: $0.05-2.50 per video (model dependent)
### Cost-Conscious Usage
- Use cheaper models for prototyping (`flux_schnell`, `hailuo`)
- Test with small batches before large-scale generation
- Monitor costs with built-in tracking
## ๐งช Testing
```bash
# Quick tests
python tests/run_all_tests.py --quick
```
๐ See [tests/README.md](tests/README.md) for complete testing guide.
## ๐ฐ Cost Management
### Estimation
- **FAL AI Video**: ~$0.05-0.10 per video
- **FAL AI Text-to-Video**: ~$0.08 (MiniMax) to $2.50-6.00 (Google Veo 3)
- **FAL AI Avatar**: ~$0.02-0.05 per video
- **FAL AI Images**: ~$0.001-0.01 per image
- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)
### Best Practices
1. Always run `test_setup.py` first (FREE)
2. Use cost estimation in pipeline manager
3. Start with cheaper models for testing
4. Monitor usage through provider dashboards
## ๐ Development Workflow
### Making Changes
```bash
# Make your changes to the codebase
git add .
git commit -m "Your changes"
git push origin main
```
### Testing Installation
```bash
# Create test environment
python3 -m venv test_env
source test_env/bin/activate
# Install and test
pip install -e .
ai-content-pipeline --help
```
## ๐ Available Commands
### AI Content Pipeline Commands
- `ai-content-pipeline list-models` - List all available models
- `ai-content-pipeline generate-image` - Generate image from text
- `ai-content-pipeline create-video` - Create video from text
- `ai-content-pipeline run-chain` - Run custom YAML pipeline
- `ai-content-pipeline create-examples` - Create example configs
- `aicp` - Shortened alias for all commands
### Individual Package Commands
See [CLAUDE.md](CLAUDE.md) for detailed commands for each package.
## ๐ Documentation
- **[Project Instructions](CLAUDE.md)** - Comprehensive development guide
- **[Documentation](docs/)** - Additional documentation and guides
- **Package READMEs** - Each package has its own README with specific instructions
## ๐๏ธ Architecture
- **Unified Package Structure** - Single `setup.py` with consolidated dependencies
- **Consolidated Configuration** - Single `.env` file for all services
- **Modular Design** - Each service can be used independently or through the unified pipeline
- **Parallel Execution** - Optional parallel processing for improved performance
- **Cost-Conscious Design** - Built-in cost estimation and management
## ๐ Resources
### ๐ AI Content Pipeline Resources
- [Pipeline Documentation](packages/core/ai_content_pipeline/docs/README.md)
- [Getting Started Guide](packages/core/ai_content_pipeline/docs/GETTING_STARTED.md)
- [YAML Configuration Reference](packages/core/ai_content_pipeline/docs/YAML_CONFIGURATION.md)
- [Parallel Execution Design](packages/core/ai_content_pipeline/docs/parallel_pipeline_design.md)
### Google Veo Resources
- [Veo API Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo-video-generation)
- [Google GenAI SDK](https://github.com/google/generative-ai-python)
- [Vertex AI Console](https://console.cloud.google.com/vertex-ai)
### FAL AI Resources
- [FAL AI Platform](https://fal.ai/)
- [MiniMax Hailuo Documentation](https://fal.ai/models/fal-ai/minimax-video-01)
- [Kling Video 2.1 Documentation](https://fal.ai/models/fal-ai/kling-video/v2.1/standard/image-to-video/api)
- [FAL AI Avatar Documentation](https://fal.ai/models/fal-ai/avatar-video)
- [ThinksSound API Documentation](https://fal.ai/models/fal-ai/thinksound/api)
- [Topaz Video Upscale Documentation](https://fal.ai/models/fal-ai/topaz/upscale/video/api)
### Text-to-Speech Resources
- [ElevenLabs API Documentation](https://elevenlabs.io/docs/capabilities/text-to-speech)
- [OpenRouter Platform](https://openrouter.ai/)
- [ElevenLabs Voice Library](https://elevenlabs.io/app/speech-synthesis/text-to-speech)
- [Text-to-Dialogue Documentation](https://elevenlabs.io/docs/cookbooks/text-to-dialogue)
- [Package Migration Guide](packages/services/text-to-speech/docs/MIGRATION_GUIDE.md)
### Additional Documentation
- [Project Instructions](CLAUDE.md) - Comprehensive development guide
- [Documentation](docs/) - Additional documentation and guides
- [Package Organization](docs/repository_organization_guide.md) - Package structure guide
## ๐ค Contributing
1. Follow the development patterns in [CLAUDE.md](CLAUDE.md)
2. Add tests for new features
3. Update documentation as needed
4. Test installation in fresh virtual environment
5. Commit with descriptive messages
Raw data
{
"_id": null,
"home_page": "https://github.com/donghaozhang/veo3-fal-video-ai",
"name": "video-ai-studio",
"maintainer": null,
"docs_url": null,
"requires_python": ">=3.10",
"maintainer_email": null,
"keywords": "ai, content generation, images, videos, audio, fal, elevenlabs, google, parallel processing, veo, pipeline",
"author": "donghao zhang",
"author_email": "zdhpeter@gmail.com",
"download_url": "https://files.pythonhosted.org/packages/96/d7/60fac39a1e4836e19e303ae1992cdc5a2c19e43af1e4dc3552280a9a1980/video_ai_studio-1.0.15.tar.gz",
"platform": null,
"description": "# AI Content Generation Suite\n\nA comprehensive AI content generation package with multiple providers and services, consolidated into a single installable package.\n\n[](https://www.python.org/downloads/)\n[](https://opensource.org/licenses/MIT)\n[](https://github.com/psf/black)\n[](https://pypi.org/project/video-ai-studio/)\n\n> **\u26a1 Production-ready Python package with comprehensive CLI, parallel execution, and enterprise-grade architecture**\n\n## \ud83c\udfac **Demo Video**\n\n[](https://www.youtube.com/watch?v=xzvPrlKnXqk)\n\n*Click to watch the complete demo of AI Content Generation Suite in action*\n\n## \ud83c\udfa8 Available AI Models\n\n### Text-to-Image Models\n| Model Name | Provider | Cost per Image | Resolution | Special Features |\n|------------|----------|----------------|------------|------------------|\n| `flux_dev` | FAL AI | $0.003 | 1024x1024 | High quality, FLUX.1 Dev |\n| `flux_schnell` | FAL AI | $0.001 | 1024x1024 | Fast generation, FLUX.1 Schnell |\n| `imagen4` | FAL AI | $0.004 | 1024x1024 | Google Imagen 4, photorealistic |\n| `seedream_v3` | FAL AI | $0.002 | 1024x1024 | Seedream v3, bilingual support |\n| `seedream3` | Replicate | $0.003 | Up to 2048px | ByteDance Seedream-3, high-res |\n| `gen4` | Replicate | $0.08 | 720p/1080p | **Runway Gen-4, multi-reference guidance** |\n\n### Image-to-Image Models \n| Model Name | Provider | Cost per Image | Special Features |\n|------------|----------|----------------|------------------|\n| `photon_flash` | FAL AI | $0.02 | Luma Photon Flash, creative & fast |\n| `photon_base` | FAL AI | $0.03 | Luma Photon Base, high quality |\n| `flux_kontext` | FAL AI | $0.025 | FLUX Kontext Dev, contextual editing |\n| `flux_kontext_multi` | FAL AI | $0.04 | FLUX Kontext Multi, multi-image |\n| `seededit_v3` | FAL AI | $0.02 | ByteDance SeedEdit v3, precise editing |\n| `clarity_upscaler` | FAL AI | $0.05 | Clarity AI upscaler |\n\n### Image-to-Video Models\n| Model Name | Provider | Cost per Video | Resolution | Special Features |\n|------------|----------|----------------|------------|------------------|\n| `veo3` | FAL AI | $3.00 | Up to 1080p | Google Veo 3.0, latest model |\n| `veo3_fast` | FAL AI | $2.00 | Up to 1080p | Google Veo 3.0 Fast |\n| `veo2` | FAL AI | $2.50 | Up to 1080p | Google Veo 2.0 |\n| `hailuo` | FAL AI | $0.08 | 720p | MiniMax Hailuo-02, budget-friendly |\n| `kling` | FAL AI | $0.10 | 720p | Kling Video 2.1 |\n\n### Image Understanding Models\n| Model Name | Provider | Cost per Analysis | Special Features |\n|------------|----------|-------------------|------------------|\n| `gemini_describe` | Google | $0.001 | Basic image description |\n| `gemini_detailed` | Google | $0.002 | Detailed image analysis |\n| `gemini_classify` | Google | $0.001 | Image classification |\n| `gemini_objects` | Google | $0.002 | Object detection |\n| `gemini_ocr` | Google | $0.001 | Text extraction (OCR) |\n| `gemini_composition` | Google | $0.002 | Artistic & technical analysis |\n| `gemini_qa` | Google | $0.001 | Question & answer system |\n\n### Text-to-Speech Models\n| Model Name | Provider | Cost per Request | Special Features |\n|------------|----------|------------------|------------------|\n| `elevenlabs` | ElevenLabs | $0.05 | High quality TTS |\n| `elevenlabs_turbo` | ElevenLabs | $0.03 | Fast generation |\n| `elevenlabs_v3` | ElevenLabs | $0.08 | Latest v3 model |\n\n### Prompt Generation Models\n| Model Name | Provider | Cost per Request | Special Features |\n|------------|----------|------------------|------------------|\n| `openrouter_video_prompt` | OpenRouter | $0.002 | General video prompts |\n| `openrouter_video_cinematic` | OpenRouter | $0.002 | Cinematic style prompts |\n| `openrouter_video_realistic` | OpenRouter | $0.002 | Realistic style prompts |\n| `openrouter_video_artistic` | OpenRouter | $0.002 | Artistic style prompts |\n| `openrouter_video_dramatic` | OpenRouter | $0.002 | Dramatic style prompts |\n\n### Audio & Video Processing\n| Model Name | Provider | Cost per Request | Special Features |\n|------------|----------|------------------|------------------|\n| `thinksound` | FAL AI | $0.05 | AI audio generation |\n| `topaz` | FAL AI | $1.50 | Video upscaling |\n\n### \ud83c\udf1f **Featured Model: Runway Gen-4**\nThe **`gen4`** model is our most advanced text-to-image model, offering unique capabilities:\n- **Multi-Reference Guidance**: Use up to 3 reference images with tagging\n- **Cinematic Quality**: Premium model for high-end generation \n- **@ Syntax**: Reference tagged elements in prompts (`@woman`, `@park`)\n- **Variable Pricing**: $0.05 (720p) / $0.08 (1080p)\n\n**Total Models: 35+ AI models across 7 categories**\n\n## \ud83d\ude80 **FLAGSHIP: AI Content Pipeline**\n\nThe unified AI content generation pipeline with parallel execution support, multi-model integration, and YAML-based configuration.\n\n### Core Capabilities\n- **\ud83d\udd04 Unified Pipeline Architecture** - YAML/JSON-based configuration for complex multi-step workflows\n- **\u26a1 Parallel Execution Engine** - 2-3x performance improvement with thread-based parallel processing\n- **\ud83c\udfaf Type-Safe Configuration** - Pydantic models with comprehensive validation\n- **\ud83d\udcb0 Cost Management** - Real-time cost estimation and tracking across all services\n- **\ud83d\udcca Rich Logging** - Beautiful console output with progress tracking and performance metrics\n\n### AI Service Integrations\n- **\ud83d\uddbc\ufe0f FAL AI** - Text-to-image, image-to-image, text-to-video, video generation, avatar creation\n- **\ud83d\udde3\ufe0f ElevenLabs** - Professional text-to-speech with 20+ voice options\n- **\ud83c\udfa5 Google Vertex AI** - Veo video generation and Gemini text generation \n- **\ud83d\udd17 OpenRouter** - Alternative TTS and chat completion services\n\n### Developer Experience\n- **\ud83d\udee0\ufe0f Professional CLI** - Comprehensive command-line interface with Click\n- **\ud83d\udce6 Modular Architecture** - Clean separation of concerns with extensible design\n- **\ud83e\uddea Comprehensive Testing** - Unit and integration tests with pytest\n- **\ud83d\udcda Type Hints** - Full type coverage for excellent IDE support\n\n## \ud83d\udce6 Installation\n\n### Quick Start\n```bash\n# Install from PyPI\npip install video-ai-studio\n\n# Or install in development mode\npip install -e .\n```\n\n### \ud83d\udd11 API Keys Setup\n\nAfter installation, you need to configure your API keys:\n\n1. **Download the example configuration:**\n ```bash\n # Option 1: Download from GitHub\n curl -o .env https://raw.githubusercontent.com/donghaozhang/veo3-fal-video-ai/main/.env.example\n \n # Option 2: Create manually\n touch .env\n ```\n\n2. **Add your API keys to `.env`:**\n ```env\n # Required for most functionality\n FAL_KEY=your_fal_api_key_here\n \n # Optional - add as needed\n GEMINI_API_KEY=your_gemini_api_key_here\n OPENROUTER_API_KEY=your_openrouter_api_key_here\n ELEVENLABS_API_KEY=your_elevenlabs_api_key_here\n ```\n\n3. **Get API keys from:**\n - **FAL AI**: https://fal.ai/dashboard (required for most models)\n - **Google Gemini**: https://makersuite.google.com/app/apikey\n - **OpenRouter**: https://openrouter.ai/keys\n - **ElevenLabs**: https://elevenlabs.io/app/settings\n\n### \ud83d\udccb Dependencies\nThe package installs core dependencies automatically. See [requirements.txt](requirements.txt) for the complete list.\n\n## \ud83d\udee0\ufe0f Quick Start\n\n### Console Commands\n```bash\n# List all available AI models\nai-content-pipeline list-models\n\n# Generate image from text\nai-content-pipeline generate-image --text \"epic space battle\" --model flux_dev\n\n# Create video (text \u2192 image \u2192 video)\nai-content-pipeline create-video --text \"serene mountain lake\"\n\n# Run custom pipeline from YAML config\nai-content-pipeline run-chain --config config.yaml --input \"cyberpunk city\"\n\n# Create example configurations\nai-content-pipeline create-examples\n\n# Shortened command alias\naicp --help\n```\n\n### Python API\n```python\nfrom packages.core.ai_content_pipeline.pipeline.manager import AIPipelineManager\n\n# Initialize manager\nmanager = AIPipelineManager()\n\n# Quick video creation\nresult = manager.quick_create_video(\n text=\"serene mountain lake\",\n image_model=\"flux_dev\",\n video_model=\"auto\"\n)\n\n# Run custom chain\nchain = manager.create_chain_from_config(\"config.yaml\")\nresult = manager.execute_chain(chain, \"input text\")\n```\n\n## \ud83d\udcda Package Structure\n\n### Core Packages\n- **[ai_content_pipeline](packages/core/ai_content_pipeline/)** - Main unified pipeline with parallel execution\n\n### Provider Packages\n\n#### Google Services\n- **[google-veo](packages/providers/google/veo/)** - Google Veo video generation (Vertex AI)\n\n#### FAL AI Services \n- **[fal-video](packages/providers/fal/video/)** - Video generation (MiniMax Hailuo-02, Kling Video 2.1)\n- **[fal-text-to-video](packages/providers/fal/text-to-video/)** - Text-to-video (MiniMax Hailuo-02 Pro, Google Veo 3)\n- **[fal-avatar](packages/providers/fal/avatar/)** - Avatar generation with TTS integration\n- **[fal-text-to-image](packages/providers/fal/text-to-image/)** - Text-to-image (Imagen 4, Seedream v3, FLUX.1)\n- **[fal-image-to-image](packages/providers/fal/image-to-image/)** - Image transformation (Luma Photon Flash)\n- **[fal-video-to-video](packages/providers/fal/video-to-video/)** - Video processing (ThinksSound + Topaz)\n\n### Service Packages\n- **[text-to-speech](packages/services/text-to-speech/)** - ElevenLabs TTS integration (20+ voices)\n- **[video-tools](packages/services/video-tools/)** - Video processing utilities with AI analysis\n\n## \ud83d\udd27 Configuration\n\n### Environment Setup\nCreate a `.env` file in the project root:\n```env\n# FAL AI API Configuration\nFAL_KEY=your_fal_api_key\n\n# Google Cloud Configuration (for Veo)\nPROJECT_ID=your-project-id\nOUTPUT_BUCKET_PATH=gs://your-bucket/veo_output/\n\n# ElevenLabs Configuration\nELEVENLABS_API_KEY=your_elevenlabs_api_key\n\n# Optional: Gemini for AI analysis\nGEMINI_API_KEY=your_gemini_api_key\n\n# Optional: OpenRouter for additional models\nOPENROUTER_API_KEY=your_openrouter_api_key\n```\n\n### YAML Pipeline Configuration\n```yaml\nname: \"Text to Video Pipeline\"\ndescription: \"Generate video from text prompt\"\nsteps:\n - name: \"generate_image\"\n type: \"text_to_image\"\n model: \"flux_dev\"\n aspect_ratio: \"16:9\"\n \n - name: \"create_video\"\n type: \"image_to_video\"\n model: \"kling_video\"\n input_from: \"generate_image\"\n duration: 8\n```\n\n### Parallel Execution\nEnable parallel processing for 2-3x speedup:\n```bash\n# Enable parallel execution\nPIPELINE_PARALLEL_ENABLED=true ai-content-pipeline run-chain --config config.yaml\n```\n\nExample parallel pipeline configuration:\n```yaml\nname: \"Parallel Processing Example\"\nsteps:\n - type: \"parallel_group\"\n steps:\n - type: \"text_to_image\"\n model: \"flux_schnell\"\n params:\n prompt: \"A cat\"\n - type: \"text_to_image\"\n model: \"flux_schnell\"\n params:\n prompt: \"A dog\"\n - type: \"text_to_image\"\n model: \"flux_schnell\"\n params:\n prompt: \"A bird\"\n```\n\n## \ud83d\udcb0 Cost Management\n\n### Cost Estimation\nAlways estimate costs before running pipelines:\n```bash\n# Estimate cost for a pipeline\nai-content-pipeline estimate-cost --config config.yaml\n```\n\n### Typical Costs\n- **Text-to-Image**: $0.001-0.004 per image\n- **Image-to-Image**: $0.01-0.05 per modification \n- **Text-to-Video**: $0.08-6.00 per video (model dependent)\n- **Avatar Generation**: $0.02-0.05 per video\n- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)\n- **Video Processing**: $0.05-2.50 per video (model dependent)\n\n### Cost-Conscious Usage\n- Use cheaper models for prototyping (`flux_schnell`, `hailuo`)\n- Test with small batches before large-scale generation\n- Monitor costs with built-in tracking\n\n## \ud83e\uddea Testing\n\n```bash\n# Quick tests\npython tests/run_all_tests.py --quick\n```\n\n\ud83d\udccb See [tests/README.md](tests/README.md) for complete testing guide.\n\n## \ud83d\udcb0 Cost Management\n\n### Estimation\n- **FAL AI Video**: ~$0.05-0.10 per video\n- **FAL AI Text-to-Video**: ~$0.08 (MiniMax) to $2.50-6.00 (Google Veo 3)\n- **FAL AI Avatar**: ~$0.02-0.05 per video\n- **FAL AI Images**: ~$0.001-0.01 per image\n- **Text-to-Speech**: Varies by usage (ElevenLabs pricing)\n\n### Best Practices\n1. Always run `test_setup.py` first (FREE)\n2. Use cost estimation in pipeline manager\n3. Start with cheaper models for testing\n4. Monitor usage through provider dashboards\n\n## \ud83d\udd04 Development Workflow\n\n### Making Changes\n```bash\n# Make your changes to the codebase\ngit add .\ngit commit -m \"Your changes\"\ngit push origin main\n```\n\n### Testing Installation\n```bash\n# Create test environment\npython3 -m venv test_env\nsource test_env/bin/activate\n\n# Install and test\npip install -e .\nai-content-pipeline --help\n```\n\n## \ud83d\udccb Available Commands\n\n### AI Content Pipeline Commands\n- `ai-content-pipeline list-models` - List all available models\n- `ai-content-pipeline generate-image` - Generate image from text\n- `ai-content-pipeline create-video` - Create video from text\n- `ai-content-pipeline run-chain` - Run custom YAML pipeline\n- `ai-content-pipeline create-examples` - Create example configs\n- `aicp` - Shortened alias for all commands\n\n### Individual Package Commands\nSee [CLAUDE.md](CLAUDE.md) for detailed commands for each package.\n\n## \ud83d\udcda Documentation\n\n- **[Project Instructions](CLAUDE.md)** - Comprehensive development guide\n- **[Documentation](docs/)** - Additional documentation and guides\n- **Package READMEs** - Each package has its own README with specific instructions\n\n## \ud83c\udfd7\ufe0f Architecture\n\n- **Unified Package Structure** - Single `setup.py` with consolidated dependencies\n- **Consolidated Configuration** - Single `.env` file for all services\n- **Modular Design** - Each service can be used independently or through the unified pipeline\n- **Parallel Execution** - Optional parallel processing for improved performance\n- **Cost-Conscious Design** - Built-in cost estimation and management\n\n## \ud83d\udcda Resources\n\n### \ud83d\ude80 AI Content Pipeline Resources\n- [Pipeline Documentation](packages/core/ai_content_pipeline/docs/README.md)\n- [Getting Started Guide](packages/core/ai_content_pipeline/docs/GETTING_STARTED.md)\n- [YAML Configuration Reference](packages/core/ai_content_pipeline/docs/YAML_CONFIGURATION.md)\n- [Parallel Execution Design](packages/core/ai_content_pipeline/docs/parallel_pipeline_design.md)\n\n### Google Veo Resources\n- [Veo API Documentation](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/veo-video-generation)\n- [Google GenAI SDK](https://github.com/google/generative-ai-python)\n- [Vertex AI Console](https://console.cloud.google.com/vertex-ai)\n\n### FAL AI Resources\n- [FAL AI Platform](https://fal.ai/)\n- [MiniMax Hailuo Documentation](https://fal.ai/models/fal-ai/minimax-video-01)\n- [Kling Video 2.1 Documentation](https://fal.ai/models/fal-ai/kling-video/v2.1/standard/image-to-video/api)\n- [FAL AI Avatar Documentation](https://fal.ai/models/fal-ai/avatar-video)\n- [ThinksSound API Documentation](https://fal.ai/models/fal-ai/thinksound/api)\n- [Topaz Video Upscale Documentation](https://fal.ai/models/fal-ai/topaz/upscale/video/api)\n\n### Text-to-Speech Resources\n- [ElevenLabs API Documentation](https://elevenlabs.io/docs/capabilities/text-to-speech)\n- [OpenRouter Platform](https://openrouter.ai/)\n- [ElevenLabs Voice Library](https://elevenlabs.io/app/speech-synthesis/text-to-speech)\n- [Text-to-Dialogue Documentation](https://elevenlabs.io/docs/cookbooks/text-to-dialogue)\n- [Package Migration Guide](packages/services/text-to-speech/docs/MIGRATION_GUIDE.md)\n\n### Additional Documentation\n- [Project Instructions](CLAUDE.md) - Comprehensive development guide\n- [Documentation](docs/) - Additional documentation and guides\n- [Package Organization](docs/repository_organization_guide.md) - Package structure guide\n\n## \ud83e\udd1d Contributing\n\n1. Follow the development patterns in [CLAUDE.md](CLAUDE.md)\n2. Add tests for new features\n3. Update documentation as needed\n4. Test installation in fresh virtual environment\n5. Commit with descriptive messages\n",
"bugtrack_url": null,
"license": null,
"summary": "Comprehensive AI content generation suite with multiple providers and services",
"version": "1.0.15",
"project_urls": {
"Changelog": "https://github.com/donghaozhang/veo3-fal-video-ai/blob/main/CHANGELOG.md",
"Documentation": "https://github.com/donghaozhang/veo3-fal-video-ai/blob/main/README.md",
"Homepage": "https://github.com/donghaozhang/veo3-fal-video-ai",
"Source": "https://github.com/donghaozhang/veo3-fal-video-ai",
"Tracker": "https://github.com/donghaozhang/veo3-fal-video-ai/issues"
},
"split_keywords": [
"ai",
" content generation",
" images",
" videos",
" audio",
" fal",
" elevenlabs",
" google",
" parallel processing",
" veo",
" pipeline"
],
"urls": [
{
"comment_text": null,
"digests": {
"blake2b_256": "1ed6c0eb80b59c193a1db00397ed737f0c4a84300c955ff0959de8d18a374475",
"md5": "0abf5af39c99c0ac667011c0391b9cb5",
"sha256": "7bfbc31438df7b65ef19a68ca6c44e73245dcb6a72d584488f8bd3a9dd27cd9b"
},
"downloads": -1,
"filename": "video_ai_studio-1.0.15-py3-none-any.whl",
"has_sig": false,
"md5_digest": "0abf5af39c99c0ac667011c0391b9cb5",
"packagetype": "bdist_wheel",
"python_version": "py3",
"requires_python": ">=3.10",
"size": 466965,
"upload_time": "2025-07-14T06:27:45",
"upload_time_iso_8601": "2025-07-14T06:27:45.511516Z",
"url": "https://files.pythonhosted.org/packages/1e/d6/c0eb80b59c193a1db00397ed737f0c4a84300c955ff0959de8d18a374475/video_ai_studio-1.0.15-py3-none-any.whl",
"yanked": false,
"yanked_reason": null
},
{
"comment_text": null,
"digests": {
"blake2b_256": "96d760fac39a1e4836e19e303ae1992cdc5a2c19e43af1e4dc3552280a9a1980",
"md5": "f4ca845071ba2a99337a6c0998369801",
"sha256": "9c4d9caefb4251937cc48f5cd763cdd9e45cfd8405ec1e5560d59416dcbf8ae8"
},
"downloads": -1,
"filename": "video_ai_studio-1.0.15.tar.gz",
"has_sig": false,
"md5_digest": "f4ca845071ba2a99337a6c0998369801",
"packagetype": "sdist",
"python_version": "source",
"requires_python": ">=3.10",
"size": 385339,
"upload_time": "2025-07-14T06:27:47",
"upload_time_iso_8601": "2025-07-14T06:27:47.169721Z",
"url": "https://files.pythonhosted.org/packages/96/d7/60fac39a1e4836e19e303ae1992cdc5a2c19e43af1e4dc3552280a9a1980/video_ai_studio-1.0.15.tar.gz",
"yanked": false,
"yanked_reason": null
}
],
"upload_time": "2025-07-14 06:27:47",
"github": true,
"gitlab": false,
"bitbucket": false,
"codeberg": false,
"github_user": "donghaozhang",
"github_project": "veo3-fal-video-ai",
"travis_ci": false,
"coveralls": false,
"github_actions": true,
"requirements": [
{
"name": "python-dotenv",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "requests",
"specs": [
[
">=",
"2.31.0"
]
]
},
{
"name": "typing-extensions",
"specs": [
[
">=",
"4.0.0"
]
]
},
{
"name": "pyyaml",
"specs": [
[
">=",
"6.0"
]
]
},
{
"name": "pathlib2",
"specs": [
[
">=",
"2.3.7"
]
]
},
{
"name": "argparse",
"specs": [
[
">=",
"1.4.0"
]
]
},
{
"name": "fal-client",
"specs": [
[
">=",
"0.4.0"
]
]
},
{
"name": "replicate",
"specs": [
[
">=",
"0.15.0"
]
]
},
{
"name": "openai",
"specs": [
[
">=",
"1.0.0"
],
[
"<",
"2.0.0"
]
]
},
{
"name": "google-cloud-aiplatform",
"specs": [
[
">=",
"1.20.0"
]
]
},
{
"name": "google-cloud-storage",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "google-auth",
"specs": [
[
">=",
"2.0.0"
]
]
},
{
"name": "google-generativeai",
"specs": [
[
">=",
"0.2.0"
]
]
},
{
"name": "elevenlabs",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "Pillow",
"specs": [
[
">=",
"10.0.0"
]
]
},
{
"name": "moviepy",
"specs": [
[
">=",
"1.0.3"
]
]
},
{
"name": "ffmpeg-python",
"specs": [
[
">=",
"0.2.0"
]
]
},
{
"name": "aiohttp",
"specs": [
[
">=",
"3.8.0"
]
]
},
{
"name": "httpx",
"specs": [
[
">=",
"0.25.0"
]
]
},
{
"name": "jupyter",
"specs": [
[
">=",
"1.0.0"
]
]
},
{
"name": "ipython",
"specs": [
[
">=",
"8.0.0"
]
]
},
{
"name": "notebook",
"specs": [
[
">=",
"7.0.0"
]
]
},
{
"name": "pytest",
"specs": [
[
">=",
"7.0.0"
]
]
},
{
"name": "pytest-asyncio",
"specs": [
[
">=",
"0.21.0"
]
]
}
],
"lcname": "video-ai-studio"
}