Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add michaelboeding/skills --skill "video-producer-agent"
Install specific skill from multi-skill repository
# Description
>
# SKILL.md
name: video-producer-agent
description: >
Use this skill to create complete videos with voiceover and music.
Triggers: "create video", "product video", "explainer video", "promo video", "demo video",
"training video", "ad video", "commercial", "marketing video", "video with voiceover",
"video with music", "brand video", "testimonial video"
Orchestrates: script, voiceover, background music, video clips/images, and final assembly.
Video Producer
Create complete videos with voiceover, music, and visuals.
This is an orchestrator skill that combines:
- Script/storyboard generation (Claude)
- Voiceover synthesis (Gemini TTS)
- Background music (Lyria)
- Video clip generation (Veo 3.1) or image animation
- Final assembly (FFmpeg via media-utils)
Workflow
Step 1: Gather Requirements (REQUIRED)
β οΈ DO NOT skip this step. DO NOT run init_project.py until you have ALL answers.
Use interactive questioning β ask ONE question at a time, wait for the response, then ask the next. This creates a collaborative spec-driven process.
Question Flow
β οΈ Use the AskUserQuestion tool for each question below. Do not just print questions in your response β use the tool to create interactive prompts with the options shown.
Q1: Subject
"I'll create that video! First β what's it about?
(e.g., product launch, brand story, tutorial, explainer β or describe your own)"
Wait for response.
Q2: Duration
"How long should the video be?
- 15 seconds (quick hook)
- 30 seconds (standard ad)
- 60 seconds (explainer)
- 2+ minutes (detailed)
- Or specify your own duration"
Wait for response.
Q3: Style
"What visual style?
- Premium/luxury
- Fun/playful
- Corporate/professional
- Dramatic/cinematic
- Minimal/clean
- Or describe your own style"
Wait for response.
Q4: Assets
"Do you have existing images or video clips to use?
- No, generate everything
- Yes, I have images (provide paths)
- Yes, I have video clips (provide paths)"
Wait for response.
Q5: Audio Strategy
"How should we handle audio?
- Custom β I generate voiceover + background music
- Veo native β Use Veo's built-in dialogue/SFX/ambient
- Silent β No audio, add later"
Wait for response.
Q6: Voice (if custom audio)
"What voice tone for the voiceover?
- Professional
- Friendly/warm
- Energetic
- Calm/soothing
- Dramatic
- Or describe your own tone"
Wait for response.
Q7: Music (if custom audio)
"What music vibe?
- Modern electronic
- Cinematic/epic
- Upbeat pop
- Ambient/chill
- Corporate
- Or describe your own style"
Wait for response.
Q8: Format
"What aspect ratio?
- 16:9 (YouTube, web)
- 9:16 (TikTok, Reels, Shorts)
- 1:1 (Instagram feed)"
Wait for response.
Q9: Resolution
"What resolution?
- 720p (faster generation)
- 1080p (standard HD)"
Wait for response.
Q10: Model
"Which Veo model?
veo-3.1β Latest, highest quality (default)veo-3.1-fastβ Faster generation, slightly lower qualityveo-3β Previous generationveo-3-fastβ Previous gen, faster"
Wait for response.
Quick Reference
| Question | Determines |
|---|---|
| Subject | Scene content and prompts |
| Duration | Scene count (Veo clips must be 4, 6, or 8 seconds) |
| Style | Visual prompts and music selection |
| Assets | Generate vs use existing |
| Audio | custom, veo_audio, or silent |
| Voice | TTS voice selection |
| Music | Lyria prompt |
| Format | Aspect ratio for Veo |
| Resolution | 720p or 1080p output quality |
| Model | veo-3.1, veo-3.1-fast, veo-3, veo-3-fast |
Step 2: Initialize Project
Once you have the user's answers, initialize the project with their preferences:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/init_project.py \
--name "Product Launch Video" \
--duration 30 \
--aspect-ratio 16:9 \
--audio-strategy custom \
--scenes 5
Step 3: Configure project.json
Edit project.json with scene prompts, voiceover text, and music style based on user's answers.
Step 4: Assemble the Video
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/assemble.py \
--project ~/my_video_project/
Project Structure
When you initialize a project, this folder structure is created:
my_project/
βββ project.json # Configuration: scenes, voiceover, music, settings
βββ storyboard.md # Planning document for the video
βββ scenes/ # Generated video clips from Veo
β βββ scene1_intro.mp4
β βββ scene2_features.mp4
β βββ scene3_cta.mp4
βββ audio/ # Audio assets
β βββ voiceover.wav # Generated voiceover
β βββ background_music.wav # Generated music
β βββ final_mix.mp3 # Mixed audio track
βββ work/ # Intermediate files (auto-generated)
β βββ silent_scene1.mp4
β βββ video_concatenated.mp4
β βββ ...
βββ output/ # Final deliverables
βββ product_launch_video_final.mp4
Scripts
init_project.py
Initialize a new video project with folder structure and templates.
# Basic project
python3 init_project.py --name "My Video" --duration 30
# Create in specific directory
python3 init_project.py --name "Demo Video" --output ~/Videos/
# Vertical video for social
python3 init_project.py --name "Instagram Reel" --aspect-ratio 9:16 --duration 15
# Use Veo's native audio (no custom voiceover/music)
python3 init_project.py --name "Cinematic Scene" --audio-strategy veo_audio
# More scenes
python3 init_project.py --name "Long Video" --duration 60 --scenes 5
Options:
| Option | Default | Description |
|--------|---------|-------------|
| --name | required | Project name |
| --output | current dir | Parent directory |
| --duration | 30 | Target duration in seconds |
| --aspect-ratio | 16:9 | 16:9, 9:16, 1:1, 4:3 |
| --audio-strategy | custom | custom, veo_audio, silent |
| --scenes | 3 | Number of scene placeholders |
assemble.py
Orchestrate the full video assembly pipeline.
# Full pipeline (generate everything + assemble)
python3 assemble.py --project ~/my_project/
# Skip generation (use existing scene/audio files)
python3 assemble.py --project ~/my_project/ --skip-generation
# Dry run (show what would be done)
python3 assemble.py --project ~/my_project/ --dry-run
Pipeline steps:
1. Generate video scenes (Veo 3.1)
2. Strip audio from scenes (if custom audio)
3. Generate voiceover (Gemini TTS)
4. Generate background music (Lyria)
5. Mix voiceover + music
6. Concatenate video clips
7. Merge audio with video
8. Output final video
project.json Configuration
{
"name": "Product Launch Video",
"duration_target": 30,
"aspect_ratio": "16:9",
"resolution": "720p",
"audio_strategy": "custom",
"scenes": [
{
"id": 1,
"name": "scene1_hero",
"prompt": "Cinematic slow zoom on premium product, dramatic lighting, high-end commercial style",
"duration": 6,
"notes": "Music only, no voiceover"
},
{
"id": 2,
"name": "scene2_features",
"prompt": "Product features demonstration, sleek animations, modern tech aesthetic",
"duration": 8,
"notes": "Voiceover starts here"
},
{
"id": 3,
"name": "scene3_cta",
"prompt": "Product with logo on clean background, call to action moment",
"duration": 6,
"notes": "Music swells, voiceover ends"
}
],
"voiceover": {
"enabled": true,
"text": "Introducing the future of audio. Crystal clear sound. All-day comfort. Experience the difference.",
"voice": "Charon",
"style": "Professional, confident, premium brand voice"
},
"music": {
"enabled": true,
"prompt": "modern electronic, premium, sleek, product showcase, subtle bass",
"duration": 35,
"bpm": 100,
"brightness": 0.6
},
"assembly": {
"transition": "fade",
"transition_duration": 0.5,
"music_volume": 0.3,
"fade_in": 1.0,
"fade_out": 2.0
}
}
Audio Strategies
| Strategy | Description | Use When |
|---|---|---|
custom |
Strip Veo audio, add custom voiceover + music | Most videos |
veo_audio |
Keep Veo's generated audio (dialogue, SFX) | Cinematic scenes, dialogues |
silent |
Strip audio, output silent video | Adding audio later |
Workflow: Creating a Video
Step 1: Initialize Project
python3 init_project.py --name "Wireless Earbuds Promo" --duration 30
Step 2: Plan the Storyboard
Edit storyboard.md to plan your video structure:
## Scene 1: Hero Reveal (0-5s)
- Visual: Earbuds emerging from shadow, premium lighting
- Audio: Music only (dramatic intro)
## Scene 2: Sound Quality (5-12s)
- Visual: Sound waves, person enjoying music
- Audio: Voiceover: "Crystal clear sound. Immersive bass."
## Scene 3: Comfort (12-20s)
- Visual: Close-up of fit, person running
- Audio: Voiceover: "All-day comfort. Secure fit."
## Scene 4: CTA (20-30s)
- Visual: Product + logo
- Audio: Voiceover: "Experience the difference." + music swell
Step 3: Configure project.json
Fill in the scene prompts, voiceover text, and music style based on your storyboard.
Step 4: Assemble
python3 assemble.py --project ~/wireless_earbuds_promo/
Step 5: Review and Iterate
Check the output in output/. If adjustments needed:
# Re-run with existing scenes (just re-mix audio)
python3 assemble.py --project ~/wireless_earbuds_promo/ --skip-generation
What You Can Create
| Type | Example |
|---|---|
| Product video | 30s hero video showcasing a product |
| Explainer video | How-to or feature explanation |
| Promo/ad video | Marketing advertisement |
| Demo video | Product demonstration |
| Training video | Internal training content |
| Testimonial | Customer quote style video |
| Brand video | Company/brand story |
Prerequisites
GOOGLE_API_KEY- For Veo (video), Gemini TTS (voice), Lyria (music)- FFmpeg installed:
brew install ffmpeg
Video Styles & Music Pairings
| Style | Music Prompt | Voice |
|---|---|---|
| Premium/Luxury | "elegant, minimal, ambient, sophisticated" | Charon (informative) |
| Tech/Modern | "electronic, futuristic, clean, innovative" | Kore (firm) |
| Fun/Playful | "upbeat, cheerful, acoustic, positive" | Puck (upbeat) |
| Corporate | "professional, inspiring, orchestral lite" | Orus (firm) |
| Lifestyle | "chill, aspirational, indie, warm" | Aoede (breezy) |
| Dramatic/Cinematic | "epic, orchestral, emotional, building" | Gacrux (mature) |
Common Video Structures
Product Video (30s)
0-5s: Hero shot (music only)
5-15s: Features (voiceover + music)
15-25s: Lifestyle/use case (voiceover + music)
25-30s: Logo + CTA (music fade)
Explainer Video (60s)
0-5s: Hook/problem statement
5-20s: Solution introduction
20-45s: How it works (3 steps)
45-55s: Benefits summary
55-60s: CTA
Testimonial Video (45s)
0-5s: Intro/name card
5-35s: Testimonial quote (multiple scenes)
35-45s: Product shot + logo
Manual Workflow (Without Scripts)
If you prefer to run each step manually:
Generate scenes:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
--batch scenes.json
Generate voiceover:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \
--text "Your voiceover text..." \
--voice Charon \
--style "Professional, warm"
Generate music:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
--prompt "modern electronic, premium" \
--duration 35
Strip audio from clips:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_strip_audio.py \
-i scene*.mp4
Concatenate videos:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_concat.py \
-i silent_*.mp4 --transition fade -o video.mp4
Mix audio:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \
--voice voiceover.wav --music music.wav -o audio.mp3
Merge audio + video:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_audio_merge.py \
--video video.mp4 --audio audio.mp3 -o final.mp4
Input Files You Can Provide
| File Type | How It's Used |
|---|---|
| Product images | Animate with Veo as first frame |
| Logo (PNG) | Overlay on final scene |
| Existing voiceover | Place in audio/voiceover.wav |
| Brand music | Place in audio/background_music.wav |
| Video clips | Place in scenes/ folder |
| Script/copy | Use for voiceover text |
Limitations
- Veo video duration: Max 8 seconds per clip (concatenate for longer)
- Veo 3.1 includes audio: All clips have generated audio (strip if using custom)
- Processing time: Video generation takes 1-3 minutes per clip
- Resolution: Currently 720p or 1080p (1080p for 8s only)
Error Handling
| Error | Solution |
|---|---|
| "GOOGLE_API_KEY not set" | Set up API key per README |
| "FFmpeg not found" | Install: brew install ffmpeg |
| "project.json not found" | Run init_project.py first |
| Video generation timeout | Retry, or use shorter duration |
| Audio/video sync issues | Adjust scene durations |
Example Prompts
Simple:
"Create a 30-second product video for my new coffee maker"
Detailed:
"Create a 45-second product video for our new wireless earbuds. Premium, luxury feel. I have product photos attached. Professional male voiceover. Modern electronic music. 16:9 for YouTube."
With project:
"Initialize a video project called 'App Demo' with 5 scenes, 60 seconds total, vertical format for TikTok"
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.