Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add aviz85/claude-skills-library --skill "kinetic-video-creator"
Install specific skill from multi-skill repository
# Description
Create professional kinetic typography videos from scratch. Includes speech writing, TTS with emotional dynamics, music generation, and animated text. Use for: promo videos, explainers, social content, inspirational speeches, product launches.
# SKILL.md
name: kinetic-video-creator
description: "Create professional kinetic typography videos from scratch. Includes speech writing, TTS with emotional dynamics, music generation, and animated text. Use for: promo videos, explainers, social content, inspirational speeches, product launches."
argument-hint: [topic] [tone: inspirational/dramatic/energetic/calm]
enhancedBy:
- speech-generator: "TTS with aviz's cloned voice - optimized for Hebrew"
- transcribe: "Word-level timing for animation sync"
- music-generator: "Background music matching emotional arc"
- youtube-uploader: "Optional publishing"
Kinetic Video Creator
Create stunning kinetic typography videos with AI-generated speech, music, and dynamic animations.
Workflow Overview
- Script โ Craft emotionally compelling speech text
- Speech โ Use
/speech-generatorskill for TTS - Transcribe โ Use
/transcribeskill for word timing - Music โ Use
/music-generatorskill for background - Merge โ Combine speech + music
- Animate โ Create kinetic typography in Remotion
- Render โ Produce final video
- Publish โ Use
/youtube-uploaderskill (optional)
Step 1: Craft the Script
Language Selection
Hebrew (Recommended for aviz's voice):
- Use Hebrew emotional directions in brackets
- Add natural Hebrew filler words
- See "Hebrew Script Guidelines" section below
English:
- Use English emotional directions
- See "English Script Guidelines" section below
Hebrew Script Guidelines
aviz's cloned voice is optimized for Hebrew. Use these Hebrew directions:
Hebrew Emotional Directions
| Direction | Effect |
|---|---|
[ื ืฉืืื ืขืืืงื] |
Deep breath, pause |
[ืืืชืืืืืช] |
Enthusiastic |
[ืืจืฆืื ืืช] |
Serious tone |
[ืืขืฆื] |
Sad, emotional |
[ืืฉืงื] |
Quiet, intimate |
[ืืืจ] |
Fast pace |
[ืืื ืืืืืจืืจ] |
Slow and clear |
[ืฉืืื] |
Question tone |
[ืืคืชืขื] |
Surprise |
[ืฆืืืง ืงื] |
Light laugh |
[ืืืื] |
Warm tone |
[ืืืื] |
Powerful, emphatic |
Hebrew Filler Words (for natural flow)
ืืื...- hesitationืืื...- thinkingืืืืื...- likeื ื...- wellืืืืื...- come onืืงืืฆืืจ...- in short...- pause
Hebrew Script Example
[ื ืฉืืื ืขืืืงื] ืืฉ ืจืืข...
[ืืื ืืืืืจืืจ] ืจืืข ืฉืืื ืืฉืชื ื.
[ืืจืฆืื ืืช] ืืื ืฉืื ืืื ืืคืืืื ืืฉืืื ืชืื ืืง.
ืื ืืืื ืืื ืืื ืขื ืืืกื ืืืืืื.
[ืืืชืืืืืช] ืืื ืืื ืฉืื? ืืื ืืื ืกืคืืจืืื ืืฆืืืื!
[ืืืื] ืืื ืชืืื ืืืืื... ืฉืืคืฉืจ ืืืืฉืื ืื ืืืื.
[ืืขืฆื] ืืฉืืืืชื ืื ืืืฉ ืขืฉืจื... ืืื ื ืคืืจ.
[ืืืื] ืืืืื ืืื? ืืคื ืืืฉืืื ืฉืื.
[ืืืื] ืืขืืืจ ืืื ืฉืื ืืืจืื ืืืืฉืื ืืช ืืืืืืืช ืฉืืื.
English Script Guidelines
English Emotional Directions
| Direction | Effect |
|---|---|
[pause] |
Brief pause |
[long pause] |
Extended pause |
[slowly] |
Slower delivery |
[faster] |
Quickened pace |
[whisper] |
Softer, intimate |
[emphatic] |
Strong emphasis |
[building] |
Increasing intensity |
[warm] |
Friendly tone |
[dramatic] |
Theatrical |
[matter-of-fact] |
Conversational |
English Script Template
[HOOK - 5-10 seconds]
[dramatic pause] Opening line that grabs attention.
[slowly, with weight] The provocative statement.
[BUILD - 20-40 seconds]
[building intensity] Establish the context.
[pause for effect] Key insight moment.
[PEAK - 20-30 seconds]
[powerful, emphatic] The main message.
[pause] Let it land.
[RESOLVE - 15-25 seconds]
[warm, inspiring] Paint the vision.
[final beat] Memorable closing.
Step 2: Generate Speech
Use the speech-generator skill:
/speech-generator [path/to/script.txt] -o [path/to/speech.mp3]
Or invoke directly:
cd ~/.claude/skills/speech-generator/scripts
npx ts-node generate_speech.ts -f script.txt -o speech.mp3
Important: The speech-generator uses aviz's cloned voice, which works best with Hebrew text and Hebrew emotional directions.
Step 3: Transcribe for Timing
Use the transcribe skill:
/transcribe [path/to/speech.mp3] --json
Or invoke directly:
cd ~/.claude/skills/transcribe/scripts
npx ts-node transcribe.ts -i speech.mp3 -o transcript.srt --json
Output: transcript_transcript.json with word-level timing data.
Step 4: Generate Background Music
Use the music-generator skill:
/music-generator [composition.json] -o background_music.mp3
Music Composition Template
{
"duration_ms": 75000,
"instrumental": true,
"positive_global_styles": ["cinematic", "inspirational"],
"negative_global_styles": ["aggressive", "chaotic"],
"sections": [
{
"section_name": "Hook - Mysterious",
"duration_ms": 12000,
"positive_local_styles": ["suspenseful", "soft"],
"negative_local_styles": ["loud"],
"lines": []
},
{
"section_name": "Build - Rising",
"duration_ms": 25000,
"positive_local_styles": ["hopeful", "building"],
"negative_local_styles": ["slow"],
"lines": []
},
{
"section_name": "Peak - Triumphant",
"duration_ms": 20000,
"positive_local_styles": ["triumphant", "uplifting"],
"negative_local_styles": ["quiet"],
"lines": []
}
]
}
Step 5: Merge Audio
ffmpeg -y \
-i speech.mp3 \
-i background_music.mp3 \
-filter_complex "[0:a]volume=1.0[speech];[1:a]volume=0.15[music];[speech][music]amix=inputs=2:duration=first[out]" \
-map "[out]" -c:a libmp3lame -q:a 2 \
final_audio.mp3
Step 6: Create Remotion Composition
Project Location
cd /Users/aviz/remotion-assistant
Default Template: SequenceComposition (One Word Per Screen)
Recommended: Use SequenceComposition for maximum impact - displays one word at a time with full-screen typography.
import { SequenceComposition } from '../templates/SequenceComposition';
import transcriptData from '../../projects/[project]/transcript_transcript.json';
const WORD_TIMINGS = transcriptData.words
.filter((w) => w.word.trim() !== '')
.map((w) => ({
word: w.word,
start: w.start,
end: w.end,
}));
export const MyVideo: React.FC = () => {
return (
<SequenceComposition
wordTimings={WORD_TIMINGS}
audioFile="[project]/final_audio.mp3"
baseFontSize={200}
dustEnabled={true}
lightBeamsEnabled={true}
centerGlowEnabled={true}
glowIntensity={1}
anticipationFrames={5}
colorSchemeStart={0}
/>
);
};
Alternative: MultiWordComposition (Word Cloud)
Use for faster-paced content with multiple words on screen:
import { MultiWordComposition } from '../templates/MultiWordComposition';
Hebrew Font Support
For Hebrew text, use Heebo font:
import { loadFont } from '@remotion/google-fonts/Heebo';
const { fontFamily } = loadFont('normal', {
weights: ['400', '600', '700', '900'],
subsets: ['hebrew', 'latin'],
});
Add RTL styling:
style={{
direction: 'rtl',
fontFamily,
}}
Step 7: Render
cd /Users/aviz/remotion-assistant
npx remotion render CompositionName output.mp4
Step 8: Upload (Optional)
Use the youtube-uploader skill:
/youtube-uploader [video.mp4] --title "Title" --description "Description"
Project Structure
remotion-assistant/
โโโ public/[project]/
โ โโโ final_audio.mp3 # Audio for Remotion
โโโ projects/[project]/
โ โโโ speech.txt # Script
โ โโโ speech.mp3 # TTS output
โ โโโ transcript_transcript.json # Word timings
โ โโโ music_composition.json
โ โโโ background_music.mp3
โ โโโ final_audio.mp3 # Merged audio
โ โโโ output.mp4 # Final video
โโโ src/compositions/
โโโ [ProjectName].tsx # Composition
Quick Reference
| Step | Skill/Command |
|---|---|
| Speech | /speech-generator script.txt -o speech.mp3 |
| Transcribe | /transcribe speech.mp3 --json |
| Music | /music-generator composition.json |
| Merge | ffmpeg (see above) |
| Render | npx remotion render Name output.mp4 |
| Upload | /youtube-uploader output.mp4 |
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.