Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add lattifai/omni-captions-skills --skill "omnicaptions-convert"
Install specific skill from multi-skill repository
# Description
Use when converting between caption formats (SRT, VTT, ASS, TTML, Gemini MD, etc.). Supports 30+ caption formats.
# SKILL.md
name: omnicaptions-convert
description: Use when converting between caption formats (SRT, VTT, ASS, TTML, Gemini MD, etc.). Supports 30+ caption formats.
allowed-tools: Bash(omnicaptions:*)
Caption Format Conversion
Convert between 30+ caption/caption formats using lattifai-captions.
β‘ YouTube Workflow
# 1. Transcribe YouTube video directly
omnicaptions transcribe "https://youtube.com/watch?v=VIDEO_ID" -o transcript.md
# 2. Convert to any format
omnicaptions convert transcript.md -o output.srt
omnicaptions convert transcript.md -o output.ass
omnicaptions convert transcript.md -o output.vtt
When to Use
- Converting SRT to VTT, ASS, TTML, etc.
- Converting Gemini markdown transcript to standard caption formats
- Converting YouTube VTT (with word-level timestamps) to other formats
- Batch format conversion
When NOT to Use
- Need transcription (use
/omnicaptions:transcribe) - Need translation (use
/omnicaptions:translate)
Setup
pip install https://github.com/lattifai/omni-captions-skills/raw/main/packages/lattifai_captions-0.1.0.tar.gz
pip install https://github.com/lattifai/omni-captions-skills/raw/main/packages/omnicaptions-0.1.0.tar.gz
Quick Reference
| Format | Extension | Read | Write |
|---|---|---|---|
| SRT | .srt |
β | β |
| VTT | .vtt |
β | β |
| ASS/SSA | .ass |
β | β |
| TTML | .ttml |
β | β |
| Gemini MD | .md |
β | β |
| JSON | .json |
β | β |
| TXT | .txt |
β | β |
Full list: SRT, VTT, ASS, SSA, TTML, DFXP, SBV, SUB, LRC, JSON, TXT, TSV, Audacity, Audition, FCPXML, EDL, and more.
CLI Usage
# Convert (auto-output to same directory, only changes extension)
omnicaptions convert input.srt -t vtt # β ./input.vtt
omnicaptions convert transcript.md # β ./transcript.srt
# Specify output file or directory
omnicaptions convert input.srt -o output/ # β output/input.srt
omnicaptions convert input.srt -o output.vtt # β output.vtt
# Specify format explicitly
omnicaptions convert input.txt -o out.srt -f txt -t srt
ASS Style Presets
When converting to ASS format, use --style to apply preset styles:
omnicaptions convert input.srt -o output.ass --style default # White text, bottom
omnicaptions convert input.srt -o output.ass --style top # White text, top
omnicaptions convert input.srt -o output.ass --style bilingual # White + Yellow (for bilingual)
omnicaptions convert input.srt -o output.ass --style yellow # Yellow text, bottom
| Preset | Position | Line 1 | Line 2 | Use Case |
|---|---|---|---|---|
default |
Bottom | White | White | Standard captions |
top |
Top | White | White | When bottom is occupied |
bilingual |
Bottom | White | Yellow | Bilingual captions (εζ + θ―ζ) |
yellow |
Bottom | Yellow | Yellow | High visibility |
Bilingual Example
If your SRT has two-line captions like:
1
00:00:01,000 --> 00:00:03,000
Hello World
δ½ ε₯½δΈη
Use --style bilingual or custom colors:
# Preset: white + yellow
omnicaptions convert bilingual.srt -o output.ass --style bilingual
# Custom colors: green English + yellow Chinese
omnicaptions convert bilingual.srt -o output.ass --line1-color "#00FF00" --line2-color "#FFFF00"
# Mix preset with custom line2 color
omnicaptions convert bilingual.srt -o output.ass --style default --line2-color "#FF6600"
Custom Color Options
| Option | Description |
|---|---|
--line1-color "#RRGGBB" |
First line (original) color |
--line2-color "#RRGGBB" |
Second line (translation) color |
Common colors: #FFFFFF (white), #FFFF00 (yellow), #00FF00 (green), #00FFFF (cyan), #FF6600 (orange)
Font Size and Resolution
Font size is auto-calculated based on video resolution. Resolution is detected from (priority order):
--resolutionargument (e.g.,1080p,4k,1920x1080)--videoargument (uses ffprobe to detect).meta.jsonfile (saved byomnicaptions download)- Default: 1080p
# Auto-detect from .meta.json (saved by download command)
omnicaptions convert abc123.en.srt -o abc123.en.ass --karaoke
# Specify resolution directly
omnicaptions convert input.srt -o output.ass --resolution 4k
omnicaptions convert input.srt -o output.ass --resolution 720p
omnicaptions convert input.srt -o output.ass --resolution 1920x1080
# Detect from video file (uses ffprobe)
omnicaptions convert input.srt -o output.ass --video video.mp4
# Override auto-calculated fontsize
omnicaptions convert input.srt -o output.ass --resolution 4k --fontsize 80
| Resolution | PlayRes | Auto FontSize |
|---|---|---|
| 480p | 854Γ480 | 24 |
| 720p | 1280Γ720 | 32 |
| 1080p | 1920Γ1080 | 48 (default) |
| 2K | 2560Γ1440 | 64 |
| 4K | 3840Γ2160 | 96 |
Karaoke Mode
Generate karaoke subtitles with word-level highlighting. Requires word-level timing (use LaiCut alignment first).
# Basic karaoke (sweep effect - gradual fill)
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke
# Different effects
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke sweep # Gradual fill (default)
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke instant # Instant highlight
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.ass --karaoke outline # Outline then fill
# LRC karaoke (enhanced word timestamps)
omnicaptions convert lyrics_LaiCut.json -o lyrics_LaiCut_karaoke.lrc --karaoke
| Effect | ASS Tag | Description |
|---|---|---|
sweep |
\kf |
Gradual fill from left to right (default) |
instant |
\k |
Instant word highlight |
outline |
\ko |
Outline fills, then text fills |
Karaoke Workflow
# 1. Align with LaiCut (get word-level timing in JSON)
omnicaptions LaiCut audio.mp3 lyrics.txt
# 2. Convert to karaoke ASS
omnicaptions convert lyrics_LaiCut.json -o karaoke.ass --karaoke
# Or combine with style
omnicaptions convert lyrics_LaiCut.json -o karaoke.ass --karaoke --style yellow
Python Usage
from omnicaptions import Caption
# Load any format
cap = Caption.read("input.srt")
# Write to any format
cap.write("output.vtt")
cap.write("output.ass")
cap.write("output.ttml")
Common Mistakes
| Mistake | Fix |
|---|---|
| Format not detected | Use --from / --to flags |
| Missing timestamps | Source format must have timing info |
| Encoding error | Specify encoding="utf-8" |
Related Skills
| Skill | Use When |
|---|---|
/omnicaptions:transcribe |
Need transcript from audio/video |
/omnicaptions:translate |
Translate with Gemini API |
/omnicaptions:translate |
Translate with Claude (no API key) |
/omnicaptions:download |
Download video/captions first |
Workflow Examples
# Transcribe β Convert β Translate (with Claude)
/omnicaptions:transcribe video.mp4
/omnicaptions:convert video_GeminiUnd.md -o video.srt
/omnicaptions:translate video.srt -l zh --bilingual
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.