Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech...
Create AI video content with HeyGen - generate avatar videos, translate content, and manage video projects
Generates images and videos using MuleRouter or MuleRun multimodal APIs. Text-to-Image, Image-to-Image, Text-to-Video, Image-to-Video, video editing (VACE, keyframe interpolation). Use when the...
Use the VLM Run CLI (`vlmrun`) to interact with Orion visual AI agent. Process images, videos, and documents with natural language. Triggers: image understanding/generation, object detection, OCR,...
Fetch and summarize latest videos from priority YouTube channels. Creates notes with transcripts summarized as bullet points. Use to catch up on subscriptions without watching everything. Triggers...
Extract subtitles/transcripts from YouTube videos. Triggers: "youtube transcript", "extract subtitles", "video captions", "θ§ι’εεΉ", "εεΉζε", "YouTube转ζε", "ζεεεΉ".
Search YouTube videos via Invidious API. Use when the user wants to find, search for, or look up videos, or asks for video recommendations on a topic.
Download videos from YouTube, Bilibili, Twitter, and thousands of other sites using yt-dlp. Use when the user provides a video URL and wants to download it, extract audio (MP3), download...
Create and edit videos using Google's Veo 2 and Veo 3 models. Supports Text-to-Video, Image-to-Video, Inpainting, and Advanced Controls.
Use when user asks about YouTube video content, wants to know what a video says, needs information from a YouTube URL, or when video transcription would answer their question
Video/audio/image processing with FFmpeg and ImageMagick. Tools: FFmpeg (video/audio), ImageMagick (images). Capabilities: format conversion, encoding (H.264/H.265/VP9/AV1), streaming (HLS/DASH),...
Best practices for HeyGen - AI avatar video creation API. Use when creating AI avatar videos, generating talking head videos, or integrating HeyGen with Remotion.
The art and science of creating AI-powered digital presenters, avatars, and synthetic spokespersons. This skill covers HeyGen, Synthesia, D-ID, Tavus, and the emerging landscape of photorealistic...
Extract audio from short videos (Douyin/TikTok) and transcribe to text with timestamps. Use when user provides video URL and needs audio transcription.
Gong API for searching calls, transcripts, and conversation intelligence. Use when working with Gong call recordings, sales conversations, transcripts, meeting data, or conversation analytics....
Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection,...
Integrate with Home Assistant REST and WebSocket APIs. Use when making API calls, managing entity states, calling services, subscribing to events, or setting up authentication. Activates on...
This skill should be used when users want to download audio from YouTube videos as high-quality MP3 files with embedded metadata and thumbnails. Trigger this skill for requests like "download the...
Expert discovery call strategist for B2B sales. Use when preparing for discovery calls, qualifying prospects, asking effective questions, identifying pain points, mapping stakeholders, or...
Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking,...