Interactive browser automation via Chrome DevTools Protocol. Use when you need to interact with web pages, test frontends, or when user interaction with a visible browser is required.
VS Code integration for viewing diffs and comparing files. Use when showing file differences to the user.
Google Drive CLI for listing, searching, uploading, downloading, and sharing files and folders.
Fetch transcripts from YouTube videos for summarization and analysis.
Google Calendar CLI for listing calendars, viewing/creating/updating events, and checking availability.
Gmail CLI for searching emails, reading threads, sending messages, managing drafts, and handling labels/attachments.
Web search and content extraction via Brave Search API. Use for searching documentation, facts, or any web content. Lightweight, no browser required.
|
This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification,...
This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification,...
This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification,...
Quick classifier training with automatic model selection, hyperparameter tuning, and comprehensive evaluation metrics.
>
Expert in spatial audio, procedural sound design, game audio middleware, and app UX sound design. Specializes in HRTF/Ambisonics, Wwise/FMOD integration, UI sound design, and adaptive music...
Video/audio/image processing with FFmpeg and ImageMagick. Tools: FFmpeg (video/audio), ImageMagick (images). Capabilities: format conversion, encoding (H.264/H.265/VP9/AV1), streaming (HLS/DASH),...
Expert in 2000s-era music visualization (Milkdrop, AVS, Geiss) and modern WebGL implementations. Specializes in Butterchurn integration, Web Audio API AnalyserNode FFT data, GLSL shaders for...
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech...
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech...
Video and audio processing with FFmpeg. Use for format conversion, resizing, compression, audio extraction, and preparing assets for Remotion. Triggers include converting GIF to MP4, resizing...
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports...