Build agents that generate creative content including music, memes, podcasts, and multimedia. Covers generative models, content synthesis, style transfer, and creative control. Use when building...
Audio playback using Tone.js including players, transport, scheduling, and loading audio. Use when implementing background music, sound effects, audio synchronization, or timed audio events....
Generate podcast scripts from text content. Use Tone.js and Howler.js for audio mixing. Create intro/outro music, transitions, sound effects.
Control Spotify playback and manage playlists via MCP server. Use when user requests playing music, controlling Spotify, creating playlists, searching songs, or managing their Spotify library.
Process and generate multimedia content using Google Gemini API for better vision capabilities. Capabilities include analyze audio files (transcription with timestamps, summarization, speech...
Expert in voice synthesis, TTS, voice cloning, podcast production, speech processing, and voice UI design via ElevenLabs integration. Specializes in vocal clarity, loudness standards (LUFS),...
Create shot lists for highlight videos. Timestamp key plays, music cue suggestions, pacing. Platform-specific cuts for TikTok, YouTube.
Binding audio analysis data to visual parameters including smoothing, beat detection responses, and frequency-to-visual mappings. Use when creating audio visualizers, music-reactive animations, or...
Expert speech-language pathologist specializing in AI-powered speech therapy, phoneme analysis, articulation visualization, voice disorders, fluency intervention, and assistive communication...
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis...
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis...
Multimodal AI processing via Google Gemini API (2M tokens context). Capabilities: audio (transcription, 9.5hr max, summarization, music analysis), images (captioning, OCR, object detection,...
Use when integrating with Kie.ai API for image/video/music generation, writing async task-based code with polling, or when user mentions kie, seedream, veo, suno, runway, kling, hailuo, flux
>
>
>
>
|
Bridges asset requirements from motion design specs to production-ready assets. Parses specs for required assets, recommends free/paid sources, provides format conversion guidance, generates...
>