This skill should be used when converting academic papers into promotional and presentation formats including interactive websites (Paper2Web), presentation videos (Paper2Video), and conference...
This skill should be used when converting academic papers into promotional and presentation formats including interactive websites (Paper2Web), presentation videos (Paper2Video), and conference...
Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking,...
This skill should be used when converting academic papers into promotional and presentation formats including interactive websites (Paper2Web), presentation videos (Paper2Video), and conference...
Download YouTube videos and HLS streams (m3u8) from platforms like Mux, Vimeo, etc. using yt-dlp and ffmpeg. Use this skill when users request downloading videos, extracting audio, handling...
Build production computer vision pipelines for object detection, tracking, and video analysis. Handles drone footage, wildlife monitoring, and real-time detection. Supports YOLO, Detectron2,...
Access ATXP paid API tools for web search, AI image generation, music creation, video generation, and X/Twitter search. Use when users need real-time web search, AI-generated media (images, music,...
下载 Markdown 文件中的音视频和图片附件并本地化嵌入 Obsidian。支持 YouTube 视频、Patreon 视频、SoundCloud 音频、网络图片,可选下载字幕。Download SoundCloud audio, YouTube videos, Patreon videos, and web images from markdown files and embed...
iOS Simulator automation using AXe CLI for touch gestures, text input, hardware buttons, screenshots, video recording, and accessibility inspection. Use when automating iOS Simulator interactions,...
Creates professional AI image/video prompts with photographer's and cinematographer's eye. Specializes in composition, lighting, color grading, and storytelling. Use when generating AI...
Download videos, audio, playlists, and channels from YouTube and 1000+ websites using yt-dlp. Supports quality selection, format conversion, subtitle download, playlist filtering, metadata...
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis...
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis...
Process and generate multimedia content using Google Gemini API. Capabilities include analyze audio files (transcription with timestamps, summarization, speech understanding, music/sound analysis...
Skip to content github / docs Code Issues 80 Pull requests 35 Discussions Actions Projects 2 Security Insights Merge branch 'main' into 1862-Add-Travis-CI-migration-table ...
Creates realistic ocean and water wave effects for web using SVG filters (feTurbulence, feDisplacementMap), CSS animations, and layering techniques. Use for ocean backgrounds, underwater...
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision...
World-class computer vision skill for image/video processing, object detection, segmentation, and visual AI systems. Expertise in PyTorch, OpenCV, YOLO, SAM, diffusion models, and vision...
The craft of producing creative assets that communicate brand, product, and message effectively. Creative communications bridges strategy and execution—translating brand positioning and marketing...
World-class character and art style consistency for AI-generated images and videos - ensures visual coherence across series, maintains character identity, and provides rigorous QA before...