Upload and manage files using Google Gemini File API via scripts/. Use for uploading images, audio, video, PDFs, and other files for use with Gemini models. Supports file upload, status checking,...
Use when porting beads or superpowers workflows into Gemini CLI extensions or designing Gemini CLI command prompts that emulate multi-step agent workflows - covers extension layout, GEMINI.md...
Generates images and text via reverse-engineered Gemini Web API. Supports text generation, image generation from prompts, reference images for vision input, and multi-turn conversations. Use when...
Remove Gemini logos, watermarks, or AI-generated image markers using OpenCV inpainting. Use this skill when the user asks to remove Gemini logo, AI watermark, or any logo/watermark from images.
Use the Gemini API (Nano Banana image generation, Veo video, Gemini TTS speech and audio understanding) to deliver end-to-end multimodal media workflows and code templates for "generation + understanding".
Perform deep, thorough code review using Gemini AI. Use this Skill when user explicitly requests 'gemini review', 'thorough review', 'detailed review', or 'deep review'. For general 'review'...
Build and run Gemini 2.5 Computer Use browser-control agents with Playwright. Use when a user wants to automate web browser tasks via the Gemini Computer Use model, needs an agent loop (screenshot...
Generate images using Google Gemini AI with text prompts and reference images. Use when creating game assets, concept art, UI mockups, promotional images, or any visual content. Supports...
Delegate tasks to Gemini CLI to save Claude context
Use Chrome DevTools Protocol to allow the AI to "ask Gemini" or "research with Gemini" directly. This uses the user's logged-in Chrome session, bypassing API limits and leveraging the web...
Enables Claude to interact with Gemini AI chat for quick queries, brainstorming, and alternative AI perspectives
Enables Claude to create and edit documents collaboratively using Gemini Canvas for visual writing and coding
Analyze images using Gemini's vision capabilities. Use for image analysis, text extraction from screenshots, and visual content understanding.
Use when the user asks to run Gemini CLI for any tasks or if big context (>200k) is needed. Ideal for Code Review, Plan Review, Multi-file Analysis, and any task that requires large context...
Generate text embeddings using Gemini Embedding API via scripts/. Use for creating vector representations of text, semantic search, similarity matching, clustering, and RAG applications. Triggers...
Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and...
Generate speech from text using Google Gemini TTS models via scripts/. Use for text-to-speech, audio generation, voice synthesis, multi-speaker conversations, and creating audio content. Supports...
Generate and edit images using the Gemini API (Nano Banana). Use this skill when creating images from text prompts, editing existing images, applying style transfers, generating logos with text,...
Process large volumes of requests using Gemini Batch API via scripts/. Use for batch processing, bulk text generation, processing JSONL files, async job execution, and cost-efficient high-volume...
Configure or debug LLM blog post generation using Vercel AI SDK and Google Gemini. Use when updating blog generation prompts, fixing AI integration issues, modifying content generation logic, or...