Refactor high-complexity React components in Dify frontend. Use when `pnpm analyze-component...
npx skills add akrindev/google-studio-skills --skill "gemini-image"
Install specific skill from multi-skill repository
# Description
Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and high-resolution output up to 4K. Triggers on "generate image", "create image", "imagen", "text to image", "AI art", "nano banana".
# SKILL.md
name: gemini-image
description: Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and high-resolution output up to 4K. Triggers on "generate image", "create image", "imagen", "text to image", "AI art", "nano banana".
license: MIT
version: 1.0.0
keywords: image generation, imagen, gemini-3-pro, gemini-2.5, text-to-image, AI art, nano banana, 4K resolution, aspect ratio
Gemini Image Generation
Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.
When to Use This Skill
Use this skill when you need to:
- Create visual content from text descriptions
- Generate multiple image variations
- Create images at specific resolutions (1K, 2K, 4K)
- Produce images for different aspect ratios (social media, banners, etc.)
- Generate photorealistic images or artistic visuals
- Create images with person generation controls
- Batch generate multiple images at once
- Combine with text generation for complete content creation
Available Scripts
scripts/generate_image.py
Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models
When to use:
- Any image generation task
- Multiple image generation (1-4 per request)
- Custom resolution and aspect ratio needs
- Professional asset creation
- Photorealistic or artistic image generation
Key parameters:
| Parameter | Description | Example |
|-----------|-------------|---------|
| prompt | Text description (required) | "A futuristic city at sunset" |
| --model, -m | Model to use | gemini-3-pro-image-preview |
| --output-dir, -o | Output directory for images | images/ |
| --name, -n | Base name for output files | artwork |
| --no-timestamp | Disable auto timestamp | Flag |
| --aspect, -a | Aspect ratio | 16:9 |
| --size, -s | Resolution | 2K or 4K |
| --num | Number of images (1-4) | 4 |
| --person | Person generation policy | allow_adult |
Output: List of saved PNG file paths
Workflows
Workflow 1: Basic Image Generation
python scripts/generate_image.py "A futuristic city at sunset with flying cars"
- Best for: Quick image generation, prototypes
- Model:
gemini-3-pro-image-preview(default, highest quality) - Output:
images/generated_image_YYYYMMDD_HHMMSS.png
Workflow 2: Social Media (Instagram, Facebook)
python scripts/generate_image.py "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop
- Best for: Instagram posts, profile pictures
- Aspect: 1:1 (square format)
- Resolution: 2K (2048x2048)
- Output:
images/coffee-shop_YYYYMMDD_HHMMSS.png
Workflow 3: YouTube Thumbnails (16:9)
python scripts/generate_image.py "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail
- Best for: YouTube, video thumbnails
- Aspect: 16:9 (widescreen)
- Resolution: 2K (2752x1536)
- Output:
images/thumbnail_YYYYMMDD_HHMMSS.png
Workflow 4: Multiple Variations
python scripts/generate_image.py "Abstract geometric patterns in blue and gold" --num 4 --name abstract
- Best for: A/B testing, design options
- Generates: 4 distinct variations
- Output:
images/abstract_YYYYMMDD_HHMMSS_0.png,images/abstract_YYYYMMDD_HHMMSS_1.png, etc.
Workflow 5: Custom Output Directory
python scripts/generate_image.py "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum
- Best for: Print materials, high-end assets, organized projects
- Model:
gemini-3-pro-image-previewonly (for 4K) - Resolution: 4K (5504x3072 for 16:9)
- Directory created automatically if it doesn't exist
Workflow 6: Photorealistic Images (Imagen 4)
python scripts/generate_image.py "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate
- Best for: Realistic photos, product shots
- Model:
imagen-4.0-generate-001(photorealistic) - Notes: English prompts only
- Max 4 images per request
Workflow 7: Blog Post Featured Image
python scripts/generate_image.py "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image
- Best for: Blog headers, article images
- Combines well with: gemini-text for blog content generation
Workflow 8: Content Creation Pipeline (Text + Image)
# 1. Generate content (gemini-text skill)
python skills/gemini-text/scripts/generate.py "Write a product description for smart home device"
# 2. Generate product image (this skill)
python scripts/generate_image.py "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product
# 3. Create social media post
- Best for: E-commerce, marketing campaigns
- Combines with: gemini-text, gemini-batch for batch production
Workflow 9: Disable Timestamp
python scripts/generate_image.py "Fixed filename image" --name my-image --no-timestamp
- Best for: When you want complete control over filename
- Output:
images/my-image.png(no timestamp) - Use when: Generating files for specific naming schemes or automated pipelines
Parameters Reference
Model Selection
| Model | Nickname | Quality | Max Size | Best For |
|---|---|---|---|---|
gemini-3-pro-image-preview |
Nano Banana Pro | Highest | 4K | Professional assets, advanced text rendering |
gemini-2.5-flash-image |
Nano Banana | Good | 2K | High-volume, low-latency |
imagen-4.0-generate-001 |
Imagen 4 | Photorealistic | 2K | Realistic photos, product shots |
Aspect Ratios
| Ratio | Use Case | 1K Size | 2K Size |
|---|---|---|---|
| 1:1 | Instagram, avatars | 1024x1024 | 2048x2048 |
| 16:9 | YouTube, presentations | 1376x768 | 2752x1536 |
| 9:16 | Instagram Stories, TikTok | 768x1376 | 1536x2752 |
| 4:3 | Traditional displays | 1024x768 | 2048x1536 |
| 3:4 | Portrait orientation | 768x1024 | 1536x2048 |
| 21:9 | Ultrawide | - | 5504x2400 |
Note: 4K resolution only available with gemini-3-pro-image-preview
Resolution Guide
| Size | Use Case | Best Model |
|---|---|---|
| 1K (1024px) | Web thumbnails, previews | Any model |
| 2K (2048px) | Standard web, social media | Any model |
| 4K (4096px) | Print, high-end assets | gemini-3-pro only |
Person Generation Policy
| Policy | Description | Restrictions |
|---|---|---|
dont_allow |
No people in images | None |
allow_adult |
Adults only | Recommended default |
allow_all |
All ages | Restricted in EU, UK, CH, MENA |
Output Interpretation
File Naming
- Default format:
{name}_YYYYMMDD_HHMMSS.png(auto timestamp) - Single image example:
artwork_20260130_031643.png - Multiple images:
{name}_YYYYMMDD_HHMMSS_0.png,{name}_YYYYMMDD_HHMMSS_1.png, etc. - Without timestamp (
--no-timestamp):{name}.png - Script prints: "Saved: /path/to/file.png"
Image Quality
- All images include SynthID watermark for authenticity
- PNG format for lossless quality
- Can be converted to JPEG/WEBP if needed
- 4K images are significantly larger file sizes
Error Messages
- "Model not available": Check model name spelling
- "Unsupported size": Verify size/model combination
- "Aspect ratio error": Use supported ratios for selected model
Common Issues
"google-genai or pillow not installed"
pip install google-genai pillow
"Image generation failed"
- Check prompt length (too verbose can fail)
- Try simpler, more focused prompts
- Verify model availability in your region
- Check API quota limits
"Unsupported aspect ratio"
- Check if ratio is supported by selected model
- Imagen 4 has fewer ratio options than Gemini
- Use 16:9 or 1:1 for best compatibility
"4K not supported"
- 4K only works with
gemini-3-pro-image-preview - Use
--size 2Kfor other models - Try
--model gemini-3-pro-image-preview --size 4K
"Imagen prompt language error"
- Imagen models support English prompts only
- Use
gemini-3-pro-image-previewfor other languages - Translate prompt to English for Imagen
File too large for storage
- Use
--size 1Kfor smaller files - Compress images after generation
- Convert PNG to JPEG for web use
Best Practices
Prompt Engineering
- Be specific and descriptive
- Include style descriptors (e.g., "photorealistic", "digital art")
- Mention lighting, mood, and composition
- Use analogies for complex concepts
- Avoid negative prompts (describe what you want, not what to avoid)
Model Selection
- Use
gemini-3-pro-image-previewfor: High quality, text rendering, 4K - Use
gemini-2.5-flash-imagefor: Speed, high volume - Use
imagen-4.0-generate-001for: Photorealism, product shots
Performance Optimization
- Generate multiple images at once with
--num - Use lower resolution for previews
- Batch requests for high-volume needs (gemini-batch skill)
- Cache results for repeated requests
Quality Tips
- Use 2K resolution for most web uses
- 4K only when maximum detail is needed
- Combine specific prompts with style guidance
- Test prompts with
--num 1before generating batches
Cost Management
- Use flash models for cost efficiency
- 4K generation costs significantly more
- Batch multiple requests when possible
- Generate at 1K for testing, 2K/4K for final
Related Skills
- gemini-text: Generate text content alongside images
- gemini-tts: Create audio for image-based content
- gemini-batch: Process multiple image requests efficiently
- gemini-embeddings: Generate image embeddings for similarity search
Quick Reference
# Basic
python scripts/generate_image.py "Your prompt"
# Social media (1:1)
python scripts/generate_image.py "Prompt" --aspect 1:1 --size 2K --name social-post
# YouTube thumbnail (16:9)
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 2K --name thumbnail
# 4K high quality
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 4K --name high-res
# Multiple variations
python scripts/generate_image.py "Prompt" --num 4 --name variations
# Custom directory
python scripts/generate_image.py "Prompt" --output-dir ./my-images/ --name custom
# Photorealistic
python scripts/generate_image.py "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo
# No timestamp
python scripts/generate_image.py "Prompt" --name fixed-name --no-timestamp
Reference
- See
references/for model documentation (if available) - Get API key: https://aistudio.google.com/apikey
- Documentation: https://ai.google.dev/gemini-api/docs/image-generation
- SynthID: https://deepmind.google/technologies/synthid/
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.