gemini-image

by @akrindev in AI & LLM

# Install this skill:

npx skills add akrindev/google-studio-skills --skill "gemini-image"

Install specific skill from multi-skill repository

# Description

Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and high-resolution output up to 4K. Triggers on "generate image", "create image", "imagen", "text to image", "AI art", "nano banana".

# SKILL.md

name: gemini-image
description: Generate images using Google Gemini and Imagen models via scripts/. Use for AI image generation, text-to-image, creating visuals from prompts, generating multiple images, custom aspect ratios, and high-resolution output up to 4K. Triggers on "generate image", "create image", "imagen", "text to image", "AI art", "nano banana".
license: MIT
version: 1.0.0
keywords: image generation, imagen, gemini-3-pro, gemini-2.5, text-to-image, AI art, nano banana, 4K resolution, aspect ratio

Gemini Image Generation

Generate high-quality images from text prompts using Google's Gemini and Imagen models through executable scripts.

When to Use This Skill

Use this skill when you need to:
- Create visual content from text descriptions
- Generate multiple image variations
- Create images at specific resolutions (1K, 2K, 4K)
- Produce images for different aspect ratios (social media, banners, etc.)
- Generate photorealistic images or artistic visuals
- Create images with person generation controls
- Batch generate multiple images at once
- Combine with text generation for complete content creation

Available Scripts

scripts/generate_image.py

Purpose: Generate images using Gemini 3 Pro Image or Imagen 4 models

When to use:
- Any image generation task
- Multiple image generation (1-4 per request)
- Custom resolution and aspect ratio needs
- Professional asset creation
- Photorealistic or artistic image generation

Key parameters:
| Parameter | Description | Example |
|-----------|-------------|---------|
| prompt | Text description (required) | "A futuristic city at sunset" |
| --model, -m | Model to use | gemini-3-pro-image-preview |
| --output-dir, -o | Output directory for images | images/ |
| --name, -n | Base name for output files | artwork |
| --no-timestamp | Disable auto timestamp | Flag |
| --aspect, -a | Aspect ratio | 16:9 |
| --size, -s | Resolution | 2K or 4K |
| --num | Number of images (1-4) | 4 |
| --person | Person generation policy | allow_adult |

Output: List of saved PNG file paths

Workflows

Workflow 1: Basic Image Generation

python scripts/generate_image.py "A futuristic city at sunset with flying cars"

Best for: Quick image generation, prototypes
Model: gemini-3-pro-image-preview (default, highest quality)
Output: images/generated_image_YYYYMMDD_HHMMSS.png

python scripts/generate_image.py "Minimalist coffee shop interior" --aspect 1:1 --size 2K --name coffee-shop

Best for: Instagram posts, profile pictures
Aspect: 1:1 (square format)
Resolution: 2K (2048x2048)
Output: images/coffee-shop_YYYYMMDD_HHMMSS.png

Workflow 3: YouTube Thumbnails (16:9)

python scripts/generate_image.py "Tech gadget review thumbnail with vibrant colors" --aspect 16:9 --size 2K --name thumbnail

Best for: YouTube, video thumbnails
Aspect: 16:9 (widescreen)
Resolution: 2K (2752x1536)
Output: images/thumbnail_YYYYMMDD_HHMMSS.png

Workflow 4: Multiple Variations

python scripts/generate_image.py "Abstract geometric patterns in blue and gold" --num 4 --name abstract

Best for: A/B testing, design options
Generates: 4 distinct variations
Output: images/abstract_YYYYMMDD_HHMMSS_0.png, images/abstract_YYYYMMDD_HHMMSS_1.png, etc.

Workflow 5: Custom Output Directory

python scripts/generate_image.py "Detailed architectural rendering of modern museum" --aspect 16:9 --size 4K --output-dir ./professional/ --name museum

Best for: Print materials, high-end assets, organized projects
Model: gemini-3-pro-image-preview only (for 4K)
Resolution: 4K (5504x3072 for 16:9)
Directory created automatically if it doesn't exist

Workflow 6: Photorealistic Images (Imagen 4)

python scripts/generate_image.py "Robot holding a red skateboard in urban setting" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --num 2 --name robot-skate

Best for: Realistic photos, product shots
Model: imagen-4.0-generate-001 (photorealistic)
Notes: English prompts only
Max 4 images per request

Workflow 7: Blog Post Featured Image

python scripts/generate_image.py "Serene mountain lake at sunrise with reflections" --aspect 16:9 --size 2K --output-dir ./blog-images/ --name featured-image

Best for: Blog headers, article images
Combines well with: gemini-text for blog content generation

Workflow 8: Content Creation Pipeline (Text + Image)

# 1. Generate content (gemini-text skill)
python skills/gemini-text/scripts/generate.py "Write a product description for smart home device"

# 2. Generate product image (this skill)
python scripts/generate_image.py "Sleek modern smart home device on white background" --aspect 4:3 --size 2K --name product

# 3. Create social media post

Best for: E-commerce, marketing campaigns
Combines with: gemini-text, gemini-batch for batch production

Workflow 9: Disable Timestamp

python scripts/generate_image.py "Fixed filename image" --name my-image --no-timestamp

Best for: When you want complete control over filename
Output: images/my-image.png (no timestamp)
Use when: Generating files for specific naming schemes or automated pipelines

Parameters Reference

Model Selection

Model	Nickname	Quality	Max Size	Best For
`gemini-3-pro-image-preview`	Nano Banana Pro	Highest	4K	Professional assets, advanced text rendering
`gemini-2.5-flash-image`	Nano Banana	Good	2K	High-volume, low-latency
`imagen-4.0-generate-001`	Imagen 4	Photorealistic	2K	Realistic photos, product shots

Aspect Ratios

Ratio	Use Case	1K Size	2K Size
1:1	Instagram, avatars	1024x1024	2048x2048
16:9	YouTube, presentations	1376x768	2752x1536
9:16	Instagram Stories, TikTok	768x1376	1536x2752
4:3	Traditional displays	1024x768	2048x1536
3:4	Portrait orientation	768x1024	1536x2048
21:9	Ultrawide	-	5504x2400

Note: 4K resolution only available with gemini-3-pro-image-preview

Resolution Guide

Size	Use Case	Best Model
1K (1024px)	Web thumbnails, previews	Any model
2K (2048px)	Standard web, social media	Any model
4K (4096px)	Print, high-end assets	gemini-3-pro only

Person Generation Policy

Policy	Description	Restrictions
`dont_allow`	No people in images	None
`allow_adult`	Adults only	Recommended default
`allow_all`	All ages	Restricted in EU, UK, CH, MENA

Output Interpretation

File Naming

Default format: {name}_YYYYMMDD_HHMMSS.png (auto timestamp)
Single image example: artwork_20260130_031643.png
Multiple images: {name}_YYYYMMDD_HHMMSS_0.png, {name}_YYYYMMDD_HHMMSS_1.png, etc.
Without timestamp (--no-timestamp): {name}.png
Script prints: "Saved: /path/to/file.png"

Image Quality

All images include SynthID watermark for authenticity
PNG format for lossless quality
Can be converted to JPEG/WEBP if needed
4K images are significantly larger file sizes

Error Messages

"Model not available": Check model name spelling
"Unsupported size": Verify size/model combination
"Aspect ratio error": Use supported ratios for selected model

Common Issues

"google-genai or pillow not installed"

pip install google-genai pillow

"Image generation failed"

Check prompt length (too verbose can fail)
Try simpler, more focused prompts
Verify model availability in your region
Check API quota limits

"Unsupported aspect ratio"

Check if ratio is supported by selected model
Imagen 4 has fewer ratio options than Gemini
Use 16:9 or 1:1 for best compatibility

"4K not supported"

4K only works with gemini-3-pro-image-preview
Use --size 2K for other models
Try --model gemini-3-pro-image-preview --size 4K

"Imagen prompt language error"

Imagen models support English prompts only
Use gemini-3-pro-image-preview for other languages
Translate prompt to English for Imagen

File too large for storage

Use --size 1K for smaller files
Compress images after generation
Convert PNG to JPEG for web use

Best Practices

Prompt Engineering

Be specific and descriptive
Include style descriptors (e.g., "photorealistic", "digital art")
Mention lighting, mood, and composition
Use analogies for complex concepts
Avoid negative prompts (describe what you want, not what to avoid)

Model Selection

Use gemini-3-pro-image-preview for: High quality, text rendering, 4K
Use gemini-2.5-flash-image for: Speed, high volume
Use imagen-4.0-generate-001 for: Photorealism, product shots

Performance Optimization

Generate multiple images at once with --num
Use lower resolution for previews
Batch requests for high-volume needs (gemini-batch skill)
Cache results for repeated requests

Quality Tips

Use 2K resolution for most web uses
4K only when maximum detail is needed
Combine specific prompts with style guidance
Test prompts with --num 1 before generating batches

Cost Management

Use flash models for cost efficiency
4K generation costs significantly more
Batch multiple requests when possible
Generate at 1K for testing, 2K/4K for final

gemini-text: Generate text content alongside images
gemini-tts: Create audio for image-based content
gemini-batch: Process multiple image requests efficiently
gemini-embeddings: Generate image embeddings for similarity search

Quick Reference

# Basic
python scripts/generate_image.py "Your prompt"

# Social media (1:1)
python scripts/generate_image.py "Prompt" --aspect 1:1 --size 2K --name social-post

# YouTube thumbnail (16:9)
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 2K --name thumbnail

# 4K high quality
python scripts/generate_image.py "Prompt" --aspect 16:9 --size 4K --name high-res

# Multiple variations
python scripts/generate_image.py "Prompt" --num 4 --name variations

# Custom directory
python scripts/generate_image.py "Prompt" --output-dir ./my-images/ --name custom

# Photorealistic
python scripts/generate_image.py "Prompt" --model imagen-4.0-generate-001 --aspect 16:9 --size 2K --name photo

# No timestamp
python scripts/generate_image.py "Prompt" --name fixed-name --no-timestamp

Reference

See references/ for model documentation (if available)
Get API key: https://aistudio.google.com/apikey
Documentation: https://ai.google.dev/gemini-api/docs/image-generation
SynthID: https://deepmind.google/technologies/synthid/

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.

gemini-image

# Description

# SKILL.md

Gemini Image Generation

When to Use This Skill

Available Scripts

scripts/generate_image.py

Workflows

Workflow 1: Basic Image Generation

Workflow 2: Social Media (Instagram, Facebook)

Workflow 3: YouTube Thumbnails (16:9)

Workflow 4: Multiple Variations

Workflow 5: Custom Output Directory

Workflow 6: Photorealistic Images (Imagen 4)

Workflow 7: Blog Post Featured Image

Workflow 8: Content Creation Pipeline (Text + Image)

Workflow 9: Disable Timestamp

Parameters Reference

Model Selection

Aspect Ratios

Resolution Guide

Person Generation Policy

Output Interpretation

File Naming

Image Quality

Error Messages

Common Issues

"google-genai or pillow not installed"

"Image generation failed"

"Unsupported aspect ratio"

"4K not supported"

"Imagen prompt language error"

File too large for storage

Best Practices

Prompt Engineering

Model Selection

Performance Optimization

Quality Tips

Cost Management

Related Skills

Quick Reference

Reference

# Related Skills

# Supported AI Coding Agents

Confirm

Submit a Skill