jimliu

baoyu-image-gen

3,033
348
# Install this skill:
npx skills add JimLiu/baoyu-skills --skill "baoyu-image-gen"

Install specific skill from multi-skill repository

# Description

AI image generation with OpenAI and Google APIs. Supports text-to-image, reference images, aspect ratios, and parallel generation (recommended 4 concurrent subagents). Use when user asks to generate, create, or draw images.

# SKILL.md


name: baoyu-image-gen
description: AI image generation with OpenAI and Google APIs. Supports text-to-image, reference images, aspect ratios, and parallel generation (recommended 4 concurrent subagents). Use when user asks to generate, create, or draw images.


Image Generation (AI SDK)

Official API-based image generation. Supports OpenAI and Google providers.

Script Directory

Agent Execution:
1. SKILL_DIR = this SKILL.md file's directory
2. Script path = ${SKILL_DIR}/scripts/main.ts

Preferences (EXTEND.md)

Use Bash to check EXTEND.md existence (priority order):

# Check project-level first
test -f .baoyu-skills/baoyu-image-gen/EXTEND.md && echo "project"

# Then user-level (cross-platform: $HOME works on macOS/Linux/WSL)
test -f "$HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md" && echo "user"

┌──────────────────────────────────────────────────┬───────────────────┐
│ Path │ Location │
├──────────────────────────────────────────────────┼───────────────────┤
│ .baoyu-skills/baoyu-image-gen/EXTEND.md │ Project directory │
├──────────────────────────────────────────────────┼───────────────────┤
│ $HOME/.baoyu-skills/baoyu-image-gen/EXTEND.md │ User home │
└──────────────────────────────────────────────────┴───────────────────┘

┌───────────┬───────────────────────────────────────────────────────────────────────────┐
│ Result │ Action │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Found │ Read, parse, apply settings │
├───────────┼───────────────────────────────────────────────────────────────────────────┤
│ Not found │ Use defaults │
└───────────┴───────────────────────────────────────────────────────────────────────────┘

EXTEND.md Supports: Default provider | Default quality | Default aspect ratio

Usage

# Basic
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image cat.png

# With aspect ratio
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A landscape" --image out.png --ar 16:9

# High quality
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --quality 2k

# From prompt files
npx -y bun ${SKILL_DIR}/scripts/main.ts --promptfiles system.md content.md --image out.png

# With reference images (Google multimodal only)
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "Make blue" --image out.png --ref source.png

# Specific provider
npx -y bun ${SKILL_DIR}/scripts/main.ts --prompt "A cat" --image out.png --provider openai

Options

Option Description
--prompt <text>, -p Prompt text
--promptfiles <files...> Read prompt from files (concatenated)
--image <path> Output image path (required)
--provider google\|openai Force provider (default: google)
--model <id>, -m Model ID
--ar <ratio> Aspect ratio (e.g., 16:9, 1:1, 4:3)
--size <WxH> Size (e.g., 1024x1024)
--quality normal\|2k Quality preset (default: 2k)
--imageSize 1K\|2K\|4K Image size for Google (default: from quality)
--ref <files...> Reference images (Google multimodal only)
--n <count> Number of images
--json JSON output

Environment Variables

Variable Description
OPENAI_API_KEY OpenAI API key
GOOGLE_API_KEY Google API key
OPENAI_IMAGE_MODEL OpenAI model override
GOOGLE_IMAGE_MODEL Google model override
OPENAI_BASE_URL Custom OpenAI endpoint
GOOGLE_BASE_URL Custom Google endpoint

Load Priority: CLI args > env vars > <cwd>/.baoyu-skills/.env > ~/.baoyu-skills/.env

Provider Selection

  1. --provider specified → use it
  2. Only one API key available → use that provider
  3. Both available → default to Google

Quality Presets

Preset Google imageSize OpenAI Size Use Case
normal 1K 1024px Quick previews
2k (default) 2K 2048px Covers, illustrations, infographics

Google imageSize: Can be overridden with --imageSize 1K|2K|4K

Aspect Ratios

Supported: 1:1, 16:9, 9:16, 4:3, 3:4, 2.35:1

  • Google multimodal: uses imageConfig.aspectRatio
  • Google Imagen: uses aspectRatio parameter
  • OpenAI: maps to closest supported size

Parallel Generation

Supports concurrent image generation via background subagents for batch operations.

Setting Value
Recommended concurrency 4 subagents
Max concurrency 8 subagents
Use case Batch generation (slides, comics, infographics)

Agent Implementation:

# Launch multiple generations in parallel using Task tool
# Each Task runs as background subagent with run_in_background=true
# Collect results via TaskOutput when all complete

Best Practice: When generating 4+ images, spawn background subagents (recommended 4 concurrent) instead of sequential execution.

Error Handling

  • Missing API key → error with setup instructions
  • Generation failure → auto-retry once
  • Invalid aspect ratio → warning, proceed with default
  • Reference images with non-multimodal model → warning, ignore refs

Extension Support

Custom configurations via EXTEND.md. See Preferences section for paths and supported options.

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.