tkoenig

peekaboo

0
0
# Install this skill:
npx skills add tkoenig/agent-skills --skill "peekaboo"

Install specific skill from multi-skill repository

# Description

macOS screen capture, UI automation, and AI vision. Use for screenshots, clicking, typing, window management, and automating any macOS app.

# SKILL.md


name: peekaboo
description: macOS screen capture, UI automation, and AI vision. Use for screenshots, clicking, typing, window management, and automating any macOS app.


Peekaboo

macOS CLI for screen capture, UI automation, and AI-powered vision analysis. Works with any application.

Install

brew install steipete/tap/peekaboo

Requires Screen Recording + Accessibility permissions (System Settings > Privacy & Security).

Screenshots

# Capture entire screen
peekaboo image --mode screen --path ~/Desktop/screen.png

# Capture specific app window
peekaboo image --mode window --app Safari --path screenshot.png

# Capture by window ID (for apps with multiple windows)
peekaboo list windows --app Safari    # Find window IDs
peekaboo image --mode window --window-id 12345 --path screenshot.png

# Retina resolution (2x)
peekaboo image --mode screen --retina --path screenshot.png

List Apps/Windows

peekaboo list apps                    # All running apps
peekaboo list windows                 # All windows
peekaboo list windows --app Safari    # Windows for specific app
peekaboo list screens                 # Available displays
peekaboo list permissions             # Check TCC permissions

Click

# Click at coordinates
peekaboo click --at 500,300

# Click UI element by label (requires snapshot)
peekaboo see --app Safari --json | jq -r '.data.snapshot_id'
peekaboo click --on "Submit" --snapshot <snapshot_id>

# Right-click
peekaboo click --at 500,300 --button right

# Double-click
peekaboo click --at 500,300 --clicks 2

Type & Keyboard

# Type text
peekaboo type --text "Hello world"

# Press keys
peekaboo press return
peekaboo press escape
peekaboo press tab

# Keyboard shortcuts
peekaboo hotkey cmd,c              # Copy
peekaboo hotkey cmd,v              # Paste
peekaboo hotkey cmd,shift,t        # Reopen tab

Scroll

peekaboo scroll --direction down --ticks 5
peekaboo scroll --direction up --ticks 3

Window Management

peekaboo window list
peekaboo window focus --app Safari
peekaboo window move --app Safari --x 100 --y 100
peekaboo window resize --app Safari --width 1200 --height 800

App Control

peekaboo app launch Safari
peekaboo app quit Safari
peekaboo app switch Safari
peekaboo app list
peekaboo menu list --app Safari       # List menus
peekaboo menu click --app Safari --menu "File" --item "New Window"

AI Vision (requires API key)

# Analyze screenshot with AI
peekaboo image --mode screen --analyze "What's on this screen?"

# See command - captures and annotates UI elements
peekaboo see --app Safari --json

Configure AI providers:

export PEEKABOO_AI_PROVIDERS="openai/gpt-4o"
export OPENAI_API_KEY="your-key"
# Or: peekaboo config init

Natural Language Agent

# Run multi-step automation via natural language
peekaboo agent "Open Notes and create a new note titled TODO"

Common Patterns

Screenshot of frontmost window

peekaboo image --mode frontmost --path screenshot.png

Click button in dialog

peekaboo click --on "OK"
peekaboo click --on "Cancel"

Fill form field

peekaboo click --at 500,300
peekaboo type --text "my input" --clear

Wait between actions

peekaboo sleep --duration 1000   # 1 second

JSON Output

Add --json or -j for machine-readable output:

peekaboo list windows --app Safari --json
peekaboo see --app Safari --json

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.