Use when you have a written implementation plan to execute in a separate session with review checkpoints
npx skills add grahama1970/agent-skills --skill "surf"
Install specific skill from multi-skill repository
# Description
>
# SKILL.md
name: surf
description: >
Unified browser automation for AI agents. Uses surf-cli extension when available
(full features), falls back to CDP (zero-config). Navigate, read with element refs,
click, type, screenshot.
allowed-tools: Bash, Read
triggers:
# CDP/Chrome management
- open browser
- start chrome
- start cdp
- launch chrome
- chrome devtools
- cdp
- puppeteer
- headless chrome
# Browser automation
- click on
- fill form
- take screenshot
- screenshot
- navigate to
- go to url
- automate browser
- browser automation
- read webpage
- scrape page
# Testing
- run ui tests
- smoke tests with browser
- browser tests
- e2e tests
- end to end tests
# Troubleshooting
- check browser
- browser not working
- cdp not connecting
metadata:
short-description: Browser automation (extension preferred, CDP fallback)
cdp-port: 9222
Surf - Browser Automation for AI Agents
Two modes of operation:
1. With surf-cli extension (recommended): Full features, works with your existing browser
2. CDP fallback: Zero-config, but requires starting a separate Chrome instance
If /tmp/surf.sock exists (extension installed), all commands route through surf-cli. Otherwise, commands use CDP.
First-Time Setup
Run the sanity check to verify setup or get installation instructions:
./sanity.sh
If any checks fail, the script provides step-by-step instructions. The agent should run this script and guide the user through any failed steps until all checks pass.
Quick Start
Option A: With Extension (Recommended)
One-time setup (see "Extension Setup" below), then:
surf tab.list # See all browser tabs
surf tab.new "https://example.com"
surf read # Page content with element refs (e1, e2...)
surf click e5 # Click element
surf type "hello" --ref e2 # Type into element
surf snap # Screenshot
Option B: CDP Fallback (Zero-config)
surf cdp start # Starts separate Chrome instance
surf go "https://example.com"
surf read
surf click e5
surf cdp stop
Commands
Navigation & Reading
surf go "https://example.com" # Navigate to URL
surf read # Read page with element refs
surf read --filter all # Include all elements (not just interactive)
surf text # Get raw text content only
Element Interaction
surf click e5 # Click element by ref
surf type "hello" # Type text
surf type "query" --submit # Type and press Enter
surf type "text" --ref e3 # Type into specific element
surf key Enter # Press key (Enter, Tab, Escape, etc.)
Screenshots & Scrolling
surf snap # Screenshot to /tmp
surf snap --output /tmp/page.png # Specify output path
surf snap --full # Full page screenshot
surf scroll down # Scroll down
surf scroll up # Scroll up
surf scroll top # Scroll to top
surf scroll bottom # Scroll to bottom
surf wait 2 # Wait 2 seconds
Element References
surf read returns an accessibility tree with stable element refs:
link "Learn more" [e1] href="https://example.com"
button "Submit" [e2] [cursor=pointer]
textbox "Email" [e3] [cursor=pointer]
heading "Welcome" [e4] [level=1]
Use these refs with other commands:
- surf click e1 - Click the link
- surf type "hello" --ref e3 - Type into the textbox
CDP Management
surf cdp start # Start Chrome with CDP (port 9222)
surf cdp start 9223 # Use custom port
surf cdp status # Show status and connection info
surf cdp env # Output export commands for shell
surf cdp stop # Stop Chrome
For Puppeteer/testing integration:
eval "$(surf cdp env)"
# Now BROWSERLESS_DISCOVERY_URL and BROWSERLESS_WS are set
Environment Variables
| Variable | Default | Description |
|---|---|---|
CDP_PORT |
9222 | Chrome DevTools Protocol port |
CHROME_USER_DATA |
/tmp/chrome-cdp-profile | Chrome profile directory |
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β surf skill (run.sh) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β β
β βββββββββββββββββββ΄ββββββββββββββββββ β
β β /tmp/surf.sock exists? β β
β βββββββββββββββββββ¬ββββββββββββββββββ β
β YES β β NO β
β βΌ βΌ β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β
β β surf-cli extension β β CDP Controller β β
β β (native/cli.cjs) β β (cdp_controller.py) β β
β βββββββββββββ¬ββββββββββββββ βββββββββββββ¬ββββββββββββββ β
β β β β
β βΌ βΌ β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ β
β β Unix Socket β Native β β CDP WebSocket β β
β β Host β Extension β β (port 9222) β β
β βββββββββββββ¬ββββββββββββββ βββββββββββββ¬ββββββββββββββ β
β β β β
β ββββββββββββββ¬ββββββββββββββββ β
β βΌ β
β βββββββββββββββββββ β
β β Chrome β β
β βββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Example: Automate Google Search
surf cdp start
surf go "https://google.com"
surf read
# Output shows: textbox "Search" [e1] ...
surf type "claude ai" --ref e1
surf key Enter
surf wait 2
surf read
# Shows search results with element refs
surf click e3 # Click first result
surf snap # Screenshot
surf cdp stop
Troubleshooting
| Problem | Solution |
|---|---|
| "Cannot connect to CDP" | Run surf cdp start first |
| Chrome not found | Install Google Chrome or Chromium |
| Port already in use | surf cdp stop then surf cdp start |
| Element not found | Run surf read first to get current refs |
| Page not loading | Check URL is valid, try with https:// |
| Empty read output | Page may still be loading - try surf wait 2 |
Extension Setup (One-time)
Important: Google Chrome blocks --load-extension for security. Manual setup required:
-
Build extension:
bash cd /home/graham/workspace/experiments/surf-cli npm install && npm run build -
Load in Chrome:
chrome://extensionsβ Enable Developer Mode β Load unpacked β selectdist/ -
Copy the Extension ID shown (e.g.,
lgamnnedgnehjplhndkkhojhbifgpcdp) -
Install native host:
bash surf install <extension-id> -
Verify:
surf tab.listshould show your browser tabs
The socket at /tmp/surf.sock enables CLI β extension communication.
Extension vs CDP Comparison
| Feature | Extension | CDP |
|---|---|---|
| Basic navigation | β | β |
| Element interaction | β | β |
| Screenshots | β | β |
| Multi-tab management | β | Limited |
| Use existing browser | β | β |
| Zero setup | β | β |
Recommendation: Set up the extension once for best experience. Use CDP for CI/testing environments where you need a fresh browser.
# Supported AI Coding Agents
This skill is compatible with the SKILL.md standard and works with all major AI coding agents:
Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.