agent-browser

Name: agent-browser
Rating: 5 (1 reviews)
Author: itechmeat

by @itechmeat in AI & LLM

# Install this skill:

npx skills add itechmeat/llm-code --skill "agent-browser"

Install specific skill from multi-skill repository

# Description

Headless browser automation CLI for AI agents. Covers commands, refs, sessions, snapshots, cloud providers, profiles. Keywords: agent-browser, browser automation, refs, snapshot.

# SKILL.md

name: agent-browser
description: "Headless browser automation CLI for AI agents. Covers commands, refs, sessions, snapshots, cloud providers, profiles. Keywords: agent-browser, browser automation, refs, snapshot."
version: "0.8.0"
release_date: "2026-01-26"

Agent Browser

Headless browser automation CLI for AI agents. Fast Rust CLI with Node.js fallback.

Works with: Claude Code, Cursor, GitHub Copilot, OpenAI Codex, Google Gemini, opencode.

Topic	Reference
Installation	installation.md
Commands	commands.md
Refs	refs.md
Advanced	advanced.md

When to Use

Automating browser tasks in AI agent workflows
Web scraping with AI-friendly output
Testing web applications with LLM agents
Managing multiple browser sessions with isolated auth

Core Concepts

Refs (Element References)

The snapshot command returns an accessibility tree where each element has a unique ref like @e1, @e2:

Deterministic - ref points to exact element from snapshot
Fast - no DOM re-query needed
AI-friendly - LLMs can reliably parse and use refs

Architecture

Client-daemon architecture:

Rust CLI - parses commands, communicates with daemon
Node.js Daemon - manages Playwright browser instance

Daemon starts automatically and persists between commands.

Quick Example

# Navigate and get snapshot
agent-browser open example.com
agent-browser snapshot                    # Get accessibility tree with refs
agent-browser click @e2                   # Click by ref from snapshot
agent-browser fill @e3 "[email protected]" # Fill input by ref
agent-browser get text @e1                # Get text by ref
agent-browser screenshot page.png         # Save screenshot
agent-browser close

AI Workflow Pattern

Optimal workflow for AI agents:

# 1. Navigate and get snapshot
agent-browser open example.com
agent-browser snapshot -i --json   # AI parses tree and refs

# 2. AI identifies target refs from snapshot

# 3. Execute actions using refs
agent-browser click @e2
agent-browser fill @e3 "input text"

# 4. Get new snapshot if page changed
agent-browser snapshot -i --json

Headed Mode (Debugging)

agent-browser open example.com --headed

JSON Output

Use --json for machine-readable output:

agent-browser snapshot --json
agent-browser get text @e1 --json
agent-browser is visible @e2 --json

Critical Prohibitions

Do not use CSS/XPath selectors when refs are available (use @e1, @e2, etc.)
Do not forget to close sessions when done
Do not assume element positions without taking a fresh snapshot
Do not use old refs after page navigation or content changes (re-snapshot)

Common Commands

# Navigation
agent-browser open <url>
agent-browser back / forward / reload
agent-browser close

# Interaction
agent-browser click <sel>
agent-browser fill <sel> <text>
agent-browser press <key>
agent-browser hover <sel>
agent-browser select <sel> <val>
agent-browser download <sel> <path>  # v0.7+

# Info
agent-browser get text <sel>
agent-browser get url
agent-browser get title
agent-browser is visible <sel>

# Snapshots & Screenshots
agent-browser snapshot -i --json
agent-browser screenshot [path]

# Supported AI Coding Agents

This skill is compatible with the SKILL.md standard and works with all major AI coding agents:

⚡ Amp 🚀 Antigravity 🤖 Claude Code 🦀 Clawdbot 📝 Codex ▶️ Cursor 🤖 Droid 💎 Gemini CLI 🐙 GitHub Copilot 🪿 Goose 📊 Kilo Code 🔧 Kiro CLI 💻 OpenCode 🦘 Roo Code 🌲 Trae 🏄 Windsurf

Learn more about the SKILL.md standard and how to use these skills with your preferred AI coding agent.

agent-browser

# Description

# SKILL.md

Agent Browser

Quick Navigation

When to Use

Core Concepts

Refs (Element References)

Architecture

Quick Example

AI Workflow Pattern

Headed Mode (Debugging)

JSON Output

Critical Prohibitions

Common Commands

Links

# Supported AI Coding Agents

agent-browser

# Description

# SKILL.md

Agent Browser

Quick Navigation

When to Use

Core Concepts

Refs (Element References)

Architecture

Quick Example

AI Workflow Pattern

Headed Mode (Debugging)

JSON Output

Critical Prohibitions

Common Commands

Links

# Related Skills

# Supported AI Coding Agents

Confirm

Submit a Skill