skillmake
← marketplace
engineerstoolsha:d8658288f25d952bmanual

agent-browser

Use when the agent needs to drive a real browser to QA a deploy, scrape a logged-in page, or verify a UI change end-to-end with screenshots and DOM access.

Install confidence
curl --create-dirs -fsSL https://skillmake.xyz/i/agent-browser -o ~/.claude/skills/agent-browser/SKILL.md
Pinned content
sha:d8658288f25d952b
Generated with
manual
Source
github.com

The file served at /api/marketplace/agent-browser-d8658288/raw matches this hash. Inspect before install, then copy the command.

2,378 chars · ~595 tokens
---
name: agent-browser
description: Use when the agent needs to drive a real browser to QA a deploy, scrape a logged-in page, or verify a UI change end-to-end with screenshots and DOM access.
source: https://github.com/vercel-labs/agent-browser
generated: 2026-05-25T02:43:47.293Z
category: tool
audience: engineers
---

## When to use

- Verifying a deploy by navigating the live site and asserting that key pages, forms, and dialogs work
- Taking annotated or full-page screenshots as evidence in a bug report or pull request
- Filling out forms and clicking through a flow where a headless HTTP request will not work
- Reading the DOM, console errors, and network requests of a page the user is debugging

## Key concepts

### Persistent browser session

A long-lived Chromium instance keeps cookies, tabs, and login state between commands, so multi-step flows survive without re-authentication.

### Snapshot-and-act loop

The agent takes a labeled accessibility snapshot to find interactive elements, then acts on them by reference, instead of guessing selectors.

### Annotated screenshots

Screenshots can be overlayed with element labels so a bug report shows exactly which button or input is in the wrong state.

### Headed handoff

When a flow hits CAPTCHA or MFA, the skill can open a visible Chromium window for the user to complete the step, then resume control.

## API reference

```
npx skills add vercel-labs/agent-browser
```

Install the agent-browser skill.

```
npx skills add vercel-labs/agent-browser
```

```
browser goto <url> / snapshot / click <ref> / fill <ref> <value>
```

Drive the browser through a flow using references from the latest snapshot.

```
browser goto https://app.example.com/login
browser snapshot -i
browser fill @e2 "user@test.com"
browser click @e4
```

## Gotchas

- References from a snapshot are invalidated after navigation, so re-snapshot before the next click
- Some sites block headless browsers entirely; switch to headed mode when you see anti-bot 403s
- Login cookies live for the session only by default; export browser state if you need to restore it later
- Full-page screenshots can be huge on long pages, so prefer element-scoped screenshots for bug reports

---
Generated by SkillMake from https://github.com/vercel-labs/agent-browser on 2026-05-25T02:43:47.293Z.
Verify against source before relying on details.

File: ~/.claude/skills/agent-browser/SKILL.md