claude code + playwright browser automation
Table of content
by Ray Svitla
Claude Code can control a web browser. not in the “I’ll describe what you should click” way — it literally launches Chromium, navigates pages, fills forms, clicks buttons, reads DOM content, and takes screenshots. through Playwright’s MCP server, Claude gets actual browser hands.
this changes what you can do with Claude Code in ways that aren’t immediately obvious.
setup
Playwright has an official MCP server. add it to your Claude Code config:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": ["-y", "@playwright/mcp@latest"]
}
}
}
restart Claude Code. you should see browser tools appear — browser_navigate, browser_click, browser_screenshot, browser_type, and about a dozen others.
first time running, Playwright downloads Chromium automatically. if you’re behind a corporate proxy, this step might fail silently. check ~/.cache/ms-playwright/ for the browser binary.
what Claude can actually do with a browser
end-to-end test writing. describe what a user flow should look like — “user logs in, creates a project, invites a team member, checks the dashboard shows the new project” — and Claude writes Playwright tests that execute that flow. it launches the browser, tries the flow, sees what actually happens, and writes tests that match reality rather than assumptions.
this is the workflow that sold me. writing e2e tests from specs means you test what you think happens. writing e2e tests while watching the actual browser means you test what actually happens. the gap between those two is where bugs hide.
visual debugging. Claude takes screenshots at each step and analyzes them. “the button says ‘Submit’ but nothing happens when I click it” — Claude can see the button, try clicking it, check the console for errors, inspect the network tab for failed requests. actual debugging, not guessing.
scraping and data extraction. need to pull structured data from a website that doesn’t have an API? Claude navigates the site, handles pagination, extracts data, and structures it. not the most elegant approach but effective for one-off data pulls.
form filling and workflow automation. tedious multi-step web forms that don’t have APIs — insurance submissions, government portals, vendor onboarding forms. Claude fills them out step by step. I’m not saying this is its highest use. I’m saying it works and people actually do it.
the test-writing workflow in detail
the workflow that produces the best results:
- start your dev server locally
- tell Claude: “navigate to localhost:3000 and explore the app”
- Claude clicks around, takes screenshots, maps out the UI
- tell Claude: “write e2e tests for the user registration flow”
- Claude executes the flow in the browser while writing the test
- the test reflects what actually happened, not what you hoped would happen
step 3 is crucial. Claude building a mental model of your app by actually using it produces better tests than Claude reading your component code and guessing what the UI looks like.
sharp edges
speed. browser automation is slow. every action involves launching or interacting with a real browser. a Claude session that would take seconds with pure code takes minutes with browser interactions. budget time accordingly.
flakiness. real browsers on real pages have timing issues. elements load asynchronously. animations play. modals appear. Claude handles most of this — it waits for elements, retries clicks — but complex SPAs with heavy JavaScript occasionally confuse it.
headless vs headed. by default, Playwright runs headless (no visible browser window). add --headed to the MCP args if you want to watch Claude drive the browser. it’s educational the first time and unnecessary after.
authentication. Claude can log into websites, but storing credentials in your MCP config is a bad idea. use environment variables, or better, pre-authenticate with a cookie file that Playwright loads on startup.
when not to use it
if you have an API, use the API. browser automation for things that have programmatic access is like using a forklift to carry a sandwich. it works, but you’re missing the point.
if your tests need to run in CI, write the Playwright tests with Claude’s help but run them through your normal test pipeline. don’t put Claude in your CI loop for browser tests — it’s too slow and too expensive for every commit.
if you need pixel-perfect visual testing, use dedicated visual regression tools (Chromatic, Percy). Claude can identify gross visual issues but isn’t a visual diff engine.
the combination that works
Claude Code with Playwright MCP is best for:
→ initial test suite creation (explore app → write tests → hand off to CI) → debugging UI issues where you need both code context and visual context → one-off automation tasks that don’t justify building a proper tool → prototyping browser workflows before building them properly
it’s a development tool, not a production tool. use it to build things faster, then run those things without Claude in the loop.
what’s the browser workflow you’d automate first if Claude could just drive it for you?
→ browser agents — the bigger picture → best MCP servers — other useful integrations → claude code + docker — sandboxed environments
Ray Svitla stay evolving