Browser Automation: Playwright vs Puppeteer vs browser-use
Table of content
Three tools dominate browser automation for AI agents. Each solves different problems.
Quick Comparison
| Tool | Language | Best For | AI-Native |
|---|---|---|---|
| Playwright | JS/Python/C# | Testing, scraping | No |
| Puppeteer | JavaScript | Chrome automation | No |
| browser-use | Python | AI agents | Yes |
Playwright
Microsoft’s testing framework. Supports Chromium, WebKit, and Firefox from one API.
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// Wait for element, then click
await page.click('text=Login');
await page.fill('#email', 'user@example.com');
await page.screenshot({ path: 'screenshot.png' });
await browser.close();
Strengths:
- Cross-browser support built in
- Auto-wait for elements (no manual sleep calls)
- Trace viewer for debugging
- Parallel test execution
Weaknesses:
- No LLM integration out of the box
- Selectors require knowing the page structure
- Heavy for simple tasks
Puppeteer
Google’s Chrome DevTools Protocol wrapper. The original headless browser tool.
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
// Evaluate JS in browser context
const title = await page.evaluate(() => document.title);
console.log(title);
await page.pdf({ path: 'page.pdf' });
await browser.close();
Strengths:
- Tight Chrome integration
- PDF generation
- Network interception
- Smaller than Playwright
Weaknesses:
- Chrome/Chromium only
- No built-in waiting strategies
- Same selector problem as Playwright
browser-use
Python library built for AI agents. 77k+ GitHub stars. Uses vision models to understand pages.
from browser_use import Agent
from langchain_openai import ChatOpenAI
agent = Agent(
task="Go to google.com and search for 'browser automation'",
llm=ChatOpenAI(model="gpt-4o"),
)
await agent.run()
Strengths:
- Natural language task descriptions
- Vision-based element detection
- Works without knowing page structure
- Built for agent loops
Weaknesses:
- Slower than direct selectors
- Token costs from vision calls
- Less precise than explicit selectors
When to Use Each
Playwright: You know the page structure. You need reliability and speed. Testing or scraping with predictable selectors.
Puppeteer: Chrome-only is fine. You want minimal dependencies. PDF generation matters.
browser-use: Your agent needs to handle arbitrary websites. You can’t hardcode selectors. Speed matters less than flexibility.
Hybrid Approach
Combine them. Use browser-use for discovery, then extract selectors for Playwright:
# First: agent figures out the flow
agent = Agent(task="Find the login form and note the selectors")
result = await agent.run()
# Later: Playwright uses those selectors directly
await page.fill(result.selectors['email'], email)
await page.fill(result.selectors['password'], password)
This gives you AI flexibility during development and deterministic speed in production.
What You Can Steal
From Playwright:
- The codegen tool records your actions and generates code
- Trace viewer shows exactly what happened in failed runs
- Auto-waiting logic you can copy for custom tools
From Puppeteer:
- DevTools Protocol access for low-level browser control
- Network request interception patterns
- PDF rendering pipeline
From browser-use:
- The agent loop structure (observe → think → act)
- How they chunk pages for vision models
- Retry strategies for flaky web elements
Related
- Browser Agents for the theory behind AI browser control
- Parallel Sessions for running multiple browsers simultaneously
Next: Building Browser Agents
Get updates
New guides, workflows, and AI patterns. No spam.
Thank you! You're on the list.