Gregor Zunic's Browser Use Framework

Table of content
Gregor Zunic's Browser Use Framework

Gregor Zunic is a software developer and startup founder based in San Francisco. He co-founded Browser Use with Magnus Muller, creating the most popular open-source library for AI browser automation. The project hit 75,000 GitHub stars within months of launch and raised $17 million from Felicis, Paul Graham, and Y Combinator (W25 batch).

Browser Use converts website interfaces into structured text that AI models can process reliably. Instead of relying on brittle screenshot-based approaches, it parses HTML and visual elements together, letting agents click buttons, fill forms, and navigate sites with human-like precision.

Background

The Pivot Story

In July 2024, Zunic quit his SEO startup despite growing revenue. From his LinkedIn post:

“I woke up every day not ready to grind.”

He was building something that worked financially but felt pointless. The SEO tool was sales-heavy and technically straightforward. He wanted something ambitious.

After quitting, he and Muller experimented with web scraping. They noticed AI agents struggled with browser interactions. Existing tools were fragile and required constant maintenance. They built a prototype in five weeks. It worked better than expected.

Browser Use launched on GitHub in late 2024. Within three months, it became the most-starred open-source web agent project.

How Browser Use Works

The library combines visual analysis with HTML parsing:

from browser_use import Agent
from langchain_openai import ChatOpenAI

agent = Agent(
    task="Find the cheapest flight from SF to NYC next Friday",
    llm=ChatOpenAI(model="gpt-4o"),
)
result = await agent.run()

The agent observes the page, reasons about what to do, then acts (click, type, scroll). This loop repeats until the task completes or fails.

Key technical decisions:

ChoiceRationale
HTML + visionVision alone is unreliable; HTML alone misses dynamic content
Self-healing selectorsAdapts when page layouts change
LLM-agnosticWorks with GPT-4, Claude, Gemini, or local models
Stealth modeBypasses bot detection for legitimate automation

Open Source Strategy

Browser Use is MIT-licensed and fully open-source. Zunic also offers a cloud version at $30/month for users who don’t want to manage infrastructure.

This mirrors the pattern from other successful AI infrastructure projects: give away the core library, charge for convenience. The open-source version has 75k+ stars and millions of downloads. The cloud version handles authentication, proxies, and scaling.

The library powers Manus, one of the first viral AI agent demos that showed agents completing complex multi-step tasks autonomously.

Technical Philosophy

Zunic’s approach to browser automation:

Combine modalities. Pure vision models hallucinate button locations. Pure HTML parsing misses JavaScript-rendered content. Use both together.

Recover from errors. Agents will click wrong buttons and navigate to dead ends. Build retry logic and state recovery into the core library.

Stay model-agnostic. Today’s best model won’t be tomorrow’s. Abstract the LLM layer so users can swap providers.

Optimize for developers. The library works in 3 lines of Python. Advanced configuration is available but not required.

Key Takeaways

PrincipleImplementation
Quit when you’re not excitedLeft a growing startup to find better work
Build fast, validate faster5-week prototype became a company
Open source builds trustMIT license, public development
Solve infrastructure problemsLet others build agents; you handle the browser

Next: Jesse Vincent’s Superpowers Skills Framework

Topics: agents open-source automation ai-coding