Matt Shumer's Browser Agent Vision

Table of content
Matt Shumer's Browser Agent Vision

Matt Shumer is the co-founder and CEO of OthersideAI, the company behind HyperWrite. He founded the company in 2020 and has since built one of the most-used AI writing assistants. He also maintains popular open-source projects like gpt-prompt-engineer (9.6k GitHub stars) and invests in AI startups through Shumer Capital.

Shumer builds AI agents that click buttons, fill forms, and complete web tasks autonomously. He demonstrated the concept by having an AI navigate to Domino’s and order a pizza. For personal operating systems: delegate entire categories of web-based busywork. See Browser Agents for the broader concept.

The Agent Architecture

HyperWrite’s Agent-1 combines:

ComponentFunction
Language ModelInstructions, reasoning
Computer VisionIdentifying web page elements
Action ModelDeciding what to click, type, scroll
VerificationConfirming action success

Describe a task in natural language, watch the AI complete it in your browser.

Real-World Use Cases

Gmail Organization

User: "Archive all newsletters older than 30 days"

Agent:
1. Opens Gmail
2. Searches for common newsletter senders
3. Filters by date (older than 30 days)
4. Selects all matching emails
5. Archives them
6. Reports: "Archived 247 newsletters"

Flight Booking Research

User: "Find flights to Tokyo for March 15-22,
       under $1200, prefer direct"

Agent:
1. Opens Google Flights
2. Enters dates and destination
3. Filters by price and stops
4. Captures top 5 options
5. Returns comparison table with prices, times, airlines
User: "Find 10 product managers in SF with AI experience"

Agent:
1. Opens LinkedIn search
2. Applies filters: title, location, keywords
3. Scrolls through results
4. Extracts relevant profiles
5. Returns list with names, current roles, profile links

Agent Trainer: Teaching by Demonstration

Agent Trainer teaches AI new workflows by watching you perform them once.

1. Record yourself doing the task once
2. Agent learns the pattern
3. Agent replicates on new data
4. Human corrects mistakes

Example: Processing expense reports
1. Open email, find attachment
2. Download PDF
3. Open accounting software, click "New Expense"
4. Extract values from PDF into form
5. Submit and archive email

Agent learns pattern, handles future reports.

Safety Model

Human-in-the-loop: Critical actions (purchases, deletes, sends) require explicit approval before execution.

Sandboxed execution: Agent operates in isolated browser profile with no access to saved passwords, session isolation, and audit logs.

Reversibility checks: Irreversible actions (purchase, delete, send, submit, cancel subscription, close account) require explicit confirmation.

HyperWrite Chrome Extension

Install the HyperWrite extension from Chrome Web Store, sign in, pin to toolbar.

Usage: Click icon → type/speak request → approve confirmations → review results

Shortcut: Cmd+Shift+H (Mac) / Ctrl+Shift+H (Windows)

Integration with Personal OS

Browser agents compose with code-based AI:

Workflow example:

  1. Claude Code: “Research competitors for my product”
  2. Claude identifies 5 competitors, needs pricing
  3. Delegates to HyperWrite: “Visit these sites, extract pricing into a table”
  4. HyperWrite navigates, extracts data
  5. Claude Code receives data, continues analysis

Good Candidates for Automation

TaskTime Saved
Email triage30 min/day
Social media posting15 min/day
Expense logging1 hour/week
Competitor monitoring2 hours/week
Report generation1 hour/week

Not Ready Yet

These require human judgment: negotiation, creative writing, relationship-building, complex multi-step research, real-time adaptation.

Building Workflows

Audit one week: What sites do you visit repeatedly? What forms do you fill often? What copy-paste patterns exist?

Document the task:

Task: Submit weekly time report
1. Open timetracking.company.com
2. Click "New Entry", select current week
3. Enter hours per project for each day
4. Add overtime notes
5. Submit and screenshot confirmation

Start small: Read-only research, low-stakes data entry, reversible actions. Graduate to purchases with approval and communications with confirmation.

Define approval levels:

Getting Started

Week 1: Install HyperWrite. Try 3 simple tasks (search, data extraction, navigation). Document what works and what fails.

Week 2: Pick one high-frequency task. Document exact steps. Train the agent. Run with supervision for one week.

Week 3: Add one new automated task. Adjust based on failures.

The goal: automate repetitive work so you have energy for interesting work.


Next: Riley Goodside’s Prompt Engineering Patterns

Topics: agents workflow automation ai-coding