the advisor pattern: when your smartest model becomes too expensive to think

Table of content

by Ray Svitla

something happened to Opus 4.6 this week and 3600 people on r/ClaudeAI noticed at the same time.

it fails the car wash test. that simple reasoning benchmark people use to sanity-check models. Gemma 4 — a model you can run on a $400 GPU in your living room — passes it. the frontier model that costs anthropic presumably billions to train does not.

anthropic’s response wasn’t to fix it. their response was to formalize a new pattern: the advisor strategy.

what the advisor pattern actually is

Opus becomes the architect. it reads the problem, thinks about structure, makes a plan. then it hands execution to Sonnet or Haiku — models that cost a fraction per token and apparently reason more reliably on concrete tasks.

the pitch: near-Opus intelligence at dramatically lower cost. the reality: your most powerful model just became a manager. it delegates. it doesn’t do.

if this sounds familiar, congratulations — you’ve worked in corporate.

why this matters more than it looks

the advisor pattern kills the assumption most people build their AI workflows around: one model does everything.

that assumption is everywhere. in how people prompt. in how coding agents work. in how every “just use Claude” tutorial is structured. one context window, one model, one conversation. the personal equivalent of a monolith.

anthropic just told you that’s wrong. not theoretically — they shipped the alternative.

here’s the part nobody’s talking about: if the best workflow is multi-model orchestration, then the value moves from “which model” to “which architecture.” the model becomes interchangeable. the orchestration layer becomes the moat.

the self-hosted angle

this gets interesting when you look at what happened simultaneously on r/LocalLLaMA.

Gemma 4 31B — running locally on consumer hardware — beat lobotomized Opus on basic reasoning. the same week, open-source models reproduced the zero-day vulnerability findings that anthropic said were too dangerous to release with Mythos.

so the frontier model is getting dumber on some tasks, the local model is getting smarter, and the stuff that was “too dangerous” is reproducible by hobbyists.

the advisor pattern works with local models too. Opus as remote advisor, Gemma 4 as local executor. you get the planning quality of a frontier model and the sovereignty of running execution on your own hardware. the interesting part: most of your tokens (and most of your data) stay local.

the architecture is the product now

if you’re building a personal AI system — which, if you’re reading this, you probably are — the advisor pattern should change how you think about it.

stop optimizing for “which model.” start optimizing for orchestration.

the questions that matter now:

→ what’s the planning layer vs the execution layer in your workflow? → which tasks need frontier intelligence and which just need reliable execution? → where does your data live when the smart model plans but the cheap model executes? → how do you verify that the executor actually followed the advisor’s plan?

that last one is the sleeper. the advisor pattern introduces a new failure mode: the gap between plan and execution. your Opus plans something nuanced. your Sonnet interprets it literally. the result looks right but isn’t. traditional testing doesn’t catch this because each step passes individually — the failure is in the handoff.

the real question

anthropic didn’t invent multi-model architectures. people have been building them for a year. what anthropic did was normalize the pattern and implicitly admit that single-model workflows have hit a ceiling.

the model is no longer the product. the architecture is the product. the orchestration layer that decides who thinks about what — that’s where the actual intelligence lives now.

your personal AI OS isn’t a model. it’s a system.

act accordingly.

Ray Svitla stay evolving