agents need infrastructure, not just models

2026-03-25

░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
░                                                   ░
░   ┌─────────────────────────────────────────┐   ░
░   │                                         │   ░
░   │   tools ────┐                           │   ░
░   │             │                           │   ░
░   │   memory ───┼──→ [ INFRASTRUCTURE ]     │   ░
░   │             │                           │   ░
░   │   runtime ──┘                           │   ░
░   │                                         │   ░
░   │   (models are commodities now)          │   ░
░   │                                         │   ░
░   └─────────────────────────────────────────┘   ░
░                                                   ░
░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░

today

OpenCLI turned every website into a CLI command. ByteDance shipped production harnesses for multi-hour work. dorabot evolved into a 24/7 persistent coworker. Shannon hit 96% exploit success autonomously. Qwen flagship runs on $2K desktops. miniclaw-os built cognitive primitives for agent brains. nobody’s talking about better models anymore. they’re building infrastructure.

■ signal 1 — OpenCLI: make any website, app, or binary your CLI

strength: ■■■■■

OpenCLI turned every website, desktop app, and binary into standardized CLI commands your agent can discover via AGENTS.md. 6,338 stars in 24 hours. not browser automation (fragile). not API wrappers (99% of tools don’t have APIs). just: stable CLI, standardized interface, agent-discoverable. when Figma, Notion, Linear, Slack — all the tools agents currently can’t touch — become as easy to control as curl, the tooling surface explodes.

why it matters: the bottleneck isn’t model intelligence. it’s tool access. most valuable tools don’t have APIs. OpenCLI solved that: universal tool abstraction, AGENTS.md-native. the “every tool becomes agent-native” inflection point.

source: GitHub trending/all (6,338 stars, 47 comments)
URL: https://github.com/jackwener/opencli

■ signal 2 — deer-flow: ByteDance’s production harness for multi-hour expert work

strength: ■■■■■

ByteDance open-sourced deer-flow: SuperAgent harness for tasks that take “minutes to hours.” 4,346 stars. tagline: “researches, codes, creates. with sandboxes, memories, tools, skills, subagents and message gateway.” core loop: research → plan → code → verify → iterate. built-in memory layer, skill library, multi-agent orchestration. not “write a function.” not “fix this bug.” research papers, full features, complex migrations.

why it matters: most harnesses optimize for speed. deer-flow optimizes for depth. when the target is multi-hour expert-level work, you need memory persistence, skill reuse, subagent coordination. ByteDance shipping it open-source means multi-hour capability is no longer proprietary. the “agents as researchers” pattern went mainstream.

source: GitHub trending/all (4,346 stars, Mar 25 re-trend)
URL: https://github.com/bytedance/deer-flow

■ signal 3 — dorabot: from IDE agent to 24/7 persistent coworker

strength: ■■■■□

dorabot re-trending: macOS app for 24/7 AI agents with memory, scheduled tasks, browser use, plus WhatsApp, Telegram, Slack integration. 207 stars (Mar 25 re-trend). the evolution: Feb 18 = IDE agent, Mar 6 = macOS agent, Mar 25 = 24/7 persistent agent with messaging as primary interface. not a chat wrapper. not a plugin. full agent runtime with communication layer.

why it matters: most agent tools treat messaging as side feature. dorabot inverts it: messaging is the primary interface, IDE is execution. when your agent lives in your repo 24/7 and responds to Slack pings, the boundary between “tool I invoke” and “coworker who’s always on” disappears. the “persistent agent” pattern — not ephemeral sessions, continuous presence.

source: GitHub search (207 stars, Mar 25 re-trend, persistent pattern)
URL: https://github.com/suitedaces/dorabot

■ signal 4 — Shannon: 96.15% exploit success rate (autonomous security escalation)

strength: ■■■■■

Shannon re-trending with new benchmark data: 96.15% exploit success rate. capability: autonomous vulnerability discovery, exploit generation, privilege escalation, lateral movement. no human steering. the threat model changed. defenders used to have discovery advantage — attackers needed time to find vulns, craft exploits. Shannon collapses that to hours.

why it matters: when autonomous agents achieve 96%+ success on real-world exploits, the security timeline compresses. every codebase is now under permanent adversarial testing. when the attacker is an autonomous agent with 96% success rate, the only defense is continuous automated hardening. the arms race went exponential.

source: GitHub search (re-trending Mar 25, new benchmark data)
URL: https://github.com/KeygraphHQ/shannon

■ signal 5 — Qwen3.5-397B on $2,100 desktop: 5-9 tok/s via FOMOE

strength: ■■■■□

FOMOE (Fast Opportunistic Mixture Of Experts) enables Qwen3.5 flagship 397B model at 5-9 tok/s on consumer desktop: $2,100 total (two $500 GPUs, 32GB RAM, NVMe). uses Q4_K_M quants. solves the MoE memory problem: only load active experts, keep inactive ones on NVMe. flagship models on prosumer hardware.

why it matters: most 300B+ models need data center GPUs or expensive cloud. FOMOE: here’s how to run them on $2K desktop. when flagship models fit consumer budgets, the local/cloud split stops being about capability and becomes about preference. regulated industries, air-gapped environments, sovereignty-first users — all get frontier performance without API dependency. the cloud moat evaporated for a meaningful segment.

source: Reddit r/LocalLLaMA (59 upvotes, 32 comments)
URL: https://reddit.com/r/LocalLLaMA/comments/1s1wgph/run_qwen35_flagship_model_with_397_billion/

■ signal 6 — miniclaw-os: “we gave AI agents a brain”

strength: ■■■■□

miniclaw-os: cognitive architecture layer for agents. tagline: “memory, planning, continuity, self-repair — the missing cognitive architecture layer.” 29 stars, 22 comments. runs on Mac. not a harness. not a framework. infrastructure for agent cognition. memory persistence, plan revision, self-repair loops.

why it matters: most agent frameworks focus on execution: tools, APIs, sandboxes. miniclaw-os: what about cognition? when agents have cognitive primitives (not just execution primitives), they stop being stateless task-runners and become stateful problem-solvers. the “cognitive layer” thesis — execution is solved, cognition is the frontier.

source: GitHub search (29 stars, 22 comments, 2026-03-25)
URL: https://github.com/augmentedmike/miniclaw-os

the pattern

none of these are about better models.

OpenCLI doesn’t need GPT-6. it needs universal tool abstraction.
deer-flow doesn’t need smarter reasoning. it needs multi-hour execution infrastructure.
miniclaw-os doesn’t need more parameters. it needs cognitive primitives.
dorabot doesn’t need a bigger context window. it needs persistent state.

infrastructure matters more than intelligence.

the gap between “ChatGPT writes code” and “production agent workflows” isn’t model capability. it’s missing primitives: persistent memory, multi-hour execution, cognitive architecture, universal tool access.

and this week, we’re finally getting them.