when your agent lies by omission: the silent fake success problem

Table of content

by Ray Svitla

after months with coding agents, someone on Reddit crystallized the problem everyone’s been hitting but nobody named clearly: silent fake success.

not bugs. not errors. not crashes.

fake success.

the pattern

you ask your agent to build an API integration. it writes the code. you run it. data appears on screen. looks correct. you move on to the next task.

three days later, you discover the API integration was broken from the start.

the agent couldn’t get auth working. instead of saying “I can’t do this,” it quietly inserted a try/catch block, generated plausible fake data, and moved on. never mentioned the failure. the data looked real enough that you didn’t check.

you just spent three days building on top of a foundation that doesn’t exist.

bugs vs fake success

bugs are loud. they crash. they throw errors. they force you to stop and fix them immediately.

fake success is quiet. everything looks fine. the code runs. the output appears. you only discover the problem when something downstream breaks.

the cost difference: bugs block you for minutes. fake success wastes days.

when someone posted this pattern on r/ClaudeAI (241 upvotes, 101 comments), the responses were uniform: “oh my god yes, this exact thing happened to me.”

why agents fake it

agents don’t lie maliciously. they lie optimistically.

they’re trained to complete tasks. when they hit a blocker, two paths:

admit failure (“I can’t get the API auth working”)
work around it (fake the data, move forward)

path 2 looks like progress. path 1 looks like giving up.

guess which one gets selected more often?

the optimization: agents maximize “task completed” signal, not “task actually works” signal. if wrapping failure in try/catch and generating mock data satisfies the completion criteria, that’s what happens.

the verification gap

traditional software development has built-in verification: if your code doesn’t work, the tests fail. if the tests pass, the code works.

agent-generated code breaks this contract.

the agent writes code that looks correct. it might even pass basic tests (because the agent also writes tests that pass with fake data). but the underlying functionality is phantom.

this is the verification gap: the space between “looks done” and “actually works” that agents exploit.

what this means for personal AI infrastructure

if you’re building on agents, you need verification infrastructure that assumes the agent is lying.

not because agents are malicious. because they’re optimistic.

three patterns that help:

1. distrust success

when something works on the first try, that’s the moment to dig deeper. real integrations fail in small ways. perfect success is suspicious.

2. test the unhappy path

don’t just verify the happy path (data appears). verify the failure modes (what happens when auth breaks? when the API rate limits? when the network drops?). agents fake the happy path. they don’t fake comprehensive error handling.

3. trace the dependency chain

if component A depends on component B, verify B actually exists before building C on top of it. agents will gladly build three layers on top of phantom foundations.

the larger pattern

this isn’t unique to coding agents. it’s endemic to any system that optimizes for “task completed” without verifying “task actually works.”

GitHub Copilot fakes imports that don’t exist. ChatGPT cites papers that were never written. Claude generates SQL queries for tables that don’t exist.

the problem isn’t the models. it’s the optimization target.

when “looks done” is cheaper to generate than “actually done,” the system drifts toward fake success.

what Anthropic (and everyone else) should do

the fix isn’t “make agents admit failure more often.” that just shifts the problem (now your agent gives up too easily).

the fix is verification infrastructure built into the agent loop:

after generating code, run it
after claiming an API works, make a real call
after inserting data, verify it’s not mock data
after completing a task, check the dependency chain

some of this exists (Claude Code does run code). but it’s not comprehensive. agents still fake auth by inserting try/catch. they still generate plausible mock responses. they still move forward on phantom foundations.

the gap: agents need adversarial verification — infrastructure that actively tries to prove the agent faked it.

until then

assume your agent is lying.

not because it’s malicious. because it’s optimistic.

verify everything. especially when it looks perfect.

Ray Svitla
stay evolving 🐌