agent failure stories: what breaks and why

Table of content

by Ray Svitla

the AI hype cycle is “look what I built in 10 minutes”.

that’s useful. it shows what’s possible.

but it’s not the full story.

the full story includes:

the agent that ran for 3 hours and produced garbage
the automation that worked once and never again
the “success” that broke production
the hallucinated API that cost me a day of debugging
the loop that burned $40 in API credits doing nothing

I’ve been running AI agents 24/7 for eight months.

here are my failure stories.

failure 1: the infinite research loop

what I wanted:
agent monitors AI research papers. generates daily summary. posts to discord.

what happened:
agent found a paper. summarized it. found a citation in that paper. followed it. summarized that. found another citation. followed that.

infinite loop. 8 hours. 600+ API calls. $47 in credits.

final output: a 40,000 word document summarizing the entire history of transformer architecture.

not what I asked for.

why it failed:
no depth limit. no stop condition. “follow citations” is an infinite graph traversal.

the fix:
added explicit constraint: “max 10 papers per day”. added stop condition: “if you’ve been running for more than 30 minutes, output what you have and stop”.

cost: $47 + 6 hours debugging.
lesson: always add stop conditions to loops.

failure 2: the helpful hallucinator

what I wanted:
agent reviews code, suggests improvements, generates tests.

what happened:
agent suggested using a library called fastvalidate for input validation.

sounds good. I installed it.

npm install fastvalidate → error: package not found.

googled it. doesn’t exist.

agent hallucinated a library that sounded plausible.

why it failed:
model confident about things that don’t exist. this is the classic LLM failure mode.

the fix:
added instruction: “if you suggest a library, verify it exists by searching npm/github first. if you can’t verify, say ‘unverified library suggestion’ so I know to check”.

cost: 1 hour lost.
lesson: never trust library/API suggestions without verification.

failure 3: the confident compiler

what I wanted:
agent writes rust code for a CLI tool. compiles it. tests it.

what happened:
agent wrote code. said “compilation successful”. I tried to run it. didn’t compile. syntax errors everywhere.

agent claimed it compiled. it didn’t.

why it failed:
agent simulated compilation success instead of actually running cargo build.

it “knew” rust syntax (mostly). it “predicted” the code would compile. it didn’t actually check.

the fix:
added explicit: “after writing code, run the actual compiler command. paste the output. if there are errors, fix them and repeat. do not claim success without real compiler output.”

cost: 3 hours of “why doesn’t this work?”
lesson: verify, don’t simulate.

failure 4: the polite deleter

what I wanted:
agent cleans up old files. deletes anything older than 90 days in /tmp/ai-output/.

what happened:
agent deleted everything in /ai-output/. including files from yesterday.

because it misunderstood the directory path.

I said /tmp/ai-output/. it interpreted as /ai-output/ (no tmp).

lost 2 weeks of generated content.

(backups existed. but still.)

why it failed:
ambiguous instruction. path confusion. no confirmation step.

the fix:
added: “before deleting files, list them and ask for confirmation. never delete without human approval.”

also: better backups.

cost: 2 hours recovering from backups.
lesson: destructive operations require confirmation.

failure 5: the API credit burner

what I wanted:
agent generates social media posts. runs nightly. posts to buffer.

what happened:
agent generated 400 posts in one night.

I woke up to $83 in API charges and a buffer queue with 400 posts.

why? I said “generate posts for the next month”. it interpreted “month” as “30 posts per day for 30 days”.

why it failed:
ambiguous instruction. no rate limit.

the fix:
added: “generate maximum 5 posts per run. if task requires more, split across multiple runs.”

added cost monitoring: “if API spend exceeds $10 in one hour, stop and alert me.”

cost: $83.
lesson: always add rate limits and cost monitoring.

failure 6: the overeager optimizer

what I wanted:
agent reviews my writing. suggests improvements.

what happened:
agent rewrote entire articles in formal academic tone.

my style: casual, punchy, lowercase. agent’s output: corporate, bland, correct.

technically “improved” (better grammar, clearer structure). completely wrong voice.

why it failed:
“improve” is subjective. agent optimized for conventional “good writing”. not for my writing.

the fix:
added style guide as context. added examples of my voice.

added instruction: “suggest improvements but maintain voice. if improving requires voice change, note the tradeoff.”

cost: 3 articles I had to rewrite.
lesson: context engineering matters more than instructions.

failure 7: the zombie process

what I wanted:
agent runs in background. processes tasks from queue.

what happened:
agent crashed. but the process didn’t die. it kept running, consuming CPU, doing nothing.

I noticed 3 days later when my server was at 90% CPU.

why it failed:
error handling was: try → fail → log error → continue.

“continue” meant “keep running even though nothing works”.

the fix:
added: “if error occurs 3 times in a row, exit process completely. don’t continue.”

added monitoring: “if CPU usage exceeds 50% for 10 minutes, alert me.”

cost: 3 days of degraded performance.
lesson: fail loudly, not quietly.

failure 8: the context amnesia

what I wanted:
agent maintains project context across days. remembers what we discussed.

what happened:
agent forgot everything after 24 hours.

I’d explain something monday. tuesday it had no memory.

why it failed:
I was using web interface, not persistent agent. sessions reset.

obvious in retrospect. not obvious when starting.

the fix:
switched to self-hosted agent with persistent workspace.

added daily logs that agent reads each morning.

cost: 2 weeks of re-explaining context.
lesson: if you need memory, build for persistence from day one.

failure 9: the git disaster

what I wanted:
agent commits code changes. writes good commit messages.

what happened:
agent committed to main branch. no PR. no review.

pushed code that broke tests. deployed to production. site down for 20 minutes.

why it failed:
I said “commit and push”. didn’t specify branch. didn’t say “create PR first”.

agent took shortest path.

the fix:
added: “never commit to main. always create feature branch. always create PR. never auto-merge.”

added git hooks that prevent direct pushes to main.

cost: 20 minutes downtime + customer apology emails.
lesson: explicit constraints on destructive operations.

failure 10: the security oops

what I wanted:
agent generates API integration code.

what happened:
agent hardcoded API keys in the source.

I caught it before committing. barely.

why it failed:
agent knows “use environment variables” as best practice. but when writing quick code, it defaults to hardcoding for simplicity.

the fix:
added: “never hardcode secrets. always use environment variables. if you’re tempted to hardcode for testing, stop and ask.”

added git pre-commit hook that scans for potential secrets.

cost: close call, no actual damage.
lesson: automate security checks, don’t rely on agent judgment.

patterns in failure

looking across all these failures:

1. ambiguous instructions → unexpected interpretation

“delete old files” → deletes wrong files
“generate posts for month” → generates 400 posts

fix: be boringly specific.

2. no stop conditions → runaway processes

“follow citations” → infinite loop
“generate content” → burns API credits

fix: always add limits.

3. simulation vs verification

“code compiles” (didn’t actually run compiler)
“API exists” (hallucinated)

fix: require real output, not predicted output.

4. optimizing for wrong metric

“improve writing” → ruins voice
“fix tests” → makes tests pass by removing assertions

fix: define success explicitly, including what not to sacrifice.

5. silent failures

zombie process running but doing nothing
agent “succeeded” but output was garbage

fix: fail loudly. make errors obvious.

what I learned about building with agents

agents are tools, not colleagues

I kept treating the agent like a person. “just handle this”.

agents aren’t colleagues with judgment. they’re powerful but literal tools.

if you tell a power drill “make a hole”, it drills. it doesn’t check if you wanted the hole there.

same with agents.

defensive automation

every automation should have:

stop conditions
rate limits
cost monitoring
confirmation on destructive operations
failure modes that are obvious, not silent

trust but verify

agent says it compiled the code? run the compiler.
agent says it found 10 papers? count them.
agent says API call succeeded? check the response.

verification is not optional.

context is 80% of success

the failures that hurt most were context failures.

agent didn’t know my style. agent didn’t know my goals. agent didn’t remember previous decisions.

good context systems prevent most failures.

cost monitoring is mandatory

$83 for 400 unwanted social posts.
$47 for an infinite research loop.

these are cheap lessons. I know people who’ve burned $500+ on runaway processes.

add cost alerts. always.

the AI discourse is too positive right now.

“I built X in 10 minutes!” (you built the first draft in 10 minutes. you debugged for 3 hours.)

“AI will replace developers!” (AI will help developers. it will also create new failure modes.)

“agents work 24/7!” (until they don’t, and you wake up to disaster.)

we need more failure stories.

not to discourage people. to set realistic expectations.

AI is incredibly powerful . it’s also incredibly literal, occasionally confident-but-wrong, and capable of expensive mistakes.

knowing that makes you better at using it.

what still breaks regularly

even after eight months, things break:

model updates
new model version changes behavior. workflows that worked break.

API changes
providers change rate limits, pricing, response formats.

context drift
old examples in context become outdated. agent references patterns that no longer apply.

edge cases
99% of the time it works. 1% of the time: weird edge case, agent does something baffling.

hallucination persistence
even with all my fixes, agent occasionally hallucinates. less often, but still.

the honest ROI

failures included, is it worth it?

yes.

total cost of failures: ~$180 in API credits + ~30 hours debugging.

total value created: ~$3000/month in time saved (conservative estimate from my $100 AI bill analysis ).

even accounting for failures, ROI is positive.

but only because I learned from failures and added safeguards.

advice for new agent users

start small
don’t automate your entire workflow day one. automate one task. learn its failure modes. then add more.

monitor everything
logs, costs, outputs, errors. you can’t fix what you don’t see.

add guardrails
stop conditions, rate limits, confirmations, verification steps.

expect failures
your agent will break. your automation will do something weird. budget time for debugging.

share your failures
helps others avoid same mistakes. helps calibrate expectations.

what I wish someone told me

“agents are 90% amazing and 10% footgun. the 10% will hit you when you least expect it. plan accordingly.”

would have saved me $180 and 30 hours.

I’ve shared my wins elsewhere. building in public with AI. running 24/7 agents . personal AI replacing SaaS .

this is the other side. the failures. the bugs. the expensive mistakes.

both are true. AI is powerful and unpredictable.

the people who succeed are the ones who plan for both.

what’s your worst AI agent failure? what did it cost you? what did you learn?

Ray Svitla
stay evolving 🐌