the AI productivity paradox: more output, more burnout

Table of content

by Ray Svitla


I ship about 3x more than I did a year ago.

that sounds like a flex. it isn’t. because I also review 3x more code, debug 3x weirder edge cases, make 3x more architectural decisions per day, and feel roughly the same amount of tired at the end of it. maybe more.

the promise was simple: AI handles the boring stuff, you focus on what matters. draft the email, summarize the meeting, scaffold the component. you do the thinking.

here’s what actually happened.

the treadmill hypothesis

Harvard Business Review just published a piece called “AI Doesn’t Reduce Work — It Intensifies It” by Aruna Ranganathan and Xingqi Maggie Ye. the core argument:

AI doesn’t give you time back. it raises the baseline.

when everyone has a tool that writes first drafts in seconds, “writing a first draft” stops being work. it becomes a pre-condition. the actual work shifts to editing, reviewing, deciding — the cognitively expensive stuff that AI handles worst.

you used to spend 30 minutes writing a spec. now Claude writes it in 90 seconds. great. except now your manager expects specs for things that never had specs before. the task didn’t disappear — it metastasized.

the developer version

if you code with AI tools, you’ve felt this already.

the old workflow: think → write → debug → ship. maybe one meaningful PR a day on a complex codebase.

the new workflow: prompt → review → prompt → review → fix the thing the AI broke → prompt → review → ship three PRs before lunch. then do it again after lunch. and again.

stripe just announced their internal coding agents (“minions”) merge over 1,000 pull requests per week . humans review. agents write. the engineers didn’t get fewer tasks — they became full-time reviewers of machine-generated code.

there’s a word for this: deskilling. the creative part of the work (writing code) gets automated. what remains is the quality control loop. and QA is exhausting in a way that creation isn’t.

the cognitive load problem

here’s the thing nobody talks about: review is harder than writing.

when you write code, you hold one mental model in your head — yours. when you review code, you have to reconstruct someone else’s mental model. or in this case, something else’s. an AI that may have made reasonable-looking choices based on entirely wrong assumptions.

I’ve caught Claude confidently implementing a feature that would have introduced a security vulnerability. the code was clean, well-tested, properly typed. it just solved the wrong problem. catching that requires deeper understanding than writing it yourself would have.

multiply that by 3x the output and you get… burnout. dressed up as productivity.

the personal AI angle

this is why I keep coming back to the idea that your AI needs to understand your context. not just your codebase — your priorities, your energy patterns, your decision-making history.

an AI that dumps three PRs on your desk at 2pm (your crash window) isn’t helping. an AI that knows you make architectural decisions best before noon and routes complex reviews accordingly — that’s a different tool entirely.

the problem isn’t that AI writes too much code. the problem is that AI writes code without understanding the human cost of reviewing it.

the intensity curve

here’s my rough model of how this plays out:

productivity  ████████████████████████░░░░░░░░
cognitive load ░░░░████████████████████████████
burnout risk   ░░░░░░░░░░██████████████████████
               ──────────────────────────────→
               no AI    early AI    full AI

early AI adoption feels magical. you’re faster, output goes up, cognitive load stays the same because you’re only using AI for simple tasks.

full AI adoption is different. everything that can be automated is automated. what’s left is the hard stuff. all day. every day. no easy tasks to break up the cognitive load.

the old workday had natural rhythm: some tasks were hard, some were easy. the easy ones weren’t waste — they were recovery. AI eliminated the recovery.

what would actually help

throttled output. an AI that ships PRs at a rate matched to your review capacity, not its generation capacity.

energy-aware scheduling. route complex reviews to your peak hours. admin tasks to your crash zone.

context-aware prioritization. the AI should know which PRs actually matter and which are nice-to-have. don’t make the human sort the queue.

honest metrics. stop measuring “PRs merged” and start measuring “decisions made per hour.” that’s the real bottleneck.

none of this exists yet. we measure output and celebrate velocity. the cognitive cost is invisible until someone burns out.

the uncomfortable question

the HBR article frames this as a corporate problem — companies pushing AI adoption without considering worker impact.

but I notice the same pattern in my own solo work. nobody’s pushing me. I push myself. the AI makes it possible to ship more, so I ship more, because shipping feels productive. the treadmill is self-imposed.

maybe the real skill of the AI era isn’t prompt engineering. it’s knowing when to stop prompting.


Ray Svitla stay evolving