Andrej Karpathy's AI-First Development

Table of content

Andrej Karpathy is a computer scientist who was a founding member of OpenAI in 2015, then led Tesla’s Autopilot computer vision team as Senior Director of AI from 2017-2022. He did his PhD at Stanford under Fei-Fei Li, where he co-created the influential CS231n deep learning course. Now he runs Eureka Labs, an AI education startup, and posts technical content on his YouTube channel.

Karpathy coined “vibe coding” and it became a meme almost immediately. But the meme obscures something useful: he actually has two completely different modes. Vibe coding for throwaway stuff. A structured three-layer approach for production.

The distinction matters. Mixing them up is how people end up with AI-generated spaghetti in their main branch.

The software evolution framing

Karpathy laid out this framework in his Software 3.0 talk at YC AI Startup School, building on his earlier Software 2.0 essay. He thinks about code in three generations:

1.0: You write explicit instructions. if x > 0: return true
2.0: You train neural networks on data. The model learns what to do.
3.0: You describe intent in English. The AI writes the code.

We’re in 3.0 now. The implications are still shaking out.

Vibe coding: when it makes sense

The original tweet made it sound reckless. Voice input, accept everything, don’t read diffs. Just vibes.

Here’s the context that got lost: this is explicitly for throwaway code. Weekend projects. Hackathons. Learning a new API. Stuff you’ll delete in a week anyway.

In that context, it’s not reckless — it’s appropriate. Why carefully review code that will never see production? Ship fast, learn what you need, move on.

Where vibe coding gets you in trouble: team codebases, anything with users, security-sensitive features. Karpathy is clear about this. The meme-fication dropped the nuance.

The three-layer approach for real work

For production, Karpathy uses different tools depending on what he’s doing:

Layer 1 (75% of the time): Tab completion

Just Cursor, completing lines as you type. You’re still driving. The AI handles boilerplate and obvious patterns. This is where most coding happens.

Layer 2 (20%): Agentic tools

Claude Code or similar for multi-file changes. Describe what you want, AI reads the codebase, proposes implementation. You review and iterate. Good for refactors, feature additions, anything touching multiple files.

Layer 3 (5%): Reasoning models

The heavy stuff. Architecture decisions, algorithms you don’t fully understand, bugs that make no sense. Load all context, ask for approaches — not code. Evaluate options, then implement with Layer 1 or 2.

The actual workflow

Karpathy covers this in his How I use LLMs video. Four steps, and the order matters:

Load context first. All the relevant files, docs, error messages, previous attempts. Don’t make the AI guess what you’re working with.
Describe one concrete thing. “Cache API responses in Redis with 5-minute TTL” — not “make it faster.” Specificity is the difference between useful output and generic suggestions.
Ask for approaches before code. This is counterintuitive. You want to just get the implementation. But asking “what are the options here?” first surfaces tradeoffs you’d miss.
Small chunks. One change at a time. Add logging. Test it. Refactor. Add the feature. One per iteration. Yes, it feels slow. It’s faster in practice.

What I take from this

The layer system resonated with me. I kept trying to use the same approach for everything — big prompts, let the AI figure it out. Doesn’t work.

Tab completion for most stuff. Agents for multi-file changes. Heavy reasoning only when you’re genuinely stuck on architecture.

The percentages (75/20/5) are rough guides, not rules. But they shifted how I think about which tool to reach for.

See The Three-Layer Workflow for a deeper exploration of this pattern and how to implement it.

The honest trade-off

Vibe coding is fun. It feels productive. You ship in hours instead of days.

But it produces code you don’t understand. For throwaway projects, that’s fine — you’ll delete it anyway. For anything else, you’re building on a foundation you can’t debug.

Know which mode you’re in. Don’t mix them.

Next: Claude Code Setup