Addy Osmani's Spec-Driven AI Coding

Table of content

Addy Osmani spent 14 years at Google leading Chrome Developer Experience and contributing to tools like Lighthouse and Chrome DevTools. In late 2025, he transitioned to Google Cloud AI. His guides on prompt engineering for developers, including “The Prompt Engineering Playbook for Programmers” and “How to Write a Good Spec for AI Agents,” document practical techniques for getting reliable results from coding agents.

Osmani’s core insight: using LLMs for programming is “difficult and unintuitive” and getting great results requires learning new patterns. The quality of AI output depends on the quality of your input.

Background

Head of Chrome Developer Experience at Google (2011-2025)
Now Director at Google Cloud AI
Author of “Learning JavaScript Design Patterns” (O’Reilly)
Created the “Beyond Vibe Coding” framework for AI-assisted development
25 years of software engineering experience

GitHub | Twitter | Blog | Substack

The 70% Problem

Osmani frames the challenge with AI coding assistants:

“AI gets us impressively far, but that final 30% requires deep engineering expertise.”

AI can draft features quickly but falters on logic, security, and edge cases. This creates the “vibe coding” trap: rapid prototyping that feels productive but produces fragile code.

AI Strength	AI Weakness
Boilerplate generation	Edge case handling
Feature scaffolding	Security considerations
Documentation drafting	Complex debugging
Code translation	Architectural decisions

The solution: treat AI as a pair programmer that requires clear direction, context, and oversight.

Spec-Driven Development

Osmani’s workflow inverts traditional coding. Start with a spec, not code:

# Feature: User Authentication

## Requirements
- Email/password login
- Session management with 24h expiry
- Rate limiting: 5 failed attempts triggers 15-min lockout

## Constraints
- Use existing auth library (passport.js)
- No changes to user schema
- Must pass existing test suite

## Acceptance Criteria
- [ ] Login returns JWT on success
- [ ] Invalid credentials return 401
- [ ] Rate limit returns 429 after 5 failures

The spec serves two purposes: it forces you to think before coding, and it gives the AI agent the context it needs to produce useful output.

Writing Specs for AI Agents

From his January 2026 guide:

“Simply throwing a massive spec at an AI agent doesn’t work. Context window limits and the model’s attention budget get in the way.”

Key principles:

Principle	Implementation
Start high-level	Let AI draft details, you refine
Break tasks apart	Modular prompts, not monoliths
Build in self-checks	Include verification steps in spec
Plan in read-only mode	Let AI propose before executing

Structure specs like a professional PRD:

## Overview
One paragraph on what and why.

## Goals
- Primary objective
- Secondary objectives

## Non-Goals
- What this feature explicitly won't do

## Technical Approach
- Architecture decisions
- Dependencies
- Integration points

## Testing Strategy
- Unit test requirements
- Integration test scenarios

Context Engineering

Osmani distinguishes “prompt engineering” from the broader “context engineering”:

“Context engineering means providing an AI with all the information and tools it needs to successfully complete a task, not just a cleverly worded prompt.”

Context includes:

Project structure: CLAUDE.md, directory layout, conventions
Relevant code: Files the AI needs to understand
Constraints: What the AI should and shouldn’t do
Examples: Working code that demonstrates the pattern

# CLAUDE.md

## Project: E-commerce API

Commands:
- `npm test` - Run tests
- `npm run lint` - Check code style

Conventions:
- Use async/await, not callbacks
- All endpoints return JSON
- Error responses use { error: string, code: number }

Key files:
- src/routes/index.js - Route definitions
- src/middleware/auth.js - Authentication logic

The Three-Layer Workflow

Osmani’s workflow aligns with the spec-plan-execute pattern:

Layer	Model	Output
Plan	Reasoning model (o1, Claude)	Spec and architecture
Implement	Coding agent	Working code
Verify	Human + tests	Validated changes

The verify layer is non-negotiable:

“If your pull request doesn’t contain evidence that it works, you’re not shipping faster. You’re just moving work downstream.”

Managing Multiple Agents

From his January 2026 piece on agent management:

“The highest-leverage developers will look like async-first managers running a small fleet of parallel AI coding agents.”

The skills that make good tech leads translate directly:

Clear delegation
Verification loops
Async communication
Context-switching between parallel workstreams

Run agents in isolated environments (worktrees, containers) and review their output as you would junior developer PRs.

Practical Prompting Patterns

From “The Prompt Engineering Playbook”:

Be specific about format:

Review this code for security issues.

Output format:
- Line number
- Issue description
- Severity (low/medium/high/critical)
- Suggested fix

Provide constraints:

Refactor this function.

Constraints:
- Maintain existing public API
- No new dependencies
- Keep function under 50 lines

Include examples when needed:

Convert these function names to camelCase.

Example:
- get_user_data → getUserData
- CREATE_ORDER → createOrder

Convert:
- fetch_all_items
- VALIDATE_INPUT

Context Buddy Tool

Osmani built Context Buddy, a visual prompt builder implementing Anthropic’s 10-section prompt structure:

Task definition
Context/background
Examples
Constraints
Output format
Reasoning steps
Edge cases
Verification criteria
Fallback behavior
Style guidelines

The tool makes explicit what experienced prompt engineers do intuitively.

Key Takeaways

Principle	Implementation
Spec before code	Write requirements, constraints, and acceptance criteria first
Context is everything	Project files, conventions, and examples matter more than clever prompts
Break tasks down	Modular prompts beat monolithic specs
Verify always	AI output requires human and automated validation
Manage, don’t vibe	Treat agents like junior developers needing oversight

Links

Next: Riley Goodside’s Prompting