Harrison Chase's Context Engineering Framework

Table of content

Harrison Chase built the most widely-used framework for AI agents, but his most valuable contribution isn’t code—it’s a mental model. After three years and 80 million monthly downloads of LangChain, he’s crystallized what actually makes AI systems work: context engineering.

Background

Started LangChain as an 800-line Python side project in October 2022
Co-founded LangChain (the company) in February 2023 with Ankush Gola
Raised $125M at $1.25B valuation in October 2025
Harvard graduate, previously at Kensho doing ML engineering
Customers include Rippling, Vanta, Cloudflare, Replit, Harvey

Twitter: @hwchase17 · GitHub: hwchase17 · LangChain Blog

Context Engineering

Chase’s core insight: when agents fail, it’s rarely because the model isn’t smart enough. It’s because you didn’t give it what it needed.

His definition:

“Context engineering is building dynamic systems to provide the right information and tools in the right format such that the LLM can plausibly accomplish the task.”

This breaks down into five components:

Component	What It Means
Dynamic system	Context comes from many sources—developer, user, previous interactions, tool calls, external data
Right information	LLMs can’t read minds. Garbage in, garbage out
Right tools	The model needs tools to look up info, take actions, or fill gaps
Right format	A short descriptive error beats a giant JSON blob
Plausibly accomplish	Ask yourself: given this context, could a smart human solve it?

Why this matters more than “prompt engineering”:

Early developers obsessed over phrasing prompts cleverly. Chase argues that’s backwards—providing complete, structured context matters far more than magic wording. Prompt engineering is just a subset of context engineering.

The Agentic Spectrum

Chase rejects the binary “is this an agent?” question. Instead, he thinks in terms of how agentic a system is:

Workflows — LLMs and tools orchestrated through predefined code paths. Predictable, reliable, you control the flow.

Agents — LLMs dynamically direct their own processes and tool usage. Flexible, but less predictable.

Most production systems combine both. The spectrum:

[Pure Workflow] ←————————————→ [Pure Agent]
     ↑                              ↑
Deterministic flow             LLM decides everything
Easy to debug                  Harder to predict
Lower capability               Higher capability ceiling

His advice: “Find the simplest solution possible, and only increase complexity when needed.”

Tool Stack

Tool	How He Uses It
LangGraph	Event-driven framework for agentic systems—supports workflows, agents, and everything between
LangSmith	Observability and evals—see exactly what context went into each LLM call
LangChain	High-level abstractions for getting started quickly

The key design principle: full control over context. LangGraph exposes everything—no hidden prompts, no magic. You see exactly what goes into the LLM.

Debugging Philosophy

When Chase debugs agent failures, he asks:

Did it have the right information? Check if the model received the context needed to make a good decision
Was it formatted well? Poor formatting kills performance
Did it have the right tools? If it needed to look something up, could it?
Did the model just mess up? Only after ruling out context problems

His claim: context issues cause most failures, not model limitations. Especially as models improve.

Why This Matters

Context engineering represents a shift in how we think about AI development. The field spent years focused on model capabilities—making LLMs smarter, faster, cheaper. Chase’s framework redirects attention to the interface between humans and models.

This matters because it’s actionable. You can’t make GPT-4 smarter, but you can control what you feed it. Every developer has the power to improve their agent’s reliability by improving context quality. No waiting for the next model release.

The framework also explains why so many AI products feel half-baked. Teams ship impressive demos, then struggle in production. The demo had perfect context—a controlled example where the developer knew exactly what information the model needed. Production has messy, incomplete, poorly-formatted context. The model didn’t get dumber; the context got worse.

Chase’s “plausibly accomplish” test cuts through the noise. Before blaming the model, ask: given what I sent it, could a smart human solve this? If not, the bug is in your context engineering, not the AI.

What You Can Steal

Technique	How to Apply
Trace every LLM call	Log the exact input/output of every model call. When it fails, check context first
The “plausible” test	Before debugging: “Given this context, could a human solve it?” If no, fix context
Start with workflows	Build deterministic pipelines first. Add agent flexibility only where needed
Format matters	Structure tool outputs for LLM consumption. Short summaries beat raw data
Own your prompts	Never use hidden prompts from frameworks. You need to see and control everything

Links

Next: Yohei Nakajima’s Autonomous Agents