Context Window Management
Table of content
Every AI conversation has a hidden limit: the context window. Exceed it and your AI forgets, hallucinates, or gives garbage output. Managing context is the difference between a useful session and a frustrating one.
Why context matters
The context window is your AI’s working memory. Everything goes in:
| What counts | Example size |
|---|---|
| System prompt | 2-5K tokens |
| Your messages | Variable |
| AI responses | Often 2-3x your input |
| File contents | 1K tokens per ~750 words |
| Tool calls and results | Adds up fast |
When context fills up:
- AI “forgets” early instructions
- Responses become generic
- Reasoning quality degrades
- Hallucinations increase
The 60% rule
Never exceed 60% context usage. Quality degrades before you hit the limit.
[============================ ] 60% - Sweet spot
[======================================== ] 80% - Quality dropping
[================================================] 100% - Broken
| Context level | What happens |
|---|---|
| Under 40% | Peak performance |
| 40-60% | Still good, monitor |
| 60-80% | Noticeable degradation |
| 80%+ | Start new conversation |
Signs of context overflow
Your AI is struggling when:
- It forgets instructions you gave earlier
- Responses become vague or generic
- It repeats itself
- It contradicts previous statements
- Code quality drops
- It stops following your CLAUDE.md rules
Strategies
Phase-based work
Split complex work into discrete phases. Clear context between each.
Phase 1: Research
├── Explore codebase
├── Save findings to thoughts/research.md
└── End conversation
Phase 2: Plan
├── Read thoughts/research.md
├── Create PLAN.md
└── End conversation
Phase 3: Implement
├── Read PLAN.md
├── Execute tasks
└── End conversation
Phase 4: Validate
├── Run tests
├── Fix issues
└── End conversation
Each phase starts fresh with only the context it needs.
Save to files
Offload context to persistent storage:
# Create a thoughts directory
mkdir -p thoughts/
# During work, save important context
claude "Save your current understanding to thoughts/project-state.md"
# In new conversation, load it back
claude "Read thoughts/project-state.md and continue"
What to save:
- Key decisions and rationale
- Current progress
- Blockers encountered
- Next steps
Use Memory MCP
For cross-session persistence:
# Save important context
claude "Remember: the auth system uses JWT with 24h expiry"
# In any future session
claude "What do you remember about our auth system?"
See Building a Memory System for setup.
Start fresh strategically
New conversation is not failure. It’s a tool.
| Situation | Action |
|---|---|
| Task complete | New conversation |
| Context over 60% | Save state, new conversation |
| AI seems confused | New conversation |
| Switching tasks | New conversation |
Monitoring context
Claude Code
# Check current usage
/context
# Compact to reduce usage
/compact
API usage
Track token counts in responses:
{
"usage": {
"input_tokens": 15234,
"output_tokens": 892
}
}
Compare against model limits:
| Model | Context window |
|---|---|
| Claude Sonnet | 200K tokens |
| Claude Opus | 200K tokens |
| GPT-4 | 128K tokens |
60% of 200K = 120K tokens max recommended.
When to start new
Trigger a new conversation when:
- Context exceeds 60%
- AI forgets earlier instructions
- Response quality drops
- Task is complete
- Switching to unrelated work
- You’ve been going for 30+ messages
Before starting fresh:
- Save current state to file
- Note any unfinished tasks
- Capture key decisions
The workflow
Start task
↓
Work (monitor context)
↓
Context > 60%? ──Yes──→ Save state to file
↓ No ↓
Continue Start new conversation
↓ ↓
Task complete? Load state from file
↓ Yes ↓
Save learnings Continue work
↓
End
Long conversations feel productive. Fresh conversations are productive.
Next: Building a Memory System
Get updates
New guides, workflows, and AI patterns. No spam.
Thank you! You're on the list.