Context Window Management

Table of content

Every AI conversation has a hidden limit: the context window. Exceed it and your AI forgets, hallucinates, or gives garbage output. Managing context is the difference between a useful session and a frustrating one.

Why context matters

The context window is your AI’s working memory. Everything goes in:

What countsExample size
System prompt2-5K tokens
Your messagesVariable
AI responsesOften 2-3x your input
File contents1K tokens per ~750 words
Tool calls and resultsAdds up fast

When context fills up:

The 60% rule

Never exceed 60% context usage. Quality degrades before you hit the limit.

[============================                    ] 60% - Sweet spot
[========================================        ] 80% - Quality dropping
[================================================] 100% - Broken
Context levelWhat happens
Under 40%Peak performance
40-60%Still good, monitor
60-80%Noticeable degradation
80%+Start new conversation

Signs of context overflow

Your AI is struggling when:

Strategies

Phase-based work

Split complex work into discrete phases. Clear context between each.

Phase 1: Research
├── Explore codebase
├── Save findings to thoughts/research.md
└── End conversation

Phase 2: Plan
├── Read thoughts/research.md
├── Create PLAN.md
└── End conversation

Phase 3: Implement
├── Read PLAN.md
├── Execute tasks
└── End conversation

Phase 4: Validate
├── Run tests
├── Fix issues
└── End conversation

Each phase starts fresh with only the context it needs.

Save to files

Offload context to persistent storage:

# Create a thoughts directory
mkdir -p thoughts/

# During work, save important context
claude "Save your current understanding to thoughts/project-state.md"

# In new conversation, load it back
claude "Read thoughts/project-state.md and continue"

What to save:

Use Memory MCP

For cross-session persistence:

# Save important context
claude "Remember: the auth system uses JWT with 24h expiry"

# In any future session
claude "What do you remember about our auth system?"

See Building a Memory System for setup.

Start fresh strategically

New conversation is not failure. It’s a tool.

SituationAction
Task completeNew conversation
Context over 60%Save state, new conversation
AI seems confusedNew conversation
Switching tasksNew conversation

Monitoring context

Claude Code

# Check current usage
/context

# Compact to reduce usage
/compact

API usage

Track token counts in responses:

{
  "usage": {
    "input_tokens": 15234,
    "output_tokens": 892
  }
}

Compare against model limits:

ModelContext window
Claude Sonnet200K tokens
Claude Opus200K tokens
GPT-4128K tokens

60% of 200K = 120K tokens max recommended.

When to start new

Trigger a new conversation when:

Before starting fresh:

  1. Save current state to file
  2. Note any unfinished tasks
  3. Capture key decisions

The workflow

Start task
Work (monitor context)
Context > 60%? ──Yes──→ Save state to file
    ↓ No                      ↓
Continue              Start new conversation
    ↓                         ↓
Task complete?        Load state from file
    ↓ Yes                     ↓
Save learnings        Continue work
End

Long conversations feel productive. Fresh conversations are productive.


Next: Building a Memory System

Topics: memory ai-agents prompting