Paul Gauthier's Terminal-First AI Coding

Table of content

Paul Gauthier is the creator of Aider, the open-source AI pair programming tool that runs in your terminal. Before Aider, he co-founded Inktomi in 1996, one of the original internet search engine companies that powered HotBot and Yahoo before being acquired. He served as CTO there until 2010.

Aider has 40,000+ GitHub stars and achieved state-of-the-art results on SWE-Bench, the benchmark for real-world coding tasks. But what makes it different from tools like Cursor or Copilot is the philosophy: Aider is a pair programming tool, not an agent.

Background

Gauthier graduated from Dalhousie University in Nova Scotia (BSC ‘94) and earned a graduate degree from the University of Washington. He started Aider in April 2023, bringing three decades of systems programming experience to the problem of AI-assisted coding.

Beyond Inktomi, his background includes founding Ludic Labs, serving as a Technical Fellow at Geomagical Labs, and board positions at several tech companies.

Pair Programming, Not Agents

On Aider’s Discord, users often ask about running it as an autonomous agent. Gauthier’s consistent response: “Aider is an AI pair programming tool.”

The distinction matters. Agent-oriented tools like Devin continue processing until a task is complete. Aider executes one step, waits for feedback, then proceeds. Like real pair programming, there’s a driver and a partner watching the code in real-time.

From the SWE-Bench blog post:

“Aider achieved this result mainly through its existing features that focus on static code analysis, reliable LLM code editing, and pragmatic UX for automatically fixing lint errors and test failures.”

Aider intentionally avoids “long delays, high token costs” from extensive autonomous exploration. It does not use RAG, vector search, or let the LLM execute arbitrary code.

The Repository Map

Large codebases break context windows. Aider’s solution is a repository map built with tree-sitter that shows the LLM only what matters.

The algorithm:

Parse every file into an abstract syntax tree
Extract function, class, and type definitions
Build a call graph showing dependencies
Rank files by how often they’re referenced
Select the most important signatures to fit the token budget

The result looks like this:

src/auth/
  login.py: def authenticate(user, password), class Session
  tokens.py: def generate_jwt(payload), def verify_jwt(token)
src/api/
  routes.py: def register_routes(app), class UserEndpoint

GPT sees class signatures and function definitions from the entire repo without reading every file. This gives enough context to understand how code connects without exhausting the context window.

Edit Formats

Different LLMs need different approaches for reliable editing. Aider supports three strategies:

Format	How It Works	Best For
Whole File	LLM outputs complete file contents	Small files, simple models
Edit Block	Search/replace blocks with exact matching	GPT-4, Claude
Unified Diff	Standard diff format (-/+ lines)	Latest models like GPT-4o

Aider benchmarks each LLM extensively to determine which format produces the most reliable edits. The unified diff format uses line numbers and context matching to apply precise changes even in large files.

--- src/auth/login.py
+++ src/auth/login.py
@@ -12,7 +12,8 @@ def authenticate(user, password):
     if not user.is_active:
         raise AuthError("Account disabled")
-    return create_session(user)
+    session = create_session(user)
+    log_login(user, session)
+    return session

Built-in tree-sitter linting catches syntax errors before commits. If tests fail, Aider can automatically retry with the error output as context.

Context Control

One of Aider’s strengths is explicit file management. You decide exactly what the LLM sees:

# Start in a git repo
aider

# Add files to the chat context
/add src/auth/*.py
/add tests/test_auth.py

# Mark files as read-only (context but not editable)
/read docs/api-spec.md

# Remove files from context
/drop src/unrelated.py

# Ask for changes
> Add rate limiting to the login endpoint

This solves the problem where AI tools edit files you didn’t intend. Read-only context provides documentation and examples without risk.

Git Integration

Every change Aider makes gets a git commit with a descriptive message:

feat: Add rate limiting to login endpoint

Adds a RateLimiter class that tracks login attempts per IP.
Limits to 5 attempts per minute with exponential backoff.

If the change is wrong, git diff HEAD~1 shows exactly what happened. git checkout HEAD~1 reverts it. No hunting through file history.

Aider works with any LLM that has an API. The default is Claude Sonnet, but you can point it at GPT-4, local models via Ollama, or anything OpenAI-compatible.

Benchmark Results

Aider scored 26.3% on SWE-Bench Lite (state-of-the-art at the time) and 18.9% on the full SWE-Bench. The benchmark submits real GitHub issues to the tool and checks if the solution passes the test suite.

Key details from the benchmark:

Pass@1: Single attempt, no retries with different prompts
No hints: Didn’t use the benchmark’s optional hint text
Automated retry: Failed attempts retry with different models (GPT-4o alternating with Claude)

The benchmark harness automatically accepted Aider’s suggestions and retried up to six times. Solutions were only counted if the code had no syntax errors and passed pre-existing tests.

Getting Started

pip install aider-chat
export ANTHROPIC_API_KEY=your-key-here

# Start in your project
cd your-repo
aider

# Or specify files directly
aider src/auth/*.py --model claude-3.5-sonnet

Common commands in the chat:

Command	Action
`/add file.py`	Add to editable context
`/read docs.md`	Add as read-only
`/drop file.py`	Remove from context
`/diff`	Show last change
`/undo`	Revert last commit
`/run pytest`	Execute command, share output

Key Takeaways

Principle	Implementation
Interactive over autonomous	Step-by-step with human approval
Static analysis over RAG	Tree-sitter repo maps instead of vector search
Explicit context control	/add, /read, /drop commands
Git as safety net	Every change committed automatically
Benchmark-driven development	Extensive testing of edit formats per LLM

Links

Next: Simon Willison’s Workflow