Alex Newman's Persistent Memory for Claude Code

Table of content
Alex Newman's Persistent Memory for Claude Code

Alex Newman is a Philadelphia-based developer who spent 20 years building web products before turning his attention to AI tooling. His claude-mem plugin has 14.7k GitHub stars, making it one of the most popular Claude Code extensions. It solves a problem every Claude Code user knows: the AI forgets everything when you restart.

Background

GitHub | X | LinkedIn

The Memory Problem

Every Claude Code session follows the same pattern:

  1. /init - Claude learns your codebase
  2. Work for an hour - context fills up
  3. /clear or restart - everything is gone
  4. Next session - re-explain everything

The CLAUDE.md workaround helps but requires manual maintenance. Newman’s insight: use AI to compress AI’s own observations, then inject relevant context automatically.

How claude-mem Works

The plugin captures everything Claude does during a session, compresses it with AI using the Claude Agent SDK, and injects relevant context back into future sessions.

Install from Claude Code:

> /plugin marketplace add thedotmack/claude-mem
> /plugin install claude-mem

After restart, context from previous sessions appears automatically.

Architecture

Six core components handle the memory lifecycle:

ComponentPurpose
5 Lifecycle HooksSessionStart, UserPromptSubmit, PostToolUse, Stop, SessionEnd
Worker ServiceHTTP API on port 37777 with web UI
SQLite DatabasePersistent storage for sessions and observations
Chroma Vector DBHybrid semantic and keyword search
mem-search SkillNatural language queries with progressive disclosure
Smart InstallDependency checker that caches prerequisites

The web viewer at localhost:37777 shows a real-time memory stream during sessions.

Progressive Disclosure

Newman’s approach to token efficiency uses a three-layer workflow for memory retrieval:

search      →  Compact index with IDs (~50-100 tokens/result)
timeline    →  Chronological context around results
get_details →  Full observation data (~500-1,000 tokens/result)

This saves approximately 10x tokens by filtering before fetching full details. Claude first searches, narrows down relevant observations, then retrieves only what it needs.

Privacy Controls

Sensitive content stays out of memory with <private> tags:

<private>
API_KEY=sk-secret-key
</private>

Everything between the tags gets excluded from compression and storage.

Configuration

Settings live in ~/.claude-mem/settings.json:

{
  "aiModel": "claude-sonnet-4-20250514",
  "workerPort": 37777,
  "dataDirectory": "~/.claude-mem/data",
  "logLevel": "info",
  "contextInjection": {
    "enabled": true,
    "maxTokens": 4000
  }
}

The Complexity Battle

Newman’s v8.1.0 release notes reveal a hard-won lesson about working with AI:

“For three months, Claude’s instinct to add code instead of delete it caused the same bugs to recur. What should have been 5 lines of code became ~1000 lines, 11 useless methods, and 7+ failed fixes. The real achievement: 984 lines of code deleted.”

The problem was Claude building redundant session management on top of session IDs that the hook system already provided. The fix: aggressive simplification.

Key Takeaways

PrincipleImplementation
Compress, don’t transcribeAI summaries beat raw logs for context retention
Progressive disclosureLayer retrieval to balance completeness with token cost
Let hooks handle stateDon’t rebuild what the platform already provides
Delete more than you addSimplicity beats features for memory systems
Privacy by exclusionExplicit tags mark content to skip

Next: John Lindquist’s TypeScript Hooks for Claude Code

Topics: claude-code open-source automation knowledge-management