the personal AI stack: what you actually need

Table of content

by Ray Svitla

everyone’s personal AI stack looks like a hoarder’s basement. twelve subscriptions, thirty browser tabs, four chat apps, and a Notion page titled “AI tools to try” that hasn’t been updated since March.

you don’t need most of it. here’s what you actually need.

the minimum viable stack

three layers. that’s it.

layer 1: the agent

you need one AI agent that can read files, write files, run commands, and connect to external tools. this is your primary interface to AI. not a chat window. not a browser extension. an agent that operates on your actual work.

Claude Code is what I use. terminal-native, skill-extensible, MCP-compatible. see the setup guide and beginner tutorial .

alternatives that work: Cursor if you want an IDE, Aider if you want multi-model terminal coding, Copilot if you want autocomplete as the primary interaction. pick one. use it for everything before adding another.

layer 2: the tools (MCP servers)

your agent needs to interact with the world beyond your filesystem. MCP servers handle this.

the minimum set:

# search the web
claude mcp add brave-search

# read web pages  
claude mcp add web-fetch

# manage code
claude mcp add github

three servers. web search, web reading, code management. this covers 80% of what most people need an AI to do externally. add more as specific needs arise, not before. see best MCP servers .

layer 3: the memory

AI agents forget everything between sessions. this is the biggest practical problem in daily AI use. you need something that remembers.

options, from simple to complex:

CLAUDE.md files. the simplest memory system. write what matters in a file the agent reads on startup. update it manually. zero infrastructure. surprisingly effective.

structured markdown logs. daily notes in ~/memory/YYYY-MM-DD.md. the agent reads recent logs at session start. cheap, portable, human-readable.

Mem0. a proper memory layer. stores facts, preferences, and context across sessions using vector and graph storage. more infrastructure, better recall. see the Mem0 profile .

start with CLAUDE.md files. upgrade when you hit the limits.

what you can skip

multiple chat interfaces

you don’t need ChatGPT and Claude and Gemini and Perplexity all open simultaneously. pick the one whose agent mode works best for your use case. use the others for specific tasks where they genuinely outperform.

prompt libraries

saved prompts are the bookmarks of 2024. you saved 200 of them. you use 3. the ones you actually reuse should be Claude Code custom commands or skills. the rest are digital clutter.

AI wrapper apps

there’s a new “AI-powered X” every week. AI email client, AI note-taking app, AI calendar, AI todo list. most of them are a thin wrapper around an API call with a subscription fee attached. Claude Code with the right MCP server usually does the same thing without the app.

local LLMs (for most people)

running Llama on your laptop is a fun weekend project. it’s not a productivity tool for most workflows. the models are smaller, slower, and less capable than API-hosted ones. if you have specific privacy requirements or love tinkering, sure. otherwise, it’s a distraction. see the honest take in local LLM runtimes .

how the layers interact

you → Claude Code → MCP servers → external world
         ↑                              
    CLAUDE.md + memory

you talk to Claude Code. Claude Code reads your instructions (CLAUDE.md) and memory (previous sessions). when it needs external data, it calls MCP servers. results flow back into the conversation.

the quality of this stack depends on:

instruction quality — how well your CLAUDE.md encodes your preferences and project knowledge. see why your CLAUDE.md sucks .
memory quality — what gets remembered and what gets forgotten. see agent memory systems .
tool selection — which MCP servers are loaded and when. see MCP server stacking .

the cost

the realistic monthly cost of this stack:

component	cost
Claude Pro or Max	$20-200/mo
Brave Search API	free (2000 queries/mo)
GitHub	free (public repos)
Memory (CLAUDE.md)	free
Memory (Mem0 cloud)	$0-49/mo
total	$20-250/mo

the low end ($20/mo) is Claude Pro with free-tier everything else. the high end ($250/mo) is Max plan with cloud memory. most people land around $20-60/mo.

compare this to: ChatGPT Plus ($20) + Cursor Pro ($20) + Notion AI ($10) + Perplexity ($20) + Grammarly ($12) + various other subscriptions = $80+/mo for fragmented tools that don’t talk to each other.

the stack that doesn’t exist yet

what I want and can’t quite have in early 2026:

seamless cross-device memory. start a conversation on my laptop, continue it on my phone, finish it on my desktop. same agent, same memory, same context. we’re not there.

proactive agents. an agent that notices patterns and acts before I ask. “you always review PRs on Monday morning, there are 3 waiting” → without me prompting. pieces exist (proactive agent patterns ) but it’s not smooth.

real privacy with cloud capability. I want cloud-grade AI quality with local-grade privacy. hybrid architectures are emerging but they’re complicated and fragile.

the actual advice

start with Claude Code and CLAUDE.md. use it for a week. notice what’s missing. add one MCP server. use it for a week. notice what’s missing. add memory if you need it.

resist the urge to install everything on day one. every tool you add is configuration you maintain, context tokens you spend, and complexity you manage.

the best personal AI stack is the smallest one that does what you need. which is, annoyingly, different for everyone. but the architecture — agent + tools + memory — is universal.

build that architecture with the minimum components. expand only when the gap between what you have and what you need becomes an actual bottleneck, not a theoretical one.

and if you take one thing from this: the three-layer pattern — agent, tools, memory — isn’t tied to any product. Claude Code could be replaced by whatever comes next. MCP servers could be replaced by a different protocol. the architecture remains. invest in understanding the layers, not the specific implementations filling them today.

Ray Svitla stay evolving