Architecture
35 practitioners working with Architecture:
ACE Framework for Personal AI
Implement David Shapiro's six-layer cognitive architecture to give your Claude Code assistant mission, values, and strategic context
Agent Guardrails: Input/Output Validation for Autonomous Systems
How to implement runtime guardrails that validate agent inputs, filter outputs, and enforce business rules. Covers NeMo Guardrails, layered checking, and production patterns.
Agent Memory Systems
How AI agents implement memory: short-term context, long-term storage, vector retrieval, and the architecture that ties it together.
Agent Observability
How to implement distributed tracing, logging, and monitoring for AI agents using OpenTelemetry and purpose-built tools like Langfuse and Braintrust.
agent-first design
how to structure repositories so AI agents can actually work in them. the code isn't just for humans anymore.
Agentic Design Patterns: ReAct, Reflection, Planning, Tool Use
When to use ReAct loops, self-critique, task decomposition, and tool calling in AI agents. A practical pattern library for building effective agent systems.
agentic loops: observe, plan, act, verify
the core pattern of autonomous agents. simple in theory, messy in practice, and full of ways to fail.
AI agent orchestration
how multiple AI agents coordinate work — supervisor patterns, hierarchical delegation, swarm architectures, and when each makes sense.
AI Memory Compression
Techniques for compressing AI observations into retrievable semantic summaries that fit in context windows
AI stopped being a tool. it became infrastructure
the personal AI OS is no longer a thought experiment. it's a deployment pattern. and the human-in-the-loop is vanishing faster than anyone projected.
Building Your First MCP Server
Create custom MCP servers to extend Claude with your own tools
Claude Code Plugins System
Extend Claude Code with plugins for custom tools, MCP servers, and workflows
context engineering
context engineering is the discipline of crafting optimal context for AI agents — memory, retrieval, compression, and instruction design.
crewai vs langgraph: choosing a multi-agent framework without losing your mind
an actually useful comparison of multi-agent frameworks that doesn't just regurgitate the documentation
Debug Your RAG Pipeline Before Users Notice
Monitor retrieval-augmented generation systems with OpenTelemetry tracing. Find whether bad answers come from retrieval, context, or generation.
Episodic Memory for LLM Agents
Give AI agents memory of specific past events with temporal context. The missing piece between semantic facts and procedural rules in the CoALA framework.
Graph Memory for Personal AI
Knowledge graphs track relationships between people, projects, and time that vector databases miss. Build AI memory that understands context across sessions.
Hybrid Retrieval: When RAG Meets Long Context
Combine RAG retrieval with long-context windows strategically instead of treating them as competing approaches
Hybrid Search: Combining Keyword and Semantic Retrieval
Vector search misses exact matches. Keyword search misses concepts. Hybrid search with reciprocal rank fusion combines both for personal knowledge bases.
Late Chunking: Context-Aware Document Splitting for Better Retrieval
Process entire documents through embedding models before splitting to preserve cross-chunk context that traditional chunking destroys
Local LLM Runtimes: When to Use Ollama vs vLLM
Ollama excels for single-user development with simple setup. vLLM delivers 20x higher throughput for production multi-user deployments. Choose based on your workload.
Malleable Software
Software as clay you reshape, not appliances you consume
MCP Server Composition
Connect your AI agent to multiple MCP servers at once, combining calendar, database, files, and search through one protocol
MCP Server Stacking: One Data Source is Data, Two is Intelligence
Combine multiple MCP servers to unlock cross-domain insights that no single server can provide on its own.
Memory Consolidation and Forgetting
How AI agents consolidate short-term observations into long-term storage using sleep-inspired patterns, plus when and what to forget.
Model Quantization: Running 70B Models on a Laptop
Reduce model precision from 32-bit to 4-bit to run large language models locally. Covers k-quants, GGUF, and choosing the right quantization level.
Multi-Agent Knowledge Management
When a single AI can't handle your PKM needs, specialized agents working together can automate capture, processing, and synthesis.
Running LLMs on Your Hardware with llama.cpp
Build llama.cpp from source, download GGUF models, pick the right quantization, and run a local AI server on Mac, Linux, or Windows.
Self-Evolving Agents
Build AI agents that improve through structured feedback capture, automated evaluation, and continuous retraining loops
stateless agents: why your AI shouldn't remember you
Subagent Patterns: Parallel, Sequential, Background
Three dispatch patterns for delegating work to AI subagents and when to use each one.
The Architecture of a Personal OS
Personal OS architecture: interface, agent, memory, integration, and tool layers. Build your AI system incrementally in 4 weeks
Tool Routing: How AI Agents Pick Which Function to Call
Modern agents route between dozens of tools using semantic matching, LLM-as-router, hierarchical patterns, and fallback chains. Patterns for scoring, selection, and MCP sampling.
What is a Personal Operating System?
What is a personal operating system? Learn how AI agents like Claude Code manage your tasks, memory, calendar, and decisions autonomously
what is MCP? the Model Context Protocol explained
what is MCP (Model Context Protocol)? a non-technical explanation of how MCP works, why it matters, and how it connects AI agents to the world.