Architecture

27 practitioners working with Architecture:

ACE Framework for Personal AI Implement David Shapiro's six-layer cognitive architecture to give your Claude Code assistant mission, values, and strategic context

Agent Guardrails: Input/Output Validation for Autonomous Systems How to implement runtime guardrails that validate agent inputs, filter outputs, and enforce business rules. Covers NeMo Guardrails, layered checking, and production patterns.

Agent Memory Systems How AI agents implement memory: short-term context, long-term storage, vector retrieval, and the architecture that ties it together.

Agent Observability How to implement distributed tracing, logging, and monitoring for AI agents using OpenTelemetry and purpose-built tools like Langfuse and Braintrust.

Agentic Design Patterns: ReAct, Reflection, Planning, Tool Use When to use ReAct loops, self-critique, task decomposition, and tool calling in AI agents. A practical pattern library for building effective agent systems.

AI Memory Compression Techniques for compressing AI observations into retrievable semantic summaries that fit in context windows

Building Your First MCP Server Create custom MCP servers to extend Claude with your own tools

Claude Code Plugins System Extend Claude Code with plugins for custom tools, MCP servers, and workflows

Debug Your RAG Pipeline Before Users Notice Monitor retrieval-augmented generation systems with OpenTelemetry tracing. Find whether bad answers come from retrieval, context, or generation.

Episodic Memory for LLM Agents Give AI agents memory of specific past events with temporal context. The missing piece between semantic facts and procedural rules in the CoALA framework.

Graph Memory for Personal AI Knowledge graphs track relationships between people, projects, and time that vector databases miss. Build AI memory that understands context across sessions.

Hybrid Retrieval: When RAG Meets Long Context Combine RAG retrieval with long-context windows strategically instead of treating them as competing approaches

Hybrid Search: Combining Keyword and Semantic Retrieval Vector search misses exact matches. Keyword search misses concepts. Hybrid search with reciprocal rank fusion combines both for personal knowledge bases.

Late Chunking: Context-Aware Document Splitting for Better Retrieval Process entire documents through embedding models before splitting to preserve cross-chunk context that traditional chunking destroys

Local LLM Runtimes: When to Use Ollama vs vLLM Ollama excels for single-user development with simple setup. vLLM delivers 20x higher throughput for production multi-user deployments. Choose based on your workload.

Malleable Software Software as clay you reshape, not appliances you consume

MCP Server Composition Connect your AI agent to multiple MCP servers at once, combining calendar, database, files, and search through one protocol

MCP Server Stacking: One Data Source is Data, Two is Intelligence Combine multiple MCP servers to unlock cross-domain insights that no single server can provide on its own.

Memory Consolidation and Forgetting How AI agents consolidate short-term observations into long-term storage using sleep-inspired patterns, plus when and what to forget.

Model Quantization: Running 70B Models on a Laptop Reduce model precision from 32-bit to 4-bit to run large language models locally. Covers k-quants, GGUF, and choosing the right quantization level.

Multi-Agent Knowledge Management When a single AI can't handle your PKM needs, specialized agents working together can automate capture, processing, and synthesis.

Running LLMs on Your Hardware with llama.cpp Build llama.cpp from source, download GGUF models, pick the right quantization, and run a local AI server on Mac, Linux, or Windows.

Self-Evolving Agents Build AI agents that improve through structured feedback capture, automated evaluation, and continuous retraining loops

Subagent Patterns: Parallel, Sequential, Background Three dispatch patterns for delegating work to AI subagents and when to use each one.

The Architecture of a Personal OS Personal OS architecture: interface, agent, memory, integration, and tool layers. Build your AI system incrementally in 4 weeks

Tool Routing: How AI Agents Pick Which Function to Call Modern agents route between dozens of tools using semantic matching, LLM-as-router, hierarchical patterns, and fallback chains. Patterns for scoring, selection, and MCP sampling.

What is a Personal Operating System? What is a personal operating system? Learn how AI agents like Claude Code manage your tasks, memory, calendar, and decisions autonomously