Architecture

35 practitioners working with Architecture:

ACE Framework for Personal AI Implement David Shapiro's six-layer cognitive architecture to give your Claude Code assistant mission, values, and strategic context
Agent Guardrails: Input/Output Validation for Autonomous Systems How to implement runtime guardrails that validate agent inputs, filter outputs, and enforce business rules. Covers NeMo Guardrails, layered checking, and production patterns.
Agent Memory Systems How AI agents implement memory: short-term context, long-term storage, vector retrieval, and the architecture that ties it together.
Agent Observability How to implement distributed tracing, logging, and monitoring for AI agents using OpenTelemetry and purpose-built tools like Langfuse and Braintrust.
agent-first design how to structure repositories so AI agents can actually work in them. the code isn't just for humans anymore.
Agentic Design Patterns: ReAct, Reflection, Planning, Tool Use When to use ReAct loops, self-critique, task decomposition, and tool calling in AI agents. A practical pattern library for building effective agent systems.
agentic loops: observe, plan, act, verify the core pattern of autonomous agents. simple in theory, messy in practice, and full of ways to fail.
AI agent orchestration how multiple AI agents coordinate work — supervisor patterns, hierarchical delegation, swarm architectures, and when each makes sense.
AI Memory Compression Techniques for compressing AI observations into retrievable semantic summaries that fit in context windows
AI stopped being a tool. it became infrastructure the personal AI OS is no longer a thought experiment. it's a deployment pattern. and the human-in-the-loop is vanishing faster than anyone projected.
Building Your First MCP Server Create custom MCP servers to extend Claude with your own tools
Claude Code Plugins System Extend Claude Code with plugins for custom tools, MCP servers, and workflows
context engineering context engineering is the discipline of crafting optimal context for AI agents — memory, retrieval, compression, and instruction design.
crewai vs langgraph: choosing a multi-agent framework without losing your mind an actually useful comparison of multi-agent frameworks that doesn't just regurgitate the documentation
Debug Your RAG Pipeline Before Users Notice Monitor retrieval-augmented generation systems with OpenTelemetry tracing. Find whether bad answers come from retrieval, context, or generation.
Episodic Memory for LLM Agents Give AI agents memory of specific past events with temporal context. The missing piece between semantic facts and procedural rules in the CoALA framework.
Graph Memory for Personal AI Knowledge graphs track relationships between people, projects, and time that vector databases miss. Build AI memory that understands context across sessions.
Hybrid Retrieval: When RAG Meets Long Context Combine RAG retrieval with long-context windows strategically instead of treating them as competing approaches
Hybrid Search: Combining Keyword and Semantic Retrieval Vector search misses exact matches. Keyword search misses concepts. Hybrid search with reciprocal rank fusion combines both for personal knowledge bases.
Late Chunking: Context-Aware Document Splitting for Better Retrieval Process entire documents through embedding models before splitting to preserve cross-chunk context that traditional chunking destroys
Local LLM Runtimes: When to Use Ollama vs vLLM Ollama excels for single-user development with simple setup. vLLM delivers 20x higher throughput for production multi-user deployments. Choose based on your workload.
Malleable Software Software as clay you reshape, not appliances you consume
MCP Server Composition Connect your AI agent to multiple MCP servers at once, combining calendar, database, files, and search through one protocol
MCP Server Stacking: One Data Source is Data, Two is Intelligence Combine multiple MCP servers to unlock cross-domain insights that no single server can provide on its own.
Memory Consolidation and Forgetting How AI agents consolidate short-term observations into long-term storage using sleep-inspired patterns, plus when and what to forget.
Model Quantization: Running 70B Models on a Laptop Reduce model precision from 32-bit to 4-bit to run large language models locally. Covers k-quants, GGUF, and choosing the right quantization level.
Multi-Agent Knowledge Management When a single AI can't handle your PKM needs, specialized agents working together can automate capture, processing, and synthesis.
Running LLMs on Your Hardware with llama.cpp Build llama.cpp from source, download GGUF models, pick the right quantization, and run a local AI server on Mac, Linux, or Windows.
Self-Evolving Agents Build AI agents that improve through structured feedback capture, automated evaluation, and continuous retraining loops
stateless agents: why your AI shouldn't remember you
Subagent Patterns: Parallel, Sequential, Background Three dispatch patterns for delegating work to AI subagents and when to use each one.
The Architecture of a Personal OS Personal OS architecture: interface, agent, memory, integration, and tool layers. Build your AI system incrementally in 4 weeks
Tool Routing: How AI Agents Pick Which Function to Call Modern agents route between dozens of tools using semantic matching, LLM-as-router, hierarchical patterns, and fallback chains. Patterns for scoring, selection, and MCP sampling.
What is a Personal Operating System? What is a personal operating system? Learn how AI agents like Claude Code manage your tasks, memory, calendar, and decisions autonomously
what is MCP? the Model Context Protocol explained what is MCP (Model Context Protocol)? a non-technical explanation of how MCP works, why it matters, and how it connects AI agents to the world.

← All topics