Eugene Yan's Personal AI Workflow
Table of content

Eugene Yan is a Principal Applied Scientist at Amazon building recommendation systems and AI-powered products at scale. But outside work, he builds personal AI tools — and that’s where it gets interesting. He’s shipped Obsidian-Copilot, news-agents, an AI Coach, and writes extensively about practical LLM patterns.
What makes Eugene’s approach notable: he treats personal tools as learning vehicles. Each prototype explores a different AI pattern — RAG, multi-agent orchestration, evaluation. The byproduct is useful software. The real output is knowledge that applies to his day job.
Obsidian-Copilot: RAG for your notes
The question Eugene wanted to answer: “What would a copilot for writing and thinking look like?”
Obsidian-Copilot uses retrieval-augmented generation on your personal notes. Give it a section header, it drafts paragraphs pulling from relevant notes. Write a daily journal, it helps reflect on the week.
The interesting technical choice: chunking by top-level bullets instead of token length.
for line in lines:
if '##' in line: # Section header
current_header = line
if line.startswith('- '): # Top-level bullet
if current_chunk:
if len(current_chunk) >= min_chunk_lines:
chunks[chunk_idx] = current_chunk
chunk_idx += 1
current_chunk = []
if current_header:
current_chunk.append(current_header)
current_chunk.append(line)
Why? Eugene found token-based chunking “didn’t work very well.” Bullet-based chunks preserve logical units of thought. Each chunk is a top-level bullet plus its sub-bullets — roughly paragraph-length, but semantically complete.
For retrieval, he combines BM25 + semantic search. Another lesson from experimentation: embedding-based retrieval alone isn’t enough. Classical keyword search catches things embeddings miss.
The architecture: OpenSearch for BM25, e5-small-v2 for embeddings (384 dimensions, small and fast), FastAPI server, Obsidian plugin for the UI.
News-agents: Multi-agent orchestration
News-agents generates daily news recaps. It’s built on Amazon Q CLI and MCP (Model Context Protocol), but the real lesson is the multi-agent pattern:
Main Agent (coordinator)
├── Read feeds.txt
├── Split feeds into 3 chunks
├── Spawn 3 Sub-Agents (separate tmux panes)
│ ├── Sub-Agent #1 → processes HackerNews, WSJ Tech
│ ├── Sub-Agent #2 → processes TechCrunch, AI News
│ └── Sub-Agent #3 → processes Wired, WSJ Markets
└── Combine into main-summary.md
Each news feed has its own RSS parser. The MCP server exposes them as tools:
@mcp.tool()
async def get_hackernews_stories(
feed_url: str = DEFAULT_HN_RSS_URL, count: int = 30
) -> str:
"""Get top stories from Hacker News."""
rss_content = await fetch_hn_rss(feed_url)
stories = parse_hn_rss(rss_content)
return "\n---\n".join([format_hn_story(s) for s in stories[:count]])
The sub-agents run in visible tmux panes so you can watch them work. When all finish, the main agent reads individual summaries and synthesizes a cross-source analysis: trending topics, category distribution, overlapping themes.
Practical output: a daily digest categorized by AI/ML, Business, Technology, with statistics like “Total Items: 124, Sources: 6, AI/ML mentions: 31 (25%)”.
The prototyping philosophy
Eugene maintains a prototyping page with 13+ projects. The common thread: each explores a specific pattern with real utility.
- Obsidian-Copilot → RAG + hybrid retrieval
- News-agents → multi-agent orchestration + MCP
- AlignEval → LLM-as-judge evaluation
- AI Reading Club → AI-powered reading experience
- AI Coach → voice-based LLM interface (you can call it!)
This is deliberate practice for AI engineering. You learn more building something real than reading papers about it.
Patterns for LLM Systems
Eugene’s most influential piece is Patterns for Building LLM-based Systems & Products — 50k+ reads. The seven patterns:
- Evals — measure before you optimize
- RAG — add external knowledge
- Fine-tuning — specialize for tasks
- Caching — reduce latency/cost
- Guardrails — ensure output quality
- Defensive UX — anticipate errors gracefully
- User feedback — build data flywheel
The key insight: these patterns aren’t mutually exclusive. You need evals before you know if fine-tuning helps. You need caching after you’ve validated with RAG. They stack.
He expanded this into Applied LLMs, co-authored with practitioners from Google, Anthropic, Microsoft, and more.
Weekly reflection workflow
Obsidian-Copilot includes a reflection feature. If you journal daily, it can summarize your week:
“Based on your entries, you spent significant time on Project X but expressed frustration about blocked dependencies. Consider: what can you unblock yourself? Your energy seemed highest on Tuesday when you did deep work in the morning.”
This is personal AI at its best — not generic advice, but insights grounded in your actual life.
Resources
- eugeneyan.com — blog with 200+ posts
- Obsidian-Copilot — RAG for your notes
- news-agents — multi-agent news recap
- Applied LLMs — practical lessons from building with LLMs
- LLM Patterns — foundational article
- @eugeneyan — active on Twitter/X
Get updates
New guides, workflows, and AI patterns. No spam.
Thank you! You're on the list.