Jerry Liu's Files-First Agent Architecture
Table of content
Jerry Liu built the framework that made RAG possible for millions of developers. Now he’s arguing that the next evolution of AI agents won’t need complex tool ecosystems at all—just files.
Background
- Created LlamaIndex (originally GPT Index) in November 2022
- Co-founded LlamaIndex Inc., raised significant funding
- Previously worked at Uber on self-driving prediction systems
- MIT graduate in Computer Science
- LlamaIndex: 46.6k GitHub stars, 25M+ monthly package downloads, 1.5k+ contributors
- LlamaCloud serves 300k+ users with document parsing/indexing
GitHub: @jerryjliu · Twitter: @jerryjliu0 · LlamaIndex Blog
The Files-First Thesis
In January 2026, Liu published “Files Are All You Need”—a paradigm shift in how we think about agent architecture. His core argument:
Instead of one agent with hundreds of tools, we’re evolving towards a world where the agent really only has access to a filesystem and ~5-10 tools.
The agent needs just:
- CLI over filesystem
- Code interpreter
- Web fetch
That’s it. This is “just as general, if not more general, than an agent with 100+ MCP tools.”
Three Ways Files Power Agents
1. Long-Running Memory
Context windows remain the main limitation for complex tasks. Liu notes the “dread every time I see the ‘Context left until auto-compact’ notification pop up on Claude Code.” Once compaction triggers, the agent gets amnesia.
Solutions emerging:
- Claude.md files — static context loaded on instantiation
- Cursor’s searchable history — compacted conversations become searchable files
- Dex Horthy’s Research→Plan→Implement workflow — explicit file checkpoints for human review
The pattern: store context as files, let agents search them.
2. Replacing Naive RAG
Liu is candid about early RAG limitations:
“RAG being terrible was a 2023 problem. All of these techniques only solved subsets of the problem.”
His experiments found:
- Reasoning agents + filesystem tools + semantic search lets agents “traverse context to answer questions of any complexity”
- File search alone outperforms naive semantic search on small-medium collections (ignoring latency)
The key insight: agents can interleave search with Read() operations—scanning across files, scrolling up/down, just like a human.
3. Skills Over MCP
Anthropic’s “Skills” concept points to a future where agents don’t need explicit MCP tools. Skills are just… files.
Benefits:
- Easier to define — copy/paste an API spec into a .md file
- Don’t bloat context — agents read what they need, not the whole tool response
- More flexible — code interpreter lets agents interact with any API
“With skills, the agent can arbitrarily interact with APIs of any service through code execution, even if that service doesn’t have MCP defined.”
The Parsing Problem
Liu’s biggest insight for document-heavy workflows: most files aren’t plain text.
| File Type | Challenge |
|---|---|
| .py, .md, .txt | Direct LLM input ✓ |
| .pdf, .docx, .xlsx | Need parsing first |
| Images | Require vision or OCR |
Coding agents can’t natively read PDFs. This is exactly what LlamaCloud solves—Parse, Extract, Sheets convert non-plaintext documents into agent-ready context.
LlamaIndex Evolution
Then (2023): RAG framework focused on indexing and retrieval Now (2026): Full-stack document automation platform
| Product | Purpose |
|---|---|
| LlamaParse | Extract text/structure from complex documents |
| LlamaExtract | Structured data extraction |
| LlamaCloud Index | Managed vector + hybrid search |
| Workflows | Control flow for agentic pipelines |
| Semtools | CLI tools for coding agents (semantic search, parsing) |
The thesis: documents are the bottleneck. Solve document→context, and you unlock agents for knowledge work.
Key Technical Ideas
Context as Filesystem
Liu sees filesystems as the “initial proxy for computer use.” Coding agents already operate this way—Claude Code and Cursor are essentially file-manipulating reasoning agents.
Scalable File Search
Current gap: CLI tools (ls, grep) don’t scale to 1k-1m+ documents. Native OS search is weak. The solution will combine semantic indexing with filesystem operations.
The Compaction Problem
Every coding agent user knows the pain: work builds up, context compacts, agent forgets. Liu’s workaround:
- Trigger compaction intentionally (“frequent intentional compaction”)
- Dump context to searchable files before it’s lost
- Let agents retrieve from past sessions
What This Means for Personal AI
Implication 1: Your notes are your agent’s memory
If files are the interface, your existing markdown notes, docs, and text files become natural agent context. No special vector database required.
Implication 2: MCP complexity may be temporary
We might look back at 100-tool MCP setups the way we look at early ORM mappers—necessary for a moment, then superseded by simpler abstractions.
Implication 3: Document parsing is infrastructure
The boring work of converting PDFs and spreadsheets to clean text is becoming as fundamental as databases. LlamaCloud’s growth shows demand.
Practical Takeaways
Structure your workspace — Treat your project files as agent context. Clear folder structure = better agent navigation.
Embrace .md files — They’re the universal format agents understand best. Claude.md, AGENTS.md, research.md, plan.md.
Consider skills over tools — Before writing an MCP server, try just describing the API in a markdown file.
Solve parsing upstream — If your agents struggle with documents, the problem isn’t the agent—it’s the format. Parse first.
Liu’s vision: “Agents with simple file search tools gets us way closer to the goal [of answering questions of any complexity].”
The filesystem isn’t a limitation. It’s the interface.
Get updates
New guides, workflows, and AI patterns. No spam.
Thank you! You're on the list.