PageIndex
Table of content
document index for vectorless, reasoning-based RAG. instead of chunking + embedding + similarity search, it builds a structured index and lets the LLM reason over it directly.
what it does
replaces the standard RAG pipeline (chunk → embed → vector search → retrieve) with a reasoning-first approach. the LLM reads a structured index, thinks, and retrieves what it actually needs.
why it matters
vectors were a workaround for models that couldn’t handle long context. now that they can, why keep the workaround?
if this works at scale, RAG stops being “find similar chunks” and starts being “read the index, think, retrieve.”
who it’s for
anyone building RAG systems and frustrated with vector similarity returning adjacent-but-wrong results.
the shift
from: “find chunks that look similar”
to: “read the index, understand what you need, get it”
semantic search → semantic reasoning.