PageIndex

Table of content

document index for vectorless, reasoning-based RAG. instead of chunking + embedding + similarity search, it builds a structured index and lets the LLM reason over it directly.

what it does

replaces the standard RAG pipeline (chunk → embed → vector search → retrieve) with a reasoning-first approach. the LLM reads a structured index, thinks, and retrieves what it actually needs.

why it matters

vectors were a workaround for models that couldn’t handle long context. now that they can, why keep the workaround?

if this works at scale, RAG stops being “find similar chunks” and starts being “read the index, think, retrieve.”

who it’s for

anyone building RAG systems and frustrated with vector similarity returning adjacent-but-wrong results.

the shift

from: “find chunks that look similar”
to: “read the index, understand what you need, get it”

semantic search → semantic reasoning.