Vector Databases for Personal RAG

Table of content

Your notes live in folders. Your brain connects them by meaning.

Vector databases bridge that gap. They store embeddings—numeric representations of text—and find similar content fast. For personal RAG (retrieval-augmented generation), you need one. But which?

I tested four options against real PKM constraints: cost sensitivity, local-first preference, and the ~10K-100K document scale most personal systems hit.

The Comparison

DatabaseTypeSelf-hostCloudFree tierCost at 50K docsBest for
ChromaEmbeddedN/A$0Prototyping, small PKM
QdrantStandalone1GB free~$9/moProduction local-first
pgvectorPostgres extVaries~$15/moExisting Postgres users
PineconeManaged100K vectors~$70/moZero-ops cloud

Performance Reality Check

Benchmarks from ANN Benchmarks and Vectorview:

For personal scale (under 100K vectors), all four work fine. The bottleneck is your embedding model, not the database.

Setup Examples

Chroma (Simplest Start)

# pip install chromadb
import chromadb

client = chromadb.PersistentClient(path="./chroma_data")
collection = client.create_collection("notes")

# Add documents
collection.add(
    documents=["Meeting notes from Monday", "Ideas for the garden project"],
    ids=["note_1", "note_2"]
)

# Query
results = collection.query(
    query_texts=["project planning"],
    n_results=5
)

Chroma handles embeddings automatically using sentence-transformers. Your data stays in ./chroma_data. Done.

Qdrant (Production-Ready Local)

# Docker one-liner
docker run -p 6333:6333 -v ./qdrant_data:/qdrant/storage qdrant/qdrant
# pip install qdrant-client sentence-transformers
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams, PointStruct
from sentence_transformers import SentenceTransformer

client = QdrantClient("localhost", port=6333)
encoder = SentenceTransformer("all-MiniLM-L6-v2")

# Create collection
client.create_collection(
    collection_name="notes",
    vectors_config=VectorParams(size=384, distance=Distance.COSINE)
)

# Add documents
docs = ["Meeting notes from Monday", "Ideas for the garden project"]
vectors = encoder.encode(docs)

client.upsert(
    collection_name="notes",
    points=[
        PointStruct(id=i, vector=v.tolist(), payload={"text": doc})
        for i, (v, doc) in enumerate(zip(vectors, docs))
    ]
)

# Query
query_vector = encoder.encode("project planning")
hits = client.search(
    collection_name="notes",
    query_vector=query_vector.tolist(),
    limit=5
)

More code, but you get filtering, snapshots, and a web UI at localhost:6333/dashboard.

pgvector (If You Already Use Postgres)

-- Enable extension
CREATE EXTENSION vector;

-- Create table
CREATE TABLE notes (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(384)
);

-- Create index (HNSW for speed)
CREATE INDEX ON notes USING hnsw (embedding vector_cosine_ops);
# pip install psycopg2-binary sentence-transformers
import psycopg2
from sentence_transformers import SentenceTransformer

encoder = SentenceTransformer("all-MiniLM-L6-v2")
conn = psycopg2.connect("postgresql://localhost/mydb")

# Insert
doc = "Meeting notes from Monday"
embedding = encoder.encode(doc).tolist()
with conn.cursor() as cur:
    cur.execute(
        "INSERT INTO notes (content, embedding) VALUES (%s, %s)",
        (doc, embedding)
    )
conn.commit()

# Query
query_vec = encoder.encode("project planning").tolist()
with conn.cursor() as cur:
    cur.execute("""
        SELECT content, 1 - (embedding <=> %s::vector) as similarity
        FROM notes
        ORDER BY embedding <=> %s::vector
        LIMIT 5
    """, (query_vec, query_vec))
    results = cur.fetchall()

The <=> operator is cosine distance. Use <-> for L2 (Euclidean).

Pinecone (Managed Cloud)

# pip install pinecone-client sentence-transformers
from pinecone import Pinecone
from sentence_transformers import SentenceTransformer

pc = Pinecone(api_key="YOUR_API_KEY")
encoder = SentenceTransformer("all-MiniLM-L6-v2")

# Create index (one-time)
pc.create_index(
    name="notes",
    dimension=384,
    metric="cosine",
    spec={"serverless": {"cloud": "aws", "region": "us-east-1"}}
)

index = pc.Index("notes")

# Upsert
docs = ["Meeting notes from Monday", "Ideas for the garden project"]
vectors = encoder.encode(docs)
index.upsert(vectors=[
    {"id": f"note_{i}", "values": v.tolist(), "metadata": {"text": doc}}
    for i, (v, doc) in enumerate(zip(vectors, docs))
])

# Query
query_vec = encoder.encode("project planning").tolist()
results = index.query(vector=query_vec, top_k=5, include_metadata=True)

Free tier gives you 100K vectors. Beyond that, expect $70+/month.

When to Use Which

Start with Chroma if:

Choose Qdrant if:

Stick with pgvector if:

Pay for Pinecone if:

For most personal PKM projects, Qdrant or Chroma hits the sweet spot. Both are open source, both run locally, and both scale past what you’ll need.

Hybrid Search Matters

Pure vector search misses exact matches. “PostgreSQL configuration” might return results about “database setup” but miss documents that literally say “PostgreSQL configuration.”

Combine vector similarity with keyword search. Qdrant and Pinecone support this natively. For pgvector, add full-text search:

-- Add tsvector column
ALTER TABLE notes ADD COLUMN tsv tsvector
    GENERATED ALWAYS AS (to_tsvector('english', content)) STORED;

CREATE INDEX ON notes USING gin(tsv);

-- Hybrid query
SELECT content,
       ts_rank(tsv, query) * 0.3 + (1 - (embedding <=> qvec)) * 0.7 as score
FROM notes, to_tsquery('english', 'postgresql & configuration') query
WHERE tsv @@ query
ORDER BY score DESC
LIMIT 5;

See hybrid search for the full implementation.

What You Can Steal

  1. Chroma for weekend projectspip install chromadb and you’re searching in 5 minutes.

  2. Qdrant Docker one-liner—persistent storage, web dashboard, production-grade filtering.

  3. pgvector HNSW index—don’t use IVFFlat for small datasets; HNSW is faster.

  4. Hybrid scoring formulakeyword_score * 0.3 + vector_score * 0.7 works well for most queries.

  5. Cost ceiling awareness—at personal scale, you should spend $0-15/month, not $70+.

Next: Embedding Models for PKM—which model creates the best vectors for your notes.