Debanjum Singh's Open Personal AI

Table of content
Debanjum Singh's Open Personal AI

Debanjum Singh Solanky spent five years at Microsoft scaling personal AI features to 20 million users through Cortana and Viva Insights. In 2021, he started building something for himself: an AI that could search through his org-mode notes with natural language. That side project became Khoj, a Y Combinator-backed open-source AI second brain with 20K+ GitHub stars.

Singh’s thesis: if AI is going to know everything about you, you should be able to verify what it’s doing. That means open source, local-first, and no hidden system prompts shaping your reality.

Background

The Open Source AI Argument

Singh argues that personal AI must be open source. Not for ideological reasons, but practical ones.

From the Khoj blog:

“The core issue is that when a company controls your AI, they control how you perceive reality. Hidden system prompts can shape what information you receive, what perspectives you see, what truth looks like.”

The 2024 Gemini incident made this concrete. Google’s system prompt forced diverse image generation regardless of user prompts, revealing how easily corporations can shape AI outputs without user awareness.

Closed Source AIOpen Source AI
Hidden system promptsInspect all instructions
Company controls updatesYou control versions
Data goes to cloudCan run fully local
Single vendor lock-inSwap models freely
No customizationFork and modify

Khoj Architecture

Khoj turns your documents into a searchable, conversational knowledge base. The architecture is straightforward:

khoj/
├── indexer/           # Parse documents into chunks
│   ├── markdown.py    # Markdown files
│   ├── org_mode.py    # Emacs org files
│   ├── pdf.py         # PDFs
│   └── notion.py      # Notion pages
├── search/
│   ├── embeddings.py  # Vector embeddings
│   └── retriever.py   # Semantic search
├── chat/
│   ├── intent.py      # Query classification
│   └── response.py    # RAG-based answers
└── clients/
    ├── obsidian/      # Plugin
    ├── emacs/         # Package
    └── web/           # Desktop/browser

How search works:

  1. Documents split into chunks (bi-encoder creates embeddings)
  2. Chunks stored with vector representations
  3. Natural language query converted to embedding
  4. Cosine similarity finds relevant chunks
  5. LLM synthesizes answer from retrieved context
# Self-host Khoj locally
pip install khoj
khoj --config ~/khoj-config.yml

# Or use Docker
docker run -p 42110:42110 khoj/khoj

Personal AI Workflows

Singh uses Khoj for his own document management. From his blog post on plaintext accounting:

Bank statement to ledger:

Prompt: "Convert this bank statement PDF to Beancount format.
        Categorize expenses into appropriate accounts."

Khoj extracts transactions from PDF, categorizes them, and
formats as valid Beancount entries.

The flexibility of LLMs over traditional parsing is the key benefit. No brittle regex patterns; the model adapts to different bank statement formats.

Research deep dive:

Khoj’s Research Mode does iterative web search:

  1. Break query into sub-questions
  2. Search web for each
  3. Synthesize findings with citations
  4. User can verify sources

Custom agents:

Agent: GRE Tutor
Personality: "You are a GRE prep tutor. Quiz me on vocabulary,
             explain math concepts step by step, and track
             which topics I struggle with."
Knowledge: [GRE vocabulary lists, practice problems]
Tools: [web search, code execution for math]

The Bi-Encoder Search Pattern

Khoj uses bi-encoder models for semantic search. This differs from the cross-encoder approach used by some systems:

Bi-EncoderCross-Encoder
Encode query and docs separatelyEncode query+doc together
Fast (precompute doc embeddings)Slow (recompute for each query)
Good for large document setsBetter accuracy, smaller sets
Scales to 100K+ documentsDoesn’t scale past 1K

For personal knowledge bases with thousands of files, bi-encoders are the practical choice. The accuracy tradeoff is acceptable when you can tune chunk size and retrieval count.

Model Flexibility

Khoj supports any LLM backend:

# khoj-config.yml
llm:
  provider: openai  # or anthropic, ollama, local
  model: gpt-4o
  # For local:
  # provider: ollama
  # model: llama3.2:8b

Run entirely local with Ollama for privacy:

ollama pull llama3.2:8b
khoj --llm-provider ollama --llm-model llama3.2:8b

Cloud option for convenience, local option for privacy. Your choice.

Client Ecosystem

Khoj ships clients for where knowledge workers live:

Emacs (Singh’s home):

(use-package khoj
  :ensure t
  :config
  (setq khoj-server-url "http://127.0.0.1:42110"))

;; M-x khoj to search
;; M-x khoj-chat to converse

Obsidian: Plugin available in Community Plugins store. Search and chat without leaving your vault.

Desktop/Web: Electron app or any browser at http://localhost:42110.

Building Safer AI

From Singh’s AI safety post:

  1. Transparency: Open source means auditable code paths
  2. User control: You decide what data gets indexed
  3. Model choice: Swap providers based on your trust level
  4. Local-first: Sensitive data never leaves your machine
  5. Verifiable citations: RAG answers include source references

Safety isn’t a feature; it’s architecture. When the code is open and the data stays local, the attack surface shrinks.

Key Takeaways

PrincipleImplementation
Personal AI should be open sourceKhoj is MIT licensed, 20K+ stars
Search your own data with natural languageBi-encoder embeddings, semantic retrieval
Trust requires transparencyNo hidden system prompts
Local-first for privacySelf-host with Ollama, no cloud required
Tools should meet you where you workEmacs, Obsidian, web, desktop clients
AI safety is architecturalOpen code, local data, user control

Next: Linus Lee’s Custom AI Tools

Topics: knowledge-management open-source local-first agents