Debanjum Singh's Open Personal AI

Table of content

Debanjum Singh Solanky spent five years at Microsoft scaling personal AI features to 20 million users through Cortana and Viva Insights. In 2021, he started building something for himself: an AI that could search through his org-mode notes with natural language. That side project became Khoj, a Y Combinator-backed open-source AI second brain with 20K+ GitHub stars.

Singh’s thesis: if AI is going to know everything about you, you should be able to verify what it’s doing. That means open source, local-first, and no hidden system prompts shaping your reality.

Background

Software Engineer at Microsoft (2017-2022), built Viva Personal Insights and Cortana Daily Briefing
Master’s in Computer Science from Dartmouth College
B.E. in Electronics from BITS Pilani
Founding team at Hillhacks (2014-2015), an international tech collective in the Himalayas
Co-founded Khoj AI with Saba Imran in 2023
Y Combinator S23 batch
GitHub | LinkedIn

The Open Source AI Argument

Singh argues that personal AI must be open source. Not for ideological reasons, but practical ones.

From the Khoj blog:

“The core issue is that when a company controls your AI, they control how you perceive reality. Hidden system prompts can shape what information you receive, what perspectives you see, what truth looks like.”

The 2024 Gemini incident made this concrete. Google’s system prompt forced diverse image generation regardless of user prompts, revealing how easily corporations can shape AI outputs without user awareness.

Closed Source AI	Open Source AI
Hidden system prompts	Inspect all instructions
Company controls updates	You control versions
Data goes to cloud	Can run fully local
Single vendor lock-in	Swap models freely
No customization	Fork and modify

Khoj Architecture

Khoj turns your documents into a searchable, conversational knowledge base. The architecture is straightforward:

khoj/
├── indexer/           # Parse documents into chunks
│   ├── markdown.py    # Markdown files
│   ├── org_mode.py    # Emacs org files
│   ├── pdf.py         # PDFs
│   └── notion.py      # Notion pages
├── search/
│   ├── embeddings.py  # Vector embeddings
│   └── retriever.py   # Semantic search
├── chat/
│   ├── intent.py      # Query classification
│   └── response.py    # RAG-based answers
└── clients/
    ├── obsidian/      # Plugin
    ├── emacs/         # Package
    └── web/           # Desktop/browser

How search works:

Documents split into chunks (bi-encoder creates embeddings)
Chunks stored with vector representations
Natural language query converted to embedding
Cosine similarity finds relevant chunks
LLM synthesizes answer from retrieved context

# Self-host Khoj locally
pip install khoj
khoj --config ~/khoj-config.yml

# Or use Docker
docker run -p 42110:42110 khoj/khoj

Personal AI Workflows

Singh uses Khoj for his own document management. From his blog post on plaintext accounting:

Bank statement to ledger:

Prompt: "Convert this bank statement PDF to Beancount format.
        Categorize expenses into appropriate accounts."

Khoj extracts transactions from PDF, categorizes them, and
formats as valid Beancount entries.

The flexibility of LLMs over traditional parsing is the key benefit. No brittle regex patterns; the model adapts to different bank statement formats.

Research deep dive:

Khoj’s Research Mode does iterative web search:

Break query into sub-questions
Search web for each
Synthesize findings with citations
User can verify sources

Custom agents:

Agent: GRE Tutor
Personality: "You are a GRE prep tutor. Quiz me on vocabulary,
             explain math concepts step by step, and track
             which topics I struggle with."
Knowledge: [GRE vocabulary lists, practice problems]
Tools: [web search, code execution for math]

The Bi-Encoder Search Pattern

Khoj uses bi-encoder models for semantic search. This differs from the cross-encoder approach used by some systems:

Bi-Encoder	Cross-Encoder
Encode query and docs separately	Encode query+doc together
Fast (precompute doc embeddings)	Slow (recompute for each query)
Good for large document sets	Better accuracy, smaller sets
Scales to 100K+ documents	Doesn’t scale past 1K

For personal knowledge bases with thousands of files, bi-encoders are the practical choice. The accuracy tradeoff is acceptable when you can tune chunk size and retrieval count.

Model Flexibility

Khoj supports any LLM backend:

# khoj-config.yml
llm:
  provider: openai  # or anthropic, ollama, local
  model: gpt-4o
  # For local:
  # provider: ollama
  # model: llama3.2:8b

Run entirely local with Ollama for privacy:

ollama pull llama3.2:8b
khoj --llm-provider ollama --llm-model llama3.2:8b

Cloud option for convenience, local option for privacy. Your choice.

Client Ecosystem

Khoj ships clients for where knowledge workers live:

Emacs (Singh’s home):

(use-package khoj
  :ensure t
  :config
  (setq khoj-server-url "http://127.0.0.1:42110"))

;; M-x khoj to search
;; M-x khoj-chat to converse

Obsidian: Plugin available in Community Plugins store. Search and chat without leaving your vault.

Desktop/Web: Electron app or any browser at http://localhost:42110.

Building Safer AI

From Singh’s AI safety post:

Transparency: Open source means auditable code paths
User control: You decide what data gets indexed
Model choice: Swap providers based on your trust level
Local-first: Sensitive data never leaves your machine
Verifiable citations: RAG answers include source references

Safety isn’t a feature; it’s architecture. When the code is open and the data stays local, the attack surface shrinks.

Key Takeaways

Principle	Implementation
Personal AI should be open source	Khoj is MIT licensed, 20K+ stars
Search your own data with natural language	Bi-encoder embeddings, semantic retrieval
Trust requires transparency	No hidden system prompts
Local-first for privacy	Self-host with Ollama, no cloud required
Tools should meet you where you work	Emacs, Obsidian, web, desktop clients
AI safety is architectural	Open code, local data, user control

Links

Next: Linus Lee’s Custom AI Tools