Build a Personal Knowledge Graph
Table of content
Your notes app treats knowledge as a pile of documents. Search returns files. But you don’t think in files. You think in connections: this person told me about that tool for this project, which relates to a decision from last year.
A personal knowledge graph makes those connections explicit and queryable.
What a Knowledge Graph Gives You
Regular note search: find documents containing words.
Graph search: traverse relationships.
| Query | Document Search | Graph Query |
|---|---|---|
| “Who told me about Kubernetes?” | Returns all docs mentioning Kubernetes | Returns the person node with an INTRODUCED edge to the Kubernetes concept |
| “What tools does the API project use?” | Finds API project files, hope tools are mentioned | Traverses API_Project → USES → tool nodes |
| “How is this concept related to that decision?” | Can’t answer | Shows path between nodes |
The difference is structure. Documents contain text. Graphs contain entities and relationships.
Core Components
A knowledge graph has three parts:
| Component | What It Stores | Examples |
|---|---|---|
| Nodes | Entities with types | Person: Sarah, Project: API Migration, Tool: Kubernetes |
| Edges | Relationships with types | WORKS_ON, MENTIONED, INTRODUCED, DECIDED |
| Properties | Attributes on nodes/edges | created: 2026-01-15, confidence: 0.9, source: meeting_notes |
Every fact becomes a triple: Subject → Predicate → Object. “Sarah mentioned Kubernetes” becomes Sarah → MENTIONED → Kubernetes.
Entity Extraction with LLMs
The hard part isn’t storing graphs. It’s getting structured data from unstructured notes. LLMs solve this.
Give Claude a note:
Met with Sarah yesterday to discuss the API migration. She recommended
we look at GraphQL instead of REST for the new service. Alex disagreed
but said he'd review the benchmarks.
Ask for entities and relationships:
Extract entities (people, projects, tools, concepts) and relationships
from this note. Return as JSON triples.
Claude returns:
{
"entities": [
{"name": "Sarah", "type": "Person"},
{"name": "Alex", "type": "Person"},
{"name": "API Migration", "type": "Project"},
{"name": "GraphQL", "type": "Tool"},
{"name": "REST", "type": "Tool"}
],
"relationships": [
{"from": "Sarah", "relation": "RECOMMENDS", "to": "GraphQL", "context": "for API migration"},
{"from": "Alex", "relation": "DISAGREES_WITH", "to": "GraphQL"},
{"from": "Alex", "relation": "WILL_REVIEW", "to": "benchmarks"},
{"from": "Sarah", "relation": "DISCUSSED", "to": "API Migration"},
{"from": "Alex", "relation": "DISCUSSED", "to": "API Migration"}
]
}
Run this on every note. The graph builds itself.
Option 1: Obsidian Graph View
Obsidian’s built-in graph visualizes links between notes. Each note is a node. Wiki-links ([[like this]]) create edges.
Setup:
- Create notes for entities:
People/Sarah.md,Projects/API Migration.md,Tools/GraphQL.md - In your meeting note, link to them: “Met with [[Sarah]] about the [[API Migration]]”
- Open Graph View (Ctrl/Cmd + G)
Limitations:
- No typed relationships (just “links to”)
- No properties on edges
- Visualization only, not queryable
Good for: Visual exploration. Seeing clusters. Finding orphaned notes.
Not good for: Answering “who recommended GraphQL?” You’d need to search text, not traverse relationships.
Option 2: Neo4j for Real Queries
Neo4j is a graph database. You query with Cypher, a language designed for graph traversal.
Setup:
# Docker install
docker run -d \
--name neo4j \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/your-password \
neo4j:latest
Open http://localhost:7474 in your browser.
Create nodes:
CREATE (sarah:Person {name: "Sarah", role: "Engineer"})
CREATE (alex:Person {name: "Alex", role: "Tech Lead"})
CREATE (api:Project {name: "API Migration", status: "active"})
CREATE (graphql:Tool {name: "GraphQL"})
CREATE (rest:Tool {name: "REST"})
Create relationships:
MATCH (s:Person {name: "Sarah"}), (g:Tool {name: "GraphQL"})
CREATE (s)-[:RECOMMENDS {date: date("2026-01-27"), context: "for API migration"}]->(g)
MATCH (s:Person {name: "Sarah"}), (a:Project {name: "API Migration"})
CREATE (s)-[:WORKS_ON]->(a)
Query relationships:
// Who recommends GraphQL?
MATCH (p:Person)-[:RECOMMENDS]->(t:Tool {name: "GraphQL"})
RETURN p.name
// What does Sarah work on?
MATCH (p:Person {name: "Sarah"})-[:WORKS_ON]->(proj:Project)
RETURN proj.name
// Two-hop: tools used by people Sarah works with
MATCH (s:Person {name: "Sarah"})-[:WORKS_WITH]->(colleague:Person)-[:USES]->(tool:Tool)
RETURN DISTINCT tool.name
Option 3: Custom Python Solution
If Neo4j feels heavy, use NetworkX for in-memory graphs or SQLite with a schema for triples.
Simple NetworkX version:
import networkx as nx
import json
class PersonalKG:
def __init__(self):
self.graph = nx.MultiDiGraph()
def add_entity(self, name: str, entity_type: str, **props):
self.graph.add_node(name, type=entity_type, **props)
def add_relationship(self, from_entity: str, relation: str, to_entity: str, **props):
self.graph.add_edge(from_entity, to_entity, relation=relation, **props)
def query_outgoing(self, entity: str, relation: str = None):
"""Find what an entity connects to."""
edges = self.graph.out_edges(entity, data=True)
if relation:
edges = [(u, v, d) for u, v, d in edges if d.get('relation') == relation]
return edges
def query_incoming(self, entity: str, relation: str = None):
"""Find what connects to an entity."""
edges = self.graph.in_edges(entity, data=True)
if relation:
edges = [(u, v, d) for u, v, d in edges if d.get('relation') == relation]
return edges
def path_between(self, start: str, end: str):
"""Find connection path between two entities."""
try:
return nx.shortest_path(self.graph, start, end)
except nx.NetworkXNoPath:
return None
def save(self, path: str):
data = nx.node_link_data(self.graph)
with open(path, 'w') as f:
json.dump(data, f)
def load(self, path: str):
with open(path) as f:
data = json.load(f)
self.graph = nx.node_link_graph(data)
# Usage
kg = PersonalKG()
kg.add_entity("Sarah", "Person", role="Engineer")
kg.add_entity("GraphQL", "Tool")
kg.add_relationship("Sarah", "RECOMMENDS", "GraphQL", date="2026-01-27")
# Query
print(kg.query_outgoing("Sarah", "RECOMMENDS"))
# [('Sarah', 'GraphQL', {'relation': 'RECOMMENDS', 'date': '2026-01-27'})]
Automated Extraction Pipeline
Connect LLM extraction to your knowledge graph:
from anthropic import Anthropic
def extract_entities_and_relations(note_text: str) -> dict:
client = Anthropic()
prompt = f"""Extract entities and relationships from this note.
Entities: people, projects, tools, concepts, decisions
Relationships: WORKS_ON, RECOMMENDS, MENTIONED, INTRODUCED, DECIDED, DISCUSSED
Return JSON:
{{
"entities": [{{"name": "...", "type": "..."}}],
"relationships": [{{"from": "...", "relation": "...", "to": "...", "context": "..."}}]
}}
Note:
{note_text}"""
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1000,
messages=[{"role": "user", "content": prompt}]
)
# Parse JSON from response
import json
return json.loads(response.content[0].text)
def ingest_note(kg: PersonalKG, note_path: str):
with open(note_path) as f:
text = f.read()
extracted = extract_entities_and_relations(text)
for entity in extracted["entities"]:
kg.add_entity(entity["name"], entity["type"])
for rel in extracted["relationships"]:
kg.add_relationship(
rel["from"],
rel["relation"],
rel["to"],
context=rel.get("context", ""),
source=note_path
)
# Process all notes
from pathlib import Path
kg = PersonalKG()
for note in Path("~/notes").expanduser().glob("**/*.md"):
ingest_note(kg, note)
kg.save("knowledge_graph.json")
Schema Design
Define your entity types and relationship types upfront. This keeps the graph consistent.
Entity types for personal knowledge:
| Type | What it represents |
|---|---|
| Person | Contacts, colleagues, authors |
| Project | Work projects, side projects |
| Tool | Software, libraries, services |
| Concept | Ideas, patterns, mental models |
| Decision | Choices with reasoning |
| Event | Meetings, conferences, milestones |
| Resource | Books, articles, videos |
Relationship types:
| Relationship | Meaning |
|---|---|
| WORKS_ON | Person to Project |
| USES | Project to Tool, Person to Tool |
| KNOWS | Person to Person |
| INTRODUCED | Person to Concept/Tool (who told you about it) |
| RECOMMENDS | Person to anything |
| DECIDED | Person/Project to Decision |
| REFERENCES | Any to Resource |
| CONTRADICTS | For tracking conflicting information |
Integration with AI Memory
Use graph queries to retrieve context for AI conversations:
def get_context_for_topic(kg: PersonalKG, topic: str) -> str:
"""Gather related context from the knowledge graph."""
context_parts = []
# Direct connections
outgoing = kg.query_outgoing(topic)
incoming = kg.query_incoming(topic)
for _, target, data in outgoing:
context_parts.append(f"{topic} {data['relation']} {target}")
for source, _, data in incoming:
context_parts.append(f"{source} {data['relation']} {topic}")
return "\n".join(context_parts)
# Use in Claude conversation
topic = "API Migration"
context = get_context_for_topic(kg, topic)
prompt = f"""Context from my knowledge graph:
{context}
Question: What's the status of {topic} and who's involved?"""
This connects to the Graph Memory concept for AI that remembers relationships.
What You Can Steal
If you only have 10 minutes: Set up Obsidian and start using [[wiki-links]] for people and projects. The graph view alone provides value.
If you have an afternoon: Run Neo4j in Docker. Manually create 20 nodes and relationships from your recent notes. Write 5 Cypher queries that answer questions you couldn’t answer before.
If you’re building a system: Create the extraction pipeline. Run it on your notes archive. Build a simple CLI that queries the graph before each Claude conversation.
Schema to start with:
Person: name, role, context
Project: name, status, started
Tool: name, category
Concept: name, source
WORKS_ON: Person → Project
USES: Project → Tool
KNOWS: Person → Person
INTRODUCED: Person → Concept (when: date)
Tradeoffs
| Approach | Pros | Cons |
|---|---|---|
| Obsidian | Already using it, visual, simple | No typed relations, not queryable |
| Neo4j | Full graph queries, production-ready | Heavier setup, separate from notes |
| Custom Python | Lightweight, embeddable | Build maintenance, less features |
| Zep/Graphiti | Temporal, AI-native | Dependency, learning curve |
For personal use, start with Obsidian links. Graduate to Neo4j when you need queries like “who introduced me to tools I use in active projects?”
Related
- Graph Memory for Personal AI - How graph memory differs from vector databases
- Personal Search - Combine graph queries with semantic search
- Building a Memory System - Full memory architecture guide
Next: Graph Memory for Personal AI
Get updates
New guides, workflows, and AI patterns. No spam.
Thank you! You're on the list.