Build a Personal Knowledge Graph

Table of content

Your notes app treats knowledge as a pile of documents. Search returns files. But you don’t think in files. You think in connections: this person told me about that tool for this project, which relates to a decision from last year.

A personal knowledge graph makes those connections explicit and queryable.

What a Knowledge Graph Gives You

Regular note search: find documents containing words.

Graph search: traverse relationships.

Query	Document Search	Graph Query
“Who told me about Kubernetes?”	Returns all docs mentioning Kubernetes	Returns the person node with an INTRODUCED edge to the Kubernetes concept
“What tools does the API project use?”	Finds API project files, hope tools are mentioned	Traverses API_Project → USES → tool nodes
“How is this concept related to that decision?”	Can’t answer	Shows path between nodes

The difference is structure. Documents contain text. Graphs contain entities and relationships.

Core Components

A knowledge graph has three parts:

Component	What It Stores	Examples
Nodes	Entities with types	Person: Sarah, Project: API Migration, Tool: Kubernetes
Edges	Relationships with types	WORKS_ON, MENTIONED, INTRODUCED, DECIDED
Properties	Attributes on nodes/edges	created: 2026-01-15, confidence: 0.9, source: meeting_notes

Every fact becomes a triple: Subject → Predicate → Object. “Sarah mentioned Kubernetes” becomes Sarah → MENTIONED → Kubernetes.

Entity Extraction with LLMs

The hard part isn’t storing graphs. It’s getting structured data from unstructured notes. LLMs solve this.

Give Claude a note:

Met with Sarah yesterday to discuss the API migration. She recommended 
we look at GraphQL instead of REST for the new service. Alex disagreed 
but said he'd review the benchmarks.

Ask for entities and relationships:

Extract entities (people, projects, tools, concepts) and relationships
from this note. Return as JSON triples.

Claude returns:

{
  "entities": [
    {"name": "Sarah", "type": "Person"},
    {"name": "Alex", "type": "Person"},
    {"name": "API Migration", "type": "Project"},
    {"name": "GraphQL", "type": "Tool"},
    {"name": "REST", "type": "Tool"}
  ],
  "relationships": [
    {"from": "Sarah", "relation": "RECOMMENDS", "to": "GraphQL", "context": "for API migration"},
    {"from": "Alex", "relation": "DISAGREES_WITH", "to": "GraphQL"},
    {"from": "Alex", "relation": "WILL_REVIEW", "to": "benchmarks"},
    {"from": "Sarah", "relation": "DISCUSSED", "to": "API Migration"},
    {"from": "Alex", "relation": "DISCUSSED", "to": "API Migration"}
  ]
}

Run this on every note. The graph builds itself.

Option 1: Obsidian Graph View

Obsidian’s built-in graph visualizes links between notes. Each note is a node. Wiki-links ([[like this]]) create edges.

Setup:

Create notes for entities: People/Sarah.md, Projects/API Migration.md, Tools/GraphQL.md
In your meeting note, link to them: “Met with [[Sarah]] about the [[API Migration]]”
Open Graph View (Ctrl/Cmd + G)

Limitations:

No typed relationships (just “links to”)
No properties on edges
Visualization only, not queryable

Good for: Visual exploration. Seeing clusters. Finding orphaned notes.

Not good for: Answering “who recommended GraphQL?” You’d need to search text, not traverse relationships.

Option 2: Neo4j for Real Queries

Neo4j is a graph database. You query with Cypher, a language designed for graph traversal.

Setup:

# Docker install
docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your-password \
  neo4j:latest

Open http://localhost:7474 in your browser.

Create nodes:

CREATE (sarah:Person {name: "Sarah", role: "Engineer"})
CREATE (alex:Person {name: "Alex", role: "Tech Lead"})
CREATE (api:Project {name: "API Migration", status: "active"})
CREATE (graphql:Tool {name: "GraphQL"})
CREATE (rest:Tool {name: "REST"})

Create relationships:

MATCH (s:Person {name: "Sarah"}), (g:Tool {name: "GraphQL"})
CREATE (s)-[:RECOMMENDS {date: date("2026-01-27"), context: "for API migration"}]->(g)

MATCH (s:Person {name: "Sarah"}), (a:Project {name: "API Migration"})
CREATE (s)-[:WORKS_ON]->(a)

Query relationships:

// Who recommends GraphQL?
MATCH (p:Person)-[:RECOMMENDS]->(t:Tool {name: "GraphQL"})
RETURN p.name

// What does Sarah work on?
MATCH (p:Person {name: "Sarah"})-[:WORKS_ON]->(proj:Project)
RETURN proj.name

// Two-hop: tools used by people Sarah works with
MATCH (s:Person {name: "Sarah"})-[:WORKS_WITH]->(colleague:Person)-[:USES]->(tool:Tool)
RETURN DISTINCT tool.name

Option 3: Custom Python Solution

If Neo4j feels heavy, use NetworkX for in-memory graphs or SQLite with a schema for triples.

Simple NetworkX version:

import networkx as nx
import json

class PersonalKG:
    def __init__(self):
        self.graph = nx.MultiDiGraph()

    def add_entity(self, name: str, entity_type: str, **props):
        self.graph.add_node(name, type=entity_type, **props)

    def add_relationship(self, from_entity: str, relation: str, to_entity: str, **props):
        self.graph.add_edge(from_entity, to_entity, relation=relation, **props)

    def query_outgoing(self, entity: str, relation: str = None):
        """Find what an entity connects to."""
        edges = self.graph.out_edges(entity, data=True)
        if relation:
            edges = [(u, v, d) for u, v, d in edges if d.get('relation') == relation]
        return edges

    def query_incoming(self, entity: str, relation: str = None):
        """Find what connects to an entity."""
        edges = self.graph.in_edges(entity, data=True)
        if relation:
            edges = [(u, v, d) for u, v, d in edges if d.get('relation') == relation]
        return edges

    def path_between(self, start: str, end: str):
        """Find connection path between two entities."""
        try:
            return nx.shortest_path(self.graph, start, end)
        except nx.NetworkXNoPath:
            return None

    def save(self, path: str):
        data = nx.node_link_data(self.graph)
        with open(path, 'w') as f:
            json.dump(data, f)

    def load(self, path: str):
        with open(path) as f:
            data = json.load(f)
        self.graph = nx.node_link_graph(data)

# Usage
kg = PersonalKG()
kg.add_entity("Sarah", "Person", role="Engineer")
kg.add_entity("GraphQL", "Tool")
kg.add_relationship("Sarah", "RECOMMENDS", "GraphQL", date="2026-01-27")

# Query
print(kg.query_outgoing("Sarah", "RECOMMENDS"))
# [('Sarah', 'GraphQL', {'relation': 'RECOMMENDS', 'date': '2026-01-27'})]

Automated Extraction Pipeline

Connect LLM extraction to your knowledge graph:

from anthropic import Anthropic

def extract_entities_and_relations(note_text: str) -> dict:
    client = Anthropic()

    prompt = f"""Extract entities and relationships from this note.

Entities: people, projects, tools, concepts, decisions
Relationships: WORKS_ON, RECOMMENDS, MENTIONED, INTRODUCED, DECIDED, DISCUSSED

Return JSON:
{{
  "entities": [{{"name": "...", "type": "..."}}],
  "relationships": [{{"from": "...", "relation": "...", "to": "...", "context": "..."}}]
}}

Note:
{note_text}"""

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1000,
        messages=[{"role": "user", "content": prompt}]
    )

    # Parse JSON from response
    import json
    return json.loads(response.content[0].text)

def ingest_note(kg: PersonalKG, note_path: str):
    with open(note_path) as f:
        text = f.read()

    extracted = extract_entities_and_relations(text)

    for entity in extracted["entities"]:
        kg.add_entity(entity["name"], entity["type"])

    for rel in extracted["relationships"]:
        kg.add_relationship(
            rel["from"],
            rel["relation"],
            rel["to"],
            context=rel.get("context", ""),
            source=note_path
        )

# Process all notes
from pathlib import Path
kg = PersonalKG()
for note in Path("~/notes").expanduser().glob("**/*.md"):
    ingest_note(kg, note)
kg.save("knowledge_graph.json")

Schema Design

Define your entity types and relationship types upfront. This keeps the graph consistent.

Entity types for personal knowledge:

Type	What it represents
Person	Contacts, colleagues, authors
Project	Work projects, side projects
Tool	Software, libraries, services
Concept	Ideas, patterns, mental models
Decision	Choices with reasoning
Event	Meetings, conferences, milestones
Resource	Books, articles, videos

Relationship types:

Relationship	Meaning
WORKS_ON	Person to Project
USES	Project to Tool, Person to Tool
KNOWS	Person to Person
INTRODUCED	Person to Concept/Tool (who told you about it)
RECOMMENDS	Person to anything
DECIDED	Person/Project to Decision
REFERENCES	Any to Resource
CONTRADICTS	For tracking conflicting information

Integration with AI Memory

Use graph queries to retrieve context for AI conversations:

def get_context_for_topic(kg: PersonalKG, topic: str) -> str:
    """Gather related context from the knowledge graph."""
    context_parts = []

    # Direct connections
    outgoing = kg.query_outgoing(topic)
    incoming = kg.query_incoming(topic)

    for _, target, data in outgoing:
        context_parts.append(f"{topic} {data['relation']} {target}")

    for source, _, data in incoming:
        context_parts.append(f"{source} {data['relation']} {topic}")

    return "\n".join(context_parts)

# Use in Claude conversation
topic = "API Migration"
context = get_context_for_topic(kg, topic)

prompt = f"""Context from my knowledge graph:
{context}

Question: What's the status of {topic} and who's involved?"""

This connects to the Graph Memory concept for AI that remembers relationships.

What You Can Steal

If you only have 10 minutes: Set up Obsidian and start using [[wiki-links]] for people and projects. The graph view alone provides value.

If you have an afternoon: Run Neo4j in Docker. Manually create 20 nodes and relationships from your recent notes. Write 5 Cypher queries that answer questions you couldn’t answer before.

If you’re building a system: Create the extraction pipeline. Run it on your notes archive. Build a simple CLI that queries the graph before each Claude conversation.

Schema to start with:

Person: name, role, context
Project: name, status, started
Tool: name, category
Concept: name, source

WORKS_ON: Person → Project
USES: Project → Tool  
KNOWS: Person → Person
INTRODUCED: Person → Concept (when: date)

Tradeoffs

Approach	Pros	Cons
Obsidian	Already using it, visual, simple	No typed relations, not queryable
Neo4j	Full graph queries, production-ready	Heavier setup, separate from notes
Custom Python	Lightweight, embeddable	Build maintenance, less features
Zep/Graphiti	Temporal, AI-native	Dependency, learning curve

For personal use, start with Obsidian links. Graduate to Neo4j when you need queries like “who introduced me to tools I use in active projects?”

Graph Memory for Personal AI - How graph memory differs from vector databases
Personal Search - Combine graph queries with semantic search
Building a Memory System - Full memory architecture guide

Next: Graph Memory for Personal AI

Topics: knowledge-management memory graph-database