Build a Personal Knowledge Graph

Table of content

Your notes app treats knowledge as a pile of documents. Search returns files. But you don’t think in files. You think in connections: this person told me about that tool for this project, which relates to a decision from last year.

A personal knowledge graph makes those connections explicit and queryable.

What a Knowledge Graph Gives You

Regular note search: find documents containing words.

Graph search: traverse relationships.

QueryDocument SearchGraph Query
“Who told me about Kubernetes?”Returns all docs mentioning KubernetesReturns the person node with an INTRODUCED edge to the Kubernetes concept
“What tools does the API project use?”Finds API project files, hope tools are mentionedTraverses API_Project → USES → tool nodes
“How is this concept related to that decision?”Can’t answerShows path between nodes

The difference is structure. Documents contain text. Graphs contain entities and relationships.

Core Components

A knowledge graph has three parts:

ComponentWhat It StoresExamples
NodesEntities with typesPerson: Sarah, Project: API Migration, Tool: Kubernetes
EdgesRelationships with typesWORKS_ON, MENTIONED, INTRODUCED, DECIDED
PropertiesAttributes on nodes/edgescreated: 2026-01-15, confidence: 0.9, source: meeting_notes

Every fact becomes a triple: Subject → Predicate → Object. “Sarah mentioned Kubernetes” becomes SarahMENTIONEDKubernetes.

Entity Extraction with LLMs

The hard part isn’t storing graphs. It’s getting structured data from unstructured notes. LLMs solve this.

Give Claude a note:

Met with Sarah yesterday to discuss the API migration. She recommended 
we look at GraphQL instead of REST for the new service. Alex disagreed 
but said he'd review the benchmarks.

Ask for entities and relationships:

Extract entities (people, projects, tools, concepts) and relationships
from this note. Return as JSON triples.

Claude returns:

{
  "entities": [
    {"name": "Sarah", "type": "Person"},
    {"name": "Alex", "type": "Person"},
    {"name": "API Migration", "type": "Project"},
    {"name": "GraphQL", "type": "Tool"},
    {"name": "REST", "type": "Tool"}
  ],
  "relationships": [
    {"from": "Sarah", "relation": "RECOMMENDS", "to": "GraphQL", "context": "for API migration"},
    {"from": "Alex", "relation": "DISAGREES_WITH", "to": "GraphQL"},
    {"from": "Alex", "relation": "WILL_REVIEW", "to": "benchmarks"},
    {"from": "Sarah", "relation": "DISCUSSED", "to": "API Migration"},
    {"from": "Alex", "relation": "DISCUSSED", "to": "API Migration"}
  ]
}

Run this on every note. The graph builds itself.

Option 1: Obsidian Graph View

Obsidian’s built-in graph visualizes links between notes. Each note is a node. Wiki-links ([[like this]]) create edges.

Setup:

  1. Create notes for entities: People/Sarah.md, Projects/API Migration.md, Tools/GraphQL.md
  2. In your meeting note, link to them: “Met with [[Sarah]] about the [[API Migration]]”
  3. Open Graph View (Ctrl/Cmd + G)

Limitations:

Good for: Visual exploration. Seeing clusters. Finding orphaned notes.

Not good for: Answering “who recommended GraphQL?” You’d need to search text, not traverse relationships.

Option 2: Neo4j for Real Queries

Neo4j is a graph database. You query with Cypher, a language designed for graph traversal.

Setup:

# Docker install
docker run -d \
  --name neo4j \
  -p 7474:7474 -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/your-password \
  neo4j:latest

Open http://localhost:7474 in your browser.

Create nodes:

CREATE (sarah:Person {name: "Sarah", role: "Engineer"})
CREATE (alex:Person {name: "Alex", role: "Tech Lead"})
CREATE (api:Project {name: "API Migration", status: "active"})
CREATE (graphql:Tool {name: "GraphQL"})
CREATE (rest:Tool {name: "REST"})

Create relationships:

MATCH (s:Person {name: "Sarah"}), (g:Tool {name: "GraphQL"})
CREATE (s)-[:RECOMMENDS {date: date("2026-01-27"), context: "for API migration"}]->(g)

MATCH (s:Person {name: "Sarah"}), (a:Project {name: "API Migration"})
CREATE (s)-[:WORKS_ON]->(a)

Query relationships:

// Who recommends GraphQL?
MATCH (p:Person)-[:RECOMMENDS]->(t:Tool {name: "GraphQL"})
RETURN p.name

// What does Sarah work on?
MATCH (p:Person {name: "Sarah"})-[:WORKS_ON]->(proj:Project)
RETURN proj.name

// Two-hop: tools used by people Sarah works with
MATCH (s:Person {name: "Sarah"})-[:WORKS_WITH]->(colleague:Person)-[:USES]->(tool:Tool)
RETURN DISTINCT tool.name

Option 3: Custom Python Solution

If Neo4j feels heavy, use NetworkX for in-memory graphs or SQLite with a schema for triples.

Simple NetworkX version:

import networkx as nx
import json

class PersonalKG:
    def __init__(self):
        self.graph = nx.MultiDiGraph()

    def add_entity(self, name: str, entity_type: str, **props):
        self.graph.add_node(name, type=entity_type, **props)

    def add_relationship(self, from_entity: str, relation: str, to_entity: str, **props):
        self.graph.add_edge(from_entity, to_entity, relation=relation, **props)

    def query_outgoing(self, entity: str, relation: str = None):
        """Find what an entity connects to."""
        edges = self.graph.out_edges(entity, data=True)
        if relation:
            edges = [(u, v, d) for u, v, d in edges if d.get('relation') == relation]
        return edges

    def query_incoming(self, entity: str, relation: str = None):
        """Find what connects to an entity."""
        edges = self.graph.in_edges(entity, data=True)
        if relation:
            edges = [(u, v, d) for u, v, d in edges if d.get('relation') == relation]
        return edges

    def path_between(self, start: str, end: str):
        """Find connection path between two entities."""
        try:
            return nx.shortest_path(self.graph, start, end)
        except nx.NetworkXNoPath:
            return None

    def save(self, path: str):
        data = nx.node_link_data(self.graph)
        with open(path, 'w') as f:
            json.dump(data, f)

    def load(self, path: str):
        with open(path) as f:
            data = json.load(f)
        self.graph = nx.node_link_graph(data)

# Usage
kg = PersonalKG()
kg.add_entity("Sarah", "Person", role="Engineer")
kg.add_entity("GraphQL", "Tool")
kg.add_relationship("Sarah", "RECOMMENDS", "GraphQL", date="2026-01-27")

# Query
print(kg.query_outgoing("Sarah", "RECOMMENDS"))
# [('Sarah', 'GraphQL', {'relation': 'RECOMMENDS', 'date': '2026-01-27'})]

Automated Extraction Pipeline

Connect LLM extraction to your knowledge graph:

from anthropic import Anthropic

def extract_entities_and_relations(note_text: str) -> dict:
    client = Anthropic()

    prompt = f"""Extract entities and relationships from this note.

Entities: people, projects, tools, concepts, decisions
Relationships: WORKS_ON, RECOMMENDS, MENTIONED, INTRODUCED, DECIDED, DISCUSSED

Return JSON:
{{
  "entities": [{{"name": "...", "type": "..."}}],
  "relationships": [{{"from": "...", "relation": "...", "to": "...", "context": "..."}}]
}}

Note:
{note_text}"""

    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1000,
        messages=[{"role": "user", "content": prompt}]
    )

    # Parse JSON from response
    import json
    return json.loads(response.content[0].text)

def ingest_note(kg: PersonalKG, note_path: str):
    with open(note_path) as f:
        text = f.read()

    extracted = extract_entities_and_relations(text)

    for entity in extracted["entities"]:
        kg.add_entity(entity["name"], entity["type"])

    for rel in extracted["relationships"]:
        kg.add_relationship(
            rel["from"],
            rel["relation"],
            rel["to"],
            context=rel.get("context", ""),
            source=note_path
        )

# Process all notes
from pathlib import Path
kg = PersonalKG()
for note in Path("~/notes").expanduser().glob("**/*.md"):
    ingest_note(kg, note)
kg.save("knowledge_graph.json")

Schema Design

Define your entity types and relationship types upfront. This keeps the graph consistent.

Entity types for personal knowledge:

TypeWhat it represents
PersonContacts, colleagues, authors
ProjectWork projects, side projects
ToolSoftware, libraries, services
ConceptIdeas, patterns, mental models
DecisionChoices with reasoning
EventMeetings, conferences, milestones
ResourceBooks, articles, videos

Relationship types:

RelationshipMeaning
WORKS_ONPerson to Project
USESProject to Tool, Person to Tool
KNOWSPerson to Person
INTRODUCEDPerson to Concept/Tool (who told you about it)
RECOMMENDSPerson to anything
DECIDEDPerson/Project to Decision
REFERENCESAny to Resource
CONTRADICTSFor tracking conflicting information

Integration with AI Memory

Use graph queries to retrieve context for AI conversations:

def get_context_for_topic(kg: PersonalKG, topic: str) -> str:
    """Gather related context from the knowledge graph."""
    context_parts = []

    # Direct connections
    outgoing = kg.query_outgoing(topic)
    incoming = kg.query_incoming(topic)

    for _, target, data in outgoing:
        context_parts.append(f"{topic} {data['relation']} {target}")

    for source, _, data in incoming:
        context_parts.append(f"{source} {data['relation']} {topic}")

    return "\n".join(context_parts)

# Use in Claude conversation
topic = "API Migration"
context = get_context_for_topic(kg, topic)

prompt = f"""Context from my knowledge graph:
{context}

Question: What's the status of {topic} and who's involved?"""

This connects to the Graph Memory concept for AI that remembers relationships.

What You Can Steal

If you only have 10 minutes: Set up Obsidian and start using [[wiki-links]] for people and projects. The graph view alone provides value.

If you have an afternoon: Run Neo4j in Docker. Manually create 20 nodes and relationships from your recent notes. Write 5 Cypher queries that answer questions you couldn’t answer before.

If you’re building a system: Create the extraction pipeline. Run it on your notes archive. Build a simple CLI that queries the graph before each Claude conversation.

Schema to start with:

Person: name, role, context
Project: name, status, started
Tool: name, category
Concept: name, source

WORKS_ON: Person → Project
USES: Project → Tool  
KNOWS: Person → Person
INTRODUCED: Person → Concept (when: date)

Tradeoffs

ApproachProsCons
ObsidianAlready using it, visual, simpleNo typed relations, not queryable
Neo4jFull graph queries, production-readyHeavier setup, separate from notes
Custom PythonLightweight, embeddableBuild maintenance, less features
Zep/GraphitiTemporal, AI-nativeDependency, learning curve

For personal use, start with Obsidian links. Graduate to Neo4j when you need queries like “who introduced me to tools I use in active projects?”


Next: Graph Memory for Personal AI

Topics: knowledge-management memory graph-database