Memory Attribution and Provenance

Table of content

Your AI remembers a fact. Where did it come from? When was it added? Is it still accurate? Without attribution, your AI’s memory is a black box of claims with no way to verify or update them.

The provenance problem

AI memory systems store facts without metadata. A simple memory entry might look like:

{
  "fact": "The auth system uses JWT with 24h expiry"
}

This tells you nothing useful:

QuestionThis format answers
When was this decided?No
Who made this decision?No
Is it still current?No
Where is it documented?No
How confident should I be?No

When your AI retrieves this fact six months later, it presents it with the same authority as something verified yesterday.

Attribution metadata

Every memory entry needs context:

{
  "fact": "The auth system uses JWT with 24h expiry",
  "source": "architecture-decision-003.md",
  "created": "2025-09-14T10:23:00Z",
  "author": "Sarah Chen",
  "confidence": 0.95,
  "last_verified": "2025-12-01T08:00:00Z",
  "supersedes": null
}
FieldPurpose
sourceWhere the fact originated
createdWhen it entered the system
authorWho provided it
confidenceHow certain we are (0-1)
last_verifiedWhen someone confirmed it
supersedesPrevious fact this replaces

Graph reification

Knowledge graphs use reification to make statements about statements. Instead of storing bare facts, you store facts wrapped in metadata.

TrustGraph’s guide on graph reification explains why this matters for AI agents: you can track fact sources, assign confidence scores, handle conflicting information, model how relationships change over time, and keep audit trails.

Traditional triple:

<AuthSystem> <uses> <JWT>

Reified triple:

<Statement_001> <subject> <AuthSystem>
<Statement_001> <predicate> <uses>
<Statement_001> <object> <JWT>
<Statement_001> <source> <ArchDoc003>
<Statement_001> <confidence> 0.95
<Statement_001> <created> "2025-09-14"

The second approach lets you query provenance directly: “Which facts came from this source?” or “What do we know with confidence below 0.8?”

Confidence scoring

Not all memories deserve equal weight:

Source typeBase confidenceNotes
User explicitly stated0.95Direct input
Extracted from document0.85May have parsing errors
Inferred from context0.70Could be wrong
Imported from external system0.60Depends on system quality
AI-generated summary0.50Verify before trusting

Confidence decays over time. A fact verified yesterday deserves more trust than one from two years ago:

import math
from datetime import datetime, timedelta

def decayed_confidence(original: float, last_verified: datetime) -> float:
    days_old = (datetime.now() - last_verified).days
    # Halve confidence every 180 days without verification
    decay = math.exp(-0.693 * days_old / 180)
    return original * decay

# Example
original_confidence = 0.95
last_check = datetime.now() - timedelta(days=365)
current = decayed_confidence(original_confidence, last_check)
# Returns ~0.24 - time to re-verify

Citation in responses

When your AI uses a memory, it should cite the source. Debanjum Singh’s Khoj implements this through RAG with source references. Every answer links back to the documents that informed it.

Pattern for citation:

def generate_response(query: str, memories: list[Memory]) -> str:
    context = format_memories_with_sources(memories)

    response = llm.complete(f"""
Based on these sources:

{context}

Question: {query}

Answer the question and cite sources using [1], [2], etc.
""")

    # Append source list
    sources = "\n".join([
        f"[{i+1}] {m.source} ({m.created.date()})"
        for i, m in enumerate(memories)
    ])

    return f"{response}\n\nSources:\n{sources}"

Output format:

The auth system uses JWT tokens with a 24-hour expiry [1].
This was chosen over session cookies for stateless scaling [2].

Sources:
[1] architecture-decision-003.md (2025-09-14)
[2] meeting-notes-2025-09-10.md (2025-09-10)

Handling conflicts

Multiple sources may disagree. Your system needs conflict resolution:

def resolve_conflict(memories: list[Memory]) -> Memory:
    # Sort by confidence and recency
    scored = [
        (m, m.confidence * recency_weight(m.last_verified))
        for m in memories
    ]
    scored.sort(key=lambda x: x[1], reverse=True)

    winner = scored[0][0]

    # Log the conflict for human review
    if len(scored) > 1 and scored[0][1] - scored[1][1] < 0.1:
        flag_for_review(memories, reason="close_confidence")

    return winner

When confidence scores are close, flag for human review rather than silently picking one.

Practical schema

SQLite schema for attributed memories:

CREATE TABLE memories (
    id TEXT PRIMARY KEY,
    content TEXT NOT NULL,
    source TEXT,
    source_type TEXT CHECK (source_type IN (
        'user_input', 'document', 'inference', 'import', 'ai_summary'
    )),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    author TEXT,
    confidence REAL DEFAULT 0.5,
    last_verified TIMESTAMP,
    superseded_by TEXT REFERENCES memories(id),
    embedding BLOB
);

CREATE INDEX idx_memories_source ON memories(source);
CREATE INDEX idx_memories_confidence ON memories(confidence);
CREATE INDEX idx_memories_verified ON memories(last_verified);

Query low-confidence or stale memories:

-- Memories needing verification
SELECT content, source, confidence, last_verified
FROM memories
WHERE confidence < 0.6
   OR last_verified < datetime('now', '-90 days')
ORDER BY confidence ASC;

Verification workflows

Build verification into your system:

Periodic review:

def get_stale_memories(days: int = 90) -> list[Memory]:
    cutoff = datetime.now() - timedelta(days=days)
    return db.query("""
        SELECT * FROM memories
        WHERE last_verified < ? OR last_verified IS NULL
        ORDER BY confidence DESC
        LIMIT 20
    """, [cutoff])

# Weekly task: verify top 20 stale memories
for memory in get_stale_memories():
    print(f"Still accurate? {memory.content}")
    print(f"Source: {memory.source}")
    # Human confirms or updates

On-access verification:

def retrieve_with_warning(query: str) -> tuple[list[Memory], list[str]]:
    memories = search(query)
    warnings = []

    for m in memories:
        if m.confidence < 0.6:
            warnings.append(f"Low confidence: {m.content[:50]}...")
        if m.last_verified < datetime.now() - timedelta(days=180):
            warnings.append(f"Stale (not verified in 6mo): {m.content[:50]}...")

    return memories, warnings

What to track

MetadataWhy it matters
Source documentEnables verification
Creation timestampShows age
AuthorAttribution and trust
Confidence scoreGuides reliance
Last verifiedFreshness indicator
Supersedes linkTracks updates
Access countShows utility
Last accessedReveals relevance

Start with source, created, and confidence. Add more as your needs grow.

Building attribution in

Personal search systems need to store document paths alongside embeddings. Memory systems need provenance metadata wrapped around every fact.

Yes, the extra fields add complexity. But the first time your AI confidently states something wrong, you’ll want to know where that belief came from and whether anyone verified it recently.


Next: Context Window Management

Topics: memory ai-agents observability