domain-memory-agent

Knowledge base with TF-IDF semantic search and extractive summarization - no ML dependencies required

View on GitHub
Author Jeremy Longshore
Namespace @jeremylongshore/claude-code-plugins-plus
Category productivity
Version 1.0.0
Stars 1,193
Downloads 3
self.md verified
Table of content

Knowledge base with TF-IDF semantic search and extractive summarization - no ML dependencies required

Installation

npx claude-plugins install @jeremylongshore/claude-code-plugins-plus/domain-memory-agent

Contents

Folders: servers, skills, tests

Files: LICENSE, README.md, package.json, server.json, tsconfig.json, vitest.config.ts

Documentation

Knowledge base with semantic search, document storage, and automatic summarization

A lightweight MCP server for domain-specific knowledge management using TF-IDF semantic search (no external ML dependencies). Perfect for building AI memory systems and RAG applications.

Features

Installation

/plugin install domain-memory-agent@claude-code-plugins-plus

6 MCP Tools

1. store_document

Store documents in knowledge base with automatic indexing.

{
  "title": "Machine Learning Basics",
  "content": "Machine learning is a subset of AI...",
  "tags": ["ai", "ml", "tutorial"],
  "metadata": {
    "author": "John Doe",
    "category": "Technical"
  }
}

Search using TF-IDF relevance scoring.

{
  "query": "machine learning algorithms",
  "limit": 10,
  "tags": ["ai"],
  "minScore": 0.1
}

Returns: Ranked results with relevance scores and excerpts.

3. summarize

Generate extractive summaries (cached).

{
  "documentId": "doc123",
  "maxSentences": 5,
  "regenerate": false
}

4. list_documents

Browse knowledge base with filtering.

{
  "tags": ["ai"],
  "sortBy": "updated",
  "limit": 50,
  "offset": 0
}

5. get_document

Retrieve full document by ID.

{
  "documentId": "doc123"
}

6. delete_document

Remove document and unindex.

{
  "documentId": "doc123"
}

Quick Start

// 1. Store knowledge
store_document({
  title: "API Design Best Practices",
  content: "RESTful APIs should be...",
  tags: ["api", "architecture"]
})

// 2. Search knowledge
semantic_search({
  query: "REST API design patterns",
  limit: 5
})

// 3. Get summary
summarize({
  documentId: "doc123",
  maxSentences: 3
})

How Semantic Search Works

Uses TF-IDF (Term Frequency-Inverse Document Frequency):

  1. Tokenization: Text → lowercase words (filter short words)
  2. Term Frequency: Count word occurrences in each document
  3. Document Frequency: Track how many documents contain each term
  4. IDF: Rare terms get higher scores
  5. TF-IDF Score: Rank documents by relevance

Advantages:

Architecture

In-Memory Storage:
├── documents: Map<id, Document>
├── tfidfIndex:
│   ├── termFrequencies: Map<term, Map<docId, freq>>
│   ├── documentFrequencies: Map<term, count>
│   └── documentLengths: Map<docId, totalTerms>

Note: Data persists during session but clears on restart. Future versions will add persistence.

Use Cases

  1. RAG Systems - Store domain knowledge for AI retrieval
  2. Documentation Search - Index and search project docs
  3. Research Notes - Organize research with semantic search
  4. Customer Support - Build knowledge bases for support agents
  5. Personal Knowledge - Second brain / Zettelkasten system

Performance

Best Practices

  1. Use descriptive titles - Improves search relevance
  2. Tag consistently - Makes filtering effective
  3. Store focused documents - Better than huge files
  4. Cache summaries - Regenerate only when needed
  5. Regular cleanup - Delete outdated documents

Example Workflows

Building a Technical Knowledge Base

# Store API documentation
store_document(title: "REST API Guide", content: "...", tags: ["api", "docs"])

# Store best practices
store_document(title: "Error Handling Patterns", content: "...", tags: ["patterns", "errors"])

# Search when needed
semantic_search(query: "handle API errors", tags: ["api"])

Research Note System

# Store research papers
store_document(title: "Transformer Architecture", content: "...", tags: ["ml", "nlp", "research"])

# Find related research
semantic_search(query: "attention mechanisms", tags: ["ml"])

# Get quick summary
summarize(documentId: "paper123", maxSentences: 5)

License

MIT License - see LICENSE


Made with ️ by Intent Solutions

Source

View on GitHub

Tags: productivity mcpknowledge-basesemantic-searchtfidfsummarizationragdocumentation