Memory Layers

the-brain organizes memory across three layers, each processing information at different timescales.

⚡ Layer 1: INSTANT (Working Memory)

Plugin: plugin-graph-memory

Intercepts prompts and injects real-time biases before they reach the LLM.

Detection Pipeline

The AFTER_RESPONSE handler runs a 6-stage pipeline using language-agnostic structural heuristics (no English regex):

Correction detection — structural heuristics with dynamic confidence weights
- Short prompt + long explanatory response (ratio > 2.5:1) → likely correction
- Very short prompt (< 50 chars) + substantial response (> 100 chars)
- High token novelty (response introduces new vocabulary not in prompt)
Preference detection — cross-interaction cluster tracking
- Short declarative statements (15-150 chars) → preference candidates
- Token overlap with previously detected preferences → reinforced
- Repeated vocabulary across 2+ interactions → emerging preference
Pattern detection — keywords appearing ≥3 times in recent interactions
Concept node creation — new keywords not matching any existing node
Node interconnection — bidirectional links between all new nodes
Weight decay — periodic decay for >24h unmatched nodes

Quality Filters

Unicode-aware tokenizer — splits on non-letter boundaries, works for all scripts (Latin, Cyrillic, CJK, Arabic, etc.) — no English stop words needed
Weight decay: every ~10 interactions, nodes >24h old lose 2% weight (floor: 0.05)
Weight boost: matched nodes gain +0.05 per use
Correction weights: 0.5-0.85 based on structural heuristic confidence
Preference weight: always 0.7
Concept weight: always 0.4

Context Injection

BEFORE_PROMPT extracts keywords → searches graph nodes → filters by minWeight → boosts matched nodes → fetches connected nodes → injects formatted context.

Edge cases: empty keywords → no injection. Connected nodes also respect minWeight. Metadata from BEFORE_PROMPT (matchedNodeIds, promptKeywords) flows to AFTER_RESPONSE.

⚖️ Layer 2: SELECTION (The Gatekeeper)

Plugin: plugin-spm-curator

Evaluates every interaction and filters noise from signal using a composite surprise score.

Dual-Mode Architecture

TF-IDF mode (default, useTfidf: true): Wider spread, +93% better discrimination
- Builds vocabulary from production memories at daemon startup
- initTfidfFromTexts(texts) → finalizeTfidf() lifecycle
- Falls back to EMA-Gaussian before vocabulary is locked
EMA-Gaussian mode (fallback): Running mean/variance per feature
- Uses 6 scalar features: promptLen, responseLen, totalLen, lexicalDiversity, hourOfDay, dayOfWeek

Composite Score Formula

composite = scalarWeight × scalarScore + embeddingWeight × embScore + noveltyWeight × noveltyScore

All sub-scores normalized to [0, 1]. Default weights: scalar=0.35, embedding=0.40, novelty=0.25.

Quality Features

Duplicate detection: djb2 hash of first 200 chars; Set of 5,000 recent hashes; duplicates get score 0
N-gram novelty: character n-grams (default n=4) compared against 50,000-cache FIFO
Z-score clamping: capped at [0, 5] to prevent outlier dominance
TF-IDF seed: finalizeTfidf(seedTexts) primes centroid to avoid all scores being 0.5 at startup

Runtime Introspection

// Access via hook
hooks.callHook("spm-curator:getInstance", (instance) => {
  instance.setThreshold(0.5);   // Dynamic threshold
  instance.getStats();          // gaussians, centroidDim, promoteRate, etc.
});

🌌 Layer 3: DEEP (Long-Term)

Plugin: trainer-local-mlx

Overnight LoRA training on curated memories. See MLX Training.

Promotion Flow

Interaction → INSTANT (graph memory)
  ↓ AFTER_RESPONSE
SPM evaluation → composite surprise score
  ↓ above threshold (default 0.82)
SELECTION promotion → DEEP
  ↓ DEEP_CONSOLIDATE (2 AM cron)
MLX LoRA training → model weights

Layer Configuration

All three layers are pluggable — swap built-in plugins with custom implementations.

See Plugin Contracts for the interfaces each layer plugin must implement.

On this page