Loading...
How BrainLayer — Persistent Memory for AI Agents works
Every Claude Code session produces JSONL transcripts. BrainLayer's 4-stage pipeline turns these into searchable knowledge. Extract parses session files and detects continuation chains. Classify identifies content types (user messages, AI code, stack traces, file reads) with content-aware length thresholds. Chunk splits text using tree-sitter for code and paragraph boundaries for prose, targeting ~2000 chars per chunk. Embed generates 1024-dim vectors with bge-large-en-v1.5 and stores them in SQLite via sqlite-vec.
Extract
JSONL → sessions
Classify
Content-type detection
Chunk
AST-aware splitting
Embed
bge-large 1024-dim
Store
SQLite + sqlite-vec
Extract
JSONL → sessions
Classify
Content-type detection
Chunk
AST-aware splitting
Embed
bge-large 1024-dim
Store
SQLite + sqlite-vec
Vector similarity alone misses exact keyword matches. BrainLayer runs two strategies in parallel: semantic search with 1024-dim embeddings (KNN, 3x oversampling) and FTS5 keyword search for exact hits. Reciprocal Rank Fusion combines both ranked lists with score = 1/(k + rank), where k=60 keeps any single high rank from dominating the final ordering.
# Reciprocal Rank Fusion (k=60)
for chunk_id in all_results:
score = 0.0
if chunk_id in semantic_results:
score += 1.0 / (60 + semantic_rank)
if chunk_id in fts_results:
score += 1.0 / (60 + fts_rank)
fused[chunk_id] = score
return sorted(fused, reverse=True)[:n]Results appearing in both lists get higher scores
Raw chunks need structure. A local LLM (GLM-4.7-Flash or Qwen2.5-Coder-14B via MLX on Apple Silicon) enriches each chunk with 10 metadata fields: summary, tags, importance (1-10), intent, primary code symbols, a hypothetical query for HyDE retrieval, epistemic level, version scope, tech debt impact, and external deps. Batches of 50-100 chunks with 5-minute stall detection per chunk.
297K+ chunks at ~$0.01/chunk via cloud API = $2,975. Local GLM-4.7-Flash costs $0. Quality is comparable for structured extraction tasks, and no data leaves the machine.
Everything lives in a single .db file: SQLite + sqlite-vec for vectors, FTS5 for keywords. Not a compromise. The database ships with the package, needs zero infrastructure, and handles concurrent access from the daemon, MCP server, and enrichment workers via APSW with a 5-second busy timeout.
7 MCP tools expose BrainLayer's full capability to any Claude Code session. 3 core memory tools handle search, persistence, and recall. 4 knowledge graph tools add entity extraction, lookup, updates, and person profiles. Started at 14 specialized tools, consolidated to 7 that cover every use case. Old names still work through backward-compat aliases.
brain_searchSemantic + keyword hybrid search across all indexed chunksbrain_storePersist decisions, learnings, bugs, and TODOs with auto-taggingbrain_recallSession context, operational history, and work summariesbrain_digestIngest raw content and extract entities, relations, action itemsbrain_entityLook up known entities in the knowledge graph with relationsbrain_updateUpdate, archive, or merge existing memory chunksbrain_get_personRetrieve person profiles with interaction history