Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

mnem vs MemPalace

MemPalace: “The best-benchmarked open-source AI memory system. And it’s free.” (repo description, MemPalace/mempalace) mnem: a content-addressed, versioned graph substrate that shares MemPalace’s no-LLM-on-write philosophy and pushes further on identity and history.

At a glance

mnemMemPalace
LicenseApache-2.0MIT
Starssmall / pre-launch49,768 (GitHub API, 2026-04-26)
Embedded / Serverembeddedembedded (Python + ChromaDB)
LLM at ingestnono (verbatim store)
Content-addressedyesno (ChromaDB row IDs)
Bitemporalnopartial (valid_from / valid_to on KG entries)
WASM targetyesno
MCP serveryes (18 tools)yes (29 tools)
Hybrid retrievalyes (vector + sparse + graph)yes (semantic + hybrid v4 / v5 with keyword + temporal boost)
Token-budget retrieval metadatayesno
3-way mergeyesno
Reproducible benchmarks in-repoyesyes (per-question JSONL committed)

Feature comparison

#DimensionmnemMemPalaceSource
1Schemaopen (any labels / properties)fixed: wings, rooms, halls, drawersMemPalace README “What it is” sha 6890948e092b
2Storageredb embeddedChromaDB + SQLiteMemPalace README + mempalace/backends/base.py
3Default embedderbundled ONNX MiniLM-L6-v2ChromaDB default (MiniLM-L6-v2 implied)MemPalace requirements
4LLM at ingestnonenoneMemPalace README
5LLM at retrievaloptional rerankoptional hybrid-v4 + LLM rerank tierMemPalace Benchmarks table
6Identitycontent CID (BLAKE3 over DAG-CBOR)ChromaDB row IDsimplementation
7Historysigned commit DAGappend-only with valid_from / valid_toMemPalace KG section
8Conflict resolution3-way mergemanual invalidate toolMemPalace MCP tool list
9Sparse laneBM25 + SPLADEhybrid-v4 keyword boostMemPalace BENCHMARKS.md
10Graph lanefirst-class (label / prop / adjacency)KG with timeline + cross-wing tunnelsMemPalace MCP tools
11MCP surface18 tools29 toolsMemPalace README “MCP server”
12Plugin scaffoldsmnem mcp + mnem integrate.claude-plugin/, .codex-plugin/ in repoMemPalace repo
13BindingsRust + Python + TS + HTTP + CLI + MCPPython + MCPMemPalace README
14Hosted productnonenonen/a
15Velocitymaturing 1.0433 commits in first 12 days, 30 contributors (early 2026)internal notes; verify on repo today

Benchmarks (where comparable)

MemPalace publishes retrieval R@5 / R@10 numbers in the same family as mnem’s harness. We pulled their numbers from benchmarks/BENCHMARKS.md and ran ours on the same datasets and embedder weights:

BenchmarkSplitMetricMemPalacemnemDelta
LongMemEval500 QR@5 session, raw dense0.9660.9660
LongMemEval500 QR@10 session, raw dense0.9820.9820
LongMemEval500 Q hybrid-v4R@5 session0.982$\color{red}{\textbf{0.976}}$-0.006
LoCoMo1986 QR@5 session, raw dense0.508$\color{green}{\textbf{0.726}}$+0.218
LoCoMo1986 QR@10 session, raw dense0.603$\color{green}{\textbf{0.855}}$+0.252
ConvoMem250 QAvg recall0.890$\color{green}{\textbf{0.976}}$+0.086
MemBench100 Q (movie)R@50.950$\color{green}{\textbf{1.000}}$+0.050

Method: identical MiniLM-L6-v2 ONNX weights, no reranker, no LLM, no lexical lane on the raw-dense rows. The LoCoMo gap comes from mnem’s adapter aggregating user-turn text per session before embedding; MemPalace’s adapter embeds at a finer grain. Mechanism, not magic.

MemPalace’s hybrid-v4 numbers tune on dev splits; the held-out 98.4% they report is the honest figure to compare against.

Latency (where measured)

SystemSetupLatency
mnemLongMemEval 500 Q, MiniLM ONNX711 ms mean retrieve
mnemLoCoMo 1986 Q, MiniLM ONNX333 ms mean retrieve
MemPalaceLongMemEval, raw densenot headlined; ChromaDB-default latency

MemPalace does not publish a single mean-latency number; their benchmark tables focus on accuracy.

Architecture differences

MemPalace stores conversation history verbatim in ChromaDB and indexes people / projects as wings, topics as rooms, flows as halls, content as drawers. The retrieval layer is pluggable behind mempalace/backends/base.py. A SQLite-backed knowledge graph adds valid_from / valid_to windows, an invalidate verb, and a timeline view. The MCP server exposes 29 tools including agent diaries and cross-wing tunnels. The product is opinionated: the palace metaphor is the user experience.

mnem ships no metaphor. Nodes and edges are open-schema; you commit whatever shape your application needs. Identity is a CID over canonical DAG-CBOR + BLAKE3, so identical content collapses to the same node across machines. History is a signed commit DAG with diff / log / branch / 3-way merge. Retrieval is 3-lane RRF (HNSW dense + BM25/SPLADE sparse + graph traversal) with first-class token-budget telemetry on every response. mnem-core is no-tokio / no-fs / no-net and compiles to WASM unchanged.

Where MemPalace clearly wins

  • Verbatim store with measured 96.6% R@5 on LongMemEval, no API key. Same as mnem on raw dense, and reproducible from their repo.
  • MCP breadth. 29 tools to mnem’s 18. Agent diaries and cross-wing tunnels are original ideas.
  • Plugin scaffolds in-repo. .claude-plugin/ and .codex-plugin/ lower install friction for Claude Code / Codex users.
  • Velocity and community. Hundreds of commits, dozens of contributors, rapid issue response.
  • Reproducibility culture. Per-question JSONL result files committed for every benchmark run.
  • Working temporal KG. valid_from / valid_to / invalidate / timeline shipped today.

Where mnem clearly wins

  • Open schema. No fixed wings/rooms/halls/drawers hierarchy. Use any labels and properties for any domain.
  • Content-addressed identity. Same fact = same CID across machines. Stable citations forever.
  • Real commit DAG. Branch, diff, 3-way merge, signed Ed25519 history. MemPalace stores facts and a timeline; mnem stores commits over a graph.
  • WASM target. Same retrieval logic in browsers, Workers, Lambda. Python + ChromaDB cannot.
  • Retrieval-quality lead on LoCoMo. +0.218 R@5 raw dense, same embedder.
  • Token-budget telemetry. tokens_used, candidates_seen, dropped returned on every retrieve.

When to pick MemPalace, when to pick mnem

Pick MemPalace if: the wings / rooms / halls / drawers metaphor matches your domain, you want the largest MCP tool surface available, or you specifically want a Claude-Code-paired personal memory appliance with reproducible benchmark numbers today.

Pick mnem if: you want an open-schema substrate, you need content-addressing and a real commit DAG, you are shipping to multiple languages or to the edge / WASM, or you want token-budget telemetry as a first-class response field.

Sources