Introduction

mnem is a knowledge-graph substrate. It stores nodes as content-addressed objects, retrieves them with vector + sparse + graph signals, and exposes the result over CLI, HTTP, and MCP surfaces.

What it does

Content-addressed nodes - every node has a CID; identical content collapses to one node.
Versioned commits - every change is a commit with a parent chain (Git-style for graphs).
Hybrid retrieval - vector (HNSW), sparse (BM25 / SPLADE), and graph traversal in one query.
In-process embedder - bundled ONNX MiniLM-L6-v2 (no Ollama / API keys required).
MCP-native - drop-in memory layer for Claude / Cursor / any MCP client.
WASM target - same core compiles to wasm32 for in-browser use.

What it is not

A vector database (it’s a graph; vectors are one signal among several).
An LLM (mnem holds memory; the LLM uses it).
A finished product. 0.1.0 is the first public cut.

Where to next

Install - single command per platform.
Quickstart - five minutes from zero to retrieve.
Core concepts - what’s a CID, what’s a commit, what’s a label.

Install

mnem ships a single mnem binary plus optional Python and HTTP daemons. Pick the source that matches your platform.

From Cargo (any platform with Rust toolchain)

cargo install --locked mnem-cli
mnem --version

Requires Rust 1.95+ (see rust-toolchain.toml).

From npm (Node.js users)

npm install -g mnem-cli
mnem --version

# or one-shot via npx
npx mnem-cli --version

Downloads the prebuilt native binary for your platform at install time. Node 18+ required. No Rust toolchain needed.

From PyPI (Python users)

pip install mnem-cli
mnem --version

The PyPI package ships the same mnem binary as a manylinux / macOS / Windows wheel.

From a release binary

Download the platform tarball from the latest GitHub release:

curl -L https://github.com/Uranid/mnem/releases/latest/download/mnem-linux-x86_64.tar.gz | tar xz
sudo mv mnem /usr/local/bin/
mnem --version

Replace linux-x86_64 with linux-aarch64 / macos-x86_64 / macos-aarch64 / windows-x86_64.zip as appropriate.

After v0.2.0, mnem ships only via Cargo and PyPI. The Homebrew tap, AUR, Nix, winget, and scoop channels have been dropped in favour of a lean three-channel model (cargo / PyPI / npm). The Cargo channel supports bundled-embedder, bundled-embedder-cuda, bundled-embedder-directml feature flags.

macOS / Linux / Windows

# npm (Node 18+, no Rust toolchain needed)
npm install -g mnem-cli

# Cargo (any platform with Rust 1.95+)
cargo install --locked mnem-cli --features bundled-embedder

# or via cargo-binstall (faster, downloads prebuilt)
cargo binstall mnem-cli

# PyPI (Python users)
pip install mnem-cli

Docker

docker run --rm -p 9876:9876 ghcr.io/uranid/mnem:latest http serve

WASM (in-browser)

cargo build --release --target wasm32-unknown-unknown -p mnem-core

See crates/mnem-core/README.md for embedding examples.

Verify

mnem --version
mnem doctor

mnem doctor probes embedder, store, and config - useful first command after install.

Quickstart

Five minutes from zero to retrieve.

1. Install

cargo install --locked mnem-cli

(See Install for other platforms.)

2. Initialise a repo

mkdir my-graph && cd my-graph
mnem init

This creates .mnem/ with default config (in-process MiniLM embedder, redb store).

3. Ingest

mnem ingest README.md
mnem ingest docs/*.md
mnem ingest <(echo '{"text": "the cat sat on the mat", "label": "demo"}') --json

4. Retrieve

mnem retrieve "what does this project do"
mnem retrieve "what is X" --label demo --top-k 5

5. Serve over HTTP (optional)

mnem http serve --repo .        # bind 127.0.0.1:9876
curl http://127.0.0.1:9876/v1/retrieve -d '{"text": "what does this do"}'

6. Wire into Claude / Cursor (optional)

mnem mcp install

Adds an MCP server entry to your client config; subsequent agent turns can call mnem_retrieve and mnem_ingest natively.

Next steps

CLI reference for every flag.
MCP server for agent integrations.
Retrieval tuning for top-K, hybrid, and graph traversal options.

CLI reference

mnem is the single entry point. Subcommands wrap repo operations.

Common subcommands

mnem init [path]                     # create .mnem/ in path (default: cwd)
mnem ingest <file|-> [...]           # add nodes from file or stdin
mnem retrieve <text> [...]           # query (vector + sparse + graph)
mnem mcp                             # start the MCP JSON-RPC server over stdio
mnem mcp --repo ~/notes              # point the MCP server at a specific graph
mnem http serve                      # start the HTTP JSON API (loopback by default)
mnem integrate                       # wire as MCP server in your agent host
mnem doctor                          # probe embedder + store + config

Inspection

mnem stats                # commits, nodes, embeddings, store size
mnem log [-n N]           # commit history
mnem cat-file <cid>       # dump a node by CID
mnem diff <cid> <cid>     # diff two commits
mnem export               # export as CAR archive

Advanced retrieve flags

--limit N                 # number of items to return (default 10); short: -n
--vector-cap N            # candidate pool from vector lane (default 256)
--graph-expand N          # multi-hop expansion budget
--graph-mode <decay|ppr>  # graph scoring: decay (default) or PPR
--rerank <provider:model> # post-rerank with a model
--summarize               # add community summarization layer
--community-filter        # Leiden community filter; drop low-coverage communities

Ingest flags

--chunker <auto|paragraph|recursive|session>  # chunking strategy (default: auto)
--extractor keybert                            # enable KeyBERT keyphrase extraction
--max-tokens N                                # token budget per chunk (default: 512)
--recursive                                   # ingest a directory recursively

For complete option lists run mnem <subcommand> --help. Long-form documentation for each subcommand lives in guides.

MCP server

mnem implements the Model Context Protocol over stdio. Drop it into any MCP client (Claude Desktop, Cursor, Zed, custom).

Install

mnem integrate              # auto-detect installed hosts and wire everything
mnem integrate claude-code  # wire a specific host

For manual registration in any MCP client:

{
  "mcpServers": {
    "mnem": {
      "command": "mnem",
      "args": ["mcp", "--repo", "/path/to/your-graph"]
    }
  }
}

Tools exposed

Tool	Purpose
`mnem_stats`	Repo overview: op-head, commit count, label list, embedder health
`mnem_schema`	List every node label and edge label in the current commit
`mnem_search`	Exact property-match search with optional outgoing-edge expansion
`mnem_get_node`	Fetch a single node by UUID (full props + content)
`mnem_traverse`	One-hop neighbour walk from a start node via named edge labels
`mnem_list_nodes`	Enumerate nodes at head, optionally filtered by label
`mnem_retrieve`	Hybrid retrieval: vector + sparse + graph, fused via RRF
`mnem_commit`	Add nodes and/or edges as a single commit
`mnem_commit_relation`	Resolve-or-create subject + object + edge in one call
`mnem_resolve_or_create`	Find-or-create a node by a primary-key property
`mnem_recent`	Walk the op-log backwards (last N operations)
`mnem_vector_search`	Cosine nearest-neighbour search over stored embeddings
`mnem_delete_node`	Hard-remove a node from the current head
`mnem_tombstone_node`	Soft-delete (forget) a node; subsequent retrieves exclude it
`mnem_ingest`	Ingest a file or inline text as Doc + Chunk + Entity subgraph
`mnem_global_retrieve`	Semantic search on the global graph (`~/.mnemglobal/.mnem/`) only
`mnem_global_ingest`	Ingest a file or inline text into the global graph
`mnem_global_add`	Write nodes/edges directly to the global graph
`mnem_community_summarize`	Extractive centroid + MMR summarizer over a set of node UUIDs (`summarize` feature)

Notes

The server runs in-process — no separate daemon, no port to manage.
Embedder is bundled (MiniLM-L6-v2, ONNX). No network calls unless you wire one.
Local vs global: mnem_retrieve searches the repo the server is pointed at. mnem_global_retrieve always searches ~/.mnemglobal/.mnem/ regardless of --repo.
For the full field-level schema of each tool, run mnem mcp --list-tools or inspect crates/mnem-mcp/src/tools/descriptions.rs.

Core concepts

Three primitives. Everything else is composed from these.

Node

A node is content + metadata, addressed by its CID (content identifier derived from a hash of canonical bytes). Two nodes with identical content collapse to one CID. Nodes carry:

text - the unit of content (a sentence, a chunk, a fact)
label - string scope; queries can filter to a label
metadata - opaque JSON map for caller-defined tags

The embedding lives in a per-commit sidecar bucket, not on the node, so two nodes with the same text but different embedders share one CID.

Commit

A commit is a snapshot of the graph at a point in time. Every ingest, every edit, every tombstone produces a new commit. Commits chain by parent CID; the head commit is the working tree’s “current state”. Older commits are immutable and reachable.

Label

A label is an opt-in namespace string attached to nodes at ingest time. Used for:

per-user / per-conversation isolation in agent memory
bench harness scoping (per-question, per-document)
coarse multi-tenancy

A query without a label sees the whole repo; a query with a label sees only nodes carrying that label.

Retrieval lanes

Every retrieve call fans out across three lanes and fuses the results:

Vector - HNSW over the per-commit sidecar embeddings
Sparse - BM25 / SPLADE (optional, feature-gated)
Graph - n-hop traversal over authored edges, optionally PPR-scored

Lanes are configurable. Vector-only is the default and is what the 0.1.0 benchmarks measure.

Configuration

mnem reads config from three sources, in priority order:

Environment variables - MNEM_* (highest precedence)
Per-repo config - <repo>/.mnem/config.toml
User-global config - ~/.mnem/config.toml

Defaults

# .mnem/config.toml
[embed]
provider = "onnx"
model = "all-MiniLM-L6-v2"

[store]
backend = "redb"        # "redb" | "in-memory"

[retrieve]
top_k = 10
vector_cap = 256

Common environment overrides

Variable	Effect
`MNEM_EMBED_PROVIDER`	`onnx` / `ollama` / `openai` / `mock`
`MNEM_EMBED_MODEL`	model name (e.g. `all-MiniLM-L6-v2`)
`MNEM_EMBED_BASE_URL`	for `ollama` / `openai` providers
`MNEM_EMBED_API_KEY_ENV`	name of env var holding the API key
`MNEM_ORT_INTRA_THREADS`	pin ONNX runtime thread count (bench harness)
`MNEM_BENCH`	enable bench-only label scoping
`MNEM_HTTP_ALLOW_NON_LOOPBACK`	allow `mnem http` to bind 0.0.0.0 (Docker)

Provider switching

Embedder, sparse encoder, reranker, and LLM are all configured via provider:model strings - no code change to switch from local ONNX to hosted Cohere.

[embed]
provider = "cohere"
model = "embed-english-v3.0"
api_key_env = "COHERE_API_KEY"

See Embedding providers for the full provider matrix.

Methodology

Every published number ships with the harness, the dataset hash, and the raw artifacts. If you cannot reproduce a number, that is a bug.

Dataset matrix

Dataset	Version	n queries	Source
LongMemEval	`longmemeval_s_cleaned.json`	500	xiaowu0162/longmemeval-cleaned
LoCoMo	`locomo10.json`	1986 (session-level)	snap-research/LoCoMo
ConvoMem	5 cat × 50 items (250)	250	Salesforce/ConvoMem
MemBench simple/roles	100 items	100	import-myself/Membench
MemBench highlevel/movie	100 items	100	import-myself/Membench

Embedder

ONNX MiniLM-L6-v2 (sentence-transformers/all-MiniLM-L6-v2 via Xenova/all-MiniLM-L6-v2), bundled in-process via the onnx-bundled feature. No network calls, no API keys, no per-call model load.

Hardware

Pinned 4 cores per lane (cpuset 0-3 / 4-7 / 8-11 / 12-15), MNEM_ORT_INTRA_THREADS=4, mem cap 3 GiB per lane. Bench host is documented per run in benchmarks/results/.

Scoring

Metric	Definition
R@K	hit if any gold item is in top-K retrieved
avg recall	mean per-item recall (ConvoMem)
Hybrid v4	dense + sparse score boost (mirrors MP harness helper)

Apple-to-apple pledge

Same dataset version, same query count.
Same scoring code (benchmarks/harness/).
No secret post-filters, no LLM rerank in the headline numbers.
Latency reported alongside recall, not separately.

Reproduce in 1 command

bash benchmarks/harness/run_bench.sh

See Reproduce for the full step-by-step.

Reproduce

End-to-end recipe to regenerate the 0.1.0 benchmark numbers locally.

Prerequisites

Docker 24+ (or podman with compose plugin)
16 cores recommended, 8 cores minimum
16 GiB RAM
Datasets downloaded:

bash benchmarks/harness/download-datasets.sh

One-shot run

bash benchmarks/harness/run_bench.sh

Wall ETA: 30-50 min on a 16-core box. Output: benchmarks/results/<UTC-stamp>/.

What happens

Build Docker image (release, FEATURES=onnx-bundled):
Bring up 4 lanes with cpuset pinning + thread caps.
Run 6 benches (LongMemEval, LoCoMo, ConvoMem, MemBench × 2, Hybrid v4) sequentially across the lanes via a token-bucket dispatcher.
Render RESULTS.md from per-bench JSONs.

Per-bench manual run

docker compose -f benchmarks/harness/compose.yml up -d mnem-bench-1

python benchmarks/harness/adapters/longmemeval_session.py \
    --dataset benchmarks/datasets/longmemeval/longmemeval_s_cleaned.json \
    mnem http serve --bind 127.0.0.1:9876 \
    --limit 500 --top-k 10 \
    --out benchmarks/results/longmemeval-500q.json

docker compose -f benchmarks/harness/compose.yml down

Verify against shipped numbers

python benchmarks/harness/comparison_table.py \
    --results benchmarks/results/<UTC-stamp> \
    --out /tmp/RESULTS.md
diff /tmp/RESULTS.md benchmarks/results/RESULTS.md

If your numbers diverge by more than ±0.01 on recall, open an issue with the host spec and the bench logs.

Run benchmarks locally with `mnem bench`

mnem bench is the 0.1.0 first-class entrypoint for running mnem against published memory benchmarks. It replaces the legacy bash benchmarks/harness/run_bench.sh flow as the default; the Bash harness stays around for reproducing the headline numbers from the project README until 0.2.0 wires the same set of embedders into mnem bench.

Quickstart

# 1. Interactive setup wizard (lists every bench; toggles unshipped
#    options behind [0.2.0] tags so you see what is on the roadmap).
mnem bench

# 2. CI-friendly explicit form.
mnem bench run \
    --benches longmemeval,locomo \
    --with mnem \
    --mode cpu-local \
    --top-k 10 \
    --out ./bench-out \
    --non-interactive

# 3. Cache datasets without running anything (network step isolated
#    so you can pre-warm a CI image).
mnem bench fetch longmemeval         # ~264 MB from HuggingFace
mnem bench fetch locomo              # ~3 MB from snap-research/LoCoMo
mnem bench fetch                     # fetch every shipped bench in one go

# 4. Re-render RESULTS.md from a previous run directory.
mnem bench results ./bench-out

Output layout:

bench-out/
  RESULTS.md             markdown table, one row per (bench, adapter)
  timing.log             per-bench wall-time breakdown
  longmemeval.json       summary
  longmemeval.jsonl      per-question rows
  locomo.json
  locomo.jsonl
  logs/<bench>.log

What ships in 0.1.0

Component	Status	Notes
LongMemEval (per-session)	shipped	R@5 / R@10 over `LmeQs:<qid>` per-question repos.
LoCoMo (session granularity)	shipped	MAX-aggregate dialog scores up to session keys.
mnem cpu-local adapter	shipped	In-process `Repo::open_in_memory` + bag-of-tokens.
ConvoMem	0.2.0	TUI lists; runtime prints “coming 0.2.0” and skips.
MemBench (simple-roles)	0.2.0	Same.
MemBench (highlevel-movie)	0.2.0	Same.
LongMemEval-hybrid-v4	0.2.0	MemPalace v4 hybrid post-filter port.
mem0 adapter	0.2.0	Same.
MempalaceAdapter	0.2.0	Same.
CPU parallel mode	0.2.0	Falls back to `cpu-local` with a stderr note.
Docker compose mode	0.2.0	Same.
ONNX MiniLM / Ollama / OpenAI embedders	0.2.0	Falls back to `bag-of-tokens` with a note.

The bag-of-tokens embedder ships built into mnem-bench. It is deterministic, network-free, and good enough to deliver recall@5 > 0 on the smoke test. It is NOT the embedder we use for the headline R@5 numbers in the project README - those still come from the legacy Bash harness driving Ollama / ONNX MiniLM / OpenAI. 0.2.0 swaps mnem-bench onto the same provider stack so the two harnesses produce identical numbers.

Pre-flight smoke test

cargo run --example smoke -p mnem-bench

Runs a 5-question LongMemEval canary and exits non-zero if recall@5 == 0. Used as the gate for releases of mnem-bench and mnem-cli.

Results

mnem vs MemPalace published numbers. Dense retrieval (vector + top-k); hybrid-v4 row mirrors MemPalace’s harness helper. No LLM rerank.

ONNX MiniLM-L6-v2 (bundled, in-process). 4 cores per lane.

Benchmark	Split	Metric	MP	mnem	Δ vs MP	Latency (ms)
LongMemEval	500 Q (full)	R@5 session	0.966	0.966	±0	711 (retr)
LongMemEval	500 Q (full)	R@10 session	0.982	0.982	±0	711 (retr)
LoCoMo	1986 Q (full)	R@5 session	0.508	$\color{green}{\textbf{0.726}}$	+0.218	333 (retr)
LoCoMo	1986 Q (full)	R@10 session	0.603	$\color{green}{\textbf{0.855}}$	+0.252	333 (retr)
ConvoMem	5 cat × 50 items (250)	avg recall	0.929	$\color{green}{\textbf{0.976}}$	+0.047	398 (retr)
MemBench	simple/roles, 100 items	R@5	0.840	$\color{green}{\textbf{0.960}}$	+0.120	1874 (e2e)
MemBench	highlevel/movie, 100 items	R@5	0.950	$\color{green}{\textbf{1.000}}$	+0.050	491 (e2e)
LongMemEval	500 Q, Hybrid v4	R@5 session	0.982	$\color{red}{\textbf{0.976}}$	-0.006	729 (retr)

(retr) = retrieve-only mean (from summary timing). (e2e) = end-to-end mean (runtime / n) when adapter doesn’t expose phase timing.

Headlines

Matches MemPalace exactly on LongMemEval (0.966 / 0.982).
Beats by +0.218 / +0.252 on LoCoMo session-level retrieval.
Beats by +0.047 on ConvoMem.
Beats by +0.120 / +0.050 on MemBench tasks.
Within ±0.006 on Hybrid v4 (no LLM rerank).

Raw artifacts

Per-bench JSON + JSONL in benchmarks/results/v0.1.0/. Each artifact carries the question, the gold set, the retrieved top-K, and per-item recall.

Reproduce

See Reproduce. One command:

bash benchmarks/harness/run_bench.sh

Ingest pipeline

mnem ingest is the only path content takes into the graph. The pipeline:

parse -> chunk -> extract -> embed -> commit

Sources

file path (mnem ingest README.md)
glob (mnem ingest 'docs/**/*.md')
stdin (cat data.txt | mnem ingest -)
structured JSON (mnem ingest data.json --json)

Chunking

Default: ~1k-token chunks with sentence-boundary alignment. Override via config:

[ingest]
chunk_size_tokens = 512
chunk_overlap_tokens = 50

Document-aware chunkers exist for code (Tree-sitter) and for Markdown (heading-aware). Auto-detected by file extension.

Extractors

Optional ingest-time enrichment:

Extractor	What it does
`none` (default)	raw text only
`keybert`	KeyBERT keyphrase extraction; phrases stored in node metadata

Enable via flag:

mnem ingest README.md --extractor keybert

Labels

Pass --label <str> to scope the ingested nodes:

mnem ingest user-42-chat.json --label user-42 --json

Subsequent retrieve calls with --label user-42 will see only this scope.

Idempotency

Ingesting the same content twice produces the same CID; the second commit is a no-op (parent points at the same tree). Edit-and-reingest produces a new CID and a child commit.

Embedding providers

mnem decouples embedder from store. Switch providers without re-ingesting.

Built-in providers

Provider	Model	Network?	Notes
`onnx`	`all-MiniLM-L6-v2` (bundled)	no	default; in-process; fastest cold-start
`ollama`	any pulled model	local HTTP	e.g. `bge-large`, `nomic-embed-text`
`openai`	`text-embedding-3-small`/`-large`	yes	needs `OPENAI_API_KEY`
`cohere`	`embed-english-v3.0`	yes	needs `COHERE_API_KEY`
`voyage`	`voyage-3`	yes	needs `VOYAGE_API_KEY`
`mock`	deterministic blake3	no	tests / smoke

Switching

Edit <repo>/.mnem/config.toml:

[embed]
provider = "ollama"
model = "bge-large"
base_url = "http://127.0.0.1:11434"

Or override per-process:

MNEM_EMBED_PROVIDER=ollama MNEM_EMBED_MODEL=bge-large mnem retrieve "..."

After switching, run mnem reindex to regenerate the per-commit embedding sidecar. Node CIDs are unchanged (they don’t carry embeddings); only the sidecar changes.

Sidecar layout

.mnem/
  store.redb              # nodes + commits
  sidecars/
    <embedder-id>/        # one dir per (provider, model) pair
      <commit-cid>.bin    # embedding bucket for that commit

Multiple sidecars co-exist. retrieve picks the sidecar matching the active embedder; if missing, it builds on-demand.

Adding a provider

Implement the Embedder trait in mnem-embed-providers/src/<your>.rs, gate behind a feature flag, register in the provider registry. See for the contract.

Comparisons

How mnem stacks up against other agent-memory and knowledge-graph systems. Each comparison is honest: where they win, where mnem wins, when to pick which.

mnem is open source (Apache-2.0). Numbers come from public artefacts; where a competitor’s claim is closed-source we say so. Where a benchmark is not directly comparable, we say so rather than fabricate a single-number league table.

Competitor	License	Server / Embedded	LLM at ingest	Bitemporal	Stars	Compare
Graphiti (`getzep/graphiti`)	Apache-2.0	server (Neo4j / Kuzu / FalkorDB / Neptune)	mandatory	yes	25,409	graphiti.md
mem0 (`mem0ai/mem0`)	Apache-2.0	library + cloud	default-on (opt-out)	no	54,113	mem0.md
MemPalace (`MemPalace/mempalace`)	MIT	embedded (Python + ChromaDB)	no	partial	49,768	mempalace.md
Supermemory (`supermemoryai/supermemory`)	MIT (repo) / closed (cloud)	hosted cloud	yes	no	22,218	supermemory.md
Cognee (`topoteretes/cognee`)	Apache-2.0	library + cloud	yes (`cognify`)	no	16,807	cognee.md
Letta (`letta-ai/letta`)	Apache-2.0	server + CLI	yes (agent is the writer)	partial	22,305	letta.md
graphify (`safishamsi/graphify`)	MIT	one-shot CLI	yes (Claude subagents)	no	35,262	graphify.md
mnem	Apache-2.0	embedded + four surfaces	no	no	small / pre-launch	(this repo)

Star counts pulled from the GitHub API on 2026-04-26. License columns reflect the repository SPDX identifier; commercial / hosted layers above some of these projects ship under different terms.

mnem positioning

mnem is the substrate underneath the products in the table: a content- addressed, versioned, hybrid-retrieval graph that runs in-process, ingests without an LLM, and exposes token-budget telemetry on every retrieve. We are not building a memory product; we are building the thing the next memory product is built on.

Reading order

If you have read about agent memory before, the most useful first read is one of:

mnem vs Graphiti if you have been thinking about bitemporal knowledge graphs.
mnem vs mem0 if you have been using the LangChain / LlamaIndex / CrewAI defaults.
mnem vs MemPalace if you care about no-LLM-on- write retrieval and reproducible benchmarks.
mnem vs Supermemory if you have been weighing the closed cloud vs self-host trade-off.
mnem vs Cognee if you have been looking at ECL- pipeline-shaped knowledge engines.
mnem vs Letta if you have been looking at the MemGPT lineage of agent platforms.
mnem vs graphify if you have been using one-shot folder-to-graph extractors.

mnem vs mem0

mem0: “Universal memory layer for AI Agents” (repo description, mem0ai/mem0) mnem: a content-addressed, versioned graph substrate underneath the memory layer.

At a glance

	mnem	mem0
License	Apache-2.0	Apache-2.0
Stars	small / pre-launch	54,113 (GitHub API, 2026-04-26)
Embedded / Server	embedded	library + optional managed Platform
LLM at ingest	no	yes by default (single-pass ADD-only since v3, Apr 2026); `infer=False` opt-out exists
Content-addressed	yes	no (UUIDs over a vector store)
Bitemporal	no	no (event log, not bitemporal)
WASM target	yes	no (Python + external vector DB)
MCP server	yes	yes (mem0 MCP exists)
Hybrid retrieval	yes (vector + sparse + graph + RRF)	yes (semantic + BM25 + entity matching, fused) since v3
Token-budget retrieval metadata	yes	no
3-way merge	yes	no (event log with add/update/delete)
Reproducible benchmarks in-repo	yes	partial (separate `memory-benchmark` repo)

Feature comparison

#	Dimension	mnem	mem0	Source
1	Data model	open-schema content-addressed nodes + edges	rows in a vector store with `{role, content}` history; `user_id` / `agent_id` / `run_id` scoping	mem0 README “Basic Usage” + docs
2	Default ingest	parse + chunk + statistical extract	LLM (gpt-5-mini default) extracts atomic facts on every `add`	mem0 README “Basic Usage” sha `bd9d27ff509f`
3	LLM requirement	optional	required by default; `infer=False` opts out but loses the “magic”	mem0 v3 README “New Memory Algorithm”
4	Identity	BLAKE3 CID over DAG-CBOR	UUIDs over a vector row	mem0 docs
5	History	signed commit DAG, diff / log / branch / merge	`history` event log of add/update/delete records	mem0 SDK
6	Conflict resolution	3-way merge over graph	“latest LLM extraction wins” before v3; v3 is ADD-only and accumulates	mem0 v3 release notes
7	Vector backends	redb default, pluggable via `Blockstore`	20+ (Qdrant, Chroma, PGVector, Pinecone, Weaviate, etc.)	mem0 docs “Supported Vector Stores”
8	LLM providers	optional, 16 via `mnem-llm-providers`	16+ (OpenAI, Anthropic, Gemini, Groq, Ollama, …)	mem0 docs “Supported LLMs”
9	Embedding model	bundled ONNX MiniLM-L6-v2 in-process	configurable; default OpenAI `text-embedding-3-small`	mem0 README
10	Retrieval lanes	dense (HNSW) + sparse (BM25/SPLADE) + graph + RRF	semantic + BM25 + entity match (v3)	mem0 v3 README
11	Token-budget metadata	first-class on every retrieve	not exposed	mnem CLI / HTTP API
12	Multi-tenancy	repo-per-tenant or scope by node label	hardcoded `user_id` / `agent_id` / `run_id` triple	mem0 SDK
13	Bindings	Rust + Python + HTTP + MCP + CLI	Python + TypeScript + REST + MCP	mem0 README badges
14	Cloud	none yet	“mem0 Platform”: Hobby free, Starter $19, Pro $249, Enterprise	mem0.ai pricing
15	Distribution	pre-launch	YC S24, ~2.6M monthly PyPI downloads	mem0 README badge

Benchmarks (where comparable)

mem0 v3 (April 2026) reports on LoCoMo and LongMemEval as a full pipeline (LLM extract + retrieve + answer). mnem reports retrieval-only (R@K) under an identical embedder, no LLM in the loop.

We have a same-harness, same-embedder reproduction of mem0 with infer=False (LLM extraction off) so the comparison lands on the retrieval layer:

Benchmark	Split	Metric	mem0 (`infer=False`, MiniLM)	mnem	Delta
LongMemEval	500 Q	R@5 session	0.946	$\color{green}{\textbf{0.966}}$	+0.020
LongMemEval	500 Q	R@10 session	0.962	$\color{green}{\textbf{0.982}}$	+0.020
LoCoMo	1986 Q	R@5 session	0.466	$\color{green}{\textbf{0.726}}$	+0.260
LoCoMo	1986 Q	R@10 session	0.676	$\color{green}{\textbf{0.855}}$	+0.179

Adapter notes: infer=False, persistent Memory, per-item user_id scoping. See benchmarks/methodology.md.

mem0’s own v3 numbers (LoCoMo 91.6, LongMemEval 93.4) are full-pipeline end-to-end accuracy, not retrieval R@5; not directly comparable to the table above.

Latency (where measured)

System	Setup	Latency
mnem	LongMemEval 500 Q, MiniLM-L6-v2, embedded redb	711 ms mean retrieve
mnem	LoCoMo 1986 Q, same setup	333 ms mean retrieve
mem0	LongMemEval, v3 single-pass	1.09 s p50 (mem0 README, Apr 2026)
mem0	LoCoMo, v3 single-pass	0.88 s p50 (mem0 README)

mem0 v3 latency includes one LLM retrieval call per query; mnem’s numbers are pure retrieval. Different mechanisms, useful only as an order-of-magnitude check.

Architecture differences

mem0 is a Python (and TS) memory layer designed to drop into LLM apps. The default flow is: mem.add(messages, user_id=...) runs an LLM to extract atomic facts, embeds them into a configured vector store, and returns a UUID per memory. Retrieval (mem.search(...)) does semantic

keyword + entity matching, optionally with a reranker. Multi-tenancy is hardcoded as user_id / agent_id / run_id. mem0 Platform layers a managed cloud, dashboards, and SOC 2 / GDPR on top.

mnem is one layer below: a content-addressed, versioned graph substrate. There is no fixed conversation schema; you commit nodes and edges with whatever labels and properties you need. Identity is a CID over canonical DAG-CBOR + BLAKE3, so the same fact on two machines collapses to the same node. History is a signed commit DAG, not an event log, so old facts remain addressable after newer ones supersede them. The write path runs no LLM by default; ingest is statistical parse + chunk + key extract. Retrieval is 3-lane RRF (HNSW dense + sparse + graph) with token-budget telemetry on every response.

Where mem0 clearly wins

Distribution. ~2.6M monthly PyPI downloads, default memory in LangChain / LlamaIndex / CrewAI / Vercel AI SDK / LiveKit / Pipecat / AWS Bedrock. mem0 is the path of least resistance.
Backend breadth. 20+ vector stores, 16 LLMs, 10 embedders work out of the box.
Managed product. Hobby tier is free; Pro is $249/mo with dashboards, SOC 2, on-prem.
LLM-assisted ingest. mem.add("I met Alice in Berlin") auto-extracts {entity: Alice, city: Berlin} with no upstream modelling effort.
YC + commercial momentum. YC S24, $24M raised, weekly release cadence on v3.

Where mnem clearly wins

No LLM in the write path. Regulated, offline, or cost-sensitive workloads ingest deterministically. mem0 v3 reduced the LLM cost to one call per add but did not eliminate it.
Content-addressed CIDs. Globally stable identity; CID-citations stay reproducible. mem0’s UUIDs are per-instance random.
Versioned history with 3-way merge. Diff / log / branch / merge / signed commits. mem0 ships an event log, not a commit graph.
Embedded + single binary. ~40 MB Docker image, no external vector DB. Runs offline.
WASM target. mnem-core compiles to wasm32; mem0 cannot.
Retrieval-quality lead under identical-embedder conditions. +0.20 R@5 on LongMemEval, +0.260 R@5 on LoCoMo (same MiniLM weights, dense lane only).
Token-budget telemetry. tokens_used / dropped per retrieve.

When to pick mem0, when to pick mnem

Pick mem0 if: you want drop-in agent memory with the broadest LangChain / LlamaIndex / CrewAI footprint, you are happy paying an LLM call per add for “magic” extraction, or you want a managed cloud and dashboards today.

Pick mnem if: you want an embedded substrate with no LLM at ingest, you need content-addressing and a real commit graph, you care about reproducibility and audit, or you are shipping to the edge / WASM / offline.

Sources

mem0 repo, sha bd9d27ff509f on main, 2026-04-26: https://github.com/mem0ai/mem0
mem0 README (“New Memory Algorithm (April 2026)”, “Basic Usage”, “CLI”): https://github.com/mem0ai/mem0/blob/main/README.md
mem0 docs: https://docs.mem0.ai
mem0 evaluation framework: https://github.com/mem0ai/memory-benchmark
mnem benchmarks: /benchmarks/proofs/v0.1.0/
mnem README: /README.md

mnem vs MemPalace

MemPalace: “The best-benchmarked open-source AI memory system. And it’s free.” (repo description, MemPalace/mempalace) mnem: a content-addressed, versioned graph substrate that shares MemPalace’s no-LLM-on-write philosophy and pushes further on identity and history.

At a glance

	mnem	MemPalace
License	Apache-2.0	MIT
Stars	small / pre-launch	49,768 (GitHub API, 2026-04-26)
Embedded / Server	embedded	embedded (Python + ChromaDB)
LLM at ingest	no	no (verbatim store)
Content-addressed	yes	no (ChromaDB row IDs)
Bitemporal	no	partial (`valid_from` / `valid_to` on KG entries)
WASM target	yes	no
MCP server	yes (18 tools)	yes (29 tools)
Hybrid retrieval	yes (vector + sparse + graph)	yes (semantic + hybrid v4 / v5 with keyword + temporal boost)
Token-budget retrieval metadata	yes	no
3-way merge	yes	no
Reproducible benchmarks in-repo	yes	yes (per-question JSONL committed)

Feature comparison

#	Dimension	mnem	MemPalace	Source
1	Schema	open (any labels / properties)	fixed: wings, rooms, halls, drawers	MemPalace README “What it is” sha `6890948e092b`
2	Storage	redb embedded	ChromaDB + SQLite	MemPalace README + `mempalace/backends/base.py`
3	Default embedder	bundled ONNX MiniLM-L6-v2	ChromaDB default (MiniLM-L6-v2 implied)	MemPalace `requirements`
4	LLM at ingest	none	none	MemPalace README
5	LLM at retrieval	optional rerank	optional hybrid-v4 + LLM rerank tier	MemPalace Benchmarks table
6	Identity	content CID (BLAKE3 over DAG-CBOR)	ChromaDB row IDs	implementation
7	History	signed commit DAG	append-only with `valid_from` / `valid_to`	MemPalace KG section
8	Conflict resolution	3-way merge	manual `invalidate` tool	MemPalace MCP tool list
9	Sparse lane	BM25 + SPLADE	hybrid-v4 keyword boost	MemPalace BENCHMARKS.md
10	Graph lane	first-class (label / prop / adjacency)	KG with timeline + cross-wing tunnels	MemPalace MCP tools
11	MCP surface	18 tools	29 tools	MemPalace README “MCP server”
12	Plugin scaffolds	mnem mcp + `mnem integrate`	`.claude-plugin/`, `.codex-plugin/` in repo	MemPalace repo
13	Bindings	Rust + Python + TS + HTTP + CLI + MCP	Python + MCP	MemPalace README
14	Hosted product	none	none	n/a
15	Velocity	maturing 1.0	433 commits in first 12 days, 30 contributors (early 2026)	internal notes; verify on repo today

Benchmarks (where comparable)

MemPalace publishes retrieval R@5 / R@10 numbers in the same family as mnem’s harness. We pulled their numbers from benchmarks/BENCHMARKS.md and ran ours on the same datasets and embedder weights:

Benchmark	Split	Metric	MemPalace	mnem	Delta
LongMemEval	500 Q	R@5 session, raw dense	0.966	0.966	0
LongMemEval	500 Q	R@10 session, raw dense	0.982	0.982	0
LongMemEval	500 Q hybrid-v4	R@5 session	0.982	$\color{red}{\textbf{0.976}}$	-0.006
LoCoMo	1986 Q	R@5 session, raw dense	0.508	$\color{green}{\textbf{0.726}}$	+0.218
LoCoMo	1986 Q	R@10 session, raw dense	0.603	$\color{green}{\textbf{0.855}}$	+0.252
ConvoMem	250 Q	Avg recall	0.890	$\color{green}{\textbf{0.976}}$	+0.086
MemBench	100 Q (movie)	R@5	0.950	$\color{green}{\textbf{1.000}}$	+0.050

Method: identical MiniLM-L6-v2 ONNX weights, no reranker, no LLM, no lexical lane on the raw-dense rows. The LoCoMo gap comes from mnem’s adapter aggregating user-turn text per session before embedding; MemPalace’s adapter embeds at a finer grain. Mechanism, not magic.

MemPalace’s hybrid-v4 numbers tune on dev splits; the held-out 98.4% they report is the honest figure to compare against.

Latency (where measured)

System	Setup	Latency
mnem	LongMemEval 500 Q, MiniLM ONNX	711 ms mean retrieve
mnem	LoCoMo 1986 Q, MiniLM ONNX	333 ms mean retrieve
MemPalace	LongMemEval, raw dense	not headlined; ChromaDB-default latency

MemPalace does not publish a single mean-latency number; their benchmark tables focus on accuracy.

Architecture differences

MemPalace stores conversation history verbatim in ChromaDB and indexes people / projects as wings, topics as rooms, flows as halls, content as drawers. The retrieval layer is pluggable behind mempalace/backends/base.py. A SQLite-backed knowledge graph adds valid_from / valid_to windows, an invalidate verb, and a timeline view. The MCP server exposes 29 tools including agent diaries and cross-wing tunnels. The product is opinionated: the palace metaphor is the user experience.

mnem ships no metaphor. Nodes and edges are open-schema; you commit whatever shape your application needs. Identity is a CID over canonical DAG-CBOR + BLAKE3, so identical content collapses to the same node across machines. History is a signed commit DAG with diff / log / branch / 3-way merge. Retrieval is 3-lane RRF (HNSW dense + BM25/SPLADE sparse + graph traversal) with first-class token-budget telemetry on every response. mnem-core is no-tokio / no-fs / no-net and compiles to WASM unchanged.

Where MemPalace clearly wins

Verbatim store with measured 96.6% R@5 on LongMemEval, no API key. Same as mnem on raw dense, and reproducible from their repo.
MCP breadth. 29 tools to mnem’s 18. Agent diaries and cross-wing tunnels are original ideas.
Plugin scaffolds in-repo. .claude-plugin/ and .codex-plugin/ lower install friction for Claude Code / Codex users.
Velocity and community. Hundreds of commits, dozens of contributors, rapid issue response.
Reproducibility culture. Per-question JSONL result files committed for every benchmark run.
Working temporal KG. valid_from / valid_to / invalidate / timeline shipped today.

Where mnem clearly wins

Open schema. No fixed wings/rooms/halls/drawers hierarchy. Use any labels and properties for any domain.
Content-addressed identity. Same fact = same CID across machines. Stable citations forever.
Real commit DAG. Branch, diff, 3-way merge, signed Ed25519 history. MemPalace stores facts and a timeline; mnem stores commits over a graph.
WASM target. Same retrieval logic in browsers, Workers, Lambda. Python + ChromaDB cannot.
Retrieval-quality lead on LoCoMo. +0.218 R@5 raw dense, same embedder.
Token-budget telemetry. tokens_used, candidates_seen, dropped returned on every retrieve.

When to pick MemPalace, when to pick mnem

Pick MemPalace if: the wings / rooms / halls / drawers metaphor matches your domain, you want the largest MCP tool surface available, or you specifically want a Claude-Code-paired personal memory appliance with reproducible benchmark numbers today.

Pick mnem if: you want an open-schema substrate, you need content-addressing and a real commit DAG, you are shipping to multiple languages or to the edge / WASM, or you want token-budget telemetry as a first-class response field.

Sources

MemPalace repo, sha 6890948e092b on develop, 2026-04-26: https://github.com/MemPalace/mempalace
MemPalace README (license MIT, “What it is”, Benchmarks table): https://github.com/MemPalace/mempalace/blob/develop/README.md
MemPalace benchmarks/BENCHMARKS.md for benchmark provenance
mnem benchmark artefacts: /benchmarks/proofs/v0.1.0/
mnem README + benchmark methodology: /README.md, benchmarks/methodology.md

mnem vs Supermemory

Supermemory: “Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.” (repo description, supermemoryai/supermemory) mnem: an open-source, embedded, content-addressed knowledge-graph substrate. Self-host or nothing.

At a glance

	mnem	Supermemory
License	Apache-2.0	MIT (repo); cloud product is closed-core
Stars	small / pre-launch	22,218 (GitHub API, 2026-04-26)
Embedded / Server	embedded	hosted cloud; self-host issue (#707) closed without resolution
LLM at ingest	no	yes (Extractors layer; entity / fact extraction)
Content-addressed	yes	no (custom vector graph engine, internals undisclosed)
Bitemporal	no	no
WASM target	yes	n/a (cloud only)
MCP server	yes	yes (`https://mcp.supermemory.ai/mcp`, OAuth + bearer)
Hybrid retrieval	yes	yes (multi-mode search across graph)
Token-budget retrieval metadata	yes	not exposed
3-way merge	yes	no
Reproducible benchmarks in-repo	yes	self-reported; `memorybench` skill is a benchmarking harness vs supermemory

Feature comparison

#	Dimension	mnem	Supermemory	Source
1	Deployment	embedded; single binary	cloud only; self-host requested + closed (issue #707)	internal research
2	Storage	redb embedded	Postgres via Cloudflare Hyperdrive + Cloudflare AI vector embeddings + R2 + KV	internal research
3	Vector engine	HNSW via `mnem-ann`	undisclosed; “custom vector graph engine with ontology-aware edges”	internal research
4	Embedding model	bundled ONNX MiniLM-L6-v2; pluggable	undisclosed (Cloudflare AI)	internal research
5	Identity	content CID	undisclosed	n/a
6	Multi-tenancy	by repo or graph scope	`containerTag` and project scoping	internal research
7	Ingest pipeline	parse + chunk + statistical extract	five stacked layers: User Profiles, Memory Graph, Retrieval, Extractors, Connectors	internal research
8	LLM use	optional, opt-in	yes, in Extractors layer	internal research
9	Connectors	none yet	webhook-driven connectors live (Notion, GDrive, etc.)	supermemory.ai docs
10	Plugin / IDE ecosystem	MCP + `mnem integrate`	12+ integration plugins, dedicated repos	internal research
11	API	local Rust / Python / HTTP / MCP / CLI	REST `api.supermemory.ai/v3` + `/v4`, TS / Python SDKs	supermemory README
12	Pricing	self-host, free	tiered cloud (free / pro / team / enterprise)	supermemory.ai/pricing
13	Funding / brand	self-funded indie	$3M seed, ~$40M valuation, named angels (Jeff Dean, Dane Knecht, Logan Kilpatrick, …)	internal research
14	Founder reach	small	Dhravya Shah, ~51.5k X followers	internal research
15	Self-reported benchmarks	reproducible artefacts in-repo	“#1 on LongMemEval, LoCoMo, ConvoMem”; sub-300 ms recall at 85.4% accuracy	internal research

Benchmarks (where comparable)

Not directly comparable in any apples-to-apples sense. Supermemory’s benchmark numbers are self-reported, the engine is closed, and the evaluation harness is bundled as the memorybench skill that points at supermemory by default. Their headline:

Supermemory: 85.2-85.4% on LongMemEval; sub-300 ms recall; “#1 on LongMemEval, LoCoMo, ConvoMem”.

mnem’s reproducible numbers under ONNX MiniLM-L6-v2, no LLM in the loop:

Benchmark	Split	Metric	mnem
LongMemEval	500 Q	R@5 session	0.966
LongMemEval	500 Q	R@10 session	0.982
LoCoMo	1986 Q	R@5 session	0.726
ConvoMem	250 Q	Avg recall	0.976

Putting 0.852 next to 0.966 looks favorable for mnem, but the metrics are not the same shape: Supermemory’s number is end-to-end QA accuracy; mnem’s is retrieval R@5 with no LLM. Both columns are honest; the column headers are not the same column.

Latency (where measured)

System	Setup	Latency
mnem	LongMemEval 500 Q, MiniLM ONNX	711 ms mean retrieve
mnem	LoCoMo 1986 Q, MiniLM ONNX	333 ms mean retrieve
Supermemory	self-reported	sub-300 ms recall, “sub-400 ms at scale”

Closed engine, edge network, undisclosed embedder. Supermemory cloud is fast at the retrieve hop; mnem runs in your process so total end-to-end (no network round-trip) tends to win for self-hosted users.

Architecture differences

Supermemory is a Cloudflare-native cloud product. The repo is MIT- licensed but the production engine is closed: a “custom vector graph engine with ontology-aware edges” sitting on top of Postgres (Hyperdrive), Cloudflare AI vector embeddings, R2 object storage, and KV. The product is five stacked layers behind one API: User Profiles, Memory Graph, Retrieval, Extractors, Connectors. MCP server is production today at mcp.supermemory.ai/mcp with OAuth or API-key auth. Connectors (Notion, GDrive, etc.) ship as live webhook integrations. The strength is GTM: $3M seed, named angels, 50,000+ self-reported users on the consumer app, integrations with Cluely, Composio, Scira AI.

mnem is the opposite: open-source Apache-2.0, embedded, single-binary, no cloud. The graph substrate is content-addressed (BLAKE3 CIDs over DAG-CBOR), versioned (signed commit DAG with 3-way merge), and runs in-process from a cargo install away. There is no managed offering; hosting is explicitly out of scope for 0.1.0. Where Supermemory wins on distribution and managed operations, mnem wins on substrate guarantees: identity, history, and deterministic retrieval that you can run offline.

Where Supermemory clearly wins

Hosted product with live connectors. Notion, GDrive, etc. work out of the box. mnem has none yet.
Distribution and brand. 22k stars, $3M seed, named angels (Jeff Dean, Dane Knecht, Logan Kilpatrick, David Cramer), founder reach ~51.5k X followers.
MCP-native cloud. Drop one URL into a client config and you have agent memory.
IDE plugin ecosystem. 12+ integration plugins live.
Cloudflare edge latency. Sub-300 ms recall claims are plausible given the Workers + Hyperdrive stack.

Where mnem clearly wins

Open-source substrate. Apache-2.0, no vendor lock-in. Self-host on a laptop or a Lambda. Supermemory’s self-host issue (#707) closed without a resolution; the cloud is structural.
No closed engine. mnem’s vector lane (HNSW), sparse lane (BM25 / SPLADE), graph lane, and RRF weights are all configurable and documented. Supermemory’s “custom vector graph engine” is a black box.
Content-addressed identity. Same fact = same CID across machines.
Real commit history. Diff, log, branch, 3-way merge, signed history. Supermemory has soft “versioning” in their sense; not a DAG.
Privacy by default. Nothing leaves your machine unless you opt in.
Reproducible benchmarks. Numbers ship with a runnable harness; Supermemory’s are self-reported.
Token-budget retrieval metadata. First-class on every retrieve.

When to pick Supermemory, when to pick mnem

Pick Supermemory if: you want a managed memory API today with hosted connectors, you trust Cloudflare for storage and inference, you want OAuth-MCP plug-and-play for ChatGPT / Claude / Cursor, or distribution on hosted infrastructure beats substrate control for your use case.

Pick mnem if: you need self-host or air-gapped, you want an open substrate with documented internals, you need content-addressing and a commit DAG, or you are building a product on top of a memory layer rather than consuming one.

Sources

Supermemory repo, sha a41bbeecb395 on main, 2026-04-26: https://github.com/supermemoryai/supermemory
Supermemory README, MCP details: mcp.supermemory.ai/mcp
Supermemory cloud and pricing: https://supermemory.ai
mnem benchmark artefacts: /benchmarks/proofs/v0.1.0/
mnem README + architecture: /README.md

mnem vs Cognee

Cognee: “Knowledge Engine for AI Agent Memory in 6 lines of code” (repo description, topoteretes/cognee) mnem: a content-addressed, versioned graph substrate that ingests without an LLM.

At a glance

	mnem	Cognee
License	Apache-2.0	Apache-2.0
Stars	small / pre-launch	16,807 (GitHub API, 2026-04-26)
Embedded / Server	embedded	library + Cognee Cloud
LLM at ingest	no	yes (`remember` calls `add` + `cognify` + `improve`)
Content-addressed	yes	no (extracted graph node IDs)
Bitemporal	no	no
WASM target	yes	no
MCP server	yes	yes (Cognee MCP exists; integrates with Claude Code, Hermes)
Hybrid retrieval	yes (vector + sparse + graph + RRF)	yes (auto-routing across graph + vector)
Token-budget retrieval metadata	yes	no
3-way merge	yes	no
Reproducible benchmarks in-repo	yes	partial (research / paper claims; no in-repo harness)

Feature comparison

#	Dimension	mnem	Cognee	Source
1	Storage	redb embedded	Kuzu default + vector DB; pluggable	Cognee README “Deploy Cognee” sha `f4964c31db04`
2	Default flow	`commit` -> CID; no LLM	`remember` runs `add` + `cognify` + `improve` (LLM extraction)	Cognee README Quickstart Step 3
3	LLM requirement	optional	required to configure before Quickstart Step 2	Cognee README “Step 2: Configure the LLM”
4	Identity	BLAKE3 CID over DAG-CBOR	extracted graph-node IDs (LLM-derived)	Cognee internals
5	History	signed commit DAG	none; standard graph state	Cognee docs
6	Conflict resolution	3-way merge	re-run `cognify` to refresh graph	Cognee Quickstart
7	Vector lane	HNSW via `mnem-ann`	configurable vector store	Cognee docs
8	Sparse lane	BM25 + SPLADE	not headlined	Cognee README
9	Graph lane	first-class	first-class	Cognee README “About Cognee”
10	LLM providers	optional	OpenAI, Anthropic, Gemini, Ollama, others	Cognee docs
11	Session memory	open (model your own)	`remember(..., session_id="...")` first-class	Cognee Quickstart
12	Auto-routing retrieval	manual lane configuration	“picks best search strategy automatically”	Cognee README Step 3
13	Bindings	Rust + Python + TS + HTTP + CLI + MCP	Python + CLI + MCP	Cognee README
14	Cloud	none yet	Cognee Cloud (managed)	Cognee README “Connect to Cognee Cloud”
15	Determinism	byte-identical CIDs same input	LLM extraction is non-deterministic	Cognee README Step 3

Benchmarks (where comparable)

Not directly comparable. Cognee publishes research and use-case narratives rather than retrieval R@K artefacts in the repo. Their strength is the ECL pipeline as a finished product (drop a PDF, get a typed knowledge graph), not retrieval-quality benchmarks at the substrate layer.

mnem’s measured retrieval numbers under ONNX MiniLM-L6-v2:

Benchmark	Split	Metric	mnem
LongMemEval	500 Q	R@5 session	0.966
LoCoMo	1986 Q	R@5 session	0.726
ConvoMem	250 Q	Avg recall	0.976
MemBench	100 Q (movie)	R@5	1.000

If you want to compare like-for-like, run Cognee against the same LongMemEval / LoCoMo dumps with infer=False-equivalent (skip cognify’s LLM extraction). We have not published a Cognee adapter; contributions welcome.

Latency (where measured)

System	Setup	Latency
mnem	LongMemEval 500 Q, MiniLM ONNX	711 ms mean retrieve
mnem	LoCoMo 1986 Q, MiniLM ONNX	333 ms mean retrieve
Cognee	not headlined; depends on LLM provider + vector store	n/a

Cognee’s retrieval latency depends heavily on whether the auto-router calls a vector store, the graph DB, or LLM rerank. They do not publish a single number.

Architecture differences

Cognee is a Python knowledge-engine library and a managed cloud. The write path is the ECL pipeline: Extract -> Cognify -> Load. You hand it documents, Cognee runs an LLM to extract entities and relationships, embeds them, and writes them into a graph DB (Kuzu by default) plus a vector store. Retrieval auto-routes across vector and graph based on the query. The strength is “drop a PDF and get a typed knowledge graph” with minimal modelling effort.

mnem is the substrate beneath that pattern. There is no required LLM at ingest: parse + chunk + statistical extract -> CID -> commit. The graph shape is whatever your application commits, not whatever the LLM happened to extract that hour. Identity is content-addressed, so the same document on two machines collapses to the same nodes. History is a signed commit DAG with diff / 3-way merge. Retrieval is explicit 3-lane RRF with token-budget telemetry, not auto-routed.

Where Cognee clearly wins

Drop-in ingest. Hand it a PDF, conversation, or URL; get a typed knowledge graph. mnem expects you to model what you want stored.
Auto-routing retrieval. The router picks vector vs graph vs hybrid for you. mnem makes you choose.
Multi-LLM-provider support. OpenAI, Anthropic, Gemini, Ollama, others work with minimal config.
Cloud + self-host parity. Cognee Cloud is managed; the OSS library works standalone.
Rich ontology derivation. LLM derives the ontology from the corpus rather than forcing one upfront.
Session memory primitive. session_id is first-class with a background sync to the long-term graph.

Where mnem clearly wins

No LLM in the write path. Deterministic, replayable, fuzz-tested. Cognee’s cognify step is non-deterministic by design.
Content-addressed identity. Same input -> same CIDs across machines. Cognee’s extracted node IDs are extraction-run-dependent.
Real commit DAG. Branch, diff, 3-way merge, signed Ed25519 history. Cognee has graph state, not a commit history.
Embedded, single binary. ~40 MB Docker image. No external graph DB or vector store to operate.
WASM target. mnem-core ships to wasm32 unchanged.
Token-budget retrieval metadata. First-class on every response.

When to pick Cognee, when to pick mnem

Pick Cognee if: you want PDF / URL / document -> typed knowledge graph in 6 lines, you are happy with an LLM at ingest, you want auto- routed retrieval, or you want a managed cloud option.

Pick mnem if: you need deterministic ingest, content-addressed identity, a real commit DAG, embedded / single-binary deployment, or WASM / edge targets.

Sources

Cognee repo, sha f4964c31db04 on main, 2026-04-26: https://github.com/topoteretes/cognee
Cognee README (“About Cognee”, Quickstart Steps 1-3, “Use with AI Agents”, “Deploy Cognee”): https://github.com/topoteretes/cognee/blob/main/README.md
Cognee docs: https://docs.cognee.ai
Cognee Cloud: https://www.cognee.ai
mnem benchmarks: /benchmarks/proofs/v0.1.0/
mnem README: /README.md

mnem vs Letta

Letta: “Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.” (repo description, letta-ai/letta) mnem: a content-addressed graph substrate that stores the memory an agent uses, without assuming the agent.

At a glance

	mnem	Letta
License	Apache-2.0	Apache-2.0
Stars	small / pre-launch	22,305 (GitHub API, 2026-04-26)
Embedded / Server	embedded	server (Letta API) + Letta Code CLI
LLM at ingest	no	yes; the agent is the writer
Content-addressed	yes	no (DB row IDs)
Bitemporal	no	partial (Letta tracks message timestamps)
WASM target	yes	no (Python server)
MCP server	yes	yes (Letta supports MCP integrations)
Hybrid retrieval	yes	recall + archival memory; not headlined as hybrid
Token-budget retrieval metadata	yes	not exposed
3-way merge	yes	no
Reproducible benchmarks in-repo	yes	partial (Letta leaderboard external)

Feature comparison

#	Dimension	mnem	Letta	Source
1	Product shape	memory substrate	agent platform (agents + memory + tools + runtime)	Letta README sha `bb52a8900a79`
2	Memory model	open graph of content-addressed nodes + edges	tiered: core blocks (in-context) + recall + archival	MemGPT paper arXiv:2310.08560
3	Who writes memory	the application	the agent itself, via tool calls	Letta docs
4	LLM at ingest	none	yes; agent decides what to write, promote, evict	MemGPT paper
5	Identity	content CID	DB row IDs	Letta SDK
6	History	signed commit DAG	standard DB state with timestamps	Letta SDK
7	Conflict resolution	3-way merge	agent-to-agent messaging	Letta docs
8	Scoping	open (any node label)	`agent_id` first-class	Letta API
9	Vector lane	HNSW via `mnem-ann`	recall / archival via configurable embedder	Letta docs
10	Sparse lane	BM25 + SPLADE	not first-class	Letta docs
11	Graph lane	first-class	not first-class	Letta docs
12	Bindings	Rust + Python + TS + HTTP + CLI + MCP	Python + REST + Letta Code CLI (Node 18+)	Letta README
13	Cloud	none yet	hosted Letta API + free dashboard	https://docs.letta.com
14	Model agnosticism	yes (provider-not-tactic)	“fully model-agnostic; recommends Opus 4.5 / GPT-5.2”	Letta README
15	Headline use-case	agent-memory substrate	“stateful agents that learn and self-improve”	Letta repo description

Benchmarks (where comparable)

Letta publishes a model leaderboard at leaderboard.letta.com ranking LLMs on Letta’s agent benchmarks (multi-turn, tool-use, reasoning). This measures models inside the Letta agent, not retrieval quality of a memory layer. mnem’s benchmarks measure retrieval R@K over corpora, not agent task success.

The two systems are not directly comparable on a single number. Letta’s “how well does this LLM run my agent” answers a different question from mnem’s “how well does the substrate retrieve under a fixed embedder.”

mnem’s retrieval numbers under ONNX MiniLM-L6-v2:

Benchmark	Split	Metric	mnem
LongMemEval	500 Q	R@5 session	0.966
LoCoMo	1986 Q	R@5 session	0.726
ConvoMem	250 Q	Avg recall	0.976

Latency (where measured)

System	Setup	Latency
mnem	LongMemEval 500 Q, MiniLM ONNX	711 ms mean retrieve
mnem	LoCoMo 1986 Q, MiniLM ONNX	333 ms mean retrieve
Letta	varies wildly with model + tool-use depth	not headlined

Letta’s user-perceived latency is dominated by the agent loop, not the memory tier. Different mechanism; not comparable.

Architecture differences

Letta is the platform descended from MemGPT. The headline pattern is tiered memory: core memory blocks held in the LLM’s context window, recall memory (recent conversation history) accessible via tool calls, and archival memory (long-term store) similarly accessed by tool. The agent itself decides what to promote and evict, using the LLM’s own reasoning. Letta ships as a Python framework, a hosted API, and a local CLI (letta via @letta-ai/letta-code). The product optimisation is “give an LLM persistent memory and let it manage the tiers.”

mnem is one layer below that. There is no agent in mnem. mnem is a graph substrate: content-addressed nodes and edges, signed commit history, 3-way merge, hybrid retrieval. If you wanted to build the MemGPT pattern on top of mnem, you would ship: core_blocks as a small ad-hoc graph, recall as an HNSW lane over recent commits, archival as the full graph traversal lane. mnem doesn’t impose any of that; it gives you the storage primitives and lets you choose the agent shape.

Where Letta clearly wins

The agent is in the box. Drop in Letta, you have an agent with memory and tool-use today. mnem requires you to bring your own agent / framework.
The MemGPT brand and lineage. Anyone reading the agent-memory literature has seen the paper. Letta’s the canonical implementation.
Hosted API + leaderboard. Comparing models on Letta’s harness is one click.
Skills + subagents. Bundled patterns for advanced memory and continual learning.
Multi-agent reconciliation via messaging. Agent-to-agent conversations are first-class.

Where mnem clearly wins

No agent assumed. Letta’s memory belongs to a Letta agent; mnem’s memory belongs to your application. Port your agent framework next year, the data stays.
No LLM in the write path. Letta writes to memory through the agent (LLM tool calls). mnem writes deterministically.
Content-addressed identity. Same fact = same CID across machines.
Real commit DAG. Diff, log, 3-way merge, signed history. Letta has DB state, not commits.
Structural multi-agent merge. Two agents working offline in the same scope reconcile by 3-way graph merge, not by chat messages.
WASM, embedded, single binary. Ship to the edge. Letta is a Python server.
Hybrid 3-lane retrieval with token-budget metadata. Explicit RRF over dense / sparse / graph; tokens_used per response.

When to pick Letta, when to pick mnem

Pick Letta if: you want the MemGPT pattern in a box, you want a ready-made agent platform with skills and subagents, you want to use Letta’s leaderboard to pick a model, or you are building a single stateful agent rather than a multi-application substrate.

Pick mnem if: you want the memory layer separate from the agent, you need content-addressing and a commit DAG, you are running multiple agent frameworks against the same store, or you need embedded / edge / WASM deployment.

Sources

Letta repo, sha bb52a8900a79 on main, 2026-04-26: https://github.com/letta-ai/letta
Letta README (Letta Code CLI, Letta API, model-agnostic note): https://github.com/letta-ai/letta/blob/main/README.md
Letta docs: https://docs.letta.com
Letta leaderboard: https://leaderboard.letta.com
MemGPT paper, arXiv:2310.08560: https://arxiv.org/abs/2310.08560
mnem README: /README.md

mnem vs graphify

graphify: “AI coding assistant skill (Claude Code, Codex, OpenCode, …). Turn any folder of code, docs, papers, images, or videos into a queryable knowledge graph.” (repo description, safishamsi/graphify) mnem: a content-addressed graph substrate. graphify builds a graph from a folder; mnem is the graph.

At a glance

	mnem	graphify
License	Apache-2.0	MIT
Stars	small / pre-launch	35,262 (GitHub API, 2026-04-26)
Embedded / Server	embedded	one-shot CLI; outputs static `graph.json` + HTML
LLM at ingest	no	yes (Claude subagents extract concepts + relationships)
Content-addressed	yes	no (NetworkX node IDs)
Bitemporal	no	no
WASM target	yes	no (Python + faster-whisper + Claude API)
MCP server	yes	no native MCP; integrates via `AGENTS.md` / hooks / Supermemory MCP
Hybrid retrieval	yes	partial (Leiden communities; no vector DB)
Token-budget retrieval metadata	yes	no
3-way merge	yes	no (re-runs against SHA256 cache)
Reproducible benchmarks in-repo	yes	no (benchmarking pointer is the `memorybench` skill which targets supermemory)

Feature comparison

#	Dimension	mnem	graphify	Source
1	Product shape	runtime substrate (CLI / HTTP / MCP / Python / Rust)	one-shot CLI skill that emits a static graph artefact	graphify README “How it works” sha `770d7f54c40d`
2	Inputs	text, code, conversations (anything you commit)	code, docs, papers, images, videos (multimodal) via tree-sitter + Whisper + Claude	graphify README
3	Ingest pipeline	parse + chunk + statistical extract -> commit	three passes: AST extract -> Whisper transcribe -> Claude subagents extract concepts	graphify README “How it works”
4	LLM requirement	optional	yes (Claude subagents are central)	graphify README
5	Identity	BLAKE3 CID over DAG-CBOR	NetworkX node IDs	graphify implementation
6	Output	live graph + retrieve API	static `graph.html`, `GRAPH_REPORT.md`, `graph.json`, `cache/`	graphify README directory listing
7	Retrieval	3-lane RRF (vector + sparse + graph)	graph-topology (Leiden communities) and `/graphify query` slash command	graphify README
8	Vector DB	HNSW via `mnem-ann`	none (graph-topology-based clustering, no embeddings)	graphify README “Clustering is graph-topology-based”
9	Re-ingest	commit append	SHA256 cache only re-processes changed files	graphify README directory listing
10	Tags on relations	edge labels	`EXTRACTED` / `INFERRED` / `AMBIGUOUS` confidence tags	graphify README
11	Always-on assistant integration	MCP + `mnem integrate`	platform-specific: Claude Code PreToolUse hook, Cursor `alwaysApply` rule, AGENTS.md	graphify README “Make your assistant always use the graph”
12	Supported AI clients	MCP (Claude Desktop, Cursor, Zed, etc.)	15+ named installers (Claude Code, Codex, Cursor, Aider, Gemini, Copilot CLI, …)	graphify README install table
13	Versioning	signed commit DAG	none; re-run produces a fresh graph	graphify implementation
14	License	Apache-2.0	MIT	repo metadata
15	Cloud	none yet	Supermemory API integration documented in README (“Build with Supermemory”)	graphify README

Benchmarks (where comparable)

Not directly comparable. graphify produces a static knowledge graph artefact for an assistant to read, not a retrieval API benchmarked on LongMemEval / LoCoMo / etc. Their headline number is a token-efficiency claim (“71.5x fewer tokens per query vs reading the raw files”), measured against a different baseline than mnem’s R@K-on-public-corpora methodology.

graphify’s README points at the memorybench skill (npx skills add supermemoryai/memorybench) for benchmarking, but that harness is supermemory-tilted by default; running mnem through it would require an adapter we have not built.

mnem’s measured retrieval numbers under ONNX MiniLM-L6-v2:

Benchmark	Split	Metric	mnem
LongMemEval	500 Q	R@5 session	0.966
LoCoMo	1986 Q	R@5 session	0.726
ConvoMem	250 Q	Avg recall	0.976

Latency (where measured)

System	Setup	Latency
mnem	LongMemEval 500 Q, MiniLM ONNX	711 ms mean retrieve
mnem	LoCoMo 1986 Q, MiniLM ONNX	333 ms mean retrieve
graphify	retrieval is “open `graph.json` and traverse” by an LLM	not measured by them

graphify’s user-perceived latency at query time is “LLM reads GRAPH_REPORT.md then traverses graph.json,” which is a different loop entirely.

Architecture differences

graphify is a one-shot CLI you point at a folder. It runs a deterministic AST pass (tree-sitter, 25 languages), transcribes audio and video with faster-whisper using a domain-aware prompt, then runs Claude subagents in parallel to extract concepts and relationships from docs, papers, images, and transcripts. Results are merged into a NetworkX graph, clustered with Leiden community detection (no embeddings; graph topology is the similarity signal), and exported as graph.html (interactive viewer), GRAPH_REPORT.md (plain-language audit), graph.json (queryable), and a SHA256 cache for incremental re-runs. Coding assistants integrate via platform- specific install commands that wire the graph into rules / hooks / AGENTS.md so the assistant always considers it before searching raw files.

mnem is a runtime substrate, not a one-shot extractor. You commit nodes and edges, then retrieve via a 3-lane fused query (HNSW dense + BM25 / SPLADE sparse + graph traversal) under explicit RRF weights and a token budget. There is no LLM in the write path. Identity is a content CID, history is a signed commit DAG, and the graph evolves by commit + diff + merge rather than by re-extraction. mnem ships embedded; the same Rust core compiles to WASM unchanged.

The two are complementary more than competitive: graphify is a great ingestor for the kind of corpora mnem stores (run graphify on a folder, ingest the resulting graph into mnem). They do not solve the same problem.

Where graphify clearly wins

Multimodal ingest in one command. Code, PDFs, markdown, screenshots, diagrams, whiteboard photos, video, audio, in 25 languages via tree-sitter. mnem ingests text and structured commits; you bring your own multimodal extractor.
Coding-assistant integration breadth. 15+ named platforms with install commands (Claude Code, Codex, OpenCode, Copilot CLI, VS Code Copilot Chat, Aider, Cursor, Gemini CLI, OpenClaw, Factory Droid, Trae, Hermes, Kiro, Antigravity).
PreToolUse hooks. Inject “the graph exists, read it first” into Claude Code / Codex / OpenCode tool flows automatically.
Static, portable artefact. graph.html opens in any browser; graph.json ships in a repo; auditors read GRAPH_REPORT.md.
Confidence tags on relations. EXTRACTED / INFERRED / AMBIGUOUS lets a reader filter what was found vs guessed.
No vector DB needed. Leiden over graph topology produces communities without embeddings.

Where mnem clearly wins

Runtime, not one-shot. mnem keeps serving as your corpus grows; graphify is a re-extract loop.
No LLM in the write path. graphify’s concept extraction is Claude-subagent-driven by design. mnem ingests deterministically.
Content-addressed identity + commit DAG. Stable identity across re-runs and machines; full diff / 3-way merge. graphify regenerates a fresh NetworkX graph.
Hybrid retrieval API. Vector + sparse + graph fused with token- budget metadata. graphify exposes traversal slash-commands but no retrieval API.
Embedded + WASM. Same retrieval logic in Rust, Python, TS, MCP, Workers, Lambda. graphify is a Python CLI.
License. Apache-2.0 vs MIT (both permissive; matters in some corporate review contexts).

When to pick graphify, when to pick mnem

Pick graphify if: you want a folder -> queryable knowledge graph in one command, you need multimodal extraction (video / audio / images), you want an always-on coding-assistant integration, or you need a static artefact you can ship in a repo.

Pick mnem if: you need a runtime memory substrate, you require deterministic ingest, you want content-addressed identity and a real commit DAG, you are building a product (not a personal coding assistant), or you need embedded / WASM deployment.

You can also pair them: graphify as the multimodal extractor, mnem as the runtime substrate that holds the resulting graph and serves queries.

Sources

graphify repo, sha 770d7f54c40d on v5, 2026-04-26: https://github.com/safishamsi/graphify
graphify README (“How it works”, “Install”, “Make your assistant always use the graph”, “Benchmarks”): https://github.com/safishamsi/graphify/blob/v5/README.md
mnem README: /README.md
mnem benchmarks: /benchmarks/proofs/v0.1.0/

Migrating to 0.1.0

0.1.0 is the first public release. There is no prior public version; this document exists for completeness and to fix the upgrade pathway forward.

Summary

If you are arriving at 0.1.0 fresh: skip this page. There is nothing to migrate.

What 0.1.0 establishes

Surface	Stability guarantee
CLI subcommand names	stable; deprecation cycle for breaking changes
HTTP `/v1/*` routes	stable; new routes additive only
MCP tool names + schemas	stable
`<repo>/.mnem/config.toml` keys	stable
Object encoding (DAG-CBOR + CIDs)	stable; bump = new schema version
Internal Rust crate APIs	unstable pre-1.0

What may change between 0.x minor releases

New optional config keys (with defaults).
New retrieval lanes / scoring modes (off-by-default).
New CLI subcommands; existing subcommands keep semantics.
Sidecar formats (re-build required, node CIDs unchanged).

Upgrading from a 0.x.y to 0.x.(y+1)

cargo install --locked mnem-cli      # or platform package manager
mnem --version
mnem doctor                          # config sanity check

No data migration needed within 0.x.

Future migrations

When 1.0 lands, this directory will gain a v0.x-to-v1.0.md doc covering any on-disk schema changes. Until then, treat 0.x as the pre-launch line where patch releases are non-breaking and minors may introduce additive changes.

Keyboard shortcuts

mnem