Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Introduction

mnem is a knowledge-graph substrate. It stores nodes as content-addressed objects, retrieves them with vector + sparse + graph signals, and exposes the result over CLI, HTTP, and MCP surfaces.

What it does

  • Content-addressed nodes - every node has a CID; identical content collapses to one node.
  • Versioned commits - every change is a commit with a parent chain (Git-style for graphs).
  • Hybrid retrieval - vector (HNSW), sparse (BM25 / SPLADE), and graph traversal in one query.
  • In-process embedder - bundled ONNX MiniLM-L6-v2 (no Ollama / API keys required).
  • MCP-native - drop-in memory layer for Claude / Cursor / any MCP client.
  • WASM target - same core compiles to wasm32 for in-browser use.

What it is not

  • A vector database (it’s a graph; vectors are one signal among several).
  • An LLM (mnem holds memory; the LLM uses it).
  • A finished product. 0.1.0 is the first public cut.

Where to next

  • Install - single command per platform.
  • Quickstart - five minutes from zero to retrieve.
  • Core concepts - what’s a CID, what’s a commit, what’s a label.

Install

mnem ships a single mnem binary plus optional Python and HTTP daemons. Pick the source that matches your platform.

From Cargo (any platform with Rust toolchain)

cargo install --locked mnem-cli
mnem --version

Requires Rust 1.95+ (see rust-toolchain.toml).

From npm (Node.js users)

npm install -g mnem-cli
mnem --version

# or one-shot via npx
npx mnem-cli --version

Downloads the prebuilt native binary for your platform at install time. Node 18+ required. No Rust toolchain needed.

From PyPI (Python users)

pip install mnem-cli
mnem --version

The PyPI package ships the same mnem binary as a manylinux / macOS / Windows wheel.

From a release binary

Download the platform tarball from the latest GitHub release:

curl -L https://github.com/Uranid/mnem/releases/latest/download/mnem-linux-x86_64.tar.gz | tar xz
sudo mv mnem /usr/local/bin/
mnem --version

Replace linux-x86_64 with linux-aarch64 / macos-x86_64 / macos-aarch64 / windows-x86_64.zip as appropriate.

Per-OS package managers

After v0.2.0, mnem ships only via Cargo and PyPI. The Homebrew tap, AUR, Nix, winget, and scoop channels have been dropped in favour of a lean three-channel model (cargo / PyPI / npm). The Cargo channel supports bundled-embedder, bundled-embedder-cuda, bundled-embedder-directml feature flags.

macOS / Linux / Windows
# npm (Node 18+, no Rust toolchain needed)
npm install -g mnem-cli

# Cargo (any platform with Rust 1.95+)
cargo install --locked mnem-cli --features bundled-embedder

# or via cargo-binstall (faster, downloads prebuilt)
cargo binstall mnem-cli

# PyPI (Python users)
pip install mnem-cli
Docker
docker run --rm -p 9876:9876 ghcr.io/uranid/mnem:latest http serve
WASM (in-browser)
cargo build --release --target wasm32-unknown-unknown -p mnem-core

See crates/mnem-core/README.md for embedding examples.

Verify

mnem --version
mnem doctor

mnem doctor probes embedder, store, and config - useful first command after install.

Quickstart

Five minutes from zero to retrieve.

1. Install

cargo install --locked mnem-cli

(See Install for other platforms.)

2. Initialise a repo

mkdir my-graph && cd my-graph
mnem init

This creates .mnem/ with default config (in-process MiniLM embedder, redb store).

3. Ingest

mnem ingest README.md
mnem ingest docs/*.md
mnem ingest <(echo '{"text": "the cat sat on the mat", "label": "demo"}') --json

4. Retrieve

mnem retrieve "what does this project do"
mnem retrieve "what is X" --label demo --top-k 5

5. Serve over HTTP (optional)

mnem http serve --repo .        # bind 127.0.0.1:9876
curl http://127.0.0.1:9876/v1/retrieve -d '{"text": "what does this do"}'

6. Wire into Claude / Cursor (optional)

mnem mcp install

Adds an MCP server entry to your client config; subsequent agent turns can call mnem_retrieve and mnem_ingest natively.

Next steps

CLI reference

mnem is the single entry point. Subcommands wrap repo operations.

Common subcommands

mnem init [path]                     # create .mnem/ in path (default: cwd)
mnem ingest <file|-> [...]           # add nodes from file or stdin
mnem retrieve <text> [...]           # query (vector + sparse + graph)
mnem mcp                             # start the MCP JSON-RPC server over stdio
mnem mcp --repo ~/notes              # point the MCP server at a specific graph
mnem http serve                      # start the HTTP JSON API (loopback by default)
mnem integrate                       # wire as MCP server in your agent host
mnem doctor                          # probe embedder + store + config

Inspection

mnem stats                # commits, nodes, embeddings, store size
mnem log [-n N]           # commit history
mnem cat-file <cid>       # dump a node by CID
mnem diff <cid> <cid>     # diff two commits
mnem export               # export as CAR archive

Advanced retrieve flags

--limit N                 # number of items to return (default 10); short: -n
--vector-cap N            # candidate pool from vector lane (default 256)
--graph-expand N          # multi-hop expansion budget
--graph-mode <decay|ppr>  # graph scoring: decay (default) or PPR
--rerank <provider:model> # post-rerank with a model
--summarize               # add community summarization layer
--community-filter        # Leiden community filter; drop low-coverage communities

Ingest flags

--chunker <auto|paragraph|recursive|session>  # chunking strategy (default: auto)
--extractor keybert                            # enable KeyBERT keyphrase extraction
--max-tokens N                                # token budget per chunk (default: 512)
--recursive                                   # ingest a directory recursively

For complete option lists run mnem <subcommand> --help. Long-form documentation for each subcommand lives in guides.

MCP server

mnem implements the Model Context Protocol over stdio. Drop it into any MCP client (Claude Desktop, Cursor, Zed, custom).

Install

mnem integrate              # auto-detect installed hosts and wire everything
mnem integrate claude-code  # wire a specific host

For manual registration in any MCP client:

{
  "mcpServers": {
    "mnem": {
      "command": "mnem",
      "args": ["mcp", "--repo", "/path/to/your-graph"]
    }
  }
}

Tools exposed

ToolPurpose
mnem_statsRepo overview: op-head, commit count, label list, embedder health
mnem_schemaList every node label and edge label in the current commit
mnem_searchExact property-match search with optional outgoing-edge expansion
mnem_get_nodeFetch a single node by UUID (full props + content)
mnem_traverseOne-hop neighbour walk from a start node via named edge labels
mnem_list_nodesEnumerate nodes at head, optionally filtered by label
mnem_retrieveHybrid retrieval: vector + sparse + graph, fused via RRF
mnem_commitAdd nodes and/or edges as a single commit
mnem_commit_relationResolve-or-create subject + object + edge in one call
mnem_resolve_or_createFind-or-create a node by a primary-key property
mnem_recentWalk the op-log backwards (last N operations)
mnem_vector_searchCosine nearest-neighbour search over stored embeddings
mnem_delete_nodeHard-remove a node from the current head
mnem_tombstone_nodeSoft-delete (forget) a node; subsequent retrieves exclude it
mnem_ingestIngest a file or inline text as Doc + Chunk + Entity subgraph
mnem_global_retrieveSemantic search on the global graph (~/.mnemglobal/.mnem/) only
mnem_global_ingestIngest a file or inline text into the global graph
mnem_global_addWrite nodes/edges directly to the global graph
mnem_community_summarizeExtractive centroid + MMR summarizer over a set of node UUIDs (summarize feature)

Notes

  • The server runs in-process — no separate daemon, no port to manage.
  • Embedder is bundled (MiniLM-L6-v2, ONNX). No network calls unless you wire one.
  • Local vs global: mnem_retrieve searches the repo the server is pointed at. mnem_global_retrieve always searches ~/.mnemglobal/.mnem/ regardless of --repo.
  • For the full field-level schema of each tool, run mnem mcp --list-tools or inspect crates/mnem-mcp/src/tools/descriptions.rs.

Core concepts

Three primitives. Everything else is composed from these.

Node

A node is content + metadata, addressed by its CID (content identifier derived from a hash of canonical bytes). Two nodes with identical content collapse to one CID. Nodes carry:

  • text - the unit of content (a sentence, a chunk, a fact)
  • label - string scope; queries can filter to a label
  • metadata - opaque JSON map for caller-defined tags

The embedding lives in a per-commit sidecar bucket, not on the node, so two nodes with the same text but different embedders share one CID.

Commit

A commit is a snapshot of the graph at a point in time. Every ingest, every edit, every tombstone produces a new commit. Commits chain by parent CID; the head commit is the working tree’s “current state”. Older commits are immutable and reachable.

Label

A label is an opt-in namespace string attached to nodes at ingest time. Used for:

  • per-user / per-conversation isolation in agent memory
  • bench harness scoping (per-question, per-document)
  • coarse multi-tenancy

A query without a label sees the whole repo; a query with a label sees only nodes carrying that label.

Retrieval lanes

Every retrieve call fans out across three lanes and fuses the results:

  1. Vector - HNSW over the per-commit sidecar embeddings
  2. Sparse - BM25 / SPLADE (optional, feature-gated)
  3. Graph - n-hop traversal over authored edges, optionally PPR-scored

Lanes are configurable. Vector-only is the default and is what the 0.1.0 benchmarks measure.

Configuration

mnem reads config from three sources, in priority order:

  1. Environment variables - MNEM_* (highest precedence)
  2. Per-repo config - <repo>/.mnem/config.toml
  3. User-global config - ~/.mnem/config.toml

Defaults

# .mnem/config.toml
[embed]
provider = "onnx"
model = "all-MiniLM-L6-v2"

[store]
backend = "redb"        # "redb" | "in-memory"

[retrieve]
top_k = 10
vector_cap = 256

Common environment overrides

VariableEffect
MNEM_EMBED_PROVIDERonnx / ollama / openai / mock
MNEM_EMBED_MODELmodel name (e.g. all-MiniLM-L6-v2)
MNEM_EMBED_BASE_URLfor ollama / openai providers
MNEM_EMBED_API_KEY_ENVname of env var holding the API key
MNEM_ORT_INTRA_THREADSpin ONNX runtime thread count (bench harness)
MNEM_BENCHenable bench-only label scoping
MNEM_HTTP_ALLOW_NON_LOOPBACKallow mnem http to bind 0.0.0.0 (Docker)

Provider switching

Embedder, sparse encoder, reranker, and LLM are all configured via provider:model strings - no code change to switch from local ONNX to hosted Cohere.

[embed]
provider = "cohere"
model = "embed-english-v3.0"
api_key_env = "COHERE_API_KEY"

See Embedding providers for the full provider matrix.

Methodology

Every published number ships with the harness, the dataset hash, and the raw artifacts. If you cannot reproduce a number, that is a bug.

Dataset matrix

DatasetVersionn queriesSource
LongMemEvallongmemeval_s_cleaned.json500xiaowu0162/longmemeval-cleaned
LoCoMolocomo10.json1986 (session-level)snap-research/LoCoMo
ConvoMem5 cat × 50 items (250)250Salesforce/ConvoMem
MemBench simple/roles100 items100import-myself/Membench
MemBench highlevel/movie100 items100import-myself/Membench

Embedder

ONNX MiniLM-L6-v2 (sentence-transformers/all-MiniLM-L6-v2 via Xenova/all-MiniLM-L6-v2), bundled in-process via the onnx-bundled feature. No network calls, no API keys, no per-call model load.

Hardware

Pinned 4 cores per lane (cpuset 0-3 / 4-7 / 8-11 / 12-15), MNEM_ORT_INTRA_THREADS=4, mem cap 3 GiB per lane. Bench host is documented per run in benchmarks/results/.

Scoring

MetricDefinition
R@Khit if any gold item is in top-K retrieved
avg recallmean per-item recall (ConvoMem)
Hybrid v4dense + sparse score boost (mirrors MP harness helper)

Apple-to-apple pledge

  • Same dataset version, same query count.
  • Same scoring code (benchmarks/harness/).
  • No secret post-filters, no LLM rerank in the headline numbers.
  • Latency reported alongside recall, not separately.

Reproduce in 1 command

bash benchmarks/harness/run_bench.sh

See Reproduce for the full step-by-step.

Reproduce

End-to-end recipe to regenerate the 0.1.0 benchmark numbers locally.

Prerequisites

  • Docker 24+ (or podman with compose plugin)
  • 16 cores recommended, 8 cores minimum
  • 16 GiB RAM
  • Datasets downloaded:
bash benchmarks/harness/download-datasets.sh

One-shot run

bash benchmarks/harness/run_bench.sh

Wall ETA: 30-50 min on a 16-core box. Output: benchmarks/results/<UTC-stamp>/.

What happens

  1. Build Docker image (release, FEATURES=onnx-bundled):
  2. Bring up 4 lanes with cpuset pinning + thread caps.
  3. Run 6 benches (LongMemEval, LoCoMo, ConvoMem, MemBench × 2, Hybrid v4) sequentially across the lanes via a token-bucket dispatcher.
  4. Render RESULTS.md from per-bench JSONs.

Per-bench manual run

docker compose -f benchmarks/harness/compose.yml up -d mnem-bench-1

python benchmarks/harness/adapters/longmemeval_session.py \
    --dataset benchmarks/datasets/longmemeval/longmemeval_s_cleaned.json \
    mnem http serve --bind 127.0.0.1:9876 \
    --limit 500 --top-k 10 \
    --out benchmarks/results/longmemeval-500q.json

docker compose -f benchmarks/harness/compose.yml down

Verify against shipped numbers

python benchmarks/harness/comparison_table.py \
    --results benchmarks/results/<UTC-stamp> \
    --out /tmp/RESULTS.md
diff /tmp/RESULTS.md benchmarks/results/RESULTS.md

If your numbers diverge by more than ±0.01 on recall, open an issue with the host spec and the bench logs.

Run benchmarks locally with mnem bench

mnem bench is the 0.1.0 first-class entrypoint for running mnem against published memory benchmarks. It replaces the legacy bash benchmarks/harness/run_bench.sh flow as the default; the Bash harness stays around for reproducing the headline numbers from the project README until 0.2.0 wires the same set of embedders into mnem bench.

Quickstart

# 1. Interactive setup wizard (lists every bench; toggles unshipped
#    options behind [0.2.0] tags so you see what is on the roadmap).
mnem bench

# 2. CI-friendly explicit form.
mnem bench run \
    --benches longmemeval,locomo \
    --with mnem \
    --mode cpu-local \
    --top-k 10 \
    --out ./bench-out \
    --non-interactive

# 3. Cache datasets without running anything (network step isolated
#    so you can pre-warm a CI image).
mnem bench fetch longmemeval         # ~264 MB from HuggingFace
mnem bench fetch locomo              # ~3 MB from snap-research/LoCoMo
mnem bench fetch                     # fetch every shipped bench in one go

# 4. Re-render RESULTS.md from a previous run directory.
mnem bench results ./bench-out

Output layout:

bench-out/
  RESULTS.md             markdown table, one row per (bench, adapter)
  timing.log             per-bench wall-time breakdown
  longmemeval.json       summary
  longmemeval.jsonl      per-question rows
  locomo.json
  locomo.jsonl
  logs/<bench>.log

What ships in 0.1.0

ComponentStatusNotes
LongMemEval (per-session)shippedR@5 / R@10 over LmeQs:<qid> per-question repos.
LoCoMo (session granularity)shippedMAX-aggregate dialog scores up to session keys.
mnem cpu-local adaptershippedIn-process Repo::open_in_memory + bag-of-tokens.
ConvoMem0.2.0TUI lists; runtime prints “coming 0.2.0” and skips.
MemBench (simple-roles)0.2.0Same.
MemBench (highlevel-movie)0.2.0Same.
LongMemEval-hybrid-v40.2.0MemPalace v4 hybrid post-filter port.
mem0 adapter0.2.0Same.
MempalaceAdapter0.2.0Same.
CPU parallel mode0.2.0Falls back to cpu-local with a stderr note.
Docker compose mode0.2.0Same.
ONNX MiniLM / Ollama / OpenAI embedders0.2.0Falls back to bag-of-tokens with a note.

The bag-of-tokens embedder ships built into mnem-bench. It is deterministic, network-free, and good enough to deliver recall@5 > 0 on the smoke test. It is NOT the embedder we use for the headline R@5 numbers in the project README - those still come from the legacy Bash harness driving Ollama / ONNX MiniLM / OpenAI. 0.2.0 swaps mnem-bench onto the same provider stack so the two harnesses produce identical numbers.

Pre-flight smoke test

cargo run --example smoke -p mnem-bench

Runs a 5-question LongMemEval canary and exits non-zero if recall@5 == 0. Used as the gate for releases of mnem-bench and mnem-cli.

See also

  • benchmarks/README.md for the legacy Bash harness (still the source of the published headline numbers; sunset after 0.2.0 ports the embedder stack).

Results

mnem vs MemPalace published numbers. Dense retrieval (vector + top-k); hybrid-v4 row mirrors MemPalace’s harness helper. No LLM rerank.

ONNX MiniLM-L6-v2 (bundled, in-process). 4 cores per lane.

BenchmarkSplitMetricMPmnemΔ vs MPLatency (ms)
LongMemEval500 Q (full)R@5 session0.9660.966±0711 (retr)
LongMemEval500 Q (full)R@10 session0.9820.982±0711 (retr)
LoCoMo1986 Q (full)R@5 session0.508$\color{green}{\textbf{0.726}}$+0.218333 (retr)
LoCoMo1986 Q (full)R@10 session0.603$\color{green}{\textbf{0.855}}$+0.252333 (retr)
ConvoMem5 cat × 50 items (250)avg recall0.929$\color{green}{\textbf{0.976}}$+0.047398 (retr)
MemBenchsimple/roles, 100 itemsR@50.840$\color{green}{\textbf{0.960}}$+0.1201874 (e2e)
MemBenchhighlevel/movie, 100 itemsR@50.950$\color{green}{\textbf{1.000}}$+0.050491 (e2e)
LongMemEval500 Q, Hybrid v4R@5 session0.982$\color{red}{\textbf{0.976}}$-0.006729 (retr)

(retr) = retrieve-only mean (from summary timing). (e2e) = end-to-end mean (runtime / n) when adapter doesn’t expose phase timing.

Headlines

  • Matches MemPalace exactly on LongMemEval (0.966 / 0.982).
  • Beats by +0.218 / +0.252 on LoCoMo session-level retrieval.
  • Beats by +0.047 on ConvoMem.
  • Beats by +0.120 / +0.050 on MemBench tasks.
  • Within ±0.006 on Hybrid v4 (no LLM rerank).

Raw artifacts

Per-bench JSON + JSONL in benchmarks/results/v0.1.0/. Each artifact carries the question, the gold set, the retrieved top-K, and per-item recall.

Reproduce

See Reproduce. One command:

bash benchmarks/harness/run_bench.sh

Ingest pipeline

mnem ingest is the only path content takes into the graph. The pipeline:

parse -> chunk -> extract -> embed -> commit

Sources

  • file path (mnem ingest README.md)
  • glob (mnem ingest 'docs/**/*.md')
  • stdin (cat data.txt | mnem ingest -)
  • structured JSON (mnem ingest data.json --json)

Chunking

Default: ~1k-token chunks with sentence-boundary alignment. Override via config:

[ingest]
chunk_size_tokens = 512
chunk_overlap_tokens = 50

Document-aware chunkers exist for code (Tree-sitter) and for Markdown (heading-aware). Auto-detected by file extension.

Extractors

Optional ingest-time enrichment:

ExtractorWhat it does
none (default)raw text only
keybertKeyBERT keyphrase extraction; phrases stored in node metadata

Enable via flag:

mnem ingest README.md --extractor keybert

Labels

Pass --label <str> to scope the ingested nodes:

mnem ingest user-42-chat.json --label user-42 --json

Subsequent retrieve calls with --label user-42 will see only this scope.

Idempotency

Ingesting the same content twice produces the same CID; the second commit is a no-op (parent points at the same tree). Edit-and-reingest produces a new CID and a child commit.

Embedding providers

mnem decouples embedder from store. Switch providers without re-ingesting.

Built-in providers

ProviderModelNetwork?Notes
onnxall-MiniLM-L6-v2 (bundled)nodefault; in-process; fastest cold-start
ollamaany pulled modellocal HTTPe.g. bge-large, nomic-embed-text
openaitext-embedding-3-small/-largeyesneeds OPENAI_API_KEY
cohereembed-english-v3.0yesneeds COHERE_API_KEY
voyagevoyage-3yesneeds VOYAGE_API_KEY
mockdeterministic blake3notests / smoke

Switching

Edit <repo>/.mnem/config.toml:

[embed]
provider = "ollama"
model = "bge-large"
base_url = "http://127.0.0.1:11434"

Or override per-process:

MNEM_EMBED_PROVIDER=ollama MNEM_EMBED_MODEL=bge-large mnem retrieve "..."

After switching, run mnem reindex to regenerate the per-commit embedding sidecar. Node CIDs are unchanged (they don’t carry embeddings); only the sidecar changes.

Sidecar layout

.mnem/
  store.redb              # nodes + commits
  sidecars/
    <embedder-id>/        # one dir per (provider, model) pair
      <commit-cid>.bin    # embedding bucket for that commit

Multiple sidecars co-exist. retrieve picks the sidecar matching the active embedder; if missing, it builds on-demand.

Adding a provider

Implement the Embedder trait in mnem-embed-providers/src/<your>.rs, gate behind a feature flag, register in the provider registry. See for the contract.

Comparisons

How mnem stacks up against other agent-memory and knowledge-graph systems. Each comparison is honest: where they win, where mnem wins, when to pick which.

mnem is open source (Apache-2.0). Numbers come from public artefacts; where a competitor’s claim is closed-source we say so. Where a benchmark is not directly comparable, we say so rather than fabricate a single-number league table.

CompetitorLicenseServer / EmbeddedLLM at ingestBitemporalStarsCompare
Graphiti (getzep/graphiti)Apache-2.0server (Neo4j / Kuzu / FalkorDB / Neptune)mandatoryyes25,409graphiti.md
mem0 (mem0ai/mem0)Apache-2.0library + clouddefault-on (opt-out)no54,113mem0.md
MemPalace (MemPalace/mempalace)MITembedded (Python + ChromaDB)nopartial49,768mempalace.md
Supermemory (supermemoryai/supermemory)MIT (repo) / closed (cloud)hosted cloudyesno22,218supermemory.md
Cognee (topoteretes/cognee)Apache-2.0library + cloudyes (cognify)no16,807cognee.md
Letta (letta-ai/letta)Apache-2.0server + CLIyes (agent is the writer)partial22,305letta.md
graphify (safishamsi/graphify)MITone-shot CLIyes (Claude subagents)no35,262graphify.md
mnemApache-2.0embedded + four surfacesnonosmall / pre-launch(this repo)

Star counts pulled from the GitHub API on 2026-04-26. License columns reflect the repository SPDX identifier; commercial / hosted layers above some of these projects ship under different terms.

mnem positioning

mnem is the substrate underneath the products in the table: a content- addressed, versioned, hybrid-retrieval graph that runs in-process, ingests without an LLM, and exposes token-budget telemetry on every retrieve. We are not building a memory product; we are building the thing the next memory product is built on.

Reading order

If you have read about agent memory before, the most useful first read is one of:

  • mnem vs Graphiti if you have been thinking about bitemporal knowledge graphs.
  • mnem vs mem0 if you have been using the LangChain / LlamaIndex / CrewAI defaults.
  • mnem vs MemPalace if you care about no-LLM-on- write retrieval and reproducible benchmarks.
  • mnem vs Supermemory if you have been weighing the closed cloud vs self-host trade-off.
  • mnem vs Cognee if you have been looking at ECL- pipeline-shaped knowledge engines.
  • mnem vs Letta if you have been looking at the MemGPT lineage of agent platforms.
  • mnem vs graphify if you have been using one-shot folder-to-graph extractors.

mnem vs mem0

mem0: “Universal memory layer for AI Agents” (repo description, mem0ai/mem0) mnem: a content-addressed, versioned graph substrate underneath the memory layer.

At a glance

mnemmem0
LicenseApache-2.0Apache-2.0
Starssmall / pre-launch54,113 (GitHub API, 2026-04-26)
Embedded / Serverembeddedlibrary + optional managed Platform
LLM at ingestnoyes by default (single-pass ADD-only since v3, Apr 2026); infer=False opt-out exists
Content-addressedyesno (UUIDs over a vector store)
Bitemporalnono (event log, not bitemporal)
WASM targetyesno (Python + external vector DB)
MCP serveryesyes (mem0 MCP exists)
Hybrid retrievalyes (vector + sparse + graph + RRF)yes (semantic + BM25 + entity matching, fused) since v3
Token-budget retrieval metadatayesno
3-way mergeyesno (event log with add/update/delete)
Reproducible benchmarks in-repoyespartial (separate memory-benchmark repo)

Feature comparison

#Dimensionmnemmem0Source
1Data modelopen-schema content-addressed nodes + edgesrows in a vector store with {role, content} history; user_id / agent_id / run_id scopingmem0 README “Basic Usage” + docs
2Default ingestparse + chunk + statistical extractLLM (gpt-5-mini default) extracts atomic facts on every addmem0 README “Basic Usage” sha bd9d27ff509f
3LLM requirementoptionalrequired by default; infer=False opts out but loses the “magic”mem0 v3 README “New Memory Algorithm”
4IdentityBLAKE3 CID over DAG-CBORUUIDs over a vector rowmem0 docs
5Historysigned commit DAG, diff / log / branch / mergehistory event log of add/update/delete recordsmem0 SDK
6Conflict resolution3-way merge over graph“latest LLM extraction wins” before v3; v3 is ADD-only and accumulatesmem0 v3 release notes
7Vector backendsredb default, pluggable via Blockstore20+ (Qdrant, Chroma, PGVector, Pinecone, Weaviate, etc.)mem0 docs “Supported Vector Stores”
8LLM providersoptional, 16 via mnem-llm-providers16+ (OpenAI, Anthropic, Gemini, Groq, Ollama, …)mem0 docs “Supported LLMs”
9Embedding modelbundled ONNX MiniLM-L6-v2 in-processconfigurable; default OpenAI text-embedding-3-smallmem0 README
10Retrieval lanesdense (HNSW) + sparse (BM25/SPLADE) + graph + RRFsemantic + BM25 + entity match (v3)mem0 v3 README
11Token-budget metadatafirst-class on every retrievenot exposedmnem CLI / HTTP API
12Multi-tenancyrepo-per-tenant or scope by node labelhardcoded user_id / agent_id / run_id triplemem0 SDK
13BindingsRust + Python + HTTP + MCP + CLIPython + TypeScript + REST + MCPmem0 README badges
14Cloudnone yet“mem0 Platform”: Hobby free, Starter $19, Pro $249, Enterprisemem0.ai pricing
15Distributionpre-launchYC S24, ~2.6M monthly PyPI downloadsmem0 README badge

Benchmarks (where comparable)

mem0 v3 (April 2026) reports on LoCoMo and LongMemEval as a full pipeline (LLM extract + retrieve + answer). mnem reports retrieval-only (R@K) under an identical embedder, no LLM in the loop.

We have a same-harness, same-embedder reproduction of mem0 with infer=False (LLM extraction off) so the comparison lands on the retrieval layer:

BenchmarkSplitMetricmem0 (infer=False, MiniLM)mnemDelta
LongMemEval500 QR@5 session0.946$\color{green}{\textbf{0.966}}$+0.020
LongMemEval500 QR@10 session0.962$\color{green}{\textbf{0.982}}$+0.020
LoCoMo1986 QR@5 session0.466$\color{green}{\textbf{0.726}}$+0.260
LoCoMo1986 QR@10 session0.676$\color{green}{\textbf{0.855}}$+0.179

Adapter notes: infer=False, persistent Memory, per-item user_id scoping. See benchmarks/methodology.md.

mem0’s own v3 numbers (LoCoMo 91.6, LongMemEval 93.4) are full-pipeline end-to-end accuracy, not retrieval R@5; not directly comparable to the table above.

Latency (where measured)

SystemSetupLatency
mnemLongMemEval 500 Q, MiniLM-L6-v2, embedded redb711 ms mean retrieve
mnemLoCoMo 1986 Q, same setup333 ms mean retrieve
mem0LongMemEval, v3 single-pass1.09 s p50 (mem0 README, Apr 2026)
mem0LoCoMo, v3 single-pass0.88 s p50 (mem0 README)

mem0 v3 latency includes one LLM retrieval call per query; mnem’s numbers are pure retrieval. Different mechanisms, useful only as an order-of-magnitude check.

Architecture differences

mem0 is a Python (and TS) memory layer designed to drop into LLM apps. The default flow is: mem.add(messages, user_id=...) runs an LLM to extract atomic facts, embeds them into a configured vector store, and returns a UUID per memory. Retrieval (mem.search(...)) does semantic

  • keyword + entity matching, optionally with a reranker. Multi-tenancy is hardcoded as user_id / agent_id / run_id. mem0 Platform layers a managed cloud, dashboards, and SOC 2 / GDPR on top.

mnem is one layer below: a content-addressed, versioned graph substrate. There is no fixed conversation schema; you commit nodes and edges with whatever labels and properties you need. Identity is a CID over canonical DAG-CBOR + BLAKE3, so the same fact on two machines collapses to the same node. History is a signed commit DAG, not an event log, so old facts remain addressable after newer ones supersede them. The write path runs no LLM by default; ingest is statistical parse + chunk + key extract. Retrieval is 3-lane RRF (HNSW dense + sparse + graph) with token-budget telemetry on every response.

Where mem0 clearly wins

  • Distribution. ~2.6M monthly PyPI downloads, default memory in LangChain / LlamaIndex / CrewAI / Vercel AI SDK / LiveKit / Pipecat / AWS Bedrock. mem0 is the path of least resistance.
  • Backend breadth. 20+ vector stores, 16 LLMs, 10 embedders work out of the box.
  • Managed product. Hobby tier is free; Pro is $249/mo with dashboards, SOC 2, on-prem.
  • LLM-assisted ingest. mem.add("I met Alice in Berlin") auto-extracts {entity: Alice, city: Berlin} with no upstream modelling effort.
  • YC + commercial momentum. YC S24, $24M raised, weekly release cadence on v3.

Where mnem clearly wins

  • No LLM in the write path. Regulated, offline, or cost-sensitive workloads ingest deterministically. mem0 v3 reduced the LLM cost to one call per add but did not eliminate it.
  • Content-addressed CIDs. Globally stable identity; CID-citations stay reproducible. mem0’s UUIDs are per-instance random.
  • Versioned history with 3-way merge. Diff / log / branch / merge / signed commits. mem0 ships an event log, not a commit graph.
  • Embedded + single binary. ~40 MB Docker image, no external vector DB. Runs offline.
  • WASM target. mnem-core compiles to wasm32; mem0 cannot.
  • Retrieval-quality lead under identical-embedder conditions. +0.20 R@5 on LongMemEval, +0.260 R@5 on LoCoMo (same MiniLM weights, dense lane only).
  • Token-budget telemetry. tokens_used / dropped per retrieve.

When to pick mem0, when to pick mnem

Pick mem0 if: you want drop-in agent memory with the broadest LangChain / LlamaIndex / CrewAI footprint, you are happy paying an LLM call per add for “magic” extraction, or you want a managed cloud and dashboards today.

Pick mnem if: you want an embedded substrate with no LLM at ingest, you need content-addressing and a real commit graph, you care about reproducibility and audit, or you are shipping to the edge / WASM / offline.

Sources

mnem vs MemPalace

MemPalace: “The best-benchmarked open-source AI memory system. And it’s free.” (repo description, MemPalace/mempalace) mnem: a content-addressed, versioned graph substrate that shares MemPalace’s no-LLM-on-write philosophy and pushes further on identity and history.

At a glance

mnemMemPalace
LicenseApache-2.0MIT
Starssmall / pre-launch49,768 (GitHub API, 2026-04-26)
Embedded / Serverembeddedembedded (Python + ChromaDB)
LLM at ingestnono (verbatim store)
Content-addressedyesno (ChromaDB row IDs)
Bitemporalnopartial (valid_from / valid_to on KG entries)
WASM targetyesno
MCP serveryes (18 tools)yes (29 tools)
Hybrid retrievalyes (vector + sparse + graph)yes (semantic + hybrid v4 / v5 with keyword + temporal boost)
Token-budget retrieval metadatayesno
3-way mergeyesno
Reproducible benchmarks in-repoyesyes (per-question JSONL committed)

Feature comparison

#DimensionmnemMemPalaceSource
1Schemaopen (any labels / properties)fixed: wings, rooms, halls, drawersMemPalace README “What it is” sha 6890948e092b
2Storageredb embeddedChromaDB + SQLiteMemPalace README + mempalace/backends/base.py
3Default embedderbundled ONNX MiniLM-L6-v2ChromaDB default (MiniLM-L6-v2 implied)MemPalace requirements
4LLM at ingestnonenoneMemPalace README
5LLM at retrievaloptional rerankoptional hybrid-v4 + LLM rerank tierMemPalace Benchmarks table
6Identitycontent CID (BLAKE3 over DAG-CBOR)ChromaDB row IDsimplementation
7Historysigned commit DAGappend-only with valid_from / valid_toMemPalace KG section
8Conflict resolution3-way mergemanual invalidate toolMemPalace MCP tool list
9Sparse laneBM25 + SPLADEhybrid-v4 keyword boostMemPalace BENCHMARKS.md
10Graph lanefirst-class (label / prop / adjacency)KG with timeline + cross-wing tunnelsMemPalace MCP tools
11MCP surface18 tools29 toolsMemPalace README “MCP server”
12Plugin scaffoldsmnem mcp + mnem integrate.claude-plugin/, .codex-plugin/ in repoMemPalace repo
13BindingsRust + Python + TS + HTTP + CLI + MCPPython + MCPMemPalace README
14Hosted productnonenonen/a
15Velocitymaturing 1.0433 commits in first 12 days, 30 contributors (early 2026)internal notes; verify on repo today

Benchmarks (where comparable)

MemPalace publishes retrieval R@5 / R@10 numbers in the same family as mnem’s harness. We pulled their numbers from benchmarks/BENCHMARKS.md and ran ours on the same datasets and embedder weights:

BenchmarkSplitMetricMemPalacemnemDelta
LongMemEval500 QR@5 session, raw dense0.9660.9660
LongMemEval500 QR@10 session, raw dense0.9820.9820
LongMemEval500 Q hybrid-v4R@5 session0.982$\color{red}{\textbf{0.976}}$-0.006
LoCoMo1986 QR@5 session, raw dense0.508$\color{green}{\textbf{0.726}}$+0.218
LoCoMo1986 QR@10 session, raw dense0.603$\color{green}{\textbf{0.855}}$+0.252
ConvoMem250 QAvg recall0.890$\color{green}{\textbf{0.976}}$+0.086
MemBench100 Q (movie)R@50.950$\color{green}{\textbf{1.000}}$+0.050

Method: identical MiniLM-L6-v2 ONNX weights, no reranker, no LLM, no lexical lane on the raw-dense rows. The LoCoMo gap comes from mnem’s adapter aggregating user-turn text per session before embedding; MemPalace’s adapter embeds at a finer grain. Mechanism, not magic.

MemPalace’s hybrid-v4 numbers tune on dev splits; the held-out 98.4% they report is the honest figure to compare against.

Latency (where measured)

SystemSetupLatency
mnemLongMemEval 500 Q, MiniLM ONNX711 ms mean retrieve
mnemLoCoMo 1986 Q, MiniLM ONNX333 ms mean retrieve
MemPalaceLongMemEval, raw densenot headlined; ChromaDB-default latency

MemPalace does not publish a single mean-latency number; their benchmark tables focus on accuracy.

Architecture differences

MemPalace stores conversation history verbatim in ChromaDB and indexes people / projects as wings, topics as rooms, flows as halls, content as drawers. The retrieval layer is pluggable behind mempalace/backends/base.py. A SQLite-backed knowledge graph adds valid_from / valid_to windows, an invalidate verb, and a timeline view. The MCP server exposes 29 tools including agent diaries and cross-wing tunnels. The product is opinionated: the palace metaphor is the user experience.

mnem ships no metaphor. Nodes and edges are open-schema; you commit whatever shape your application needs. Identity is a CID over canonical DAG-CBOR + BLAKE3, so identical content collapses to the same node across machines. History is a signed commit DAG with diff / log / branch / 3-way merge. Retrieval is 3-lane RRF (HNSW dense + BM25/SPLADE sparse + graph traversal) with first-class token-budget telemetry on every response. mnem-core is no-tokio / no-fs / no-net and compiles to WASM unchanged.

Where MemPalace clearly wins

  • Verbatim store with measured 96.6% R@5 on LongMemEval, no API key. Same as mnem on raw dense, and reproducible from their repo.
  • MCP breadth. 29 tools to mnem’s 18. Agent diaries and cross-wing tunnels are original ideas.
  • Plugin scaffolds in-repo. .claude-plugin/ and .codex-plugin/ lower install friction for Claude Code / Codex users.
  • Velocity and community. Hundreds of commits, dozens of contributors, rapid issue response.
  • Reproducibility culture. Per-question JSONL result files committed for every benchmark run.
  • Working temporal KG. valid_from / valid_to / invalidate / timeline shipped today.

Where mnem clearly wins

  • Open schema. No fixed wings/rooms/halls/drawers hierarchy. Use any labels and properties for any domain.
  • Content-addressed identity. Same fact = same CID across machines. Stable citations forever.
  • Real commit DAG. Branch, diff, 3-way merge, signed Ed25519 history. MemPalace stores facts and a timeline; mnem stores commits over a graph.
  • WASM target. Same retrieval logic in browsers, Workers, Lambda. Python + ChromaDB cannot.
  • Retrieval-quality lead on LoCoMo. +0.218 R@5 raw dense, same embedder.
  • Token-budget telemetry. tokens_used, candidates_seen, dropped returned on every retrieve.

When to pick MemPalace, when to pick mnem

Pick MemPalace if: the wings / rooms / halls / drawers metaphor matches your domain, you want the largest MCP tool surface available, or you specifically want a Claude-Code-paired personal memory appliance with reproducible benchmark numbers today.

Pick mnem if: you want an open-schema substrate, you need content-addressing and a real commit DAG, you are shipping to multiple languages or to the edge / WASM, or you want token-budget telemetry as a first-class response field.

Sources

mnem vs Supermemory

Supermemory: “Memory engine and app that is extremely fast, scalable. The Memory API for the AI era.” (repo description, supermemoryai/supermemory) mnem: an open-source, embedded, content-addressed knowledge-graph substrate. Self-host or nothing.

At a glance

mnemSupermemory
LicenseApache-2.0MIT (repo); cloud product is closed-core
Starssmall / pre-launch22,218 (GitHub API, 2026-04-26)
Embedded / Serverembeddedhosted cloud; self-host issue (#707) closed without resolution
LLM at ingestnoyes (Extractors layer; entity / fact extraction)
Content-addressedyesno (custom vector graph engine, internals undisclosed)
Bitemporalnono
WASM targetyesn/a (cloud only)
MCP serveryesyes (https://mcp.supermemory.ai/mcp, OAuth + bearer)
Hybrid retrievalyesyes (multi-mode search across graph)
Token-budget retrieval metadatayesnot exposed
3-way mergeyesno
Reproducible benchmarks in-repoyesself-reported; memorybench skill is a benchmarking harness vs supermemory

Feature comparison

#DimensionmnemSupermemorySource
1Deploymentembedded; single binarycloud only; self-host requested + closed (issue #707)internal research
2Storageredb embeddedPostgres via Cloudflare Hyperdrive + Cloudflare AI vector embeddings + R2 + KVinternal research
3Vector engineHNSW via mnem-annundisclosed; “custom vector graph engine with ontology-aware edges”internal research
4Embedding modelbundled ONNX MiniLM-L6-v2; pluggableundisclosed (Cloudflare AI)internal research
5Identitycontent CIDundisclosedn/a
6Multi-tenancyby repo or graph scopecontainerTag and project scopinginternal research
7Ingest pipelineparse + chunk + statistical extractfive stacked layers: User Profiles, Memory Graph, Retrieval, Extractors, Connectorsinternal research
8LLM useoptional, opt-inyes, in Extractors layerinternal research
9Connectorsnone yetwebhook-driven connectors live (Notion, GDrive, etc.)supermemory.ai docs
10Plugin / IDE ecosystemMCP + mnem integrate12+ integration plugins, dedicated reposinternal research
11APIlocal Rust / Python / HTTP / MCP / CLIREST api.supermemory.ai/v3 + /v4, TS / Python SDKssupermemory README
12Pricingself-host, freetiered cloud (free / pro / team / enterprise)supermemory.ai/pricing
13Funding / brandself-funded indie$3M seed, ~$40M valuation, named angels (Jeff Dean, Dane Knecht, Logan Kilpatrick, …)internal research
14Founder reachsmallDhravya Shah, ~51.5k X followersinternal research
15Self-reported benchmarksreproducible artefacts in-repo“#1 on LongMemEval, LoCoMo, ConvoMem”; sub-300 ms recall at 85.4% accuracyinternal research

Benchmarks (where comparable)

Not directly comparable in any apples-to-apples sense. Supermemory’s benchmark numbers are self-reported, the engine is closed, and the evaluation harness is bundled as the memorybench skill that points at supermemory by default. Their headline:

Supermemory: 85.2-85.4% on LongMemEval; sub-300 ms recall; “#1 on LongMemEval, LoCoMo, ConvoMem”.

mnem’s reproducible numbers under ONNX MiniLM-L6-v2, no LLM in the loop:

BenchmarkSplitMetricmnem
LongMemEval500 QR@5 session0.966
LongMemEval500 QR@10 session0.982
LoCoMo1986 QR@5 session0.726
ConvoMem250 QAvg recall0.976

Putting 0.852 next to 0.966 looks favorable for mnem, but the metrics are not the same shape: Supermemory’s number is end-to-end QA accuracy; mnem’s is retrieval R@5 with no LLM. Both columns are honest; the column headers are not the same column.

Latency (where measured)

SystemSetupLatency
mnemLongMemEval 500 Q, MiniLM ONNX711 ms mean retrieve
mnemLoCoMo 1986 Q, MiniLM ONNX333 ms mean retrieve
Supermemoryself-reportedsub-300 ms recall, “sub-400 ms at scale”

Closed engine, edge network, undisclosed embedder. Supermemory cloud is fast at the retrieve hop; mnem runs in your process so total end-to-end (no network round-trip) tends to win for self-hosted users.

Architecture differences

Supermemory is a Cloudflare-native cloud product. The repo is MIT- licensed but the production engine is closed: a “custom vector graph engine with ontology-aware edges” sitting on top of Postgres (Hyperdrive), Cloudflare AI vector embeddings, R2 object storage, and KV. The product is five stacked layers behind one API: User Profiles, Memory Graph, Retrieval, Extractors, Connectors. MCP server is production today at mcp.supermemory.ai/mcp with OAuth or API-key auth. Connectors (Notion, GDrive, etc.) ship as live webhook integrations. The strength is GTM: $3M seed, named angels, 50,000+ self-reported users on the consumer app, integrations with Cluely, Composio, Scira AI.

mnem is the opposite: open-source Apache-2.0, embedded, single-binary, no cloud. The graph substrate is content-addressed (BLAKE3 CIDs over DAG-CBOR), versioned (signed commit DAG with 3-way merge), and runs in-process from a cargo install away. There is no managed offering; hosting is explicitly out of scope for 0.1.0. Where Supermemory wins on distribution and managed operations, mnem wins on substrate guarantees: identity, history, and deterministic retrieval that you can run offline.

Where Supermemory clearly wins

  • Hosted product with live connectors. Notion, GDrive, etc. work out of the box. mnem has none yet.
  • Distribution and brand. 22k stars, $3M seed, named angels (Jeff Dean, Dane Knecht, Logan Kilpatrick, David Cramer), founder reach ~51.5k X followers.
  • MCP-native cloud. Drop one URL into a client config and you have agent memory.
  • IDE plugin ecosystem. 12+ integration plugins live.
  • Cloudflare edge latency. Sub-300 ms recall claims are plausible given the Workers + Hyperdrive stack.

Where mnem clearly wins

  • Open-source substrate. Apache-2.0, no vendor lock-in. Self-host on a laptop or a Lambda. Supermemory’s self-host issue (#707) closed without a resolution; the cloud is structural.
  • No closed engine. mnem’s vector lane (HNSW), sparse lane (BM25 / SPLADE), graph lane, and RRF weights are all configurable and documented. Supermemory’s “custom vector graph engine” is a black box.
  • Content-addressed identity. Same fact = same CID across machines.
  • Real commit history. Diff, log, branch, 3-way merge, signed history. Supermemory has soft “versioning” in their sense; not a DAG.
  • Privacy by default. Nothing leaves your machine unless you opt in.
  • Reproducible benchmarks. Numbers ship with a runnable harness; Supermemory’s are self-reported.
  • Token-budget retrieval metadata. First-class on every retrieve.

When to pick Supermemory, when to pick mnem

Pick Supermemory if: you want a managed memory API today with hosted connectors, you trust Cloudflare for storage and inference, you want OAuth-MCP plug-and-play for ChatGPT / Claude / Cursor, or distribution on hosted infrastructure beats substrate control for your use case.

Pick mnem if: you need self-host or air-gapped, you want an open substrate with documented internals, you need content-addressing and a commit DAG, or you are building a product on top of a memory layer rather than consuming one.

Sources

mnem vs Cognee

Cognee: “Knowledge Engine for AI Agent Memory in 6 lines of code” (repo description, topoteretes/cognee) mnem: a content-addressed, versioned graph substrate that ingests without an LLM.

At a glance

mnemCognee
LicenseApache-2.0Apache-2.0
Starssmall / pre-launch16,807 (GitHub API, 2026-04-26)
Embedded / Serverembeddedlibrary + Cognee Cloud
LLM at ingestnoyes (remember calls add + cognify + improve)
Content-addressedyesno (extracted graph node IDs)
Bitemporalnono
WASM targetyesno
MCP serveryesyes (Cognee MCP exists; integrates with Claude Code, Hermes)
Hybrid retrievalyes (vector + sparse + graph + RRF)yes (auto-routing across graph + vector)
Token-budget retrieval metadatayesno
3-way mergeyesno
Reproducible benchmarks in-repoyespartial (research / paper claims; no in-repo harness)

Feature comparison

#DimensionmnemCogneeSource
1Storageredb embeddedKuzu default + vector DB; pluggableCognee README “Deploy Cognee” sha f4964c31db04
2Default flowcommit -> CID; no LLMremember runs add + cognify + improve (LLM extraction)Cognee README Quickstart Step 3
3LLM requirementoptionalrequired to configure before Quickstart Step 2Cognee README “Step 2: Configure the LLM”
4IdentityBLAKE3 CID over DAG-CBORextracted graph-node IDs (LLM-derived)Cognee internals
5Historysigned commit DAGnone; standard graph stateCognee docs
6Conflict resolution3-way mergere-run cognify to refresh graphCognee Quickstart
7Vector laneHNSW via mnem-annconfigurable vector storeCognee docs
8Sparse laneBM25 + SPLADEnot headlinedCognee README
9Graph lanefirst-classfirst-classCognee README “About Cognee”
10LLM providersoptionalOpenAI, Anthropic, Gemini, Ollama, othersCognee docs
11Session memoryopen (model your own)remember(..., session_id="...") first-classCognee Quickstart
12Auto-routing retrievalmanual lane configuration“picks best search strategy automatically”Cognee README Step 3
13BindingsRust + Python + TS + HTTP + CLI + MCPPython + CLI + MCPCognee README
14Cloudnone yetCognee Cloud (managed)Cognee README “Connect to Cognee Cloud”
15Determinismbyte-identical CIDs same inputLLM extraction is non-deterministicCognee README Step 3

Benchmarks (where comparable)

Not directly comparable. Cognee publishes research and use-case narratives rather than retrieval R@K artefacts in the repo. Their strength is the ECL pipeline as a finished product (drop a PDF, get a typed knowledge graph), not retrieval-quality benchmarks at the substrate layer.

mnem’s measured retrieval numbers under ONNX MiniLM-L6-v2:

BenchmarkSplitMetricmnem
LongMemEval500 QR@5 session0.966
LoCoMo1986 QR@5 session0.726
ConvoMem250 QAvg recall0.976
MemBench100 Q (movie)R@51.000

If you want to compare like-for-like, run Cognee against the same LongMemEval / LoCoMo dumps with infer=False-equivalent (skip cognify’s LLM extraction). We have not published a Cognee adapter; contributions welcome.

Latency (where measured)

SystemSetupLatency
mnemLongMemEval 500 Q, MiniLM ONNX711 ms mean retrieve
mnemLoCoMo 1986 Q, MiniLM ONNX333 ms mean retrieve
Cogneenot headlined; depends on LLM provider + vector storen/a

Cognee’s retrieval latency depends heavily on whether the auto-router calls a vector store, the graph DB, or LLM rerank. They do not publish a single number.

Architecture differences

Cognee is a Python knowledge-engine library and a managed cloud. The write path is the ECL pipeline: Extract -> Cognify -> Load. You hand it documents, Cognee runs an LLM to extract entities and relationships, embeds them, and writes them into a graph DB (Kuzu by default) plus a vector store. Retrieval auto-routes across vector and graph based on the query. The strength is “drop a PDF and get a typed knowledge graph” with minimal modelling effort.

mnem is the substrate beneath that pattern. There is no required LLM at ingest: parse + chunk + statistical extract -> CID -> commit. The graph shape is whatever your application commits, not whatever the LLM happened to extract that hour. Identity is content-addressed, so the same document on two machines collapses to the same nodes. History is a signed commit DAG with diff / 3-way merge. Retrieval is explicit 3-lane RRF with token-budget telemetry, not auto-routed.

Where Cognee clearly wins

  • Drop-in ingest. Hand it a PDF, conversation, or URL; get a typed knowledge graph. mnem expects you to model what you want stored.
  • Auto-routing retrieval. The router picks vector vs graph vs hybrid for you. mnem makes you choose.
  • Multi-LLM-provider support. OpenAI, Anthropic, Gemini, Ollama, others work with minimal config.
  • Cloud + self-host parity. Cognee Cloud is managed; the OSS library works standalone.
  • Rich ontology derivation. LLM derives the ontology from the corpus rather than forcing one upfront.
  • Session memory primitive. session_id is first-class with a background sync to the long-term graph.

Where mnem clearly wins

  • No LLM in the write path. Deterministic, replayable, fuzz-tested. Cognee’s cognify step is non-deterministic by design.
  • Content-addressed identity. Same input -> same CIDs across machines. Cognee’s extracted node IDs are extraction-run-dependent.
  • Real commit DAG. Branch, diff, 3-way merge, signed Ed25519 history. Cognee has graph state, not a commit history.
  • Embedded, single binary. ~40 MB Docker image. No external graph DB or vector store to operate.
  • WASM target. mnem-core ships to wasm32 unchanged.
  • Token-budget retrieval metadata. First-class on every response.

When to pick Cognee, when to pick mnem

Pick Cognee if: you want PDF / URL / document -> typed knowledge graph in 6 lines, you are happy with an LLM at ingest, you want auto- routed retrieval, or you want a managed cloud option.

Pick mnem if: you need deterministic ingest, content-addressed identity, a real commit DAG, embedded / single-binary deployment, or WASM / edge targets.

Sources

mnem vs Letta

Letta: “Letta is the platform for building stateful agents: AI with advanced memory that can learn and self-improve over time.” (repo description, letta-ai/letta) mnem: a content-addressed graph substrate that stores the memory an agent uses, without assuming the agent.

At a glance

mnemLetta
LicenseApache-2.0Apache-2.0
Starssmall / pre-launch22,305 (GitHub API, 2026-04-26)
Embedded / Serverembeddedserver (Letta API) + Letta Code CLI
LLM at ingestnoyes; the agent is the writer
Content-addressedyesno (DB row IDs)
Bitemporalnopartial (Letta tracks message timestamps)
WASM targetyesno (Python server)
MCP serveryesyes (Letta supports MCP integrations)
Hybrid retrievalyesrecall + archival memory; not headlined as hybrid
Token-budget retrieval metadatayesnot exposed
3-way mergeyesno
Reproducible benchmarks in-repoyespartial (Letta leaderboard external)

Feature comparison

#DimensionmnemLettaSource
1Product shapememory substrateagent platform (agents + memory + tools + runtime)Letta README sha bb52a8900a79
2Memory modelopen graph of content-addressed nodes + edgestiered: core blocks (in-context) + recall + archivalMemGPT paper arXiv:2310.08560
3Who writes memorythe applicationthe agent itself, via tool callsLetta docs
4LLM at ingestnoneyes; agent decides what to write, promote, evictMemGPT paper
5Identitycontent CIDDB row IDsLetta SDK
6Historysigned commit DAGstandard DB state with timestampsLetta SDK
7Conflict resolution3-way mergeagent-to-agent messagingLetta docs
8Scopingopen (any node label)agent_id first-classLetta API
9Vector laneHNSW via mnem-annrecall / archival via configurable embedderLetta docs
10Sparse laneBM25 + SPLADEnot first-classLetta docs
11Graph lanefirst-classnot first-classLetta docs
12BindingsRust + Python + TS + HTTP + CLI + MCPPython + REST + Letta Code CLI (Node 18+)Letta README
13Cloudnone yethosted Letta API + free dashboardhttps://docs.letta.com
14Model agnosticismyes (provider-not-tactic)“fully model-agnostic; recommends Opus 4.5 / GPT-5.2”Letta README
15Headline use-caseagent-memory substrate“stateful agents that learn and self-improve”Letta repo description

Benchmarks (where comparable)

Letta publishes a model leaderboard at leaderboard.letta.com ranking LLMs on Letta’s agent benchmarks (multi-turn, tool-use, reasoning). This measures models inside the Letta agent, not retrieval quality of a memory layer. mnem’s benchmarks measure retrieval R@K over corpora, not agent task success.

The two systems are not directly comparable on a single number. Letta’s “how well does this LLM run my agent” answers a different question from mnem’s “how well does the substrate retrieve under a fixed embedder.”

mnem’s retrieval numbers under ONNX MiniLM-L6-v2:

BenchmarkSplitMetricmnem
LongMemEval500 QR@5 session0.966
LoCoMo1986 QR@5 session0.726
ConvoMem250 QAvg recall0.976

Latency (where measured)

SystemSetupLatency
mnemLongMemEval 500 Q, MiniLM ONNX711 ms mean retrieve
mnemLoCoMo 1986 Q, MiniLM ONNX333 ms mean retrieve
Lettavaries wildly with model + tool-use depthnot headlined

Letta’s user-perceived latency is dominated by the agent loop, not the memory tier. Different mechanism; not comparable.

Architecture differences

Letta is the platform descended from MemGPT. The headline pattern is tiered memory: core memory blocks held in the LLM’s context window, recall memory (recent conversation history) accessible via tool calls, and archival memory (long-term store) similarly accessed by tool. The agent itself decides what to promote and evict, using the LLM’s own reasoning. Letta ships as a Python framework, a hosted API, and a local CLI (letta via @letta-ai/letta-code). The product optimisation is “give an LLM persistent memory and let it manage the tiers.”

mnem is one layer below that. There is no agent in mnem. mnem is a graph substrate: content-addressed nodes and edges, signed commit history, 3-way merge, hybrid retrieval. If you wanted to build the MemGPT pattern on top of mnem, you would ship: core_blocks as a small ad-hoc graph, recall as an HNSW lane over recent commits, archival as the full graph traversal lane. mnem doesn’t impose any of that; it gives you the storage primitives and lets you choose the agent shape.

Where Letta clearly wins

  • The agent is in the box. Drop in Letta, you have an agent with memory and tool-use today. mnem requires you to bring your own agent / framework.
  • The MemGPT brand and lineage. Anyone reading the agent-memory literature has seen the paper. Letta’s the canonical implementation.
  • Hosted API + leaderboard. Comparing models on Letta’s harness is one click.
  • Skills + subagents. Bundled patterns for advanced memory and continual learning.
  • Multi-agent reconciliation via messaging. Agent-to-agent conversations are first-class.

Where mnem clearly wins

  • No agent assumed. Letta’s memory belongs to a Letta agent; mnem’s memory belongs to your application. Port your agent framework next year, the data stays.
  • No LLM in the write path. Letta writes to memory through the agent (LLM tool calls). mnem writes deterministically.
  • Content-addressed identity. Same fact = same CID across machines.
  • Real commit DAG. Diff, log, 3-way merge, signed history. Letta has DB state, not commits.
  • Structural multi-agent merge. Two agents working offline in the same scope reconcile by 3-way graph merge, not by chat messages.
  • WASM, embedded, single binary. Ship to the edge. Letta is a Python server.
  • Hybrid 3-lane retrieval with token-budget metadata. Explicit RRF over dense / sparse / graph; tokens_used per response.

When to pick Letta, when to pick mnem

Pick Letta if: you want the MemGPT pattern in a box, you want a ready-made agent platform with skills and subagents, you want to use Letta’s leaderboard to pick a model, or you are building a single stateful agent rather than a multi-application substrate.

Pick mnem if: you want the memory layer separate from the agent, you need content-addressing and a commit DAG, you are running multiple agent frameworks against the same store, or you need embedded / edge / WASM deployment.

Sources

mnem vs graphify

graphify: “AI coding assistant skill (Claude Code, Codex, OpenCode, …). Turn any folder of code, docs, papers, images, or videos into a queryable knowledge graph.” (repo description, safishamsi/graphify) mnem: a content-addressed graph substrate. graphify builds a graph from a folder; mnem is the graph.

At a glance

mnemgraphify
LicenseApache-2.0MIT
Starssmall / pre-launch35,262 (GitHub API, 2026-04-26)
Embedded / Serverembeddedone-shot CLI; outputs static graph.json + HTML
LLM at ingestnoyes (Claude subagents extract concepts + relationships)
Content-addressedyesno (NetworkX node IDs)
Bitemporalnono
WASM targetyesno (Python + faster-whisper + Claude API)
MCP serveryesno native MCP; integrates via AGENTS.md / hooks / Supermemory MCP
Hybrid retrievalyespartial (Leiden communities; no vector DB)
Token-budget retrieval metadatayesno
3-way mergeyesno (re-runs against SHA256 cache)
Reproducible benchmarks in-repoyesno (benchmarking pointer is the memorybench skill which targets supermemory)

Feature comparison

#DimensionmnemgraphifySource
1Product shaperuntime substrate (CLI / HTTP / MCP / Python / Rust)one-shot CLI skill that emits a static graph artefactgraphify README “How it works” sha 770d7f54c40d
2Inputstext, code, conversations (anything you commit)code, docs, papers, images, videos (multimodal) via tree-sitter + Whisper + Claudegraphify README
3Ingest pipelineparse + chunk + statistical extract -> committhree passes: AST extract -> Whisper transcribe -> Claude subagents extract conceptsgraphify README “How it works”
4LLM requirementoptionalyes (Claude subagents are central)graphify README
5IdentityBLAKE3 CID over DAG-CBORNetworkX node IDsgraphify implementation
6Outputlive graph + retrieve APIstatic graph.html, GRAPH_REPORT.md, graph.json, cache/graphify README directory listing
7Retrieval3-lane RRF (vector + sparse + graph)graph-topology (Leiden communities) and /graphify query slash commandgraphify README
8Vector DBHNSW via mnem-annnone (graph-topology-based clustering, no embeddings)graphify README “Clustering is graph-topology-based”
9Re-ingestcommit appendSHA256 cache only re-processes changed filesgraphify README directory listing
10Tags on relationsedge labelsEXTRACTED / INFERRED / AMBIGUOUS confidence tagsgraphify README
11Always-on assistant integrationMCP + mnem integrateplatform-specific: Claude Code PreToolUse hook, Cursor alwaysApply rule, AGENTS.mdgraphify README “Make your assistant always use the graph”
12Supported AI clientsMCP (Claude Desktop, Cursor, Zed, etc.)15+ named installers (Claude Code, Codex, Cursor, Aider, Gemini, Copilot CLI, …)graphify README install table
13Versioningsigned commit DAGnone; re-run produces a fresh graphgraphify implementation
14LicenseApache-2.0MITrepo metadata
15Cloudnone yetSupermemory API integration documented in README (“Build with Supermemory”)graphify README

Benchmarks (where comparable)

Not directly comparable. graphify produces a static knowledge graph artefact for an assistant to read, not a retrieval API benchmarked on LongMemEval / LoCoMo / etc. Their headline number is a token-efficiency claim (“71.5x fewer tokens per query vs reading the raw files”), measured against a different baseline than mnem’s R@K-on-public-corpora methodology.

graphify’s README points at the memorybench skill (npx skills add supermemoryai/memorybench) for benchmarking, but that harness is supermemory-tilted by default; running mnem through it would require an adapter we have not built.

mnem’s measured retrieval numbers under ONNX MiniLM-L6-v2:

BenchmarkSplitMetricmnem
LongMemEval500 QR@5 session0.966
LoCoMo1986 QR@5 session0.726
ConvoMem250 QAvg recall0.976

Latency (where measured)

SystemSetupLatency
mnemLongMemEval 500 Q, MiniLM ONNX711 ms mean retrieve
mnemLoCoMo 1986 Q, MiniLM ONNX333 ms mean retrieve
graphifyretrieval is “open graph.json and traverse” by an LLMnot measured by them

graphify’s user-perceived latency at query time is “LLM reads GRAPH_REPORT.md then traverses graph.json,” which is a different loop entirely.

Architecture differences

graphify is a one-shot CLI you point at a folder. It runs a deterministic AST pass (tree-sitter, 25 languages), transcribes audio and video with faster-whisper using a domain-aware prompt, then runs Claude subagents in parallel to extract concepts and relationships from docs, papers, images, and transcripts. Results are merged into a NetworkX graph, clustered with Leiden community detection (no embeddings; graph topology is the similarity signal), and exported as graph.html (interactive viewer), GRAPH_REPORT.md (plain-language audit), graph.json (queryable), and a SHA256 cache for incremental re-runs. Coding assistants integrate via platform- specific install commands that wire the graph into rules / hooks / AGENTS.md so the assistant always considers it before searching raw files.

mnem is a runtime substrate, not a one-shot extractor. You commit nodes and edges, then retrieve via a 3-lane fused query (HNSW dense + BM25 / SPLADE sparse + graph traversal) under explicit RRF weights and a token budget. There is no LLM in the write path. Identity is a content CID, history is a signed commit DAG, and the graph evolves by commit + diff + merge rather than by re-extraction. mnem ships embedded; the same Rust core compiles to WASM unchanged.

The two are complementary more than competitive: graphify is a great ingestor for the kind of corpora mnem stores (run graphify on a folder, ingest the resulting graph into mnem). They do not solve the same problem.

Where graphify clearly wins

  • Multimodal ingest in one command. Code, PDFs, markdown, screenshots, diagrams, whiteboard photos, video, audio, in 25 languages via tree-sitter. mnem ingests text and structured commits; you bring your own multimodal extractor.
  • Coding-assistant integration breadth. 15+ named platforms with install commands (Claude Code, Codex, OpenCode, Copilot CLI, VS Code Copilot Chat, Aider, Cursor, Gemini CLI, OpenClaw, Factory Droid, Trae, Hermes, Kiro, Antigravity).
  • PreToolUse hooks. Inject “the graph exists, read it first” into Claude Code / Codex / OpenCode tool flows automatically.
  • Static, portable artefact. graph.html opens in any browser; graph.json ships in a repo; auditors read GRAPH_REPORT.md.
  • Confidence tags on relations. EXTRACTED / INFERRED / AMBIGUOUS lets a reader filter what was found vs guessed.
  • No vector DB needed. Leiden over graph topology produces communities without embeddings.

Where mnem clearly wins

  • Runtime, not one-shot. mnem keeps serving as your corpus grows; graphify is a re-extract loop.
  • No LLM in the write path. graphify’s concept extraction is Claude-subagent-driven by design. mnem ingests deterministically.
  • Content-addressed identity + commit DAG. Stable identity across re-runs and machines; full diff / 3-way merge. graphify regenerates a fresh NetworkX graph.
  • Hybrid retrieval API. Vector + sparse + graph fused with token- budget metadata. graphify exposes traversal slash-commands but no retrieval API.
  • Embedded + WASM. Same retrieval logic in Rust, Python, TS, MCP, Workers, Lambda. graphify is a Python CLI.
  • License. Apache-2.0 vs MIT (both permissive; matters in some corporate review contexts).

When to pick graphify, when to pick mnem

Pick graphify if: you want a folder -> queryable knowledge graph in one command, you need multimodal extraction (video / audio / images), you want an always-on coding-assistant integration, or you need a static artefact you can ship in a repo.

Pick mnem if: you need a runtime memory substrate, you require deterministic ingest, you want content-addressed identity and a real commit DAG, you are building a product (not a personal coding assistant), or you need embedded / WASM deployment.

You can also pair them: graphify as the multimodal extractor, mnem as the runtime substrate that holds the resulting graph and serves queries.

Sources

Migrating to 0.1.0

0.1.0 is the first public release. There is no prior public version; this document exists for completeness and to fix the upgrade pathway forward.

Summary

If you are arriving at 0.1.0 fresh: skip this page. There is nothing to migrate.

What 0.1.0 establishes

SurfaceStability guarantee
CLI subcommand namesstable; deprecation cycle for breaking changes
HTTP /v1/* routesstable; new routes additive only
MCP tool names + schemasstable
<repo>/.mnem/config.toml keysstable
Object encoding (DAG-CBOR + CIDs)stable; bump = new schema version
Internal Rust crate APIsunstable pre-1.0

What may change between 0.x minor releases

  • New optional config keys (with defaults).
  • New retrieval lanes / scoring modes (off-by-default).
  • New CLI subcommands; existing subcommands keep semantics.
  • Sidecar formats (re-build required, node CIDs unchanged).

Upgrading from a 0.x.y to 0.x.(y+1)

cargo install --locked mnem-cli      # or platform package manager
mnem --version
mnem doctor                          # config sanity check

No data migration needed within 0.x.

Future migrations

When 1.0 lands, this directory will gain a v0.x-to-v1.0.md doc covering any on-disk schema changes. Until then, treat 0.x as the pre-launch line where patch releases are non-breaking and minors may introduce additive changes.