The Long-Term
Memory Layer for AI
Five addressable memory layers — Events, Episodes, Facts, Beliefs, Understanding — with hierarchical scopes and bi-temporal queries. 93.8% on LongMemEval-S. 53 integrations. Production-ready.
No email, no card. 7-day token, full v1 capabilities.
import { V1Client } from "cortexdbai/v1";
// Three lines: no email, no card, working token
const cortex = await V1Client.signup();
await cortex.experience(cortex.actor, {
text: "Q3 revenue: $2.4M",
observedAt: new Date().toISOString(),
idempotencyKey: "q3-001",
});
const pack = await cortex.recall(cortex.actor,
{ query: "Q3 revenue?", diagnostics: "none" });
Architecture
How it works
Captures are appended to an immutable WAL, then asynchronously promoted into derived layers (facts, beliefs, understanding). Recall pulls a stratified pack from all five layers in one round trip — backed by a Raft-replicated RocksDB cluster.
Scaling
Features
Built for production AI workloads
Everything your AI agents need for reliable, accurate long-term memory. No compromises.
Five Memory Layers
Events (lossless capture) → Episodes (sealed spans) → Facts (bi-temporal triples) → Beliefs (probabilistic claims with confidence) → Understanding (synthesized concepts). Read any layer separately or pull them all as a stratified pack.
Bi-temporal facts
Every derived record carries valid_from / valid_to and recorded_from / recorded_to. Answer 'what did the system believe about Acme on April 15 when looked at from May 1?' Single-axis systems can't.
Stratified Recall
POST /v1/recall returns events + episodes + facts + beliefs + understanding plus a narrative context block and full provenance. HNSW vector + Tantivy BM25 fused with RRF; optional Cohere rerank-v3.5 cross-encoder; every result carries citations.
Crash Durable
Battle-tested storage engine with zero data loss under any failure scenario. Your memories survive anything, guaranteed.
Hierarchical Scopes
org:acme/dept:eng/user:alice — one namespace primitive that subsumes tenants, namespaces, and workspaces. Recall walks up (holistic) or down (descend) the hierarchy. Every scope carries its own policy.
16 Data Connectors
Slack, GitHub, Jira, Notion, Salesforce, and 11 more built-in. Ingest your team's knowledge automatically.
Architecture
Lossless vs. lossy
Other memory systems rewrite your data through an LLM before storing it. CortexDB preserves the original.
CortexDB
Lossless, event-sourced
Raw content stored as immutable events
Async enrichment extracts atomic facts + knowledge graph
4-channel retrieval + neural cross-encoder reranking
Adaptive query planner learns optimal strategy per query type
Others
Lossy, LLM-rewritten
Content rewritten by LLM before storage
LLM on critical write path — slow, unpredictable cost
Single-channel vector search (no reranking)
No irrelevance detection — always returns something
Comparison
How CortexDB is different
Architectural differences that matter at scale.
| Feature | CortexDB | Competition |
|---|---|---|
| Memory model | Five addressable layers (Events / Episodes / Facts / Beliefs / Understanding) | One flat memory bag |
| Data preservation | Lossless — Events are immutable; derived layers reproducible by replay | Lossy (LLM-rewritten on write) |
| Write path | No LLM on write path — async extraction → indexing → consolidation | LLM required on every write |
| Retrieval | Hybrid HNSW + Tantivy BM25 with RRF fusion + optional Cohere rerank-v3.5 | Vector-only |
| Time model | Bi-temporal: valid_* + recorded_* on every derived record | Single time axis |
| Forget | Selective forget + reference-counted GDPR erasure (preview → execute) | Hard delete (or none) |
| Auth | PASETO v4 public tokens + four-tier capability stack (every denial cites tier + capability) | Plaintext API keys |
| Namespacing | Hierarchical scope paths (org:acme/dept:eng/user:alice) | Flat namespaces |
| Crash durability | WAL + RocksDB + cluster Raft replication | Depends on vector DB |
| Benchmark (LongMemEval-S) | 93.8% (server-parity) | Mem0 93.4% · Zep / Graphiti 71.2% |
| Data connectors | 16 built-in (Slack, GitHub, Jira, Salesforce, …) | Few or none |
| Audit | SHA-256-chained, /v1/audit/verify for tamper checks | None / app-side logs |
Benchmarks
Proven accuracy, not marketing claims
We evaluate against published academic benchmarks and share the results. No cherry-picked demos — real numbers on standardized tests.
LongMemEval-S
ICLR 2025 benchmark for long-term assistant memory. 500 hand-written questions across 6 memory-skill categories.
469 / 500 · server parity
Run on the production server (WAL + RocksDB + HNSW), not a research-only in-memory pipeline. Claude Opus 4.6 answerer, $49.69 / run. May 2 2026.
LoCoMo
Long Conversation Memory benchmark. 1,540 QA pairs across multi-hop, temporal, open-domain and single-hop categories.
1,339 / 1,540 · cats 1–4
Cat 4 — Single-hop
770 / 841 questions
91.6%
Cat 2 — Temporal
282 / 321 questions
87.9%
Cat 1 — Multi-hop
225 / 282 questions
79.8%
Cat 3 — Open-domain
62 / 96 questions
64.6%
Mem0's self-reported April 2026 LoCoMo score is 91.6%. We trail them on this benchmark by ~5pp while leading on LongMemEval-S — that's the honest current state, and where the next round of recall-side work is focused. Stage `cats1234_typed_tokens_v1` judged 2026-05-03.
End-to-end answer latency
p50 recall 2.9s · p50 LLM generation 2.5s · p50 total 9.7s. Measured on the LongMemEval 93.8% run.
avg over 500 questions
Developer Experience
First-class SDKs for every stack
Get started in minutes with our Python, TypeScript, or REST API.
from cortexdb.v1 import V1Client
client = V1Client(
api_url="https://api-v1.cortexdb.ai",
actor="user:alice",
bearer="v4.public...",
)
# Capture an experience
client.experience(
scope="org:acme/user:alice",
text="Q3 revenue exceeded $2.4M, up 34% YoY",
observed_at="2026-05-15T10:42:00Z",
idempotency_key="q3-update-001",
)
# Recall a stratified pack
pack = client.recall(
scope="org:acme/user:alice",
view="holistic",
query="What was Q3 revenue?",
include=["beliefs", "facts", "episodes"],
)
print(pack["context_block"])
# => "Q3 revenue exceeded $2.4M, up 34% YoY [1]"Works with your favorite IDE
Install the CortexDB MCP server and give any AI-powered IDE persistent long-term memory. One command, every conversation remembers.
# Install the MCP server (>=0.3.0 — v1 surface)
pip install 'cortexdb-mcp>=0.3.0'
# Add to Claude Code — anonymous auto-signup on first launch
claude mcp add cortexdb cortexdb-mcp16 built-in tools
Store, search, forget, list events, walk knowledge-graph entities, and check usage — all from your IDE.
Memories live in the cloud
The MCP server is a lightweight bridge — memories never leave CortexDB. Only a small auth-state file (your token + actor + default scope) is cached locally so you don't sign up on every restart.
Works everywhere MCP does
Any client that speaks Model Context Protocol gets instant access to CortexDB's full memory system.
Connects to everything you use
Drop-in support for 21 agent frameworks, 16 data connectors, 6 orchestration tools, 6 LLM providers, and 4 no-code platforms. Click any integration for docs.
Ready to give your AI agents perfect memory?
Get started in under 5 minutes. Free tier available — no credit card required.