Now available — Start building today

The Long-Term Memory Layer for AI

Five addressable memory layers — Events, Episodes, Facts, Beliefs, Understanding — with hierarchical scopes and bi-temporal queries. 93.8% on LongMemEval-S. 53 integrations. Production-ready.

No email, no card. 7-day token, full v1 capabilities.

View Documentation
quickstart.ts
typescript
import { V1Client } from "cortexdbai/v1";

// Three lines: no email, no card, working token
const cortex = await V1Client.signup();
await cortex.experience(cortex.actor, {
  text: "Q3 revenue: $2.4M",
  observedAt: new Date().toISOString(),
  idempotencyKey: "q3-001",
});
const pack = await cortex.recall(cortex.actor,
  { query: "Q3 revenue?", diagnostics: "none" });
93.8%
LongMemEval-S
Server-parity, 469/500
37+
Integrations
frameworks & tools
16
Connectors
Slack, GitHub, Jira+
6
IDE Integrations
MCP server

Architecture

How it works

Captures are appended to an immutable WAL, then asynchronously promoted into derived layers (facts, beliefs, understanding). Recall pulls a stratified pack from all five layers in one round trip — backed by a Raft-replicated RocksDB cluster.

Cognitive Recall Pipeline
Phase 1Query
Phase 2Planner
Phase 34-CH Search
Phase 4Reranking
Phase 5Gate
Phase 6KG Enrich
Recall
~737
ms latency
Distributed Cluster · Raft Consensus
Healthy · RF 3
Leader
Node 2
Node 3
Node 4
Node 5
+
Horizontal
Scaling
Data replication
Raft consensus
Token routing

Features

Built for production AI workloads

Everything your AI agents need for reliable, accurate long-term memory. No compromises.

Five Memory Layers

Events (lossless capture) → Episodes (sealed spans) → Facts (bi-temporal triples) → Beliefs (probabilistic claims with confidence) → Understanding (synthesized concepts). Read any layer separately or pull them all as a stratified pack.

Bi-temporal facts

Every derived record carries valid_from / valid_to and recorded_from / recorded_to. Answer 'what did the system believe about Acme on April 15 when looked at from May 1?' Single-axis systems can't.

Stratified Recall

POST /v1/recall returns events + episodes + facts + beliefs + understanding plus a narrative context block and full provenance. HNSW vector + Tantivy BM25 fused with RRF; optional Cohere rerank-v3.5 cross-encoder; every result carries citations.

Crash Durable

Battle-tested storage engine with zero data loss under any failure scenario. Your memories survive anything, guaranteed.

Hierarchical Scopes

org:acme/dept:eng/user:alice — one namespace primitive that subsumes tenants, namespaces, and workspaces. Recall walks up (holistic) or down (descend) the hierarchy. Every scope carries its own policy.

16 Data Connectors

Slack, GitHub, Jira, Notion, Salesforce, and 11 more built-in. Ingest your team's knowledge automatically.

Architecture

Lossless vs. lossy

Other memory systems rewrite your data through an LLM before storing it. CortexDB preserves the original.

CortexDB

Lossless, event-sourced

Raw content stored as immutable events

Async enrichment extracts atomic facts + knowledge graph

4-channel retrieval + neural cross-encoder reranking

Adaptive query planner learns optimal strategy per query type

Others

Lossy, LLM-rewritten

Content rewritten by LLM before storage

LLM on critical write path — slow, unpredictable cost

Single-channel vector search (no reranking)

No irrelevance detection — always returns something

Comparison

How CortexDB is different

Architectural differences that matter at scale.

FeatureCortexDBCompetition
Memory modelFive addressable layers (Events / Episodes / Facts / Beliefs / Understanding)One flat memory bag
Data preservationLossless — Events are immutable; derived layers reproducible by replayLossy (LLM-rewritten on write)
Write pathNo LLM on write path — async extraction → indexing → consolidationLLM required on every write
RetrievalHybrid HNSW + Tantivy BM25 with RRF fusion + optional Cohere rerank-v3.5Vector-only
Time modelBi-temporal: valid_* + recorded_* on every derived recordSingle time axis
ForgetSelective forget + reference-counted GDPR erasure (preview → execute)Hard delete (or none)
AuthPASETO v4 public tokens + four-tier capability stack (every denial cites tier + capability)Plaintext API keys
NamespacingHierarchical scope paths (org:acme/dept:eng/user:alice)Flat namespaces
Crash durabilityWAL + RocksDB + cluster Raft replicationDepends on vector DB
Benchmark (LongMemEval-S)93.8% (server-parity)Mem0 93.4% · Zep / Graphiti 71.2%
Data connectors16 built-in (Slack, GitHub, Jira, Salesforce, …)Few or none
AuditSHA-256-chained, /v1/audit/verify for tamper checksNone / app-side logs

Benchmarks

Proven accuracy, not marketing claims

We evaluate against published academic benchmarks and share the results. No cherry-picked demos — real numbers on standardized tests.

LongMemEval-S

ICLR 2025 benchmark for long-term assistant memory. 500 hand-written questions across 6 memory-skill categories.

93.8%

469 / 500 · server parity

Supermemory
98.6%
AgentMemory
96.2%
OMEGA
95.4%
Mastra
94.9%
CortexDB
93.8%
Mem0
93.4%
Chronos
92.6%
Hindsight
91.4%
Zep / Graphiti
71.2%

Run on the production server (WAL + RocksDB + HNSW), not a research-only in-memory pipeline. Claude Opus 4.6 answerer, $49.69 / run. May 2 2026.

LoCoMo

Long Conversation Memory benchmark. 1,540 QA pairs across multi-hop, temporal, open-domain and single-hop categories.

86.9%

1,339 / 1,540 · cats 1–4

Cat 4 — Single-hop

770 / 841 questions

91.6%

Cat 2 — Temporal

282 / 321 questions

87.9%

Cat 1 — Multi-hop

225 / 282 questions

79.8%

Cat 3 — Open-domain

62 / 96 questions

64.6%

Mem0's self-reported April 2026 LoCoMo score is 91.6%. We trail them on this benchmark by ~5pp while leading on LongMemEval-S — that's the honest current state, and where the next round of recall-side work is focused. Stage `cats1234_typed_tokens_v1` judged 2026-05-03.

End-to-end answer latency

p50 recall 2.9s · p50 LLM generation 2.5s · p50 total 9.7s. Measured on the LongMemEval 93.8% run.

$0.12 / question

avg over 500 questions

Developer Experience

First-class SDKs for every stack

Get started in minutes with our Python, TypeScript, or REST API.

app.py
python
from cortexdb.v1 import V1Client

client = V1Client(
    api_url="https://api-v1.cortexdb.ai",
    actor="user:alice",
    bearer="v4.public...",
)

# Capture an experience
client.experience(
    scope="org:acme/user:alice",
    text="Q3 revenue exceeded $2.4M, up 34% YoY",
    observed_at="2026-05-15T10:42:00Z",
    idempotency_key="q3-update-001",
)

# Recall a stratified pack
pack = client.recall(
    scope="org:acme/user:alice",
    view="holistic",
    query="What was Q3 revenue?",
    include=["beliefs", "facts", "episodes"],
)

print(pack["context_block"])
# => "Q3 revenue exceeded $2.4M, up 34% YoY [1]"
MCP Server

Works with your favorite IDE

Install the CortexDB MCP server and give any AI-powered IDE persistent long-term memory. One command, every conversation remembers.

terminal
bash
# Install the MCP server (>=0.3.0 — v1 surface)
pip install 'cortexdb-mcp>=0.3.0'

# Add to Claude Code — anonymous auto-signup on first launch
claude mcp add cortexdb cortexdb-mcp

16 built-in tools

Store, search, forget, list events, walk knowledge-graph entities, and check usage — all from your IDE.

Memories live in the cloud

The MCP server is a lightweight bridge — memories never leave CortexDB. Only a small auth-state file (your token + actor + default scope) is cached locally so you don't sign up on every restart.

Works everywhere MCP does

Any client that speaks Model Context Protocol gets instant access to CortexDB's full memory system.

View MCP setup guide
53+ Integrations

Connects to everything you use

Drop-in support for 21 agent frameworks, 16 data connectors, 6 orchestration tools, 6 LLM providers, and 4 no-code platforms. Click any integration for docs.

Ready to give your AI agents perfect memory?

Get started in under 5 minutes. Free tier available — no credit card required.