TL;DR. As of v0.5.1, CortexDB ships as a Docker image and stripped Linux binaries, both free for any use — same engine that powers CortexDB Cloud, now runnable inside your own VPC, your homelab, or your laptop. One command, no signup, no telemetry. Grab it at cortexdb.ai/download.
Why on-prem?
Most teams we talk to have one of three reasons to self-host their AI memory layer:
- Data residency / regulated industry. Customer data can't leave your network, full stop. Memory of customer conversations doesn't get an exception.
- Latency. Agent memory is in the hot path — every tool call, every recall, every belief update. Round-tripping to a managed service ~50ms each way is fine for a chatbot, painful for a coding agent that does 200 recalls per session.
- Cost predictability. Once your agents are running 24/7, the per-token cost of a managed memory layer becomes the kind of bill that gets put on a Finance ticket.
Cloud is still the right answer for prototyping and for teams who don't want to operate infrastructure. But “must use cloud” was never a forcing function we wanted to keep. So as of v0.5.1, you don't have to.
What you're getting
Two distribution paths, same binary underneath:
- Docker image. Multi-arch (
linux/amd64,linux/arm64) on Docker Hub:cortexdb/cortexdb:latest. Pinned tags for production (cortexdb/cortexdb:v0.5.1). - Linux tarballs. Stripped binary (~80 MB) + bundled assets, docs, and license.
x86_64andaarch64builds. SHA-256 checksums on every GitHub Release.
Both packages include:
- The full CortexDB engine (event sourcing, recall pipeline, all five memory layers)
- Built-in dashboard at
localhost:3141 - WordNet synonym map for BM25 query expansion (preloaded)
docs/GETTING_STARTED.md— five-minute quickstart bundled in- Full v1 API reference + production deployment guide
macOS and Windows server builds aren't shipped yet — Linux is the production target. Reach out via contact if you have a specific need.
The fastest path: Docker
If you have Docker installed, this is the whole installation:
docker run -d \
--name cortexdb \
-p 3141:3141 \
-v cortexdb-data:/data \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
cortexdb/cortexdb:latestTwo notes on the env vars: CortexDB calls OpenAI (gpt-4o-mini) for entity extraction and fact enrichment, and Anthropic (Claude) for the /v1/answer endpoint. You can swap these for any OpenAI-compatible endpoint by setting CORTEX_LLM_URL. If you only want to use the lower-level /v1/experience and /v1/recall endpoints, the answer key is optional.
Verify it's up:
curl -s http://localhost:3141/v1/admin/health
# {"status":"ok","version":"0.5.1",...}Then open http://localhost:3141 in a browser for the dashboard.
The tarball path
For environments without Docker (airgapped, bare metal, banks that have opinions about container runtimes):
# x86_64
curl -LO https://cortexdb.ai/download/cortexdb-0.5.1-linux-amd64.tar.gz
tar -xzf cortexdb-0.5.1-linux-amd64.tar.gz
cd cortexdb-0.5.1-linux-amd64
./cortexdbFor systemd, an example unit ships in docs/PRODUCTION_DEPLOYMENT.md. Drop it under /etc/systemd/system/cortexdb.service, set the env file path, and systemctl enable --now cortexdb.
Your first memory
CortexDB's write surface is one endpoint: /v1/experience. You hand it a scope (a namespace), some content, an observed-at timestamp, and an idempotency key. The server takes care of chunking, embedding, fact extraction, entity resolution, and layer materialization.
curl -X POST http://localhost:3141/v1/experience \
-H 'Content-Type: application/json' \
-d '{
"scope": "ws:demo",
"modality": "observation",
"content": { "kind": "text", "text": "Priya at Acme signed for 200 seats." },
"context": { "observed_at": "'`date -u +%Y-%m-%dT%H:%M:%SZ`'" },
"idempotency_key": "'`uuidgen`'"
}'Then read it back:
curl -X POST http://localhost:3141/v1/recall \
-H 'Content-Type: application/json' \
-d '{
"scope": "ws:demo",
"query": "Who at Acme is the contact?",
"view": "local"
}'That's the whole demo. Five layers (Events → Episodes → Facts → Beliefs → Understanding) materialize behind the scenes; you just ask questions and get answers with provenance.
SDKs
If you'd rather not curl:
pip install cortexdbai
# or
npm i cortexdbaiBoth packages ship the same CortexDB client. Point at http://localhost:3141 and you're done.
What ships in the package
cortexdb # stripped release binary (~80 MB)
assets/ # WordNet synonym map for BM25
docs/
GETTING_STARTED.md
API_DESIGN_V1.md
API_REFERENCE_V1.md
CORTEXDB_FOR_THE_ENTERPRISE.md
PRODUCTION_DEPLOYMENT.md
cortexdb_customer_deck.html
README.md
LICENSE.txtEverything you need to take it from “downloaded” to “running in production behind a reverse proxy with TLS and a backup policy.” The production guide covers systemd hardening, secrets handling, snapshot strategy, and the operational metrics worth alerting on.
A note on the architecture
For folks curious about what's actually inside the binary:
- RocksDB for the write-ahead log and content store. The WAL is the source of truth — every index is a materialized view.
- Tantivy for BM25 full-text. WordNet expansion preloaded.
- HNSW for vector search. We do the embeddings out-of-process (OpenAI by default, swap with any compatible endpoint).
- Custom graph index for entity relationships and provenance walks.
- In-process derivation pipeline materializes episodes, beliefs, and concepts asynchronously. The
/v1/recalland/v1/answerpaths read from materialized views, not raw events.
Everything is a single Rust binary. No external dependencies beyond the LLM endpoints you wire up.
What's not in v0.5.1
Honesty about gaps, in case you're evaluating:
- No macOS or Windows server builds yet. Linux only. Add -server to that sentence to be clear: this is for the box that's running your agent backend, not your dev laptop. (Use Docker on macOS/Windows for now — same binary, runs fine.)
- No clustering in this release. Single-node is solid; multi-node Raft + gossip is in the next minor (
v0.6). For most teams under ~10M events, single-node is fine. - No GUI installer. Tarball + systemd or Docker + compose, that's it. We'll add a one-line install script next month.
What's next
Roadmap items most users have been asking about:
- Helm chart + Kubernetes operator (v0.6)
- Cluster mode GA (v0.6)
- macOS native binary (when asked enough — currently 3 customer requests)
- Encryption-at-rest with customer-managed KMS keys (v0.7)
- An OSS eval harness —
cortexdb-eval— that runs against any AI memory layer, not just ours
Get started
- Download: cortexdb.ai/download
- Docs: cortexdb.ai/docs
- Discord: discord.gg/cortexdb — best place for setup questions
- GitHub Issues: bug reports + feature requests
- Email: contact
If you're running CortexDB on-prem, we'd love to hear what you're building. Reply on @cortexdbai or drop a note in Discord — we read everything.