Automatically ingest Notion pages, database entries, and comments into CortexDB as episodes.
Notion Connector
The Notion connector continuously syncs pages, database entries, and discussion comments from your Notion workspace into CortexDB. Pages and database rows are stored as document episodes; comments become comment episodes. All content includes property metadata, authorship, and parent-child relationships.
Setup
1. Create a Notion Integration
- Go to notion.so/my-integrations
- Click New integration
- Name the integration (e.g.
CortexDB Sync) - Select the workspace you want to connect
- Under Capabilities, enable:
- Read content
- Read comments
- Read user information (optional, for richer actor metadata)
- Click Submit and copy the Internal Integration Token
- Share each database or page with the integration by clicking ... > Connections > Connect to > your integration
2. Configure the Connector
# Required
CORTEX_NOTION_TOKEN=ntn_your_integration_token
# Optional: scope to specific databases and pages
CORTEX_NOTION_DATABASES=db-id-1,db-id-2
CORTEX_NOTION_PAGES=page-id-1,page-id-2
# CortexDB target
CORTEX_NOTION_TENANT_ID=my-app
CORTEX_NOTION_NAMESPACE=notion
3. Start the Connector
# As part of CortexDB
docker run -d \
-e CORTEX_NOTION_TOKEN=ntn_... \
-e CORTEX_NOTION_DATABASES="db-id-1,db-id-2" \
-e CORTEX_NOTION_TENANT_ID=my-app \
cortexdb/cortexdb:latest \
--enable-connector notion
# As a standalone process
cortexdb-connector notion \
--token ntn_... \
--databases "db-id-1,db-id-2" \
--tenant-id my-app
What Gets Ingested
| Notion Event | Episode Type | Content |
|---|---|---|
| Page created/updated | document | Page title + body text excerpt |
| Database entry created/updated | document | Entry title + property summary |
| Comment on a page | comment | Comment text |
| Page in search results | document | Title + body (workspace-wide search) |
Episode Metadata
Each ingested page includes:
{
"type": "document",
"content": "[Sprint Planning] Q1 2026 Roadmap\n\nStatus: In Progress; Priority: High; Owner: Alice",
"source": "notion",
"author": "alice",
"timestamp": "2026-03-15T14:20:00Z",
"metadata": {
"page_id": "abc123-def456",
"database_id": "db-789",
"url": "https://notion.so/abc123",
"properties": {
"Status": { "type": "status", "status": { "name": "In Progress" } },
"Priority": { "type": "select", "select": { "name": "High" } }
},
"last_edited_time": "2026-03-15T14:20:00Z",
"created_time": "2026-03-01T09:00:00Z"
}
}
Configuration
| Variable | Default | Description |
|---|---|---|
| CORTEX_NOTION_TOKEN | Required | Notion internal integration token |
| CORTEX_NOTION_DATABASES | None | Comma-separated database IDs to sync |
| CORTEX_NOTION_PAGES | None | Comma-separated page IDs to sync |
| CORTEX_NOTION_TENANT_ID | Required | Target tenant |
| CORTEX_NOTION_NAMESPACE | notion | Target namespace |
| CORTEX_NOTION_BACKFILL_DAYS | 30 | Days of history to backfill on first run |
| CORTEX_NOTION_POLL_INTERVAL_SEC | 60 | Polling interval in seconds |
Backfill
On first run, the connector backfills the configured number of days of history. It searches the entire workspace for recently modified pages, queries configured databases, and fetches comments on all discovered pages. Subsequent runs process only content modified since the last sync.
# Backfill 90 days of history
cortexdb-connector notion --backfill-days 90
The connector uses Notion's last_edited_time filter to efficiently query only modified content. Idempotency keys include the page version timestamp, so re-processing the same page at the same version is safely deduplicated.