Use locally-hosted Ollama models for CortexDB embedding generation and entity extraction.

Ollama Provider

Run CortexDB's embedding and entity-extraction pipelines entirely on your own hardware using Ollama. No API keys, no data leaving your network.

Overview

Ollama is a local inference engine that runs open-source models on consumer hardware. This integration configures CortexDB to use Ollama for:

  • Embedding generation — convert text to vectors for semantic search
  • Entity extraction — extract entities and relationships from ingested episodes using local chat models (Llama 3, Mistral, etc.)

Installation

pip install cortexdbai[ollama]

Make sure Ollama is running locally:

# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.com/install.sh | sh

# Pull an embedding model
ollama pull nomic-embed-text

# Ollama serves on http://localhost:11434 by default

Configuration

| Environment Variable | Default | Description | |---|---|---| | CORTEX_OLLAMA_URL | http://localhost:11434 | Ollama server URL | | CORTEX_OLLAMA_EMBED_MODEL | nomic-embed-text | Embedding model name |

Embedding Models

| Model | Dimensions | Description | |---|---|---| | nomic-embed-text | 768 | High-quality, general-purpose embeddings | | mxbai-embed-large | 1024 | Strong performance on retrieval benchmarks | | all-minilm | 384 | Lightweight and fast | | snowflake-arctic-embed | 1024 | Optimized for retrieval tasks |

Usage Example

from cortexdb_ollama import OllamaEmbeddingProvider, OllamaConfig

# Use defaults (localhost:11434, nomic-embed-text)
async with OllamaEmbeddingProvider() as provider:
    embedding = await provider.embed_query("What is event sourcing?")
    print(f"Dimension: {provider.dimension}")  # 768

    # Batch embedding
    embeddings = await provider.embed([
        "Event sourcing stores all changes as events.",
        "CQRS separates reads from writes.",
    ])

Custom Configuration

config = OllamaConfig(
    base_url="http://gpu-server:11434",
    embed_model="mxbai-embed-large",
)
provider = OllamaEmbeddingProvider(config)

Self-Hosted Setup

Ollama is self-hosted by design. For production deployments:

# Run on a dedicated GPU server
OLLAMA_HOST=0.0.0.0:11434 ollama serve

# Pre-pull models at deploy time
ollama pull nomic-embed-text
ollama pull llama3

Point CortexDB at your Ollama instance:

export CORTEX_OLLAMA_URL=http://gpu-server:11434
export CORTEX_OLLAMA_EMBED_MODEL=nomic-embed-text

Switching Providers

To switch CortexDB from the default OpenAI embeddings to Ollama, update your configuration:

from cortexdb_ollama import OllamaEmbeddingProvider

# Replace your existing embedding provider
provider = OllamaEmbeddingProvider()

All CortexDB embedding providers implement the same interface (embed, embed_query, dimension, model_name), so switching is a one-line change.

Under the Hood

When using the Ollama provider, the SDK translates your calls into REST API requests against the CortexDB and Ollama endpoints.

Storing a memory (remember)

# SDK: cortex.remember("Event sourcing stores all changes as events.")
curl -X POST https://api.cortexdb.ai/v1/remember \
  -H "Authorization: Bearer $CORTEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "Event sourcing stores all changes as events.",
    "tenant_id": "my-app"
  }'
# Returns: { "event_id": "evt_abc123" }

Retrieving context (recall)

# SDK: result = cortex.recall("What is event sourcing?")
# result.context, result.confidence, result.latency_ms
curl -X POST https://api.cortexdb.ai/v1/recall \
  -H "Authorization: Bearer $CORTEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is event sourcing?",
    "tenant_id": "my-app"
  }'
# Returns: { "context": "...", "confidence": 0.92, "latency_ms": 45 }

Generating embeddings (Ollama)

# The provider calls Ollama's embedding endpoint
curl -X POST http://localhost:11434/api/embeddings \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nomic-embed-text",
    "prompt": "What is event sourcing?"
  }'
# Returns: { "embedding": [0.123, -0.456, ...] }