Use DeepInfra's serverless inference for CortexDB embeddings and entity extraction.

DeepInfra Provider

Access a wide catalog of open-source models through DeepInfra's serverless inference platform with pay-per-token pricing.

Overview

DeepInfra provides serverless and dedicated inference for open-source models. This integration configures CortexDB to use DeepInfra for:

  • Embedding generation — high-quality open-source embedding models
  • Entity extraction — Llama, Mistral, and other chat models for relationship extraction

Installation

pip install cortexdb-deepinfra

Configuration

| Environment Variable | Default | Description | |---|---|---| | CORTEX_DEEPINFRA_API_KEY | Required | DeepInfra API key | | CORTEX_DEEPINFRA_EMBED_MODEL | BAAI/bge-large-en-v1.5 | Embedding model | | CORTEX_DEEPINFRA_CHAT_MODEL | meta-llama/Meta-Llama-3.1-70B-Instruct | Chat model |

Embedding Models

| Model | Dimensions | Description | |---|---|---| | BAAI/bge-large-en-v1.5 | 1024 | High-quality general-purpose embeddings by BAAI | | sentence-transformers/all-MiniLM-L6-v2 | 384 | Lightweight, fast sentence embeddings | | intfloat/e5-large-v2 | 1024 | Strong multilingual embedding model |

Usage Example

from cortexdb_deepinfra import DeepInfraEmbeddingProvider, DeepInfraConfig

config = DeepInfraConfig(api_key="your-deepinfra-api-key")

async with DeepInfraEmbeddingProvider(config=config) as provider:
    embedding = await provider.embed_query("What is event sourcing?")
    print(f"Dimension: {provider.dimension}")  # 1024

    embeddings = await provider.embed([
        "Event sourcing stores all changes as events.",
        "CQRS separates reads from writes.",
    ])

Using a Different Model

config = DeepInfraConfig(
    api_key="your-deepinfra-api-key",
    embed_model="sentence-transformers/all-MiniLM-L6-v2",
)
provider = DeepInfraEmbeddingProvider(config=config)
# provider.dimension == 384

Switching Providers

To switch CortexDB from the default OpenAI embeddings to DeepInfra:

from cortexdb_deepinfra import DeepInfraEmbeddingProvider

provider = DeepInfraEmbeddingProvider()  # reads from env vars

All CortexDB embedding providers implement the same interface (embed, embed_query, dimension, model_name), so switching is a one-line change.