Use DeepInfra's serverless inference for CortexDB embeddings and entity extraction.
DeepInfra Provider
Access a wide catalog of open-source models through DeepInfra's serverless inference platform with pay-per-token pricing.
Overview
DeepInfra provides serverless and dedicated inference for open-source models. This integration configures CortexDB to use DeepInfra for:
- Embedding generation — high-quality open-source embedding models
- Entity extraction — Llama, Mistral, and other chat models for relationship extraction
Installation
pip install cortexdb-deepinfra
Configuration
| Environment Variable | Default | Description |
|---|---|---|
| CORTEX_DEEPINFRA_API_KEY | Required | DeepInfra API key |
| CORTEX_DEEPINFRA_EMBED_MODEL | BAAI/bge-large-en-v1.5 | Embedding model |
| CORTEX_DEEPINFRA_CHAT_MODEL | meta-llama/Meta-Llama-3.1-70B-Instruct | Chat model |
Embedding Models
| Model | Dimensions | Description |
|---|---|---|
| BAAI/bge-large-en-v1.5 | 1024 | High-quality general-purpose embeddings by BAAI |
| sentence-transformers/all-MiniLM-L6-v2 | 384 | Lightweight, fast sentence embeddings |
| intfloat/e5-large-v2 | 1024 | Strong multilingual embedding model |
Usage Example
from cortexdb_deepinfra import DeepInfraEmbeddingProvider, DeepInfraConfig
config = DeepInfraConfig(api_key="your-deepinfra-api-key")
async with DeepInfraEmbeddingProvider(config=config) as provider:
embedding = await provider.embed_query("What is event sourcing?")
print(f"Dimension: {provider.dimension}") # 1024
embeddings = await provider.embed([
"Event sourcing stores all changes as events.",
"CQRS separates reads from writes.",
])
Using a Different Model
config = DeepInfraConfig(
api_key="your-deepinfra-api-key",
embed_model="sentence-transformers/all-MiniLM-L6-v2",
)
provider = DeepInfraEmbeddingProvider(config=config)
# provider.dimension == 384
Switching Providers
To switch CortexDB from the default OpenAI embeddings to DeepInfra:
from cortexdb_deepinfra import DeepInfraEmbeddingProvider
provider = DeepInfraEmbeddingProvider() # reads from env vars
All CortexDB embedding providers implement the same interface (embed, embed_query, dimension, model_name), so switching is a one-line change.