Use Fireworks AI's optimized inference for CortexDB embeddings and entity extraction.
Fireworks AI Provider
Run CortexDB's embedding and entity-extraction pipelines on Fireworks AI's optimized inference infrastructure.
Overview
Fireworks AI provides fast, cost-efficient inference for open-source models with production-grade reliability. This integration configures CortexDB to use Fireworks AI for:
- Embedding generation — optimized open-source embedding models
- Entity extraction — fast chat model inference for relationship extraction
Installation
pip install cortexdbai[fireworks]
Configuration
| Environment Variable | Default | Description |
|---|---|---|
| CORTEX_FIREWORKS_API_KEY | Required | Fireworks AI API key |
| CORTEX_FIREWORKS_EMBED_MODEL | nomic-ai/nomic-embed-text-v1.5 | Embedding model |
| CORTEX_FIREWORKS_CHAT_MODEL | accounts/fireworks/models/llama-v3p1-70b-instruct | Chat model |
Embedding Models
| Model | Dimensions | Description |
|---|---|---|
| nomic-ai/nomic-embed-text-v1.5 | 768 | High-quality, long-context embedding model by Nomic |
| thenlper/gte-large | 1024 | General Text Embeddings by Alibaba DAMO |
Usage Example
from cortexdb_fireworks import FireworksEmbeddingProvider, FireworksConfig
config = FireworksConfig(api_key="your-fireworks-api-key")
async with FireworksEmbeddingProvider(config=config) as provider:
embedding = await provider.embed_query("What is event sourcing?")
print(f"Dimension: {provider.dimension}") # 768
embeddings = await provider.embed([
"Event sourcing stores all changes as events.",
"CQRS separates reads from writes.",
])
Using a Different Model
config = FireworksConfig(
api_key="your-fireworks-api-key",
embed_model="thenlper/gte-large",
)
provider = FireworksEmbeddingProvider(config=config)
# provider.dimension == 1024
Switching Providers
To switch CortexDB from the default OpenAI embeddings to Fireworks AI:
from cortexdb_fireworks import FireworksEmbeddingProvider
provider = FireworksEmbeddingProvider() # reads from env vars
All CortexDB embedding providers implement the same interface (embed, embed_query, dimension, model_name), so switching is a one-line change.
Under the Hood
When using the Fireworks AI provider, the SDK translates your calls into REST API requests against the CortexDB and Fireworks AI endpoints.
Storing a memory (remember)
# SDK: cortex.remember("The payments service was migrated to CockroachDB.")
curl -X POST https://api.cortexdb.ai/v1/remember \
-H "Authorization: Bearer $CORTEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"content": "The payments service was migrated to CockroachDB.",
"tenant_id": "my-app"
}'
# Returns: { "event_id": "evt_abc123" }
Retrieving context (recall)
# SDK: result = cortex.recall("What database does the payments service use?")
# result.context, result.confidence, result.latency_ms
curl -X POST https://api.cortexdb.ai/v1/recall \
-H "Authorization: Bearer $CORTEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What database does the payments service use?",
"tenant_id": "my-app"
}'
# Returns: { "context": "...", "confidence": 0.94, "latency_ms": 41 }
Generating embeddings (Fireworks AI)
# The provider calls Fireworks AI's embedding endpoint
curl -X POST https://api.fireworks.ai/inference/v1/embeddings \
-H "Authorization: Bearer $FIREWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "nomic-ai/nomic-embed-text-v1.5",
"input": "What database does the payments service use?"
}'
# Returns: { "data": [{ "embedding": [0.123, -0.456, ...] }] }