Use Together AI's open-source model platform for CortexDB embeddings and entity extraction.
Together AI Provider
Access a broad catalog of open-source embedding and chat models through Together AI's inference platform.
Overview
Together AI hosts hundreds of open-source models with serverless and dedicated deployment options. This integration configures CortexDB to use Together AI for:
- Embedding generation — high-quality open-source embedding models
- Entity extraction — Llama, Mistral, and other chat models for relationship extraction
Installation
pip install cortexdbai[together]
Configuration
| Environment Variable | Default | Description |
|---|---|---|
| CORTEX_TOGETHER_API_KEY | Required | Together AI API key |
| CORTEX_TOGETHER_EMBED_MODEL | togethercomputer/m2-bert-80M-8k-retrieval | Embedding model |
| CORTEX_TOGETHER_CHAT_MODEL | meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo | Chat model for entity extraction |
Embedding Models
| Model | Dimensions | Description |
|---|---|---|
| togethercomputer/m2-bert-80M-8k-retrieval | 768 | Fast, lightweight retrieval model |
| BAAI/bge-large-en-v1.5 | 1024 | High-quality general-purpose embeddings |
| WhereIsAI/UAE-Large-V1 | 1024 | Strong performance on MTEB benchmarks |
Usage Example
from cortexdb_together import TogetherEmbeddingProvider, TogetherConfig
config = TogetherConfig(api_key="your-together-api-key")
async with TogetherEmbeddingProvider(config=config) as provider:
embedding = await provider.embed_query("What is event sourcing?")
print(f"Dimension: {provider.dimension}") # 768
# Batch embedding
embeddings = await provider.embed([
"Event sourcing stores all changes as events.",
"CQRS separates reads from writes.",
])
Using a Different Model
config = TogetherConfig(
api_key="your-together-api-key",
embed_model="BAAI/bge-large-en-v1.5",
)
provider = TogetherEmbeddingProvider(config=config)
# provider.dimension == 1024
Switching Providers
To switch CortexDB from the default OpenAI embeddings to Together AI:
from cortexdb_together import TogetherEmbeddingProvider
provider = TogetherEmbeddingProvider() # reads from env vars
All CortexDB embedding providers implement the same interface (embed, embed_query, dimension, model_name), so switching is a one-line change.
Under the Hood
When using the Together AI provider, the SDK translates your calls into REST API requests against the CortexDB and Together AI endpoints.
Storing a memory (remember)
# SDK: cortex.remember("CQRS separates reads from writes.")
curl -X POST https://api.cortexdb.ai/v1/remember \
-H "Authorization: Bearer $CORTEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"content": "CQRS separates reads from writes.",
"tenant_id": "my-app"
}'
# Returns: { "event_id": "evt_abc123" }
Retrieving context (recall)
# SDK: result = cortex.recall("What is CQRS?")
# result.context, result.confidence, result.latency_ms
curl -X POST https://api.cortexdb.ai/v1/recall \
-H "Authorization: Bearer $CORTEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"query": "What is CQRS?",
"tenant_id": "my-app"
}'
# Returns: { "context": "...", "confidence": 0.91, "latency_ms": 52 }
Generating embeddings (Together AI)
# The provider calls Together AI's embedding endpoint
curl -X POST https://api.together.xyz/v1/embeddings \
-H "Authorization: Bearer $TOGETHER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "togethercomputer/m2-bert-80M-8k-retrieval",
"input": "What is CQRS?"
}'
# Returns: { "data": [{ "embedding": [0.123, -0.456, ...] }] }