Route /v1/answer + embeddings through Together AI's open-model platform.

Together AI Integration

Together hosts a wide catalog of open-source models behind an OpenAI-compatible API. Point CortexDB's LLM router and embedding provider at it.

Deployment configuration

CORTEX_LLM_URL=https://api.together.xyz/v1
CORTEX_LLM_API_KEY=tgt_...
CORTEX_LLM_MODEL=meta-llama/Llama-3.3-70B-Instruct-Turbo
CORTEX_EMBEDDING_PROVIDER=together
CORTEX_EMBEDDING_MODEL=togethercomputer/m2-bert-80M-8k-retrieval

Per-request override

client.answer(
    scope="org:acme/user:alice",
    question="What did we decide?",
    answer_model="together/Qwen/Qwen2.5-72B-Instruct-Turbo",
)

See also