Use a Cloud Provider
Connect Zeph to Claude, OpenAI, or any OpenAI-compatible API instead of local Ollama.
Claude
ZEPH_CLAUDE_API_KEY=sk-ant-... zeph
Or in config:
[llm]
provider = "claude"
[llm.cloud]
model = "claude-sonnet-4-5-20250929"
max_tokens = 4096
Claude does not support embeddings. Use the orchestrator to combine Claude chat with Ollama embeddings, or use OpenAI embeddings.
OpenAI
ZEPH_LLM_PROVIDER=openai ZEPH_OPENAI_API_KEY=sk-... zeph
[llm]
provider = "openai"
[llm.openai]
base_url = "https://api.openai.com/v1"
model = "gpt-5.2"
max_tokens = 4096
embedding_model = "text-embedding-3-small"
reasoning_effort = "medium" # optional: low, medium, high (for o3, etc.)
When embedding_model is set, Qdrant subsystems use it automatically for skill matching and semantic memory.
Compatible APIs
Change base_url to point to any OpenAI-compatible endpoint:
# Together AI
base_url = "https://api.together.xyz/v1"
# Groq
base_url = "https://api.groq.com/openai/v1"
# Fireworks
base_url = "https://api.fireworks.ai/inference/v1"
Hybrid Setup
Embeddings via free local Ollama, chat via paid Claude API:
[llm]
provider = "orchestrator"
[llm.orchestrator]
default = "claude"
embed = "ollama"
[llm.orchestrator.providers.ollama]
provider_type = "ollama"
[llm.orchestrator.providers.claude]
provider_type = "claude"
[llm.orchestrator.routes]
general = ["claude"]
See Model Orchestrator for task classification and fallback chain options.
Interactive Setup
Run zeph init and select your provider in Step 2. The wizard handles model names, base URLs, and API keys. See Configuration Wizard.