Configuration
Zeph loads config/default.toml at startup and applies environment variable overrides.
The config path can be overridden via CLI argument or environment variable:
# CLI argument (highest priority)
zeph --config /path/to/custom.toml
# Environment variable
ZEPH_CONFIG=/path/to/custom.toml zeph
# Default (fallback)
# config/default.toml
Priority: --config > ZEPH_CONFIG > config/default.toml.
Hot-Reload
Zeph watches the config file for changes and applies runtime-safe fields without restart. The file watcher uses 500ms debounce to avoid redundant reloads.
Reloadable fields (applied immediately):
| Section | Fields |
|---|---|
[security] | redact_secrets |
[timeouts] | llm_seconds, embedding_seconds, a2a_seconds |
[memory] | history_limit, summarization_threshold, context_budget_tokens, compaction_threshold, compaction_preserve_tail, prune_protect_tokens, cross_session_score_threshold |
[memory.semantic] | recall_limit |
[index] | repo_map_ttl_secs, watch |
[agent] | max_tool_iterations |
[skills] | max_active_skills |
Not reloadable (require restart): LLM provider/model, SQLite path, Qdrant URL, Telegram token, MCP servers, A2A config, skill paths.
Check for config reloaded in the log to confirm a successful reload.
Configuration File
[agent]
name = "Zeph"
max_tool_iterations = 10 # Max tool loop iterations per response (default: 10)
[llm]
provider = "ollama"
base_url = "http://localhost:11434"
model = "mistral:7b"
embedding_model = "qwen3-embedding" # Model for text embeddings
[llm.cloud]
model = "claude-sonnet-4-5-20250929"
max_tokens = 4096
# [llm.openai]
# base_url = "https://api.openai.com/v1"
# model = "gpt-5.2"
# max_tokens = 4096
# embedding_model = "text-embedding-3-small"
# reasoning_effort = "medium" # low, medium, high (for reasoning models)
[skills]
paths = ["./skills"]
max_active_skills = 5 # Top-K skills per query via embedding similarity
[memory]
sqlite_path = "./data/zeph.db"
history_limit = 50
summarization_threshold = 100 # Trigger summarization after N messages
context_budget_tokens = 0 # 0 = unlimited (proportional split: 15% summaries, 25% recall, 60% recent)
compaction_threshold = 0.75 # Compact when context usage exceeds this fraction
compaction_preserve_tail = 4 # Keep last N messages during compaction
prune_protect_tokens = 40000 # Protect recent N tokens from tool output pruning
cross_session_score_threshold = 0.35 # Minimum relevance for cross-session results
[memory.semantic]
enabled = false # Enable semantic search via Qdrant
recall_limit = 5 # Number of semantically relevant messages to inject
[tools]
enabled = true
summarize_output = false # LLM-based summarization for long tool outputs
[tools.shell]
timeout = 30
blocked_commands = []
allowed_commands = []
allowed_paths = [] # Directories shell can access (empty = cwd only)
allow_network = true # false blocks curl/wget/nc
confirm_patterns = ["rm ", "git push -f", "git push --force", "drop table", "drop database", "truncate "]
[tools.file]
allowed_paths = [] # Directories file tools can access (empty = cwd only)
# Pattern-based permissions per tool (optional; overrides legacy blocked_commands/confirm_patterns)
# [tools.permissions.bash]
# [[tools.permissions.bash]]
# pattern = "*sudo*"
# action = "deny"
# [[tools.permissions.bash]]
# pattern = "cargo *"
# action = "allow"
# [[tools.permissions.bash]]
# pattern = "*"
# action = "ask"
[tools.scrape]
timeout = 15
max_body_bytes = 1048576 # 1MB
[tools.audit]
enabled = false # Structured JSON audit log for tool executions
destination = "stdout" # "stdout" or file path
[security]
redact_secrets = true # Redact API keys/tokens in LLM responses
[timeouts]
llm_seconds = 120 # LLM chat completion timeout
embedding_seconds = 30 # Embedding generation timeout
a2a_seconds = 30 # A2A remote call timeout
[vault]
backend = "env" # "env" (default) or "age"; CLI --vault overrides this
[a2a]
enabled = false
host = "0.0.0.0"
port = 8080
# public_url = "https://agent.example.com"
# auth_token = "secret"
rate_limit = 60
Shell commands are sandboxed with path restrictions, network control, and destructive command confirmation. See Security for details.
Environment Variables
| Variable | Description |
|---|---|
ZEPH_LLM_PROVIDER | ollama, claude, openai, candle, or orchestrator |
ZEPH_LLM_BASE_URL | Ollama API endpoint |
ZEPH_LLM_MODEL | Model name for Ollama |
ZEPH_LLM_EMBEDDING_MODEL | Embedding model for Ollama (default: qwen3-embedding) |
ZEPH_CLAUDE_API_KEY | Anthropic API key (required for Claude) |
ZEPH_OPENAI_API_KEY | OpenAI API key (required for OpenAI provider) |
ZEPH_TELEGRAM_TOKEN | Telegram bot token (enables Telegram mode) |
ZEPH_SQLITE_PATH | SQLite database path |
ZEPH_QDRANT_URL | Qdrant server URL (default: http://localhost:6334) |
ZEPH_MEMORY_SUMMARIZATION_THRESHOLD | Trigger summarization after N messages (default: 100) |
ZEPH_MEMORY_CONTEXT_BUDGET_TOKENS | Context budget for proportional token allocation (default: 0 = unlimited) |
ZEPH_MEMORY_COMPACTION_THRESHOLD | Compaction trigger threshold as fraction of context budget (default: 0.75) |
ZEPH_MEMORY_COMPACTION_PRESERVE_TAIL | Messages preserved during compaction (default: 4) |
ZEPH_MEMORY_PRUNE_PROTECT_TOKENS | Tokens protected from Tier 1 tool output pruning (default: 40000) |
ZEPH_MEMORY_CROSS_SESSION_SCORE_THRESHOLD | Minimum relevance score for cross-session memory (default: 0.35) |
ZEPH_MEMORY_SEMANTIC_ENABLED | Enable semantic memory with Qdrant (default: false) |
ZEPH_MEMORY_RECALL_LIMIT | Max semantically relevant messages to recall (default: 5) |
ZEPH_SKILLS_MAX_ACTIVE | Max skills per query via embedding match (default: 5) |
ZEPH_AGENT_MAX_TOOL_ITERATIONS | Max tool loop iterations per response (default: 10) |
ZEPH_TOOLS_SUMMARIZE_OUTPUT | Enable LLM-based tool output summarization (default: false) |
ZEPH_TOOLS_TIMEOUT | Shell command timeout in seconds (default: 30) |
ZEPH_TOOLS_SCRAPE_TIMEOUT | Web scrape request timeout in seconds (default: 15) |
ZEPH_TOOLS_SCRAPE_MAX_BODY | Max response body size in bytes (default: 1048576) |
ZEPH_A2A_ENABLED | Enable A2A server (default: false) |
ZEPH_A2A_HOST | A2A server bind address (default: 0.0.0.0) |
ZEPH_A2A_PORT | A2A server port (default: 8080) |
ZEPH_A2A_PUBLIC_URL | Public URL for agent card discovery |
ZEPH_A2A_AUTH_TOKEN | Bearer token for A2A server authentication |
ZEPH_A2A_RATE_LIMIT | Max requests per IP per minute (default: 60) |
ZEPH_A2A_REQUIRE_TLS | Require HTTPS for outbound A2A connections (default: true) |
ZEPH_A2A_SSRF_PROTECTION | Block private/loopback IPs in A2A client (default: true) |
ZEPH_A2A_MAX_BODY_SIZE | Max request body size in bytes (default: 1048576) |
ZEPH_TOOLS_FILE_ALLOWED_PATHS | Comma-separated directories file tools can access (empty = cwd) |
ZEPH_TOOLS_SHELL_ALLOWED_PATHS | Comma-separated directories shell can access (empty = cwd) |
ZEPH_TOOLS_SHELL_ALLOW_NETWORK | Allow network commands from shell (default: true) |
ZEPH_TOOLS_AUDIT_ENABLED | Enable audit logging for tool executions (default: false) |
ZEPH_TOOLS_AUDIT_DESTINATION | Audit log destination: stdout or file path |
ZEPH_SECURITY_REDACT_SECRETS | Redact secrets in LLM responses (default: true) |
ZEPH_TIMEOUT_LLM | LLM call timeout in seconds (default: 120) |
ZEPH_TIMEOUT_EMBEDDING | Embedding generation timeout in seconds (default: 30) |
ZEPH_TIMEOUT_A2A | A2A remote call timeout in seconds (default: 30) |
ZEPH_CONFIG | Path to config file (default: config/default.toml) |
ZEPH_TUI | Enable TUI dashboard: true or 1 (requires tui feature) |