Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Configuration

Zeph loads config/default.toml at startup and applies environment variable overrides.

The config path can be overridden via CLI argument or environment variable:

# CLI argument (highest priority)
zeph --config /path/to/custom.toml

# Environment variable
ZEPH_CONFIG=/path/to/custom.toml zeph

# Default (fallback)
# config/default.toml

Priority: --config > ZEPH_CONFIG > config/default.toml.

Hot-Reload

Zeph watches the config file for changes and applies runtime-safe fields without restart. The file watcher uses 500ms debounce to avoid redundant reloads.

Reloadable fields (applied immediately):

SectionFields
[security]redact_secrets
[timeouts]llm_seconds, embedding_seconds, a2a_seconds
[memory]history_limit, summarization_threshold, context_budget_tokens, compaction_threshold, compaction_preserve_tail, prune_protect_tokens, cross_session_score_threshold
[memory.semantic]recall_limit
[index]repo_map_ttl_secs, watch
[agent]max_tool_iterations
[skills]max_active_skills

Not reloadable (require restart): LLM provider/model, SQLite path, Qdrant URL, Telegram token, MCP servers, A2A config, skill paths.

Check for config reloaded in the log to confirm a successful reload.

Configuration File

[agent]
name = "Zeph"
max_tool_iterations = 10  # Max tool loop iterations per response (default: 10)

[llm]
provider = "ollama"
base_url = "http://localhost:11434"
model = "mistral:7b"
embedding_model = "qwen3-embedding"  # Model for text embeddings

[llm.cloud]
model = "claude-sonnet-4-5-20250929"
max_tokens = 4096

# [llm.openai]
# base_url = "https://api.openai.com/v1"
# model = "gpt-5.2"
# max_tokens = 4096
# embedding_model = "text-embedding-3-small"
# reasoning_effort = "medium"  # low, medium, high (for reasoning models)

[skills]
paths = ["./skills"]
max_active_skills = 5  # Top-K skills per query via embedding similarity

[memory]
sqlite_path = "./data/zeph.db"
history_limit = 50
summarization_threshold = 100  # Trigger summarization after N messages
context_budget_tokens = 0      # 0 = unlimited (proportional split: 15% summaries, 25% recall, 60% recent)
compaction_threshold = 0.75    # Compact when context usage exceeds this fraction
compaction_preserve_tail = 4   # Keep last N messages during compaction
prune_protect_tokens = 40000   # Protect recent N tokens from tool output pruning
cross_session_score_threshold = 0.35  # Minimum relevance for cross-session results

[memory.semantic]
enabled = false               # Enable semantic search via Qdrant
recall_limit = 5              # Number of semantically relevant messages to inject

[tools]
enabled = true
summarize_output = false      # LLM-based summarization for long tool outputs

[tools.shell]
timeout = 30
blocked_commands = []
allowed_commands = []
allowed_paths = []          # Directories shell can access (empty = cwd only)
allow_network = true        # false blocks curl/wget/nc
confirm_patterns = ["rm ", "git push -f", "git push --force", "drop table", "drop database", "truncate "]

[tools.file]
allowed_paths = []          # Directories file tools can access (empty = cwd only)

# Pattern-based permissions per tool (optional; overrides legacy blocked_commands/confirm_patterns)
# [tools.permissions.bash]
# [[tools.permissions.bash]]
# pattern = "*sudo*"
# action = "deny"
# [[tools.permissions.bash]]
# pattern = "cargo *"
# action = "allow"
# [[tools.permissions.bash]]
# pattern = "*"
# action = "ask"

[tools.scrape]
timeout = 15
max_body_bytes = 1048576  # 1MB

[tools.audit]
enabled = false             # Structured JSON audit log for tool executions
destination = "stdout"      # "stdout" or file path

[security]
redact_secrets = true       # Redact API keys/tokens in LLM responses

[timeouts]
llm_seconds = 120           # LLM chat completion timeout
embedding_seconds = 30      # Embedding generation timeout
a2a_seconds = 30            # A2A remote call timeout

[vault]
backend = "env"  # "env" (default) or "age"; CLI --vault overrides this

[a2a]
enabled = false
host = "0.0.0.0"
port = 8080
# public_url = "https://agent.example.com"
# auth_token = "secret"
rate_limit = 60

Shell commands are sandboxed with path restrictions, network control, and destructive command confirmation. See Security for details.

Environment Variables

VariableDescription
ZEPH_LLM_PROVIDERollama, claude, openai, candle, or orchestrator
ZEPH_LLM_BASE_URLOllama API endpoint
ZEPH_LLM_MODELModel name for Ollama
ZEPH_LLM_EMBEDDING_MODELEmbedding model for Ollama (default: qwen3-embedding)
ZEPH_CLAUDE_API_KEYAnthropic API key (required for Claude)
ZEPH_OPENAI_API_KEYOpenAI API key (required for OpenAI provider)
ZEPH_TELEGRAM_TOKENTelegram bot token (enables Telegram mode)
ZEPH_SQLITE_PATHSQLite database path
ZEPH_QDRANT_URLQdrant server URL (default: http://localhost:6334)
ZEPH_MEMORY_SUMMARIZATION_THRESHOLDTrigger summarization after N messages (default: 100)
ZEPH_MEMORY_CONTEXT_BUDGET_TOKENSContext budget for proportional token allocation (default: 0 = unlimited)
ZEPH_MEMORY_COMPACTION_THRESHOLDCompaction trigger threshold as fraction of context budget (default: 0.75)
ZEPH_MEMORY_COMPACTION_PRESERVE_TAILMessages preserved during compaction (default: 4)
ZEPH_MEMORY_PRUNE_PROTECT_TOKENSTokens protected from Tier 1 tool output pruning (default: 40000)
ZEPH_MEMORY_CROSS_SESSION_SCORE_THRESHOLDMinimum relevance score for cross-session memory (default: 0.35)
ZEPH_MEMORY_SEMANTIC_ENABLEDEnable semantic memory with Qdrant (default: false)
ZEPH_MEMORY_RECALL_LIMITMax semantically relevant messages to recall (default: 5)
ZEPH_SKILLS_MAX_ACTIVEMax skills per query via embedding match (default: 5)
ZEPH_AGENT_MAX_TOOL_ITERATIONSMax tool loop iterations per response (default: 10)
ZEPH_TOOLS_SUMMARIZE_OUTPUTEnable LLM-based tool output summarization (default: false)
ZEPH_TOOLS_TIMEOUTShell command timeout in seconds (default: 30)
ZEPH_TOOLS_SCRAPE_TIMEOUTWeb scrape request timeout in seconds (default: 15)
ZEPH_TOOLS_SCRAPE_MAX_BODYMax response body size in bytes (default: 1048576)
ZEPH_A2A_ENABLEDEnable A2A server (default: false)
ZEPH_A2A_HOSTA2A server bind address (default: 0.0.0.0)
ZEPH_A2A_PORTA2A server port (default: 8080)
ZEPH_A2A_PUBLIC_URLPublic URL for agent card discovery
ZEPH_A2A_AUTH_TOKENBearer token for A2A server authentication
ZEPH_A2A_RATE_LIMITMax requests per IP per minute (default: 60)
ZEPH_A2A_REQUIRE_TLSRequire HTTPS for outbound A2A connections (default: true)
ZEPH_A2A_SSRF_PROTECTIONBlock private/loopback IPs in A2A client (default: true)
ZEPH_A2A_MAX_BODY_SIZEMax request body size in bytes (default: 1048576)
ZEPH_TOOLS_FILE_ALLOWED_PATHSComma-separated directories file tools can access (empty = cwd)
ZEPH_TOOLS_SHELL_ALLOWED_PATHSComma-separated directories shell can access (empty = cwd)
ZEPH_TOOLS_SHELL_ALLOW_NETWORKAllow network commands from shell (default: true)
ZEPH_TOOLS_AUDIT_ENABLEDEnable audit logging for tool executions (default: false)
ZEPH_TOOLS_AUDIT_DESTINATIONAudit log destination: stdout or file path
ZEPH_SECURITY_REDACT_SECRETSRedact secrets in LLM responses (default: true)
ZEPH_TIMEOUT_LLMLLM call timeout in seconds (default: 120)
ZEPH_TIMEOUT_EMBEDDINGEmbedding generation timeout in seconds (default: 30)
ZEPH_TIMEOUT_A2AA2A remote call timeout in seconds (default: 30)
ZEPH_CONFIGPath to config file (default: config/default.toml)
ZEPH_TUIEnable TUI dashboard: true or 1 (requires tui feature)