Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Security

Zeph implements defense-in-depth security for safe AI agent operations in production environments.

Shell Command Filtering

All shell commands from LLM responses pass through a security filter before execution. Commands matching blocked patterns are rejected with detailed error messages.

12 blocked patterns by default:

PatternRisk CategoryExamples
rm -rf /, rm -rf /*Filesystem destructionPrevents accidental system wipe
sudo, suPrivilege escalationBlocks unauthorized root access
mkfs, fdiskFilesystem operationsPrevents disk formatting
dd if=, dd of=Low-level disk I/OBlocks dangerous write operations
curl | bash, wget | shArbitrary code executionPrevents remote code injection
nc, ncat, netcatNetwork backdoorsBlocks reverse shell attempts
shutdown, reboot, haltSystem controlPrevents service disruption

Configuration:

[tools.shell]
timeout = 30
blocked_commands = ["custom_pattern"]  # Additional patterns (additive to defaults)
allowed_paths = ["/home/user/workspace"]  # Restrict filesystem access
allow_network = true  # false blocks curl/wget/nc
confirm_patterns = ["rm ", "git push -f"]  # Destructive command patterns

Custom blocked patterns are additive — you cannot weaken default security. Matching is case-insensitive.

Shell Sandbox

Commands are validated against a configurable filesystem allowlist before execution:

  • allowed_paths = [] (default) restricts access to the working directory only
  • Paths are canonicalized to prevent traversal attacks (../../etc/passwd)
  • allow_network = false blocks network tools (curl, wget, nc, ncat, netcat)

Destructive Command Confirmation

Commands matching confirm_patterns trigger an interactive confirmation before execution:

  • CLI: y/N prompt on stdin
  • Telegram: inline keyboard with Confirm/Cancel buttons
  • Default patterns: rm, git push -f, git push --force, drop table, drop database, truncate
  • Configurable via tools.shell.confirm_patterns in TOML

File Executor Sandbox

FileExecutor enforces the same allowed_paths sandbox as the shell executor for all file operations (read, write, edit, glob, grep).

Path validation:

  • All paths are resolved to absolute form and canonicalized before access
  • Non-existing paths (e.g., for write) use ancestor-walk canonicalization: the resolver walks up the path tree to the nearest existing ancestor, canonicalizes it, then re-appends the remaining segments. This prevents symlink and .. traversal on paths that do not yet exist on disk
  • If the resolved path does not fall under any entry in allowed_paths, the operation is rejected with a SandboxViolation error

Glob and grep enforcement:

  • glob results are post-filtered: matched paths outside the sandbox are silently excluded
  • grep validates the search root directory before scanning begins

Configuration is shared with the shell sandbox:

[tools.shell]
allowed_paths = ["/home/user/workspace"]  # Empty = cwd only

Permission Policy

The [tools.permissions] config section provides fine-grained, pattern-based access control for each tool. Rules are evaluated in order (first match wins) using case-insensitive glob patterns against the tool input. See Tool System — Permissions for configuration details.

Key security properties:

  • Tools with all-deny rules are excluded from the LLM system prompt, preventing the model from attempting to use them
  • Legacy blocked_commands and confirm_patterns are auto-migrated to equivalent permission rules when [tools.permissions] is absent
  • Default action when no rule matches is Ask (confirmation required)

Audit Logging

Structured JSON audit log for all tool executions:

[tools.audit]
enabled = true
destination = "./data/audit.jsonl"  # or "stdout"

Each entry includes timestamp, tool name, command, result (success/blocked/error/timeout), and duration in milliseconds.

Secret Redaction

LLM responses are scanned for common secret patterns before display:

  • Detected patterns: sk-, AKIA, ghp_, gho_, xoxb-, xoxp-, sk_live_, sk_test_, -----BEGIN
  • Secrets replaced with [REDACTED] preserving original whitespace formatting
  • Enabled by default (security.redact_secrets = true), applied to both streaming and non-streaming responses

Timeout Policies

Configurable per-operation timeouts prevent hung connections:

[timeouts]
llm_seconds = 120       # LLM chat completion
embedding_seconds = 30  # Embedding generation
a2a_seconds = 30        # A2A remote calls

A2A Network Security

  • TLS enforcement: a2a.require_tls = true rejects HTTP endpoints (HTTPS only)
  • SSRF protection: a2a.ssrf_protection = true blocks private IP ranges (RFC 1918, loopback, link-local) via DNS resolution
  • Payload limits: a2a.max_body_size caps request body (default: 1 MiB)

Safe execution model:

  • Commands parsed for blocked patterns, then sandbox-validated, then confirmation-checked
  • Timeout enforcement (default: 30s, configurable)
  • Full errors logged to system, sanitized messages shown to users
  • Audit trail for all tool executions (when enabled)

Container Security

Security LayerImplementationStatus
Base imageOracle Linux 9 SlimProduction-hardened
Vulnerability scanningTrivy in CI/CD0 HIGH/CRITICAL CVEs
User privilegesNon-root zeph user (UID 1000)Enforced
Attack surfaceMinimal package installationDistroless-style

Continuous security:

  • Every release scanned with Trivy before publishing
  • Automated Dependabot PRs for dependency updates
  • cargo-deny checks in CI for license/vulnerability compliance

Code Security

Rust-native memory safety guarantees:

  • Minimal unsafe: One audited unsafe block behind candle feature flag (memory-mapped safetensors loading). Core crates enforce #![deny(unsafe_code)]
  • No panic in production: unwrap() and expect() linted via clippy
  • Secure dependencies: All crates audited with cargo-deny
  • MSRV policy: Rust 1.88+ (Edition 2024) for latest security patches

Reporting Vulnerabilities

Do not open a public issue. Use GitHub Security Advisories to submit a private report.

Include: description, steps to reproduce, potential impact, suggested fix. Expect an initial response within 72 hours.