Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog.

Unreleased

[0.19.1] - 2026-04-15

Added

TaskSupervisor observability — CPU and wall-time metrics for supervised tasks, visible in Jaeger traces and tokio-console. See Observability & Cost. (The task-metrics feature flag was consolidated as always-on in v0.20.x — no feature flag required.)
TUI task registry panel — new /tasks command displays live table of all supervised tasks (name, state, uptime, restart count). See TUI Dashboard.
Per-chunk code indexing supervision — CodeIndexer now integrates with TaskSupervisor for fine-grained visibility of concurrent embedding tasks. Each chunk operation is registered as a separate task (chunk_file_{N}) in the supervisor registry.
Bootstrap TaskSupervisor migration — 7 memory background loops (eviction, tier promotion, consolidation, forgetting, compression, tree consolidation) migrated to TaskSupervisor with restart policies.

Changed

RuntimeContext consolidation — new RuntimeContext struct carries runtime mode flags (tui_mode, daemon_mode). All subsystem initializers now accept a single RuntimeContext instead of individual bool parameters.

Fixed

CPU/RAM regressions — graph community detection OOM guard, Qdrant upsert timeout (30s), Box::leak eliminated via Arc<str>, file watcher debounce (500ms), TUI log fallback to platform log directory, blocking task capacity limits, concurrent index_project re-entry guard, OTLP circuit breaker, audit log TUI redirect, and safe IndexerConfig defaults.
TUI performance — eliminated per-frame message list clones (20K clones/sec), reduced context assembler pre-allocation for large-context providers.

For full details, see the GitHub release.

[0.19.0] - 2026-04-13

Changed

Slash command handler registry consolidation — /skill, /skills, /feedback, /compact, /mcp, /lsp, /scheduler, /experiment, and /log commands are now dispatched through the CommandHandler registry. This improves consistency with the rest of the command palette and enables better discoverability. The legacy dispatch_slash_command function has been removed. No user-facing behavior changes — all commands work as before.
Compaction internals refactoring — the compact_context function has been restructured using owned types (Vec<Message>, String, AnyProvider) instead of references, eliminating borrows held across .await boundaries. This internal change improves code maintainability and resolves Rust HRTB (higher-ranked trait bound) constraints.

Security

Path traversal hardening in ImageCommand — the /image command now rejects absolute paths (e.g., /etc/passwd) in addition to traversal sequences like ../. This mirrors the equivalent protection added to the CLI channel in v0.18.6. Images must be relative paths within the working directory.
Upgraded rand to 0.10 — fixed RUSTSEC-2025-0097 by upgrading from the unsound rand 0.8.5 to the safe rand 0.10.1. All call sites updated to use the new RngExt::random_range() API.

[0.18.5] - 2026-04-07

Added

Per-provider cost breakdown — CostTracker now accumulates per-provider token counts (input, cache read/write, output) and cost. The /status CLI command and TUI /cost view render a per-provider table sorted by cost. See Observability & Cost.
ASI coherence tracking — per-provider sliding window of response embeddings penalizes Thompson/EMA routing when coherence drops. Enabled via [llm.routing.asi]. See Adaptive Inference.
Unified quality gate — optional post-selection embedding similarity check via [llm.routing] quality_gate. See Adaptive Inference.
Time-based microcompact — stale low-value tool outputs are cleared after an idle gap, at zero LLM cost. Configurable via [memory.microcompact]. See Memory & Context.
autoDream background consolidation — post-session memory consolidation sweep behind a session-count and time gate. Configurable via [memory.autodream]. See Memory & Context.
MagicDocs auto-maintained markdown — files with # MAGIC DOC: header are rewritten after tool-call turns by a background LLM task. Configurable via [magic_docs]. See Memory & Context.
Key facts semantic dedup — near-duplicate key_facts are silently skipped before Qdrant insertion. Configurable via memory.key_facts_dedup_threshold. See Memory & Context.
MCP error codes — McpErrorCode enum with is_retryable() for caller-side retry classification. See Tool System.
Caller identity propagation — ToolCall.caller_id and AuditEntry.policy_match are now populated from the channel layer. See Tool System.
Per-session tool call quota — tools.max_tool_calls_per_session limits tool executions per session. See Tool System.
OAP authorization config — [tools.authorization] TOML section merges rules into PolicyEnforcer at startup. See Policy Enforcer.
Scheduler CLI subcommand — zeph schedule list/add/remove/show for managing jobs outside an agent session. See Scheduler.

Fixed

spawn_asi_update debounce — exactly one embed call per agent turn instead of N concurrent sub-calls.
MagicDocs scan now detects ToolOutput in Role::User messages and ToolResult in the native execution path.
Filter policy-decision language from key_facts at store time — transient enforcement facts ("blocked", "permission denied", etc.) are no longer embedded.
Compaction failure is now surfaced as a user-visible message instead of a silent tracing::warn.
MCP manager shuts down explicitly before runtime exit, killing stdio child processes cleanly.
BPE tokenizer data is cached in a OnceLock, eliminating repeated disk loads on TokenCounter construction.
Unbounded RSS growth in TUI mode: cancel bridge tasks, message buffer, URL set, and scheduler tick storm all addressed.

[0.17.1] - 2026-03-27

Added

Tool error taxonomy — ToolErrorCategory classifies tool failures into 11 categories driving retry, parameter-reformat, and reputation-scoring decisions. ToolErrorFeedback::format_for_llm() replaces opaque error strings with structured [tool_error] blocks. ToolError::Shell carries an explicit category and exit code. See Tool System.
MCP per-server trust levels — [[mcp.servers]] entries accept trust_level (trusted/untrusted/sandboxed) and tool_allowlist. Sandboxed servers expose only explicitly listed tools (fail-closed). Untrusted servers with no allowlist emit a startup warning. See MCP Integration.
Candle-backed classifiers — CandleClassifier runs protectai/deberta-v3-small-prompt-injection-v2 for injection detection. CandlePiiClassifier runs iiiorg/piiranha-v1-detect-personal-information (NER) for PII detection; results are merged with the regex filter. Configured via the new [classifiers] section. Requires classifiers feature. See Local Inference.
SYNAPSE hybrid seed selection — SYNAPSE spreading activation now ranks seed entities by hybrid_score = fts_score * (1 - seed_structural_weight) + structural_score * seed_structural_weight. New config fields: seed_structural_weight (default: 0.4) and seed_community_cap (default: 3).
A-MEM link weight evolution — edges accumulate retrieval_count; composite scoring uses evolved_weight(count, confidence) = confidence * (1 + 0.2 * ln(1 + count)).min(1.0). A background decay task reduces counts over time via link_weight_decay_lambda and link_weight_decay_interval_secs.
Topology-aware orchestration — TopologyClassifier classifies DAG structure (AllParallel, LinearChain, FanOut, FanIn, Hierarchical, Mixed) and selects a dispatch strategy (FullParallel, Sequential, LevelBarrier, Adaptive). LevelBarrier dispatch fires tasks level-by-level for hierarchical plans. Enable with topology_selection = true (requires experiments feature).
Per-task execution_mode — planner annotates tasks with parallel (default) or sequential to hint the scheduler. Missing fields in stored graphs default to parallel for backward compatibility.
PlanVerifier completeness checking — post-task LLM verification produces a structured VerificationResult with gap severity levels (critical/important/minor). replan() injects new TaskNodes for actionable gaps. All failures are fail-open. Configure via verify_provider. See Task Orchestration.
rmcp 1.3 — updated from rmcp 1.2.

[0.15.3] - 2026-03-17

Fixed

ACP config fallback (#1945) — resolve_config_path() now falls back to ~/.config/zeph/config.toml when config/default.toml is absent relative to CWD; resolves ACP stdio/HTTP startup failure when launched from an IDE workspace directory.
TUI filter metrics zero (#1939) — filter metrics (filter_raw_tokens, filter_saved_tokens, filter_applications) no longer show zero in the TUI dashboard during native tool execution. Extracted record_filter_metrics helper and called from all four metric-recording sites.
Graph metrics initialization (#1938) — TUI graph metrics panel now shows correct entity/edge/community counts on startup. App::with_metrics_rx() eagerly reads the initial snapshot; graph extraction now awaits the background task and re-reads counts.
TUI tool start events (#1931) — native tool calls now emit ToolStart events so the TUI shows a spinner and $ command header before tool output arrives.
Graph metrics per-turn update (#1932) — graph memory metrics (entities/edges/communities) now update every turn via per-turn sync_graph_counts() call.

Added

OAuth 2.1 PKCE for MCP (#1930) — McpTransport::OAuth variant with url, scopes, callback_port, client_name. McpManager::with_oauth_credential_store() for credential persistence via VaultCredentialStore. Two-phase connect_all(): stdio/HTTP concurrently, OAuth sequentially. SSRF validation on all OAuth metadata endpoints.
Background code indexing progress (#1923) — IndexProgress struct with files_done, files_total, chunks_created. CLI prints progress to stderr; TUI shows “Indexing codebase… N/M files (X%)” in status bar.
Real behavioral learning (#1913) — LearningEngine now injects inferred user preferences (verbosity, response format, language) into the volatile system prompt block. Preferences learned from corrections via watermark-based incremental scan every 5 turns. Wilson-score confidence threshold gates persistence.
Context compression overrides (#1904) — CLI flags --focus/--no-focus, --sidequest/--no-sidequest, --pruning-strategy <reactive|task_aware|mig> for per-session overrides. --init wizard step added. (task_aware_mig removed in v0.16.1 — was dead code; existing configs fall back to reactive with a warning.)
Orchestration metrics (#1899) — LlmPlanner::plan() and LlmAggregator::aggregate() return token usage; /status command shows Orchestration block when plans executed.
Memory integration tests (#1916) — four #[ignore] tests for session summary → Qdrant roundtrip using testcontainers.

[0.15.2] - 2026-03-16

Added

Per-conversation compression guidelines — the compression_guidelines table gains a conversation_id column (migration 034). Guidelines are now scoped to a specific conversation when one is in scope; the global (NULL) guideline is used as fallback. Configure via [memory.compression_guidelines]; toggle with --compression-guidelines. See Context Engineering.
Session summary on shutdown (#1816) — when no hard compaction fired during a session, the agent generates a lightweight LLM summary at shutdown and stores it in the vector store for cross-session recall. Configurable via memory.shutdown_summary, shutdown_summary_min_messages (default 4), and shutdown_summary_max_messages (default 20). The --init wizard prompts for the toggle; a TUI spinner appears during summarization.
Declarative policy compiler (#1695) — PolicyEnforcer evaluates TOML-based allow/deny rules before any tool executes. Deny-wins semantics; path traversal normalization; tool name normalization. Configure via [tools.policy] with enabled, default_effect, rules, and policy_file. CLI: --policy-file. Slash commands: /policy status, /policy check [--trust-level <level>]. Feature flag: policy-enforcer (included in full). See Policy Enforcer.
Pre-execution action verification (#1630) — pluggable PreExecutionVerifier pipeline runs before any tool executes. Two built-in verifiers: DestructiveCommandVerifier (blocks rm -rf /, dd if=, mkfs, etc. outside configured allowed_paths) and InjectionPatternVerifier (blocks SQL injection, command injection, path traversal; warns on SSRF). Configure via [security.pre_execution_verify]. CLI escape hatch: --no-pre-execution-verify. TUI security panel shows block/warn counters.
LLM guardrail pre-screener (#1651) — GuardrailFilter screens user input (and optionally tool output) through a guard model before it enters agent context. Configurable action (block/warn), fail strategy (closed/open), timeout, and max_input_chars. Enable with --guardrail or [security.guardrail] enabled = true. TUI status bar: GRD:on (green) or GRD:warn (yellow). Slash command: /guardrail for live stats.
Skill content scanner (#1853) — SkillContentScanner scans all loaded skill bodies for injection patterns at startup when [skills.trust] scan_on_load = true (default). Scanner is advisory: findings are WARN-logged and do not downgrade trust or block tools. On-demand: /skill scan TUI command, --scan-skills-on-load CLI flag.
OTLP-compatible debug traces (#1343) — --dump-format trace emits OpenTelemetry-compatible JSON traces with span hierarchy: session → iteration → LLM request / tool call / memory search. Configure endpoint and service name via [debug.traces]. Switch at runtime: /dump-format <json|raw|trace>. --init wizard prompts for format when debug dump is enabled.
TUI: compression guidelines status (#1803) — memory panel shows guidelines version and last update timestamp. /guidelines slash command displays current guidelines text.
Feature use-case bundles (#1831) — six named bundles group related features: desktop (tui + scheduler + compression-guidelines), ide (acp + acp-http + lsp-context), server (gateway + a2a + scheduler + otel), chat (discord + slack), ml (candle + pdf + stt), full (all except ml/hardware). Individual feature flags are unchanged. See Feature Flags.

Changed

Cascade router observability (#1825) — cascade_chat and cascade_chat_stream now emit structured tracing events for provider selection, judge scoring, quality verdict, escalation, and budget exhaustion.
ACP session config centralization (#1812) — AgentSessionConfig::from_config() and Agent::apply_session_config() replace ~25 individually-copied fields in daemon/runner/ACP session bootstrap. Fixes missing orchestration config and server compaction in daemon sessions.
rmcp 0.17 → 1.2 (#1845) — migrated CallToolRequestParams to builder pattern.

Fixed

Scheduler deadlock no longer emits misleading “Plan failed. 0/N tasks failed” — non-terminal tasks are marked Canceled at deadlock time; done message distinguishes deadlock, mixed failure, and normal failure paths (#1879).
MCP tools are now denied for quarantined skills — TrustGateExecutor tracks registered MCP tool IDs and blocks any call in the set (#1876).
Policy tool="shell" / "sh" / "bash" aliases now all match ShellExecutor at rule compile time (#1877).
/policy check no longer leaks process environment variables into trace output (#1873).
PolicyEffect::AllowIf variant removed — it was identical to Allow and generated misleading TOML docs (#1871).
Overflow notice format changed to [full output stored — ID: {uuid} — ...]; read_overflow accepts bare UUIDs and strips the legacy overflow: prefix (#1868).
Session summary timeout attempts plain-text fallback instead of silently returning None; shutdown_summary_timeout_secs (default 10) replaces hardcoded 5 s limit (#1869).
JWT Bearer tokens (Authorization: Bearer <token>, eyJ...) are now redacted before compression_failure_pairs SQLite insert (#1847).
Soft compaction threshold lowered from 0.70 to 0.60; maybe_soft_compact_mid_iteration() fires after per-tool summarization to relieve context pressure without triggering LLM calls (#1828).
Ollama base_url with /v1 suffix no longer causes 404 on embed calls (#1832).
Graph memory: entity embeddings now correctly stored in Qdrant — EntityResolver was built without a provider in extract_and_store() (#1817, #1829).
Debug trace.json written inside per-session subdir, preventing overwrites (#1814).
JIT tool reference injection works after overflow migration to SQLite (#1818).
Policy symlink boundary check: load_policy_file() canonicalizes the path and rejects files outside the process working directory (#1872).

[0.15.1] - 2026-03-15

Fixed

save_compression_guidelines atomic write — the version-number assignment now uses a single INSERT ... SELECT COALESCE(MAX(version), 0) + 1 statement, eliminating the read-then-write TOCTOU race where two concurrent callers could insert duplicate version numbers. Migration 033 adds a UNIQUE(version) constraint to the compression_guidelines table with row-level deduplication for pre-existing corrupt data (closes #1799).

Added

Failure-driven compression guidelines (ACON) — after hard compaction, the agent watches subsequent LLM responses for two-signal context-loss indicators (uncertainty phrase + prior-context reference). Confirmed failure pairs are stored in SQLite (compression_failure_pairs). A background updater wakes periodically, calls the LLM to synthesize updated guidelines from accumulated pairs, sanitizes the output to strip prompt injection, and persists the result. Guidelines are injected into every future compaction prompt via a <compression-guidelines> block. Configure via [memory.compression_guidelines]; disabled by default. See Context Engineering.

[0.15.0] - 2026-03-14

Added

Gemini provider — full Google Gemini API support across 6 phases: basic chat (generateContent), SSE streaming with thinking-part support, native tool use / function calling, vision / multimodal input (inlineData), semantic embeddings (embedContent), and remote model discovery (GET /v1beta/models). Default model: gemini-2.0-flash; extended thinking available with gemini-2.5-pro. Configure with [llm.gemini] and ZEPH_GEMINI_API_KEY. See LLM Providers.
Gemini thinking_level / thinking_budget support — GeminiThinkingConfig with thinking_level (minimal, low, medium, high), thinking_budget (validated -1/0/1–32768), and include_thoughts fields. Applies to Gemini 2.5+ models. Configurable in [llm.gemini] and the --init wizard.
Cascade routing strategy — new strategy = "cascade" for the router provider. Tries providers cheapest-first; escalates only when the response is classified as degenerate (empty, repetitive, incoherent). Heuristic and LLM-judge classifier modes. Configure via [llm.router.cascade] with quality_threshold, max_escalations, classifier_mode, and max_cascade_tokens. See Adaptive Inference.
Claude server-side context compaction — [llm.cloud] server_compaction = true enables the compact-2026-01-12 beta API. Claude manages context on the server side; compaction summaries stream back and are surfaced in the TUI. Graceful fallback to client-side compaction when the beta header is rejected (e.g. on Haiku models). New server_compaction_events metric. Enable with --server-compaction.
Claude 1M extended context window — [llm.cloud] enable_extended_context = true injects the context-1m-2025-08-07 beta header, unlocking 1M token context for Opus 4.6 and Sonnet 4.6. context_window() reports 1,000,000 when active so auto_budget scales correctly. Configurable in --init wizard.
/scheduler list command and list_tasks tool — lists all active scheduled tasks with NAME, KIND, MODE, and NEXT RUN columns. LLM-callable via the list_tasks tool; also available as /scheduler list slash command. See Scheduler.
search_code tool — unified hybrid code search combining tree-sitter structural extraction, Qdrant semantic search, and LSP symbol resolution. Always available (no feature flag). See Tools.
zeph migrate-config — CLI command to add missing config parameters as commented-out blocks and reformat the file. Idempotent; never modifies existing values. See Migrate Config.
ACP readiness probes — /health HTTP endpoint returns 200 OK when ready; stdio transport emits zeph/ready JSON-RPC notification as the first outbound packet.
Request metadata in debug dumps — model, token limit, temperature, exposed tools, and cache breakpoints included in both json and raw dump formats.

Changed

Tiered context compaction (#1338): replaced single compaction_threshold with soft tier (soft_compaction_threshold, default 0.70 — prune tool outputs + apply deferred summaries, no LLM) and hard tier (hard_compaction_threshold, default 0.90 — full LLM summarization). Old compaction_threshold field still accepted via serde alias. deferred_apply_threshold removed — absorbed into soft tier. See Context Engineering.
Async parallel dispatch in DagScheduler — tick() now dispatches all ready tasks simultaneously instead of capping at max_parallel - running. Concurrency enforced by SubAgentManager returning ConcurrencyLimit; tasks revert to Ready and retry on the next tick.
/plan cancel during execution — cancel commands delivered immediately during active plan execution via concurrent channel polling.
DagScheduler exponential backoff — concurrency-limit deferral uses 250ms→500ms→1s→2s→4s (cap 5s) instead of a fixed 250ms sleep.
Single shared QdrantOps instance — all subsystems share one gRPC connection instead of creating independent connections on startup.
zeph-index always-on — the index feature flag is removed; tree-sitter and code intelligence are compiled into every build.
Graph memory chunked edge loading — community detection loads edges in configurable chunks (keyset pagination) instead of loading all edges at once, reducing peak memory on large graphs. Configurable via memory.graph.lpa_edge_chunk_size (default: 10,000).

Security

SEC-001–004 tool execution hardening — randomized hash seeds, jitter-free retry timing, tool name length limits, wall-clock retry budget. See Security.
Shell blocklist unconditional — blocked_commands and DEFAULT_BLOCKED now apply regardless of PermissionPolicy configuration; previously skipped when a policy was attached.

Fixed

Context compaction loop: maybe_compact() now detects when the token budget is too tight to make progress (compactable message count ≤ 1, or compaction produced zero net token reduction, or context remains above threshold after a successful summarization pass) and sets a permanent compaction_exhausted flag. Subsequent calls skip compaction entirely and emit a one-time user-visible warning to increase context_budget_tokens or start a new session (#1727).
Claude server compaction: ContextManagement struct now serializes to the correct API shape (auto_truncate type with nested trigger); the previous shape caused non-functional --server-compaction.
Haiku models: with_server_compaction(true) now emits WARN and keeps the flag disabled (the compact-2026-01-12 beta is not supported for Haiku).
Skill embedding log noise: SkillMatcher::new() no longer emits one WARN per skill when the provider does not support embeddings — all EmbedUnsupported errors are summarised into a single info-level message.
OpenAI / Gemini: tools with no parameters no longer cause 400 Bad Request in strict mode.
Anomaly detector: outcomes now recorded correctly for native tool-use providers (Claude, OpenAI, Gemini).

[0.14.3] - 2026-03-10

See CHANGELOG.md for full release notes.

[0.14.2] - 2026-03-09

See CHANGELOG.md for full release notes.

[0.14.1] - 2026-03-07

See CHANGELOG.md for full release notes.

[0.14.0] - 2026-03-06

See CHANGELOG.md for full release notes.

[0.12.5] - 2026-03-02

See CHANGELOG.md for full release notes.

[0.12.4] - 2026-03-01

Added

list_directory tool in FileExecutor: sorted entries with [dir]/[file]/[symlink] labels; uses lstat to avoid following symlinks (#1053)
create_directory, delete_path, move_path, copy_path tools in FileExecutor: structured file system mutation ops, all paths sandbox-validated; copy_dir_recursive uses lstat to prevent symlink escape (#1054)
fetch tool in WebScrapeExecutor: plain URL-to-text without CSS selector requirement, SSRF protection applied (#1055)
DiagnosticsExecutor with diagnostics tool: runs cargo check or cargo clippy --message-format=json, returns structured error/warning list (file, line, col, severity, message), output capped, graceful degradation if cargo absent (#1056)
list_directory and find_path tools in AcpFileExecutor: run on agent filesystem when IDE advertises fs.readTextFile capability; paths sandbox-validated, glob segments validated against .. traversal, results capped at 1000 (#1059)
ToolFilter: suppresses local FileExecutor tools (read, write, glob) when AcpFileExecutor provides IDE-proxied alternatives (#1059)
check_blocklist() and DEFAULT_BLOCKED_COMMANDS extracted to zeph-tools public API so AcpShellExecutor applies the same blocklist as ShellExecutor (#1050)
ToolPermission enum with per-binary pattern support in persisted TOML ([tools.bash.patterns]); deny patterns route to RejectAlways fast-path without IDE round-trip (#1050)
Self-learning loop (Phase 1–4): FailureKind enum, /skill reject, FeedbackDetector, UserCorrection cross-session recall, Wilson score Bayesian re-ranking, check_trust_transition(), BM25+RRF hybrid search, EMA routing (#1035)

Changed

Renamed FileExecutor tool id glob → find_path to align with Zed IDE native tool surface (#1052)
READONLY_TOOLS allowlist updated to current tool IDs: read, find_path, grep, list_directory, web_scrape, fetch (#1052)
CI: migrated from Dependabot to self-hosted Renovate with MSRV-aware constraintsFiltering: strict and grouped minor/patch automerge (#1048)

Security

ACP permission gate: subshell injection ($(, backtick) blocked before pattern matching; effective_shell_command() checks inner command of bash -c <cmd> against blocklist; extract_command_binary() strips transparent prefixes to prevent allow-always scope expansion (SEC-ACP-C1, SEC-ACP-C2) (#1050)
ACP tool notifications: raw_response is now passed through redact_json before forwarding to claudeCode.toolResponse; prevents secrets from bypassing the redact_secrets pipeline (SEC-ACP-001)

Fixed

ACP: terminal release deferred until after tool_call_update notification is dispatched (#1013)
ACP: tool execution output forwarded via LoopbackEvent::ToolOutput to ACP channel (#1003)
ACP: newlines preserved in tool output for IDE terminal widget (#1034)

[0.12.1] - 2026-02-25

Security

Enforce unsafe_code = "deny" at workspace lint level; audited unsafe blocks (mmap via candle, std::env in tests) annotated with #[allow(unsafe_code)] (#867)
AgeVaultProvider secrets map switched from HashMap to BTreeMap for deterministic JSON key ordering on vault.save() (#876)
WebScrapeExecutor: redirect targets now validated against private/internal IP ranges to prevent SSRF via redirect chains (#871)
Gateway webhook payload: per-field length limits (sender/channel <= 256 bytes, body <= 65536 bytes) and ASCII control-char stripping to prevent prompt injection (#868)
ACP permission cache: null bytes stripped from tool names before cache key construction to prevent key collision (#872)
gateway.max_body_size bounded to 10 MiB (10,485,760 bytes) at config validation to prevent memory exhaustion (#875)
Shell sandbox: <(, >(, <<<, eval added to default confirm_patterns to mitigate process substitution, here-string, and eval bypass vectors (#870)

Performance

ClaudeProvider caches pre-serialized ToolDefinition slices; cache is invalidated only when the tool set changes, eliminating per-call JSON construction overhead (#894)
should_compact() replaced O(N) message scan with direct comparison against cached_prompt_tokens (#880)
EnvironmentContext cached on Agent; only git_branch refreshed on skill reload instead of spawning a full git subprocess per turn (#881)
Doom-loop content hashed in-place by feeding stable message parts directly into the hasher, eliminating the intermediate normalized String allocation (#882)
prune_stale_tool_outputs: count_tokens called once per ToolResult part instead of twice (#883)
Composite covering index (conversation_id, id) on messages table (migration 015) replaces single-column index; eliminates post-filter sort step (#895)
load_history_filtered rewritten as a CTE, replacing the previous double-sort subquery (#896)
remove_tool_responses_middle_out takes ownership of the message Vec instead of cloning; HashSet replaced with Vec::with_capacity for small-N index tracking (#884, #888)
Fast-path parts_json == "[]" check in history load functions skips serde parse on the common empty case (#886)
consolidate_summaries uses String::with_capacity + write! loop instead of collect::<Vec<_>>().join() (#887)
TUI tui_loop() skips terminal.draw() when no events occurred in the 250ms tick, reducing idle CPU usage (#892)

Added

sqlite_pool_size: u32 in MemoryConfig (default 5) — configurable via [memory] sqlite_pool_size (#893)
Background cleanup task for ResponseCache::cleanup_expired() — interval configurable via [memory] response_cache_cleanup_interval_secs (default 3600s) (#891)
schema feature flag in zeph-llm gating schemars dependency and typed output API (#879)

Changed

check_summarization() uses in-memory unsummarized_count counter on MemoryState instead of issuing a COUNT(*) SQL query on every message save (#890)
Removed 4 channel.send_status() calls from persist_message() in zeph-core — SQLite WAL inserts < 1ms do not warrant status reporting (#889)
Default Ollama model changed from mistral:7b to qwen3:8b; "qwen3" and "qwen" added as ChatML template aliases (#897)
src/main.rs split into focused modules: runner.rs, agent_setup.rs, tracing_init.rs, tui_bridge.rs, channel.rs, tests.rs — main.rs reduced to 26 LOC (#839)
zeph-core/src/bootstrap.rs split into submodule directory: config.rs, health.rs, mcp.rs, provider.rs, skills.rs, tests.rs — bootstrap/mod.rs reduced to 278 LOC (#840)
SkillTrustRow.source_kind changed from String to SourceKind enum (Local, Hub, File) with serde DB serialization (#848)
ScheduledTaskConfig.kind changed from String to ScheduledTaskKind enum (#850)
TrustLevel moved to zeph-tools::trust_level; zeph-skills re-exports it, removing the zeph-tools → zeph-skills reverse dependency (#841)
Duplicate ChannelError removed from zeph-channels::error; all channel adapters use zeph_core::channel::ChannelError (#842)
zeph_a2a::types::TaskState replaced in zeph-core with a local SubAgentState enum; zeph-a2a removed from zeph-core dependencies (#843)
zeph-index Qdrant access consolidated through VectorStore trait from zeph-memory; direct qdrant-client dependency removed (#844)
content_hash(data: &[u8]) -> String utility added to zeph-core::hash backed by BLAKE3 (#845)
zeph-core::diff re-export module removed; zeph_core::DiffData is now a direct re-export of zeph_tools::executor::DiffData (#846)
ContextManager, ToolOrchestrator, LearningEngine extracted from Agent into standalone structs with pure delegation (#830, #836, #837, #838)
Secret type wraps inner value in Zeroizing<String>; Clone removed (#865)
AgeVaultProvider secrets and intermediate decrypt/encrypt buffers wrapped in Zeroizing (#866, #874)
A2aServer::serve() and GatewayServer::serve() emit tracing::warn! when auth_token is None (#869, #873)

0.12.0 - 2026-02-24

Added

MessageMetadata struct in zeph-llm with agent_visible, user_visible, compacted_at fields; default is both-visible for backward compat (#M28)
Message.metadata field with #[serde(default)] — existing serialized messages deserialize without change
SQLite migration 013_message_metadata.sql — adds agent_visible, user_visible, compacted_at columns to messages table
save_message_with_metadata() in SqliteStore for saving messages with explicit visibility flags
load_history_filtered() in SqliteStore — SQL-level filtering by agent_visible / user_visible
replace_conversation() in SqliteStore — atomic compaction: marks originals user_only, inserts summary as agent_only
oldest_message_ids() in SqliteStore — returns N oldest message IDs for a conversation
Agent.load_history() now loads only agent_visible=true messages, excluding compacted originals
compact_context() persists compaction atomically via replace_conversation(), falling back to legacy summary storage if DB IDs are unavailable
Multi-session ACP support with configurable max_sessions (default 4) and LRU eviction of idle sessions (#781)
session_idle_timeout_secs config for automatic session cleanup (default 30 min) with background reaper task (#781)
ZEPH_ACP_MAX_SESSIONS and ZEPH_ACP_SESSION_IDLE_TIMEOUT_SECS env overrides (#781)
ACP session persistence to SQLite — acp_sessions and acp_session_events tables with conversation replay on load_session per ACP spec (#782)
SqliteStore methods for ACP session lifecycle: create_acp_session, save_acp_event, load_acp_events, delete_acp_session, acp_session_exists (#782)
TokenCounter in zeph-memory — accurate token counting with tiktoken-rs cl100k_base, replacing chars/4 heuristic (#789)
DashMap-backed token cache (10k cap) for amortized O(1) lookups
OpenAI tool schema token formula for precise context budget allocation
Input size guard (64KB) on token counting to prevent cache pollution from oversized input
Graceful fallback to chars/4 when tiktoken tokenizer is unavailable
Configurable tool response offload — OverflowConfig with threshold (default 50k chars), retention (7 days), optional custom dir (#791)
[tools.overflow] section in config.toml for offload configuration
Security hardening: path canonicalization, symlink-safe cleanup, 0o600 file permissions on Unix
Wire AcpContext (IDE-proxied FS, shell, permissions) through AgentSpawner into agent tool chain via CompositeExecutor — ACP executors take priority with automatic local fallback (#779)
DynExecutor newtype in zeph-tools for object-safe ToolExecutor composition in CompositeExecutor (#779)
cancel_signal: Arc<Notify> on LoopbackHandle for cooperative cancellation between ACP sessions and agent loop (#780)
with_cancel_signal() builder method on Agent to inject external cancellation signal (#780)
zeph-acp crate — ACP (Agent Client Protocol) server for IDE embedding (Zed, JetBrains, Neovim) (#763-#766)
--acp CLI flag to launch Zeph as an ACP stdio server (requires acp feature)
acp feature gate in root Cargo.toml; included in full feature set
ZephAcpAgent implementing SDK Agent trait with session lifecycle (new, prompt, cancel, load)
loopback_event_to_update mapping LoopbackEvent variants to ACP SessionUpdate notifications, with empty chunk filtering
serve_stdio() transport using AgentSideConnection over tokio-compat stdio streams
Stream monitor gated behind ZEPH_ACP_LOG_MESSAGES env var for JSON-RPC traffic debugging
Custom mdBook theme with Zeph brand colors (navy+amber palette from TUI)
Z-letter favicon SVG for documentation site
Sidebar logo via inline data URI
Navy as default documentation theme
AcpConfig struct in zeph-core — enabled, agent_name, agent_version with ZEPH_ACP_* env overrides (#771)
[acp] section in config.toml for configuring ACP server identity
--acp-manifest CLI flag — prints ACP agent manifest JSON to stdout for IDE discovery (#772)
serve_connection<W, R> generic transport function extracted from serve_stdio for testability (#770)
ConnSlot pattern in transport — Rc<RefCell<Option<Rc<AgentSideConnection>>>> populated post-construction so new_session can build ACP adapters (#770)
build_acp_context in ZephAcpAgent — wires AcpFileExecutor, AcpShellExecutor, AcpPermissionGate per session (#770)
AcpServerConfig passed through serve_stdio/serve_connection to configure agent identity from config values (#770)
ACP section in --init wizard — prompts for enabled, agent_name, agent_version (#771)
Integration tests for ACP transport using tokio::io::duplex — initialize_handshake, new_session_and_cancel (#773)
ACP permission persistence to ~/.config/zeph/acp-permissions.toml — AllowAlways/RejectAlways decisions survive restarts (#786)
acp.permission_file config and ZEPH_ACP_PERMISSION_FILE env override for custom permission file path (#786)

Fixed

Permission cache key collision on anonymous tools — uses tool_call_id as fallback when title is absent (#779)

Changed

CI: add CLA check for external contributors via contributor-assistant/github-action

0.11.6 - 2026-02-23

Fixed

Auto-create parent directories for sqlite_path on startup (#756)

Added

autosave_assistant and autosave_min_length config fields in MemoryConfig — assistant responses skip embedding when disabled (#748)
SemanticMemory::save_only() — persist message to SQLite without generating a vector embedding (#748)
ResponseCache in zeph-memory — SQLite-backed LLM response cache with blake3 key hashing and TTL expiry (#750)
response_cache_enabled and response_cache_ttl_secs config fields in LlmConfig (#750)
Background cleanup_expired() task for response cache (runs every 10 minutes) (#750)
ZEPH_MEMORY_AUTOSAVE_ASSISTANT, ZEPH_MEMORY_AUTOSAVE_MIN_LENGTH env overrides (#748)
ZEPH_LLM_RESPONSE_CACHE_ENABLED, ZEPH_LLM_RESPONSE_CACHE_TTL_SECS env overrides (#750)
MemorySnapshot, export_snapshot(), import_snapshot() in zeph-memory/src/snapshot.rs (#749)
zeph memory export <path> and zeph memory import <path> CLI subcommands (#749)
SQLite migration 012_response_cache.sql for the response cache table (#750)
Temporal decay scoring in SemanticMemory::recall() — time-based score attenuation with configurable half-life (#745)
MMR (Maximal Marginal Relevance) re-ranking in SemanticMemory::recall() — post-processing for result diversity (#744)
Compact XML skills prompt format (format_skills_prompt_compact) for low-budget contexts (#747)
SkillPromptMode enum (full/compact/auto) with auto-selection based on context budget (#747)
Adaptive chunked context compaction — parallel chunk summarization via join_all (#746)
with_ranking_options() builder for SemanticMemory to configure temporal decay and MMR
message_timestamps() method on SqliteStore for Unix epoch retrieval via strftime
get_vectors() method on EmbeddingStore for raw vector fetch from SQLite vector_points
SQLite-backed SqliteVectorStore as embedded alternative to Qdrant for zero-dependency vector search (#741)
vector_backend config option to select between qdrant and sqlite vector backends
Credential scrubbing in LLM context pipeline via scrub_content() — redacts secrets and paths before LLM calls (#743)
redact_credentials config option (default: true) to toggle context scrubbing
Filter diagnostics mode: kept_lines tracking in FilterResult for all 9 filter strategies
TUI expand (‘e’) highlights kept lines vs filtered-out lines with dim styling and legend
Markdown table rendering in TUI chat panel — Unicode box-drawing borders, bold headers, column auto-width

Changed

Token estimation uses chars/4 heuristic instead of bytes/3 for better accuracy on multi-byte text (#742)

0.11.5 - 2026-02-22

Added

Declarative TOML-based output filter engine with 9 strategy types: strip_noise, truncate, keep_matching, strip_annotated, test_summary, group_by_rule, git_status, git_diff, dedup
Embedded default-filters.toml with 25 pre-configured rules for CLI tools (cargo, git, docker, npm, pip, make, pytest, go, terraform, kubectl, brew, ls, journalctl, find, grep/rg, curl/wget, du/df/ps, jest/mocha/vitest, eslint/ruff/mypy/pylint)
filters_path option in FilterConfig for user-provided filter rules override
ReDoS protection: RegexBuilder with size_limit, 512-char pattern cap, 1 MiB file size limit
Dedup strategy with configurable normalization patterns and HashMap pre-allocation
NormalizeEntry replacement validation (rejects unescaped $ capture group refs)
Sub-agent orchestration system with A2A protocol integration (#709)
Sub-agent definition format with TOML frontmatter parser (#710)
SubAgentManager with spawn/cancel/collect lifecycle (#711)
Tool filtering (AllowList/DenyList/InheritAll) and skill filtering with glob patterns (#712)
Zero-trust permission model with TTL-based grants and automatic revocation (#713)
In-process A2A channels for orchestrator-to-sub-agent communication
PermissionGrants with audit trail via tracing
Real LLM loop wired into SubAgentManager::spawn() with background tokio task execution (#714)
poll_subagents() on Agent<C> for collecting completed sub-agent results (#714)
shutdown_all() on SubAgentManager for graceful teardown (#714)
SubAgentMetrics in MetricsSnapshot with state, turns, elapsed time (#715)
TUI sub-agents panel (zeph-tui widgets/subagents) with color-coded states (#715)
/agent CLI commands: list, spawn, bg, status, cancel, approve, deny (#716)
Typed AgentCommand enum with parse() for type-safe command dispatch replacing string matching in the agent loop
@agent_name mention syntax for quick sub-agent invocation with disambiguation from @-triggered file references

Changed

Migrated all 6 hardcoded filters (cargo_build, test_output, clippy, git, dir_listing, log_dedup) into the declarative TOML engine

Removed

FilterConfig per-filter config structs (TestFilterConfig, GitFilterConfig, ClippyFilterConfig, CargoBuildFilterConfig, DirListingFilterConfig, LogDedupFilterConfig) — filter params now in TOML strategy fields

0.11.4 - 2026-02-21

Added

validate_skill_references(body, skill_dir) in zeph-skills loader: parses Markdown links targeting references/, scripts/, or assets/ subdirs, warns on missing files and symlink traversal attempts (#689)
sanitize_skill_body(body) in zeph-skills prompt: escapes XML structural tags (<skill, </skill>, <instructions, </instructions>, <available_skills, </available_skills>) to prevent prompt injection (#689)
Body sanitization applied automatically to all non-Trusted skills in format_skills_prompt() (#689)
load_skill_resource(skill_dir, relative_path) public function in zeph-skills::resource for on-demand loading of skill resource files with path traversal protection (#687)
Nested metadata: block support in SKILL.md frontmatter: indented key-value pairs under metadata: are parsed as structured metadata (#686)
Field length validation in SKILL.md loader: description capped at 1024 characters, compatibility capped at 500 characters (#686)
Warning log in load_skill_body() when body exceeds 20,000 bytes (~5000 tokens) per spec recommendation (#686)
Empty value normalization for compatibility and license frontmatter fields: bare compatibility: now produces None instead of Some("") (#686)
SkillManager in zeph-skills — install skills from git URLs or local paths, remove, verify blake3 integrity, list with trust metadata
CLI subcommands: zeph skill {install, remove, list, verify, trust, block, unblock} — runs without agent loop
In-session /skill install <url|path> and /skill remove <name> with hot reload
Managed skills directory at ~/.config/zeph/skills/, auto-appended to skills.paths at bootstrap
Hash re-verification on trust promotion — recomputes blake3 before promoting to trusted/verified, rejects on mismatch
URL scheme allowlist and path traversal validation in SkillManager as defense-in-depth
Blocking I/O wrapped in spawn_blocking for async safety in skill management handlers
custom: HashMap<String, Secret> field in ResolvedSecrets for user-defined vault secrets (#682)
list_keys() method on VaultProvider trait with implementations for Age and Env backends (#682)
requires-secrets field in SKILL.md frontmatter for declaring per-skill secret dependencies (#682)
Gate skill activation on required secrets availability in system prompt builder (#682)
Inject active skill’s secrets as scoped env vars into ShellExecutor at execution time (#682)
Custom secrets step in interactive config wizard (--init) (#682)
crates.io publishing metadata (description, readme, homepage, keywords, categories) for all workspace crates (#702)

Changed

requires-secrets SKILL.md frontmatter field renamed to x-requires-secrets to follow JSON Schema vendor extension convention and avoid future spec collisions — breaking change: update skill frontmatter to use x-requires-secrets; the old requires-secrets form is still parsed with a deprecation warning (#688)
allowed-tools SKILL.md field now uses space-separated values per agentskills.io spec (was comma-separated) — breaking change for skills using comma-delimited allowed-tools (#686)
Skill resource files (references, scripts, assets) are no longer eagerly injected into the system prompt on skill activation; only filenames are listed as available resources — breaking change for skills relying on auto-injected reference content (#687)

0.11.3 - 2026-02-20

Added

LoopbackChannel / LoopbackHandle / LoopbackEvent in zeph-core — headless channel for daemon mode, pairs with a handle that exposes input_tx / output_rx for programmatic agent I/O
ProcessorEvent enum in zeph-a2a server — streaming event type replacing synchronous ProcessResult; TaskProcessor::process now accepts an mpsc::Sender<ProcessorEvent> and returns Result<(), A2aError>
--daemon CLI flag (feature daemon+a2a) — bootstraps a full agent + A2A JSON-RPC server under DaemonSupervisor with PID file lifecycle and Ctrl-C graceful shutdown
--connect <URL> CLI flag (feature tui+a2a) — connects the TUI to a remote daemon via A2A SSE, mapping TaskEvent to AgentEvent in real-time
Command palette daemon commands: daemon:connect, daemon:disconnect, daemon:status
Command palette action commands: app:quit (shortcut q), app:help (shortcut ?), session:new, app:theme
Fuzzy-matching for command palette — character-level gap-penalty scoring replaces substring filter; daemon_command_registry() merged into filter_commands
TuiCommand::ToggleTheme variant in command palette (placeholder — theme switching not yet implemented)
--init wizard daemon step — prompts for A2A server host, port, and auth token; writes config.a2a.*
Snapshot tests for Config::default() TOML serialization (zeph-core), git filter diff/status output, cargo-build filter success/error output, and clippy grouped warnings output — using insta for regression detection
Tests for handle_tool_result covering blocked, cancelled, sandbox violation, empty output, exit-code failure, and success paths (zeph-core agent/tool_execution.rs)
Tests for maybe_redact (redaction enabled/disabled) and last_user_query helper in agent/tool_execution.rs
Tests for handle_skill_command dispatch covering unknown subcommand, missing arguments, and no-memory early-exit paths for stats, versions, activate, approve, and reset subcommands (zeph-core agent/learning.rs)
Tests for record_skill_outcomes noop path when no active skills are present
insta added to workspace dev-dependencies and to zeph-core and zeph-tools crate dev-deps
Embeddable trait and EmbeddingRegistry<T> in zeph-memory — generic Qdrant sync/search extracted from duplicated code in QdrantSkillMatcher and McpToolRegistry (~350 lines removed)
MCP server command allowlist validation — only permitted commands (npx, uvx, node, python3, python, docker, deno, bun) can spawn child processes; configurable via mcp.allowed_commands
MCP env var blocklist — blocks 21 dangerous variables (LD_PRELOAD, DYLD_, NODE_OPTIONS, PYTHONPATH, JAVA_TOOL_OPTIONS, etc.) and BASH_FUNC_ prefix from MCP server processes
Path separator rejection in MCP command validation to prevent symlink-based bypasses

Changed

MessagePart::Image variant now holds Box<ImageData> instead of inline fields, improving semantic grouping of image data
Agent<C, T> simplified to Agent<C> — ToolExecutor generic replaced with Box<dyn ErasedToolExecutor>, reducing monomorphization
Shell command detection rewritten from substring matching to tokenizer-based pipeline with escape normalization, eliminating bypass vectors via backslash insertion, hex/octal escapes, quote splitting, and pipe chains
Shell sandbox path validation now uses std::path::absolute() as fallback when canonicalize() fails on non-existent paths
Blocked command matching extracts basename from absolute paths (/usr/bin/sudo now correctly blocked)
Transparent wrapper commands (env, command, exec, nice, nohup, time, xargs) are skipped to detect the actual command
Default confirm patterns now include $( and backtick subshell expressions
Enable SQLite WAL mode with SYNCHRONOUS=NORMAL for 2-5x write throughput (#639)
Replace O(n*iterations) token scan with cached_prompt_tokens in budget checks (#640)
Defer maybe_redact to stream completion boundary instead of per-chunk (#641)
Replace format_tool_output string allocation with Write-into-buffer (#642)
Change ToolCall.params from HashMap to serde_json::Map, eliminating clone (#643)
Pre-join static system prompt sections into LazyLock (#644)
Replace doom-loop string history with content hash comparison (#645)
Return &’static str from detect_image_mime with case-insensitive matching (#646)
Replace block_on in history persist with fire-and-forget async spawn (#647)
Change LlmProvider::name() from &'static str to &str, eliminating Box::leak memory leak in CompatibleProvider (#633)
Extract rate-limit retry helper send_with_retry() in zeph-llm, deduplicating 3 retry loops (#634)
Extract sse_to_chat_stream() helpers shared by Claude and OpenAI providers (#635)
Replace double AnyProvider::clone() in embed_fn() with single Arc clone (#636)
Add with_client() builder to ClaudeProvider and OpenAiProvider for shared reqwest::Client (#637)
Cache JsonSchema per TypeId in chat_typed to avoid per-call schema generation (#638)
Scrape executor performs post-DNS resolution validation against private/loopback IPs with pinned address client to prevent SSRF via DNS rebinding
Private host detection expanded to block *.localhost, *.internal, *.local domains
A2A error responses sanitized: serde details and method names no longer exposed to clients
Rate limiter rejects new clients with 429 when entry map is at capacity after stale eviction
Secret redaction regex-based pattern matching replaces whitespace tokenizer, detecting secrets in URLs, JSON, and quoted strings
Added hf_, npm_, dckr_pat_ to secret redaction prefixes
A2A client stream errors truncate upstream body to 256 bytes
Add default_client() HTTP helper with standard timeouts and user-agent in zeph-core and zeph-llm (#666)
Replace 5 production Client::new() calls with default_client() for consistent HTTP config (#667)
Decompose agent/mod.rs (2602→459 lines) into tool_execution, message_queue, builder, commands, and utils modules (#648, #649, #650)
Replace anyhow in zeph-core::config with typed ConfigError enum (Io, Parse, Validation, Vault)
Replace anyhow in zeph-tui with typed TuiError enum (Io, Channel); simplify handle_event() return to ()
Sort [workspace.dependencies] alphabetically in root Cargo.toml

Fixed

False positive: “sudoku” no longer matched by “sudo” blocked pattern (word-boundary matching)
PID file creation uses OpenOptions::create_new(true) (O_CREAT|O_EXCL) to prevent TOCTOU symlink attacks

0.11.2 - 2026-02-19

Added

base_url and language fields in [llm.stt] config for OpenAI-compatible local whisper servers (e.g. whisper.cpp)
ZEPH_STT_BASE_URL and ZEPH_STT_LANGUAGE environment variable overrides
Whisper API provider now passes language parameter for accurate non-English transcription
Documentation for whisper.cpp server setup with Metal acceleration on macOS
Per-sub-provider base_url and embedding_model overrides in orchestrator config
Full orchestrator example with cloud + local + STT in default.toml
All previously undocumented config keys in default.toml (agent.auto_update_check, llm.stt, llm.vision_model, skills.disambiguation_threshold, tools.filters.*, tools.permissions, a2a.auth_token, mcp.servers.env)

Fixed

Outdated config keys in default.toml: removed nonexistent repo_id, renamed provider_type to type, corrected candle defaults, fixed observability exporter default
Add wait(true) to Qdrant upsert and delete operations for read-after-write consistency, fixing flaky ingested_chunks_have_correct_payload integration test (#567)
Vault age backend now falls back to default directory for key/path when --vault-key/--vault-path are not provided, matching zeph vault init behavior (#613)

Changed

Whisper STT provider no longer requires OpenAI API key when base_url points to a local server
Orchestrator sub-providers now resolve base_url and embedding_model via fallback chain: per-provider, parent section, global default

0.11.1 - 2026-02-19

Added

Persistent CLI input history with rustyline: arrow key navigation, prefix search, line editing, SQLite-backed persistence across restarts (#604)
Clickable markdown links in TUI via OSC 8 hyperlinks — [text](url) renders as terminal-clickable link with URL sanitization and scheme allowlist (#580)
@-triggered fuzzy file picker in TUI input — type @ to search project files by name/path/extension with real-time filtering (#600)
Command palette in TUI with read-only agent management commands (#599)
Orchestrator provider option in zeph init wizard for multi-model routing setup (#597)
zeph vault CLI subcommands: init (generate age keypair), set (store secret), get (retrieve secret), list (show keys), rm (remove secret) (#598)
Atomic file writes for vault operations with temp+rename strategy (#598)
Default vault directory resolution via XDG_CONFIG_HOME / APPDATA / HOME (#598)
Auto-update check via GitHub Releases API with configurable scheduler task (#588)
auto_update_check config field (default: true) with ZEPH_AUTO_UPDATE_CHECK env override
TaskKind::UpdateCheck variant and UpdateCheckHandler in zeph-scheduler
One-shot update check at startup when scheduler feature is disabled
--init wizard step for auto-update check configuration

Fixed

Restore --vault, --vault-key, --vault-path CLI flags lost during clap migration (#587)

Changed

Refactor AppBuilder::from_env() to AppBuilder::new() with explicit CLI overrides
Eliminate redundant manual std::env::args() parsing in favor of clap
Add ZEPH_VAULT_KEY and ZEPH_VAULT_PATH environment variable support
Init wizard reordered: vault backend selection is now step 1 before LLM provider (#598)
API key and channel token prompts skipped when age vault backend is selected (#598)

0.11.0 - 2026-02-19

Added

Vision (image input) support across Claude, OpenAI, and Ollama providers (#490)
MessagePart::Image content type with base64 serialization
LlmProvider::supports_vision() trait method for runtime capability detection
Claude structured content with AnthropicContentBlock::Image variant
OpenAI array content format with image_url data-URI encoding
Ollama with_images() support with optional vision_model config for dedicated model routing
/image <path> command in CLI and TUI channels
Telegram photo message handling with pre-download size guard
vision_model field in [llm.ollama] config section and --init wizard update
20 MB max image size limit and path traversal protection
Interactive configuration wizard via zeph init subcommand with 5-step setup (LLM provider, memory, channels, secrets backend, config generation)
clap-based CLI argument parsing with --help, --version support
Serialize derive on Config and all nested types for TOML generation
dialoguer dependency for interactive terminal prompts
Structured LLM output via chat_typed<T>() on LlmProvider trait with JSON schema enforcement (#456)
OpenAI/Compatible native response_format: json_schema structured output (#457)
Claude structured output via forced tool use pattern (#458)
Extractor<T> utility for typed data extraction from LLM responses (#459)
TUI test automation infrastructure: EventSource trait abstraction, insta widget snapshot tests, TestBackend integration tests, proptest layout verification, expectrl E2E terminal tests (#542)
CI snapshot regression pipeline with cargo insta test --check (#547)
Pipeline API with composable, type-safe Step trait, Pipeline builder, ParallelStep combinator, and built-in steps (LlmStep, RetrievalStep, ExtractStep, MapStep) (#466, #467, #468)
Structured intent classification for skill disambiguation: when top-2 skill scores are within disambiguation_threshold (default 0.05), agent calls LLM via chat_typed::<IntentClassification>() to select the best-matching skill (#550)
ScoredMatch struct exposing both skill index and cosine similarity score from matcher backends
IntentClassification type (skill_name, confidence, params) with JsonSchema derive for schema-enforced LLM responses
disambiguation_threshold in [skills] config section (default: 0.05) with with_disambiguation_threshold() builder on Agent
DocumentLoader trait with text/markdown file loader in zeph-memory (#469)
Text splitter with configurable chunk size, overlap, and sentence-aware splitting (#470)
PDF document loader, feature-gated behind pdf (#471)
Document ingestion pipeline: load, split, embed, store via Qdrant (#472)
File size guard (50 MiB default) and path canonicalization for document loaders
Audio input support: Attachment/AttachmentKind types, SpeechToText trait, OpenAI Whisper backend behind stt feature flag (#520, #521, #522)
Telegram voice and audio message handling with automatic file download (#524)
STT bootstrap wiring: WhisperProvider created from [llm.stt] config behind stt feature (#529)
Slack audio file upload handling with host validation and size limits (#525)
Local Whisper backend via candle for offline STT with symphonia audio decode and rubato resampling (#523)
Shell-based installation script (install/install.sh) with SHA256 verification, platform detection, and --version flag
Shellcheck lint job in CI pipeline
Per-job permission scoping in release workflow (least privilege)
TUI word-jump and line-jump cursor navigation (#557)
TUI keybinding help popup on ? in normal mode (#533)
TUI clickable hyperlinks via OSC 8 escape sequences (#530)
TUI edit-last-queued for recalling queued messages (#535)
VectorStore trait abstraction in zeph-memory (#554)
Operation-level cancellation for LLM requests and tool executions (#538)

Changed

Consolidate Docker files into docker/ directory (#539)
Typed deserialization for tool call params (#540)
CI: replace oraclelinux base image with debian bookworm-slim (#532)

Fixed

Strip schema metadata and fix doom loop detection for native tool calls (#534)
TUI freezes during fast LLM streaming and parallel tool execution: biased event loop with input priority and agent event batching (#500)
Redundant syntax highlighting and markdown parsing on every TUI frame: per-message render cache with content-hash keying (#501)

0.10.0 - 2026-02-18

Fixed

TUI status spinner not cleared after model warmup completes (#517)
Duplicate tool output rendering for shell-streamed tools in TUI (#516)
send_tool_output not forwarded through AppChannel/AnyChannel enum dispatch (#508)
Tool output and diff not sent atomically in native tool_use path (#498)
Parallel tool_use calls: results processed sequentially for correct ordering (#486)
Native tool_result format not recognized by TUI history loader (#484)
Inline filter stats threshold based on char savings instead of line count (#483)
Token metrics not propagated in native tool_use path (#482)
Filter metrics not appearing in TUI Resources panel when using native tool_use providers (#480)
Output filter matchers not matching compound shell commands like cd /path && cargo test 2>&1 | tail (#481)
Duplicate ToolEvent::Completed emission in shell executor before filtering was applied (#480)
TUI feature gate compilation errors (#435)

Added

GitHub CLI skill with token-saving patterns (#507)
Parallel execution of native tool_use calls with configurable concurrency (#486)
TUI compact/detailed tool output toggle with ‘e’ key binding (#479)
TUI [tui] config section with show_source_labels option to hide [user]/[zeph]/[tool] prefixes (#505)
Syntax-highlighted diff view for write/edit tool output in TUI (#455)
- Diff rendering with green/red backgrounds for added/removed lines
- Word-level change highlighting within modified lines
- Syntax highlighting via tree-sitter
- Compact/expanded toggle with existing ‘e’ key binding
- New dependency: similar 2.7.0
Per-tool inline filter stats in CLI chat: [shell] cargo test (342 lines -> 28 lines, 91.8% filtered) (#449)
Filter metrics in TUI Resources panel: confidence distribution, command hit rate, token savings (#448)
Periodic 250ms tick in TUI event loop for real-time metrics refresh (#447)
Output filter architecture improvements (M26.1): CommandMatcher enum, FilterConfidence, FilterPipeline, SecurityPatterns, per-filter TOML config (#452)
Token savings tracking and metrics for output filtering (#445)
Smart tool output filtering: command-aware filters that compress tool output before context insertion
OutputFilter trait and OutputFilterRegistry with first-match-wins dispatch
sanitize_output() ANSI escape and progress bar stripping (runs on all tool output)
Test output filter: cargo test/nextest failures-only mode (94-99% token savings on green suites)
Git output filter: compact status/diff/log/push compression (80-99% savings)
Clippy output filter: group warnings by lint rule (70-90% savings)
Directory listing filter: hide noise directories (target, node_modules, .git)
Log deduplication filter: normalize timestamps/UUIDs, count repeated patterns (70-85% savings)
[tools.filters] config section with enabled toggle
Skill trust levels: 4-tier model (Trusted, Verified, Quarantined, Blocked) with per-turn enforcement
TrustGateExecutor wrapping tool execution with trust-level permission checks
AnomalyDetector with sliding-window threshold counters for quarantined skill monitoring
blake3 content hashing for skill integrity verification on load and hot-reload
Quarantine prompt wrapping for structural isolation of untrusted skill bodies
Self-learning gate: skills with trust < Verified skip auto-improvement
skill_trust SQLite table with migration 009
CLI commands: /skill trust, /skill block, /skill unblock
[skills.trust] config section (default_level, local_level, hash_mismatch_level)
ProviderKind enum for type-safe provider selection in config
RuntimeConfig struct grouping agent runtime fields
AnyProvider::embed_fn() shared embedding closure helper
Config::validate() with bounds checking for critical config values
sanitize_paths() for stripping absolute paths from error messages
10-second timeout wrapper for embedding API calls
full feature flag enabling all optional features

Changed

Remove P generic from Agent, SemanticMemory, CodeRetriever — provider resolved at construction (#423)
Architecture improvements, performance optimizations, security hardening (M24) (#417)
Extract bootstrap logic from main.rs into zeph-core::bootstrap::AppBuilder (#393): main.rs reduced from 2313 to 978 lines
SecurityConfig and TimeoutConfig gain Clone + Copy
AnyChannel moved from main.rs to zeph-channels crate
Remove 8 lightweight feature gates, make always-on: openai, compatible, orchestrator, router, self-learning, qdrant, vault-age, mcp (#438)
Default features reduced to minimal set (empty after M26)
Skill matcher concurrency reduced from 50 to 20
String::with_capacity in context building loops
CI updated to use --features full

Breaking

LlmConfig.provider changed from String to ProviderKind enum
Default features reduced – users needing a2a, candle, mcp, openai, orchestrator, router, tui must enable explicitly or use --features full
Telegram channel rejects empty allowed_users at startup
Config with extreme values now rejected by Config::validate()

Deprecated

ToolExecutor::execute() string-based dispatch (use execute_tool_call() instead)

Fixed

Closed #410 (clap dropped atty), #411 (rmcp updated quinn-udp), #413 (A2A body limit already present)

0.9.9 - 2026-02-17

Added

zeph-gateway crate: axum HTTP gateway with POST /webhook ingestion, bearer auth (blake3 + ct_eq), per-IP rate limiting, GET /health endpoint, feature-gated (gateway) (#379)
zeph-core::daemon module: component supervisor with health monitoring, PID file management, graceful shutdown, feature-gated (daemon) (#380)
zeph-scheduler crate: cron-based periodic task scheduler with SQLite persistence, built-in tasks (memory_cleanup, skill_refresh, health_check), TaskHandler trait, feature-gated (scheduler) (#381)
New config sections: [gateway], [daemon], [scheduler] in config/default.toml (#367)
New optional feature flags: gateway, daemon, scheduler
Hybrid memory search: FTS5 keyword search combined with Qdrant vector similarity (#372, #373, #374)
SQLite FTS5 virtual table with auto-sync triggers for full-text keyword search
Configurable vector_weight/keyword_weight in [memory.semantic] for hybrid ranking
FTS5-only fallback when Qdrant is unavailable (replaces empty results)
AutonomyLevel enum (ReadOnly/Supervised/Full) for controlling tool access (#370)
autonomy_level config key in [security] section (default: supervised)
Read-only mode restricts agent to file_read, file_glob, file_grep, web_scrape
Full mode allows all tools without confirmation prompts
Documented [telegram].allowed_users allowlist in default config (#371)
OpenTelemetry OTLP trace export with tracing-opentelemetry layer, feature-gated behind otel (#377)
[observability] config section with exporter selection and OTLP endpoint
Instrumentation spans for LLM calls (llm_call) and tool executions (tool_exec)
CostTracker with per-model token pricing and configurable daily budget limits (#378)
[cost] config section with enabled and max_daily_cents options
cost_spent_cents field in MetricsSnapshot for TUI cost display
Discord channel adapter with Gateway v10 WebSocket, slash commands, edit-in-place streaming (#382)
Slack channel adapter with Events API webhook, HMAC-SHA256 signature verification, streaming (#383)
Feature flags: discord and slack (opt-in) in zeph-channels and root crate
DiscordConfig and SlackConfig with token redaction in Debug impls
Slack timestamp replay protection (reject requests >5min old)
Configurable Slack webhook bind address (webhook_host)

0.9.8 - 2026-02-16

Added

Graceful shutdown on Ctrl-C with farewell message and MCP server cleanup (#355)
Cancel-aware LLM streaming via tokio::select on shutdown signal (#358)
McpManager::shutdown_all_shared() with per-client 5s timeout (#356)
Indexer progress logging with file count and per-file stats
Skip code index for providers with native tool_use (#357)
OpenAI prompt caching: parse and report cached token usage (#348)
Syntax highlighting for TUI code blocks via tree-sitter-highlight (#345, #346, #347)
Anthropic prompt caching with structured system content blocks (#337)
Configurable summary provider for tool output summarization via local model (#338)
Aggressive inline pruning of stale tool outputs in tool loops (#339)
Cache usage metrics (cache_read_tokens, cache_creation_tokens) in MetricsSnapshot (#340)
Native tool_use support for Claude provider (Anthropic API format) (#256)
Native function calling support for OpenAI provider (#257)
ToolDefinition, ChatResponse, ToolUseRequest types in zeph-llm (#254)
ToolUse/ToolResult variants in MessagePart for structured tool flow (#255)
Dual-mode agent loop: native structured path alongside legacy text extraction (#258)
Dual system prompt: native tool_use instructions for capable providers, fenced-block instructions for legacy providers

Changed

Consolidate all SQLite migrations into root migrations/ directory (#354)

0.9.7 - 2026-02-15

Performance

Token estimation uses len() / 3 for improved accuracy (#328)
Explicit tokio feature selection replacing broad feature gates (#326)
Concurrent skill embedding for faster startup (#327)
Pre-allocate strings in hot paths to reduce allocations (#329)
Parallel context building via try_join! (#331)
Criterion benchmark suite for core operations (#330)

Security

Path traversal protection in shell sandbox (#325)
Canonical path validation in skill loader (#322)
SSRF protection for MCP server connections (#323)
Remove MySQL/RSA vulnerable transitive dependencies (#324)
Secret redaction patterns for Google and GitLab tokens (#320)
TTL-based eviction for rate limiter entries (#321)

Changed

QdrantOps shared helper trait for Qdrant collection operations (#304)
delegate_provider! macro replacing boilerplate provider delegation (#303)
Remove TuiError in favor of unified error handling (#302)
Generic recv_optional replacing per-channel optional receive logic (#301)

Dependencies

Upgraded rmcp to 0.15, toml to 1.0, uuid to 1.21 (#296)
Cleaned up deny.toml advisory and license configuration (#312)

0.9.6 - 2026-02-15

Changed

BREAKING: ToolDef schema field replaced Vec<ParamDef> with schemars::Schema auto-derived from Rust structs via #[derive(JsonSchema)]
BREAKING: ParamDef and ParamType removed from zeph-tools public API
BREAKING: ToolRegistry::new() replaced with ToolRegistry::from_definitions(); registry no longer hardcodes built-in tools — each executor owns its definitions via tool_definitions()
BREAKING: Channel trait now requires ChannelError enum with typed error handling replacing anyhow::Result
BREAKING: Agent::new() signature changed to accept new field grouping; agent struct refactored into 5 inner structs for improved organization
BREAKING: AgentError enum introduced with 7 typed variants replacing scattered anyhow::Error handling
ToolDef now includes InvocationHint (FencedBlock/ToolCall) so LLM prompt shows exact invocation format per tool
web_scrape tool definition includes all parameters (url, select, extract, limit) auto-derived from ScrapeInstruction
ShellExecutor and WebScrapeExecutor now implement tool_definitions() for single source of truth
Replaced tokio “full” feature with granular features in zeph-core (async-io, macros, rt, sync, time)
Removed anyhow dependency from zeph-channels
Message persistence now uses MessageKind enum instead of is_summary bool for qdrant storage

Added

ChannelError enum with typed variants for channel operation failures
AgentError enum with 7 typed variants for agent operation failures (streaming, persistence, configuration, etc.)
Workspace-level qdrant feature flag for optional semantic memory support
Type aliases consolidated into zeph-llm: EmbedFuture and EmbedFn with typed LlmError
streaming.rs and persistence.rs modules extracted from agent module for improved code organization
MessageKind enum for distinguishing summary and regular messages in storage

Removed

anyhow::Result from Channel trait (replaced with ChannelError)
Direct anyhow::Error usage in agent module (replaced with AgentError)

0.9.5 - 2026-02-14

Added

Pattern-based permission policy with glob matching per tool (allow/ask/deny), first-match-wins evaluation (#248)
Legacy blocked_commands and confirm_patterns auto-migrated to permission rules (#249)
Denied tools excluded from LLM system prompt (#250)
Tool output overflow: full output saved to file when truncated, path notice appended for LLM access (#251)
Stale tool output overflow files cleaned up on startup (>24h TTL) (#252)
ToolRegistry with typed ToolDef definitions for 7 built-in tools (bash, read, edit, write, glob, grep, web_scrape) (#239)
FileExecutor for sandboxed file operations: read, write, edit, glob, grep (#242)
ToolCall struct and execute_tool_call() on ToolExecutor trait for structured tool invocation (#241)
CompositeExecutor routes structured tool calls to correct sub-executor by tool_id (#243)
Tool catalog section in system prompt via ToolRegistry::format_for_prompt() (#244)
Configurable max_tool_iterations (default 10, previously hardcoded 3) via TOML and ZEPH_AGENT_MAX_TOOL_ITERATIONS env var (#245)
Doom-loop detection: breaks agent loop on 3 consecutive identical tool outputs
Context budget check at 80% threshold stops iteration before context overflow
IndexWatcher for incremental code index updates on file changes via notify file watcher (#233)
watch config field in [index] section (default true) to enable/disable file watching
Repo map cache with configurable TTL (repo_map_ttl_secs, default 300s) to avoid per-message filesystem traversal (#231)
Cross-session memory score threshold (cross_session_score_threshold, default 0.35) to filter low-relevance results (#232)
embed_missing() called on startup for embedding backfill when Qdrant available (#261)
AgentTaskProcessor replaces EchoTaskProcessor for real A2A inference (#262)

Changed

ShellExecutor uses PermissionPolicy for all permission checks instead of legacy find_blocked_command/find_confirm_command
Replaced unmaintained dirs-next 2.0 with dirs 6.x
Batch messages retrieval in semantic recall: replaced N+1 query pattern with messages_by_ids() for improved performance

Fixed

Persist MessagePart data to SQLite via remember_with_parts() — pruning state now survives session restarts (#229)
Clear tool output body from memory after Tier 1 pruning to reclaim heap (#230)
TUI uptime display now updates from agent start time instead of always showing 0s (#259)
FileExecutor handle_write now uses canonical path for security (TOCTOU prevention) (#260)
resolve_via_ancestors trailing slash bug on macOS
vault.backend from config now used as default backend; CLI --vault flag overrides config (#263)
A2A error responses sanitized to prevent provider URL leakage

0.9.4 - 2026-02-14

Added

Bounded FIFO message queue (max 10) in agent loop: users can submit messages during inference, queued messages are delivered sequentially when response cycle completes
Channel trait extended with try_recv() (non-blocking poll) and send_queue_count() with default no-op impls
Consecutive user messages within 500ms merge window joined by newline
TUI queue badge [+N queued] in input area, Ctrl+K to clear queue, /clear-queue command
TelegramChannel try_recv() implementation via mpsc
Deferred model warmup in TUI mode: interface renders immediately, Ollama warmup runs in background with status indicator (“warming up model…” → “model ready”), agent loop awaits completion via watch::channel
context_tokens metric in TUI Resources panel showing current prompt estimate (vs cumulative session totals)
unsummarized_message_count in SemanticMemory for precise summarization trigger
count_messages_after in SqliteStore for counting messages beyond a given ID
TUI status indicators for context compaction (“compacting context…”) and summarization (“summarizing…”)
Debug tracing in should_compact() for context budget diagnostics (token estimate, threshold, decision)
Config hot-reload: watch config file for changes via notify_debouncer_mini and apply runtime-safe fields (security, timeouts, memory limits, context budget, compaction, max_active_skills) without restart
ConfigWatcher in zeph-core with 500ms debounced filesystem monitoring
with_config_reload() builder method on Agent for wiring config file watcher
tool_name field in ToolOutput for identifying tool type (bash, mcp, web-scrape) in persisted messages and TUI display
Real-time status events for provider retries and orchestrator fallbacks surfaced as [system] messages across all channels (CLI stderr, TUI chat panel, Telegram)
StatusTx type alias in zeph-llm for emitting status events from providers
Status variant in TUI AgentEvent rendered as System-role messages (DarkGray)
set_status_tx() on AnyProvider, SubProvider, and ModelOrchestrator for propagating status sender through the provider hierarchy
Background forwarding tasks for immediate status delivery (bypasses agent loop for zero-latency display)
TUI: toggle side panels with d key in Normal mode
TUI: input history navigation (Up/Down in Insert mode)
TUI: message separators and accent bars for visual structure
TUI: tool output restored as expandable messages from conversation history
TUI: collapsed tool output preview (3 lines) when restoring history
LlmProvider::context_window() trait method for model context window size detection
Ollama context window auto-detection via /api/show model info endpoint
Context window sizes for Claude (200K) and OpenAI (128K/16K/1M) provider models
auto_budget config field with ZEPH_MEMORY_AUTO_BUDGET env override for automatic context budget from model metadata
inject_summaries() in Agent: injects SQLite conversation summaries into context (newest-first, budget-aware, with deduplication)
Wire zeph-index Code RAG pipeline into agent loop (feature-gated index): CodeRetriever integration, inject_code_rag() in prepare_context(), repo map in system prompt, background project indexing on startup
IndexConfig with [index] TOML section and ZEPH_INDEX_* env overrides (enabled, max_chunks, score_threshold, budget_ratio, repo_map_tokens)
Two-tier context pruning strategy for granular token reclamation before full LLM compaction
- Tier 1: selective ToolOutput part pruning with compacted_at timestamp on pruned parts
- Tier 2: LLM-based compaction fallback when tier 1 is insufficient
- prune_protect_tokens config field for token-based protection zone (shields recent context from pruning)
- tool_output_prunes metric tracking tier 1 pruning operations
- compacted_at field on MessagePart::ToolOutput for pruning audit trail
MessagePart enum (Text, ToolOutput, Recall, CodeContext, Summary) for typed message content with independent lifecycle
Message::from_parts() constructor with to_llm_content() flattening for LLM provider consumption
Message::from_legacy() backward-compatible constructor for simple text messages
SQLite migration 006: parts column for structured message storage (JSON-serialized)
save_message_with_parts() in SqliteStore for persisting typed message parts
inject_semantic_recall, inject_code_context, inject_summaries now create typed MessagePart variants

Changed

index feature enabled by default (Code RAG pipeline active out of the box)
Agent error handler shows specific error context instead of generic message
TUI inline code rendered as blue with dark background glow instead of bright yellow
TUI header uses deep blue background (Rgb(20, 40, 80)) for improved contrast
System prompt includes explicit bash block example and bans invented formats (tool_code, tool_call) for small model compatibility
TUI Resources panel: replaced separate Prompt/Completion/Total with Context (current) and Session (cumulative) metrics
Summarization trigger uses unsummarized message count instead of total, avoiding repeated no-op checks
Empty AgentEvent::Status clears TUI spinner instead of showing blank throbber
Status label cleared after summarization and compaction complete
Default summarization_threshold: 100 → 50 messages
Default compaction_threshold: 0.75 → 0.80
Default compaction_preserve_tail: 4 → 6 messages
Default semantic.enabled: false → true
Default summarize_output: false → true
Default context_budget_tokens: 0 (auto-detect from model)

Fixed

TUI chat line wrapping no longer eats 2 characters on word wrap (accent prefix width accounted for)
TUI activity indicator moved to dedicated layout row (no longer overlaps content)
Memory history loading now retrieves most recent messages instead of oldest
Persisted tool output format includes tool name ([tool output: bash]) for proper display on restore
summarize_output serde deserialization used #[serde(default)] yielding false instead of config default true

0.9.3 - 2026-02-12

Added

New zeph-index crate: AST-based code indexing and semantic retrieval pipeline
- Language detection and grammar registry with feature-gated tree-sitter grammars (Rust, Python, JavaScript, TypeScript, Go, Bash, TOML, JSON, Markdown)
- AST-based chunker with cAST-inspired greedy sibling merge and recursive decomposition (target 600 non-ws chars per chunk)
- Contextualized embedding text generation for improved retrieval quality
- Dual-write storage layer (Qdrant vector search + SQLite metadata) with INT8 scalar quantization
- Incremental indexer with .gitignore-aware file walking and content-hash change detection
- Hybrid retriever with query classification (Semantic/Grep/Hybrid) and budget-aware result packing
- Lightweight repo map generation (tree-sitter signature extraction, budget-constrained output)
code_context slot in BudgetAllocation for code RAG injection into agent context
inject_code_context() method in Agent for transient code chunk injection before semantic recall

0.9.2 - 2026-02-12

Added

Runtime context compaction for long sessions: automatic LLM-based summarization of middle messages when context usage exceeds configurable threshold (default 75%)
with_context_budget() builder method on Agent for wiring context budget and compaction settings
Config fields: compaction_threshold (f32), compaction_preserve_tail (usize) with env var overrides
context_compactions counter in MetricsSnapshot for observability
Context budget integration: ContextBudget::allocate() wired into agent loop via prepare_context() orchestrator
Semantic recall injection: SemanticMemory::recall() results injected as transient system messages with token budget control
Message history trimming: oldest non-system messages evicted when history exceeds budget allocation
Environment context injection: working directory, OS, git branch, and model name in system prompt via <environment> block
Extended BASE_PROMPT with structured Tool Use, Guidelines, and Security sections
Tool output truncation: head+tail split at 30K chars with UTF-8 safe boundaries
Smart tool output summarization: optional LLM-based summarization for outputs exceeding 30K chars, with fallback to truncation on failure (disabled by default via summarize_output config)
Progressive skill loading: matched skills get full body, remaining shown as description-only catalog via <other_skills>
ZEPH.md project config discovery: walk up directory tree, inject into system prompt as <project_context>

0.9.1 - 2026-02-12

Added

Mouse scroll support for TUI chat widget (scroll up/down via mouse wheel)
Splash screen with colored block-letter “ZEPH” banner on TUI startup
Conversation history loading into chat on TUI startup
Model thinking block rendering (<think> tags from Ollama DeepSeek/Qwen models) in distinct darker style
Markdown rendering for all chat messages via pulldown-cmark: bold, italic, strikethrough, headings, code blocks, inline code, lists, blockquotes, horizontal rules
Scrollbar track with proportional thumb indicator in chat widget

Fixed

Chat messages no longer overflow below the viewport when lines wrap
Scroll no longer sticks at top after over-scrolling past content boundary

0.9.0 - 2026-02-12

Added

ratatui-based TUI dashboard with real-time agent metrics (feature-gated tui, opt-in)
TuiChannel as new Channel implementation with bottom-up chat feed, input line, and status bar
MetricsSnapshot and MetricsCollector in zeph-core via tokio::sync::watch for live metrics transport
with_metrics() builder on Agent with instrumentation at 8 collection points: api_calls, latency, prompt/completion tokens, active skills, sqlite message count, qdrant status, summarization count
Side panel widgets (skills, memory, resources) with live data from agent loop
Confirmation modal dialog for destructive command approval in TUI (Y/Enter confirms, N/Escape cancels)
Scroll indicators (▲/▼) in chat widget when content overflows viewport
Responsive layout: side panels hidden on terminals narrower than 80 columns
Multiline input via Shift+Enter in TUI insert mode
Bottom-up chat layout with proper newline handling and per-message visual separation
Panic hook for terminal state restoration on any panic during TUI execution
Unicode-safe char-index cursor tracking for multi-byte input in TUI
--config <path> CLI argument and ZEPH_CONFIG env var to override default config path
OpenAI-compatible LLM provider with chat, streaming, and embeddings support
Feature-gated openai feature (enabled by default)
Support for OpenAI, Together AI, Groq, Fireworks, and any OpenAI-compatible API via configurable base_url
reasoning_effort parameter for OpenAI reasoning models (low/medium/high)
/mcp add <id> <command> [args...] for dynamic stdio MCP server connection at runtime
/mcp add <id> <url> for HTTP transport (remote MCP servers in Docker/cloud)
/mcp list command to show connected servers and tool counts
/mcp remove <id> command to disconnect MCP servers
McpTransport enum: Stdio (child process) and Http (Streamable HTTP) transports
HTTP MCP server config via url field in [[mcp.servers]]
mcp.allowed_commands config for command allowlist (security hardening)
mcp.max_dynamic_servers config to limit concurrent dynamic servers (default 10)
Qdrant registry sync after dynamic MCP add/remove for semantic tool matching

Changed

Docker images now include Node.js, npm, and Python 3 for MCP server runtime
ServerEntry uses McpTransport enum instead of flat command/args/env fields

Fixed

Effective embedding model resolution: Qdrant subsystems now use the correct provider-specific embedding model name when provider is openai or orchestrator routes to OpenAI
Skill watcher no longer loops in Docker containers (overlayfs phantom events)

0.8.2 - 2026-02-10

Changed

Enable all non-platform features by default: orchestrator, self-learning, mcp, vault-age, candle
Features metal and cuda remain opt-in (platform-specific GPU accelerators)
CI clippy uses default features instead of explicit feature list
Docker images now include skill runtime dependencies: curl, wget, git, jq, file, findutils, procps-ng

0.8.1 - 2026-02-10

Added

Shell sandbox: configurable allowed_paths directory allowlist and allow_network toggle blocking curl/wget/nc in ShellExecutor (Issue #91)
Sandbox validation before every shell command execution with path canonicalization
tools.shell.allowed_paths config (empty = working directory only) with ZEPH_TOOLS_SHELL_ALLOWED_PATHS env override
tools.shell.allow_network config (default: true) with ZEPH_TOOLS_SHELL_ALLOW_NETWORK env override
Interactive confirmation for destructive commands (rm, git push -f, DROP TABLE, etc.) with CLI y/N prompt and Telegram inline keyboard (Issue #92)
tools.shell.confirm_patterns config with default destructive command patterns
Channel::confirm() trait method with default auto-confirm for headless/test scenarios
ToolError::ConfirmationRequired and ToolError::SandboxViolation variants
execute_confirmed() method on ToolExecutor for confirmation bypass after user approval
A2A TLS enforcement: reject HTTP endpoints when a2a.require_tls = true (Issue #92)
A2A SSRF protection: block private IP ranges (RFC 1918, loopback, link-local) with DNS resolution (Issue #92)
Configurable A2A server payload size limit via a2a.max_body_size (default: 1 MiB)
Structured JSON audit logging for all tool executions with stdout or file destination (Issue #93)
AuditLogger with AuditEntry (timestamp, tool, command, result, duration) and AuditResult enum
[tools.audit] config section with ZEPH_TOOLS_AUDIT_ENABLED and ZEPH_TOOLS_AUDIT_DESTINATION env overrides
Secret redaction in LLM responses: detect API keys, tokens, passwords, private keys and replace with [REDACTED] (Issue #93)
Whitespace-preserving redact_secrets() scanner with zero-allocation fast path via Cow<str>
[security] config section with redact_secrets toggle (default: true)
Configurable timeout policies for LLM, embedding, and A2A operations (Issue #93)
[timeouts] config section with llm_seconds, embedding_seconds, a2a_seconds
LLM calls wrapped with tokio::time::timeout in agent loop

0.8.0 - 2026-02-10

Added

VaultProvider trait with pluggable secret backends, Secret newtype with redacted debug output, EnvVaultProvider for environment variable secrets (Issue #70)
AgeVaultProvider: age-encrypted JSON vault backend with x25519 identity key decryption (Issue #70)
Config::resolve_secrets(): async secret resolution through vault provider for API keys and tokens
CLI vault args: --vault <backend>, --vault-key <path>, --vault-path <path>
vault-age feature flag on zeph-core and root binary
[vault] config section with backend field (default: env)
docker-compose.vault.yml overlay for containerized age vault deployment
CARGO_FEATURES build arg in Dockerfile.dev for optional feature flags
CandleProvider: local GGUF model inference via candle ML framework with chat templates (Llama3, ChatML, Mistral, Phi3, Raw), token generation with top-k/top-p sampling, and repeat penalty (Issue #125)
CandleProvider embeddings: BERT-based embedding model loaded from HuggingFace Hub with mean pooling and L2 normalization (Issue #126)
ModelOrchestrator: task-aware multi-model routing with keyword-based classification (coding, creative, analysis, translation, summarization, general) and provider fallback chains (Issue #127)
SubProvider enum breaking recursive type cycle between AnyProvider and ModelOrchestrator
Device auto-detection: Metal on macOS, CUDA on Linux with GPU, CPU fallback (Issue #128)
Feature flags: candle, metal, cuda, orchestrator on workspace and zeph-llm crate
CandleConfig, GenerationParams, OrchestratorConfig in zeph-core config
Config examples for candle and orchestrator in config/default.toml
Setup guide sections for candle local inference and model orchestrator
15 new unit tests for orchestrator, chat templates, generation config, and loader
Progressive skill loading: lazy body loading via OnceLock, on-demand resource resolution for scripts/, references/, assets/ directories, extended frontmatter (compatibility, license, metadata, allowed-tools), skill name validation per agentskills.io spec (Issue #115)
SkillMeta/Skill composition pattern: metadata loaded at startup, body deferred until skill activation
SkillRegistry replaces Vec<Skill> in Agent — lazy body access via get_skill()/get_body()
resource.rs module: discover_resources() + load_resource() with path traversal protection via canonicalization
Self-learning skill evolution system: automatic skill improvement through failure detection, self-reflection retry, and LLM-generated version updates (Issue #107)
SkillOutcome enum and SkillMetrics for skill execution outcome tracking (Issue #108)
Agent self-reflection retry on tool failure with 1-retry-per-message budget (Issue #109)
Skill version generation and storage in SQLite with auto-activate and manual approval modes (Issue #110)
Automatic rollback when skill version success rate drops below threshold (Issue #111)
/skill stats, /skill versions, /skill activate, /skill approve, /skill reset commands for version management (Issue #111)
/feedback command for explicit user feedback on skill quality (Issue #112)
LearningConfig with TOML config section [skills.learning] and env var overrides
self-learning feature flag on zeph-skills, zeph-core, and root binary
SQLite migration 005: skill_versions and skill_outcomes tables
Bundled setup-guide skill with configuration reference for all env vars, TOML keys, and operating modes
Bundled skill-audit skill for spec compliance and security review of installed skills
allowed_commands shell config to override default blocklist entries via ZEPH_TOOLS_SHELL_ALLOWED_COMMANDS
QdrantSkillMatcher: persistent skill embeddings in Qdrant with BLAKE3 content-hash delta sync (Issue #104)
SkillMatcherBackend enum dispatching between InMemory and Qdrant skill matching (Issue #105)
qdrant feature flag on zeph-skills crate gating all Qdrant dependencies
Graceful fallback to in-memory matcher when Qdrant is unavailable
Skill matching tracing via tracing::debug! for diagnostics
New zeph-mcp crate: MCP client via rmcp 0.14 with stdio transport (Issue #117)
McpClient and McpManager for multi-server lifecycle management with concurrent connections
McpToolExecutor implementing ToolExecutor for ```mcp block execution (Issue #120)
McpToolRegistry: MCP tool embeddings in Qdrant zeph_mcp_tools collection with BLAKE3 delta sync (Issue #118)
Unified matching: skills + MCP tools injected into system prompt by relevance (Issue #119)
mcp feature flag on root binary and zeph-core gating all MCP functionality
Bundled mcp-generate skill with instructions for MCP-to-skill generation via mcp-execution (Issue #121)
[[mcp.servers]] TOML config section for MCP server connections

Changed

Skill struct refactored: split into SkillMeta (lightweight metadata) + Skill (meta + body), composition pattern
SkillRegistry now uses OnceLock<String> for lazy body caching instead of eager loading
Matcher APIs accept &[&SkillMeta] instead of &[Skill] — embeddings use description only
Agent stores SkillRegistry directly instead of Vec<Skill>
Agent field matcher type changed from Option<SkillMatcher> to Option<SkillMatcherBackend>
Skill matcher creation extracted to create_skill_matcher() in main.rs

Dependencies

Added age 0.11.2 to workspace (optional, behind vault-age feature, default-features = false)
Added candle-core 0.9, candle-nn 0.9, candle-transformers 0.9 to workspace (optional, behind candle feature)
Added hf-hub 0.4 to workspace (HuggingFace model downloads with rustls-tls)
Added tokenizers 0.22 to workspace (BPE tokenization with fancy-regex)
Added blake3 1.8 to workspace
Added rmcp 0.14 to workspace (MCP protocol SDK)

0.7.1 - 2026-02-09

Added

WebScrapeExecutor: safe HTML scraping via scrape-core with CSS selectors, SSRF protection, and HTTPS-only enforcement (Issue #57)
CompositeExecutor<A, B>: generic executor chaining with first-match-wins dispatch
Bundled web-scrape skill with CSS selector examples for structured data extraction
extract_fenced_blocks() shared utility for fenced code block parsing (DRY refactor)
[tools.scrape] config section with timeout and max body size settings

Changed

Agent tool output label from [shell output] to [tool output]
ShellExecutor block extraction now uses shared extract_fenced_blocks()

0.7.0 - 2026-02-08

Added

A2A Server: axum-based HTTP server with JSON-RPC 2.0 routing for message/send, tasks/get, tasks/cancel (Issue #83)
In-memory TaskManager with full task lifecycle: create, get, update status, add artifacts, append history, cancel (Issue #83)
SSE streaming endpoint (/a2a/stream) with JSON-RPC response envelope wrapping per A2A spec (Issue #84)
Bearer token authentication middleware with constant-time comparison via subtle::ConstantTimeEq (Issue #85)
Per-IP rate limiting middleware with configurable 60-second sliding window (Issue #85)
Request body size limit (1 MiB) via tower-http::limit::RequestBodyLimitLayer (Issue #85)
A2aServerConfig with env var overrides: ZEPH_A2A_ENABLED, ZEPH_A2A_HOST, ZEPH_A2A_PORT, ZEPH_A2A_PUBLIC_URL, ZEPH_A2A_AUTH_TOKEN, ZEPH_A2A_RATE_LIMIT
Agent card served at /.well-known/agent.json (public, no auth required)
Graceful shutdown integration via tokio watch channel
Server module gated behind server feature flag on zeph-a2a crate

Changed

Part type refactored from flat struct to tagged enum with kind discriminator (text, file, data) per A2A spec
TaskState::Pending renamed to TaskState::Submitted with explicit per-variant #[serde(rename)] for kebab-case wire format
Added AuthRequired and Unknown variants to TaskState
TaskStatusUpdateEvent and TaskArtifactUpdateEvent gained kind field (status-update, artifact-update)

0.6.0 - 2026-02-08

Added

New zeph-a2a crate: A2A protocol implementation for agent-to-agent communication (Issue #78)
A2A protocol types: Task, TaskState, TaskStatus, Message, Part, Artifact, AgentCard, AgentSkill, AgentCapabilities with full serde camelCase serialization (Issue #79)
JSON-RPC 2.0 envelope types (JsonRpcRequest, JsonRpcResponse, JsonRpcError) with method constants for A2A operations (Issue #79)
AgentCardBuilder for constructing A2A agent cards from runtime config and skills (Issue #79)
AgentRegistry with well-known URI discovery (/.well-known/agent.json), TTL-based caching, and manual registration (Issue #80)
A2aClient with send_message, stream_message (SSE), get_task, cancel_task via JSON-RPC 2.0 (Issue #81)
Bearer token authentication support for all A2A client operations (Issue #81)
SSE streaming via eventsource-stream with TaskEvent enum (StatusUpdate, ArtifactUpdate) (Issue #81)
A2aError enum with variants for HTTP, JSON, JSON-RPC, discovery, and stream errors (Issue #79)
Optional a2a feature flag (enabled by default) to gate A2A functionality
42 new unit tests for protocol types, JSON-RPC envelopes, agent card builder, discovery registry, and client operations

0.5.0 - 2026-02-08

Added

Embedding-based skill matcher: SkillMatcher with cosine similarity selects top-K relevant skills per query instead of injecting all skills into the system prompt (Issue #75)
max_active_skills config field (default: 5) with ZEPH_SKILLS_MAX_ACTIVE env var override
Skill hot-reload: filesystem watcher via notify-debouncer-mini detects SKILL.md changes and re-embeds without restart (Issue #76)
Skill priority: earlier paths in skills.paths take precedence when skills share the same name (Issue #76)
SkillRegistry::reload() and SkillRegistry::into_skills() methods
SQLite skill_usage table tracking per-skill invocation counts and last-used timestamps (Issue #77)
/skills command displaying available skills with usage statistics
Three new bundled skills: git, docker, api-request (Issue #77)
17 new unit tests for matcher, registry priority, reload, and usage tracking

Changed

Agent::new() signature: accepts Vec<Skill>, Option<SkillMatcher>, max_active_skills instead of pre-formatted skills prompt string
format_skills_prompt now generic over Borrow<Skill> to accept both &[Skill] and &[&Skill]
Skill struct derives Clone
Agent generic constraint: P: LlmProvider + Clone + 'static (required for embed_fn closures)
System prompt rebuilt dynamically per user query with only matched skills

Dependencies

Added notify 8.0, notify-debouncer-mini 0.6
zeph-core now depends on zeph-skills
zeph-skills now depends on tokio (sync, rt) and notify

0.4.3 - 2026-02-08

Fixed

Telegram “Bad Request: text must be non-empty” error when LLM returns whitespace-only content. Added is_empty() guard after markdown_to_telegram conversion in both send() and send_or_edit() (Issue #73)

Added

Dockerfile.dev: multi-stage build from source with cargo registry/build cache layers for fast rebuilds
docker-compose.dev.yml: full dev stack (Qdrant + Zeph) with debug tracing (RUST_LOG, RUST_BACKTRACE=1), uses host Ollama via host.docker.internal
docker-compose.deps.yml: Qdrant-only compose for native zeph execution on macOS

0.4.2 - 2026-02-08

Fixed

Telegram MarkdownV2 parsing errors (Issue #69). Replaced manual character-by-character escaping with AST-based event-driven rendering using pulldown-cmark 0.13.0
UTF-8 safe text chunking for messages exceeding Telegram’s 4096-byte limit. Uses str::is_char_boundary() with newline preference to prevent splitting multi-byte characters (emoji, CJK)
Link URL over-escaping. Dedicated escape_url() method only escapes ) and \ per Telegram MarkdownV2 spec, fixing broken URLs like https://example\.com

Added

TelegramRenderer state machine for context-aware escaping: 19 special characters in text, only \ and ` in code blocks
Markdown formatting support: bold, italic, strikethrough, headers, code blocks, links, lists, blockquotes
Comprehensive benchmark suite with criterion: 7 scenario groups measuring latency (2.83µs for 500 chars) and throughput (121-970 MiB/s)
Memory profiling test to measure escaping overhead (3-20% depending on content)
30 markdown unit tests covering formatting, escaping, edge cases, and UTF-8 chunking (99.32% line coverage)

Changed

crates/zeph-channels/src/markdown.rs: Complete rewrite with pulldown-cmark event-driven parser (449 lines)
crates/zeph-channels/src/telegram.rs: Removed has_unclosed_code_block() pre-flight check (no longer needed with AST parsing), integrated UTF-8 safe chunking
Dependencies: Added pulldown-cmark 0.13.0 (MIT) and criterion 0.8.0 (Apache-2.0/MIT) for benchmarking

0.4.1 - 2026-02-08

Fixed

Auto-create Qdrant collection on first use. Previously, the zeph_conversations collection had to be manually created using curl commands. Now, ensure_collection() is called automatically before all Qdrant operations (remember, recall, summarize) to initialize the collection with correct vector dimensions (896 for qwen3-embedding) and Cosine distance metric on first access, similar to SQL migrations.

Changed

Docker Compose: Added environment variables for semantic memory configuration (ZEPH_MEMORY_SEMANTIC_ENABLED, ZEPH_MEMORY_SEMANTIC_RECALL_LIMIT) and Qdrant URL override (ZEPH_QDRANT_URL) to enable full semantic memory stack via .env file

0.4.0 - 2026-02-08

Added

M9 Phase 3: Conversation Summarization and Context Budget (Issue #62)

New SemanticMemory::summarize() method for LLM-based conversation compression
Automatic summarization triggered when message count exceeds threshold
SQLite migration 003_summaries.sql creates dedicated summaries table with CASCADE constraints
SqliteStore::save_summary() stores summary with metadata (first/last message IDs, token estimate)
SqliteStore::load_summaries() retrieves all summaries for a conversation ordered by ID
SqliteStore::load_messages_range() fetches messages after specific ID with limit for batch processing
SqliteStore::count_messages() counts total messages in conversation
SqliteStore::latest_summary_last_message_id() gets last summarized message ID for resumption
ContextBudget struct for proportional token allocation (15% summaries, 25% semantic recall, 60% recent history)
estimate_tokens() helper using chars/4 heuristic (100x faster than tiktoken, ±25% accuracy)
Agent::check_summarization() lazy trigger after persist_message() when threshold exceeded
Batch size = threshold/2 to balance summary quality with LLM call frequency
Configuration: memory.summarization_threshold (default: 100), memory.context_budget_tokens (default: 0 = unlimited)
Environment overrides: ZEPH_MEMORY_SUMMARIZATION_THRESHOLD, ZEPH_MEMORY_CONTEXT_BUDGET_TOKENS
Inline comments in config/default.toml documenting all configuration parameters
26 new unit tests for summarization and context budget (196 total tests, 75.31% coverage)
Architecture Decision Records ADR-016 through ADR-019 for summarization design
Foreign key constraint added to messages.conversation_id with ON DELETE CASCADE

M9 Phase 2: Semantic Memory Integration (Issue #61)

SemanticMemory<P: LlmProvider> orchestrator coordinating SQLite, Qdrant, and LlmProvider
SemanticMemory::remember() saves message to SQLite, generates embedding, stores in Qdrant
SemanticMemory::recall() performs semantic search with query embedding and fetches messages from SQLite
SemanticMemory::has_embedding() checks if message already embedded to prevent duplicates
SemanticMemory::embed_missing() background task to embed old messages (with LIMIT parameter)
Agent<P, C, T> now generic over LlmProvider to support SemanticMemory
Agent::with_memory() replaces SqliteStore with SemanticMemory
Graceful degradation: embedding failures logged but don’t block message save
Qdrant connection failures silently downgrade to SQLite-only mode (no semantic recall)
Generic provider pattern: SemanticMemory<P: LlmProvider> instead of Arc<dyn LlmProvider> for Edition 2024 async trait compatibility
AnyProvider, OllamaProvider, ClaudeProvider now derive/implement Clone for semantic memory integration
Integration test updated for SemanticMemory API (with_memory now takes 5 parameters including recall_limit)
Semantic memory config: memory.semantic.enabled, memory.semantic.recall_limit (default: 5)
18 new tests for semantic memory orchestration (recall, remember, embed_missing, graceful degradation)

M9 Phase 1: Qdrant Integration (Issue #60)

New QdrantStore module in zeph-memory for vector storage and similarity search
QdrantStore::store() persists embeddings to Qdrant and tracks metadata in SQLite
QdrantStore::search() performs cosine similarity search with filtering by conversation_id and role
QdrantStore::has_embedding() checks if message has associated embedding
QdrantStore::ensure_collection() idempotently creates Qdrant collection with 768-dimensional vectors
SQLite migration 002_embeddings_metadata.sql for embedding metadata tracking
embeddings_metadata table with foreign key constraint to messages (ON DELETE CASCADE)
PRAGMA foreign_keys enabled in SqliteStore via SqliteConnectOptions
SearchFilter and SearchResult types for flexible query construction
MemoryConfig.qdrant_url field with ZEPH_QDRANT_URL environment variable override (default: http://localhost:6334)
Docker Compose Qdrant service (qdrant/qdrant:v1.13.6) on ports 6333/6334 with persistent storage
Integration tests for Qdrant operations (ignored by default, require running Qdrant instance)
Unit tests for SQLite metadata operations with 98% coverage
12 new tests total (3 unit + 2 integration for QdrantStore, 1 CASCADE DELETE test for SqliteStore, 3 config tests)

M8: Embeddings support (Issue #54)

LlmProvider trait extended with embed(&str) -> Result<Vec<f32>> for generating text embeddings
LlmProvider trait extended with supports_embeddings() -> bool for capability detection
OllamaProvider implements embeddings via ollama-rs generate_embeddings() API
Default embedding model: qwen3-embedding (configurable via llm.embedding_model)
ZEPH_LLM_EMBEDDING_MODEL environment variable for runtime override
ClaudeProvider::embed() returns descriptive error (Claude API does not support embeddings)
AnyProvider delegates embedding methods to active provider
10 new tests: unit tests for all providers, config tests for defaults/parsing/env override
Integration test for real Ollama embedding generation (ignored by default)
README documentation: model compatibility notes and ollama pull instructions for both LLM and embedding models
Docker Compose configuration: added ZEPH_LLM_EMBEDDING_MODEL environment variable

Changed

BREAKING CHANGES (pre-1.0.0):

SqliteStore::save_message() now returns Result<i64> instead of Result<()> to enable embedding workflow
SqliteStore::new() uses sqlx::migrate!() macro instead of INIT_SQL constant for proper migration management
QdrantStore::store() requires model: &str parameter for multi-model support
Config constant LLM_ENV_KEYS renamed to ENV_KEYS to reflect inclusion of non-LLM variables

Migration:

#![allow(unused)]
fn main() {
// Before:
let _ = store.save_message(conv_id, "user", "hello").await?;

// After:
let message_id = store.save_message(conv_id, "user", "hello").await?;
}

OllamaProvider::new() now accepts embedding_model parameter (breaking change, pre-v1.0)
Config schema: added llm.embedding_model field with serde default for backward compatibility

0.3.0 - 2026-02-07

Added

M7 Phase 1: Tool Execution Framework - zeph-tools crate (Issue #39)

New zeph-tools leaf crate for tool execution abstraction following ADR-014
ToolExecutor trait with native async (Edition 2024 RPITIT): accepts full LLM response, returns Option<ToolOutput>
ShellExecutor implementation with bash block parser and execution (30s timeout via tokio::time::timeout)
ToolOutput struct with summary string and blocks_executed count
ToolError enum with Blocked/Timeout/Execution variants (thiserror)
ToolsConfig and ShellConfig configuration types with serde Deserialize and sensible defaults
Workspace version consolidation: version.workspace = true across all crates
Workspace inter-crate dependency references: zeph-llm.workspace = true pattern for all internal dependencies
22 unit tests with 99.25% line coverage, zero clippy warnings
ADR-014: zeph-tools crate design rationale and architecture decisions

M7 Phase 2: Command safety (Issue #40)

DEFAULT_BLOCKED patterns: 12 dangerous commands (rm -rf /, sudo, mkfs, dd if=, curl, wget, nc, ncat, netcat, shutdown, reboot, halt)
Case-insensitive command filtering via to_lowercase() normalization
Configurable timeout and blocked_commands in TOML via [tools.shell] section
Custom blocked commands additive to defaults (cannot weaken security)
35+ comprehensive unit tests covering exact match, prefix match, multiline, case variations
ToolsConfig integration with core Config struct

M7 Phase 3: Agent integration (Issue #41)

Agent now uses ShellExecutor for all bash command execution with safety checks
SEC-001 CRITICAL vulnerability fixed: unfiltered bash execution removed from agent.rs
Removed 66 lines of duplicate code (extract_bash_blocks, execute_bash, extract_and_execute_bash)
ToolError::Blocked properly handled with user-facing error message
Four integration tests for blocked command behavior and error handling
Performance validation: < 1% overhead for tool executor abstraction
Security audit: all acceptance criteria met, zero vulnerabilities

Security

CRITICAL fix for SEC-001: Shell commands now filtered through ShellExecutor with DEFAULT_BLOCKED patterns (rm -rf /, sudo, mkfs, dd if=, curl, wget, nc, shutdown, reboot, halt). Resolves command injection vulnerability where agent.rs bypassed all security checks via inline bash execution.

Fixed

Shell command timeout now respects config.tools.shell.timeout (was hardcoded 30s in agent.rs)
Removed duplicate bash parsing logic from agent.rs (now centralized in zeph-tools)
Error message pattern leakage: blocked commands now show generic security policy message instead of leaking exact blocked pattern

Changed

BREAKING CHANGES (pre-1.0.0):

Agent::new() signature changed: now requires tool_executor: T as 4th parameter where T: ToolExecutor
Agent struct now generic over three types: Agent<P, C, T> (provider, channel, tool_executor)
Workspace Cargo.toml now defines version = "0.3.0" in [workspace.package] section
All crate manifests use version.workspace = true instead of explicit versions
Inter-crate dependencies now reference workspace definitions (e.g., zeph-llm.workspace = true)

Migration:

#![allow(unused)]
fn main() {
// Before:
let agent = Agent::new(provider, channel, &skills_prompt);

// After:
use zeph_tools::shell::ShellExecutor;
let executor = ShellExecutor::new(&config.tools.shell);
let agent = Agent::new(provider, channel, &skills_prompt, executor);
}

0.2.0 - 2026-02-06

Added

M6 Phase 1: Streaming trait extension (Issue #35)

LlmProvider::chat_stream() method returning Pin<Box<dyn Stream<Item = Result<String>> + Send>>
LlmProvider::supports_streaming() capability query method
Channel::send_chunk() method for incremental response delivery
Channel::flush_chunks() method for buffered chunk flushing
ChatStream type alias for Pin<Box<dyn Stream<Item = anyhow::Result<String>> + Send>>
Streaming infrastructure in zeph-llm and zeph-core (dependencies: futures-core 0.3, tokio-stream 0.1)

M6 Phase 2: Ollama streaming backend (Issue #36)

Native token-by-token streaming for OllamaProvider using ollama-rs streaming API
OllamaProvider::chat_stream() implementation via send_chat_messages_stream()
OllamaProvider::supports_streaming() now returns true
Stream mapping from Result<ChatMessageResponse, ()> to Result<String, anyhow::Error>
Integration tests for streaming happy path and equivalence with non-streaming chat() (ignored by default)
ollama-rs "stream" feature enabled in workspace dependencies

M6 Phase 3: Claude SSE streaming backend (Issue #37)

Native token-by-token streaming for ClaudeProvider using Anthropic Messages API with Server-Sent Events
ClaudeProvider::chat_stream() implementation via SSE event parsing
ClaudeProvider::supports_streaming() now returns true
SSE event parsing via eventsource-stream 0.2.3 library
Stream pipeline: bytes_stream() -> eventsource() -> filter_map(parse_sse_event) -> Box::pin()
Handles SSE events: content_block_delta (text extraction), error (mid-stream errors), metadata events (skipped)
Integration tests for streaming happy path and equivalence with non-streaming chat() (ignored by default)
eventsource-stream dependency added to workspace dependencies
reqwest "stream" feature enabled for bytes_stream() support

M6 Phase 4: Agent streaming integration (Issue #38)

Agent automatically uses streaming when provider.supports_streaming() returns true (ADR-014)
Agent::process_response_streaming() method for stream consumption and chunk accumulation
CliChannel immediate streaming: send_chunk() prints each chunk instantly via print!() + flush()
TelegramChannel batched streaming: debounce at 1 second OR 512 bytes, edit-in-place for progressive updates
Response buffer pre-allocation with String::with_capacity(2048) for performance
Error message sanitization: full errors logged via tracing::error!(), generic messages shown to users
Telegram edit retry logic: recovers from stale message_id (message deleted, permissions lost)
tokio-stream dependency added for StreamExt trait
6 new unit tests for channel streaming behavior

Fixed

M6 Phase 3: Security improvements

Manual Debug implementation for ClaudeProvider to prevent API key leakage in debug output
Error message sanitization: full Claude API errors logged via tracing::error!(), generic messages returned to users

Changed

BREAKING CHANGES (pre-1.0.0):

LlmProvider trait now requires chat_stream() and supports_streaming() implementations (no default implementations per project policy)
Channel trait now requires send_chunk() and flush_chunks() implementations (no default implementations per project policy)
All existing providers (OllamaProvider, ClaudeProvider) updated with fallback implementations (Phase 1 non-streaming: calls chat() and wraps in single-item stream)
All existing channels (CliChannel, TelegramChannel) updated with no-op implementations (Phase 1: streaming not yet wired into agent loop)

0.1.0 - 2026-02-05

Added

M0: Workspace bootstrap

Cargo workspace with 5 crates: zeph-core, zeph-llm, zeph-skills, zeph-memory, zeph-channels
Binary entry point with version display
Default configuration file
Workspace-level dependency management and lints

M1: LLM + CLI agent loop

LlmProvider trait with Message/Role types
Ollama backend using ollama-rs
Config loading from TOML with env var overrides
Interactive CLI agent loop with multi-turn conversation

M2: Skills system

SKILL.md parser with YAML frontmatter and markdown body (zeph-skills)
Skill registry that scans directories for */SKILL.md files
Prompt formatter with XML-like skill injection into system prompt
Bundled skills: web-search, file-ops, system-info
Shell execution: agent extracts bash blocks from LLM responses and runs them
Multi-step execution loop with 3-iteration limit
30-second timeout on shell commands
Context builder that combines base system prompt with skill instructions

M3: Memory + Claude

SQLite conversation persistence with sqlx (zeph-memory)
Conversation history loading and message saving per session
Claude backend via Anthropic Messages API with 429 retry (zeph-llm)
AnyProvider enum dispatch for runtime provider selection
CloudLlmConfig for Claude-specific settings (model, max_tokens)
ZEPH_CLAUDE_API_KEY env var for API authentication
ZEPH_SQLITE_PATH env var override for database location
Provider factory in main.rs selecting Ollama or Claude from config
Memory integration into Agent with optional SqliteStore

M4: Telegram channel

Channel trait abstraction for agent I/O (recv, send, send_typing)
CliChannel implementation reading stdin/stdout via tokio::task::spawn_blocking
TelegramChannel adapter using teloxide with mpsc-based message routing
Telegram user whitelist via telegram.allowed_users config
ZEPH_TELEGRAM_TOKEN env var for Telegram bot activation
Bot commands: /start (welcome), /reset, /skills forwarded as ChannelMessage
AnyChannel enum dispatch for runtime channel selection
zeph-channels crate with teloxide 0.17 dependency
TelegramConfig in config.rs with TOML and env var support

M5: Integration tests + release

Integration test suite: config, skills, memory, and agent end-to-end
MockProvider and MockChannel for agent testing without external dependencies
Graceful shutdown via tokio::sync::watch + tokio::signal (SIGINT/SIGTERM)
Ollama startup health check (warn-only, non-blocking)
README with installation, configuration, usage, and skills documentation
GitHub Actions CI/CD: lint, clippy, test (ubuntu + macos), coverage, security, release
Dependabot for Cargo and GitHub Actions with auto-merge for patch/minor updates
Auto-labeler workflow for PRs by path, title prefix, and size
Release workflow with cross-platform binary builds and checksums
Issue templates (bug report, feature request)
PR template with review checklist
LICENSE (MIT), CONTRIBUTING.md, SECURITY.md

Fixed

Replace vulnerable serde_yml/libyml with manual frontmatter parser (GHSA high + medium)

Changed

Move dependency features from workspace root to individual crate manifests
Update README with badges, architecture overview, and pre-built binaries section
Agent is now generic over both LlmProvider and Channel (Agent<P, C>)
Agent::new() accepts a Channel parameter instead of reading stdin directly
Agent::run() uses channel.recv()/send() instead of direct I/O
Agent calls channel.send_typing() before each LLM request
Agent::run() uses tokio::select! to race channel messages against shutdown signal

Keyboard shortcuts

Zeph Documentation

0.12.0 - 2026-02-24

0.11.6 - 2026-02-23

0.11.5 - 2026-02-22

0.11.4 - 2026-02-21

0.11.3 - 2026-02-20

0.11.2 - 2026-02-19

0.11.1 - 2026-02-19

0.11.0 - 2026-02-19

0.10.0 - 2026-02-18

0.9.9 - 2026-02-17

0.9.8 - 2026-02-16

0.9.7 - 2026-02-15

0.9.6 - 2026-02-15

0.9.5 - 2026-02-14

0.9.4 - 2026-02-14

0.9.3 - 2026-02-12

0.9.2 - 2026-02-12

0.9.1 - 2026-02-12

0.9.0 - 2026-02-12

0.8.2 - 2026-02-10

0.8.1 - 2026-02-10

0.8.0 - 2026-02-10

0.7.1 - 2026-02-09

0.7.0 - 2026-02-08

0.6.0 - 2026-02-08

0.5.0 - 2026-02-08

0.4.3 - 2026-02-08

0.4.2 - 2026-02-08

0.4.1 - 2026-02-08

0.4.0 - 2026-02-08