Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog.

Unreleased

[0.18.5] - 2026-04-07

Added

  • Per-provider cost breakdownCostTracker now accumulates per-provider token counts (input, cache read/write, output) and cost. The /status CLI command and TUI /cost view render a per-provider table sorted by cost. See Observability & Cost.
  • ASI coherence tracking — per-provider sliding window of response embeddings penalizes Thompson/EMA routing when coherence drops. Enabled via [llm.routing.asi]. See Adaptive Inference.
  • Unified quality gate — optional post-selection embedding similarity check via [llm.routing] quality_gate. See Adaptive Inference.
  • Time-based microcompact — stale low-value tool outputs are cleared after an idle gap, at zero LLM cost. Configurable via [memory.microcompact]. See Memory & Context.
  • autoDream background consolidation — post-session memory consolidation sweep behind a session-count and time gate. Configurable via [memory.autodream]. See Memory & Context.
  • MagicDocs auto-maintained markdown — files with # MAGIC DOC: header are rewritten after tool-call turns by a background LLM task. Configurable via [magic_docs]. See Memory & Context.
  • Key facts semantic dedup — near-duplicate key_facts are silently skipped before Qdrant insertion. Configurable via memory.key_facts_dedup_threshold. See Memory & Context.
  • MCP error codesMcpErrorCode enum with is_retryable() for caller-side retry classification. See Tool System.
  • Caller identity propagationToolCall.caller_id and AuditEntry.policy_match are now populated from the channel layer. See Tool System.
  • Per-session tool call quotatools.max_tool_calls_per_session limits tool executions per session. See Tool System.
  • OAP authorization config[tools.authorization] TOML section merges rules into PolicyEnforcer at startup. See Policy Enforcer.
  • Scheduler CLI subcommandzeph schedule list/add/remove/show for managing jobs outside an agent session. See Scheduler.

Fixed

  • spawn_asi_update debounce — exactly one embed call per agent turn instead of N concurrent sub-calls.
  • MagicDocs scan now detects ToolOutput in Role::User messages and ToolResult in the native execution path.
  • Filter policy-decision language from key_facts at store time — transient enforcement facts ("blocked", "permission denied", etc.) are no longer embedded.
  • Compaction failure is now surfaced as a user-visible message instead of a silent tracing::warn.
  • MCP manager shuts down explicitly before runtime exit, killing stdio child processes cleanly.
  • BPE tokenizer data is cached in a OnceLock, eliminating repeated disk loads on TokenCounter construction.
  • Unbounded RSS growth in TUI mode: cancel bridge tasks, message buffer, URL set, and scheduler tick storm all addressed.

[0.17.1] - 2026-03-27

Added

  • Tool error taxonomyToolErrorCategory classifies tool failures into 11 categories driving retry, parameter-reformat, and reputation-scoring decisions. ToolErrorFeedback::format_for_llm() replaces opaque error strings with structured [tool_error] blocks. ToolError::Shell carries an explicit category and exit code. See Tool System.
  • MCP per-server trust levels[[mcp.servers]] entries accept trust_level (trusted/untrusted/sandboxed) and tool_allowlist. Sandboxed servers expose only explicitly listed tools (fail-closed). Untrusted servers with no allowlist emit a startup warning. See MCP Integration.
  • Candle-backed classifiersCandleClassifier runs protectai/deberta-v3-small-prompt-injection-v2 for injection detection. CandlePiiClassifier runs iiiorg/piiranha-v1-detect-personal-information (NER) for PII detection; results are merged with the regex filter. Configured via the new [classifiers] section. Requires classifiers feature. See Local Inference.
  • SYNAPSE hybrid seed selection — SYNAPSE spreading activation now ranks seed entities by hybrid_score = fts_score * (1 - seed_structural_weight) + structural_score * seed_structural_weight. New config fields: seed_structural_weight (default: 0.4) and seed_community_cap (default: 3).
  • A-MEM link weight evolution — edges accumulate retrieval_count; composite scoring uses evolved_weight(count, confidence) = confidence * (1 + 0.2 * ln(1 + count)).min(1.0). A background decay task reduces counts over time via link_weight_decay_lambda and link_weight_decay_interval_secs.
  • Topology-aware orchestrationTopologyClassifier classifies DAG structure (AllParallel, LinearChain, FanOut, FanIn, Hierarchical, Mixed) and selects a dispatch strategy (FullParallel, Sequential, LevelBarrier, Adaptive). LevelBarrier dispatch fires tasks level-by-level for hierarchical plans. Enable with topology_selection = true (requires experiments feature).
  • Per-task execution_mode — planner annotates tasks with parallel (default) or sequential to hint the scheduler. Missing fields in stored graphs default to parallel for backward compatibility.
  • PlanVerifier completeness checking — post-task LLM verification produces a structured VerificationResult with gap severity levels (critical/important/minor). replan() injects new TaskNodes for actionable gaps. All failures are fail-open. Configure via verify_provider. See Task Orchestration.
  • rmcp 1.3 — updated from rmcp 1.2.

[0.15.3] - 2026-03-17

Fixed

  • ACP config fallback (#1945) — resolve_config_path() now falls back to ~/.config/zeph/config.toml when config/default.toml is absent relative to CWD; resolves ACP stdio/HTTP startup failure when launched from an IDE workspace directory.
  • TUI filter metrics zero (#1939) — filter metrics (filter_raw_tokens, filter_saved_tokens, filter_applications) no longer show zero in the TUI dashboard during native tool execution. Extracted record_filter_metrics helper and called from all four metric-recording sites.
  • Graph metrics initialization (#1938) — TUI graph metrics panel now shows correct entity/edge/community counts on startup. App::with_metrics_rx() eagerly reads the initial snapshot; graph extraction now awaits the background task and re-reads counts.
  • TUI tool start events (#1931) — native tool calls now emit ToolStart events so the TUI shows a spinner and $ command header before tool output arrives.
  • Graph metrics per-turn update (#1932) — graph memory metrics (entities/edges/communities) now update every turn via per-turn sync_graph_counts() call.

Added

  • OAuth 2.1 PKCE for MCP (#1930) — McpTransport::OAuth variant with url, scopes, callback_port, client_name. McpManager::with_oauth_credential_store() for credential persistence via VaultCredentialStore. Two-phase connect_all(): stdio/HTTP concurrently, OAuth sequentially. SSRF validation on all OAuth metadata endpoints.
  • Background code indexing progress (#1923) — IndexProgress struct with files_done, files_total, chunks_created. CLI prints progress to stderr; TUI shows “Indexing codebase… N/M files (X%)” in status bar.
  • Real behavioral learning (#1913) — LearningEngine now injects inferred user preferences (verbosity, response format, language) into the volatile system prompt block. Preferences learned from corrections via watermark-based incremental scan every 5 turns. Wilson-score confidence threshold gates persistence.
  • Context compression overrides (#1904) — CLI flags --focus/--no-focus, --sidequest/--no-sidequest, --pruning-strategy <reactive|task_aware|mig> for per-session overrides. --init wizard step added. (task_aware_mig removed in v0.16.1 — was dead code; existing configs fall back to reactive with a warning.)
  • Orchestration metrics (#1899) — LlmPlanner::plan() and LlmAggregator::aggregate() return token usage; /status command shows Orchestration block when plans executed.
  • Memory integration tests (#1916) — four #[ignore] tests for session summary → Qdrant roundtrip using testcontainers.

[0.15.2] - 2026-03-16

Added

  • Per-conversation compression guidelines — the compression_guidelines table gains a conversation_id column (migration 034). Guidelines are now scoped to a specific conversation when one is in scope; the global (NULL) guideline is used as fallback. Configure via [memory.compression_guidelines]; toggle with --compression-guidelines. See Context Engineering.
  • Session summary on shutdown (#1816) — when no hard compaction fired during a session, the agent generates a lightweight LLM summary at shutdown and stores it in the vector store for cross-session recall. Configurable via memory.shutdown_summary, shutdown_summary_min_messages (default 4), and shutdown_summary_max_messages (default 20). The --init wizard prompts for the toggle; a TUI spinner appears during summarization.
  • Declarative policy compiler (#1695) — PolicyEnforcer evaluates TOML-based allow/deny rules before any tool executes. Deny-wins semantics; path traversal normalization; tool name normalization. Configure via [tools.policy] with enabled, default_effect, rules, and policy_file. CLI: --policy-file. Slash commands: /policy status, /policy check [--trust-level <level>]. Feature flag: policy-enforcer (included in full). See Policy Enforcer.
  • Pre-execution action verification (#1630) — pluggable PreExecutionVerifier pipeline runs before any tool executes. Two built-in verifiers: DestructiveCommandVerifier (blocks rm -rf /, dd if=, mkfs, etc. outside configured allowed_paths) and InjectionPatternVerifier (blocks SQL injection, command injection, path traversal; warns on SSRF). Configure via [security.pre_execution_verify]. CLI escape hatch: --no-pre-execution-verify. TUI security panel shows block/warn counters.
  • LLM guardrail pre-screener (#1651) — GuardrailFilter screens user input (and optionally tool output) through a guard model before it enters agent context. Configurable action (block/warn), fail strategy (closed/open), timeout, and max_input_chars. Enable with --guardrail or [security.guardrail] enabled = true. TUI status bar: GRD:on (green) or GRD:warn (yellow). Slash command: /guardrail for live stats.
  • Skill content scanner (#1853) — SkillContentScanner scans all loaded skill bodies for injection patterns at startup when [skills.trust] scan_on_load = true (default). Scanner is advisory: findings are WARN-logged and do not downgrade trust or block tools. On-demand: /skill scan TUI command, --scan-skills-on-load CLI flag.
  • OTLP-compatible debug traces (#1343) — --dump-format trace emits OpenTelemetry-compatible JSON traces with span hierarchy: session → iteration → LLM request / tool call / memory search. Configure endpoint and service name via [debug.traces]. Switch at runtime: /dump-format <json|raw|trace>. --init wizard prompts for format when debug dump is enabled.
  • TUI: compression guidelines status (#1803) — memory panel shows guidelines version and last update timestamp. /guidelines slash command displays current guidelines text.
  • Feature use-case bundles (#1831) — six named bundles group related features: desktop (tui + scheduler + compression-guidelines), ide (acp + acp-http + lsp-context), server (gateway + a2a + scheduler + otel), chat (discord + slack), ml (candle + pdf + stt), full (all except ml/hardware). Individual feature flags are unchanged. See Feature Flags.

Changed

  • Cascade router observability (#1825) — cascade_chat and cascade_chat_stream now emit structured tracing events for provider selection, judge scoring, quality verdict, escalation, and budget exhaustion.
  • ACP session config centralization (#1812) — AgentSessionConfig::from_config() and Agent::apply_session_config() replace ~25 individually-copied fields in daemon/runner/ACP session bootstrap. Fixes missing orchestration config and server compaction in daemon sessions.
  • rmcp 0.17 → 1.2 (#1845) — migrated CallToolRequestParams to builder pattern.

Fixed

  • Scheduler deadlock no longer emits misleading “Plan failed. 0/N tasks failed” — non-terminal tasks are marked Canceled at deadlock time; done message distinguishes deadlock, mixed failure, and normal failure paths (#1879).
  • MCP tools are now denied for quarantined skills — TrustGateExecutor tracks registered MCP tool IDs and blocks any call in the set (#1876).
  • Policy tool="shell" / "sh" / "bash" aliases now all match ShellExecutor at rule compile time (#1877).
  • /policy check no longer leaks process environment variables into trace output (#1873).
  • PolicyEffect::AllowIf variant removed — it was identical to Allow and generated misleading TOML docs (#1871).
  • Overflow notice format changed to [full output stored — ID: {uuid} — ...]; read_overflow accepts bare UUIDs and strips the legacy overflow: prefix (#1868).
  • Session summary timeout attempts plain-text fallback instead of silently returning None; shutdown_summary_timeout_secs (default 10) replaces hardcoded 5 s limit (#1869).
  • JWT Bearer tokens (Authorization: Bearer <token>, eyJ...) are now redacted before compression_failure_pairs SQLite insert (#1847).
  • Soft compaction threshold lowered from 0.70 to 0.60; maybe_soft_compact_mid_iteration() fires after per-tool summarization to relieve context pressure without triggering LLM calls (#1828).
  • Ollama base_url with /v1 suffix no longer causes 404 on embed calls (#1832).
  • Graph memory: entity embeddings now correctly stored in Qdrant — EntityResolver was built without a provider in extract_and_store() (#1817, #1829).
  • Debug trace.json written inside per-session subdir, preventing overwrites (#1814).
  • JIT tool reference injection works after overflow migration to SQLite (#1818).
  • Policy symlink boundary check: load_policy_file() canonicalizes the path and rejects files outside the process working directory (#1872).

[0.15.1] - 2026-03-15

Fixed

  • save_compression_guidelines atomic write — the version-number assignment now uses a single INSERT ... SELECT COALESCE(MAX(version), 0) + 1 statement, eliminating the read-then-write TOCTOU race where two concurrent callers could insert duplicate version numbers. Migration 033 adds a UNIQUE(version) constraint to the compression_guidelines table with row-level deduplication for pre-existing corrupt data (closes #1799).

Added

  • Failure-driven compression guidelines (ACON) — after hard compaction, the agent watches subsequent LLM responses for two-signal context-loss indicators (uncertainty phrase + prior-context reference). Confirmed failure pairs are stored in SQLite (compression_failure_pairs). A background updater wakes periodically, calls the LLM to synthesize updated guidelines from accumulated pairs, sanitizes the output to strip prompt injection, and persists the result. Guidelines are injected into every future compaction prompt via a <compression-guidelines> block. Configure via [memory.compression_guidelines]; disabled by default. See Context Engineering.

[0.15.0] - 2026-03-14

Added

  • Gemini provider — full Google Gemini API support across 6 phases: basic chat (generateContent), SSE streaming with thinking-part support, native tool use / function calling, vision / multimodal input (inlineData), semantic embeddings (embedContent), and remote model discovery (GET /v1beta/models). Default model: gemini-2.0-flash; extended thinking available with gemini-2.5-pro. Configure with [llm.gemini] and ZEPH_GEMINI_API_KEY. See LLM Providers.
  • Gemini thinking_level / thinking_budget supportGeminiThinkingConfig with thinking_level (minimal, low, medium, high), thinking_budget (validated -1/0/1–32768), and include_thoughts fields. Applies to Gemini 2.5+ models. Configurable in [llm.gemini] and the --init wizard.
  • Cascade routing strategy — new strategy = "cascade" for the router provider. Tries providers cheapest-first; escalates only when the response is classified as degenerate (empty, repetitive, incoherent). Heuristic and LLM-judge classifier modes. Configure via [llm.router.cascade] with quality_threshold, max_escalations, classifier_mode, and max_cascade_tokens. See Adaptive Inference.
  • Claude server-side context compaction[llm.cloud] server_compaction = true enables the compact-2026-01-12 beta API. Claude manages context on the server side; compaction summaries stream back and are surfaced in the TUI. Graceful fallback to client-side compaction when the beta header is rejected (e.g. on Haiku models). New server_compaction_events metric. Enable with --server-compaction.
  • Claude 1M extended context window[llm.cloud] enable_extended_context = true injects the context-1m-2025-08-07 beta header, unlocking 1M token context for Opus 4.6 and Sonnet 4.6. context_window() reports 1,000,000 when active so auto_budget scales correctly. Configurable in --init wizard.
  • /scheduler list command and list_tasks tool — lists all active scheduled tasks with NAME, KIND, MODE, and NEXT RUN columns. LLM-callable via the list_tasks tool; also available as /scheduler list slash command. See Scheduler.
  • search_code tool — unified hybrid code search combining tree-sitter structural extraction, Qdrant semantic search, and LSP symbol resolution. Always available (no feature flag). See Tools.
  • zeph migrate-config — CLI command to add missing config parameters as commented-out blocks and reformat the file. Idempotent; never modifies existing values. See Migrate Config.
  • ACP readiness probes/health HTTP endpoint returns 200 OK when ready; stdio transport emits zeph/ready JSON-RPC notification as the first outbound packet.
  • Request metadata in debug dumps — model, token limit, temperature, exposed tools, and cache breakpoints included in both json and raw dump formats.

Changed

  • Tiered context compaction (#1338): replaced single compaction_threshold with soft tier (soft_compaction_threshold, default 0.70 — prune tool outputs + apply deferred summaries, no LLM) and hard tier (hard_compaction_threshold, default 0.90 — full LLM summarization). Old compaction_threshold field still accepted via serde alias. deferred_apply_threshold removed — absorbed into soft tier. See Context Engineering.
  • Async parallel dispatch in DagSchedulertick() now dispatches all ready tasks simultaneously instead of capping at max_parallel - running. Concurrency enforced by SubAgentManager returning ConcurrencyLimit; tasks revert to Ready and retry on the next tick.
  • /plan cancel during execution — cancel commands delivered immediately during active plan execution via concurrent channel polling.
  • DagScheduler exponential backoff — concurrency-limit deferral uses 250ms→500ms→1s→2s→4s (cap 5s) instead of a fixed 250ms sleep.
  • Single shared QdrantOps instance — all subsystems share one gRPC connection instead of creating independent connections on startup.
  • zeph-index always-on — the index feature flag is removed; tree-sitter and code intelligence are compiled into every build.
  • Graph memory chunked edge loading — community detection loads edges in configurable chunks (keyset pagination) instead of loading all edges at once, reducing peak memory on large graphs. Configurable via memory.graph.lpa_edge_chunk_size (default: 10,000).

Security

  • SEC-001–004 tool execution hardening — randomized hash seeds, jitter-free retry timing, tool name length limits, wall-clock retry budget. See Security.
  • Shell blocklist unconditionalblocked_commands and DEFAULT_BLOCKED now apply regardless of PermissionPolicy configuration; previously skipped when a policy was attached.

Fixed

  • Context compaction loop: maybe_compact() now detects when the token budget is too tight to make progress (compactable message count ≤ 1, or compaction produced zero net token reduction, or context remains above threshold after a successful summarization pass) and sets a permanent compaction_exhausted flag. Subsequent calls skip compaction entirely and emit a one-time user-visible warning to increase context_budget_tokens or start a new session (#1727).
  • Claude server compaction: ContextManagement struct now serializes to the correct API shape (auto_truncate type with nested trigger); the previous shape caused non-functional --server-compaction.
  • Haiku models: with_server_compaction(true) now emits WARN and keeps the flag disabled (the compact-2026-01-12 beta is not supported for Haiku).
  • Skill embedding log noise: SkillMatcher::new() no longer emits one WARN per skill when the provider does not support embeddings — all EmbedUnsupported errors are summarised into a single info-level message.
  • OpenAI / Gemini: tools with no parameters no longer cause 400 Bad Request in strict mode.
  • Anomaly detector: outcomes now recorded correctly for native tool-use providers (Claude, OpenAI, Gemini).

[0.14.3] - 2026-03-10

See CHANGELOG.md for full release notes.

[0.14.2] - 2026-03-09

See CHANGELOG.md for full release notes.

[0.14.1] - 2026-03-07

See CHANGELOG.md for full release notes.

[0.14.0] - 2026-03-06

See CHANGELOG.md for full release notes.

[0.12.5] - 2026-03-02

See CHANGELOG.md for full release notes.

[0.12.4] - 2026-03-01

Added

  • list_directory tool in FileExecutor: sorted entries with [dir]/[file]/[symlink] labels; uses lstat to avoid following symlinks (#1053)
  • create_directory, delete_path, move_path, copy_path tools in FileExecutor: structured file system mutation ops, all paths sandbox-validated; copy_dir_recursive uses lstat to prevent symlink escape (#1054)
  • fetch tool in WebScrapeExecutor: plain URL-to-text without CSS selector requirement, SSRF protection applied (#1055)
  • DiagnosticsExecutor with diagnostics tool: runs cargo check or cargo clippy --message-format=json, returns structured error/warning list (file, line, col, severity, message), output capped, graceful degradation if cargo absent (#1056)
  • list_directory and find_path tools in AcpFileExecutor: run on agent filesystem when IDE advertises fs.readTextFile capability; paths sandbox-validated, glob segments validated against .. traversal, results capped at 1000 (#1059)
  • ToolFilter: suppresses local FileExecutor tools (read, write, glob) when AcpFileExecutor provides IDE-proxied alternatives (#1059)
  • check_blocklist() and DEFAULT_BLOCKED_COMMANDS extracted to zeph-tools public API so AcpShellExecutor applies the same blocklist as ShellExecutor (#1050)
  • ToolPermission enum with per-binary pattern support in persisted TOML ([tools.bash.patterns]); deny patterns route to RejectAlways fast-path without IDE round-trip (#1050)
  • Self-learning loop (Phase 1–4): FailureKind enum, /skill reject, FeedbackDetector, UserCorrection cross-session recall, Wilson score Bayesian re-ranking, check_trust_transition(), BM25+RRF hybrid search, EMA routing (#1035)

Changed

  • Renamed FileExecutor tool id globfind_path to align with Zed IDE native tool surface (#1052)
  • READONLY_TOOLS allowlist updated to current tool IDs: read, find_path, grep, list_directory, web_scrape, fetch (#1052)
  • CI: migrated from Dependabot to self-hosted Renovate with MSRV-aware constraintsFiltering: strict and grouped minor/patch automerge (#1048)

Security

  • ACP permission gate: subshell injection ($(, backtick) blocked before pattern matching; effective_shell_command() checks inner command of bash -c <cmd> against blocklist; extract_command_binary() strips transparent prefixes to prevent allow-always scope expansion (SEC-ACP-C1, SEC-ACP-C2) (#1050)
  • ACP tool notifications: raw_response is now passed through redact_json before forwarding to claudeCode.toolResponse; prevents secrets from bypassing the redact_secrets pipeline (SEC-ACP-001)

Fixed

  • ACP: terminal release deferred until after tool_call_update notification is dispatched (#1013)
  • ACP: tool execution output forwarded via LoopbackEvent::ToolOutput to ACP channel (#1003)
  • ACP: newlines preserved in tool output for IDE terminal widget (#1034)

[0.12.1] - 2026-02-25

Security

  • Enforce unsafe_code = "deny" at workspace lint level; audited unsafe blocks (mmap via candle, std::env in tests) annotated with #[allow(unsafe_code)] (#867)
  • AgeVaultProvider secrets map switched from HashMap to BTreeMap for deterministic JSON key ordering on vault.save() (#876)
  • WebScrapeExecutor: redirect targets now validated against private/internal IP ranges to prevent SSRF via redirect chains (#871)
  • Gateway webhook payload: per-field length limits (sender/channel <= 256 bytes, body <= 65536 bytes) and ASCII control-char stripping to prevent prompt injection (#868)
  • ACP permission cache: null bytes stripped from tool names before cache key construction to prevent key collision (#872)
  • gateway.max_body_size bounded to 10 MiB (10,485,760 bytes) at config validation to prevent memory exhaustion (#875)
  • Shell sandbox: <(, >(, <<<, eval added to default confirm_patterns to mitigate process substitution, here-string, and eval bypass vectors (#870)

Performance

  • ClaudeProvider caches pre-serialized ToolDefinition slices; cache is invalidated only when the tool set changes, eliminating per-call JSON construction overhead (#894)
  • should_compact() replaced O(N) message scan with direct comparison against cached_prompt_tokens (#880)
  • EnvironmentContext cached on Agent; only git_branch refreshed on skill reload instead of spawning a full git subprocess per turn (#881)
  • Doom-loop content hashed in-place by feeding stable message parts directly into the hasher, eliminating the intermediate normalized String allocation (#882)
  • prune_stale_tool_outputs: count_tokens called once per ToolResult part instead of twice (#883)
  • Composite covering index (conversation_id, id) on messages table (migration 015) replaces single-column index; eliminates post-filter sort step (#895)
  • load_history_filtered rewritten as a CTE, replacing the previous double-sort subquery (#896)
  • remove_tool_responses_middle_out takes ownership of the message Vec instead of cloning; HashSet replaced with Vec::with_capacity for small-N index tracking (#884, #888)
  • Fast-path parts_json == "[]" check in history load functions skips serde parse on the common empty case (#886)
  • consolidate_summaries uses String::with_capacity + write! loop instead of collect::<Vec<_>>().join() (#887)
  • TUI tui_loop() skips terminal.draw() when no events occurred in the 250ms tick, reducing idle CPU usage (#892)

Added

  • sqlite_pool_size: u32 in MemoryConfig (default 5) — configurable via [memory] sqlite_pool_size (#893)
  • Background cleanup task for ResponseCache::cleanup_expired() — interval configurable via [memory] response_cache_cleanup_interval_secs (default 3600s) (#891)
  • schema feature flag in zeph-llm gating schemars dependency and typed output API (#879)

Changed

  • check_summarization() uses in-memory unsummarized_count counter on MemoryState instead of issuing a COUNT(*) SQL query on every message save (#890)
  • Removed 4 channel.send_status() calls from persist_message() in zeph-core — SQLite WAL inserts < 1ms do not warrant status reporting (#889)
  • Default Ollama model changed from mistral:7b to qwen3:8b; "qwen3" and "qwen" added as ChatML template aliases (#897)
  • src/main.rs split into focused modules: runner.rs, agent_setup.rs, tracing_init.rs, tui_bridge.rs, channel.rs, tests.rsmain.rs reduced to 26 LOC (#839)
  • zeph-core/src/bootstrap.rs split into submodule directory: config.rs, health.rs, mcp.rs, provider.rs, skills.rs, tests.rsbootstrap/mod.rs reduced to 278 LOC (#840)
  • SkillTrustRow.source_kind changed from String to SourceKind enum (Local, Hub, File) with serde DB serialization (#848)
  • ScheduledTaskConfig.kind changed from String to ScheduledTaskKind enum (#850)
  • TrustLevel moved to zeph-tools::trust_level; zeph-skills re-exports it, removing the zeph-tools → zeph-skills reverse dependency (#841)
  • Duplicate ChannelError removed from zeph-channels::error; all channel adapters use zeph_core::channel::ChannelError (#842)
  • zeph_a2a::types::TaskState replaced in zeph-core with a local SubAgentState enum; zeph-a2a removed from zeph-core dependencies (#843)
  • zeph-index Qdrant access consolidated through VectorStore trait from zeph-memory; direct qdrant-client dependency removed (#844)
  • content_hash(data: &[u8]) -> String utility added to zeph-core::hash backed by BLAKE3 (#845)
  • zeph-core::diff re-export module removed; zeph_core::DiffData is now a direct re-export of zeph_tools::executor::DiffData (#846)
  • ContextManager, ToolOrchestrator, LearningEngine extracted from Agent into standalone structs with pure delegation (#830, #836, #837, #838)
  • Secret type wraps inner value in Zeroizing<String>; Clone removed (#865)
  • AgeVaultProvider secrets and intermediate decrypt/encrypt buffers wrapped in Zeroizing (#866, #874)
  • A2aServer::serve() and GatewayServer::serve() emit tracing::warn! when auth_token is None (#869, #873)

0.12.0 - 2026-02-24

Added

  • MessageMetadata struct in zeph-llm with agent_visible, user_visible, compacted_at fields; default is both-visible for backward compat (#M28)
  • Message.metadata field with #[serde(default)] — existing serialized messages deserialize without change
  • SQLite migration 013_message_metadata.sql — adds agent_visible, user_visible, compacted_at columns to messages table
  • save_message_with_metadata() in SqliteStore for saving messages with explicit visibility flags
  • load_history_filtered() in SqliteStore — SQL-level filtering by agent_visible / user_visible
  • replace_conversation() in SqliteStore — atomic compaction: marks originals user_only, inserts summary as agent_only
  • oldest_message_ids() in SqliteStore — returns N oldest message IDs for a conversation
  • Agent.load_history() now loads only agent_visible=true messages, excluding compacted originals
  • compact_context() persists compaction atomically via replace_conversation(), falling back to legacy summary storage if DB IDs are unavailable
  • Multi-session ACP support with configurable max_sessions (default 4) and LRU eviction of idle sessions (#781)
  • session_idle_timeout_secs config for automatic session cleanup (default 30 min) with background reaper task (#781)
  • ZEPH_ACP_MAX_SESSIONS and ZEPH_ACP_SESSION_IDLE_TIMEOUT_SECS env overrides (#781)
  • ACP session persistence to SQLiteacp_sessions and acp_session_events tables with conversation replay on load_session per ACP spec (#782)
  • SqliteStore methods for ACP session lifecycle: create_acp_session, save_acp_event, load_acp_events, delete_acp_session, acp_session_exists (#782)
  • TokenCounter in zeph-memory — accurate token counting with tiktoken-rs cl100k_base, replacing chars/4 heuristic (#789)
  • DashMap-backed token cache (10k cap) for amortized O(1) lookups
  • OpenAI tool schema token formula for precise context budget allocation
  • Input size guard (64KB) on token counting to prevent cache pollution from oversized input
  • Graceful fallback to chars/4 when tiktoken tokenizer is unavailable
  • Configurable tool response offload — OverflowConfig with threshold (default 50k chars), retention (7 days), optional custom dir (#791)
  • [tools.overflow] section in config.toml for offload configuration
  • Security hardening: path canonicalization, symlink-safe cleanup, 0o600 file permissions on Unix
  • Wire AcpContext (IDE-proxied FS, shell, permissions) through AgentSpawner into agent tool chain via CompositeExecutor — ACP executors take priority with automatic local fallback (#779)
  • DynExecutor newtype in zeph-tools for object-safe ToolExecutor composition in CompositeExecutor (#779)
  • cancel_signal: Arc<Notify> on LoopbackHandle for cooperative cancellation between ACP sessions and agent loop (#780)
  • with_cancel_signal() builder method on Agent to inject external cancellation signal (#780)
  • zeph-acp crate — ACP (Agent Client Protocol) server for IDE embedding (Zed, JetBrains, Neovim) (#763-#766)
  • --acp CLI flag to launch Zeph as an ACP stdio server (requires acp feature)
  • acp feature gate in root Cargo.toml; included in full feature set
  • ZephAcpAgent implementing SDK Agent trait with session lifecycle (new, prompt, cancel, load)
  • loopback_event_to_update mapping LoopbackEvent variants to ACP SessionUpdate notifications, with empty chunk filtering
  • serve_stdio() transport using AgentSideConnection over tokio-compat stdio streams
  • Stream monitor gated behind ZEPH_ACP_LOG_MESSAGES env var for JSON-RPC traffic debugging
  • Custom mdBook theme with Zeph brand colors (navy+amber palette from TUI)
  • Z-letter favicon SVG for documentation site
  • Sidebar logo via inline data URI
  • Navy as default documentation theme
  • AcpConfig struct in zeph-coreenabled, agent_name, agent_version with ZEPH_ACP_* env overrides (#771)
  • [acp] section in config.toml for configuring ACP server identity
  • --acp-manifest CLI flag — prints ACP agent manifest JSON to stdout for IDE discovery (#772)
  • serve_connection<W, R> generic transport function extracted from serve_stdio for testability (#770)
  • ConnSlot pattern in transport — Rc<RefCell<Option<Rc<AgentSideConnection>>>> populated post-construction so new_session can build ACP adapters (#770)
  • build_acp_context in ZephAcpAgent — wires AcpFileExecutor, AcpShellExecutor, AcpPermissionGate per session (#770)
  • AcpServerConfig passed through serve_stdio/serve_connection to configure agent identity from config values (#770)
  • ACP section in --init wizard — prompts for enabled, agent_name, agent_version (#771)
  • Integration tests for ACP transport using tokio::io::duplexinitialize_handshake, new_session_and_cancel (#773)
  • ACP permission persistence to ~/.config/zeph/acp-permissions.tomlAllowAlways/RejectAlways decisions survive restarts (#786)
  • acp.permission_file config and ZEPH_ACP_PERMISSION_FILE env override for custom permission file path (#786)

Fixed

  • Permission cache key collision on anonymous tools — uses tool_call_id as fallback when title is absent (#779)

Changed

  • CI: add CLA check for external contributors via contributor-assistant/github-action

0.11.6 - 2026-02-23

Fixed

  • Auto-create parent directories for sqlite_path on startup (#756)

Added

  • autosave_assistant and autosave_min_length config fields in MemoryConfig — assistant responses skip embedding when disabled (#748)
  • SemanticMemory::save_only() — persist message to SQLite without generating a vector embedding (#748)
  • ResponseCache in zeph-memory — SQLite-backed LLM response cache with blake3 key hashing and TTL expiry (#750)
  • response_cache_enabled and response_cache_ttl_secs config fields in LlmConfig (#750)
  • Background cleanup_expired() task for response cache (runs every 10 minutes) (#750)
  • ZEPH_MEMORY_AUTOSAVE_ASSISTANT, ZEPH_MEMORY_AUTOSAVE_MIN_LENGTH env overrides (#748)
  • ZEPH_LLM_RESPONSE_CACHE_ENABLED, ZEPH_LLM_RESPONSE_CACHE_TTL_SECS env overrides (#750)
  • MemorySnapshot, export_snapshot(), import_snapshot() in zeph-memory/src/snapshot.rs (#749)
  • zeph memory export <path> and zeph memory import <path> CLI subcommands (#749)
  • SQLite migration 012_response_cache.sql for the response cache table (#750)
  • Temporal decay scoring in SemanticMemory::recall() — time-based score attenuation with configurable half-life (#745)
  • MMR (Maximal Marginal Relevance) re-ranking in SemanticMemory::recall() — post-processing for result diversity (#744)
  • Compact XML skills prompt format (format_skills_prompt_compact) for low-budget contexts (#747)
  • SkillPromptMode enum (full/compact/auto) with auto-selection based on context budget (#747)
  • Adaptive chunked context compaction — parallel chunk summarization via join_all (#746)
  • with_ranking_options() builder for SemanticMemory to configure temporal decay and MMR
  • message_timestamps() method on SqliteStore for Unix epoch retrieval via strftime
  • get_vectors() method on EmbeddingStore for raw vector fetch from SQLite vector_points
  • SQLite-backed SqliteVectorStore as embedded alternative to Qdrant for zero-dependency vector search (#741)
  • vector_backend config option to select between qdrant and sqlite vector backends
  • Credential scrubbing in LLM context pipeline via scrub_content() — redacts secrets and paths before LLM calls (#743)
  • redact_credentials config option (default: true) to toggle context scrubbing
  • Filter diagnostics mode: kept_lines tracking in FilterResult for all 9 filter strategies
  • TUI expand (‘e’) highlights kept lines vs filtered-out lines with dim styling and legend
  • Markdown table rendering in TUI chat panel — Unicode box-drawing borders, bold headers, column auto-width

Changed

  • Token estimation uses chars/4 heuristic instead of bytes/3 for better accuracy on multi-byte text (#742)

0.11.5 - 2026-02-22

Added

  • Declarative TOML-based output filter engine with 9 strategy types: strip_noise, truncate, keep_matching, strip_annotated, test_summary, group_by_rule, git_status, git_diff, dedup
  • Embedded default-filters.toml with 25 pre-configured rules for CLI tools (cargo, git, docker, npm, pip, make, pytest, go, terraform, kubectl, brew, ls, journalctl, find, grep/rg, curl/wget, du/df/ps, jest/mocha/vitest, eslint/ruff/mypy/pylint)
  • filters_path option in FilterConfig for user-provided filter rules override
  • ReDoS protection: RegexBuilder with size_limit, 512-char pattern cap, 1 MiB file size limit
  • Dedup strategy with configurable normalization patterns and HashMap pre-allocation
  • NormalizeEntry replacement validation (rejects unescaped $ capture group refs)
  • Sub-agent orchestration system with A2A protocol integration (#709)
  • Sub-agent definition format with TOML frontmatter parser (#710)
  • SubAgentManager with spawn/cancel/collect lifecycle (#711)
  • Tool filtering (AllowList/DenyList/InheritAll) and skill filtering with glob patterns (#712)
  • Zero-trust permission model with TTL-based grants and automatic revocation (#713)
  • In-process A2A channels for orchestrator-to-sub-agent communication
  • PermissionGrants with audit trail via tracing
  • Real LLM loop wired into SubAgentManager::spawn() with background tokio task execution (#714)
  • poll_subagents() on Agent<C> for collecting completed sub-agent results (#714)
  • shutdown_all() on SubAgentManager for graceful teardown (#714)
  • SubAgentMetrics in MetricsSnapshot with state, turns, elapsed time (#715)
  • TUI sub-agents panel (zeph-tui widgets/subagents) with color-coded states (#715)
  • /agent CLI commands: list, spawn, bg, status, cancel, approve, deny (#716)
  • Typed AgentCommand enum with parse() for type-safe command dispatch replacing string matching in the agent loop
  • @agent_name mention syntax for quick sub-agent invocation with disambiguation from @-triggered file references

Changed

  • Migrated all 6 hardcoded filters (cargo_build, test_output, clippy, git, dir_listing, log_dedup) into the declarative TOML engine

Removed

  • FilterConfig per-filter config structs (TestFilterConfig, GitFilterConfig, ClippyFilterConfig, CargoBuildFilterConfig, DirListingFilterConfig, LogDedupFilterConfig) — filter params now in TOML strategy fields

0.11.4 - 2026-02-21

Added

  • validate_skill_references(body, skill_dir) in zeph-skills loader: parses Markdown links targeting references/, scripts/, or assets/ subdirs, warns on missing files and symlink traversal attempts (#689)
  • sanitize_skill_body(body) in zeph-skills prompt: escapes XML structural tags (<skill, </skill>, <instructions, </instructions>, <available_skills, </available_skills>) to prevent prompt injection (#689)
  • Body sanitization applied automatically to all non-Trusted skills in format_skills_prompt() (#689)
  • load_skill_resource(skill_dir, relative_path) public function in zeph-skills::resource for on-demand loading of skill resource files with path traversal protection (#687)
  • Nested metadata: block support in SKILL.md frontmatter: indented key-value pairs under metadata: are parsed as structured metadata (#686)
  • Field length validation in SKILL.md loader: description capped at 1024 characters, compatibility capped at 500 characters (#686)
  • Warning log in load_skill_body() when body exceeds 20,000 bytes (~5000 tokens) per spec recommendation (#686)
  • Empty value normalization for compatibility and license frontmatter fields: bare compatibility: now produces None instead of Some("") (#686)
  • SkillManager in zeph-skills — install skills from git URLs or local paths, remove, verify blake3 integrity, list with trust metadata
  • CLI subcommands: zeph skill {install, remove, list, verify, trust, block, unblock} — runs without agent loop
  • In-session /skill install <url|path> and /skill remove <name> with hot reload
  • Managed skills directory at ~/.config/zeph/skills/, auto-appended to skills.paths at bootstrap
  • Hash re-verification on trust promotion — recomputes blake3 before promoting to trusted/verified, rejects on mismatch
  • URL scheme allowlist and path traversal validation in SkillManager as defense-in-depth
  • Blocking I/O wrapped in spawn_blocking for async safety in skill management handlers
  • custom: HashMap<String, Secret> field in ResolvedSecrets for user-defined vault secrets (#682)
  • list_keys() method on VaultProvider trait with implementations for Age and Env backends (#682)
  • requires-secrets field in SKILL.md frontmatter for declaring per-skill secret dependencies (#682)
  • Gate skill activation on required secrets availability in system prompt builder (#682)
  • Inject active skill’s secrets as scoped env vars into ShellExecutor at execution time (#682)
  • Custom secrets step in interactive config wizard (--init) (#682)
  • crates.io publishing metadata (description, readme, homepage, keywords, categories) for all workspace crates (#702)

Changed

  • requires-secrets SKILL.md frontmatter field renamed to x-requires-secrets to follow JSON Schema vendor extension convention and avoid future spec collisions — breaking change: update skill frontmatter to use x-requires-secrets; the old requires-secrets form is still parsed with a deprecation warning (#688)
  • allowed-tools SKILL.md field now uses space-separated values per agentskills.io spec (was comma-separated) — breaking change for skills using comma-delimited allowed-tools (#686)
  • Skill resource files (references, scripts, assets) are no longer eagerly injected into the system prompt on skill activation; only filenames are listed as available resources — breaking change for skills relying on auto-injected reference content (#687)

0.11.3 - 2026-02-20

Added

  • LoopbackChannel / LoopbackHandle / LoopbackEvent in zeph-core — headless channel for daemon mode, pairs with a handle that exposes input_tx / output_rx for programmatic agent I/O
  • ProcessorEvent enum in zeph-a2a server — streaming event type replacing synchronous ProcessResult; TaskProcessor::process now accepts an mpsc::Sender<ProcessorEvent> and returns Result<(), A2aError>
  • --daemon CLI flag (feature daemon+a2a) — bootstraps a full agent + A2A JSON-RPC server under DaemonSupervisor with PID file lifecycle and Ctrl-C graceful shutdown
  • --connect <URL> CLI flag (feature tui+a2a) — connects the TUI to a remote daemon via A2A SSE, mapping TaskEvent to AgentEvent in real-time
  • Command palette daemon commands: daemon:connect, daemon:disconnect, daemon:status
  • Command palette action commands: app:quit (shortcut q), app:help (shortcut ?), session:new, app:theme
  • Fuzzy-matching for command palette — character-level gap-penalty scoring replaces substring filter; daemon_command_registry() merged into filter_commands
  • TuiCommand::ToggleTheme variant in command palette (placeholder — theme switching not yet implemented)
  • --init wizard daemon step — prompts for A2A server host, port, and auth token; writes config.a2a.*
  • Snapshot tests for Config::default() TOML serialization (zeph-core), git filter diff/status output, cargo-build filter success/error output, and clippy grouped warnings output — using insta for regression detection
  • Tests for handle_tool_result covering blocked, cancelled, sandbox violation, empty output, exit-code failure, and success paths (zeph-core agent/tool_execution.rs)
  • Tests for maybe_redact (redaction enabled/disabled) and last_user_query helper in agent/tool_execution.rs
  • Tests for handle_skill_command dispatch covering unknown subcommand, missing arguments, and no-memory early-exit paths for stats, versions, activate, approve, and reset subcommands (zeph-core agent/learning.rs)
  • Tests for record_skill_outcomes noop path when no active skills are present
  • insta added to workspace dev-dependencies and to zeph-core and zeph-tools crate dev-deps
  • Embeddable trait and EmbeddingRegistry<T> in zeph-memory — generic Qdrant sync/search extracted from duplicated code in QdrantSkillMatcher and McpToolRegistry (~350 lines removed)
  • MCP server command allowlist validation — only permitted commands (npx, uvx, node, python3, python, docker, deno, bun) can spawn child processes; configurable via mcp.allowed_commands
  • MCP env var blocklist — blocks 21 dangerous variables (LD_PRELOAD, DYLD_, NODE_OPTIONS, PYTHONPATH, JAVA_TOOL_OPTIONS, etc.) and BASH_FUNC_ prefix from MCP server processes
  • Path separator rejection in MCP command validation to prevent symlink-based bypasses

Changed

  • MessagePart::Image variant now holds Box<ImageData> instead of inline fields, improving semantic grouping of image data
  • Agent<C, T> simplified to Agent<C> — ToolExecutor generic replaced with Box<dyn ErasedToolExecutor>, reducing monomorphization
  • Shell command detection rewritten from substring matching to tokenizer-based pipeline with escape normalization, eliminating bypass vectors via backslash insertion, hex/octal escapes, quote splitting, and pipe chains
  • Shell sandbox path validation now uses std::path::absolute() as fallback when canonicalize() fails on non-existent paths
  • Blocked command matching extracts basename from absolute paths (/usr/bin/sudo now correctly blocked)
  • Transparent wrapper commands (env, command, exec, nice, nohup, time, xargs) are skipped to detect the actual command
  • Default confirm patterns now include $( and backtick subshell expressions
  • Enable SQLite WAL mode with SYNCHRONOUS=NORMAL for 2-5x write throughput (#639)
  • Replace O(n*iterations) token scan with cached_prompt_tokens in budget checks (#640)
  • Defer maybe_redact to stream completion boundary instead of per-chunk (#641)
  • Replace format_tool_output string allocation with Write-into-buffer (#642)
  • Change ToolCall.params from HashMap to serde_json::Map, eliminating clone (#643)
  • Pre-join static system prompt sections into LazyLock (#644)
  • Replace doom-loop string history with content hash comparison (#645)
  • Return &’static str from detect_image_mime with case-insensitive matching (#646)
  • Replace block_on in history persist with fire-and-forget async spawn (#647)
  • Change LlmProvider::name() from &'static str to &str, eliminating Box::leak memory leak in CompatibleProvider (#633)
  • Extract rate-limit retry helper send_with_retry() in zeph-llm, deduplicating 3 retry loops (#634)
  • Extract sse_to_chat_stream() helpers shared by Claude and OpenAI providers (#635)
  • Replace double AnyProvider::clone() in embed_fn() with single Arc clone (#636)
  • Add with_client() builder to ClaudeProvider and OpenAiProvider for shared reqwest::Client (#637)
  • Cache JsonSchema per TypeId in chat_typed to avoid per-call schema generation (#638)
  • Scrape executor performs post-DNS resolution validation against private/loopback IPs with pinned address client to prevent SSRF via DNS rebinding
  • Private host detection expanded to block *.localhost, *.internal, *.local domains
  • A2A error responses sanitized: serde details and method names no longer exposed to clients
  • Rate limiter rejects new clients with 429 when entry map is at capacity after stale eviction
  • Secret redaction regex-based pattern matching replaces whitespace tokenizer, detecting secrets in URLs, JSON, and quoted strings
  • Added hf_, npm_, dckr_pat_ to secret redaction prefixes
  • A2A client stream errors truncate upstream body to 256 bytes
  • Add default_client() HTTP helper with standard timeouts and user-agent in zeph-core and zeph-llm (#666)
  • Replace 5 production Client::new() calls with default_client() for consistent HTTP config (#667)
  • Decompose agent/mod.rs (2602→459 lines) into tool_execution, message_queue, builder, commands, and utils modules (#648, #649, #650)
  • Replace anyhow in zeph-core::config with typed ConfigError enum (Io, Parse, Validation, Vault)
  • Replace anyhow in zeph-tui with typed TuiError enum (Io, Channel); simplify handle_event() return to ()
  • Sort [workspace.dependencies] alphabetically in root Cargo.toml

Fixed

  • False positive: “sudoku” no longer matched by “sudo” blocked pattern (word-boundary matching)
  • PID file creation uses OpenOptions::create_new(true) (O_CREAT|O_EXCL) to prevent TOCTOU symlink attacks

0.11.2 - 2026-02-19

Added

  • base_url and language fields in [llm.stt] config for OpenAI-compatible local whisper servers (e.g. whisper.cpp)
  • ZEPH_STT_BASE_URL and ZEPH_STT_LANGUAGE environment variable overrides
  • Whisper API provider now passes language parameter for accurate non-English transcription
  • Documentation for whisper.cpp server setup with Metal acceleration on macOS
  • Per-sub-provider base_url and embedding_model overrides in orchestrator config
  • Full orchestrator example with cloud + local + STT in default.toml
  • All previously undocumented config keys in default.toml (agent.auto_update_check, llm.stt, llm.vision_model, skills.disambiguation_threshold, tools.filters.*, tools.permissions, a2a.auth_token, mcp.servers.env)

Fixed

  • Outdated config keys in default.toml: removed nonexistent repo_id, renamed provider_type to type, corrected candle defaults, fixed observability exporter default
  • Add wait(true) to Qdrant upsert and delete operations for read-after-write consistency, fixing flaky ingested_chunks_have_correct_payload integration test (#567)
  • Vault age backend now falls back to default directory for key/path when --vault-key/--vault-path are not provided, matching zeph vault init behavior (#613)

Changed

  • Whisper STT provider no longer requires OpenAI API key when base_url points to a local server
  • Orchestrator sub-providers now resolve base_url and embedding_model via fallback chain: per-provider, parent section, global default

0.11.1 - 2026-02-19

Added

  • Persistent CLI input history with rustyline: arrow key navigation, prefix search, line editing, SQLite-backed persistence across restarts (#604)
  • Clickable markdown links in TUI via OSC 8 hyperlinks — [text](url) renders as terminal-clickable link with URL sanitization and scheme allowlist (#580)
  • @-triggered fuzzy file picker in TUI input — type @ to search project files by name/path/extension with real-time filtering (#600)
  • Command palette in TUI with read-only agent management commands (#599)
  • Orchestrator provider option in zeph init wizard for multi-model routing setup (#597)
  • zeph vault CLI subcommands: init (generate age keypair), set (store secret), get (retrieve secret), list (show keys), rm (remove secret) (#598)
  • Atomic file writes for vault operations with temp+rename strategy (#598)
  • Default vault directory resolution via XDG_CONFIG_HOME / APPDATA / HOME (#598)
  • Auto-update check via GitHub Releases API with configurable scheduler task (#588)
  • auto_update_check config field (default: true) with ZEPH_AUTO_UPDATE_CHECK env override
  • TaskKind::UpdateCheck variant and UpdateCheckHandler in zeph-scheduler
  • One-shot update check at startup when scheduler feature is disabled
  • --init wizard step for auto-update check configuration

Fixed

  • Restore --vault, --vault-key, --vault-path CLI flags lost during clap migration (#587)

Changed

  • Refactor AppBuilder::from_env() to AppBuilder::new() with explicit CLI overrides
  • Eliminate redundant manual std::env::args() parsing in favor of clap
  • Add ZEPH_VAULT_KEY and ZEPH_VAULT_PATH environment variable support
  • Init wizard reordered: vault backend selection is now step 1 before LLM provider (#598)
  • API key and channel token prompts skipped when age vault backend is selected (#598)

0.11.0 - 2026-02-19

Added

  • Vision (image input) support across Claude, OpenAI, and Ollama providers (#490)
  • MessagePart::Image content type with base64 serialization
  • LlmProvider::supports_vision() trait method for runtime capability detection
  • Claude structured content with AnthropicContentBlock::Image variant
  • OpenAI array content format with image_url data-URI encoding
  • Ollama with_images() support with optional vision_model config for dedicated model routing
  • /image <path> command in CLI and TUI channels
  • Telegram photo message handling with pre-download size guard
  • vision_model field in [llm.ollama] config section and --init wizard update
  • 20 MB max image size limit and path traversal protection
  • Interactive configuration wizard via zeph init subcommand with 5-step setup (LLM provider, memory, channels, secrets backend, config generation)
  • clap-based CLI argument parsing with --help, --version support
  • Serialize derive on Config and all nested types for TOML generation
  • dialoguer dependency for interactive terminal prompts
  • Structured LLM output via chat_typed<T>() on LlmProvider trait with JSON schema enforcement (#456)
  • OpenAI/Compatible native response_format: json_schema structured output (#457)
  • Claude structured output via forced tool use pattern (#458)
  • Extractor<T> utility for typed data extraction from LLM responses (#459)
  • TUI test automation infrastructure: EventSource trait abstraction, insta widget snapshot tests, TestBackend integration tests, proptest layout verification, expectrl E2E terminal tests (#542)
  • CI snapshot regression pipeline with cargo insta test --check (#547)
  • Pipeline API with composable, type-safe Step trait, Pipeline builder, ParallelStep combinator, and built-in steps (LlmStep, RetrievalStep, ExtractStep, MapStep) (#466, #467, #468)
  • Structured intent classification for skill disambiguation: when top-2 skill scores are within disambiguation_threshold (default 0.05), agent calls LLM via chat_typed::<IntentClassification>() to select the best-matching skill (#550)
  • ScoredMatch struct exposing both skill index and cosine similarity score from matcher backends
  • IntentClassification type (skill_name, confidence, params) with JsonSchema derive for schema-enforced LLM responses
  • disambiguation_threshold in [skills] config section (default: 0.05) with with_disambiguation_threshold() builder on Agent
  • DocumentLoader trait with text/markdown file loader in zeph-memory (#469)
  • Text splitter with configurable chunk size, overlap, and sentence-aware splitting (#470)
  • PDF document loader, feature-gated behind pdf (#471)
  • Document ingestion pipeline: load, split, embed, store via Qdrant (#472)
  • File size guard (50 MiB default) and path canonicalization for document loaders
  • Audio input support: Attachment/AttachmentKind types, SpeechToText trait, OpenAI Whisper backend behind stt feature flag (#520, #521, #522)
  • Telegram voice and audio message handling with automatic file download (#524)
  • STT bootstrap wiring: WhisperProvider created from [llm.stt] config behind stt feature (#529)
  • Slack audio file upload handling with host validation and size limits (#525)
  • Local Whisper backend via candle for offline STT with symphonia audio decode and rubato resampling (#523)
  • Shell-based installation script (install/install.sh) with SHA256 verification, platform detection, and --version flag
  • Shellcheck lint job in CI pipeline
  • Per-job permission scoping in release workflow (least privilege)
  • TUI word-jump and line-jump cursor navigation (#557)
  • TUI keybinding help popup on ? in normal mode (#533)
  • TUI clickable hyperlinks via OSC 8 escape sequences (#530)
  • TUI edit-last-queued for recalling queued messages (#535)
  • VectorStore trait abstraction in zeph-memory (#554)
  • Operation-level cancellation for LLM requests and tool executions (#538)

Changed

  • Consolidate Docker files into docker/ directory (#539)
  • Typed deserialization for tool call params (#540)
  • CI: replace oraclelinux base image with debian bookworm-slim (#532)

Fixed

  • Strip schema metadata and fix doom loop detection for native tool calls (#534)
  • TUI freezes during fast LLM streaming and parallel tool execution: biased event loop with input priority and agent event batching (#500)
  • Redundant syntax highlighting and markdown parsing on every TUI frame: per-message render cache with content-hash keying (#501)

0.10.0 - 2026-02-18

Fixed

  • TUI status spinner not cleared after model warmup completes (#517)
  • Duplicate tool output rendering for shell-streamed tools in TUI (#516)
  • send_tool_output not forwarded through AppChannel/AnyChannel enum dispatch (#508)
  • Tool output and diff not sent atomically in native tool_use path (#498)
  • Parallel tool_use calls: results processed sequentially for correct ordering (#486)
  • Native tool_result format not recognized by TUI history loader (#484)
  • Inline filter stats threshold based on char savings instead of line count (#483)
  • Token metrics not propagated in native tool_use path (#482)
  • Filter metrics not appearing in TUI Resources panel when using native tool_use providers (#480)
  • Output filter matchers not matching compound shell commands like cd /path && cargo test 2>&1 | tail (#481)
  • Duplicate ToolEvent::Completed emission in shell executor before filtering was applied (#480)
  • TUI feature gate compilation errors (#435)

Added

  • GitHub CLI skill with token-saving patterns (#507)
  • Parallel execution of native tool_use calls with configurable concurrency (#486)
  • TUI compact/detailed tool output toggle with ‘e’ key binding (#479)
  • TUI [tui] config section with show_source_labels option to hide [user]/[zeph]/[tool] prefixes (#505)
  • Syntax-highlighted diff view for write/edit tool output in TUI (#455)
    • Diff rendering with green/red backgrounds for added/removed lines
    • Word-level change highlighting within modified lines
    • Syntax highlighting via tree-sitter
    • Compact/expanded toggle with existing ‘e’ key binding
    • New dependency: similar 2.7.0
  • Per-tool inline filter stats in CLI chat: [shell] cargo test (342 lines -> 28 lines, 91.8% filtered) (#449)
  • Filter metrics in TUI Resources panel: confidence distribution, command hit rate, token savings (#448)
  • Periodic 250ms tick in TUI event loop for real-time metrics refresh (#447)
  • Output filter architecture improvements (M26.1): CommandMatcher enum, FilterConfidence, FilterPipeline, SecurityPatterns, per-filter TOML config (#452)
  • Token savings tracking and metrics for output filtering (#445)
  • Smart tool output filtering: command-aware filters that compress tool output before context insertion
  • OutputFilter trait and OutputFilterRegistry with first-match-wins dispatch
  • sanitize_output() ANSI escape and progress bar stripping (runs on all tool output)
  • Test output filter: cargo test/nextest failures-only mode (94-99% token savings on green suites)
  • Git output filter: compact status/diff/log/push compression (80-99% savings)
  • Clippy output filter: group warnings by lint rule (70-90% savings)
  • Directory listing filter: hide noise directories (target, node_modules, .git)
  • Log deduplication filter: normalize timestamps/UUIDs, count repeated patterns (70-85% savings)
  • [tools.filters] config section with enabled toggle
  • Skill trust levels: 4-tier model (Trusted, Verified, Quarantined, Blocked) with per-turn enforcement
  • TrustGateExecutor wrapping tool execution with trust-level permission checks
  • AnomalyDetector with sliding-window threshold counters for quarantined skill monitoring
  • blake3 content hashing for skill integrity verification on load and hot-reload
  • Quarantine prompt wrapping for structural isolation of untrusted skill bodies
  • Self-learning gate: skills with trust < Verified skip auto-improvement
  • skill_trust SQLite table with migration 009
  • CLI commands: /skill trust, /skill block, /skill unblock
  • [skills.trust] config section (default_level, local_level, hash_mismatch_level)
  • ProviderKind enum for type-safe provider selection in config
  • RuntimeConfig struct grouping agent runtime fields
  • AnyProvider::embed_fn() shared embedding closure helper
  • Config::validate() with bounds checking for critical config values
  • sanitize_paths() for stripping absolute paths from error messages
  • 10-second timeout wrapper for embedding API calls
  • full feature flag enabling all optional features

Changed

  • Remove P generic from Agent, SemanticMemory, CodeRetriever — provider resolved at construction (#423)
  • Architecture improvements, performance optimizations, security hardening (M24) (#417)
  • Extract bootstrap logic from main.rs into zeph-core::bootstrap::AppBuilder (#393): main.rs reduced from 2313 to 978 lines
  • SecurityConfig and TimeoutConfig gain Clone + Copy
  • AnyChannel moved from main.rs to zeph-channels crate
  • Remove 8 lightweight feature gates, make always-on: openai, compatible, orchestrator, router, self-learning, qdrant, vault-age, mcp (#438)
  • Default features reduced to minimal set (empty after M26)
  • Skill matcher concurrency reduced from 50 to 20
  • String::with_capacity in context building loops
  • CI updated to use --features full

Breaking

  • LlmConfig.provider changed from String to ProviderKind enum
  • Default features reduced – users needing a2a, candle, mcp, openai, orchestrator, router, tui must enable explicitly or use --features full
  • Telegram channel rejects empty allowed_users at startup
  • Config with extreme values now rejected by Config::validate()

Deprecated

  • ToolExecutor::execute() string-based dispatch (use execute_tool_call() instead)

Fixed

  • Closed #410 (clap dropped atty), #411 (rmcp updated quinn-udp), #413 (A2A body limit already present)

0.9.9 - 2026-02-17

Added

  • zeph-gateway crate: axum HTTP gateway with POST /webhook ingestion, bearer auth (blake3 + ct_eq), per-IP rate limiting, GET /health endpoint, feature-gated (gateway) (#379)
  • zeph-core::daemon module: component supervisor with health monitoring, PID file management, graceful shutdown, feature-gated (daemon) (#380)
  • zeph-scheduler crate: cron-based periodic task scheduler with SQLite persistence, built-in tasks (memory_cleanup, skill_refresh, health_check), TaskHandler trait, feature-gated (scheduler) (#381)
  • New config sections: [gateway], [daemon], [scheduler] in config/default.toml (#367)
  • New optional feature flags: gateway, daemon, scheduler
  • Hybrid memory search: FTS5 keyword search combined with Qdrant vector similarity (#372, #373, #374)
  • SQLite FTS5 virtual table with auto-sync triggers for full-text keyword search
  • Configurable vector_weight/keyword_weight in [memory.semantic] for hybrid ranking
  • FTS5-only fallback when Qdrant is unavailable (replaces empty results)
  • AutonomyLevel enum (ReadOnly/Supervised/Full) for controlling tool access (#370)
  • autonomy_level config key in [security] section (default: supervised)
  • Read-only mode restricts agent to file_read, file_glob, file_grep, web_scrape
  • Full mode allows all tools without confirmation prompts
  • Documented [telegram].allowed_users allowlist in default config (#371)
  • OpenTelemetry OTLP trace export with tracing-opentelemetry layer, feature-gated behind otel (#377)
  • [observability] config section with exporter selection and OTLP endpoint
  • Instrumentation spans for LLM calls (llm_call) and tool executions (tool_exec)
  • CostTracker with per-model token pricing and configurable daily budget limits (#378)
  • [cost] config section with enabled and max_daily_cents options
  • cost_spent_cents field in MetricsSnapshot for TUI cost display
  • Discord channel adapter with Gateway v10 WebSocket, slash commands, edit-in-place streaming (#382)
  • Slack channel adapter with Events API webhook, HMAC-SHA256 signature verification, streaming (#383)
  • Feature flags: discord and slack (opt-in) in zeph-channels and root crate
  • DiscordConfig and SlackConfig with token redaction in Debug impls
  • Slack timestamp replay protection (reject requests >5min old)
  • Configurable Slack webhook bind address (webhook_host)

0.9.8 - 2026-02-16

Added

  • Graceful shutdown on Ctrl-C with farewell message and MCP server cleanup (#355)
  • Cancel-aware LLM streaming via tokio::select on shutdown signal (#358)
  • McpManager::shutdown_all_shared() with per-client 5s timeout (#356)
  • Indexer progress logging with file count and per-file stats
  • Skip code index for providers with native tool_use (#357)
  • OpenAI prompt caching: parse and report cached token usage (#348)
  • Syntax highlighting for TUI code blocks via tree-sitter-highlight (#345, #346, #347)
  • Anthropic prompt caching with structured system content blocks (#337)
  • Configurable summary provider for tool output summarization via local model (#338)
  • Aggressive inline pruning of stale tool outputs in tool loops (#339)
  • Cache usage metrics (cache_read_tokens, cache_creation_tokens) in MetricsSnapshot (#340)
  • Native tool_use support for Claude provider (Anthropic API format) (#256)
  • Native function calling support for OpenAI provider (#257)
  • ToolDefinition, ChatResponse, ToolUseRequest types in zeph-llm (#254)
  • ToolUse/ToolResult variants in MessagePart for structured tool flow (#255)
  • Dual-mode agent loop: native structured path alongside legacy text extraction (#258)
  • Dual system prompt: native tool_use instructions for capable providers, fenced-block instructions for legacy providers

Changed

  • Consolidate all SQLite migrations into root migrations/ directory (#354)

0.9.7 - 2026-02-15

Performance

  • Token estimation uses len() / 3 for improved accuracy (#328)
  • Explicit tokio feature selection replacing broad feature gates (#326)
  • Concurrent skill embedding for faster startup (#327)
  • Pre-allocate strings in hot paths to reduce allocations (#329)
  • Parallel context building via try_join! (#331)
  • Criterion benchmark suite for core operations (#330)

Security

  • Path traversal protection in shell sandbox (#325)
  • Canonical path validation in skill loader (#322)
  • SSRF protection for MCP server connections (#323)
  • Remove MySQL/RSA vulnerable transitive dependencies (#324)
  • Secret redaction patterns for Google and GitLab tokens (#320)
  • TTL-based eviction for rate limiter entries (#321)

Changed

  • QdrantOps shared helper trait for Qdrant collection operations (#304)
  • delegate_provider! macro replacing boilerplate provider delegation (#303)
  • Remove TuiError in favor of unified error handling (#302)
  • Generic recv_optional replacing per-channel optional receive logic (#301)

Dependencies

  • Upgraded rmcp to 0.15, toml to 1.0, uuid to 1.21 (#296)
  • Cleaned up deny.toml advisory and license configuration (#312)

0.9.6 - 2026-02-15

Changed

  • BREAKING: ToolDef schema field replaced Vec<ParamDef> with schemars::Schema auto-derived from Rust structs via #[derive(JsonSchema)]
  • BREAKING: ParamDef and ParamType removed from zeph-tools public API
  • BREAKING: ToolRegistry::new() replaced with ToolRegistry::from_definitions(); registry no longer hardcodes built-in tools — each executor owns its definitions via tool_definitions()
  • BREAKING: Channel trait now requires ChannelError enum with typed error handling replacing anyhow::Result
  • BREAKING: Agent::new() signature changed to accept new field grouping; agent struct refactored into 5 inner structs for improved organization
  • BREAKING: AgentError enum introduced with 7 typed variants replacing scattered anyhow::Error handling
  • ToolDef now includes InvocationHint (FencedBlock/ToolCall) so LLM prompt shows exact invocation format per tool
  • web_scrape tool definition includes all parameters (url, select, extract, limit) auto-derived from ScrapeInstruction
  • ShellExecutor and WebScrapeExecutor now implement tool_definitions() for single source of truth
  • Replaced tokio “full” feature with granular features in zeph-core (async-io, macros, rt, sync, time)
  • Removed anyhow dependency from zeph-channels
  • Message persistence now uses MessageKind enum instead of is_summary bool for qdrant storage

Added

  • ChannelError enum with typed variants for channel operation failures
  • AgentError enum with 7 typed variants for agent operation failures (streaming, persistence, configuration, etc.)
  • Workspace-level qdrant feature flag for optional semantic memory support
  • Type aliases consolidated into zeph-llm: EmbedFuture and EmbedFn with typed LlmError
  • streaming.rs and persistence.rs modules extracted from agent module for improved code organization
  • MessageKind enum for distinguishing summary and regular messages in storage

Removed

  • anyhow::Result from Channel trait (replaced with ChannelError)
  • Direct anyhow::Error usage in agent module (replaced with AgentError)

0.9.5 - 2026-02-14

Added

  • Pattern-based permission policy with glob matching per tool (allow/ask/deny), first-match-wins evaluation (#248)
  • Legacy blocked_commands and confirm_patterns auto-migrated to permission rules (#249)
  • Denied tools excluded from LLM system prompt (#250)
  • Tool output overflow: full output saved to file when truncated, path notice appended for LLM access (#251)
  • Stale tool output overflow files cleaned up on startup (>24h TTL) (#252)
  • ToolRegistry with typed ToolDef definitions for 7 built-in tools (bash, read, edit, write, glob, grep, web_scrape) (#239)
  • FileExecutor for sandboxed file operations: read, write, edit, glob, grep (#242)
  • ToolCall struct and execute_tool_call() on ToolExecutor trait for structured tool invocation (#241)
  • CompositeExecutor routes structured tool calls to correct sub-executor by tool_id (#243)
  • Tool catalog section in system prompt via ToolRegistry::format_for_prompt() (#244)
  • Configurable max_tool_iterations (default 10, previously hardcoded 3) via TOML and ZEPH_AGENT_MAX_TOOL_ITERATIONS env var (#245)
  • Doom-loop detection: breaks agent loop on 3 consecutive identical tool outputs
  • Context budget check at 80% threshold stops iteration before context overflow
  • IndexWatcher for incremental code index updates on file changes via notify file watcher (#233)
  • watch config field in [index] section (default true) to enable/disable file watching
  • Repo map cache with configurable TTL (repo_map_ttl_secs, default 300s) to avoid per-message filesystem traversal (#231)
  • Cross-session memory score threshold (cross_session_score_threshold, default 0.35) to filter low-relevance results (#232)
  • embed_missing() called on startup for embedding backfill when Qdrant available (#261)
  • AgentTaskProcessor replaces EchoTaskProcessor for real A2A inference (#262)

Changed

  • ShellExecutor uses PermissionPolicy for all permission checks instead of legacy find_blocked_command/find_confirm_command
  • Replaced unmaintained dirs-next 2.0 with dirs 6.x
  • Batch messages retrieval in semantic recall: replaced N+1 query pattern with messages_by_ids() for improved performance

Fixed

  • Persist MessagePart data to SQLite via remember_with_parts() — pruning state now survives session restarts (#229)
  • Clear tool output body from memory after Tier 1 pruning to reclaim heap (#230)
  • TUI uptime display now updates from agent start time instead of always showing 0s (#259)
  • FileExecutor handle_write now uses canonical path for security (TOCTOU prevention) (#260)
  • resolve_via_ancestors trailing slash bug on macOS
  • vault.backend from config now used as default backend; CLI --vault flag overrides config (#263)
  • A2A error responses sanitized to prevent provider URL leakage

0.9.4 - 2026-02-14

Added

  • Bounded FIFO message queue (max 10) in agent loop: users can submit messages during inference, queued messages are delivered sequentially when response cycle completes
  • Channel trait extended with try_recv() (non-blocking poll) and send_queue_count() with default no-op impls
  • Consecutive user messages within 500ms merge window joined by newline
  • TUI queue badge [+N queued] in input area, Ctrl+K to clear queue, /clear-queue command
  • TelegramChannel try_recv() implementation via mpsc
  • Deferred model warmup in TUI mode: interface renders immediately, Ollama warmup runs in background with status indicator (“warming up model…” → “model ready”), agent loop awaits completion via watch::channel
  • context_tokens metric in TUI Resources panel showing current prompt estimate (vs cumulative session totals)
  • unsummarized_message_count in SemanticMemory for precise summarization trigger
  • count_messages_after in SqliteStore for counting messages beyond a given ID
  • TUI status indicators for context compaction (“compacting context…”) and summarization (“summarizing…”)
  • Debug tracing in should_compact() for context budget diagnostics (token estimate, threshold, decision)
  • Config hot-reload: watch config file for changes via notify_debouncer_mini and apply runtime-safe fields (security, timeouts, memory limits, context budget, compaction, max_active_skills) without restart
  • ConfigWatcher in zeph-core with 500ms debounced filesystem monitoring
  • with_config_reload() builder method on Agent for wiring config file watcher
  • tool_name field in ToolOutput for identifying tool type (bash, mcp, web-scrape) in persisted messages and TUI display
  • Real-time status events for provider retries and orchestrator fallbacks surfaced as [system] messages across all channels (CLI stderr, TUI chat panel, Telegram)
  • StatusTx type alias in zeph-llm for emitting status events from providers
  • Status variant in TUI AgentEvent rendered as System-role messages (DarkGray)
  • set_status_tx() on AnyProvider, SubProvider, and ModelOrchestrator for propagating status sender through the provider hierarchy
  • Background forwarding tasks for immediate status delivery (bypasses agent loop for zero-latency display)
  • TUI: toggle side panels with d key in Normal mode
  • TUI: input history navigation (Up/Down in Insert mode)
  • TUI: message separators and accent bars for visual structure
  • TUI: tool output restored as expandable messages from conversation history
  • TUI: collapsed tool output preview (3 lines) when restoring history
  • LlmProvider::context_window() trait method for model context window size detection
  • Ollama context window auto-detection via /api/show model info endpoint
  • Context window sizes for Claude (200K) and OpenAI (128K/16K/1M) provider models
  • auto_budget config field with ZEPH_MEMORY_AUTO_BUDGET env override for automatic context budget from model metadata
  • inject_summaries() in Agent: injects SQLite conversation summaries into context (newest-first, budget-aware, with deduplication)
  • Wire zeph-index Code RAG pipeline into agent loop (feature-gated index): CodeRetriever integration, inject_code_rag() in prepare_context(), repo map in system prompt, background project indexing on startup
  • IndexConfig with [index] TOML section and ZEPH_INDEX_* env overrides (enabled, max_chunks, score_threshold, budget_ratio, repo_map_tokens)
  • Two-tier context pruning strategy for granular token reclamation before full LLM compaction
    • Tier 1: selective ToolOutput part pruning with compacted_at timestamp on pruned parts
    • Tier 2: LLM-based compaction fallback when tier 1 is insufficient
    • prune_protect_tokens config field for token-based protection zone (shields recent context from pruning)
    • tool_output_prunes metric tracking tier 1 pruning operations
    • compacted_at field on MessagePart::ToolOutput for pruning audit trail
  • MessagePart enum (Text, ToolOutput, Recall, CodeContext, Summary) for typed message content with independent lifecycle
  • Message::from_parts() constructor with to_llm_content() flattening for LLM provider consumption
  • Message::from_legacy() backward-compatible constructor for simple text messages
  • SQLite migration 006: parts column for structured message storage (JSON-serialized)
  • save_message_with_parts() in SqliteStore for persisting typed message parts
  • inject_semantic_recall, inject_code_context, inject_summaries now create typed MessagePart variants

Changed

  • index feature enabled by default (Code RAG pipeline active out of the box)
  • Agent error handler shows specific error context instead of generic message
  • TUI inline code rendered as blue with dark background glow instead of bright yellow
  • TUI header uses deep blue background (Rgb(20, 40, 80)) for improved contrast
  • System prompt includes explicit bash block example and bans invented formats (tool_code, tool_call) for small model compatibility
  • TUI Resources panel: replaced separate Prompt/Completion/Total with Context (current) and Session (cumulative) metrics
  • Summarization trigger uses unsummarized message count instead of total, avoiding repeated no-op checks
  • Empty AgentEvent::Status clears TUI spinner instead of showing blank throbber
  • Status label cleared after summarization and compaction complete
  • Default summarization_threshold: 100 → 50 messages
  • Default compaction_threshold: 0.75 → 0.80
  • Default compaction_preserve_tail: 4 → 6 messages
  • Default semantic.enabled: false → true
  • Default summarize_output: false → true
  • Default context_budget_tokens: 0 (auto-detect from model)

Fixed

  • TUI chat line wrapping no longer eats 2 characters on word wrap (accent prefix width accounted for)
  • TUI activity indicator moved to dedicated layout row (no longer overlaps content)
  • Memory history loading now retrieves most recent messages instead of oldest
  • Persisted tool output format includes tool name ([tool output: bash]) for proper display on restore
  • summarize_output serde deserialization used #[serde(default)] yielding false instead of config default true

0.9.3 - 2026-02-12

Added

  • New zeph-index crate: AST-based code indexing and semantic retrieval pipeline
    • Language detection and grammar registry with feature-gated tree-sitter grammars (Rust, Python, JavaScript, TypeScript, Go, Bash, TOML, JSON, Markdown)
    • AST-based chunker with cAST-inspired greedy sibling merge and recursive decomposition (target 600 non-ws chars per chunk)
    • Contextualized embedding text generation for improved retrieval quality
    • Dual-write storage layer (Qdrant vector search + SQLite metadata) with INT8 scalar quantization
    • Incremental indexer with .gitignore-aware file walking and content-hash change detection
    • Hybrid retriever with query classification (Semantic/Grep/Hybrid) and budget-aware result packing
    • Lightweight repo map generation (tree-sitter signature extraction, budget-constrained output)
  • code_context slot in BudgetAllocation for code RAG injection into agent context
  • inject_code_context() method in Agent for transient code chunk injection before semantic recall

0.9.2 - 2026-02-12

Added

  • Runtime context compaction for long sessions: automatic LLM-based summarization of middle messages when context usage exceeds configurable threshold (default 75%)
  • with_context_budget() builder method on Agent for wiring context budget and compaction settings
  • Config fields: compaction_threshold (f32), compaction_preserve_tail (usize) with env var overrides
  • context_compactions counter in MetricsSnapshot for observability
  • Context budget integration: ContextBudget::allocate() wired into agent loop via prepare_context() orchestrator
  • Semantic recall injection: SemanticMemory::recall() results injected as transient system messages with token budget control
  • Message history trimming: oldest non-system messages evicted when history exceeds budget allocation
  • Environment context injection: working directory, OS, git branch, and model name in system prompt via <environment> block
  • Extended BASE_PROMPT with structured Tool Use, Guidelines, and Security sections
  • Tool output truncation: head+tail split at 30K chars with UTF-8 safe boundaries
  • Smart tool output summarization: optional LLM-based summarization for outputs exceeding 30K chars, with fallback to truncation on failure (disabled by default via summarize_output config)
  • Progressive skill loading: matched skills get full body, remaining shown as description-only catalog via <other_skills>
  • ZEPH.md project config discovery: walk up directory tree, inject into system prompt as <project_context>

0.9.1 - 2026-02-12

Added

  • Mouse scroll support for TUI chat widget (scroll up/down via mouse wheel)
  • Splash screen with colored block-letter “ZEPH” banner on TUI startup
  • Conversation history loading into chat on TUI startup
  • Model thinking block rendering (<think> tags from Ollama DeepSeek/Qwen models) in distinct darker style
  • Markdown rendering for all chat messages via pulldown-cmark: bold, italic, strikethrough, headings, code blocks, inline code, lists, blockquotes, horizontal rules
  • Scrollbar track with proportional thumb indicator in chat widget

Fixed

  • Chat messages no longer overflow below the viewport when lines wrap
  • Scroll no longer sticks at top after over-scrolling past content boundary

0.9.0 - 2026-02-12

Added

  • ratatui-based TUI dashboard with real-time agent metrics (feature-gated tui, opt-in)
  • TuiChannel as new Channel implementation with bottom-up chat feed, input line, and status bar
  • MetricsSnapshot and MetricsCollector in zeph-core via tokio::sync::watch for live metrics transport
  • with_metrics() builder on Agent with instrumentation at 8 collection points: api_calls, latency, prompt/completion tokens, active skills, sqlite message count, qdrant status, summarization count
  • Side panel widgets (skills, memory, resources) with live data from agent loop
  • Confirmation modal dialog for destructive command approval in TUI (Y/Enter confirms, N/Escape cancels)
  • Scroll indicators (▲/▼) in chat widget when content overflows viewport
  • Responsive layout: side panels hidden on terminals narrower than 80 columns
  • Multiline input via Shift+Enter in TUI insert mode
  • Bottom-up chat layout with proper newline handling and per-message visual separation
  • Panic hook for terminal state restoration on any panic during TUI execution
  • Unicode-safe char-index cursor tracking for multi-byte input in TUI
  • --config <path> CLI argument and ZEPH_CONFIG env var to override default config path
  • OpenAI-compatible LLM provider with chat, streaming, and embeddings support
  • Feature-gated openai feature (enabled by default)
  • Support for OpenAI, Together AI, Groq, Fireworks, and any OpenAI-compatible API via configurable base_url
  • reasoning_effort parameter for OpenAI reasoning models (low/medium/high)
  • /mcp add <id> <command> [args...] for dynamic stdio MCP server connection at runtime
  • /mcp add <id> <url> for HTTP transport (remote MCP servers in Docker/cloud)
  • /mcp list command to show connected servers and tool counts
  • /mcp remove <id> command to disconnect MCP servers
  • McpTransport enum: Stdio (child process) and Http (Streamable HTTP) transports
  • HTTP MCP server config via url field in [[mcp.servers]]
  • mcp.allowed_commands config for command allowlist (security hardening)
  • mcp.max_dynamic_servers config to limit concurrent dynamic servers (default 10)
  • Qdrant registry sync after dynamic MCP add/remove for semantic tool matching

Changed

  • Docker images now include Node.js, npm, and Python 3 for MCP server runtime
  • ServerEntry uses McpTransport enum instead of flat command/args/env fields

Fixed

  • Effective embedding model resolution: Qdrant subsystems now use the correct provider-specific embedding model name when provider is openai or orchestrator routes to OpenAI
  • Skill watcher no longer loops in Docker containers (overlayfs phantom events)

0.8.2 - 2026-02-10

Changed

  • Enable all non-platform features by default: orchestrator, self-learning, mcp, vault-age, candle
  • Features metal and cuda remain opt-in (platform-specific GPU accelerators)
  • CI clippy uses default features instead of explicit feature list
  • Docker images now include skill runtime dependencies: curl, wget, git, jq, file, findutils, procps-ng

0.8.1 - 2026-02-10

Added

  • Shell sandbox: configurable allowed_paths directory allowlist and allow_network toggle blocking curl/wget/nc in ShellExecutor (Issue #91)
  • Sandbox validation before every shell command execution with path canonicalization
  • tools.shell.allowed_paths config (empty = working directory only) with ZEPH_TOOLS_SHELL_ALLOWED_PATHS env override
  • tools.shell.allow_network config (default: true) with ZEPH_TOOLS_SHELL_ALLOW_NETWORK env override
  • Interactive confirmation for destructive commands (rm, git push -f, DROP TABLE, etc.) with CLI y/N prompt and Telegram inline keyboard (Issue #92)
  • tools.shell.confirm_patterns config with default destructive command patterns
  • Channel::confirm() trait method with default auto-confirm for headless/test scenarios
  • ToolError::ConfirmationRequired and ToolError::SandboxViolation variants
  • execute_confirmed() method on ToolExecutor for confirmation bypass after user approval
  • A2A TLS enforcement: reject HTTP endpoints when a2a.require_tls = true (Issue #92)
  • A2A SSRF protection: block private IP ranges (RFC 1918, loopback, link-local) with DNS resolution (Issue #92)
  • Configurable A2A server payload size limit via a2a.max_body_size (default: 1 MiB)
  • Structured JSON audit logging for all tool executions with stdout or file destination (Issue #93)
  • AuditLogger with AuditEntry (timestamp, tool, command, result, duration) and AuditResult enum
  • [tools.audit] config section with ZEPH_TOOLS_AUDIT_ENABLED and ZEPH_TOOLS_AUDIT_DESTINATION env overrides
  • Secret redaction in LLM responses: detect API keys, tokens, passwords, private keys and replace with [REDACTED] (Issue #93)
  • Whitespace-preserving redact_secrets() scanner with zero-allocation fast path via Cow<str>
  • [security] config section with redact_secrets toggle (default: true)
  • Configurable timeout policies for LLM, embedding, and A2A operations (Issue #93)
  • [timeouts] config section with llm_seconds, embedding_seconds, a2a_seconds
  • LLM calls wrapped with tokio::time::timeout in agent loop

0.8.0 - 2026-02-10

Added

  • VaultProvider trait with pluggable secret backends, Secret newtype with redacted debug output, EnvVaultProvider for environment variable secrets (Issue #70)
  • AgeVaultProvider: age-encrypted JSON vault backend with x25519 identity key decryption (Issue #70)
  • Config::resolve_secrets(): async secret resolution through vault provider for API keys and tokens
  • CLI vault args: --vault <backend>, --vault-key <path>, --vault-path <path>
  • vault-age feature flag on zeph-core and root binary
  • [vault] config section with backend field (default: env)
  • docker-compose.vault.yml overlay for containerized age vault deployment
  • CARGO_FEATURES build arg in Dockerfile.dev for optional feature flags
  • CandleProvider: local GGUF model inference via candle ML framework with chat templates (Llama3, ChatML, Mistral, Phi3, Raw), token generation with top-k/top-p sampling, and repeat penalty (Issue #125)
  • CandleProvider embeddings: BERT-based embedding model loaded from HuggingFace Hub with mean pooling and L2 normalization (Issue #126)
  • ModelOrchestrator: task-aware multi-model routing with keyword-based classification (coding, creative, analysis, translation, summarization, general) and provider fallback chains (Issue #127)
  • SubProvider enum breaking recursive type cycle between AnyProvider and ModelOrchestrator
  • Device auto-detection: Metal on macOS, CUDA on Linux with GPU, CPU fallback (Issue #128)
  • Feature flags: candle, metal, cuda, orchestrator on workspace and zeph-llm crate
  • CandleConfig, GenerationParams, OrchestratorConfig in zeph-core config
  • Config examples for candle and orchestrator in config/default.toml
  • Setup guide sections for candle local inference and model orchestrator
  • 15 new unit tests for orchestrator, chat templates, generation config, and loader
  • Progressive skill loading: lazy body loading via OnceLock, on-demand resource resolution for scripts/, references/, assets/ directories, extended frontmatter (compatibility, license, metadata, allowed-tools), skill name validation per agentskills.io spec (Issue #115)
  • SkillMeta/Skill composition pattern: metadata loaded at startup, body deferred until skill activation
  • SkillRegistry replaces Vec<Skill> in Agent — lazy body access via get_skill()/get_body()
  • resource.rs module: discover_resources() + load_resource() with path traversal protection via canonicalization
  • Self-learning skill evolution system: automatic skill improvement through failure detection, self-reflection retry, and LLM-generated version updates (Issue #107)
  • SkillOutcome enum and SkillMetrics for skill execution outcome tracking (Issue #108)
  • Agent self-reflection retry on tool failure with 1-retry-per-message budget (Issue #109)
  • Skill version generation and storage in SQLite with auto-activate and manual approval modes (Issue #110)
  • Automatic rollback when skill version success rate drops below threshold (Issue #111)
  • /skill stats, /skill versions, /skill activate, /skill approve, /skill reset commands for version management (Issue #111)
  • /feedback command for explicit user feedback on skill quality (Issue #112)
  • LearningConfig with TOML config section [skills.learning] and env var overrides
  • self-learning feature flag on zeph-skills, zeph-core, and root binary
  • SQLite migration 005: skill_versions and skill_outcomes tables
  • Bundled setup-guide skill with configuration reference for all env vars, TOML keys, and operating modes
  • Bundled skill-audit skill for spec compliance and security review of installed skills
  • allowed_commands shell config to override default blocklist entries via ZEPH_TOOLS_SHELL_ALLOWED_COMMANDS
  • QdrantSkillMatcher: persistent skill embeddings in Qdrant with BLAKE3 content-hash delta sync (Issue #104)
  • SkillMatcherBackend enum dispatching between InMemory and Qdrant skill matching (Issue #105)
  • qdrant feature flag on zeph-skills crate gating all Qdrant dependencies
  • Graceful fallback to in-memory matcher when Qdrant is unavailable
  • Skill matching tracing via tracing::debug! for diagnostics
  • New zeph-mcp crate: MCP client via rmcp 0.14 with stdio transport (Issue #117)
  • McpClient and McpManager for multi-server lifecycle management with concurrent connections
  • McpToolExecutor implementing ToolExecutor for ```mcp block execution (Issue #120)
  • McpToolRegistry: MCP tool embeddings in Qdrant zeph_mcp_tools collection with BLAKE3 delta sync (Issue #118)
  • Unified matching: skills + MCP tools injected into system prompt by relevance (Issue #119)
  • mcp feature flag on root binary and zeph-core gating all MCP functionality
  • Bundled mcp-generate skill with instructions for MCP-to-skill generation via mcp-execution (Issue #121)
  • [[mcp.servers]] TOML config section for MCP server connections

Changed

  • Skill struct refactored: split into SkillMeta (lightweight metadata) + Skill (meta + body), composition pattern
  • SkillRegistry now uses OnceLock<String> for lazy body caching instead of eager loading
  • Matcher APIs accept &[&SkillMeta] instead of &[Skill] — embeddings use description only
  • Agent stores SkillRegistry directly instead of Vec<Skill>
  • Agent field matcher type changed from Option<SkillMatcher> to Option<SkillMatcherBackend>
  • Skill matcher creation extracted to create_skill_matcher() in main.rs

Dependencies

  • Added age 0.11.2 to workspace (optional, behind vault-age feature, default-features = false)
  • Added candle-core 0.9, candle-nn 0.9, candle-transformers 0.9 to workspace (optional, behind candle feature)
  • Added hf-hub 0.4 to workspace (HuggingFace model downloads with rustls-tls)
  • Added tokenizers 0.22 to workspace (BPE tokenization with fancy-regex)
  • Added blake3 1.8 to workspace
  • Added rmcp 0.14 to workspace (MCP protocol SDK)

0.7.1 - 2026-02-09

Added

  • WebScrapeExecutor: safe HTML scraping via scrape-core with CSS selectors, SSRF protection, and HTTPS-only enforcement (Issue #57)
  • CompositeExecutor<A, B>: generic executor chaining with first-match-wins dispatch
  • Bundled web-scrape skill with CSS selector examples for structured data extraction
  • extract_fenced_blocks() shared utility for fenced code block parsing (DRY refactor)
  • [tools.scrape] config section with timeout and max body size settings

Changed

  • Agent tool output label from [shell output] to [tool output]
  • ShellExecutor block extraction now uses shared extract_fenced_blocks()

0.7.0 - 2026-02-08

Added

  • A2A Server: axum-based HTTP server with JSON-RPC 2.0 routing for message/send, tasks/get, tasks/cancel (Issue #83)
  • In-memory TaskManager with full task lifecycle: create, get, update status, add artifacts, append history, cancel (Issue #83)
  • SSE streaming endpoint (/a2a/stream) with JSON-RPC response envelope wrapping per A2A spec (Issue #84)
  • Bearer token authentication middleware with constant-time comparison via subtle::ConstantTimeEq (Issue #85)
  • Per-IP rate limiting middleware with configurable 60-second sliding window (Issue #85)
  • Request body size limit (1 MiB) via tower-http::limit::RequestBodyLimitLayer (Issue #85)
  • A2aServerConfig with env var overrides: ZEPH_A2A_ENABLED, ZEPH_A2A_HOST, ZEPH_A2A_PORT, ZEPH_A2A_PUBLIC_URL, ZEPH_A2A_AUTH_TOKEN, ZEPH_A2A_RATE_LIMIT
  • Agent card served at /.well-known/agent.json (public, no auth required)
  • Graceful shutdown integration via tokio watch channel
  • Server module gated behind server feature flag on zeph-a2a crate

Changed

  • Part type refactored from flat struct to tagged enum with kind discriminator (text, file, data) per A2A spec
  • TaskState::Pending renamed to TaskState::Submitted with explicit per-variant #[serde(rename)] for kebab-case wire format
  • Added AuthRequired and Unknown variants to TaskState
  • TaskStatusUpdateEvent and TaskArtifactUpdateEvent gained kind field (status-update, artifact-update)

0.6.0 - 2026-02-08

Added

  • New zeph-a2a crate: A2A protocol implementation for agent-to-agent communication (Issue #78)
  • A2A protocol types: Task, TaskState, TaskStatus, Message, Part, Artifact, AgentCard, AgentSkill, AgentCapabilities with full serde camelCase serialization (Issue #79)
  • JSON-RPC 2.0 envelope types (JsonRpcRequest, JsonRpcResponse, JsonRpcError) with method constants for A2A operations (Issue #79)
  • AgentCardBuilder for constructing A2A agent cards from runtime config and skills (Issue #79)
  • AgentRegistry with well-known URI discovery (/.well-known/agent.json), TTL-based caching, and manual registration (Issue #80)
  • A2aClient with send_message, stream_message (SSE), get_task, cancel_task via JSON-RPC 2.0 (Issue #81)
  • Bearer token authentication support for all A2A client operations (Issue #81)
  • SSE streaming via eventsource-stream with TaskEvent enum (StatusUpdate, ArtifactUpdate) (Issue #81)
  • A2aError enum with variants for HTTP, JSON, JSON-RPC, discovery, and stream errors (Issue #79)
  • Optional a2a feature flag (enabled by default) to gate A2A functionality
  • 42 new unit tests for protocol types, JSON-RPC envelopes, agent card builder, discovery registry, and client operations

0.5.0 - 2026-02-08

Added

  • Embedding-based skill matcher: SkillMatcher with cosine similarity selects top-K relevant skills per query instead of injecting all skills into the system prompt (Issue #75)
  • max_active_skills config field (default: 5) with ZEPH_SKILLS_MAX_ACTIVE env var override
  • Skill hot-reload: filesystem watcher via notify-debouncer-mini detects SKILL.md changes and re-embeds without restart (Issue #76)
  • Skill priority: earlier paths in skills.paths take precedence when skills share the same name (Issue #76)
  • SkillRegistry::reload() and SkillRegistry::into_skills() methods
  • SQLite skill_usage table tracking per-skill invocation counts and last-used timestamps (Issue #77)
  • /skills command displaying available skills with usage statistics
  • Three new bundled skills: git, docker, api-request (Issue #77)
  • 17 new unit tests for matcher, registry priority, reload, and usage tracking

Changed

  • Agent::new() signature: accepts Vec<Skill>, Option<SkillMatcher>, max_active_skills instead of pre-formatted skills prompt string
  • format_skills_prompt now generic over Borrow<Skill> to accept both &[Skill] and &[&Skill]
  • Skill struct derives Clone
  • Agent generic constraint: P: LlmProvider + Clone + 'static (required for embed_fn closures)
  • System prompt rebuilt dynamically per user query with only matched skills

Dependencies

  • Added notify 8.0, notify-debouncer-mini 0.6
  • zeph-core now depends on zeph-skills
  • zeph-skills now depends on tokio (sync, rt) and notify

0.4.3 - 2026-02-08

Fixed

  • Telegram “Bad Request: text must be non-empty” error when LLM returns whitespace-only content. Added is_empty() guard after markdown_to_telegram conversion in both send() and send_or_edit() (Issue #73)

Added

  • Dockerfile.dev: multi-stage build from source with cargo registry/build cache layers for fast rebuilds
  • docker-compose.dev.yml: full dev stack (Qdrant + Zeph) with debug tracing (RUST_LOG, RUST_BACKTRACE=1), uses host Ollama via host.docker.internal
  • docker-compose.deps.yml: Qdrant-only compose for native zeph execution on macOS

0.4.2 - 2026-02-08

Fixed

  • Telegram MarkdownV2 parsing errors (Issue #69). Replaced manual character-by-character escaping with AST-based event-driven rendering using pulldown-cmark 0.13.0
  • UTF-8 safe text chunking for messages exceeding Telegram’s 4096-byte limit. Uses str::is_char_boundary() with newline preference to prevent splitting multi-byte characters (emoji, CJK)
  • Link URL over-escaping. Dedicated escape_url() method only escapes ) and \ per Telegram MarkdownV2 spec, fixing broken URLs like https://example\.com

Added

  • TelegramRenderer state machine for context-aware escaping: 19 special characters in text, only \ and ` in code blocks
  • Markdown formatting support: bold, italic, strikethrough, headers, code blocks, links, lists, blockquotes
  • Comprehensive benchmark suite with criterion: 7 scenario groups measuring latency (2.83µs for 500 chars) and throughput (121-970 MiB/s)
  • Memory profiling test to measure escaping overhead (3-20% depending on content)
  • 30 markdown unit tests covering formatting, escaping, edge cases, and UTF-8 chunking (99.32% line coverage)

Changed

  • crates/zeph-channels/src/markdown.rs: Complete rewrite with pulldown-cmark event-driven parser (449 lines)
  • crates/zeph-channels/src/telegram.rs: Removed has_unclosed_code_block() pre-flight check (no longer needed with AST parsing), integrated UTF-8 safe chunking
  • Dependencies: Added pulldown-cmark 0.13.0 (MIT) and criterion 0.8.0 (Apache-2.0/MIT) for benchmarking

0.4.1 - 2026-02-08

Fixed

  • Auto-create Qdrant collection on first use. Previously, the zeph_conversations collection had to be manually created using curl commands. Now, ensure_collection() is called automatically before all Qdrant operations (remember, recall, summarize) to initialize the collection with correct vector dimensions (896 for qwen3-embedding) and Cosine distance metric on first access, similar to SQL migrations.

Changed

  • Docker Compose: Added environment variables for semantic memory configuration (ZEPH_MEMORY_SEMANTIC_ENABLED, ZEPH_MEMORY_SEMANTIC_RECALL_LIMIT) and Qdrant URL override (ZEPH_QDRANT_URL) to enable full semantic memory stack via .env file

0.4.0 - 2026-02-08

Added

M9 Phase 3: Conversation Summarization and Context Budget (Issue #62)

  • New SemanticMemory::summarize() method for LLM-based conversation compression
  • Automatic summarization triggered when message count exceeds threshold
  • SQLite migration 003_summaries.sql creates dedicated summaries table with CASCADE constraints
  • SqliteStore::save_summary() stores summary with metadata (first/last message IDs, token estimate)
  • SqliteStore::load_summaries() retrieves all summaries for a conversation ordered by ID
  • SqliteStore::load_messages_range() fetches messages after specific ID with limit for batch processing
  • SqliteStore::count_messages() counts total messages in conversation
  • SqliteStore::latest_summary_last_message_id() gets last summarized message ID for resumption
  • ContextBudget struct for proportional token allocation (15% summaries, 25% semantic recall, 60% recent history)
  • estimate_tokens() helper using chars/4 heuristic (100x faster than tiktoken, ±25% accuracy)
  • Agent::check_summarization() lazy trigger after persist_message() when threshold exceeded
  • Batch size = threshold/2 to balance summary quality with LLM call frequency
  • Configuration: memory.summarization_threshold (default: 100), memory.context_budget_tokens (default: 0 = unlimited)
  • Environment overrides: ZEPH_MEMORY_SUMMARIZATION_THRESHOLD, ZEPH_MEMORY_CONTEXT_BUDGET_TOKENS
  • Inline comments in config/default.toml documenting all configuration parameters
  • 26 new unit tests for summarization and context budget (196 total tests, 75.31% coverage)
  • Architecture Decision Records ADR-016 through ADR-019 for summarization design
  • Foreign key constraint added to messages.conversation_id with ON DELETE CASCADE

M9 Phase 2: Semantic Memory Integration (Issue #61)

  • SemanticMemory<P: LlmProvider> orchestrator coordinating SQLite, Qdrant, and LlmProvider
  • SemanticMemory::remember() saves message to SQLite, generates embedding, stores in Qdrant
  • SemanticMemory::recall() performs semantic search with query embedding and fetches messages from SQLite
  • SemanticMemory::has_embedding() checks if message already embedded to prevent duplicates
  • SemanticMemory::embed_missing() background task to embed old messages (with LIMIT parameter)
  • Agent<P, C, T> now generic over LlmProvider to support SemanticMemory
  • Agent::with_memory() replaces SqliteStore with SemanticMemory
  • Graceful degradation: embedding failures logged but don’t block message save
  • Qdrant connection failures silently downgrade to SQLite-only mode (no semantic recall)
  • Generic provider pattern: SemanticMemory<P: LlmProvider> instead of Arc<dyn LlmProvider> for Edition 2024 async trait compatibility
  • AnyProvider, OllamaProvider, ClaudeProvider now derive/implement Clone for semantic memory integration
  • Integration test updated for SemanticMemory API (with_memory now takes 5 parameters including recall_limit)
  • Semantic memory config: memory.semantic.enabled, memory.semantic.recall_limit (default: 5)
  • 18 new tests for semantic memory orchestration (recall, remember, embed_missing, graceful degradation)

M9 Phase 1: Qdrant Integration (Issue #60)

  • New QdrantStore module in zeph-memory for vector storage and similarity search
  • QdrantStore::store() persists embeddings to Qdrant and tracks metadata in SQLite
  • QdrantStore::search() performs cosine similarity search with filtering by conversation_id and role
  • QdrantStore::has_embedding() checks if message has associated embedding
  • QdrantStore::ensure_collection() idempotently creates Qdrant collection with 768-dimensional vectors
  • SQLite migration 002_embeddings_metadata.sql for embedding metadata tracking
  • embeddings_metadata table with foreign key constraint to messages (ON DELETE CASCADE)
  • PRAGMA foreign_keys enabled in SqliteStore via SqliteConnectOptions
  • SearchFilter and SearchResult types for flexible query construction
  • MemoryConfig.qdrant_url field with ZEPH_QDRANT_URL environment variable override (default: http://localhost:6334)
  • Docker Compose Qdrant service (qdrant/qdrant:v1.13.6) on ports 6333/6334 with persistent storage
  • Integration tests for Qdrant operations (ignored by default, require running Qdrant instance)
  • Unit tests for SQLite metadata operations with 98% coverage
  • 12 new tests total (3 unit + 2 integration for QdrantStore, 1 CASCADE DELETE test for SqliteStore, 3 config tests)

M8: Embeddings support (Issue #54)

  • LlmProvider trait extended with embed(&str) -> Result<Vec<f32>> for generating text embeddings
  • LlmProvider trait extended with supports_embeddings() -> bool for capability detection
  • OllamaProvider implements embeddings via ollama-rs generate_embeddings() API
  • Default embedding model: qwen3-embedding (configurable via llm.embedding_model)
  • ZEPH_LLM_EMBEDDING_MODEL environment variable for runtime override
  • ClaudeProvider::embed() returns descriptive error (Claude API does not support embeddings)
  • AnyProvider delegates embedding methods to active provider
  • 10 new tests: unit tests for all providers, config tests for defaults/parsing/env override
  • Integration test for real Ollama embedding generation (ignored by default)
  • README documentation: model compatibility notes and ollama pull instructions for both LLM and embedding models
  • Docker Compose configuration: added ZEPH_LLM_EMBEDDING_MODEL environment variable

Changed

BREAKING CHANGES (pre-1.0.0):

  • SqliteStore::save_message() now returns Result<i64> instead of Result<()> to enable embedding workflow
  • SqliteStore::new() uses sqlx::migrate!() macro instead of INIT_SQL constant for proper migration management
  • QdrantStore::store() requires model: &str parameter for multi-model support
  • Config constant LLM_ENV_KEYS renamed to ENV_KEYS to reflect inclusion of non-LLM variables

Migration:

#![allow(unused)]
fn main() {
// Before:
let _ = store.save_message(conv_id, "user", "hello").await?;

// After:
let message_id = store.save_message(conv_id, "user", "hello").await?;
}
  • OllamaProvider::new() now accepts embedding_model parameter (breaking change, pre-v1.0)
  • Config schema: added llm.embedding_model field with serde default for backward compatibility

0.3.0 - 2026-02-07

Added

M7 Phase 1: Tool Execution Framework - zeph-tools crate (Issue #39)

  • New zeph-tools leaf crate for tool execution abstraction following ADR-014
  • ToolExecutor trait with native async (Edition 2024 RPITIT): accepts full LLM response, returns Option<ToolOutput>
  • ShellExecutor implementation with bash block parser and execution (30s timeout via tokio::time::timeout)
  • ToolOutput struct with summary string and blocks_executed count
  • ToolError enum with Blocked/Timeout/Execution variants (thiserror)
  • ToolsConfig and ShellConfig configuration types with serde Deserialize and sensible defaults
  • Workspace version consolidation: version.workspace = true across all crates
  • Workspace inter-crate dependency references: zeph-llm.workspace = true pattern for all internal dependencies
  • 22 unit tests with 99.25% line coverage, zero clippy warnings
  • ADR-014: zeph-tools crate design rationale and architecture decisions

M7 Phase 2: Command safety (Issue #40)

  • DEFAULT_BLOCKED patterns: 12 dangerous commands (rm -rf /, sudo, mkfs, dd if=, curl, wget, nc, ncat, netcat, shutdown, reboot, halt)
  • Case-insensitive command filtering via to_lowercase() normalization
  • Configurable timeout and blocked_commands in TOML via [tools.shell] section
  • Custom blocked commands additive to defaults (cannot weaken security)
  • 35+ comprehensive unit tests covering exact match, prefix match, multiline, case variations
  • ToolsConfig integration with core Config struct

M7 Phase 3: Agent integration (Issue #41)

  • Agent now uses ShellExecutor for all bash command execution with safety checks
  • SEC-001 CRITICAL vulnerability fixed: unfiltered bash execution removed from agent.rs
  • Removed 66 lines of duplicate code (extract_bash_blocks, execute_bash, extract_and_execute_bash)
  • ToolError::Blocked properly handled with user-facing error message
  • Four integration tests for blocked command behavior and error handling
  • Performance validation: < 1% overhead for tool executor abstraction
  • Security audit: all acceptance criteria met, zero vulnerabilities

Security

  • CRITICAL fix for SEC-001: Shell commands now filtered through ShellExecutor with DEFAULT_BLOCKED patterns (rm -rf /, sudo, mkfs, dd if=, curl, wget, nc, shutdown, reboot, halt). Resolves command injection vulnerability where agent.rs bypassed all security checks via inline bash execution.

Fixed

  • Shell command timeout now respects config.tools.shell.timeout (was hardcoded 30s in agent.rs)
  • Removed duplicate bash parsing logic from agent.rs (now centralized in zeph-tools)
  • Error message pattern leakage: blocked commands now show generic security policy message instead of leaking exact blocked pattern

Changed

BREAKING CHANGES (pre-1.0.0):

  • Agent::new() signature changed: now requires tool_executor: T as 4th parameter where T: ToolExecutor
  • Agent struct now generic over three types: Agent<P, C, T> (provider, channel, tool_executor)
  • Workspace Cargo.toml now defines version = "0.3.0" in [workspace.package] section
  • All crate manifests use version.workspace = true instead of explicit versions
  • Inter-crate dependencies now reference workspace definitions (e.g., zeph-llm.workspace = true)

Migration:

#![allow(unused)]
fn main() {
// Before:
let agent = Agent::new(provider, channel, &skills_prompt);

// After:
use zeph_tools::shell::ShellExecutor;
let executor = ShellExecutor::new(&config.tools.shell);
let agent = Agent::new(provider, channel, &skills_prompt, executor);
}

0.2.0 - 2026-02-06

Added

M6 Phase 1: Streaming trait extension (Issue #35)

  • LlmProvider::chat_stream() method returning Pin<Box<dyn Stream<Item = Result<String>> + Send>>
  • LlmProvider::supports_streaming() capability query method
  • Channel::send_chunk() method for incremental response delivery
  • Channel::flush_chunks() method for buffered chunk flushing
  • ChatStream type alias for Pin<Box<dyn Stream<Item = anyhow::Result<String>> + Send>>
  • Streaming infrastructure in zeph-llm and zeph-core (dependencies: futures-core 0.3, tokio-stream 0.1)

M6 Phase 2: Ollama streaming backend (Issue #36)

  • Native token-by-token streaming for OllamaProvider using ollama-rs streaming API
  • OllamaProvider::chat_stream() implementation via send_chat_messages_stream()
  • OllamaProvider::supports_streaming() now returns true
  • Stream mapping from Result<ChatMessageResponse, ()> to Result<String, anyhow::Error>
  • Integration tests for streaming happy path and equivalence with non-streaming chat() (ignored by default)
  • ollama-rs "stream" feature enabled in workspace dependencies

M6 Phase 3: Claude SSE streaming backend (Issue #37)

  • Native token-by-token streaming for ClaudeProvider using Anthropic Messages API with Server-Sent Events
  • ClaudeProvider::chat_stream() implementation via SSE event parsing
  • ClaudeProvider::supports_streaming() now returns true
  • SSE event parsing via eventsource-stream 0.2.3 library
  • Stream pipeline: bytes_stream() -> eventsource() -> filter_map(parse_sse_event) -> Box::pin()
  • Handles SSE events: content_block_delta (text extraction), error (mid-stream errors), metadata events (skipped)
  • Integration tests for streaming happy path and equivalence with non-streaming chat() (ignored by default)
  • eventsource-stream dependency added to workspace dependencies
  • reqwest "stream" feature enabled for bytes_stream() support

M6 Phase 4: Agent streaming integration (Issue #38)

  • Agent automatically uses streaming when provider.supports_streaming() returns true (ADR-014)
  • Agent::process_response_streaming() method for stream consumption and chunk accumulation
  • CliChannel immediate streaming: send_chunk() prints each chunk instantly via print!() + flush()
  • TelegramChannel batched streaming: debounce at 1 second OR 512 bytes, edit-in-place for progressive updates
  • Response buffer pre-allocation with String::with_capacity(2048) for performance
  • Error message sanitization: full errors logged via tracing::error!(), generic messages shown to users
  • Telegram edit retry logic: recovers from stale message_id (message deleted, permissions lost)
  • tokio-stream dependency added for StreamExt trait
  • 6 new unit tests for channel streaming behavior

Fixed

M6 Phase 3: Security improvements

  • Manual Debug implementation for ClaudeProvider to prevent API key leakage in debug output
  • Error message sanitization: full Claude API errors logged via tracing::error!(), generic messages returned to users

Changed

BREAKING CHANGES (pre-1.0.0):

  • LlmProvider trait now requires chat_stream() and supports_streaming() implementations (no default implementations per project policy)
  • Channel trait now requires send_chunk() and flush_chunks() implementations (no default implementations per project policy)
  • All existing providers (OllamaProvider, ClaudeProvider) updated with fallback implementations (Phase 1 non-streaming: calls chat() and wraps in single-item stream)
  • All existing channels (CliChannel, TelegramChannel) updated with no-op implementations (Phase 1: streaming not yet wired into agent loop)

0.1.0 - 2026-02-05

Added

M0: Workspace bootstrap

  • Cargo workspace with 5 crates: zeph-core, zeph-llm, zeph-skills, zeph-memory, zeph-channels
  • Binary entry point with version display
  • Default configuration file
  • Workspace-level dependency management and lints

M1: LLM + CLI agent loop

  • LlmProvider trait with Message/Role types
  • Ollama backend using ollama-rs
  • Config loading from TOML with env var overrides
  • Interactive CLI agent loop with multi-turn conversation

M2: Skills system

  • SKILL.md parser with YAML frontmatter and markdown body (zeph-skills)
  • Skill registry that scans directories for */SKILL.md files
  • Prompt formatter with XML-like skill injection into system prompt
  • Bundled skills: web-search, file-ops, system-info
  • Shell execution: agent extracts bash blocks from LLM responses and runs them
  • Multi-step execution loop with 3-iteration limit
  • 30-second timeout on shell commands
  • Context builder that combines base system prompt with skill instructions

M3: Memory + Claude

  • SQLite conversation persistence with sqlx (zeph-memory)
  • Conversation history loading and message saving per session
  • Claude backend via Anthropic Messages API with 429 retry (zeph-llm)
  • AnyProvider enum dispatch for runtime provider selection
  • CloudLlmConfig for Claude-specific settings (model, max_tokens)
  • ZEPH_CLAUDE_API_KEY env var for API authentication
  • ZEPH_SQLITE_PATH env var override for database location
  • Provider factory in main.rs selecting Ollama or Claude from config
  • Memory integration into Agent with optional SqliteStore

M4: Telegram channel

  • Channel trait abstraction for agent I/O (recv, send, send_typing)
  • CliChannel implementation reading stdin/stdout via tokio::task::spawn_blocking
  • TelegramChannel adapter using teloxide with mpsc-based message routing
  • Telegram user whitelist via telegram.allowed_users config
  • ZEPH_TELEGRAM_TOKEN env var for Telegram bot activation
  • Bot commands: /start (welcome), /reset, /skills forwarded as ChannelMessage
  • AnyChannel enum dispatch for runtime channel selection
  • zeph-channels crate with teloxide 0.17 dependency
  • TelegramConfig in config.rs with TOML and env var support

M5: Integration tests + release

  • Integration test suite: config, skills, memory, and agent end-to-end
  • MockProvider and MockChannel for agent testing without external dependencies
  • Graceful shutdown via tokio::sync::watch + tokio::signal (SIGINT/SIGTERM)
  • Ollama startup health check (warn-only, non-blocking)
  • README with installation, configuration, usage, and skills documentation
  • GitHub Actions CI/CD: lint, clippy, test (ubuntu + macos), coverage, security, release
  • Dependabot for Cargo and GitHub Actions with auto-merge for patch/minor updates
  • Auto-labeler workflow for PRs by path, title prefix, and size
  • Release workflow with cross-platform binary builds and checksums
  • Issue templates (bug report, feature request)
  • PR template with review checklist
  • LICENSE (MIT), CONTRIBUTING.md, SECURITY.md

Fixed

  • Replace vulnerable serde_yml/libyml with manual frontmatter parser (GHSA high + medium)

Changed

  • Move dependency features from workspace root to individual crate manifests

  • Update README with badges, architecture overview, and pre-built binaries section

  • Agent is now generic over both LlmProvider and Channel (Agent<P, C>)

  • Agent::new() accepts a Channel parameter instead of reading stdin directly

  • Agent::run() uses channel.recv()/send() instead of direct I/O

  • Agent calls channel.send_typing() before each LLM request

  • Agent::run() uses tokio::select! to race channel messages against shutdown signal