Changelog
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog.
Unreleased
[0.18.5] - 2026-04-07
Added
- Per-provider cost breakdown —
CostTrackernow accumulates per-provider token counts (input, cache read/write, output) and cost. The/statusCLI command and TUI/costview render a per-provider table sorted by cost. See Observability & Cost. - ASI coherence tracking — per-provider sliding window of response embeddings penalizes Thompson/EMA routing when coherence drops. Enabled via
[llm.routing.asi]. See Adaptive Inference. - Unified quality gate — optional post-selection embedding similarity check via
[llm.routing] quality_gate. See Adaptive Inference. - Time-based microcompact — stale low-value tool outputs are cleared after an idle gap, at zero LLM cost. Configurable via
[memory.microcompact]. See Memory & Context. - autoDream background consolidation — post-session memory consolidation sweep behind a session-count and time gate. Configurable via
[memory.autodream]. See Memory & Context. - MagicDocs auto-maintained markdown — files with
# MAGIC DOC:header are rewritten after tool-call turns by a background LLM task. Configurable via[magic_docs]. See Memory & Context. - Key facts semantic dedup — near-duplicate
key_factsare silently skipped before Qdrant insertion. Configurable viamemory.key_facts_dedup_threshold. See Memory & Context. - MCP error codes —
McpErrorCodeenum withis_retryable()for caller-side retry classification. See Tool System. - Caller identity propagation —
ToolCall.caller_idandAuditEntry.policy_matchare now populated from the channel layer. See Tool System. - Per-session tool call quota —
tools.max_tool_calls_per_sessionlimits tool executions per session. See Tool System. - OAP authorization config —
[tools.authorization]TOML section merges rules intoPolicyEnforcerat startup. See Policy Enforcer. - Scheduler CLI subcommand —
zeph schedule list/add/remove/showfor managing jobs outside an agent session. See Scheduler.
Fixed
spawn_asi_updatedebounce — exactly one embed call per agent turn instead of N concurrent sub-calls.MagicDocsscan now detectsToolOutputinRole::Usermessages andToolResultin the native execution path.- Filter policy-decision language from
key_factsat store time — transient enforcement facts ("blocked","permission denied", etc.) are no longer embedded. - Compaction failure is now surfaced as a user-visible message instead of a silent
tracing::warn. - MCP manager shuts down explicitly before runtime exit, killing stdio child processes cleanly.
- BPE tokenizer data is cached in a
OnceLock, eliminating repeated disk loads onTokenCounterconstruction. - Unbounded RSS growth in TUI mode: cancel bridge tasks, message buffer, URL set, and scheduler tick storm all addressed.
[0.17.1] - 2026-03-27
Added
- Tool error taxonomy —
ToolErrorCategoryclassifies tool failures into 11 categories driving retry, parameter-reformat, and reputation-scoring decisions.ToolErrorFeedback::format_for_llm()replaces opaque error strings with structured[tool_error]blocks.ToolError::Shellcarries an explicit category and exit code. See Tool System. - MCP per-server trust levels —
[[mcp.servers]]entries accepttrust_level(trusted/untrusted/sandboxed) andtool_allowlist. Sandboxed servers expose only explicitly listed tools (fail-closed). Untrusted servers with no allowlist emit a startup warning. See MCP Integration. - Candle-backed classifiers —
CandleClassifierrunsprotectai/deberta-v3-small-prompt-injection-v2for injection detection.CandlePiiClassifierrunsiiiorg/piiranha-v1-detect-personal-information(NER) for PII detection; results are merged with the regex filter. Configured via the new[classifiers]section. Requiresclassifiersfeature. See Local Inference. - SYNAPSE hybrid seed selection — SYNAPSE spreading activation now ranks seed entities by
hybrid_score = fts_score * (1 - seed_structural_weight) + structural_score * seed_structural_weight. New config fields:seed_structural_weight(default: 0.4) andseed_community_cap(default: 3). - A-MEM link weight evolution — edges accumulate
retrieval_count; composite scoring usesevolved_weight(count, confidence) = confidence * (1 + 0.2 * ln(1 + count)).min(1.0). A background decay task reduces counts over time vialink_weight_decay_lambdaandlink_weight_decay_interval_secs. - Topology-aware orchestration —
TopologyClassifierclassifies DAG structure (AllParallel, LinearChain, FanOut, FanIn, Hierarchical, Mixed) and selects a dispatch strategy (FullParallel, Sequential, LevelBarrier, Adaptive).LevelBarrierdispatch fires tasks level-by-level for hierarchical plans. Enable withtopology_selection = true(requiresexperimentsfeature). - Per-task
execution_mode— planner annotates tasks withparallel(default) orsequentialto hint the scheduler. Missing fields in stored graphs default toparallelfor backward compatibility. PlanVerifiercompleteness checking — post-task LLM verification produces a structuredVerificationResultwith gap severity levels (critical/important/minor).replan()injects newTaskNodes for actionable gaps. All failures are fail-open. Configure viaverify_provider. See Task Orchestration.- rmcp 1.3 — updated from rmcp 1.2.
[0.15.3] - 2026-03-17
Fixed
- ACP config fallback (#1945) —
resolve_config_path()now falls back to~/.config/zeph/config.tomlwhenconfig/default.tomlis absent relative to CWD; resolves ACP stdio/HTTP startup failure when launched from an IDE workspace directory. - TUI filter metrics zero (#1939) — filter metrics (
filter_raw_tokens,filter_saved_tokens,filter_applications) no longer show zero in the TUI dashboard during native tool execution. Extractedrecord_filter_metricshelper and called from all four metric-recording sites. - Graph metrics initialization (#1938) — TUI graph metrics panel now shows correct entity/edge/community counts on startup.
App::with_metrics_rx()eagerly reads the initial snapshot; graph extraction now awaits the background task and re-reads counts. - TUI tool start events (#1931) — native tool calls now emit
ToolStartevents so the TUI shows a spinner and$ commandheader before tool output arrives. - Graph metrics per-turn update (#1932) — graph memory metrics (entities/edges/communities) now update every turn via per-turn
sync_graph_counts()call.
Added
- OAuth 2.1 PKCE for MCP (#1930) —
McpTransport::OAuthvariant withurl,scopes,callback_port,client_name.McpManager::with_oauth_credential_store()for credential persistence viaVaultCredentialStore. Two-phaseconnect_all(): stdio/HTTP concurrently, OAuth sequentially. SSRF validation on all OAuth metadata endpoints. - Background code indexing progress (#1923) —
IndexProgressstruct withfiles_done,files_total,chunks_created. CLI prints progress to stderr; TUI shows “Indexing codebase… N/M files (X%)” in status bar. - Real behavioral learning (#1913) —
LearningEnginenow injects inferred user preferences (verbosity, response format, language) into the volatile system prompt block. Preferences learned from corrections via watermark-based incremental scan every 5 turns. Wilson-score confidence threshold gates persistence. - Context compression overrides (#1904) — CLI flags
--focus/--no-focus,--sidequest/--no-sidequest,--pruning-strategy <reactive|task_aware|mig>for per-session overrides.--initwizard step added. (task_aware_migremoved in v0.16.1 — was dead code; existing configs fall back toreactivewith a warning.) - Orchestration metrics (#1899) —
LlmPlanner::plan()andLlmAggregator::aggregate()return token usage;/statuscommand shows Orchestration block when plans executed. - Memory integration tests (#1916) — four
#[ignore]tests for session summary → Qdrant roundtrip using testcontainers.
[0.15.2] - 2026-03-16
Added
- Per-conversation compression guidelines — the
compression_guidelinestable gains aconversation_idcolumn (migration 034). Guidelines are now scoped to a specific conversation when one is in scope; the global (NULL) guideline is used as fallback. Configure via[memory.compression_guidelines]; toggle with--compression-guidelines. See Context Engineering. - Session summary on shutdown (#1816) — when no hard compaction fired during a session, the agent generates a lightweight LLM summary at shutdown and stores it in the vector store for cross-session recall. Configurable via
memory.shutdown_summary,shutdown_summary_min_messages(default 4), andshutdown_summary_max_messages(default 20). The--initwizard prompts for the toggle; a TUI spinner appears during summarization. - Declarative policy compiler (#1695) —
PolicyEnforcerevaluates TOML-based allow/deny rules before any tool executes. Deny-wins semantics; path traversal normalization; tool name normalization. Configure via[tools.policy]withenabled,default_effect,rules, andpolicy_file. CLI:--policy-file. Slash commands:/policy status,/policy check [--trust-level <level>]. Feature flag:policy-enforcer(included infull). See Policy Enforcer. - Pre-execution action verification (#1630) — pluggable
PreExecutionVerifierpipeline runs before any tool executes. Two built-in verifiers:DestructiveCommandVerifier(blocksrm -rf /,dd if=,mkfs, etc. outside configuredallowed_paths) andInjectionPatternVerifier(blocks SQL injection, command injection, path traversal; warns on SSRF). Configure via[security.pre_execution_verify]. CLI escape hatch:--no-pre-execution-verify. TUI security panel shows block/warn counters. - LLM guardrail pre-screener (#1651) —
GuardrailFilterscreens user input (and optionally tool output) through a guard model before it enters agent context. Configurable action (block/warn), fail strategy (closed/open), timeout, andmax_input_chars. Enable with--guardrailor[security.guardrail] enabled = true. TUI status bar:GRD:on(green) orGRD:warn(yellow). Slash command:/guardrailfor live stats. - Skill content scanner (#1853) —
SkillContentScannerscans all loaded skill bodies for injection patterns at startup when[skills.trust] scan_on_load = true(default). Scanner is advisory: findings areWARN-logged and do not downgrade trust or block tools. On-demand:/skill scanTUI command,--scan-skills-on-loadCLI flag. - OTLP-compatible debug traces (#1343) —
--dump-format traceemits OpenTelemetry-compatible JSON traces with span hierarchy: session → iteration → LLM request / tool call / memory search. Configure endpoint and service name via[debug.traces]. Switch at runtime:/dump-format <json|raw|trace>.--initwizard prompts for format when debug dump is enabled. - TUI: compression guidelines status (#1803) — memory panel shows guidelines version and last update timestamp.
/guidelinesslash command displays current guidelines text. - Feature use-case bundles (#1831) — six named bundles group related features:
desktop(tui + scheduler + compression-guidelines),ide(acp + acp-http + lsp-context),server(gateway + a2a + scheduler + otel),chat(discord + slack),ml(candle + pdf + stt),full(all except ml/hardware). Individual feature flags are unchanged. See Feature Flags.
Changed
- Cascade router observability (#1825) —
cascade_chatandcascade_chat_streamnow emit structured tracing events for provider selection, judge scoring, quality verdict, escalation, and budget exhaustion. - ACP session config centralization (#1812) —
AgentSessionConfig::from_config()andAgent::apply_session_config()replace ~25 individually-copied fields in daemon/runner/ACP session bootstrap. Fixes missing orchestration config and server compaction in daemon sessions. - rmcp 0.17 → 1.2 (#1845) — migrated
CallToolRequestParamsto builder pattern.
Fixed
- Scheduler deadlock no longer emits misleading “Plan failed. 0/N tasks failed” — non-terminal tasks are marked
Canceledat deadlock time; done message distinguishes deadlock, mixed failure, and normal failure paths (#1879). - MCP tools are now denied for quarantined skills —
TrustGateExecutortracks registered MCP tool IDs and blocks any call in the set (#1876). - Policy
tool="shell"/"sh"/"bash"aliases now all matchShellExecutorat rule compile time (#1877). /policy checkno longer leaks process environment variables into trace output (#1873).PolicyEffect::AllowIfvariant removed — it was identical toAllowand generated misleading TOML docs (#1871).- Overflow notice format changed to
[full output stored — ID: {uuid} — ...];read_overflowaccepts bare UUIDs and strips the legacyoverflow:prefix (#1868). - Session summary timeout attempts plain-text fallback instead of silently returning
None;shutdown_summary_timeout_secs(default 10) replaces hardcoded 5 s limit (#1869). - JWT Bearer tokens (
Authorization: Bearer <token>,eyJ...) are now redacted beforecompression_failure_pairsSQLite insert (#1847). - Soft compaction threshold lowered from 0.70 to 0.60;
maybe_soft_compact_mid_iteration()fires after per-tool summarization to relieve context pressure without triggering LLM calls (#1828). - Ollama
base_urlwith/v1suffix no longer causes 404 on embed calls (#1832). - Graph memory: entity embeddings now correctly stored in Qdrant —
EntityResolverwas built without a provider inextract_and_store()(#1817, #1829). - Debug trace.json written inside per-session subdir, preventing overwrites (#1814).
- JIT tool reference injection works after overflow migration to SQLite (#1818).
- Policy symlink boundary check:
load_policy_file()canonicalizes the path and rejects files outside the process working directory (#1872).
[0.15.1] - 2026-03-15
Fixed
save_compression_guidelinesatomic write — the version-number assignment now uses a singleINSERT ... SELECT COALESCE(MAX(version), 0) + 1statement, eliminating the read-then-write TOCTOU race where two concurrent callers could insert duplicate version numbers. Migration 033 adds aUNIQUE(version)constraint to thecompression_guidelinestable with row-level deduplication for pre-existing corrupt data (closes #1799).
Added
- Failure-driven compression guidelines (ACON) — after hard compaction, the agent watches subsequent LLM responses for two-signal context-loss indicators (uncertainty phrase + prior-context reference). Confirmed failure pairs are stored in SQLite (
compression_failure_pairs). A background updater wakes periodically, calls the LLM to synthesize updated guidelines from accumulated pairs, sanitizes the output to strip prompt injection, and persists the result. Guidelines are injected into every future compaction prompt via a<compression-guidelines>block. Configure via[memory.compression_guidelines]; disabled by default. See Context Engineering.
[0.15.0] - 2026-03-14
Added
- Gemini provider — full Google Gemini API support across 6 phases: basic chat (
generateContent), SSE streaming with thinking-part support, native tool use / function calling, vision / multimodal input (inlineData), semantic embeddings (embedContent), and remote model discovery (GET /v1beta/models). Default model:gemini-2.0-flash; extended thinking available withgemini-2.5-pro. Configure with[llm.gemini]andZEPH_GEMINI_API_KEY. See LLM Providers. - Gemini
thinking_level/thinking_budgetsupport —GeminiThinkingConfigwiththinking_level(minimal,low,medium,high),thinking_budget(validated -1/0/1–32768), andinclude_thoughtsfields. Applies to Gemini 2.5+ models. Configurable in[llm.gemini]and the--initwizard. - Cascade routing strategy — new
strategy = "cascade"for therouterprovider. Tries providers cheapest-first; escalates only when the response is classified as degenerate (empty, repetitive, incoherent). Heuristic and LLM-judge classifier modes. Configure via[llm.router.cascade]withquality_threshold,max_escalations,classifier_mode, andmax_cascade_tokens. See Adaptive Inference. - Claude server-side context compaction —
[llm.cloud] server_compaction = trueenables thecompact-2026-01-12beta API. Claude manages context on the server side; compaction summaries stream back and are surfaced in the TUI. Graceful fallback to client-side compaction when the beta header is rejected (e.g. on Haiku models). Newserver_compaction_eventsmetric. Enable with--server-compaction. - Claude 1M extended context window —
[llm.cloud] enable_extended_context = trueinjects thecontext-1m-2025-08-07beta header, unlocking 1M token context for Opus 4.6 and Sonnet 4.6.context_window()reports 1,000,000 when active soauto_budgetscales correctly. Configurable in--initwizard. /scheduler listcommand andlist_taskstool — lists all active scheduled tasks with NAME, KIND, MODE, and NEXT RUN columns. LLM-callable via thelist_taskstool; also available as/scheduler listslash command. See Scheduler.search_codetool — unified hybrid code search combining tree-sitter structural extraction, Qdrant semantic search, and LSP symbol resolution. Always available (no feature flag). See Tools.zeph migrate-config— CLI command to add missing config parameters as commented-out blocks and reformat the file. Idempotent; never modifies existing values. See Migrate Config.- ACP readiness probes —
/healthHTTP endpoint returns200 OKwhen ready; stdio transport emitszeph/readyJSON-RPC notification as the first outbound packet. - Request metadata in debug dumps — model, token limit, temperature, exposed tools, and cache breakpoints included in both
jsonandrawdump formats.
Changed
- Tiered context compaction (#1338): replaced single
compaction_thresholdwith soft tier (soft_compaction_threshold, default 0.70 — prune tool outputs + apply deferred summaries, no LLM) and hard tier (hard_compaction_threshold, default 0.90 — full LLM summarization). Oldcompaction_thresholdfield still accepted via serde alias.deferred_apply_thresholdremoved — absorbed into soft tier. See Context Engineering. - Async parallel dispatch in
DagScheduler—tick()now dispatches all ready tasks simultaneously instead of capping atmax_parallel - running. Concurrency enforced bySubAgentManagerreturningConcurrencyLimit; tasks revert toReadyand retry on the next tick. /plan cancelduring execution — cancel commands delivered immediately during active plan execution via concurrent channel polling.- DagScheduler exponential backoff — concurrency-limit deferral uses 250ms→500ms→1s→2s→4s (cap 5s) instead of a fixed 250ms sleep.
- Single shared
QdrantOpsinstance — all subsystems share one gRPC connection instead of creating independent connections on startup. zeph-indexalways-on — theindexfeature flag is removed; tree-sitter and code intelligence are compiled into every build.- Graph memory chunked edge loading — community detection loads edges in configurable chunks (keyset pagination) instead of loading all edges at once, reducing peak memory on large graphs. Configurable via
memory.graph.lpa_edge_chunk_size(default: 10,000).
Security
- SEC-001–004 tool execution hardening — randomized hash seeds, jitter-free retry timing, tool name length limits, wall-clock retry budget. See Security.
- Shell blocklist unconditional —
blocked_commandsandDEFAULT_BLOCKEDnow apply regardless ofPermissionPolicyconfiguration; previously skipped when a policy was attached.
Fixed
- Context compaction loop:
maybe_compact()now detects when the token budget is too tight to make progress (compactable message count ≤ 1, or compaction produced zero net token reduction, or context remains above threshold after a successful summarization pass) and sets a permanentcompaction_exhaustedflag. Subsequent calls skip compaction entirely and emit a one-time user-visible warning to increasecontext_budget_tokensor start a new session (#1727). - Claude server compaction:
ContextManagementstruct now serializes to the correct API shape (auto_truncatetype with nested trigger); the previous shape caused non-functional--server-compaction. - Haiku models:
with_server_compaction(true)now emitsWARNand keeps the flag disabled (thecompact-2026-01-12beta is not supported for Haiku). - Skill embedding log noise:
SkillMatcher::new()no longer emits oneWARNper skill when the provider does not support embeddings — allEmbedUnsupportederrors are summarised into a single info-level message. - OpenAI / Gemini: tools with no parameters no longer cause
400 Bad Requestin strict mode. - Anomaly detector: outcomes now recorded correctly for native tool-use providers (Claude, OpenAI, Gemini).
[0.14.3] - 2026-03-10
See CHANGELOG.md for full release notes.
[0.14.2] - 2026-03-09
See CHANGELOG.md for full release notes.
[0.14.1] - 2026-03-07
See CHANGELOG.md for full release notes.
[0.14.0] - 2026-03-06
See CHANGELOG.md for full release notes.
[0.12.5] - 2026-03-02
See CHANGELOG.md for full release notes.
[0.12.4] - 2026-03-01
Added
list_directorytool inFileExecutor: sorted entries with[dir]/[file]/[symlink]labels; uses lstat to avoid following symlinks (#1053)create_directory,delete_path,move_path,copy_pathtools inFileExecutor: structured file system mutation ops, all paths sandbox-validated;copy_dir_recursiveuses lstat to prevent symlink escape (#1054)fetchtool inWebScrapeExecutor: plain URL-to-text without CSS selector requirement, SSRF protection applied (#1055)DiagnosticsExecutorwithdiagnosticstool: runscargo checkorcargo clippy --message-format=json, returns structured error/warning list (file, line, col, severity, message), output capped, graceful degradation if cargo absent (#1056)list_directoryandfind_pathtools inAcpFileExecutor: run on agent filesystem when IDE advertisesfs.readTextFilecapability; paths sandbox-validated, glob segments validated against..traversal, results capped at 1000 (#1059)ToolFilter: suppresses localFileExecutortools (read,write,glob) whenAcpFileExecutorprovides IDE-proxied alternatives (#1059)check_blocklist()andDEFAULT_BLOCKED_COMMANDSextracted tozeph-toolspublic API soAcpShellExecutorapplies the same blocklist asShellExecutor(#1050)ToolPermissionenum with per-binary pattern support in persisted TOML ([tools.bash.patterns]);denypatterns route toRejectAlwaysfast-path without IDE round-trip (#1050)- Self-learning loop (Phase 1–4):
FailureKindenum,/skill reject,FeedbackDetector,UserCorrectioncross-session recall, Wilson score Bayesian re-ranking,check_trust_transition(), BM25+RRF hybrid search, EMA routing (#1035)
Changed
- Renamed
FileExecutortool idglob→find_pathto align with Zed IDE native tool surface (#1052) READONLY_TOOLSallowlist updated to current tool IDs:read,find_path,grep,list_directory,web_scrape,fetch(#1052)- CI: migrated from Dependabot to self-hosted Renovate with MSRV-aware
constraintsFiltering: strictand grouped minor/patch automerge (#1048)
Security
- ACP permission gate: subshell injection (
$(, backtick) blocked before pattern matching;effective_shell_command()checks inner command ofbash -c <cmd>against blocklist;extract_command_binary()strips transparent prefixes to prevent allow-always scope expansion (SEC-ACP-C1, SEC-ACP-C2) (#1050) - ACP tool notifications:
raw_responseis now passed throughredact_jsonbefore forwarding toclaudeCode.toolResponse; prevents secrets from bypassing theredact_secretspipeline (SEC-ACP-001)
Fixed
- ACP: terminal release deferred until after
tool_call_updatenotification is dispatched (#1013) - ACP: tool execution output forwarded via
LoopbackEvent::ToolOutputto ACP channel (#1003) - ACP: newlines preserved in tool output for IDE terminal widget (#1034)
[0.12.1] - 2026-02-25
Security
- Enforce
unsafe_code = "deny"at workspace lint level; auditedunsafeblocks (mmap via candle,std::envin tests) annotated with#[allow(unsafe_code)](#867) AgeVaultProvidersecrets map switched fromHashMaptoBTreeMapfor deterministic JSON key ordering onvault.save()(#876)WebScrapeExecutor: redirect targets now validated against private/internal IP ranges to prevent SSRF via redirect chains (#871)- Gateway webhook payload: per-field length limits (sender/channel <= 256 bytes, body <= 65536 bytes) and ASCII control-char stripping to prevent prompt injection (#868)
- ACP permission cache: null bytes stripped from tool names before cache key construction to prevent key collision (#872)
gateway.max_body_sizebounded to 10 MiB (10,485,760 bytes) at config validation to prevent memory exhaustion (#875)- Shell sandbox:
<(,>(,<<<,evaladded to defaultconfirm_patternsto mitigate process substitution, here-string, and eval bypass vectors (#870)
Performance
ClaudeProvidercaches pre-serializedToolDefinitionslices; cache is invalidated only when the tool set changes, eliminating per-call JSON construction overhead (#894)should_compact()replaced O(N) message scan with direct comparison againstcached_prompt_tokens(#880)EnvironmentContextcached onAgent; onlygit_branchrefreshed on skill reload instead of spawning a full git subprocess per turn (#881)- Doom-loop content hashed in-place by feeding stable message parts directly into the hasher, eliminating the intermediate normalized
Stringallocation (#882) prune_stale_tool_outputs:count_tokenscalled once perToolResultpart instead of twice (#883)- Composite covering index
(conversation_id, id)onmessagestable (migration 015) replaces single-column index; eliminates post-filter sort step (#895) load_history_filteredrewritten as a CTE, replacing the previous double-sort subquery (#896)remove_tool_responses_middle_outtakes ownership of the messageVecinstead of cloning;HashSetreplaced withVec::with_capacityfor small-N index tracking (#884, #888)- Fast-path
parts_json == "[]"check in history load functions skips serde parse on the common empty case (#886) consolidate_summariesusesString::with_capacity+write!loop instead ofcollect::<Vec<_>>().join()(#887)- TUI
tui_loop()skipsterminal.draw()when no events occurred in the 250ms tick, reducing idle CPU usage (#892)
Added
sqlite_pool_size: u32inMemoryConfig(default 5) — configurable via[memory] sqlite_pool_size(#893)- Background cleanup task for
ResponseCache::cleanup_expired()— interval configurable via[memory] response_cache_cleanup_interval_secs(default 3600s) (#891) schemafeature flag inzeph-llmgatingschemarsdependency and typed output API (#879)
Changed
check_summarization()uses in-memoryunsummarized_countcounter onMemoryStateinstead of issuing aCOUNT(*)SQL query on every message save (#890)- Removed 4
channel.send_status()calls frompersist_message()inzeph-core— SQLite WAL inserts < 1ms do not warrant status reporting (#889) - Default Ollama model changed from
mistral:7btoqwen3:8b;"qwen3"and"qwen"added asChatMLtemplate aliases (#897) src/main.rssplit into focused modules:runner.rs,agent_setup.rs,tracing_init.rs,tui_bridge.rs,channel.rs,tests.rs—main.rsreduced to 26 LOC (#839)zeph-core/src/bootstrap.rssplit into submodule directory:config.rs,health.rs,mcp.rs,provider.rs,skills.rs,tests.rs—bootstrap/mod.rsreduced to 278 LOC (#840)SkillTrustRow.source_kindchanged fromStringtoSourceKindenum (Local,Hub,File) with serde DB serialization (#848)ScheduledTaskConfig.kindchanged fromStringtoScheduledTaskKindenum (#850)TrustLevelmoved tozeph-tools::trust_level;zeph-skillsre-exports it, removing thezeph-tools → zeph-skillsreverse dependency (#841)- Duplicate
ChannelErrorremoved fromzeph-channels::error; all channel adapters usezeph_core::channel::ChannelError(#842) zeph_a2a::types::TaskStatereplaced inzeph-corewith a localSubAgentStateenum;zeph-a2aremoved fromzeph-coredependencies (#843)zeph-indexQdrant access consolidated throughVectorStoretrait fromzeph-memory; directqdrant-clientdependency removed (#844)content_hash(data: &[u8]) -> Stringutility added tozeph-core::hashbacked by BLAKE3 (#845)zeph-core::diffre-export module removed;zeph_core::DiffDatais now a direct re-export ofzeph_tools::executor::DiffData(#846)ContextManager,ToolOrchestrator,LearningEngineextracted fromAgentinto standalone structs with pure delegation (#830, #836, #837, #838)Secrettype wraps inner value inZeroizing<String>;Cloneremoved (#865)AgeVaultProvidersecrets and intermediate decrypt/encrypt buffers wrapped inZeroizing(#866, #874)A2aServer::serve()andGatewayServer::serve()emittracing::warn!whenauth_tokenisNone(#869, #873)
0.12.0 - 2026-02-24
Added
MessageMetadatastruct inzeph-llmwithagent_visible,user_visible,compacted_atfields; default is both-visible for backward compat (#M28)Message.metadatafield with#[serde(default)]— existing serialized messages deserialize without change- SQLite migration
013_message_metadata.sql— addsagent_visible,user_visible,compacted_atcolumns tomessagestable save_message_with_metadata()inSqliteStorefor saving messages with explicit visibility flagsload_history_filtered()inSqliteStore— SQL-level filtering byagent_visible/user_visiblereplace_conversation()inSqliteStore— atomic compaction: marks originalsuser_only, inserts summary asagent_onlyoldest_message_ids()inSqliteStore— returns N oldest message IDs for a conversationAgent.load_history()now loads onlyagent_visible=truemessages, excluding compacted originalscompact_context()persists compaction atomically viareplace_conversation(), falling back to legacy summary storage if DB IDs are unavailable- Multi-session ACP support with configurable
max_sessions(default 4) and LRU eviction of idle sessions (#781) session_idle_timeout_secsconfig for automatic session cleanup (default 30 min) with background reaper task (#781)ZEPH_ACP_MAX_SESSIONSandZEPH_ACP_SESSION_IDLE_TIMEOUT_SECSenv overrides (#781)- ACP session persistence to
SQLite—acp_sessionsandacp_session_eventstables with conversation replay onload_sessionper ACP spec (#782) SqliteStoremethods for ACP session lifecycle:create_acp_session,save_acp_event,load_acp_events,delete_acp_session,acp_session_exists(#782)TokenCounterinzeph-memory— accurate token counting withtiktoken-rscl100k_base, replacingchars/4heuristic (#789)- DashMap-backed token cache (10k cap) for amortized O(1) lookups
- OpenAI tool schema token formula for precise context budget allocation
- Input size guard (64KB) on token counting to prevent cache pollution from oversized input
- Graceful fallback to
chars/4when tiktoken tokenizer is unavailable - Configurable tool response offload —
OverflowConfigwith threshold (default 50k chars), retention (7 days), optional custom dir (#791) [tools.overflow]section inconfig.tomlfor offload configuration- Security hardening: path canonicalization, symlink-safe cleanup, 0o600 file permissions on Unix
- Wire
AcpContext(IDE-proxied FS, shell, permissions) throughAgentSpawnerinto agent tool chain viaCompositeExecutor— ACP executors take priority with automatic local fallback (#779) DynExecutornewtype inzeph-toolsfor object-safeToolExecutorcomposition inCompositeExecutor(#779)cancel_signal: Arc<Notify>onLoopbackHandlefor cooperative cancellation between ACP sessions and agent loop (#780)with_cancel_signal()builder method onAgentto inject external cancellation signal (#780)zeph-acpcrate — ACP (Agent Client Protocol) server for IDE embedding (Zed, JetBrains, Neovim) (#763-#766)--acpCLI flag to launch Zeph as an ACP stdio server (requiresacpfeature)acpfeature gate in rootCargo.toml; included infullfeature setZephAcpAgentimplementing SDKAgenttrait with session lifecycle (new, prompt, cancel, load)loopback_event_to_updatemappingLoopbackEventvariants to ACPSessionUpdatenotifications, with empty chunk filteringserve_stdio()transport usingAgentSideConnectionover tokio-compat stdio streams- Stream monitor gated behind
ZEPH_ACP_LOG_MESSAGESenv var for JSON-RPC traffic debugging - Custom mdBook theme with Zeph brand colors (navy+amber palette from TUI)
- Z-letter favicon SVG for documentation site
- Sidebar logo via inline data URI
- Navy as default documentation theme
AcpConfigstruct inzeph-core—enabled,agent_name,agent_versionwithZEPH_ACP_*env overrides (#771)[acp]section inconfig.tomlfor configuring ACP server identity--acp-manifestCLI flag — prints ACP agent manifest JSON to stdout for IDE discovery (#772)serve_connection<W, R>generic transport function extracted fromserve_stdiofor testability (#770)ConnSlotpattern in transport —Rc<RefCell<Option<Rc<AgentSideConnection>>>>populated post-construction sonew_sessioncan build ACP adapters (#770)build_acp_contextinZephAcpAgent— wiresAcpFileExecutor,AcpShellExecutor,AcpPermissionGateper session (#770)AcpServerConfigpassed throughserve_stdio/serve_connectionto configure agent identity from config values (#770)- ACP section in
--initwizard — prompts forenabled,agent_name,agent_version(#771) - Integration tests for ACP transport using
tokio::io::duplex—initialize_handshake,new_session_and_cancel(#773) - ACP permission persistence to
~/.config/zeph/acp-permissions.toml—AllowAlways/RejectAlwaysdecisions survive restarts (#786) acp.permission_fileconfig andZEPH_ACP_PERMISSION_FILEenv override for custom permission file path (#786)
Fixed
- Permission cache key collision on anonymous tools — uses
tool_call_idas fallback when title is absent (#779)
Changed
- CI: add CLA check for external contributors via
contributor-assistant/github-action
0.11.6 - 2026-02-23
Fixed
- Auto-create parent directories for
sqlite_pathon startup (#756)
Added
autosave_assistantandautosave_min_lengthconfig fields inMemoryConfig— assistant responses skip embedding when disabled (#748)SemanticMemory::save_only()— persist message to SQLite without generating a vector embedding (#748)ResponseCacheinzeph-memory— SQLite-backed LLM response cache with blake3 key hashing and TTL expiry (#750)response_cache_enabledandresponse_cache_ttl_secsconfig fields inLlmConfig(#750)- Background
cleanup_expired()task for response cache (runs every 10 minutes) (#750) ZEPH_MEMORY_AUTOSAVE_ASSISTANT,ZEPH_MEMORY_AUTOSAVE_MIN_LENGTHenv overrides (#748)ZEPH_LLM_RESPONSE_CACHE_ENABLED,ZEPH_LLM_RESPONSE_CACHE_TTL_SECSenv overrides (#750)MemorySnapshot,export_snapshot(),import_snapshot()inzeph-memory/src/snapshot.rs(#749)zeph memory export <path>andzeph memory import <path>CLI subcommands (#749)- SQLite migration
012_response_cache.sqlfor the response cache table (#750) - Temporal decay scoring in
SemanticMemory::recall()— time-based score attenuation with configurable half-life (#745) - MMR (Maximal Marginal Relevance) re-ranking in
SemanticMemory::recall()— post-processing for result diversity (#744) - Compact XML skills prompt format (
format_skills_prompt_compact) for low-budget contexts (#747) SkillPromptModeenum (full/compact/auto) with auto-selection based on context budget (#747)- Adaptive chunked context compaction — parallel chunk summarization via
join_all(#746) with_ranking_options()builder forSemanticMemoryto configure temporal decay and MMRmessage_timestamps()method onSqliteStorefor Unix epoch retrieval viastrftimeget_vectors()method onEmbeddingStorefor raw vector fetch from SQLitevector_points- SQLite-backed
SqliteVectorStoreas embedded alternative to Qdrant for zero-dependency vector search (#741) vector_backendconfig option to select betweenqdrantandsqlitevector backends- Credential scrubbing in LLM context pipeline via
scrub_content()— redacts secrets and paths before LLM calls (#743) redact_credentialsconfig option (default: true) to toggle context scrubbing- Filter diagnostics mode:
kept_linestracking inFilterResultfor all 9 filter strategies - TUI expand (‘e’) highlights kept lines vs filtered-out lines with dim styling and legend
- Markdown table rendering in TUI chat panel — Unicode box-drawing borders, bold headers, column auto-width
Changed
- Token estimation uses
chars/4heuristic instead ofbytes/3for better accuracy on multi-byte text (#742)
0.11.5 - 2026-02-22
Added
- Declarative TOML-based output filter engine with 9 strategy types:
strip_noise,truncate,keep_matching,strip_annotated,test_summary,group_by_rule,git_status,git_diff,dedup - Embedded
default-filters.tomlwith 25 pre-configured rules for CLI tools (cargo, git, docker, npm, pip, make, pytest, go, terraform, kubectl, brew, ls, journalctl, find, grep/rg, curl/wget, du/df/ps, jest/mocha/vitest, eslint/ruff/mypy/pylint) filters_pathoption inFilterConfigfor user-provided filter rules override- ReDoS protection: RegexBuilder with size_limit, 512-char pattern cap, 1 MiB file size limit
- Dedup strategy with configurable normalization patterns and HashMap pre-allocation
- NormalizeEntry replacement validation (rejects unescaped
$capture group refs) - Sub-agent orchestration system with A2A protocol integration (#709)
- Sub-agent definition format with TOML frontmatter parser (#710)
SubAgentManagerwith spawn/cancel/collect lifecycle (#711)- Tool filtering (AllowList/DenyList/InheritAll) and skill filtering with glob patterns (#712)
- Zero-trust permission model with TTL-based grants and automatic revocation (#713)
- In-process A2A channels for orchestrator-to-sub-agent communication
PermissionGrantswith audit trail via tracing- Real LLM loop wired into
SubAgentManager::spawn()with background tokio task execution (#714) poll_subagents()onAgent<C>for collecting completed sub-agent results (#714)shutdown_all()onSubAgentManagerfor graceful teardown (#714)SubAgentMetricsinMetricsSnapshotwith state, turns, elapsed time (#715)- TUI sub-agents panel (
zeph-tuiwidgets/subagents) with color-coded states (#715) /agentCLI commands:list,spawn,bg,status,cancel,approve,deny(#716)- Typed
AgentCommandenum withparse()for type-safe command dispatch replacing string matching in the agent loop @agent_namemention syntax for quick sub-agent invocation with disambiguation from@-triggered file references
Changed
- Migrated all 6 hardcoded filters (cargo_build, test_output, clippy, git, dir_listing, log_dedup) into the declarative TOML engine
Removed
FilterConfigper-filter config structs (TestFilterConfig,GitFilterConfig,ClippyFilterConfig,CargoBuildFilterConfig,DirListingFilterConfig,LogDedupFilterConfig) — filter params now in TOML strategy fields
0.11.4 - 2026-02-21
Added
validate_skill_references(body, skill_dir)in zeph-skills loader: parses Markdown links targetingreferences/,scripts/, orassets/subdirs, warns on missing files and symlink traversal attempts (#689)sanitize_skill_body(body)in zeph-skills prompt: escapes XML structural tags (<skill,</skill>,<instructions,</instructions>,<available_skills,</available_skills>) to prevent prompt injection (#689)- Body sanitization applied automatically to all non-
Trustedskills informat_skills_prompt()(#689) load_skill_resource(skill_dir, relative_path)public function inzeph-skills::resourcefor on-demand loading of skill resource files with path traversal protection (#687)- Nested
metadata:block support in SKILL.md frontmatter: indented key-value pairs undermetadata:are parsed as structured metadata (#686) - Field length validation in SKILL.md loader:
descriptioncapped at 1024 characters,compatibilitycapped at 500 characters (#686) - Warning log in
load_skill_body()when body exceeds 20,000 bytes (~5000 tokens) per spec recommendation (#686) - Empty value normalization for
compatibilityandlicensefrontmatter fields: barecompatibility:now producesNoneinstead ofSome("")(#686) SkillManagerin zeph-skills — install skills from git URLs or local paths, remove, verify blake3 integrity, list with trust metadata- CLI subcommands:
zeph skill {install, remove, list, verify, trust, block, unblock}— runs without agent loop - In-session
/skill install <url|path>and/skill remove <name>with hot reload - Managed skills directory at
~/.config/zeph/skills/, auto-appended toskills.pathsat bootstrap - Hash re-verification on trust promotion — recomputes blake3 before promoting to trusted/verified, rejects on mismatch
- URL scheme allowlist and path traversal validation in SkillManager as defense-in-depth
- Blocking I/O wrapped in
spawn_blockingfor async safety in skill management handlers custom: HashMap<String, Secret>field inResolvedSecretsfor user-defined vault secrets (#682)list_keys()method onVaultProvidertrait with implementations for Age and Env backends (#682)requires-secretsfield in SKILL.md frontmatter for declaring per-skill secret dependencies (#682)- Gate skill activation on required secrets availability in system prompt builder (#682)
- Inject active skill’s secrets as scoped env vars into
ShellExecutorat execution time (#682) - Custom secrets step in interactive config wizard (
--init) (#682) - crates.io publishing metadata (description, readme, homepage, keywords, categories) for all workspace crates (#702)
Changed
requires-secretsSKILL.md frontmatter field renamed tox-requires-secretsto follow JSON Schema vendor extension convention and avoid future spec collisions — breaking change: update skill frontmatter to usex-requires-secrets; the oldrequires-secretsform is still parsed with a deprecation warning (#688)allowed-toolsSKILL.md field now uses space-separated values per agentskills.io spec (was comma-separated) — breaking change for skills using comma-delimited allowed-tools (#686)- Skill resource files (references, scripts, assets) are no longer eagerly injected into the system prompt on skill activation; only filenames are listed as available resources — breaking change for skills relying on auto-injected reference content (#687)
0.11.3 - 2026-02-20
Added
LoopbackChannel/LoopbackHandle/LoopbackEventin zeph-core — headless channel for daemon mode, pairs with a handle that exposesinput_tx/output_rxfor programmatic agent I/OProcessorEventenum in zeph-a2a server — streaming event type replacing synchronousProcessResult;TaskProcessor::processnow accepts anmpsc::Sender<ProcessorEvent>and returnsResult<(), A2aError>--daemonCLI flag (featuredaemon+a2a) — bootstraps a full agent + A2A JSON-RPC server underDaemonSupervisorwith PID file lifecycle and Ctrl-C graceful shutdown--connect <URL>CLI flag (featuretui+a2a) — connects the TUI to a remote daemon via A2A SSE, mappingTaskEventtoAgentEventin real-time- Command palette daemon commands:
daemon:connect,daemon:disconnect,daemon:status - Command palette action commands:
app:quit(shortcutq),app:help(shortcut?),session:new,app:theme - Fuzzy-matching for command palette — character-level gap-penalty scoring replaces substring filter;
daemon_command_registry()merged intofilter_commands TuiCommand::ToggleThemevariant in command palette (placeholder — theme switching not yet implemented)--initwizard daemon step — prompts for A2A server host, port, and auth token; writesconfig.a2a.*- Snapshot tests for
Config::default()TOML serialization (zeph-core), git filter diff/status output, cargo-build filter success/error output, and clippy grouped warnings output — using insta for regression detection - Tests for
handle_tool_resultcovering blocked, cancelled, sandbox violation, empty output, exit-code failure, and success paths (zeph-core agent/tool_execution.rs) - Tests for
maybe_redact(redaction enabled/disabled) andlast_user_queryhelper in agent/tool_execution.rs - Tests for
handle_skill_commanddispatch covering unknown subcommand, missing arguments, and no-memory early-exit paths for stats, versions, activate, approve, and reset subcommands (zeph-core agent/learning.rs) - Tests for
record_skill_outcomesnoop path when no active skills are present instaadded to workspace dev-dependencies and to zeph-core and zeph-tools crate dev-depsEmbeddabletrait andEmbeddingRegistry<T>in zeph-memory — generic Qdrant sync/search extracted from duplicated code in QdrantSkillMatcher and McpToolRegistry (~350 lines removed)- MCP server command allowlist validation — only permitted commands (npx, uvx, node, python3, python, docker, deno, bun) can spawn child processes; configurable via
mcp.allowed_commands - MCP env var blocklist — blocks 21 dangerous variables (LD_PRELOAD, DYLD_, NODE_OPTIONS, PYTHONPATH, JAVA_TOOL_OPTIONS, etc.) and BASH_FUNC_ prefix from MCP server processes
- Path separator rejection in MCP command validation to prevent symlink-based bypasses
Changed
MessagePart::Imagevariant now holdsBox<ImageData>instead of inline fields, improving semantic grouping of image dataAgent<C, T>simplified toAgent<C>— ToolExecutor generic replaced withBox<dyn ErasedToolExecutor>, reducing monomorphization- Shell command detection rewritten from substring matching to tokenizer-based pipeline with escape normalization, eliminating bypass vectors via backslash insertion, hex/octal escapes, quote splitting, and pipe chains
- Shell sandbox path validation now uses
std::path::absolute()as fallback whencanonicalize()fails on non-existent paths - Blocked command matching extracts basename from absolute paths (
/usr/bin/sudonow correctly blocked) - Transparent wrapper commands (
env,command,exec,nice,nohup,time,xargs) are skipped to detect the actual command - Default confirm patterns now include
$(and backtick subshell expressions - Enable SQLite WAL mode with SYNCHRONOUS=NORMAL for 2-5x write throughput (#639)
- Replace O(n*iterations) token scan with cached_prompt_tokens in budget checks (#640)
- Defer maybe_redact to stream completion boundary instead of per-chunk (#641)
- Replace format_tool_output string allocation with Write-into-buffer (#642)
- Change ToolCall.params from HashMap to serde_json::Map, eliminating clone (#643)
- Pre-join static system prompt sections into LazyLock
(#644) - Replace doom-loop string history with content hash comparison (#645)
- Return &’static str from detect_image_mime with case-insensitive matching (#646)
- Replace block_on in history persist with fire-and-forget async spawn (#647)
- Change
LlmProvider::name()from&'static strto&str, eliminatingBox::leakmemory leak in CompatibleProvider (#633) - Extract rate-limit retry helper
send_with_retry()in zeph-llm, deduplicating 3 retry loops (#634) - Extract
sse_to_chat_stream()helpers shared by Claude and OpenAI providers (#635) - Replace double
AnyProvider::clone()inembed_fn()with singleArcclone (#636) - Add
with_client()builder to ClaudeProvider and OpenAiProvider for sharedreqwest::Client(#637) - Cache
JsonSchemaperTypeIdinchat_typedto avoid per-call schema generation (#638) - Scrape executor performs post-DNS resolution validation against private/loopback IPs with pinned address client to prevent SSRF via DNS rebinding
- Private host detection expanded to block
*.localhost,*.internal,*.localdomains - A2A error responses sanitized: serde details and method names no longer exposed to clients
- Rate limiter rejects new clients with 429 when entry map is at capacity after stale eviction
- Secret redaction regex-based pattern matching replaces whitespace tokenizer, detecting secrets in URLs, JSON, and quoted strings
- Added
hf_,npm_,dckr_pat_to secret redaction prefixes - A2A client stream errors truncate upstream body to 256 bytes
- Add
default_client()HTTP helper with standard timeouts and user-agent in zeph-core and zeph-llm (#666) - Replace 5 production
Client::new()calls withdefault_client()for consistent HTTP config (#667) - Decompose agent/mod.rs (2602→459 lines) into tool_execution, message_queue, builder, commands, and utils modules (#648, #649, #650)
- Replace
anyhowinzeph-core::configwith typedConfigErrorenum (Io, Parse, Validation, Vault) - Replace
anyhowinzeph-tuiwith typedTuiErrorenum (Io, Channel); simplifyhandle_event()return to() - Sort
[workspace.dependencies]alphabetically in root Cargo.toml
Fixed
- False positive: “sudoku” no longer matched by “sudo” blocked pattern (word-boundary matching)
- PID file creation uses
OpenOptions::create_new(true)(O_CREAT|O_EXCL) to prevent TOCTOU symlink attacks
0.11.2 - 2026-02-19
Added
base_urlandlanguagefields in[llm.stt]config for OpenAI-compatible local whisper servers (e.g. whisper.cpp)ZEPH_STT_BASE_URLandZEPH_STT_LANGUAGEenvironment variable overrides- Whisper API provider now passes
languageparameter for accurate non-English transcription - Documentation for whisper.cpp server setup with Metal acceleration on macOS
- Per-sub-provider
base_urlandembedding_modeloverrides in orchestrator config - Full orchestrator example with cloud + local + STT in default.toml
- All previously undocumented config keys in default.toml (
agent.auto_update_check,llm.stt,llm.vision_model,skills.disambiguation_threshold,tools.filters.*,tools.permissions,a2a.auth_token,mcp.servers.env)
Fixed
- Outdated config keys in default.toml: removed nonexistent
repo_id, renamedprovider_typetotype, corrected candle defaults, fixed observability exporter default - Add
wait(true)to Qdrant upsert and delete operations for read-after-write consistency, fixing flakyingested_chunks_have_correct_payloadintegration test (#567) - Vault age backend now falls back to default directory for key/path when
--vault-key/--vault-pathare not provided, matchingzeph vault initbehavior (#613)
Changed
- Whisper STT provider no longer requires OpenAI API key when
base_urlpoints to a local server - Orchestrator sub-providers now resolve
base_urlandembedding_modelvia fallback chain: per-provider, parent section, global default
0.11.1 - 2026-02-19
Added
- Persistent CLI input history with rustyline: arrow key navigation, prefix search, line editing, SQLite-backed persistence across restarts (#604)
- Clickable markdown links in TUI via OSC 8 hyperlinks —
[text](url)renders as terminal-clickable link with URL sanitization and scheme allowlist (#580) @-triggered fuzzy file picker in TUI input — type@to search project files by name/path/extension with real-time filtering (#600)- Command palette in TUI with read-only agent management commands (#599)
- Orchestrator provider option in
zeph initwizard for multi-model routing setup (#597) zeph vaultCLI subcommands:init(generate age keypair),set(store secret),get(retrieve secret),list(show keys),rm(remove secret) (#598)- Atomic file writes for vault operations with temp+rename strategy (#598)
- Default vault directory resolution via XDG_CONFIG_HOME / APPDATA / HOME (#598)
- Auto-update check via GitHub Releases API with configurable scheduler task (#588)
auto_update_checkconfig field (default: true) withZEPH_AUTO_UPDATE_CHECKenv overrideTaskKind::UpdateCheckvariant andUpdateCheckHandlerin zeph-scheduler- One-shot update check at startup when scheduler feature is disabled
--initwizard step for auto-update check configuration
Fixed
- Restore
--vault,--vault-key,--vault-pathCLI flags lost during clap migration (#587)
Changed
- Refactor
AppBuilder::from_env()toAppBuilder::new()with explicit CLI overrides - Eliminate redundant manual
std::env::args()parsing in favor of clap - Add
ZEPH_VAULT_KEYandZEPH_VAULT_PATHenvironment variable support - Init wizard reordered: vault backend selection is now step 1 before LLM provider (#598)
- API key and channel token prompts skipped when age vault backend is selected (#598)
0.11.0 - 2026-02-19
Added
- Vision (image input) support across Claude, OpenAI, and Ollama providers (#490)
MessagePart::Imagecontent type with base64 serializationLlmProvider::supports_vision()trait method for runtime capability detection- Claude structured content with
AnthropicContentBlock::Imagevariant - OpenAI array content format with
image_urldata-URI encoding - Ollama
with_images()support with optionalvision_modelconfig for dedicated model routing /image <path>command in CLI and TUI channels- Telegram photo message handling with pre-download size guard
vision_modelfield in[llm.ollama]config section and--initwizard update- 20 MB max image size limit and path traversal protection
- Interactive configuration wizard via
zeph initsubcommand with 5-step setup (LLM provider, memory, channels, secrets backend, config generation) - clap-based CLI argument parsing with
--help,--versionsupport Serializederive onConfigand all nested types for TOML generationdialoguerdependency for interactive terminal prompts- Structured LLM output via
chat_typed<T>()onLlmProvidertrait with JSON schema enforcement (#456) - OpenAI/Compatible native
response_format: json_schemastructured output (#457) - Claude structured output via forced tool use pattern (#458)
Extractor<T>utility for typed data extraction from LLM responses (#459)- TUI test automation infrastructure: EventSource trait abstraction, insta widget snapshot tests, TestBackend integration tests, proptest layout verification, expectrl E2E terminal tests (#542)
- CI snapshot regression pipeline with
cargo insta test --check(#547) - Pipeline API with composable, type-safe
Steptrait,Pipelinebuilder,ParallelStepcombinator, and built-in steps (LlmStep,RetrievalStep,ExtractStep,MapStep) (#466, #467, #468) - Structured intent classification for skill disambiguation: when top-2 skill scores are within
disambiguation_threshold(default 0.05), agent calls LLM viachat_typed::<IntentClassification>()to select the best-matching skill (#550) ScoredMatchstruct exposing both skill index and cosine similarity score from matcher backendsIntentClassificationtype (skill_name,confidence,params) withJsonSchemaderive for schema-enforced LLM responsesdisambiguation_thresholdin[skills]config section (default: 0.05) withwith_disambiguation_threshold()builder onAgent- DocumentLoader trait with text/markdown file loader in zeph-memory (#469)
- Text splitter with configurable chunk size, overlap, and sentence-aware splitting (#470)
- PDF document loader, feature-gated behind
pdf(#471) - Document ingestion pipeline: load, split, embed, store via Qdrant (#472)
- File size guard (50 MiB default) and path canonicalization for document loaders
- Audio input support:
Attachment/AttachmentKindtypes,SpeechToTexttrait, OpenAI Whisper backend behindsttfeature flag (#520, #521, #522) - Telegram voice and audio message handling with automatic file download (#524)
- STT bootstrap wiring:
WhisperProvidercreated from[llm.stt]config behindsttfeature (#529) - Slack audio file upload handling with host validation and size limits (#525)
- Local Whisper backend via candle for offline STT with symphonia audio decode and rubato resampling (#523)
- Shell-based installation script (
install/install.sh) with SHA256 verification, platform detection, and--versionflag - Shellcheck lint job in CI pipeline
- Per-job permission scoping in release workflow (least privilege)
- TUI word-jump and line-jump cursor navigation (#557)
- TUI keybinding help popup on
?in normal mode (#533) - TUI clickable hyperlinks via OSC 8 escape sequences (#530)
- TUI edit-last-queued for recalling queued messages (#535)
- VectorStore trait abstraction in zeph-memory (#554)
- Operation-level cancellation for LLM requests and tool executions (#538)
Changed
- Consolidate Docker files into
docker/directory (#539) - Typed deserialization for tool call params (#540)
- CI: replace oraclelinux base image with debian bookworm-slim (#532)
Fixed
- Strip schema metadata and fix doom loop detection for native tool calls (#534)
- TUI freezes during fast LLM streaming and parallel tool execution: biased event loop with input priority and agent event batching (#500)
- Redundant syntax highlighting and markdown parsing on every TUI frame: per-message render cache with content-hash keying (#501)
0.10.0 - 2026-02-18
Fixed
- TUI status spinner not cleared after model warmup completes (#517)
- Duplicate tool output rendering for shell-streamed tools in TUI (#516)
send_tool_outputnot forwarded throughAppChannel/AnyChannelenum dispatch (#508)- Tool output and diff not sent atomically in native tool_use path (#498)
- Parallel tool_use calls: results processed sequentially for correct ordering (#486)
- Native
tool_resultformat not recognized by TUI history loader (#484) - Inline filter stats threshold based on char savings instead of line count (#483)
- Token metrics not propagated in native tool_use path (#482)
- Filter metrics not appearing in TUI Resources panel when using native tool_use providers (#480)
- Output filter matchers not matching compound shell commands like
cd /path && cargo test 2>&1 | tail(#481) - Duplicate
ToolEvent::Completedemission in shell executor before filtering was applied (#480) - TUI feature gate compilation errors (#435)
Added
- GitHub CLI skill with token-saving patterns (#507)
- Parallel execution of native tool_use calls with configurable concurrency (#486)
- TUI compact/detailed tool output toggle with ‘e’ key binding (#479)
- TUI
[tui]config section withshow_source_labelsoption to hide[user]/[zeph]/[tool]prefixes (#505) - Syntax-highlighted diff view for write/edit tool output in TUI (#455)
- Diff rendering with green/red backgrounds for added/removed lines
- Word-level change highlighting within modified lines
- Syntax highlighting via tree-sitter
- Compact/expanded toggle with existing ‘e’ key binding
- New dependency:
similar2.7.0
- Per-tool inline filter stats in CLI chat:
[shell] cargo test (342 lines -> 28 lines, 91.8% filtered)(#449) - Filter metrics in TUI Resources panel: confidence distribution, command hit rate, token savings (#448)
- Periodic 250ms tick in TUI event loop for real-time metrics refresh (#447)
- Output filter architecture improvements (M26.1):
CommandMatcherenum,FilterConfidence,FilterPipeline,SecurityPatterns, per-filter TOML config (#452) - Token savings tracking and metrics for output filtering (#445)
- Smart tool output filtering: command-aware filters that compress tool output before context insertion
OutputFiltertrait andOutputFilterRegistrywith first-match-wins dispatchsanitize_output()ANSI escape and progress bar stripping (runs on all tool output)- Test output filter: cargo test/nextest failures-only mode (94-99% token savings on green suites)
- Git output filter: compact status/diff/log/push compression (80-99% savings)
- Clippy output filter: group warnings by lint rule (70-90% savings)
- Directory listing filter: hide noise directories (target, node_modules, .git)
- Log deduplication filter: normalize timestamps/UUIDs, count repeated patterns (70-85% savings)
[tools.filters]config section withenabledtoggle- Skill trust levels: 4-tier model (Trusted, Verified, Quarantined, Blocked) with per-turn enforcement
TrustGateExecutorwrapping tool execution with trust-level permission checksAnomalyDetectorwith sliding-window threshold counters for quarantined skill monitoring- blake3 content hashing for skill integrity verification on load and hot-reload
- Quarantine prompt wrapping for structural isolation of untrusted skill bodies
- Self-learning gate: skills with trust < Verified skip auto-improvement
skill_trustSQLite table with migration 009- CLI commands:
/skill trust,/skill block,/skill unblock [skills.trust]config section (default_level, local_level, hash_mismatch_level)ProviderKindenum for type-safe provider selection in configRuntimeConfigstruct grouping agent runtime fieldsAnyProvider::embed_fn()shared embedding closure helperConfig::validate()with bounds checking for critical config valuessanitize_paths()for stripping absolute paths from error messages- 10-second timeout wrapper for embedding API calls
fullfeature flag enabling all optional features
Changed
- Remove
Pgeneric fromAgent,SemanticMemory,CodeRetriever— provider resolved at construction (#423) - Architecture improvements, performance optimizations, security hardening (M24) (#417)
- Extract bootstrap logic from main.rs into
zeph-core::bootstrap::AppBuilder(#393): main.rs reduced from 2313 to 978 lines SecurityConfigandTimeoutConfiggainClone + CopyAnyChannelmoved from main.rs to zeph-channels crate- Remove 8 lightweight feature gates, make always-on: openai, compatible, orchestrator, router, self-learning, qdrant, vault-age, mcp (#438)
- Default features reduced to minimal set (empty after M26)
- Skill matcher concurrency reduced from 50 to 20
String::with_capacityin context building loops- CI updated to use
--features full
Breaking
LlmConfig.providerchanged fromStringtoProviderKindenum- Default features reduced – users needing a2a, candle, mcp, openai, orchestrator, router, tui must enable explicitly or use
--features full - Telegram channel rejects empty
allowed_usersat startup - Config with extreme values now rejected by
Config::validate()
Deprecated
ToolExecutor::execute()string-based dispatch (useexecute_tool_call()instead)
Fixed
- Closed #410 (clap dropped atty), #411 (rmcp updated quinn-udp), #413 (A2A body limit already present)
0.9.9 - 2026-02-17
Added
zeph-gatewaycrate: axum HTTP gateway with POST /webhook ingestion, bearer auth (blake3 + ct_eq), per-IP rate limiting, GET /health endpoint, feature-gated (gateway) (#379)zeph-core::daemonmodule: component supervisor with health monitoring, PID file management, graceful shutdown, feature-gated (daemon) (#380)zeph-schedulercrate: cron-based periodic task scheduler with SQLite persistence, built-in tasks (memory_cleanup, skill_refresh, health_check), TaskHandler trait, feature-gated (scheduler) (#381)- New config sections:
[gateway],[daemon],[scheduler]in config/default.toml (#367) - New optional feature flags:
gateway,daemon,scheduler - Hybrid memory search: FTS5 keyword search combined with Qdrant vector similarity (#372, #373, #374)
- SQLite FTS5 virtual table with auto-sync triggers for full-text keyword search
- Configurable
vector_weight/keyword_weightin[memory.semantic]for hybrid ranking - FTS5-only fallback when Qdrant is unavailable (replaces empty results)
AutonomyLevelenum (ReadOnly/Supervised/Full) for controlling tool access (#370)autonomy_levelconfig key in[security]section (default: supervised)- Read-only mode restricts agent to file_read, file_glob, file_grep, web_scrape
- Full mode allows all tools without confirmation prompts
- Documented
[telegram].allowed_usersallowlist in default config (#371) - OpenTelemetry OTLP trace export with
tracing-opentelemetrylayer, feature-gated behindotel(#377) [observability]config section with exporter selection and OTLP endpoint- Instrumentation spans for LLM calls (
llm_call) and tool executions (tool_exec) CostTrackerwith per-model token pricing and configurable daily budget limits (#378)[cost]config section withenabledandmax_daily_centsoptionscost_spent_centsfield inMetricsSnapshotfor TUI cost display- Discord channel adapter with Gateway v10 WebSocket, slash commands, edit-in-place streaming (#382)
- Slack channel adapter with Events API webhook, HMAC-SHA256 signature verification, streaming (#383)
- Feature flags:
discordandslack(opt-in) in zeph-channels and root crate DiscordConfigandSlackConfigwith token redaction in Debug impls- Slack timestamp replay protection (reject requests >5min old)
- Configurable Slack webhook bind address (
webhook_host)
0.9.8 - 2026-02-16
Added
- Graceful shutdown on Ctrl-C with farewell message and MCP server cleanup (#355)
- Cancel-aware LLM streaming via tokio::select on shutdown signal (#358)
McpManager::shutdown_all_shared()with per-client 5s timeout (#356)- Indexer progress logging with file count and per-file stats
- Skip code index for providers with native tool_use (#357)
- OpenAI prompt caching: parse and report cached token usage (#348)
- Syntax highlighting for TUI code blocks via tree-sitter-highlight (#345, #346, #347)
- Anthropic prompt caching with structured system content blocks (#337)
- Configurable summary provider for tool output summarization via local model (#338)
- Aggressive inline pruning of stale tool outputs in tool loops (#339)
- Cache usage metrics (cache_read_tokens, cache_creation_tokens) in MetricsSnapshot (#340)
- Native tool_use support for Claude provider (Anthropic API format) (#256)
- Native function calling support for OpenAI provider (#257)
ToolDefinition,ChatResponse,ToolUseRequesttypes in zeph-llm (#254)ToolUse/ToolResultvariants inMessagePartfor structured tool flow (#255)- Dual-mode agent loop: native structured path alongside legacy text extraction (#258)
- Dual system prompt: native tool_use instructions for capable providers, fenced-block instructions for legacy providers
Changed
- Consolidate all SQLite migrations into root
migrations/directory (#354)
0.9.7 - 2026-02-15
Performance
- Token estimation uses
len() / 3for improved accuracy (#328) - Explicit tokio feature selection replacing broad feature gates (#326)
- Concurrent skill embedding for faster startup (#327)
- Pre-allocate strings in hot paths to reduce allocations (#329)
- Parallel context building via
try_join!(#331) - Criterion benchmark suite for core operations (#330)
Security
- Path traversal protection in shell sandbox (#325)
- Canonical path validation in skill loader (#322)
- SSRF protection for MCP server connections (#323)
- Remove MySQL/RSA vulnerable transitive dependencies (#324)
- Secret redaction patterns for Google and GitLab tokens (#320)
- TTL-based eviction for rate limiter entries (#321)
Changed
QdrantOpsshared helper trait for Qdrant collection operations (#304)delegate_provider!macro replacing boilerplate provider delegation (#303)- Remove
TuiErrorin favor of unified error handling (#302) - Generic
recv_optionalreplacing per-channel optional receive logic (#301)
Dependencies
- Upgraded rmcp to 0.15, toml to 1.0, uuid to 1.21 (#296)
- Cleaned up deny.toml advisory and license configuration (#312)
0.9.6 - 2026-02-15
Changed
- BREAKING:
ToolDefschema field replacedVec<ParamDef>withschemars::Schemaauto-derived from Rust structs via#[derive(JsonSchema)] - BREAKING:
ParamDefandParamTyperemoved fromzeph-toolspublic API - BREAKING:
ToolRegistry::new()replaced withToolRegistry::from_definitions(); registry no longer hardcodes built-in tools — each executor owns its definitions viatool_definitions() - BREAKING:
Channeltrait now requiresChannelErrorenum with typed error handling replacinganyhow::Result - BREAKING:
Agent::new()signature changed to accept new field grouping; agent struct refactored into 5 inner structs for improved organization - BREAKING:
AgentErrorenum introduced with 7 typed variants replacing scatteredanyhow::Errorhandling ToolDefnow includesInvocationHint(FencedBlock/ToolCall) so LLM prompt shows exact invocation format per toolweb_scrapetool definition includes all parameters (url,select,extract,limit) auto-derived fromScrapeInstructionShellExecutorandWebScrapeExecutornow implementtool_definitions()for single source of truth- Replaced
tokio“full” feature with granular features in zeph-core (async-io, macros, rt, sync, time) - Removed
anyhowdependency from zeph-channels - Message persistence now uses
MessageKindenum instead ofis_summarybool for qdrant storage
Added
ChannelErrorenum with typed variants for channel operation failuresAgentErrorenum with 7 typed variants for agent operation failures (streaming, persistence, configuration, etc.)- Workspace-level
qdrantfeature flag for optional semantic memory support - Type aliases consolidated into zeph-llm:
EmbedFutureandEmbedFnwith typedLlmError streaming.rsandpersistence.rsmodules extracted from agent module for improved code organizationMessageKindenum for distinguishing summary and regular messages in storage
Removed
anyhow::Resultfrom Channel trait (replaced withChannelError)- Direct
anyhow::Errorusage in agent module (replaced withAgentError)
0.9.5 - 2026-02-14
Added
- Pattern-based permission policy with glob matching per tool (allow/ask/deny), first-match-wins evaluation (#248)
- Legacy blocked_commands and confirm_patterns auto-migrated to permission rules (#249)
- Denied tools excluded from LLM system prompt (#250)
- Tool output overflow: full output saved to file when truncated, path notice appended for LLM access (#251)
- Stale tool output overflow files cleaned up on startup (>24h TTL) (#252)
ToolRegistrywith typedToolDefdefinitions for 7 built-in tools (bash, read, edit, write, glob, grep, web_scrape) (#239)FileExecutorfor sandboxed file operations: read, write, edit, glob, grep (#242)ToolCallstruct andexecute_tool_call()onToolExecutortrait for structured tool invocation (#241)CompositeExecutorroutes structured tool calls to correct sub-executor by tool_id (#243)- Tool catalog section in system prompt via
ToolRegistry::format_for_prompt()(#244) - Configurable
max_tool_iterations(default 10, previously hardcoded 3) via TOML andZEPH_AGENT_MAX_TOOL_ITERATIONSenv var (#245) - Doom-loop detection: breaks agent loop on 3 consecutive identical tool outputs
- Context budget check at 80% threshold stops iteration before context overflow
IndexWatcherfor incremental code index updates on file changes vianotifyfile watcher (#233)watchconfig field in[index]section (defaulttrue) to enable/disable file watching- Repo map cache with configurable TTL (
repo_map_ttl_secs, default 300s) to avoid per-message filesystem traversal (#231) - Cross-session memory score threshold (
cross_session_score_threshold, default 0.35) to filter low-relevance results (#232) embed_missing()called on startup for embedding backfill when Qdrant available (#261)AgentTaskProcessorreplacesEchoTaskProcessorfor real A2A inference (#262)
Changed
- ShellExecutor uses PermissionPolicy for all permission checks instead of legacy find_blocked_command/find_confirm_command
- Replaced unmaintained dirs-next 2.0 with dirs 6.x
- Batch messages retrieval in semantic recall: replaced N+1 query pattern with
messages_by_ids()for improved performance
Fixed
- Persist
MessagePartdata to SQLite viaremember_with_parts()— pruning state now survives session restarts (#229) - Clear tool output body from memory after Tier 1 pruning to reclaim heap (#230)
- TUI uptime display now updates from agent start time instead of always showing 0s (#259)
FileExecutorhandle_writenow uses canonical path for security (TOCTOU prevention) (#260)resolve_via_ancestorstrailing slash bug on macOSvault.backendfrom config now used as default backend; CLI--vaultflag overrides config (#263)- A2A error responses sanitized to prevent provider URL leakage
0.9.4 - 2026-02-14
Added
- Bounded FIFO message queue (max 10) in agent loop: users can submit messages during inference, queued messages are delivered sequentially when response cycle completes
- Channel trait extended with
try_recv()(non-blocking poll) andsend_queue_count()with default no-op impls - Consecutive user messages within 500ms merge window joined by newline
- TUI queue badge
[+N queued]in input area,Ctrl+Kto clear queue,/clear-queuecommand - TelegramChannel
try_recv()implementation via mpsc - Deferred model warmup in TUI mode: interface renders immediately, Ollama warmup runs in background with status indicator (“warming up model…” → “model ready”), agent loop awaits completion via
watch::channel context_tokensmetric in TUI Resources panel showing current prompt estimate (vs cumulative session totals)unsummarized_message_countinSemanticMemoryfor precise summarization triggercount_messages_afterinSqliteStorefor counting messages beyond a given ID- TUI status indicators for context compaction (“compacting context…”) and summarization (“summarizing…”)
- Debug tracing in
should_compact()for context budget diagnostics (token estimate, threshold, decision) - Config hot-reload: watch config file for changes via
notify_debouncer_miniand apply runtime-safe fields (security, timeouts, memory limits, context budget, compaction, max_active_skills) without restart ConfigWatcherin zeph-core with 500ms debounced filesystem monitoringwith_config_reload()builder method on Agent for wiring config file watchertool_namefield inToolOutputfor identifying tool type (bash, mcp, web-scrape) in persisted messages and TUI display- Real-time status events for provider retries and orchestrator fallbacks surfaced as
[system]messages across all channels (CLI stderr, TUI chat panel, Telegram) StatusTxtype alias inzeph-llmfor emitting status events from providersStatusvariant in TUIAgentEventrendered as System-role messages (DarkGray)set_status_tx()onAnyProvider,SubProvider, andModelOrchestratorfor propagating status sender through the provider hierarchy- Background forwarding tasks for immediate status delivery (bypasses agent loop for zero-latency display)
- TUI: toggle side panels with
dkey in Normal mode - TUI: input history navigation (Up/Down in Insert mode)
- TUI: message separators and accent bars for visual structure
- TUI: tool output restored as expandable messages from conversation history
- TUI: collapsed tool output preview (3 lines) when restoring history
LlmProvider::context_window()trait method for model context window size detection- Ollama context window auto-detection via
/api/showmodel info endpoint - Context window sizes for Claude (200K) and OpenAI (128K/16K/1M) provider models
auto_budgetconfig field withZEPH_MEMORY_AUTO_BUDGETenv override for automatic context budget from model metadatainject_summaries()in Agent: injects SQLite conversation summaries into context (newest-first, budget-aware, with deduplication)- Wire
zeph-indexCode RAG pipeline into agent loop (feature-gatedindex):CodeRetrieverintegration,inject_code_rag()inprepare_context(), repo map in system prompt, background project indexing on startup IndexConfigwith[index]TOML section andZEPH_INDEX_*env overrides (enabled, max_chunks, score_threshold, budget_ratio, repo_map_tokens)- Two-tier context pruning strategy for granular token reclamation before full LLM compaction
- Tier 1: selective
ToolOutputpart pruning withcompacted_attimestamp on pruned parts - Tier 2: LLM-based compaction fallback when tier 1 is insufficient
prune_protect_tokensconfig field for token-based protection zone (shields recent context from pruning)tool_output_prunesmetric tracking tier 1 pruning operationscompacted_atfield onMessagePart::ToolOutputfor pruning audit trail
- Tier 1: selective
MessagePartenum (Text, ToolOutput, Recall, CodeContext, Summary) for typed message content with independent lifecycleMessage::from_parts()constructor withto_llm_content()flattening for LLM provider consumptionMessage::from_legacy()backward-compatible constructor for simple text messages- SQLite migration 006:
partscolumn for structured message storage (JSON-serialized) save_message_with_parts()in SqliteStore for persisting typed message parts- inject_semantic_recall, inject_code_context, inject_summaries now create typed MessagePart variants
Changed
indexfeature enabled by default (Code RAG pipeline active out of the box)- Agent error handler shows specific error context instead of generic message
- TUI inline code rendered as blue with dark background glow instead of bright yellow
- TUI header uses deep blue background (
Rgb(20, 40, 80)) for improved contrast - System prompt includes explicit
bashblock example and bans invented formats (tool_code,tool_call) for small model compatibility - TUI Resources panel: replaced separate Prompt/Completion/Total with Context (current) and Session (cumulative) metrics
- Summarization trigger uses unsummarized message count instead of total, avoiding repeated no-op checks
- Empty
AgentEvent::Statusclears TUI spinner instead of showing blank throbber - Status label cleared after summarization and compaction complete
- Default
summarization_threshold: 100 → 50 messages - Default
compaction_threshold: 0.75 → 0.80 - Default
compaction_preserve_tail: 4 → 6 messages - Default
semantic.enabled: false → true - Default
summarize_output: false → true - Default
context_budget_tokens: 0 (auto-detect from model)
Fixed
- TUI chat line wrapping no longer eats 2 characters on word wrap (accent prefix width accounted for)
- TUI activity indicator moved to dedicated layout row (no longer overlaps content)
- Memory history loading now retrieves most recent messages instead of oldest
- Persisted tool output format includes tool name (
[tool output: bash]) for proper display on restore summarize_outputserde deserialization used#[serde(default)]yieldingfalseinstead of config defaulttrue
0.9.3 - 2026-02-12
Added
- New
zeph-indexcrate: AST-based code indexing and semantic retrieval pipeline- Language detection and grammar registry with feature-gated tree-sitter grammars (Rust, Python, JavaScript, TypeScript, Go, Bash, TOML, JSON, Markdown)
- AST-based chunker with cAST-inspired greedy sibling merge and recursive decomposition (target 600 non-ws chars per chunk)
- Contextualized embedding text generation for improved retrieval quality
- Dual-write storage layer (Qdrant vector search + SQLite metadata) with INT8 scalar quantization
- Incremental indexer with .gitignore-aware file walking and content-hash change detection
- Hybrid retriever with query classification (Semantic/Grep/Hybrid) and budget-aware result packing
- Lightweight repo map generation (tree-sitter signature extraction, budget-constrained output)
code_contextslot inBudgetAllocationfor code RAG injection into agent contextinject_code_context()method in Agent for transient code chunk injection before semantic recall
0.9.2 - 2026-02-12
Added
- Runtime context compaction for long sessions: automatic LLM-based summarization of middle messages when context usage exceeds configurable threshold (default 75%)
with_context_budget()builder method on Agent for wiring context budget and compaction settings- Config fields:
compaction_threshold(f32),compaction_preserve_tail(usize) with env var overrides context_compactionscounter in MetricsSnapshot for observability- Context budget integration:
ContextBudget::allocate()wired into agent loop viaprepare_context()orchestrator - Semantic recall injection:
SemanticMemory::recall()results injected as transient system messages with token budget control - Message history trimming: oldest non-system messages evicted when history exceeds budget allocation
- Environment context injection: working directory, OS, git branch, and model name in system prompt via
<environment>block - Extended BASE_PROMPT with structured Tool Use, Guidelines, and Security sections
- Tool output truncation: head+tail split at 30K chars with UTF-8 safe boundaries
- Smart tool output summarization: optional LLM-based summarization for outputs exceeding 30K chars, with fallback to truncation on failure (disabled by default via
summarize_outputconfig) - Progressive skill loading: matched skills get full body, remaining shown as description-only catalog via
<other_skills> - ZEPH.md project config discovery: walk up directory tree, inject into system prompt as
<project_context>
0.9.1 - 2026-02-12
Added
- Mouse scroll support for TUI chat widget (scroll up/down via mouse wheel)
- Splash screen with colored block-letter “ZEPH” banner on TUI startup
- Conversation history loading into chat on TUI startup
- Model thinking block rendering (
<think>tags from Ollama DeepSeek/Qwen models) in distinct darker style - Markdown rendering for all chat messages via
pulldown-cmark: bold, italic, strikethrough, headings, code blocks, inline code, lists, blockquotes, horizontal rules - Scrollbar track with proportional thumb indicator in chat widget
Fixed
- Chat messages no longer overflow below the viewport when lines wrap
- Scroll no longer sticks at top after over-scrolling past content boundary
0.9.0 - 2026-02-12
Added
- ratatui-based TUI dashboard with real-time agent metrics (feature-gated
tui, opt-in) TuiChannelas newChannelimplementation with bottom-up chat feed, input line, and status barMetricsSnapshotandMetricsCollectorin zeph-core viatokio::sync::watchfor live metrics transportwith_metrics()builder on Agent with instrumentation at 8 collection points: api_calls, latency, prompt/completion tokens, active skills, sqlite message count, qdrant status, summarization count- Side panel widgets (skills, memory, resources) with live data from agent loop
- Confirmation modal dialog for destructive command approval in TUI (Y/Enter confirms, N/Escape cancels)
- Scroll indicators (▲/▼) in chat widget when content overflows viewport
- Responsive layout: side panels hidden on terminals narrower than 80 columns
- Multiline input via Shift+Enter in TUI insert mode
- Bottom-up chat layout with proper newline handling and per-message visual separation
- Panic hook for terminal state restoration on any panic during TUI execution
- Unicode-safe char-index cursor tracking for multi-byte input in TUI
--config <path>CLI argument andZEPH_CONFIGenv var to override default config path- OpenAI-compatible LLM provider with chat, streaming, and embeddings support
- Feature-gated
openaifeature (enabled by default) - Support for OpenAI, Together AI, Groq, Fireworks, and any OpenAI-compatible API via configurable
base_url reasoning_effortparameter for OpenAI reasoning models (low/medium/high)/mcp add <id> <command> [args...]for dynamic stdio MCP server connection at runtime/mcp add <id> <url>for HTTP transport (remote MCP servers in Docker/cloud)/mcp listcommand to show connected servers and tool counts/mcp remove <id>command to disconnect MCP serversMcpTransportenum:Stdio(child process) andHttp(Streamable HTTP) transports- HTTP MCP server config via
urlfield in[[mcp.servers]] mcp.allowed_commandsconfig for command allowlist (security hardening)mcp.max_dynamic_serversconfig to limit concurrent dynamic servers (default 10)- Qdrant registry sync after dynamic MCP add/remove for semantic tool matching
Changed
- Docker images now include Node.js, npm, and Python 3 for MCP server runtime
ServerEntryusesMcpTransportenum instead of flat command/args/env fields
Fixed
- Effective embedding model resolution: Qdrant subsystems now use the correct provider-specific embedding model name when provider is
openaior orchestrator routes to OpenAI - Skill watcher no longer loops in Docker containers (overlayfs phantom events)
0.8.2 - 2026-02-10
Changed
- Enable all non-platform features by default:
orchestrator,self-learning,mcp,vault-age,candle - Features
metalandcudaremain opt-in (platform-specific GPU accelerators) - CI clippy uses default features instead of explicit feature list
- Docker images now include skill runtime dependencies:
curl,wget,git,jq,file,findutils,procps-ng
0.8.1 - 2026-02-10
Added
- Shell sandbox: configurable
allowed_pathsdirectory allowlist andallow_networktoggle blocking curl/wget/nc inShellExecutor(Issue #91) - Sandbox validation before every shell command execution with path canonicalization
tools.shell.allowed_pathsconfig (empty = working directory only) withZEPH_TOOLS_SHELL_ALLOWED_PATHSenv overridetools.shell.allow_networkconfig (default: true) withZEPH_TOOLS_SHELL_ALLOW_NETWORKenv override- Interactive confirmation for destructive commands (
rm,git push -f,DROP TABLE, etc.) with CLI y/N prompt and Telegram inline keyboard (Issue #92) tools.shell.confirm_patternsconfig with default destructive command patternsChannel::confirm()trait method with default auto-confirm for headless/test scenariosToolError::ConfirmationRequiredandToolError::SandboxViolationvariantsexecute_confirmed()method onToolExecutorfor confirmation bypass after user approval- A2A TLS enforcement: reject HTTP endpoints when
a2a.require_tls = true(Issue #92) - A2A SSRF protection: block private IP ranges (RFC 1918, loopback, link-local) with DNS resolution (Issue #92)
- Configurable A2A server payload size limit via
a2a.max_body_size(default: 1 MiB) - Structured JSON audit logging for all tool executions with stdout or file destination (Issue #93)
AuditLoggerwithAuditEntry(timestamp, tool, command, result, duration) andAuditResultenum[tools.audit]config section withZEPH_TOOLS_AUDIT_ENABLEDandZEPH_TOOLS_AUDIT_DESTINATIONenv overrides- Secret redaction in LLM responses: detect API keys, tokens, passwords, private keys and replace with
[REDACTED](Issue #93) - Whitespace-preserving
redact_secrets()scanner with zero-allocation fast path viaCow<str> [security]config section withredact_secretstoggle (default: true)- Configurable timeout policies for LLM, embedding, and A2A operations (Issue #93)
[timeouts]config section withllm_seconds,embedding_seconds,a2a_seconds- LLM calls wrapped with
tokio::time::timeoutin agent loop
0.8.0 - 2026-02-10
Added
VaultProvidertrait with pluggable secret backends,Secretnewtype with redacted debug output,EnvVaultProviderfor environment variable secrets (Issue #70)AgeVaultProvider: age-encrypted JSON vault backend with x25519 identity key decryption (Issue #70)Config::resolve_secrets(): async secret resolution through vault provider for API keys and tokens- CLI vault args:
--vault <backend>,--vault-key <path>,--vault-path <path> vault-agefeature flag onzeph-coreand root binary[vault]config section withbackendfield (default:env)docker-compose.vault.ymloverlay for containerized age vault deploymentCARGO_FEATURESbuild arg inDockerfile.devfor optional feature flagsCandleProvider: local GGUF model inference via candle ML framework with chat templates (Llama3, ChatML, Mistral, Phi3, Raw), token generation with top-k/top-p sampling, and repeat penalty (Issue #125)CandleProviderembeddings: BERT-based embedding model loaded from HuggingFace Hub with mean pooling and L2 normalization (Issue #126)ModelOrchestrator: task-aware multi-model routing with keyword-based classification (coding, creative, analysis, translation, summarization, general) and provider fallback chains (Issue #127)SubProviderenum breaking recursive type cycle betweenAnyProviderandModelOrchestrator- Device auto-detection: Metal on macOS, CUDA on Linux with GPU, CPU fallback (Issue #128)
- Feature flags:
candle,metal,cuda,orchestratoron workspace and zeph-llm crate CandleConfig,GenerationParams,OrchestratorConfigin zeph-core config- Config examples for candle and orchestrator in
config/default.toml - Setup guide sections for candle local inference and model orchestrator
- 15 new unit tests for orchestrator, chat templates, generation config, and loader
- Progressive skill loading: lazy body loading via
OnceLock, on-demand resource resolution forscripts/,references/,assets/directories, extended frontmatter (compatibility,license,metadata,allowed-tools), skill name validation per agentskills.io spec (Issue #115) SkillMeta/Skillcomposition pattern: metadata loaded at startup, body deferred until skill activationSkillRegistryreplacesVec<Skill>in Agent — lazy body access viaget_skill()/get_body()resource.rsmodule:discover_resources()+load_resource()with path traversal protection via canonicalization- Self-learning skill evolution system: automatic skill improvement through failure detection, self-reflection retry, and LLM-generated version updates (Issue #107)
SkillOutcomeenum andSkillMetricsfor skill execution outcome tracking (Issue #108)- Agent self-reflection retry on tool failure with 1-retry-per-message budget (Issue #109)
- Skill version generation and storage in SQLite with auto-activate and manual approval modes (Issue #110)
- Automatic rollback when skill version success rate drops below threshold (Issue #111)
/skill stats,/skill versions,/skill activate,/skill approve,/skill resetcommands for version management (Issue #111)/feedbackcommand for explicit user feedback on skill quality (Issue #112)LearningConfigwith TOML config section[skills.learning]and env var overridesself-learningfeature flag onzeph-skills,zeph-core, and root binary- SQLite migration 005:
skill_versionsandskill_outcomestables - Bundled
setup-guideskill with configuration reference for all env vars, TOML keys, and operating modes - Bundled
skill-auditskill for spec compliance and security review of installed skills allowed_commandsshell config to override default blocklist entries viaZEPH_TOOLS_SHELL_ALLOWED_COMMANDSQdrantSkillMatcher: persistent skill embeddings in Qdrant with BLAKE3 content-hash delta sync (Issue #104)SkillMatcherBackendenum dispatching betweenInMemoryandQdrantskill matching (Issue #105)qdrantfeature flag onzeph-skillscrate gating all Qdrant dependencies- Graceful fallback to in-memory matcher when Qdrant is unavailable
- Skill matching tracing via
tracing::debug!for diagnostics - New
zeph-mcpcrate: MCP client via rmcp 0.14 with stdio transport (Issue #117) McpClientandMcpManagerfor multi-server lifecycle management with concurrent connectionsMcpToolExecutorimplementingToolExecutorfor```mcpblock execution (Issue #120)McpToolRegistry: MCP tool embeddings in Qdrantzeph_mcp_toolscollection with BLAKE3 delta sync (Issue #118)- Unified matching: skills + MCP tools injected into system prompt by relevance (Issue #119)
mcpfeature flag on root binary and zeph-core gating all MCP functionality- Bundled
mcp-generateskill with instructions for MCP-to-skill generation via mcp-execution (Issue #121) [[mcp.servers]]TOML config section for MCP server connections
Changed
Skillstruct refactored: split intoSkillMeta(lightweight metadata) +Skill(meta + body), composition patternSkillRegistrynow usesOnceLock<String>for lazy body caching instead of eager loading- Matcher APIs accept
&[&SkillMeta]instead of&[Skill]— embeddings use description only AgentstoresSkillRegistrydirectly instead ofVec<Skill>Agentfieldmatchertype changed fromOption<SkillMatcher>toOption<SkillMatcherBackend>- Skill matcher creation extracted to
create_skill_matcher()inmain.rs
Dependencies
- Added
age0.11.2 to workspace (optional, behindvault-agefeature,default-features = false) - Added
candle-core0.9,candle-nn0.9,candle-transformers0.9 to workspace (optional, behindcandlefeature) - Added
hf-hub0.4 to workspace (HuggingFace model downloads with rustls-tls) - Added
tokenizers0.22 to workspace (BPE tokenization with fancy-regex) - Added
blake31.8 to workspace - Added
rmcp0.14 to workspace (MCP protocol SDK)
0.7.1 - 2026-02-09
Added
WebScrapeExecutor: safe HTML scraping via scrape-core with CSS selectors, SSRF protection, and HTTPS-only enforcement (Issue #57)CompositeExecutor<A, B>: generic executor chaining with first-match-wins dispatch- Bundled
web-scrapeskill with CSS selector examples for structured data extraction extract_fenced_blocks()shared utility for fenced code block parsing (DRY refactor)[tools.scrape]config section with timeout and max body size settings
Changed
- Agent tool output label from
[shell output]to[tool output] ShellExecutorblock extraction now uses sharedextract_fenced_blocks()
0.7.0 - 2026-02-08
Added
- A2A Server: axum-based HTTP server with JSON-RPC 2.0 routing for
message/send,tasks/get,tasks/cancel(Issue #83) - In-memory
TaskManagerwith full task lifecycle: create, get, update status, add artifacts, append history, cancel (Issue #83) - SSE streaming endpoint (
/a2a/stream) with JSON-RPC response envelope wrapping per A2A spec (Issue #84) - Bearer token authentication middleware with constant-time comparison via
subtle::ConstantTimeEq(Issue #85) - Per-IP rate limiting middleware with configurable 60-second sliding window (Issue #85)
- Request body size limit (1 MiB) via
tower-http::limit::RequestBodyLimitLayer(Issue #85) A2aServerConfigwith env var overrides:ZEPH_A2A_ENABLED,ZEPH_A2A_HOST,ZEPH_A2A_PORT,ZEPH_A2A_PUBLIC_URL,ZEPH_A2A_AUTH_TOKEN,ZEPH_A2A_RATE_LIMIT- Agent card served at
/.well-known/agent.json(public, no auth required) - Graceful shutdown integration via tokio watch channel
- Server module gated behind
serverfeature flag onzeph-a2acrate
Changed
Parttype refactored from flat struct to tagged enum withkinddiscriminator (text,file,data) per A2A specTaskState::Pendingrenamed toTaskState::Submittedwith explicit per-variant#[serde(rename)]for kebab-case wire format- Added
AuthRequiredandUnknownvariants toTaskState TaskStatusUpdateEventandTaskArtifactUpdateEventgainedkindfield (status-update,artifact-update)
0.6.0 - 2026-02-08
Added
- New
zeph-a2acrate: A2A protocol implementation for agent-to-agent communication (Issue #78) - A2A protocol types:
Task,TaskState,TaskStatus,Message,Part,Artifact,AgentCard,AgentSkill,AgentCapabilitieswith full serde camelCase serialization (Issue #79) - JSON-RPC 2.0 envelope types (
JsonRpcRequest,JsonRpcResponse,JsonRpcError) with method constants for A2A operations (Issue #79) AgentCardBuilderfor constructing A2A agent cards from runtime config and skills (Issue #79)AgentRegistrywith well-known URI discovery (/.well-known/agent.json), TTL-based caching, and manual registration (Issue #80)A2aClientwithsend_message,stream_message(SSE),get_task,cancel_taskvia JSON-RPC 2.0 (Issue #81)- Bearer token authentication support for all A2A client operations (Issue #81)
- SSE streaming via
eventsource-streamwithTaskEventenum (StatusUpdate,ArtifactUpdate) (Issue #81) A2aErrorenum with variants for HTTP, JSON, JSON-RPC, discovery, and stream errors (Issue #79)- Optional
a2afeature flag (enabled by default) to gate A2A functionality - 42 new unit tests for protocol types, JSON-RPC envelopes, agent card builder, discovery registry, and client operations
0.5.0 - 2026-02-08
Added
- Embedding-based skill matcher:
SkillMatcherwith cosine similarity selects top-K relevant skills per query instead of injecting all skills into the system prompt (Issue #75) max_active_skillsconfig field (default: 5) withZEPH_SKILLS_MAX_ACTIVEenv var override- Skill hot-reload: filesystem watcher via
notify-debouncer-minidetects SKILL.md changes and re-embeds without restart (Issue #76) - Skill priority: earlier paths in
skills.pathstake precedence when skills share the same name (Issue #76) SkillRegistry::reload()andSkillRegistry::into_skills()methods- SQLite
skill_usagetable tracking per-skill invocation counts and last-used timestamps (Issue #77) /skillscommand displaying available skills with usage statistics- Three new bundled skills:
git,docker,api-request(Issue #77) - 17 new unit tests for matcher, registry priority, reload, and usage tracking
Changed
Agent::new()signature: acceptsVec<Skill>,Option<SkillMatcher>,max_active_skillsinstead of pre-formatted skills prompt stringformat_skills_promptnow generic overBorrow<Skill>to accept both&[Skill]and&[&Skill]Skillstruct derivesCloneAgentgeneric constraint:P: LlmProvider + Clone + 'static(required for embed_fn closures)- System prompt rebuilt dynamically per user query with only matched skills
Dependencies
- Added
notify8.0,notify-debouncer-mini0.6 zeph-corenow depends onzeph-skillszeph-skillsnow depends ontokio(sync, rt) andnotify
0.4.3 - 2026-02-08
Fixed
- Telegram “Bad Request: text must be non-empty” error when LLM returns whitespace-only content. Added
is_empty()guard aftermarkdown_to_telegramconversion in bothsend()andsend_or_edit()(Issue #73)
Added
Dockerfile.dev: multi-stage build from source with cargo registry/build cache layers for fast rebuildsdocker-compose.dev.yml: full dev stack (Qdrant + Zeph) with debug tracing (RUST_LOG,RUST_BACKTRACE=1), uses host Ollama viahost.docker.internaldocker-compose.deps.yml: Qdrant-only compose for native zeph execution on macOS
0.4.2 - 2026-02-08
Fixed
- Telegram MarkdownV2 parsing errors (Issue #69). Replaced manual character-by-character escaping with AST-based event-driven rendering using pulldown-cmark 0.13.0
- UTF-8 safe text chunking for messages exceeding Telegram’s 4096-byte limit. Uses
str::is_char_boundary()with newline preference to prevent splitting multi-byte characters (emoji, CJK) - Link URL over-escaping. Dedicated
escape_url()method only escapes)and\per Telegram MarkdownV2 spec, fixing broken URLs likehttps://example\.com
Added
TelegramRendererstate machine for context-aware escaping: 19 special characters in text, only\and`in code blocks- Markdown formatting support: bold, italic, strikethrough, headers, code blocks, links, lists, blockquotes
- Comprehensive benchmark suite with criterion: 7 scenario groups measuring latency (2.83µs for 500 chars) and throughput (121-970 MiB/s)
- Memory profiling test to measure escaping overhead (3-20% depending on content)
- 30 markdown unit tests covering formatting, escaping, edge cases, and UTF-8 chunking (99.32% line coverage)
Changed
crates/zeph-channels/src/markdown.rs: Complete rewrite with pulldown-cmark event-driven parser (449 lines)crates/zeph-channels/src/telegram.rs: Removedhas_unclosed_code_block()pre-flight check (no longer needed with AST parsing), integrated UTF-8 safe chunking- Dependencies: Added pulldown-cmark 0.13.0 (MIT) and criterion 0.8.0 (Apache-2.0/MIT) for benchmarking
0.4.1 - 2026-02-08
Fixed
- Auto-create Qdrant collection on first use. Previously, the
zeph_conversationscollection had to be manually created using curl commands. Now,ensure_collection()is called automatically before all Qdrant operations (remember, recall, summarize) to initialize the collection with correct vector dimensions (896 for qwen3-embedding) and Cosine distance metric on first access, similar to SQL migrations.
Changed
- Docker Compose: Added environment variables for semantic memory configuration (
ZEPH_MEMORY_SEMANTIC_ENABLED,ZEPH_MEMORY_SEMANTIC_RECALL_LIMIT) and Qdrant URL override (ZEPH_QDRANT_URL) to enable full semantic memory stack via.envfile
0.4.0 - 2026-02-08
Added
M9 Phase 3: Conversation Summarization and Context Budget (Issue #62)
- New
SemanticMemory::summarize()method for LLM-based conversation compression - Automatic summarization triggered when message count exceeds threshold
- SQLite migration
003_summaries.sqlcreates dedicated summaries table with CASCADE constraints SqliteStore::save_summary()stores summary with metadata (first/last message IDs, token estimate)SqliteStore::load_summaries()retrieves all summaries for a conversation ordered by IDSqliteStore::load_messages_range()fetches messages after specific ID with limit for batch processingSqliteStore::count_messages()counts total messages in conversationSqliteStore::latest_summary_last_message_id()gets last summarized message ID for resumptionContextBudgetstruct for proportional token allocation (15% summaries, 25% semantic recall, 60% recent history)estimate_tokens()helper using chars/4 heuristic (100x faster than tiktoken, ±25% accuracy)Agent::check_summarization()lazy trigger after persist_message() when threshold exceeded- Batch size = threshold/2 to balance summary quality with LLM call frequency
- Configuration:
memory.summarization_threshold(default: 100),memory.context_budget_tokens(default: 0 = unlimited) - Environment overrides:
ZEPH_MEMORY_SUMMARIZATION_THRESHOLD,ZEPH_MEMORY_CONTEXT_BUDGET_TOKENS - Inline comments in
config/default.tomldocumenting all configuration parameters - 26 new unit tests for summarization and context budget (196 total tests, 75.31% coverage)
- Architecture Decision Records ADR-016 through ADR-019 for summarization design
- Foreign key constraint added to
messages.conversation_idwith ON DELETE CASCADE
M9 Phase 2: Semantic Memory Integration (Issue #61)
SemanticMemory<P: LlmProvider>orchestrator coordinating SQLite, Qdrant, and LlmProviderSemanticMemory::remember()saves message to SQLite, generates embedding, stores in QdrantSemanticMemory::recall()performs semantic search with query embedding and fetches messages from SQLiteSemanticMemory::has_embedding()checks if message already embedded to prevent duplicatesSemanticMemory::embed_missing()background task to embed old messages (with LIMIT parameter)Agent<P, C, T>now generic over LlmProvider to support SemanticMemoryAgent::with_memory()replaces SqliteStore with SemanticMemory- Graceful degradation: embedding failures logged but don’t block message save
- Qdrant connection failures silently downgrade to SQLite-only mode (no semantic recall)
- Generic provider pattern:
SemanticMemory<P: LlmProvider>instead ofArc<dyn LlmProvider>for Edition 2024 async trait compatibility AnyProvider,OllamaProvider,ClaudeProvidernow derive/implementClonefor semantic memory integration- Integration test updated for SemanticMemory API (with_memory now takes 5 parameters including recall_limit)
- Semantic memory config:
memory.semantic.enabled,memory.semantic.recall_limit(default: 5) - 18 new tests for semantic memory orchestration (recall, remember, embed_missing, graceful degradation)
M9 Phase 1: Qdrant Integration (Issue #60)
- New
QdrantStoremodule in zeph-memory for vector storage and similarity search QdrantStore::store()persists embeddings to Qdrant and tracks metadata in SQLiteQdrantStore::search()performs cosine similarity search with filtering by conversation_id and roleQdrantStore::has_embedding()checks if message has associated embeddingQdrantStore::ensure_collection()idempotently creates Qdrant collection with 768-dimensional vectors- SQLite migration
002_embeddings_metadata.sqlfor embedding metadata tracking embeddings_metadatatable with foreign key constraint to messages (ON DELETE CASCADE)- PRAGMA foreign_keys enabled in SqliteStore via SqliteConnectOptions
SearchFilterandSearchResulttypes for flexible query constructionMemoryConfig.qdrant_urlfield withZEPH_QDRANT_URLenvironment variable override (default: http://localhost:6334)- Docker Compose Qdrant service (qdrant/qdrant:v1.13.6) on ports 6333/6334 with persistent storage
- Integration tests for Qdrant operations (ignored by default, require running Qdrant instance)
- Unit tests for SQLite metadata operations with 98% coverage
- 12 new tests total (3 unit + 2 integration for QdrantStore, 1 CASCADE DELETE test for SqliteStore, 3 config tests)
M8: Embeddings support (Issue #54)
LlmProvidertrait extended withembed(&str) -> Result<Vec<f32>>for generating text embeddingsLlmProvidertrait extended withsupports_embeddings() -> boolfor capability detectionOllamaProviderimplements embeddings via ollama-rsgenerate_embeddings()API- Default embedding model:
qwen3-embedding(configurable viallm.embedding_model) ZEPH_LLM_EMBEDDING_MODELenvironment variable for runtime overrideClaudeProvider::embed()returns descriptive error (Claude API does not support embeddings)AnyProviderdelegates embedding methods to active provider- 10 new tests: unit tests for all providers, config tests for defaults/parsing/env override
- Integration test for real Ollama embedding generation (ignored by default)
- README documentation: model compatibility notes and
ollama pullinstructions for both LLM and embedding models - Docker Compose configuration: added
ZEPH_LLM_EMBEDDING_MODELenvironment variable
Changed
BREAKING CHANGES (pre-1.0.0):
SqliteStore::save_message()now returnsResult<i64>instead ofResult<()>to enable embedding workflowSqliteStore::new()usessqlx::migrate!()macro instead of INIT_SQL constant for proper migration managementQdrantStore::store()requiresmodel: &strparameter for multi-model support- Config constant
LLM_ENV_KEYSrenamed toENV_KEYSto reflect inclusion of non-LLM variables
Migration:
#![allow(unused)]
fn main() {
// Before:
let _ = store.save_message(conv_id, "user", "hello").await?;
// After:
let message_id = store.save_message(conv_id, "user", "hello").await?;
}
OllamaProvider::new()now acceptsembedding_modelparameter (breaking change, pre-v1.0)- Config schema: added
llm.embedding_modelfield with serde default for backward compatibility
0.3.0 - 2026-02-07
Added
M7 Phase 1: Tool Execution Framework - zeph-tools crate (Issue #39)
- New
zeph-toolsleaf crate for tool execution abstraction following ADR-014 ToolExecutortrait with native async (Edition 2024 RPITIT): accepts full LLM response, returnsOption<ToolOutput>ShellExecutorimplementation with bash block parser and execution (30s timeout viatokio::time::timeout)ToolOutputstruct with summary string and blocks_executed countToolErrorenum with Blocked/Timeout/Execution variants (thiserror)ToolsConfigandShellConfigconfiguration types with serde Deserialize and sensible defaults- Workspace version consolidation:
version.workspace = trueacross all crates - Workspace inter-crate dependency references:
zeph-llm.workspace = truepattern for all internal dependencies - 22 unit tests with 99.25% line coverage, zero clippy warnings
- ADR-014: zeph-tools crate design rationale and architecture decisions
M7 Phase 2: Command safety (Issue #40)
- DEFAULT_BLOCKED patterns: 12 dangerous commands (rm -rf /, sudo, mkfs, dd if=, curl, wget, nc, ncat, netcat, shutdown, reboot, halt)
- Case-insensitive command filtering via to_lowercase() normalization
- Configurable timeout and blocked_commands in TOML via
[tools.shell]section - Custom blocked commands additive to defaults (cannot weaken security)
- 35+ comprehensive unit tests covering exact match, prefix match, multiline, case variations
- ToolsConfig integration with core Config struct
M7 Phase 3: Agent integration (Issue #41)
- Agent now uses
ShellExecutorfor all bash command execution with safety checks - SEC-001 CRITICAL vulnerability fixed: unfiltered bash execution removed from agent.rs
- Removed 66 lines of duplicate code (extract_bash_blocks, execute_bash, extract_and_execute_bash)
- ToolError::Blocked properly handled with user-facing error message
- Four integration tests for blocked command behavior and error handling
- Performance validation: < 1% overhead for tool executor abstraction
- Security audit: all acceptance criteria met, zero vulnerabilities
Security
- CRITICAL fix for SEC-001: Shell commands now filtered through ShellExecutor with DEFAULT_BLOCKED patterns (rm -rf /, sudo, mkfs, dd if=, curl, wget, nc, shutdown, reboot, halt). Resolves command injection vulnerability where agent.rs bypassed all security checks via inline bash execution.
Fixed
- Shell command timeout now respects
config.tools.shell.timeout(was hardcoded 30s in agent.rs) - Removed duplicate bash parsing logic from agent.rs (now centralized in zeph-tools)
- Error message pattern leakage: blocked commands now show generic security policy message instead of leaking exact blocked pattern
Changed
BREAKING CHANGES (pre-1.0.0):
Agent::new()signature changed: now requirestool_executor: Tas 4th parameter whereT: ToolExecutorAgentstruct now generic over three types:Agent<P, C, T>(provider, channel, tool_executor)- Workspace
Cargo.tomlnow definesversion = "0.3.0"in[workspace.package]section - All crate manifests use
version.workspace = trueinstead of explicit versions - Inter-crate dependencies now reference workspace definitions (e.g.,
zeph-llm.workspace = true)
Migration:
#![allow(unused)]
fn main() {
// Before:
let agent = Agent::new(provider, channel, &skills_prompt);
// After:
use zeph_tools::shell::ShellExecutor;
let executor = ShellExecutor::new(&config.tools.shell);
let agent = Agent::new(provider, channel, &skills_prompt, executor);
}
0.2.0 - 2026-02-06
Added
M6 Phase 1: Streaming trait extension (Issue #35)
LlmProvider::chat_stream()method returningPin<Box<dyn Stream<Item = Result<String>> + Send>>LlmProvider::supports_streaming()capability query methodChannel::send_chunk()method for incremental response deliveryChannel::flush_chunks()method for buffered chunk flushingChatStreamtype alias forPin<Box<dyn Stream<Item = anyhow::Result<String>> + Send>>- Streaming infrastructure in zeph-llm and zeph-core (dependencies: futures-core 0.3, tokio-stream 0.1)
M6 Phase 2: Ollama streaming backend (Issue #36)
- Native token-by-token streaming for
OllamaProviderusingollama-rsstreaming API OllamaProvider::chat_stream()implementation viasend_chat_messages_stream()OllamaProvider::supports_streaming()now returnstrue- Stream mapping from
Result<ChatMessageResponse, ()>toResult<String, anyhow::Error> - Integration tests for streaming happy path and equivalence with non-streaming
chat()(ignored by default) - ollama-rs
"stream"feature enabled in workspace dependencies
M6 Phase 3: Claude SSE streaming backend (Issue #37)
- Native token-by-token streaming for
ClaudeProviderusing Anthropic Messages API with Server-Sent Events ClaudeProvider::chat_stream()implementation via SSE event parsingClaudeProvider::supports_streaming()now returnstrue- SSE event parsing via
eventsource-stream0.2.3 library - Stream pipeline:
bytes_stream() -> eventsource() -> filter_map(parse_sse_event) -> Box::pin() - Handles SSE events:
content_block_delta(text extraction),error(mid-stream errors), metadata events (skipped) - Integration tests for streaming happy path and equivalence with non-streaming
chat()(ignored by default) - eventsource-stream dependency added to workspace dependencies
- reqwest
"stream"feature enabled forbytes_stream()support
M6 Phase 4: Agent streaming integration (Issue #38)
- Agent automatically uses streaming when
provider.supports_streaming()returns true (ADR-014) Agent::process_response_streaming()method for stream consumption and chunk accumulation- CliChannel immediate streaming:
send_chunk()prints each chunk instantly viaprint!()+flush() - TelegramChannel batched streaming: debounce at 1 second OR 512 bytes, edit-in-place for progressive updates
- Response buffer pre-allocation with
String::with_capacity(2048)for performance - Error message sanitization: full errors logged via
tracing::error!(), generic messages shown to users - Telegram edit retry logic: recovers from stale message_id (message deleted, permissions lost)
- tokio-stream dependency added for
StreamExttrait - 6 new unit tests for channel streaming behavior
Fixed
M6 Phase 3: Security improvements
- Manual
Debugimplementation forClaudeProviderto prevent API key leakage in debug output - Error message sanitization: full Claude API errors logged via
tracing::error!(), generic messages returned to users
Changed
BREAKING CHANGES (pre-1.0.0):
LlmProvidertrait now requireschat_stream()andsupports_streaming()implementations (no default implementations per project policy)Channeltrait now requiressend_chunk()andflush_chunks()implementations (no default implementations per project policy)- All existing providers (
OllamaProvider,ClaudeProvider) updated with fallback implementations (Phase 1 non-streaming: callschat()and wraps in single-item stream) - All existing channels (
CliChannel,TelegramChannel) updated with no-op implementations (Phase 1: streaming not yet wired into agent loop)
0.1.0 - 2026-02-05
Added
M0: Workspace bootstrap
- Cargo workspace with 5 crates: zeph-core, zeph-llm, zeph-skills, zeph-memory, zeph-channels
- Binary entry point with version display
- Default configuration file
- Workspace-level dependency management and lints
M1: LLM + CLI agent loop
- LlmProvider trait with Message/Role types
- Ollama backend using ollama-rs
- Config loading from TOML with env var overrides
- Interactive CLI agent loop with multi-turn conversation
M2: Skills system
- SKILL.md parser with YAML frontmatter and markdown body (zeph-skills)
- Skill registry that scans directories for
*/SKILL.mdfiles - Prompt formatter with XML-like skill injection into system prompt
- Bundled skills: web-search, file-ops, system-info
- Shell execution: agent extracts
bashblocks from LLM responses and runs them - Multi-step execution loop with 3-iteration limit
- 30-second timeout on shell commands
- Context builder that combines base system prompt with skill instructions
M3: Memory + Claude
- SQLite conversation persistence with sqlx (zeph-memory)
- Conversation history loading and message saving per session
- Claude backend via Anthropic Messages API with 429 retry (zeph-llm)
- AnyProvider enum dispatch for runtime provider selection
- CloudLlmConfig for Claude-specific settings (model, max_tokens)
- ZEPH_CLAUDE_API_KEY env var for API authentication
- ZEPH_SQLITE_PATH env var override for database location
- Provider factory in main.rs selecting Ollama or Claude from config
- Memory integration into Agent with optional SqliteStore
M4: Telegram channel
- Channel trait abstraction for agent I/O (recv, send, send_typing)
- CliChannel implementation reading stdin/stdout via tokio::task::spawn_blocking
- TelegramChannel adapter using teloxide with mpsc-based message routing
- Telegram user whitelist via
telegram.allowed_usersconfig - ZEPH_TELEGRAM_TOKEN env var for Telegram bot activation
- Bot commands: /start (welcome), /reset, /skills forwarded as ChannelMessage
- AnyChannel enum dispatch for runtime channel selection
- zeph-channels crate with teloxide 0.17 dependency
- TelegramConfig in config.rs with TOML and env var support
M5: Integration tests + release
- Integration test suite: config, skills, memory, and agent end-to-end
- MockProvider and MockChannel for agent testing without external dependencies
- Graceful shutdown via tokio::sync::watch + tokio::signal (SIGINT/SIGTERM)
- Ollama startup health check (warn-only, non-blocking)
- README with installation, configuration, usage, and skills documentation
- GitHub Actions CI/CD: lint, clippy, test (ubuntu + macos), coverage, security, release
- Dependabot for Cargo and GitHub Actions with auto-merge for patch/minor updates
- Auto-labeler workflow for PRs by path, title prefix, and size
- Release workflow with cross-platform binary builds and checksums
- Issue templates (bug report, feature request)
- PR template with review checklist
- LICENSE (MIT), CONTRIBUTING.md, SECURITY.md
Fixed
- Replace vulnerable
serde_yml/libymlwith manual frontmatter parser (GHSA high + medium)
Changed
-
Move dependency features from workspace root to individual crate manifests
-
Update README with badges, architecture overview, and pre-built binaries section
-
Agent is now generic over both LlmProvider and Channel (
Agent<P, C>) -
Agent::new() accepts a Channel parameter instead of reading stdin directly
-
Agent::run() uses channel.recv()/send() instead of direct I/O
-
Agent calls channel.send_typing() before each LLM request
-
Agent::run() uses tokio::select! to race channel messages against shutdown signal