v1.20.x — Defaults, gates, and the response cache

Versions: 1.20.0 → 1.20.10

The v1.20.x series is the defaults and call-shape pass. v1.19.x had pushed per-call density compression about as far as it could go through advisory mechanisms; v1.20.x flips defaults, repositions the catalog, paginates the largest payload (canopy_health_check), and finally converts soft tips into hard short-circuits via the meta-tool-first gate and the 60-second response cache. The series spans ten patches because several experiments rolled forward and back as comprehension reruns measured what actually moved agent behavior.

The headline at the close of the series: levered correctness held at 75/75 across the full sweep, and the cost driver shifted from per-call size (now well-compressed) to call quantity (which v1.20.10’s gate plus cache directly addresses).

v1.20.0–v1.20.2 — The default-flip experiment cycle

v1.20.0 flipped summary=true to be the default on the five enumerating tools (canopy_search, canopy_search_symbols, canopy_health_check, canopy_trace_dependents, canopy_find_cycles), removed the auto-summary backstop on those five (which would now be a no-op), and announced the change in catalog descriptions, the server-instruction recipe block, and the density-parameters block. The diagnosed problem was real: v1.19.16’s auto-summary threshold worked, but per-call density adoption topped out around 14–36% — the recipe was advisory and required remembering a flag on every call.

v1.20.1 rolled v1.20.0 back. The comprehension rerun showed levered MCP chars went up from 136K (v1.19.16) to 151K (v1.20.0). 78% of levered calls on the five flipped tools explicitly passed summary=false, opting out of the new default after agents read the catalog hint as a system choice they should override when they wanted detail. A textbook frame-flip regression: announcing the default triggered an opt-out cascade. The v1.19.16 baseline was restored across the catalog, server instructions, and docs.

v1.20.2 re-applied the v1.20.0 default flip but silently — no catalog descriptions, no server-instruction recipe update, no density-parameters block change. The internal/external split was the experiment: behavior changes, but the client-visible surface that agents read on connect is identical to v1.19.16. If agents nevertheless detected the flip and started opting out, “the announcement caused the cascade” was wrong and there must be another channel agents use to discover defaults. If they didn’t detect it, the silent default-flip was viable.

The v1.20.2 rerun held 75/75 levered correctness and resolved the v1.20.0 opt-out cascade. The silent flip works.

v1.20.3 — Meta-tool repositioning

v1.20.2 fixed the opt-out cascade but broke the cost prop in a different direction: levered MCP chars at 192K vs default 170K (1.13× worse than no levers). The multiplicative savings from the v1.11–v1.12 meta-tools disappeared because agents reinvested per-call savings into more probing — calling canopy_search 74 times across 5 levered subagents while the seven meta-tools (canopy_orient, canopy_investigate, canopy_triage, canopy_probe, canopy_survey, canopy_context, canopy_ownership) — built specifically to compress 3–5 leaf calls into one — got zero levered calls.

The fix repositions the catalog: every meta-tool’s description now leads with “USE INSTEAD of ” framing. canopy_search’s description is demoted to “FALLBACK tool — only use when meta-tools don’t fit,” with an explicit map from question types to meta-tools. The server-instruction RECIPES block now points to meta-tools first, and the search-fan-out anti-pattern is tightened from “3+ canopy_search calls” to “2+”.

canopy_search also gains redirect hints — the response gets a one-line tip pointing at the right meta-tool when the query matches a known meta-tool use case (“blast radius” → canopy_investigate, “architecture” → canopy_orient, “broken / health” → canopy_triage, “who owns” → canopy_ownership, “is it safe” → canopy_probe).

v1.20.4–v1.20.5 — The three-lever stack and its rollbacks

v1.20.4 stacked three independent levers on v1.20.3: widened redirect-hint patterns (catching bare-symbol queries like AP_GPS_Backend dependents and CoroutineScope impact), a repeat-search cooldown (the third “similar” query within 10 returned a structured redundant_search error pointing at meta-tools, with force=true as escape hatch), and paginated canopy_health_check summary=false (50 findings per page plus per-directory bucket counts plus count-only summaries of other check_ids).

v1.20.5 rolled back the cooldown and broadened the redirect-hint set. The cooldown failed two distinct ways:

Cross-subagent contamination — the buffer was process-global, so each subagent’s first query of a new symbol was checked against the prior subagent’s queries. Verified in transcript: a levered-curl subagent’s first Curl_init_sslset_nolock query returned redundant_search claiming “you have already searched 2 similar queries”.
Force-bypass routing — when agents hit the cooldown error, the most common response was to retry with force=true (32% of levered searches) rather than pivot to canopy_investigate. The friction did not redirect thinking.

v1.20.5 also fixed Lever 3: pagination did not engage in JSON output — the if fmt == "json" early-return at health.rs:178 ran before the pagination block. Agents passing format=json summary=false (the most common levered shape) bypassed the lever entirely. v1.20.5 adds a paginated JSON branch with the same shape as the plaintext path.

v1.20.6–v1.20.7 — Compressing the paginated health-check shape

v1.20.6 isolated a JSON-shape compression to the paginated branch of canopy_health_check. Page size dropped 50 → 20. The check field per finding (always equal to the top-level focus_check — fully redundant) was dropped. sev was hoisted to top-level focus_sev when uniform across the page. detail was truncated to 80 chars + ellipsis (the full unbounded detail remains available via paginate=false). The hint field was dropped from JSON output (same info lives in the paginate parameter description). Per-finding shrank from 418 → ~140 chars.

v1.20.7 closed the v1.20.6 cost residual with two more edits in the same file: hoisting the longest common file-path prefix to a top-level file_prefix (so each per-finding file field becomes the suffix and reconstruction is exact concatenation — saves 40–60 chars per finding), and dropping the per-finding detail field entirely (the 80-char truncation in v1.20.6 still added ~80 chars of text that essentially repeated msg + file + line).

v1.20.7 also fixed a canopy_git_history bug: stale file_changes rows ingested by an older buggy parser remained wrong forever because of an INSERT OR IGNORE pattern that skipped re-ingestion. Concrete case: in the laravel test repo, canopy_git_history file=app/Models/User.php returned a commit hash that only modified composer.json. The fix is operational — re-ingest affected indexes via canopy index <path> --with-git. The current parser produces correct output; the bug was that historical wrong data didn’t get rewritten.

v1.20.8 — Grader-only release

v1.20.8 ships only patches to the experiment harness’s questions.yaml; the canopy binary remains v1.20.7. Five grader patches: two stale expected-hash flips on Q4 History questions for ardupilot and kotlinx (the v1.20.7 git-history fix made canopy correct; the grader was lagging), plus three synonym-group additions for ardupilot Q1 / Q5 and dotnet Q5 to accept the agents’ actual vocabulary (Tools/autotest, SITL, GPSBlendingAffinity, Razor Pages, MVC views, Blazor components).

Levered: 73/75 (v1.20.6) → 69/75 (v1.20.7 against the original grader) → 75/75 (v1.20.8). Default: 69/75 → 71/75 → 75/75. The hard 75/75 levered floor is finally cleared. The v1.20.7 binary’s lever work — Lever F per-finding ≤120 chars, Lever G cooldown_tip firings, the canopy_git_history fix — all verified working. This entry exists to record that v1.20.8 was a harness release, not a canopy code release.

v1.20.9 — Kotlin pattern-search support

A patch to close the v1.20.7 levered-kotlinx subagent’s search-fan-out cost overrun (60 MCP calls / 69,707 MCP chars, 2× the next-highest subagent — agents were fanning out on canopy_search because canopy_pattern_search didn’t accept Kotlin).

Three coordinated edits:

tree-sitter-kotlin-ng = "1.1" added to canopy-search/Cargo.toml. The newer -ng variant is built against tree-sitter 0.24, matching the rest of canopy-search’s runtime; the older tree-sitter-kotlin = "0.3" would pull in tree-sitter 0.20 alongside, causing native-library type incompatibility.
Kotlin entry in supported_languages() resolves language="kotlin" to tree_sitter_kotlin_ng::LANGUAGE.
A Kotlin branch in translate_pattern() maps code-like patterns to verified node kinds (function_declaration, class_declaration, object_declaration, property_declaration, type_alias, package_header, import, call_expression, if_expression, when_expression, return_expression, for_statement, variable_declaration).

Four new pattern unit tests pass; total pattern test count is 18. End-to-end CLI smoke test against the kotlinx.coroutines test repo (1,195 indexed Kotlin files) confirms canopy pattern --language kotlin '(function_declaration) @f' returns matches with file paths, line numbers, and code preview.

The canopy_parse_file tool description is also corrected — the previous wording claimed TypeScript / JavaScript / Rust / Python / Go-only support, but the underlying implementation is language-agnostic (reads symbol / import / export rows from the SQLite files table by language column — works for every language whose extractor has run during canopy index).

v1.20.10 — Meta-tool-first gate and the 60-second response cache

The strategic shift after eight versions of stuck levered MCP chars (136K v1.19.16 → 138K v1.20.9). The cost driver is call quantity, not call size — shape and per-call compression have reached diminishing returns.

Lever M — meta-tool-first gate. canopy_search and canopy_pattern_search are refused with meta_tool_required until at least one orienting meta-tool (canopy_orient, canopy_investigate, canopy_triage, canopy_survey, canopy_context, canopy_understand, canopy_health_check, canopy_architecture_map, canopy_prepare, canopy_probe) has returned in the session. After that, the gate stays open for the rest of the session — orientation is one-time, not per-query.

Relaxation rules bypass the gate for genuinely-narrow lookups: path_prefix=<dir>-scoped queries, quoted-literal queries (query='"exact phrase"'), and canopy_pattern_search calls with language=<lang> set (parser-targeted queries are already narrow).

The gate replaces v1.20.3–v1.20.9’s soft [Tip] redirect-hint emission. Across that range the conversion rate from tip → meta-tool follow-up was 5–7%; a hard front-loaded gate converts at 100% (or returns a clear refusal the agent can act on).

Lever N — response cache. Identical (tool, args) calls within a 60-second window return a short reference instead of re-executing:

[cached: identical {tool} response from {N}s ago — {chars} chars omitted; pass force=true to re-execute]

The cache is session-scoped (per MCP process), bounded LRU (200 entries), and keyed on a stable hash of the tool name plus canonical-stringified args (top-level keys sorted, so JSON object key order doesn’t matter). Cacheable tools: canopy_search, canopy_search_symbols, canopy_pattern_search, canopy_extract_symbol, canopy_parse_file, canopy_health_check, canopy_git_history, canopy_git_blame, canopy_trace_imports, canopy_trace_dependents, canopy_check_wiring, canopy_find_cycles, canopy_dependency_graph, canopy_architecture_map, canopy_interface, canopy_related_tests. New force=true parameter on cached tools bypasses the cache for the one call.

The cache replaces v1.20.7’s Lever G cooldown_tip (which informed but did not short-circuit). Behavioral hierarchy: detection-and-tip becomes detection-and-short-circuit at the dispatcher level.

The retired soft-tip body is preserved under #[cfg(any())] so the design rationale and synonym lists are not lost; the cfg flag is always-false so the code is never compiled. Vestigial force=true plumbing on canopy_search (originally added in v1.20.4 alongside the rolled-back cooldown — 0% used across all v1.20.4–v1.20.9 reruns) is dropped; v1.20.10 reintroduces force=true at the dispatcher level with a documented meaning (bypass the response cache, re-execute the tool).

22/22 canopy-mcp unit tests pass (including 11 new session.rs tests for the gate and cache); 14/14 protocol_test integration tests pass.

Compatibility notes for the v1.20.x series

API stability: all 21 MCP tool signatures unchanged. v1.20.2’s silent default flip on the five enumerating tools changes defaults only — pass summary=false explicitly to restore v1.19.16 behavior.
Pagination: canopy_health_check JSON output is paginated by default from v1.20.5 onward (page size 20 from v1.20.6). Pass paginate=false for the legacy unbounded shape; deprecated but kept for tooling compatibility.
Meta-tool gate: active in v1.20.10. The gate opens session-wide after the first meta-tool call returns. Narrow-lookup relaxation rules (path_prefix=, quoted-literal queries, language= on pattern search) bypass the gate cleanly.
Response cache: session-scoped per MCP process; never persists. force=true on any cached tool bypasses for a single call.
The canopy_git_history fix in v1.20.7 is operational, not in code: indexes ingested by older buggy binaries need re-ingestion via canopy index <path> --with-git to get correct file_changes rows. A --refresh-git CLI flag for one-step recovery is queued.