v1.14.x — Language expansion and the comprehension benchmark
Versions: 1.14.0 → 1.19.16
The v1.14.x series is the language-coverage and benchmark-validation pass. It triples Canopy’s supported-language surface from the v1.13 baseline (TypeScript, JavaScript, Python, Rust, Go) by shipping three language batches (C/C++/C#, Java/Kotlin/Swift, Ruby/PHP) plus the framework, build, and config files those ecosystems live alongside. It also lands content-based disambiguation for inherently ambiguous extensions (.m, .h, .inc, .ac), a parallel CPU-embedding loop that takes Ollama from 75% idle to ~4× throughput, and the comprehension benchmark — the first measurement of how AI agents answer real questions on real codebases using only Canopy MCP tools.
v1.13.0 was skipped — the version cut directly from v1.12.0 to v1.14.0 to align the language-batch numbering with the public roadmap.
v1.14.0 — Q2 batch: C, C++, C#, and the build-descriptor family
Section titled “v1.14.0 — Q2 batch: C, C++, C#, and the build-descriptor family”Regex-based extractors throughout, holding the existing tree-sitter = "0.24" ABI-14 pin. No new grammar crates added in this release.
- C (
.c,.h): functions, forward declarations, structs, unions, enums, typedefs,#definemacros. Quoted#includebecomes relative imports; system<header>includes are external. - C++ (
.cpp,.cxx,.cc,.hpp,.hxx,.h++): classes, structs, unions, namespaces, templates,usingaliases, member and free functions,enum class. - C# (
.cs): classes, interfaces, records, structs, enums, methods, properties, delegates, events, block and file-scoped namespaces.usingdirectives detectSystem.*/Microsoft.*/NuGet.*framework prefixes. - CMake (
CMakeLists.txt,*.cmake):project,add_executable,add_library,add_custom_target,set,find_package,include. - Makefile (
Makefile,GNUmakefile,*.mk): phony targets, user targets, variable assignments,include. - MSBuild (
.csproj,.sln,.props,.targets): project root,PackageReference/ProjectReferenceentries,<Target>elements;.slnproject lines surface as both symbols and relative imports. - Razor / Blazor (
.razor,.cshtml):@page,@using,@inherits,@implements,@inject,@code { … }block content.
Filename-based dispatch lands so extensionless files (Makefile, Gemfile, Rakefile, composer.json) and compound-extension files (build.gradle.kts, *.blade.php) resolve to the right extractor.
v1.15.0 — Q3 batch: Java, Kotlin, Swift, plus Gradle / Markdown / Shell
Section titled “v1.15.0 — Q3 batch: Java, Kotlin, Swift, plus Gradle / Markdown / Shell”- Java (
.java): classes, interfaces, enums, records, annotations, methods, constructors, fields.packagedirectives emit module symbols;importedges markjava.*/javax.*/jakarta.*/org.springframework.*/org.junit.*/com.google.*external. - Kotlin (
.kt,.kts): classes (includingdata,sealed,enum class,annotation class),objectdeclarations, interfaces, top-level and extension functions,val/varproperties,const val,typealias,package, aliasedimport. Android framework prefixes external. - Swift (
.swift): classes, structs, enums, protocols, actors, extensions, functions,let/var, subscripts, type aliases. SwiftUI / UIKit / Foundation imports external. - Gradle (
.gradle,.gradle.kts): plugins, dependency configurations (implementation,api,testImplementation,ksp,kapt,androidTestImplementation, …),tasks.register, well-known DSL blocks (android,dependencies,repositories,plugins, …) as module symbols. A Java repo without Gradle indexing is half-indexed. - Markdown (
.md,.mdx,.markdown): ATX headings up to H6 as module symbols, fenced code blocks ascode_block_<lang>, inline and reference-style links as imports. YAML frontmatter is skipped without parsing. - Shell (
.sh,.bash,.zsh): function definitions in both forms, top-level variable assignments (export,readonly),source/.imports, shebang capture via the__shebang__synthetic symbol.
v1.16.0 — Q4 batch: Ruby, PHP, and their ecosystems
Section titled “v1.16.0 — Q4 batch: Ruby, PHP, and their ecosystems”- Ruby (
.rb): classes with base-class extraction, modules, instance andself.-methods,attr_reader/attr_writer/attr_accessor, top-level constants,require/require_relative(external vs relative). - PHP (
.php,.phtml): namespaces,usedirectives with framework detection (Illuminate / Symfony / Laravel / Psr / Doctrine / Monolog / PHPUnit external), classes / interfaces / traits / enums, functions / methods,constanddefine(). - Gemfile / Rack (
Gemfile,Gemfile.lock,*.gemspec,*.ru): gem entries with version pins, group blocks, Ruby version constant,spec.add_dependencycalls, Rack require statements. - Rakefile (
Rakefile,*.rake):task :name => [:deps]with dependency tracking,namespaceblocks,desclines captured as the following task’s doc comment. - Composer (
composer.json,composer.lock): shallow JSON parse emitsrequire/require-devas external imports, PSR-4 autoload namespaces as module symbols, package name as a module. Falls back to regex extraction when JSON is malformed. - Blade (
*.blade.php):@extends,@include,@yield,@section,@component,@php … @endphpinline blocks.
v1.17.0 — Gap closure: scripts, configs, web, build infrastructure
Section titled “v1.17.0 — Gap closure: scripts, configs, web, build infrastructure”The philosophy: a code intelligence engine should index code, and any text file that declares names code depends on is code.
Shebang detection for extensionless scripts: #!/bin/sh, /bin/bash, /bin/zsh, /usr/bin/env perl, /usr/bin/python3, /usr/bin/env ruby, /usr/bin/env php, /usr/bin/env node route to the appropriate extractor.
Ruby template family: ERB / Erubis, Haml / Hamlit, Slim — render-partial imports, yield :name, content_for :name, scriptlet def/class.
Build + infra: Autotools (*.am, *.m4, *.in, configure.ac, Makefile.am), Dockerfile (Dockerfile, Containerfile, *.dockerfile) with FROM stage names, COPY --from=stage, ARG / ENV / EXPOSE / LABEL.
Config + data: Java .properties, generic YAML, generic JSON (with dependencies-family detection and tsconfig extends), TOML (sections, dependency families, scalar key/values), generic XML, .env and variants (with ${VAR} interpolation producing env:VAR import edges), INI / CFG / CONF.
Web: HTML (<title>, id / class attributes, src / href on script / img / link / iframe / video / audio / source, well-known <meta>), CSS / SCSS / Sass / Less / Stylus (selectors, custom properties, $variables, @import / @use / @forward / @keyframes / @mixin / @function).
Perl (.pl, .pm, .t): package declarations with ::-qualified names, sub, use / require / no with pragma detection, my / our / local top-level variables, Moose has attributes.
Language enum extends with 15 new variants for the v1.17.0 batch.
v1.18.0 — ArduPilot audit and content-based disambiguation
Section titled “v1.18.0 — ArduPilot audit and content-based disambiguation”Closes the remaining gaps a complete inventory of a large embedded / robotics codebase surfaced, and introduces content-based disambiguation for inherently ambiguous extensions.
content_classify reads up to 2 KB from each file’s head and picks between candidate languages:
.m— MATLAB (%comments,function,classdef) vs Objective-C (@interface,@implementation,@property,#import,NSObject)..h— C (default) vs C++ (class,namespace,template<,using namespace,std::)..inc— PHP (<?php) vs C (#include,#define,#ifdef)..ac— Autoconf (AC_INIT,dnl) vs OpenSceneGraph 3D model (binary-adjacent; skipped).
Eleven new extractors land in the same release: Lua, MATLAB / Octave, Objective-C, Linker scripts (.ld — ENTRY, MEMORY regions, SECTIONS, PROVIDE, INCLUDE), Device Tree Source (.dts, .dtbo, .dtsi — labelled nodes, compatible = "vendor,chip" as external imports), ROS messages (.msg, .srv, .action), IDL (CORBA / DDS / MIDL), Batch (.bat, .cmd), PowerShell, ArduPilot parameter files (.parm, .param, .defaults, .waypoints, .plan), and Assembly (.asm, .s, .S — labels with visibility from .globl / .global, .equ / .set constants, .section modules).
Extension routing also updates: .ino / .inoflag → C++, .urdf / .xacro / .launch → XML, .ioc → Properties, several systemd / lint / editor configs → INI, .jsonc → JSON.
v1.19.0 — Parallel CPU embedding
Section titled “v1.19.0 — Parallel CPU embedding”The pre-v1.19.0 embedding loop was fully sequential — for batch in rows.chunks(32) { embed_batch().await } — so on a 32-core machine Ollama (default OLLAMA_NUM_PARALLEL=4) sat 75%+ idle while Canopy waited on one HTTP roundtrip at a time.
build_embedding_index now detects the Ollama backend and fires batches concurrently via futures::stream::buffer_unordered. The sequential fallback remains for the ORT backend (whose inference session is &mut-only). Concurrency is tunable via CANOPY_EMBED_CONCURRENCY (default 4); set to 1 to restore pre-v1.19.0 behavior.
On the ArduPilot fixture (5,157 files, ~90,000 chunks, 32-core Xeon E5-2699 v4, nomic-embed-text-v1.5 on CPU): roughly 5 hours wall-time on the legacy path drops to ~75 minutes at concurrency 4 — a ~4× speedup. Ollama’s built-in request pool is the cap; raising CANOPY_EMBED_CONCURRENCY beyond OLLAMA_NUM_PARALLEL yields no additional speedup (raise both to push further).
v1.19.1–v1.19.4 — Test coverage and regression baselines
Section titled “v1.19.1–v1.19.4 — Test coverage and regression baselines”The v1.19.x patch series spends its first four patches locking down the language work that landed across v1.14.0–v1.18.0:
- v1.19.1 — 45 smoke integration tests with real-world fixtures, one per extractor across the v1.14–v1.18 batches. Refinements found via the new fixtures:
kotlin.rs::is_kotlin_framework_nsnow matcheskotlinx.*imports;autotools.rsaccepts both bare and parenthesised macro forms. Test count: 213 unit + 45 integration = 258, all passing. - v1.19.2 — per-repo regression baseline so future canopy versions cannot silently regress the 30+ new extractors. Snapshot of
(files, symbols, imports, chunks, embeddings)and per-language file counts across 10 indexed test repos. 11,306 files / 150,171 symbols / 159,266 chunks / 246,793 vectors total. - v1.19.3 — content-based disambiguation regression tests (8 fixtures × 8 tests) for
.m,.h,.inc,.ac. - v1.19.4 — health-check regression baseline. Future canopy releases cannot introduce new critical findings on the 10 indexed test repos beyond a small ±2 absolute tolerance for the intrinsic non-determinism of cycle detection on tied edges.
v1.19.5–v1.19.9 — Lever Stress Test v2 and CLI lever parity
Section titled “v1.19.5–v1.19.9 — Lever Stress Test v2 and CLI lever parity”The v1.19.5 → v1.19.9 patch sweep validates that Canopy’s lever surface still works efficiently and comprehensively against the 30+ new extractors and 10 new test repos.
- v1.19.5 — Lever Stress Test v2 results published. 13 / 13 new-language reachability probes pass.
canopy_mapdensity lever (--format json) hits 15.9× reduction on the new corpus (was 10.5× on the v1 corpus). The new corpus is roughly 2× v1 in symbols / embeddings and 5× in language surface. - v1.19.6 —
canopy searchCLI collapses near-identical hits in config-style languages (ardupilot_params,properties,env,ini) into one condensed entry, eliminating the N-files-of-noise pattern (e.g.,ATC_RAT_RLL_Pshowing up 21 times across ardupilot’s.parmdefaults). - v1.19.7 —
canopy health --summaryand--format jsonCLI levers. On the ardupilot fixture (7,667 health findings), default text output is 2,049,287 chars;--summarybrings it under 600 chars regardless. A 4,026× reduction — the largest single token-efficiency win on the new corpus. - v1.19.8 — per-extractor coverage probes (38 probes spanning every extractor batch). 33/38 pass; the 5 misses are tokenizer artifacts in keyword search, not coverage gaps (the underlying language is extracted in every case).
- v1.19.9 — free-play “investigate broken imports on ardupilot” scenario validates the compound effect of v1.19.6 + v1.19.7 on a denser corpus. End-to-end flow: 6 tool calls, 3,353 chars total, 10,884× lever-vs-legacy reduction on step 1 alone.
v1.19.10–v1.19.13 — The comprehension benchmark
Section titled “v1.19.10–v1.19.13 — The comprehension benchmark”The v1 lever stress tests measured token efficiency on operator-driven workflows. The v2 comprehension benchmark measures agentic correctness in unguided exploration — an AI agent answers questions about a repo using only Canopy MCP tools, with answers graded against deterministic ground truth.
- v1.19.10 — 49-scenario unguided benchmark across 10 repos × 7 categories (orient / health / blast / symbol / search / cross-language / free-play). Ground truth captured via
canopy_orient,canopy_health_check,canopy_architecture_map— Canopy MCP grades itself. The MCP-only invariant is enforced in both the prompt and the reference runner. - v1.19.11 — 5-question comprehension rubric (5 categories per repo × 5 repos = 75/75 max) applied to the v1.19.0+ extractor repos for language diversity (cpp/c/kotlin/csharp/php). Operator-authored canonical answers pass at 75/75, confirming the rubric heuristics correctly award full credit when the answer hits all key facts.
- v1.19.12 — restores Q4 History (
canopy_git_historyagainst unshallowed test repos) to match the V11/V13 design exactly. Test repos unshallowed viagit fetch --unshallowand re-ingested viacanopy ingest-git --max-commits 5000. ~125K commits available across 10 repos; total ingest time ~60s. - v1.19.13 — first unguided real-subagent run. 10 subagents × 2 modes (Default with full tool access vs Levered with MCP-only). Default scored 72/75; Levered scored 75/75. Default agents naturally reach for canopy MCP (every Default subagent used canopy 3–15 times even with no instruction); they also lean on Bash for verification (
git log,ls,find) as cross-checks rather than substitutes. Levered mode is 8× less Bash, 3.2× more MCP. The 4-point gap is answer verbosity, not reasoning.
v1.19.14–v1.19.16 — Closing the density-flag adoption gap
Section titled “v1.19.14–v1.19.16 — Closing the density-flag adoption gap”The comprehension reruns surfaced a structural issue: agents read the recipes but didn’t apply density flags on each call. The next three patches close that gap from three different angles.
- v1.19.14 — strengthens MCP server instructions with an explicit substitution table (
grep/rg/ag→canopy_search,find -name→canopy_search path_prefix=,git log -- <file>→canopy_git_history,head/cat/tail/ Read on a source file →canopy_extract_symbol, etc.). Frames the index as the source of truth, not a claim to double-check. - v1.19.15 — RECIPES block (8 one-line question-to-tool-sequence recipes), an ANTI-PATTERN callout naming the search-fan-out failure mode (3+
canopy_searchcalls on the same topic means STOP and switch tocanopy_understand,canopy_orient, orcanopy_extract_symbol), and the Meta tool category in the catalog framed as “workflow synthesizers — prefer over raw search chains”. - v1.19.16 —
CANOPY_AUTO_SUMMARY_THRESHOLDlowered 10,000 → 3,000 chars. The meancanopy_searchresponse was 1,663 chars in v1.19.15 levered runs — well under the old 10K threshold, so verbose output sneaked through. Setting the threshold to 3K catches realistic per-call response sizes forcanopy_search,canopy_search_symbols,canopy_health_check, and the trace tools without changing parameter defaults. Estimated effect on the comprehension workload: levered MCP chars drop from ~163K to ~51K (−69%).
Compatibility notes for the v1.14.x series
Section titled “Compatibility notes for the v1.14.x series”- Additive throughout. Every extractor and parameter is additive. No public interface is removed or renamed across the series. The v1.0.0 API freeze remains in effect.
Languageenum extensions: v1.14.0 adds 7 variants (Q2 batch), v1.15.0 adds 6 (Q3 batch), v1.16.0 adds 6 (Q4 batch), v1.17.0 adds 15 (gap closure), v1.18.0 adds 11 (ArduPilot audit). Total: 45 new variants across the series.- Embedding concurrency: new
CANOPY_EMBED_CONCURRENCYenv var (default 4). Set to 1 for the pre-v1.19.0 sequential path. - Auto-summary threshold:
CANOPY_AUTO_SUMMARY_THRESHOLDdefaults to 3,000 chars from v1.19.16 onward. Set to 0 to disable; set to 10,000 to restore pre-v1.19.16 behavior. - Test repos: the comprehension benchmark assumes unshallowed clones with git ingest. Indexes built with
git clone --depth 1will return one-commit Q4 History responses until re-ingested.