v1.14.x — Language expansion and the comprehension benchmark

Versions: 1.14.0 → 1.19.16

The v1.14.x series is the language-coverage and benchmark-validation pass. It triples Canopy’s supported-language surface from the v1.13 baseline (TypeScript, JavaScript, Python, Rust, Go) by shipping three language batches (C/C++/C#, Java/Kotlin/Swift, Ruby/PHP) plus the framework, build, and config files those ecosystems live alongside. It also lands content-based disambiguation for inherently ambiguous extensions (.m, .h, .inc, .ac), a parallel CPU-embedding loop that takes Ollama from 75% idle to ~4× throughput, and the comprehension benchmark — the first measurement of how AI agents answer real questions on real codebases using only Canopy MCP tools.

v1.13.0 was skipped — the version cut directly from v1.12.0 to v1.14.0 to align the language-batch numbering with the public roadmap.

v1.14.0 — Q2 batch: C, C++, C#, and the build-descriptor family

Regex-based extractors throughout, holding the existing tree-sitter = "0.24" ABI-14 pin. No new grammar crates added in this release.

C (.c, .h): functions, forward declarations, structs, unions, enums, typedefs, #define macros. Quoted #include becomes relative imports; system <header> includes are external.
C++ (.cpp, .cxx, .cc, .hpp, .hxx, .h++): classes, structs, unions, namespaces, templates, using aliases, member and free functions, enum class.
C# (.cs): classes, interfaces, records, structs, enums, methods, properties, delegates, events, block and file-scoped namespaces. using directives detect System.* / Microsoft.* / NuGet.* framework prefixes.
CMake (CMakeLists.txt, *.cmake): project, add_executable, add_library, add_custom_target, set, find_package, include.
Makefile (Makefile, GNUmakefile, *.mk): phony targets, user targets, variable assignments, include.
MSBuild (.csproj, .sln, .props, .targets): project root, PackageReference / ProjectReference entries, <Target> elements; .sln project lines surface as both symbols and relative imports.
Razor / Blazor (.razor, .cshtml): @page, @using, @inherits, @implements, @inject, @code { … } block content.

Filename-based dispatch lands so extensionless files (Makefile, Gemfile, Rakefile, composer.json) and compound-extension files (build.gradle.kts, *.blade.php) resolve to the right extractor.

v1.15.0 — Q3 batch: Java, Kotlin, Swift, plus Gradle / Markdown / Shell

Java (.java): classes, interfaces, enums, records, annotations, methods, constructors, fields. package directives emit module symbols; import edges mark java.* / javax.* / jakarta.* / org.springframework.* / org.junit.* / com.google.* external.
Kotlin (.kt, .kts): classes (including data, sealed, enum class, annotation class), object declarations, interfaces, top-level and extension functions, val / var properties, const val, typealias, package, aliased import. Android framework prefixes external.
Swift (.swift): classes, structs, enums, protocols, actors, extensions, functions, let / var, subscripts, type aliases. SwiftUI / UIKit / Foundation imports external.
Gradle (.gradle, .gradle.kts): plugins, dependency configurations (implementation, api, testImplementation, ksp, kapt, androidTestImplementation, …), tasks.register, well-known DSL blocks (android, dependencies, repositories, plugins, …) as module symbols. A Java repo without Gradle indexing is half-indexed.
Markdown (.md, .mdx, .markdown): ATX headings up to H6 as module symbols, fenced code blocks as code_block_<lang>, inline and reference-style links as imports. YAML frontmatter is skipped without parsing.
Shell (.sh, .bash, .zsh): function definitions in both forms, top-level variable assignments (export, readonly), source / . imports, shebang capture via the __shebang__ synthetic symbol.

v1.16.0 — Q4 batch: Ruby, PHP, and their ecosystems

Ruby (.rb): classes with base-class extraction, modules, instance and self.-methods, attr_reader / attr_writer / attr_accessor, top-level constants, require / require_relative (external vs relative).
PHP (.php, .phtml): namespaces, use directives with framework detection (Illuminate / Symfony / Laravel / Psr / Doctrine / Monolog / PHPUnit external), classes / interfaces / traits / enums, functions / methods, const and define().
Gemfile / Rack (Gemfile, Gemfile.lock, *.gemspec, *.ru): gem entries with version pins, group blocks, Ruby version constant, spec.add_dependency calls, Rack require statements.
Rakefile (Rakefile, *.rake): task :name => [:deps] with dependency tracking, namespace blocks, desc lines captured as the following task’s doc comment.
Composer (composer.json, composer.lock): shallow JSON parse emits require / require-dev as external imports, PSR-4 autoload namespaces as module symbols, package name as a module. Falls back to regex extraction when JSON is malformed.
Blade (*.blade.php): @extends, @include, @yield, @section, @component, @php … @endphp inline blocks.

v1.17.0 — Gap closure: scripts, configs, web, build infrastructure

The philosophy: a code intelligence engine should index code, and any text file that declares names code depends on is code.

Shebang detection for extensionless scripts: #!/bin/sh, /bin/bash, /bin/zsh, /usr/bin/env perl, /usr/bin/python3, /usr/bin/env ruby, /usr/bin/env php, /usr/bin/env node route to the appropriate extractor.

Ruby template family: ERB / Erubis, Haml / Hamlit, Slim — render-partial imports, yield :name, content_for :name, scriptlet def/class.

Build + infra: Autotools (*.am, *.m4, *.in, configure.ac, Makefile.am), Dockerfile (Dockerfile, Containerfile, *.dockerfile) with FROM stage names, COPY --from=stage, ARG / ENV / EXPOSE / LABEL.

Config + data: Java .properties, generic YAML, generic JSON (with dependencies-family detection and tsconfig extends), TOML (sections, dependency families, scalar key/values), generic XML, .env and variants (with ${VAR} interpolation producing env:VAR import edges), INI / CFG / CONF.

Web: HTML (<title>, id / class attributes, src / href on script / img / link / iframe / video / audio / source, well-known <meta>), CSS / SCSS / Sass / Less / Stylus (selectors, custom properties, $variables, @import / @use / @forward / @keyframes / @mixin / @function).

Perl (.pl, .pm, .t): package declarations with ::-qualified names, sub, use / require / no with pragma detection, my / our / local top-level variables, Moose has attributes.

Language enum extends with 15 new variants for the v1.17.0 batch.

v1.18.0 — ArduPilot audit and content-based disambiguation

Closes the remaining gaps a complete inventory of a large embedded / robotics codebase surfaced, and introduces content-based disambiguation for inherently ambiguous extensions.

content_classify reads up to 2 KB from each file’s head and picks between candidate languages:

.m — MATLAB (% comments, function, classdef) vs Objective-C (@interface, @implementation, @property, #import, NSObject).
.h — C (default) vs C++ (class, namespace, template<, using namespace, std::).
.inc — PHP (<?php) vs C (#include, #define, #ifdef).
.ac — Autoconf (AC_INIT, dnl) vs OpenSceneGraph 3D model (binary-adjacent; skipped).

Eleven new extractors land in the same release: Lua, MATLAB / Octave, Objective-C, Linker scripts (.ld — ENTRY, MEMORY regions, SECTIONS, PROVIDE, INCLUDE), Device Tree Source (.dts, .dtbo, .dtsi — labelled nodes, compatible = "vendor,chip" as external imports), ROS messages (.msg, .srv, .action), IDL (CORBA / DDS / MIDL), Batch (.bat, .cmd), PowerShell, ArduPilot parameter files (.parm, .param, .defaults, .waypoints, .plan), and Assembly (.asm, .s, .S — labels with visibility from .globl / .global, .equ / .set constants, .section modules).

Extension routing also updates: .ino / .inoflag → C++, .urdf / .xacro / .launch → XML, .ioc → Properties, several systemd / lint / editor configs → INI, .jsonc → JSON.

v1.19.0 — Parallel CPU embedding

The pre-v1.19.0 embedding loop was fully sequential — for batch in rows.chunks(32) { embed_batch().await } — so on a 32-core machine Ollama (default OLLAMA_NUM_PARALLEL=4) sat 75%+ idle while Canopy waited on one HTTP roundtrip at a time.

build_embedding_index now detects the Ollama backend and fires batches concurrently via futures::stream::buffer_unordered. The sequential fallback remains for the ORT backend (whose inference session is &mut-only). Concurrency is tunable via CANOPY_EMBED_CONCURRENCY (default 4); set to 1 to restore pre-v1.19.0 behavior.

On the ArduPilot fixture (5,157 files, ~90,000 chunks, 32-core Xeon E5-2699 v4, nomic-embed-text-v1.5 on CPU): roughly 5 hours wall-time on the legacy path drops to ~75 minutes at concurrency 4 — a ~4× speedup. Ollama’s built-in request pool is the cap; raising CANOPY_EMBED_CONCURRENCY beyond OLLAMA_NUM_PARALLEL yields no additional speedup (raise both to push further).

v1.19.1–v1.19.4 — Test coverage and regression baselines

The v1.19.x patch series spends its first four patches locking down the language work that landed across v1.14.0–v1.18.0:

v1.19.1 — 45 smoke integration tests with real-world fixtures, one per extractor across the v1.14–v1.18 batches. Refinements found via the new fixtures: kotlin.rs::is_kotlin_framework_ns now matches kotlinx.* imports; autotools.rs accepts both bare and parenthesised macro forms. Test count: 213 unit + 45 integration = 258, all passing.
v1.19.2 — per-repo regression baseline so future canopy versions cannot silently regress the 30+ new extractors. Snapshot of (files, symbols, imports, chunks, embeddings) and per-language file counts across 10 indexed test repos. 11,306 files / 150,171 symbols / 159,266 chunks / 246,793 vectors total.
v1.19.3 — content-based disambiguation regression tests (8 fixtures × 8 tests) for .m, .h, .inc, .ac.
v1.19.4 — health-check regression baseline. Future canopy releases cannot introduce new critical findings on the 10 indexed test repos beyond a small ±2 absolute tolerance for the intrinsic non-determinism of cycle detection on tied edges.

v1.19.5–v1.19.9 — Lever Stress Test v2 and CLI lever parity

The v1.19.5 → v1.19.9 patch sweep validates that Canopy’s lever surface still works efficiently and comprehensively against the 30+ new extractors and 10 new test repos.

v1.19.5 — Lever Stress Test v2 results published. 13 / 13 new-language reachability probes pass. canopy_map density lever (--format json) hits 15.9× reduction on the new corpus (was 10.5× on the v1 corpus). The new corpus is roughly 2× v1 in symbols / embeddings and 5× in language surface.
v1.19.6 — canopy search CLI collapses near-identical hits in config-style languages (ardupilot_params, properties, env, ini) into one condensed entry, eliminating the N-files-of-noise pattern (e.g., ATC_RAT_RLL_P showing up 21 times across ardupilot’s .parm defaults).
v1.19.7 — canopy health --summary and --format json CLI levers. On the ardupilot fixture (7,667 health findings), default text output is 2,049,287 chars; --summary brings it under 600 chars regardless. A 4,026× reduction — the largest single token-efficiency win on the new corpus.
v1.19.8 — per-extractor coverage probes (38 probes spanning every extractor batch). 33/38 pass; the 5 misses are tokenizer artifacts in keyword search, not coverage gaps (the underlying language is extracted in every case).
v1.19.9 — free-play “investigate broken imports on ardupilot” scenario validates the compound effect of v1.19.6 + v1.19.7 on a denser corpus. End-to-end flow: 6 tool calls, 3,353 chars total, 10,884× lever-vs-legacy reduction on step 1 alone.

v1.19.10–v1.19.13 — The comprehension benchmark

The v1 lever stress tests measured token efficiency on operator-driven workflows. The v2 comprehension benchmark measures agentic correctness in unguided exploration — an AI agent answers questions about a repo using only Canopy MCP tools, with answers graded against deterministic ground truth.

v1.19.10 — 49-scenario unguided benchmark across 10 repos × 7 categories (orient / health / blast / symbol / search / cross-language / free-play). Ground truth captured via canopy_orient, canopy_health_check, canopy_architecture_map — Canopy MCP grades itself. The MCP-only invariant is enforced in both the prompt and the reference runner.
v1.19.11 — 5-question comprehension rubric (5 categories per repo × 5 repos = 75/75 max) applied to the v1.19.0+ extractor repos for language diversity (cpp/c/kotlin/csharp/php). Operator-authored canonical answers pass at 75/75, confirming the rubric heuristics correctly award full credit when the answer hits all key facts.
v1.19.12 — restores Q4 History (canopy_git_history against unshallowed test repos) to match the V11/V13 design exactly. Test repos unshallowed via git fetch --unshallow and re-ingested via canopy ingest-git --max-commits 5000. ~125K commits available across 10 repos; total ingest time ~60s.
v1.19.13 — first unguided real-subagent run. 10 subagents × 2 modes (Default with full tool access vs Levered with MCP-only). Default scored 72/75; Levered scored 75/75. Default agents naturally reach for canopy MCP (every Default subagent used canopy 3–15 times even with no instruction); they also lean on Bash for verification (git log, ls, find) as cross-checks rather than substitutes. Levered mode is 8× less Bash, 3.2× more MCP. The 4-point gap is answer verbosity, not reasoning.

v1.19.14–v1.19.16 — Closing the density-flag adoption gap

The comprehension reruns surfaced a structural issue: agents read the recipes but didn’t apply density flags on each call. The next three patches close that gap from three different angles.

v1.19.14 — strengthens MCP server instructions with an explicit substitution table (grep/rg/ag → canopy_search, find -name → canopy_search path_prefix=, git log -- <file> → canopy_git_history, head / cat / tail / Read on a source file → canopy_extract_symbol, etc.). Frames the index as the source of truth, not a claim to double-check.
v1.19.15 — RECIPES block (8 one-line question-to-tool-sequence recipes), an ANTI-PATTERN callout naming the search-fan-out failure mode (3+ canopy_search calls on the same topic means STOP and switch to canopy_understand, canopy_orient, or canopy_extract_symbol), and the Meta tool category in the catalog framed as “workflow synthesizers — prefer over raw search chains”.
v1.19.16 — CANOPY_AUTO_SUMMARY_THRESHOLD lowered 10,000 → 3,000 chars. The mean canopy_search response was 1,663 chars in v1.19.15 levered runs — well under the old 10K threshold, so verbose output sneaked through. Setting the threshold to 3K catches realistic per-call response sizes for canopy_search, canopy_search_symbols, canopy_health_check, and the trace tools without changing parameter defaults. Estimated effect on the comprehension workload: levered MCP chars drop from ~163K to ~51K (−69%).

Compatibility notes for the v1.14.x series

Additive throughout. Every extractor and parameter is additive. No public interface is removed or renamed across the series. The v1.0.0 API freeze remains in effect.
Language enum extensions: v1.14.0 adds 7 variants (Q2 batch), v1.15.0 adds 6 (Q3 batch), v1.16.0 adds 6 (Q4 batch), v1.17.0 adds 15 (gap closure), v1.18.0 adds 11 (ArduPilot audit). Total: 45 new variants across the series.
Embedding concurrency: new CANOPY_EMBED_CONCURRENCY env var (default 4). Set to 1 for the pre-v1.19.0 sequential path.
Auto-summary threshold: CANOPY_AUTO_SUMMARY_THRESHOLD defaults to 3,000 chars from v1.19.16 onward. Set to 0 to disable; set to 10,000 to restore pre-v1.19.16 behavior.
Test repos: the comprehension benchmark assumes unshallowed clones with git ingest. Indexes built with git clone --depth 1 will return one-commit Q4 History responses until re-ingested.