Skip to content
Canopy is in pre-release. These docs describe the product at its public launch — commands, tool names, and integration examples reflect what you'll see once binaries ship. Join the waitlist →

Progressive Disclosure

Canopy exposes 21 MCP tools for codebase intelligence. Without parameter tuning, a single architectural survey call can consume 4,472 tokens; the same question answered through the right sequence of density-controlled calls costs 426 tokens — a 10.5x reduction. This guide explains the density-lever system: what parameters exist, which tools support them, how they compose, and which workflow recipes hit the best token-per-insight ratios. It is written for developers using Canopy via CLI or MCP, and for AI-engineering practitioners evaluating Canopy’s token-efficiency story.


The problem with naive full-enumeration calls

Section titled “The problem with naive full-enumeration calls”

Every Canopy tool that traverses a dependency graph, emits health findings, or returns search results has an upper bound on output size determined by the size of the indexed codebase — not by what the caller actually needs. On a real-world monorepo (622 files, 3,752 chunks), naive calls produce:

  • canopy_architecture_map with no parameters: ~2,700 tokens of module listings
  • canopy_trace_dependents on a hub file with no parameters: ~905 tokens of file paths
  • canopy_health_check with no parameters: ~3,913 tokens of finding detail
  • canopy_prepare with default format: ~1,015 tokens including sections the caller will discard

When an AI agent runs four such calls to answer one architectural question, the total lands around 4,472 tokens. That number comes from Experiment 1, measured against the Pith monorepo baseline — not an estimate.

The solution: density levers on every high-output tool

Section titled “The solution: density levers on every high-output tool”

Over 14 experiments across two days, 13 density parameters were added to the tools most likely to produce runaway output. The parameters fall into five categories: summary collapse, concise format, pagination, JSON structured output, and targeted filtering. Every tool keeps its default verbose behavior — passing no parameters produces the legacy full output. Levers only activate when you pass them.

The optimal workflow for the same architectural survey question — using canopy_survey question=health as the entry point, then drilling with summary and filter parameters — costs approximately 390 tokens. The prepare-edit-validate cycle for a safe refactor, fully concise-mode, costs approximately 426 tokens versus 4,015 tokens naive. That is a 9.4x to 10.5x reduction depending on the task.


Use tier as your first decision. Do not start at Detail unless you already know exactly what you need.

TierPurposeTypical costEntry call
Assessment”What shape is this repo?”~390tcanopy_survey question=health
Triage”What’s broken? Where?”~128tcanopy_health_check summary=true format=json
Orientation”Where should I look?”~83–150tcanopy_trace_dependents summary=true top_n=5
Detail”Show me the code”variescanopy_prepare / canopy_understand / canopy_search

One call. You learn the hub files, cycle count, acyclic flag, and recommended drill-down areas. This is the right tier to start any session on an unfamiliar codebase.

One or two calls. You know something is wrong (build failing, lint failing, health check blocker) and you want the category and count of findings before reading individual ones. The summary=true parameter on canopy_health_check collapses thousands of finding lines into a count-by-category table.

You know which file or module you care about. You want to understand its blast radius or what it imports before you touch it. Summary mode on trace tools gives you file counts by directory and the top-N hot paths without listing every dependent file.

You have oriented. You know the file, you know the risk surface, and now you need the actual content. canopy_prepare (concise format), canopy_understand, canopy_search, and canopy_extract_symbol live here. Token cost scales with the content — there is no summary mode that elides code; the levers just remove the surrounding metadata padding.


Summary mode collapses per-item enumeration into counts and top-N buckets. Instead of listing every dependent file path, you get the count of dependents grouped by directory and the five files with the most transitive reach.

Turn on with: summary=true, optionally combined with top_n=N (default 5 when summary is active).

Supported tools:

ToolParameterMeasured reduction
canopy_trace_importssummary=true, top_n=N2x–419x depending on hub file size
canopy_trace_dependentssummary=true, top_n=N2x–419x depending on hub file size
canopy_find_cyclessummary=trueCollapses cycle listings to count + top participants
canopy_health_checksummary=trueCollapses finding detail to count-by-category
canopy_searchsummary=true, top_n=NReplaces result list with match-count summary per directory
canopy_search_symbolssummary=trueReplaces symbol list with count by kind
canopy_git_blamesummary=trueCollapses per-line blame to commit-section summary

The 419x figure is not a typo. On a hub file that is imported by 419 other files, full enumeration returns 419 paths; summary=true returns 12 lines (directory groups + top_n entries). The token ratio mirrors the file count ratio.

Rule of thumb: Use summary=true any time you are not yet ready to act on per-file detail.


Concise mode drops decorative sections from the composite workflow tools — sections like “why this file matters,” extended git narrative, and coverage explanation prose. You still get the verdict (GO/CAUTION/STOP), the dependent count, and the health finding list; you just do not get the surrounding explanation text.

Turn on with: format="concise". The canopy_prepare tool also supports format="targeted", which further narrows output to only the file’s direct blast radius.

Supported tools:

ToolParameterMeasured reduction
canopy_prepareformat="concise"4.3x (1,015t → 236t)
canopy_prepareformat="targeted"5.6x (1,015t → 181t)
canopy_validateformat="concise"4.3x–12.5x depending on finding count
canopy_understandformat="concise"4.3x on average

When to use concise vs targeted on canopy_prepare:

  • Use concise when you want the full dependent list but not the explanatory prose.
  • Use targeted when you only need the direct (depth-1) dependents and the GO/CAUTION/STOP verdict. It is the fastest safe pre-check.

Pagination slices large result sets without summarizing them. Unlike summary=true, pagination gives you the actual file paths — just not all of them at once.

Turn on with: limit=N to cap results, offset=N to skip the first N items, directory="path/prefix/" to scope results to one subtree.

Supported tools:

ToolParameterMeasured reduction
canopy_trace_importslimit=N, offset=N, directory=X5.6x–7.1x on hub files
canopy_trace_dependentslimit=N, offset=N, directory=X5.6x–7.1x on hub files
canopy_searchlimit=NProportional to limit vs default
canopy_search_symbolslimit=NProportional to limit vs default

When to use pagination vs summary:

  • Use summary=true first to understand the shape of a large result set (how many files, which directories dominate).
  • Use directory=X + limit=N to drill into a specific subtree once you know where to look.
  • Pagination and summary compose: summary=true top_n=5 followed by directory="lib/collection/" limit=20 is a documented recipe (see Recipe 3).

JSON mode swaps Canopy’s human-readable plaintext output for machine-readable JSON. For most tools, the raw token count is similar — text is already compact. The benefit of JSON is parseability: downstream tooling, custom scripts, and LLM function-calling pipelines can extract fields without regex.

Turn on with: format="json".

Supported tools:

ToolJSON vs text reductionNotes
canopy_architecture_map1.49x–1.62x smaller in JSONRecommended: use JSON for this tool
canopy_health_check1.3x smaller in JSONRecommended: use JSON for this tool
canopy_trace_importsMarginally larger in JSONText is already compact; prefer text
canopy_trace_dependentsMarginally larger in JSONText is already compact; prefer text
canopy_check_wiringNeutralUse JSON only if downstream parsing
canopy_find_cyclesNeutralUse JSON only if downstream parsing
canopy_dependency_graphNeutralUse JSON only if downstream parsing

Rule of thumb: Use format=json for canopy_architecture_map and canopy_health_check. Do not use it for trace tools — their plaintext output is already highly compact and JSON wrapping adds overhead.


Filter levers narrow the scope of a call before it runs — reducing both compute and output. They are the highest-leverage levers when you already know what you are looking at.

Supported parameters:

ToolParameterEffectMeasured reduction
canopy_health_checkcheck=XRun only one check type (e.g., check=circular_deps)5.1x–29.3x vs full health dump
canopy_trace_importsdirectory=XRestrict results to imports under this path prefix5.6x–7.1x
canopy_trace_dependentsdirectory=XRestrict results to dependents under this path prefix5.6x–7.1x
canopy_searchpath_prefix=XRestrict search results to files under this path2x on broad queries

Valid check values for canopy_health_check:

  • circular_deps
  • dead_exports
  • broken_imports
  • missing_files
  • stale_references
  • secret_scan

When you know the category of problem you are investigating, filter to that check first. A full health dump on the Pith monorepo returns 3,913 tokens of findings. Filtering to check=circular_deps returns ~579 tokens.


These five recipes are derived from the 14 experiments and represent measured, not hypothetical, token budgets. All measurements are against the Pith monorepo (622 files, 3,752 chunks).


Recipe 1: “What shape is this repo?” (~390t total)

Section titled “Recipe 1: “What shape is this repo?” (~390t total)”

Use this at session start or when you are unfamiliar with a codebase.

canopy_survey question=health

What you get: Hub files (highest in-degree), cycle count, acyclic flag, health summary by category, recommended drill areas.

What you do next: Based on the acyclic flag and hub list, decide whether to go to Recipe 2 (something is broken) or Recipe 3 (safe to edit, need blast radius).

Savings vs naive full survey: ~11x (4,472t naive, 390t here).


Recipe 2: “How bad is the health? What type of issues?” (~707t total)

Section titled “Recipe 2: “How bad is the health? What type of issues?” (~707t total)”

Use this when the survey flags problems or when you are about to do a refactor and want to know the health baseline.

Step 1 — Get the category breakdown (~128t):

canopy_health_check summary=true format=json

You receive a count-by-category table. If circular_deps is non-zero, proceed to step 2. If only dead_exports are present, the repo is structurally safe (dead exports are informational on barrel-export files).

Step 2 — Drill into the category that matters (~579t):

canopy_health_check check=circular_deps limit=3 format=json

You receive the top 3 cycles with their full file chains. This is enough to identify the cycle root and plan a break.

Total: ~707t. Savings: 5.5x vs full health dump (3,913t).


Recipe 3: “Blast radius of this file?” (~279t total)

Section titled “Recipe 3: “Blast radius of this file?” (~279t total)”

Use this before modifying any file that might have many dependents.

Step 1 — Summary orientation (~119t):

canopy_trace_dependents file="src/lib/core.ts" summary=true top_n=5

You receive: total dependent count, count grouped by directory, and the 5 files with the largest transitive reach. If the total count is less than 10, stop here — you have the full picture.

Step 2 — Directory drill if count is large (~160t):

canopy_trace_dependents file="src/lib/core.ts" directory="lib/collection/"

You receive the actual dependent file paths restricted to one directory. Repeat with a different directory value for each subtree that matters.

Total: ~279t. Savings: 3.2x vs full enumeration (905t).


Recipe 4: “Safely break a cycle” (~426t total)

Section titled “Recipe 4: “Safely break a cycle” (~426t total)”

Use this when health check reports a circular dependency and you need to resolve it with minimum rework.

Step 1 — Targeted prepare on the file you will modify (~181t):

canopy_prepare file="src/lib/cycleA.ts" format=targeted

You receive: GO/CAUTION/STOP verdict, direct dependents, health findings scoped to this file. No prose. No git narrative.

Step 2 — Edit the file (not a Canopy call — this is your actual code change).

Step 3 — Concise validate (~128t):

canopy_validate paths=["src/lib/cycleA.ts"] format=concise

You receive: PASSED/FAILED verdict, any new health findings introduced, list of dependents at risk.

Step 4 — Summary health check to confirm cycle is gone (~117t):

canopy_health_check check=circular_deps summary=true format=json

Total: ~426t. Savings vs naive prepare-edit-validate: 9.4x (4,015t naive).


Recipe 5: “Semantic search with orientation” (~2,500t total)

Section titled “Recipe 5: “Semantic search with orientation” (~2,500t total)”

Use this when you need to find code by concept rather than by exact symbol name.

Step 1 — Broad search in summary mode (~1,450t):

canopy_search query="auth" summary=true top_n=5 limit=50

You receive: match count per directory, top 5 directories by match density. Use this to identify which directory contains the auth implementation you are looking for.

Step 2 — Drill into that directory (~1,050t):

canopy_search query="auth" path_prefix="src/chosen_dir"

You receive the actual matching code chunks, scoped to the directory identified in step 1.

Total: ~2,500t. Savings: 2.0x vs naive broad search (4,900t).

Note: search is the highest-cost category at any tier because it returns code content, not metadata. The levers here buy a 2x reduction, not a 10x reduction. The larger gains come from graph and health tools.


ToolDensity parameters supportedBest measured result
canopy_prepareformat=concise, format=targeted5.6x (targeted)
canopy_validateformat=concise4.3x–12.5x
canopy_understandformat=concise4.3x
canopy_health_checksummary=true, check=X, format=json29.3x (single check, scoped)
canopy_architecture_mapformat=json1.62x
canopy_trace_importssummary=true, top_n=N, limit=N, offset=N, directory=X, format=json419x (hub file, summary mode)
canopy_trace_dependentssummary=true, top_n=N, limit=N, offset=N, directory=X, format=json419x (hub file, summary mode)
canopy_find_cyclessummary=true, format=json~5x
canopy_check_wiringformat=jsonNeutral (use for parseability only)
canopy_dependency_graphformat=jsonNeutral (use for parseability only)
canopy_searchsummary=true, top_n=N, limit=N, path_prefix=X2x (broad query, oriented drill)
canopy_search_symbolssummary=true, limit=N~3x
canopy_pattern_searchlimit=NProportional
canopy_git_blamesummary=true~4x on large files
canopy_git_historylimit=NProportional
canopy_parse_fileinclude_body=false (default)N/A — body=false is already the default
canopy_extract_symbol(none)N/A — scoped by design
canopy_ingest_scip(none)N/A
canopy_coverage(none)N/A
canopy_index_status(none)N/A
canopy_reindexfull=false (default, incremental)N/A

Tools marked “N/A” either have no variable-size output (index operations, ingest) or are scoped by design (canopy_extract_symbol always returns one symbol; canopy_parse_file with no body is already compact).


Should I always pass summary=true?

No. Use it to orient before drilling, not as a permanent setting. Summary mode hides per-file detail. If you need to know whether src/lib/parser.ts specifically imports from your modified file, summary=true will not tell you — it only gives you a count. Orient with summary, drill without it.

When does JSON beat text?

Only on canopy_architecture_map (1.49x–1.62x smaller) and canopy_health_check (1.3x smaller). These tools produce structured records that compress better as JSON. Trace tools produce compact path lists where JSON wrapping adds overhead rather than reducing it. Use format=json for architecture and health; use default text for trace tools.

Is there a “debug mode” that disables all density levers?

No. The inverse is true: passing no parameters produces the full verbose output, which is the default behavior on every tool. The density levers are opt-in. If you want legacy verbosity, omit all density parameters.

How do I know which tool to reach for first?

Follow the four tiers in section 2. Start at Assessment (canopy_survey question=health). Advance to Triage only if the assessment flags a problem. Advance to Orientation only when you have identified a specific file or module. Advance to Detail only when you know exactly what you need and are ready to act.

Do the levers compose?

Yes, and they are designed to. summary=true top_n=5 format=json is a valid combination. check=circular_deps summary=true format=json is a valid combination. directory="src/lib" limit=20 offset=0 is a valid combination. Parameters from different lever categories do not conflict. The only invalid combination is summary=true with directory=X on the same trace call in contexts where you want the full enumeration of a subtree — use one or the other based on what you need.

What is canopy_survey — I do not see it in the tool list?

canopy_survey is the Assessment-tier entry point referenced in the recipes. It is a composite tool that internally calls canopy_architecture_map, canopy_health_check summary=true, and canopy_find_cycles summary=true in sequence and returns a structured overview. It is listed as one of the 21 MCP tools and its question parameter filters the output section returned. See the MCP Tools reference for the full parameter schema.


  • MCP Tools reference — full parameter schemas, types, and default values for every tool
  • Getting Started — install, index, and connect to your AI agent
  • Tool Catalog — all 21 tools with one-line descriptions, organized by category
  • How Canopy Works — the indexing pipeline and storage architecture behind the tools