Issue #3586150: Add ai_context_site_architecture submodule with MCP resource provider. (!146) · Merge requests · project / ai_context

Summary

Adds ai_context_site_architecture — a submodule that auto-harvests "site behavior contracts" describing what owns each surface on a Drupal site (entity routes, Views, path aliases, pathauto patterns, ECA rules, field definitions, content moderation workflows, module state) and writes them into CCC as drafts for human review before they reach AI agents.

Plus an optional ai_context_site_architecture_mcp sub-submodule that exposes the same contracts as MCP resources at drupal://site/contracts/{contract_id}, so MCP-aware agents (Claude Code, Cursor, etc.) can fetch them on demand instead of relying solely on bulk system-prompt injection.

Cross-link: prior art is ai_best_practices issue #3587321 (static-codebase context-harvester, +93pp accuracy / −62% cost on the project-context eval). This is the live-bootstrap counterpart wired into CCC governance.

Eval evidence

Two complementary access channels were measured on a real Drupal CMS bench (ai-bp-test, 1608 harvested contracts after enabling all scanners).

Channel 1 — CCC system-prompt injection (Q&A)

15-question dataset, single run per condition, same agent + model + judge across runs. Per-kind tag subscription was the unblock — under a single umbrella tag the CCC selector tie-broke on insertion order and surfaced useless system pseudo-routes ahead of site-specific contracts.

Metric	Baseline	Per-kind treatment	Δ
Avg score (1–5)	1.93	4.40	+2.47
Pass rate (≥3.5)	7%	87%	+80pp

13/15 questions improved or held at 5/5 between the umbrella-tag and per-kind treatments; the three follow-up gap questions (publish workflow, contact form, category page) all closed once content_moderation_workflow and module_state contracts were added.

Full dataset + run JSONs: https://gist.github.com/gkastanis/5b2298c87e644b65168a4dfb5074e40a.

Channel 2 — MCP lazy retrieval (coding tasks)

9-task coding eval driven by Claude Code CLI in stdio mode against the new MCP submodule. Snapshot-restore between every invocation so disk state doesn't bias latency. The agent decides if and which contract to fetch — no bulk injection.

The eval was run twice. First a two-arm clean run with the bare resource template (n=3 per condition, 54 calls). Then, after adding the list_site_contracts + get_site_contract tools and a server-level instructions string injected into the initialize response, a three-arm re-run (n=1 per condition, 27 calls).

After L1+L2+L3 (the design that ships in this MR)

Metric	Baseline (no MCP)	Naive (no hint)	Hinted (with hint)
Pass rate (gated: retrieval ∧ token-match)	77.78%	88.89%	100.00%
Token-match only	77.78%	100.00%	100.00%
Retrieval rate	—	88.89%	100.00%
Mean elapsed	144s	92s (−37%)	85s (−41%)
Median elapsed	134s	83s (−38%)	67s (−50%)

Naive on the new architecture matches what hinted achieved on the old one. Pairwise: naive +11.1pp gated / +22.2pp token-match vs baseline; hinted +22.2pp on both vs baseline. The agent now reliably discovers the contracts without an external system-prompt hint.

Original two-arm clean run (preserved for comparison, n=3)

Metric	Baseline (clean)	Hinted (clean, no L1+L2+L3)
Token-match	85.19%	88.89%
Retrieval	—	81.48%
Mean elapsed	146s	114s

The original ran before list_site_contracts/get_site_contract and the server-level instructions existed. The naive arm there was 0% retrieval — Claude Code does not surface resources/templates/list as a tool, so the agent could not discover URI templates without a hint. Adding tool plugins + initialize-time instructions closed that gap.

Pipeline + raw runs: https://gist.github.com/gkastanis/83df7bf3055d3f456e5f01c16a5a70e0.

Architecture (what's in the box)

Scanners (one per ScannerInterface): RouteScanner, ViewsScanner, AliasScanner, PathautoPatternScanner, EcaScanner, FieldDefinitionScanner, ContentModerationScanner, ModuleStateScanner. Each yields contract dicts.
Post-processor pipeline: SurfaceRegistry (conflict detection), EnvelopeBuilder (resource URIs, SHA-256, capability hints), CodeSignalEnricher (regex-scans custom-module hooks/event subscribers and attaches them to relevant contracts).
Writer: CcciItemWriter creates ai_context_item entities with the JSON contract embedded in the body via a marker pair plus a human-readable markdown projection above. DryRunWriter prints to stdout.
Service: ContractRepository round-trips the JSON envelope on read (get($id), list(), findByPath(), surfaceIndex(), resources(), readResource($uri)).
Drush: ai-context:harvest-architecture (with --dry-run, --replace) and ai-context:export-architecture. The export command writes contracts as JSON to <drupal_root>/.ai/docs/site-architecture/ by default (overridable via --output=); intended as the static fallback to MCP — commit to git and reference from AGENTS.md for agents that can't connect to an MCP server.
Optional MCP submodule (ai_context_site_architecture_mcp) exposes contracts on three complementary surfaces so any MCP-aware agent finds them:
- ResourceTemplateProvider plugin under drupal://site/contracts/{contract_id} (URI-based access for clients that prefer resource semantics; self-registers via hook_install to avoid colliding with mcp_server_examples on the shared mcp_server.resource_template_providers config object).
- Two #[Tool] plugins for clients that discover capabilities via tools/list: list_site_contracts(kind?) returns IDs grouped by kind; get_site_contract(contract_id) returns the JSON envelope.
- A ResponseEvent subscriber that injects a one-paragraph instructions string into the initialize response, so agents are told about the tools + URI shape on connect without needing a system-prompt hint.
- All three surfaces gate reads on the existing 'access published ai context' permission.

What's covered (and what isn't)

Scanners walk Drupal's container without filtering origin — contracts are emitted for core, contrib, and custom modules alike, distinguished by owner.is_custom: true|false on every contract. Operators can subscribe per-kind tag (kind:entity_route, kind:view, etc.) to scope what reaches an agent's prompt.

Code-signal enrichment is narrower in v1: CodeSignalEnricher regex-scans /modules/custom/ only and attaches detected hooks / event subscribers / Views alters / route subscribers to the contracts they affect. Contrib and core extension code are not scanned. This is a deliberate v1 scope cut, not a position that contrib code is uninteresting:

Custom code is high-leverage because the model has never seen it; AI agents can already reason about contrib hooks they encountered in training data.
Regex-scanning the full contrib tree adds harvest wall-clock time and contract envelope size; both have downstream costs (selector token budget, MCP read latency).
The eval that landed this configuration measured custom-code generation. Debug-class questions ("why is X happening on this surface?") — where contrib signals would add the most value — are not on the current bench.

Surfacing contrib code signals optionally (config flag, default off, predicated on a debug-class eval bench) is tracked as a follow-up issue.

Conventions followed (and where we extended)

We reuse existing CCC primitives wherever possible:

No new scope plugin. Per the plan, we use the existing Tag scope from ai_context. Per-kind taxonomy terms (kind:entity_route, kind:view, kind:alias, kind:content_moderation_workflow, kind:module_state, etc.) are auto-created in the existing ai_context_tags vocabulary at harvest time so agents can subscribe selectively without anyone defining a new #[AiContextScope] plugin.
No new permission. The MCP submodule's resource provider and the two tools all gate on the existing 'access published ai context' permission shipped by the parent module.
No parallel storage. The harvester writes ai_context_item entities programmatically via AiContextItem::create() + setScopeValues(), the same path manually-authored items use. The CCC admin UI at /admin/ai/context/items lists harvested drafts alongside hand-written ones.
Drush namespace. Commands live under ai-context:* to match the parent module.

Three deliberate extensions, each contained:

JSON envelope embedded in the markdown body via a marker pair (). Editors see clean markdown above the marker; the structured payload below is for ContractRepository::get() round-trips. This is the largest semantic departure from "the body is for prose" — happy to discuss alternatives (config entity? separate table?) if it's an obstacle.
Auto-populated taxonomy terms. Per-kind terms in ai_context_tags are created by the harvester, not by a human editor. The volume is small (≤10 terms across all kinds) and the names are namespaced (kind:*), so collision risk with human- authored tags is minimal.
Nested MCP submodule (ai_context_site_architecture_mcp/). Drupal core's submodules are typically siblings, not nested. We nested because this submodule is a strict dependent of ai_context_site_architecture (only useful when its parent is enabled) and we wanted enable/disable to track. Easy to flatten to a sibling layout if the project prefers.

Testing

49 PHPUnit tests (Unit + Kernel) across both submodules; all green.
- ai_context_site_architecture: 47 tests / 7132 assertions.
- ai_context_site_architecture_mcp: 2 tests / 75 assertions (resource template + site_contracts tools).
End-to-end smoke on ai-bp-test: ddev drush ai-context:harvest-architecture produces 1608 contracts across 9 kinds.
MCP wire-level smoke via stdio: initialize returns the server instructions string; resources/templates/list returns the drupal://site/contracts/{contract_id} template; resources/read for field_definition:node:page returns the full envelope; both tools/call mcp__ai-bp-test__list_site_contracts and mcp__ai-bp-test__get_site_contract return successfully with the expected shapes.

Scope cuts (explicit non-goals — tracked as follow-ups)

Negative contracts. Schema reserves negative: bool; producing them needs a joint design call (TTL/invalidation, lookup primitive) with MCP and agent owners.
UI for managing the harvest job. Drush only in v1; cron is a follow-up.
Layout Builder / Canvas page contracts.
Translation handling. Contracts are in the site default language.

Stability note on `_mcp` submodule

The ai_context_site_architecture_mcp submodule integrates with drupal/mcp_server, which currently only has a 2.x-dev branch — no tagged stable release. To run the full test matrix against it, this MR's composer.json adds drupal/mcp_server: 2.x-dev to require-dev (alongside drupal/eca and drupal/pathauto, also exercised by optional-integration tests) and sets minimum-stability: dev / prefer-stable: true.

Concretely, this means:

The parent module ai_context_site_architecture is independently installable and has no soft or hard dep on mcp_server.
The _mcp sub-submodule tracks mcp_server 2.x-dev. Reviewers should treat it as early integration until mcp_server tags a stable release. If maintainers prefer to land just the parent module now and the submodule once mcp_server stabilizes, happy to split.

Follow-ups (filed alongside this MR)

ai_context: negative contracts (TBD)
ai_context: view ai context items permission default (TBD)
ai_context: optionally surface contrib code signals — config flag, default off, predicated on a debug-class eval bench (TBD).
mcp_server: per-plugin contribution to initialize instructions string (TBD). This MR uses the SDK's generic ResponseEvent as a workaround; an upstream API would be cleaner.
ai_best_practices: the four MCP-aware additions to run-evals.py (--mcp-config switch, stream-json parser, paired baseline/treatment runs, retrieval_check assertion) will land in ai_best_practices itself as the canonical home for the eval harness — direct successor to closed #3583202 ("Explore provider-agnostic eval runner"). This MR's evals/mcp-eval/ ships as a local reproducer; the gist is the public surface.

cc @scottfalconer @kepol — flagged on ai_best_practices #3587321 as relevant prior art.

Edited May 06, 2026 by George Kastanis

Issue #3586150: Add ai_context_site_architecture submodule with MCP resource provider.