SystemPromptSubscriber re-injects full context on every agent loop iteration
>>> [!note] Migrated issue
<!-- Drupal.org comment -->
<!-- Migrated from issue #3582288. -->
Reported by: [alex ua](https://www.drupal.org/user/110386)
Related to !114 !113
>>>
<h3>Problem</h3>
<p><code>SystemPromptSubscriber::onPreSystemPrompt()</code> fires on every <code>BuildSystemPromptEvent</code>, which dispatches on every agent loop iteration. For agents with <code>always_include</code> context items, the full context block is re-appended to the system prompt on every LLM call across all loops.</p>
<p>The system prompt is rebuilt each loop, and the context block is re-injected each time — the same content, at the same position, providing no additional information to the LLM (which already has it from loop 0 in its conversation window). The cost scales with loop count.</p>
<p>The pattern is similar to cache-unaware code that re-fetches on every call despite the result being unchanged. <code>available_on_loop</code> in <code>default_information_tools</code> already solves the equivalent problem for tool outputs — the same principle should apply to ai_context items.</p>
<h3>Measured cost</h3>
<p>All measurements are N=1 on a single demo page (15 components, 8 ai_context items totaling ~86K bytes). We expect directional accuracy but recommend instrumented measurement across diverse operations before committing to an architectural change.</p>
<table>
<tr>
<th>Agent</th>
<th>Typical loops</th>
<th>Context per injection</th>
<th>Wasted tokens (loops 1+)</th>
</tr>
<tr>
<td>canvas_page_builder_agent</td>
<td>3 (measured)</td>
<td>~22K tokens (86K bytes)</td>
<td>~44K</td>
</tr>
<tr>
<td>canvas_template_builder_agent</td>
<td>3-8 (observed)</td>
<td>~22K tokens</td>
<td>44-154K</td>
</tr>
</table>
<p>On a heading edit operation (101K total tokens at baseline, 16.4s latency), stripping ai_context on loops 1+ reduces total cost to 48K tokens — a 52% reduction. This was the largest single optimization we measured across layout scoping, context filtering, and deterministic routing combined.</p>
<p>Context size is configuration-dependent — sites with fewer or smaller ai_context items will see proportionally smaller absolute savings, but the relative reduction from eliminating re-injection remains significant whenever context items are non-trivial.</p>
<h3>Proposed solution</h3>
<p>Add a <code>loop_aware</code> setting to per-agent context configuration in <code>ai_context.agents</code>. When enabled, <code>SystemPromptSubscriber</code> checks the current loop count and skips injection on loop > 0. This follows the same pattern as <code>available_on_loop</code> for tool outputs.</p>
<p><strong>Implementation sketch:</strong></p>
<ol>
<li>Add <code>loop_aware</code> boolean to <code>ai_context.schema.yml</code> per-agent mapping (alongside <code>always_include</code>, <code>excluded_subcontext</code>).</li>
<li>In <code>SystemPromptSubscriber::onAgentStarted()</code>, capture <code>$event->getLoopCount()</code> per agent ID.</li>
<li>In <code>SystemPromptSubscriber::onPreSystemPrompt()</code>, check <code>loop_aware</code> config + loop count > 0 → skip injection.</li>
<li>Default: <code>FALSE</code> (no behavior change for existing sites). Missing key treated as <code>FALSE</code> — no update hook needed.</li>
</ol>
<p>Per-agent granularity is intentional: single-loop agents (orchestrators, chatbots) should always get context. Only multi-loop agents (page_builder, template_builder) benefit from skipping.</p>
<p>We have not observed output quality degradation in our testing (brand guidelines and writing tone remained consistent across edited content), but recommend verifying this for diverse agent configurations before enabling by default. The per-agent flag provides a safe rollout path.</p>
<h3>Relationship to existing work</h3>
<ul>
<li>Complementary to <a href="https://www.drupal.org/project/ai_context/issues/3564706">#3564706</a> (Context Scope feature) — Scope filters <em>which</em> items to inject; this filters <em>when</em> to inject them. Even with perfect scope filtering, surviving items are still re-injected every loop without this fix.</li>
<li>Adjacent to <a href="https://www.drupal.org/project/ai_agents/issues/3524351">AI Agents Issue #3524351</a> (tool memory re-injection) — that addresses tool output memory; this addresses context item re-injection. Same underlying pattern: don't repeat data the LLM already has.</li>
<li><code>available_on_loop</code> in <code>default_information_tools</code> is the closest precedent.</li>
</ul>
<h3>Prototype</h3>
<p>Working <code>LoopAwareContextSubscriber</code> in a custom module. Before/after measurements confirm 52% total token reduction on a single heading edit (N=1). The subscriber runs at priority -5, after ai_context's SystemPromptSubscriber (implicit priority 0 via Symfony default).</p>
<p>Happy to contribute a patch if this direction looks right.</p>
<p><em>Disclaimer: This was written, along with the code, with the assistance of Claude Code Opus</em></p>
issue