AI agent runner resume state (#3585053) · Issues · project / ai

AI agent runner resume state

>>> [!note] Migrated issue   Reported by: [tim bozeman](https://www.drupal.org/user/2241356) Related to !1515 >>> [Tracker] Update Summary: Patch ready for review Short Description: AgentRunner does not inject new user messages when resuming an agent from tempstore Check-in Date: 04/13/2026 Metadata is used by the <a href="https://www.drupalstarforge.ai/" title="AI Tracker">AI Tracker.</a> Docs and additional fields <a href="https://www.drupalstarforge.ai/ai-dashboard/docs" title="AI Issue Tracker Documentation">here</a>. [/Tracker] <h3 id="summary-problem-motivation">Problem/Motivation</h3> When an AI agent is run via <code>AgentRunner::runAsAgent()</code>, the agent may be saved to the private tempstore and resumed on a subsequent user message. This happens in two scenarios: <ol> <li>Verbose/streaming mode: The agent saves state between loop iterations (<code>setLooped(FALSE)</code>), allowing the frontend to poll for progress.</li> <li><code>return_directly</code> tools: When a tool (such as a sub-agent) has <code>return_directly=true</code>, the agent returns the tool output immediately without calling the LLM. The agent is not marked as finished, so its state is persisted to tempstore. On the next user message, the agent should resume with full conversation context.</li> </ol> However, the current restore path simply calls <code>$agent->fromArray($agent_data)</code> without making any adjustments to the restored state. This causes several problems: <ol> <li>User's new message is lost: The agent's internal <code>chatHistory</code> (which includes detailed tool call context) does not include messages sent after the agent was saved. The user's follow-up message is available in the <code>$chat_history</code> parameter but is never injected into the agent's history. The agent continues from its saved state without awareness of the user's follow-up.</li> <li>Loop budget is exhausted: The <code>looped</code> counter is restored at its previous value (e.g. 2 or 3), consuming the agent's max loop budget. Resetting it to 0 gives the agent a fresh budget for the new turn.</li> <li>Stale context tools may re-execute: The <code>context_tools</code> array may contain tools from the previous turn (e.g. from a <code>return_directly</code> tool whose cleanup was skipped due to early return). These tools would be re-executed on the next <code>determineSolvability()</code> call.</li> </ol> Together, these issues prevent multi-turn agent conversations from working correctly when the agent is resumed from tempstore. The <code>return_directly</code> case is especially impactful: orchestration agents that delegate to sub-agents (e.g. a routing chatbot calling a page builder) lose all conversation context between turns, making multi-step workflows like workspace selection impossible. <h4 id="summary-steps-reproduce">Steps to reproduce (required for bugs, but not feature requests)</h4> Scenario A — <code>return_directly</code> sub-agent (primary use case): <ol> <li>Configure an orchestration agent (e.g. a routing chatbot) with a sub-agent tool (e.g. <code>page_builder</code>) that has <code>return_directly=true</code> in the agent's <code>tool_settings</code>.</li> <li>Send a message that triggers the sub-agent (e.g. "Create a page about dolphins").</li> <li>The sub-agent encounters a condition requiring user input (e.g. workspace selection) and returns a question to the user.</li> <li>Send a follow-up message answering the question (e.g. "Use the Animals workspace").</li> <li>Observe that the agent does not incorporate the follow-up message — it starts fresh with no memory of the previous exchange.</li> </ol> Scenario B — verbose/streaming mode: <ol> <li>Configure an assistant that uses an AI agent.</li> <li>Enable verbose mode so the agent saves state between iterations.</li> <li>Send a message that triggers tool usage (the agent will save to tempstore after the first loop).</li> <li>Send a follow-up message in the same thread.</li> <li>Observe that the agent does not incorporate the follow-up message — it either ignores it, re-executes stale tools, or hits the loop limit.</li> </ol> Environment: Drupal AI module 1.3.x, AI Assistant API sub-module, any LLM provider. Testing with Entity Blueprint: The <a href="https://www.drupal.org/project/entity_blueprint">Entity Blueprint</a> contrib module provides a concrete setup for reproducing this. Its <code>entity_blueprint_ai</code> sub-module exposes a <code>page_builder</code> agent with tools like <code>get_blueprint_schema</code> and <code>switch_workspace</code>. Configure a routing chatbot agent with the page builder as a <code>return_directly</code> sub-agent tool, then ask it to create content that requires workspace selection — the multi-turn workspace negotiation will fail without this fix. <h3 id="summary-proposed-resolution">Proposed resolution</h3> Before calling <code>fromArray()</code>, inspect the restored chat history to determine whether there are unresolved tool calls (i.e. <code>tool_use</code> blocks without matching <code>tool_result</code> blocks). This distinguishes two resume scenarios: <ol> <li>Unresolved tool calls (verbose mode mid-loop): The agent stopped after receiving <code>tool_use</code> from the LLM but before executing the tools. In this case, let the agent continue from where it left off — the pending tools will be executed inside <code>determineSolvability()</code>. Do not inject the user message or reset state, as this would create invalid API message ordering (<code>tool_use</code> followed by <code>user</code> text instead of <code>tool_result</code>).</li> <li>All tools resolved (e.g. <code>return_directly</code> completed): The agent has finished processing all tool calls. Inject the user's new message, reset the loop counter, and clear stale context tools to start a fresh turn.</li> </ol> This ensures that: <ul> <li>In verbose mode, the agent continues processing pending tools naturally without injecting messages that would violate LLM API message ordering rules.</li> <li>After <code>return_directly</code> tools complete, the user's latest message is appended to the agent's rich internal history (preserving prior tool call context).</li> <li>The loop counter is reset so the agent has a full budget for the new turn.</li> <li>Stale context tools are cleared to prevent duplicate execution.</li> </ul> Note: This patch works in conjunction with <a href="https://www.drupal.org/project/ai_agents/issues/3585054">a separate fix in the AI Agents module</a> that ensures <code>return_directly</code> tool results include the <code>tool_id</code> (see related issue). Without that fix, the unresolved tool check cannot match tool results to their originating tool calls, and the restored chat history causes LLM API validation errors. <h3 id="summary-remaining-tasks">Remaining tasks</h3> <ul> <li>Review and commit the patch.</li> </ul> [x] AI Assisted Code This code was mainly generated by a human, with AI autocompleting or parts AI generated, but under full human supervision. > Related issue: [Issue #3585054](https://www.drupal.org/node/3585054)

issue