Dispatch AiExceptionEvent and allow subscribers to replace the response (graceful failover)
>>> [!note] Migrated issue <!-- Drupal.org comment --> <!-- Migrated from issue #3585233. --> Reported by: [marcus_johansson](https://www.drupal.org/user/385947) Related to !1514 >>> <p>[Tracker]<br> <strong>Update Summary: </strong>[One-line status update for stakeholders]<br> <strong>Short Description: </strong>Dispatch a new AiExceptionEvent from ProviderProxy::wrapperCall() so subscribers can rewrite the exception message or substitute an OutputInterface, enabling third-party failover without patching core.<br> <strong>Check-in Date: </strong>MM/DD/YYYY<br> [/Tracker]</p> <h3 id="summary-problem-motivation">Problem/Motivation</h3> <p>Projects are shipping the AI module's features and third parties want to customise how failures surface. Two real needs today:</p> <ul> <li><strong>Customise the error message.</strong> When OpenAI reports a budget overrun, the raw exception surfaces as <code>Drupal\ai\Exception\AiRequestErrorException: Error invoking model response: Budget has been exceeded! Current cost: 1.133637, Max budget: 1.0</code> thrown from <code>ProviderProxy::wrapperCall()</code>. The floating-point precision and the raw "Budget has been exceeded" text leak implementation detail to the end user. Downstream modules currently have no extension point to rewrite this before it propagates.</li> <li><strong>React to a failure without taking the caller down with it.</strong> When the primary provider exceeds quota or hits a rate limit, site owners want to transparently fall back to a secondary provider (or a cached response, or a canned apology) without the caller having to catch and retry. Right now <code>ProviderProxy::wrapperCall()</code> always rethrows and there is no way for a subscriber to substitute a valid <code>OutputInterface</code>.</li> </ul> <p>Both <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span> (dispatch an exception event) and a graceful-failover mechanism are being built or discussed in parallel. This issue does both in one patch because they share the dispatch site and the same event object.</p> <h3 id="summary-proposed-resolution">Proposed resolution</h3> <p>Three coordinated parts, all landing together because they touch the same three lines of <code>ProviderProxy::wrapperCall()</code> and the same event class.</p> <p><strong>Part A &mdash; Dispatch <code>AiExceptionEvent</code> (from <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span>).</strong> Add a new event class that carries the original exception and an overridable message, preserving the exception class when a subscriber rewrites the text so existing <code>catch (AiQuotaException $e)</code> blocks keep working:</p> <pre><div class="codeblock"><code class="language-php">namespace Drupal\ai\Event;<br><br>use Symfony\Contracts\EventDispatcher\Event;<br><br>final class AiExceptionEvent extends Event {<br><br>&nbsp; private string $message;<br><br>&nbsp; public function __construct(<br>&nbsp;&nbsp;&nbsp; public readonly \Exception $exception,<br>&nbsp; ) {<br>&nbsp;&nbsp;&nbsp; $this-&gt;message = $exception-&gt;getMessage();<br>&nbsp; }<br><br>&nbsp; public function setMessage(string $message): void {<br>&nbsp;&nbsp;&nbsp; $this-&gt;message = $message;<br>&nbsp; }<br><br>&nbsp; /**<br>&nbsp;&nbsp; * Returns the exception that should be thrown by the proxy.<br>&nbsp;&nbsp; *<br>&nbsp;&nbsp; * The original exception type is preserved so callers keep their<br>&nbsp;&nbsp; * specific catch blocks (AiQuotaException, AiRateLimitException, ...).<br>&nbsp;&nbsp; * Only the message is swapped if a subscriber rewrote it.<br>&nbsp;&nbsp; */<br>&nbsp; public function getException(): \Exception {<br>&nbsp;&nbsp;&nbsp; if ($this-&gt;message === $this-&gt;exception-&gt;getMessage()) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return $this-&gt;exception;<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; $class = get_class($this-&gt;exception);<br>&nbsp;&nbsp;&nbsp; return new $class($this-&gt;message, $this-&gt;exception-&gt;getCode(), $this-&gt;exception);<br>&nbsp; }<br>}<p>Dispatch it in <code>ProviderProxy::wrapperCall()</code>, inside the existing <code>try/catch</code> block, for every caught exception type. Dispatch by class (no event-name string), consistent with the MR !846 review discussion:</p> <pre><div class="codeblock"><code class="language-php">catch (\Throwable $e) {<br>&nbsp; $event = new AiExceptionEvent($e);<br>&nbsp; $this-&gt;eventDispatcher-&gt;dispatch($event);<br>&nbsp; // Handled-response path - see Part B.<br>&nbsp; if ($forced = $event-&gt;getForcedOutputObject()) {<br>&nbsp;&nbsp;&nbsp; return $forced;<br>&nbsp; }<br>&nbsp; throw $event-&gt;getException();<br>}<p>Move the existing per-type <code>$this-&gt;loggerFactory-&gt;get('ai')-&gt;error(...)</code> calls out of <code>ProviderProxy</code> and into an event subscriber on <code>AiExceptionEvent</code> so logging behaviour is extensible too.</p> <p><strong>Part B &mdash; Allow subscribers to replace the response (failover).</strong> Add a forced-output slot on the event, mirroring <code>PreGenerateResponseEvent</code>:</p> <pre><div class="codeblock"><code class="language-php">// On AiExceptionEvent.<br>use Drupal\ai\OperationType\OutputInterface;<br><br>protected ?OutputInterface $forcedOutputObject = NULL;<br><br>public function getForcedOutputObject(): ?OutputInterface {<br>&nbsp; return $this-&gt;forcedOutputObject;<br>}<br><br>public function setForcedOutputObject(OutputInterface $output): void {<br>&nbsp; $this-&gt;forcedOutputObject = $output;<br>}<p>If a subscriber called <code>setForcedOutputObject()</code>, the proxy returns that output instead of throwing. If no subscriber sets one, behaviour is identical to Part A alone: rethrow with the (possibly rewritten) message. This is the minimum surface that lets a third-party module implement graceful failover - for example, a module that sees an <code>AiQuotaException</code> on <code>openai</code> can call a backup provider via <code>AiProviderPluginManager</code>, build an <code>OutputInterface</code> from the response, and set it on the event; the original caller receives a successful response and never sees an exception.</p> <p><strong>Part C &mdash; Preserve the request context on the event.</strong> For failover to be useful, subscribers need to know which provider/model/input failed. Extend the constructor:</p> <pre><div class="codeblock"><code class="language-php">public function __construct(<br>&nbsp; public readonly \Exception $exception,<br>&nbsp; public readonly ?string $requestThreadId = NULL,<br>&nbsp; public readonly ?string $providerId = NULL,<br>&nbsp; public readonly ?string $operationType = NULL,<br>&nbsp; public readonly ?string $modelId = NULL,<br>&nbsp; public readonly mixed $input = NULL,<br>&nbsp; public readonly array $configuration = [],<br>&nbsp; public readonly array $tags = [],<br>) { ... }<p><code>ProviderProxy::wrapperCall()</code> already has all of these in scope at the point of dispatch (it constructs <code>PreGenerateResponseEvent</code> from them a few lines earlier). All new arguments are nullable / defaulted so existing instantiations, including test fixtures, keep working.</p> <p><strong>Example: a failover subscriber in a third-party module.</strong> With Parts A-C in place, a module can implement provider failover without patching AI core:</p> <pre><div class="codeblock"><code class="language-php">namespace Drupal\my_failover\EventSubscriber;<br><br>use Drupal\ai\AiProviderPluginManager;<br>use Drupal\ai\Event\AiExceptionEvent;<br>use Drupal\ai\Exception\AiQuotaException;<br>use Drupal\ai\Exception\AiRateLimitException;<br>use Drupal\ai\OperationType\Chat\ChatInput;<br>use Symfony\Component\EventDispatcher\EventSubscriberInterface;<br><br>final class FailoverSubscriber implements EventSubscriberInterface {<br><br>&nbsp; public function __construct(<br>&nbsp;&nbsp;&nbsp; private readonly AiProviderPluginManager $aiProvider,<br>&nbsp; ) {}<br><br>&nbsp; public static function getSubscribedEvents(): array {<br>&nbsp;&nbsp;&nbsp; return [AiExceptionEvent::class =&gt; 'onException'];<br>&nbsp; }<br><br>&nbsp; public function onException(AiExceptionEvent $event): void {<br>&nbsp;&nbsp;&nbsp; // Only failover on quota/rate-limit, and only for chat.<br>&nbsp;&nbsp;&nbsp; if (!($event-&gt;exception instanceof AiQuotaException<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; || $event-&gt;exception instanceof AiRateLimitException)) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return;<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; if ($event-&gt;operationType !== 'chat' || !$event-&gt;input instanceof ChatInput) {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return;<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; // Avoid looping if the backup itself threw.<br>&nbsp;&nbsp;&nbsp; if ($event-&gt;providerId === 'anthropic') {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return;<br>&nbsp;&nbsp;&nbsp; }<br>&nbsp;&nbsp;&nbsp; $backup = $this-&gt;aiProvider-&gt;createInstance('anthropic');<br>&nbsp;&nbsp;&nbsp; $output = $backup-&gt;chat($event-&gt;input, 'claude-3-5-sonnet-latest', $event-&gt;tags);<br>&nbsp;&nbsp;&nbsp; $event-&gt;setForcedOutputObject($output);<br>&nbsp; }<br>}<p>The caller of the original <code>chat()</code> call receives the backup provider's response and is unaware anything failed.</p> <p><strong>Backwards compatibility.</strong> Part A only changes the thrown exception when a subscriber rewrites the message; even then, the class is identical, so existing <code>catch (AiQuotaException $e)</code> keeps working. With no subscriber, behaviour is byte-for-byte identical to current code. Part B adds one optional behaviour (forced output) which is a no-op unless opted into. Part C adds constructor arguments that are all optional, so <code>new AiExceptionEvent($e)</code> remains a valid call. No existing callers of <code>ProviderProxy</code> need to change.</p> <p><strong>Why one issue instead of two.</strong> Parts A and B touch the same three lines of <code>ProviderProxy::wrapperCall()</code> and the same event class. Splitting them means landing an event that has a known missing capability (no way to recover gracefully), then landing a follow-up that rewrites what just landed. Landing together also means the tests cover the interaction (subscriber rewrites the message and sets a forced output - the output wins, the message is irrelevant), which isn't exercised if the two land separately.</p> <p><strong>Two MRs: 1.x preserves the original exception class, 2.0.x wraps in <code>AiResponseErrorException</code>.</strong> The rethrow behaviour is deliberately different on the two branches:</p> <ul> <li><strong>1.x MR</strong> - keep the exact current behaviour: the thrown exception is the same class as the original (<code>AiQuotaException</code>, <code>AiRateLimitException</code>, etc.), with only the message swapped when a subscriber rewrites it. This preserves byte-for-byte compatibility for every site currently using <code>catch (AiQuotaException $e)</code> and friends in 1.x.</li> <li><strong>2.0.x MR</strong> - normalise the rethrow into a single <code>AiResponseErrorException</code> that wraps the original exception as its <code>$previous</code>. Callers that care about the specific cause use <code>$e-&gt;getPrevious()</code> to reach the original <code>AiQuotaException</code> / <code>AiRateLimitException</code> / etc. This lets callers write one <code>catch (AiResponseErrorException $e)</code> for all provider-side failures while still being able to branch on the underlying class when needed. It also means the rewritten-message path no longer has to reconstruct the original exception class by reflection - the wrapper is always the same type, and the original is available untouched via <code>getPrevious()</code>. This is a breaking change, which is why it ships in 2.0.x only.</li> </ul> <p>Both MRs share the event class, forced-output slot, and context properties. The only divergence is in <code>ProviderProxy::wrapperCall()</code>'s rethrow statement and in <code>AiExceptionEvent::getException()</code>. Tests split accordingly: the 1.x MR asserts the caller sees the original class; the 2.0.x MR asserts the caller sees <code>AiResponseErrorException</code> with <code>getPrevious()</code> returning the original.</p> <p><strong>Remaining tasks:</strong></p> <ul> <li>Implement <code>AiExceptionEvent</code> with message, forced-output, and context properties (Parts A + B + C).</li> <li>Update <code>ProviderProxy::wrapperCall()</code> to dispatch, honour forced output, and rethrow with the event's exception.</li> <li>Move per-type logging from <code>ProviderProxy</code> into <code>Drupal\ai\EventSubscriber\AiExceptionLoggingSubscriber</code> and register it in <code>ai.services.yml</code>.</li> <li>Unit test: subscriber that rewrites the message - assert the caller sees the rewritten text and the original exception class.</li> <li>Unit test: subscriber that sets a forced output - assert the caller receives the output and no exception is thrown.</li> <li>Unit test: no subscriber - assert current behaviour is unchanged (exception and message both untouched).</li> <li>Kernel test covering each exception type currently handled in <code>wrapperCall()</code>: <code>AiBadRequestException</code>, <code>AiResponseErrorException</code>, <code>AiMissingFeatureException</code>, <code>AiQuotaException</code>, <code>AiRateLimitException</code>, <code>AiUnsafePromptException</code>, <code>AiRequestErrorException</code>, plus the generic <code>\Exception</code> catch.</li> <li>Update <code>docs/developers/events.md</code> with an example of a failover subscriber.</li> <li>Close <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span> as duplicate/supersede, link this issue from MR !846.</li> </ul> <p><strong>API changes:</strong></p> <ul> <li>New public class <code>Drupal\ai\Event\AiExceptionEvent</code> with the properties and methods above. Dispatched by class from <code>ProviderProxy::wrapperCall()</code>.</li> <li>Internal refactor: logging messages previously emitted directly from <code>ProviderProxy</code> move into an event subscriber. Text and log channel remain identical, so log parsers are unaffected.</li> <li>No UI change.</li> </ul> <p>This issue supersedes / combines <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span> with the extension needed for third-party modules to implement failover.</p> <h3 id="summary-ai-usage">AI usage (if applicable)</h3> <p>[x] AI Assisted Issue<br> This issue was generated with AI assistance, but was reviewed and refined by the creator.</p> <p>[ ] AI Assisted Code</p> <p>[ ] AI Generated Code</p> <p>[ ] Vibe Coded</p> <p>- <strong>This issue was created with the help of AI</strong></p> </code></div></pre></code></div></pre></code></div></pre></code></div></pre></code></div></pre>
issue