Dispatch AiExceptionEvent and allow subscribers to replace the response (graceful failover)
>>> [!note] Migrated issue
<!-- Drupal.org comment -->
<!-- Migrated from issue #3585233. -->
Reported by: [marcus_johansson](https://www.drupal.org/user/385947)
Related to !1514
>>>
<p>[Tracker]<br>
<strong>Update Summary: </strong>[One-line status update for stakeholders]<br>
<strong>Short Description: </strong>Dispatch a new AiExceptionEvent from ProviderProxy::wrapperCall() so subscribers can rewrite the exception message or substitute an OutputInterface, enabling third-party failover without patching core.<br>
<strong>Check-in Date: </strong>MM/DD/YYYY<br>
[/Tracker]</p>
<h3 id="summary-problem-motivation">Problem/Motivation</h3>
<p>Projects are shipping the AI module's features and third parties want to customise how failures surface. Two real needs today:</p>
<ul>
<li><strong>Customise the error message.</strong> When OpenAI reports a budget overrun, the raw exception surfaces as <code>Drupal\ai\Exception\AiRequestErrorException: Error invoking model response: Budget has been exceeded! Current cost: 1.133637, Max budget: 1.0</code> thrown from <code>ProviderProxy::wrapperCall()</code>. The floating-point precision and the raw "Budget has been exceeded" text leak implementation detail to the end user. Downstream modules currently have no extension point to rewrite this before it propagates.</li>
<li><strong>React to a failure without taking the caller down with it.</strong> When the primary provider exceeds quota or hits a rate limit, site owners want to transparently fall back to a secondary provider (or a cached response, or a canned apology) without the caller having to catch and retry. Right now <code>ProviderProxy::wrapperCall()</code> always rethrows and there is no way for a subscriber to substitute a valid <code>OutputInterface</code>.</li>
</ul>
<p>Both <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span> (dispatch an exception event) and a graceful-failover mechanism are being built or discussed in parallel. This issue does both in one patch because they share the dispatch site and the same event object.</p>
<h3 id="summary-proposed-resolution">Proposed resolution</h3>
<p>Three coordinated parts, all landing together because they touch the same three lines of <code>ProviderProxy::wrapperCall()</code> and the same event class.</p>
<p><strong>Part A — Dispatch <code>AiExceptionEvent</code> (from <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span>).</strong> Add a new event class that carries the original exception and an overridable message, preserving the exception class when a subscriber rewrites the text so existing <code>catch (AiQuotaException $e)</code> blocks keep working:</p>
<pre><div class="codeblock"><code class="language-php">namespace Drupal\ai\Event;<br><br>use Symfony\Contracts\EventDispatcher\Event;<br><br>final class AiExceptionEvent extends Event {<br><br> private string $message;<br><br> public function __construct(<br> public readonly \Exception $exception,<br> ) {<br> $this->message = $exception->getMessage();<br> }<br><br> public function setMessage(string $message): void {<br> $this->message = $message;<br> }<br><br> /**<br> * Returns the exception that should be thrown by the proxy.<br> *<br> * The original exception type is preserved so callers keep their<br> * specific catch blocks (AiQuotaException, AiRateLimitException, ...).<br> * Only the message is swapped if a subscriber rewrote it.<br> */<br> public function getException(): \Exception {<br> if ($this->message === $this->exception->getMessage()) {<br> return $this->exception;<br> }<br> $class = get_class($this->exception);<br> return new $class($this->message, $this->exception->getCode(), $this->exception);<br> }<br>}<p>Dispatch it in <code>ProviderProxy::wrapperCall()</code>, inside the existing <code>try/catch</code> block, for every caught exception type. Dispatch by class (no event-name string), consistent with the MR !846 review discussion:</p>
<pre><div class="codeblock"><code class="language-php">catch (\Throwable $e) {<br> $event = new AiExceptionEvent($e);<br> $this->eventDispatcher->dispatch($event);<br> // Handled-response path - see Part B.<br> if ($forced = $event->getForcedOutputObject()) {<br> return $forced;<br> }<br> throw $event->getException();<br>}<p>Move the existing per-type <code>$this->loggerFactory->get('ai')->error(...)</code> calls out of <code>ProviderProxy</code> and into an event subscriber on <code>AiExceptionEvent</code> so logging behaviour is extensible too.</p>
<p><strong>Part B — Allow subscribers to replace the response (failover).</strong> Add a forced-output slot on the event, mirroring <code>PreGenerateResponseEvent</code>:</p>
<pre><div class="codeblock"><code class="language-php">// On AiExceptionEvent.<br>use Drupal\ai\OperationType\OutputInterface;<br><br>protected ?OutputInterface $forcedOutputObject = NULL;<br><br>public function getForcedOutputObject(): ?OutputInterface {<br> return $this->forcedOutputObject;<br>}<br><br>public function setForcedOutputObject(OutputInterface $output): void {<br> $this->forcedOutputObject = $output;<br>}<p>If a subscriber called <code>setForcedOutputObject()</code>, the proxy returns that output instead of throwing. If no subscriber sets one, behaviour is identical to Part A alone: rethrow with the (possibly rewritten) message. This is the minimum surface that lets a third-party module implement graceful failover - for example, a module that sees an <code>AiQuotaException</code> on <code>openai</code> can call a backup provider via <code>AiProviderPluginManager</code>, build an <code>OutputInterface</code> from the response, and set it on the event; the original caller receives a successful response and never sees an exception.</p>
<p><strong>Part C — Preserve the request context on the event.</strong> For failover to be useful, subscribers need to know which provider/model/input failed. Extend the constructor:</p>
<pre><div class="codeblock"><code class="language-php">public function __construct(<br> public readonly \Exception $exception,<br> public readonly ?string $requestThreadId = NULL,<br> public readonly ?string $providerId = NULL,<br> public readonly ?string $operationType = NULL,<br> public readonly ?string $modelId = NULL,<br> public readonly mixed $input = NULL,<br> public readonly array $configuration = [],<br> public readonly array $tags = [],<br>) { ... }<p><code>ProviderProxy::wrapperCall()</code> already has all of these in scope at the point of dispatch (it constructs <code>PreGenerateResponseEvent</code> from them a few lines earlier). All new arguments are nullable / defaulted so existing instantiations, including test fixtures, keep working.</p>
<p><strong>Example: a failover subscriber in a third-party module.</strong> With Parts A-C in place, a module can implement provider failover without patching AI core:</p>
<pre><div class="codeblock"><code class="language-php">namespace Drupal\my_failover\EventSubscriber;<br><br>use Drupal\ai\AiProviderPluginManager;<br>use Drupal\ai\Event\AiExceptionEvent;<br>use Drupal\ai\Exception\AiQuotaException;<br>use Drupal\ai\Exception\AiRateLimitException;<br>use Drupal\ai\OperationType\Chat\ChatInput;<br>use Symfony\Component\EventDispatcher\EventSubscriberInterface;<br><br>final class FailoverSubscriber implements EventSubscriberInterface {<br><br> public function __construct(<br> private readonly AiProviderPluginManager $aiProvider,<br> ) {}<br><br> public static function getSubscribedEvents(): array {<br> return [AiExceptionEvent::class => 'onException'];<br> }<br><br> public function onException(AiExceptionEvent $event): void {<br> // Only failover on quota/rate-limit, and only for chat.<br> if (!($event->exception instanceof AiQuotaException<br> || $event->exception instanceof AiRateLimitException)) {<br> return;<br> }<br> if ($event->operationType !== 'chat' || !$event->input instanceof ChatInput) {<br> return;<br> }<br> // Avoid looping if the backup itself threw.<br> if ($event->providerId === 'anthropic') {<br> return;<br> }<br> $backup = $this->aiProvider->createInstance('anthropic');<br> $output = $backup->chat($event->input, 'claude-3-5-sonnet-latest', $event->tags);<br> $event->setForcedOutputObject($output);<br> }<br>}<p>The caller of the original <code>chat()</code> call receives the backup provider's response and is unaware anything failed.</p>
<p><strong>Backwards compatibility.</strong> Part A only changes the thrown exception when a subscriber rewrites the message; even then, the class is identical, so existing <code>catch (AiQuotaException $e)</code> keeps working. With no subscriber, behaviour is byte-for-byte identical to current code. Part B adds one optional behaviour (forced output) which is a no-op unless opted into. Part C adds constructor arguments that are all optional, so <code>new AiExceptionEvent($e)</code> remains a valid call. No existing callers of <code>ProviderProxy</code> need to change.</p>
<p><strong>Why one issue instead of two.</strong> Parts A and B touch the same three lines of <code>ProviderProxy::wrapperCall()</code> and the same event class. Splitting them means landing an event that has a known missing capability (no way to recover gracefully), then landing a follow-up that rewrites what just landed. Landing together also means the tests cover the interaction (subscriber rewrites the message and sets a forced output - the output wins, the message is irrelevant), which isn't exercised if the two land separately.</p>
<p><strong>Two MRs: 1.x preserves the original exception class, 2.0.x wraps in <code>AiResponseErrorException</code>.</strong> The rethrow behaviour is deliberately different on the two branches:</p>
<ul>
<li><strong>1.x MR</strong> - keep the exact current behaviour: the thrown exception is the same class as the original (<code>AiQuotaException</code>, <code>AiRateLimitException</code>, etc.), with only the message swapped when a subscriber rewrites it. This preserves byte-for-byte compatibility for every site currently using <code>catch (AiQuotaException $e)</code> and friends in 1.x.</li>
<li><strong>2.0.x MR</strong> - normalise the rethrow into a single <code>AiResponseErrorException</code> that wraps the original exception as its <code>$previous</code>. Callers that care about the specific cause use <code>$e->getPrevious()</code> to reach the original <code>AiQuotaException</code> / <code>AiRateLimitException</code> / etc. This lets callers write one <code>catch (AiResponseErrorException $e)</code> for all provider-side failures while still being able to branch on the underlying class when needed. It also means the rewritten-message path no longer has to reconstruct the original exception class by reflection - the wrapper is always the same type, and the original is available untouched via <code>getPrevious()</code>. This is a breaking change, which is why it ships in 2.0.x only.</li>
</ul>
<p>Both MRs share the event class, forced-output slot, and context properties. The only divergence is in <code>ProviderProxy::wrapperCall()</code>'s rethrow statement and in <code>AiExceptionEvent::getException()</code>. Tests split accordingly: the 1.x MR asserts the caller sees the original class; the 2.0.x MR asserts the caller sees <code>AiResponseErrorException</code> with <code>getPrevious()</code> returning the original.</p>
<p><strong>Remaining tasks:</strong></p>
<ul>
<li>Implement <code>AiExceptionEvent</code> with message, forced-output, and context properties (Parts A + B + C).</li>
<li>Update <code>ProviderProxy::wrapperCall()</code> to dispatch, honour forced output, and rethrow with the event's exception.</li>
<li>Move per-type logging from <code>ProviderProxy</code> into <code>Drupal\ai\EventSubscriber\AiExceptionLoggingSubscriber</code> and register it in <code>ai.services.yml</code>.</li>
<li>Unit test: subscriber that rewrites the message - assert the caller sees the rewritten text and the original exception class.</li>
<li>Unit test: subscriber that sets a forced output - assert the caller receives the output and no exception is thrown.</li>
<li>Unit test: no subscriber - assert current behaviour is unchanged (exception and message both untouched).</li>
<li>Kernel test covering each exception type currently handled in <code>wrapperCall()</code>: <code>AiBadRequestException</code>, <code>AiResponseErrorException</code>, <code>AiMissingFeatureException</code>, <code>AiQuotaException</code>, <code>AiRateLimitException</code>, <code>AiUnsafePromptException</code>, <code>AiRequestErrorException</code>, plus the generic <code>\Exception</code> catch.</li>
<li>Update <code>docs/developers/events.md</code> with an example of a failover subscriber.</li>
<li>Close <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span> as duplicate/supersede, link this issue from MR !846.</li>
</ul>
<p><strong>API changes:</strong></p>
<ul>
<li>New public class <code>Drupal\ai\Event\AiExceptionEvent</code> with the properties and methods above. Dispatched by class from <code>ProviderProxy::wrapperCall()</code>.</li>
<li>Internal refactor: logging messages previously emitted directly from <code>ProviderProxy</code> move into an event subscriber. Text and log channel remain identical, so log parsers are unaffected.</li>
<li>No UI change.</li>
</ul>
<p>This issue supersedes / combines <span class="drupalorg-gitlab-issue-link project-issue-status-info project-issue-status-13"><a href="https://www.drupal.org/project/ai/issues/3542496" title="Status: Needs work">#3542496: Dispatch AiExceptionEvent when a provider throws an exception</a></span> with the extension needed for third-party modules to implement failover.</p>
<h3 id="summary-ai-usage">AI usage (if applicable)</h3>
<p>[x] AI Assisted Issue<br>
This issue was generated with AI assistance, but was reviewed and refined by the creator.</p>
<p>[ ] AI Assisted Code</p>
<p>[ ] AI Generated Code</p>
<p>[ ] Vibe Coded</p>
<p>- <strong>This issue was created with the help of AI</strong></p>
</code></div></pre></code></div></pre></code></div></pre></code></div></pre></code></div></pre>
issue