ai_observability: streaming OTel spans drop final token usage (span ended on un-consumed iterator)
## Problem
For streaming chat, `ProviderProxy` dispatches `PostGenerateResponseEvent` with the
response whose normalized value is the **un-consumed** `StreamedChatMessageIterator`
(`ProviderProxy.php:352`, guard `:357`). `AiOtelSpansEventSubscriber` routes both
`PostGenerateResponseEvent` and `PostStreamingResponseEvent` to the same handler
(`:63-67`). On the first (`PostGenerateResponseEvent`) it ends the span (`:118-120`) —
before the stream is consumed, so `getTokenUsage()` is empty — and never unsets it
(`:34`, stored at `:93`). When the stream finishes,
`StreamedChatMessageIterator::getIterator()` calls `reconstructChatOutput()` (`:422`)
and `triggerEvent()` (`:423`), the latter dispatching `PostStreamingResponseEvent`
(`:469`); the handler runs a second time and finds the still-stored span (`:113-118`),
but it is already ended, so the re-filled attributes are dropped (per the OTel spec,
mutations after `end()` are no-ops). **Net: streaming spans record no final token usage.**
`AiOtelMetricsEventSubscriber` subscribes to both events too (`:43-47`) but is **not**
affected the same way — it has no stored-span state: it emits a counter per non-NULL
token value read from the event output (`:72-87`), skipping the empty first event and
recording usage on the terminal event. Its only streaming defect is a downstream symptom
of the metadata gap below: the `operation_type` dimension (`:83`) is empty on streaming.
Aggravating factors:
- The terminal `PostStreamingResponseEvent` carries an empty `operationType` **and**
empty `configuration`. `triggerEvent()` reads `$this->operationType` (`:459`) and
`$this->configuration` (`:460`), but neither property is ever populated on the
iterator: `attachStreamMetadata()` (`ProviderProxy.php:96-101`) propagates
provider / model / input / tags / requestThreadId and calls `setProviderConfiguration()`
— which writes a *different* property (`providerConfiguration`, `:325`) the event never
reads — and sets no operation type at all.
- `reconstructChatOutput()` (`:536-560`) does not copy `$this->finishReason`
(iterator-only, getter `:680`) onto the final `ChatMessage`, so the finish reason is
not exposed as a first-class field on the terminal event (it may still survive inside
the merged raw output, provider-dependent — e.g. OpenAI passes chunk `toArray()` through
as raw).
- `ReplayedChatMessageIterator` (`:18`) and `ai_assistant_api`'s
`AssistantStreamIterator` (`:21`) override `getIterator()` without calling
`triggerEvent()`, so they don't dispatch the terminal event themselves. Impact varies:
`ReplayedChatMessageIterator` wraps an inner iterator that may still fire (e.g. via
`PromptJsonDecoder`); `AssistantStreamIterator` has no instantiation in the 1.4.0
checkout. A span-leak guard is warranted regardless.
## Proposed approach
- In the spans subscriber: for streaming outputs (normalized is a
`StreamedChatMessageIteratorInterface`), **finalize on `PostStreamingResponseEvent`,
not the initial event**; unset the stored span after the terminal event.
- Guard against span leaks for stream subclasses that never dispatch the streaming
event.
- Propagate `operationType` **and** `configuration` to the terminal event — either set
`$this->operationType` / `$this->configuration` on the iterator (today
`attachStreamMetadata()` only sets `providerConfiguration`, which `triggerEvent()`
ignores) or have `triggerEvent()` read the populated properties. Provider / model /
input / tags already flow through.
- Preserve the finish reason into the reconstructed output/metadata so consumers can
read it.
## Definition of done
- Streaming spans carry final `input`/`output` token usage.
- `$otelSpans` entries unset after the terminal event (no unbounded growth).
- Streaming `PostStreamingResponseEvent` carries non-empty `operationType` and
`configuration` (so span attrs and the metrics `operation_type` dimension are populated
on the streaming path).
- No span leak for non-dispatching subclasses.
- Finish reason recoverable from the final streaming output.
- Unit test simulating the real streaming order (pre → post-generate w/ iterator →
post-streaming final). Fix `AiOtelMetricsEventSubscriberTest.php:78-79`, which builds a
`PostStreamingResponseEvent` stub but dispatches it under
`PostGenerateResponseEvent::EVENT_NAME` — so the test never asserts the
`PostStreamingResponseEvent` subscription is wired (both events share one handler, so
the mislabel passes silently).
## Reproduction (verified against 1.4.0 + real OTel SDK)
Driving the real subscribers with the actual streaming event order — pre →
`PostGenerateResponseEvent` (un-consumed iterator, empty `TokenUsageDto`) →
`PostStreamingResponseEvent` (reconstructed output, `input:11`/`output:22`), all sharing
one `requestThreadId` — against a real `TracerProvider` + `InMemoryExporter`:
- **Span (buggy order):** exported `token_usage` = `[]`. **Fixed order** (finalize on the
terminal event only) = `{"input":11,"output":22}` → confirms the loss is the early
`end()`, not attribute rejection.
- **Metrics:** counter records `input=11` and `output=22` on the terminal event — **not**
affected (only the `operation_type` dimension is empty).
- **Terminal event metadata:** real `getIterator()`→`triggerEvent()` path emits
`operationType=''` and `configuration=[]` while `providerId`/`modelId` are populated.
## Related
- Umbrella #3586445 · Layer 2 envelope tracker #3586453
- **Prerequisite for** #3586479 (the `gen_ai.*` alignment; streaming token usage + finish_reasons).
---
AI-Generated: Yes (Claude Code; verified against `drupal/ai` 1.4.0 — `ProviderProxy`, `StreamedChatMessageIterator`, `AiOtelSpansEventSubscriber`, `AiOtelMetricsEventSubscriber`; bug reproduced with a real OpenTelemetry SDK in-memory exporter.)
issue