Issue #3558801: Route the OpenAI chat operation through the Responses API instead of Chat Completions.

MR Summary — Convert the OpenAI chat operation to the Responses API (Issue #3558801)

Why this was done

OpenAI's Responses API is the successor to Chat Completions, and OpenAI intends to eventually deprecate Chat Completions. The issue originally proposed exposing Responses as a separate responses operation type (MR !56), but the maintainers (comments #6 and #9) rejected that direction:

No other provider distinguishes chat vs. responses — the operation type doesn't change, only OpenAI's endpoint does. Adding a parallel operation type would fragment the abstraction.
The change should be essentially backwards compatible: every existing chat() consumer (chatbot streaming, assistants, automators, CKEditor, etc.) must keep working unchanged.
It should open the door to future Responses-only features (internal tools like web/file search and code interpreter, internal memory via previous_response_id/Conversations, vector-store integration, native structured output).

So this MR keeps the single chat operation and its ChatInput/ChatOutput contract intact, and simply switches the endpoint it calls from client->chat() (Chat Completions) to client->responses() (Responses), re-mapping every request/response detail to the new shape.

What changed and why

src/Plugin/AiProvider/OpenAiProvider.php — chat() rewritten to target the Responses endpoint. The public signature and behavior are unchanged; only the wire format differs. The logic is split into focused helpers:

buildResponsesInput() — maps ChatInput messages to the Responses input array: plain text as string content, the system role as an input message (with the existing o1/o3 special-casing), prior assistant tool calls as function_call items, and tool results as function_call_output items.
buildResponsesMessageItem() — maps multimodal messages to typed content parts (input_text / input_image / input_file), replacing the Chat Completions image_url/file shapes.
prepareResponsesConfiguration() — translates config keys to Responses equivalents: max_tokens/max_completion_tokens → max_output_tokens, drops the unsupported frequency_penalty/presence_penalty, moves reasoning_effort → reasoning.effort, and drops temperature/top_p for reasoning models (which reject them).
renderResponsesTools() + sanitizeResponsesToolSchema() — flattens tool definitions from the nested Chat Completions shape ({type, function:{…}}) to the flat Responses shape ({type, name, description, parameters, strict}), and strips non-standard keys the core tool renderer leaks into each property (a redundant name and a boolean required). Chat Completions tolerated those; the Responses API validates strictly and rejects them — without this, real tool calls 400 with "True is not of type 'array'".
extractResponsesChatMessage() — parses the Responses output[] array into a ChatMessage (assistant text) plus ToolsFunctionOutput objects from function_call items.
setResponsesTokenUsage() — maps the Responses usage shape (input_tokens, output_tokens, output_tokens_details.reasoning_tokens, input_tokens_details.cached_tokens) into TokenUsageDto (the shared core helper reads Chat Completions keys, so a provider-local mapper is used rather than touching the base class).
Structured output is now sent under text.format instead of response_format.
Extension seams (documented @todos) mark where internal tools, previous_response_id/Conversations memory, and vector stores plug in later.

definitions/api_defaults.yml — the chat configuration is updated to Responses-compatible params: max_tokens → max_output_tokens, and frequency_penalty/presence_penalty removed (unsupported by Responses). getModelSettings() is updated accordingly. Existing stored configs still work because prepareResponsesConfiguration() translates the old keys at request time.

What was added and why

src/OpenAiResponsesStreamIterator.php (new) — the old OpenAiChatMessageIterator read the Chat Completions streaming shape (choices[].delta), which no longer applies. This new iterator consumes the Responses event stream (response.output_text.delta, response.function_call_arguments.*, response.completed, refusals) and maps it onto the core streamed-chat contract, so streaming text, tool calls, finish reason, and token usage are preserved. It contains @todo seams for the future internal-tool/reasoning events.
src/OpenAiResponsesToolCall.php (new) — a tiny value object. The core StreamedChatMessageIterator::assembleToolCalls() is private and reconstructs streamed tool calls from objects whose toArray() returns the OpenAI tool_calls delta shape. Since the Responses API streams function calls as a separate item plus argument deltas, this object adapts those fragments back into the shape the core iterator expects.
Removed src/OpenAiChatMessageIterator.php (obsolete Chat Completions iterator).

Backwards compatibility & notes

The chat/ChatInput/ChatOutput contract is unchanged — no consumer changes required.
Minor config-surface change: the chat config field is renamed (max_tokens → max_output_tokens) and the two penalty fields are removed from the UI. Stored values under the old keys are still honored via in-code translation.
Core follow-up (separate issue): the schema sanitizer works around the ai module's ToolsPropertyInput renderer leaking a boolean required/name into each property. The proper fix belongs in drupal/ai; the provider keeps the defensive sanitizer regardless.

Testing

Automated (added with this MR — module previously had no tests)

tests/src/Unit/OpenAiResponsesToolCallTest.php, tests/src/Kernel/OpenAiResponsesChatTest.php, tests/src/Kernel/OpenAiResponsesStreamIteratorTest.php — 12 tests / 75 assertions, no network (the chat test uses a mock HTTP transport with a real OpenAI\Client).

`# From the Drupal root, with SIMPLETEST_DB set: vendor/bin/phpunit -c web/core/phpunit.xml.dist web/modules/contrib/ai_provider_openai/tests/

Standards:

vendor/bin/phpcs --standard=Drupal,DrupalPractice web/modules/contrib/ai_provider_openai/ vendor/bin/phpstan analyse --configuration=web/modules/contrib/ai/phpstan.neon web/modules/contrib/ai_provider_openai/ `

Coverage: request hits /responses; message→input mapping; multimodal parts; flat + sanitized tool schemas; text.format structured output; config translation; reasoning-model handling; tool-history (function_call/function_call_output); non-streamed parsing + token usage; and streamed text/tool/usage reconstruction.

Manual UI verification

Rebuild caches first (drush cr). Since the code has no Chat Completions path left, any working chat proves the Responses endpoint is in use.

AI API Explorer → Chat Generator (/admin/config/ai/explorers/chat_generator):
- Basic chat (gpt-4o-mini): normal reply + token usage.
- Reasoning model (gpt-5.x/o*): correct reply with no "Unsupported parameter: reasoning_effort" / temperature error.
- Streaming (Streamed on, use a multi-sentence prompt — short single-line output is coalesced by the core URL-safety buffer, unrelated to this MR): text paints progressively.
- Structured output (Advanced → JSON Schema): returns valid JSON. (Requires the ai module's JSON-schema editor JS to be built: npm install && npm run build in web/modules/contrib/ai/ui/json-schema-editor.)
- Function calling (Execute Function Call on): tool call executes with no "Invalid schema for function" error.
- Vision: upload an image and confirm it's described.
Provider settings surface — the chat config fields render inside the AI Assistant edit form (/admin/config/ai/ai-assistant, AI Provider section): confirm "Max Output Tokens" is present and Frequency/Presence Penalty are gone. (The provider settings page itself only holds the API key/moderation.)
Chatbot streaming + multi-turn — requires a legacy (non-agent) assistant (the DeepChat block hides the Stream toggle for agent-based assistants); point a DeepChat block at a plain chat assistant, enable Stream, and confirm progressive streaming and retained context across turns.

Pass criteria: all scenarios succeed and — as the two MR-specific signals — reasoning models don't raise "Unsupported parameter", and tool calls don't raise "Invalid schema for function".

Closes #3558801

Issue #3558801: Route the OpenAI chat operation through the Responses API...