Issue #3558801: Route the OpenAI chat operation through the Responses API...

Issue #3558801: Route the OpenAI chat operation through the Responses API instead of Chat Completions.

MR Summary — Convert the OpenAI chat operation to the Responses API (Issue #3558801)

Why this was done

OpenAI's Responses API is the successor to Chat Completions, and OpenAI intends to eventually deprecate Chat Completions. The issue originally proposed exposing Responses as a separate responses operation type (MR !56), but the maintainers (comments #6 and #9) rejected that direction:

  • No other provider distinguishes chat vs. responses — the operation type doesn't change, only OpenAI's endpoint does. Adding a parallel operation type would fragment the abstraction.
  • The change should be essentially backwards compatible: every existing chat() consumer (chatbot streaming, assistants, automators, CKEditor, etc.) must keep working unchanged.
  • It should open the door to future Responses-only features (internal tools like web/file search and code interpreter, internal memory via previous_response_id/Conversations, vector-store integration, native structured output).

So this MR keeps the single chat operation and its ChatInput/ChatOutput contract intact, and simply switches the endpoint it calls from client->chat() (Chat Completions) to client->responses() (Responses), re-mapping every request/response detail to the new shape.

What changed and why

src/Plugin/AiProvider/OpenAiProvider.phpchat() rewritten to target the Responses endpoint. The public signature and behavior are unchanged; only the wire format differs. The logic is split into focused helpers:

  • buildResponsesInput() — maps ChatInput messages to the Responses input array: plain text as string content, the system role as an input message (with the existing o1/o3 special-casing), prior assistant tool calls as function_call items, and tool results as function_call_output items.
  • buildResponsesMessageItem() — maps multimodal messages to typed content parts (input_text / input_image / input_file), replacing the Chat Completions image_url/file shapes.
  • prepareResponsesConfiguration() — translates config keys to Responses equivalents: max_tokens/max_completion_tokensmax_output_tokens, drops the unsupported frequency_penalty/presence_penalty, moves reasoning_effortreasoning.effort, and drops temperature/top_p for reasoning models (which reject them).
  • renderResponsesTools() + sanitizeResponsesToolSchema() — flattens tool definitions from the nested Chat Completions shape ({type, function:{…}}) to the flat Responses shape ({type, name, description, parameters, strict}), and strips non-standard keys the core tool renderer leaks into each property (a redundant name and a boolean required). Chat Completions tolerated those; the Responses API validates strictly and rejects them — without this, real tool calls 400 with "True is not of type 'array'".
  • extractResponsesChatMessage() — parses the Responses output[] array into a ChatMessage (assistant text) plus ToolsFunctionOutput objects from function_call items.
  • setResponsesTokenUsage() — maps the Responses usage shape (input_tokens, output_tokens, output_tokens_details.reasoning_tokens, input_tokens_details.cached_tokens) into TokenUsageDto (the shared core helper reads Chat Completions keys, so a provider-local mapper is used rather than touching the base class).
  • Structured output is now sent under text.format instead of response_format.
  • Extension seams (documented @todos) mark where internal tools, previous_response_id/Conversations memory, and vector stores plug in later.

definitions/api_defaults.yml — the chat configuration is updated to Responses-compatible params: max_tokensmax_output_tokens, and frequency_penalty/presence_penalty removed (unsupported by Responses). getModelSettings() is updated accordingly. Existing stored configs still work because prepareResponsesConfiguration() translates the old keys at request time.

What was added and why

  • src/OpenAiResponsesStreamIterator.php (new) — the old OpenAiChatMessageIterator read the Chat Completions streaming shape (choices[].delta), which no longer applies. This new iterator consumes the Responses event stream (response.output_text.delta, response.function_call_arguments.*, response.completed, refusals) and maps it onto the core streamed-chat contract, so streaming text, tool calls, finish reason, and token usage are preserved. It contains @todo seams for the future internal-tool/reasoning events.
  • src/OpenAiResponsesToolCall.php (new) — a tiny value object. The core StreamedChatMessageIterator::assembleToolCalls() is private and reconstructs streamed tool calls from objects whose toArray() returns the OpenAI tool_calls delta shape. Since the Responses API streams function calls as a separate item plus argument deltas, this object adapts those fragments back into the shape the core iterator expects.
  • Removed src/OpenAiChatMessageIterator.php (obsolete Chat Completions iterator).

Backwards compatibility & notes

  • The chat/ChatInput/ChatOutput contract is unchanged — no consumer changes required.
  • Minor config-surface change: the chat config field is renamed (max_tokensmax_output_tokens) and the two penalty fields are removed from the UI. Stored values under the old keys are still honored via in-code translation.
  • Core follow-up (separate issue): the schema sanitizer works around the ai module's ToolsPropertyInput renderer leaking a boolean required/name into each property. The proper fix belongs in drupal/ai; the provider keeps the defensive sanitizer regardless.

Testing

Automated (added with this MR — module previously had no tests)

tests/src/Unit/OpenAiResponsesToolCallTest.php, tests/src/Kernel/OpenAiResponsesChatTest.php, tests/src/Kernel/OpenAiResponsesStreamIteratorTest.php — 12 tests / 75 assertions, no network (the chat test uses a mock HTTP transport with a real OpenAI\Client).

`# From the Drupal root, with SIMPLETEST_DB set: vendor/bin/phpunit -c web/core/phpunit.xml.dist web/modules/contrib/ai_provider_openai/tests/

Standards:

vendor/bin/phpcs --standard=Drupal,DrupalPractice web/modules/contrib/ai_provider_openai/ vendor/bin/phpstan analyse --configuration=web/modules/contrib/ai/phpstan.neon web/modules/contrib/ai_provider_openai/ `

Coverage: request hits /responses; message→input mapping; multimodal parts; flat + sanitized tool schemas; text.format structured output; config translation; reasoning-model handling; tool-history (function_call/function_call_output); non-streamed parsing + token usage; and streamed text/tool/usage reconstruction.

Manual UI verification

Rebuild caches first (drush cr). Since the code has no Chat Completions path left, any working chat proves the Responses endpoint is in use.

  • AI API Explorer → Chat Generator (/admin/config/ai/explorers/chat_generator):
    • Basic chat (gpt-4o-mini): normal reply + token usage.
    • Reasoning model (gpt-5.x/o*): correct reply with no "Unsupported parameter: reasoning_effort" / temperature error.
    • Streaming (Streamed on, use a multi-sentence prompt — short single-line output is coalesced by the core URL-safety buffer, unrelated to this MR): text paints progressively.
    • Structured output (Advanced → JSON Schema): returns valid JSON. (Requires the ai module's JSON-schema editor JS to be built: npm install && npm run build in web/modules/contrib/ai/ui/json-schema-editor.)
    • Function calling (Execute Function Call on): tool call executes with no "Invalid schema for function" error.
    • Vision: upload an image and confirm it's described.
  • Provider settings surface — the chat config fields render inside the AI Assistant edit form (/admin/config/ai/ai-assistant, AI Provider section): confirm "Max Output Tokens" is present and Frequency/Presence Penalty are gone. (The provider settings page itself only holds the API key/moderation.)
  • Chatbot streaming + multi-turn — requires a legacy (non-agent) assistant (the DeepChat block hides the Stream toggle for agent-based assistants); point a DeepChat block at a plain chat assistant, enable Stream, and confirm progressive streaming and retained context across turns.

Pass criteria: all scenarios succeed and — as the two MR-specific signals — reasoning models don't raise "Unsupported parameter", and tool calls don't raise "Invalid schema for function".

Closes #3558801

Merge request reports

Loading