Issue #3607960: Add token counting, citations, and context management

Summary

Phase 3 of the native SDK integration meta (#3572138): adds token counting, citations, and context management (compaction + editing). All three features are opt-in via new admin form sections and default off; the GA chat path is behaviorally unchanged when they are disabled (live-verified).

What changed

  • Token counting — public AnthropicProvider::countTokens(ChatInput|array, string): int backed by $client->messages->countTokens(). Provider-native method (drupal/ai 1.4 has no token-count OperationType and no consumer for one — recorded design decision; upstreamable later). Request assembly reuses the shared helpers below; create-only fields (max_tokens, temperature, top_k, top_p) are stripped per the MessageCountTokensParams contract.
  • CitationsCitationsConfigParam(enabled: true) attached to PDF DocumentBlockParams when enabled + model-capable. Response-side, all five TextCitation location variants are normalized by the new src/Citation/CitationNormalizer.php into ChatOutput::getMetadata()['citations'] (wire-shape snake_case keys), on both the sync and streaming paths (streaming accumulates CitationsDelta events and emits on the final usage chunk).
  • Context management — when enabled + model-capable (typed ModelCapabilities->contextManagement flags), chat() routes through $client->beta->messages->create() with the live-verified beta headers: compact-2026-01-12 (compaction; not yet enum'd in SDK 0.32, passed as string) / context-management-2025-06-27 (clear_tool_uses / clear_thinking). Hybrid beta-to-GA posture: the header is admin-overridable (clear the field when Anthropic promotes compaction to GA), and an API rejection of the beta header triggers exactly one retry without context management plus an actionable log warning. Streaming requests fall back to the GA streaming path (scoped decision, documented in code).
  • Shared plumbingparseAssistantContent() dispatches content blocks by the wire type discriminator so GA Message and beta BetaMessage (fully separate SDK class trees) share one parser. resolveSystemPrompt() + buildSdkTools() deduplicate request assembly between create and count paths. extractSystemPrompt() now strips empty system messages that previously produced an API 400 (regression-tested).
  • Admin form — three peer details sections following the existing prompt-caching pattern; #states-gated dependent controls; config schema constrains the beta-header override (Regex + Length).

How it was built and verified

Built via orchestrated multi-agent workflows against a human-approved architecture, then pushed through a five-layer review gauntlet:

  1. Adversarial review fan-out (correctness / security / standards / regressions) with 3-vote refutation per finding — caught and fixed a real config-source bug (admin toggles were saved to config but read from per-call runtime configuration).
  2. Code-quality audit (SOLID / DRY / security / semantic lint) + 3-persona paper test (happy path / edge case / red team) + fresh-context contribution review — 19 findings, 12 vote-confirmed, all fixed (includes the CitationNormalizer extraction, DRY dedup, cleared-vs-unset beta-header semantics, and mocked beta-path test coverage).
  3. Local gates: 101 unit tests / 208 assertions green (44 new), phpcs (Drupal + DrupalPractice) clean, drupal-rector clean, composer valid.
  4. Live E2E against the Anthropic API (ddev site): countTokens() returned a real count; citations populated metadata['citations'] with the correct page_location + cited_text for a test PDF; compaction request accepted on the beta endpoint (BetaMessage response with contextManagement acknowledged); toggles-off regression confirmed the GA path unchanged; toggle-gate enforcement confirmed (disabled countTokens() refuses).
  5. Browser verification of the admin form: all sections render, both #states levels behave, and the save round-trip persisted all five new config keys.

Deferred to drupalci: phpstan (local env has an unrelated contrib autoloader break), cspell, PHPUnit concurrent + opt-in variants.

AI-Generated: Yes

Built via multi-agent AI orchestration (Claude Opus 4.8 workers, Claude Fable 5 orchestration) against a human-approved architecture: research, implementation, and tests AI-generated; reviewed through two adversarial review passes (code audit, 3-persona paper test, fresh-context review, refutation voting), live E2E-verified against the Anthropic API, with human maintainer direction throughout. Dependencies, logic, security, and GPL compatibility verified. Full contributor responsibility assumed.

Closes #3607960.

Merge request reports

Loading