RestrictToTopic guardrail: add semantic topic matching mode
>>> [!note] Migrated issue <!-- Drupal.org comment --> <!-- Migrated from issue #3584977. --> Reported by: [marcus_johansson](https://www.drupal.org/user/385947) >>> <p>[Tracker]<br> <strong>Update Summary: </strong>[One-line status update for stakeholders]<br> <strong>Short Description: </strong>Add a semantic matching mode to the RestrictToTopic guardrail so LLM-identified topics are matched against the configured list by meaning rather than exact string equality.<br> <strong>Check-in Date: </strong>MM/DD/YYYY<br> [/Tracker]</p> <h3 id="summary-problem-motivation">Problem/Motivation</h3> <p>The <code>restrict_to_topic</code> guardrail in <code>src/Plugin/AiGuardrail/RestrictToTopic.php</code> asks an LLM which of the configured topics are present in the input text, then uses an exact-string <code>in_array()</code> check to place each returned topic into the valid or invalid bucket:</p> <pre>foreach ($topics_present as $topic) {<br>&nbsp; if (\in_array($topic, $valid_topics)) { ... }<br>&nbsp; elseif (\in_array($topic, $invalid_topics)) { ... }<br>}</pre><p>The LLM does not reliably return exactly the strings from the configured list. It may generalise ("banana" &rarr; "banana fruit"), pluralise, translate, add qualifiers, or return a semantically equivalent synonym. When that happens the returned topic matches neither bucket, the guardrail silently drops it, and the user request either passes through unchecked or the <em>valid_topics_missing</em> branch fires incorrectly.</p> <p>The fixed prompt also instructs the model to "return a valid json list of which topics are present" but does not constrain it to return verbatim strings from the input list, so the drift is expected behaviour rather than a model defect.</p> <h3 id="summary-proposed-resolution">Proposed resolution</h3> <ul> <li>Check if this makes sense?</li> <li>Add a <code>matching_mode</code> setting to the guardrail configuration form with at least two options: <code>exact</code> (current behaviour, default for backward compatibility) and <code>semantic</code> (new).</li> <li>In <code>exact</code> mode, keep the existing <code>in_array()</code> logic unchanged.</li> <li>In <code>semantic</code> mode, change the LLM prompt to force the model to answer using only the exact strings from the configured list (e.g. "For each topic present in the text, return the closest matching topic from the provided list verbatim, or omit it if nothing matches"), so the returned values can still be bucketed with <code>in_array()</code>.</li> <li>Alternatively, or as a third mode, run a second LLM pass (or an embedding-based similarity check) that maps each returned free-form topic back to the nearest configured topic above a similarity threshold, and bucket on that.</li> <li>Consider adding a configurable similarity threshold for the embedding variant, and surface unmatched topics in the result metadata for debugging.</li> <li>Decide whether semantic mode should require a separate embeddings provider configuration or reuse the existing <code>llm_provider</code> / <code>llm_model</code> chat provider.</li> <li>Add kernel test coverage for both modes, including the "banana" vs "banana fruit" style drift case.</li> </ul> <h3 id="summary-ai-usage">AI usage (if applicable)</h3> <p>[x] AI Assisted Issue<br> This issue was generated with AI assistance, but was reviewed and refined by the creator.</p> <p>[ ] AI Assisted Code</p> <p>[ ] AI Generated Code</p> <p>[ ] Vibe Coded</p> <p>- <strong>This issue was created with the help of AI</strong></p>
issue