Add option for excluding empty context values in embedding strategies
>>> [!note] Migrated issue
<!-- Drupal.org comment -->
<!-- Migrated from issue #3549753. -->
Reported by: [lpeabody](https://www.drupal.org/user/1137356)
>>>
<h3 id="summary-problem-motivation">Problem/Motivation</h3>
<p>I think it could be useful to have an option for maintained embedding strategies of this module to exclude empty context values from being attached to a chunk. In my mind, empty values are of little use to embedding models. This could be considered an optimization strategy as it will reduce the number of chunks per item. For example, on an implementation I have it would reduce the chunks for a given node from 84 down to 50. Consider that as an average, repeat that for a thousand nodes and that is fairly significant time savings, as it takes a while for the site to generate embeddings for all of those chunks (it took 2 hours to generate embeddings for 200 nodes with a main-content token resolution I wasn't willing to sacrifice).</p>
<h4 id="summary-steps-reproduce">Steps to reproduce (required for bugs, but not feature requests)</h4>
<p>Please provide information like AI modules enabled, which AI provider, browser, etc.</p>
<h3 id="summary-proposed-resolution">Proposed resolution</h3>
<p>Add a base option to embedding strategies which ensures contexts with empty values are not attached to chunks. For 1.x default it to disabled for BC, for 2.x consider defaulting to enabled.</p>
<h3 id="summary-remaining-tasks">Remaining tasks</h3>
<h3>Optional: Other details as applicable (e.g., User interface changes, API changes, Data model changes)</h3>
issue