Batch embeddings for improved indexing performance
>>> [!note] Migrated issue <!-- Drupal.org comment --> <!-- Migrated from issue #3568648. --> Reported by: [paulsheldrake](https://www.drupal.org/user/1350686) Related to !1126 >>> <p>[Tracker]<br> <strong>Update Summary: </strong>Increase embeddings performance<br> <strong>Short Description: </strong>Support batching for embeddings so multile chunks can be converted and inserted at a time instead of doing each chunk sequentially.<br> <strong>Check-in Date: </strong>MM/DD/YYYY<br> <em>Metadata is used by the <a href="https://www.drupalstarforge.ai/" title="AI Tracker">AI Tracker.</a> Docs and additional fields <a href="https://www.drupalstarforge.ai/ai-dashboard/docs" title="AI Issue Tracker Documentation">here</a>.</em><br> [/Tracker]</p> <h3 id="summary-problem-motivation">Problem/Motivation</h3> <p>Indexing is slow for vector DBs</p> <h3 id="summary-proposed-resolution">Proposed resolution</h3> <p>Add batching so more chunks can be process in one go. This MR updates the underlying classes to support the batching process and it is then up to the <a href="https://www.drupal.org/project/ai_vdb_provider_milvus/issues/3568651">individual vector db providers to implement the batching process</a>.</p> <p>From my setup, the results are:</p> <ul> <li>Before: 5-7 seconds per item (individual API calls per chunk) </li> <li>After: 3-4.5 seconds per item (batch API calls)</li> <li>Improvement: ~30-50% faster indexing</li> </ul> <h3>Related patches</h3> <ul> <li><a href="https://www.drupal.org/project/ai_provider_openai/issues/3568659">ai_provider_openai</a></li> <li><a href="https://www.drupal.org/project/ai_vdb_provider_milvus/issues/3568651">ai_vdb_provider_milvus</a></li> <li><a href="https://www.drupal.org/project/ai_provider_voyage/issues/3568661">ai_provider_voyage</a></li> <li><a href="https://www.drupal.org/project/gemini_provider/issues/3589183">gemini_provider</a></li> </ul> <h3 id="summary-ai-usage">AI usage (if applicable)</h3> <p>[x] Vibe Coded<br> This code was generated by an AI and has only been functionally tested.</p>
issue