LocalAI Provider
>>> [!note] Migrated issue <!-- Drupal.org comment --> <!-- Migrated from issue #3576741. --> Reported by: [marcus_johansson](https://www.drupal.org/user/385947) >>> <p>[Tracker]<br> <strong>Update Summary: </strong>[One-line status update for stakeholders]<br> <strong>Check-in Date: </strong>MM/DD/YYYY<br> <strong>Additional Collaborators: </strong><br> <em>Metadata is used by the <a href="https://www.drupalstarforge.ai/" title="AI Tracker">AI Tracker.</a> Docs and additional fields <a href="https://www.drupalstarforge.ai/ai-dashboard/docs" title="AI Issue Tracker Documentation">here</a>.</em><br> [/Tracker]</p> <h3 id="summary-problem-motivation">Problem/Motivation</h3> <p>We have added some cool new operation types into 1.3.x, including</p> <ul> <li><a href="https://project.pages.drupalcode.org/ai/1.3.x/developers/call_object_detection/">Object Detection</a></li> <li><a href="https://project.pages.drupalcode.org/ai/1.3.x/developers/call_image_classification/">Image Classification</a></li> <li>ReRank (we need to document this)</li> </ul> <p>These are already pretty powerful with the <a href="https://www.drupal.org/project/ai_validations">AI Validations</a> module, but currently you can only use the <a href="https://www.drupal.orgdrupal.org/project/ai_provider_huggingface">Huggingface</a> module for it. Huggingface is cool for demos, but its too expensive for enterprise inference.</p> <p>To the rescue - <a href="https://localai.io/">https://localai.io/</a>. This has 100% support for <a href="https://localai.io/features/object-detection/">Object Detection</a> and <a href="https://localai.io/features/reranker/">ReRanking</a> and possibly for Image Classification.</p> <p>It also has normal Chat, Embeddings, Text to Speech, Speech to Text functionality that we should add.</p> <p>Many of these ML models can also be run on CPU, meaning that you can setup hosting for this yourself without bleeding out your budget. Maybe its even possible to run in AWS Lambda or some other serverless hosting.</p> <p>It also supports distributed inference, meaning that you can run clusters as one endpoint. Or if you do not care that much about privacy, even in federated global mode sharing resources between other persons machines.</p> <h3 id="summary-proposed-resolution">Proposed resolution</h3> <p>Talk to Jan Kellermann if someone that takes on this can be a co-maintainer. Marcus can reach out.<br> Also see how far they have come.<br> Within a sprint create at least a RC of the provider module, including testing.<br> Create a setup form, that takes host name and (optional) api key via the Key module.<br> Also extend so it has a controller form, outside of the normal form where you can download and manage models.<br> Most important is to cover the operation types where we only have Huggingface or LiteLLM.</p> <h3 id="summary-remaining-tasks">Target date or deadline</h3> <h3 id="summary-remaining-tasks">Remaining tasks</h3> <h3 id="summary-ai-usage">AI usage (if applicable)</h3> <p>[ ] AI Assisted Issue<br> This issue was generated with AI assistance, but was reviewed and refined by the creator.</p> <p>[ ] AI Assisted Code<br> This code was mainly generated by a human, with AI autocompleting or parts AI generated, but under full human supervision.</p> <p>[ ] AI Generated Code<br> This code was mainly generated by an AI with human guidance, and reviewed, tested, and refined by a human.</p> <p>[ ] Vibe Coded<br> This code was generated by an AI and has only been functionally tested.</p>
issue