Draft: Symfony AI Store integration with AI (no changes to ai_search)
Description
Integration Plan: symfony/ai-store into the Drupal AI Module
Date: 2026-05-27
Scope: Core ai module — ai_search submodule changes are out of scope for this document but are referenced as a downstream consumer.
1. Executive Summary
The Drupal ai module currently ships its own VDB-provider abstraction (AiVdbProviderInterface) and embedding pipeline that are tightly coupled to Drupal's plugin system and the Search API. The symfony/ai-store library (already present in vendor/symfony/ai-store) provides a backend-agnostic, pipeline-oriented abstraction for vectorization, document indexing, and retrieval, together with 25+ ready-made store adapters (Qdrant, Weaviate, Milvus, pgvector, Pinecone, OpenSearch, …).
The goal of this plan is to:
- Bridge Drupal's existing embedding and VDB abstractions into symfony's interfaces — zero breaking changes for existing providers and consumers.
- Centralize vectorization, indexing, and querying behind three new Drupal services so all submodules and contrib modules share one pipeline.
- Enable any
symfony/ai-storebridge package (e.g.symfony/qdrant-store) to be plugged in as a Drupal VDB backend with minimal glue code.
2. Current Architecture
2.1 AI Provider / Embedding System
| Component | Path | Role |
|---|---|---|
AiProviderInterface |
src/AiProviderInterface.php |
All AI providers implement this; exposes embeddings() |
EmbeddingsInterface |
src/OperationType/Embeddings/EmbeddingsInterface.php |
Operation type interface: embeddings(input, model): EmbeddingsOutput |
EmbeddingsInput |
src/OperationType/Embeddings/EmbeddingsInput.php |
DTO wrapping the text (or image) to embed |
EmbeddingsOutput |
src/OperationType/Embeddings/EmbeddingsOutput.php |
DTO with getNormalized(): array (float vector) |
AiProviderPluginManager |
src/AiProviderPluginManager.php |
Drupal plugin manager — discovers providers via #[AiProvider] attribute |
EmbeddingsTrait |
src/Traits/OperationType/EmbeddingsTrait.php |
Default maxEmbeddingsInput() / embeddingsVectorSize() for providers |
2.2 Vector Database System
| Component | Path | Role |
|---|---|---|
AiVdbProviderInterface |
src/AiVdbProviderInterface.php |
All VDB providers implement this |
AiVdbProviderClientBase |
src/Base/AiVdbProviderClientBase.php |
Abstract base class; also implements Search API integration |
AiVdbProviderPluginManager |
src/AiVdbProviderPluginManager.php |
Discovers VDB plugins via #[AiVdbProvider] attribute |
VdbSimilarityMetrics |
src/Enum/VdbSimilarityMetrics.php |
CosineSimilarity, EuclideanDistance, InnerProduct |
Registered VDB backends: azure_ai_search, milvus, pinecone, postgres (pgvector), sqlite.
2.3 Key AiVdbProviderInterface Operations
Collection management : getCollections, createCollection, dropCollection
Data write : insertIntoCollection, deleteFromCollection, deleteItems, deleteAllItems
Query : querySearch(…), vectorSearch(…, QueryInterface $query, …)
Utility : getVdbIds, getRawEmbeddingFieldName, ping, isSetup2.4 Chunking / Tokenization
ai.text_chunker→src/Service/TextChunker.phpai.tokenizer→src/Utility/Tokenizer.php- Used currently by
ai_searchsubmodule embedding strategies; no centralized indexing service exists in the coreaimodule.
3. symfony/ai-store Overview
3.1 Core Abstractions
| Interface | Role |
|---|---|
StoreInterface |
add(VectorDocument[]), remove(id[]), query(QueryInterface), supports() |
ManagedStoreInterface |
setup(), drop() — optional collection lifecycle |
IndexerInterface |
index($input, $options) — high-level entry point |
RetrieverInterface |
retrieve(string $query, $options): iterable<VectorDocument> |
VectorizerInterface |
`vectorize(string |
TransformerInterface |
Mutate a stream of documents (chunking, cleaning, summarizing) |
FilterInterface |
Remove documents from a stream before indexing |
LoaderInterface |
Load documents from a source (file, URL, CSV, RSS, …) |
3.2 Core Value Objects
| Class | Key Fields |
|---|---|
TextDocument |
id, content, Metadata |
VectorDocument |
id, Vector, Metadata, score |
Metadata |
ArrayObject subclass; reserved keys _text, _parent_id, _source, _title, _summary |
Vector |
Float array wrapper |
3.3 Indexing Pipeline
DocumentProcessor: Filters → Transformers → Vectorize (batched) → Store::add()
DocumentIndexer : accepts EmbeddableDocument(s), delegates to DocumentProcessor
SourceIndexer : accepts source path/URL, uses LoaderInterface, delegates to DocumentProcessor3.4 Retrieval Pipeline
Retriever:
1. Dispatch PreQueryEvent (allow query expansion / modification)
2. Build query: TextQuery | VectorQuery | HybridQuery (based on store capabilities)
3. StoreInterface::query()
4. Dispatch PostQueryEvent (allow reranking / filtering)
5. Yield VectorDocuments3.5 Available Query Types
| Class | Description |
|---|---|
VectorQuery |
Wraps a Vector for similarity search |
TextQuery |
Wraps a string (or string array, OR logic) for full-text search |
HybridQuery |
Combines Vector + text with configurable semanticRatio (0–1) |
3.6 Available Bridge Packages (installable via Composer)
Qdrant, Weaviate, Milvus, Pinecone, ChromaDB, Elasticsearch, OpenSearch, Meilisearch, ManticoreSearch, pgvector, SQLite, MongoDB, Redis, Supabase, SurrealDB, Neo4j, ClickHouse, MariaDB, Azure AI Search, AWS S3 Vectors, Cloudflare Vectorize, Typesense, Vektor.
4. Gap Analysis and Integration Challenges
4.1 Vectorizer: Different Platform Abstractions
symfony's Vectorizer uses symfony/ai-platform's PlatformInterface::invoke(), while the Drupal AI module uses AiProviderInterface::embeddings(). The two are incompatible at the API level.
Resolution: A DrupalAiVectorizerAdapter bridges Drupal providers → VectorizerInterface without touching the symfony Platform stack.
4.2 Store: The Search API Coupling
AiVdbProviderInterface::vectorSearch() requires a \Drupal\search_api\Query\QueryInterface parameter. This is the main obstacle to wrapping existing VDB providers as StoreInterface.
// Current signature — requires Search API
public function vectorSearch(
string $collection_name,
array $vector_input,
array $output_fields,
\Drupal\search_api\Query\QueryInterface $query, // <-- problem
string $filters = '',
int $limit = 10, int $offset = 0, string $database = 'default',
): array;Resolution: Introduce a new optional interface AiVdbDirectSearchInterface (see §5.2) with a vectorSearchDirect() method that has no Search API dependency. DrupalVdbStoreAdapter uses it when available, and falls back gracefully otherwise. AiVdbProviderClientBase gains a default implementation that delegates to vectorSearch() via a minimal no-op query — keeping all existing providers working immediately.
4.3 Collection / Namespace Management
StoreInterface has no concept of collections or namespaces. symfony stores are typically scoped to a single collection at construction time. AiVdbProviderInterface manages multiple named collections.
Resolution: DrupalVdbStoreAdapter is constructed with a fixed $collectionName + $database. Collection creation/management remains on AiVdbProviderInterface and is performed by the factory service before handing out the adapter.
4.4 ID Mapping
Drupal entity IDs are composite strings (e.g. entity:node/1:en). symfony's VectorDocument expects int|string IDs. Chunked documents add a suffix (e.g. entity:node/1:en#chunk-2).
Resolution: The DrupalEntityDocumentMapper (§5.5) encodes/decodes Drupal IDs into strings accepted by symfony, and writes the original Drupal ID into Metadata::_source.
4.5 Metadata Conventions
Drupal VDB providers store ad-hoc metadata arrays; symfony uses Metadata with typed accessors and reserved underscore-prefixed keys.
Resolution: The mapper flattens Drupal metadata into Metadata, using _source for the Drupal entity ID, _title for the entity label, and custom namespace keys (drupal_entity_type, drupal_bundle, etc.) for Drupal-specific fields.
4.6 Batch Vectorization
Drupal AI providers are called one item at a time. symfony Vectorizer uses Capability::INPUT_MULTIPLE to optionally batch. The adapter defaults to sequential calls and will support batching once providers expose a batch embeddings method.
4.7 Dependency Declaration
symfony/ai-store is currently in vendor/ but not in composer.json of the ai module.
Resolution: Add it to require (see §6.1).
5. Proposed Architecture
┌─────────────────────────────────────────────────────────────────┐
│ Consumers (contrib / custom) │
│ ai_search │ ai_automators │ ai_chatbot │ custom modules │
└──────┬──────────────┬───────────────┬──────────────┬────────────┘
│ │ │ │
┌──────▼──────────────▼───────────────▼──────────────▼────────────┐
│ Centralized Drupal Services (new) │
│ ai.indexing_pipeline ai.retrieval_pipeline ai.vectorization│
└──────┬──────────────────────────────────┬────────────────────────┘
│ │
┌──────▼──────────────┐ ┌─────────────▼──────────────────────┐
│ Bridge Layer (new) │ │ Bridge Layer (new) │
│ │ │ │
│ DrupalAiVectorizer │ │ DrupalVdbStoreAdapter │
│ Adapter │ │ (StoreInterface wrapping │
│ (VectorizerInterface│ │ AiVdbProviderInterface) │
│ wrapping Drupal │ │ │
│ AI providers) │ │ SymfonyStoreBridgePlugin │
└──────┬──────────────┘ │ (AiVdbProviderInterface wrapping │
│ │ any symfony StoreInterface) │
│ └─────────────┬───────────────────────┘
│ │
┌──────▼──────────────┐ ┌─────────────▼──────────────────────┐
│ Existing Drupal │ │ Existing Drupal VDB Providers │
│ AI Providers │ │ (milvus, pinecone, pgvector, …) │
│ (openai, ollama, …) │ │ │
└─────────────────────┘ │ OR symfony Store Bridges │
│ (qdrant-store, weaviate-store, …) │
└─────────────────────────────────────┘5.1 DrupalAiVectorizerAdapter
Path: src/Bridge/SymfonyAiStore/DrupalAiVectorizerAdapter.php
Implements: Symfony\AI\Store\Document\VectorizerInterface
Responsibilities:
- Accepts a Drupal
AiProviderInterfaceinstance (resolved fromAiProviderPluginManager), a provider ID, and a model ID. - For a
string|Stringableinput: wraps inEmbeddingsInput, calls$provider->embeddings(), returns aVectorfromEmbeddingsOutput::getNormalized(). - For an
EmbeddableDocumentInterfaceinput: does the above, wraps result asVectorDocumentwith the document's ID and metadata. - For an
arrayinput: iterates and processes each element (sequential; batch support is a future enhancement). - Optionally implements
\Psr\Log\LoggerAwareInterfacefor observability.
// Sketch — not final code
final class DrupalAiVectorizerAdapter implements VectorizerInterface, LoggerAwareInterface {
public function __construct(
private readonly AiProviderPluginManager $providerManager,
private readonly string $providerId,
private readonly string $modelId,
private LoggerInterface $logger = new NullLogger(),
) {}
public function vectorize(string|\Stringable|EmbeddableDocumentInterface|array $values, array $options = []): Vector|VectorDocument|array {
if (is_array($values)) {
return array_map(fn($v) => $this->vectorizeSingle($v, $options), $values);
}
return $this->vectorizeSingle($values, $options);
}
private function vectorizeSingle(mixed $value, array $options): Vector|VectorDocument {
$text = $value instanceof EmbeddableDocumentInterface ? (string) $value->getContent() : (string) $value;
$provider = $this->providerManager->createInstance($this->providerId);
$output = $provider->embeddings(new EmbeddingsInput($text), $this->modelId);
$vector = new Vector($output->getNormalized());
return $value instanceof EmbeddableDocumentInterface
? new VectorDocument($value->getId(), $vector, $value->getMetadata())
: $vector;
}
}5.2 New Optional Interface: AiVdbDirectSearchInterface
Path: src/AiVdbDirectSearchInterface.php
Decouples vector search from Search API. Existing providers are not required to implement this — it is entirely additive.
interface AiVdbDirectSearchInterface {
/**
* Perform a vector similarity search without Search API dependency.
*
* @param array $vector_input Float array (embedding vector)
* @param array $output_fields Fields to return in results
* @param array $filters Simple key=>value filter map
* @return array Matching records with optional score field
*/
public function vectorSearchDirect(
string $collection_name,
array $vector_input,
array $output_fields,
array $filters = [],
int $limit = 10,
int $offset = 0,
string $database = 'default',
): array;
}AiVdbProviderClientBase gains a default implementation of vectorSearchDirect() that:
- Converts the
$filtersarray into the string formatprepareFilters()already accepts. - Calls
$this->vectorSearch(…, $query = new NullSearchApiQuery())— a minimal internal stub that produces empty conditions.
This means all existing providers get vectorSearchDirect() for free without any changes to their own code.
5.3 DrupalVdbStoreAdapter
Path: src/Bridge/SymfonyAiStore/DrupalVdbStoreAdapter.php
Implements: Symfony\AI\Store\StoreInterface, Symfony\AI\Store\ManagedStoreInterface
Wraps one AiVdbProviderInterface instance scoped to a single collection.
StoreInterface method |
Maps to |
|---|---|
add(VectorDocument[]) |
insertIntoCollection($collectionName, $data) — converts VectorDocument to the flat array format providers expect |
remove(string[]) |
deleteFromCollection($collectionName, $ids) |
query(VectorQuery) |
vectorSearchDirect() if provider implements AiVdbDirectSearchInterface, else vectorSearch() with null stub query |
query(TextQuery) |
querySearch() with filter built from text |
query(HybridQuery) |
Executes both vector and text paths, merges results using Reciprocal Rank Fusion (mirrors symfony's CombinedStore) |
supports($class) |
Returns true for VectorQuery; returns true for TextQuery and HybridQuery only if provider's querySearch() is meaningful |
setup($options) |
createCollection() — reads dimension and metric_type from $options |
drop($options) |
dropCollection() |
Key design notes:
- The adapter is collection-scoped: one adapter instance = one collection. The factory service creates adapters for the configured collection.
- Metadata conversion:
VectorDocument::getMetadata()is stored as a JSON-encoded extra field; on retrieval it is decoded back. - Score: raw distance returned by VDB providers is mapped to
VectorDocument::withScore().
5.4 SymfonyStoreBridgePlugin
Path: src/Bridge/SymfonyAiStore/SymfonyStoreBridgePlugin.php
Extends: AiVdbProviderClientBase
Purpose: Base class for Drupal VDB provider plugins that delegate to a symfony StoreInterface.
This enables contrib/custom modules to ship thin Drupal plugins for any symfony store bridge package (e.g. a QdrantVdbProvider plugin that wraps symfony/qdrant-store) with minimal boilerplate.
Responsibilities:
- Implements the full
AiVdbProviderInterfacecontract. - Stores the wrapped
StoreInterfaceinstance (injected or constructed increate()). - Maps collection management to
ManagedStoreInterface::setup()/drop()if available, else no-op. - Maps
insertIntoCollection()→StoreInterface::add(). - Maps
deleteFromCollection()→StoreInterface::remove(). - Maps
vectorSearch()/vectorSearchDirect()→StoreInterface::query(VectorQuery). - Implements
AiVdbDirectSearchInterfacenatively.
Concrete bridge plugins extend this class, supply the symfony store instance, and add their own settings form (host, API key, etc.).
5.5 DrupalEntityDocumentMapper
Path: src/Mapper/DrupalEntityDocumentMapper.php
Converts Drupal entities and Search API items to/from TextDocument/VectorDocument.
Entity → TextDocument:
id = "<entity_type>/<entity_id>/<langcode>"
content = concatenated text fields (configurable)
metadata:
_source = "<entity_type>/<entity_id>/<langcode>"
_title = entity label
drupal_entity_type = "node"
drupal_bundle = "article"
drupal_langcode = "en"
drupal_url = absolute URL
VectorDocument → result array:
id = parsed back to entity_type/entity_id/langcode
score = similarity score
metadata = passed throughChunked documents append #chunk-N to the ID and set Metadata::KEY_PARENT_ID.
5.6 Centralized Drupal Services
Three new services replace scattered per-module implementations.
ai.vectorization_service → VectorizationService
Path: src/Service/VectorizationService.php
- getVectorizer(string $provider_id, string $model_id): VectorizerInterface
- vectorize(string $text, string $provider_id, string $model_id): array // float[]
- vectorizeBatch(array $texts, string $provider_id, string $model_id): array // float[][]Internally:
- Creates (or caches) a
DrupalAiVectorizerAdapterper provider+model combination. - Dispatches Drupal events before/after vectorization for observability (
ai_observabilitysubmodule integration). - Cache results using
cache.aibackend with a hash of the input text.
ai.indexing_pipeline → IndexingPipelineService
Path: src/Service/IndexingPipelineService.php
- indexEntity(EntityInterface $entity, array $config): void
- indexDocuments(iterable $documents, array $config): void
- indexFromSource(string $source, array $config): voidInternally:
- Builds a symfony
DocumentProcessorfrom config:- Vectorizer:
DrupalAiVectorizerAdapter(fromai.vectorization_service) - Store:
DrupalVdbStoreAdapter(wrapping the configured VDB provider) - Transformers: at minimum a
TextSplitTransformer(replaces currentai.text_chunkerfor this pipeline) - Filters: pluggable via Drupal tagged services
- Vectorizer:
- For entity input, uses
DrupalEntityDocumentMapperto produceTextDocumentinstances. - Ensures the collection exists before indexing (via
AiVdbProviderPluginManager::ensureCollectionExists()).
ai.retrieval_pipeline → RetrievalPipelineService
Path: src/Service/RetrievalPipelineService.php
- retrieve(string $query, array $config, int $limit = 10): array<VectorDocument>
- retrieveAsEntities(string $query, array $config, int $limit = 10): array<EntityInterface>Internally:
- Builds a symfony
Retrieverwith:VectorizerInterface:DrupalAiVectorizerAdapterStoreInterface:DrupalVdbStoreAdapter
- Dispatches
PreQueryEvent/PostQueryEvent— contrib modules can register rerankers or query expanders via Drupal's event system. - Supports hybrid queries when the underlying store (via
DrupalVdbStoreAdapter) reportssupports(HybridQuery::class).
ai.symfony_store_factory → SymfonyStoreFactory
Path: src/Service/SymfonyStoreFactory.php
- createStore(string $vdb_provider_id, array $collection_config): StoreInterfaceLooks up the named VDB provider plugin, constructs a DrupalVdbStoreAdapter wrapping it, ensures collection exists, returns the adapter. Used internally by IndexingPipelineService and RetrievalPipelineService.
6. Implementation Plan
Phase 1 — Dependency and Scaffold (no functional changes)
- Add
symfony/ai-storetocomposer.jsonof theaimodule (requiresection,^0.9). - Create directory structure:
web/modules/contrib/ai/src/Bridge/SymfonyAiStore/ web/modules/contrib/ai/src/Mapper/ web/modules/contrib/ai/src/Service/ (already exists partially) - Verify autoloading — the module already uses PSR-4 under
Drupal\ai; new subdirectories are picked up automatically.
Phase 2 — New Optional Interface AiVdbDirectSearchInterface
- Create
src/AiVdbDirectSearchInterface.php. - Add default implementation of
vectorSearchDirect()toAiVdbProviderClientBase:- Converts
array $filtersto a string using the same logic asprepareFilters(). - Creates a minimal no-op internal query stub (private inner class, no public API).
- Calls
$this->vectorSearch(…)with that stub.
- Converts
- Mark existing VDB provider plugins as implementing
AiVdbDirectSearchInterfaceprogressively (optional, no deadline).
Backward compatibility: Zero changes to AiVdbProviderInterface. Existing providers continue to work exactly as before.
Phase 3 — DrupalAiVectorizerAdapter
- Create
src/Bridge/SymfonyAiStore/DrupalAiVectorizerAdapter.php. - Write unit tests in
tests/src/Unit/Bridge/DrupalAiVectorizerAdapterTest.phpusing existing mock provider infrastructure. - Register as a factory in
ai.services.yml:ai.drupal_vectorizer_adapter_factory: class: Drupal\ai\Bridge\SymfonyAiStore\DrupalAiVectorizerAdapterFactory arguments: ['@ai.provider']
Phase 4 — DrupalVdbStoreAdapter
- Create
src/Bridge/SymfonyAiStore/DrupalVdbStoreAdapter.php. - Implement
add(),remove(),query(),supports(),setup(),drop(). - Write integration tests against the InMemory VDB provider (test infrastructure already exists).
- Register factory:
ai.symfony_store_factory: class: Drupal\ai\Service\SymfonyStoreFactory arguments: ['@ai.vdb_provider']
Phase 5 — SymfonyStoreBridgePlugin Base Class
- Create
src/Bridge/SymfonyAiStore/SymfonyStoreBridgePlugin.phpextendingAiVdbProviderClientBase. - Add documentation and a code example showing how to write a concrete plugin for a symfony bridge package (e.g. qdrant).
- No service registration — this is a base class for other plugins.
Phase 6 — DrupalEntityDocumentMapper
- Create
src/Mapper/DrupalEntityDocumentMapper.php. - Define a
DrupalEntityDocumentMapperInterfaceso it can be swapped or decorated. - Register as a service:
ai.entity_document_mapper: class: Drupal\ai\Mapper\DrupalEntityDocumentMapper arguments: ['@entity_type.manager', '@url_generator']
Phase 7 — Centralized Services
- Create
VectorizationService,IndexingPipelineService,RetrievalPipelineService. - Inject all dependencies via constructor; use lazy loading where appropriate.
- Add service definitions to
ai.services.yml(see §7). - Dispatch Drupal events at key pipeline steps to integrate with
ai_observabilityandai_logging.
Phase 8 — Documentation and Migration Guide
- Add docblock-level usage examples to each new class.
- Write a developer-facing guide in
docs/symfony-ai-store.mdexplaining how to:- Use
ai.indexing_pipelineto index Drupal entities. - Use
ai.retrieval_pipelineto retrieve by semantic similarity. - Write a new VDB backend using
SymfonyStoreBridgePlugin.
- Use
- Note which
ai_searchfunctionality can be migrated to the new centralized services in a future minor release.
7. Service Definitions (additions to ai.services.yml)
services:
# Factory for DrupalAiVectorizerAdapter instances
ai.drupal_vectorizer_adapter_factory:
class: Drupal\ai\Bridge\SymfonyAiStore\DrupalAiVectorizerAdapterFactory
arguments:
- '@ai.provider'
- '@logger.channel.ai'
# Factory for DrupalVdbStoreAdapter instances
ai.symfony_store_factory:
class: Drupal\ai\Service\SymfonyStoreFactory
arguments:
- '@ai.vdb_provider'
- '@logger.channel.ai'
# Entity ↔ TextDocument mapping
ai.entity_document_mapper:
class: Drupal\ai\Mapper\DrupalEntityDocumentMapper
arguments:
- '@entity_type.manager'
- '@url_generator'
# Centralized vectorization
ai.vectorization_service:
class: Drupal\ai\Service\VectorizationService
arguments:
- '@ai.drupal_vectorizer_adapter_factory'
- '@cache.ai'
- '@event_dispatcher'
- '@logger.channel.ai'
# Centralized indexing pipeline
ai.indexing_pipeline:
class: Drupal\ai\Service\IndexingPipelineService
arguments:
- '@ai.vectorization_service'
- '@ai.symfony_store_factory'
- '@ai.entity_document_mapper'
- '@ai.vdb_provider'
- '@event_dispatcher'
- '@logger.channel.ai'
# Centralized retrieval pipeline
ai.retrieval_pipeline:
class: Drupal\ai\Service\RetrievalPipelineService
arguments:
- '@ai.vectorization_service'
- '@ai.symfony_store_factory'
- '@ai.entity_document_mapper'
- '@event_dispatcher'
- '@logger.channel.ai'8. New File Structure
web/modules/contrib/ai/
├── composer.json # + symfony/ai-store ^0.9
├── ai.services.yml # + new service definitions (§7)
└── src/
├── AiVdbDirectSearchInterface.php # NEW — optional, Search-API-free vector search
├── Base/
│ └── AiVdbProviderClientBase.php # MODIFIED — adds default vectorSearchDirect()
├── Bridge/
│ └── SymfonyAiStore/
│ ├── DrupalAiVectorizerAdapter.php # NEW — VectorizerInterface wrapping Drupal AI provider
│ ├── DrupalAiVectorizerAdapterFactory.php # NEW — factory service
│ ├── DrupalVdbStoreAdapter.php # NEW — StoreInterface wrapping AiVdbProviderInterface
│ └── SymfonyStoreBridgePlugin.php # NEW — base class for symfony-store-backed VDB plugins
├── Mapper/
│ ├── DrupalEntityDocumentMapperInterface.php # NEW — contract for entity-to-document mapping
│ └── DrupalEntityDocumentMapper.php # NEW — default implementation
└── Service/
├── SymfonyStoreFactory.php # NEW — creates scoped DrupalVdbStoreAdapter instances
├── VectorizationService.php # NEW — centralized vectorization with caching
├── IndexingPipelineService.php # NEW — centralized indexing using DocumentProcessor
└── RetrievalPipelineService.php # NEW — centralized retrieval using RetrieverFiles marked MODIFIED receive only additive changes (new method on the base class). All existing files remain unchanged.
9. Backward Compatibility Guarantees
| Artifact | Status |
|---|---|
AiVdbProviderInterface |
Unchanged — no added required methods |
AiProviderInterface |
Unchanged |
EmbeddingsInterface, EmbeddingsInput, EmbeddingsOutput |
Unchanged |
AiVdbProviderPluginManager |
Unchanged |
AiVdbProviderClientBase |
Additive only — one new method with a default implementation |
| Existing VDB provider plugins | No changes required |
ai_search submodule |
No changes required — continues to use the existing path |
Custom modules using AiVdbProviderInterface |
No changes required |
New interfaces (AiVdbDirectSearchInterface, DrupalEntityDocumentMapperInterface) are opt-in. No existing code is forced to implement them.
10. Risk Register
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
symfony/ai-store API changes (marked experimental) |
Medium | Medium | Pin to ^0.9; wrap behind Drupal interfaces so symfony changes are isolated to bridge classes |
Search API QueryInterface unavailable at runtime |
Low | Medium | The vectorSearchDirect() path explicitly avoids it; Search API stays a require-dev dependency |
| Performance regression from sequential vectorization | Low | Low | Add batch path in DrupalAiVectorizerAdapter once providers expose batch embeddings; cache layer in VectorizationService covers repeated requests |
| ID collisions in chunked documents | Low | Medium | Use deterministic suffix scheme (<id>#chunk-<index>) and document it; mapper validates uniqueness |
| Metadata overflow in VDB providers that impose field-size limits | Low | Low | Mapper truncates oversized metadata fields with a configurable max length |
11. Open Questions
-
Should
ai.indexing_pipelinereplaceai_search's indexing path immediately, or coexist?
Recommendation: coexist in the initial release. Mark theai_searchpath as "will migrate toai.indexing_pipelinein a future minor" in a@deprecatednotice. -
Should
SymfonyStoreBridgePluginship in the coreaimodule or a separateai_symfony_storesubmodule?
A submodule keeps the core module lean and avoids forcing symfony store adapters on all installations. Recommended:modules/ai_symfony_store/. -
Event naming for indexing/retrieval hooks — should they mirror symfony's
PreQueryEvent/PostQueryEventor follow the Drupal AI module's existing event naming conventions?
Recommendation: follow Drupal conventions for Drupal-dispatched events, but document the symfony event counterparts for bridge consumers. -
How should the
ai.indexing_pipelinehandle re-indexing (update vs insert)?
VDB providers differ; recommend exposing an$upsert = trueoption that callsdeleteFromCollection()beforeinsertIntoCollection()when the document already exists, matching currentai_searchbehavior. -
Caching strategy for
VectorizationService— cache by content hash is correct for static content, but dynamic content (e.g. user-generated text) should bypass the cache. Expose a$cacheableflag in the API.
AI Compliance
Check the one that best describes your usage, or leave all unchecked if AI was not significantly used.
-
AI Assisted Code
Mainly written by a human; AI used for autocomplete or partial generation under full human supervision. -
AI Generated Code
Mainly generated by AI, reviewed and approved by a human before this MR was created. -
Vibe Coded
Generated by AI and only functionally reviewed before this MR was created.