[Tool] Create XML reader
>>> [!note] Migrated issue <!-- Drupal.org comment --> <!-- Migrated from issue #3554730. --> Reported by: [marcus_johansson](https://www.drupal.org/user/385947) >>> <h3 id="summary-problem-motivation">Problem/Motivation</h3> <p>You need to be able to read and extract structured data from XML files. Since XML is a well-established format with native PHP support, this can be natively supported and included as part of the core packages.</p> <p>In terms of naming, the tool could be either "Read XML Data" or "Load XML Data". Following the same logic as CSV, JSON, and YAML, since it is a structured format, "Load XML Data" would be the most consistent choice.</p> <p>The inputs for this extractor should be:</p> <p>* File Source (binary, required). The XML file being loaded. How to load this is TBD. An actual serialized XML string should also be possible.<br> * XPath (string, default null). An optional XPath expression that determines which elements to extract from the XML document. If null, extract from the document root.<br> * Message Format (enum, default xml). The return format for the message, can also be json or yaml, mainly useful for AI or output standardization.</p> <p>The output format in string format will be the XML file as-is unless a different message format is specified.</p> <p>If XPath is used, the root starts from the extracted elements, not including the XPath node itself.</p> <p>As for checkAccess, if the file loading happens before this tool, nothing should really need to be checked. If the file loading from uri, url, fid, etc. happens here, it has to be:</p> <p>1. fid - check access to the file entity.<br> 2. uri - if it&rsquo;s managed by Drupal, check access to the file entity, otherwise ensure the file is reachable.<br> 3. url - check that the file can be reached.<br> 4. data - no checks needed.</p> <h3 id="summary-proposed-resolution">Proposed resolution</h3> <p>* If the tool_utilities submodule does not exist yet, create the metadata for it.<br> * Create the "Load XML Data" tool according to the instructions above.<br> * Implement support for an optional XPath extractor that allows targeted extraction of specific elements or attributes within the XML data.<br> * Write unit tests to ensure the output matches the expected input for both full-document and XPath-based extractions.<br> * Add validation to handle malformed XML and ensure consistent conversion to PHP arrays or serialized formats.<br> * Require and use PHP SimpleXML serialization/unserialization utilities.</p>
issue