Using JSON Schema to define "input", "output" and "config"
>>> [!note] Migrated issue <!-- Drupal.org comment --> <!-- Migrated from issue #3554622. --> Reported by: [d34dman](https://www.drupal.org/user/751698) >>> <h3 id="summary-problem-motivation">Problem/Motivation</h3> <p><a href="https://json-schema.org/">JSON Schema </a>gives you a portable contract for each &ldquo;thing&rdquo; (task/node) and lets you auto-validate, auto-document, and even auto-generate forms. For an orchestration layer, a clean split into input, output, and config per node is intuitive and scales.</p> <p>Why JSON-Schema fits?</p> <p>- Contracts at the edges: Each node publishes exactly what it consumes (input) and emits (output), so composition is safe.<br> - Runtime safety: Validate at design-time (when building a workflow) and at run-time (before/after a node executes).<br> - Rich Metadata: Provides standardized way to describe title, description, default, examples<br> - UI for free: Based on the metadata and some additional keys (enum, format, read-only...), user interface can be generated automatically (in php, javascript or by a 3rd party service)<br> - Extensible: Custom vocab via x-* (akin to x-headers) annotations for orchestration semantics (or 3rd Party extentions) without breaking validators.</p> <h3>Drupal integration considerations<br> </h3> <p>- Config management: Store node config in Drupal config entities; validate with JSON Schema at save time.<br> - Typed Data: Map input/output to Drupal TypedData definitions for strong typing inside plugins (so keep the current DX)<br> - Form API bridge: Build a small adapter that converts JSON Schema annotations to FAPI arrays so editors get forms without bespoke code. (Maybe some modules can leverage this already by not needing to define config form using form-api) </p> <p>NOTE: this is highly inspired from work done in <a href="https://www.drupal.org/project/flowdrop">FlowDrop module<br> </a> where each "FlowDrop node" type is aDrupal Plugin with 3 methods that returns input, output and config as JSON Schemas, and an execute method with signature <code>Plugin::execute($input, $config): $output</code>.</p> <h3>Validation strategy</h3> <p>NOTE: To make things simple we can also start with very simple subset of JSON Schema without use of $refs.</p> <p>1. Design-time: When composing a workflow, resolve all $refs and validate each step&rsquo;s config against its node schema. Also validate x-bind targets exist (you can do a static check against referenced node output schemas). </p> <p>2. Pre-exec: Before running a step, resolve bindings into a concrete input object and validate it.</p> <p>3. Post-exec: Validate the step&rsquo;s output to catch contract violations from buggy implementations.</p> <p>4. Perf: Cache compiled schemas and reuse.</p> <h3>Security</h3> <p>1. Mark secrets: @TODO (maybe store values in Drupal&rsquo;s key module / vault and only inject at runtime?, but these needs to be Masked from logs as well)</p> <p>2. Use format (e.g., uri, email, uuid) and pattern generously.</p> <p>3. Handle unevaluatedProperties: @TODO</p> <p>4. Handle cases for long enums like "Entity References List": @TODO</p> <h3>Pitfalls</h3> <p>1. Binary/streaming payloads cannot be represented with JSON Schema</p> <p>2. If you need strong typing in PHP, consider generating PHP DTOs from the schemas (this is being experimented in FlowDrop and is some extra boilerplate that Developers have to go through to make this happen, i.e. bad DX).</p> <h2>References</h2> <p>- More about JSON SChema : <a href="https://json-schema.org/">https://json-schema.org/</a></p>
issue