Skip to content
Snippets Groups Projects

Issue #3162496: Support skipping field mappings where source is missing

Open Issue #3162496: Support skipping field mappings where source is missing
8 open threads

Allow feeds to skip fields or field columns when a source is missing from the feed.

Edited by Eric Smith

Merge request reports

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Eric Smith added 1 commit

    added 1 commit

    • 133bedd3 - Issue #3162496: Fix issue where file and image fields do not partially update...

    Compare with previous version

  • Eric Smith resolved all threads

    resolved all threads

  • Eric Smith added 1 commit

    added 1 commit

    • 8fa75e8e - tsy-cass-d7-upgrade#649: Do not rely on NULL - needs a test but something is...

    Compare with previous version

  • 1328 $skipped_mapping_deltas = [];
    1329 $incomplete_mapping_deltas = [];
    1330
    1331 // Mappers add to existing fields rather than replacing them unless partial
    1332 // updates are allowed. Hence we need to clear target elements of each item
    1333 // before mapping in case we are mapping on a prepopulated item such as an
    1334 // existing node.
    1330 1335 foreach ($mappings as $delta => $mapping) {
    1331 1336 if ($mapping['target'] == 'feeds_item') {
    1332 1337 // Skip feeds item as this field gets default values before mapping.
    1333 1338 continue;
    1334 1339 }
    1335 1340
    1341 if ($missing_source !== static::MISSING_SOURCE_REQUIRE_NONE) {
    1342 $source_is_missing_from_item = function ($source) use ($item) {
    1343 return $item->get($source) === NULL && !array_key_exists($source, $item->toArray());
    • Moving this logic to the item would be cleaner - e.g. having a $item->has($source) method we can call to properly determine having a value vs having a NULL value. Trying to keep number of changes smaller at this stage while fleshing out the idea.

      Edited by Eric Smith
    • Yes, adding a $item->has() method sounds indeed like a better idea.

    • Some Item classes, like SyndicationItem, have class properties defined. Would for these $item->has() always return TRUE if a property on the class is requested? I would think so. For example $item->has('enclosures') would then always return TRUE, because $enclosures is defined in the class - even if no enclosures was defined in the RSS feed.

      Maybe something to think about later? I understood that the feature is focussed on CSV currently.

    • Eric Smith changed this line in version 9 of the diff

      changed this line in version 9 of the diff

    • Please register or sign in to reply
  • 1354 1392 }
    1355 1393
    1356 1394 $value = $item->get($source);
    1357 if (!is_array($value)) {
    1395 if ($value === NULL && !array_key_exists($source, $item->toArray()) && isset($target_values[$delta])) {
  • 293 293 ],
    294 294 ];
    295 295
    296 $form['advanced']['missing_source'] = [
    297 '#type' => 'radios',
    298 '#title' => $this->t('When a mapped source is missing from the feed'),
    299 '#options' => [
    300 ProcessorInterface::MISSING_SOURCE_REQUIRE_NONE => $this->t('Update all mapped fields, treat the missing sources as NULL'),
    301 ProcessorInterface::MISSING_SOURCE_REQUIRE_ALL => $this->t('Update fields where the field mapping has all sources provided'),
    302 ProcessorInterface::MISSING_SOURCE_REQUIRE_ANY => $this->t('Update fields where the field mapping has any source provided'),
    303 ],
    304 '#description' => $this->t("When a field mapping is composed of multiple sources, e.g. for a <em>Text (formatted, long, with summary)</em> field where both the <em>Text</em> and <em>Summary</em> targets are mapped, 'Update fields where the field mapping has all sources provided' will not update the field if either of the source values are missing. Use 'Update fields where the field mapping has any source provided' to merge the existing field column values with supplied field columns from the source.<br><strong>Warning: This may have unexpected consequences for multi value fields.</strong>"),
    • Comment on lines +299 to +304

      I wonder if we could word this better. The first option I understood, that looks like to be current behavior. On the second and third option it took me a while to understand what it meant. My first thought after reading option 2 was that all mapped fields must exist. Later on I understood this is actually about all mapped subfields/properties of a single target.

      The 'this may have unexpected consequences for multivalue fields' sounds like we would need test coverage to resolve those unexpected things.

    • Agree re label - I find it very difficult to convey in words what the two options are. Multiple subfields/properties is also not common for field mappings - so I wasn't sure even what language people would use to describe them.

      Something like

      Update all mapped fields, treat the missing sources as NULL
      Update mapped fields where the field mapping source is provided, excluding partially complete sources for multi target mappings
      Update mapped fields where the field mapping source is provided, including partially complete sources for multi target mappings

      Even with better help text still seems incredibly verbose.

      Maybe splitting it into two different checkboxes (with the second only shown if the first is selected) is be better? Something like

      • Only update fields that are provided in the source
      • If a field mapping has multiple targets and at least one of those sources are missing, treat the field as missing

      Something like might be clearer - either way you are right the help text needs improvements.

      The scary warning about unexpected consequences for multivalue fields - need to add test coverage, but for example if we have a multi cardinality field with 3 existing values for foo - and the field mapping was comprised of multiple targets foo:one, foo:two - what happens if only 2 values are provided for foo:one. I don't even know what I would expected - any outcome feels unexpected :sweat_smile:

      Edited by Eric Smith
    • Eric Smith changed this line in version 13 of the diff

      changed this line in version 13 of the diff

    • After giving this a lot more thought, I've gone a bit nuclear and removed having this a 2 different options in a4bcce19

      The more I thought about it, the less I could see a point. Either the source is missing and we skip it, or we don't. It shouldn't matter if that mapping is for a field or a field property, the user can expect the same outcome.

      This reduces the complexity and I hope makes it easier for the user to understand.

      If there is a use case to bring it back we can revert a4bcce19 and try improve it more, but otherwise it could also be tackled as a follow up.

    • Please register or sign in to reply
  • 1347 if ($number_of_missing_sources > 0) {
    1348 if ($missing_source === static::MISSING_SOURCE_REQUIRE_ANY && $number_of_missing_sources !== count($mapping['map'])) {
    1349 // Track that this field mapping is incomplete so that we can merge
    1350 // the existing field columns with provided columns.
    1351 $incomplete_mapping_deltas[] = $delta;
    1352 }
    1353 else {
    1354 // Track that this field mapping should be skipped completely.
    1355 $skipped_mapping_deltas[] = $delta;
    1356 }
    1357
    1358 // Do not clear the target as it will either be skipped or merged.
    1359 continue;
    1360 }
    1361 }
    1362
    • Comment on lines +1341 to +1362

      I have not studied the above completely yet, but perhaps it would be good to move it to a new method, so we can write a unit test for it.

    • Have done so - its a slightly clunky method returning 3 arrays, but it now has a unit test at least. End of day for me, can be tidied up later

      Refactored further, it now is a method that groups the mappings with a unit test

      Edited by Eric Smith
    • Please register or sign in to reply
  • Eric Smith added 20 commits

    added 20 commits

    Compare with previous version

  • Eric Smith added 1 commit

    added 1 commit

    Compare with previous version

  • Eric Smith changed title from Issue #3162496: Add rough idea for testing update to Issue #3162496: Support skipping field mappings where source is missing

    changed title from Issue #3162496: Add rough idea for testing update to Issue #3162496: Support skipping field mappings where source is missing

  • Eric Smith added 1 commit

    added 1 commit

    Compare with previous version

  • Eric Smith added 1 commit

    added 1 commit

    • e00eed1c - Issue #3162496: Add test to FileTest to cover when reference_by expects a...

    Compare with previous version

  • Eric Smith added 1 commit

    added 1 commit

    • a4bcce19 - Issue #3162496: Reduce complication by removing the 'all' option. Sources are...

    Compare with previous version

  • 1459 protected function groupMappingsByAction(array $mappings, ItemInterface $item): array {
    1460 $skip_missing_source = $this->configuration['skip_missing_source'];
    1461
    1462 $grouped = [
    1463 self::FIELD_MAPPING_IMPORT => [],
    1464 self::FIELD_MAPPING_IMPORT_KEEP_MISSING_PROPERTIES => [],
    1465 self::FIELD_MAPPING_SKIP => [],
    1466 ];
    1467
    1468 // Import everything if we do not require source values to be present.
    1469 if (!$skip_missing_source) {
    1470 $grouped[self::FIELD_MAPPING_IMPORT] = $mappings;
    1471 return $grouped;
    1472 }
    1473
    1474 // @todo: Refactor to `has` method on item.
  • 293 293 ],
    294 294 ];
    295 295
    296 $form['advanced']['skip_missing_source'] = [
    297 '#type' => 'checkbox',
    298 '#title' => $this->t('Only process mappings found in the source'),
    299 '#description' => $this->t('This will avoid wiping existing target data if a source is missing for a mapping. If a mapping is composed of multiple properties this will update only the properties that have a source provided. E.g. for a <em>Text (formatted, long, with summary)</em> field where both the <em>Text</em> and <em>Summary</em> properties are mapped but only the <em>Summary</em> source is provided then the existing value for <em>Text</em> target will be retained.'),
    300 '#default_value' => $this->plugin->getConfiguration('skip_missing'),
  • 293 293 ],
    294 294 ];
    295 295
    296 $form['advanced']['skip_missing_source'] = [
    297 '#type' => 'checkbox',
    298 '#title' => $this->t('Only process mappings found in the source'),
    299 '#description' => $this->t('This will avoid wiping existing target data if a source is missing for a mapping. If a mapping is composed of multiple properties this will update only the properties that have a source provided. E.g. for a <em>Text (formatted, long, with summary)</em> field where both the <em>Text</em> and <em>Summary</em> properties are mapped but only the <em>Summary</em> source is provided then the existing value for <em>Text</em> target will be retained.'),
    300 '#default_value' => $this->plugin->getConfiguration('skip_missing'),
    301 '#parents' => ['processor_configuration', 'skip_missing'],
    302 '#states' => [
    303 'visible' => [
    304 'input[name="processor_configuration[skip_missing_source]"]' => [
  • this decides the field that dictates the visibility of skip_missing_source.

  • Max @max-ar started a thread on the diff
  • 1480
    1481 $grouped = [
    1482 self::FIELD_MAPPING_IMPORT => [],
    1483 self::FIELD_MAPPING_IMPORT_KEEP_MISSING_PROPERTIES => [],
    1484 self::FIELD_MAPPING_SKIP => [],
    1485 ];
    1486
    1487 // Import everything if we do not require source values to be present.
    1488 if (!$skip_missing_source) {
    1489 $grouped[self::FIELD_MAPPING_IMPORT] = $mappings;
    1490 return $grouped;
    1491 }
    1492
    1493 // @todo: Refactor to `has` method on item.
    1494 $source_is_missing_from_item = function ($source) use ($item) {
    1495 return $item->get($source) === NULL && !array_key_exists($source, $item->toArray());
    • Hey guys!

      We were suffering from the linked issue. We were importing with a JSON source and the $source var when dumped looks like;

      [
        "json_key" => []
        "json_key_1" => "value"
        "json_key_2" => 1234
      ]

      I didn't test anything with CSV, but if we have a JSON object we are importing that is like so;

      {
        "property_1": "hello",
        "property_2": "world",
      }

      and we have a mapping set for property_3 in feeds config, the $source variable will look like this;

      [
        "property_1" => "hello"
        "property_2" => "world"
        "property_3" => []
      ]

      The empty list (is rightly) treated as a value that should be imported. Meaning data in the property_3 gets cleared out. However, somewhere upstream instead of setting null on the JSON it is setting [] for source data that is missing, not triggering this check.

      Fortunately the project I am working on treats an empty list from source as a value that should be ignored/not imported. So my fix for this using a JSON source was to change this line to;

      $item->get($source) === NULL || $item->get($source) === [];

      It's a bit rough but it works for our use case.

    • Please register or sign in to reply
  • Eric Smith added 40 commits

    added 40 commits

    Compare with previous version

  • Eric Smith added 26 commits

    added 26 commits

    Compare with previous version

  • Please register or sign in to reply
    Loading