Issue #3162496: Support skipping field mappings where source is missing
Allow feeds to skip fields or field columns when a source is missing from the feed.
Merge request reports
Activity
added 4 commits
- fc70cc04 - Issue #3162496: Add configuration to UI
- 02ddecad - Issue #3162496: Add logic for skipping clear & mapping on empty sources if configured
- 36d51e4d - Issue #3162496: Show current issue when a mapping comprises multiple targets
- 323c76c1 - Issue #3162496: Check that we can distinguish between an empty CSV column and a missing CSV column
Toggle commit listadded 1 commit
- Resolved by Eric Smith
added 1 commit
added 1 commit
1328 $skipped_mapping_deltas = []; 1329 $incomplete_mapping_deltas = []; 1330 1331 // Mappers add to existing fields rather than replacing them unless partial 1332 // updates are allowed. Hence we need to clear target elements of each item 1333 // before mapping in case we are mapping on a prepopulated item such as an 1334 // existing node. 1330 1335 foreach ($mappings as $delta => $mapping) { 1331 1336 if ($mapping['target'] == 'feeds_item') { 1332 1337 // Skip feeds item as this field gets default values before mapping. 1333 1338 continue; 1334 1339 } 1335 1340 1341 if ($missing_source !== static::MISSING_SOURCE_REQUIRE_NONE) { 1342 $source_is_missing_from_item = function ($source) use ($item) { 1343 return $item->get($source) === NULL && !array_key_exists($source, $item->toArray()); Moving this logic to the item would be cleaner - e.g. having a
$item->has($source)
method we can call to properly determine having a value vs having aNULL
value. Trying to keep number of changes smaller at this stage while fleshing out the idea.Edited by Eric SmithSome Item classes, like SyndicationItem, have class properties defined. Would for these
$item->has()
always returnTRUE
if a property on the class is requested? I would think so. For example$item->has('enclosures')
would then always returnTRUE
, because$enclosures
is defined in the class - even if no enclosures was defined in the RSS feed.Maybe something to think about later? I understood that the feature is focussed on CSV currently.
changed this line in version 9 of the diff
1354 1392 } 1355 1393 1356 1394 $value = $item->get($source); 1357 if (!is_array($value)) { 1395 if ($value === NULL && !array_key_exists($source, $item->toArray()) && isset($target_values[$delta])) { Yes. Maybe it is good to handle that in a separate issue first? To reduce the scope of this issue.
Edited by Youri van Koppen
293 293 ], 294 294 ]; 295 295 296 $form['advanced']['missing_source'] = [ 297 '#type' => 'radios', 298 '#title' => $this->t('When a mapped source is missing from the feed'), 299 '#options' => [ 300 ProcessorInterface::MISSING_SOURCE_REQUIRE_NONE => $this->t('Update all mapped fields, treat the missing sources as NULL'), 301 ProcessorInterface::MISSING_SOURCE_REQUIRE_ALL => $this->t('Update fields where the field mapping has all sources provided'), 302 ProcessorInterface::MISSING_SOURCE_REQUIRE_ANY => $this->t('Update fields where the field mapping has any source provided'), 303 ], 304 '#description' => $this->t("When a field mapping is composed of multiple sources, e.g. for a <em>Text (formatted, long, with summary)</em> field where both the <em>Text</em> and <em>Summary</em> targets are mapped, 'Update fields where the field mapping has all sources provided' will not update the field if either of the source values are missing. Use 'Update fields where the field mapping has any source provided' to merge the existing field column values with supplied field columns from the source.<br><strong>Warning: This may have unexpected consequences for multi value fields.</strong>"), - Comment on lines +299 to +304
I wonder if we could word this better. The first option I understood, that looks like to be current behavior. On the second and third option it took me a while to understand what it meant. My first thought after reading option 2 was that all mapped fields must exist. Later on I understood this is actually about all mapped subfields/properties of a single target.
The 'this may have unexpected consequences for multivalue fields' sounds like we would need test coverage to resolve those unexpected things.
Agree re label - I find it very difficult to convey in words what the two options are. Multiple subfields/properties is also not common for field mappings - so I wasn't sure even what language people would use to describe them.
Something like
Update all mapped fields, treat the missing sources as NULL Update mapped fields where the field mapping source is provided, excluding partially complete sources for multi target mappings Update mapped fields where the field mapping source is provided, including partially complete sources for multi target mappings
Even with better help text still seems incredibly verbose.
Maybe splitting it into two different checkboxes (with the second only shown if the first is selected) is be better? Something like
- Only update fields that are provided in the source
- If a field mapping has multiple targets and at least one of those sources are missing, treat the field as missing
Something like might be clearer - either way you are right the help text needs improvements.
The scary warning about unexpected consequences for multivalue fields - need to add test coverage, but for example if we have a multi cardinality field with 3 existing values for foo - and the field mapping was comprised of multiple targets foo:one, foo:two - what happens if only 2 values are provided for foo:one. I don't even know what I would expected - any outcome feels unexpected
Edited by Eric Smithchanged this line in version 13 of the diff
After giving this a lot more thought, I've gone a bit nuclear and removed having this a 2 different options in a4bcce19
The more I thought about it, the less I could see a point. Either the source is missing and we skip it, or we don't. It shouldn't matter if that mapping is for a field or a field property, the user can expect the same outcome.
This reduces the complexity and I hope makes it easier for the user to understand.
If there is a use case to bring it back we can revert a4bcce19 and try improve it more, but otherwise it could also be tackled as a follow up.
1347 if ($number_of_missing_sources > 0) { 1348 if ($missing_source === static::MISSING_SOURCE_REQUIRE_ANY && $number_of_missing_sources !== count($mapping['map'])) { 1349 // Track that this field mapping is incomplete so that we can merge 1350 // the existing field columns with provided columns. 1351 $incomplete_mapping_deltas[] = $delta; 1352 } 1353 else { 1354 // Track that this field mapping should be skipped completely. 1355 $skipped_mapping_deltas[] = $delta; 1356 } 1357 1358 // Do not clear the target as it will either be skipped or merged. 1359 continue; 1360 } 1361 } 1362 Have done so - its a slightly clunky method returning 3 arrays, but it now has a unit test at least. End of day for me, can be tidied up laterRefactored further, it now is a method that groups the mappings with a unit test
Edited by Eric Smith
added 20 commits
-
8fa75e8e...dc354618 - 6 commits from branch
project:8.x-3.x
- dc354618...e1281601 - 4 earlier commits
- d776719c - Issue #3162496: Check that we can distinguish between an empty CSV column and a missing CSV column
- cb91f192 - Issue #3162496: Add additional check for incomplete items.
- b2200cfd - Issue #3162496: Use single configuration option for the multiple field handling
- a1fe0a24 - Issue #3162496: Alernative approach for merging partially supplied fields
- 8fbd75a5 - Issue #3162496: Typo
- 38592208 - Issue #3162496: Update tests with additional context about the files, align...
- a8f9c814 - Issue #3162496: Fix outdated comment
- b654c6b4 - Issue #3162496: Fix issue where file and image fields do not partially update...
- 82ee920b - Issue #3162496: Do not rely on NULL - needs a test but something is happening...
- be1e1de3 - Issue #3162496: Refactor source completion check into a method that can be unit tested
Toggle commit list-
8fa75e8e...dc354618 - 6 commits from branch
added 1 commit
added 1 commit
added 1 commit
added 1 commit
1459 protected function groupMappingsByAction(array $mappings, ItemInterface $item): array { 1460 $skip_missing_source = $this->configuration['skip_missing_source']; 1461 1462 $grouped = [ 1463 self::FIELD_MAPPING_IMPORT => [], 1464 self::FIELD_MAPPING_IMPORT_KEEP_MISSING_PROPERTIES => [], 1465 self::FIELD_MAPPING_SKIP => [], 1466 ]; 1467 1468 // Import everything if we do not require source values to be present. 1469 if (!$skip_missing_source) { 1470 $grouped[self::FIELD_MAPPING_IMPORT] = $mappings; 1471 return $grouped; 1472 } 1473 1474 // @todo: Refactor to `has` method on item. 293 293 ], 294 294 ]; 295 295 296 $form['advanced']['skip_missing_source'] = [ 297 '#type' => 'checkbox', 298 '#title' => $this->t('Only process mappings found in the source'), 299 '#description' => $this->t('This will avoid wiping existing target data if a source is missing for a mapping. If a mapping is composed of multiple properties this will update only the properties that have a source provided. E.g. for a <em>Text (formatted, long, with summary)</em> field where both the <em>Text</em> and <em>Summary</em> properties are mapped but only the <em>Summary</em> source is provided then the existing value for <em>Text</em> target will be retained.'), 300 '#default_value' => $this->plugin->getConfiguration('skip_missing'), changed this line in version 14 of the diff
293 293 ], 294 294 ]; 295 295 296 $form['advanced']['skip_missing_source'] = [ 297 '#type' => 'checkbox', 298 '#title' => $this->t('Only process mappings found in the source'), 299 '#description' => $this->t('This will avoid wiping existing target data if a source is missing for a mapping. If a mapping is composed of multiple properties this will update only the properties that have a source provided. E.g. for a <em>Text (formatted, long, with summary)</em> field where both the <em>Text</em> and <em>Summary</em> properties are mapped but only the <em>Summary</em> source is provided then the existing value for <em>Text</em> target will be retained.'), 300 '#default_value' => $this->plugin->getConfiguration('skip_missing'), 301 '#parents' => ['processor_configuration', 'skip_missing'], 302 '#states' => [ 303 'visible' => [ 304 'input[name="processor_configuration[skip_missing_source]"]' => [ changed this line in version 14 of the diff
1480 1481 $grouped = [ 1482 self::FIELD_MAPPING_IMPORT => [], 1483 self::FIELD_MAPPING_IMPORT_KEEP_MISSING_PROPERTIES => [], 1484 self::FIELD_MAPPING_SKIP => [], 1485 ]; 1486 1487 // Import everything if we do not require source values to be present. 1488 if (!$skip_missing_source) { 1489 $grouped[self::FIELD_MAPPING_IMPORT] = $mappings; 1490 return $grouped; 1491 } 1492 1493 // @todo: Refactor to `has` method on item. 1494 $source_is_missing_from_item = function ($source) use ($item) { 1495 return $item->get($source) === NULL && !array_key_exists($source, $item->toArray()); Hey guys!
We were suffering from the linked issue. We were importing with a JSON source and the
$source
var when dumped looks like;[ "json_key" => [] "json_key_1" => "value" "json_key_2" => 1234 ]
I didn't test anything with CSV, but if we have a JSON object we are importing that is like so;
{ "property_1": "hello", "property_2": "world", }
and we have a mapping set for
property_3
in feeds config, the$source
variable will look like this;[ "property_1" => "hello" "property_2" => "world" "property_3" => [] ]
The empty list (is rightly) treated as a value that should be imported. Meaning data in the
property_3
gets cleared out. However, somewhere upstream instead of settingnull
on the JSON it is setting[]
for source data that is missing, not triggering this check.Fortunately the project I am working on treats an empty list from source as a value that should be ignored/not imported. So my fix for this using a JSON source was to change this line to;
$item->get($source) === NULL || $item->get($source) === [];
It's a bit rough but it works for our use case.
added 40 commits
-
a4bcce19...364322c4 - 22 commits from branch
project:8.x-3.x
- 364322c4...3d6ebb30 - 8 earlier commits
- 861c7393 - Issue #3162496: Typo
- c394eb4e - Issue #3162496: Update tests with additional context about the files, align...
- 832a661e - Issue #3162496: Fix outdated comment
- d2928f02 - Issue #3162496: Fix issue where file and image fields do not partially update...
- 49fcecff - Issue #3162496: Do not rely on NULL - needs a test but something is happening...
- 034cec84 - Issue #3162496: Refactor source completion check into a method that can be unit tested
- 5befd974 - Issue #3162496: Futher refactoring to make test more readable
- 2f61d61c - Issue #3162496: Add test to FileTest to cover when reference_by expects a...
- 999db6d1 - Issue #3162496: Reduce complication by removing the 'all' option. Sources are...
- 14acc947 - Issue #3162496: Implement fixes to the UI configuration form identified by stefanvaduva
Toggle commit list-
a4bcce19...364322c4 - 22 commits from branch
added 26 commits
-
14acc947...eb5147da - 7 commits from branch
project:8.x-3.x
- eb5147da...f996ece0 - 9 earlier commits
- 9cc36845 - Issue #3162496: Update tests with additional context about the files, align...
- 2845b804 - Issue #3162496: Fix outdated comment
- de1c8c62 - Issue #3162496: Fix issue where file and image fields do not partially update...
- 8f906806 - Issue #3162496: Do not rely on NULL - needs a test but something is happening...
- 6084c298 - Issue #3162496: Refactor source completion check into a method that can be unit tested
- 78d68434 - Issue #3162496: Futher refactoring to make test more readable
- 4027c431 - Issue #3162496: Add test to FileTest to cover when reference_by expects a...
- ebc6ac52 - Issue #3162496: Reduce complication by removing the 'all' option. Sources are...
- 360e439f - Issue #3162496: Implement fixes to the UI configuration form identified by stefanvaduva
- 2da64f0a - Issue #3162496: Fix test after constructor changes
Toggle commit list-
14acc947...eb5147da - 7 commits from branch