Issue #3591045: Strip tags before decoding entities in Xls::formatValue()
Summary
When Strip HTML is enabled, Xls::formatValue() previously decoded HTML entities first and stripped tags after. Content containing entity-encoded literal < or > characters (e.g. low pressure, <1 MPa from a formatted text field) was decoded to a real < and then strip_tags() consumed the rest of the string as if it were an unfinished HTML tag — silently truncating the cell to low pressure, .
Fix
Reverse the order of the two operations:
protected function formatValue($value) {
if ($this->stripTags) {
- $value = Html::decodeEntities($value);
- $value = strip_tags($value);
+ $value = strip_tags($value);
+ $value = Html::decodeEntities($value);
}After strip_tags(), entity-encoded brackets remain as < / > (so they do not trigger tag interpretation). Html::decodeEntities() then turns them back into literal characters for the final value.
Scope
Single change in Xls::formatValue(). The fix transparently covers:
Xlsdirectly.Xlsx extends Xls— does not overrideformatValue().OpenSpoutXlsxEncoder extends Xlsx— does not override either.
No behaviour change for inputs without entity-encoded brackets (the most common case): real HTML tags are still stripped, entities like & still decode correctly. The existing testFormatValue() assertions all still pass without modification.
Test plan
Adds testFormatValueWithEncodedAngleBrackets() covering:
- The exact reported case:
<p>Buffer hydrogen gas holder: low pressure, <1 MPa</p>→Buffer hydrogen gas holder: low pressure, <1 MPa. >equivalent:<p>5 > 3 in absolute value</p>→5 > 3 in absolute value.- Entities adjacent to real tags:
<p>If <em>x < y</em> and <em>y > z</em> then x < z.</p>→If x < y and y > z then x < z.
Locally: 6 tests, 29 assertions, OK. The pre-existing setAccessible() deprecation warning is unrelated.
phpcs --standard=Drupal,DrupalPractice clean.