Issue #571548: Prevent Views generating SQL aliases which are too-long or duplicates
Closes #571548
Use safe SQL aliases.
We limit the length of the original alias (to a default maximum of 60 characters), incorporating a hash of the original value for uniqueness if that value was too long to use verbatim.
This prevents subsequent truncation from creating duplicate aliases in cases where two or more aliases are identical up to the point of truncation. This happens particularly easily with relationships, where alias names may be built from several concatenated identifiers.
Merge request reports
Activity
- Resolved by jweowu
added 1 commit
- 7a1ef666 - Add type-hinting for the $maxlength argument
added 1 commit
- 2e661f89 - fixup! Add type-hinting for the $maxlength argument
marked this merge request as draft from issue/drupal-571548@2e661f89
added 1 commit
- 4a26e63e - Add type-hinting for the $maxlength argument
added 1 commit
- c9ca7d27 - Add type-hinting for the $maxlength argument
- Resolved by jweowu
- Resolved by jweowu
added 1 commit
- efb734a5 - Add mixed type-hinting for the $alias argument
- Resolved by jweowu
- Resolved by jweowu
592 * 593 * @param mixed $alias 594 * The alias to sanitize. 595 * @param int $maxlength 596 * The maximum number of characters permitted in an alias. Defaults to 60. 597 * 598 * @return mixed 599 * The sanitized alias string (if a string argument was supplied; 600 * non-strings are returned unmodified). 601 */ 602 public static function sanitizeAlias(mixed $alias, int $maxlength = 60): mixed { 603 if (!is_string($alias)) { 604 return $alias; 605 } 606 if (strlen($alias) > $maxlength) { 607 $checksum = 'c' . crc32($alias); I think my expectation was also that a function which was designed to return a shorter hash might well be less likely to produce a collision than truncation of longer hashes (which might turn out to have common prefixes). I don't know whether that's actually the case, though (and I imagine the chances are vanishingly small in either case, so I feel like this concern is probably only theoretical).
I didn't see an obvious reason not to go with
crc32
here, but I'll happily defer to someone who understands the maths better than myself.
596 * The maximum number of characters permitted in an alias. Defaults to 60. 597 * 598 * @return mixed 599 * The sanitized alias string (if a string argument was supplied; 600 * non-strings are returned unmodified). 601 */ 602 public static function sanitizeAlias(mixed $alias, int $maxlength = 60): mixed { 603 if (!is_string($alias)) { 604 return $alias; 605 } 606 if (strlen($alias) > $maxlength) { 607 $checksum = 'c' . crc32($alias); 608 $pos = strlen($checksum) + strlen($alias) - $maxlength; 609 $alias = $checksum . substr($alias, $pos); 610 } 611 return strtolower(substr($alias, 0, $maxlength)); In fact the lower-case-ness was me being consistent with the old code, which did this:
// PostgreSQL truncates aliases to 63 characters: // https://www.drupal.org/node/571548. // We limit the length of the original alias up to 60 characters // to get a unique alias later if its have duplicates $alias = strtolower(substr($alias, 0, 60));
So as aliases are forced to lower-case in current releases, I think it's sensible to maintain that.
411 412 while (!empty($this->relationships[$alias])) { 412 413 $alias = $alias_base . '_' . $count++; 413 414 } I'd need to examine the code again, but from memory I think that was pre-existing code which protects against the generation of identical aliases which will be used for different purposes.
The sanitizer doesn't help in that situation (its job is not to return a different alias every time, but rather to return a safe version of what it's given -- and always the same output for a given input), so we probably still need this code.
(Of course if those 'uniquified' aliases are too long, the sanitizer will produce a unique shorter alias for each of them.)
Edited by jweowuYes, that code is to ensure a NEW relationship alias is generated, so it needs to be different to anything already known.
It does look like there's a bug here, though...
The code does work if we call
addRelationship()
twice with the same short alias (edit: make that a short lower-case alias): the sanitizer won't modify that alias, and therefore it will exist as a key in$this->relationships
the second time we call the function, and we'll notice that and add a unique numeric suffix to make it different.If we call
addRelationship()
twice with the same LONG alias (or one with upper-case letters) then we'll end up with a conflict, because when we assign$this->relationships[$alias]
a few lines later we've sanitized$alias
to a shorter value. So the next time we calladdRelationship()
with the original/long would-be-duplicate alias it won't find it in$this->relationships
, and hence it will end up with the same sanitized value as the first call.Edited by jweowuNote that we do still NEED to sanitize the alias AFTER the suffix has been added, because the addition of the suffix might be the thing which pushes the length of the alias over the limit.
If I'm not mis-reading the situation, I suspect we want:
// Make sure $alias isn't already used. $alias_base = $alias; $count = 1; $alias = $this->sanitizeAlias($alias_base); while (!empty($this->relationships[$alias])) { $alias = $this->sanitizeAlias($alias_base . '_' . $count++); }
Could someone please review this bug/fix, in case I'm missing some reason why the code was actually ok? (E.g. if the un-sanitized alias ends up as a key in that array as well as the sanitized version.)
Edited by jweowu
- Resolved by jweowu
573 578 return $alias; 574 579 } 575 580 581 /** 582 * Produce a safe alias value. 583 * 584 * We limit the length of an alias string (to a default maximum of 60 585 * characters), incorporating a hash of the original value for uniqueness 586 * if that value was too long to use verbatim. 587 * 588 * This prevents subsequent truncation from creating duplicate aliases 589 * in cases where two or more aliases are identical up to the point of 590 * truncation. This happens particularly easily with relationships, 591 * where alias names may be built from several concatenated identifiers. 592 * 593 * @param mixed $alias