fix: #3591878 Lock-handling audit: TTL renewal + deadline-loop correctness

Fix for #3591878

Two related fixes to the module's lock-handling, surfaced by the lock-inventory pass that followed #3591871.

Issue A: chain-write lock TTL can expire mid-loop

ChainArchiver::archiveSegment() (via writeNdjson()) and ChainArchiver::restoreLocked() both hold the chain-write lock for a data-dependent duration:

  • archiveSegment writes one NDJSON line per row in the segment range plus hashes the file.
  • restoreLocked re-INSERTs one row per archive line back into the live table.

Multi-million-row segments can exceed the 30s lock TTL. The lock then expires and a concurrent logger write on the same chain can claim it mid-loop. Chain integrity is still protected at the DB level by UNIQUE(chain, previous_hash), but the operator-visible "this critical section ran atomically against the chain head" guarantee degrades silently.

Fix

Mirrors the pattern shipped for the coverage lock in #3591871:

  • New constants CHAIN_WRITE_LOCK_TTL_S (30s) and CHAIN_WRITE_LOCK_RENEW_INTERVAL_S (10s, 1/3 of TTL so two consecutive missed renews still leave the lock alive).
  • New helper renewChainWriteLockIfStale() -- same shape as renewCoverageLockIfStale(). Uses Drupal's same-holder LockBackendInterface::acquire() to extend the semaphore.expire column.
  • Wired into writeNdjson()'s per-row + per-ack loops (signature gains $lock_name + $renewed_at parameters).
  • Wired into restoreLocked()'s per-line replay loop (signature gains $lock_name; $renewed_at is local).
  • The helper throws if the lock was lost to another holder between renews, so the critical section bails rather than continuing without the guarantee.

Issue B: AuditTrailChainWriter::writeChainedRow() deadline short-circuit

Same bug shape that was fixed in ChainArchiver::acquireChainWriteLock() as part of #3591871. The retry loop's

if (!$this->lock->wait($lock_name, 1) && microtime(TRUE) >= $deadline) {

shape short-circuits the deadline check whenever wait() returns TRUE (lock observed still held). Under sustained contention by a foreign holder, the loop spins forever instead of dropping the write and bumping the drop counter after 5s.

Fix

Separates the deadline check from the wait back-off. Identical pattern to the ChainArchiver side.

Test coverage notes

  • Issue A: renewChainWriteLockIfStale() is a copy-paste shape of renewCoverageLockIfStale() which already has happy-path + throw-path unit tests in #3591871. The wiring into archiveSegment() + restoreLocked() is exercised integration-wise by every existing archive + restore test (which all pass against this MR). A dedicated unit test for the renewer-on-stale path would need either time mocking or a 10-second sleep.
  • Issue B: the existing LockContentionDropCounterTest uses an AlwaysContendedLockBackend whose wait() returns FALSE, which exercises the !wait() = TRUE branch where the bug did NOT manifest. A dedicated test for the bug-fix path (wait() returns TRUE) would need a custom stub with bounded loop iterations to keep wall-clock deterministic. The existing test still passes against the fix.

Both test gaps are open as follow-ups if dedicated coverage is preferred.

Verification

  • Full kernel suite: 232 passing, 985 assertions.
  • phpcs clean on touched files (the 2 pre-existing AuditTrailSettingsForm non-capturing-catch errors are out of scope).
  • FF-merges cleanly against 1.x.
Edited by Frank Mably

Merge request reports

Loading