fix: #3591878 Lock-handling audit: TTL renewal + deadline-loop correctness
Fix for #3591878
Two related fixes to the module's lock-handling, surfaced by the lock-inventory pass that followed #3591871.
Issue A: chain-write lock TTL can expire mid-loop
ChainArchiver::archiveSegment() (via writeNdjson()) and ChainArchiver::restoreLocked() both hold the chain-write lock for a data-dependent duration:
archiveSegmentwrites one NDJSON line per row in the segment range plus hashes the file.restoreLockedre-INSERTs one row per archive line back into the live table.
Multi-million-row segments can exceed the 30s lock TTL. The lock then expires and a concurrent logger write on the same chain can claim it mid-loop. Chain integrity is still protected at the DB level by UNIQUE(chain, previous_hash), but the operator-visible "this critical section ran atomically against the chain head" guarantee degrades silently.
Fix
Mirrors the pattern shipped for the coverage lock in #3591871:
- New constants
CHAIN_WRITE_LOCK_TTL_S(30s) andCHAIN_WRITE_LOCK_RENEW_INTERVAL_S(10s, 1/3 of TTL so two consecutive missed renews still leave the lock alive). - New helper
renewChainWriteLockIfStale()-- same shape asrenewCoverageLockIfStale(). Uses Drupal's same-holderLockBackendInterface::acquire()to extend thesemaphore.expirecolumn. - Wired into
writeNdjson()'s per-row + per-ack loops (signature gains$lock_name+$renewed_atparameters). - Wired into
restoreLocked()'s per-line replay loop (signature gains$lock_name;$renewed_atis local). - The helper throws if the lock was lost to another holder between renews, so the critical section bails rather than continuing without the guarantee.
Issue B: AuditTrailChainWriter::writeChainedRow() deadline short-circuit
Same bug shape that was fixed in ChainArchiver::acquireChainWriteLock() as part of #3591871. The retry loop's
if (!$this->lock->wait($lock_name, 1) && microtime(TRUE) >= $deadline) {shape short-circuits the deadline check whenever wait() returns TRUE (lock observed still held). Under sustained contention by a foreign holder, the loop spins forever instead of dropping the write and bumping the drop counter after 5s.
Fix
Separates the deadline check from the wait back-off. Identical pattern to the ChainArchiver side.
Test coverage notes
- Issue A:
renewChainWriteLockIfStale()is a copy-paste shape ofrenewCoverageLockIfStale()which already has happy-path + throw-path unit tests in #3591871. The wiring intoarchiveSegment()+restoreLocked()is exercised integration-wise by every existing archive + restore test (which all pass against this MR). A dedicated unit test for the renewer-on-stale path would need either time mocking or a 10-second sleep. - Issue B: the existing
LockContentionDropCounterTestuses anAlwaysContendedLockBackendwhosewait()returns FALSE, which exercises the!wait() = TRUEbranch where the bug did NOT manifest. A dedicated test for the bug-fix path (wait()returns TRUE) would need a custom stub with bounded loop iterations to keep wall-clock deterministic. The existing test still passes against the fix.
Both test gaps are open as follow-ups if dedicated coverage is preferred.
Verification
- Full kernel suite: 232 passing, 985 assertions.
- phpcs clean on touched files (the 2 pre-existing
AuditTrailSettingsFormnon-capturing-catch errors are out of scope). - FF-merges cleanly against
1.x.