fix: #3591953 Close ack/segment bisection race in createBareSegment + acknowledgeChainReset

Summary

ChainArchiver::createBareSegment() ran its assertNoAcknowledgmentBoundaryConflict guard BEFORE acquiring the chain-write lock. Symmetrically, AuditTrailVerifier::acknowledgeChainReset() took NO chain-write lock at all AND had no segment-bisection guard. The combination opened a race window in either direction.

Closes #3591953.

Problem

Two paths mutate the related ack + segment state without serializing against each other:

  • src/Archive/ChainArchiver.php -- the ack-bisection guard ran above the acquireChainWriteLock call.
  • src/AuditTrailVerifier.php -- acknowledgeChainReset validates + INSERTs without any chain-write lock.

Race trace (createBareSegment side):

  1. Thread A: createBareSegment(chain, F, T). Guard scans audit_trail_acknowledgment, finds no bisecting ack, returns clean. Then blocks on acquireChainWriteLock.
  2. Thread B: acknowledgeChainReset(chain, Fa, Ta) where the new ack bisects [F, T]. INSERTs the ack (no lock held). Returns.
  3. Thread A unblocks, reads head_id + anchors, INSERTs the bare. Bare and ack now mutually bisect; downstream archive of the bare would split the ack across two NDJSON files and anchorsStillValid() would silently fail.

Symmetric race in the other direction (ack side): no guard existed at all, so an ack could land that bisects an existing segment with nothing to refuse it.

Fix

(1) createBareSegment() -- guard moved inside the lock

   public function createBareSegment(string $chain_id, int $from_id, int $to_id): int {
     // ... range validation stays outside the lock ...
-    $this->assertNoAcknowledgmentBoundaryConflict($chain_id, $from_id, $to_id);
     $lock_name = $this->acquireChainWriteLock($chain_id, 'declare a segment');
     try {
+      // Run the ack-bisection guard INSIDE the lock so a
+      // concurrent acknowledgeChainReset() (which now takes
+      // the same lock) cannot slip a bisecting INSERT between
+      // the guard scan and the bare INSERT.
+      $this->assertNoAcknowledgmentBoundaryConflict($chain_id, $from_id, $to_id);
       // ... rest unchanged ...
     }

(2) acknowledgeChainReset() -- take the lock + symmetric guard

  • Acquires audit_trail.write:<chain> via the retry-until-deadline pattern (inlined; will extract to a trait if a third caller appears).
  • Inside the lock: head-id read, anchor resolution, new segment-bisection guard, ack INSERT.
  • Symmetric guard scans audit_trail_segment and refuses an ack whose range bisects any existing segment. Same "fully disjoint OR fully contained" invariant as the existing reverse-direction guard, mirror-image error message.

Lock-name match between createBareSegment and acknowledgeChainReset serializes the two paths against each other, so the in-lock guards see a consistent snapshot.

Tests

Three race-window regression tests added to ChainArchiverTest:

  • testCreateBareSegmentClosesAckBisectionRaceWindow -- new test-only CallbackOnAcquireLockBackend wraps the production lock backend; on the FIRST acquire('audit_trail.write:webdav') it fires a callback that INSERTs a bisecting ack BEFORE returning. Simulates a concurrent caller landing the ack during the chain-write lock acquire window. With the fix, createBareSegment's guard runs INSIDE the lock and catches it. Without the fix, the guard ran above acquire when no ack existed, so the bare INSERT slipped through unchecked.
  • testAcknowledgeChainResetRefusesBisectingSegment -- straightforward pre-existing-segment guard test. Pre-creates a bare at [2, 4], attempts ack at [3, 5] (bisects), asserts throw with "would bisect segment" + no ack landed.
  • testAcknowledgeChainResetClosesSegmentBisectionRaceWindow -- symmetric to the first test. Callback INSERTs a stub segment during the ack's chain-write lock acquire; the new in-lock guard catches the segment and refuses the ack.

Tests fail without the fix (verified)

Per the convention from #3591949: stashed both src/Archive/ChainArchiver.php and src/AuditTrailVerifier.php, re-ran the three tests, observed 3/3 FAIL with bug-shape outcomes:

  • testCreateBareSegmentClosesAckBisectionRaceWindow: bare INSERT succeeded silently; expected throw never fired.
  • testAcknowledgeChainResetRefusesBisectingSegment: ack INSERT succeeded silently; no symmetric guard existed pre-fix.
  • testAcknowledgeChainResetClosesSegmentBisectionRaceWindow: ack INSERT succeeded silently.

Restored the fix, re-ran, observed 3/3 green.

Verification

  • 119/119 across ChainArchiverTest + AuditTrailVerifierTest + PurgeRestoreTest.
  • phpcs clean.
  • Branch rebased onto current 1.x head (post-#3591949 merge). Single commit ahead, fast-forward mergeable.

What is NOT in this MR

  • No public-API signature changes. assertNoAcknowledgmentBoundaryConflict stays private; acknowledgeChainReset keeps the same signature + return.
  • No extraction of the lock-acquire pattern into a shared trait -- two call sites is below the "extract on third" threshold. If a fourth caller appears, the retry-until-deadline loop can be factored out then.
Edited by Frank Mably

Merge request reports

Loading