fix: address review feedback from @coderabbitai[bot] on #3954#3963
fix: address review feedback from @coderabbitai[bot] on #3954#3963thepastaclaw wants to merge 6 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
✅ Review complete (commit e8f6385) |
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
Verified PR delivers five focused fixes against CodeRabbit feedback (orphan status preservation, event-drain on cancel, clear/start latch, doc corrections). All in-scope changes are correct and well-tested. Agent-flagged 'blocking' concerns about deeper races in clear_shielded vs direct sync_now/sync_wallet and stop-retry-after-orphan-park are pre-existing in the broader shutdown-join branch, not introduced or required to be fixed by this PR's stated scope; they belong to follow-up work.
shutdown() discarded the reap-orphans status (`let (_status, detached) = ...`), so a parked orphan that finished but panicked left `detached == 0` and `all_clean()` reported true — masking the non-clean orphan from the FFI destroy path that uses all_clean() to decide whether to delay freeing the host callback context. Promote the orphan status to a first-class `ShutdownReport.orphan_status` field, and have all_clean() reject any non-clean variant (Detached covers survivors; Panicked / Stopped covers finished-but-bad). Map the field verbatim onto `CoordinatorExitStatus.detached_threads` in the wallet adapter so the FFI sees the real orphan classification instead of always Ok/Detached. Lock the new semantic with a registry test that parks a panicking orphan, runs shutdown, and asserts `detached == 0` + non-clean `orphan_status` + `!all_clean()`. Refresh the platform-wallet adapter unit test for the same case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
If `cancel` won the `tokio::select!` in the wallet-event adapter loop, any WalletEvents the upstream broadcast had already delivered but the select hadn't yet polled were silently dropped — losing persistence work the upstream considered committed. Extract the per-event handling into `dispatch_event` so the live arm and a new `drain_pending_events` helper apply identical projection / store / error handling, and have the cancellation arm drain the receiver via repeated `try_recv` before breaking. Lagged batches match the live-loop policy (log + skip); a closed channel ends the drain. Add three regression tests: the drain dispatches every queued event, an empty receiver is a no-op, and the end-to-end loop persists every event sent into the WalletManager broadcast after `cancel.cancel()` fires. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tics The three coordinators' (identity, platform_address, shielded) and the shared CoordinatorLifecycle's stop() docs claimed an in-flight sync pass "keeps running to completion". That predates the current biased select! in each start() loop, which polls the cancel arm first — so an in-flight sync_now() / coordinator.sync() future is dropped at its next .await whenever cancel races a still-pending pass. The host callback / persister fan-out after that drop point does not run. Rewrite each stop() doc to state that cancel can win at any await, that no "drained" post-condition holds, and that callers needing a real barrier must use quiesce() (or manager-wide shutdown()). No behaviour change — docs only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
clear_shielded held the registry-level quiescing gate across its liveness check + store wipe, but ShieldedSyncManager::start() reopens that gate on entry. So a host racing clear with a concurrent shielded_sync_start could spawn a fresh loop whose first pass landed in the store clear was about to wipe — relying entirely on host-side serialization to avoid the race. Add an internal AtomicBool `clearing` on ShieldedSyncManager. clear_shielded raises it (via an AtomicFlagGuard) BEFORE quiesce, holds it across quiesce -> liveness check -> coord.clear(), and drops it on return. start() checks the latch first and bails as a no-op (with a warn log) when raised, so the registry-level gate clear is holding cannot be defeated by a racing restart. Test: while the latch is held, start() does not spawn the loop; once dropped, start() succeeds. Existing clear_shielded_refuses_while_shielded_orphan_alive still passes, proving the orphan gate continues to work alongside the new latch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…exit Reviewer follow-up to the clear/start exclusion latch: lock in that clear_shielded() lowers the latch on both the orphan-error early return AND the successful path, and clarify in the doc that a start() whose latch load beat the raise still spawns — the subsequent quiesce() + status.is_clean() gate is what keeps the store safe in that narrow race. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…shutdown The inline comment in platform_wallet_manager_destroy still said the status was "just logged and dropped". It hasn't been since the non-clean path was wired to return ErrorShutdownIncomplete — refresh the comment so it matches the actual behaviour the host depends on (the host must NOT free its callback context on that error code). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
be676b5 to
e8f6385
Compare
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
Cumulative review at e8f6385 against base, with latest delta be676b5..e8f6385. All four agents (general, security, rust-quality, FFI) report no in-scope defects. Verified the key delta changes against source: SeqCst handshake in begin_pass/quiesce is correctly applied (Dekker-style StoreLoad across two atomics), clear_shielded_inner now holds the quiescing gate continuously across drain+liveness-check+wipe with a bounded drain, registry start_thread parks the prior under the slot lock and fully rolls back slot config on spawn failure, and FFI shielded_sync_stop gates Success on orphan liveness symmetric with destroy/clear_shielded. Tests are deterministic and non-vacuous. No carried-forward prior findings (prior review was clean) and no new in-scope defects in the latest delta.
|
Closing this because it was created by the automated review-feedback fix workflow, which is now disabled. We no longer want PastaClaw to open helper/fix PRs into other people's PR branches from review comments; review-bot should report/comment/escalate instead. |
Addresses 15 review comment(s) from @coderabbitai[bot] on #3954.
Original PR author: @Claudius-Maginificent
Review comments addressed:
🧩 Analysis chain
🏁 Script ex](#3954 (comment))
**Report aborted event-adapter joins ](#3954 (comment))
**Bound the handler-start wait in this tes](#3954 (comment))
@Claudius-Maginificent, the addition ofCoordinatorThreadStatus::Timeoutas a dedicated variant**Make the
Stopped-path test deterministic](#3954 (comment))🧩 Analysis chain
🏁 Script ex](#3954 (comment))
**Bound the final cleanup quiesce in this ](#3954 (comment))
**Preserve non-clean orphan status in `Shu](#3954 (comment))
**Drain pending events before exiting on cancellation](#3954 (comment))
**Update
stop()docs for cancellab](#3954 (comment))**Enforce clear/start exclusion ins](#3954 (comment))
**Update
stop()docs for cancellat](#3954 (comment))**Update
stop()docs for backgroun](#3954 (comment))🧹 Nitpick comments (1)