Skip to content

fix(hooks): prevent circuit breaker false positives on deliberate rollbacks#195

Merged
jobordu merged 2 commits into
mainfrom
nf/quick-45-circuit-breaker-false-positives
Jun 5, 2026
Merged

fix(hooks): prevent circuit breaker false positives on deliberate rollbacks#195
jobordu merged 2 commits into
mainfrom
nf/quick-45-circuit-breaker-false-positives

Conversation

@jobordu

@jobordu jobordu commented Jun 3, 2026

Copy link
Copy Markdown

Problem

The circuit breaker hook fired a false positive on DigitalFrontier-infra. A prefer_wallet feature was added in one commit and deliberately removed in the next — a clean one-shot rollback. The algorithm flagged it as oscillation because:

  1. Run-group counting conflates rollback with oscillation — 3 run-groups can arise from a single A→B→A cycle
  2. The reversion check flags ANY net-negative pattern, even a deliberate revert
  3. No commit message intent analysis — the commit said "remove unvalidated wallet-first"
  4. Haiku's prompt lacked a rollback classification category

Changes

1. Minimum cycle gate (min_cycles: 2)

A single A→B→A no longer triggers. Requires at least 2 full oscillation cycles (A→B→A→B→A).

2. Commit message intent signals

New hasRollbackIntent() detects keywords (revert, rollback, remove, undo, backout) on net-negative commits at borderline cycle counts.

3. Diff-level rollback detection

New isCleanRollback() analyzes inverse diff pairs with an asymmetry check — only flags patterns where one commit is heavily additive (+30/-1) and the next is heavily deletive (+1/-30). Small mixed swaps (genuine oscillation) are left alone.

4. Improved Haiku prompt

Added DELIBERATE_ROLLBACK as a third classification option.

5. New config parameters

  • min_cycles (default: 2) — minimum oscillation cycles before flagging
  • rollback_detection (default: true) — enable intent + diff-level rollback checks

Files changed

File What
hooks/nf-circuit-breaker.js 4 new functions + detection flow
hooks/nf-circuit-breaker.test.js 13 new tests (50 total)
hooks/config-loader.js New min_cycles + rollback_detection defaults + validation

Verification

node --test hooks/nf-circuit-breaker.test.js
# 50 tests, 50 passed, 0 failed

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Enhanced oscillation detection that distinguishes intentional rollbacks from problematic patterns.
    • New circuit breaker options: min_cycles (default 2) and rollback_detection (default true) for finer control.
    • Commit-message parsing to identify deliberate rollback intent and suppress notifications when applicable.
  • Bug Fixes

    • Fewer false positives in rollback/oscillation detection and more accurate notification suppression.

…lbacks

Adds three layers of filtering to distinguish deliberate one-shot rollbacks
from genuine oscillation:

1. Minimum cycle gate (min_cycles: 2) — single A→B→A no longer triggers;
   requires at least 2 full oscillation cycles (A→B→A→B→A).
2. Commit message intent signals — detects revert/rollback/remove/undo
   keywords on net-negative commits at borderline cycle counts.
3. Diff-level rollback detection (isCleanRollback) — analyzes inverse
   diff pairs with asymmetry check: only flags patterns where one commit
   is heavily additive and the next heavily deletive.

Also improves Haiku prompt with DELIBERATE_ROLLBACK as a third
classification option.

Prevents the DigitalFrontier-infra false positive where adding then
removing prefer_wallet in two consecutive commits triggered the breaker.

New config: min_cycles (default 2), rollback_detection (default true).
50 tests passing (13 new).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 3, 2026 20:32
@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review Change Stack

Walkthrough

This PR extends circuit-breaker oscillation detection from a single reversion check into a configurable multi-pass pipeline that distinguishes true oscillation from deliberate one-shot rollback. Configuration adds cycle and rollback-detection toggles; detection logic gates by cycle count and, at the borderline, suppresses oscillation based on commit-message keywords and diff-level inverse-pair patterns. Haiku consultation adds a DELIBERATE_ROLLBACK verdict. Tests validate end-to-end behavior across single/multi-cycle and rollback-intent scenarios.

Changes

Oscillation Detection with Rollback Intent Suppression

Layer / File(s) Summary
Configuration schema and validation
hooks/config-loader.js
Circuit breaker config gains min_cycles (default 2) and rollback_detection (default true) fields. Validation enforces type constraints with stderr warnings and fallback defaults.
Oscillation analysis helpers
hooks/nf-circuit-breaker.js
hasReversionInHashes extended with optional pairStatsOut parameter to collect per-pair diff statistics. New helpers: countOscillationCycles (cycle counting), getCommitMessages (hash→subject map), hasRollbackIntent (keyword detection), isCleanRollback (single inverse-pair detection), and ROLLBACK_KEYWORDS regex. Module exports extended to expose these helpers.
Multi-pass oscillation detection
hooks/nf-circuit-breaker.js
detectOscillation redefined to accept options and execute run-group depth check, reversion check, cycle-count gating via minCycles, and—at borderline—commit-intent and diff-level rollback suppression. Collects per-pair stats and distinguishes one-shot rollback from oscillation.
Haiku consultation for rollback verdict
hooks/nf-circuit-breaker.js
Haiku prompt requests single-word GENUINE, REFINEMENT, or DELIBERATE_ROLLBACK verdict. Response parser maps non-GENUINE verdicts to their outputs; suppression treats both REFINEMENT and DELIBERATE_ROLLBACK as false-negatives.
PreToolUse detection flow
hooks/nf-circuit-breaker.js
PreToolUse calls detectOscillation with minCycles and rollbackDetection from config. Treats both REFINEMENT and DELIBERATE_ROLLBACK verdicts as false-negatives (suppressing notifications and recording the verdict).
Test coverage for rollback detection
hooks/nf-circuit-breaker.test.js
Unit tests validate cycle counting, rollback keywords, rollback-intent detection, and clean-rollback patterns. Integration tests exercise hook end-to-end: single-cycle rollback suppression, multi-cycle triggering, rollback-keyword prevention, clean inverse-diff detection, and minCycles option behavior. Asserts Haiku prompt includes DELIBERATE_ROLLBACK and pairStatsOut population.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main objective: preventing false positives in the circuit breaker hook when deliberate rollbacks occur.
Description check ✅ Passed The description covers Problem, Changes, Files changed, and Verification sections. Missing explicit Testing checklist marks, CHANGELOG.md confirmation, and some style/dependency checks from the template.
Docstring Coverage ✅ Passed Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch nf/quick-45-circuit-breaker-false-positives

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the nf-circuit-breaker hook to reduce false positives when a repo experiences a deliberate add-then-remove rollback, by adding cycle-count gating plus two rollback classifiers (commit-message intent and inverse-diff detection), and extending the Haiku reviewer prompt to recognize rollbacks.

Changes:

  • Add min_cycles gating and optional rollback detection (rollback_detection) to the oscillation decision flow.
  • Add rollback intent detection via commit-message keyword scanning and “clean rollback” detection via inverse diff-pair analysis.
  • Extend Haiku classification to include DELIBERATE_ROLLBACK, plus add tests and config defaults/validation.

Reviewed changes

Copilot reviewed 3 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
hooks/nf-circuit-breaker.js Adds cycle gating + rollback detection functions; expands Haiku prompt/verdict handling; wires new config options into detection.
hooks/dist/nf-circuit-breaker.js Built artifact mirroring the hook logic updates.
hooks/config-loader.js Adds defaults + validation for circuit_breaker.min_cycles and circuit_breaker.rollback_detection.
hooks/dist/config-loader.js Built artifact mirroring config-loader updates.
hooks/nf-circuit-breaker.test.js Adds unit/integration tests covering cycles, rollback intent, inverse-diff rollback, and Haiku prompt text.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread hooks/nf-circuit-breaker.test.js Outdated
Comment thread hooks/nf-circuit-breaker.test.js Outdated
Comment thread hooks/nf-circuit-breaker.test.js Outdated
Comment thread hooks/nf-circuit-breaker.js Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
hooks/nf-circuit-breaker.js (1)

486-491: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Audit log records incorrect verdict for DELIBERATE_ROLLBACK.

When Haiku returns DELIBERATE_ROLLBACK, appendFalseNegative still records verdict: 'REFINEMENT' because it's hardcoded. This misrepresents the classification in the audit trail.

Proposed fix

Update function signature and call site:

-function appendFalseNegative(statePath, fileSet) {
+function appendFalseNegative(statePath, fileSet, verdict = 'REFINEMENT') {
   try {
     const fnLogPath = statePath.replace('circuit-breaker-state.json', 'circuit-breaker-false-negatives.json');
     let existing = [];
     if (fs.existsSync(fnLogPath)) {
       try {
         existing = JSON.parse(fs.readFileSync(fnLogPath, 'utf8'));
         if (!Array.isArray(existing)) existing = [];
       } catch {
         existing = [];
       }
     }
     existing.push({
       detected_at: new Date().toISOString(),
       file_set: fileSet,
       reviewer: 'haiku',
-      verdict: 'REFINEMENT',
+      verdict,
     });

And at the call site (around line 951):

-          appendFalseNegative(statePath, result.fileSet);
+          appendFalseNegative(statePath, result.fileSet, verdict);

Also applies to: 947-951

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hooks/nf-circuit-breaker.js` around lines 486 - 491, The audit entry creation
in appendFalseNegative currently hardcodes verdict: 'REFINEMENT' causing
mislabeling when Haiku returns 'DELIBERATE_ROLLBACK'; modify the
appendFalseNegative function (and the internal push that creates the audit
object) to accept a verdict parameter and use that value for the verdict field,
keep 'REFINEMENT' as the default if none provided, and update the call sites
(the places invoking appendFalseNegative around the DELIBERATE_ROLLBACK
handling) to pass 'DELIBERATE_ROLLBACK' when appropriate so the audit log
accurately reflects the classification.
🧹 Nitpick comments (1)
hooks/nf-circuit-breaker.test.js (1)

1602-1607: 💤 Low value

Remove dead fileSets block.

This map is never used (the test relies on properFileSets built at Lines 1610-1616), it discards its result by returning [], and hashes[hashes.indexOf(hashes.find(() => true))] is a convoluted no-op that always resolves to hashes[0]. It just fires redundant git diff-tree spawns.

♻️ Proposed cleanup
-    const fileSets = hashes.map(() => {
-      const r = spawnSync('git', ['diff-tree', '--no-commit-id', '-r', '--name-only', '--root', hashes[hashes.indexOf(hashes.find(() => true))]], {
-        cwd: repoDir, encoding: 'utf8', timeout: 5000,
-      });
-      return [];
-    });
-
-    // Actually get proper file sets
     const properFileSets = [];
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hooks/nf-circuit-breaker.test.js` around lines 1602 - 1607, Remove the unused
dead block that builds fileSets: the const fileSets = hashes.map(() => { ...
return []; }); loop is redundant (it always returns [] and calls spawnSync with
the no-op hashes[hashes.indexOf(hashes.find(() => true))]) and should be
deleted; simply remove the entire fileSets assignment and its spawnSync call so
the test relies only on properFileSets (constructed later) and avoids
unnecessary git diff-tree spawns.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hooks/nf-circuit-breaker.test.js`:
- Around line 1344-1353: The current test uses an if (stdout.trim()) guard so
the notStrictEqual(decision, 'deny') never runs (stdout is empty on first-pass),
making the assertion vacuous; change the test to explicitly assert the breaker
state file was not written/is not active for the
runHook(makeWritePayload(repoDir)) case (mirror the negative checks in
CB-TC5/CB-TC20/CB-TC23) and remove the conditional guard — alternatively
unconditionally parse stdout (if present) and assert
parsed.hookSpecificOutput?.permissionDecision !== 'deny'; apply the same fix to
the CB-FP03 test so you assert the state-file inactivity rather than relying on
optional stdout.

---

Outside diff comments:
In `@hooks/nf-circuit-breaker.js`:
- Around line 486-491: The audit entry creation in appendFalseNegative currently
hardcodes verdict: 'REFINEMENT' causing mislabeling when Haiku returns
'DELIBERATE_ROLLBACK'; modify the appendFalseNegative function (and the internal
push that creates the audit object) to accept a verdict parameter and use that
value for the verdict field, keep 'REFINEMENT' as the default if none provided,
and update the call sites (the places invoking appendFalseNegative around the
DELIBERATE_ROLLBACK handling) to pass 'DELIBERATE_ROLLBACK' when appropriate so
the audit log accurately reflects the classification.

---

Nitpick comments:
In `@hooks/nf-circuit-breaker.test.js`:
- Around line 1602-1607: Remove the unused dead block that builds fileSets: the
const fileSets = hashes.map(() => { ... return []; }); loop is redundant (it
always returns [] and calls spawnSync with the no-op
hashes[hashes.indexOf(hashes.find(() => true))]) and should be deleted; simply
remove the entire fileSets assignment and its spawnSync call so the test relies
only on properFileSets (constructed later) and avoids unnecessary git diff-tree
spawns.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 7d7f7107-26d5-4c74-b3da-765772902867

📥 Commits

Reviewing files that changed from the base of the PR and between 15fc181 and ab1cacb.

⛔ Files ignored due to path filters (2)
  • hooks/dist/config-loader.js is excluded by !**/dist/**
  • hooks/dist/nf-circuit-breaker.js is excluded by !**/dist/**
📒 Files selected for processing (3)
  • hooks/config-loader.js
  • hooks/nf-circuit-breaker.js
  • hooks/nf-circuit-breaker.test.js

Comment thread hooks/nf-circuit-breaker.test.js Outdated
- CB-FP01/CB-FP03: assert state file NOT written instead of vacuous stdout guard
- CB-FP03: use clean inverse diff pattern (large add/remove) for proper rollback test
- CB-FP09: remove dead code (discarded fileSets variable)
- appendFalseNegative: accept verdict parameter instead of hard-coding REFINEMENT
- CB-TC18: set min_cycles:0 in project config (test has only 1 cycle)
- Remove commit message intent as standalone gate — keyword alone is insufficient
  for suppressing detection; requires isCleanRollback diff confirmation too

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
hooks/nf-circuit-breaker.js (2)

479-497: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Create .claude before writing the false-negative audit log.

On the first suppressed detection, writeState() has not run yet, so .claude/ may not exist. In that case fs.writeFileSync(fnLogPath, ...) throws ENOENT, and the new verdict audit trail is lost instead of persisted.

Suggested fix
 function appendFalseNegative(statePath, fileSet, verdict) {
   try {
     const fnLogPath = statePath.replace('circuit-breaker-state.json', 'circuit-breaker-false-negatives.json');
+    fs.mkdirSync(path.dirname(fnLogPath), { recursive: true });
     let existing = [];
     if (fs.existsSync(fnLogPath)) {
       try {
         existing = JSON.parse(fs.readFileSync(fnLogPath, 'utf8'));
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hooks/nf-circuit-breaker.js` around lines 479 - 497, appendFalseNegative
currently writes to fnLogPath (derived in function appendFalseNegative) without
ensuring the parent directory (.claude) exists, causing fs.writeFileSync to
throw ENOENT on first suppressed detection; update appendFalseNegative to ensure
the directory for fnLogPath exists before writing (e.g., compute the directory
from fnLogPath and call fs.mkdirSync(dir, { recursive: true }) or equivalent),
then proceed to read/parse existing file and fs.writeFileSync as before so the
false-negative audit log is persisted even before writeState runs.

950-957: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not let Haiku self-waive the breaker on DELIBERATE_ROLLBACK.

detectOscillation() only exempts rollback-shaped patterns at the deterministic borderline, but this branch suppresses any detected loop if the model says DELIBERATE_ROLLBACK. Because that verdict is derived from raw git log/diff text, a crafted commit message or patch can steer the model into bypassing the new minCycles / isCleanRollback() gate. Only honor that verdict when the same deterministic rollback check also passes, or keep it advisory-only. Based on learnings: whole-turn/self-referential scans can become self-waiving prompt injection vectors when protection logic trusts echoed untrusted text.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hooks/nf-circuit-breaker.js` around lines 950 - 957, The Haiku verdict branch
currently treats DELIBERATE_ROLLBACK the same as REFINEMENT and lets the tool
proceed; change it so DELIBERATE_ROLLBACK is only honored if the deterministic
rollback check in detectOscillation/isCleanRollback (and the minCycles
requirement) also pass — otherwise treat DELIBERATE_ROLLBACK as advisory (log it
via appendFalseNegative but do not bypass the circuit breaker or call
process.exit(0)). Update the code around consultHaiku’s handling of verdict to:
when verdict === 'DELIBERATE_ROLLBACK' first verify
detectOscillation(...).isCleanRollback() (or equivalent) and minCycles, only
then allow proceeding; if not, do not suppress the breaker and continue with
normal loop-detection behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hooks/nf-circuit-breaker.js`:
- Around line 363-368: The current suppression uses isCleanRollback(gitRoot,
oscillatingHashes, files) which only checks add/delete counts and can
false-positive; update the check so isCleanRollback actually verifies an
inverse-patch match before continuing: compute and compare patch identifiers or
the actual changed line content for the earlier commit(s) referenced by
oscillatingHashes against the current files' diffs (e.g., generate patch IDs or
canonicalized hunks for the prior commit and for the current commit) and only
return true from isCleanRollback when those patch IDs or line-level diffs are
exact inverses; modify isCleanRollback (and its callers) to load the prior patch
content from git using gitRoot + oscillatingHashes and compare to the current
files diff rather than relying solely on add/delete aggregates.

---

Outside diff comments:
In `@hooks/nf-circuit-breaker.js`:
- Around line 479-497: appendFalseNegative currently writes to fnLogPath
(derived in function appendFalseNegative) without ensuring the parent directory
(.claude) exists, causing fs.writeFileSync to throw ENOENT on first suppressed
detection; update appendFalseNegative to ensure the directory for fnLogPath
exists before writing (e.g., compute the directory from fnLogPath and call
fs.mkdirSync(dir, { recursive: true }) or equivalent), then proceed to
read/parse existing file and fs.writeFileSync as before so the false-negative
audit log is persisted even before writeState runs.
- Around line 950-957: The Haiku verdict branch currently treats
DELIBERATE_ROLLBACK the same as REFINEMENT and lets the tool proceed; change it
so DELIBERATE_ROLLBACK is only honored if the deterministic rollback check in
detectOscillation/isCleanRollback (and the minCycles requirement) also pass —
otherwise treat DELIBERATE_ROLLBACK as advisory (log it via appendFalseNegative
but do not bypass the circuit breaker or call process.exit(0)). Update the code
around consultHaiku’s handling of verdict to: when verdict ===
'DELIBERATE_ROLLBACK' first verify detectOscillation(...).isCleanRollback() (or
equivalent) and minCycles, only then allow proceeding; if not, do not suppress
the breaker and continue with normal loop-detection behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 5e6c3b5a-e0b1-4455-b79b-5c9b7537ee16

📥 Commits

Reviewing files that changed from the base of the PR and between ab1cacb and 398e647.

📒 Files selected for processing (2)
  • hooks/nf-circuit-breaker.js
  • hooks/nf-circuit-breaker.test.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • hooks/nf-circuit-breaker.test.js

Comment thread hooks/nf-circuit-breaker.js
@jobordu jobordu merged commit cb63d08 into main Jun 5, 2026
21 checks passed
@jobordu jobordu deleted the nf/quick-45-circuit-breaker-false-positives branch June 5, 2026 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants