Skip to content

security: bound scan fan-out with a shared DNS budget + overall deadline#548

Merged
schmug merged 1 commit into
mainfrom
claude/intelligent-ardinghelli-d709a1
Jun 15, 2026
Merged

security: bound scan fan-out with a shared DNS budget + overall deadline#548
schmug merged 1 commit into
mainfrom
claude/intelligent-ardinghelli-d709a1

Conversation

@schmug

@schmug schmug commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

Orchestrator-level DoS umbrella for the confirmed finding in the private draft advisory GHSA-f828-8wf8-vqp2.

scan() / scanStreaming() previously awaited Promise.all across all analyzers with only a per-DNS-query 3s timeout — there was no aggregate cap on total DNS queries or total wall-clock per scan, and no AbortController/deadline wrapping the orchestrator. A single request combining a large input surface (many DKIM selectors + a rua/ruf-stuffed _dmarc record on an attacker-controlled domain) could drive a large outbound-DNS burst and long wall-clock — all on one rate-limit token.

This is the umbrella behind the per-analyzer fan-out fixes (#539 capped DKIM selectors + MTA-STS body; a sibling task caps DMARC rua/ruf). Even if a per-analyzer cap regresses, the total is now bounded.

What changed

Two backstops, both threaded through every analyzer into the DNS client:

  1. Shared per-scan DNS-query budgetScanBudget (src/dns/scan-budget.ts): one capped pool that every queryTxt / queryMx / queryDoh draws from via budget?.consume() before issuing any outbound query. DKIM selector probes and DMARC rua/ruf lookups (and SPF includes, DANE per-MX, etc.) all draw from the same pool, so total outbound DNS cannot scale with attacker input.
  2. Single overall deadline — one AbortController + setTimeout per scan. Each already-settled analyzer is raced against the signal (raceDeadline); on a breach it resolves to its synthetic fallback so the scan returns partial results instead of hanging. The budget also holds the signal, so no new query is issued once the deadline fires.

Both errors (ScanBudgetError, ScanDeadlineError) subclass DnsLookupError, so analyzers that already catch it degrade to a "could not verify" warning rather than a false "not configured"; anything uncaught is still caught by the existing #378 per-analyzer settle wrapper. A breach degrades gracefully (partial results + a note), never throws, and scanStreaming still streams every protocol exactly once (emit-once guard + post-buildScanResult MTA-STS emit preserved).

Defaults DEFAULT_SCAN_LIMITS = 150 queries / 12s wall-clock — generous for a real multi-analyzer scan (DKIM alone probes ~37 common selectors) and inside the ~10–15s target. Both are overridable via an optional limits parameter (used by the regression tests; production call sites are unchanged).

DnsLookupError was moved to src/dns/errors.ts (re-exported from client.ts for back-compat) so ScanBudget can subclass it without depending on the DNS client — which many tests vi.mock.

Test plan (TDD — failing test written first)

  • New test/orchestrator-budget.test.ts mocks the resolver layer (node:dns) and runs the real DKIM/DMARC/SPF analyzers + real DNS client + real budget:
    • Budget: worst-case input (300 selectors + 100 rua URIs → ~439 uncapped queries) caps total resolver queries at the shared pool and still returns a well-formed result.
    • Deadline: against a resolver that answers only after 500ms, an 80ms deadline returns partial results in <400ms instead of hanging.
    • Streaming: same cap applies, and every protocol is emitted exactly once.
  • Gates (all green): npm test1302 passing, 0 failing; npm run typecheck clean; npm run lint clean.

Notes

  • Touches src/orchestrator.ts + src/analyzers/**CODEOWNERS-gated; needs a code-owner review. Auto-merge intentionally not enabled.
  • The advisory GHSA-f828-8wf8-vqp2 remains a private draft — not published by this PR.

🤖 Generated with Claude Code

Orchestrator-level DoS umbrella for GHSA-f828-8wf8-vqp2. scan()/scanStreaming()
previously awaited Promise.all across all analyzers with only a per-query 3s DNS
timeout — no aggregate cap on total DNS queries or wall-clock. A single request
combining a large DKIM selector list with a rua/ruf-stuffed _dmarc record on an
attacker-controlled domain could drive a large outbound-DNS burst and long
wall-clock on one rate-limit token.

Adds two backstops, both threaded through every analyzer:
- ScanBudget (src/dns/scan-budget.ts): one shared per-scan DNS-query pool that
  every queryTxt/queryMx/queryDoh draws from; exhaustion throws a DnsLookupError
  subclass so analyzers degrade to "could not verify" instead of crashing.
- Overall deadline: one AbortController + setTimeout; each settled analyzer is
  raced against it, degrading to its synthetic fallback on a breach. The budget
  also holds the signal, so no new query is issued past the deadline.

Defaults (DEFAULT_SCAN_LIMITS): 150 queries / 12s — generous for real
multi-analyzer scans, overridable via an optional `limits` param (tests).
Preserves the #378 per-analyzer settle contract and scanStreaming SSE semantics
(every protocol still streams exactly once). DnsLookupError moved to
src/dns/errors.ts so the budget can subclass it without depending on the DNS
client (which tests mock).

Refs GHSA-f828-8wf8-vqp2. Umbrella over the per-analyzer caps in #539.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: schmug <38227427+schmug@users.noreply.github.com>
@schmug schmug force-pushed the claude/intelligent-ardinghelli-d709a1 branch from 8d7ed3d to 52778df Compare June 13, 2026 18:36
@schmug schmug merged commit a9bc678 into main Jun 15, 2026
5 checks passed
@schmug schmug deleted the claude/intelligent-ardinghelli-d709a1 branch June 15, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant