security: bound scan fan-out with a shared DNS budget + overall deadline by schmug · Pull Request #548 · schmug/dmarc.mx

schmug · 2026-06-13T18:05:43Z

Summary

Orchestrator-level DoS umbrella for the confirmed finding in the private draft advisory GHSA-f828-8wf8-vqp2.

scan() / scanStreaming() previously awaited Promise.all across all analyzers with only a per-DNS-query 3s timeout — there was no aggregate cap on total DNS queries or total wall-clock per scan, and no AbortController/deadline wrapping the orchestrator. A single request combining a large input surface (many DKIM selectors + a rua/ruf-stuffed _dmarc record on an attacker-controlled domain) could drive a large outbound-DNS burst and long wall-clock — all on one rate-limit token.

This is the umbrella behind the per-analyzer fan-out fixes (#539 capped DKIM selectors + MTA-STS body; a sibling task caps DMARC rua/ruf). Even if a per-analyzer cap regresses, the total is now bounded.

What changed

Two backstops, both threaded through every analyzer into the DNS client:

Shared per-scan DNS-query budget — ScanBudget (src/dns/scan-budget.ts): one capped pool that every queryTxt / queryMx / queryDoh draws from via budget?.consume() before issuing any outbound query. DKIM selector probes and DMARC rua/ruf lookups (and SPF includes, DANE per-MX, etc.) all draw from the same pool, so total outbound DNS cannot scale with attacker input.
Single overall deadline — one AbortController + setTimeout per scan. Each already-settled analyzer is raced against the signal (raceDeadline); on a breach it resolves to its synthetic fallback so the scan returns partial results instead of hanging. The budget also holds the signal, so no new query is issued once the deadline fires.

Both errors (ScanBudgetError, ScanDeadlineError) subclass DnsLookupError, so analyzers that already catch it degrade to a "could not verify" warning rather than a false "not configured"; anything uncaught is still caught by the existing #378 per-analyzer settle wrapper. A breach degrades gracefully (partial results + a note), never throws, and scanStreaming still streams every protocol exactly once (emit-once guard + post-buildScanResult MTA-STS emit preserved).

Defaults DEFAULT_SCAN_LIMITS = 150 queries / 12s wall-clock — generous for a real multi-analyzer scan (DKIM alone probes ~37 common selectors) and inside the ~10–15s target. Both are overridable via an optional limits parameter (used by the regression tests; production call sites are unchanged).

DnsLookupError was moved to src/dns/errors.ts (re-exported from client.ts for back-compat) so ScanBudget can subclass it without depending on the DNS client — which many tests vi.mock.

Test plan (TDD — failing test written first)

New test/orchestrator-budget.test.ts mocks the resolver layer (node:dns) and runs the real DKIM/DMARC/SPF analyzers + real DNS client + real budget:
- Budget: worst-case input (300 selectors + 100 rua URIs → ~439 uncapped queries) caps total resolver queries at the shared pool and still returns a well-formed result.
- Deadline: against a resolver that answers only after 500ms, an 80ms deadline returns partial results in <400ms instead of hanging.
- Streaming: same cap applies, and every protocol is emitted exactly once.
Gates (all green): npm test → 1302 passing, 0 failing; npm run typecheck clean; npm run lint clean.

Notes

Touches src/orchestrator.ts + src/analyzers/** → CODEOWNERS-gated; needs a code-owner review. Auto-merge intentionally not enabled.
The advisory GHSA-f828-8wf8-vqp2 remains a private draft — not published by this PR.

🤖 Generated with Claude Code

Orchestrator-level DoS umbrella for GHSA-f828-8wf8-vqp2. scan()/scanStreaming() previously awaited Promise.all across all analyzers with only a per-query 3s DNS timeout — no aggregate cap on total DNS queries or wall-clock. A single request combining a large DKIM selector list with a rua/ruf-stuffed _dmarc record on an attacker-controlled domain could drive a large outbound-DNS burst and long wall-clock on one rate-limit token. Adds two backstops, both threaded through every analyzer: - ScanBudget (src/dns/scan-budget.ts): one shared per-scan DNS-query pool that every queryTxt/queryMx/queryDoh draws from; exhaustion throws a DnsLookupError subclass so analyzers degrade to "could not verify" instead of crashing. - Overall deadline: one AbortController + setTimeout; each settled analyzer is raced against it, degrading to its synthetic fallback on a breach. The budget also holds the signal, so no new query is issued past the deadline. Defaults (DEFAULT_SCAN_LIMITS): 150 queries / 12s — generous for real multi-analyzer scans, overridable via an optional `limits` param (tests). Preserves the #378 per-analyzer settle contract and scanStreaming SSE semantics (every protocol still streams exactly once). DnsLookupError moved to src/dns/errors.ts so the budget can subclass it without depending on the DNS client (which tests mock). Refs GHSA-f828-8wf8-vqp2. Umbrella over the per-analyzer caps in #539. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: schmug <38227427+schmug@users.noreply.github.com>

schmug force-pushed the claude/intelligent-ardinghelli-d709a1 branch from 8d7ed3d to 52778df Compare June 13, 2026 18:36

schmug merged commit a9bc678 into main Jun 15, 2026
5 checks passed

schmug deleted the claude/intelligent-ardinghelli-d709a1 branch June 15, 2026 18:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: bound scan fan-out with a shared DNS budget + overall deadline#548

security: bound scan fan-out with a shared DNS budget + overall deadline#548
schmug merged 1 commit into
mainfrom
claude/intelligent-ardinghelli-d709a1

schmug commented Jun 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

schmug commented Jun 13, 2026

Summary

What changed

Test plan (TDD — failing test written first)

Notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant