security: bound scan fan-out with a shared DNS budget + overall deadline#548
Merged
Conversation
Orchestrator-level DoS umbrella for GHSA-f828-8wf8-vqp2. scan()/scanStreaming() previously awaited Promise.all across all analyzers with only a per-query 3s DNS timeout — no aggregate cap on total DNS queries or wall-clock. A single request combining a large DKIM selector list with a rua/ruf-stuffed _dmarc record on an attacker-controlled domain could drive a large outbound-DNS burst and long wall-clock on one rate-limit token. Adds two backstops, both threaded through every analyzer: - ScanBudget (src/dns/scan-budget.ts): one shared per-scan DNS-query pool that every queryTxt/queryMx/queryDoh draws from; exhaustion throws a DnsLookupError subclass so analyzers degrade to "could not verify" instead of crashing. - Overall deadline: one AbortController + setTimeout; each settled analyzer is raced against it, degrading to its synthetic fallback on a breach. The budget also holds the signal, so no new query is issued past the deadline. Defaults (DEFAULT_SCAN_LIMITS): 150 queries / 12s — generous for real multi-analyzer scans, overridable via an optional `limits` param (tests). Preserves the #378 per-analyzer settle contract and scanStreaming SSE semantics (every protocol still streams exactly once). DnsLookupError moved to src/dns/errors.ts so the budget can subclass it without depending on the DNS client (which tests mock). Refs GHSA-f828-8wf8-vqp2. Umbrella over the per-analyzer caps in #539. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: schmug <38227427+schmug@users.noreply.github.com>
8d7ed3d to
52778df
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Orchestrator-level DoS umbrella for the confirmed finding in the private draft advisory GHSA-f828-8wf8-vqp2.
scan()/scanStreaming()previously awaitedPromise.allacross all analyzers with only a per-DNS-query 3s timeout — there was no aggregate cap on total DNS queries or total wall-clock per scan, and noAbortController/deadline wrapping the orchestrator. A single request combining a large input surface (many DKIM selectors + arua/ruf-stuffed_dmarcrecord on an attacker-controlled domain) could drive a large outbound-DNS burst and long wall-clock — all on one rate-limit token.This is the umbrella behind the per-analyzer fan-out fixes (#539 capped DKIM selectors + MTA-STS body; a sibling task caps DMARC
rua/ruf). Even if a per-analyzer cap regresses, the total is now bounded.What changed
Two backstops, both threaded through every analyzer into the DNS client:
ScanBudget(src/dns/scan-budget.ts): one capped pool that everyqueryTxt/queryMx/queryDohdraws from viabudget?.consume()before issuing any outbound query. DKIM selector probes and DMARCrua/ruflookups (and SPF includes, DANE per-MX, etc.) all draw from the same pool, so total outbound DNS cannot scale with attacker input.AbortController+setTimeoutper scan. Each already-settled analyzer is raced against the signal (raceDeadline); on a breach it resolves to its synthetic fallback so the scan returns partial results instead of hanging. The budget also holds the signal, so no new query is issued once the deadline fires.Both errors (
ScanBudgetError,ScanDeadlineError) subclassDnsLookupError, so analyzers that already catch it degrade to a "could not verify" warning rather than a false "not configured"; anything uncaught is still caught by the existing#378per-analyzersettlewrapper. A breach degrades gracefully (partial results + a note), never throws, andscanStreamingstill streams every protocol exactly once (emit-once guard + post-buildScanResultMTA-STS emit preserved).Defaults
DEFAULT_SCAN_LIMITS= 150 queries / 12s wall-clock — generous for a real multi-analyzer scan (DKIM alone probes ~37 common selectors) and inside the ~10–15s target. Both are overridable via an optionallimitsparameter (used by the regression tests; production call sites are unchanged).DnsLookupErrorwas moved tosrc/dns/errors.ts(re-exported fromclient.tsfor back-compat) soScanBudgetcan subclass it without depending on the DNS client — which many testsvi.mock.Test plan (TDD — failing test written first)
test/orchestrator-budget.test.tsmocks the resolver layer (node:dns) and runs the real DKIM/DMARC/SPF analyzers + real DNS client + real budget:ruaURIs → ~439 uncapped queries) caps total resolver queries at the shared pool and still returns a well-formed result.npm test→ 1302 passing, 0 failing;npm run typecheckclean;npm run lintclean.Notes
src/orchestrator.ts+src/analyzers/**→ CODEOWNERS-gated; needs a code-owner review. Auto-merge intentionally not enabled.🤖 Generated with Claude Code