security: atomic Durable Object rate limiter (completes GHSA-v7qc-7qh8-h69g fix)#549
Merged
Conversation
Replace the non-atomic Cache-API rate-limit counter with a per-identity Durable Object. The Cache API has no atomic increment/CAS, so a concurrent burst under one identity could each read the same stale count and write count+1, letting the effective ceiling exceed the configured limit (free 10/60s, pro 60/3600s). The DO's single-threaded RPC serializes the read-modify-write, closing the window across isolates and colos. Completes the partial mitigation from PR #539, which only awaited the Cache-API write to close the intra-isolate window. - src/rate-limit-do.ts: new RateLimiterDO (SQLite-backed, synchronous read-modify-write); one instance per `ip:<x>` / `user:<id>` bucket. - src/rate-limit.ts: checkRateLimit threads an optional DO namespace and routes to it; in-memory path kept as graceful fallback when the binding is absent (self-host, Node test pool). Cache-API path removed. - src/index.ts: re-export the DO class from the entry module; middleware passes c.env.RATE_LIMITER. - wrangler.toml / src/env.ts: RATE_LIMITER binding + v1 sqlite migration. - test/integration/rate-limit-do.test.ts: N concurrent increments reach exactly N with no burst bypass; tiers, headers, isolation, window roll. - vitest.config.ts / test/shims: Node-pool stub for cloudflare:workers. Preserves RateLimitResult fields and X-RateLimit-* headers. Does NOT use the native Rate Limiting binding (no remaining/resetAt; 60s-period cap breaks the pro 3600s tier). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: schmug <38227427+schmug@users.noreply.github.com>
The threat model still described `/mcp` as having "no rate-limit middleware" and being "unmetered", but every scan-triggering route already carries `rateLimitMiddleware` (`/check`, `/api/check`, `/api/bulk-scan`, SSE, `/badge`, `/mcp`, `/api/domain/*`). `/mx/:slug` is a static provider page that never triggers a scan, so it correctly has no limiter. - E2: note /mcp is rate-limited per-IP (anon bucket), bearer still not required. - T9: reframe residual from "unmetered routes" to IP-rotation; list the metered routes; keep partially_mitigated (botnet IP rotation remains). - Remove the resolved "/mcp rate limiting?" open question. - Mark the "apply rateLimitMiddleware to every scan route" mitigation done. No code change; documentation accuracy only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: schmug <38227427+schmug@users.noreply.github.com>
schmug
added a commit
that referenced
this pull request
Jun 13, 2026
…imit write-race fix (superseded by #549) The Cache-API await fix here (GHSA-v7qc-7qh8-h69g) is only a partial mitigation; #549 fully supersedes it with an atomic Durable Object rate limiter and deletes the checkRateLimitCache path this touched. Revert the rate-limit.ts change, the rateLimitMiddleware hunk in index.ts, and the f30 test so this PR scopes cleanly to the per-analyzer caps that nothing else duplicates: DKIM selector count/length (GHSA-6fqp-4vhc-59mf) and MTA-STS policy body size (GHSA-p676-gc7j-96mx). Avoids a same-line conflict with #549's middleware rewrite. Signed-off-by: schmug <38227427+schmug@users.noreply.github.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Completes the fix for the confirmed rate-limiter race condition tracked in the private advisory GHSA-v7qc-7qh8-h69g by replacing the non-atomic Cache-API counter with a per-identity Durable Object atomic counter.
PR #539 shipped a partial mitigation — it
awaits the Cache-API write before returning, closing only the intra-isolate window. This PR implements the complete fix: an atomic primitive that holds across isolates and colos.The problem
checkRateLimitCachewas a non-atomic read-modify-write on the Cloudflare Cache API:cache.match→count++→cache.put. The Cache API offers no atomic increment/CAS, so a concurrent burst under one identity could all read the same stale count and each writecount + 1, letting the effective ceiling exceed the configured limit (free 10/60s, pro 60/3600s).The fix
src/rate-limit-do.ts— newRateLimiterDO(SQLite-backed). One instance per identity bucket (ip:<x>/user:<id>) viagetByName. Theincrement(limit, windowSec)RPC does the whole read-modify-write synchronously, so the DO's single-threaded execution serializes overlapping calls — no interleaving, no lost updates.src/rate-limit.ts—checkRateLimit(identity, config, namespace?)routes to the DO when the binding is present and falls back to the existing in-memory limiter when it isn't (self-host deploys that strip the binding, and the Node test pool). The racy Cache-API path andpendingWriteare removed.src/index.ts— re-exports the DO class from the Worker entry (required for the binding to resolveclass_name); middleware passesc.env.RATE_LIMITER.wrangler.toml/src/env.ts—RATE_LIMITERbinding +v1new_sqlite_classesmigration.vitest.config.ts/test/shims/— Node-pool stub for thecloudflare:workersvirtual module (the Node tests importsrc/index.ts, which now pulls in the DO base class; the Workers pool uses the real module).Why not the native Rate Limiting binding? It returns only
{ success }(noremaining/resetAtfor theX-RateLimit-*headers) and supports only 10s/60s periods, which breaks the pro 3600s tier. A Durable Object is the correct primitive.Testing (TDD — failing concurrency test first)
test/integration/rate-limit-do.test.tsruns in the realworkerdruntime:limitare allowed.X-RateLimit-*headers,resetAtstability within a window,ip:vsuser:bucket isolation, and the window-roll reset branch (viarunInDurableObject).Gates (fresh run):
npm test→ 1307 passed (78 files, incl. 8 new DO tests)npm run typecheck→ pass (exit 0)npm run lint→ pass (161 files, no diagnostics)wrangler deploy --dry-run→ binding recognized asenv.RATE_LIMITER (RateLimiterDO), migration acceptedAcceptance criteria
checkRateLimitcalls for one identity and asserts the stored count reaches N (no burst bypass)wrangler.tomlhas the DO binding + migration; typecheck passes against the generated binding typeNotes
wrangler.toml, rate limiting) — expects code-owner review; auto-merge intentionally not enabled.Refs: GHSA-v7qc-7qh8-h69g · completes #539
Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com