Configure rate limits on VirtualMCPServer PR B 2#5522
Conversation
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5522 +/- ##
==========================================
+ Coverage 69.70% 69.72% +0.02%
==========================================
Files 645 645
Lines 65598 65627 +29
==========================================
+ Hits 45724 45758 +34
+ Misses 16530 16523 -7
- Partials 3344 3346 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
@Sanskarzz — for context, this is about Trey's in-flight vMCP interface refactor (epic #5419, RFC THV-0076), which splits vMCP into a domain core (the Let's park this for now. In the post-refactor vMCP, per-tool rate limiting doesn't need to be HTTP middleware at all — it fits more naturally as a The reason it fits: the optimizer is becoming the outermost Roughly (signatures illustrative — align with the post-#5431 // Composition root (cli/serve.go, post-refactor):
var v core.VMCP = core.New(coreCfg) // authz admission seam lives here (#5438) — below the optimizer
v = ratelimit.WrapVMCP(v, limiter) // keys the per-tool bucket on the resolved tool name
v = optimizer.WrapVMCP(v, ...) // outermost: resolves call_tool -> backend tool first
srv, _ := server.Serve(ctx, v, serverCfg)// pkg/vmcp/ratelimit — a VMCP decorator instead of HTTP middleware:
type rateLimitedVMCP struct {
core.VMCP // embed: every other method passes through unchanged
limiter Limiter
}
func (v *rateLimitedVMCP) CallTool(
ctx context.Context, id *auth.Identity, tool string, args map[string]any, meta vmcp.Meta,
) (*vmcp.CallToolResult, error) {
// `tool` is already the backend tool name — the optimizer decorator wraps
// this one and resolved call_tool -> tool before delegating here.
if err := v.limiter.Allow(ctx, id, tool); err != nil {
return nil, err // typed rate-limit error; transport maps it to the JSON-RPC 429
}
return v.VMCP.CallTool(ctx, id, tool, args, meta)
}The one bit of prep this needs: If you'd like to take this on as the CC @tgrunnagle for awareness — no action needed from you. |
Summary
This PR adds optimizer-aware tool-name resolution to vMCP rate limiting.
PR #5276 wired
VirtualMCPServer.spec.config.rateLimitinginto the vMCP runtime for normaltools/callrequests, where the parsed MCP resource ID is already the backend tool name. When the vMCP optimizer is enabled, clients call the optimizer meta-toolcall_tool, and the real backend tool name is carried insidearguments["tool_name"]. Without this follow-up, per-tool rate limits are evaluated againstcall_toolinstead of the backend tool the optimizer is routing to.Fixes #4552
Type of change
Test plan
task test)task test-e2e)task lint-fix)API Compatibility
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.This PR does not change the CRD schema or
v1beta1API surface.Changes
pkg/vmcp/cli/serve.gopassThroughToolsmap into the vMCP rate-limit factory.pkg/vmcp/ratelimit/factory/middleware.gocall_toolinner-tool resolution before invoking the shared rate-limit middleware.pkg/vmcp/ratelimit/factory/middleware_test.gotest/e2e/thv-operator/virtualmcp/virtualmcp_rate_limiting_test.gocall_toolrate limiting by inner backend tool name.Does this introduce a user-facing change?
Yes. When vMCP optimizer is enabled, per-tool rate limits now apply to the real backend tool name passed through
call_tool, instead of applying to the optimizer meta-tool name.Implementation plan
Approved implementation plan
passThroughToolsmap computed inpkg/vmcp/cli/serve.go.pkg/vmcp/ratelimit/factory.NewMiddleware.tools/callrequests.call_toolandarguments["tool_name"]is a non-empty string, shallow-copy the parsed MCP request and replace onlyResourceIDwith the inner backend tool name.pkg/ratelimitmiddleware with that temporary parsed request so rate-limit buckets use the backend tool name.call_toolrequest.call_toolinvocation for the same inner tool is rate-limited.Special notes for reviewers
pkg/ratelimit.pkg/ratelimit.NewMiddlewareremains the owner of Redis setup, limiter construction, fail-open behavior, identity extraction, and the JSON-RPC 429 response.call_toolto the inner backend tool name before the shared rate-limit middleware checks the bucket.call_toolrequest.