net/http/internal/http2: enable SetReuseFrames in server and Transport#2
Closed
ChrisLundquist wants to merge 6 commits into
Closed
net/http/internal/http2: enable SetReuseFrames in server and Transport#2ChrisLundquist wants to merge 6 commits into
ChrisLundquist wants to merge 6 commits into
Conversation
9551f89 to
8a9637f
Compare
Mirror DataFrame's existing reuse pattern for WindowUpdateFrame: add
the struct as a sibling field on frameCache, expose a nil-safe
getWindowUpdateFrame, and have parseWindowUpdateFrame populate the
cached struct in place. Reuse is opt-in via Framer.SetReuseFrames,
matching DataFrame's existing API contract; without SetReuseFrames,
parseWindowUpdateFrame returns a freshly-allocated *WindowUpdateFrame
and behavior is identical to before this CL.
When SetReuseFrames is in effect, both in-tree consumers extract the
StreamID and Increment fields synchronously before the next
ReadFrame call:
* On the server, (*serverConn).processWindowUpdate runs on the
serve goroutine. The readFrames goroutine that calls ReadFrame
blocks on res.readMore() until the previous frame has been fully
processed.
* On the client, (*clientConnReadLoop).processWindowUpdate runs
inline within readLoop's frame-handling switch and returns
before the next ReadFrame call.
WindowUpdateFrame parses dominate the heap-allocation profile for
large transfers: the receiver sends one WINDOW_UPDATE per ~16 KiB of
data at both the stream and connection levels.
BenchmarkParseWindowUpdateFrame (added in a later CL in this stack)
measures the cache's contribution on linux/amd64:
variant B/op allocs/op ns/op
Default 16 1 ~95
Reused 0 0 ~77 (-1 alloc, -19%)
Four tests guard the change:
TestReadFrameReusesWindowUpdate: same pointer across consecutive
WINDOW_UPDATE parses when SetReuseFrames is in effect.
TestReadFrameWindowUpdateNoAllocsWhenReused: locks in zero
allocations per parse with SetReuseFrames.
TestReadFrameWindowUpdateOverwrites: defensive — poisons the
cached struct and confirms every field is re-assigned by the
next parse.
TestReadFrameWindowUpdateDistinctWithoutReuse: pre-SetReuseFrames
API contract — distinct pointers per parse, no mutation across
calls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change-Id: Ibcef724f7bc2ee2bd1a74cb2577d61e3117a872e
Mirror DataFrame's existing reuse pattern for HeadersFrame: add it
as a sibling field on frameCache, expose a nil-safe getHeadersFrame,
and have parseHeadersFrame populate the cached struct in place.
Reuse is opt-in via Framer.SetReuseFrames, matching DataFrame and
the prior WindowUpdateFrame CL on this branch; without
SetReuseFrames, parseHeadersFrame allocates a fresh *HeadersFrame
and behavior is identical to before this CL.
The explicit `*hf = HeadersFrame{FrameHeader: fh}` reset at the top
of parseHeadersFrame zeros every field so values from a previous
parse (Priority, headerFragBuf, Flags carried by FrameHeader)
cannot leak into the next caller. The later flag-gated assignments
fill Priority only when FlagHeadersPriority is set.
Reuse safety, when SetReuseFrames is in effect:
* In net/http's internal use of this package, the *HeadersFrame
returned by parseHeadersFrame is wrapped into a MetaHeadersFrame
by readMetaFrame within the same ReadFrame call. The
MetaHeadersFrame's embedded *HeadersFrame is then invalidated
(Header().valid = false) before the next ReadFrame.
* The headerFragBuf slice already aliases the framer's read
buffer in both the cached and fresh variants, matching the
existing DataFrame contract.
Every request on every HTTP/2 connection triggers exactly one
parseHeadersFrame call on each side, so when SetReuseFrames is in
effect this allocation is removed from the hottest of hot paths.
BenchmarkParseHeadersFrame (added in a later CL in this stack)
measures the cache's contribution on linux/amd64:
variant B/op allocs/op ns/op
Default 48 1 ~125
Reused 0 0 ~86 (-1 alloc, -31%)
Three tests guard the change:
TestReadFrameReusesHeadersFrame: same *HeadersFrame pointer
across consecutive HEADERS parses when SetReuseFrames is in
effect.
TestReadFrameHeadersOverwrites: defensive — parses a HEADERS
frame with Priority + padding then a HEADERS frame with neither,
asserts Priority and Flags do not bleed.
TestReadFrameHeadersDistinctWithoutReuse: pre-SetReuseFrames API
contract — distinct *HeadersFrame pointers per parse, no
mutation across calls.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change-Id: Ieacc0f4cb14454169d87a533b9ab06e01142f809
Mirror DataFrame's existing reuse pattern for the MetaHeadersFrame
wrapper produced by readMetaFrame: store it as a sibling field on
frameCache, expose a nil-safe getMetaHeadersFrame, and have
readMetaFrame populate the cached struct in place. Reuse is opt-in
via Framer.SetReuseFrames, matching DataFrame, WindowUpdateFrame,
and HeadersFrame on this branch; without SetReuseFrames,
readMetaFrame allocates a fresh *MetaHeadersFrame and behavior is
identical to before this CL.
The explicit `*mh = MetaHeadersFrame{HeadersFrame: hf}` reset at the
top of readMetaFrame ensures Fields, Truncated, and any future-added
field are zeroed so values from a previous (possibly aborted) parse
cannot leak into the next caller.
Reuse safety, when SetReuseFrames is in effect:
* The two consumers ((*serverConn).processHeaders and
(*clientConnReadLoop).processHeaders) extract PseudoValue,
RegularFields, and Truncated synchronously and never retain
the *MetaHeadersFrame past the next ReadFrame.
* The embedded HeadersFrame is the cached one from the prior CL
on this branch; the *MetaHeadersFrame wrapper is one more level
of the same sharing, with the same single-Framer-goroutine
safety.
BenchmarkReadMetaFrame (added in a later CL in this stack) measures
the combined HF+MHF cache contribution on linux/amd64:
variant B/op allocs/op ns/op
Default 472 9 ~1062
Reused 376 7 ~1004 (-2 allocs, -5%)
The -2 allocs reflect the *HeadersFrame (saved by the HF CL) and the
*MetaHeadersFrame (saved here). The remaining 7 allocations come
from HPACK Fields-slice growth and the SetEmitFunc closure, which
are addressed by separate CLs not part of this branch.
Three tests guard the change:
TestReadMetaFrameReusesMetaHeadersFrame: same *MetaHeadersFrame
pointer across consecutive parses when SetReuseFrames is in
effect.
TestReadMetaFrameMetaHeadersOverwrites: defensive — first parse
triggers Truncated, second parse fits; Truncated does not leak.
TestReadMetaFrameDistinctWithoutReuse: pre-SetReuseFrames API
contract — distinct *MetaHeadersFrame pointers per parse.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change-Id: I93e6916075b1110137eada96b8017bc38b700165
Add direct parser microbenchmarks that exercise the SetReuseFrames- enabled paths so the reuse savings introduced earlier in this stack are observable per-cache. Each has Default and Reused subtests so benchstat can attribute the delta to the cache rather than to other parts of the parse path: * BenchmarkParseDataFrame (preexisting DataFrame cache) * BenchmarkParseWindowUpdateFrame (new on this branch) * BenchmarkParseHeadersFrame (new on this branch) * BenchmarkReadMetaFrame (new HeadersFrame + MetaHeadersFrame caches) BenchmarkParseDataFrame is included so the contribution of the preexisting DataFrame cache can be compared directly against the caches added in this stack; without it, a reviewer has no reference point. Microbench, linux/amd64: bench Default Reused ParseDataFrame 48 B, 1 alloc 0 B, 0 alloc ParseWindowUpdateFrame 16 B, 1 alloc 0 B, 0 alloc ParseHeadersFrame 48 B, 1 alloc 0 B, 0 alloc ReadMetaFrame 472 B, 9 allocs 376 B, 7 allocs Each parse-path cache saves the same shape (1 alloc per parse). The ReadMetaFrame Reused result reflects both the HeadersFrame cache (saving 1 alloc) and the MetaHeadersFrame cache (saving 1 alloc). The remaining 7 allocations come from HPACK Fields-slice growth and the SetEmitFunc closure; those are out of scope for this branch. The package's existing end-to-end download benchmarks do not exercise the SetReuseFrames path because no in-tree caller opts in. Wiring SetReuseFrames into transport.go/server.go (whether via a test-only hook or a public knob) is a separate change and is deliberately not part of this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Change-Id: Ic3651e52baae0417e40d2e31bdd5bdee6b4e6ce9
…etention contract
The SetReuseFrames contract is that frames returned by ReadFrame, and
any slice aliasing the Framer's read buffer reachable from that frame,
are only valid until the next ReadFrame call. A retention bug under
reuse manifests as one goroutine reading retained data while the
reader goroutine writes into the same backing memory on the next
ReadFrame.
The existing reuse tests assert pointer/slice-header stability across
ReadFrame calls but do not exercise the concurrent reader/consumer
shape that the server uses (serverConn.readFrames feeding the serve
goroutine over a channel gate) and do not reach the production
process* paths at all. Add three complementary tests covering both
levels.
frame_reuse_race_test.go (package http2):
TestFrameReuseRaceCorrect (positive control)
Models the gated handoff: a reader goroutine calls ReadFrame and
delivers the result over a channel, the consumer fully consumes
the frame's payload (DataFrame.Data, HeadersFrame.HeaderBlockFragment,
MetaHeadersFrame.Fields, WindowUpdateFrame.Increment) and only then
signals the per-frame gate so the reader may proceed. Drives 800
frames across DATA, WINDOW_UPDATE, HEADERS, and HEADERS+CONTINUATION
(the meta-headers path). Under -race must pass; future changes that
eagerly mutate the cached frame or its backing buffer before the
consumer signals done would fail this test.
TestFrameReuseRaceAdversarial (negative control)
Same plumbing, but the consumer deliberately leaks the retained
slice into a sidecar goroutine and closes the gate immediately.
All payloads are size-stable so the framer's readBuf stops growing
and the retained slice keeps aliasing the buffer the next ReadFrame
writes into. Guarded behind H2_REUSE_RACE_NEGATIVE=1 so default
go test skips it; the assertion is the race detector itself. The
DataFrame.Data and HeadersFrame.HeaderBlockFragment cases fire
the detector under -race; the MetaHeadersFrame case is retained
for contract documentation but does not race because hpack
allocates independent strings (see consumeHeaderFieldWork
docstring).
frame_reuse_e2e_test.go (package http2_test):
TestFrameReuseEndToEndStress
Drives concurrent multiplexed requests through the real net/http
HTTP/2 server and Transport (both of which call SetReuseFrames
on their per-connection Framer) and touches every field
reachable from a request/response in ways that would race
against the read loop's next ReadFrame if anything still aliased
the cached frame after the readMore gate (server) or read-loop
iteration (Transport). 8 workers x 25 requests, each exercising
headers, body, and trailers in both directions. Under -race this
is a regression test against future refactors of processHeaders
/ handleResponse / processTrailers / processData that fail to
copy aliased data. The synthetic Framer-pair tests cannot reach
the production process* code paths; this one does.
Without -race, TestFrameReuseRaceCorrect remains a useful integration
test of the gated reader/consumer pattern, and
TestFrameReuseEndToEndStress remains a smoke test of the server +
Transport multiplexed path.
No production-code behavior change; tests only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change-Id: I8a86eb2440618c2f85f3d823ceabcf789cbb596c
Call Framer.SetReuseFrames once on the per-connection Framer in both
serverConn (serveConn) and Transport (newClientConn) so the parsed
*DataFrame, *WindowUpdateFrame, *HeadersFrame, and *MetaHeadersFrame
structs returned by ReadFrame are reused across calls instead of
being heap-allocated each time. SetReuseFrames has shipped as an
opt-in Framer method since CL 34812 (golang/net, 2017) but no in-tree
caller previously opted in, so the cache was effectively dead code
from the standard library's perspective; this CL turns it on.
A new GODEBUG setting, http2reuseframes, is registered as a compat
escape hatch in line with the existing http2client / http2server /
http2debug settings:
- default (or http2reuseframes=1): reuse on, this CL's behavior.
- GODEBUG=http2reuseframes=0: reuse disabled, pre-CL behavior.
If a deployment encounters an issue the static audit and -race
coverage didn't catch, operators can flip the setting without
recompiling.
Per-frame allocation savings, microbench numbers (linux/amd64, from
the BenchmarkParse* and BenchmarkReadMetaFrame benchmarks added
earlier in this stack):
bench Default Reused
ParseDataFrame 48 B, 1 alloc 0 B, 0 alloc
ParseWindowUpdateFrame 16 B, 1 alloc 0 B, 0 alloc
ParseHeadersFrame 48 B, 1 alloc 0 B, 0 alloc
ReadMetaFrame 472 B, 9 allocs 376 B, 7 allocs
End-to-end allocation reductions on the package's existing read-path
benchmarks (-count=10, benchstat master vs branch):
bench allocs/op delta p
ClientGzip -59.20% 0.000
DownloadFrameSize/16k -46.73% 0.000
DownloadFrameSize/64k -49.00% 0.000
DownloadFrameSize/128k -49.55% 0.000
DownloadFrameSize/256k -49.87% 0.000
DownloadFrameSize/512k -49.98% 0.000
ClientRequestHeaders/0 -8.00% 0.000
ClientResponseHeaders/0 -8.00% 0.000
ClientRequestHeaders/10 -5.41% 0.000
ClientResponseHeaders/10 -3.48% 0.000
geomean (allocs/op) -19.46%
Write-side benchmarks (WriteScheduler*, WriteQueue) are unchanged, as
expected; those paths do not exercise ReadFrame. ClientGzip latency
also improves by -2.74% (p=0.023). The remaining allocations in
ReadMetaFrame come from HPACK Fields-slice growth and the
SetEmitFunc closure, which this CL does not address.
Reuse safety, by frame type:
DataFrame
serverConn.processData and clientConnReadLoop.processData read
Length, StreamID, StreamEnded, and the Data() slice synchronously
before returning. The bytes from Data() flow through
{server,client}Conn.body / cs.bufPipe -> dataBuffer.Write, which
copies into a pool-allocated chunk; no slice retained past
ReadFrame.
WindowUpdateFrame
serverConn.processWindowUpdate and clientConnReadLoop.processWindowUpdate
read only the scalar StreamID and Increment fields. The struct
type has no slice fields, so there is nothing to alias the read
buffer.
HeadersFrame
Both server and Transport set Framer.ReadMetaHeaders during init,
so a bare *HeadersFrame is never delivered to consumer code; the
Framer always returns *MetaHeadersFrame on a HEADERS frame.
readMetaFrame clears MetaHeadersFrame.HeadersFrame.headerFragBuf
and calls invalidate() on the embedded *HeadersFrame before
returning. The aliased frag buf is therefore not exposed past
readMetaFrame.
MetaHeadersFrame
The Fields slice is freshly allocated per parse: readMetaFrame
does *mh = MetaHeadersFrame{HeadersFrame: hf} (Fields zeroed to
nil), then the HPACK emit callback grows it via append, so each
returned MetaHeadersFrame has its own backing array. HPACK
Name/Value strings are independently allocated by the decoder
(hpack.decodeString returns string(u.b) / buf.String(), both of
which copy), so no string aliases the read buffer.
serverConn.processHeaders / processTrailerHeaders and
clientConnReadLoop.processHeaders / handleResponse / processTrailers
iterate Fields synchronously on the read-loop / serve goroutine
and copy strings into a fresh http.Header before returning.
No code stores *MetaHeadersFrame past the dispatch.
Server-side gating: readFrames -> readFrameCh -> serve calls
processFrameFromReader synchronously and only then invokes
readMore(), which unblocks readFrames for the next ReadFrame. So
even though the server consumes frames on a different goroutine
from the one calling ReadFrame, every frame is fully consumed
before the cache is overwritten.
Transport-side gating: clientConnReadLoop.run consumes each frame
synchronously on the read-loop goroutine before the next
ReadFrame, and what escapes to the RoundTrip / response-body
goroutines is either copied (DataFrame -> bufPipe) or composed of
immutable, independently-allocated Go strings (Header maps).
While here, defensively clear GoAwayFrame.debugData after copying
it to cc.goAwayDebug in processGoAway. GoAwayFrame is not in
frameCache today so the retained cc.goAway pointer is safe, but
this Transport already stores a *GoAwayFrame across ReadFrame
calls; clearing the only field that aliases the read buffer
prevents a future addition of GoAwayFrame to the reuse cache from
silently turning cc.goAway.DebugData() into a use-after-overwrite.
Verification:
net/http/internal/http2: go test -race -count=30 PASS (8m13s)
net/http/internal/http2: go test -race -count=10 -cpu=1,2,4,8
PASS (6m17s)
net/http: go test -race -count=5 PASS (2m15s)
TestFrameReuseRaceCorrect -race -count=200 PASS (5s)
TestFrameReuseRaceAdversarial under
H2_REUSE_RACE_NEGATIVE=1 -race -count=10 RACE (10/10)
TestFrameReuseEndToEndStress -race PASS
GODEBUG=http2reuseframes=0 -race
(TestFrameReuseEndToEndStress, TestFrameReuseRaceCorrect)
PASS
The adversarial test deliberately violates the reuse contract and
asserts the race detector fires; the other runs validate the
production code paths both with reuse on (the default) and with
reuse disabled via GODEBUG.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change-Id: I9d6d0c2e761b0901314c25f362c99062260b31a7
8a9637f to
3031bf4
Compare
Owner
Author
|
Superseded by golang#79634. This fork-internal PR was for pre-review |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This stack extends Framer.SetReuseFrames to cover *WindowUpdateFrame,
*HeadersFrame, and *MetaHeadersFrame (in addition to the existing
*DataFrame), then calls SetReuseFrames once on the per-connection
Framer in serverConn and Transport.newClientConn. Result: ~50% fewer
per-op allocations on the package's read-path benchmarks
(BenchmarkDownloadFrameSize, BenchmarkClientGzip), with no measurable
change on write-path benchmarks and no API change.
A new GODEBUG=http2reuseframes=0 setting is registered alongside
http2client / http2server / http2debug and reverts to the pre-CL
allocate-per-frame behavior.
Background
Framer.SetReuseFrames has shipped as an opt-in Framer method since
golang/net CL 34812 (commit bb807669, 2017-02-27, fixing golang#18502). It
was added for gRPC-Go, whose Framer profiling showed parseDataFrame
accounting for ~66% of allocations (1.4 GB of 2.2 GB) on a 64-conn /
100-stream / 900K RT/sec benchmark. gRPC-Go calls SetReuseFrames
directly on the Framer it owns.
No in-tree caller in the standard library has ever opted in, so the
cache has been effectively dead code from the stdlib's perspective
for nine years. This CL turns it on.
Prior attempts and reviewer concerns
This is the third attempt to enable frame reuse on the stdlib's
HTTP/2 server/Transport:
Issue x/net/http2: improve frame decoding performance by reusing frames golang/go#68352 (still open, 2024-07-09): a request to improve
frame decoding performance. neild rejected a user-facing config
option ("There's no need for a user-configurable option here.
If there are performance improvements that can be made to frame
decoding in the HTTP/2 implementation, we can just make them.")
but left the door open for internal enablement.
golang/net CL 597455 (PR x/http/net: enable
SetFrameReusein server and client connections golang/net#216, 2024-07-10): tried toenable on both server and Transport unconditionally. neild's
blocking review: "this is not correct. The server read goroutine
(serverConn.readFrames) reads frames and delivers them to the
main server goroutine for processing," and "Changing the HTTP/2
server and/or transport to use the frame cache is likely to
require some significant refactoring." Stale; never landed.
golang/net CL 676235 (server-only, 2025-05-25): neild Hold+1 on
2025-06-24: "this is not a safe change for the server.
SetReuseFrames causes each frame to be invalidated by the next
call to ReadFrame, but the server processes frames concurrently
with ReadFrame calls." Abandoned 2025-10-24.
The blocker on both prior CLs was correctness, specifically retention
of frame data past the next ReadFrame. This CL addresses that with a
static retention audit, a race-detector test that exercises the gated
handoff, an end-to-end stress test under -race, and a GODEBUG escape
hatch in case the analysis missed something.
Stack layout
Six commits, each independently reviewable:
control, end-to-end stress)
http2reuseframes GODEBUG
Commits 1-3 mirror the existing *DataFrame cache pattern from CL
34812: wholesale "*p = T{...}" reset at the top of each parse
function, slot in frameCache, nil-safe getFrame getter. No
behavior change without SetReuseFrames; the caches are dormant unless
opted in.
Safety analysis
Static retention audit across every code path that consumes a
reusable *Frame, plus a separate concurrency audit. Per frame type:
*DataFrame
f.Data() is copied into dataBuffer.Write
(databuffer.go:126 copy(chunk[b.w:], p)) before readMore. The
pool-allocated chunk has independent backing.
*WindowUpdateFrame
Struct has only scalar fields. Safe by construction.
*HeadersFrame
Both server and Transport set Framer.ReadMetaHeaders during init
(server.go:322, transport.go:663), so a bare *HeadersFrame never
reaches consumer code. readMetaFrame (frame.go:1868-1869) clears
headerFragBuf=nil and invalidates the embedded *HeadersFrame
before returning.
*MetaHeadersFrame
readMetaFrame:1772 does "*mh = MetaHeadersFrame{HeadersFrame: hf}",
zeroing Fields and Truncated. Append grows a fresh backing array
per parse. hpack.decodeString (vendor/.../hpack.go:509-520)
returns string(u.b) or buf.String(), both of which copy, so
Name/Value strings are independently allocated and do not alias
readBuf. Server's processHeaders / newWriterAndRequest and
Transport's handleResponse / processTrailers iterate Fields
synchronously and copy into fresh http.Header maps before
returning. No code stores *MetaHeadersFrame past dispatch.
Concurrency:
Server: serverConn.readFrames (server.go:692-711) sends each
parsed frame on readFrameCh then blocks on gate. The serve
goroutine receives, calls processFrameFromReader synchronously,
then res.readMore() (server.go:854), which releases the gate.
The single readMore call site fires only after the synchronous
dispatch returns, so the cache is never overwritten while
consumer code still references it.
Transport: clientConnReadLoop.run (transport.go:2046-2090) reads
frames in a tight loop with no per-frame gate, so each process*
function must fully consume aliased data before returning.
Verified: every byte that escapes the loop is either copied
(DataFrame -> bufPipe) or composed of immutable,
independently-allocated Go strings (Header maps populated from
hpack-decoded strings).
Spawned goroutines: runHandler (server) only sees the fresh
*ServerRequest whose Header/Trailer/URL are independently
allocated. writeFrameAsync, body-close, Shutdown, ping writers
don't touch read-side cached frames.
Defensive cleanup: GoAwayFrame.debugData is cleared after copying it
to cc.goAwayDebug in processGoAway. GoAwayFrame is not in frameCache
today so the retained cc.goAway=f pointer is safe, but the Transport
already stores a *GoAwayFrame across ReadFrame calls; clearing the
only field that aliases the read buffer prevents a future addition of
GoAwayFrame to the reuse cache from silently turning
cc.goAway.DebugData() into a use-after-overwrite.
Benchmarks
-count=10 -benchtime=1s -benchmem, linux/amd64 i7-6770HQ @ 2.60GHz,
master vs h2-frame-cache, via benchstat.
Allocations per op (the main win):
Write-side benchmarks (WriteScheduler*, WriteQueue) are unchanged
within noise; those paths don't touch ReadFrame.
Latency / throughput (sec/op):
Per-cache attribution from microbenchmarks added in commit 4:
The residual 7 allocations in ReadMetaFrame/Reused come from HPACK
Fields-slice growth and the SetEmitFunc closure; out of scope for
this stack.
Test plan
Tests added in commit 5 and verified against the enablement commit:
frame_reuse_race_test.go (package http2, synthetic Framer-pair tests):
TestFrameReuseRaceCorrect: positive control. Models
serverConn.readFrames with a reader goroutine, a channel, a
per-frame gate, and a consumer that fully consumes payload
before signalling done. 800 frames across DATA, WINDOW_UPDATE,
HEADERS, HEADERS+CONTINUATION. Must PASS under -race.
TestFrameReuseRaceAdversarial: negative control, gated behind
H2_REUSE_RACE_NEGATIVE=1. Same plumbing, but deliberately leaks
slices into a sidecar goroutine across the next ReadFrame. Must
RACE under -race to verify the detector functions for this
pattern.
frame_reuse_e2e_test.go (package http2_test, real server + Transport):
requests through the real httptest-based HTTP/2 server and
net/http.Transport, both running with SetReuseFrames enabled.
Handler and client touch every byte of every header name/value
and trailer in both directions, drain bodies. Regression test
against future refactors of processHeaders / handleResponse /
processTrailers / processData that fail to copy aliased data.
Validation runs (linux/amd64):
GODEBUG escape hatch
Registered alongside the existing http2 GODEBUGs in
src/internal/godebugs/table.go:
Documented in doc/godebug.md under the Go 1.27 section. The intent is
a compat hatch, not a tuning knob: operators can disable without
recompiling if the static analysis and race coverage missed
something. This addresses the safety concerns from prior CLs while
staying consistent with the preference against user-facing
performance knobs (this is purely a safety/compat escape).
Compatibility
parsed frame values, only the per-call heap allocation is
elided.
and SetReuseFrames method remain unreachable from outside the
stdlib.
gating (mirroring http2client / http2server) since there is no
language-visible behavior change to anchor to a go directive
version.
Updates golang#18502
Updates golang#68352