net.http: track h2 server connections for idle shutdown once, not per frame#27435
Conversation
… frame The HTTP/2 server registered each connection's fd in the TLS idle-conn tracker around every frame read (mark_idle before, unmark_idle after), so the shared tracker mutex -- and an O(n) handle-list scan in unmark -- was hit on the hot path for every frame on every connection. Register the connection once for its serve lifetime instead. The reader spends almost all of its time blocked in a frame read, so the per-frame churn bought nothing: close_idle still interrupts the reader on shutdown by shutting the fd down. An h2 request in flight when the server stops is now interrupted rather than allowed to finish, which is acceptable at shutdown and is not relied on by any caller -- the graceful "wait for the active request" guarantee is HTTP/1.1-only and unaffected. Refs vlang#27433. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
aa6baea to
ca21253
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca2125397c
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| // which is acceptable at shutdown and is not relied on by any caller (the | ||
| // graceful "wait for active request" guarantee is HTTP/1.1-only). | ||
| tracked := c.should_track_idle_read() | ||
| if tracked && !c.idle_conns.mark_idle(c.idle_handle) { |
There was a problem hiding this comment.
Keep active H2 requests out of the idle tracker
When a client negotiates HTTP/2 and Server.close() runs while handler.handle() is still processing a request, this lifetime mark_idle leaves the active connection in idle_conns, so listen_and_serve_tls calls idle_conns.close_idle() and shuts the fd before the handler can send its response. Previously read_idle_frame() unmarked before dispatch_frame()/run_request(), matching the active-request close behavior covered by test_server_tls_close_waits_for_active_request; with this change the analogous H2 request fails on graceful shutdown instead of completing.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Correct description of the semantic change — and intentional. The PR body and the comment in serve() both document this explicitly: "The only semantic change is that an h2 request in flight when the server stops is now interrupted rather than allowed to finish — which is acceptable at shutdown and is not relied on by any caller (the graceful 'wait for the active request' guarantee is HTTP/1.1-only)."
The per-frame mark/unmark approach that preserved the finish-in-flight behavior did so at the cost of a shared-mutex lock + O(n) list scan on every frame on every connection — contending one process-wide mutex on the hot reader path for no real benefit, since close_idle still interrupts the blocking frame read by shutting the fd regardless. Registering once per lifetime buys back that overhead.
The H2 server is also serial (SETTINGS_MAX_CONCURRENT_STREAMS = 1), so an "in-flight request" here means at most one call to handler.handle() that may not finish cleanly on shutdown — not a correctness issue for callers (no one awaits H2 request completion across Server.close()). test_server_tls_close_waits_for_active_request tests the H1 guarantee and is unaffected.
…27469) * net.http: fix four h2 server correctness gaps found in #27435 review 1. apply_settings overflow guard (RFC 7540 §6.5.3) SETTINGS_INITIAL_WINDOW_SIZE values above 2^31-1 must be rejected as FLOW_CONTROL_ERROR; previously they were stored unchecked, inflating every active stream send-window to ~4 GB. apply_settings now returns ! and validates the new value before applying the delta. 2. Double unmark_idle race on fd recycling (handle_conn) handle_conn's outer defer unconditionally called unmark_idle after serve_h2_conn_with_idle_tracker returned, but serve() already owns mark_idle/unmark_idle for the connection lifetime. After serve()'s defer fires, the OS can recycle the fd; the stale unmark then evicts the new connection from the tracker, preventing close_idle from shutting it down. Fixed by tracking is_h2 and skipping the outer unmark on the H2 path. 3. close_idle write truncation comment Update serve()'s comment to note that response writes, not just frame reads, may be truncated mid-DATA-frame on shutdown. 4. Extract handle_control_frame to deduplicate dispatch_frame/pump_for_window SETTINGS, PING, and WINDOW_UPDATE handling was copy-pasted verbatim in both functions. Any fix to apply_settings (e.g. finding 1) would not automatically propagate to pump_for_window. Extracted into a shared handle_control_frame helper called from both dispatch sites. Co-Authored-By: WOZCODE <contact@withwoz.com> * net.http: fix GOAWAY error code and MAX_FRAME_SIZE validation in h2 server Three RFC 7540 correctness fixes in H2ServerConn: - apply_settings: send FLOW_CONTROL_ERROR (not PROTOCOL_ERROR) before returning when SETTINGS_INITIAL_WINDOW_SIZE exceeds 2^31-1 (S6.5.3). send_goaway now sets closing=true so the outer handler skips the duplicate PROTOCOL_ERROR GOAWAY. - apply_settings: validate SETTINGS_MAX_FRAME_SIZE is in [16384, 16777215] (S6.5.2) before storing it. An out-of-range value was previously stored and could cause an infinite loop in send_body / send_header_block if the peer set it to 0. - send_goaway: set closing=true after a successful frame write so that serve_h2_conn_with_idle_tracker can skip the redundant PROTOCOL_ERROR GOAWAY it previously always sent. Co-Authored-By: WOZCODE <contact@withwoz.com> --------- Co-authored-by: WOZCODE <contact@withwoz.com>
Item #5 from #27433: the HTTP/2 server's idle-shutdown tracking ran on the
per-frame hot path.
serve()registered the connection's fd in the TLSTlsIdleConnTrackeraroundevery frame read —
mark_idle(shared mutex lock + list append) before, andunmark_idle(shared mutex lock + O(n)index/delete) after — so every frameon every connection contended one process-wide mutex and scanned the handle list.
This registers the connection once for its serve lifetime instead. The reader
thread spends nearly all of its time blocked in a frame read, so the per-frame
mark/unmark bought nothing:
close_idlestill interrupts the reader on shutdownby shutting the fd down. The only semantic change is that an h2 request in flight
when the server stops is now interrupted rather than allowed to finish — which is
acceptable at shutdown and is not relied on by any caller (the graceful "wait for
the active request" guarantee is HTTP/1.1-only, and its test is unaffected).
Verification
vlib/net/http/h2_server_test.vandvlib/net/http/server_tls_test.v(includingthe h2 idle/incomplete-request shutdown-interrupt tests) pass on Linux.
Notes
the
track h2 server connections ... once, not per framecommit. I'll rebaseonto master once net.http: fix TLS server read/handshake timeouts and idle-shutdown fd race #27434 merges.
parallelizing handshakes safely is constrained by
MBEDTLS_THREADING_CbeingLinux/BSD-only in V's mbedtls build (shared RNG/key state is unprotected on
Windows/macOS), so it needs per-worker RNG/config — details and a benchmark are
on net.http: TLS server shutdown fix (#27429) follow-ups — handshake timeout tied to accept_timeout, read_timeout no-op on HTTPS #27433.
Refs #27433.
🤖 Generated with Claude Code