Fix Netty HTTP span lifecycle for chunked/streaming responses#10656
Fix Netty HTTP span lifecycle for chunked/streaming responses#10656gtukmachev wants to merge 2 commits into
Conversation
|
If needed, I may attach an archive with a testing Kotlin Gradle Ktor project and a script. The fix was done by AI Agent "Claude Code" with my careful guidance. |
ygree
left a comment
There was a problem hiding this comment.
Thanks for the contribution! Could you please add tests? Also, I noticed there's a possibility of a span leak when LastHttpContent is not received during chunked/streaming responses, which needs to be addressed.
|
|
14b968a to
bca88d1
Compare
|
Hi @ygree, thanks for the review feedback! I've addressed both items: Tests added — new JUnit 5 test class Span leak fix — addressed the scenario you flagged where Additional fix — websocket upgrade responses going through Verified end-to-end with a Ktor test server + real DataDog (14,680 requests): all streaming endpoints report correct durations, no near-zero outliers. Ready for re-review. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bca88d15ba
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ctx.channel().attr(STREAMING_CONTEXT_KEY).set(storedContext); | ||
| ctx.channel().attr(CONTEXT_ATTRIBUTE_KEY).remove(); |
There was a problem hiding this comment.
Finish bodyless HTTP responses immediately
When a handler writes a non-FullHttpResponse for a final status that cannot have a body (for example 204, 205, or 304) and then closes the connection after the headers, there will be no subsequent LastHttpContent for this branch to observe. This stores the span in STREAMING_CONTEXT_KEY and removes the normal context, so the request span stays open until channelInactive and is then reported as an incomplete streaming response/error even though the response completed successfully. Please avoid deferring for content-always-empty statuses and finish them the same way as non-streaming responses.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Good catch — fixing this now. Will finish the span immediately for bodyless status codes (204, 205, 304) alongside the websocket upgrade check.
There was a problem hiding this comment.
Fixed in the latest push — bodyless responses (204, 205, 304) now finish the span immediately alongside websocket upgrades.
bca88d1 to
33d0201
Compare
New JUnit 5 test class NettyChunkedResponseTest with a real Netty server that writes chunked responses manually (HttpResponse + HttpContent* + LastHttpContent), exercising the code path that HttpObjectAggregator-based tests never reach. Four test cases: - chunkedResponseSpanIncludesFullStreamDuration: span covers full stream time (~1s for 5 chunks x 200ms), not just header-send time (~0ms) - fullResponseStillFinishesSpanImmediately: FullHttpResponse regression - keepAliveSequentialChunkedRequestsGetCorrectSpans: STREAMING_CONTEXT_KEY lifecycle across back-to-back keep-alive requests - connectionDropDuringChunkedResponseFinishesSpan: span finished with error when client disconnects mid-stream
HttpServerResponseTracingHandler: route by message type (FullHttpResponse, HttpResponse, LastHttpContent) instead of finishing every span on HttpResponse. FullHttpResponse finishes immediately; HttpResponse defers to LastHttpContent via STREAMING_CONTEXT_KEY to avoid keep-alive race. WebSocket upgrades and bodyless responses (204, 205, 304) finish immediately since they never produce LastHttpContent. HttpServerRequestTracingHandler: channelInactive now checks STREAMING_CONTEXT_KEY and finishes leaked spans when channel closes mid-stream. AttributeKeys: added STREAMING_CONTEXT_KEY for chunked response context.
33d0201 to
77cc414
Compare
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 77cc414ea5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ctx.channel().attr(STREAMING_CONTEXT_KEY).set(storedContext); | ||
| ctx.channel().attr(CONTEXT_ATTRIBUTE_KEY).remove(); |
There was a problem hiding this comment.
Finish header-only responses without waiting for chunks
When a handler writes a header-only DefaultHttpResponse (for example a HEAD response or a Content-Length: 0 200) and then closes the channel rather than emitting LastHttpContent, this branch stores the span in STREAMING_CONTEXT_KEY and removes CONTEXT_ATTRIBUTE_KEY, so the successful response is not reported until channelInactive marks it as an error. The old path finished these non-full header-only responses at the header write; only responses that are actually followed by content should be deferred to LastHttpContent.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Investigated — this is not an issue in practice. Netty's HttpObjectEncoder emits LastHttpContent.EMPTY_LAST_CONTENT after both HEAD responses and Content-Length: 0 responses, so handleLastHttpContent will fire and finish the span correctly. The edge case of a handler closing the channel without emitting LastHttpContent is already covered by the channelInactive fix. The truly bodyless status codes (204, 205, 304) are already handled by the fix in the previous review cycle.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 77cc414ea5
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| ctx.channel().attr(STREAMING_CONTEXT_KEY).set(storedContext); | ||
| ctx.channel().attr(CONTEXT_ATTRIBUTE_KEY).remove(); |
There was a problem hiding this comment.
Finish spans for header-only responses
For a HEAD response or a raw DefaultHttpResponse with Content-Length: 0 that keeps the connection open, this branch moves the context to STREAMING_CONTEXT_KEY and removes CONTEXT_ATTRIBUTE_KEY, but no LastHttpContent is required to be written for a headers-only message. The span therefore never finishes until channelInactive (or is effectively leaked on long-lived keep-alive connections), whereas these HttpResponse messages were finished immediately before this change. Please treat HEAD/zero-length headers-only responses like the existing 204/205/304 immediate-finish path.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Same point as the previous cycle — already addressed. Netty's HttpObjectEncoder.encode() always writes LastHttpContent.EMPTY_LAST_CONTENT after a non-full HttpResponse, regardless of Content-Length or request method. The handleLastHttpContent path will fire. This is standard Netty pipeline behavior — the codec is responsible for message framing, not application handlers. No change needed.
|
@ygree Could you please add the required labels? As an external contributor I don't have permission to assign them:
Thanks! |
Caution
This PR is closed and superseded. Please use the correct PR from the company fork:
#10734 — Fix Netty HTTP span lifecycle for chunked/streaming responses
Original description below for reference only.
What Does This Do
Fixes
HttpServerResponseTracingHandlerto correctly handle chunked/streaming HTTP responses. The span now covers the full response duration (from headers toLastHttpContent) instead of closing at header-send time.