fix(provider): skip empty reasoning_content to preserve KV cache hits#28352
fix(provider): skip empty reasoning_content to preserve KV cache hits#28352nilo85 wants to merge 2 commits into
Conversation
Historical assistant messages without reasoning were getting reasoning_content: "" forwarded to the LLM API, breaking KV cache prefix matching (0% cache hits on 196K token prompts). Only set the interleaved field when reasoningText is non-empty.
Adds two tests: - empty reasoning part does not set reasoning_content field - assistant without reasoning parts keeps message unchanged
|
Thanks for your contribution! This PR doesn't have a linked issue. All PRs must reference an existing issue. Please:
See CONTRIBUTING.md for details. |
|
Thanks for updating your PR! It now meets our contributing guidelines. 👍 |
|
The following comment was made by an LLM, it may be inaccurate: Potential Duplicate Found: PR #28346: fix(llm): forward reasoning_content in experimental OpenAI Chat assistant messages Why it's related: This PR directly addresses the inverse scenario of the current PR. While #28352 fixes the issue of empty |
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.
Add detection for when messages sent to the LLM differ between turns, which breaks KV cache prefix matching. Logs a warning with a formatted diff showing what changed and why. Catches issues like empty reasoning_content being added to messages that previously had none (PR anomalyco#28352), as well as any other message mutations during transformation.

Issue for this PR
Closes #19081
Type of change
What does this PR do?
Historical assistant messages without reasoning were getting
reasoning_content: ""forwarded to the LLM API. An empty string changes the token stream, breaking KV cache prefix matching — observed as 0% cache hits on 196K token prompts in production captures.The interleaved reasoning block in
normalizeMessagesalways setproviderOptions.openaiCompatible[field]toreasoningTexteven when empty. The fix adds a guard so the field is only set when there's actual reasoning text. Messages without reasoning now keep the field absent, preserving KV cache prefix matching.How did you verify your code works?
Screenshots / recordings
N/A — backend-only change, no UI.
Checklist