Skip to content

fix: include reasoning tokens in overflow detection#28585

Open
zmlgit wants to merge 2 commits into
anomalyco:devfrom
zmlgit:fix/overflow-reasoning-tokens
Open

fix: include reasoning tokens in overflow detection#28585
zmlgit wants to merge 2 commits into
anomalyco:devfrom
zmlgit:fix/overflow-reasoning-tokens

Conversation

@zmlgit
Copy link
Copy Markdown

@zmlgit zmlgit commented May 21, 2026

Issue for this PR

Closes #15556

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

isOverflow() in overflow.ts computes context usage as tokens.input + tokens.output + tokens.cache.read + tokens.cache.write. However tokens.output already has reasoning tokens subtracted by getUsage() (session.ts:414 — output: outputTokens - reasoningTokens). The reasoning tokens are stored separately in tokens.reasoning but never added back into the overflow count.

This causes a systematic under-count proportional to reasoning output. For reasoning-heavy models (GLM-5, Claude thinking, o1/o3, Gemini), the under-count is severe enough that auto-compaction never triggers, leading to context_length_exceeded errors.

The fix adds input.tokens.reasoning to the sum on line 30. When tokens.total is available (non-zero), it short-circuits via || so this only affects the fallback path.

How did you verify your code works?

  • Confirmed getUsage() subtracts reasoning from output at session.ts:414-415
  • Confirmed isOverflow() was missing reasoning in the count at overflow.ts:30
  • For non-reasoning models, tokens.reasoning = 0, so the addition is a no-op — no behavior change
  • Verified the fix compiles (pnpm build in packages/opencode)

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

The `isOverflow()` function computes context usage by summing
`tokens.input + tokens.output + tokens.cache.read + tokens.cache.write`,
but `tokens.output` already has reasoning tokens subtracted by
`getUsage()` (session.ts). The `tokens.reasoning` field is never added
back, causing a systematic under-count proportional to the model's
reasoning output.

For reasoning-heavy models (GLM-5, Claude with thinking, o1/o3, Gemini),
this under-counting is severe enough to prevent auto-compaction from ever
triggering, leading to context_length_exceeded errors.

Fixes anomalyco#15556

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions github-actions Bot added the needs:compliance This means the issue will auto-close after 2 hours. label May 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found one potentially related PR that may be addressing similar issues:

Related PR:

However, these PRs appear to address different aspects of the overflow/compaction system. The current PR (#28585) specifically fixes the missing tokens.reasoning field in the overflow calculation, which is a distinct issue from the general auto-compaction improvements in #25180 or the thinking block handling in #14393.

Conclusion: No duplicate PRs found. PR #28585 appears to be addressing a specific, previously unhandled bug in the overflow detection logic that the related PRs don't directly cover.

@github-actions github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label May 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for updating your PR! It now meets our contributing guidelines. 👍

Reasoning models (GLM-5, Claude thinking, o1/o3, Gemini) can get stuck
in repetitive loops when asked to generate compaction summaries. The
thinking/reasoning output interferes with the structured summary
template, causing the model to produce repeated content instead of a
proper summary.

Disable thinking via agent options override:
- `thinking: { type: "disabled" }` for zhipuai/openai-compatible providers
- `thinkingConfig: { includeThoughts: false }` for Google providers

These options are deep-merged after the base provider options in
request.ts, so they properly override the thinking config set in
ProviderTransform.options() without affecting normal agent requests.

Related anomalyco#15556, anomalyco#16903

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Context auto-compression seems ineffective with GLM-5: Request exceeds maximum context length

1 participant