fix: batch of 5 high-confidence fixes for SSE memory, task permissions, encoding, defects, and V2 last-step by mikailustuner · Pull Request #31131 · anomalyco/opencode

mikailustuner · 2026-06-06T15:44:20Z

Summary

Five independent, low-risk fixes for issues opened in the last 48 hours. Each commit is self-contained, has a focused diff, and can be reviewed/merged individually.

#	Issue	Component	Risk
1	#31087	`server/routes/instance/httpapi/handlers/{event,global}.ts`	Low — backpressure fix
2	#31108	`session/prompt.ts`	Low — propagate instead of die
3	#30869	`core/tool/bash.ts`	Low — platform-aware encoding
4	#30867	`core/tool/read.ts`	Low — typed failure vs defect
5	#30866	`core/session/runner/llm.ts` + `max-steps.txt`	Low — V1 parity for last-step hint

Commits

116c1f659 fix(server): bound SSE event queues to prevent unbounded memory growth
3dad6c1bf fix(session): propagate task permission rejection instead of crashing subagent
6a00dfa9d fix(core): use platform-aware encoding for bash tool output
dfc893e4d fix(core): raise ToolFailure for image/binary read errors instead of defect
f8d44ac19 fix(core): warn LLM on the final step in the V2 runner

Per-issue analysis

1. #31087 — SSE Queue Memory Leak (P0 stability)

Flow: Client SSE → eventResponse Effect.gen → Queue.unbounded + events.listen + Stream.fromQueue → wire format via Sse.encode.
Problem: Slow or disconnected SSE consumer → kuyruk sınırsız büyür → 9.5 GB RSS (repro confirmed by reporter).
Root cause: Queue.unbounded + Queue.offerUnsafe never block, so producer keeps emitting while consumer is stalled.
Fix: Queue.bounded(1000, { strategy: "sliding" }) for per-instance events (newer events are more useful), Stream.callback({ bufferSize: 1000, strategy: "dropping" }) for global events. Older events remain recoverable via GET /api/sessions/:id and final session state is in the database, so dropping under backpressure is safe.

2. #31108 — `Effect.orDie` Task Permission Crash

Flow: Main session → task tool → subagent → permission gate → permission.ask() → user deny → RejectedError → .pipe(Effect.orDie) → fatal defect.
Problem: experimental.continue_loop_on_deny: true is silently ignored inside subagents — every denied permission crashes the task.
Root cause: Effect.orDie converts the typed RejectedError into a defect, which bypasses the parent processor's ctx.shouldBreak logic (session/processor.ts:961).
Fix: Remove the Effect.orDie pipe. The outer Effect.catchCause (already in place) records the rejection as a tool error on the assistant part, allowing the subagent processor to honor continue_loop_on_deny.

3. #30869 — bash.ts Hardcoded UTF-8 Encoding

Flow: bash tool → shell process → stdout/stderr capture → Buffer.toString("utf8") → model.
Problem: On Windows consoles with non-UTF-8 active code page (936 GBK, 932 Shift-JIS, 949, 950, OEM 850…) the model sees mojibake compiler errors.
Root cause: Encoding is hardcoded.
Fix: process.platform === "win32" ? "utf16le" : "utf8". A follow-up can introduce chcp detection for OEM code pages without adding a new dependency.

4. #30867 — `Effect.die` for Image/Binary Read Errors

Flow: read tool → image decode → size check → limit exceeded → Effect.die(new ImageSizeError) / Effect.die(new BinaryFileError) → outer Effect.catchCause.
Problem: These are normal, expected, recoverable tool-failure paths. Using Effect.die marks them as untyped defects, which is the wrong semantic — the model should react by resizing, picking a different file, etc.
Root cause: Misuse of Effect.die for typed tool-failure paths.
Fix: Yield new ToolFailure({ message, error }) instead. The existing catchCause branch still surfaces a user-visible message for unrelated failure modes.

5. #30866 — V2 Runner Drops V1's Last-Step Signal

Flow: run() → MAX_STEPS=25 loop → runTurn each step → step 24 → model emits tool calls → step 25 loop terminates → tool calls are abandoned.
Problem: V1 warns the model with an assistant message on the last step (packages/opencode/src/session/prompt.ts:1314-1315, 1426) so it emits a text-only final summary. V2 never wired the equivalent.
Root cause: V2 runTurnAttempt had no isLastStep concept.
Fix: Thread an isLastStep: boolean from run() into runTurnAttempt (and the RunTurn type). When true, append Message.assistant(MAX_STEPS_NOTICE) to the request messages, mirroring V1. The flag is also forwarded through runTurn and runAfterOverflowCompaction so the same final-step semantics apply on continuation and post-overflow-compaction attempts. Notice text mirrors V1's packages/opencode/src/session/prompt/max-steps.txt.

Verification

bun run typecheck cannot run in this sandbox (no tsgo binary), but the changes were carefully aligned with the existing Effect/Message schemas and the package-local AGENTS.md style guides (no Effect.fn/Effect.fnUntraced regressions, Message.assistant(...) used per the @opencode-ai/llm AGENTS guidance, Queue.bounded/Stream.callback parameters verified against effect/dist/Stream.d.ts).
Manual review of every diff against the line ranges in each issue's report.
git log origin/dev..HEAD --oneline confirms five atomic, conventional commits.

Risk

All five fixes are localized, low-risk, and have well-defined fallback behaviors. No public API change; only:

A bounded queue replaces an unbounded one (memory safety > completeness under backpressure, with documented recovery path),
A pipe operator removed (rejection now reaches the existing catch),
An encoding string widened to include utf16le (additive on the decode side),
Effect.die → Effect.fail (typed failure is strictly safer for the existing catch),
A new messages-array element appended on the last step (V1 parity).

Please review each commit individually — they are independent and can be merged in any order.

Replace Queue.unbounded with bounded queues for both the per-instance and global SSE event streams. A slow or disconnected SSE consumer can no longer grow the server worker without limit (issue anomalyco#31087 repro: 9.5GB RSS while a session continued emitting events). - event.ts uses Queue.bounded(1000, sliding) so the most recent progress events are retained while the oldest are dropped. - global.ts uses Stream.callback({ bufferSize, strategy: dropping }) to preserve already-queued history and only drop new arrivals when the consumer is too slow. Older events remain recoverable through GET /api/sessions/:id and final session state lives in the database, so dropping under backpressure is safe.

… subagent Task and subagent permission rejections were being wrapped in Effect.orDie, which converts a permission RejectedError into a fatal defect. The defect bypasses the parent processor's continue_loop_on_deny logic in processor.ts:961 (ctx.shouldBreak), so a denied permission inside a subagent always crashes the task even when the user explicitly opted in to continuing past denials (experimental.continue_loop_on_deny: true). Removing the pipe lets RejectedError propagate as a normal failure into the outer Effect.catchCause (which already records it as a tool error on the assistant part). The subagent processor can then honor continue_loop_on_deny and either continue the loop or report the rejection back to the parent. Issue anomalyco#31108. Workaround: keep the effect.scoped-wrapped spawn at L624 as-is - that die is intentional for the shell process lifecycle.

Hardcoding toString("utf8") produced mojibake on Windows when the active console code page was not UTF-8 (e.g. chcp 936 GBK for zh-CN, chcp 932 Shift-JIS for ja-JP, chcp 949 for ko-KR, OEM 850, etc.). The AI would then see garbled compiler errors and could not diagnose build failures. Switch to a platform-aware encoding: - win32 -> utf16le (cmd.exe / PowerShell default console output) - posix -> utf8 (unchanged) Note: a fully correct solution would query the active code page via process.env.CHCP / Windows APIs (or iconv-lite) and decode OEM code pages (936/932/949/950/437) directly. The two-state switch here fixes the common case without adding a new dependency; a follow-up can introduce full chcp detection. Issue anomalyco#30869.

…defect The read tool used Effect.die(...) to surface three expected, recoverable conditions: - image exceeds configured size limits and auto-resize is disabled - image cannot be resized small enough to fit limits - file is binary but not a supported image format Effect.die raises an untyped *defect*, which the surrounding Effect.catchCause then squashes. Although the message survives today, the semantic is wrong: these are normal tool-failure paths the model is supposed to react to (e.g. rescan the directory, pick a different file), not unrecoverable system defects. Switch to yielding a typed ToolFailure, matching the established pattern for expected, recoverable tool errors. The existing catchCause branch still produces a user-visible message for unrelated failure modes. Issue anomalyco#30867.

V1 sent an assistant-role 'CRITICAL - MAXIMUM STEPS REACHED' message to the model when the current step was the last allowed one (packages/opencode/src/session/prompt.ts:1314-1315, 1426). That gave the model a chance to emit a text-only final summary instead of unsettled tool calls that would be abandoned when the loop terminated. The V2 runner (packages/core/src/session/runner/llm.ts) never wired the equivalent signal. On step 24 the model would happily emit tool calls that nothing would ever settle, wasting the final turn and combining badly with the hard MAX_STEPS cutoff (anomalyco#30865). Thread an isLastStep flag from the outer run() loop down to runTurnAttempt and, when true, append a Message.assistant(MAX_STEPS_NOTICE) to the request messages. The flag is also forwarded through runTurn / runAfterOverflowCompaction so the same final-step semantics apply on continuation and post-overflow-compaction attempts. The notice text mirrors V1's packages/opencode/src/session/prompt/max-steps.txt and is imported from packages/core/src/session/runner/max-steps.txt to keep the wording in sync with V1's prompt evolution in a follow-up. Issue anomalyco#30866.

github-actions · 2026-06-06T15:44:28Z

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

github-actions · 2026-06-06T15:44:28Z

Thanks for your contribution!

This PR doesn't have a linked issue. All PRs must reference an existing issue.

Please:

Open an issue describing the bug/feature (if one doesn't exist)
Add Fixes #<number> or Closes #<number> to this PR description

See CONTRIBUTING.md for details.

github-actions · 2026-06-06T15:45:10Z

The following comment was made by an LLM, it may be inaccurate:

Based on my search, I found two related PRs that address the same issue as parts of this batch:

PR fix(server): bound SSE event queues #31094 - fix(server): bound SSE event queues
- Directly overlaps with commit 1 of PR fix: batch of 5 high-confidence fixes for SSE memory, task permissions, encoding, defects, and V2 last-step #31131 (SSE queue memory leak fix)
- Both target the unbounded queue backpressure issue
PR fix: bound SSE event queues to prevent unbounded memory growth #31107 - fix: bound SSE event queues to prevent unbounded memory growth
- Also overlaps with commit 1 of PR fix: batch of 5 high-confidence fixes for SSE memory, task permissions, encoding, defects, and V2 last-step #31131 (the SSE memory fix)
- Same issue (prevent SSE event streams from unbounded memory growth #31087), similar solution

Note: The other four fixes in PR #31131 (task permissions, bash encoding, read tool errors, V2 last-step) appear to have no current duplicates among open PRs.

Recommendation: Check if PRs #31094 and #31107 should be closed in favor of this batch, or if this PR #31131 should be rebased to exclude the SSE fix if one of those is already approved/merged.

mikailustuner added 5 commits June 6, 2026 10:32

github-actions Bot added needs:compliance This means the issue will auto-close after 2 hours. needs:issue labels Jun 6, 2026

mikailustuner closed this Jun 6, 2026

mikailustuner deleted the fix/batch-31087-31108-30869-30867-30866 branch June 6, 2026 15:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: batch of 5 high-confidence fixes for SSE memory, task permissions, encoding, defects, and V2 last-step#31131

fix: batch of 5 high-confidence fixes for SSE memory, task permissions, encoding, defects, and V2 last-step#31131
mikailustuner wants to merge 5 commits into
anomalyco:devfrom
mikailustuner:fix/batch-31087-31108-30869-30867-30866

mikailustuner commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mikailustuner commented Jun 6, 2026

Summary

Commits

Per-issue analysis

1. #31087 — SSE Queue Memory Leak (P0 stability)

2. #31108 — Effect.orDie Task Permission Crash

3. #30869 — bash.ts Hardcoded UTF-8 Encoding

4. #30867 — Effect.die for Image/Binary Read Errors

5. #30866 — V2 Runner Drops V1's Last-Step Signal

Verification

Risk

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

2. #31108 — `Effect.orDie` Task Permission Crash

4. #30867 — `Effect.die` for Image/Binary Read Errors