feat(providers/openai): add Responses computer use support#27
feat(providers/openai): add Responses computer use support#27ibetitsmike wants to merge 11 commits intocoder_2_33from
Conversation
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
> Authored by Mux on behalf of Mike.
…on validation > Authored by Mux on behalf of Mike.
…convention > Authored by Mux on behalf of Mike.
|
My implementation is at https://github.com/hugodutka/fantasy/tree/hugodutka/openai-computer-use. 🤖 I compared this PR against d139114, and I don’t think this is correct as-is. The main bug is in I also think d139114 has the better core abstraction overall. This PR builds a local computer-use action model and reconstructs the OpenAI payload from that, while d139114 uses the SDK-native types and preserves the raw There’s also a correctness issue in If you want a concrete direction, I’d start from d139114’s shape in |
Summary
Add OpenAI Responses computer-use support to the
openaiprovider sodownstream consumers (Coder) can drive OpenAI's
computer-use-previewmodel through the existing fantasy language-model interface.
Problem
The
openaiprovider only exposed text tools through the ResponsesAPI. There was no way to declare the Responses-native computer-use
tool, parse
computer_calloutputs, or round-trip screenshots andsafety acknowledgments back to OpenAI. Coder's computer-use subagent
was consequently hardcoded to Anthropic.
Fix
providers/openai/computer_use.goexposes aProviderDefinedToolwith idopenai.computer, typedComputerUseToolOptions(display dimensions + environment), and alocal
ComputerUseInputrepresentation of batched actions.responses_language_model.gotoResponsesToolsaccepts the new tooland converts it to
responses.ComputerUsePreviewToolParam. Invaliddimensions fail request preparation instead of warning silently.
toResponsesPromptmaps tool results back to their originatingcomputer_callviaOpenAIComputerUseCallMetadata(call id andpending safety checks). Results without metadata hard-fail so
malformed prompts do not reach the API.
IsResponsesModelandgetResponsesModelConfigshare a narrowallowlist (
computer-use-preview,computer-use-preview-2025-03-11)instead of broad
strings.Contains.zero/negative widths and heights), and batched-actions handling.
providertests/openai_responses_test.goadds an integration testgated on a VCR cassette (cassette recording is a follow-up; the test
is currently skipped with a clear message).
store=trueis required for this to work because the Responses APIpersists computer-use state server-side; Coder enforces that at the
model-config boundary.