feat(runtime): wire ctx.llm from sandbox credentials in cloud defaults#193
Conversation
ctx.llm was structurally unwired in every cloud deployment: buildCtx only receives an LlmContext when a caller passes one, createCloudRuntimeDefaults never built one, and no deploy/bootstrap path did either — so ctx.llm.complete() always threw the UNAVAILABLE_LLM stub regardless of persona.useSubscription (which only gates harness-binary credential linking). The gap surfaced when the linear-chat-lead CWD fix advanced execution past getIssue onto the classifyIntent ctx.llm call; granola and hn-monitor share the same dead path. createCloudRuntimeDefaults now derives an LlmContext from sandbox env credentials, in order: ANTHROPIC_API_KEY (x-api-key), CLAUDE_CODE_OAUTH_TOKEN (Authorization: Bearer — the claude setup-token cloud#1629 injects), OPENAI_API_KEY (chat completions). The persona's model field steers provider-family selection and supplies the model (provider prefixes stripped); families fall back to claude-opus-4-8 / gpt-5.1. No credential → undefined → buildCtx keeps the existing throwing stub. startRunner forwards subsystems.llm ?? cloudDefaults.llm, mirroring the workflow subsystem wiring. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
CodeAnt AI is reviewing your PR. |
|
Warning Review limit reached
More reviews will be available in 56 minutes and 27 seconds. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughThis PR adds cloud LLM support to the runtime, enabling deployed personas to query Anthropic or OpenAI models using environment-provided credentials. Core logic selects credentials per provider, resolves model names, and implements request/response flows for both endpoints. Integration points wire the LLM into cloud defaults and runtime context with fallback semantics. ChangesCloud LLM Support
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a default environment-derived LLM context (ctx.llm) for cloud-deployed personas, resolving an issue where cloud personas lacked a working LLM client. It adds credential selection and routing for Anthropic and OpenAI APIs, along with comprehensive unit tests. The review feedback highlights several improvement opportunities, including updating the default OpenAI model to a valid one (e.g., gpt-4o), improving model family detection to support newer o1/o3 models and the harness field, and adding defensive checks using the isRecord helper to prevent potential runtime TypeErrors when parsing API responses.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| */ | ||
|
|
||
| const DEFAULT_ANTHROPIC_MODEL = 'claude-opus-4-8'; | ||
| const DEFAULT_OPENAI_MODEL = 'gpt-5.1'; |
There was a problem hiding this comment.
| function personaModelFamily(persona: PersonaSpec): LlmProviderFamily | null { | ||
| const model = nonEmpty(persona.model); | ||
| if (!model) return null; | ||
| const normalized = model.toLowerCase(); | ||
| if (normalized.startsWith('anthropic/') || normalized.includes('claude')) return 'anthropic'; | ||
| if ( | ||
| normalized.startsWith('openai/') || | ||
| normalized.startsWith('openai-codex/') || | ||
| normalized.includes('gpt-') || | ||
| normalized.includes('codex') | ||
| ) { | ||
| return 'openai'; | ||
| } | ||
| return null; | ||
| } |
There was a problem hiding this comment.
The current personaModelFamily implementation only checks the model string. This has two issues:
- It does not recognize OpenAI's newer
o1ando3series models (e.g.,o1-mini,o3-mini), which do not contain thegpt-substring. - It ignores the
persona.harnessfield, which is a highly reliable indicator of the provider family (claudemaps toanthropic, andcodex/opencodemap toopenai).
We can make this much more robust by checking persona.harness first, and also adding support for o1- and o3- prefixes.
function personaModelFamily(persona: PersonaSpec): LlmProviderFamily | null {
if (persona.harness === 'claude') return 'anthropic';
if (persona.harness === 'codex' || persona.harness === 'opencode') return 'openai';
const model = nonEmpty(persona.model);
if (!model) return null;
const normalized = model.toLowerCase();
if (normalized.startsWith('anthropic/') || normalized.includes('claude')) return 'anthropic';
if (
normalized.startsWith('openai/') ||
normalized.startsWith('openai-codex/') ||
normalized.includes('gpt-') ||
normalized.includes('codex') ||
normalized.startsWith('o1-') ||
normalized.startsWith('o3-')
) {
return 'openai';
}
return null;
}| const content = Array.isArray((payload as { content?: unknown }).content) | ||
| ? ((payload as { content: unknown[] }).content) | ||
| : []; |
There was a problem hiding this comment.
Defensive programming: payload is typed as unknown and cast directly to an object (payload as { content?: unknown }). If the API returns a non-object JSON response (like null or a primitive), accessing .content will throw a TypeError at runtime. We should use the existing isRecord helper to safely guard this property access.
| const content = Array.isArray((payload as { content?: unknown }).content) | |
| ? ((payload as { content: unknown[] }).content) | |
| : []; | |
| const content = isRecord(payload) && Array.isArray(payload.content) | |
| ? payload.content | |
| : []; |
| `ctx.llm: Anthropic response contained no text content (stop_reason=${String( | ||
| (payload as { stop_reason?: unknown }).stop_reason ?? 'unknown' | ||
| )})` |
There was a problem hiding this comment.
Defensive programming: If payload is not an object, casting it to access stop_reason will throw a TypeError. We should safely check if payload is a record before accessing stop_reason.
| `ctx.llm: Anthropic response contained no text content (stop_reason=${String( | |
| (payload as { stop_reason?: unknown }).stop_reason ?? 'unknown' | |
| )})` | |
| `ctx.llm: Anthropic response contained no text content (stop_reason=${String( | |
| (isRecord(payload) && payload.stop_reason) ?? 'unknown' | |
| )})` |
| const choices = (payload as { choices?: unknown }).choices; | ||
| const first = Array.isArray(choices) ? choices[0] : undefined; |
There was a problem hiding this comment.
Defensive programming: If payload is not an object, casting it to access choices will throw a TypeError. We should safely check if payload is a record before accessing choices.
| const choices = (payload as { choices?: unknown }).choices; | |
| const first = Array.isArray(choices) ? choices[0] : undefined; | |
| const choices = isRecord(payload) ? payload.choices : undefined; | |
| const first = Array.isArray(choices) ? choices[0] : undefined; |
Review (claude-3) — GO with one likely-functional gap on the OAuth leg + two model-string asksRead the full diff at 53d18ef. The architecture is right: env-derived, fail-soft (no creds → existing throwing stub), exactly-one-auth-header discipline, persona.model steers family with prefix stripping, structured logs that name the credential SOURCE but never the value, 120s timeout + truncated error detail. Wiring through 1. (Likely functional) The CLAUDE_CODE_OAUTH_TOKEN leg probably 401s without the OAuth beta header. Claude Code setup-tokens are accepted by the Messages API only with 2. OpenAI default model 3. Codex-model passthrough: Side observation for the activation-dependency flag: agreed it's real, and note the failure shape if the OAuth token itself is expired/rotated — Anthropic's authentication_error message is essentially "invalid bearer token", which will surface through |
…t, codex-model fallback - CLAUDE_CODE_OAUTH_TOKEN leg sends anthropic-beta: oauth-2025-04-20 (setup-tokens are rejected by /v1/messages on a bare Bearer); test pins that the header set DIFFERS between the two anthropic sources - DEFAULT_OPENAI_MODEL gpt-5.1 -> gpt-5.5 (fleet-current) - gpt-*-codex persona models steer family selection but fall back to the default chat model for /v1/chat/completions (codex models aren't served there); new test Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
CodeAnt AI finished reviewing your PR. |
|
All three review findings addressed at
Also +1 on your triage note: an expired/rotated setup-token will now surface as |
|
Re-reviewed the 53d18ef→8de5526 delta — all three findings closed exactly as asked: OAuth leg carries |
User description
Root cause this fixes
ctx.llmhas been structurally unwired in EVERY cloud deployment since the runtime existed:buildCtxonly receives anLlmContextwhen a caller passes one (runner.tsforwardssubsystems.llmonly),createCloudRuntimeDefaultsnever built one, and no deploy/bootstrap path constructs one anywhere — repo-wide grep finds zero constructors outside type definitions. Result:ctx.llm.complete()always threw theUNAVAILABLE_LLMstub, and its error message's advice (set persona.useSubscription:true and connect a provider) was misleading —useSubscriptiongates deploy-time harness-binary credential linking, and never flowed intobuildCtxat all.Why it surfaced now: the linear-chat-lead CWD/mount fix (relay-helpers 0.3.33) advanced handler execution past
getIssueontoclassifyIntent'sctx.llm.complete()(linear/agent.ts:188). granola (agent.ts:95) and hn-monitor (agent.ts:65) share the same dead path — everyctx.llm-using persona has always been broken in cloud deployment.Change
cloud-llm.ts:createDefaultLlm({persona, env, log})derives anLlmContextfrom sandbox env credentials, priority order:ANTHROPIC_API_KEY→ Anthropic Messages API viax-api-key+anthropic-version: 2023-06-01CLAUDE_CODE_OAUTH_TOKEN→ same API viaAuthorization: Bearer(theclaude setup-tokenenv that cloud#1629 injects foroauth_tokenprovider credentials) — never viax-api-key, and never both headers at onceOPENAI_API_KEY→ OpenAI chat completions, bearerpersona.modelnames a family (claude-*/anthropic/...vsgpt-*/openai/.../openai-codex/...), a credential of that family is preferred and the persona's model is used (provider prefix stripped). Family defaults otherwise:claude-opus-4-8/gpt-5.1createCloudRuntimeDefaultsexposesllm?;startRunnerforwardssubsystems.llm ?? cloudDefaults.llm(mirrors the existing workflow-subsystem wiring).undefined→buildCtxkeeps the existing throwing stub (unchanged behavior for boxes with no LLM creds).''(the granola JSON-parse path would otherwise mis-succeed); 120s timeout viaAbortSignal.timeout.Validation
npm run buildclean (after building persona-kit workspace dep)npm test(runtime): 46/46 pass, including 8 new cloud-llm tests covering: no-creds→undefined, x-api-key header shape, Bearer-only for setup-token (no x-api-key leak), OpenAI routing + prefix-strip, family preference with multiple creds, anthropic default with no family, non-2xx throw, empty-content throw.Rollout chain (cloud side, after merge)
publish runtime → bump in cloud snapshot/agents pins (lockstep rules apply) → redeploy affected personas (linear-chat-lead, granola, hn-monitor). Note: the deployed-persona env for anthropic harnesses currently UNSETS
ANTHROPIC_API_KEY(ambient-env scrub in deployment-trigger-delivery) and auths the binary via credential files — so for those personas this fix activates once cloud#1629'sCLAUDE_CODE_OAUTH_TOKENinjection (merged today) deploys, or when the workspace uses BYOK env keys. Worth a follow-up discussion on whether the credential-file fallback belongs in the runtime too.🤖 Generated with Claude Code
CodeAnt-AI Description
Use sandbox credentials for
ctx.llmin cloud runsWhat Changed
ctx.llmwhen a supported API credential is present in the sandbox, instead of always failing with the unavailable-LLM stubImpact
✅ Working LLM calls in cloud personas✅ Fewer runtime failures in chat and classification flows✅ Clearer errors when an LLM request is rejected or empty💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.