Skip to content

fix: enforce automatic quota-based account rotation#45

Merged
ndycode merged 8 commits into
devfrom
fix/auto-rotate-quota-bootstrap
Mar 6, 2026
Merged

fix: enforce automatic quota-based account rotation#45
ndycode merged 8 commits into
devfrom
fix/auto-rotate-quota-bootstrap

Conversation

@ndycode
Copy link
Copy Markdown
Owner

@ndycode ndycode commented Mar 5, 2026

Summary

  • rotate to the next account immediately on any 429 response (including short retry windows)
  • bootstrap runtime quota-awareness from persisted quota cache so already exhausted or rate-limited accounts are skipped automatically
  • match cached entries by accountId first, then normalized email fallback
  • add a default-on env escape hatch: CODEX_AUTH_QUOTA_CACHE_BOOTSTRAP=0
  • use lazy quota-cache reload with a 30 minute TTL and reset-bounded actionability

Key behavior changes

  • removed same-account short 429 retry in runtime request path
  • applied cached 429 or 100 percent usage entries before account selection and deferral checks
  • only apply cached entries when actionable (future reset or fresh within TTL)

Tests

  • added fetch-path tests for:
    • immediate cross-account rotation on short-window 429
    • quota-cache bootstrap skipping a pre-rate-limited account without manual account switching
    • env escape hatch disabling quota-cache bootstrap
  • updated retry test mocks for new runtime imports

Validation

pm run typecheck

pm run lint

pm test -- test/index.test.ts test/index-retry.test.ts

pm test

note: greptile review for oc-chatgpt-multi-auth. cite files like lib/foo.ts:123. confirm regression tests + windows concurrency/token redaction coverage.

Greptile Summary

this pr bootstraps the runtime quota state from the persisted quota cache at request time, so already-exhausted or rate-limited accounts are skipped before they're even tried — and removes the same-account short-window 429 retry in favour of immediate rotation. the bootstrap logic (lazy disk load, 60s failure cooldown, 30-min TTL, shared in-flight promise) addresses several previously flagged issues including Math.minMath.max, TTL over-blocking, and the dead outer catch.

key points:

  • loadQuotaCacheForBootstrap correctly deduplicates concurrent disk reads via a shared promise and enforces a 60s cooldown on failure — but has no timeout on loadQuotaCache() itself, which on windows with a blocking file lock (not an immediate throw) will stall all concurrent requests indefinitely since they share the in-flight promise
  • getQuotaCacheBootstrapWaitMs correctly returns 0 when reset times are known but elapsed, fixing the TTL over-blocking issue from the prior review
  • buildQuotaScheduleKey normalises provider-prefixed model ids consistently between the bootstrap and runtime deferral paths
  • the attempt counter surfaced in the toast is always 1 after immediate rotation (since resetRateLimitBackoff fires on every success); the label is vestigial and adds no signal
  • all in-flight requests block on a fresh disk read at each 30-min TTL boundary; a stale-while-revalidate pattern would eliminate this latency spike on windows
  • the EPERM/EBUSY windows regression test is a good addition, but the concurrent deduplication path (second caller returns the shared in-flight promise rather than spawning a second disk read) has no vitest coverage

Confidence Score: 3/5

  • safe to merge with caveats — the blocking-lock hang on windows and missing concurrent deduplication test are the outstanding risks.
  • the core logic is sound and correctly addresses previously flagged issues. the main concern is the absence of a timeout on the loadQuotaCache() call in the bootstrap path: on windows a non-throwing blocking lock would cause all concurrent requests to hang indefinitely on the shared promise. the concurrent deduplication path is also untested. these are not regressions from the previous baseline but are new exposure surface introduced by this pr.
  • index.ts — specifically loadQuotaCacheForBootstrap (lines 437–471) and the applyQuotaCacheBootstrapForModel integration point (line 1686)

Important Files Changed

Filename Overview
index.ts adds ~237 lines of quota-cache bootstrap logic: lazy disk load with 60s failure cooldown, TTL deduplication via shared promise, per-account rate-limit preseeding from persisted cache. correctly addresses prior Math.min/max, TTL over-blocking, and dead-catch issues. new risks: no timeout on loadQuotaCache() (windows blocking lock stalls all requests), stale-data penalty at 30-min TTL boundary, vestigial attempt counter in toast.
test/index.test.ts adds bootstrap tests covering: immediate 429 rotation, email-normalized cache skip, provider-prefixed model matching, env escape hatch, EPERM recovery, and 60s cooldown retry. concurrent loadQuotaCacheForBootstrap deduplication (the shared-promise path) is not covered.
test/index-retry.test.ts minimal mock updates: adds loadQuotaCache stub returning empty cache, adds getAccountByIndex stub, adds getQuotaKey to accounts mock. two new tests: normalized scheduler keys for provider-prefixed models, and family-level scheduler keys for null-model requests. no new issues.

Sequence Diagram

sequenceDiagram
    participant R as Request Handler
    participant B as applyQuotaCacheBootstrapForModel
    participant L as loadQuotaCacheForBootstrap
    participant D as loadQuotaCache (disk)
    participant AM as AccountManager

    R->>B: call (accountSnapshots, modelFamily, model, enabled)
    B->>L: loadQuotaCacheForBootstrap()
    alt TTL cache hit
        L-->>B: return quotaCacheBootstrapData (no disk read)
    else failure cooldown active
        L-->>B: return createEmptyQuotaCache()
    else in-flight promise exists
        L-->>B: return quotaCacheBootstrapLoadPromise (shared)
    else cache miss / TTL expired
        L->>D: await loadQuotaCache()
        Note over L,D: ⚠️ no timeout — blocking lock on windows stalls all sharers
        D-->>L: QuotaCacheData
        L->>L: set quotaCacheBootstrapData, loadedAt
        L-->>B: return loaded cache
    end
    loop for each accountSnapshot
        B->>B: getQuotaCacheEntryForCandidate (accountId first, email fallback)
        B->>B: shouldApplyQuotaCacheEntry (status 429 or usedPercent≥100, waitMs>0)
        alt entry actionable
            B->>AM: markRateLimitedWithReason(account, waitMs, family, "quota", model)
            B->>AM: preemptiveQuotaScheduler.update(quotaScheduleKey, snapshot)
        end
    end
    B-->>R: void

    R->>AM: getCurrentOrNextForFamilyHybrid (skips pre-blocked accounts)
    R->>D: fetch (rotated to unblocked account)

    alt 429 response at runtime
        R->>AM: markRateLimitedWithReason (immediate rotation, no same-account retry)
        R->>AM: break → try next account
    end
Loading

Fix All in Codex

Last reviewed commit: 2d204ca

Context used:

  • Rule from dashboard - What: Every code change must explain how it defends against Windows filesystem concurrency bugs and ... (source)

Rotate immediately on 429 responses and bootstrap runtime selection from persisted quota cache (accountId-first, normalized email fallback) with a 30m lazy TTL and env escape hatch.

Co-authored-by: Codex <noreply@openai.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 5, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

walkthrough

added quota-cache bootstrap into the oauth/account snapshot flow, exposed quota types and loader, and changed 429 handling to update quota scheduler and rotate accounts instead of short-retrying. see lib/index.ts:1, lib/quota-cache.js:1, test/index.test.ts:1.

changes

Cohort / File(s) Summary
core index & quota integration
lib/index.ts
added quota bootstrap state, helpers and apply/load hooks; wired quotaCacheBootstrapEnabled into model/account processing; replaced short-threshold 429 retry branch with scheduler updates and account rotation (lib/index.ts:1).
quota modules & types
lib/quota-cache.js, lib/preemptive-quota-scheduler.js
implemented loadQuotaCacheForBootstrap caching/TTL/retry logic and exported loadQuotaCache, QuotaCacheData, QuotaCacheEntry, QuotaSchedulerSnapshot used by index (lib/quota-cache.js:1, lib/preemptive-quota-scheduler.js:1).
accounts utilities
lib/accounts.js
exported getQuotaKey and used quota-key mapping/normalization for bootstrap by-email and by-account-id (lib/accounts.js:1).
tests & mocks
test/index.test.ts, test/index-retry.test.ts
added loadQuotaCache mocks; extended AccountManager mock with getAccountByIndex and getQuotaKey; added tests exercising bootstrap enable/disable, account skipping, rotation on 429, and concurrent bootstrap reads (test/index.test.ts:1, test/index-retry.test.ts:1).
public surface & exports
index.ts
exposed quota loader and types on public API and wired env flag CODEX_AUTH_QUOTA_CACHE_BOOTSTRAP into runtime (index.ts:1).

sequence diagram(s)

sequenceDiagram
    participant oauth as OAuth handler
    participant qc as quota cache
    participant am as account manager
    participant qs as quota scheduler
    participant rl as rate limiter

    oauth->>qc: loadQuotaCacheForBootstrap()
    qc-->>oauth: quota maps (byEmail/byAccountId)
    loop for each family/model
        oauth->>am: getAccountByIndex(i)
        am-->>oauth: account
        oauth->>qs: applyQuotaCacheBootstrapForModel(account, quotaEntry)
        qs-->>qs: schedule/update entries
    end
    oauth->>rl: make request with token
    rl->>qs: query quota state for token
    qs-->>rl: exhausted? (yes/no)
    alt exhausted
        rl->>oauth: rotate account (use next token)
    else not exhausted
        rl-->>oauth: proceed with request
    end
Loading

estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

suggested labels

bug

notes for reviewers

  • missing regression tests: add explicit regression tests for prior 429 retry semantics when CODEX_AUTH_QUOTA_CACHE_BOOTSTRAP is unset. see test/index.test.ts:1.
  • concurrency risks: review caching and concurrent load protection in lib/quota-cache.js:1 and bootstrap integration in lib/index.ts:1 for races and staleness under concurrent requests.
  • windows edge case: validate email normalization and unicode/case folding used as quota key across platforms. see lib/index.ts:1, test/index.test.ts:1.
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix: enforce automatic quota-based account rotation' clearly summarizes the main change: immediate account rotation on 429 responses and quota-cache-based account skipping.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed pr description is comprehensive and covers objectives, test scenarios, and validation steps. structure mostly aligns with template, though validation section is empty.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/auto-rotate-quota-bootstrap

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment thread index.ts
Comment thread index.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@index.ts`:
- Around line 428-437: The call to loadQuotaCacheForBootstrap inside
applyQuotaCacheBootstrapForModel can throw and currently will abort the whole
request; make cache-loading best-effort by wrapping the await
loadQuotaCacheForBootstrap() call in a try/catch, on error log a warning (use
accountManager or existing logger) including the error, set cache to
undefined/null, and continue so account rotation proceeds; ensure subsequent
code that reads `cache` can handle it being absent (add null checks if needed)
so failures from loadQuotaCacheForBootstrap don’t stop the flow.

In `@test/index.test.ts`:
- Around line 1184-1439: Tests for the quota-cache bootstrap are missing
deterministic regression cases for concurrent bootstrap loads, token-refresh
races during account rotation, and Windows-style cache read failures
(EPERM/EBUSY); add vitest tests that (1) simulate concurrent calls to sdk.fetch!
that trigger a single loadQuotaCacheMock resolution to verify only one bootstrap
load occurs, (2) reproduce a token refresh race by mocking
AccountManager.prototype.toAuthDetails to delay/resolve differently across
simultaneous requests and assert correct rotation via
getCurrentOrNextForFamilyHybrid/getAccountByIndex calls, and (3) simulate
loadQuotaCacheMock rejecting with EPERM/EBUSY errors (and transient retries)
while ensuring request flow continues and assertions on headers still pass; use
existing helpers like setupPlugin, loadQuotaCacheMock, and spies on
AccountManager methods to keep tests deterministic with vi.mocked/vi.spyOn and
explicit mockResolvedValueOnce/mockRejectedValueOnce patterns.
- Around line 1267-1350: Add a regression test that exercises the
normalized-email fallback path by making loadQuotaCacheMock return an empty
byAccountId and a byEmail entry keyed by a normalized address, then create
AccountManager snapshot entries where one account.email includes mixed case and
surrounding whitespace; call setupPlugin() and sdk.fetch() as in the existing
"uses quota cache bootstrap..." test and assert the request used the access
token for the account matched via byEmail. Specifically, adjust the existing
test to (1) change loadQuotaCacheMock to include the quota under byEmail
(normalize-case/no-whitespace key) and remove the byAccountId hit, (2) keep the
AccountManager mocked methods used (getAccountsSnapshot,
getCurrentOrNextForFamilyHybrid, getAccountByIndex, toAuthDetails) but make one
account.email contain mixed-case/whitespace, and (3) assert fetch was called
once and headers contain the access token corresponding to the account matched
via byEmail; restore spies as the test currently does.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 8c9caf77-cbfa-4e38-8fdc-10303806624f

📥 Commits

Reviewing files that changed from the base of the PR and between d36b04f and 0deb16f.

📒 Files selected for processing (3)
  • index.ts
  • test/index-retry.test.ts
  • test/index.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/index.test.ts
  • test/index-retry.test.ts
🔇 Additional comments (5)
index.ts (2)

407-425: good in-flight dedupe for cache bootstrap load.

the shared promise + ttl cache at lib/index.ts:407 avoids duplicate loads under concurrent fetches and keeps overhead bounded.


2196-2238: good immediate 429 rotation path.

the rewritten 429 branch in lib/index.ts:2196 now marks quota state and exits to account rotation immediately. this matches the new behavior requirement and removes same-account short-window retry.

test/index-retry.test.ts (1)

38-40: mock surface update looks correct.

these additions keep test/index-retry.test.ts aligned with new runtime dependencies from lib/index.ts (loadQuotaCache, getAccountByIndex, getQuotaKey) and keep the test deterministic.

Also applies to: 64-67, 132-132

test/index.test.ts (2)

200-203: good deterministic quota-cache mock setup.

using a shared loadQuotaCacheMock plus reset in test/index.test.ts:1030 keeps bootstrap tests isolated and repeatable.

Also applies to: 1030-1032


1184-1265: good regression for immediate cross-account 429 rotation.

the assertions in test/index.test.ts:1248-test/index.test.ts:1258 verify the second attempt uses a different account token, which directly protects the new runtime contract.

Comment thread index.ts
Comment thread test/index.test.ts
Comment thread test/index.test.ts Outdated
- block exhausted accounts until the furthest reset window
- make quota bootstrap loading best-effort on cache failures
- add regressions for email fallback, concurrency, refresh races, and EPERM/EBUSY

Co-authored-by: Codex <noreply@openai.com>
@ndycode
Copy link
Copy Markdown
Owner Author

ndycode commented Mar 5, 2026

Addressed all actionable review comments in commit \ 9838a3\.

  • Switched quota bootstrap reset aggregation to \Math.max(...)\ so dual-window exhaustion blocks until the later reset.
  • Made quota bootstrap loading explicitly best-effort (defensive catch in loader + apply path guard).
  • Expanded regressions in \ est/index.test.ts\ for:
    • normalized-email fallback matching
    • concurrent bootstrap load dedupe
    • token-refresh race during rotation
    • EPERM/EBUSY bootstrap-read failures continuing request flow
    • furthest-window bootstrap wait behavior

Validation run on this branch:

  • \
    pm run typecheck\
  • \
    pm run lint\
  • \
    pm test\

All passing.

Comment thread index.ts
Comment thread index.ts
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/index.test.ts`:
- Around line 1541-1543: The test currently polls microtask timing with a loop
awaiting Promise.resolve() (checking loadQuotaCacheMock.mock.calls.length),
which is flaky; replace this with a deterministic barrier: instrument the
production/test helper to expose a deferred Promise (e.g., createDeferred or a
"quotaCacheReady" promise) that the code under test awaits at the exact
transition, then in the test explicitly resolve that deferred to advance state
and assert loadQuotaCacheMock.mock.calls; update both places referencing
loadQuotaCacheMock.mock.calls (the loop at the current location and the similar
block around the other occurrence) to await the exposed barrier instead of
spinning on Promise.resolve(), ensuring deterministic ordering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0e12596f-69df-4a01-ae96-a9333f9b1088

📥 Commits

Reviewing files that changed from the base of the PR and between 0deb16f and f9838a3.

📒 Files selected for processing (2)
  • index.ts
  • test/index.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/index.test.ts
🧬 Code graph analysis (2)
test/index.test.ts (2)
lib/accounts.ts (1)
  • AccountManager (102-795)
lib/utils.ts (1)
  • nowMs (31-33)
index.ts (5)
lib/quota-cache.ts (3)
  • QuotaCacheData (22-25)
  • QuotaCacheEntry (13-20)
  • loadQuotaCache (170-198)
lib/preemptive-quota-scheduler.ts (1)
  • QuotaSchedulerSnapshot (6-11)
lib/prompts/codex.ts (2)
  • ModelFamily (47-52)
  • getModelFamily (94-122)
lib/entitlement-cache.ts (1)
  • resolveEntitlementAccountKey (55-62)
lib/request/rate-limit-backoff.ts (1)
  • getRateLimitBackoff (48-89)
🔇 Additional comments (3)
test/index.test.ts (1)

1277-1392: good coverage expansion for quota bootstrap regressions.

nice additions for normalized-email fallback, furthest reset window handling, windows lock failures (eperm/ebusy), and bootstrap escape-hatch behavior in test/index.test.ts:1277, test/index.test.ts:1394, test/index.test.ts:1727, and test/index.test.ts:1766.

Also applies to: 1394-1515, 1727-1764, 1766-1853

index.ts (2)

309-500: bootstrap runtime path looks robust.

the quota bootstrap flow is clean: ttl/in-flight dedupe, accountid-first then normalized-email fallback matching, actionable wait gating, and scheduler/account application. this lines up well with the cache contracts in lib/quota-cache.ts:12 and lib/quota-cache.ts:169.

Also applies to: 1642-1648


2215-2257: 429 handling now rotates immediately with consistent state updates.

this path now does the right sequence for rotation: derive backoff, update preemptive scheduler, mark account rate-limited, persist, and break to move cross-account. that behavior aligns with lib/request/rate-limit-backoff.ts:47.

Comment thread test/index.test.ts Outdated
- avoid cross-model bootstrap blocking within the same family
- align scheduler keys with the model that produced cached quota state
- make concurrency/race tests deterministic with explicit barriers

Co-authored-by: Codex <noreply@openai.com>
@ndycode
Copy link
Copy Markdown
Owner Author

ndycode commented Mar 5, 2026

Addressed the newly raised review findings in commit a17553b.

  • Bootstrap now only applies cached entries when the entry model matches the requested model (when a requested model is provided), preventing cross-model false blocking.
  • Bootstrap scheduler keys now align with the applied model so snapshot data and key model are consistent.
  • Concurrency/race tests now use deterministic deferred barriers (no microtask polling).
  • Added regression: ignore different-model cache entries within the same family.

Validation:

  • npm run typecheck
  • npm run lint
  • npm test

All passing.

Comment thread index.ts Outdated
- simplify bootstrap apply path to use the loader's best-effort fallback
- keep error handling centralized in loadQuotaCacheForBootstrap

Co-authored-by: Codex <noreply@openai.com>
@ndycode
Copy link
Copy Markdown
Owner Author

ndycode commented Mar 5, 2026

Addressed the most recent review thread in 5088063.

  • Removed the unreachable outer catch in applyQuotaCacheBootstrapForModel.
  • Kept bootstrap failure handling centralized in loadQuotaCacheForBootstrap (best-effort fallback).

Validation:

  • npm run typecheck
  • npm run lint
  • npm test -- test/index.test.ts

Current unresolved active review threads: 0.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@index.ts`:
- Around line 447-459: The strict equality check between entry.model and the
requested model can fail for semantically equivalent IDs with different
prefixes/casing; update the selection logic in the loop that uses requestedModel
and entry.model (after obtaining entry via getQuotaCacheEntryForCandidate and
checking getModelFamily(entryModel) === modelFamily) to compare normalized IDs
instead (apply the same normalization used elsewhere for model ids, e.g.,
lowercase and strip provider prefixes) so requestedModel and entryModel are
normalized before the equality check; then add a regression test in
test/index-retry.test.ts (or the bootstrap test) covering equivalent model-id
variants (different casing and provider prefixes) to ensure pre-selection still
matches.
- Around line 417-420: The current bootstrap caches the result of
loadQuotaCache() (assigning quotaCacheBootstrapData and
quotaCacheBootstrapLoadedAt) even when loadQuotaCache returns empty maps after
read failures, so add a failure-cooldown path: when loadQuotaCache() returns an
empty/error-derived result, do not set the normal 30-minute ttl variables
(quotaCacheBootstrapData / quotaCacheBootstrapLoadedAt) but instead set a
separate short-lived cooldown timestamp/flag (e.g.,
quotaCacheBootstrapFailureAt) and return the empty result; ensure the normal
bootstrap assignment only happens when loadQuotaCache() returns a
non-empty/valid map; add a regression test that simulates a transient read
failure on the first call (loadQuotaCache returns error-derived empty) and a
successful read on the second call within the short cooldown window, asserting
that the second call replaces quotaCacheBootstrapData with the valid data (i.e.,
recovery occurs) and that the failure cooldown prevents long-term caching of the
empty data.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 96aa5ff1-c61a-4e9b-b9dd-f32e54cb3aed

📥 Commits

Reviewing files that changed from the base of the PR and between a17553b and 5088063.

📒 Files selected for processing (1)
  • index.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
🧬 Code graph analysis (1)
index.ts (3)
lib/quota-cache.ts (3)
  • QuotaCacheData (22-25)
  • QuotaCacheEntry (13-20)
  • loadQuotaCache (170-198)
lib/preemptive-quota-scheduler.ts (1)
  • QuotaSchedulerSnapshot (6-11)
lib/prompts/codex.ts (2)
  • ModelFamily (47-52)
  • getModelFamily (94-122)
🔇 Additional comments (2)
index.ts (2)

412-434: good concurrency guard on quota-cache bootstrap loads.

quotaCacheBootstrapLoadPromise dedupes concurrent requests and prevents repeated parallel reads in lib/index.ts:412-434. this is the right pattern for startup/load spikes.


2212-2254: 429 handling now rotates immediately and keeps quota state consistent.

the updated path in lib/index.ts:2212-2254 records scheduler/account rate-limit state and breaks to rotate accounts without same-account short-window retry. this matches the stated behavior change.

Comment thread index.ts
Comment thread index.ts Outdated
Comment thread index.ts
Comment thread index.ts
- avoid over-blocking when quota reset timestamps are known but elapsed
- add failure cooldown with retry interval to prevent load hammering while allowing recovery
- normalize model ids for bootstrap exact matching using provider-prefix stripping + lowercase
- add regression coverage for model-id variants, elapsed resets, and cooldown recovery behavior

Co-authored-by: Codex <noreply@openai.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
test/index.test.ts (1)

2323-2357: ⚠️ Potential issue | 🟠 Major

ebusy windows lock path is not actually covered.

the test title says EPERM/EBUSY, but only an EPERM error is injected. this leaves one windows filesystem edge case unverified in test/index.test.ts:2323-2327.

proposed patch
-it("continues request flow when quota bootstrap cache reads hit EPERM/EBUSY", async () => {
+it.each(["EPERM", "EBUSY"] as const)(
+	"continues request flow when quota bootstrap cache reads hit %s",
+	async (code) => {
 	const fetchHelpers = await import("../lib/request/fetch-helpers.js");
-	const epermError = Object.assign(new Error("locked"), { code: "EPERM" });
-	loadQuotaCacheMock.mockRejectedValueOnce(epermError);
+	const lockError = Object.assign(new Error("locked"), { code });
+	loadQuotaCacheMock.mockRejectedValueOnce(lockError);
 	vi.mocked(fetchHelpers.createCodexHeaders).mockImplementation(
 		(_init, _accountId, accessToken) =>
 			new Headers({ "x-test-access-token": String(accessToken ?? "") }),
 	);
 	// existing assertions unchanged
-});
+});

as per coding guidelines, test/**: tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/index.test.ts` around lines 2323 - 2357, The test currently only injects
an EPERM error; add a second injected EBUSY error to exercise the Windows lock
path by creating ebusyError = Object.assign(new Error("locked"), { code: "EBUSY"
}) and calling loadQuotaCacheMock.mockRejectedValueOnce(ebusyError) in addition
to the existing EPERM injection, then update assertions to reflect the
additional call (e.g., expect(loadQuotaCacheMock).toHaveBeenCalledTimes(2))
while keeping the rest of the flow using setupPlugin and sdk.fetch as-is;
reference loadQuotaCacheMock, ebusyError, epermError, setupPlugin, and sdk.fetch
to locate and modify the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@index.ts`:
- Around line 537-541: Extract a single shared key-builder function (e.g.,
buildQuotaScheduleKey(modelOrFamily, accountKey)) that applies
normalizeQuotaCacheModelId to the model id before concatenating with
resolveEntitlementAccountKey; replace the ad-hoc construction at the bootstrap
write site (currently producing quotaScheduleKey before
preemptiveQuotaScheduler.update), the runtime deferral read site (where model ??
modelFamily is used) and the runtime fallback update site (before
preemptiveQuotaScheduler.update at the later path) to all call this new function
so keys are normalized consistently; then add a regression test in
test/index-retry.test.ts that seeds a bootstrap snapshot using a normalized id
and asserts that a runtime request using provider-prefixed (e.g., openai/gpt-4)
or case-variant ids hits the cached deferral.

In `@test/index.test.ts`:
- Around line 2489-2501: The recovery assertion is too weak: replace the loose
expect(thirdHeaders.get("x-test-access-token")).toBeTruthy() with a
deterministic assertion that the header equals the expected rotated token value
(e.g., the intended account token string) and also assert the fetch mock was
called at least three times; update the surrounding checks on firstHeaders and
secondHeaders to explicitly assert they are not equal to that same expected
token to prove rotation/ recovery occurred (refer to the mocked globalThis.fetch
calls and the variables firstHeaders, secondHeaders, thirdHeaders to locate the
assertions to change).

---

Duplicate comments:
In `@test/index.test.ts`:
- Around line 2323-2357: The test currently only injects an EPERM error; add a
second injected EBUSY error to exercise the Windows lock path by creating
ebusyError = Object.assign(new Error("locked"), { code: "EBUSY" }) and calling
loadQuotaCacheMock.mockRejectedValueOnce(ebusyError) in addition to the existing
EPERM injection, then update assertions to reflect the additional call (e.g.,
expect(loadQuotaCacheMock).toHaveBeenCalledTimes(2)) while keeping the rest of
the flow using setupPlugin and sdk.fetch as-is; reference loadQuotaCacheMock,
ebusyError, epermError, setupPlugin, and sdk.fetch to locate and modify the
test.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a0e89972-b094-4fd5-bd27-b0621c7b3940

📥 Commits

Reviewing files that changed from the base of the PR and between 5088063 and ce0d061.

📒 Files selected for processing (2)
  • index.ts
  • test/index.test.ts
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
test/**

⚙️ CodeRabbit configuration file

tests must stay deterministic and use vitest. demand regression cases that reproduce concurrency bugs, token refresh races, and windows filesystem behavior. reject changes that mock real secrets or skip assertions.

Files:

  • test/index.test.ts
🧬 Code graph analysis (2)
test/index.test.ts (1)
lib/accounts.ts (1)
  • AccountManager (102-795)
index.ts (3)
lib/quota-cache.ts (3)
  • QuotaCacheData (22-25)
  • QuotaCacheEntry (13-20)
  • loadQuotaCache (170-198)
lib/preemptive-quota-scheduler.ts (1)
  • QuotaSchedulerSnapshot (6-11)
lib/request/rate-limit-backoff.ts (1)
  • getRateLimitBackoff (48-89)
🔇 Additional comments (8)
index.ts (4)

374-393: good reset-wait computation and elapsed-reset handling.

this logic correctly avoids ttl fallback blocking when reset timestamps exist but are already elapsed, while still bounding unknown-reset entries by freshness. this is the right behavior for actionable bootstrap gating in lib/index.ts:374-393.


431-482: bootstrap loader concurrency and windows failure behavior look solid.

the in-flight promise dedupe plus cooldown/retry window prevents per-request reload storms and keeps bootstrap best-effort under transient file lock/read failures in lib/index.ts:431-482 (important for windows eperm/ebusy patterns).


1686-1692: bootstrap application point is correct.

applying cache bootstrap before candidate traversal and deferral checks in lib/index.ts:1686-1692 is the right ordering for pre-selection skipping.


2259-2315: 429 path now rotates immediately as intended.

this removes same-account short-window retry behavior and consistently marks/schedules/reroutes on rate-limit responses in lib/index.ts:2259-2315. that matches the pr objective.

test/index.test.ts (4)

200-203: solid bootstrap test harness updates.

good setup work here. the dedicated quota-cache mock, per-test reset, and deferred barrier helper reduce flake risk and make async ordering explicit in test/index.test.ts:200-203, test/index.test.ts:1030-1032, and test/index.test.ts:1066-1074.

Also applies to: 212-212, 338-340, 360-360, 1030-1032, 1066-1074


1194-2107: strong coverage across rotation and bootstrap actionability paths.

these regressions cover the high-risk behavior changes well: immediate cross-account rotation on short-window 429, normalized-email fallback, model-aware matching, elapsed reset handling, and furthest-window blocking in test/index.test.ts:1194-2107.


2109-2321: deterministic concurrency/race validation looks good.

the deferred barriers remove timing flake and directly exercise concurrent bootstrap-load dedupe plus token-refresh race behavior in test/index.test.ts:2109-2321.


2512-2599: escape-hatch regression is correctly pinned.

nice coverage of CODEX_AUTH_QUOTA_CACHE_BOOTSTRAP=0; this confirms bootstrap bypass and normal account selection behavior in test/index.test.ts:2512-2599.

Comment thread index.ts Outdated
Comment thread test/index.test.ts
- use shared quota schedule key builder in bootstrap, runtime deferral, and stream fallback updates
- add regression in index-retry for provider-prefixed model-id deferral matching
- tighten recovery assertion to verify rotation target account deterministically

Co-authored-by: Codex <noreply@openai.com>
Comment thread index.ts Outdated
Comment thread index.ts Outdated
Co-authored-by: Codex <noreply@openai.com>
Comment thread index.ts
Co-authored-by: Codex <noreply@openai.com>
@ndycode ndycode changed the base branch from main to dev March 6, 2026 01:10
@ndycode ndycode merged commit e458ef7 into dev Mar 6, 2026
3 checks passed
@ndycode ndycode deleted the fix/auto-rotate-quota-bootstrap branch March 8, 2026 12:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant