ndycode · ndycode · May 31, 2026 · May 31, 2026 · May 31, 2026 · May 31, 2026
@@ -73,6 +73,8 @@ These are safe for most operators and frequently used in day-to-day workflows.
 | `CODEX_TUI_GLYPHS=ascii|unicode|auto` | Glyph mode selection |
 | `CODEX_AUTH_FETCH_TIMEOUT_MS=<ms>` | HTTP request timeout override |
 | `CODEX_AUTH_STREAM_STALL_TIMEOUT_MS=<ms>` | Stream stall timeout override |
+| `CODEX_AUTH_MIN_ROTATION_INTERVAL_MS=<ms>` | Minimum time between global account switches (default `60000`). The proxy biases selection toward the last-served account within this window to reduce the rate at which different OAuth tokens appear from the same IP. Set to `0` to disable. |
+| `CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS=<ms>` | Cooldown applied to an account when the upstream or token-refresh endpoint explicitly revokes its OAuth token (default `300000`, 5 minutes). Raise this if accounts continue to be re-invalidated after re-login. |
 
 ---
 
@@ -110,6 +112,13 @@ Keep these enabled for most environments:
 
 The proxy preserves request bodies and streaming responses, replaces outbound auth headers with the selected managed account, and rotates to another account before response bytes are streamed when it sees rate limits, server errors, network failures, or refresh failures. It removes hop-by-hop headers, private account metadata headers, and stale decoded `content-encoding` from client responses. If every account is unavailable, the proxy returns a structured pool-exhaustion error that points to `codex-multi-auth rotation status`.
 
+**Anti-abuse protection.** Rapidly switching OAuth tokens from the same IP can trigger OpenAI's anti-abuse detection and cause accounts to be invalidated in sequence. The proxy includes two mitigations:
+
+- **Token-invalidation detection**: when the upstream or the token-refresh endpoint returns an explicit OAuth revocation message, the proxy returns the error directly to the client instead of rotating to the next account. The affected account receives a 5-minute cooldown (`tokenInvalidationCooldownMs`, default `300000`) instead of the generic 30-second auth-failure cooldown. Configure via `CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS`.
+- **Rotation-rate throttle**: the proxy biases account selection toward the last-served account for a configurable window (default 60 seconds, `minRotationIntervalMs`). Accounts that are rate-limited or cooling down are still rotated around. Configure via `CODEX_AUTH_MIN_ROTATION_INTERVAL_MS` or set to `0` to disable.
+
+Microsoft/Outlook SSO accounts may be more sensitive to proxy-mediated token use. If an Outlook-linked account is invalidated on every first request through the proxy but works normally on ChatGPT web, the root cause is likely IP or device binding on the Microsoft side. Raising `CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS` and re-logging in the affected account typically resolves the cascade. If the problem persists, consider excluding the Microsoft account from the rotation pool via `codex-multi-auth switch`.
+
 For `codex app` launches that go through the wrapper, the wrapper automatically starts a small internal helper so rotation can keep working if the desktop app launcher detaches. The helper stores only local runtime status, uses the same per-session proxy client key as the CLI path, and exits after an idle timeout.
 
 `codex-multi-auth rotation enable` also binds the packaged desktop app to a persistent localhost router. This backs up the real Codex `config.toml`, writes the `codex-multi-auth-runtime-proxy` provider into the real Codex home, starts the router immediately, and installs a user login startup entry: a Startup `.cmd` on Windows or a LaunchAgent on macOS. The persistent provider is marked as not requiring OpenAI auth and uses a local app-bind client token, so the desktop runtime does not display the selected multi-auth account while codex-multi-auth status and quota views still read the router's last-account telemetry. `codex-multi-auth rotation disable` and `codex-multi-auth rotation unbind-app` stop that router, remove the startup entry, and restore the backed-up Codex config. The official app files are not patched.

@@ -76,6 +76,8 @@ The package does not publish a global `codex` binary. `codex-multi-auth ...` is
 | `codex-multi-auth rotation status` says disabled | Stored setting or env override is off | Run `codex-multi-auth rotation enable`, remove `CODEX_MULTI_AUTH_RUNTIME_ROTATION_PROXY=0`, or set `CODEX_MULTI_AUTH_RUNTIME_ROTATION_PROXY=1` for one process |
 | Forwarded Codex session does not show the local provider | Command is help/non-requesting, rotation is disabled, or the official CLI was not launched through the wrapper | Check `where codex-multi-auth-codex`, then run `codex-multi-auth rotation status` |
 | Pool exhausted error from the proxy | Every managed account is unavailable for that model/family | Run `codex-multi-auth rotation status`, then `codex-multi-auth forecast --live` |
+| Accounts progressively lose OAuth tokens while the proxy is active | Rapid account rotation triggers OpenAI's anti-abuse detection, which invalidates tokens in sequence | The proxy detects explicit token-invalidation responses and stops rotating; re-login any invalidated accounts and ensure `minRotationIntervalMs` is at least `60000` (default) |
+| Microsoft/Outlook SSO account gets invalidated on every first request through the proxy | Microsoft OAuth tokens may be invalidated when the proxy presents them from a different IP or device context than where they were issued | The proxy now detects invalidation at both the upstream request and the token-refresh stage; if the problem persists, set `CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS=600000` (10 min) and re-login, or keep the Microsoft account disabled from the rotation pool via `codex-multi-auth rotation status` |
 | Packaged app still uses normal Codex routing | App bind was not installed or was removed | Run `codex-multi-auth rotation bind-app`, then reopen the app |
 | Codex Desktop history disappears after app bind | Current Codex Desktop builds can filter local threads by the active provider, and app bind switches the real config to `codex-multi-auth-runtime-proxy` | The data is normally still under `~/.codex`; run `codex-multi-auth rotation unbind-app` or `codex-multi-auth rotation disable` to restore the original provider/config before browsing old history |
 | Model speed controls are not visible with rotation | Speed/reasoning controls remain owned by Codex config or CLI flags; the app bind only routes Responses traffic | Set `model_reasoning_effort` in `~/.codex/config.toml` or pass `-c model_reasoning_effort=<level>` for wrapper-launched CLI sessions |

@@ -198,6 +198,8 @@ export const DEFAULT_PLUGIN_CONFIG: PluginConfig = {
 	proactiveRefreshBufferMs: 5 * 60_000,
 	networkErrorCooldownMs: 6_000,
 	serverErrorCooldownMs: 4_000,
+	tokenInvalidationCooldownMs: 5 * 60_000,
+	minRotationIntervalMs: 60_000,
 	storageBackupEnabled: true,
 	preemptiveQuotaEnabled: true,
 	preemptiveQuotaRemainingPercent5h: 5,
@@ -1402,6 +1404,48 @@ export function getServerErrorCooldownMs(pluginConfig: PluginConfig): number {
 	);
 }
 
+/**
+ * Get the cooldown duration in milliseconds to apply when an OAuth token has been
+ * explicitly invalidated by the upstream (distinct from a generic 401).
+ *
+ * A longer default (5 minutes) prevents the cascade where rapid account rotation
+ * causes each successive account's token to be invalidated in turn by OpenAI's
+ * anti-abuse detection.
+ *
+ * @param pluginConfig - Plugin configuration used to resolve the setting
+ * @returns The cooldown in milliseconds (minimum 0, default 300000)
+ */
+export function getTokenInvalidationCooldownMs(pluginConfig: PluginConfig): number {
+	return resolveNumberSetting(
+		"CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS",
+		pluginConfig.tokenInvalidationCooldownMs,
+		5 * 60_000,
+		{ min: 0 },
+	);
+}
+
+/**
+ * Get the minimum time in milliseconds that must elapse between global account
+ * switches across requests. When the last served account is still within this
+ * window and is available, it receives a large selection-score boost so the
+ * proxy stays on it rather than rotating to a fresher idle account.
+ *
+ * Setting this to 0 disables the throttle. Default is 60 seconds, which
+ * reduces the rate at which different OAuth tokens are presented from the same
+ * IP and helps avoid OpenAI's anti-abuse detection (see issue #495).
+ *
+ * @param pluginConfig - Plugin configuration used to resolve the setting
+ * @returns The minimum rotation interval in milliseconds (minimum 0, default 60000)
+ */
+export function getMinRotationIntervalMs(pluginConfig: PluginConfig): number {
+	return resolveNumberSetting(
+		"CODEX_AUTH_MIN_ROTATION_INTERVAL_MS",
+		pluginConfig.minRotationIntervalMs,
+		60_000,
+		{ min: 0 },
+	);
+}
+
 /**
  * Determines whether periodic storage backups are enabled.
  *
@@ -1822,6 +1866,16 @@ const CONFIG_EXPLAIN_ENTRIES: ConfigExplainMeta[] = [
 		envNames: ["CODEX_AUTH_SERVER_ERROR_COOLDOWN_MS"],
 		getValue: getServerErrorCooldownMs,
 	},
+	{
+		key: "tokenInvalidationCooldownMs",
+		envNames: ["CODEX_AUTH_TOKEN_INVALIDATION_COOLDOWN_MS"],
+		getValue: getTokenInvalidationCooldownMs,
+	},
+	{
+		key: "minRotationIntervalMs",
+		envNames: ["CODEX_AUTH_MIN_ROTATION_INTERVAL_MS"],
+		getValue: getMinRotationIntervalMs,
+	},
 	{
 		key: "storageBackupEnabled",
 		envNames: ["CODEX_AUTH_STORAGE_BACKUP_ENABLED"],

@@ -17,6 +17,8 @@ import {
 	getSessionAffinityMaxEntries,
 	getSessionAffinityTtlMs,
 	getStreamStallTimeoutMs,
+	getMinRotationIntervalMs,
+	getTokenInvalidationCooldownMs,
 	getTokenRefreshSkewMs,
 	loadPluginConfig,
 } from "./config.js";
@@ -116,7 +118,25 @@ interface RuntimeRotationAccountIdentity {
 const DEFAULT_HOST = "127.0.0.1";
 const DEFAULT_QUOTA_REMAINING_THRESHOLD = 10;
 const DEFAULT_AUTH_FAILURE_COOLDOWN_MS = 30_000;
+
 const DEFAULT_MAX_RUNTIME_ACCOUNT_ATTEMPTS = 4;
+
+// Phrases observed in upstream 401 response bodies when OpenAI/Microsoft has
+// explicitly revoked an OAuth token (as opposed to a generic expired-token 401
+// that can be retried after a refresh). Matching is case-insensitive substring.
+// If anti-abuse detection triggers different wording in production, add the new
+// phrase here and record the source provider and date. See issue #495.
+const TOKEN_INVALIDATION_PHRASES = [
+	"invalidated oauth token",
+	"authentication token has been invalidated",
+	"oauth token has been invalidated",
+	"token has been invalidated",
+] as const;
-const TOKEN_INVALIDATION_PHRASES = [
-	"invalidated oauth token",
-	"authentication token has been invalidated",
-	"oauth token has been invalidated",
-	"token has been invalidated",
-] as const;
+const TOKEN_INVALIDATION_PHRASES = [
+	"invalidated oauth token",
+	"token has been invalidated",
+] as const;
-const TOKEN_INVALIDATION_PHRASES = [
-	"invalidated oauth token",
-	"authentication token has been invalidated",
-	"oauth token has been invalidated",
-	"token has been invalidated",
-] as const;
+const TOKEN_INVALIDATION_PHRASES = [
+	"invalidated oauth token",
+	"token has been invalidated",
+] as const;
+
+function isTokenInvalidationError(bodyText: string): boolean {
+	const lower = bodyText.toLowerCase();
+	return TOKEN_INVALIDATION_PHRASES.some((phrase) => lower.includes(phrase));
+}
 const MAX_REQUEST_BODY_BYTES = 64 * 1024 * 1024;
 const MAX_THREAD_GOAL_FALLBACKS = 512;
 const HOP_BY_HOP_HEADERS = new Set([
@@ -636,6 +656,21 @@ function buildUpstreamUrl(
 	return upstream.toString();
 }
 
+// Monotonic auth-failure cooldown: only extend, never shorten. Two concurrent
+// requests on the same account can race so that an invalidation path sets a
+// long cooldown (5 min) and a subsequent generic 401 truncates it (30 s).
+// Reading the live coolingDownUntil before writing prevents that race.
+function applyMonotonicAuthCooldown(
+	accountManager: AccountManager,
+	account: ManagedAccount,
+	cooldownMs: number,
+): void {
+	const existing = accountManager.getAccountByIndex(account.index)?.coolingDownUntil ?? 0;
+	if (Date.now() + cooldownMs > existing) {
+		accountManager.markAccountCoolingDown(account, cooldownMs, "auth-failure");
+	}
+}
+
 function hasUsableAccessToken(
 	account: ManagedAccount,
 	now: number,
@@ -699,8 +734,13 @@ async function ensureFreshAccessToken(params: {
 	model: string | null;
 	now: number;
 	tokenRefreshSkewMs: number;
-}): Promise<{ ok: true; accessToken: string; account: ManagedAccount } | { ok: false; retryable: boolean }> {
-	const { accountManager, account, family, model, now, tokenRefreshSkewMs } = params;
+	tokenInvalidationCooldownMs: number;
+}): Promise<
+	| { ok: true; accessToken: string; account: ManagedAccount }
+	| { ok: false; retryable: boolean; invalidated?: boolean }
+> {
+	const { accountManager, account, family, model, now, tokenRefreshSkewMs, tokenInvalidationCooldownMs } =
+		params;
 	if (hasUsableAccessToken(account, now, tokenRefreshSkewMs)) {
 		return { ok: true, accessToken: account.access ?? "", account };
 	}
@@ -709,13 +749,18 @@ async function ensureFreshAccessToken(params: {
 	if (refreshResult.type === "failed") {
 		accountManager.recordFailure(account, family, model);
 		accountManager.incrementAuthFailures(account);
-		accountManager.markAccountCoolingDown(
+		// If the refresh endpoint itself returns an explicit invalidation message
+		// (e.g. Microsoft/Outlook SSO revokes the refresh token server-side), apply
+		// the long cooldown and signal to the caller to stop rotating rather than
+		// presenting other accounts' tokens from the same IP.
+		const invalidated = isTokenInvalidationError(refreshResult.message ?? "");
+		applyMonotonicAuthCooldown(
+			accountManager,
 			account,
-			DEFAULT_AUTH_FAILURE_COOLDOWN_MS,
-			"auth-failure",
+			invalidated ? tokenInvalidationCooldownMs : DEFAULT_AUTH_FAILURE_COOLDOWN_MS,
 		);
 		accountManager.saveToDiskDebounced();
-		return { ok: false, retryable: isTokenRefreshRetryable(refreshResult) };
+		return { ok: false, retryable: isTokenRefreshRetryable(refreshResult), invalidated };
 	}
 
 	const auth: OAuthAuthDetails = {
@@ -869,6 +914,7 @@ export function chooseAccount(params: {
 	policy: RuntimePolicyDecision | null;
 	pinnedIndex: number | null;
 	skipReasons?: Map<number, string>;
+	stickyBoostByAccount?: Record<number, number>;
 }): ManagedAccount | null {
 	const {
 		accountManager,
@@ -881,6 +927,7 @@ export function chooseAccount(params: {
 		policy,
 		pinnedIndex,
 		skipReasons,
+		stickyBoostByAccount,
 	} = params;
 
 	// Manual pin (from `codex-multi-auth switch <n>`) overrides every other
@@ -940,7 +987,10 @@ export function chooseAccount(params: {
 	}
 
 	const selected = accountManager.getCurrentOrNextForFamilyHybrid(family, model, {
-		scoreBoostByAccount: policy?.scoreBoostByAccount,
+		scoreBoostByAccount: {
+			...(policy?.scoreBoostByAccount ?? {}),
+			...(stickyBoostByAccount ?? {}),
+		},
 	});
 	if (
 		selected &&
@@ -1179,6 +1229,10 @@ export async function startRuntimeRotationProxy(
 	const tokenRefreshSkewMs = getTokenRefreshSkewMs(pluginConfig);
 	const networkErrorCooldownMs = getNetworkErrorCooldownMs(pluginConfig);
 	const serverErrorCooldownMs = getServerErrorCooldownMs(pluginConfig);
+	const tokenInvalidationCooldownMs = getTokenInvalidationCooldownMs(pluginConfig);
+	const minRotationIntervalMs = getMinRotationIntervalMs(pluginConfig);
+	let lastGlobalAccountIndex: number | null = null;
+	let lastGlobalSwitchAt = 0;
 	const fetchTimeoutMs = options.fetchTimeoutMs ?? getFetchTimeoutMs(pluginConfig);
 	const streamStallTimeoutMs =
 		options.streamStallTimeoutMs ?? getStreamStallTimeoutMs(pluginConfig);
@@ -1388,6 +1442,12 @@ export async function startRuntimeRotationProxy(
 				attemptedIndexes.size < accountCount &&
 				transientAttempts < transientAttemptLimit
 			) {
+				const rotationStickyBoost: Record<number, number> =
+					minRotationIntervalMs > 0 &&
+					lastGlobalAccountIndex !== null &&
+					now() - lastGlobalSwitchAt < minRotationIntervalMs
+						? { [lastGlobalAccountIndex]: 1000 }
+						: {};
 				const selected = chooseAccount({
 					accountManager,
 					sessionAffinityStore,
@@ -1399,6 +1459,7 @@ export async function startRuntimeRotationProxy(
 					policy: policyDecision,
 					pinnedIndex,
 					skipReasons: accountSkipReasons,
+					stickyBoostByAccount: rotationStickyBoost,
 				});
 				if (!selected) {
 					if (
@@ -1445,10 +1506,32 @@ export async function startRuntimeRotationProxy(
 					model: context.model,
 					now: now(),
 					tokenRefreshSkewMs,
+					tokenInvalidationCooldownMs,
 				});
 				if (!refreshed.ok) {
 					accountManager.refundToken(selected, context.family, context.model);
 					exhaustionReason = "auth-failure";
+					if (refreshed.invalidated) {
+						// Refresh endpoint explicitly revoked the token. Stop cascade:
+						// return auth error to client instead of rotating to the next account.
+						sessionAffinityStore?.forgetSession(context.sessionKey);
+						res.writeHead(HTTP_STATUS.UNAUTHORIZED, { "content-type": "application/json" });
+						res.end(
+							JSON.stringify({
+								error: {
+									message: "OAuth token has been invalidated. Please re-login.",
+									code: "token_invalidated",
+								},
+							}),
+						);
+						await usageRecorder.record({
+							outcome: "failure",
+							statusCode: HTTP_STATUS.UNAUTHORIZED,
+							errorCode: "token_invalidated",
+							account: selected,
+						});
+						return;
+					}
 					if (!refreshed.retryable) continue;
 					transientAttempts += 1;
 					transientExhaustionReason = "auth-failure";
@@ -1643,13 +1726,36 @@ export async function startRuntimeRotationProxy(
 				}
 
 				if (upstream.status === HTTP_STATUS.UNAUTHORIZED) {
-					await readErrorBody(upstream);
+					const bodyText = await readErrorBody(upstream);
 					accountManager.refundToken(refreshed.account, context.family, context.model);
 					accountManager.recordFailure(refreshed.account, context.family, context.model);
-					accountManager.markAccountCoolingDown(
+					if (isTokenInvalidationError(bodyText)) {
+						// The upstream explicitly revoked this OAuth token. Applying a long
+						// cooldown prevents cascade invalidation: rapidly presenting each
+						// account's token from the same IP triggers OpenAI's anti-abuse
+						// detection and invalidates them in sequence. Return the 401 directly
+						// rather than rotating so the client can prompt for re-login.
+						applyMonotonicAuthCooldown(
+							accountManager,
+							refreshed.account,
+							tokenInvalidationCooldownMs,
+						);
+						sessionAffinityStore?.forgetSession(context.sessionKey);
+						accountManager.saveToDiskDebounced();
+						res.writeHead(upstream.status, responseHeadersForClient(upstream.headers));
+						res.end(bodyText);
+						await usageRecorder.record({
+							outcome: "failure",
+							statusCode: upstream.status,
+							errorCode: "token_invalidated",
+							account: refreshed.account,
+						});
+						return;
+					}
+					applyMonotonicAuthCooldown(
+						accountManager,
 						refreshed.account,
 						DEFAULT_AUTH_FAILURE_COOLDOWN_MS,
-						"auth-failure",
 					);
 					accountManager.saveToDiskDebounced();
 					exhaustionReason = "auth-failure";
@@ -1727,6 +1833,10 @@ export async function startRuntimeRotationProxy(
 						refreshed.account.index,
 						now(),
 					);
+					if (refreshed.account.index !== lastGlobalAccountIndex) {
+						lastGlobalAccountIndex = refreshed.account.index;
+					}
+					lastGlobalSwitchAt = now();
 				}
 				await persistRuntimeActiveAccount(
 					accountManager,

@@ -70,6 +70,8 @@ export const PluginConfigSchema = z.object({
 	proactiveRefreshBufferMs: z.number().min(30_000).optional(),
 	networkErrorCooldownMs: z.number().min(0).optional(),
 	serverErrorCooldownMs: z.number().min(0).optional(),
+	tokenInvalidationCooldownMs: z.number().min(0).optional(),
+	minRotationIntervalMs: z.number().min(0).optional(),
 	storageBackupEnabled: z.boolean().optional(),
 	preemptiveQuotaEnabled: z.boolean().optional(),
 	preemptiveQuotaRemainingPercent5h: z.number().min(0).max(100).optional(),

@@ -1175,6 +1175,8 @@ describe("codex manager cli commands", () => {
 			proactiveRefreshBufferMs: 300000,
 			networkErrorCooldownMs: 6000,
 			serverErrorCooldownMs: 4000,
+			tokenInvalidationCooldownMs: 300000,
+			minRotationIntervalMs: 60000,
 			storageBackupEnabled: true,
 			preemptiveQuotaEnabled: true,
 			preemptiveQuotaRemainingPercent5h: 5,