Update the auth-profiles.markauthprofilefailure test suite to match the
new stepped cooldown formula (30s → 1m → 5m cap) introduced in the
first commit. The test was still asserting the old exponential backoff
values (1m → 5m → 25m → 1h cap).
Changes:
- calculateAuthProfileCooldownMs assertions: 60s→30s, 5m→1m, 25m→5m,
1h→5m cap
- 'resets error count when previous cooldown has expired' test: upper
bound adjusted from 120s to 60s to match 30s base cooldown
- Comments updated to reflect the stepped ladder
Resolves merge-blocker review from @altaywtf.
- Add curly braces to single-line if/for bodies in usage.ts and
model-fallback.ts to satisfy oxlint eslint(curly) rule
- Thread modelId into all 3 isProfileInCooldown calls in
pi-embedded-runner/run.ts (lines 719, 746, 767) so the inner
profile loop respects per-model cooldown scope — fixes Codex P1
review comment about outer gate passing model-B while inner loop
rejects it without model context
- When model A is cooling down and model B also fails, set cooldownModel
to undefined so neither model bypasses via per-model scope
- Same-model retries preserve the original cooldownModel
- Add 8 new tests for per-model cooldown behavior: model-scoped bypass,
profile-wide cooldown, billing-disable guard, scope-widening, same-model
retry preservation
- Update .some() comment to document intentional design choice for mixed
fallback failure reasons
- Update markAuthProfileCooldown JSDoc to reflect new stepped backoff (30s/1m/5m)
- Merge duplicate isFallbackSummaryError import into single import statement
- Run oxfmt on all changed files to fix formatting CI failure
Combines ideas from PRs #45113, #31962, and #45763 to address three
cooldown-related issues:
1. Stepped cooldown (30s → 1m → 5m cap) replaces the aggressive
exponential formula (1m → 5m → 25m → 1h) that locked out providers
for far longer than the actual API rate-limit window.
2. Per-model cooldown scoping: rate_limit cooldowns now record which
model triggered them. When a different model on the same auth profile
is requested, the cooldown is bypassed — so one model hitting a 429
no longer blocks all other models on the same provider.
3. FallbackSummaryError with soonest-expiry countdown: when all
candidates are exhausted, the user sees a clear message like
'⚠️ Rate-limited — ready in ~28s' instead of a generic failure.
Files changed:
- types.ts: add cooldownReason/cooldownModel to ProfileUsageStats
- usage.ts: stepped formula, model-aware isProfileInCooldown, modelId
threading through computeNextProfileUsageStats/markAuthProfileFailure
- model-fallback.ts: FallbackSummaryError class, model-aware availability
check, soonestCooldownExpiry computation
- pi-embedded-runner/run.ts: thread modelId into failure recording
- agent-runner-execution.ts: buildCopilotCooldownMessage helper, rate-limit
detection branch in error handler
- usage.test.ts: update expected cooldown value (60s → 30s)