Skip to content

fix(SUB-003): subscription rate-limit — persist reset time, skip retries on 429, default max_retries=0 (#476)#477

Open
vybe wants to merge 3 commits into
mainfrom
feature/476-subscription-rate-limit-fix
Open

fix(SUB-003): subscription rate-limit — persist reset time, skip retries on 429, default max_retries=0 (#476)#477
vybe wants to merge 3 commits into
mainfrom
feature/476-subscription-rate-limit-fix

Conversation

@vybe
Copy link
Copy Markdown
Contributor

@vybe vybe commented Apr 23, 2026

Summary

  • Persist rate_limited_until — parse "resets 8pm (America/New_York)" from the 429 body and store in a new subscription_credentials.rate_limited_until column. is_subscription_rate_limited() checks this authoritative timestamp before falling back to the 2h event-count heuristic, preventing the slow ping-pong where Trinity considers an exhausted subscription viable again every 2 hours while Anthropic's actual window is 5–8h.
  • Never retry rate-limited executions_maybe_schedule_retry() now returns early on 429/rate-limit errors. The next cron tick is the correct recovery path; queuing retries during a subscription outage just floods the backlog with guaranteed-to-fail work.
  • Default max_retries from 1 → 0 — retries are now opt-in per schedule. Scheduled agents are typically stateful and idempotent; skipping a run is preferable to thundering-herd replay when many tasks fail simultaneously.

Changes

  • db/migrations.py — add _migrate_subscription_rate_limited_until (ALTER TABLE adds column, idempotent)
  • db/subscriptions.pyset_subscription_rate_limited_until() setter + updated is_subscription_rate_limited() checks durable timestamp first
  • database.py — expose new setter on facade
  • services/subscription_auto_switch.py_parse_rate_limit_reset() parser + wired into handle_rate_limit_error()
  • db_models.py + scheduler/models.py + db/schedules.pymax_retries default 1 → 0
  • scheduler/service.py — early return in _maybe_schedule_retry() for rate-limit errors

Test Plan

  • Existing subscription tests pass
  • is_subscription_rate_limited() returns True when rate_limited_until is in the future
  • is_subscription_rate_limited() clears expired rate_limited_until and falls back to event count
  • _parse_rate_limit_reset("resets 8pm (America/New_York)") returns a future UTC ISO string
  • _maybe_schedule_retry() returns without scheduling when error contains "429"
  • New schedules default to max_retries=0

Closes #476

🤖 Generated with Claude Code

vybe and others added 3 commits April 23, 2026 17:28
…t max_retries=0

Three related fixes for subscription rate-limit handling:

1. Persist authoritative reset timestamp — parse "resets 8pm (America/New_York)"
   from 429 body and store in new subscription_credentials.rate_limited_until column.
   is_subscription_rate_limited() checks this first before falling back to the 2h
   event-count heuristic, preventing ping-pong when Anthropic's window is 5-8h.

2. Never retry rate-limited executions — _maybe_schedule_retry() now returns early
   on 429/rate-limit errors. The next scheduled cron tick is the correct recovery
   path; queuing retries during a subscription outage just floods the backlog.

3. Change default max_retries from 1 to 0 — retries are now opt-in per schedule.
   Scheduled agents are typically stateful and idempotent; skipping a run is
   preferable to thundering-herd replay when many tasks fail simultaneously.

Closes #476

Co-Authored-By: Claude <noreply@anthropic.com>
… retry opt-in

Co-Authored-By: Claude <noreply@anthropic.com>
…, SUB-003)

Co-Authored-By: Claude <noreply@anthropic.com>
@vybe
Copy link
Copy Markdown
Contributor Author

vybe commented Apr 28, 2026

Hey @vybe — this PR is targeting main directly. Per our dev workflow, all feature/fix branches should target dev instead (feature/* → dev → main). Could you rebase onto dev and update the base branch? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(SUB-003): rate-limit events never age out due to SQLite string-compare bug; retries amplify outages

1 participant