docs: record sub2api weekly-limit cooldown handling
This commit is contained in:
@@ -145,6 +145,7 @@ bun scripts/cli.ts platform-infra sub2api codex-pool configure-local --confirm
|
||||
- Codex 启动 WebSocket 回退:用原入口 Codex smoke 复现,再用 bounded Sub2API 日志确认 account;对 WS handshake 5xx 的账号关闭 YAML WSv2 能力后同步,不把临时可用性推断写成调度配置。
|
||||
- 上游要求 Codex User-Agent:只给该 profile 配 `upstreamUserAgent`,跑 `sync --confirm`。
|
||||
- 上游报 capacity/rate-limit/overload/Bad Gateway 后没有切号或频繁先失败再恢复:先确认 `codex-pool validate` 里 `tempUnschedulable.ok=true` 且目标 account `runtimeEnabled=true`、规则数符合 YAML;再看 `validation.gatewayResponses.evidence.failovers` 的 account/upstream status。若 mismatch,跑 `codex-pool sync --confirm`,不要手工 patch Sub2API credentials。
|
||||
- Codex 报 weekly-limit、`less than 10% of your weekly limit left`、`Run /status for a breakdown` 等账号状态/软配额提示并要求切号:把稳定 body 关键词放进 `pool.defaultTempUnschedulable` 的 403 和 429 规则,跑 `codex-pool sync --confirm`,再用 `codex-pool validate` 确认每个 managed account 的 runtime 403/429 rules 都包含这些关键词。Sub2API 临时下线规则按 HTTP status + body keyword 匹配;如果该文案是 HTTP 200 成功内容,需要另提响应分类能力 issue,不能只靠 YAML 冷却规则声明解决。
|
||||
- 上游错误反复触发:默认错误冷却按严重程度分层;临时问题可从 10 分钟起步,网关/服务不可用/过载类应更长,认证/权限/配额/账号状态类使用最长冷却。`Recovered upstream error ...`、`Bad Gateway` 和 Codex-facing `Upstream request failed` 这类通用包装文案都应留在 YAML 冷却政策里。具体数值只以 YAML 为准,修改后必须 `codex-pool sync --confirm` 和 `codex-pool validate`。
|
||||
- Codex auto compact 后丢上下文:先确认本机 `~/.codex/config.toml` 是否有 `supports_websockets = true` 和 `responses_websockets_v2 = true`,再看 `codex-pool validate` 的 WSv2 candidate 和 Sub2API 日志里的 `transport=responses_websockets_v2`。
|
||||
- Codex smoke 有 reconnect/1013:这是上游并发/可用性问题,和 HTTP-only compact context-loss 分开处理;记录 session/log 证据并关联专项 issue,不要用运行时手补覆盖 YAML 容量。
|
||||
|
||||
@@ -32,6 +32,7 @@
|
||||
- `profiles.entries[].loadFactor` optionally overrides `pool.defaultAccountLoadFactor` for one account and is rendered to Sub2API `load_factor`. Treat it as routing policy: values belong in YAML and `codex-pool validate` output, not code constants, Secrets, or ad-hoc runtime patches.
|
||||
- Do not change account membership, priority, capacity, load factor, WebSocket mode, or other routing policy from inference alone. Unless the user explicitly asks for a configuration change, first preserve the current YAML, collect provenance and runtime evidence, and write the finding to the relevant issue or runbook before proposing a change.
|
||||
- `profiles.entries[].tempUnschedulable` may override the pool default for one account. The CLI renders it into Sub2API credentials as `temp_unschedulable_enabled` and `temp_unschedulable_rules`; rules match HTTP status plus response-body keywords and place only that account into a temporary unschedulable cooldown.
|
||||
- Codex account-state or quota prompts that stop a task and ask the operator to switch accounts belong in `pool.defaultTempUnschedulable`, not in account membership, priority, capacity, load factor, WebSocket mode, or `pool_mode`. Keep stable body phrases such as weekly-limit and `/status` prompts in both the 403 account-state rule and the 429 quota/rate-limit rule, then run `codex-pool sync --confirm` and `codex-pool validate`. The validation evidence must include runtime temporary-unschedulable alignment for each managed account, not only successful group-level `/v1/models` or `/v1/responses` smoke output.
|
||||
- `profiles.entries[].openaiResponsesWebSocketsV2Mode` is the account-level Responses WebSocket v2 switch for OpenAI-compatible upstreams that require WebSocket transport. Allowed values are `off`, `ctx_pool`, and `passthrough`; omit the field unless that upstream needs it.
|
||||
- `profiles.entries[].upstreamUserAgent` is an optional account-level upstream request User-Agent override. Use it only for upstreams that require a Codex CLI compatible User-Agent; keep the value YAML-controlled and newline-free.
|
||||
- `publicExposure` controls the optional FRP bridge from master server to the G14 ClusterIP service.
|
||||
@@ -45,6 +46,8 @@ Do not encode current availability assumptions in long-term reference prose. If
|
||||
|
||||
Do not enable Sub2API `pool_mode` for UniDesk-managed Codex accounts. `pool_mode` retries the same selected account path, while UniDesk's desired failover behavior is to mark the failing account temporarily unschedulable and let Sub2API choose another account from the group. `codex-pool validate` reports each managed account's temporary-unschedulable runtime alignment and should be used after `codex-pool sync --confirm`. Generic 502 bodies such as `Recovered upstream error 502`, `Bad Gateway`, and Codex-facing `Upstream request failed` must stay in the YAML cooldown policy so an intermittently bad account is cooled down instead of repeatedly adding latency at the next compact or Responses request. The Codex pool default error cooldown is severity-tiered: temporary signals can start at ten minutes, gateway/service/overload failures should cool down longer, and credential, permission, quota, or account-state failures should use the longest cooldown. Exact current values belong in YAML and runtime validation output.
|
||||
|
||||
Sub2API temporary-unschedulable rules require both an HTTP status match and a response-body keyword match. Do not treat them as a general successful-response content filter. If an upstream returns a quota warning as normal HTTP 200 assistant content, track that as a separate response-classification capability issue instead of claiming the YAML cooldown policy has covered it.
|
||||
|
||||
The request path is:
|
||||
|
||||
1. A client sends an OpenAI-compatible request to the configured consumer base URL, normally master-local `http://127.0.0.1:<frp-port>/v1/...`, with the unified API key.
|
||||
|
||||
Reference in New Issue
Block a user