fix: 收敛 Sub2API Codex WSv2 候选

This commit is contained in:
Codex
2026-06-09 09:57:17 +00:00
parent 5e34f2c2f6
commit 979e855caf
3 changed files with 7 additions and 2 deletions
+3
View File
@@ -71,6 +71,8 @@ bun scripts/cli.ts platform-infra sub2api codex-pool validate
WebSocket v2 是账号能力集合,不是调度 pin。`openaiResponsesWebSocketsV2Mode` 只声明该账号可承担 Codex Responses WSv2 链路;`codex-pool validate` 至少要看到一个 `webSocketsV2.schedulableEnabled` 账号,真实可用性仍以 Codex smoke 和运行日志为准。
Codex 启动时反复出现 WebSocket reconnect、HTTPS fallback、`websocket closed by server before response.completed`,或 Sub2API 日志出现 `openai.websocket_proxy_failed` / 上游 WS handshake 5xx 时,先按运行证据定位具体 account 和 transport。若只有某个账号的 WSv2 握手失败,优先只在 YAML 中把该账号的 `openaiResponsesWebSocketsV2Mode` 收敛为 `off``codex-pool sync --confirm`;不要顺手改 membership、priority、capacity、Secret 或代码 fallback。
## 添加上游
1. 在 master `~/.codex/` 准备 profile 文件,例如 `config.toml.<profile>``auth.json.<profile>`
@@ -139,6 +141,7 @@ bun scripts/cli.ts platform-infra sub2api codex-pool configure-local --confirm
- FRP 不通:先看 `codex-pool expose --confirm` 输出的 `masterFrps``sub2api-frpc` 和 public 401 probe;需要低层证据时只用 `trans G14:k3s` 做 bounded 查询。
- default profile 递归:检查 YAML default entry 是否使用 `*.pre-sub2api` 备份文件;必要时恢复备份后重新 `configure-local --confirm`
- 上游需要 WebSocket v2:只给该 profile 配 `openaiResponsesWebSocketsV2Mode: ctx_pool|passthrough`,跑 `sync --confirm`;把它当 capability candidate,容量仍以 YAML 中的 `capacity` 或默认值为准。
- Codex 启动 WebSocket 回退:用原入口 Codex smoke 复现,再用 bounded Sub2API 日志确认 account;对 WS handshake 5xx 的账号关闭 YAML WSv2 能力后同步,不把临时可用性推断写成调度配置。
- 上游要求 Codex User-Agent:只给该 profile 配 `upstreamUserAgent`,跑 `sync --confirm`
- 上游报 capacity/rate-limit/overload 后没有切号:先确认 `codex-pool validate``tempUnschedulable.ok=true` 且目标 account `runtimeEnabled=true`、规则数符合 YAML;若 mismatch,跑 `codex-pool sync --confirm`,不要手工 patch Sub2API credentials。
- Codex auto compact 后丢上下文:先确认本机 `~/.codex/config.toml` 是否有 `supports_websockets = true``responses_websockets_v2 = true`,再看 `codex-pool validate` 的 WSv2 candidate 和 Sub2API 日志里的 `transport=responses_websockets_v2`
@@ -45,7 +45,7 @@ profiles:
authFile: auth.json.pre-sub2api
fallbackConfigFile: config.toml
fallbackAuthFile: auth.json
openaiResponsesWebSocketsV2Mode: ctx_pool
openaiResponsesWebSocketsV2Mode: off
priority: 10
- profile: HY
accountName: unidesk-codex-hy
+3 -1
View File
@@ -36,7 +36,9 @@
- `publicExposure` controls the optional FRP bridge from master server to the G14 ClusterIP service.
- `localCodex` controls how the master server's current `~/.codex` consumer files are backed up and rewritten. Codex consumers using Sub2API must keep `supportsWebSockets` and `responsesWebSocketsV2` enabled so compacted long sessions can continue through the Responses WebSocket v2 response chain instead of falling back to HTTP-only summary context.
Enable account-level WebSocket v2 only for upstream profiles that have passed a direct Codex WSv2 probe. Treat this as a YAML-declared capability set, not a hard scheduling pin to one profile; `codex-pool validate` must show at least one current `webSocketsV2.schedulableEnabled` account, and runtime smoke remains the availability proof. The same validation reports each managed account's runtime WebSocket v2 mode and whether it matches YAML, so stale `ctx_pool` settings cannot silently keep routing Codex WS sessions to an upstream that closes with `no available account`.
Enable account-level WebSocket v2 only for upstream profiles that have passed a direct Codex WSv2 probe. Treat this as a YAML-declared capability set, not a hard scheduling pin to one profile; `codex-pool validate` must show at least one current `webSocketsV2.schedulableEnabled` account, and runtime smoke remains the availability proof. The same validation reports each managed account's runtime WebSocket v2 mode and whether it matches YAML, so stale `ctx_pool` settings cannot silently keep routing Codex WS sessions to an upstream that closes with `no available account`, WS handshake 5xx, or before `response.completed`.
When Codex startup repeatedly reports WebSocket reconnects or HTTPS fallback, preserve membership, priority, capacity, and other routing policy until runtime logs identify the failing account and transport. If bounded Sub2API logs show repeated `openai.websocket_proxy_failed` or upstream WS handshake 5xx for one account, remove only that account from the WSv2 capability set in YAML, run `codex-pool sync --confirm`, and prove the result with Codex smoke plus `codex-pool validate`.
Do not encode current availability assumptions in long-term reference prose. If an account needs a higher concurrency than `pool.defaultAccountCapacity`, make that a deliberate YAML override and verify it with `codex-pool validate`; the reference document should describe the rule, not repeat the current numeric value.