docs: codify Sub2API upstream capability boundary

This commit is contained in:
Codex
2026-06-10 09:50:47 +00:00
parent 54097cf75f
commit 0b416bd388
3 changed files with 9 additions and 12 deletions
+6 -5
View File
@@ -58,12 +58,12 @@ bun scripts/cli.ts platform-infra sub2api codex-pool validate
- `pool.apiKeySecretName` / `pool.apiKeySecretKey`: 统一消费 API key 的 k3s Secret 位置,默认 `platform-infra/sub2api-codex-pool-api-key.API_KEY`
- `pool.minOwnerBalanceUsd`: pool key owner 最低余额,sync/validate 会补齐。
- `pool.minOwnerConcurrency`: 统一消费 API key owner 的最低并发,sync/validate 会补齐;用于避免共享 key 在用户并发层触发 WS 1013,不要用提高某个 provider capacity 来掩盖。
- `pool.defaultTempUnschedulable`: 默认账号级临时下线规则;用于在上游返回容量、限流、overload、service unavailable、gateway timeout、稳定模型路由错误或认证状态异常时,让 Sub2API 冷却该账号并切换到同组其他账号。
- `pool.defaultTempUnschedulable`: 默认账号级临时下线规则;只声明 Sub2API 已支持的错误路径能力,用于在上游返回容量、限流、overload、service unavailable、gateway timeout、稳定模型路由错误或认证状态异常时,让 Sub2API 冷却该账号并切换到同组其他账号。不要用 YAML、UniDesk CLI、k8s 热补或本地 fork 魔改 Sub2API 不支持的行为。
- `profiles.entries`: 从 master `~/.codex/` 选择上游 profile 并映射到 Sub2API account。
- `profiles.entries[].capacity`: 可选 per-account concurrency override;不写则使用 `pool.defaultAccountCapacity`。具体数值只以 `config/platform-infra/sub2api-codex-pool.yaml` 为准,skill 和长期参考只描述规则,不重复写当前值。
- `profiles.entries[].loadFactor`: 可选 per-account Sub2API `load_factor` override;不写则使用 `pool.defaultAccountLoadFactor`。具体数值只以 `config/platform-infra/sub2api-codex-pool.yaml` 为准,修改后必须 `codex-pool sync --confirm``codex-pool validate`
- 除非用户明确要求修改配置,不要仅凭推断改账号 membership、priority、capacity、loadFactor、WebSocket mode 或其他调度策略;先保留 YAML,完成 provenance/runtime evidence 溯源,并把结论写回相关 issue 或 runbook 后再提出变更。
- `profiles.entries[].tempUnschedulable`: 可选 per-account 临时下线规则覆盖;字段语义以 `docs/reference/platform-infra.md` 为权威。
- `profiles.entries[].tempUnschedulable`: 可选 per-account 临时下线规则覆盖;字段语义以 `docs/reference/platform-infra.md` 为权威。上游 Sub2API 不支持的成功体分类、调度策略或账号冷却行为不要在这里声明。
- `profiles.entries[].openaiResponsesWebSocketsV2Mode`: 需要 Responses WebSocket v2 的上游才设置,值为 `off``ctx_pool``passthrough`
- `profiles.entries[].upstreamUserAgent`: 少数要求 Codex CLI User-Agent 的上游才设置,不能含换行。
@@ -84,7 +84,7 @@ Codex 启动时反复出现 WebSocket reconnect、HTTPS fallback、`websocket cl
5.`codex-pool sync --confirm`
6.`codex-pool validate`
普通新增上游是 YAML 操作,不走 CI/CD,不改代码。只有需要新增可复用 schema 能力时才修改 `scripts/src/platform-infra-sub2api-codex.ts`
普通新增上游是 YAML 操作,不走 CI/CD,不改代码。只有需要渲染或校验上游 Sub2API 已经存在的可复用能力时才修改 `scripts/src/platform-infra-sub2api-codex.ts`;Sub2API 本身不支持的能力不在 UniDesk 侧魔改实现
## 删除上游
@@ -132,7 +132,7 @@ bun scripts/cli.ts platform-infra sub2api codex-pool configure-local --confirm
- `sub2api status`Deployment/StatefulSet/Service/Secret 可见,运行镜像与 YAML 一致。
- `sub2api validate`app、PostgreSQL、Redis 和 service proxy 基础检查通过。
- `codex-pool validate`:统一 key 的 `GET /v1/models` 成功,并用 `localCodex.responsesSmokeModel` 跑一次小的 `POST /v1/responses` smokeowner balance / owner concurrency 已满足 YAML 最小值,capacity、WebSocket v2 和 temporary-unschedulable 运行时状态与 YAML 对齐;`validation.gatewayCompactRecent` 会汇总最近 6 小时 `/responses/compact` 成功、失败、failover、最终 4xx/5xx 和 `context canceled` 证据`runtimeCapabilities.successBodyReclassification` 会用临时 probe account 检查 YAML 200 成功体规则是否已被当前 Sub2API 镜像真实重分类为账号冷却。若 Responses smoke `outcome=succeeded-with-failover``gatewayCompactRecent.degraded=true` 或 runtime capability `outcome=unsupported-runtime-image`,说明请求已恢复但仍有账号级上游 5xx/compact timeout #247 这类能力缺口需要继续处理。
- `codex-pool validate`:统一 key 的 `GET /v1/models` 成功,并用 `localCodex.responsesSmokeModel` 跑一次小的 `POST /v1/responses` smokeowner balance / owner concurrency 已满足 YAML 最小值,capacity、WebSocket v2 和 temporary-unschedulable 运行时状态与 YAML 对齐;`validation.gatewayCompactRecent` 会汇总最近 6 小时 `/responses/compact` 成功、失败、failover、最终 4xx/5xx 和 `context canceled` 证据。若 Responses smoke `outcome=succeeded-with-failover``gatewayCompactRecent.degraded=true`,说明请求已恢复但仍有账号级上游 5xx/compact timeout 需要继续处理。
-`publicExposure.enabled=true`,确认 FRP path 可用;`expose --confirm` 会用未带 key 的 public `/v1/models` 401 作为网关可达性探针。
如果要证明真实模型请求可用,使用最小 `/v1/responses` 或等价 Codex smoke。不要把 group-level `/v1/models` 成功解释成每个上游 account 都健康。
@@ -148,7 +148,7 @@ bun scripts/cli.ts platform-infra sub2api codex-pool configure-local --confirm
- Codex 启动 WebSocket 回退:用原入口 Codex smoke 复现,再用 bounded Sub2API 日志确认 account;对 WS handshake 4xx/5xx、`openai.websocket_account_select_failed` 或 close-before-`response.completed` 的账号关闭 YAML WSv2 能力后同步。若没有剩余 WSv2-capable account,把 `localCodex.supportsWebSockets``localCodex.responsesWebSocketsV2` 一起关掉,不把临时可用性推断写成调度配置。
- 上游要求 Codex User-Agent:只给该 profile 配 `upstreamUserAgent`,跑 `sync --confirm`
- 上游报 capacity/rate-limit/overload/Bad Gateway/Gateway Timeout 后没有切号或频繁先失败再恢复:先确认 `codex-pool validate``tempUnschedulable.ok=true` 且目标 account `runtimeEnabled=true`、规则数符合 YAML;再看 `validation.gatewayResponses.evidence.failovers` 的 account/upstream status。若 mismatch,跑 `codex-pool sync --confirm`,不要手工 patch Sub2API credentials。
- Codex 报 weekly-limit、`less than 10% of your weekly limit left``Run /status for a breakdown` 等账号状态/软配额提示并要求切号:把稳定 body 关键词放进 `pool.defaultTempUnschedulable` 403 和 429 规则,跑 `codex-pool sync --confirm`,再用 `codex-pool validate` 确认每个 managed account 的 runtime 403/429 rules 都包含这些关键词。Sub2API 临时下线规则按 HTTP status + body keyword 匹配;如果该文案是 HTTP 200 成功内容,最多只能把 200 规则作为期望分类信号写进 YAML,并查看 `runtimeCapabilities.successBodyReclassification` 是否证明当前镜像已能重分类;若结果是 `unsupported-runtime-image`,必须登记或更新响应分类能力 issue,不能只靠 YAML 规则声明当作已能冷却账号
- Codex 报 weekly-limit、`less than 10% of your weekly limit left``Run /status for a breakdown` 等账号状态/软配额提示并要求切号:如果上游以 403/429 等错误状态返回,把稳定 body 关键词放进 `pool.defaultTempUnschedulable`对应规则,跑 `codex-pool sync --confirm`,再用 `codex-pool validate` 确认每个 managed account 的 runtime 规则包含这些关键词。若该文案是 HTTP 200 成功内容,当前 Sub2API 不支持把它重分类为账号冷却;不要写 YAML 200 规则、不要热补 Sub2API、不要绕过 sync,必要时登记上游能力缺口 issue
- 上游 503 响应体出现 `model_not_found``No available channel for model ...` 或同类稳定模型路由失败文案:把稳定 body 关键词放进 `pool.defaultTempUnschedulable` 的 503 规则,跑 `codex-pool sync --confirm`,再用 `codex-pool validate` 确认目标 account 的 runtime 503 rule 包含这些关键词;不要用 account membership、priority、capacity、loadFactor、WebSocket mode 或 User-Agent 改动掩盖该错误族。
- 上游错误反复触发:默认错误冷却按严重程度分层;临时问题可从 10 分钟起步,网关/服务不可用/过载/模型路由类应更长,认证/权限/配额/账号状态类使用最长冷却。`Recovered upstream error ...``Bad Gateway``Gateway Timeout`、Cloudflare `524`、Codex-facing `Upstream request failed``Unknown error``context deadline exceeded``context canceled``model_not_found``No available channel for model`、大上下文 `413``openai_error` 这类稳定包装文案都应留在对应 5xx/413 YAML 冷却政策里,特别是 compact 链路里上游 524 可能最终表现为客户端 502/504 + `Unknown error`。具体数值只以 YAML 为准,修改后必须 `codex-pool sync --confirm``codex-pool validate`。长期判定见 `docs/reference/platform-infra.md`
- Codex auto compact 后丢上下文:先确认 YAML `localCodex` 是否声明启用 WSv2;若启用,再确认本机 `~/.codex/config.toml` 是否有 `supports_websockets = true``responses_websockets_v2 = true`,并看 `codex-pool validate` 的 WSv2 candidate 和 Sub2API 日志里的 `transport=responses_websockets_v2`。若 YAML 当前禁用 WSv2,则按 HTTP Responses 稳定性排查,不把旧 WS 口径当成验收要求。
@@ -162,3 +162,4 @@ bun scripts/cli.ts platform-infra sub2api codex-pool configure-local --confirm
- 不给 Sub2API manifest 添加 CPU/memory limits,除非有新的 YAML 化明确决策。
- 不打印完整 API key、admin password 或 Secret 明文。
- 不把普通上游增删做成代码变更、CI/CD、feature flag 或兼容双路径。
- 不魔改 Sub2API:Sub2API 本身不支持的能力就不做,不通过 UniDesk 脚本、k8s 原地热补、本地 fork、YAML 伪声明或隐藏 fallback 代替上游实现。
@@ -11,10 +11,6 @@ pool:
defaultTempUnschedulable:
enabled: true
rules:
- statusCode: 200
keywords: [less than 10% of your weekly limit left]
durationMinutes: 120
description: Success-body account-state prompts require Sub2API 2xx body reclassification before they can cool accounts.
- statusCode: 401
keywords: [unauthorized, invalid api key, invalid_api_key, authentication, recovered upstream error]
durationMinutes: 120
+3 -3
View File
@@ -26,7 +26,7 @@
- `pool.groupName` names the Sub2API group that represents the pool.
- `pool.apiKeySecretName` and `pool.apiKeySecretKey` name the k3s Secret that stores the single consumer API key.
- `pool.minOwnerConcurrency` declares the minimum concurrency for the Sub2API user that owns the unified consumer API key. Keep it high enough to cover the declared account capacity set, so the shared key does not fail WS sessions at the user-concurrency layer. Do not compensate for owner-concurrency 1013 errors by pinning capacity to one provider.
- `pool.defaultTempUnschedulable` declares Sub2API account-level temporary unschedulable rules. Keep 429/overload/capacity, service-unavailable, gateway timeout, and stable model-routing failures in this YAML policy so the scheduler can cool down a failing account and choose another candidate instead of hard-pinning one provider.
- `pool.defaultTempUnschedulable` declares Sub2API account-level temporary unschedulable rules for capabilities that Sub2API itself already supports. Keep 429/overload/capacity, service-unavailable, gateway timeout, and stable model-routing failures in this YAML policy so the scheduler can cool down a failing account and choose another candidate instead of hard-pinning one provider. Do not declare unsupported Sub2API behavior in YAML as a promise that UniDesk code or runtime patches should emulate.
- `profiles.entries` selects local Codex profile files from `~/.codex/` and maps them to Sub2API account names.
- The unsuffixed master `~/.codex/config.toml` and `~/.codex/auth.json` are reserved for the unified Sub2API consumer. `config.toml` must keep `base_url = "https://sub2api.74-48-78-17.nip.io/"`, and `auth.json` must contain the unified pool API key from `pool.apiKeySecretName` / `pool.apiKeySecretKey`. Do not replace these two files with direct upstream account credentials.
- Additional upstream accounts must use suffixed local profile files such as `config.toml.<profile>` and `auth.json.<profile>`, then be declared through `profiles.entries` in `config/platform-infra/sub2api-codex-pool.yaml`.
@@ -50,7 +50,7 @@ Do not encode current availability assumptions in long-term reference prose. If
Do not enable Sub2API `pool_mode` for UniDesk-managed Codex accounts. `pool_mode` retries the same selected account path, while UniDesk's desired failover behavior is to mark the failing account temporarily unschedulable and let Sub2API choose another account from the group. `codex-pool validate` reports each managed account's temporary-unschedulable runtime alignment and should be used after `codex-pool sync --confirm`. Generic 502/503/504 bodies such as `Recovered upstream error 502`, `Bad Gateway`, `Gateway Timeout`, Codex-facing `Upstream request failed`, `Unknown error`, context-deadline/canceled wrappers, and stable `model_not_found` / "no available channel for model" wrappers must stay in the YAML cooldown policy so an intermittently bad account is cooled down instead of repeatedly adding latency at the next compact or Responses request. The Codex pool default error cooldown is severity-tiered: temporary signals can start at ten minutes, gateway/service/overload/model-routing failures should cool down longer, and credential, permission, quota, or account-state failures should use the longest cooldown. Exact current values belong in YAML and runtime validation output.
Sub2API temporary-unschedulable rules require both an HTTP status match and a response-body keyword match. Do not treat them as a general successful-response content filter. If an upstream returns a quota warning as normal HTTP 200 assistant content, keep the same stable phrases in the 403/429 rules and track a separate response-classification capability issue; a YAML 200 rule may document the desired classification signal, but account cooling is only proven when `codex-pool validate` reports `runtimeCapabilities.successBodyReclassification.outcome=supported`. An `unsupported-runtime-image` result is visibility for the capability gap, not proof that successful assistant content currently cools the selected account.
Sub2API temporary-unschedulable rules require both an HTTP status match and a response-body keyword match in the upstream failure/error path. Do not treat them as a general successful-response content filter. If an upstream returns a quota warning or maintenance prompt as normal HTTP 200 assistant content, do not add a YAML 200 cooldown rule, patch Sub2API in place, fork behavior in UniDesk, or bypass `codex-pool sync` to make the pool pretend that account cooling exists. Record the upstream capability gap in an issue when it matters operationally; until upstream Sub2API supports that behavior and `codex-pool validate` proves it, UniDesk should not implement or rely on it.
The request path is:
@@ -60,7 +60,7 @@ The request path is:
4. Sub2API validates the unified key and resolves its `group_id`.
5. Accounts listed in `profiles.entries` are bound to the same group via `group_ids`, so Sub2API dispatches through that group using its own account selection semantics.
Adding, removing, exposing, validating, and configuring local Codex consumers are daily operations covered by `$unidesk-sub2api`. The development rule is that ordinary pool membership changes stay YAML-only and do not add code or CI/CD. Code changes are only appropriate when the YAML schema needs a new reusable capability such as account-level WebSocket mode or per-account upstream User-Agent.
Adding, removing, exposing, validating, and configuring local Codex consumers are daily operations covered by `$unidesk-sub2api`. The development rule is that ordinary pool membership changes stay YAML-only and do not add code or CI/CD. Code changes are only appropriate when UniDesk needs to render or validate a Sub2API capability that already exists upstream, such as account-level WebSocket mode or per-account upstream User-Agent. If Sub2API itself does not support a desired behavior, do not magic-patch it through UniDesk scripts, Kubernetes hotfixes, local forks, or hidden compatibility paths; either leave the behavior unsupported or pursue it upstream as an explicit Sub2API feature.
After `codex-pool configure-local --confirm`, the default `~/.codex/config.toml` / `auth.json` pair must remain the unified Sub2API consumer and must not be reused as an upstream account profile. Keep every upstream source profile in suffixed files such as `config.toml.<profile>` / `auth.json.<profile>` and register it through YAML `profiles.entries`.