test: validate Sub2API 2xx reclassification

2026-06-10 08:14:34 +00:00
parent 3f55f2508b
commit 008a7d1361
5 changed files with 362 additions and 14 deletions
@@ -132,7 +132,7 @@ bun scripts/cli.ts platform-infra sub2api codex-pool configure-local --confirm

 - `sub2api status`：Deployment/StatefulSet/Service/Secret 可见，运行镜像与 YAML 一致。
 - `sub2api validate`：app、PostgreSQL、Redis 和 service proxy 基础检查通过。
- `codex-pool validate`：统一 key 的 `GET /v1/models` 成功，并用 `localCodex.responsesSmokeModel` 跑一次小的 `POST /v1/responses` smoke；owner balance / owner concurrency 已满足 YAML 最小值，capacity、WebSocket v2 和 temporary-unschedulable 运行时状态与 YAML 对齐；`validation.gatewayCompactRecent` 会汇总最近 6 小时 `/responses/compact` 成功、失败、failover、最终 4xx/5xx 和 `context canceled` 证据。若 Responses smoke `outcome=succeeded-with-failover` 或 `gatewayCompactRecent.degraded=true`，说明请求已恢复但仍有账号级上游 5xx/compact timeout 需要按 evidence 继续降频或冷却。
+- `codex-pool validate`：统一 key 的 `GET /v1/models` 成功，并用 `localCodex.responsesSmokeModel` 跑一次小的 `POST /v1/responses` smoke；owner balance / owner concurrency 已满足 YAML 最小值，capacity、WebSocket v2 和 temporary-unschedulable 运行时状态与 YAML 对齐；`validation.gatewayCompactRecent` 会汇总最近 6 小时 `/responses/compact` 成功、失败、failover、最终 4xx/5xx 和 `context canceled` 证据；`runtimeCapabilities.successBodyReclassification` 会用临时 probe account 检查 YAML 200 成功体规则是否已被当前 Sub2API 镜像真实重分类为账号冷却。若 Responses smoke `outcome=succeeded-with-failover`、`gatewayCompactRecent.degraded=true` 或 runtime capability `outcome=unsupported-runtime-image`，说明请求已恢复但仍有账号级上游 5xx/compact timeout 或 #247 这类能力缺口需要继续处理。
 - 若 `publicExposure.enabled=true`，确认 FRP path 可用；`expose --confirm` 会用未带 key 的 public `/v1/models` 401 作为网关可达性探针。

 如果要证明真实模型请求可用，使用最小 `/v1/responses` 或等价 Codex smoke。不要把 group-level `/v1/models` 成功解释成每个上游 account 都健康。
@@ -148,7 +148,7 @@ bun scripts/cli.ts platform-infra sub2api codex-pool configure-local --confirm
 - Codex 启动 WebSocket 回退：用原入口 Codex smoke 复现，再用 bounded Sub2API 日志确认 account；对 WS handshake 4xx/5xx、`openai.websocket_account_select_failed` 或 close-before-`response.completed` 的账号关闭 YAML WSv2 能力后同步。若没有剩余 WSv2-capable account，把 `localCodex.supportsWebSockets` 和 `localCodex.responsesWebSocketsV2` 一起关掉，不把临时可用性推断写成调度配置。
 - 上游要求 Codex User-Agent：只给该 profile 配 `upstreamUserAgent`，跑 `sync --confirm`。
 - 上游报 capacity/rate-limit/overload/Bad Gateway/Gateway Timeout 后没有切号或频繁先失败再恢复：先确认 `codex-pool validate` 里 `tempUnschedulable.ok=true` 且目标 account `runtimeEnabled=true`、规则数符合 YAML；再看 `validation.gatewayResponses.evidence.failovers` 的 account/upstream status。若 mismatch，跑 `codex-pool sync --confirm`，不要手工 patch Sub2API credentials。
- Codex 报 weekly-limit、`less than 10% of your weekly limit left`、`Run /status for a breakdown` 等账号状态/软配额提示并要求切号：把稳定 body 关键词放进 `pool.defaultTempUnschedulable` 的 403 和 429 规则，跑 `codex-pool sync --confirm`，再用 `codex-pool validate` 确认每个 managed account 的 runtime 403/429 rules 都包含这些关键词。Sub2API 临时下线规则按 HTTP status + body keyword 匹配；如果该文案是 HTTP 200 成功内容，最多只能把 200 规则作为期望分类信号写进 YAML，同时必须登记响应分类能力 issue，不能只靠 YAML 规则声明当作已能冷却账号。
+- Codex 报 weekly-limit、`less than 10% of your weekly limit left`、`Run /status for a breakdown` 等账号状态/软配额提示并要求切号：把稳定 body 关键词放进 `pool.defaultTempUnschedulable` 的 403 和 429 规则，跑 `codex-pool sync --confirm`，再用 `codex-pool validate` 确认每个 managed account 的 runtime 403/429 rules 都包含这些关键词。Sub2API 临时下线规则按 HTTP status + body keyword 匹配；如果该文案是 HTTP 200 成功内容，最多只能把 200 规则作为期望分类信号写进 YAML，并查看 `runtimeCapabilities.successBodyReclassification` 是否证明当前镜像已能重分类；若结果是 `unsupported-runtime-image`，必须登记或更新响应分类能力 issue，不能只靠 YAML 规则声明当作已能冷却账号。
 - 上游 503 响应体出现 `model_not_found`、`No available channel for model ...` 或同类稳定模型路由失败文案：把稳定 body 关键词放进 `pool.defaultTempUnschedulable` 的 503 规则，跑 `codex-pool sync --confirm`，再用 `codex-pool validate` 确认目标 account 的 runtime 503 rule 包含这些关键词；不要用 account membership、priority、capacity、loadFactor、WebSocket mode 或 User-Agent 改动掩盖该错误族。
 - 上游错误反复触发：默认错误冷却按严重程度分层；临时问题可从 10 分钟起步，网关/服务不可用/过载/模型路由类应更长，认证/权限/配额/账号状态类使用最长冷却。`Recovered upstream error ...`、`Bad Gateway`、`Gateway Timeout`、Cloudflare `524`、Codex-facing `Upstream request failed`、`Unknown error`、`context deadline exceeded`、`context canceled`、`model_not_found`、`No available channel for model`、大上下文 `413` 和 `openai_error` 这类稳定包装文案都应留在对应 5xx/413 YAML 冷却政策里，特别是 compact 链路里上游 524 可能最终表现为客户端 502/504 + `Unknown error`。具体数值只以 YAML 为准，修改后必须 `codex-pool sync --confirm` 和 `codex-pool validate`。长期判定见 `docs/reference/platform-infra.md`。
 - Codex auto compact 后丢上下文：先确认 YAML `localCodex` 是否声明启用 WSv2；若启用，再确认本机 `~/.codex/config.toml` 是否有 `supports_websockets = true` 和 `responses_websockets_v2 = true`，并看 `codex-pool validate` 的 WSv2 candidate 和 Sub2API 日志里的 `transport=responses_websockets_v2`。若 YAML 当前禁用 WSv2，则按 HTTP Responses 稳定性排查，不把旧 WS 口径当成验收要求。
@@ -50,7 +50,7 @@ Do not encode current availability assumptions in long-term reference prose. If

 Do not enable Sub2API `pool_mode` for UniDesk-managed Codex accounts. `pool_mode` retries the same selected account path, while UniDesk's desired failover behavior is to mark the failing account temporarily unschedulable and let Sub2API choose another account from the group. `codex-pool validate` reports each managed account's temporary-unschedulable runtime alignment and should be used after `codex-pool sync --confirm`. Generic 502/503/504 bodies such as `Recovered upstream error 502`, `Bad Gateway`, `Gateway Timeout`, Codex-facing `Upstream request failed`, `Unknown error`, context-deadline/canceled wrappers, and stable `model_not_found` / "no available channel for model" wrappers must stay in the YAML cooldown policy so an intermittently bad account is cooled down instead of repeatedly adding latency at the next compact or Responses request. The Codex pool default error cooldown is severity-tiered: temporary signals can start at ten minutes, gateway/service/overload/model-routing failures should cool down longer, and credential, permission, quota, or account-state failures should use the longest cooldown. Exact current values belong in YAML and runtime validation output.

-Sub2API temporary-unschedulable rules require both an HTTP status match and a response-body keyword match. Do not treat them as a general successful-response content filter. If an upstream returns a quota warning as normal HTTP 200 assistant content, keep the same stable phrases in the 403/429 rules and track a separate response-classification capability issue; a YAML 200 rule may document the desired classification signal, but validation of that rule only proves it is stored, not that successful assistant content currently cools the selected account.
+Sub2API temporary-unschedulable rules require both an HTTP status match and a response-body keyword match. Do not treat them as a general successful-response content filter. If an upstream returns a quota warning as normal HTTP 200 assistant content, keep the same stable phrases in the 403/429 rules and track a separate response-classification capability issue; a YAML 200 rule may document the desired classification signal, but account cooling is only proven when `codex-pool validate` reports `runtimeCapabilities.successBodyReclassification.outcome=supported`. An `unsupported-runtime-image` result is visibility for the capability gap, not proof that successful assistant content currently cools the selected account.

 The request path is:

@@ -91,13 +91,16 @@ if (parsed.pool?.defaultTempUnschedulable?.enabled === true) {
  const quota429Keywords = new Set((quota429Rule?.keywords ?? []).map((keyword) => keyword.toLowerCase()));
  const successBody200Keywords = new Set((successBody200Rule?.keywords ?? []).map((keyword) => keyword.toLowerCase()));
  const serviceUnavailable503Keywords = new Set((serviceUnavailable503Rule?.keywords ?? []).map((keyword) => keyword.toLowerCase()));
-  for (const keyword of ["weekly limit", "less than 10% of your weekly limit left", "run /status for a breakdown"]) {
-    assertCondition(accountState403Keywords.has(keyword), "403 temporary-unschedulable rule must catch Codex weekly-limit account-state prompts", { keyword, accountState403Rule });
-    assertCondition(quota429Keywords.has(keyword), "429 temporary-unschedulable rule must catch Codex weekly-limit quota prompts", { keyword, quota429Rule });
+  const accountStatePhrases = ["weekly limit", "less than 10% of your weekly limit left", "run /status for a breakdown"];
+  const successBodyPhrase = "less than 10% of your weekly limit left";
+  for (const accountStatePhrase of accountStatePhrases) {
+    assertCondition(accountState403Keywords.has(accountStatePhrase), "403 temporary-unschedulable rule must catch Codex account-state phrases", { accountStatePhrase, accountState403Rule });
+    assertCondition(quota429Keywords.has(accountStatePhrase), "429 temporary-unschedulable rule must catch Codex account-state phrases", { accountStatePhrase, quota429Rule });
  }
+  assertCondition(successBody200Rule !== undefined, "200 temporary-unschedulable rule must be declared when YAML needs success-body reclassification", rules);
  if (successBody200Rule !== undefined) {
-    assertCondition(successBody200Keywords.has("less than 10% of your weekly limit left"), "200 temporary-unschedulable rule must document the weekly-limit success-body classifier phrase", successBody200Rule);
-    assertCondition(/reclassification/u.test(successBody200Rule.description ?? ""), "200 temporary-unschedulable rule must be documented as a reclassification signal, not a proven cooldown", successBody200Rule);
+    assertCondition(successBody200Keywords.size === 1 && successBody200Keywords.has(successBodyPhrase), "200 temporary-unschedulable rule must use one stable success-body classifier phrase", successBody200Rule);
+    assertCondition(/reclassification/u.test(successBody200Rule.description ?? ""), "200 temporary-unschedulable rule must be documented as a runtime reclassification requirement", successBody200Rule);
  }
  for (const keyword of ["model_not_found", "no available channel for model"]) {
    assertCondition(serviceUnavailable503Keywords.has(keyword), "503 temporary-unschedulable rule must catch upstream model-routing failures", { keyword, serviceUnavailable503Rule });
@@ -29,14 +29,18 @@ assertCondition(!("pool_mode" in credentials), "pool_mode must not be enabled be
 assertCondition(!("api_key" in credentials) && !("base_url" in credentials), "temporary-unschedulable rendering must not include secrets or endpoints", credentials);
 const accountState403Rule = rules.find((rule) => rule.error_code === 403);
 const quota429Rule = rules.find((rule) => rule.error_code === 429);
+const successBody200Rule = rules.find((rule) => rule.error_code === 200);
 const gateway502Rule = rules.find((rule) => rule.error_code === 502);
 const serviceUnavailable503Rule = rules.find((rule) => rule.error_code === 503);
 const gatewayTimeout504Rule = rules.find((rule) => rule.error_code === 504);
 const largeContext413Rule = rules.find((rule) => rule.error_code === 413);
 const cloudflare524Rule = rules.find((rule) => rule.error_code === 524);
-for (const keyword of ["weekly limit", "less than 10% of your weekly limit left", "run /status for a breakdown"]) {
-  assertCondition(accountState403Rule?.keywords?.includes(keyword), "403 rendered rule must preserve Codex weekly-limit account-state keyword", { keyword, accountState403Rule });
-  assertCondition(quota429Rule?.keywords?.includes(keyword), "429 rendered rule must preserve Codex weekly-limit quota keyword", { keyword, quota429Rule });
+const accountStatePhrases = ["weekly limit", "less than 10% of your weekly limit left", "run /status for a breakdown"];
+const successBodyPhrase = "less than 10% of your weekly limit left";
+assertCondition(successBody200Rule?.keywords?.length === 1 && successBody200Rule.keywords.includes(successBodyPhrase), "200 rendered rule must use the single stable success-body account-state phrase", successBody200Rule);
+for (const accountStatePhrase of accountStatePhrases) {
+  assertCondition(accountState403Rule?.keywords?.includes(accountStatePhrase), "403 rendered rule must preserve Codex account-state phrases", { accountStatePhrase, accountState403Rule });
+  assertCondition(quota429Rule?.keywords?.includes(accountStatePhrase), "429 rendered rule must preserve Codex account-state phrases", { accountStatePhrase, quota429Rule });
 }
 for (const keyword of ["model_not_found", "no available channel for model"]) {
  assertCondition(serviceUnavailable503Rule?.keywords?.includes(keyword), "503 rendered rule must catch upstream model-routing failures", { keyword, serviceUnavailable503Rule });
@@ -67,7 +71,7 @@ console.log(JSON.stringify({
  checks: [
    "temporary unschedulable policy renders to Sub2API credential field names",
    "temporary unschedulable rendering follows the input policy without hard-coded policy gates",
-    "Codex weekly-limit prompt keywords render into 403 and 429 cooldown rules",
+    "Codex account-state prompt uses one stable phrase, including the 200 success-body rule",
    "large-context upstream failures render into the 413 cooldown rule",
    "upstream model-routing failures render into the 503 cooldown rule",
    "gateway timeout wrappers render into the 504 cooldown rule",
@@ -2517,6 +2517,343 @@ def summarize_temp_unschedulable_rules(rules):
        "hasDescription": bool(rule.get("description")),
    } for rule in rules]

+def success_body_reclassification_requirement():
+    for name in sorted(EXPECTED_ACCOUNT_TEMP_UNSCHEDULABLE):
+        expected = normalize_temp_unschedulable_credentials(EXPECTED_ACCOUNT_TEMP_UNSCHEDULABLE[name])
+        for rule in expected["rules"]:
+            error_code = rule.get("error_code")
+            keywords = rule.get("keywords") or []
+            if isinstance(error_code, int) and 200 <= error_code < 300 and keywords:
+                return {
+                    "required": True,
+                    "sourceAccountName": name,
+                    "statusCode": error_code,
+                    "keywords": keywords,
+                    "probeKeyword": keywords[0],
+                    "durationMinutes": rule.get("duration_minutes"),
+                }
+    return {
+        "required": False,
+        "sourceAccountName": None,
+        "statusCode": None,
+        "keywords": [],
+        "probeKeyword": None,
+        "durationMinutes": None,
+    }
+
+def delete_probe_resource(token, method, path, label):
+    if not path:
+        return {"label": label, "ok": True, "skipped": True}
+    resp = curl_api(method, path, bearer=token)
+    ok = resp.get("ok") is True or resp.get("httpStatus") in (404, 410)
+    return {
+        "label": label,
+        "ok": ok,
+        "method": method,
+        "path": path,
+        "httpStatus": resp.get("httpStatus"),
+        "transportExitCode": resp.get("transportExitCode"),
+        "bodyPreview": "" if ok else text(resp.get("body", ""), 500),
+        "valuesPrinted": False,
+    }
+
+def launch_success_body_mock_upstream(status_code, body_text):
+    port = 28000 + secrets.randbelow(2000)
+    body_b64 = base64.b64encode(body_text.encode("utf-8")).decode("ascii")
+    script = r'''
+set -eu
+port="$1"
+status="$2"
+body_b64="$3"
+body="$(printf "%s" "$body_b64" | base64 -d)"
+length="$(printf "%s" "$body" | wc -c | tr -d " ")"
+{ printf "HTTP/1.1 %s OK\r\nContent-Type: application/json\r\nContent-Length: %s\r\nConnection: close\r\n\r\n" "$status" "$length"; printf "%s" "$body"; } | nc -l -p "$port" -w 5
+'''
+    proc = subprocess.Popen([
+        "kubectl", "-n", NAMESPACE, "exec", "-i", APP_POD,
+        "--", "sh", "-c", script, "sh", str(port), str(status_code), body_b64,
+    ], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+    time.sleep(0.35)
+    if proc.poll() is None:
+        return port, proc, None
+    stdout, stderr = proc.communicate(timeout=2)
+    return None, None, {
+        "ok": False,
+        "error": "mock-upstream-exited-before-probe",
+        "exitCode": proc.returncode,
+        "stdoutTail": text(stdout, 1000),
+        "stderrTail": text(stderr, 1000),
+    }
+
+def finish_mock_upstream(proc):
+    if proc is None:
+        return None
+    timed_out = False
+    try:
+        stdout, stderr = proc.communicate(timeout=2)
+    except subprocess.TimeoutExpired:
+        timed_out = True
+        proc.kill()
+        stdout, stderr = proc.communicate(timeout=2)
+    return {
+        "exitCode": proc.returncode,
+        "timedOut": timed_out,
+        "stdoutTail": text(stdout, 1000),
+        "stderrTail": text(stderr, 1000),
+    }
+
+def create_success_body_probe_resources(token, base_url, requirement):
+    stamp = str(int(time.time() * 1000))
+    suffix = stamp + "-" + "".join(secrets.choice(string.ascii_lowercase + string.digits) for _ in range(6))
+    group_payload_obj = group_payload()
+    group_payload_obj["name"] = "unidesk-probe-2xx-body-" + suffix
+    group_payload_obj["description"] = "UniDesk validate probe for OpenAI 2xx success-body reclassification."
+    group_id = None
+    account_id = None
+    api_key_id = None
+    try:
+        group = ensure_success(curl_api("POST", "/api/v1/admin/groups", bearer=token, payload=group_payload_obj), "create 2xx success-body probe group")
+        group_id = group.get("id") if isinstance(group, dict) else None
+        if group_id is None:
+            raise RuntimeError("2xx success-body probe group id missing")
+
+        account_payload_obj = {
+            "name": "unidesk-probe-2xx-body-" + suffix,
+            "notes": "Temporary UniDesk validate probe account; safe to delete.",
+            "platform": "openai",
+            "type": "apikey",
+            "credentials": {
+                "api_key": "sk-unidesk-probe-upstream",
+                "base_url": base_url,
+                "temp_unschedulable_enabled": True,
+                "temp_unschedulable_rules": [{
+                    "error_code": requirement["statusCode"],
+                    "keywords": [requirement["probeKeyword"]],
+                    "duration_minutes": 1,
+                    "description": "UniDesk runtime capability probe for OpenAI 2xx success-body reclassification.",
+                }],
+            },
+            "extra": {
+                "openai_responses_mode": "force_responses",
+                "unidesk_probe": "success_body_reclassification",
+            },
+            "concurrency": 1,
+            "priority": 0,
+            "rate_multiplier": 1,
+            "load_factor": 1,
+            "group_ids": [group_id],
+            "confirm_mixed_channel_risk": True,
+        }
+        account = ensure_success(curl_api("POST", "/api/v1/admin/accounts", bearer=token, payload=account_payload_obj), "create 2xx success-body probe account")
+        account_id = account.get("id") if isinstance(account, dict) else None
+        if account_id is None:
+            raise RuntimeError("2xx success-body probe account id missing")
+
+        api_key = "sk-unidesk-probe-" + "".join(secrets.choice(string.ascii_letters + string.digits) for _ in range(36))
+        api_key_obj = ensure_success(curl_api("POST", "/api/v1/keys", bearer=token, payload={
+            "name": "unidesk-probe-2xx-body-" + suffix,
+            "group_id": group_id,
+            "custom_key": api_key,
+            "quota": 0,
+            "rate_limit_5h": 0,
+            "rate_limit_1d": 0,
+            "rate_limit_7d": 0,
+        }), "create 2xx success-body probe API key")
+        api_key_id = api_key_obj.get("id") if isinstance(api_key_obj, dict) else None
+        return {
+            "groupId": group_id,
+            "groupName": group_payload_obj["name"],
+            "accountId": account_id,
+            "accountName": account_payload_obj["name"],
+            "apiKeyId": api_key_id,
+            "apiKey": api_key,
+            "keyPreview": api_key_preview(api_key),
+            "valuesPrinted": False,
+        }
+    except Exception:
+        if api_key_id is not None:
+            delete_probe_resource(token, "DELETE", f"/api/v1/keys/{api_key_id}", "api-key")
+        if account_id is not None:
+            delete_probe_resource(token, "DELETE", f"/api/v1/admin/accounts/{account_id}", "account")
+        if group_id is not None:
+            delete_probe_resource(token, "DELETE", f"/api/v1/admin/groups/{group_id}", "group")
+        raise
+
+def gateway_success_body_probe_request(api_key, request_id):
+    payload = {
+        "model": RESPONSES_SMOKE_MODEL,
+        "input": "Reply exactly: success-body probe",
+        "stream": False,
+        "store": False,
+        "max_output_tokens": 8,
+    }
+    body = json.dumps(payload, separators=(",", ":")).encode("utf-8")
+    script = r'''
+set -eu
+token="$1"
+request_id="$2"
+tmp="$(mktemp)"
+trap 'rm -f "$tmp"' EXIT
+cat > "$tmp"
+curl -sS -w '\\n__HTTP_CODE__:%{http_code}' -X POST \
+  -H "Authorization: Bearer $token" \
+  -H 'Content-Type: application/json' \
+  -H "X-Request-ID: $request_id" \
+  -H "OpenAI-Client-Request-ID: $request_id" \
+  --data-binary @"$tmp" \
+  http://127.0.0.1:8080/v1/responses
+'''
+    proc = run([
+        "kubectl", "-n", NAMESPACE, "exec", "-i", APP_POD,
+        "--", "sh", "-c", script, "sh", api_key, request_id,
+    ], body)
+    return parse_curl_output(proc)
+
+def account_temp_unschedulable_probe_state(token, account_id):
+    detail = ensure_success(curl_api("GET", f"/api/v1/admin/accounts/{account_id}", bearer=token), "get 2xx success-body probe account")
+    if not isinstance(detail, dict):
+        return {
+            "accountId": account_id,
+            "status": None,
+            "schedulable": None,
+            "tempUnschedulableUntil": None,
+            "tempUnschedulableReasonPreview": "",
+            "tempUnschedulableSet": False,
+        }
+    until = detail.get("temp_unschedulable_until") or detail.get("tempUnschedulableUntil")
+    reason = detail.get("temp_unschedulable_reason") or detail.get("tempUnschedulableReason") or ""
+    return {
+        "accountId": account_id,
+        "status": detail.get("status"),
+        "schedulable": detail.get("schedulable"),
+        "tempUnschedulableUntil": until,
+        "tempUnschedulableReasonPreview": text(str(reason), 500) if reason else "",
+        "tempUnschedulableSet": until is not None or bool(reason),
+    }
+
+def validate_success_body_reclassification(token):
+    requirement = success_body_reclassification_requirement()
+    if not requirement["required"]:
+        return {
+            "ok": True,
+            "required": False,
+            "capability": "openai-2xx-success-body-temp-unschedulable-failover",
+            "outcome": "not-required-by-yaml",
+            "valuesPrinted": False,
+        }
+
+    resources = None
+    cleanup = []
+    mock_proc = None
+    try:
+        keyword = requirement["probeKeyword"]
+        upstream_body = json.dumps({
+            "id": "resp_unidesk_success_body_probe",
+            "object": "response",
+            "created_at": int(time.time()),
+            "status": "completed",
+            "model": RESPONSES_SMOKE_MODEL,
+            "output": [{
+                "type": "message",
+                "role": "assistant",
+                "content": [{"type": "output_text", "text": keyword}],
+            }],
+            "usage": {"input_tokens": 1, "output_tokens": 1, "total_tokens": 2},
+        }, separators=(",", ":"))
+        port, mock_proc, mock_error = launch_success_body_mock_upstream(requirement["statusCode"], upstream_body)
+        if mock_error is not None:
+            return {
+                "ok": False,
+                "required": True,
+                "capability": "openai-2xx-success-body-temp-unschedulable-failover",
+                "outcome": "probe-infrastructure-failed",
+                "requirement": requirement,
+                "mock": mock_error,
+                "valuesPrinted": False,
+            }
+        resources = create_success_body_probe_resources(token, f"http://127.0.0.1:{port}", requirement)
+        request_id = "unidesk-2xx-body-probe-" + str(int(time.time() * 1000))
+        started = time.time()
+        response = gateway_success_body_probe_request(resources["apiKey"], request_id)
+        mock_result = finish_mock_upstream(mock_proc)
+        mock_proc = None
+        state = account_temp_unschedulable_probe_state(token, resources["accountId"])
+        evidence = request_log_evidence(request_id)
+        supported = response.get("ok") is not True and state.get("tempUnschedulableSet") is True
+        return {
+            "ok": supported,
+            "required": True,
+            "capability": "openai-2xx-success-body-temp-unschedulable-failover",
+            "outcome": "supported" if supported else "unsupported-runtime-image",
+            "requirement": requirement,
+            "probe": {
+                "requestId": request_id,
+                "durationMs": int((time.time() - started) * 1000),
+                "httpStatus": response.get("httpStatus"),
+                "transportExitCode": response.get("transportExitCode"),
+                "responseOk": response.get("ok"),
+                "bodyPreview": text(response.get("body", ""), 800),
+                "stderr": response.get("stderr", ""),
+                "accountState": state,
+                "logEvidence": evidence,
+            },
+            "mock": mock_result,
+            "resources": {
+                "groupId": resources["groupId"],
+                "accountId": resources["accountId"],
+                "apiKeyId": resources["apiKeyId"],
+                "keyPreview": resources["keyPreview"],
+                "valuesPrinted": False,
+            },
+            "cleanup": cleanup,
+            "message": "Sub2API image must reclassify matching 2xx OpenAI bodies into failover/temp-unschedulable before statusCode=200 YAML rules are effective." if not supported else "Matching 2xx OpenAI bodies are reclassified into account failover/temp-unschedulable.",
+            "valuesPrinted": False,
+        }
+    except Exception as exc:
+        return {
+            "ok": False,
+            "required": True,
+            "capability": "openai-2xx-success-body-temp-unschedulable-failover",
+            "outcome": "probe-failed",
+            "requirement": requirement,
+            "error": str(exc),
+            "cleanup": cleanup,
+            "valuesPrinted": False,
+        }
+    finally:
+        if mock_proc is not None:
+            _ = finish_mock_upstream(mock_proc)
+        if resources is not None:
+            cleanup.append(delete_probe_resource(token, "DELETE", f"/api/v1/keys/{resources['apiKeyId']}" if resources.get("apiKeyId") is not None else "", "api-key"))
+            cleanup.append(delete_probe_resource(token, "DELETE", f"/api/v1/admin/accounts/{resources['accountId']}", "account"))
+            cleanup.append(delete_probe_resource(token, "DELETE", f"/api/v1/admin/groups/{resources['groupId']}", "group"))
+
+def validate_runtime_capabilities(token):
+    success_body = validate_success_body_reclassification(token)
+    return {
+        "ok": success_body.get("ok") is True,
+        "runtimeImage": app_pod_runtime_image(),
+        "successBodyReclassification": success_body,
+        "valuesPrinted": False,
+    }
+
+def app_pod_runtime_image():
+    try:
+        pod = kube_json(["-n", NAMESPACE, "get", "pod", APP_POD], f"pod/{APP_POD}")
+        spec_containers = ((pod.get("spec") or {}).get("containers") or []) if isinstance(pod, dict) else []
+        status_containers = ((pod.get("status") or {}).get("containerStatuses") or []) if isinstance(pod, dict) else []
+        spec = next((item for item in spec_containers if item.get("name") == "sub2api"), spec_containers[0] if spec_containers else {})
+        status = next((item for item in status_containers if item.get("name") == "sub2api"), status_containers[0] if status_containers else {})
+        return {
+            "pod": APP_POD,
+            "image": spec.get("image"),
+            "imageID": status.get("imageID"),
+            "ready": status.get("ready"),
+            "restartCount": status.get("restartCount"),
+        }
+    except Exception as exc:
+        return {"pod": APP_POD, "error": str(exc)}
+
 def get_account_detail(token, account):
    account_id = account.get("id") if isinstance(account, dict) else None
    if account_id is None:
@@ -2764,9 +3101,10 @@ def run_sync():
    gateway = validate_gateway(api_key)
    responses_smoke = validate_gateway_responses(api_key)
    compact_evidence = recent_compact_gateway_evidence()
+    runtime_capabilities = validate_runtime_capabilities(token)
    return {
        "ok": gateway["ok"] is True and responses_smoke["ok"] is True and owner_concurrency["ok"] is True and capacity_status["ok"] is True and load_factor_status["ok"] is True and ws_v2_status["ok"] is True and temp_unschedulable_status["ok"] is True,
-        "degraded": bool(responses_smoke.get("degraded")) or bool(compact_evidence.get("degraded")),
+        "degraded": bool(responses_smoke.get("degraded")) or bool(compact_evidence.get("degraded")) or runtime_capabilities.get("ok") is not True,
        "mode": "sync",
        "namespace": NAMESPACE,
        "serviceDns": SERVICE_DNS,
@@ -2800,6 +3138,7 @@ def run_sync():
        },
        "ownerBalance": owner_balance,
        "ownerConcurrency": owner_concurrency,
+        "runtimeCapabilities": runtime_capabilities,
        "validation": {"gatewayModels": gateway, "gatewayResponses": responses_smoke, "gatewayCompactRecent": compact_evidence},
    }

@@ -2821,9 +3160,10 @@ def run_validate():
    gateway = validate_gateway(api_key)
    responses_smoke = validate_gateway_responses(api_key)
    compact_evidence = recent_compact_gateway_evidence()
+    runtime_capabilities = validate_runtime_capabilities(token)
    return {
        "ok": gateway["ok"] is True and responses_smoke["ok"] is True and (owner_concurrency is None or owner_concurrency["ok"] is True) and capacity_status["ok"] is True and load_factor_status["ok"] is True and ws_v2_status["ok"] is True and temp_unschedulable_status["ok"] is True,
-        "degraded": bool(responses_smoke.get("degraded")) or bool(compact_evidence.get("degraded")),
+        "degraded": bool(responses_smoke.get("degraded")) or bool(compact_evidence.get("degraded")) or runtime_capabilities.get("ok") is not True,
        "mode": "validate",
        "namespace": NAMESPACE,
        "serviceDns": SERVICE_DNS,
@@ -2842,6 +3182,7 @@ def run_validate():
        "loadFactor": load_factor_status,
        "webSocketsV2": ws_v2_status,
        "tempUnschedulable": temp_unschedulable_status,
+        "runtimeCapabilities": runtime_capabilities,
        "validation": {"gatewayModels": gateway, "gatewayResponses": responses_smoke, "gatewayCompactRecent": compact_evidence},
    }