feat(v3s): add D518 code queue standby pod

2026-05-15 14:56:02 +00:00
parent 00add260e3
commit 9d6be83c52
10 changed files with 500 additions and 43 deletions
@@ -0,0 +1 @@
+@AGENTS.md
@@ -50,11 +50,11 @@ frontend 的 Docker 上线顺序为：先运行必要的本地校验，例如 `b

 ## Health Criteria

-服务跑通的最低标准是：backend-core 内网 `/health` 返回 ok，frontend 公网 `/health` 返回 ok，provider ingress 公网 `/health` 返回 ok，database 在容器内 `pg_isready` 可用，Todo Note 后端 `/api/health` 返回 `storage=postgres`，`v3sctl-adapter` `/api/control-plane` 可见 Kubernetes API service proxy 状态、D601 active serving healthy、D518 expected/missing 节点状态和 no-fallback 路径，Code Queue `/health` 经 v3s active Service 返回轻量 readiness、默认模型和 `queue.storage`，`/api/tasks/overview` 返回 PostgreSQL 队列总览，Project Manager `/health` 返回 `storage.primary=postgres` 和项目数量，backend-core `/api/performance` 返回性能指标，`/api/nodes` 中出现 `main-server` 和 `D601` provider 且状态为 `online`，`/api/nodes/system-status` 中出现对应 CPU/内存/硬盘采样，`/api/nodes/docker-status` 中能看到主 server 与 D601 Docker 快照。D518 未接入前不得用 D601->D518 直连或 NodePort 绕过，也不得把 D518 missing 当作 D601 active Service 不可用；接入验收应证明 D518 通过 k3s/k8s 原生 agent/proxy/tunnel 或显式 provider 维护代理进入控制面。交付前还必须运行 `bun scripts/cli.ts e2e run`，并以 `docs/reference/e2e.md` 的门禁作为最终判定。
+服务跑通的最低标准是：backend-core 内网 `/health` 返回 ok，frontend 公网 `/health` 返回 ok，provider ingress 公网 `/health` 返回 ok，database 在容器内 `pg_isready` 可用，Todo Note 后端 `/api/health` 返回 `storage=postgres`，`v3sctl-adapter` `/api/control-plane` 可见 `unidesk-v8s` Kubernetes API service proxy 状态、D601 active serving healthy、D518 standby pod ready、`presentNodeIds=[D601,D518]`、`missingNodeIds=[]` 和 no-fallback 路径，Code Queue `/health` 经 v3s active Service 返回轻量 readiness、默认模型、`queue.storage` 和 `egressProxy.connected=true`，`/api/tasks/overview` 返回 PostgreSQL 队列总览，Project Manager `/health` 返回 `storage.primary=postgres` 和项目数量，backend-core `/api/performance` 返回性能指标，`/api/nodes` 中出现 `main-server`、`D601` 和 `D518` provider 且状态为 `online`，`/api/nodes/system-status` 中出现对应 CPU/内存/硬盘采样，`/api/nodes/docker-status` 中能看到主 server、D601 与 D518 Docker 快照。D518 必须通过 K3S agent 加入 V8S 控制面并运行 `code-queue-d518` standby Pod；不得用 D601->D518 直连、NodePort 或 provider-gateway business HTTP 绕过 Kubernetes service route。交付前还必须运行 `bun scripts/cli.ts e2e run`，并以 `docs/reference/e2e.md` 的门禁作为最终判定。

 ## Code Queue D601 Resource Budget

-Code Queue 已从主 server 迁移到 D601 v3s/k8s，但仍必须保持明确的 memory/swap 硬上限，默认 `CODE_QUEUE_MAX_ACTIVE_QUEUES=0` 以恢复 queue 间并行，仍保持 `CODE_QUEUE_IN_MEMORY_OUTPUT_RECORDS=10`、`CODE_QUEUE_IN_MEMORY_EVENT_RECORDS=10` 这类小热窗口；任务历史、队列统计和 Trace/output 读取必须优先从 PostgreSQL 直读或聚合，`/health` 只做轻量 readiness，不能为了性能便利在 Bun 进程内缓存全量历史。任何提高 Code Queue 热窗口、日志缓冲、Playwright/Codex 子进程常驻规模或容器上限的变更，或把 `CODE_QUEUE_MAX_ACTIVE_QUEUES` 显式改成正数，都必须在同一任务里说明 D601 资源预算来源，并通过 D601 `kubectl -n unidesk get deploy,svc,pod`、`kubectl -n unidesk top pod` 或等价 Docker stats、`microservice health code-queue` 和对应 E2E 证明未重新引入内存爆炸风险。
+Code Queue 已从主 server 迁移到 D601 v3s/k8s，但仍必须保持明确的 memory/swap 硬上限，默认 `CODE_QUEUE_MAX_ACTIVE_QUEUES=0` 以恢复 queue 间并行，仍保持 `CODE_QUEUE_IN_MEMORY_OUTPUT_RECORDS=10`、`CODE_QUEUE_IN_MEMORY_EVENT_RECORDS=10` 这类小热窗口；任务历史、队列统计和 Trace/output 读取必须优先从 PostgreSQL 直读或聚合，`/health` 只做轻量 readiness，不能为了性能便利在 Bun 进程内缓存全量历史。任何提高 Code Queue 热窗口、日志缓冲、Playwright/Codex 子进程常驻规模或容器上限的变更，或把 `CODE_QUEUE_MAX_ACTIVE_QUEUES` 显式改成正数，都必须在同一任务里说明 D601 资源预算来源，并通过 D601 `KUBECONFIG=/home/ubuntu/unidesk-code-queue-deploy/.state/v8s/kubeconfig kubectl -n unidesk get deploy,svc,pod`、`kubectl -n unidesk top pod` 或等价 Docker stats、`microservice health code-queue` 和对应 E2E 证明未重新引入内存爆炸风险。

 ## Database Connection Budget

@@ -35,7 +35,7 @@ Typical targeted commands:
 - Core API: `docker exec unidesk-backend-core` calls internal `GET /api/overview`, which must report `dbReady: true`, `pgdata.volumeName=unidesk_pgdata_10gb`, a positive PostgreSQL database byte count, and at least one online node; internal `GET /api/performance` must report component request statistics, internal operation statistics, PGDATA usage and Code Queue PostgreSQL storage metadata.
 - Provider self-connection: internal `GET /api/nodes` must contain `main-server` with `status: online`, `labels.providerGatewayVersion` equal to `src/components/provider-gateway/package.json`, `labels.providerGatewayUpgradePolicy: "always-enabled"`, `labels.providerGatewayRestartPolicyOk: true`, `labels.providerGatewayPidModeOk: true`, and `labels.providerGatewayRuntimeGuardOk: true`; internal `GET /api/nodes/system-status` must contain CPU/memory/disk samples plus a non-empty process resource list sorted by memory by default; internal `GET /api/nodes/docker-status` must contain a Docker snapshot for `main-server`; every running `provider-gateway` container visible in Docker snapshots must report `restartPolicy: "always"` and `pidMode: "host"`; public provider ingress `/health` must return ok.
 - Provider remote control: internal `/api/dispatch` must successfully complete a real `provider.upgrade` task in `mode: "plan"` so the upgrade path is validated without recreating the running gateway during E2E.
- User services: internal `/api/microservices` must include `todo-note` and `oa-event-flow` on `main-server`, canonical `filebrowser` on `D518`, plus `v3sctl-adapter`, `code-queue`, `findjob`, `pipeline`, `met-nonlinear`, `claudeqq` and `filebrowser-d601` on `D601` with `public=false`; `/api/microservices/todo-note/health` must report `storage=postgres`, `/api/microservices/todo-note/proxy/api/instances` must expose the migrated Todo Note lists, and a temporary Todo Note list create/add/toggle/undo/delete cycle must succeed through the real provider-gateway proxy; `/api/microservices/oa-event-flow/health`, `/api/microservices/oa-event-flow/proxy/api/diagnostics`, `/api/microservices/oa-event-flow/proxy/api/events`, `/api/microservices/oa-event-flow/proxy/api/events?tags=service:pipeline` and `/api/microservices/oa-event-flow/proxy/api/stats/trace` must prove the independent OA event table、Pipeline bridge 和 stats center are reachable through UniDesk proxy; `/api/microservices/v3sctl-adapter/health` and `/api/microservices/v3sctl-adapter/proxy/api/control-plane` must expose the D601 v3s/k8s control plane, `kubeApiProxy.mode=kubernetes-api-service-proxy`, D601 active instance `servingHealthy=true`, D518 expected/missing state when D518 has not joined, `status=degraded` for incomplete topology, and `noFallback=true`; `/api/microservices/code-queue/health` must return the active Code Queue backend summary with default model `gpt-5.5`, and `/api/microservices/code-queue/proxy/api/tasks/overview` must return queue state through backend-core -> v3sctl-adapter -> Kubernetes API service proxy -> v3s/k8s Service, not through a `serviceId=code-queue` provider-gateway direct task or `/api/code-queue-direct`; `/api/microservices/filebrowser/health`, `/api/microservices/filebrowser-d601/health` and `/api/microservices/filebrowser/proxy/` must prove File Browser health and WebUI access through UniDesk proxy; `/api/microservices/findjob/health` and `/api/microservices/findjob/proxy/api/summary` must succeed through the real provider-gateway proxy; `/api/microservices/findjob/proxy/api/jobs?__unideskArrayLimit=jobs:5` must return a bounded preview with `_unidesk.arrayLimits` metadata; `/api/microservices/pipeline/health`, `/api/microservices/pipeline/proxy/api/snapshot?__unideskArrayLimit=registry.components:8,runs:3` and `/api/microservices/pipeline/proxy/api/oa-event-flow/diagnostics` must return Pipeline health, registry/run previews and OA event-flow evidence; `/api/microservices/met-nonlinear/health`, `/api/microservices/met-nonlinear/proxy/api/queue`, `/api/microservices/met-nonlinear/proxy/api/projects?root=projects&limit=500`, `/api/microservices/met-nonlinear/proxy/api/projects?root=ex_projects&limit=500`, `/api/microservices/met-nonlinear/proxy/api/projects/config?path=<projectPath>` and `/api/microservices/met-nonlinear/proxy/api/images` must return the D601 TS backend health, queue/GPU policy, full project tree inputs, structured project detail and ready `met-nonlinear-ml:tf26` image status.
+- User services: internal `/api/microservices` must include `todo-note` and `oa-event-flow` on `main-server`, canonical `filebrowser` on `D518`, plus `v3sctl-adapter`, `code-queue`, `findjob`, `pipeline`, `met-nonlinear`, `claudeqq` and `filebrowser-d601` on `D601` with `public=false`; `/api/microservices/todo-note/health` must report `storage=postgres`, `/api/microservices/todo-note/proxy/api/instances` must expose the migrated Todo Note lists, and a temporary Todo Note list create/add/toggle/undo/delete cycle must succeed through the real provider-gateway proxy; `/api/microservices/oa-event-flow/health`, `/api/microservices/oa-event-flow/proxy/api/diagnostics`, `/api/microservices/oa-event-flow/proxy/api/events`, `/api/microservices/oa-event-flow/proxy/api/events?tags=service:pipeline` and `/api/microservices/oa-event-flow/proxy/api/stats/trace` must prove the independent OA event table、Pipeline bridge 和 stats center are reachable through UniDesk proxy; `/api/microservices/v3sctl-adapter/health` and `/api/microservices/v3sctl-adapter/proxy/api/control-plane` must expose the D601 `unidesk-v8s` control plane, `kubeApiProxy.mode=kubernetes-api-service-proxy`, D601 active instance `servingHealthy=true`, D518 standby instance `healthy=true`, `presentNodeIds=[D601,D518]`, `missingNodeIds=[]`, `status=healthy`, and `noFallback=true`; `/api/microservices/code-queue/health` must return the active Code Queue backend summary with default model `gpt-5.5`, `egressProxy.connected=true`, and `/api/microservices/code-queue/proxy/api/tasks/overview` must return queue state through backend-core -> v3sctl-adapter -> Kubernetes API service proxy -> v3s/k8s Service, not through a `serviceId=code-queue` provider-gateway direct task or `/api/code-queue-direct`; `/api/microservices/filebrowser/health`, `/api/microservices/filebrowser-d601/health` and `/api/microservices/filebrowser/proxy/` must prove File Browser health and WebUI access through UniDesk proxy; `/api/microservices/findjob/health` and `/api/microservices/findjob/proxy/api/summary` must succeed through the real provider-gateway proxy; `/api/microservices/findjob/proxy/api/jobs?__unideskArrayLimit=jobs:5` must return a bounded preview with `_unidesk.arrayLimits` metadata; `/api/microservices/pipeline/health`, `/api/microservices/pipeline/proxy/api/snapshot?__unideskArrayLimit=registry.components:8,runs:3` and `/api/microservices/pipeline/proxy/api/oa-event-flow/diagnostics` must return Pipeline health, registry/run previews and OA event-flow evidence; `/api/microservices/met-nonlinear/health`, `/api/microservices/met-nonlinear/proxy/api/queue`, `/api/microservices/met-nonlinear/proxy/api/projects?root=projects&limit=500`, `/api/microservices/met-nonlinear/proxy/api/projects?root=ex_projects&limit=500`, `/api/microservices/met-nonlinear/proxy/api/projects/config?path=<projectPath>` and `/api/microservices/met-nonlinear/proxy/api/images` must return the D601 TS backend health, queue/GPU policy, full project tree inputs, structured project detail and ready `met-nonlinear-ml:tf26` image status.
 - ClaudeQQ availability: `/api/microservices/claudeqq/health` must only pass when `ready=true`, NapCat HTTP and WebSocket are connected, and `napcat.loginState=logged_in`; `/api/microservices/claudeqq/proxy/api/napcat/login` must show the same logged-in account state and `/api/microservices/claudeqq/proxy/api/events/recent` must prove the backend can read the persistent event cache. A QR-code-only or not-logged-in NapCat state must be treated as unhealthy.
 - Database: the command writes an `unidesk_e2e_markers` row through `docker exec unidesk-database psql`, confirms provider state is stored in PostgreSQL, and checks Todo Note rows exist in `todo_note_instances` using the same named volume.
 - Pipeline OA event flow: `microservice:pipeline-oa-event-flow` must prove both no-audit and monitor-audit runs are driven by OA events end to end. The event stream must show `node-finished` as a neutral fact with `pipeline:{pipelineId}` and `epoch:{runId}` tags, OA policy as the source of downstream/audit decisions, monitor decisions as OA control events, and runner control-result evidence. E2E must fail if delivery still depends on a legacy detail audit policy flag as policy authority, independent legacy audit-request points, a legacy batch completion gate, direct monitor-to-runner calls, or frontend/CLI writes to Pipeline `.state`.
@@ -130,11 +130,12 @@ Baidu Netdisk 在 UniDesk 语境中按纯后端服务管理：不得暴露百度
 - Provider：`D601`，由 D601 provider-gateway 仅维护和访问 `v3sctl-adapter` 的本机私有端口 `127.0.0.1:4266`；provider-gateway 不再作为 `code-queue` 业务请求的直接代理。
 - 代码引用：`https://github.com/pikasTech/unidesk` 与配置中的 `repository.commitId`；服务源码位于 `src/components/microservices/v3sctl-adapter`，属于 UniDesk 自有控制面组件。
 - 部署引用：UniDesk 仓库中的 `src/components/microservices/v3sctl-adapter/docker-compose.d601.yml`，Dockerfile 为 `src/components/microservices/v3sctl-adapter/Dockerfile`，容器名为 `v3sctl-adapter`。
- v3s 实现：当前 `v3sctl-managed` 可以落到 k3s、k8s 或等价标准 Kubernetes 控制面，但必须使用 Kubernetes 原生命名空间、Deployment、Service、readiness/liveness probe、Kubernetes API service proxy 等规范对象；不得把裸容器端口、NodePort、SSH curl、provider-gateway `microservice.http` 或 host 直连地址伪装成 v3s 服务路由。
+- v3s 实现：当前运行控制面为 D601 上的 `unidesk-v8s` K3S server 和 D518 上的 K3S agent；`v3sctl-managed` 可以落到 k3s、k8s 或等价标准 Kubernetes 控制面，但必须使用 Kubernetes 原生命名空间、Deployment、Service、readiness/liveness probe、Kubernetes API service proxy 等规范对象；不得把裸容器端口、NodePort、SSH curl、provider-gateway `microservice.http` 或 host 直连地址伪装成 v3s 服务路由。
+- V8S 系统组件：`unidesk-v8s` server 必须禁用非必要的 `traefik`、`servicelb` 和 `metrics-server`，只保留业务必需的 API server、CoreDNS 与 local-path provisioner；CoreDNS 和 local-path provisioner 固定运行在 D601 控制面节点，避免 D518 维护隧道限制导致系统 DNS/readiness 抖动。
 - manifest：代管服务声明放在 `src/components/microservices/v3sctl-adapter/v3s/*.v3s.json`，adapter 启动时通过 `V3SCTL_MANIFEST_PATHS` 读取；manifest 是 D601/D518 实例、active instance、single writer、expected nodes 和 health policy 的权威来源。`V3SCTL_SERVICES_JSON` 不得承载 static HTTP 服务、不得覆盖同名服务、不得作为隐藏 fallback；如需追加服务也必须提供完整 `ManagedKubernetesService` manifest。
 - API：`GET /health` 只表示 adapter 控制面自身可用，并把代管服务 serving 健康作为 `managedServicesHealthy` 字段展示；`GET /api/control-plane` 返回控制面、manifest、kubectl/v3s snapshot 和代管服务状态；`GET /api/services` 返回代管服务列表；`GET|HEAD /api/services/<id>/health` 返回该 v3s 服务的 active serving 健康；`/api/services/<id>/proxy/*` 是业务请求进入 active service 的唯一代理入口。
- 代理路径：adapter 访问代管业务服务的唯一正式路径是 Kubernetes API service proxy：`/api/v1/namespaces/<namespace>/services/<service>:<port>/proxy/...`。D601 与 D518 不要求能彼此直连；D518 加入时应优先通过 k3s/k8s 原生 agent/proxy/tunnel 能力让控制面可达该节点，必要时可用 provider 维护通道只承载控制面连接的建立和诊断，但业务请求不得退化为 provider-gateway 直连 Code Queue HTTP 端口。
- 拓扑健康：`expectedNodeIds` 负责展示计划内节点，D518 尚未接入时必须保留 `missingNodeIds=["D518"]` 与 `status=degraded` 可见；只要 active D601 Service 通过 Kubernetes API service proxy 返回健康，`servingHealthy=true`、`healthy=true` 和 `managedServicesHealthy=true` 仍应成立。只有显式 `requireAllInstancesHealthy=true` 的服务才允许把缺失 standby/worker 节点提升为整体不健康。
+- 代理路径：adapter 访问 active 业务服务的唯一正式路径是 Kubernetes API service proxy：`/api/v1/namespaces/<namespace>/services/<service>:<port>/proxy/...`。D601 与 D518 不要求能彼此直连；D518 通过 K3S agent 加入控制面，控制面连接可以借助节点维护隧道建立，但业务请求不得退化为 provider-gateway 直连 Code Queue HTTP 端口。standby/worker 节点如果受 kubelet/service-proxy 可达性限制，可以在 manifest 中显式使用 `healthMode=pod-ready` 作为拓扑健康探针；这只读取 Kubernetes Pod readiness，不是业务代理路径，也不能替代 active Service proxy。
+- 拓扑健康：`expectedNodeIds` 负责展示计划内节点；当前 Code Queue 目标拓扑必须同时包含 D601 和 D518，`presentNodeIds` 应为 `["D601","D518"]`、`missingNodeIds=[]`、`topologyComplete=true`、`status=healthy`。D518 未加入只允许作为迁移中的显式 degraded 状态，不能隐藏为 fallback；只有显式 `requireAllInstancesHealthy=true` 的服务才允许把缺失 standby/worker 节点提升为整体不健康。
 - 前端：`用户服务 / V3S Control` React 页面必须只通过 `/api/microservices/v3sctl-adapter/proxy/api/control-plane` 通信，展示控制面状态、manifest、D601/D518 实例、active instance、Kubernetes API service proxy/no-fallback 路径和显式原始 JSON 按钮；页面不得直接访问 provider-gateway、D601/D518 业务容器端口、NodePort 或 raw v3s/kubectl API。

 ### Code Queue V3S-Managed
@@ -148,7 +149,7 @@ Baidu Netdisk 在 UniDesk 语境中按纯后端服务管理：不得暴露百度
 - 主服务依赖映射：Code Queue 仍以主 PostgreSQL 为权威数据库，`DATABASE_URL` 必须指向主 server 受限端口映射；`OA_EVENT_FLOW_BASE_URL` 必须指向主 server OA Event Flow 受限端口映射；D601 active 实例的 `CODE_QUEUE_NOTIFY_CLAUDEQQ_BASE_URL` 直接使用本机 ClaudeQQ 映射 `http://host.docker.internal:3290`。这些端口映射只服务受控节点运行时，必须用防火墙或等价策略限制来源，不得成为浏览器或任意公网客户端入口。
 - K8s 探针与启动维护：Kubernetes liveness/startup probe 必须使用轻量 `/live`，readiness 和用户服务健康使用 `/health`；`/health` 不得执行全量任务聚合、历史回填或长事务索引维护，历史任务总览应由 `/api/tasks/overview` 读取 PostgreSQL。启动时允许后台执行队列元数据 flush、通知 outbox 读取、任务表索引维护和 overview warmup，但这些维护不得阻塞 Bun server、readiness endpoint 或 frontend overview；通知表索引和大批量 OA backfill 不得作为默认启动副作用。
 - MiniMax/OpenCode 并发：`minimax-m2.7` 通过 OpenCode JSON 事件端口运行；每个 Code Queue task 必须使用独立的 OpenCode XDG data/config/cache/state 目录，禁止多队列并发任务共享同一个 OpenCode SQLite/WAL 状态目录，否则并发 smoke 会触发 `PRAGMA journal_mode = WAL` 之类的数据库锁或初始化错误。用于验证 v3s/k8s 链路的 MiniMax smoke 以“至少 4 个任务、分布到 2 个 queue、至少 2 个终态成功”为链路验收线；剩余失败如果是 OpenCode 最终回复捕获、业务任务判定或模型限流，应作为 Code Queue 执行可靠性问题单独排查，不能反推 v3s 代理链路失败。
- 默认出网代理：D601 active Code Queue Pod 必须默认把 `HTTP_PROXY`、`HTTPS_PROXY` 和 `ALL_PROXY` 注入给 Codex/OpenCode、`git`、`curl`、`npm` 等任务子进程；当前唯一上游是 D601 provider-gateway 通过宿主 loopback 发布的 egress HTTP CONNECT 端口 `http://host.docker.internal:18789`，该端口只允许绑定 `127.0.0.1`，不得开放公网。这里的 provider-gateway 只承担出网代理，不承担 Code Queue 业务 HTTP 代理；业务访问仍只能走 Kubernetes API service proxy。k3s/k8s 原生 egress gateway、service mesh 或 CNI egress policy 只作为后续网络层增强方向，当前交付态不引入第二套出网控制面。远程开发/执行容器不得只依赖这些环境变量，必须在容器网络层用 TUN 默认路由和 OUTPUT 防火墙强制外网流量只能经 master TUN 出口。
+- 默认出网代理：D601 active Code Queue Pod 必须默认把 `HTTP_PROXY`、`HTTPS_PROXY` 和 `ALL_PROXY` 注入给 Codex/OpenCode、`git`、`curl`、`npm` 等任务子进程；当前唯一上游是 D601 provider-gateway egress HTTP CONNECT 代理，并通过 Kubernetes `Service d601-provider-egress-proxy` 暴露给 `unidesk` namespace 内的 Pod。该 Service 的 EndpointSlice 指向 D601 provider-gateway 私有 Docker network endpoint，Pod 内代理 URL 使用 `http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789`，provider-gateway 宿主端口仍只允许绑定 `127.0.0.1`，不得开放公网；如 provider-gateway 容器 IP 变化，必须同步刷新 EndpointSlice 并用 Code Queue `/health.egressProxy.connected=true` 验证。这里的 provider-gateway 只承担出网代理，不承担 Code Queue 业务 HTTP 代理；业务访问仍只能走 Kubernetes API service proxy。k3s/k8s 原生 egress gateway、service mesh 或 CNI egress policy 只作为后续网络层增强方向，当前交付态不引入第二套出网控制面。远程开发/执行容器不得只依赖这些环境变量，必须在容器网络层用 TUN 默认路由和 OUTPUT 防火墙强制外网流量只能经 master TUN 出口。
 - 出网代理无 fallback 纪律：Code Queue 的运行时配置只允许一个默认出网路径，即 provider-gateway egress proxy；不得在代码中同时保留 Code Queue 自建 WebSocket proxy、临时 shell proxy、D601 本地直连公网、主 server direct HTTP proxy 等隐式分支。任何新增网络 fallback 都必须先进入本参考文档并配套 `/health` 可见状态，否则视为残留旧路径。
 - 上线纪律：Code Queue 相关的前端或后端改进必须在同一任务内正式上线并验证公网 frontend 或 live API，不能只停留在源码、构建产物或“后续再上线”。修改 Code Queue 自身时不得等待当前 Code Queue task 结束、等待 queue idle 或等待 `0 running` 后才重启；应通过 v3s 控制面或 D601/D518 维护入口做 build-first 替换，并用 v3s adapter、Code Queue live API 或公网 frontend 证明任务和队列仍可读可继续。
 - 更名与灾备恢复：旧版 Codex 队列服务名只允许作为兼容诊断和一次性迁移来源；`code-queue-backend` 容器自身 `/health` 正常但 `microservice health code-queue` 返回 provider 直连错误时，优先判定为 backend-core 仍加载旧 `MICROSERVICES_JSON` 或 adapter manifest 未刷新，必须刷新 `.state/docker-compose.env`、重建/替换 `backend-core` 与 `v3sctl-adapter`，随后用 `microservice list` 验证 `code-queue` 的 `runtime.orchestrator=v3sctl`、`backend.proxyMode=v3sctl-adapter-http` 和无业务容器直连摘要。
@@ -281,7 +282,7 @@ ClaudeQQ 在 UniDesk 语境中按消息网关后端服务管理：不得直接
 - `bun scripts/cli.ts microservice health claudeqq`、`bun scripts/cli.ts microservice proxy claudeqq /api/napcat/login`、`bun scripts/cli.ts microservice proxy claudeqq /api/events/recent` 和 `bun scripts/cli.ts microservice proxy claudeqq /api/events/subscriptions`：验证 ClaudeQQ 后端、NapCat 容器登录、事件订阅和私有代理链路；消息推送使用 `POST /api/push/text`，不得开放 D601 `3290/3000/3001/6099` 公网端口。
 - `bun scripts/cli.ts microservice health todo-note` 与 `bun scripts/cli.ts microservice proxy todo-note /api/instances`：验证主 server Todo Note 后端、PostgreSQL 存储和本机 provider-gateway 私有代理链路。
 - `bun scripts/cli.ts microservice health oa-event-flow`、`bun scripts/cli.ts microservice proxy oa-event-flow /api/diagnostics --raw` 与 `bun scripts/cli.ts microservice proxy oa-event-flow '/api/events?tags=service:code-queue&limit=20' --raw`：验证统一 OA 事件流、事件表、tag 查询和统计中心。
- `bun scripts/cli.ts microservice health v3sctl-adapter` 与 `bun scripts/cli.ts microservice proxy v3sctl-adapter /api/control-plane --raw`：验证 D601 v3s 控制面 adapter、manifest、D601/D518 实例状态和 no-fallback 运行路径。
+- `bun scripts/cli.ts microservice health v3sctl-adapter` 与 `bun scripts/cli.ts microservice proxy v3sctl-adapter /api/control-plane --raw`：验证 D601 `unidesk-v8s` 控制面 adapter、manifest、D601 active/D518 standby 实例状态、`presentNodeIds=[D601,D518]`、`missingNodeIds=[]` 和 no-fallback 运行路径。
 - `bun scripts/cli.ts microservice health code-queue` 与 `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview`：验证 Code Queue 经过 backend-core -> v3sctl-adapter -> v3s active service 的单一路径；输出不得出现 `serviceId=code-queue` 的 provider-gateway `microservice.http` 业务代理任务，写入、追加 prompt、打断和 readAt/未读状态都必须由 backend 写入 PostgreSQL，frontend 不得用本地存储伪造成功状态。
 - `bun scripts/cli.ts microservice health filebrowser`、`bun scripts/cli.ts microservice health filebrowser-d601` 与 `bun scripts/cli.ts microservice proxy filebrowser / --max-body-bytes 2000`：验证 D518 主 File Browser 和 D601 备用 File Browser 私有代理链路；浏览器 WebUI 必须通过 `/api/microservices/filebrowser/proxy/` 或 `/api/microservices/filebrowser-d601/proxy/` 访问，不得直接开放 `4251` 公网端口。
 - `bun scripts/cli.ts --main-server-ip 74.48.78.17 microservice health findjob`：在计算节点或其他非主 server 主机上通过公网 frontend remote CLI 进行同一验证，不需要主 server SSH key。
@@ -309,7 +310,7 @@ ClaudeQQ 在 UniDesk 语境中按消息网关后端服务管理：不得直接
 - 运行 `bun scripts/cli.ts microservice health met-nonlinear`、`bun scripts/cli.ts microservice proxy met-nonlinear /api/queue`、`bun scripts/cli.ts microservice proxy met-nonlinear '/api/projects?root=projects&limit=20'` 和 `bun scripts/cli.ts microservice proxy met-nonlinear /api/images`，确认真实链路经过 backend-core、WebSocket、D601 provider-gateway 和 D601 本机 MET Nonlinear TS 后端。
 - 运行 `bun scripts/cli.ts microservice health claudeqq`、`bun scripts/cli.ts microservice proxy claudeqq /api/napcat/login`、`bun scripts/cli.ts microservice proxy claudeqq /api/events/recent` 和 `bun scripts/cli.ts microservice proxy claudeqq /api/events/subscriptions`，确认真实链路经过 backend-core、WebSocket、D601 provider-gateway 和 D601 本机 ClaudeQQ 后端；在 D601 上 `curl http://127.0.0.1:3290/health` 应显示 `service=claudeqq`、`pureBackend=true`、`napcat.containerized=true`、NapCat HTTP/WS 状态、二维码状态和订阅计数。
 - 运行 `bun scripts/cli.ts microservice health todo-note` 与 `bun scripts/cli.ts microservice proxy todo-note /api/instances`，确认真实链路经过 backend-core、WebSocket、main-server provider-gateway 和主 server `todo-note-backend` 后端；输出中必须包含五个迁移清单和 PostgreSQL 存储健康状态。
- 运行 `bun scripts/cli.ts microservice health v3sctl-adapter`、`bun scripts/cli.ts microservice proxy v3sctl-adapter /api/control-plane --raw`、`bun scripts/cli.ts microservice health code-queue` 与 `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview`，确认真实链路经过 backend-core -> v3sctl-adapter -> v3s active service；Code Queue `/health` 必须仍返回业务后端自己的 `queue.storage.primary=postgres`、`queue.storage.postgresReady=true` 和 `queue.notifications.claudeqq.outbox.storage=postgres`，不得被 adapter 聚合健康 JSON 替代。还必须在 active Code Queue Pod 内验证主 PostgreSQL 端口映射、主 OA Event Flow 端口映射和本机 ClaudeQQ `http://host.docker.internal:3290` 均可访问，并在 adapter 控制页确认 D601 active serving healthy、D518 expected/missing 可见且整体不退化为 hidden fallback。再通过公网 frontend 提交一个 `gpt-5.5` 小任务，确认队列串行推进、输出实时更新、结束后有 judge 判定，且运行中可追加 prompt 或打断。Code Queue 的重启恢复必须作为验收项：运行中任务存在时重启或重建 active 实例后，任务必须从 PostgreSQL 恢复到可继续执行状态，不能丢失 active task、`promptHistory`、后续 queued 任务、readAt/未读状态或已入 outbox 的 ClaudeQQ 通知。Code Queue 服务名、表名前缀或持久化目录发生迁移后，还必须运行 `bun scripts/cli.ts e2e run --only microservice:catalog-code-queue,microservice:code-queue-status,microservice:code-queue-health,microservice:code-queue-tasks`，证明 backend-core catalog、v3s adapter 私有代理、PostgreSQL 队列和任务列表都指向 `code-queue`。批量验收必须通过公网 frontend 设置 `入队份数=5` 或使用多段 prompt 分隔，一次性入队 5 条任务，并确认 5 条任务按顺序进入 running/judging/succeeded，而不是只运行第一条。
+- 运行 `bun scripts/cli.ts microservice health v3sctl-adapter`、`bun scripts/cli.ts microservice proxy v3sctl-adapter /api/control-plane --raw`、`bun scripts/cli.ts microservice health code-queue` 与 `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview`，确认真实链路经过 backend-core -> v3sctl-adapter -> v3s active service；Code Queue `/health` 必须仍返回业务后端自己的 `queue.storage.primary=postgres`、`queue.storage.postgresReady=true`、`queue.notifications.claudeqq.outbox.storage=postgres` 和 `egressProxy.connected=true`，不得被 adapter 聚合健康 JSON 替代。还必须在 active Code Queue Pod 内验证主 PostgreSQL 端口映射、主 OA Event Flow 端口映射、本机 ClaudeQQ `http://host.docker.internal:3290` 和 `d601-provider-egress-proxy` 均可访问，并在 adapter 控制页确认 D601 active serving healthy、D518 standby pod ready、`missingNodeIds=[]` 且整体不退化为 hidden fallback。再通过公网 frontend 提交一个 `gpt-5.5` 小任务，确认队列串行推进、输出实时更新、结束后有 judge 判定，且运行中可追加 prompt 或打断。Code Queue 的重启恢复必须作为验收项：运行中任务存在时重启或重建 active 实例后，任务必须从 PostgreSQL 恢复到可继续执行状态，不能丢失 active task、`promptHistory`、后续 queued 任务、readAt/未读状态或已入 outbox 的 ClaudeQQ 通知。Code Queue 服务名、表名前缀或持久化目录发生迁移后，还必须运行 `bun scripts/cli.ts e2e run --only microservice:catalog-code-queue,microservice:code-queue-status,microservice:code-queue-health,microservice:code-queue-tasks`，证明 backend-core catalog、v3s adapter 私有代理、PostgreSQL 队列和任务列表都指向 `code-queue`。批量验收必须通过公网 frontend 设置 `入队份数=5` 或使用多段 prompt 分隔，一次性入队 5 条任务，并确认 5 条任务按顺序进入 running/judging/succeeded，而不是只运行第一条。
 - Code Queue 内存防回归验收：凡是改动 Code Queue 的持久化、scheduler、输出/Trace、health、列表/详情查询、日志导出或容器运行参数，交付前必须在 D601 用 `kubectl -n unidesk get deploy,pod,svc,endpoints -o wide`、`kubectl -n unidesk describe deploy/code-queue` 或等价 Docker inspect 确认 memory/swap 硬上限符合预算，运行 `kubectl -n unidesk top pod` 或 Docker stats 确认常驻内存、`OOMKilled=false` 和 `RestartCount` 未异常增长，再运行 `bun scripts/cli.ts microservice health code-queue` 确认 `/health` 是轻量 readiness 且暴露 PostgreSQL/notification/outbox 状态。验收还必须覆盖有历史任务存在时的 `/api/tasks/overview`、单任务详情和 output/transcript 查询，证明热状态裁剪不会丢历史输出、也不会重新把全部历史 `task_json` 缓存在进程内；涉及 TypeScript/frontend 验证的任务应能在 D601 Code Queue memory/swap 预算中完成 `bun run --cwd src/components/frontend check` 这类短时高内存命令，而不是被 memory watchdog 反复 SIGTERM。
 - Code Queue 延迟防回归验收：凡是改动 Code Queue 列表、overview、readAt、Trace/summary 懒加载、实时 output/SSE 事件发布、frontend 请求策略、backend-core 用户服务代理或 frontend Code Queue 请求路径，交付前必须在有历史任务数据且有 active output 流动的 live 环境验证 `GET /api/tasks/overview`、`POST /api/tasks/<id>/read`、选定 task 的 `trace-step` 和前端 `/app/code-queue/` 首屏均低于 1s 目标；可运行 `bun scripts/src/code-queue-perf.ts --json --target-ms 1000` 采集公网 frontend 下的首屏耗时、最慢 API 和 DOM 完成指标，并用 `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview --raw`、D601 Pod `/health` 与 `/api/tasks/overview` curl、性能面板 `/api/performance` 与 `/api/frontend-performance` 失败/慢操作记录、`kubectl -n unidesk top pod` 或 Docker stats 补充后端耗时、代理 502 和内存/CPU 证据。验收结论必须同时说明是否使用了短 TTL cache、cache 如何被 mutation 或 archive append 失效、数据库索引/聚合是否命中、输出热路径是否只读增量指标，以及分页加载是否跳过 selected/active/stats；不能只展示 cache 命中后的单次快照。
 - 运行 `bun scripts/cli.ts microservice health filebrowser`、`bun scripts/cli.ts microservice health filebrowser-d601` 和 `bun scripts/cli.ts microservice proxy filebrowser / --max-body-bytes 2000`，确认 File Browser health 返回 `status=OK`，WebUI HTML 包含 `File Browser`，D518/D601 通过 provider-gateway 访问节点本机 `4251`；随后在公网 frontend 的 `用户服务 / File Browser` 中确认 D518 为默认目标、可导出截图、iframe 紧凑布局不再有巨大 `folder` 标记遮挡文件名，并可浏览 `/mnt/c`。
@@ -92,11 +92,11 @@ provider ingress 是唯一允许公网暴露的 provider 连接接口，当前

 ## Egress Proxy

-provider-gateway 可以提供 egress HTTP CONNECT 代理，用于让 Code Queue、Pipeline runner 等节点侧执行环境通过既有 provider WebSocket 通道出网。代理默认监听容器内 `0.0.0.0:18789`，节点部署必须只发布为宿主 loopback `127.0.0.1:18789->18789/tcp`，不得开放公网端口；普通 Docker 执行容器可通过同一私有 Docker network 访问 provider-gateway 容器名，v3s/k8s Pod 统一通过 `host.docker.internal:18789` 访问该 loopback 映射。代理只负责把本地 CONNECT/absolute HTTP 请求转换为 `egress_tcp_open`、`egress_tcp_data`、`egress_tcp_close` 消息；backend-core 在主 server 侧建立真实 TCP 连接并把数据回传，避免 D601 等计算节点本地网络不可达时卡死 Codex/Git/NPM。
+provider-gateway 可以提供 egress HTTP CONNECT 代理，用于让 Code Queue、Pipeline runner 等节点侧执行环境通过既有 provider WebSocket 通道出网。代理默认监听容器内 `0.0.0.0:18789`，节点部署必须只发布为宿主 loopback `127.0.0.1:18789->18789/tcp`，不得开放公网端口；普通 Docker 执行容器可通过同一私有 Docker network 访问 provider-gateway 容器名，v3s/k8s Pod 必须通过显式 Kubernetes Service/EndpointSlice 暴露同节点 provider-gateway 私有 endpoint，例如 D601 Code Queue 使用 `d601-provider-egress-proxy.unidesk.svc.cluster.local:18789`，不得把该 egress Service 当作业务 HTTP 入口。代理只负责把本地 CONNECT/absolute HTTP 请求转换为 `egress_tcp_open`、`egress_tcp_data`、`egress_tcp_close` 消息；backend-core 在主 server 侧建立真实 TCP 连接并把数据回传，避免 D601 等计算节点本地网络不可达时卡死 Codex/Git/NPM。

 该能力属于 provider-gateway 通道能力，register/heartbeat 的 `unideskCapabilities` 必须包含 `network.egress-proxy`，labels 必须上报 `providerGatewayEgressProxy*` 状态。不得再为某个用户服务单独注册伪 provider 来实现出网代理；否则节点列表会出现虚假 provider，且代理、统计、升级路径会形成多套通道。代理健康检查使用 `GET /__unidesk/egress-proxy/health`，返回 `connected`、`providerId`、`activeTunnels` 和监听端口；业务服务自己的 `/health` 应把该结果作为排障证据透出。

-egress proxy 的长期边界是“统一 provider 通道，不引入第二控制面”。backend-core 只接受在线 provider socket 上的 `egress_tcp_*` 消息，并在该 socket 关闭时销毁全部对应 TCP relay；provider-gateway 只维护本地 HTTP proxy 与 WebSocket 消息映射，不保存业务状态，不参与任务调度、统计或节点注册以外的控制面。执行容器、用户服务和 Pipeline runner 不允许直接连接 backend-core provider ingress，也不允许携带 provider token 自行注册；需要出网时只能连接同节点 provider-gateway 的私有 proxy endpoint。当前 v3s/k8s Code Queue 采用 `host.docker.internal:18789`，这是节点 loopback egress 入口，不是业务 HTTP 代理入口，也不能替代 Kubernetes API service proxy。
+egress proxy 的长期边界是“统一 provider 通道，不引入第二控制面”。backend-core 只接受在线 provider socket 上的 `egress_tcp_*` 消息，并在该 socket 关闭时销毁全部对应 TCP relay；provider-gateway 只维护本地 HTTP proxy 与 WebSocket 消息映射，不保存业务状态，不参与任务调度、统计或节点注册以外的控制面。执行容器、用户服务和 Pipeline runner 不允许直接连接 backend-core provider ingress，也不允许携带 provider token 自行注册；需要出网时只能连接同节点 provider-gateway 的私有 proxy endpoint。当前 v3s/k8s Code Queue 通过 `d601-provider-egress-proxy` Kubernetes Service 连接 D601 provider-gateway egress endpoint，这是 Pod 内的出网入口，不是业务 HTTP 代理入口，也不能替代 Kubernetes API service proxy。

 故障语义必须显式，不允许静默 fallback。provider-gateway 到 backend-core 的 WebSocket 未连接时，本地 proxy 必须返回 503；执行容器不能自动绕过到 D601 本地直连公网、外部公共代理或主 server 公网 HTTP 端口。`NO_PROXY` 只用于 PostgreSQL、OA Event Flow、ClaudeQQ、frontend/backend-core 内网代理、provider-gateway health 等明确内网链路，不能把 GitHub、模型 API、npm registry 等外部目标加入绕过列表。验收必须同时证明 provider-gateway labels、业务服务 `/health` 和执行容器内 `curl -I https://...` 都走同一 proxy path。

@@ -61,6 +61,8 @@ const SERVICE_CHECK_NAMES = [
  "microservice:catalog-todo-note",
  "microservice:catalog-oa-event-flow",
  "microservice:catalog-code-queue",
+  "microservice:v3sctl-adapter-status",
+  "microservice:v3sctl-control-plane",
  "microservice:catalog-filebrowser",
  "microservice:filebrowser-health",
  "microservice:filebrowser-webui",
@@ -1026,6 +1028,8 @@ async function serviceChecks(config: UniDeskConfig, urls: PublicUrls, checks: E2
  const oaEventFlowEvents = dockerCoreJson("/api/microservices/oa-event-flow/proxy/api/events?limit=10");
  const oaEventFlowPipelineEvents = dockerCoreJson("/api/microservices/oa-event-flow/proxy/api/events?tags=service:pipeline&limit=10");
  const oaEventFlowStats = dockerCoreJson("/api/microservices/oa-event-flow/proxy/api/stats/trace?limit=10");
+  const v3sctlStatus = dockerCoreJson("/api/microservices/v3sctl-adapter/status");
+  const v3sctlControlPlane = dockerCoreJson("/api/microservices/v3sctl-adapter/proxy/api/control-plane");
  const codeQueueStatus = dockerCoreJson("/api/microservices/code-queue/status");
  const codeQueueHealth = dockerCoreJson("/api/microservices/code-queue/health");
  const codeQueueTasks = dockerCoreJson("/api/microservices/code-queue/proxy/api/tasks/overview?limit=5&transcriptLimit=1&compact=1&afterSeq=0&preferId=");
@@ -1100,8 +1104,27 @@ async function serviceChecks(config: UniDeskConfig, urls: PublicUrls, checks: E2
  const oaEventFlowEventsBody = (oaEventFlowEvents as { body?: { ok?: boolean; events?: unknown[]; returned?: number } }).body;
  const oaEventFlowPipelineEventsBody = (oaEventFlowPipelineEvents as { body?: { ok?: boolean; events?: Array<{ tags?: unknown[]; sourceId?: string; type?: string; payload?: { runId?: string; pipelineId?: string } }>; returned?: number } }).body;
  const oaEventFlowStatsBody = (oaEventFlowStats as { body?: { ok?: boolean; stats?: unknown[]; returned?: number } }).body;
-  const codeQueueHealthBody = (codeQueueHealth as { body?: { ok?: boolean; queue?: { defaultModel?: string; judgeConfigured?: boolean; modelReasoningEfforts?: Record<string, string> } } }).body;
+  const codeQueueHealthBody = (codeQueueHealth as { body?: { ok?: boolean; egressProxy?: { connected?: boolean }; queue?: { defaultModel?: string; judgeConfigured?: boolean; modelReasoningEfforts?: Record<string, string> } } }).body;
  const codeQueueTasksBody = (codeQueueTasks as { body?: { ok?: boolean; queue?: { defaultModel?: string; modelReasoningEfforts?: Record<string, string> }; tasks?: unknown[] } }).body;
+  const v3sctlControlPlaneBody = (v3sctlControlPlane as { body?: {
+    ok?: boolean;
+    clusterId?: string;
+    noFallback?: boolean;
+    managedServicesHealthy?: boolean;
+    kubeApiProxy?: { mode?: string };
+    services?: Array<{
+      id?: string;
+      status?: string;
+      presentNodeIds?: string[];
+      missingNodeIds?: string[];
+      topologyComplete?: boolean;
+      servingHealthy?: boolean;
+      active?: { id?: string; healthy?: boolean };
+      instances?: Array<{ id?: string; healthy?: boolean; proxyMode?: string }>;
+    }>;
+  } }).body;
+  const v3sctlCodeQueueService = v3sctlControlPlaneBody?.services?.find((service) => service.id === "code-queue");
+  const v3sctlD518Instance = v3sctlCodeQueueService?.instances?.find((instance) => instance.id === "D518");
  const filebrowserHealthBody = (filebrowserHealth as { body?: { status?: string } }).body;
  const filebrowserD601HealthBody = (filebrowserD601Health as { body?: { status?: string } }).body;
  const filebrowserWebuiText = String((filebrowserWebui as { body?: { text?: string } }).body?.text || "");
@@ -1141,6 +1164,35 @@ async function serviceChecks(config: UniDeskConfig, urls: PublicUrls, checks: E2
    && codeQueue.runtime?.orchestrator === "v3sctl"
    && codeQueue.runtime?.container === null,
    { microservices });
+  addSelectedCheck(checks, options, "microservice:v3sctl-adapter-status",
+    (v3sctlStatus as { ok?: boolean; body?: { microservice?: { id?: string; providerId?: string } } }).ok === true
+    && (v3sctlStatus as { body?: { microservice?: { id?: string; providerId?: string } } }).body?.microservice?.id === "v3sctl-adapter"
+    && (v3sctlStatus as { body?: { microservice?: { id?: string; providerId?: string } } }).body?.microservice?.providerId === "D601",
+    v3sctlStatus);
+  addSelectedCheck(checks, options, "microservice:v3sctl-control-plane",
+    (v3sctlControlPlane as { ok?: boolean }).ok === true
+    && v3sctlControlPlaneBody?.ok === true
+    && v3sctlControlPlaneBody.clusterId === "unidesk-v8s"
+    && v3sctlControlPlaneBody.noFallback === true
+    && v3sctlControlPlaneBody.managedServicesHealthy === true
+    && v3sctlControlPlaneBody.kubeApiProxy?.mode === "kubernetes-api-service-proxy"
+    && v3sctlCodeQueueService?.status === "healthy"
+    && v3sctlCodeQueueService?.topologyComplete === true
+    && v3sctlCodeQueueService?.servingHealthy === true
+    && v3sctlCodeQueueService?.active?.id === "D601"
+    && v3sctlCodeQueueService?.active?.healthy === true
+    && (v3sctlCodeQueueService?.presentNodeIds ?? []).includes("D601")
+    && (v3sctlCodeQueueService?.presentNodeIds ?? []).includes("D518")
+    && (v3sctlCodeQueueService?.missingNodeIds ?? []).length === 0
+    && v3sctlD518Instance?.healthy === true
+    && v3sctlD518Instance?.proxyMode === "kubernetes-api-pod-readiness",
+    {
+      ok: (v3sctlControlPlane as { ok?: boolean }).ok,
+      clusterId: v3sctlControlPlaneBody?.clusterId,
+      noFallback: v3sctlControlPlaneBody?.noFallback,
+      kubeApiProxy: v3sctlControlPlaneBody?.kubeApiProxy,
+      service: v3sctlCodeQueueService,
+    });
  addSelectedCheck(checks, options, "microservice:catalog-filebrowser", (microservices as { ok?: boolean }).ok === true
    && filebrowser?.providerId === "D518"
    && filebrowser.backend?.public === false
@@ -1209,7 +1261,7 @@ async function serviceChecks(config: UniDeskConfig, urls: PublicUrls, checks: E2
    });
  addSelectedCheck(checks, options, "microservice:oa-event-flow-stats", (oaEventFlowStats as { ok?: boolean }).ok === true && oaEventFlowStatsBody?.ok === true && Array.isArray(oaEventFlowStatsBody.stats), oaEventFlowStats);
  addSelectedCheck(checks, options, "microservice:code-queue-status", (codeQueueStatus as { ok?: boolean }).ok === true && (codeQueueStatus as { body?: { microservice?: { id?: string; providerId?: string } } }).body?.microservice?.providerId === "D601", codeQueueStatus);
-  addSelectedCheck(checks, options, "microservice:code-queue-health", (codeQueueHealth as { ok?: boolean }).ok === true && codeQueueHealthBody?.ok === true && codeQueueHealthBody.queue?.defaultModel === "gpt-5.5" && codeQueueHealthBody.queue?.modelReasoningEfforts?.["gpt-5.5"] === "xhigh", codeQueueHealth);
+  addSelectedCheck(checks, options, "microservice:code-queue-health", (codeQueueHealth as { ok?: boolean }).ok === true && codeQueueHealthBody?.ok === true && codeQueueHealthBody.egressProxy?.connected === true && codeQueueHealthBody.queue?.defaultModel === "gpt-5.5" && codeQueueHealthBody.queue?.modelReasoningEfforts?.["gpt-5.5"] === "xhigh", codeQueueHealth);
  addSelectedCheck(checks, options, "microservice:code-queue-tasks", (codeQueueTasks as { ok?: boolean }).ok === true && codeQueueTasksBody?.ok === true && Array.isArray(codeQueueTasksBody.tasks) && codeQueueTasksBody.queue?.defaultModel === "gpt-5.5" && codeQueueTasksBody.queue?.modelReasoningEfforts?.["gpt-5.5"] === "xhigh", codeQueueTasks);
  const upgradeDispatch = dockerCoreJson("/api/dispatch", {
    method: "POST",
@@ -17,18 +17,18 @@ services:
      HOST: "0.0.0.0"
      PORT: "4266"
      LOG_FILE: "/var/log/unidesk/v3sctl-adapter.jsonl"
-      V3SCTL_CLUSTER_ID: "${V3SCTL_CLUSTER_ID:-D601}"
+      V3SCTL_CLUSTER_ID: "${V3SCTL_CLUSTER_ID:-unidesk-v8s}"
      V3SCTL_NODE_ID: "${V3SCTL_NODE_ID:-D601}"
      V3SCTL_KUBECTL_ENABLED: "${V3SCTL_KUBECTL_ENABLED:-false}"
      V3SCTL_KUBE_API_PROXY_ENABLED: "${V3SCTL_KUBE_API_PROXY_ENABLED:-true}"
-      V3SCTL_KUBECONFIG_PATH: "/var/lib/unidesk/v3s/kubeconfig"
+      V3SCTL_KUBECONFIG_PATH: "/var/lib/unidesk/v8s/kubeconfig"
      V3SCTL_KUBE_API_CONNECT_HOST: "${V3SCTL_KUBE_API_CONNECT_HOST:-host.docker.internal}"
      V3SCTL_MANIFEST_PATHS: "${V3SCTL_MANIFEST_PATHS:-v3s/code-queue.v3s.json}"
      V3SCTL_SERVICES_JSON: "${V3SCTL_SERVICES_JSON:-[]}"
      UNIDESK_LOG_RETENTION_BYTES: "${UNIDESK_LOG_RETENTION_BYTES:-512MiB}"
    volumes:
      - ${V3SCTL_ADAPTER_LOG_DIR:-../../../../.state/v3sctl-adapter/logs}:/var/log/unidesk
-      - ${V3SCTL_KUBECONFIG_HOST_PATH:-../../../../.state/v3s/kubeconfig}:/var/lib/unidesk/v3s/kubeconfig:ro
+      - ${V3SCTL_KUBECONFIG_HOST_PATH:-../../../../.state/v8s/kubeconfig}:/var/lib/unidesk/v8s/kubeconfig:ro
    extra_hosts:
      - "host.docker.internal:host-gateway"
    networks:
@@ -8,6 +8,7 @@ type JsonValue = string | number | boolean | null | JsonValue[] | { [key: string
 type JsonRecord = Record<string, JsonValue>;

 type InstanceRole = "primary" | "standby" | "worker";
+type EndpointHealthMode = "service-proxy" | "pod-ready";

 interface ManagedEndpoint {
  id: string;
@@ -15,6 +16,7 @@ interface ManagedEndpoint {
  role: InstanceRole;
  baseUrl: string;
  healthPath: string;
+  healthMode: EndpointHealthMode;
 }

 interface ManagedService {
@@ -143,6 +145,11 @@ function normalizeRole(value: string): InstanceRole {
  return "worker";
 }

+function normalizeHealthMode(value: string): EndpointHealthMode {
+  if (value === "service-proxy" || value === "pod-ready") return value;
+  return "service-proxy";
+}
+
 function parseEndpoint(value: unknown, index: number, ownerPath = "endpoint"): ManagedEndpoint {
  const path = `${ownerPath}[${index}]`;
  const item = asRecord(value, path);
@@ -154,6 +161,7 @@ function parseEndpoint(value: unknown, index: number, ownerPath = "endpoint"): M
    role: normalizeRole(optionalStringField(item, "role", id === "D601" ? "primary" : "standby")),
    baseUrl: stringField(item, "baseUrl", path).replace(/\/+$/u, ""),
    healthPath: optionalStringField(item, "healthPath", "/health"),
+    healthMode: normalizeHealthMode(optionalStringField(item, "healthMode", "service-proxy")),
  };
 }

@@ -244,12 +252,12 @@ function readConfig(): RuntimeConfig {
    port: envNumber("PORT", 4266),
    logFile: envString("LOG_FILE", "/var/log/unidesk/v3sctl-adapter.jsonl"),
    manifestPaths: paths,
-    clusterId: envString("V3SCTL_CLUSTER_ID", "D601"),
+    clusterId: envString("V3SCTL_CLUSTER_ID", "unidesk-v8s"),
    nodeId: envString("V3SCTL_NODE_ID", "D601"),
    kubectlEnabled: envBool("V3SCTL_KUBECTL_ENABLED", false),
    kubectlContext: envString("V3SCTL_KUBECTL_CONTEXT", ""),
    kubeApiProxyEnabled: envBool("V3SCTL_KUBE_API_PROXY_ENABLED", true),
-    kubeconfigPath: envString("V3SCTL_KUBECONFIG_PATH", "/var/lib/unidesk/v3s/kubeconfig"),
+    kubeconfigPath: envString("V3SCTL_KUBECONFIG_PATH", "/var/lib/unidesk/v8s/kubeconfig"),
    kubeApiConnectHost: envString("V3SCTL_KUBE_API_CONNECT_HOST", "host.docker.internal"),
    requestTimeoutMs: Math.max(1000, Math.min(120_000, envNumber("V3SCTL_REQUEST_TIMEOUT_MS", 30_000))),
    healthTimeoutMs: Math.max(500, Math.min(30_000, envNumber("V3SCTL_HEALTH_TIMEOUT_MS", 2500))),
@@ -385,6 +393,23 @@ function serviceProxyApiPath(service: ManagedService, targetPath: string): strin
  return `/api/v1/namespaces/${encodeURIComponent(service.namespace)}/services/${encodeURIComponent(`${serviceName}:${servicePort}`)}/proxy${safeTargetPath}`;
 }

+function endpointProxyApiPath(service: ManagedService, endpoint: ManagedEndpoint, targetPath: string): string {
+  const { namespace, serviceRef } = kubernetesEndpointServiceRef(service, endpoint);
+  const safeTargetPath = targetPath.startsWith("/") ? targetPath : `/${targetPath}`;
+  return `/api/v1/namespaces/${encodeURIComponent(namespace)}/services/${encodeURIComponent(serviceRef)}/proxy${safeTargetPath}`;
+}
+
+function kubernetesEndpointServiceRef(service: ManagedService, endpoint: ManagedEndpoint): { namespace: string; serviceRef: string } {
+  const base = new URL(endpoint.baseUrl);
+  if (base.protocol !== "kubernetes:") throw new Error(`endpoint ${endpoint.id} must use kubernetes:// baseUrl`);
+  const namespace = base.hostname || service.namespace;
+  const parts = base.pathname.split("/").filter(Boolean);
+  if (parts.length !== 2 || parts[0] !== "services" || parts[1].length === 0) {
+    throw new Error(`endpoint ${endpoint.id} baseUrl must be kubernetes://<namespace>/services/<service>:<port>`);
+  }
+  return { namespace, serviceRef: parts[1] };
+}
+
 function kubeProxyCurlArgs(client: KubeApiClient, method: string, url: URL, headers: Headers, hasBody: boolean, timeoutMs: number): string[] {
  const args = [
    "-sS",
@@ -431,11 +456,32 @@ async function kubeApiServiceProxyResponse(
  targetPath: string,
  query: string,
  timeoutMs: number,
+): Promise<Response> {
+  return kubeApiProxyResponse(service, req, serviceProxyApiPath(service, targetPath), query, timeoutMs);
+}
+
+async function kubeApiEndpointProxyResponse(
+  service: ManagedService,
+  endpoint: ManagedEndpoint,
+  req: Request,
+  targetPath: string,
+  query: string,
+  timeoutMs: number,
+): Promise<Response> {
+  return kubeApiProxyResponse(service, req, endpointProxyApiPath(service, endpoint, targetPath), query, timeoutMs);
+}
+
+async function kubeApiProxyResponse(
+  service: ManagedService,
+  req: Request,
+  apiPath: string,
+  query: string,
+  timeoutMs: number,
 ): Promise<Response> {
  if (kubeClient === null) {
    return jsonResponse({ ok: false, error: "kubernetes api proxy is not configured", serviceId: service.id, kubeconfigPath: config.kubeconfigPath, noFallback: true }, 502);
  }
-  const upstreamUrl = new URL(serviceProxyApiPath(service, targetPath), kubeClient.serverUrl);
+  const upstreamUrl = new URL(apiPath, kubeClient.serverUrl);
  upstreamUrl.search = query;
  const headers = forwardHeaders(req);
  const bodyText = req.method === "GET" || req.method === "HEAD" ? "" : await req.text();
@@ -455,7 +501,7 @@ async function kubeApiServiceProxyResponse(
    proc.exited,
  ]);
  if (exitCode !== 0) {
-    log("error", "kube_api_proxy_failed", { serviceId: service.id, targetPath, exitCode, stderr: stderr.slice(0, 2000), noFallback: true });
+    log("error", "kube_api_proxy_failed", { serviceId: service.id, apiPath, exitCode, stderr: stderr.slice(0, 2000), noFallback: true });
    return jsonResponse({ ok: false, error: "kubernetes api service proxy failed", serviceId: service.id, detail: stderr.slice(0, 4000), noFallback: true }, 502);
  }
  const parsed = parseCurlHeaderBody(Buffer.from(stdout));
@@ -522,13 +568,27 @@ async function probeEndpoint(endpoint: ManagedEndpoint): Promise<JsonRecord> {

 async function probeKubernetesServiceActive(service: ManagedService): Promise<JsonRecord> {
  const endpoint = activeEndpoint(service);
+  return probeKubernetesEndpoint(service, endpoint, true);
+}
+
+async function probeKubernetesEndpoint(service: ManagedService, endpoint: ManagedEndpoint, active = false): Promise<JsonRecord> {
+  if (!active && endpoint.healthMode === "pod-ready") return await probeKubernetesPodReady(service, endpoint);
  const checkedAt = new Date().toISOString();
-  const response = await kubeApiServiceProxyResponse(
+  const response = active
+    ? await kubeApiServiceProxyResponse(
      service,
      new Request("http://v3sctl-adapter.local/health", { method: "GET", headers: { accept: "application/json" } }),
      endpoint.healthPath,
      "",
      config.healthTimeoutMs,
+    )
+    : await kubeApiEndpointProxyResponse(
+      service,
+      endpoint,
+      new Request("http://v3sctl-adapter.local/health", { method: "GET", headers: { accept: "application/json" } }),
+      endpoint.healthPath,
+      "",
+      config.healthTimeoutMs,
    );
  const contentType = response.headers.get("content-type") ?? "application/octet-stream";
  const bodyText = await response.text();
@@ -544,6 +604,7 @@ async function probeKubernetesServiceActive(service: ManagedService): Promise<Js
    role: endpoint.role,
    baseUrl: endpoint.baseUrl,
    healthPath: endpoint.healthPath,
+    healthMode: endpoint.healthMode,
    proxyMode: "kubernetes-api-service-proxy",
    route: service.route,
    healthy: response.ok,
@@ -555,9 +616,79 @@ async function probeKubernetesServiceActive(service: ManagedService): Promise<Js
  };
 }

+function jsonAtPath(value: unknown, path: string): unknown {
+  return path.split(".").reduce((current, key) => {
+    if (typeof current !== "object" || current === null) return undefined;
+    return (current as Record<string, unknown>)[key];
+  }, value);
+}
+
+function podReady(item: unknown): boolean {
+  const conditions = jsonAtPath(item, "status.conditions");
+  return Array.isArray(conditions) && conditions.some((condition) => {
+    const record = typeof condition === "object" && condition !== null ? condition as Record<string, unknown> : {};
+    return record.type === "Ready" && record.status === "True";
+  });
+}
+
+function podSummary(item: unknown): JsonRecord {
+  const metadata = typeof jsonAtPath(item, "metadata") === "object" && jsonAtPath(item, "metadata") !== null ? jsonAtPath(item, "metadata") as Record<string, unknown> : {};
+  return {
+    name: typeof metadata.name === "string" ? metadata.name : "",
+    nodeName: typeof jsonAtPath(item, "spec.nodeName") === "string" ? jsonAtPath(item, "spec.nodeName") as string : "",
+    phase: typeof jsonAtPath(item, "status.phase") === "string" ? jsonAtPath(item, "status.phase") as string : "",
+    podIP: typeof jsonAtPath(item, "status.podIP") === "string" ? jsonAtPath(item, "status.podIP") as string : "",
+    ready: podReady(item),
+  };
+}
+
+async function probeKubernetesPodReady(service: ManagedService, endpoint: ManagedEndpoint): Promise<JsonRecord> {
+  const checkedAt = new Date().toISOString();
+  const { namespace } = kubernetesEndpointServiceRef(service, endpoint);
+  const labelSelector = new URLSearchParams({
+    labelSelector: `app.kubernetes.io/name=${service.id},unidesk.ai/instance-id=${endpoint.id}`,
+  }).toString();
+  const response = await kubeApiProxyResponse(
+    service,
+    new Request("http://v3sctl-adapter.local/api/pods", { method: "GET", headers: { accept: "application/json" } }),
+    `/api/v1/namespaces/${encodeURIComponent(namespace)}/pods`,
+    `?${labelSelector}`,
+    config.healthTimeoutMs,
+  );
+  const contentType = response.headers.get("content-type") ?? "application/octet-stream";
+  const bodyText = await response.text();
+  let body: JsonValue = bodyText.slice(0, 2000);
+  let pods: JsonRecord[] = [];
+  try {
+    const parsed = JSON.parse(bodyText) as JsonRecord;
+    const items = Array.isArray(parsed.items) ? parsed.items : [];
+    pods = items.map(podSummary);
+    body = { itemCount: items.length, pods };
+  } catch {
+    // Keep the raw text preview below.
+  }
+  const healthy = response.ok && pods.some((pod) => pod.ready === true);
+  return {
+    id: endpoint.id,
+    nodeId: endpoint.nodeId,
+    role: endpoint.role,
+    baseUrl: endpoint.baseUrl,
+    healthPath: endpoint.healthPath,
+    healthMode: endpoint.healthMode,
+    proxyMode: "kubernetes-api-pod-readiness",
+    route: service.route,
+    healthy,
+    status: healthy ? "healthy" : "unhealthy",
+    upstreamStatus: response.status,
+    contentType,
+    checkedAt,
+    body,
+  };
+}
+
 async function serviceStatus(service: ManagedService): Promise<JsonRecord> {
  const instances = isKubernetesServiceRoute(service)
-    ? [await probeKubernetesServiceActive(service)]
+    ? await Promise.all(service.endpoints.map((endpoint) => endpoint.id === service.activeInstanceId ? probeKubernetesServiceActive(service) : probeKubernetesEndpoint(service, endpoint)))
    : [{
      id: service.activeInstanceId,
      nodeId: activeEndpoint(service).nodeId,
@@ -576,7 +707,7 @@ async function serviceStatus(service: ManagedService): Promise<JsonRecord> {
  const activeHealthy = active?.healthy === true;
  const allInstancesHealthy = instances.every((item) => item.healthy === true);
  const expectedNodeIds = service.expectedNodeIds;
-  const presentNodeIds = Array.from(new Set(instances.map((item) => String(item.nodeId))));
+  const presentNodeIds = Array.from(new Set(instances.filter((item) => item.healthy === true).map((item) => String(item.nodeId))));
  const missingNodeIds = expectedNodeIds.filter((nodeId) => !presentNodeIds.includes(nodeId));
  const topologyComplete = missingNodeIds.length === 0;
  const requiredTopologyHealthy = !service.requireAllInstancesHealthy || (topologyComplete && allInstancesHealthy);
@@ -4,7 +4,43 @@ metadata:
  name: unidesk
  labels:
    app.kubernetes.io/part-of: unidesk
-    unidesk.ai/v3s-cluster: unidesk-v3s
+    unidesk.ai/v3s-cluster: unidesk-v8s
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: d601-provider-egress-proxy
+  namespace: unidesk
+  labels:
+    app.kubernetes.io/name: provider-egress-proxy
+    app.kubernetes.io/part-of: unidesk
+    unidesk.ai/provider-id: D601
+spec:
+  type: ClusterIP
+  ports:
+    - name: http
+      port: 18789
+      targetPort: 18789
+      protocol: TCP
+---
+apiVersion: discovery.k8s.io/v1
+kind: EndpointSlice
+metadata:
+  name: d601-provider-egress-proxy
+  namespace: unidesk
+  labels:
+    kubernetes.io/service-name: d601-provider-egress-proxy
+    app.kubernetes.io/name: provider-egress-proxy
+    app.kubernetes.io/part-of: unidesk
+    unidesk.ai/provider-id: D601
+addressType: IPv4
+ports:
+  - name: http
+    protocol: TCP
+    port: 18789
+endpoints:
+  - addresses:
+      - "172.25.0.3"
 ---
 apiVersion: apps/v1
 kind: Deployment
@@ -31,6 +67,8 @@ spec:
        unidesk.ai/instance-id: D601
        unidesk.ai/node-id: D601
    spec:
+      nodeSelector:
+        unidesk.ai/node-id: D601
      terminationGracePeriodSeconds: 30
      containers:
        - name: code-queue
@@ -99,25 +137,25 @@ spec:
            - name: CODE_QUEUE_EGRESS_PROXY_ENABLED
              value: "true"
            - name: CODE_QUEUE_EGRESS_PROXY_URL
-              value: "http://host.docker.internal:18789"
+              value: "http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789"
            - name: CODE_QUEUE_EGRESS_PROXY_NO_PROXY
-              value: "localhost,127.0.0.1,::1,host.docker.internal,unidesk-provider-gateway-D601,74.48.78.17,backend-core,oa-event-flow,database"
+              value: "localhost,127.0.0.1,::1,host.docker.internal,d601-provider-egress-proxy,d601-provider-egress-proxy.unidesk,d601-provider-egress-proxy.unidesk.svc,d601-provider-egress-proxy.unidesk.svc.cluster.local,172.25.0.3,unidesk-provider-gateway-D601,74.48.78.17,backend-core,oa-event-flow,database"
            - name: HTTP_PROXY
-              value: "http://host.docker.internal:18789"
+              value: "http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789"
            - name: HTTPS_PROXY
-              value: "http://host.docker.internal:18789"
+              value: "http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789"
            - name: ALL_PROXY
-              value: "http://host.docker.internal:18789"
+              value: "http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789"
            - name: http_proxy
-              value: "http://host.docker.internal:18789"
+              value: "http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789"
            - name: https_proxy
-              value: "http://host.docker.internal:18789"
+              value: "http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789"
            - name: all_proxy
-              value: "http://host.docker.internal:18789"
+              value: "http://d601-provider-egress-proxy.unidesk.svc.cluster.local:18789"
            - name: NO_PROXY
-              value: "localhost,127.0.0.1,::1,host.docker.internal,unidesk-provider-gateway-D601,74.48.78.17,backend-core,oa-event-flow,database"
+              value: "localhost,127.0.0.1,::1,host.docker.internal,d601-provider-egress-proxy,d601-provider-egress-proxy.unidesk,d601-provider-egress-proxy.unidesk.svc,d601-provider-egress-proxy.unidesk.svc.cluster.local,172.25.0.3,unidesk-provider-gateway-D601,74.48.78.17,backend-core,oa-event-flow,database"
            - name: no_proxy
-              value: "localhost,127.0.0.1,::1,host.docker.internal,unidesk-provider-gateway-D601,74.48.78.17,backend-core,oa-event-flow,database"
+              value: "localhost,127.0.0.1,::1,host.docker.internal,d601-provider-egress-proxy,d601-provider-egress-proxy.unidesk,d601-provider-egress-proxy.unidesk.svc,d601-provider-egress-proxy.unidesk.svc.cluster.local,172.25.0.3,unidesk-provider-gateway-D601,74.48.78.17,backend-core,oa-event-flow,database"
            - name: OA_EVENT_FLOW_BASE_URL
              value: "http://74.48.78.17:4255"
            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_ENABLED
@@ -226,3 +264,228 @@ spec:
    - name: http
      port: 4222
      targetPort: http
+
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: code-queue-d518
+  namespace: unidesk
+  labels:
+    app.kubernetes.io/name: code-queue
+    app.kubernetes.io/part-of: unidesk
+    unidesk.ai/deployment-mode: v3sctl-managed
+    unidesk.ai/instance-id: D518
+spec:
+  replicas: 1
+  selector:
+    matchLabels:
+      app.kubernetes.io/name: code-queue
+      unidesk.ai/instance-id: D518
+  template:
+    metadata:
+      labels:
+        app.kubernetes.io/name: code-queue
+        app.kubernetes.io/part-of: unidesk
+        unidesk.ai/deployment-mode: v3sctl-managed
+        unidesk.ai/instance-id: D518
+        unidesk.ai/node-id: D518
+    spec:
+      nodeSelector:
+        unidesk.ai/node-id: D518
+      terminationGracePeriodSeconds: 30
+      containers:
+        - name: code-queue
+          image: unidesk-code-queue:d601
+          imagePullPolicy: IfNotPresent
+          ports:
+            - name: http
+              containerPort: 4222
+          envFrom:
+            - secretRef:
+                name: code-queue-env
+                optional: true
+          env:
+            - name: HOST
+              value: "0.0.0.0"
+            - name: PORT
+              value: "4222"
+            - name: CODE_QUEUE_INSTANCE_ID
+              value: "D518"
+            - name: CODE_QUEUE_SCHEDULER_ENABLED
+              value: "false"
+            - name: CODE_QUEUE_STARTUP_OA_BACKFILL_ENABLED
+              value: "false"
+            - name: CODE_QUEUE_DATA_DIR
+              value: "/var/lib/unidesk/code-queue"
+            - name: CODE_QUEUE_WORKDIR
+              value: "/workspace"
+            - name: CODE_QUEUE_CODEX_HOME
+              value: "/var/lib/unidesk/code-queue/codex-home"
+            - name: CODE_QUEUE_OPENCODE_XDG_DIR
+              value: "/var/lib/unidesk/code-queue/opencode-xdg"
+            - name: CODE_QUEUE_SOURCE_CODEX_CONFIG
+              value: "/root/.codex/config.toml"
+            - name: CODE_QUEUE_DEFAULT_MODEL
+              value: "gpt-5.5"
+            - name: CODE_QUEUE_MODELS
+              value: "gpt-5.5,gpt-5.4-mini,gpt-5.4,minimax-m2.7"
+            - name: CODE_QUEUE_MODEL_REASONING_EFFORTS
+              value: "gpt-5.5=xhigh"
+            - name: CODE_QUEUE_SANDBOX
+              value: "danger-full-access"
+            - name: CODE_QUEUE_APPROVAL_POLICY
+              value: "never"
+            - name: CODE_QUEUE_MAX_ACTIVE_QUEUES
+              value: "0"
+            - name: CODE_QUEUE_DATABASE_POOL_MAX
+              value: "2"
+            - name: NODE_OPTIONS
+              value: "--max-old-space-size=1024"
+            - name: CODE_QUEUE_IN_MEMORY_OUTPUT_RECORDS
+              value: "10"
+            - name: CODE_QUEUE_IN_MEMORY_EVENT_RECORDS
+              value: "10"
+            - name: CODE_QUEUE_MAIN_PROVIDER_ID
+              value: "D518"
+            - name: CODE_QUEUE_REMOTE_WORKDIR
+              value: "/home/ubuntu"
+            - name: CODE_QUEUE_EXECUTION_PROVIDER_IDS
+              value: "D518"
+            - name: CODE_QUEUE_DEV_CONTAINER_MASTER_HOST
+              value: "74.48.78.17"
+            - name: CODE_QUEUE_DEV_CONTAINER_DEFAULT_PROVIDER_ID
+              value: "D518"
+            - name: CODE_QUEUE_DEV_CONTAINER_WORKDIR
+              value: "/home/ubuntu"
+            - name: CODE_QUEUE_EGRESS_PROXY_ENABLED
+              value: "false"
+            - name: CODE_QUEUE_EGRESS_PROXY_URL
+              value: ""
+            - name: CODE_QUEUE_EGRESS_PROXY_NO_PROXY
+              value: "localhost,127.0.0.1,::1,host.docker.internal,74.48.78.17,backend-core,oa-event-flow,database"
+            - name: HTTP_PROXY
+              value: ""
+            - name: HTTPS_PROXY
+              value: ""
+            - name: ALL_PROXY
+              value: ""
+            - name: http_proxy
+              value: ""
+            - name: https_proxy
+              value: ""
+            - name: all_proxy
+              value: ""
+            - name: NO_PROXY
+              value: "localhost,127.0.0.1,::1,host.docker.internal,74.48.78.17,backend-core,oa-event-flow,database"
+            - name: no_proxy
+              value: "localhost,127.0.0.1,::1,host.docker.internal,74.48.78.17,backend-core,oa-event-flow,database"
+            - name: OA_EVENT_FLOW_BASE_URL
+              value: "http://74.48.78.17:4255"
+            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_ENABLED
+              value: "false"
+            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_BASE_URL
+              value: ""
+            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_TARGET_TYPE
+              value: "private"
+            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_USER_ID
+              value: "645275593"
+            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_MAX_RESPONSE_CHARS
+              value: "12000"
+            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_TIMEOUT_MS
+              value: "15000"
+            - name: CODE_QUEUE_NOTIFY_CLAUDEQQ_SEND_ATTEMPTS
+              value: "3"
+            - name: LOG_FILE
+              value: "/var/log/unidesk/code-queue-d518.jsonl"
+            - name: UNIDESK_LOG_RETENTION_BYTES
+              value: "1GiB"
+          volumeMounts:
+            - name: docker-sock
+              mountPath: /var/run/docker.sock
+            - name: workspace
+              mountPath: /workspace
+            - name: workspace
+              mountPath: /root/unidesk
+            - name: codex-config
+              mountPath: /root/.codex/config.toml
+              readOnly: true
+            - name: ssh-dir
+              mountPath: /root/.ssh
+              readOnly: true
+            - name: logs
+              mountPath: /var/log/unidesk
+            - name: state
+              mountPath: /var/lib/unidesk/code-queue
+          readinessProbe:
+            httpGet:
+              path: /health
+              port: http
+            periodSeconds: 5
+            timeoutSeconds: 3
+            failureThreshold: 20
+          livenessProbe:
+            httpGet:
+              path: /health
+              port: http
+            periodSeconds: 10
+            timeoutSeconds: 3
+            failureThreshold: 6
+          startupProbe:
+            httpGet:
+              path: /health
+              port: http
+            periodSeconds: 5
+            timeoutSeconds: 3
+            failureThreshold: 60
+          resources:
+            requests:
+              cpu: 250m
+              memory: 512Mi
+            limits:
+              memory: 4Gi
+      volumes:
+        - name: docker-sock
+          hostPath:
+            path: /var/run/docker.sock
+            type: Socket
+        - name: workspace
+          hostPath:
+            path: /home/ubuntu/cq-deploy
+            type: Directory
+        - name: codex-config
+          hostPath:
+            path: /home/ubuntu/.codex/config.toml
+            type: File
+        - name: ssh-dir
+          hostPath:
+            path: /home/ubuntu/.ssh
+            type: Directory
+        - name: logs
+          hostPath:
+            path: /home/ubuntu/cq-deploy/.state/code-queue/logs
+            type: DirectoryOrCreate
+        - name: state
+          hostPath:
+            path: /home/ubuntu/cq-deploy/.state/code-queue
+            type: DirectoryOrCreate
+---
+apiVersion: v1
+kind: Service
+metadata:
+  name: code-queue-d518
+  namespace: unidesk
+  labels:
+    app.kubernetes.io/name: code-queue
+    app.kubernetes.io/part-of: unidesk
+    unidesk.ai/deployment-mode: v3sctl-managed
+    unidesk.ai/instance-id: D518
+spec:
+  type: ClusterIP
+  selector:
+    app.kubernetes.io/name: code-queue
+    unidesk.ai/instance-id: D518
+  ports:
+    - name: http
+      port: 4222
+      targetPort: http
@@ -9,8 +9,8 @@
    "adapterServiceId": "v3sctl-adapter",
    "controlPlane": {
      "type": "kubernetes",
-      "cluster": "unidesk-v3s",
-      "context": "kind-unidesk-v3s"
+      "cluster": "unidesk-v8s",
+      "context": "unidesk-v8s"
    },
    "route": {
      "kind": "kubernetes-service",
@@ -29,7 +29,16 @@
        "nodeId": "D601",
        "role": "primary",
        "baseUrl": "kubernetes://unidesk/services/code-queue:4222",
-        "healthPath": "/health"
+        "healthPath": "/health",
+        "healthMode": "service-proxy"
+      },
+      {
+        "id": "D518",
+        "nodeId": "D518",
+        "role": "standby",
+        "baseUrl": "kubernetes://unidesk/services/code-queue-d518:4222",
+        "healthPath": "/health",
+        "healthMode": "pod-ready"
      }
    ],
    "requireAllInstancesHealthy": false