fix(code-queue): add issue 3 gateway diagnostics
This commit is contained in:
@@ -163,6 +163,10 @@ The reconciler selects the executor from `config.json`:
|
||||
- Control bridges that UniDesk needs in order to inspect or repair an orchestrator must stay in this direct class. In particular, `k3sctl-adapter` is a UniDesk-managed bridge to native k3s and must remain outside k3s; Docker packaging on Docker Desktop/WSL must create an explicit host-local bridge, currently an adapter-container SSH local tunnel, to reach `/etc/rancher/k3s/k3s.yaml` and WSL `127.0.0.1:6443`.
|
||||
- `deployment.mode=k3sctl-managed`: the target behavior is to build on the active control target, verify native k3s on the host OS/WSL distro, import the image into native k3s/containerd, apply the existing Kubernetes manifest, stamp the Deployment and wait for rollout. On D601, persistent dev apply is currently allowed only for `backend-core` and `frontend` in `unidesk-dev`; normal production services still cannot use a maintenance-channel direct rollout. The executor must use the native kubeconfig and containerd socket, for example `/etc/rancher/k3s/k3s.yaml` and `/run/k3s/containerd/containerd.sock`; running k3s itself in Docker is forbidden for both control-plane and worker nodes. A `rancher/k3s` image or legacy container may only be used as a temporary artifact source during migration, and any active containerized k3s control plane must be stopped before verification succeeds. The executor must preload a valid `rancher/mirrored-pause:3.6` sandbox image into native k3s containerd through the provider-gateway one-shot egress path, verify its entrypoint is `/pause`, and reject fake or sleep-based replacement images. Code Queue's k3s migration executor must also stop/remove the legacy direct Docker `code-queue-backend` after k3s rollout, so there is never a second scheduler running beside the native k3s scheduler.
|
||||
|
||||
D601 Docker local images are not the source of truth for k3s runtime availability. For Code Queue, the deploy gate must verify `unidesk-code-queue:d601` exists in native k3s containerd after import with `ctr --address /run/k3s/containerd/containerd.sock -n k8s.io images ls`, and it must fail before rollout if the tag is missing. The same gate must verify every production Code Queue Deployment that uses the image (`code-queue`, `code-queue-read`, `code-queue-write`, `d601-provider-egress-proxy`, `d601-tcp-egress-gateway`) still references exactly `unidesk-code-queue:d601`; otherwise kubelet may attempt an external registry pull and leave base gateways in `ImagePullBackOff`.
|
||||
|
||||
Code Queue health and diagnostics must cover its k3s dependencies, not only scheduler HTTP health. `bun scripts/cli.ts microservice diagnostics code-queue` and the `/health` aggregation must mark the service degraded/failing when `d601-provider-egress-proxy` or `d601-tcp-egress-gateway` Deployment availability or Endpoint readiness is missing, when the scheduler reports `storage.lastError` or PostgreSQL route failure through `d601-tcp-egress-gateway.unidesk.svc.cluster.local:15432`, or when stale active/retry_wait reconcile reports recoverable active tasks without a local run.
|
||||
|
||||
Existing service-specific commands such as Code Queue deploy are disabled as direct D601 deploy paths. Their build/import/rollout semantics should converge later into one controlled target-side deployment path instead of keeping parallel implementations.
|
||||
|
||||
Decision Center is a standard `k3sctl-managed` service in this model, but D601 maintenance-channel direct apply must not deploy it. Future controlled CD for Decision Center should build `src/components/microservices/decision-center/Dockerfile` on D601, import `unidesk-decision-center:d601` into native k3s containerd, apply `src/components/microservices/k3sctl-adapter/k3s/decision-center.k8s.yaml`, stamp the Deployment, and verify health through `/api/microservices/decision-center/health`. It must not add a main-server Compose service, NodePort, hostPort, or provider-gateway direct HTTP backend for Decision Center.
|
||||
|
||||
Reference in New Issue
Block a user