Files

T

Codex 021a9eef01 feat: add artifact publish preflight

2026-05-20 21:49:50 +00:00

43 KiB

Raw Blame History

Desired Deploy Reconciler

UniDesk deployment is driven by a desired-state manifest. The manifest answers only one question: which service should run which repository commit. Runtime topology, ports, providers, compose files, Kubernetes manifests, health paths and proxy policy remain in config.json and the existing service manifests.

Persistent D601 dev environment rules, including the public dev frontend port, deploy apply --env dev service scope and Rust backend-core build boundary, are owned by docs/reference/dev-environment.md. Release-line governance, CI/CD runtime pinning and the release/v1 transition policy are owned by docs/reference/release-governance.md and GitHub issue #6. This document owns the generic desired-state reconciler and target-side build contract.

Manifest

The root deploy.json is the single desired-state source for both prod and dev. Environment branches such as deploy/dev and deploy/prod are deprecated because they create a second control plane for version intent.

{
  "schemaVersion": 2,
  "environments": {
    "prod": {
      "services": [
        {
          "id": "code-queue",
          "repo": "https://github.com/pikasTech/unidesk",
          "commitId": "0c3cdb4ee06a23361ed511a2da033d67b53d16f4"
        }
      ]
    },
    "dev": {
      "services": [
        {
          "id": "backend-core",
          "repo": "https://github.com/pikasTech/unidesk",
          "commitId": "348c644"
        }
      ]
    }
  }
}

schemaVersion=1 remains accepted only as a local compatibility format. Standard environment commands use schemaVersion=2 and select environments.dev.services or environments.prod.services.

deploy.json service entries must not contain provider IDs, ports, compose service names, Kubernetes namespace, health paths, environment variables, Dockerfile paths or build commands. The deploy reconciler joins each service id with config.json.microservices[] and existing k3s manifests to resolve those details. A service listed in deploy.json but missing from config.json is an error. A service with no Dockerfile source artifact is reported as unsupported rather than silently skipped. commitId may be a unique pushed short SHA or a full SHA; every deploy command resolves it through the remote repository to a full 40-character commit before target-side build or rollout, and fails immediately if the SHA is missing or ambiguous.

The optional non-service execution declaration under environments.dev is intentionally not specified here. The only currently allowed declaration is ci, and its authoritative repo, scriptPath, timeoutMs, short launcher, host fetch boundary and no-CD rules are defined only in docs/reference/dev-ci-runner.md.

Environment mode never reads the local dirty working tree manifest. deploy check --env ..., deploy plan --env ... and deploy apply --env ... fetch origin/master, read origin/master:deploy.json, select environments.<env>, and report the manifest commit/blob, service commit IDs, target namespace, database fingerprint and Provider identity. deploy apply --env dev is currently enabled for persistent D601 dev backend-core target-side rollout and for reviewed artifact consumers frontend, baidu-netdisk, decision-center, mdtodo, claudeqq, dev-only code-queue, project-manager, oa-event-flow, code-queue-mgr, todo-note, findjob, pipeline and met-nonlinear. deploy apply --env prod exposes reviewed registry artifact consumers (backend-core, frontend, baidu-netdisk, decision-center, mdtodo, claudeqq, project-manager, oa-event-flow, todo-note, findjob, pipeline and met-nonlinear), while code-queue must report unsupported, code-queue-mgr remains supervisor-gated and k3sctl-adapter is plan/dry-run only. Production backend-core artifact CD is a separate executor because its build target is D601 CI while its runtime target is the master server. The default user-service delivery policy, including CI build, registry publication, dev validation, production CD and manual acceptance, is documented in docs/reference/user-service-delivery.md.

--commit <full-sha> is allowed only with --env dev|prod --service <id> for reviewed artifact consumers. It overrides the selected service commit for that one artifact consumer while still using the Git-backed environment manifest for target, namespace, repo, deploy ref and guardrails. It is the supported temporary shape for release-line frontend validation and rollback when the artifact was produced from a pushed release/v1 commit but origin/master:deploy.json#environments.<env>.services.frontend.commitId has not been repinned. It must not be used for local-manifest mode, multi-service apply, or target-side source-build services such as dev backend-core.

For services with reviewed production artifact consumers, local-manifest deploy apply --file ... is not a production fallback. The CLI blocks backend-core, frontend, baidu-netdisk, decision-center, mdtodo, claudeqq and other reviewed pull-only consumers before source materialization or Docker build and directs operators to deploy apply --env prod --service <id> --commit <full-sha>. This prevents a dirty worktree, local manifest or target-side source build from bypassing the pull-only artifact CD guardrails. The broader precheck and legacy-path classification live in docs/reference/cicd-standardization.md.

The current implementation has not yet enabled separate stable and integration dev lanes. Future lane names such as dev-v1 and dev-master, or an equivalent nested schema, must be added as explicit deploy.json and CLI semantics before use. A deploy command must print the manifest ref it used and must not infer release/v1 from a local branch, a dirty file, or an undocumented environment alias. For frontend-only release-line validation, the current bridge is an explicit commit override on the reviewed artifact consumer: first publish the artifact with ci publish-user-service --service frontend --commit <release-v1-full-sha>, then run deploy apply --env dev --service frontend --commit <release-v1-full-sha>; production uses the same shape with --env prod only after dev evidence is accepted.

CI/CD server and control-plane services are normal deployable services for versioning purposes: production runtime must be pinned by deploy.json to a known commit. A CLI built from master may orchestrate the pinned server only through backward-compatible APIs and server-reported capabilities; it must not bypass server-side deploy policy when the pinned server does not support a requested operation.

The only D601 direct-service exception in local manifest mode is k3sctl-adapter, because it is the UniDesk-managed control bridge outside the k3s fault domain and owns the Kubernetes service catalog used by the dev public frontend path. Its artifact consumer path is plan/dry-run only and never performs real prod deployment without supervisor confirmation. D601 Code Queue, Decision Center, MDTODO, ClaudeQQ and future k3s-managed workloads remain blocked from maintenance-channel direct deploy.

config.json.microservices[].repository.commitId is retained for catalog compatibility, but deploy.json is the deployment version authority for the reconciler.

Dev CI Runner

Dev desired-state smoke verification is not a deploy executor. Use bun scripts/cli.ts ci run-dev-e2e for the Git-controlled temporary namespace runner described in docs/reference/dev-ci-runner.md; that command must not roll out persistent D601 services.

Persistent dev backend-core and frontend rollout is separate from that smoke runner and is described in docs/reference/dev-environment.md.

D601 Dev Foundation

Phase 2 of the D601 dev environment creates only the isolated namespace and database foundation. The authoritative manifest is src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-foundation.k8s.yaml.

It may create resources only in unidesk-dev:

Namespace unidesk-dev, plus quota and default limits.
Secret unidesk-dev-runtime-secrets as a dev-only template for DB credentials, provider token, auth/session secret, and Code Queue model secret placeholders. The frontend auth/session values are placeholders in this manifest; the controlled dev frontend deploy path syncs them from main-server config.json.auth so dev and production use the same login identity and session signer.
ConfigMap unidesk-dev-runtime-config for dev identity, desired-state source origin/master:deploy.json#environments.dev, provider id D601-dev, Code Queue dev paths, and non-secret runtime defaults. SESSION_TTL_SECONDS follows the same main-server auth config when frontend is deployed.
ConfigMap unidesk-dev-db-guard with an executable guard script that rejects production-looking DATABASE_URL values.
StatefulSet/Service postgres-dev with a 5Gi persistent volume claim and bounded CPU/memory requests/limits.
Job unidesk-dev-db-migrate, which waits for postgres-dev, runs the guard, then prepares backend-core and Code Queue tables in the independent unidesk_dev database.

The manifest must not create, update, or delete production namespace resources, production DB objects, production PVCs, production Deployments/Services/Secrets, or main server Docker Compose services. Static validation is available through bun scripts/cli.ts dev-env validate; Kubernetes client dry-run is bun scripts/cli.ts dev-env validate --kubectl-dry-run. If applying manually during Phase 2, the only allowed apply target is this manifest and the post-check must prove production resources are unchanged, for example by comparing kubectl -n unidesk get deploy,sts,svc,secret,pvc -o name before and after.

Before applying the foundation on a fresh D601 native k3s runtime, run bun scripts/cli.ts dev-env prewarm-images and wait for the returned job to succeed. This imports the foundation images postgres:16-alpine and rancher/mirrored-library-busybox:1.36.1 from Docker into /run/k3s/containerd/containerd.sock; k3s/containerd must not depend on live Docker Hub pulls during rollout. If this step is skipped, postgres-dev or the local-path helper pod can remain ImagePullBackOff, leaving the PVC pending even though the manifest is valid.

Phase 2 guardrails are deliberately limited to the dev manifest and CLI validator. Runtime startup guards for dev backend-core, Code Queue and Code Queue Manager must be reviewed and shipped as a separate change before dev workloads are exposed beyond dry-run or controlled apply.

On D601, dev/prod k3s verification must use the native k3s kubeconfig explicitly: KUBECONFIG=/etc/rancher/k3s/k3s.yaml. The default kubectl context may point at Docker Desktop and is not an acceptable target for UniDesk k3s deploy validation.

D601 Dev Core

Phase 3 introduces the dev backend/frontend manifest at src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-core.k8s.yaml. It may create only backend-core-dev and frontend-dev Deployment/Service objects in unidesk-dev; the persistent rollout contract and Rust build boundary are owned by docs/reference/dev-environment.md.

backend-core-dev must use unidesk-dev-runtime-config and unidesk-dev-runtime-secrets, connect to postgres-dev.../unidesk_dev, expose HTTP on 8080 and provider ingress on 8081, and write logs under /var/log/unidesk-dev. frontend-dev must set CORE_INTERNAL_URL=http://backend-core-dev.unidesk-dev.svc.cluster.local:8080 and must not proxy to production backend-core.

The manifest keeps placeholder image tags and deploy commit values in source control. The controlled deploy apply --env dev --service backend-core path fetches origin/master:deploy.json, materializes the requested source commit on D601, narrows the dev core control manifest to the selected Service/Deployment pair, replaces placeholders with the requested commit and dev image tag, builds on D601, imports the image into native k3s containerd, applies only the unidesk-dev objects and stamps the Deployment. deploy apply --env dev --service frontend uses the same selected dev manifest objects but consumes the existing D601 registry artifact 127.0.0.1:5000/unidesk/frontend:<commit> instead of building frontend source on the target. Decision Center, MDTODO and ClaudeQQ use the same dev namespace but follow the D601 registry artifact consumer path instead of a source build: each verifies the commit-pinned image in D601 registry, imports it into native k3s containerd, applies its dev manifest, stamps the Deployment and verifies live commit/requestedCommit through the Kubernetes API service proxy. project-manager, oa-event-flow, code-queue-mgr, todo-note, findjob, pipeline and met-nonlinear consume existing D601 registry artifacts for direct Docker/Compose validation rather than separate parallel k3s dev instances; code-queue-mgr live prod apply remains supervisor-gated. Client dry-run and static validation remain useful checks before controlled apply:

bun scripts/cli.ts dev-env validate --manifest src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-core.k8s.yaml
KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl apply --dry-run=client --validate=false -f src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-core.k8s.yaml

backend-core and frontend keep their production health payload shape by default. They add environment, namespace, databaseName, serviceId, deployRef and deploy commit metadata only when UNIDESK_ENV=dev or UNIDESK_NAMESPACE=unidesk-dev is set. The frontend shell shows a visible DEV ribbon only under the same dev identity.

D601 Dev Code Queue

Phase 5 introduces the dev Code Queue execution manifest at src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-code-queue.k8s.yaml. It may create only Code Queue dev execution objects in unidesk-dev: code-queue-scheduler-dev, code-queue-read-dev, code-queue-write-dev and the supporting d601-dev-provider-egress-proxy.

All dev Code Queue components must use unidesk-dev-runtime-config and unidesk-dev-runtime-secrets, connect to postgres-dev.../unidesk_dev, write logs and state under /home/ubuntu/unidesk-dev-code-queue-deploy/state, and expose HTTP on 4222 only as ClusterIP services. The scheduler uses CODE_QUEUE_MAIN_PROVIDER_ID=D601-dev, CODE_QUEUE_WORKDIR=/workspace-dev, CODE_QUEUE_REMOTE_WORKDIR=/home/ubuntu/unidesk-dev-workspace, disables ClaudeQQ notifications by default, and does not use the production d601-tcp-egress-gateway or production PostgreSQL route.

Maintenance-channel direct D601 apply must not deploy dev Code Queue and the old codex deploy compatibility entry remains disabled. Dev Code Queue deployment is allowed only as the D601 registry artifact consumer for deploy apply --env dev --service code-queue or the equivalent artifact-registry deploy-service --env dev --service code-queue: it verifies the existing 127.0.0.1:5000/unidesk/code-queue:<commit> artifact, imports it into native k3s containerd, applies only the unidesk-dev Code Queue manifest, stamps code-queue-scheduler-dev, code-queue-read-dev, code-queue-write-dev and d601-dev-provider-egress-proxy, and verifies the scheduler Service through the Kubernetes API service proxy. deploy apply --env prod --service code-queue and artifact-registry deploy-service --env prod --service code-queue must return explicit unsupported output and must not mutate production Code Queue manifests, Deployments or rollouts. The scheduler has an explicit 5Gi memory limit and must use Recreate rollout strategy so an update does not temporarily require two scheduler replicas under the namespace quota. All dev Code Queue containers must set CPU limits so the namespace LimitRange does not inject a quota-breaking default CPU limit. Live health verification uses the Kubernetes API service proxy for the dev ClusterIP Service, not kubectl exec or debug binaries inside the application image. This dev execution slice proves artifact deployability, health and dev database isolation; wiring the dev frontend stable code-queue route through a dev code-queue-mgr is a separate later phase.

Production code-queue-mgr is a separate main-server Compose sidecar artifact consumer. deploy apply --env prod --service code-queue-mgr --dry-run may plan only the code-queue-mgr Compose service/container and must surface that D601 Code Queue scheduler/runner, queued tasks, interrupts and cancellations are excluded targets. Non-dry-run production apply for this sidecar remains supervisor-gated even when the artifact exists.

CLI

bun scripts/cli.ts deploy check [--file deploy.json] [--service <id>] checks the live runtime against the desired repo and commit without changing the system.

bun scripts/cli.ts deploy plan [--file deploy.json] [--service <id>] prints the same live state plus the intended action: noop, deploy or unsupported.

bun scripts/cli.ts deploy plan --env dev [--service <id>] reads origin/master:deploy.json#environments.dev and prints a dry-run environment plan without checking or mutating live runtime resources. deploy check --env dev uses the same dry-run environment plan. --env prod is available for parity as a dry-run planning path; it reads origin/master:deploy.json#environments.prod and must not use a dirty local deploy.json.

Environment plan output must be sufficient to review the artifact matrix without running a live apply. Each service item includes deploymentPath, artifactConsumer.consumerKind, artifactConsumer.registryImage, artifactConsumer.noRuntimeSourceBuild, artifactConsumer.dryRunOnly, target, validation and liveApply where relevant. consumerKind=d601-direct-compose means the reviewed consumer touches only the D601 Docker/Compose service and private health path; consumerKind=d601-k3s-managed means the reviewed consumer imports the artifact into native k3s/containerd and verifies through the Kubernetes API service proxy; consumerKind=main-server-compose means the reviewed consumer streams or loads the D601 artifact into the main-server Compose service; consumerKind=d601-dev-target-side-build is reserved for the controlled dev backend-core source-build exception. Artifact consumer plan items must explicitly report noRuntimeSourceBuild=true and list forbidden build/public exposure actions. Blocked or gated services must keep structured dryRunOnly / blockedReason output, for example met-nonlinear runtime-verification-blocked and k3sctl-adapter supervisor-only production apply.

bun scripts/cli.ts deploy apply [--file deploy.json | --env dev|prod] [--service <id>] [--commit <full-sha>] [--dry-run] [--force] starts an asynchronous job only for supported targets. Use bun scripts/cli.ts job status <jobId> --tail-bytes 30000 to observe progress. --dry-run resolves the same plan but does not build or replace runtime objects. --force rebuilds even when the live commit matches. Environment apply is not the dev e2e trigger; use bun scripts/cli.ts ci run-dev-e2e for the Git-controlled temporary namespace smoke flow. --env dev apply is enabled for persistent D601 backend-core target-side rollout and for frontend/baidu-netdisk/decision-center/mdtodo/claudeqq/dev-only code-queue/project-manager/oa-event-flow/code-queue-mgr/todo-note/findjob/pipeline/met-nonlinear artifact consumers. --env prod apply exposes the D601 registry artifact consumer for backend-core, frontend, baidu-netdisk, decision-center, mdtodo, claudeqq, project-manager, oa-event-flow, todo-note, findjob, pipeline and met-nonlinear; code-queue-mgr prod live apply is supervisor-gated and k3sctl-adapter is plan/dry-run only. --commit may override one selected reviewed artifact consumer in either dev or prod, for example deploy apply --env dev --service frontend --commit <release-v1-full-sha>, and the image must already exist as 127.0.0.1:5000/unidesk/<service-id>:<commit>. Unsupported prod services, especially code-queue, return a structured unsupported payload instead of silently falling back to a maintenance-channel source build.

All deploy commands output JSON. Long operations must use .state/jobs/ and bounded log tails; no deploy path may succeed with missing progress output.

Target-Side Build

Target-side build is the standard deployment mode. The controller may run on the main server, but source materialization, compile/build, Docker image creation and deployment normally happen on the target node that will run the service.

Main server services are fetched, built and deployed on the main server.
D601 services are fetched, built and deployed on D601.
D518 services are fetched, built and deployed on D518.
k3s managed services are built on the active control target and then imported into that target's Kubernetes container runtime.

The reconciler distributes only repository URL, commit ID, Dockerfile path, build context and the existing deployment manifest/compose declaration. It must not distribute large Docker images between hosts as the default path, and it must not accept docker commit images, dirty worktrees or hand-mutated runtime containers as deployment truth.

Each target fetches the remote repository, resolves the requested commit to a full 40 character SHA and exports tracked files with git archive. Build contexts are created from that archive, not from the operator's current working tree. Environment applies such as deploy apply --env dev must not upload Kubernetes manifests or source files from the master server worktree; the target-side materialized commit is the source for Dockerfile, build context and k3s control manifests. The master server side may only do lightweight CLI orchestration, environment ref reading and remote command dispatch.

Artifact Consumer Exception

Production backend-core and reviewed user-service samples are explicit exceptions to standard target-side build. The runtime target can be the master server Compose stack, but the build target is D601 CI; CD then consumes only commit-pinned images from the D601 artifact registry.

The exception is narrow:

CI on D601 builds src/components/backend-core/Dockerfile from a pushed commit, stamps image labels and publishes 127.0.0.1:5000/unidesk/backend-core:<commit> to the D601 artifact registry.
CD on the master server pulls that existing image through the controlled artifact-registry relay, retags it for the Compose service, recreates only backend-core with --no-build --no-deps --force-recreate, and verifies the running commit.
CD must not run Rust compilation, Docker build, Compose build or server rebuild backend-core.
The legacy artifact-registry deploy-backend-core compatibility entry is deprecated and disabled as a standard entrypoint; use deploy apply --env prod --service backend-core --commit <full-sha> so the common artifact-consumer guardrails execute first.
The pushed Git commit remains the version source of truth. The image registry is a content cache and transfer boundary, not a replacement for deploy.json or Git.
baidu-netdisk is the first main-server direct user-service sample for the same split: CI publishes 127.0.0.1:5000/unidesk/baidu-netdisk:<commit> from src/components/microservices/baidu-netdisk/Dockerfile; dev validation and prod CD both pull that artifact, retag baidu-netdisk, recreate only baidu-netdisk with --no-build --no-deps --force-recreate, and verify image labels plus /health.deploy.commit.
frontend is the UniDesk UI artifact sample: CI publishes 127.0.0.1:5000/unidesk/frontend:<commit> from src/components/frontend/Dockerfile; dev CD imports that artifact into native k3s frontend-dev, prod CD retags it as unidesk-frontend for the master-server Compose service, and both paths verify image labels plus /health.deploy.commit.
findjob and pipeline are D601 direct Docker/Compose artifact consumers: CD runs on D601 through the existing provider-gateway/SSH maintenance bridge, verifies 127.0.0.1:5000/unidesk/<service>:<commit> labels, writes deploy env/labels, and recreates only the target Compose service with --no-build --no-deps --force-recreate.
met-nonlinear has a D601 direct dry-run/plan contract, but live artifact deploy is blocked until the long-running met-nonlinear-ts image contract is separated from the ML image Dockerfile contract or otherwise proves the running container image label matches the requested commit.
k3sctl-adapter exposes only artifact consumer plan/dry-run here because it is an infrastructure control bridge; real prod deployment requires supervisor confirmation outside the standard user-service CD path.
mdtodo and claudeqq are k3s-managed artifact consumers: CI publishes 127.0.0.1:5000/unidesk/<service-id>:<commit>, dev CD lands in unidesk-dev, prod CD lands in unidesk, and both paths verify Deployment metadata plus health through the Kubernetes API service proxy.
code-queue is a dev-only artifact consumer: CI may publish 127.0.0.1:5000/unidesk/code-queue:<commit>, and dev CD may update only unidesk-dev Code Queue execution objects. Production artifact deploy, production rollout and production manifest mutation for code-queue are unsupported.
This exception must not be generalized to other services unless their resource profile and runtime boundary are documented with the same CI-producer/CD-consumer split.

The registry contract is defined in docs/reference/artifact-registry.md; the CI producer rules are defined in docs/reference/ci.md.

Upstream Image Exception

filebrowser and filebrowser-d601 are not source-built UniDesk services and must not be modeled as Dockerfile producers. Their minimal catalog expression is CI.json.artifacts[] entries with kind=upstream-image plus config.json.microservices[].repository.artifactSource:

upstream image: docker.io/filebrowser/filebrowser:v2.63.3;
upstream source revision: ca5e249e3c0c94159c2136a0cd431a424eb18472;
digest pin: required before rollout;
mirror strategy: mirror only after digest verification to 127.0.0.1:5000/upstream/filebrowser/filebrowser;
CD: pull-only by digest or mirror digest, then verify image identity, OCI labels and private proxy health.

If the upstream registry is unavailable during precheck, the service remains pending-network-verification; a locally cached image id is supporting evidence only and not a manifest digest pin.

One-Shot Build Proxy

Target-side source fetches and Docker builds that need external network access use a one-shot proxy scope through provider-gateway WS egress. Provider targets connect only to their node-local provider-gateway egress endpoint, normally http://127.0.0.1:18789; provider-gateway carries the TCP stream over the already-authenticated provider WebSocket to the main server, and the main server opens the final outbound TCP connection. This is the only allowed proxy channel for provider-side deploy source fetches and builds. The deploy path must not mutate host-global proxy settings:

Do not edit /etc/docker/daemon.json.
Do not edit shell profiles or global Docker CLI config.
Do not leave long-lived host HTTP_PROXY, HTTPS_PROXY or ALL_PROXY.
Do not silently fall back to target local direct internet.
Do not create a separate SSH SOCKS proxy, public master proxy port, or direct backend-core/provider-ingress connection for deploy egress.

The standard implementation first probes GitHub through the node-local egress proxy, then runs target-side git clone/git fetch and the Docker build in that scoped environment. It also uses the target Docker daemon's local BuildKit builder so target-side base image and layer caches are reused. Proxy variables are scoped to the current deploy step and passed as matching --build-arg values for Dockerfile RUN steps; they are not written to daemon or shell configuration. Provider targets also use docker buildx build --network host so 127.0.0.1:<proxy-port> inside RUN resolves to the target host's loopback provider-gateway egress proxy. Each deploy must log the proxy channel and probe result, for example target_source_proxy=provider-gateway-ws-egress:http://127.0.0.1:18789, target_build_proxy=provider-gateway-ws-egress:http://127.0.0.1:18789 and target_build_proxy_probe=ok.

Build cache is part of the deployment contract, not an optimization left to Docker defaults. The deploy reconciler must pass inline BuildKit cache metadata (--cache-to type=inline) and import the current target image as cache source when it exists (--cache-from <image>). Dockerfiles that intentionally expose a warm build-base argument, such as Code Queue's CODE_QUEUE_BASE_IMAGE, may use the target-local <image>-build-base image to avoid re-running large apt/npm/Playwright setup layers; this is still target-local build cache and must be logged as target_build_base_image=<image>-build-base. If a service later needs an isolated docker-container builder or a local cache directory backend, it may use one only as a service-specific fallback and must still log proxy resolution, proxy probe result, cache source, cache destination and builder cleanup. The default path must not discard target-local image cache by creating a fresh builder for every deploy.

Main server targets may build without a proxy unless a service explicitly requires one. Provider targets must not bypass provider-gateway WS egress for GitHub, Debian apt, npm, Playwright, model downloads or any other external build dependency.

Deployment Executors

The reconciler selects the executor from config.json:

deployment.mode=unidesk-direct on main-server: the legacy/local manifest executor builds the image on the main server, then uses the fixed UniDesk Compose project and up -d --no-build --no-deps --force-recreate <service>. Reviewed artifact-consumer services such as frontend, baidu-netdisk, project-manager and oa-event-flow use the D601 registry pull-only path for --env dev and --env prod instead.
deployment.mode=internal-sidecar on main-server: use the same main-server target-side source export, Docker build, image label stamping, fixed Compose project replacement and live commit verification as direct Compose services. This class is for private sidecars such as code-queue-mgr; it is still versioned by deploy.json.commitId, not by the operator's current worktree, and prod live apply remains supervisor-gated.
deployment.mode=unidesk-direct on a provider: this executor is disabled for D601 service deployment. The historical behavior dispatched host.ssh to the provider, built on the provider, then used the service's provider-local compose file and project; that shape must not remain a second deployment control plane.
Control bridges that UniDesk needs in order to inspect or repair an orchestrator must stay in this direct class. In particular, k3sctl-adapter is a UniDesk-managed bridge to native k3s and must remain outside k3s; Docker packaging on Docker Desktop/WSL must create an explicit host-local bridge, currently an adapter-container SSH local tunnel, to reach /etc/rancher/k3s/k3s.yaml and WSL 127.0.0.1:6443.
deployment.mode=k3sctl-managed: the target behavior is to build on the active control target unless the service has a reviewed artifact-consumer exception, verify native k3s on the host OS/WSL distro, import the image into native k3s/containerd, apply the existing Kubernetes manifest, stamp the Deployment and wait for rollout. On D601, persistent dev apply is currently allowed for backend-core target-side build plus frontend, decision-center, mdtodo, claudeqq and dev-only code-queue artifact consumption in unidesk-dev; production artifact consumers are limited to reviewed services and exclude Code Queue. Normal production services still cannot use a maintenance-channel direct rollout. The executor must use the native kubeconfig and containerd socket, for example /etc/rancher/k3s/k3s.yaml and /run/k3s/containerd/containerd.sock; running k3s itself in Docker is forbidden for both control-plane and worker nodes. A rancher/k3s image or legacy container may only be used as a temporary artifact source during migration, and any active containerized k3s control plane must be stopped before verification succeeds. The executor must preload a valid rancher/mirrored-pause:3.6 sandbox image into native k3s containerd through the provider-gateway one-shot egress path, verify its entrypoint is /pause, and reject fake or sleep-based replacement images. k3s-managed deploys must use ClusterIP Services and Kubernetes API service proxy health checks; they must not add NodePort, hostPort, public business ports or provider-gateway direct business backends.

D601 Docker local images are not the source of truth for k3s runtime availability. For Code Queue, the deploy gate must verify unidesk-code-queue:d601 exists in native k3s containerd after import with ctr --address /run/k3s/containerd/containerd.sock -n k8s.io images ls, and it must fail before rollout if the tag is missing. The same gate must verify every production Code Queue Deployment that uses the image (code-queue, code-queue-read, code-queue-write, d601-provider-egress-proxy, d601-tcp-egress-gateway) still references exactly unidesk-code-queue:d601; otherwise kubelet may attempt an external registry pull and leave base gateways in ImagePullBackOff.

Code Queue health and diagnostics must cover its k3s dependencies, not only scheduler HTTP health. bun scripts/cli.ts microservice diagnostics code-queue and the /health aggregation must mark the service degraded/failing when d601-provider-egress-proxy or d601-tcp-egress-gateway Deployment availability or Endpoint readiness is missing, when the scheduler reports storage.lastError or PostgreSQL route failure through d601-tcp-egress-gateway.unidesk.svc.cluster.local:15432, or when stale active/retry_wait reconcile reports recoverable active tasks without a local run.

Existing service-specific commands such as Code Queue deploy are disabled as direct D601 deploy paths. Their build/import/rollout semantics should converge later into one controlled target-side deployment path instead of keeping parallel implementations.

Baidu Netdisk is the main-server unidesk-direct sample for artifact CD and a dependency of the PGDATA-to-Baidu-Netdisk backup path. Controlled dev validation and prod CD use the D601 registry artifact consumer: it verifies unidesk/baidu-netdisk:<commit> exists in the registry, streams the image to the main server through provider-gateway Host SSH, retags baidu-netdisk and baidu-netdisk:<commit>, stamps UNIDESK_BAIDU_NETDISK_DEPLOY_* in the canonical Compose env file, recreates only Compose service baidu-netdisk, and verifies container health, image labels, service id, /health.deploy.commit, and /health.auth. Live apply must fail or return degraded before success if UNIDESK_BAIDU_NETDISK_CLIENT_ID, UNIDESK_BAIDU_NETDISK_CLIENT_SECRET, or UNIDESK_BAIDU_NETDISK_TOKEN_KEY is absent from the controlled env source, or if /health.auth.configured, clientIdConfigured, clientSecretConfigured, tokenKeyConfigured, or loggedIn is not true after recreate. Dry-run only reports that these secret presences and auth fields are required and pending live check; it must not read or print secret values. It must not use server rebuild baidu-netdisk, mutable tags, dirty worktrees, hand-built images, or public 4244 exposure as deployment truth.

For PGDATA-to-Baidu-Netdisk incident review, the no-authorization read-only boundary is limited to server status, schedule list, schedule get, schedule runs, microservice status/health baidu-netdisk, microservice proxy baidu-netdisk /api/auth/status --raw, and microservice proxy baidu-netdisk '/api/transfers?limit=20' --raw. These commands may report failureKind=target-stack-not-running when unidesk-backend-core, unidesk-database, or baidu-netdisk-backend is absent, especially when only *.verify-* containers are visible; that state is an infrastructure blocker, not a successful empty backup history. Recovery actions such as restoring non-empty Baidu secrets, server start, server rebuild backend-core, server rebuild baidu-netdisk, deploy apply --env prod --service baidu-netdisk, schedule run, or schedule retry-run can affect production or trigger a real backup and require explicit operator authorization.

Decision Center is a standard k3sctl-managed service in this model, but D601 maintenance-channel direct apply must not deploy it. Controlled CD for Decision Center uses the D601 registry artifact consumer in both dev and prod: it verifies unidesk/decision-center:<commit> exists in the registry, imports unidesk-decision-center:<commit> into native k3s containerd, applies the appropriate Decision Center manifest, stamps the Deployment, and verifies health through /api/microservices/decision-center/health while proving the live and requested commit match. It must not add a main-server Compose service, NodePort, hostPort, or provider-gateway direct HTTP backend for Decision Center.

MDTODO and ClaudeQQ are standard k3sctl-managed artifact consumers in the same model. Dev rollout lands in unidesk-dev using their dev manifests; production rollout lands in unidesk using the production manifests. Both services must pass dev validation before production rollout, must expose deploy metadata in health when practical, and must verify through the Kubernetes API service proxy instead of NodePort, hostPort or provider-gateway direct HTTP.

Code Queue is explicitly narrower. Only --env dev --service code-queue is a supported artifact consumer target, and it may mutate only unidesk-dev Code Queue execution objects. Production Code Queue artifact deploy, production rollout and production manifest mutation are unsupported and must fail visibly.

Code Queue Production HostPath Guard

生产 Code Queue 仍处在 hostPath source 过渡边界。生产 scheduler/read/write Pod 会把 D601 /home/ubuntu/cq-deploy 同时挂载为 /app 和 /root/unidesk，因此 Bun 进程启动时解析的是 hostPath repo，而不是镜像内已 COPY 的源码。即使 Docker build 或 unidesk-code-queue:d601 导入成功，只要 /home/ubuntu/cq-deploy 部分同步，运行态仍会失败。必须防住的具体故障类是：index.ts 已导入 ./runtime-preflight，但 /home/ubuntu/cq-deploy/src/components/microservices/code-queue/src/runtime-preflight.ts 缺失；该状态必须视为 deploy-degraded，并阻止任何 scheduler restart 或 rollout。

任何仍会修改生产 /home/ubuntu/cq-deploy 的部署或恢复路径，都必须在 source sync 之后、Kubernetes rollout 之前运行 Code Queue source import guard：

bun scripts/code-queue-source-guard.ts --root /home/ubuntu/cq-deploy
# or from a local controller worktree:
bun scripts/cli.ts deploy guard code-queue-source --root /home/ubuntu/cq-deploy

guard 必须返回 JSON，并在失败时以非零退出码给出 degradedReason=source-root-missing 或 degradedReason=missing-relative-import-target；部署编排必须透出该 reason，并在 kubectl rollout restart 或任何会迫使 scheduler 重新导入脏 hostPath source 的 Pod 删除之前停止。当前 guard 覆盖 src/components/microservices/code-queue/src/**/*.ts 下的相对 import、export ... from 和 import(...) 目标，包括 runtime-preflight.ts 这类缺文件故障。

路径所有权必须保持显式。/home/ubuntu/cq-deploy 是 src/components/microservices/k3sctl-adapter/k3s/code-queue.k8s.yaml 使用的生产 k3s hostPath repo。/home/ubuntu/unidesk-code-queue-deploy 是历史/开发 worktree 名称；除非 manifest、部署代码和文档一起修改，否则不得假设它是生产 scheduler source。迁移期如果两者通过软链接关联，guard 仍必须对实际挂载进 /app 的路径运行。

D601 k3s 验证必须始终设置原生 kubeconfig：

KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl -n unidesk get deploy,svc,pod,endpoints

D601 默认 kubectl context 可能指向 Docker Desktop、kind 或其他本地集群，因此不能作为 UniDesk production Code Queue ready 的证据。长期目标是完全移除生产 hostPath source 覆盖，让 Code Queue production 收敛到 commit-pinned artifact/image CD，并像其他已审查 artifact consumer 一样验证 live commit。

CI Separation

Continuous integration is intentionally separate from this deploy reconciler. D601 k3s hosts Tekton CI resources described in docs/reference/ci.md; PipelineRuns may clone, check, run read-only performance gates, create temporary CI-owned namespaces for dev manifest smoke e2e, or publish commit-pinned backend-core/user-service image artifacts to the D601 artifact registry. They must not call deploy apply, codex deploy, kubectl rollout restart for production services, mutate deploy.json, or write production namespaces.

Artifact publish preflight is part of CI, not deploy: artifact-registry status|health and ci publish-user-service --dry-run are the supported read-only checks for registry reachability and user-service publish readiness. These commands must not depend on a coincidentally present local unidesk-database container, and when backend-core/database/provider channels are missing they should return structured infra-blocked instead of a raw container error.

The Code Queue performance gate may create a temporary code-queue-ci-read service and read the main PostgreSQL through the existing d601-tcp-egress-gateway. Because it runs with CODE_QUEUE_SERVICE_ROLE=read, scheduler/backfill/notification disabled and EmptyDir state, it is not deployment truth and does not need a temporary database for the current read-only checks.

Version Stamping And Verification

Every successful deployment must stamp the source version in the runtime:

Docker image labels: unidesk.ai/service-id, unidesk.ai/source-repo, unidesk.ai/source-commit and unidesk.ai/dockerfile.
Runtime env or Kubernetes annotations: UNIDESK_DEPLOY_SERVICE_ID, UNIDESK_DEPLOY_REPO, UNIDESK_DEPLOY_COMMIT and UNIDESK_DEPLOY_REQUESTED_COMMIT.
Service health response should expose deploy.repo and deploy.commit when practical. Existing service-specific health contracts such as Code Queue's deploy.commit remain valid.

The deploy job is not complete until live verification proves the running service matches the requested commit. For Docker services this includes image label inspection on the running container. For k3s services this includes Deployment annotation/env inspection and service health through the Kubernetes API service proxy path for the target ClusterIP Service; production user-service requests continue through the same UniDesk microservice proxy path used by the frontend. A healthy old service must fail verification.

Unsupported Services

Image-only services, such as a service declared directly as docker.io/vendor/image:tag without a Dockerfile source artifact, do not satisfy target-side source-build policy. They must not be silently converted into Dockerfile source builds. If the object is an approved upstream image exception, it must follow the digest-pinned pull-only model documented in docs/reference/cicd-standardization.md; otherwise deploy check, deploy plan and CD entrypoints should report it as unsupported until a reviewed artifact producer or upstream digest consumer exists.

43 KiB Raw Blame History Unescape Escape