Files
pikasTech-unidesk/docs/reference/ci.md
T
2026-05-19 15:22:56 +00:00

14 KiB

UniDesk CI On D601 k3s

UniDesk CI is hosted on the D601 native k3s cluster with Tekton Pipelines and Tekton Triggers. It is CI only. CD remains separate from Tekton. No Tekton task may roll out production services. CI/CD runtime-version governance follows docs/reference/release-governance.md and GitHub issue #6. The default user-service release order is owned by docs/reference/user-service-delivery.md.

Components

  • Tekton Pipelines: v1.12.0.
  • Tekton Triggers: v0.34.0.
  • UniDesk CI namespace: unidesk-ci.
  • Manifests: src/components/microservices/k3sctl-adapter/k3s/ci/.
  • CLI entry: bun scripts/cli.ts ci install|status|run|publish-backend-core|publish-user-service|run-dev-e2e|logs.
  • Dev namespace e2e runner: bun scripts/cli.ts ci run-dev-e2e; authoritative runner path, manifest contract and safety boundary are in docs/reference/dev-ci-runner.md.
  • Rust backend-core check/build boundary: CI may run UNIDESK_D601_RUST_CHECK=1 bun scripts/cli.ts check --full --rust on D601; the master server must not compile Rust for backend-core iteration. The authoritative dev environment rule is docs/reference/dev-environment.md.

Pipeline Scope

Each commit CI run performs:

  • git clone and checkout of the requested repository revision.
  • bun install --frozen-lockfile at the repo root and src/, because bun scripts/cli.ts check compiles all src/components and needs the component workspace lockfile for frontend React dependencies.
  • UNIDESK_D601_RUST_CHECK=1 bun scripts/cli.ts check --full --rust, so Rust backend-core is checked only inside the D601 CI execution boundary.
  • Application contract checks such as Code Queue /api/workdirs, using ordinary app fixtures or E2E paths rather than CI/CD infrastructure self-tests.
  • Temporary code-queue-ci-read Deployment and ClusterIP Service in unidesk-ci.
  • Code Queue read performance checks against the production PostgreSQL through d601-tcp-egress-gateway.
  • Manual dev desired-state smoke for Code Queue via ci run-dev-e2e, using the Git-pinned code-queue service commit from origin/master:deploy.json#environments.dev.

CI/CD bootstrap, repair and upgrade actions are infrastructure operations. They are manually tested and may be promoted directly to production when the infrastructure itself is the target; do not add CI jobs whose purpose is to prove that CI/CD can bootstrap or repair itself.

ci install also prewarms the D601 k3s containerd runtime with the Tekton entrypoint/workingdir helper images, oven/bun:1-debian, alpine/git:2.45.2 and unidesk-code-queue:dev. Missing images are pulled through the node-local provider-gateway WS egress proxy and then imported into native k3s containerd with digests preserved, so PipelineRun pods do not hang on external registry pulls. Sustained pull throughput below 1 MB/s is treated as a provider/main-server network or proxy degradation first, not as a Dockerfile or application failure.

Git clone and dependency downloads inside the repo check task use d601-provider-egress-proxy.unidesk.svc.cluster.local:18789; the NO_PROXY list keeps the in-cluster read service and D601 TCP egress gateway on the cluster network.

Private repository source authentication is part of the CI contract and follows docs/reference/devops-hygiene.md. If the repo-check task fails at git clone because credentials are unavailable, treat it as a CI infrastructure/auth gap, not as an application test result.

CI/CD Runtime Governance

CI/CD server and control-plane runtime is production-like infrastructure. Its service version must be pinned by deploy.json and verified through runtime commit metadata; it must not float with the latest master just because the operator's CLI is newer.

The CLI may be run from master if it remains backward compatible with the pinned server version. When the CLI needs a newer server capability, it must detect that through a health or capability response and fail explicitly. It must not replace the missing server capability with raw SSH, direct kubectl, direct SQL, direct production namespace mutation, or another hidden deployment path.

CI/CD services should report their source commit, API/schema capability, supported environments and supported operations. CI diagnostics should include that information when rejecting an operation as unsupported.

During a release/v1 stabilization window, CI should continue using the implemented dev desired-state contract rather than adding split-lane infrastructure. The origin/master:deploy.json#environments.dev service pins may point at v1 stabilization commits for validation, but CI must print the manifest commit and service commits it used. Explicit dev-v1 and dev-master support is a later infrastructure change after v1 is stable.

When the broken component is CI/CD itself, use manual smoke checks, runtime health, logs, commit metadata and operator review as the acceptance path. Do not block the repair on a new CI self-test for the CI/CD bootstrap path.

Steps that call the Kubernetes API directly clear inherited proxy variables so service-account HTTPS calls to kubernetes.default.svc do not accidentally use the Code Queue image's Docker Compose proxy defaults. The rollout poll reads the Deployment main resource rather than the /status subresource, keeping CI RBAC limited to the same app/service resources it creates and deletes. The performance probe scans recent Code Queue tasks until it finds one with trace steps, so a newly selected task without persisted step detail does not make the whole gate fail before measuring the trace endpoints.

The temporary Code Queue service uses:

  • CODE_QUEUE_SERVICE_ROLE=read.
  • CODE_QUEUE_SCHEDULER_ENABLED=false.
  • CODE_QUEUE_STARTUP_OA_BACKFILL_ENABLED=false.
  • CODE_QUEUE_NOTIFY_CLAUDEQQ_ENABLED=false.
  • CODE_QUEUE_CODEX_SQLITE_LOG_EXPORT_ENABLED=false.
  • D601 k3s d601-provider-egress-proxy for external/OA Event Flow fetches, with d601-tcp-egress-gateway and the CI read service in NO_PROXY.
  • EmptyDir state/log mounts.

This means the CI service can read existing tasks, Trace summaries, Trace steps and Trace step details from the main database, but it must not schedule, mutate, notify, backfill or become deployment truth.

Backend-Core Artifact Publication

backend-core production image creation belongs to a manual D601-side artifact producer action, not to master server CD and not to a CI/CD bootstrap self-test. The purpose is to keep Rust compilation, Docker build cache, dependency downloads and image push on the higher-resource D601 side while leaving production deployment with a small pull/recreate/verify surface.

The CI artifact task must follow these rules:

  • Input revision comes from pushed Git and is resolved to a full 40-character commit. A dirty worktree or unpushed local tree must never be used as the image source.
  • Source fetch for this artifact uses the existing D601 GitHub SSH deploy identity and the node-local provider-gateway WS egress proxy at http://127.0.0.1:18789. D601 prepares a commit-pinned source export under /home/ubuntu/.unidesk/ci/backend-core-artifacts/<commit> before creating the PipelineRun; Tekton consumes that prepared source through a read-only hostPath and must not clone GitHub itself, mount GitHub credentials, use an in-cluster Git mirror, or accept an operator-uploaded source tree.
  • The source checkout, Rust build and Docker build run on D601 CI infrastructure. The master server must not run cargo build, docker compose build backend-core or server rebuild backend-core as part of production backend-core deployment.
  • The image is tagged with the source commit, for example unidesk/backend-core:<commit>, and pushed to the D601 artifact registry as 127.0.0.1:5000/unidesk/backend-core:<commit>.
  • The image must carry at least unidesk.ai/service-id=backend-core, unidesk.ai/source-repo, unidesk.ai/source-commit and unidesk.ai/dockerfile=src/components/backend-core/Dockerfile.
  • Publication must fail if the D601 artifact registry is not healthy. It must not fall back to a third-party registry or a mutable latest tag.
  • CI may output the image ref and digest as deployment input, but it must not restart production Compose services, call production deploy apply, mutate production namespaces, or change deploy.json.

The artifact registry contract and CD consumption path are defined in docs/reference/artifact-registry.md. CI is the producer of the backend-core image artifact; CD is only the consumer.

User-Service Artifact Publication

User-service image creation uses the same CI producer boundary as backend-core, but the service identity and Dockerfile come from the registered config.json.microservices[] entry. The reviewed sample services are baidu-netdisk and decision-center.

The CI user-service artifact task must follow these rules:

  • Inputs are a pushed full 40-character Git commit and a registered service id. Dirty worktrees, operator-uploaded source trees and local-only commits are not valid artifact sources.
  • D601 prepares a commit-pinned source export under /home/ubuntu/.unidesk/ci/user-service-artifacts/<service-id>/<commit> using the existing GitHub SSH deploy identity and node-local provider-gateway WS egress proxy. Tekton consumes that export through a read-only hostPath.
  • The image is tagged only with the source commit and pushed to the D601 registry as 127.0.0.1:5000/unidesk/<service-id>:<commit>. The producer must reject third-party registries and must not publish or consume a mutable latest tag.
  • The image must carry unidesk.ai/service-id, unidesk.ai/source-repo, unidesk.ai/source-commit and unidesk.ai/dockerfile labels.
  • The command output must include the image ref, tag, digest, source commit and service id. The digest ref is suitable as immutable input for later dev/prod deployment work.
  • CI is an artifact producer only. It must not restart production services, call production deploy apply, mutate the production namespace, or change deploy.json.

Publish a Baidu Netdisk artifact:

bun scripts/cli.ts ci publish-user-service --service baidu-netdisk --commit <full-sha> --wait-ms 1200000

This command creates the unidesk-user-service-artifact-publish Tekton PipelineRun and pushes 127.0.0.1:5000/unidesk/baidu-netdisk:<commit>.

Publish a Decision Center artifact:

bun scripts/cli.ts ci publish-user-service --service decision-center --commit <full-sha> --wait-ms 1200000

This command creates the unidesk-user-service-artifact-publish Tekton PipelineRun and pushes 127.0.0.1:5000/unidesk/decision-center:<commit>.

Dev Namespace E2E

ci run-dev-e2e is the manual dev desired-state smoke flow. The single authoritative reference for its Git-controlled runner script, short launcher, result directory and no-CD boundary is docs/reference/dev-ci-runner.md.

The current dev namespace e2e is a harness and smoke gate, not a full frontend/backend stack rollout. It does include a controlled Code Queue slice: D601 builds or reuses the environments.dev.services[].id=code-queue commit, imports the image into native k3s containerd, starts temporary PostgreSQL plus Code Queue scheduler/read/write Services in unidesk-ci-e2e-<runId>, and verifies the HTTP API through the Kubernetes API service proxy. The stable frontend/backend path /api/microservices/code-queue/proxy/api/workdirs is covered by the normal UniDesk e2e check microservice:code-queue-workdirs. This remains CI-only and must not deploy persistent unidesk-dev or production resources.

Performance Gate

The initial budgets live in unidesk-ci/unidesk-ci-budgets:

  • Code Queue first overview payload through the temporary read service, used as the service-side first-paint proxy: 10000ms.
  • GET /api/tasks/{id}/trace-summary: 10000ms.
  • GET /api/tasks/{id}/trace-steps: 20000ms diagnostic, reported but not blocking while the existing production TraceView step query is being optimized.
  • GET /api/tasks/{id}/trace-step: 20000ms diagnostic, reported but not blocking while the existing production TraceView step query is being optimized.
  • GET /api/tasks/overview p95 over 10 samples: 20000ms.

These are absolute budgets. Historical relative baselines can be added later by writing metrics to a dedicated CI table or object store; they should not be mixed into production task tables.

Commands

Install or refresh CI:

bun scripts/cli.ts ci install

Check status:

bun scripts/cli.ts ci status

Run CI manually for a commit:

bun scripts/cli.ts ci run --revision <commit>

Publish a backend-core artifact for production CD:

bun scripts/cli.ts ci publish-backend-core --commit <full-sha> --wait-ms 1200000

This command creates the unidesk-backend-core-artifact-publish Tekton PipelineRun. It is a CI producer action only: it may build and push 127.0.0.1:5000/unidesk/backend-core:<commit>, but it must not recreate the master server container. Production deployment is triggered separately with artifact-registry deploy-backend-core.

Publish a user-service artifact:

bun scripts/cli.ts ci publish-user-service --service baidu-netdisk --commit <full-sha> --wait-ms 1200000
bun scripts/cli.ts ci publish-user-service --service decision-center --commit <full-sha> --wait-ms 1200000

This command is a CI producer action only. For Baidu Netdisk and Decision Center, it builds and pushes 127.0.0.1:5000/unidesk/<service-id>:<commit> and reports the immutable digest without deploying production.

Run the dev namespace e2e harness manually:

bun scripts/cli.ts ci run-dev-e2e --wait-ms 600000

Inspect a run:

bun scripts/cli.ts ci logs <runId>

Trigger Boundary

unidesk-ci.triggers.yaml installs the EventListener, TriggerBinding and TriggerTemplate, but the EventListener remains a normal in-cluster Service. Do not expose it through NodePort, LoadBalancer or an unrestricted public ingress. If GitHub or another Git remote needs webhook delivery, add a UniDesk-controlled frontend/backend route with secret verification and then proxy to the EventListener; keep only the documented main-server public entrypoints: production frontend, dev frontend proxy and provider ingress. The dev frontend public port is defined in docs/reference/dev-environment.md.