diff --git a/docs/reference/codex-deploy.md b/docs/reference/codex-deploy.md index 37a2ee51..faac7de8 100644 --- a/docs/reference/codex-deploy.md +++ b/docs/reference/codex-deploy.md @@ -19,7 +19,7 @@ bun scripts/cli.ts job status --tail-bytes 30000 部署 job 的步骤固定为: -1. 在 D601 的 deploy cache 中 `git fetch` remote,并用 `git archive ` 导出 tracked files 到一次性 export 目录。 +1. 在 D601 的 deploy cache 中通过本机 provider-gateway WS egress proxy 执行 `git fetch` remote,并用 `git archive ` 导出 tracked files 到一次性 export 目录;不得让 D601 直连 GitHub,也不得临时创建 SSH SOCKS、公网 master proxy 或 backend-core/provider-ingress fallback。 2. 用 `rsync --delete` 同步导出的 repo 到 `/home/ubuntu/cq-deploy`,保留 `.state/`、`logs/`、`.git/`、`node_modules/` 和 `dist/`。 3. 在 D601 用目标 Docker daemon 的本地 BuildKit builder 构建 `unidesk-code-queue:d601`,复用 D601 上已有基础镜像、inline cache 和 Code Queue build-base;provider-gateway WS egress 是唯一允许的构建代理通道,只作为本次 build 的环境变量与 build-arg 注入,并配合本次 build 的 `--network host` 让 RUN 阶段访问 D601 宿主 loopback proxy,不能污染 D601 宿主 Docker/HTTP proxy 配置,不能新建 SSH SOCKS、公网 master proxy 或直连 fallback。 4. `docker save` 镜像并导入 k3s containerd:`docker exec -i unidesk-v8s-server ctr -n k8s.io images import -`。 diff --git a/docs/reference/deploy.md b/docs/reference/deploy.md index 0ee80e35..264e2378 100644 --- a/docs/reference/deploy.md +++ b/docs/reference/deploy.md @@ -48,15 +48,15 @@ Each target fetches the remote repository, resolves the requested commit to a fu ## One-Shot Build Proxy -Target-side Docker builds that need external network access use a one-shot proxy scope through provider-gateway WS egress. Provider targets connect only to their node-local provider-gateway egress endpoint, normally `http://127.0.0.1:18789`; provider-gateway carries the TCP stream over the already-authenticated provider WebSocket to the main server, and the main server opens the final outbound TCP connection. This is the only allowed proxy channel for provider-side deploy builds. The build path must not mutate host-global proxy settings: +Target-side source fetches and Docker builds that need external network access use a one-shot proxy scope through provider-gateway WS egress. Provider targets connect only to their node-local provider-gateway egress endpoint, normally `http://127.0.0.1:18789`; provider-gateway carries the TCP stream over the already-authenticated provider WebSocket to the main server, and the main server opens the final outbound TCP connection. This is the only allowed proxy channel for provider-side deploy source fetches and builds. The deploy path must not mutate host-global proxy settings: - Do not edit `/etc/docker/daemon.json`. - Do not edit shell profiles or global Docker CLI config. - Do not leave long-lived host `HTTP_PROXY`, `HTTPS_PROXY` or `ALL_PROXY`. - Do not silently fall back to target local direct internet. -- Do not create a separate SSH SOCKS proxy, public master proxy port, or direct backend-core/provider-ingress connection for Docker build egress. +- Do not create a separate SSH SOCKS proxy, public master proxy port, or direct backend-core/provider-ingress connection for deploy egress. -The standard implementation first uses the target Docker daemon's local BuildKit builder so target-side base image and layer caches are reused. Proxy variables are scoped to the current build process and passed as matching `--build-arg` values for Dockerfile `RUN` steps; they are not written to daemon or shell configuration. Provider targets also use `docker buildx build --network host` so `127.0.0.1:` inside `RUN` resolves to the target host's loopback provider-gateway egress proxy. Each deploy build must log the proxy channel and probe result, for example `target_build_proxy=provider-gateway-ws-egress:http://127.0.0.1:18789` and `target_build_proxy_probe=ok`. +The standard implementation first probes GitHub through the node-local egress proxy, then runs target-side `git clone`/`git fetch` and the Docker build in that scoped environment. It also uses the target Docker daemon's local BuildKit builder so target-side base image and layer caches are reused. Proxy variables are scoped to the current deploy step and passed as matching `--build-arg` values for Dockerfile `RUN` steps; they are not written to daemon or shell configuration. Provider targets also use `docker buildx build --network host` so `127.0.0.1:` inside `RUN` resolves to the target host's loopback provider-gateway egress proxy. Each deploy must log the proxy channel and probe result, for example `target_source_proxy=provider-gateway-ws-egress:http://127.0.0.1:18789`, `target_build_proxy=provider-gateway-ws-egress:http://127.0.0.1:18789` and `target_build_proxy_probe=ok`. Build cache is part of the deployment contract, not an optimization left to Docker defaults. The deploy reconciler must pass inline BuildKit cache metadata (`--cache-to type=inline`) and import the current target image as cache source when it exists (`--cache-from `). Dockerfiles that intentionally expose a warm build-base argument, such as Code Queue's `CODE_QUEUE_BASE_IMAGE`, may use the target-local `-build-base` image to avoid re-running large apt/npm/Playwright setup layers; this is still target-local build cache and must be logged as `target_build_base_image=-build-base`. If a service later needs an isolated `docker-container` builder or a local cache directory backend, it may use one only as a service-specific fallback and must still log proxy resolution, proxy probe result, cache source, cache destination and builder cleanup. The default path must not discard target-local image cache by creating a fresh builder for every deploy. diff --git a/docs/reference/provider-gateway.md b/docs/reference/provider-gateway.md index f5489f3d..34ccb9f0 100644 --- a/docs/reference/provider-gateway.md +++ b/docs/reference/provider-gateway.md @@ -130,7 +130,7 @@ backend-core 可以通过真实 WebSocket 调度向在线 provider 下发 `provi ## Manual Upgrade Maintenance -手动升级只用于把旧节点 bootstrap 到支持 always-enabled 远程升级的版本;bootstrap 完成后,常规重建/升级必须回到 `provider.upgrade mode=schedule`,不得再用 SSH 透传同步重建 `provider-gateway`。节点侧维护步骤是:进入节点本地 UniDesk 仓库,执行 `git pull --ff-only` 获取主 server 已推送版本;确认 `.state/provider-.env` 中存在 `PROVIDER_SERVER_URL=ws://74.48.78.17:18082/ws/provider`、`PROVIDER_ID=`、`PROVIDER_NAME=`、`PROVIDER_TOKEN`、`PROVIDER_LABELS_JSON`、`PROVIDER_UPGRADE_HOST_PROJECT_ROOT=/home/ubuntu/unidesk`、`PROVIDER_UPGRADE_WORKSPACE_PATH=/workspace`、`PROVIDER_UPGRADE_COMPOSE_FILE`、`PROVIDER_UPGRADE_ENV_FILE`、`PROVIDER_UPGRADE_COMPOSE_PROJECT`、`PROVIDER_UPGRADE_SERVICE=provider-gateway`、`PROVIDER_UPGRADE_RUNNER_IMAGE=unidesk_provider-gateway:`、`DOCKER_SOCKET_PATH=/var/run/docker.sock`、`MONITOR_DISK_PATH=/`、心跳和重连参数。旧 env 文件中如果还残留 `PROVIDER_UPGRADE_ENABLED`,新版 provider-gateway 会忽略它;长期文档和新部署不得再依赖这个键。 +手动升级只用于把旧节点 bootstrap 到支持 always-enabled 远程升级的版本;bootstrap 完成后,常规重建/升级必须回到 `provider.upgrade mode=schedule`,不得再用 SSH 透传同步重建 `provider-gateway`。节点侧维护步骤是:进入节点本地 UniDesk 仓库,确认 GitHub 访问走本机 provider-gateway WS egress proxy,例如 `git config --local http.proxy http://127.0.0.1:18789` 和 `git config --local https.proxy http://127.0.0.1:18789` 后再执行 `git pull --ff-only` 获取主 server 已推送版本;不得让 provider 侧 Git 拉取退回直连公网、SSH SOCKS 或公开 master proxy。随后确认 `.state/provider-.env` 中存在 `PROVIDER_SERVER_URL=ws://74.48.78.17:18082/ws/provider`、`PROVIDER_ID=`、`PROVIDER_NAME=`、`PROVIDER_TOKEN`、`PROVIDER_LABELS_JSON`、`PROVIDER_UPGRADE_HOST_PROJECT_ROOT=/home/ubuntu/unidesk`、`PROVIDER_UPGRADE_WORKSPACE_PATH=/workspace`、`PROVIDER_UPGRADE_COMPOSE_FILE`、`PROVIDER_UPGRADE_ENV_FILE`、`PROVIDER_UPGRADE_COMPOSE_PROJECT`、`PROVIDER_UPGRADE_SERVICE=provider-gateway`、`PROVIDER_UPGRADE_RUNNER_IMAGE=unidesk_provider-gateway:`、`DOCKER_SOCKET_PATH=/var/run/docker.sock`、`MONITOR_DISK_PATH=/`、心跳和重连参数。旧 env 文件中如果还残留 `PROVIDER_UPGRADE_ENABLED`,新版 provider-gateway 会忽略它;长期文档和新部署不得再依赖这个键。 如果节点已有专用 Compose,优先用节点本地 Compose 手动重建一次:`docker compose --env-file .state/provider-.env -f -p up -d --no-deps --build --force-recreate provider-gateway`。这条命令必须在节点本地终端、节点自有 Web terminal、系统计划任务或 detached shell 中执行;不得通过正在被重建的 UniDesk provider-gateway 自己提供的 SSH 透传同步执行,否则旧 provider 容器停止时会切断 SSH client,可能导致重建中断在旧容器已停、新容器未起的状态。若只能通过 UniDesk 触达该节点,必须使用 `provider.upgrade mode=schedule` 的 detached updater,或先用节点本地 `nohup`/systemd 启动一个不依赖当前 provider 容器生命周期的重建脚本。老版 `docker-compose` 可能在重建已存在容器时因为 `ContainerConfig` 兼容问题失败;此时只能移除目标 provider-gateway 容器后重新 `up -d --no-deps provider-gateway`,不得执行 `down -v`、`docker volume rm` 或任何会影响 database 命名卷的命令。如果节点当前只有 `docker run` 部署,则先构建镜像 `docker build -f src/components/provider-gateway/Dockerfile -t unidesk_provider-gateway: .`,再以固定容器名重建:使用 `--restart always --pid host`,挂载 `/var/run/docker.sock:/var/run/docker.sock`、`/home/ubuntu/unidesk:/workspace:ro`、节点日志目录到 `/var/log/unidesk`,如需 WSL SSH 维护桥还要把只读私钥目录挂载到 `/run/host-ssh`,并使用同一个 `.state/provider-.env` 启动。无论 Compose 还是 `docker run`,容器名和镜像 tag 都必须带 Provider ID,便于 Docker 状态页、进程资源表、任务历史和节点本地排障互相对应。 diff --git a/scripts/src/deploy.ts b/scripts/src/deploy.ts index 9b88485d..610f92e2 100644 --- a/scripts/src/deploy.ts +++ b/scripts/src/deploy.ts @@ -280,6 +280,7 @@ function sourceProxyPrelude(service: UniDeskMicroserviceConfig): string { "export HTTP_PROXY=\"$build_proxy\" HTTPS_PROXY=\"$build_proxy\" ALL_PROXY=\"$build_proxy\"", "export NO_PROXY=\"localhost,127.0.0.1,::1,host.docker.internal\"", "curl -fsSI --max-time 20 -x \"$build_proxy\" https://github.com >/dev/null", + "echo target_source_proxy=provider-gateway-ws-egress:$build_proxy", "echo target_build_proxy=provider-gateway-ws-egress:$build_proxy", "echo target_build_proxy_probe=ok", ].join("\n");