Files
pikasTech-unidesk/docs/reference/g14.md
T
2026-06-04 14:44:02 +00:00

267 lines
42 KiB
Markdown

# G14 Provider Node
G14 is the current HWLAB DEV/PROD source and k3s/GitOps runtime truth, and it remains a UniDesk provider node for staging other infrastructure workloads. Its UniDesk provider id is `G14`; the local UniDesk worktree is `/root/unidesk`, and the native k3s kubeconfig is `/etc/rancher/k3s/k3s.yaml`.
G14's long-lived k3s control bridge is `k3sctl-adapter-g14`, a UniDesk direct service outside the k3s fault domain. It listens on the G14 host loopback port `127.0.0.1:4266` and is registered separately from the D601 `k3sctl-adapter`, so G14 infrastructure services can be built and tested without taking over user services that still run on D601.
For Code Queue and non-HWLAB CI/CD migration preparation, G14 uses native k3s labels `unidesk.ai/node-id=G14` and `unidesk.ai/provider-id=G14`. The G14 Code Queue manifests `src/components/microservices/k3sctl-adapter/k3s/code-queue.g14.k8s.yaml` and `src/components/microservices/k3sctl-adapter/k3s/code-queue.g14.k3s.json` are candidate staging artifacts only until an explicit production cutover is approved. Non-HWLAB production Code Queue, CI/CD and user-service execution must remain on D601 while D601 is carrying those services.
## HWLAB DEV/PROD Runtime
G14 hosts the current HWLAB DEV runtime in native k3s namespace `hwlab-dev`, and the same node/GitOps line is the HWLAB PROD target unless a newer HWLAB repo rule says otherwise. The canonical G14 HWLAB source workspace is `/root/hwlab` on branch `G14`, with `origin` set to `git@github.com:pikasTech/HWLAB.git`; before any G14 HWLAB work, this fixed workspace must be immediately fast-forwarded to the latest `origin/G14`. HWLAB source edits, CI/CD/GitOps script changes, render work, manual polling and runtime validation must be based on that updated workspace through UniDesk SSH passthrough. Do not use `/root/HWLAB`, `/home/ubuntu/hwlab`, `/workspace/hwlab`, D601 workspace or a master-server checkout as persistent HWLAB source truth. G14-local details are mirrored on the node in `/root/docs/hwlab-g14-workspace.md`.
The standard entry forms are:
```bash
trans G14:/root/hwlab script -- 'git fetch origin G14 && git pull --ff-only origin G14 && git status --short --branch && git remote -v'
trans G14 apply-patch < patch.diff
trans G14:k3s kubectl get pods -n hwlab-dev
```
`G14:k3s` is the only supported k3s route form. Do not use `ssh G14 k3s ...`; the first token must locate the distributed target, and the following tokens must be the operation.
If `/root/hwlab` has unrelated local changes when this sync starts, first determine whether they can be quickly merged with the latest `origin/G14`. Merge them immediately when they are mergeable; do not default to stash, discard or a behind worktree. Only when the changes cannot be automatically merged should they be isolated and the operation stopped for human decision. A behind fixed workspace is not a valid basis for precheck, new worktree creation, render, polling, deployment or runtime validation.
The G14 HWLAB runtime boundary is:
- Current DEV public endpoints are `http://74.48.78.17:17666/` and `http://74.48.78.17:17667/health/live`. D601 `16666/16667` is legacy/migration evidence only.
- Keep HWLAB Services as `ClusterIP` unless a repo-owned G14 GitOps rule explicitly exposes them. Public exposure should stay in the approved G14 edge/proxy path, not ad hoc NodePort or local port-forward.
- Use a G14-local PostgreSQL instance such as `hwlab-g14-postgres` and a G14-local `hwlab-cloud-api-dev-db` Secret for cloud-api durable runtime tests. Do not copy D601 database credentials.
- Use only G14-local Codex auth material and k8s Secrets authorized for HWLAB on G14; do not copy D601 or production auth material by hand.
- Set `HWLAB_CLOUD_API_PORT=6667` explicitly in the G14 cloud-api Deployment. Kubernetes otherwise injects a `HWLAB_CLOUD_API_PORT=tcp://...` Service environment variable that breaks the Node port parser.
- `HWLAB_PUBLIC_ENDPOINT` and health/live evidence must describe the G14 endpoint, not the old D601 production endpoint.
- Do not run HWLAB repository `check`, Playwright/browser smoke, image builds or other heavy validation on the master server. Run those through G14 `/root/hwlab`, G14 k3s/Tekton, or another explicitly approved external execution plane.
- Manual device-agent experiments for real hardware must be standalone resources in `hwlab-dev` such as `device-agent-71-freq` and must not patch existing HWLAB Deployments, Services, ArgoCD Applications, FRP, CD desired-state or public frontend routing unless a separate HWLAB change authorizes it.
- A D601 Windows `hwlab-gateway` may connect outbound to G14 DEV cloud-api as an external host bridge for Keil/serial/workspace access. That bridge does not make D601 the HWLAB runtime truth; it is only a hardware access provider behind the G14 device-agent/cloud-api path.
After the G14-local database is provisioned, run the HWLAB migration CLI only against the G14 DEV database with explicit non-production confirmations:
```bash
kubectl -n hwlab-dev exec deploy/hwlab-cloud-api -- \
node /app/cmd/hwlab-cloud-api/migrate.mjs \
--apply --confirm-dev --confirmed-non-production
```
Healthy G14 HWLAB runtime means the main Deployments and StatefulSets are Ready, `cloud-api` and `edge-proxy` return `/health/live` with `status=ok`, durable runtime checks pass, and the public G14 DEV endpoints report the expected revision. For a device-agent smoke, health also requires the standalone device-agent Service to answer in-cluster and the D601 Windows gateway session/resource/capability to be visible through G14 cloud-api.
## HWLAB v0.2 Expansion Line
HWLAB `v0.2` is an additive G14 expansion line. It must not rename, delete, repurpose, or mutate the existing `G14` source branch, `G14-gitops` branch, `hwlab-dev` namespace, `hwlab-prod` namespace, DEV public ports `17666/17667`, or PROD public ports `18666/18667`. Existing DEV/PROD CI/CD remains the stability baseline while `v0.2` is introduced beside it.
The fixed `v0.2` source branch is `v0.2`, forked from the current `G14` branch after the G14 long-term reference docs record this decision. The fixed G14 development workspace for that branch is:
```bash
trans G14:/root/hwlab-v02 script -- 'git status --short --branch && git remote -v'
```
`/root/hwlab-v02` is the long-lived `v0.2` development workspace, not a scratch clone or CI/CD source selector. It must track `origin/v0.2` with `origin git@github.com:pikasTech/HWLAB.git`; local dirty state, stale `HEAD`, and untracked `.worktree/` only affect human development. Existing `G14` work continues to use `/root/hwlab`; do not reuse `/root/hwlab` or `/root/hwlab/.worktree/*` as the `v0.2` fixed workspace.
`v0.2` CI/CD source selection is isolated in the dedicated bare repo `G14:/root/hwlab-v02-cicd.git`. UniDesk control-plane commands must fetch `origin/v0.2` into that repo and render from a commit-pinned detached worktree; they must not read the source commit from `/root/hwlab-v02` checkout state.
The fixed `v0.2` runtime namespace is `hwlab-v02`. The intended public FRP allocation is:
- Cloud Web browser entry: `http://74.48.78.17:19666/`.
- API/edge entry and live health: `http://74.48.78.17:19667/health/live`.
Master-side FRP server maintenance for HWLAB public ports is documented in `docs/reference/hwlab.md#hwlab-frp-维护`; keep the detailed allowlist, restart boundary and verification sequence there instead of duplicating another runbook in this G14 node reference.
The `v0.2` CI/CD integration must be additive: add a manual UniDesk trigger, dedicated CI/CD source repo, `devops-infra` git mirror/relay, GitOps desired-state lane, Argo CD Application, namespace resources, artifact catalog, and `deploy.json` environment only when they target `v0.2`/`hwlab-v02` explicitly. Do not add a `v0.2` branch poller or retarget the existing `G14` poller, DEV/PROD Argo Applications, DEV/PROD runtime paths, or existing namespace resources to bootstrap `v0.2`.
The `devops-infra` git mirror/relay remains manual and CLI-controlled, not CronJob-driven. The standard `v0.2` delivery trigger is `bun scripts/cli.ts hwlab g14 monitor-prs --lane v02`: it watches base=`v0.2` PRs, waits for GitHub preflight/CI readiness, auto-merges only ready and non-conflicting PRs, then drives the same controlled CD path and comments pending/blocked/succeeded/failed/timeout state back to the PR. The lower-level `bun scripts/cli.ts hwlab g14 control-plane trigger-current --lane v02 --confirm` remains the manual recovery or diagnosis entry; it must fetch `/root/hwlab-v02-cicd.git`, resolve the current `origin/v0.2` source commit, check the mirror's `localV02` ref before creating the PipelineRun, run one bounded manual `git-mirror sync` Job when the mirror is stale, and only continue after the mirror ref matches the current source commit. Use `hwlab g14 git-mirror sync --confirm` directly only for explicit mirror maintenance or diagnosis.
After a `v0.2` PipelineRun completes, treat runtime rollout and remote GitOps persistence as two separate checks. `hwlab g14 control-plane status --lane v02` is the runtime check: it must show the expected source commit, PipelineRun completed, Argo `Synced/Healthy`, public 19666/19667 probes passing, and Cloud Web asset probes such as `/app.js` readable. `hwlab g14 git-mirror status` is the persistence check: `cache.summary.pendingFlush` must be false and `cache.summary.githubInSync` true before declaring GitOps fully flushed back to GitHub. The PR monitor performs this flush automatically for its own merged PRs and records the result in the PR comment. Manual operators should run `bun scripts/cli.ts hwlab g14 git-mirror flush --confirm` and poll the returned job with `bun scripts/cli.ts job status <jobId> --tail-bytes 12000` only when they used lower-level manual trigger/status paths or when the monitor reports a flush failure; do not replace this with raw `kubectl`, native `git push`, or a long SSH wait.
When closing an issue against a specific completed `v0.2` PipelineRun, use targeted status instead of the latest-head status if `origin/v0.2` has already advanced through a parallel task:
```bash
bun scripts/cli.ts hwlab g14 control-plane status --lane v02 --pipeline-run hwlab-v02-ci-poll-<short-sha>
bun scripts/cli.ts hwlab g14 control-plane status --lane v02 --source-commit <full-sha>
```
Targeted status must expose `statusTarget.mode` and `targetValidation`. `targetValidation.state=passed` means the requested PipelineRun/source commit reached a succeeded PipelineRun, Argo `Synced/Healthy`, public web/API probes, flushed Git mirror, and matching runtime source commits for the services listed in that run's `planArtifacts.rolloutServices`; services listed in `planArtifacts.reusedServices` remain visible as runtime/provenance evidence but must not be forced to the target source commit. `targetValidation.state=superseded` means the requested PipelineRun succeeded but no longer owns runtime: either it was replaced by a newer succeeded `v0.2` PipelineRun, or latest-only promotion observed that `origin/v0.2` had advanced before GitOps/runtime writeback and closed the historical run as no-op. This is valid closure evidence for the requested run when the newer commit is on the same branch lineage. In both states, `commitAlignment.staleReasons` may still mention later `origin/v0.2` or CI/CD source head movement; that is parallel-head context, not a failure of the requested run. `falseGreenGuard` is a current-runtime guard and should report not-applicable/superseded for such historical targets instead of turning later runtime movement into a false failure. Default status without a target remains strict for the latest source head.
For HWLAB user-feedback, CLI, Cloud Web, AgentRun, device-pod, public API, or runtime workflow issues, source-level validation is not enough to close the issue. Unit tests, contract tests, `git diff --check`, targeted build checks, PR merge metadata, and source commit rollout evidence are supporting evidence only. The issue may be closed only after the affected user entry or original entry has been exercised against the target runtime. For CLI issues, that means running the relevant `hwlab-cli` or UniDesk-controlled CLI command from the G14 `v0.2` workspace or approved execution plane against the intended lane/URL/namespace and proving the observed behavior, not just proving the helper code compiles. For Cloud Web or public API issues, use the public endpoint or a bounded API/asset smoke that reaches the deployed runtime. For AgentRun or device-pod issues, capture the trace/session/thread/run/job/device evidence that proves the specific continuation or hardware workflow reached the live backend.
For Cloud Web Workbench and Code Agent issues, the closeout validation must use the same dispatch entry as the browser flow, or a CLI command that calls that same Cloud Web/Cloud API dispatcher path. A hand-written `dispatchHwlabAgentRun()` canary, direct AgentRun manager command, or runner job created outside the Web dispatcher is only infrastructure evidence; it cannot prove that the browser path requested the correct `toolCredentials`, `toolAliases`, transient env, conversation/session/thread binding, or runtime lane. If no CLI can exercise the Web-equivalent path, improve the CLI first and keep the issue open until the Web-equivalent CLI or browser trace proves the deployed behavior.
For Cloud Web Workbench Code Agent response or trace-rendering bugs, the minimum Web-equivalent CLI proof is a fresh `hwlab-cli client agent send --wait` against the deployed public Web origin, followed by `hwlab-cli client agent trace <traceId> --render web` against the same origin. The submit proof must show the browser dispatcher family, normally `POST /v1/agent/chat`, result polling through `/v1/agent/chat/result/<traceId>`, `continuation.webEquivalent=true`, `shortConnection=true`, and explicit `sessionId` / `conversationId` / `threadId` binding when those values affect the bug. The result proof must show the final assistant text from `assistantText` or `reply.content`; placeholder status text, result summaries, terminal status messages, and AgentRun completion boilerplate are not acceptable substitutes for the assistant final response.
For persisted final-response display regressions, a fresh turn alone is not enough when the user report identifies an existing conversation, session, or trace. Re-read the original record on the deployed `v0.2` runtime with locked lane env and the correct `projectId`; the default session list project may differ from the affected Workbench project. The minimum proof is `client session list --project-id <projectId> --limit <N> --full`, `client session inspect <conversationId> --full`, and `client agent result <traceId> --full`. Passing evidence must show that list and inspect surface the same latest agent `traceId` as `lastTraceId`, the latest agent text matches the terminal result `reply.content` or equivalent final assistant text, and known fallback text such as `Code Agent 仍在处理,可以继续 steer 或等待 trace 完成。` is absent from list, inspect, and result output. When the repair is lazy-on-read, run the read path again or capture the exposed repair source/updated marker so the evidence proves persisted conversation state was repaired, not merely synthesized for one response. `client agent trace <traceId> --render web` remains required for trace-rendering bugs; for persisted conversation-display bugs it is supporting evidence unless it returns rendered assistant rows from the same original trace.
The `--render web` proof must inspect the rendered body, not only the raw event count. Passing evidence should include `body.render=web`, the shared renderer identity when exposed, `status=completed`, rendered/returned row counts, noise/omitted counts when available, at least one rendered assistant row containing the final assistant text, and an explicit absence check for known non-user boilerplate such as `AgentRun terminal status completed`, `AgentRun result is ready`, and `Code Agent 仍在处理`. If the trace API returns `status=missing`, `sourceEventCount=0`, or no rows for a historical issue trace, treat that trace as expired or unavailable; do not use it as closure evidence. Generate a fresh equivalent turn on the current v0.2 runtime and validate that trace instead.
CLI/Web-equivalent trace evidence does not replace browser UI evidence for visual, layout, copy-to-clipboard, collapsed-panel or removed-control bugs. Those require a bounded browser or DOM smoke against `http://74.48.78.17:19666/` after rollout, with assertions on the deployed page text, DOM state, or control behavior that the user reported. A local bundle smoke can support regression coverage, but the closeout still needs the deployed public endpoint unless the browser entry is unavailable and the issue comment records the blocker. Missing Playwright browser binaries or declared test dependencies are not a valid skip; install the repository-declared runner/browser or use an approved system browser executable and record that choice in the validation evidence.
The closing comment for these issues must be semantic natural language before it lists evidence: state what the user-visible problem was, what changed, where it rolled out, and what original entry was rechecked. It must include the actual command or entry path, target lane or endpoint, relevant trace/session/thread/PipelineRun/run/device ids, and the pass/fail result. If the original entry cannot be verified because rollout has not happened, credentials are unavailable, the target runtime is down, or the required CLI capability is missing, keep the issue open and record the blocker. Do not close the issue on the strength of PR merge, targeted tests, or "will be verified after rollout" wording. If an issue was closed before this real CLI/user-entry validation, reopen it and add a correction comment before continuing.
For HWLAB v0.2 Code Agent context-loss or multi-turn continuity issues, the minimum closeout is a real `hwlab-cli client agent` two-turn E2E from `G14:/root/hwlab-v02` or another approved G14 execution plane with locked runtime namespace/lane env. Submit the first turn, poll its result to completed, submit the second turn with the same explicit `conversationId`/`sessionId`/`threadId`, then capture `trace`/`inspect` evidence. Passing evidence must show the second turn used prior-turn context, and should include context attachment or run reuse labels such as `conversation-context:attached`, `agentrun:run:reused`, `agentrun:runner-job:reused`, plus the relevant run/command ids. Long verification evidence belongs in a separate `gh issue comment create --body-file` comment; lifecycle close comments stay short, as defined in `docs/reference/cli.md`.
`/health/live` revision is owned by `hwlab-cloud-api`; it can legitimately differ from the source commit for a Cloud Web-only change. Do not call that difference a failed Cloud Web rollout when `webAssets.checks.htmlOk`, `webAssets.checks.appJsOk`, CSS probes, Argo health, and `hwlab-cloud-web` Deployment readiness have passed. For Cloud Web behavior changes, the public JS asset probe or a bounded browser/DOM check is stronger evidence than cloud-api `apiRevision`.
Do not turn `v0.2` expansion governance into a stack of broad compatibility gates. The stable control points are branch, dedicated CI/CD source repo, git mirror/relay refs, GitOps branch, namespace, runtime path, Argo Application, FRP ports and generated-output ownership. Legacy DEV/D601/main preflights that block the `v0.2` lane should be removed from that lane, not patched with fallback or legacy modes. Naming, RBAC scope, cleanup policy, resource quota and rollback order are design decisions or runbook entries unless they protect a concrete high-value risk that cannot be enforced by the fixed boundaries above.
## v0.2 Worktree + PR Workflow
`v0.2` source-of-truth changes must enter through a task-scoped worktree on a feature branch and then merge through a PR, not by direct commits to `v0.2`. The generic P2/P3/P4 flow is owned by `$dad-dev`; this section only fixes the G14/v0.2-specific source route, branch and lane:
```bash
trans G14:/root/hwlab-v02 script -- 'git fetch origin v0.2 && git pull --ff-only origin v0.2 && git status --short --branch'
trans G14:/root/hwlab-v02 script -- 'git worktree add .worktree/<task> -b fix/issue<N>-<short-name> origin/v0.2'
```
The fixed repo at `/root/hwlab-v02` is not a scratch area and must not carry parallel worktree state on the `v0.2` branch itself. All worktree branches should follow the `fix/issue<N>-<short-name>` naming so PR titles and merge commits stay scannable. GitHub PR writes, merge, rollout trigger and final original-entry validation follow `$dad-dev` plus the UniDesk CLI control rules in `AGENTS.md`.
### Recovery From a Direct Commit To v0.2
A direct commit on `v0.2` (work that landed with `git commit` on the fixed repo or a checkout that bypassed the worktree) must be moved onto a feature branch and re-merged through a PR before the next `trigger-current` reads it. The recovery is bounded and audit-friendly, but it is also a `git push --force-with-lease` against the protected branch, so it is only acceptable when the direct commit is the only new content on `v0.2` since the last merged PR:
1. Confirm no parallel worktree was in flight and the commit is the only delta. `trans G14:/root/hwlab-v02 script -- 'git log origin/v0.2..HEAD'` and `git log HEAD..origin/v0.2` must show the direct commit as a single fast-forward candidate.
2. Capture the commit identity and patch for the recovery record:
```bash
trans G14:/root/hwlab-v02 script -- 'git show <direct-commit-sha> > /tmp/v0.2-recovery.patch'
```
3. Roll the fixed repo back to the previous merged PR head. Use `git reset --hard <previous-pr-sha>`; this preserves any autostash (e.g. from a parallel `git checkout` snapshot in another worktree) on the stash list and does not touch the other worktree's working tree.
4. In the pre-existing worktree (e.g. `.worktree/<task>` on `fix/issue<N>-<short-name>`) bring the branch up to the previous PR head with `trans G14:/root/hwlab-v02/.worktree/<task> script -- 'git reset --hard <previous-pr-sha>'`, then `git cherry-pick <direct-commit-sha>` to replay the direct commit on the feature branch. If the worktree branch was already a clean clone of `origin/v0.2` at the previous PR head, the reset is a no-op.
5. Push the feature branch and force-push `v0.2` back to the rolled-back head with `--force-with-lease` (refuses to clobber a concurrent push):
```bash
trans G14:/root/hwlab-v02/.worktree/<task> script -- 'git push -u origin fix/issue<N>-<short-name>'
trans G14:/root/hwlab-v02 script -- 'git push --force-with-lease origin v0.2'
```
6. Open the PR through UniDesk CLI, squash-merge, then `git pull --ff-only origin v0.2` to bring the fixed repo back in sync. The previous PR's merge commit will not be in the new PR's history; the new PR's diff equals the original direct commit's diff, so the PR trail still contains the exact same bytes.
7. `bun scripts/cli.ts hwlab g14 control-plane status --lane v02` will read the new merge commit; the previously-staged PipelineRun for the direct commit was created on the v0.2 head and `trigger-current` will delete + recreate it for the post-merge head, so no manual PipelineRun cleanup is required.
The recovery is auditable: the original `git show` patch and the `cherry-pick` SHA both land in the PR diff, so the issue/PR trail still contains the exact same bytes that were first committed directly. This is a one-time corrective action; recurring direct commits on `v0.2` are a workflow regression and must be called out in the relevant issue or PR.
### v0.2 Cloud Web Runtime Layout Validation
Cloud Web layout, status-panel, collapsed-control, and modal issues on `v0.2` need deployed browser evidence. Source checks and control-plane rollout are supporting evidence; they do not prove that the public `19666` page renders the fixed DOM.
Use these surfaces together:
- `trans G14:/root/hwlab-v02/.worktree/<task>/web/hwlab-cloud-web script -- 'bun run check'` for static unit/contract/layout checks and dist freshness.
- `bun scripts/cli.ts hwlab g14 control-plane status --lane v02` for runtime, Argo, public endpoint, and GitOps alignment. If `origin/v0.2` moved through a parallel PR, use `--pipeline-run` or `--source-commit` and treat same-branch supersession as context rather than failure.
- Public API probes for both `/health/live` and `/v1/live-builds`. `/health/live` proves live service health/revision, but Cloud Web build time, image tag/digest, source metadata, and actual runtime commit/revision should be read from `/v1/live-builds`.
- A bounded browser/DOM probe against `http://74.48.78.17:19666/` that asserts the deployed page state relevant to the issue.
Cloud Web frontend regressions still use the two-layer validation rule. Deterministic client behavior, such as scroll-follow state machines, Markdown/HTML escaping, shared renderer output, persisted view mapping and DOM class/attribute decisions, should be reproduced first in source-level unit or contract tests; those tests may mock DOM nodes, API responses or renderer input because they are the fast regression guard. The deployed browser or Web-equivalent CLI layer must not mock the user entry, and should prove only the live integration that unit tests cannot prove: the public bundle is deployed, the real page dispatch path creates the expected DOM state, and the user-visible control behaves on the target lane. Do not move every frontend bug into CLI/browser smoke just because it is user-facing.
Cloud Web message Markdown must go through a single shared React renderer component. Do not maintain a hand-written Markdown parser or a `dangerouslySetInnerHTML` message path for normal chat/workbench messages. The shared renderer's fast tests should cover at least GFM table rendering, inline/fenced code, emphasis/strong text and raw HTML escaping. Browser closeout should assert rendered DOM shape, such as `table`/`code`/`strong` counts and absence of injected `script` nodes or executed script flags, instead of comparing the full rendered HTML string.
For Workbench status/build panels, the minimum DOM proof should check the topbar chip, absence of full status cards in the right sidebar, hidden collapsed lists actually absent from layout, bounded scroll ownership on the right content area, and a details dialog that contains environment image metadata, actual live commit/revision, and source/build-time fields when available.
`/v1/live-builds.latest` is global across services and can legitimately point at `hwlab-cloud-api` when API rolled after Web. Inspect the `hwlab-cloud-web` service row before deciding whether a Web build field is missing or stale.
For `#workspace` or other scroll-owner fixes, closeout evidence should include numeric scroll metrics before and after the interaction: `scrollHeight`, `clientHeight`, `scrollTop`, `distanceFromBottom`, computed `overflowY`, and the page's follow/detached state attribute when one exists. Passing evidence for follow-tail behavior must show that new content keeps the view at bottom while already following, manual upward scroll detaches, and scrolling back to the bottom re-attaches. If the issue is specifically about final assistant response persistence or trace rendering, the browser/CLI proof must wait for the final agent/trace result as described above. If the issue is a frontend-only renderer or scroll-container regression and the same component/path renders user and agent messages, a real `#command-input` submission that creates a long user message is sufficient to exercise the deployed renderer/scroll path; do not block closure on an unrelated slow external model turn.
Generic layout smoke can be used only when it is bounded in the current transport. A Playwright smoke that runs through `trans` with no output for the SSH idle timeout, leaves preview/browser processes behind, or never writes an exit/report file is not closure evidence. Run it as an async remote job with explicit report and cleanup, or use a smaller issue-specific DOM probe that emits one JSON result and exits. The stable remote-probe shape is: create a fresh Workbench session through the UI when prior session state may be failed, start the browser script as a target-side job, write a PID/log/result JSON/screenshot on G14, poll those files with short `trans` queries, and cancel any running live turn through the UI before exit when the probe submitted a real prompt. Missing Playwright-managed browser binaries are not a skip; use an approved system browser executable on G14 or install the declared browser dependency, and record the choice. When staging a Node probe outside the repo workspace, make package resolution explicit by running from the workspace or importing packages through the workspace's `node_modules`; do not treat `MODULE_NOT_FOUND` from a `/tmp` script as an application failure.
### v0.2 Cloud Web Button/JS Sync Rule (HWLAB #748)
When a `v0.2` Cloud Web fix removes a button from `index.html` or a field from the `el` literal in `web/hwlab-cloud-web/app.ts`, every `el.<removed-field>.addEventListener(...)` (or `.requestSubmit()` / `.showModal()` / etc.) binding must be removed from the matching `init*` function in the same commit. The static `web:check` does not catch this orphan listener class because the TypeScript build is `Bun.build` transpile-only (no `tsc --noEmit`), and the runtime crash only surfaces as `Cannot read properties of undefined (reading 'addEventListener')` on first init. The minimal closeout checks for the v0.2 lane are:
```bash
# 1. Web assets rebuild and the orphan is gone from the dist
trans G14:/root/hwlab-v02/.worktree/<task> script -- 'cd web/hwlab-cloud-web && bun run build'
trans G14:/root/hwlab-v02/.worktree/<task> script -- "grep -c '<removed-field>' web/hwlab-cloud-web/dist/app.js" # must be 0
trans G14:/root/hwlab-v02/.worktree/<task> script -- "grep -c 'id=\"<removed-id>\"' web/hwlab-cloud-web/index.html" # must be 0
# 2. Live 19666/19667 confirms the deployed bundle is the new build
curl -fsS http://74.48.78.17:19666/ | grep -c '<removed-id>' # must be 0
curl -fsS http://74.48.78.17:19666/app.js | grep -c '<removed-field>' # must be 0
bun scripts/cli.ts hwlab g14 control-plane status --lane v02 # webAssets.checks.appJsOk = true, sourceCommit = merge commit
```
While the PR is open, the author can also run a one-liner to surface any orphan `el.<field>.addEventListener` whose field is not declared in the `el` literal of `app.ts`:
```bash
trans G14:/root/hwlab-v02/.worktree/<task> script -- 'awk "/^const el = /,/^};/" web/hwlab-cloud-web/app.ts | tr -d "," | awk "{print \$1}" | grep -E "^[a-zA-Z]" | sort -u > /tmp/el-fields.txt; grep -nEo "el\\.([A-Za-z_$][A-Za-z0-9_$]*)\\.addEventListener" web/hwlab-cloud-web/*.ts | while read m; do field=$(echo "$m" | sed -E "s/.*el\\.([A-Za-z_$][A-Za-z0-9_$]*)\\.addEventListener.*/\\1/"); if ! grep -q "^$field$" /tmp/el-fields.txt; then echo "ORPHAN: el.$field.addEventListener"; fi; done'
```
Document the explicit `grep` / curl evidence in the issue closeout comment. Tightening the `el` literal with proper TypeScript types is tracked separately and must not be done as part of a runtime fix PR.
## Node-Local VPN Proxy
G14 has a node-local VPN/proxy stack for infrastructure bootstrap and recovery downloads:
- Primary mixed HTTP/SOCKS proxy: `127.0.0.1:10808`.
- Backup Hysteria2 HTTP proxy: `127.0.0.1:11809`.
- Backup Hysteria2 SOCKS5 proxy: `127.0.0.1:11808`.
- Operator-only local details remain on G14 under `/root/docs/vpn-proxy-ops.md`; subscription URLs, node credentials and GUI database contents must not be copied into the UniDesk repository.
The G14 host persists this proxy configuration in these local files:
- `/etc/profile.d/unidesk-g14-proxy.sh` exports `HTTP_PROXY`, `HTTPS_PROXY`, `ALL_PROXY`, lowercase aliases and `NO_PROXY` for new login shells. Set `UNIDESK_G14_DISABLE_PROXY=1` before shell startup to opt out.
- `/root/.npmrc` pins npm `proxy`, `https-proxy`, `noproxy` and retry settings for root-side bootstrap commands.
- `/root/.gitconfig` pins root Git HTTP/HTTPS proxy settings.
- `/root/.docker/config.json` pins Docker client proxy settings for commands and build contexts that honor Docker client proxy configuration.
- `/etc/systemd/system/docker.service.d/proxy.conf` pins Docker daemon pull proxy settings. Updating this drop-in requires `systemctl daemon-reload` and a Docker restart before the active daemon sees the new `NO_PROXY`; do not restart Docker while G14 provider-gateway, k3s bootstrap or image builds are in flight unless that interruption is intentional.
The `NO_PROXY` list must include localhost, the main server, private LAN ranges, k3s pod/service CIDRs, Kubernetes service domains and the loopback registry so that k3s, `127.0.0.1:5000`, Kubernetes API access and UniDesk control paths do not route through the VPN proxy.
The primary proxy can be used for G14 target-side image bootstrap when Docker Hub, npm, GitHub or Playwright downloads are unreliable through direct network or provider-gateway WS egress. For Docker build steps that use `127.0.0.1`, build with host networking so the build container reaches the host proxy:
```bash
docker build --network host \
--build-arg HTTP_PROXY=http://127.0.0.1:10808 \
--build-arg HTTPS_PROXY=http://127.0.0.1:10808 \
--build-arg ALL_PROXY=socks5h://127.0.0.1:10808 \
--build-arg http_proxy=http://127.0.0.1:10808 \
--build-arg https_proxy=http://127.0.0.1:10808 \
--build-arg all_proxy=socks5h://127.0.0.1:10808 \
...
```
`127.0.0.1:10808` is a G14 host loopback endpoint. Inside an ordinary k3s Pod, `127.0.0.1` is the Pod network namespace, not the node proxy. Do not set long-lived workload proxy env to `http://127.0.0.1:10808` unless that workload is intentionally `hostNetwork` and the port conflict/DEV-PROD blast radius has been reviewed. Temporary hostNetwork debug Pods may use the node-local proxy only for bounded bootstrap proof or cache prewarm; they must not become GitOps desired state just to make external downloads work.
The backup proxy uses `HTTP_PROXY=http://127.0.0.1:11809`, `HTTPS_PROXY=http://127.0.0.1:11809` and `ALL_PROXY=socks5h://127.0.0.1:11808`.
This proxy is not a replacement for UniDesk runtime egress. k3s workloads such as Code Queue must still use the cataloged `g14-provider-egress-proxy` Kubernetes Service and `g14-tcp-egress-gateway` for normal runtime access to PostgreSQL, OA Event Flow and external APIs. The node-local VPN proxy is allowed only for G14 host-side bootstrap, image build, cache prewarm or recovery steps, and those steps should record the proxy choice in issue or deployment evidence.
## v0.2 device-pod cloud-api architecture
`v0.2` device-pod integration is the cloud-api → executor → D601 Windows `device-host-cli.mjs` chain under `internal/cloud/access-control.ts`, `cmd/hwlab-device-pod/main.ts` and the host-side `F:\Work\ConStart\tools\device-host-cli.mjs`. PR #765 (selector cheat sheet + fail-fast) and PR #778 (output.text propagation + evidence selector + read-only sub-action `--reason` exemption) are the two anchor PRs; PR #779 tracks the still-open host-side ops work. Earlier work used raw `MUTATING_INTENTS.has(intent) && !reason` and a single-pass `textOr(output.text, …)` extractor; both are obsolete and must not be re-introduced.
### Intent / sub-action / reason matrix
`DEVICE_JOB_INTENTS` (cloud-api) enumerates the full supported surface; `MUTATING_INTENTS` is the strict subset whose default sub-action is mutating. Only `workspace.build` and `debug.download` carry a structured sub-action (`start` / `status` / `output` / `wait` / `cancel` / `evidence`) and are listed in `DEVICE_JOB_ACTIONABLE_INTENTS`; for those two, `_deviceJobRequiresReason(intent, args, reason)` returns `false` when `reason` is provided OR when `args.action` is in `DEVICE_JOB_READ_ONLY_SUB_ACTIONS`. Any other mutating intent (`workspace.apply-patch`, `workspace.put`, `debug.reset`, `io.uart.write`, `io.uart.jsonrpc`, `io.uart.read-after-launch-flash`, etc.) still always requires a non-empty `reason`. Adding a new actionable mutating intent requires extending both `MUTATING_INTENTS` and `DEVICE_JOB_ACTIONABLE_INTENTS` together; adding a new read-only sub-action requires only the `DEVICE_JOB_READ_ONLY_SUB_ACTIONS` set.
The `evidence` sub-action on `workspace.evidence` / `debug.evidence` is a first-class intent, not a `workspace.build` sub-action. Code Agent sees `<pod>:workspace:/ build evidence [jobId]` and `<pod>:debug-probe download evidence [jobId]`; cloud-api maps to a new device-pod executor job, the executor maps to `deviceHostArgs = ["workspace", "evidence", kind, ...]`, and the host-side `device-host-cli.mjs` dispatches via `if (command === "evidence")` at the top level of `main()` (not nested under `if (command === "build")`). `workspace.evidence kind=build` → `keil-build` job; `debug.evidence kind=download` → `keil-download` job; the kind sub-arg must be `build` / `download` and the optional jobId selects a specific past job.
### Output text propagation chain
`body.output.text` flows through three layers in order; each layer tries more fields and only falls back when earlier sources are empty:
1. **host `device-host-cli.mjs`** returns a JSON envelope that already contains `stdout` / `stderr` / `summary` / `logTail` / `buildSummary` for build/download ops; `workspace.ls` / `workspace.cat` / `workspace.rg` are inline and include a JSON body.
2. **executor `cmd/hwlab-device-pod/main.ts` `gatewayDispatchText(result, dispatch)`** walks `result.stdout` → `result.stderr` → `dispatch.stdout` → `dispatch.stderr` → `result.evidence.{text,logTail,summary}` → `dispatch.message` (only when `dispatchStatus === "completed"`) → `dispatch.summary` → `result.summary` → `dispatch.buildSummary` → `result.text` → `JSON.stringify(result)`. The executor stores this as `job.output` and exposes it via `boundedOutput()` which clips at `DEVICE_JOB_OUTPUT_MAX_BYTES` (12000) and drops `executor` / nested `output` when truncated.
3. **cloud-api `executorOutputPayload(body, httpStatus)`** wraps what the executor sent and exposes `body.text` / `body.output` / `body.bytes` / `body.truncation` to the `/v1/device-pods/{id}/jobs/{jobId}/output` endpoint. `text` is `firstString(body?.text, output.text, nestedOutput.text, output.summary, nestedOutput.summary, evidence.text, evidence.logTail, evidence.summary)`; the matched key is recorded by caller convention. `executor` payload stays on the response so callers can read `dispatch.exitCode` / `dispatch.message` / `dispatch.stdout` even when text is empty.
The `evidence.*` and `*summary` lookups exist so a dispatcher that already includes host `logTail` / `buildSummary` becomes visible without a separate `bootsharp` re-run on the Code Agent side. The `summary` lookups also keep error messages (`dispatch.message`) in the response even when `dispatchStatus` is not `completed`; this is the reason `body.error.message` always has something to show for failed host dispatches.
### Cloud-api vs host-side boundary
`/root/hwlab-v02/skills/device-pod-cli/assets/device-host-cli.mjs` is the v0.2-shipped copy of the host-side CLI. The actual hardware host runs a separate `F:\Work\ConStart\tools\device-host-cli.mjs` that is **not** a deployment of the v0.2 repo; it is a D601 ops-side copy that must be synced manually when the v0.2 repo changes host-side behavior. The two-step contract is:
- v0.2 cloud-api / executor changes are valid once `PipelineRun Succeeded` + `git mirror flush` complete; runtime revision is `commit.id` from `/health/live` and source commit can be forced to match via the next `trigger-current`.
- v0.2 host-side `device-host-cli.mjs` changes are NOT visible until someone replaces `F:\Work\ConStart\tools\device-host-cli.mjs` on the D601 Windows host; cloud-api `body.text` will faithfully surface the "unsupported command" JSON error from the stale host binary, which proves the cloud-api propagation chain works but the host side is stale.
A live `workspace.evidence` / `debug.evidence` / `download evidence` selector that returns the host `logTail` end-to-end therefore requires both (a) the v0.2 PR merged and rolled, and (b) the D601 host binary replaced; missing either half is a known gap tracked in #779.
### v0.2 device-pod closeout checks
Device-pod fixes still follow `$dad-dev` and the `## v0.2 Worktree + PR Workflow` route above. The device-pod-specific closeout is the three-layer runtime matrix below; keep these checks because they prove the cloud-api -> executor -> D601 host chain, while generic PR/CI/CD and worktree mechanics stay in `$dad-dev`.
```bash
trans G14:/root/hwlab-v02/.worktree/<task> script -- 'cd tools && bun test device-pod-cli.test.ts'
trans G14:/root/hwlab-v02/.worktree/<task> script -- 'cd cmd/hwlab-device-pod && bun test main.test.ts'
trans G14:/root/hwlab-v02/.worktree/<task> script -- 'cd internal/cloud && bun test access-control.test.ts'
trans G14:/root/hwlab-v02/.worktree/<task> script -- 'node --check skills/device-pod-cli/assets/device-host-cli.mjs'
```
Treat `access-control.test.ts` workbench failures as pre-existing on the v0.2 base unless the new test list explicitly covers them. After PR merge and `trigger-current --lane v02 --confirm`, the live `http://74.48.78.17:19667/` CLI 验收 must hit all three layers:
1. `body.output.text` non-empty for at least one happy-path intent (`workspace.ls` / `workspace.cat` are the cheapest ones to verify propagation without needing a real D601 build).
2. `workspace.evidence kind=build` / `kind=download` accepted by cloud-api, dispatched to executor, executor `blocker === null` and `job.reason === ""`.
3. `<mutating intent> action=status` accepted without `--reason` while the same intent with `action=start` is still rejected with `device_job_reason_required`.
There is no separate `device-pod` doc; this section is the single authoritative reference for the architecture, and the AGENTS.md index points to it.