docs: add unidesk webdev skill
This commit is contained in:
+3
-32
@@ -22,11 +22,7 @@ CI/CD、GitOps、rollout、artifact 发布、PR 合并后的 runtime lane 滚动
|
||||
|
||||
G14/D601 v03 的 bootstrap admin password 是 HWLAB runtime Secret 生命周期的一部分,必须收敛到 `config/hwlab-node-lanes.yaml` 的 `bootstrapAdmin` 声明与受控 `hwlab nodes secret status|ensure --node <node> --lane v03 --name hwlab-v03-bootstrap-admin` CLI。明文只能存在于 Git 忽略、owner-only 的 `.state/secrets/...` sourceRef 文件;CLI 在本地把明文转换为 HWLAB 兼容 password hash,只向运行面同步 `password-hash`,并在输出中只披露 sourceRef、sourceKey、target Secret/key、presence、byte count、fingerprint、mutation 与后续命令。`secret ensure --force` 只用于明确需要按 YAML sourceRef 重灌 bootstrap admin hash 并重启 Cloud API 的受控恢复场景,默认 ensure 不做强制重灌;不要把人工生成 hash、手工写 k8s Secret 或原生 `kubectl rollout` 沉淀为长期入口。
|
||||
|
||||
`hwlab nodes web-probe run --node <node> --lane <lane> [--url <public-origin>]` 是 HWLAB Cloud Web DOM probe 的受控指挥入口。它从 `config/hwlab-node-lanes.yaml` 解析目标 workspace、public URL 和 bootstrap admin sourceRef,在 UniDesk 指挥侧读取 owner-only 明文后只通过一次性 stdin/env 注入目标 workspace 的 `scripts/web-live-dom-probe.mjs`;stdout 只披露 sourceRef、sourceKey、presence、fingerprint、注入方式、DOM 摘要和 artifact hash,不打印密码。缺少 sourceRef 或 source 文件时应结构化返回 `web_login_secret_missing`,不能回退历史默认密码或要求把 secret 复制到 D601/G14 目标 host。Code Agent Trace 实时性验收使用 `--trace-sample-count <N>` 和 `--trace-sample-interval-ms <ms>` 透传到目标 helper,输出每次采样的 agent status、trace presence/status、row count、empty label 和最新 row preview,用于证明运行中渐进拉取;这类采样不能由终态截图替代。CLI 默认按 trace 采样窗口、terminal wait 和页面 timeout 自动扩展短连接 `commandTimeout`,显式 `--command-timeout-seconds` 只允许延长该预算,不应把 90-180s 采样压回 60s;当自动预算超过短连接安全窗口时,`run` 必须切到 helper `start/status` 异步轮询并在输出中标记 `transportMode=async-start-status`、job id、poll count、timedOut 和最终 report。需要自定义 Playwright route/intercept、in-flight DOM 读取或专用截图时,使用 `hwlab nodes web-probe script --node <node> --lane <lane> <<'JS' ... JS`,由 CLI 负责同一 sourceRef 凭据解析、`/auth/login` 建立 `hwlab_session`、已认证 `browser/context/page/baseUrl` 注入和 artifact path/hash 摘要;自定义脚本不得自行读取或打印 Web 登录凭据。`web-probe script` 托管登录先对同源 `/auth/login` 做短重试;仍未拿到 `hwlab_session` 时自动回到当前 Cloud Web 登录表单,以浏览器方式提交同一凭据。`probe.auth` 只输出 method、origin、loginPath、status、attempts、retryCount、fallbackUsed、fallback、retryable、transientObserved、fingerprint、commanderAction 和 redacted errorSummary,不打印密码、cookie 或可复制 session 值。
|
||||
|
||||
`web-probe script` 的默认 `goto('/workbench')` 是稳定导航边界:它会先复用当前 page,失败后有限次切 fresh page 重试,并等待 workbench 基础 DOM(默认 `#workspace` 和 `#command-input`)可见;需要显式控制时使用注入的 `gotoStable(target, { selectors, activeSelector, attempts, readinessTimeoutMs })`、`waitWorkbenchReady({ selectors })`、`waitForReady({ selectors })`、`gotoRaw()` 和 `getPage()`。稳定化失败必须在 `probe.readiness` 中低噪声披露 attempt、阶段、selector、是否观察到 `/v1` API request、API failure 摘要和失败截图 artifact;分类值固定为 `browser-load-jitter`、`selector-timeout`、`api-not-sent`、`api-response-failed`,避免把“页面没准备好/请求未发出”和“后端响应失败”混成同一种 selector timeout。runner 不在用户脚本执行前抢先导航同一 page,保证脚本仍可先安装 `page.route` 或 context route;如重试切换 fresh page,后续脚本应通过 `gotoStable()` 返回值或 `getPage()` 取得当前 page。
|
||||
|
||||
`web-probe script` 的调试 helper 必须覆盖常见 Workbench 探针动作:`fetchJson(path)` 在已登录页面上下文里按同源 cookie 请求 JSON,失败返回 `{ ok:false, path, status, error }` 而不是吞掉证据;`collectText(selector)` 返回 selector 命中数量和文本摘要;`safeEvaluate(fn, arg)` 固定使用 Playwright 单参数规则;`screenshotOnError(name, fn)` 在用户断言抛错时落 `failure.png`;`summarizeWorkspace()` 与 `summarizeConversation()` 只输出会话/消息摘要,不打印 cookie、密码或 token。workspace/conversation/trace 采样必须保留 project authority:摘要中输出 projectId,trace 请求优先使用 conversation/workspace projectId;遇到 `trace_project_mismatch` 时输出 `requestProjectId`、`requestScoped`、`errorCode`、HTTP status/bodyPreview/degradedReason,并允许降级尝试无 projectId 查询,不能退化成无上下文的 `Failed to fetch`。可复用脚本优先用 `--script-file <path>`,一次性探针才用 stdin heredoc。Playwright `page.evaluate` 只能传一个可序列化参数;需要多个值时必须写成 `page.evaluate(({ a, b }) => ..., { a, b })` 或使用 `safeEvaluate(fn, { a, b })`。脚本抛错或返回 `{ ok:false }` 时,`probe` 顶层必须保留 `failureKind=script-api-misuse|assertion-failed|navigation-failed|auth-failed|browser-failed`、`errorMessage`、`scriptSha256`、`runDir`、`lastUrl` 和 `lastScreenshot`;默认失败截图文件名是 `failure.png`,调用方脚本源码保存在同一 `runDir` 以便复查。
|
||||
`hwlab nodes web-probe run|script --node <node> --lane <lane>` 是 HWLAB Cloud Web 线上 DOM/Playwright 验收的受控入口;CLI 负责从 YAML 解析 workspace、public URL 和 bootstrap admin sourceRef,并只输出 redacted 凭据状态、artifact path/hash、readiness 和失败分类。具体 Web 开发、fake-server Playwright、fixture 脱敏、`web-probe script` helper、截图和 Workbench/Performance 判定口径统一见 `$unidesk-webdev`,本 CLI 参考不再维护第二套操作面。
|
||||
|
||||
`hwlab nodes control-plane infra plan|status|apply --node D601 --lane v03` 是 D601 HWLAB v03 节点本地 CI/CD 与 git-mirror 前置控制面的 YAML 驱动入口,配置真相源是 `config/hwlab-node-control-plane.yaml`。`plan` 只读展示 YAML target 和将渲染的 control-plane 对象;`status` 只读观察 D601 Tekton、CI namespace、git-mirror、Argo、node-local registry 和 tools image readiness;`apply --dry-run` 只输出 manifest 摘要;`apply --confirm` 只收敛 D601 control-plane bootstrap 对象,不触发 HWLAB runtime rollout,不创建 PK01 DB,也不修改 Caddy/FRP。tools image 的 node-local registry 地址只能作为输出 artifact;输入 base image 必须由 YAML 声明为公开 registry 来源,缺少 output image 时应在 `status.next.blockers` 中体现,而不是把现有 node-local image 当成输入基础镜像。
|
||||
|
||||
@@ -134,34 +130,9 @@ G14/D601 v03 的 bootstrap admin password 是 HWLAB runtime Secret 生命周期
|
||||
- `debug health`、`debug ssh-pool <providerId>`、`debug dispatch` 与 `debug task` 走真实内部 core、WebSocket、数据库、provider、系统指标、Docker 状态和 Host SSH 维护桥流程,只用于开发调试,不写入 `TEST.md` 的正式验收步骤;`debug ssh-pool` 只裁剪单个 provider 的 `providerGatewaySshData*` labels,用于低噪声判断 tcp-pool 是否 ready、claimed、exhausted 或有 lastError。
|
||||
- `e2e run [--only pattern[,pattern...]] [--skip pattern[,pattern...]]` 使用 publicHost 派生的公开 production frontend/dev frontend/provider ingress URL,并通过 Docker 内网验证 core API、PostgreSQL、provider self-connection、系统指标曲线、Docker 状态快照、provider.upgrade 预检和 Playwright 前端页面,是交付前的自动化 E2E 门禁;CLI 默认输出 check 状态摘要,完整诊断写入 `resultPath`,日常迭代应优先用 `--only` / `--skip` 跑最小必要集合。
|
||||
|
||||
## Playwright Commander Wrapper
|
||||
## Web / Playwright
|
||||
|
||||
UniDesk 仓库自带 `scripts/playwright-cli.ts` 作为 host commander 浏览器手测 wrapper。它是短生命周期、JSON 输出的命令,不是长驻浏览器 daemon。
|
||||
|
||||
- `bun scripts/playwright-cli.ts screenshot <url> [path] [--session id] [--selector css] [--full-page]` 默认 headless 打开 URL、保存截图、把 storage state 写到 `.state/playwright-cli/sessions/`,并返回 `status`、`finalUrl`、`title`、`screenshotPath` 等紧凑 JSON。
|
||||
- `bun scripts/playwright-cli.ts open <url> [--session id] [--screenshot path]` 执行一次导航;需要证据时加 `--screenshot`。
|
||||
- `bun scripts/playwright-cli.ts eval <url> '<javascript-expression>' [--session id]` 在导航后执行单个表达式并返回结构化值。
|
||||
- `session-list` 和 `session-delete [sessionId]` 只管理 storage-state 文件。`--session` 不表示 live page,也不会保留 element refs。
|
||||
- `click`、`fill`、`snapshot`、`tab-list`、`close` 等交互式 daemon 命令会返回 `unsupported-command` 和下一步建议,不会透传给 `npx playwright`,因此 `--session=<id>` 不会再被上游 Playwright 当作未知参数。
|
||||
- 默认走 headless,适合无 XServer runner。确实需要 headed 时使用 `xvfb-run -a bun scripts/playwright-cli.ts open <url> --headed --screenshot /tmp/page.png`;Code Queue runner 镜像必须包含 `xvfb-run` 和 `xauth` 作为该兜底路径。
|
||||
|
||||
外部 agent skill `~/.agents/skills/playwright` 是另一个 source of truth。当前宿主上它可能仍是 `npx playwright` passthrough,但 `SKILL.md` 里描述了更丰富的 `--session`、`snapshot` 和 element-ref 操作。外部 skill 分发更新前,UniDesk/HWLAB 指挥手测应使用本仓库 wrapper;不要把外部 skill 文档当成 daemon/session 能力已经可用的证据。
|
||||
|
||||
### Playwright Trans Passthrough
|
||||
|
||||
跨 host 浏览器验收优先使用 `trans <route> playwright`,标准形态是不带 workspace 的 host route,例如:
|
||||
|
||||
```bash
|
||||
trans D601 playwright --local-dir /tmp <<'PW'
|
||||
playwright-cli screenshot https://example.com "$UNIDESK_PLAYWRIGHT_SCREENSHOT" --full-page
|
||||
PW
|
||||
```
|
||||
|
||||
`playwright` operation 读取 stdin heredoc,在目标 POSIX host/workload 上临时注入 `playwright-cli` wrapper 到 `PATH`。wrapper 优先使用 route workspace 或目标 host 上已知 UniDesk workspace 的 `./scripts/playwright-cli.ts`,其次使用远端用户的 `~/.agents/skills/playwright*/scripts/playwright-cli.ts`,最后才使用远端 `PATH` 中的 `playwright-cli`。命令会设置 `UNIDESK_PLAYWRIGHT_REMOTE_DIR` 和 `UNIDESK_PLAYWRIGHT_SCREENSHOT`,把远端 run 目录中的 `png/jpg/jpeg/webp/pdf` 产物回传到本机 `--local-dir`,默认 `/tmp`,并返回本地路径、远端路径、字节数、SHA-256、stdout/stderr tail 和 transfer verification。
|
||||
|
||||
该入口只负责短生命周期 Playwright 执行和产物回传,不提供长驻浏览器 daemon。需要多步交互时,把步骤写在同一个 heredoc 内;helper 会在远端后台提交 job,并用短连接轮询 manifest,避免单次 SSH 透传超过 60 秒硬限制。需要保留远端证据时显式加 `--keep-remote`。
|
||||
|
||||
登录态、会话发送和 trace 截图应在同一个 heredoc 中显式等待关键 HTTP response 和稳定 selector,不要只靠宽泛的 `input[type=text]` 或页面标题判断登录成功。对 HWLAB Cloud Web 这类表单,优先定位 `input[autocomplete="username"]`、`input[type="password"]`,并等待 `/auth/login` 返回 `authenticated=true` 后再进入 `/workbench`、创建 session、发送消息和截图。截图对比必须记录实际 URL/lane;例如 D601 `v0.3` 的 `https://hwlab.pikapython.com` 是 PK01 Caddy/FRP 域名入口,而 `http://74.48.78.17:19666/` 属于 G14 `v0.2`/旧 React 工作台对照,不能作为 D601 `v0.3` 的等价验收入口。
|
||||
UniDesk/HWLAB Web 开发、Playwright wrapper、`trans <route> playwright`、HWLAB `web-probe run|script`、fake-server 回归、截图 artifact 和 node/lane 原入口验收统一见 `$unidesk-webdev`。本文件只保留 CLI 命名索引,不复制 Web 测试操作面,避免形成多路径和 fallback。
|
||||
|
||||
## Async Job State
|
||||
|
||||
|
||||
+11
-26
@@ -5,24 +5,16 @@ UniDesk delivery is not complete until the public production frontend, public de
|
||||
## Required Preconditions
|
||||
|
||||
- `config.json` `network.publicHost` must be the externally reachable host name or IP of the main server, not `127.0.0.1`, when validating browser access from outside the server.
|
||||
- `bunx playwright install chromium` and `bunx playwright install-deps chromium` must have been run on hosts that execute browser E2E tests.
|
||||
- The Docker stack must be running through `bun scripts/cli.ts server start`, and `bun scripts/cli.ts server status` must report healthy frontend, provider ingress, internal core, database, and provider-gateway containers.
|
||||
- Browser runner dependencies, Playwright execution shape, screenshots, focused frontend checks and layout assertions are owned by `$unidesk-webdev`; this file only records delivery gate scope.
|
||||
|
||||
## Automated E2E Scope
|
||||
|
||||
`bun scripts/cli.ts e2e run` validates the following URLs and internal checks derived from `config.json`. The CLI response is intentionally bounded: it prints check names/statuses, screenshot path, counts, and `resultPath`; the full per-check diagnostics are written to `resultPath` under `.state/e2e/` so failures remain inspectable without flooding stdout.
|
||||
`bun scripts/cli.ts e2e run` validates the public production frontend, optional public dev frontend proxy, public provider ingress, internal core API, PostgreSQL database, provider-gateway self-connection and registered user service paths derived from `config.json`. The CLI response is intentionally bounded: it prints check names/statuses, screenshot path, counts, failed checks and `resultPath`; full diagnostics stay under `.state/e2e/`.
|
||||
|
||||
## Selective Execution Rule
|
||||
|
||||
E2E must be run in two stages instead of blindly re-running the full suite after every edit.
|
||||
|
||||
- First run only the smallest verification set that covers the current change. For example, a Pipeline right-sidebar layout fix should first use focused Playwright or module-scoped checks against Pipeline timeline visibility, height, overflow and interaction, rather than immediately re-running every Todo Note / FindJob / MET Nonlinear path.
|
||||
- `bun scripts/cli.ts e2e run --only <pattern[,pattern...]>` selects only matching checks. Pattern matching accepts a full check name such as `frontend:pipeline-step-timeline-visible`, a prefix such as `frontend:pipeline` / `frontend`, or `*` wildcards such as `frontend:*`.
|
||||
- `bun scripts/cli.ts e2e run --skip <pattern[,pattern...]>` removes matching checks from the current selection. `--only` and `--skip` can be combined, for example `bun scripts/cli.ts e2e run --only frontend:* --skip frontend:todo-note-integrated-visible,frontend:findjob-integrated-visible`.
|
||||
- Targeted execution is real execution rather than output filtering only: when a selection contains only frontend checks, the command skips unrelated network/database/service check groups instead of still running the full suite in the background.
|
||||
- Only after the targeted check is green should the operator run the full public `bun scripts/cli.ts e2e run` regression gate to ensure the local fix did not break unrelated modules.
|
||||
- `总高度`、`横向滚动条`、关键交互可见性 and the exact module being edited are all valid reasons to prefer a targeted Playwright pass before the final full regression.
|
||||
- The full-suite run remains mandatory before claiming delivery; selective execution is an efficiency rule for iteration, not a replacement for final regression.
|
||||
Run E2E in two stages. First run the smallest real selection that covers the change, then run the full public `bun scripts/cli.ts e2e run` before claiming delivery. Selection uses `--only <pattern[,pattern...]>` and `--skip <pattern[,pattern...]>`; patterns may be full check names, prefixes or wildcards. Frontend selection and layout acceptance details are in `$unidesk-webdev`.
|
||||
|
||||
Typical targeted commands:
|
||||
|
||||
@@ -31,21 +23,14 @@ Typical targeted commands:
|
||||
- `bun scripts/cli.ts e2e run --only frontend --skip frontend:todo-note-integrated-visible,frontend:findjob-integrated-visible`
|
||||
- `bun scripts/cli.ts e2e run --only network,provider-ingress`
|
||||
|
||||
- Public exposure: Docker port summary must not show core REST, Code Queue NodePort, or Code Queue public host mappings; the only unrestricted public entries are production frontend, dev frontend proxy and provider ingress. PostgreSQL `15432` and OA Event Flow `4255` may be host-mapped only for controlled Code Queue nodes and must be protected by the `DOCKER-USER` source restrictions generated from `network.restrictedHostAccess`; E2E treats either an unreachable generic probe or a verified restricted rule as passing. Known private user-service ports such as FindJob `3254`, MET Nonlinear `3288`, Todo Note `4211`, legacy Code Queue host ports and File Browser provider port `4251` probes must fail. The dev frontend proxy rule is owned by `docs/reference/dev-environment.md`.
|
||||
- Core API: `docker exec unidesk-backend-core` calls internal `GET /api/overview`, which must report `dbReady: true`, `pgdata.volumeName=unidesk_pgdata_10gb`, a positive PostgreSQL database byte count, and at least one online node; internal `GET /api/performance` must report component request statistics, internal operation statistics, PGDATA usage and Code Queue PostgreSQL storage metadata.
|
||||
- Provider self-connection: internal `GET /api/nodes` must contain `main-server` with `status: online`, `labels.providerGatewayVersion` equal to `src/components/provider-gateway/package.json`, `labels.providerGatewayUpgradePolicy: "always-enabled"`, `labels.providerGatewayRestartPolicyOk: true`, `labels.providerGatewayPidModeOk: true`, and `labels.providerGatewayRuntimeGuardOk: true`; internal `GET /api/nodes/system-status` must contain CPU/memory/disk samples plus a non-empty process resource list sorted by `memoryBytes` by default, where `memoryBytes` should use PSS when `/proc/[pid]/smaps_rollup` is available, otherwise `rssBytes - statm.shared` before raw RSS, and must retain `rssBytes` for diagnostics; internal `GET /api/nodes/docker-status` must contain a Docker snapshot for `main-server`; every running `provider-gateway` container visible in Docker snapshots must report `restartPolicy: "always"` and `pidMode: "host"`; public provider ingress `/health` must return ok.
|
||||
- Provider remote control: internal `/api/dispatch` must successfully complete a real `provider.upgrade` task in `mode: "plan"` so the upgrade path is validated without recreating the running gateway during E2E.
|
||||
- User services: internal `/api/microservices` must include `todo-note` and `oa-event-flow` on `main-server`, canonical `filebrowser` on `D518`, plus `k3sctl-adapter`, `code-queue`, `findjob`, `pipeline`, `met-nonlinear`, `claudeqq` and `filebrowser-d601` on `D601` with `public=false`; `/api/microservices/todo-note/health` must report `storage=postgres`, `/api/microservices/todo-note/proxy/api/instances` must expose the migrated Todo Note lists, and a temporary Todo Note list create/add/toggle/undo/delete cycle must succeed through the real provider-gateway proxy; `/api/microservices/oa-event-flow/health`, `/api/microservices/oa-event-flow/proxy/api/diagnostics`, `/api/microservices/oa-event-flow/proxy/api/events`, `/api/microservices/oa-event-flow/proxy/api/events?tags=service:pipeline` and `/api/microservices/oa-event-flow/proxy/api/stats/trace` must prove the independent OA event table、Pipeline bridge 和 stats center are reachable through UniDesk proxy; `/api/microservices/k3sctl-adapter/health` and `/api/microservices/k3sctl-adapter/proxy/api/control-plane` must expose the D601 `unidesk-k3s` control plane, `kubeApiProxy.mode=kubernetes-api-service-proxy`, D601 active Code Queue instance `servingHealthy=true`, `presentNodeIds` containing `D601`, `missingNodeIds=[]`, `status=healthy`, and `noFallback=true`; `/api/microservices/code-queue/health` must return the active Code Queue backend summary with default model `gpt-5.5`, `egressProxy.connected=true`, `queue.executionDiagnostics` containing DB active state, scheduler active slots, scheduler heartbeat and Trace/OA progress, and `/api/microservices/code-queue/proxy/api/tasks/overview` must return queue state through backend-core -> k3sctl-adapter -> Kubernetes API service proxy -> k3s/k8s Service, not through a `serviceId=code-queue` provider-gateway direct task or `/api/code-queue-direct`; Code Queue raw prompt observation fields must preserve long prompt tails across create/list/detail/frontend paths, with any shortened text exposed only through explicit `*Preview` objects carrying `chars` and `truncated`; `/api/microservices/filebrowser/health`, `/api/microservices/filebrowser-d601/health` and `/api/microservices/filebrowser/proxy/` must prove File Browser health and WebUI access through UniDesk proxy; `/api/microservices/findjob/health` and `/api/microservices/findjob/proxy/api/summary` must succeed through the real provider-gateway proxy; `/api/microservices/findjob/proxy/api/jobs?__unideskArrayLimit=jobs:5` must return a bounded preview with `_unidesk.arrayLimits` metadata; `/api/microservices/pipeline/health`, `/api/microservices/pipeline/proxy/api/snapshot?__unideskArrayLimit=registry.components:8,runs:3` and `/api/microservices/pipeline/proxy/api/oa-event-flow/diagnostics` must return Pipeline health, registry/run previews and OA event-flow evidence; `/api/microservices/met-nonlinear/health`, `/api/microservices/met-nonlinear/proxy/api/queue`, `/api/microservices/met-nonlinear/proxy/api/projects?root=projects&limit=500`, `/api/microservices/met-nonlinear/proxy/api/projects?root=ex_projects&limit=500`, `/api/microservices/met-nonlinear/proxy/api/projects/config?path=<projectPath>` and `/api/microservices/met-nonlinear/proxy/api/images` must return the D601 TS backend health, queue/GPU policy, full project tree inputs, structured project detail and ready `met-nonlinear-ml:tf26` image status. Code Queue liveness fixture checks are first-class E2E selections: `code-queue:active-run-heartbeat-visible`, `code-queue:trace-gap-not-stale`, `code-queue:stale-active-owner-expired`, `code-queue:control-plane-split-brain-diagnostics` and `code-queue:oa-publisher-degraded-visible`.
|
||||
- ClaudeQQ availability: `/api/microservices/claudeqq/health` must only pass when `ready=true`, NapCat HTTP and WebSocket are connected, and `napcat.loginState=logged_in`; `/api/microservices/claudeqq/proxy/api/napcat/login` must show the same logged-in account state and `/api/microservices/claudeqq/proxy/api/events/recent` must prove the backend can read the persistent event cache. A QR-code-only or not-logged-in NapCat state must be treated as unhealthy.
|
||||
- Database: the command writes an `unidesk_e2e_markers` row through `docker exec unidesk-database psql`, confirms provider state is stored in PostgreSQL, and checks Todo Note rows exist in `todo_note_instances` using the same named volume.
|
||||
- Pipeline OA event flow: `microservice:pipeline-oa-event-flow` must prove both no-audit and monitor-audit runs are driven by OA events end to end. The event stream must show `node-finished` as a neutral fact with `pipeline:{pipelineId}` and `epoch:{runId}` tags, OA policy as the source of downstream/audit decisions, monitor decisions as OA control events, and runner control-result evidence. E2E must fail if delivery still depends on a legacy detail audit policy flag as policy authority, independent legacy audit-request points, a legacy batch completion gate, direct monitor-to-runner calls, or frontend/CLI writes to Pipeline `.state`.
|
||||
- The same Pipeline OA diagnostics must fail on legacy file-transport residuals. Procedure containers, monitor sessions, UI/Gantt DTO builders and CLI fetches must consume prompt/control/stop/display evidence only from the OA event ledger and normalized HTTP read APIs; `control-prompts.jsonl`, `monitor-prompts.jsonl`, `monitor-control`, `control-events.jsonl`, monitor stop files, `.state/pipeline-runs/{runId}/control/commands/`, `PIPELINE_*_APPEND_FILE`, local JSONL append/read helpers, and monitor `/pipeline-state` mounts are forbidden in runtime source.
|
||||
- Pipeline live Gantt setup: when `frontend:pipeline-gantt-observation-live-running` is selected, E2E first looks for a current Pipeline run that already contains both a `node-long-running-observation` marker and a still-running execution interval. If no such candidate exists, the E2E setup starts the D601 `monitor-management-behavior-test` pipeline through `trans D601 ...` and polls the private backend proxy until the observation candidate exists; the acceptance assertion itself still opens the public frontend with Playwright and verifies the rendered arrows, absence of observation source pseudo-points, target arrow inset, and live flashing running bar through React DOM controls.
|
||||
- Frontend: Playwright must open the public frontend URL derived from `network.publicHost`, not localhost or a Docker-internal URL; it logs in with the configured account, waits for `核心在线`, asserts that `main-server` and `Main Server Provider` are visible, verifies desktop sidebar collapse and `PGDATA` overview metric, opens `运行总览 / 性能面板` to verify `Bwebui`、组件汇总、最近失败请求、内部操作汇总和最近慢操作, clicks `查看原始JSON` to verify Provider data from the frontend, confirms no raw JSON is visible before that click, opens task history to verify duration and failure diagnostics, opens resource nodes `资源监控` to verify CPU/Memory/Disk curves, the structured process resource table, default memory-desc sorting, sortable CPU column and provider upgrade precheck dispatch, opens `Docker 状态`, switches to `main-server`, and verifies the Docker Desktop-style container view including the database named volume `unidesk_pgdata_10gb`, opens `网关版本` and verifies the provider-gateway version, SSH 透传可用性、远程更新可用性 plus structured remote update records for `provider.upgrade`, then opens `用户服务 / 服务目录`、`用户服务 / Todo Note`、`用户服务 / OA Event Flow`、`用户服务 / k3s Control`、`用户服务 / Code Queue`、`用户服务 / FindJob`、`用户服务 / Pipeline` and `用户服务 / MET Nonlinear` to verify 主 server Todo Note/OA Event Flow、D601 Code Queue、D601 业务服务、仓库引用、私有后端映射、Todo Note 迁移清单和树形任务、OA Event Flow 事件表和 Trace stats 表、k3s 控制面/D601 scheduler/read/write 实例/Kubernetes API service proxy/no-fallback 路径、Code Queue 队列/模型/输出/初始 `Submitted prompt`/终态任务自动加载完整 Trace/追加 prompt/打断控件、FindJob 指标和岗位预览、Pipeline 组件矩阵、MiniMax 限额卡片、结构化 OA 事件流诊断面板、React Flow 控制图、epoch 甘特图、甘特图渲染图导出、monitor 首列排序、长任务观察连线、无观察来源伪点、running node 实时闪动执行条和 OpenCode Trace、MET Nonlinear 项目库/Fork/待启动队列/当前队列/已完成/失败诊断/GPU/镜像都通过 React 控件展示。Playwright 还必须验证 Code Queue 页面所有 API 请求走 `/api/microservices/code-queue/proxy`,不得再出现 `/api/code-queue-direct`;深链接直达路由例如公网 `http://<publicHost>:<frontendPort>/app/pipeline/` 能直接落到 Pipeline 页面,随后切到 `资源节点 / Docker 状态` 时地址栏更新为 `/nodes/docker/`,并且浏览器 history 返回链路仍能回到 `/app/pipeline/`;还必须直开 `/app/code-queue/` 验证页面存在 `app-shell`、左侧主模块边栏、顶部状态栏、顶部子标签和 `code-queue-page`,防止用户服务 deep link 退化成缺 shell 的 standalone 页面;同时 `态势总览` 这类非用户服务页面应落在自己的模块前缀下,例如 `/ops/status/`。Playwright 必须覆盖默认可见时间按北京时间显示,至少包括顶部 `北京时间` 时钟、任务历史/网关版本更新时间和用户服务刷新时间,不得随浏览器本地时区漂移。Task history and provider upgrade records must not display a real sub-second duration as `0s`; MET Nonlinear running rows must show an ETA derived from backend progress or from `startedAt` plus epoch progress, and queue/completed rows must show training speed as `epoch/h`.
|
||||
- Frontend dense-layout regression gate: whenever a frontend change touches Pipeline 右侧边栏、Trace timeline、详情抽屉、甘特图坐标或其他高信息密度面板, Playwright acceptance must inspect both `总高度` and `横向滚动条`. For Pipeline specifically, the OpenCode Trace session head must carry shared agent/model/session facts and the Trace body must use the same Code Queue `TraceView` styling; Playwright must fail if old `.pipeline-opencode-step`, `.pipeline-opencode-flow`, `.pipeline-step-message-card` or `.pipeline-opencode-part` user-visible styles reappear, if the Trace container introduces an internal horizontal scrollbar, or if `frontend:pipeline-gantt-frontend-y-accuracy` fails to prove the frontend `frontend-y` layout maps ticks, markers and execution bars from timestamps to y coordinates within tolerance.
|
||||
- OpenCode Trace must use Code Queue Trace styling and must not render the deprecated Pipeline continuous step connector; Playwright should fail if `.pipeline-opencode-flow`, `.pipeline-opencode-step` or any equivalent continuous connector/card returns to the user-visible Trace.
|
||||
- User service frontend assertions must wait for real backend data, not only the page skeleton. For Todo Note this means the page must show the migrated lists `CONSTAR`、`大论文`、`找工作`、`小论文`、`事务`, support creating a temporary list and task through the frontend, and delete that temporary list afterwards. The temporary list must be selected again by its unique generated name before deletion so E2E never deletes a migrated source list by accident. For FindJob this means the page must show a numeric `岗位总量`, `HEALTH OK`, and a non-empty `PREVIEW` count such as `40/1463 PREVIEW`; for Pipeline this means the page must show `Pipeline v2 工作台`, `Health OK`, a numeric component count, a non-empty React Flow control graph, `控制图`, `Epoch 甘特图`, and after clicking a Gantt execution line it must show `OpenCode Trace` rendered by the shared Code Queue-style Trace component with messages and tool-call groups; for MET Nonlinear this means the page must show `MET Nonlinear 训练编排`, `Health OK`, `Fork Project`, `加入待启动队列`, `启动队列`, `当前队列`, 最大并发设置、task queue and GPU/image panels, and must not show the removed hard-coded `创建10个10轮任务` frontend entry. The MET Nonlinear project library must render `projects/` and `ex_projects/` as a true path tree with folder Project counts; clicking a project row must open a structured detail panel containing `config.json`, `data/ 训练状态`, `模型参数`, `指标` and a parameter count such as `Total Params`; clicking a completed/current/failed job row must open a structured job detail and both the row and detail must show `epoch/h`. Full MET Nonlinear acceptance is driven by public frontend controls: choose a visible source Project, set batch size, epochs and max concurrency in inputs, fork into `projects/unidesk_forks/`, stage the selected forks, start the queue, and verify completed rows plus automatic `metnl-train-*` container removal; loading placeholders like `--` or empty states are not sufficient for E2E success.
|
||||
- For ClaudeQQ this means the page must show `Health OK`, `NapCat 容器登录`, `NAPCAT HTTP OK`, `NAPCAT WS OK`, logged-in state such as `已登录 logged_in`, event cache, subscriptions and message push controls. A page that only shows a QR code, stale raw JSON, or a running backend without logged-in NapCat is not acceptable.
|
||||
Stable non-frontend E2E groups:
|
||||
|
||||
- Public exposure: only production frontend, dev frontend proxy and provider ingress are unrestricted public entries; private service ports must remain private or source-restricted.
|
||||
- Core API and database: internal overview/performance endpoints, PostgreSQL readiness and named volume persistence must pass.
|
||||
- Provider self-connection and remote control: main-server provider-gateway must be online, versioned, always-enabled, host SSH capable and able to plan `provider.upgrade` without recreating itself.
|
||||
- User services: registered services must be visible through the real UniDesk proxy path, with health and representative API checks for Todo Note, OA Event Flow, k3s Control, Code Queue, File Browser, FindJob, Pipeline, MET Nonlinear and ClaudeQQ.
|
||||
- Pipeline/OA event flow and Code Queue liveness selections remain first-class E2E checks; their current exact check names live in the CLI output and source tests, not in this long-term reference.
|
||||
- Frontend Playwright and dense layout assertions are governed by `$unidesk-webdev`.
|
||||
|
||||
## Frontend JSON Rule
|
||||
|
||||
|
||||
@@ -157,7 +157,7 @@ frontend shell 必须把左侧主模块与顶部子标签编译为统一的 URL
|
||||
- 右侧边栏中的 OpenCode Trace 必须把公共 session 信息(agent、model、session id)聚合到 Trace 头部,不得在每个 step 重复;Trace 正文必须由 `src/components/frontend/src/trace.tsx` 的 opencode port 转换后统一渲染,工具调用折叠、摘要、横向滚动、message 去缩进规则与 Code Queue 完全一致。
|
||||
- 右侧边栏排版必须优先保护横向可读宽度:时间放在 step 顶部 header,而不是单独占用左侧窄列;默认摘要不得引入右侧边栏内部横向滚动条,也不得因为窄列挤压把 step 高度拉得过高。
|
||||
- OpenCode Trace 不能使用 Pipeline 旧连续 step 装饰线或旧 step 卡片;相邻 step 之间若存在真实时间空闲区间,不得被任何连续连接线误渲染为持续执行。
|
||||
- 调整任何高信息密度右侧边栏布局时,都必须把 `总高度` 与 `横向滚动条` 作为显式验收指标,用 Playwright 打开真实页面验证,而不是只看静态代码或本地想象。
|
||||
- 调整任何高信息密度右侧边栏布局时,都必须把 `总高度` 与 `横向滚动条` 作为显式验收指标,并通过 `$unidesk-webdev` 规定的真实页面浏览器验收完成;本文件不复制 Playwright 运行细节。
|
||||
- 运行材料只能作为结构化索引行展示计数、状态、时间和来源摘要,完整 JSON、JSONL 或 log tail 只能通过显式 `查看原始JSON` 按钮打开。
|
||||
- `Pipeline` 渲染与算法验证。
|
||||
- 涉及 monitor 审核、管理行为或甘特图算法的改动,必须用 Pipeline 侧通用 fixture 组合验证。
|
||||
|
||||
+2
-21
@@ -106,7 +106,7 @@ For persisted final-response display regressions, a fresh turn alone is not enou
|
||||
|
||||
The `--render web` proof must inspect the rendered body, not only the raw event count. Passing evidence should include `body.render=web`, the shared renderer identity when exposed, `status=completed`, rendered/returned row counts, noise/omitted counts when available, at least one rendered assistant row containing the final assistant text, and an explicit absence check for known non-user boilerplate such as `AgentRun terminal status completed`, `AgentRun result is ready`, and `Code Agent 仍在处理`. If the trace API returns `status=missing`, `sourceEventCount=0`, or no rows for a historical issue trace, treat that trace as expired or unavailable; do not use it as closure evidence. Generate a fresh equivalent turn on the current v0.2 runtime and validate that trace instead.
|
||||
|
||||
CLI/Web-equivalent trace evidence does not replace browser UI evidence for visual, layout, copy-to-clipboard, collapsed-panel or removed-control bugs. Those require a bounded browser or DOM smoke against `http://74.48.78.17:19666/` after rollout, with assertions on the deployed page text, DOM state, or control behavior that the user reported. A local bundle smoke can support regression coverage, but the closeout still needs the deployed public endpoint unless the browser entry is unavailable and the issue comment records the blocker. Missing Playwright browser binaries or declared test dependencies are not a valid skip; install the repository-declared runner/browser or use an approved system browser executable and record that choice in the validation evidence.
|
||||
CLI/Web-equivalent trace evidence does not replace browser UI evidence for visual, layout, copy-to-clipboard, collapsed-panel or removed-control bugs. G14/v0.2 browser or DOM evidence must still hit the deployed public endpoint for the target lane, but Playwright/web-probe/fake-server/screenshot operation details are owned by `$unidesk-webdev`; this G14 reference only records lane and rollout evidence.
|
||||
|
||||
The closing comment for these issues must be semantic natural language before it lists evidence: state what the user-visible problem was, what changed, where it rolled out, and what original entry was rechecked. It must include the actual command or entry path, target lane or endpoint, relevant trace/session/thread/PipelineRun/run/device ids, and the pass/fail result. If the original entry cannot be verified because rollout has not happened, credentials are unavailable, the target runtime is down, or the required CLI capability is missing, keep the issue open and record the blocker. Do not close the issue on the strength of PR merge, targeted tests, or "will be verified after rollout" wording. If an issue was closed before this real CLI/user-entry validation, reopen it and add a correction comment before continuing.
|
||||
|
||||
@@ -160,26 +160,7 @@ The recovery is auditable: the original `git show` patch and the `cherry-pick` S
|
||||
|
||||
### v0.2 Cloud Web Runtime Layout Validation
|
||||
|
||||
Cloud Web layout, status-panel, collapsed-control, and modal issues on `v0.2` need deployed browser evidence. Source checks and control-plane rollout are supporting evidence; they do not prove that the public `19666` page renders the fixed DOM.
|
||||
|
||||
Use these surfaces together:
|
||||
|
||||
- `trans G14:/root/hwlab-v02/.worktree/<task>/web/hwlab-cloud-web sh -- 'bun run check'` for approved static source/layout checks and dist freshness.
|
||||
- `bun scripts/cli.ts hwlab g14 control-plane status --lane v02` for runtime, Argo, public endpoint, and GitOps alignment. If `origin/v0.2` moved through a parallel PR, use `--pipeline-run` or `--source-commit` and treat same-branch supersession as context rather than failure.
|
||||
- Public API probes for both `/health/live` and `/v1/live-builds`. `/health/live` proves live service health/revision, but Cloud Web build time, image tag/digest, source metadata, and actual runtime commit/revision should be read from `/v1/live-builds`.
|
||||
- A bounded browser/DOM probe against `http://74.48.78.17:19666/` that asserts the deployed page state relevant to the issue.
|
||||
|
||||
Cloud Web frontend regressions still use the two-layer validation rule when approved by the task: deterministic source-level checks can cover scroll-follow state machines, Markdown/HTML escaping, shared renderer output, persisted view mapping and DOM class/attribute decisions; the deployed browser or Web-equivalent CLI layer must not mock the user entry, and should prove only the live integration that source-level checks cannot prove: the public bundle is deployed, the real page dispatch path creates the expected DOM state, and the user-visible control behaves on the target lane. Do not move every frontend bug into CLI/browser smoke just because it is user-facing.
|
||||
|
||||
Cloud Web message Markdown must go through a single shared React renderer component. Do not maintain a hand-written Markdown parser or a `dangerouslySetInnerHTML` message path for normal chat/workbench messages. The shared renderer's fast tests should cover at least GFM table rendering, inline/fenced code, emphasis/strong text and raw HTML escaping. Browser closeout should assert rendered DOM shape, such as `table`/`code`/`strong` counts and absence of injected `script` nodes or executed script flags, instead of comparing the full rendered HTML string.
|
||||
|
||||
For Workbench status/build panels, the minimum DOM proof should check the topbar chip, absence of full status cards in the right sidebar, hidden collapsed lists actually absent from layout, bounded scroll ownership on the right content area, and a details dialog that contains environment image metadata, actual live commit/revision, and source/build-time fields when available.
|
||||
|
||||
`/v1/live-builds.latest` is global across services and can legitimately point at `hwlab-cloud-api` when API rolled after Web. Inspect the `hwlab-cloud-web` service row before deciding whether a Web build field is missing or stale.
|
||||
|
||||
For `#workspace` or other scroll-owner fixes, closeout evidence should include numeric scroll metrics before and after the interaction: `scrollHeight`, `clientHeight`, `scrollTop`, `distanceFromBottom`, computed `overflowY`, and the page's follow/detached state attribute when one exists. Passing evidence for follow-tail behavior must show that new content keeps the view at bottom while already following, manual upward scroll detaches, and scrolling back to the bottom re-attaches. If the issue is specifically about final assistant response persistence or trace rendering, the browser/CLI proof must wait for the final agent/trace result as described above. If the issue is a frontend-only renderer or scroll-container regression and the same component/path renders user and agent messages, a real `#command-input` submission that creates a long user message is sufficient to exercise the deployed renderer/scroll path; do not block closure on an unrelated slow external model turn.
|
||||
|
||||
Generic layout smoke can be used only when it is bounded in the current transport. A Playwright smoke that runs through `trans` with no output for the SSH idle timeout, leaves preview/browser processes behind, or never writes an exit/report file is not closure evidence. Run it as an async remote job with explicit report and cleanup, or use a smaller issue-specific DOM probe that emits one JSON result and exits. The stable remote-probe shape is: create a fresh Workbench session through the UI when prior session state may be failed, start the browser script as a target-side job, write a PID/log/result JSON/screenshot on G14, poll those files with short `trans` queries, and cancel any running live turn through the UI before exit when the probe submitted a real prompt. Missing Playwright-managed browser binaries are not a skip; use an approved system browser executable on G14 or install the declared browser dependency, and record the choice. When staging a Node probe outside the repo workspace, make package resolution explicit by running from the workspace or importing packages through the workspace's `node_modules`; do not treat `MODULE_NOT_FOUND` from a `/tmp` script as an application failure.
|
||||
Cloud Web layout, status-panel, collapsed-control, modal, Markdown renderer, scroll-owner and Workbench DOM issues on `v0.2` need deployed browser evidence after rollout; source checks and control-plane status are supporting evidence. Use `$unidesk-webdev` for the actual Playwright/web-probe shape, bounded screenshot artifact rules, layout metrics and no-skip dependency policy. G14/v0.2 closeout still records the lane endpoint, rollout provenance, `/health/live`, `/v1/live-builds` service row and the specific user-visible DOM assertion.
|
||||
|
||||
### v0.2 Cloud Web Button/JS Sync Rule (HWLAB #748)
|
||||
|
||||
|
||||
+3
-47
@@ -82,59 +82,15 @@ AgentRun terminal `failed`、`blocked` 或 `canceled` 也是最终结果,不
|
||||
|
||||
### Workbench 浏览器回归专项
|
||||
|
||||
Workbench 浏览器回归需求以 UniDesk OA [PJ2026-010401 Web工作台](../../project-management/PJ2026-01/specs/PJ2026-010401-web-workbench.md) 的浏览器回归小节为权威;UniDesk 指挥侧只记录运行入口和边界。HWLAB repo 中的专项入口位于 `web/hwlab-cloud-web`,通过 `bun run e2e:workbench -- --project=chromium` 构建 Cloud Web,并由 `playwright.workbench.config.ts` 启动同源 fake server 和 Playwright Chromium。该命令必须在 issue/CLI 选中的目标 node/lane workspace 或其独立 worktree 上运行,例如 D601 `v0.3` 的 `/home/ubuntu/workspace/hwlab-v03`;不要在 master server 本地运行浏览器或仓库级前端 check。
|
||||
|
||||
该套件验证 Web 工作台用户可见行为:session 切换和刷新恢复、SSE/REST 事件重放、`running`/`completed`/`failed`/`canceled` 状态一致性、Trace readable rows、deep link hydration 和延迟会话列表最终状态。fake server 只重放脱敏 fixture 和最小边界变形,不得访问 live Cloud API、AgentRun、HWPOD、数据库或 Kubernetes 作为通过条件;任何未 mock 的 `/auth/*`、`/v1/*` 或 `/health*` 请求都应失败并暴露 path。
|
||||
|
||||
通过 `hwlab nodes web-probe run|script` 在线上 public origin 发现的 Workbench 用户可见 bug,修复前必须先进入上述 fake-server Playwright 套件形成独立红灯,再改源码。issue 正文应明确写出对应 fake-server 复现要求、fixture 来源、目标 viewport 或用户路径、修复后需要运行的 `bun run e2e:workbench -- --project=chromium`,以及最终回到同一 node/lane public origin 的 `web-probe` 验收命令。线上 web-probe 是 P4 原入口验收,不替代本地可重复的 Workbench 回归用例。
|
||||
|
||||
线上环境不易稳定触发或不应主动制造的负向分支,例如跨 project 详情读取、过期 session 权威、上游 5xx、auth rollout 瞬态和 list/window 缺项,应由 fake-server fixture 提供确定性复现和断言;live `web-probe` 关闭时证明同一 node/lane public origin 的部署版本、自然路径和健康状态即可,并在 closeout 中说明该负向分支的确定性覆盖来自 fake-server。不要为了证明 5xx、rollout 中间态或第三方瞬态而故意破坏线上服务。
|
||||
|
||||
移动端、窄屏、session rail、按钮可见性和 selector readiness 类问题必须在 Playwright viewport 中表达可见性断言:目标入口应可见、可点击、具有非零 layout rect,并且与用户实际操作路径一致。不要把桌面 selector(例如 session rail 的某个固定 id)假定为所有 viewport 的稳定 readiness;若移动端存在等价折叠菜单或替代按钮,fake-server 用例和线上 `web-probe script` 都应验证该等价入口。
|
||||
|
||||
fixture seed 优先来自目标 node/lane 的受控真实样本;采集脚本只能输出脱敏后的 workspace/conversation/session/turn/trace shape、stable pseudo ids、sourceRef/presence/fingerprint 和 `valuesPrinted=false`。不得提交或打印 `HWLAB_API_KEY`、cookie、Authorization header、DB DSN、provider token、真实用户身份或非公开 prompt。截图、Playwright trace 和 HTML report 是 issue/PR closeout 证据,默认保存在 `.state/workbench-e2e/` 或通过受控远程 artifact 回传;它们不替代 OA spec,也不替代真实 public runtime 的 `hwlab nodes web-probe` 登录/DOM/Trace smoke。
|
||||
Workbench 浏览器回归需求以 UniDesk OA [PJ2026-010401 Web工作台](../../project-management/PJ2026-01/specs/PJ2026-010401-web-workbench.md) 为权威;具体 Web 开发、fake-server Playwright、fixture 采集脱敏、移动端断言、截图 artifact 和线上 web-probe 闭环统一见 `$unidesk-webdev`。HWLAB repo 只保留入口边界:专项代码位于 `web/hwlab-cloud-web`,命令必须在 issue/CLI 选中的目标 node/lane workspace 或独立 worktree 上运行,不得在 master server 本地跑浏览器或仓库级前端 check。
|
||||
|
||||
### Web Live DOM Probe 验收
|
||||
|
||||
`scripts/web-live-dom-probe.mjs` 是 Cloud Web 原入口的 DOM 级等价验收 helper。它必须在 issue/CLI 选中的 node/lane workspace 上运行,并打到同一 public origin;例如 D601 `v0.3` 使用目标 workspace 的脚本和 `https://hwlab.pikapython.com`,不能在 master server 本地跑浏览器 smoke,也不能用旧端口或其他 lane fallback。跨 node/lane 的日常指挥验收优先使用 UniDesk `bun scripts/cli.ts hwlab nodes web-probe run --node <node> --lane <lane>`,由该入口解析 workspace、public origin 和 Web 登录 sourceRef,再把凭据作为一次性 stdin/env 注入目标 helper。需要自定义 Playwright route/intercept、延迟 API、读取 in-flight DOM 或生成专项 artifact 时,使用 `bun scripts/cli.ts hwlab nodes web-probe script --node <node> --lane <lane> <<'JS' ... JS`;该入口由 UniDesk 先通过同源 `/auth/login` 建立 `hwlab_session`,脚本只接收已认证的 `browser/context/page/baseUrl` 和 artifact helper。只修改该 helper 时属于无服务交付,按目标 HWLAB repo `AGENTS.md` 选择直接提交或 PR,关闭证据写明 `rollout=not-applicable`。
|
||||
|
||||
`web-probe script` 注入的 `baseUrl` 是 public origin,并且可以带尾随 `/`。自定义脚本构造 API 或页面 URL 时必须使用 `new URL(path, baseUrl).toString()`;不要用 `` `${baseUrl}${path}` `` 直接拼接以 `/` 开头的 path,否则会生成 `//health`、`//v1/...` 这类双斜杠路径,可能被 edge-proxy 当成上游绝对 URL/异常代理路径处理,造成与真实浏览器页面不同的 502 或误判。
|
||||
|
||||
`web-probe script` 的 `fetchJson` 依赖已认证页面所在 origin。自定义脚本必须先通过 `waitWorkbenchReady`、`gotoStable` 或等价页面导航进入目标 public origin,再调用 `fetchJson` 读取 `/v1/...`;不要在 `about:blank` 上先发 API 请求,否则 status 0 只说明浏览器上下文尚未进入同源页面。Workbench 页面存在长连接、轮询或 SSE 时,不要用 Playwright `networkidle` 作为 reload/切换通过条件;应使用 `domcontentloaded` 加明确 DOM/API 条件,例如 final URL、route conversationId、active tab、message card、trace row 或 workspace `selectedConversationId`。
|
||||
|
||||
Workbench 历史会话 deep link 可能属于非默认 project。打开 `/workbench/sessions/<conversationId>` 时如果默认 project 返回 `conversation_project_mismatch`、页面提示 conversation 不可见,不能立即判定原 issue 失效;应先从 workspace/conversation 摘要确认该会话 projectId,再用等价 URL `/workbench/sessions/<conversationId>?projectId=<projectId>` 做 DOM 断言。closeout 证据应写明实际 projectId、finalUrl、terminal agent card 数量和负向文本断言,避免用默认 project 的 404 掩盖真实页面状态。
|
||||
|
||||
Web 登录凭据必须从目标 node/lane 的受控 source 解析并作为一次性进程环境注入,例如先用 `bun scripts/cli.ts hwlab nodes secret status|ensure --node <node> --lane <lane> --name hwlab-v03-bootstrap-admin` 确认 bootstrap admin sourceRef,再运行受控 `hwlab nodes web-probe run` / `hwlab nodes web-probe script`。目标 host 没有 owner-only source 文件时,受控入口应快速返回 `web_login_secret_missing`;不要依赖脚本历史默认密码,不要把凭据复制到目标 host、shell 启动文件、issue、日志或 Git 文档。若 sourceRef/fingerprint 已确认但 public `/auth/login` 仍不接受 source 密码,应先区分 API rollout、user-billing 回退和用户表状态;只有明确需要按 YAML source 重灌 bootstrap admin hash 时,才使用受控 `secret ensure --confirm --force`。
|
||||
|
||||
排查 probe 登录误报时,优先看 JSON 里的 `actions`、`dom.authState`、`finalUrl`、`failureDom` 和 `dom.requiredSelectors`。新版登录页 fallback 必须先等待真实登录 surface(`#workspace`、legacy id 或 `.login-card input`)再判断 input count;提交前还要确认表单值已经落到 DOM,例如 `actions.login.valuesReady=true`。只在 `authState=login` 的瞬间立即 `count()`,或在 Vue 尚未更新 input value 时 submit,都可能把前端填表时序误判成凭据错误。关闭 Workbench 登录/DOM helper 问题时,证据至少包含原命令、目标 URL/lane、登录 `selectorMode`、`valuesReady`、`finalUrl` 和 `workspace`/`commandInput` 等关键 selector 结果。
|
||||
|
||||
Trace 实时性问题必须使用间隔采样,而不是只看静态截图或终态 DOM。`hwlab nodes web-probe run` 可用 `--trace-sample-count` 和 `--trace-sample-interval-ms` 采集运行中样本;自定义场景使用 `hwlab nodes web-probe script` 在同一 public origin 上读取 `.message-card[data-role="agent"]`、`.trace-timeline`、`.trace-render-row`、`data-status` 和 `.trace-empty`。采样判断要区分“加载中”和“思考中”:加载中表示页面仍在拉取已有 trace 或 trace 资源未就绪;思考中表示 trace 容器已挂载、turn 正在运行,但还没有第一条可读 Code Agent 事件。终态 message card(completed/failed/blocked/timeout/canceled)不得继续保留“思考中”空态;若 full trace 为空,应显示与终态一致的空 trace 文案,并且该 article 的可读 `textContent` 不应包含“思考中”。通过证据应包含连续样本里的 `agentStatus`、`tracePresent`、`traceStatus`、`rowCount`、`emptyLabel` 和最后一行预览;终态空 trace 类 issue 还应统计 terminal agent card、terminalWithThinking 和 terminal trace-empty label。
|
||||
|
||||
浏览器控制台中的随机文件名脚本、扩展注入脚本或浏览器实验功能警告不得直接归因到 HWLAB Cloud Web。遇到 `Permissions policy violation`、`unload is not allowed` 或类似 console 噪声时,先用 `hwlab nodes web-probe script` 在干净 Playwright 上记录 `document.scripts`、同源响应 header 和相关 console 过滤结果;只有脚本实际来自 HWLAB public origin、部署产物或响应 header 明确由 HWLAB 设置时,才把它登记为 HWLAB Web bug。Edge/Copilot、浏览器扩展或用户侧注入脚本产生的随机 bundle 名称,应作为浏览器环境噪声记录,不阻塞已通过的 Web 功能验收。
|
||||
`scripts/web-live-dom-probe.mjs` 和 UniDesk `hwlab nodes web-probe run|script` 是 Cloud Web 原入口 DOM 验收底座。登录 sourceRef、同源 page helper、URL 构造、readiness、Trace 采样、浏览器噪声分类和 artifact 规则统一见 `$unidesk-webdev`;本文件不复制使用细则,避免与 UniDesk WebDev 操作面分叉。只修改该 helper 时属于无服务交付,按目标 HWLAB repo `AGENTS.md` 选择直接提交或 PR,关闭证据写明 `rollout=not-applicable`。
|
||||
|
||||
### Cloud Web Workbench Prompt 浏览器闭环
|
||||
|
||||
Workbench prompt、TraceTimeline、final response、详情弹窗或工具调用展示类 issue 关闭前,必须同时有浏览器 UI 证据和同一 trace 的 typed CLI 交叉验证。浏览器侧优先使用 `trans <node>:<workspace> playwright --local-dir <local-dir>` 在选中 node/lane workspace 执行,并打到同一 public origin;不要在 master server 本地跑浏览器,也不要用其他 lane 的旧端口代替。heredoc 内应显式等待 `/auth/login`、`#workspace`、`#code-agent-provider-profile`、session 选择、`/v1/agent/chat` 和目标 trace selector,而不是只靠页面标题或宽泛 input count 判断成功。截图、PDF 或 summary artifact 必须通过 `--local-dir` 或 `trans <node> download` 回传,并在 closeout 中记录本地路径、bytes 和 SHA-256 verification。
|
||||
|
||||
一次完整的 Workbench prompt UI 证据应覆盖:Web session 登录成功;模型通道选择符合目标 provider profile;显式创建或选择 session;prompt 被 `/v1/agent/chat` 接受并得到 `traceId/sessionId/conversationId/threadId`;页面可见用户消息、Agent message、final response;若 TraceTimeline 初始是 compact/result 压缩态,应在 Web 上触发 `回放 Trace` 后展开 timeline,让页面本身可见 `commandExecution` 等工具行。随后在同一 node/lane public origin 上,用 `hwlab-cli client agent result <traceId>`、`trace <traceId>` 和必要的 `inspect <traceId>` 交叉确认 terminal status、toolCalls、finalResponse、AgentRun run/command/runner ID 和脱敏状态。
|
||||
|
||||
对于失败终态的 final response UI 验收,浏览器证据必须读取 final response 正文或 `.message-text` 中的用户可见错误,并与同一 trace 的 `client agent result` / turn endpoint 的 `finalResponse.text` 对齐。若 API 已返回 `error.message` 但没有 `finalResponse.text`,问题在 HWLAB API/adapter 结果映射;若 API 已返回 `finalResponse.text` 但页面仍显示空 final、通用 fallback 或只在 trace row 中展示错误,问题在 Cloud Web 恢复/terminal 文本选择。两类问题都不能用 trace timeline 展示正常来关闭。
|
||||
|
||||
详情弹窗和恢复会话类验收还应覆盖“从持久化 conversation 恢复”的路径,而不是只在刚完成 turn 的内存态截图。若用户报告的原始 conversation 对当前验收 actor 不可见,但同一 trace 的 `result` 可读,可以创建当前 actor 可见的临时 conversation,消息中挂载同一个真实 `traceId` 和最小 terminal agent message,再在 Web 中选择该临时会话、打开运行详情并等待 result 诊断自动补齐;验收后删除临时 conversation。closeout 必须写明这是同 trace 的恢复路径验证,不能声称修改或读取了原始用户 conversation。
|
||||
|
||||
Completed turn、Trace 重放或 session deep link 类 issue 关闭前,必须用新的浏览器上下文或等价 fresh login 直接打开 `/workbench/sessions/<conversationId>?projectId=<projectId>`。同一 SPA 页面内发送后再 `reload` 只能证明当前内存态和同页缓存没有立即丢失,不能替代 fresh deep link 复测;fresh deep link 必须重新通过 conversation detail、turn/trace replay 或等价事件重放恢复用户消息、terminal agent message、Trace rows、active tab 和 `running=false` 终态。若同页 reload 正常但 fresh deep link 下 `/v1/agent/conversations/<conversationId>` 或 `/v1/agent/turns/<traceId>` 返回 404,且 DOM `messageCount=0` / `traceRowCount=0`,该 issue 仍未通过,必须保持打开或重开。
|
||||
|
||||
Session 切换、session rail 或 Workbench 恢复路径类问题必须同时验证点击态和持久化恢复态。浏览器证据应在同一 public origin 中选择一个非当前 session,等待足够覆盖 `hydrate`、`select-conversation`、active trace repair 等异步返回的空闲窗口,确认 active tab 没有回退;随后读取 `/v1/workbench/workspace?projectId=<projectId>` 确认后端 `selectedConversationId` 已改变并持久化;最后刷新页面并再次等待 session tabs,确认同一个 session 仍是 active。若当前 selected conversation 没有出现在 `/v1/agent/conversations` 当前列表窗口中,前端必须把 workspace 当前选中 conversation 合入 session rail;刷新后没有 active tab、状态显示“等待 workspace”,或只靠内存态显示成功,均不能作为通过证据。session rail 的刷新列表只是候选窗口,不是 selected session 的完整真相;list API 临时缺当前 conversation 时,不得清空当前 tab、取消 active 状态或让标签在刷新/深链恢复中闪烁消失。自定义 Playwright 验收不要只按 tab index 或重复标题判断,应锁定唯一 title、conversationId 或后端 selected id,并记录 `select-conversation` response;与 session 切换无关的 health/RPC 噪声应拆成独立 issue,不阻塞已经通过的切换闭环。
|
||||
|
||||
线上 session rail 当前列表窗口可能只有空会话,或包含已经过期的 stale conversation。选择非空、running 或 terminal 会话作为验收候选时,应先通过 `/v1/agent/conversations?projectId=<projectId>&limit=<N>` 扩大候选范围,再对候选逐个读取 `/v1/agent/conversations/<conversationId>?projectId=<projectId>`;只有 detail 200 且消息/trace shape 满足目标场景的会话才可作为正向切换、reload 或 terminal replay 证据。detail 404 的候选本身是 stale/list-window 负向信号,应用于验证隔离路径,而不能继续当作正向 reload 样本。
|
||||
|
||||
Code Agent turn 运行状态的权威入口是 `/v1/agent/turns/:traceId`,`traceId` 是单一查询键;`hwlab-cli client agent result <traceId>`、Web composer、cancel 按钮、session 标签动效和最终消息状态都应消费同一 turn snapshot 的 `running`、`terminal`、`status`、`finalResponse`、`error` 和 AgentRun provenance。`/v1/agent/trace*`、conversation list、workspace summary、message status、runnerTrace rows 和 currentRequest 只能作为详情、列表或上下文来源,不能再反向推断 turn 是否运行中、是否完成或是否失败,也不能作为 fallback 与 turn endpoint 竞争覆盖 UI 状态。CLI 等待 terminal 时也应优先使用 submit 返回的 `turnUrl` 或 `/v1/agent/turns/:traceId`,只有详情展示才读取 trace rows。
|
||||
|
||||
Code Agent 会话恢复的状态权威必须以最新 message 和同 turn trace 为准。conversation 顶层 `status=idle` 只能表示当前没有进行中的 turn,不能覆盖 message 自身的 `running` / `completed` / `failed`,也不能把 agent message 降级成 `source`;conversation 顶层 `lastTraceId` 若与最新 user/agent message trace 脱节,后端 summary 应按最新同 turn user/agent trace 修正,前端 hydrate/select/retry 后也应从最新 active message reattach trace。关闭刷新、deep link 或切换 session 后丢消息/卡住类 issue 时,验收必须同时看 DOM 最后一条 agent 的 `data-status`、conversation API 顶层 `lastTraceId`、最后 user/agent message 的 `traceId`、`runnerTrace.status` 和 `/v1/agent/turns/:traceId` 的 `running` / `terminal`,不能只看 session rail 文案或 trace 面板是否出现占位文本。
|
||||
|
||||
Workbench session URL 是恢复路径的一部分。每个可见或当前选中的 session 都应能映射到 `/workbench/sessions/<conversationId>`,`/workspace/sessions/<conversationId>` 仅作为兼容 alias;从 `/workbench` 进入、创建 session 或点击 session rail 后,地址栏应反映当前 active `conversationId`,复制 session 信息时也应包含可直接打开的 session URL。直接打开 deep link 时,Cloud Web 静态层必须返回 SPA `index.html` 而不是 404;登录 redirect 必须保留原 path;登录后或 hydrate 后应通过 `/v1/agent/conversations/<conversationId>` 与 workspace select 路径把后端 `selectedConversationId` 收敛到 URL 中的 id。关闭 session URL/恢复类 issue 时,浏览器证据至少包含最终 URL、workspace API 的 `selectedConversationId`、active tab 判断,以及一次无 cookie direct GET 或等价探针确认 deep link 是 `200 text/html`。
|
||||
|
||||
Cloud Web 登录页的中文错误可能会把 API upstream 502、rollout 中间态或真实 401 都表现成登录失败。遇到登录失败先看 `web-probe script` 的 `probe.auth.retryCount`、`transientObserved`、`retryable`、`fallbackUsed`、`fingerprint` 和 `commanderAction`,再用目标 public origin probe `/health/live`、`/auth/login` 状态和选中 namespace 的 API/Web/edge-proxy rollout;只有 API 已 ready 且 `/auth/login` 明确返回 401 时,才把它归类为凭据或用户状态问题。rollout 瞬态恢复后重跑同一短生命周期 Playwright 验收即可,不要把 transient `upstream_unavailable` 写成长期功能缺陷。
|
||||
Workbench prompt、TraceTimeline、final response、详情弹窗、session 切换、deep link 和运行态一致性问题的浏览器证据要求统一见 `$unidesk-webdev`;typed CLI 交叉验证仍见 `$hwlab-code-agent`。这里不再复制 selector、fresh context、turn authority 或登录排障细则,避免与 WebDev 操作面产生多路径。
|
||||
|
||||
## HWLAB FRP 维护
|
||||
|
||||
|
||||
@@ -431,5 +431,5 @@ ClaudeQQ 的业务源码和持久化数据仍在 D601,但正式运行由 k3s
|
||||
- 在 D601 上用 `trans D601 ...` 调试业务仓库和容器,确认 `curl http://127.0.0.1:18082/health` 和 `curl http://127.0.0.1:18082/api/snapshot` 可用;不要把 Pipeline 调试服务部署到主 server。
|
||||
- 在 D601 上用 `trans D601 ...` 调试 `~/met_nonlinear`,确认 `curl http://127.0.0.1:3288/health` 可用;最终验收必须回到公网 UniDesk frontend,通过项目库选择、Fork、加入待启动队列和启动队列完成,不要把 MET Nonlinear 后端、Docker build 或训练任务部署到主 server。
|
||||
- 在 D601 上用 `trans D601 ...` 调试 `~/.agents/skills/claudeqq`,可使用业务仓库 `docker-compose.unidesk.yml` 做本地诊断,但正式部署必须回到 `bun scripts/cli.ts deploy apply --service claudeqq`、k3s Deployment `claudeqq` 和 UniDesk `microservice health/proxy` 验证;不要把 ClaudeQQ 后端或 NapCat 调试服务部署到主 server,也不要把诊断 Compose 当作正式运行态。
|
||||
- 运行 `bun scripts/cli.ts e2e run`,确认用户服务相关检查 passed,并确认 Playwright 访问的是公网 `http://74.48.78.17:18081/`。
|
||||
- 运行 `bun scripts/cli.ts e2e run`,确认用户服务相关检查 passed;公网浏览器访问、截图和前端断言细则统一见 `$unidesk-webdev`。
|
||||
- 登录公网 frontend,进入 `用户服务 / 服务目录`、`用户服务 / Todo Note`、`用户服务 / FindJob`、`用户服务 / Pipeline`、`用户服务 / MET Nonlinear` 和 `用户服务 / ClaudeQQ`,确认能看到主 server 与 D601 provider、仓库引用、后端私有映射、Todo Note 迁移清单与树形任务、FindJob 指标和岗位预览、Pipeline 组件矩阵、React Flow 控制图、epoch 列表、epoch 甘特图和运行材料索引、MET Nonlinear 队列/GPU/镜像/Project config/训练历史、ClaudeQQ NapCat 容器登录二维码/NapCat 状态/事件订阅/消息推送/最近 QQ 事件;Todo Note 页面必须能创建临时清单、添加任务并删除临时清单,删除前必须按唯一临时清单名称重新选中对应行,禁止用未确认的当前 active 清单执行删除,FindJob 页面必须显示真实数字指标、`HEALTH OK` 和非空岗位预览,Pipeline 页面必须显示 `Pipeline v2 工作台`、`Health OK`、组件数、epoch 甘特图和结构化运行材料索引,MET Nonlinear 页面必须显示 `Health OK`、`Fork Project`、`启动队列`、`当前队列`、最大并发设置和 GPU/镜像面板,ClaudeQQ 页面必须显示 `Health OK`、`NapCat 容器登录`、`QQ 事件订阅`、`消息推送`、`事件缓存` 和私有代理说明,不能只停留在 loading 骨架;页面默认不得出现裸 JSON、JSONL 或逐行日志。
|
||||
|
||||
@@ -94,7 +94,7 @@ SSH 透传自测是 provider-gateway 部署验收的一部分。目标 Provider
|
||||
|
||||
用户服务后端端口不得映射到公网;provider-gateway 只允许代理节点本地 HTTP 地址或主 server 显式 Compose 服务名,业务 API 路径和 HTTP 方法还要受 backend-core `allowedPathPrefixes` 与 `allowedMethods` 限制。无论走 tunnel 还是旧 dispatch,响应必须保留 upstream status、content-type、截断标记和代理模式诊断头;provider WebSocket 断开时 backend-core 必须显式返回 504/502,不得静默 fallback 到公网直连、旧缓存或 provider-gateway 以外的业务 HTTP 入口。
|
||||
|
||||
自动化验证必须使用 Playwright 访问公网 frontend,而不是在容器内直接调 core API 代替浏览器验收。标准命令是 `bun scripts/cli.ts e2e run`;该命令会让 Playwright 打开公网 `http://74.48.78.17:18081/`、登录、抓取页面中的 Provider 信息和 `查看原始JSON` 内容,并检查 Provider 自接入、资源指标、Docker 状态和 `provider.upgrade` 预检。外部新增节点的人工验收应复用同一套前端路径:先确认 Provider 信息出现在节点清单,再确认资源监控和 Docker 状态页面有该节点的数据,最后通过任务调度向该 Provider 下发 `echo`、`docker.ps` 或维护专用 `host.ssh` probe,并在任务历史中查看耗时、状态、stdout/stderr 摘要和失败原因。
|
||||
自动化验证必须访问公网 frontend 用户入口,而不是在容器内直接调 core API 代替浏览器验收。标准交付门禁仍是 `bun scripts/cli.ts e2e run`;浏览器执行形态、截图、断言和失败 artifact 统一见 `$unidesk-webdev`。外部新增节点的人工验收应复用同一套前端路径:先确认 Provider 信息出现在节点清单,再确认资源监控和 Docker 状态页面有该节点的数据,最后通过任务调度向该 Provider 下发 `echo`、`docker.ps` 或维护专用 `host.ssh` probe,并在任务历史中查看耗时、状态、stdout/stderr 摘要和失败原因。
|
||||
|
||||
## Provider Ingress
|
||||
|
||||
|
||||
Reference in New Issue
Block a user