25 KiB
UniDesk E2E Reference
UniDesk delivery is not complete until the public production frontend, public dev frontend proxy when deployed, public provider ingress, internal core API, PostgreSQL database, local provider-gateway self-connection, and frontend Playwright flow pass the relevant end-to-end checks. The canonical automated command is bun scripts/cli.ts e2e run.
Required Preconditions
config.jsonnetwork.publicHostmust be the externally reachable host name or IP of the main server, not127.0.0.1, when validating browser access from outside the server.bunx playwright install chromiumandbunx playwright install-deps chromiummust have been run on hosts that execute browser E2E tests.- The Docker stack must be running through
bun scripts/cli.ts server start, andbun scripts/cli.ts server statusmust report healthy frontend, provider ingress, internal core, database, and provider-gateway containers.
Automated E2E Scope
bun scripts/cli.ts e2e run validates the following URLs and internal checks derived from config.json. The CLI response is intentionally bounded: it prints check names/statuses, screenshot path, counts, and resultPath; the full per-check diagnostics are written to resultPath under .state/e2e/ so failures remain inspectable without flooding stdout.
Selective Execution Rule
E2E must be run in two stages instead of blindly re-running the full suite after every edit.
- First run only the smallest verification set that covers the current change. For example, a Pipeline right-sidebar layout fix should first use focused Playwright or module-scoped checks against Pipeline timeline visibility, height, overflow and interaction, rather than immediately re-running every Todo Note / FindJob / MET Nonlinear path.
bun scripts/cli.ts e2e run --only <pattern[,pattern...]>selects only matching checks. Pattern matching accepts a full check name such asfrontend:pipeline-step-timeline-visible, a prefix such asfrontend:pipeline/frontend, or*wildcards such asfrontend:*.bun scripts/cli.ts e2e run --skip <pattern[,pattern...]>removes matching checks from the current selection.--onlyand--skipcan be combined, for examplebun scripts/cli.ts e2e run --only frontend:* --skip frontend:todo-note-integrated-visible,frontend:findjob-integrated-visible.- Targeted execution is real execution rather than output filtering only: when a selection contains only frontend checks, the command skips unrelated network/database/service check groups instead of still running the full suite in the background.
- Only after the targeted check is green should the operator run the full public
bun scripts/cli.ts e2e runregression gate to ensure the local fix did not break unrelated modules. 总高度、横向滚动条、关键交互可见性 and the exact module being edited are all valid reasons to prefer a targeted Playwright pass before the final full regression.- The full-suite run remains mandatory before claiming delivery; selective execution is an efficiency rule for iteration, not a replacement for final regression.
Typical targeted commands:
-
bun scripts/cli.ts e2e run --only frontend:pipeline-step-timeline-visible -
bun scripts/cli.ts e2e run --only frontend:pipeline -
bun scripts/cli.ts e2e run --only frontend --skip frontend:todo-note-integrated-visible,frontend:findjob-integrated-visible -
bun scripts/cli.ts e2e run --only network,provider-ingress -
Public exposure: Docker port summary must not show core REST, Code Queue NodePort, or Code Queue public host mappings; the only unrestricted public entries are production frontend, dev frontend proxy and provider ingress. PostgreSQL
15432and OA Event Flow4255may be host-mapped only for controlled Code Queue nodes and must be protected by theDOCKER-USERsource restrictions generated fromnetwork.restrictedHostAccess; E2E treats either an unreachable generic probe or a verified restricted rule as passing. Known private user-service ports such as FindJob3254, MET Nonlinear3288, Todo Note4211, legacy Code Queue host ports and File Browser provider port4251probes must fail. The dev frontend proxy rule is owned bydocs/reference/dev-environment.md. -
Core API:
docker exec unidesk-backend-corecalls internalGET /api/overview, which must reportdbReady: true,pgdata.volumeName=unidesk_pgdata_10gb, a positive PostgreSQL database byte count, and at least one online node; internalGET /api/performancemust report component request statistics, internal operation statistics, PGDATA usage and Code Queue PostgreSQL storage metadata. -
Provider self-connection: internal
GET /api/nodesmust containmain-serverwithstatus: online,labels.providerGatewayVersionequal tosrc/components/provider-gateway/package.json,labels.providerGatewayUpgradePolicy: "always-enabled",labels.providerGatewayRestartPolicyOk: true,labels.providerGatewayPidModeOk: true, andlabels.providerGatewayRuntimeGuardOk: true; internalGET /api/nodes/system-statusmust contain CPU/memory/disk samples plus a non-empty process resource list sorted bymemoryBytesby default, wherememoryBytesshould use PSS when/proc/[pid]/smaps_rollupis available, otherwiserssBytes - statm.sharedbefore raw RSS, and must retainrssBytesfor diagnostics; internalGET /api/nodes/docker-statusmust contain a Docker snapshot formain-server; every runningprovider-gatewaycontainer visible in Docker snapshots must reportrestartPolicy: "always"andpidMode: "host"; public provider ingress/healthmust return ok. -
Provider remote control: internal
/api/dispatchmust successfully complete a realprovider.upgradetask inmode: "plan"so the upgrade path is validated without recreating the running gateway during E2E. -
User services: internal
/api/microservicesmust includetodo-noteandoa-event-flowonmain-server, canonicalfilebrowseronD518, plusk3sctl-adapter,code-queue,findjob,pipeline,met-nonlinear,claudeqqandfilebrowser-d601onD601withpublic=false;/api/microservices/todo-note/healthmust reportstorage=postgres,/api/microservices/todo-note/proxy/api/instancesmust expose the migrated Todo Note lists, and a temporary Todo Note list create/add/toggle/undo/delete cycle must succeed through the real provider-gateway proxy;/api/microservices/oa-event-flow/health,/api/microservices/oa-event-flow/proxy/api/diagnostics,/api/microservices/oa-event-flow/proxy/api/events,/api/microservices/oa-event-flow/proxy/api/events?tags=service:pipelineand/api/microservices/oa-event-flow/proxy/api/stats/tracemust prove the independent OA event table、Pipeline bridge 和 stats center are reachable through UniDesk proxy;/api/microservices/k3sctl-adapter/healthand/api/microservices/k3sctl-adapter/proxy/api/control-planemust expose the D601unidesk-k3scontrol plane,kubeApiProxy.mode=kubernetes-api-service-proxy, D601 active Code Queue instanceservingHealthy=true,presentNodeIdscontainingD601,missingNodeIds=[],status=healthy, andnoFallback=true;/api/microservices/code-queue/healthmust return the active Code Queue backend summary with default modelgpt-5.5,egressProxy.connected=true, and/api/microservices/code-queue/proxy/api/tasks/overviewmust return queue state through backend-core -> k3sctl-adapter -> Kubernetes API service proxy -> k3s/k8s Service, not through aserviceId=code-queueprovider-gateway direct task or/api/code-queue-direct;/api/microservices/filebrowser/health,/api/microservices/filebrowser-d601/healthand/api/microservices/filebrowser/proxy/must prove File Browser health and WebUI access through UniDesk proxy;/api/microservices/findjob/healthand/api/microservices/findjob/proxy/api/summarymust succeed through the real provider-gateway proxy;/api/microservices/findjob/proxy/api/jobs?__unideskArrayLimit=jobs:5must return a bounded preview with_unidesk.arrayLimitsmetadata;/api/microservices/pipeline/health,/api/microservices/pipeline/proxy/api/snapshot?__unideskArrayLimit=registry.components:8,runs:3and/api/microservices/pipeline/proxy/api/oa-event-flow/diagnosticsmust return Pipeline health, registry/run previews and OA event-flow evidence;/api/microservices/met-nonlinear/health,/api/microservices/met-nonlinear/proxy/api/queue,/api/microservices/met-nonlinear/proxy/api/projects?root=projects&limit=500,/api/microservices/met-nonlinear/proxy/api/projects?root=ex_projects&limit=500,/api/microservices/met-nonlinear/proxy/api/projects/config?path=<projectPath>and/api/microservices/met-nonlinear/proxy/api/imagesmust return the D601 TS backend health, queue/GPU policy, full project tree inputs, structured project detail and readymet-nonlinear-ml:tf26image status. -
ClaudeQQ availability:
/api/microservices/claudeqq/healthmust only pass whenready=true, NapCat HTTP and WebSocket are connected, andnapcat.loginState=logged_in;/api/microservices/claudeqq/proxy/api/napcat/loginmust show the same logged-in account state and/api/microservices/claudeqq/proxy/api/events/recentmust prove the backend can read the persistent event cache. A QR-code-only or not-logged-in NapCat state must be treated as unhealthy. -
Database: the command writes an
unidesk_e2e_markersrow throughdocker exec unidesk-database psql, confirms provider state is stored in PostgreSQL, and checks Todo Note rows exist intodo_note_instancesusing the same named volume. -
Pipeline OA event flow:
microservice:pipeline-oa-event-flowmust prove both no-audit and monitor-audit runs are driven by OA events end to end. The event stream must shownode-finishedas a neutral fact withpipeline:{pipelineId}andepoch:{runId}tags, OA policy as the source of downstream/audit decisions, monitor decisions as OA control events, and runner control-result evidence. E2E must fail if delivery still depends on a legacy detail audit policy flag as policy authority, independent legacy audit-request points, a legacy batch completion gate, direct monitor-to-runner calls, or frontend/CLI writes to Pipeline.state. -
The same Pipeline OA diagnostics must fail on legacy file-transport residuals. Procedure containers, monitor sessions, UI/Gantt DTO builders and CLI fetches must consume prompt/control/stop/display evidence only from the OA event ledger and normalized HTTP read APIs;
control-prompts.jsonl,monitor-prompts.jsonl,monitor-control,control-events.jsonl, monitor stop files,.state/pipeline-runs/{runId}/control/commands/,PIPELINE_*_APPEND_FILE, local JSONL append/read helpers, and monitor/pipeline-statemounts are forbidden in runtime source. -
Pipeline live Gantt setup: when
frontend:pipeline-gantt-observation-live-runningis selected, E2E first looks for a current Pipeline run that already contains both anode-long-running-observationmarker and a still-running execution interval. If no such candidate exists, the E2E setup starts the D601monitor-management-behavior-testpipeline throughbun scripts/cli.ts ssh D601 ...and polls the private backend proxy until the observation candidate exists; the acceptance assertion itself still opens the public frontend with Playwright and verifies the rendered arrows, absence of observation source pseudo-points, target arrow inset, and live flashing running bar through React DOM controls. -
Frontend: Playwright must open the public frontend URL derived from
network.publicHost, not localhost or a Docker-internal URL; it logs in with the configured account, waits for核心在线, asserts thatmain-serverandMain Server Providerare visible, verifies desktop sidebar collapse andPGDATAoverview metric, opens运行总览 / 性能面板to verifyBwebui、组件汇总、最近失败请求、内部操作汇总和最近慢操作, clicks查看原始JSONto verify Provider data from the frontend, confirms no raw JSON is visible before that click, opens task history to verify duration and failure diagnostics, opens resource nodes资源监控to verify CPU/Memory/Disk curves, the structured process resource table, default memory-desc sorting, sortable CPU column and provider upgrade precheck dispatch, opensDocker 状态, switches tomain-server, and verifies the Docker Desktop-style container view including the database named volumeunidesk_pgdata_10gb, opens网关版本and verifies the provider-gateway version, SSH 透传可用性、远程更新可用性 plus structured remote update records forprovider.upgrade, then opens用户服务 / 服务目录、用户服务 / Todo Note、用户服务 / OA Event Flow、用户服务 / k3s Control、用户服务 / Code Queue、用户服务 / FindJob、用户服务 / Pipelineand用户服务 / MET Nonlinearto verify 主 server Todo Note/OA Event Flow、D601 Code Queue、D601 业务服务、仓库引用、私有后端映射、Todo Note 迁移清单和树形任务、OA Event Flow 事件表和 Trace stats 表、k3s 控制面/D601 scheduler/read/write 实例/Kubernetes API service proxy/no-fallback 路径、Code Queue 队列/模型/输出/初始Submitted prompt/终态任务自动加载完整 Trace/追加 prompt/打断控件、FindJob 指标和岗位预览、Pipeline 组件矩阵、MiniMax 限额卡片、结构化 OA 事件流诊断面板、React Flow 控制图、epoch 甘特图、甘特图渲染图导出、monitor 首列排序、长任务观察连线、无观察来源伪点、running node 实时闪动执行条和 OpenCode Trace、MET Nonlinear 项目库/Fork/待启动队列/当前队列/已完成/失败诊断/GPU/镜像都通过 React 控件展示。Playwright 还必须验证 Code Queue 页面所有 API 请求走/api/microservices/code-queue/proxy,不得再出现/api/code-queue-direct;深链接直达路由例如公网http://<publicHost>:<frontendPort>/app/pipeline/能直接落到 Pipeline 页面,随后切到资源节点 / Docker 状态时地址栏更新为/nodes/docker/,并且浏览器 history 返回链路仍能回到/app/pipeline/;还必须直开/app/code-queue/验证页面存在app-shell、左侧主模块边栏、顶部状态栏、顶部子标签和code-queue-page,防止用户服务 deep link 退化成缺 shell 的 standalone 页面;同时态势总览这类非用户服务页面应落在自己的模块前缀下,例如/ops/status/。Playwright 必须覆盖默认可见时间按北京时间显示,至少包括顶部北京时间时钟、任务历史/网关版本更新时间和用户服务刷新时间,不得随浏览器本地时区漂移。Task history and provider upgrade records must not display a real sub-second duration as0s; MET Nonlinear running rows must show an ETA derived from backend progress or fromstartedAtplus epoch progress, and queue/completed rows must show training speed asepoch/h. -
Frontend dense-layout regression gate: whenever a frontend change touches Pipeline 右侧边栏、Trace timeline、详情抽屉、甘特图坐标或其他高信息密度面板, Playwright acceptance must inspect both
总高度and横向滚动条. For Pipeline specifically, the OpenCode Trace session head must carry shared agent/model/session facts and the Trace body must use the same Code QueueTraceViewstyling; Playwright must fail if old.pipeline-opencode-step,.pipeline-opencode-flow,.pipeline-step-message-cardor.pipeline-opencode-partuser-visible styles reappear, if the Trace container introduces an internal horizontal scrollbar, or iffrontend:pipeline-gantt-frontend-y-accuracyfails to prove the frontendfrontend-ylayout maps ticks, markers and execution bars from timestamps to y coordinates within tolerance. -
OpenCode Trace must use Code Queue Trace styling and must not render the deprecated Pipeline continuous step connector; Playwright should fail if
.pipeline-opencode-flow,.pipeline-opencode-stepor any equivalent continuous connector/card returns to the user-visible Trace. -
User service frontend assertions must wait for real backend data, not only the page skeleton. For Todo Note this means the page must show the migrated lists
CONSTAR、大论文、找工作、小论文、事务, support creating a temporary list and task through the frontend, and delete that temporary list afterwards. The temporary list must be selected again by its unique generated name before deletion so E2E never deletes a migrated source list by accident. For FindJob this means the page must show a numeric岗位总量,HEALTH OK, and a non-emptyPREVIEWcount such as40/1463 PREVIEW; for Pipeline this means the page must showPipeline v2 工作台,Health OK, a numeric component count, a non-empty React Flow control graph,控制图,Epoch 甘特图, and after clicking a Gantt execution line it must showOpenCode Tracerendered by the shared Code Queue-style Trace component with messages and tool-call groups; for MET Nonlinear this means the page must showMET Nonlinear 训练编排,Health OK,Fork Project,加入待启动队列,启动队列,当前队列, 最大并发设置、task queue and GPU/image panels, and must not show the removed hard-coded创建10个10轮任务frontend entry. The MET Nonlinear project library must renderprojects/andex_projects/as a true path tree with folder Project counts; clicking a project row must open a structured detail panel containingconfig.json,data/ 训练状态,模型参数,指标and a parameter count such asTotal Params; clicking a completed/current/failed job row must open a structured job detail and both the row and detail must showepoch/h. Full MET Nonlinear acceptance is driven by public frontend controls: choose a visible source Project, set batch size, epochs and max concurrency in inputs, fork intoprojects/unidesk_forks/, stage the selected forks, start the queue, and verify completed rows plus automaticmetnl-train-*container removal; loading placeholders like--or empty states are not sufficient for E2E success. -
For ClaudeQQ this means the page must show
Health OK,NapCat 容器登录,NAPCAT HTTP OK,NAPCAT WS OK, logged-in state such as已登录 logged_in, event cache, subscriptions and message push controls. A page that only shows a QR code, stale raw JSON, or a running backend without logged-in NapCat is not acceptable.
Frontend JSON Rule
The frontend must render JSON data into React controls by default. Raw JSON is allowed only after an explicit 查看原始JSON user action, and E2E must fail if the initial page exposes raw JSON text or a raw JSON block.
Remote update records in the frontend are covered by the same rule: provider.upgrade task history must be rendered as rows/cards with status, mode, task id, source, duration, policy, outcome summary, and updated time. The page must not expose upgrade plan/result JSON as a log block unless the operator clicks 查看原始JSON.
Provider operation availability is also covered by the structured rendering rule. host.ssh availability must be displayed as badges or equivalent controls derived from capabilities and hostSsh* labels, and remote update availability must be displayed from provider.upgrade capability plus the always-enabled policy; these fields must not require opening raw Provider JSON.
User service pages are covered by the same rule. Todo Note must show lists, task tree, filters, reminder input, movement controls, undo/redo and metrics as controls; OA Event Flow must show health, live tag stream state, event table, tag filter presets and Trace stats table as controls; Code Queue must show queue cards, live transcript, model/cwd/max attempt inputs, judge decision, attempt table, append prompt, interrupt and retry controls; File Browser must show D518 as the default target, D601 as an alternate target, a screenshot export action, and an embedded upstream WebUI frame served through /api/microservices/<id>/proxy/ with compact file rows that do not let material-icon fallback text cover file metadata; FindJob must show metrics, jobs and drafts as cards/tables; Pipeline must show component classes, React Flow graph nodes/edges, run cards, Gantt execution lines and OpenCode Trace timelines as controls; MET Nonlinear must show queue rows, GPU/image cards, a real path tree for the project library, structured project/job detail panels, project config preview, data/ training state, model parameter count, metrics, progress bars, ETA, epoch/h speed and history diagnostics as controls; ClaudeQQ must show NapCat HTTP/WS/login badges, QR/login panel, event cache, subscriptions and message push controls; the full user-service config, summary, snapshot, jobs preview, drafts, OA events and run JSON can only appear after an explicit 查看原始JSON click.
Public Boundary Rule
The production frontend URL, dev frontend proxy URL and provider ingress URL are the only unrestricted public network interfaces. backend-core REST API remains Docker-internal only; PostgreSQL and OA Event Flow may expose restricted host mappings solely for controlled Code Queue nodes, and E2E must prove those mappings are unreachable to generic clients or protected by explicit source rules.
Database Persistence Rule
The PostgreSQL data volume is the named Docker volume unidesk_pgdata_10gb. CLI server control commands must never use docker compose down -v, docker volume rm, or any equivalent data-volume removal. To validate persistence, insert a marker row into unidesk_e2e_markers, run bun scripts/cli.ts server start or a full stop/start cycle, and verify the marker row still exists.
User Service Restart-Recovery Rule
Any new user service, service migration, or change to a service's Compose/docker run/k8s configuration must prove it can recover after container restart and Docker daemon restart. The delivery evidence must include the service's config.json id/provider/container or Kubernetes Service mapping, restart policy or Deployment replica policy, private port or ClusterIP Service, persistent mounts or PostgreSQL tables, health readiness fields, and at least one post-restart bun scripts/cli.ts microservice health <id> plus a representative microservice proxy check through the real UniDesk path. k3sctl-managed services must prove the proxy path through k3sctl-adapter and Kubernetes API service proxy, not the provider-gateway direct business path.
D601 services have an extra gate because Windows, WSL, Docker Desktop and native k3s are separate supervisors: record the Windows scheduled task or equivalent keepalive, run docker inspect to confirm Docker-managed services such as met-nonlinear-ts have non-empty restart policies and host bind mounts for durable state, and run kubectl -n unidesk get deploy,pod,svc,endpoints claudeqq -o wide for k3s-managed ClaudeQQ. Then verify MET Nonlinear queue/image health and ClaudeQQ logged-in NapCat HTTP/WebSocket state through the real UniDesk proxy after the restart. A service that only becomes running but loses login, queue, token, subscription, data directory or pending work is not restart-recovery complete.
Delivery Gate
Before claiming delivery, run these checks and keep their JSON output or screenshot path available for review:
bun scripts/cli.ts checkbun scripts/cli.ts server start, thenbun scripts/cli.ts job status latestuntilsucceededbun scripts/cli.ts server statusbun scripts/cli.ts e2e runand inspectfailedChecksor the emittedresultPathif any check fails- a database persistence marker check across at least one CLI-controlled restart
Provider Upgrade Gate
When delivery explicitly includes upgrading or rebuilding a compute-node provider-gateway such as D601 or D518, the automated E2E plan check is not sufficient. The operator must first bootstrap any legacy provider only from a node-local terminal, node-owned web terminal, systemd, scheduled task, or detached shell if it cannot yet schedule upgrades; SSH passthrough carried by the same provider-gateway must not be used for synchronous self-rebuilds. Then run provider.upgrade with mode: "schedule" against that Provider ID, confirm the task succeeds, confirm the sleep-and-validate candidate gateway reconnects in the public frontend, confirm docker inspect reports final restart policy always and PID mode host, record whether systemd/PM2/Windows scheduled task/Docker Desktop autostart is the daemon-level supervisor, and finally verify any required host.ssh capability with bun scripts/cli.ts ssh <PROVIDER_ID> hostname. This schedule check is a node-upgrade gate, not a replacement for the standard public frontend Playwright E2E gate.
External compute nodes should run that schedule check through the remote main-server passthrough form: bun scripts/cli.ts --main-server-ip 74.48.78.17 debug dispatch <PROVIDER_ID> provider.upgrade --mode schedule --wait-ms 15000. The default remote transport logs in to the public frontend and does not require a main server SSH key; this proves the node can validate itself without direct access to backend-core REST or PostgreSQL.