docs: update web sentinel post-task guidance

2026-06-28 10:06:24 +00:00
parent dab32a823e
commit 57ed859e96
2 changed files with 31 additions and 25 deletions
@@ -23,16 +23,16 @@ description: UniDesk monitoring and Web sentinel operations. Use when working on
 ## Quick Commands

 ```bash
-bun scripts/cli.ts web-probe sentinel status --node D601 --lane v03
-bun scripts/cli.ts web-probe sentinel status --node D601 --lane v03 --sentinel <id>
-bun scripts/cli.ts web-probe sentinel control-plane status --node D601 --lane v03 --sentinel <id>
-bun scripts/cli.ts web-probe sentinel validate --node D601 --lane v03 --sentinel <id>
-bun scripts/cli.ts web-probe sentinel dashboard verify --node D601 --lane v03 --sentinel <id>
-bun scripts/cli.ts web-probe sentinel dashboard screenshot --node D601 --lane v03 --sentinel <id>
-bun scripts/cli.ts web-probe sentinel report --node D601 --lane v03 --sentinel <id> --latest --view summary
-bun scripts/cli.ts web-probe sentinel control-plane trigger-current --node D601 --lane v03 --sentinel <id> --confirm
-trans D601:k3s kubectl -n <namespace> get cronjob -l app.kubernetes.io/component=cadence-scheduler
-trans D601:k3s kubectl -n <namespace> create job --from=cronjob/<quick-verify-cronjob> <manual-job-name>
+bun scripts/cli.ts web-probe sentinel status --node <node> --lane <lane>
+bun scripts/cli.ts web-probe sentinel status --node <node> --lane <lane> --sentinel <id>
+bun scripts/cli.ts web-probe sentinel control-plane status --node <node> --lane <lane> --sentinel <id>
+bun scripts/cli.ts web-probe sentinel validate --node <node> --lane <lane> --sentinel <id>
+bun scripts/cli.ts web-probe sentinel dashboard verify --node <node> --lane <lane> --sentinel <id>
+bun scripts/cli.ts web-probe sentinel dashboard screenshot --node <node> --lane <lane> --sentinel <id>
+bun scripts/cli.ts web-probe sentinel report --node <node> --lane <lane> --sentinel <id> --latest --view summary
+bun scripts/cli.ts web-probe sentinel control-plane trigger-current --node <node> --lane <lane> --sentinel <id> --confirm
+trans <node>:k3s kubectl -n <namespace> get cronjob -l app.kubernetes.io/component=cadence-scheduler
+trans <node>:k3s kubectl -n <namespace> create job --from=cronjob/<quick-verify-cronjob> <manual-job-name>
 ```

 For k3s cadence validation, first use the controlled control-plane status/trigger commands, then inspect the rendered CronJob in the target k3s namespace. Manual `kubectl create job --from=cronjob/...` is validation evidence only; persistent cadence changes must be made through YAML/GitOps and redeployed.
@@ -5,43 +5,43 @@
 Primary registry:

 ```bash
-bun scripts/cli.ts web-probe sentinel status --node D601 --lane v03
+bun scripts/cli.ts web-probe sentinel status --node <node> --lane <lane>
 ```

-Known D601/v03 sentinel ids:
+Known sentinel ids vary by node/lane and must come from YAML. Common `v03` examples include:

 - `workbench-dsflash-go-tool-call-10x`
 - `workbench-auth-session-switch-2users`
+- `workbench-fake-echo-session-invariance-10x`
 - `mdtodo-visual-regression`

 Per-sentinel drill-down:

 ```bash
-bun scripts/cli.ts web-probe sentinel status --node D601 --lane v03 --sentinel <id>
-bun scripts/cli.ts web-probe sentinel control-plane status --node D601 --lane v03 --sentinel <id>
+bun scripts/cli.ts web-probe sentinel status --node <node> --lane <lane> --sentinel <id>
+bun scripts/cli.ts web-probe sentinel control-plane status --node <node> --lane <lane> --sentinel <id>
 ```

 Freshness-only check:

 ```bash
-bun scripts/web-probe-sentinel-scheduler.ts run --node D601 --lane v03 --sentinel <id> --stale-multiplier 1 --dry-run
+bun scripts/web-probe-sentinel-scheduler.ts run --node <node> --lane <lane> --sentinel <id> --stale-multiplier 1 --dry-run
 ```

-Host timer installation/status:
+Cadence/runtime validation is k3s-first:

 ```bash
-bun scripts/web-probe-sentinel-scheduler.ts status-systemd --node D601 --lane v03
-bun scripts/web-probe-sentinel-scheduler.ts install-systemd --node D601 --lane v03 --confirm
-bun scripts/web-probe-sentinel-scheduler.ts status-systemd --node D601 --lane v03 --sentinel <id>
+trans <node>:k3s kubectl -n <namespace> get cronjob -l app.kubernetes.io/component=cadence-scheduler
+trans <node>:k3s kubectl -n <namespace> create job --from=cronjob/<quick-verify-cronjob> <manual-job-name>
 ```

-Without `--sentinel`, `status-systemd` and `install-systemd` enumerate every enabled sentinel from the YAML registry and manage independent per-sentinel timers. Use this when a sentinel's latest run is stale: a missing timer is a runtime defect even if `run --dry-run` can enumerate the sentinel and mark it due.
+Host `systemd` timer commands are legacy diagnostics only. Enabled HWLAB Web sentinels must run from target node/lane k3s CronJob/GitOps. If a sentinel's latest run is stale, first compare YAML cadence, latest run age and rendered CronJob state for that sentinel; a missing or stale CronJob is a runtime defect even if the dry-run scheduler can enumerate the sentinel and mark it due.

 Dashboard render and screenshot verification:

 ```bash
-bun scripts/cli.ts web-probe sentinel dashboard verify --node D601 --lane v03 --sentinel <id>
-bun scripts/cli.ts web-probe sentinel dashboard screenshot --node D601 --lane v03 --sentinel <id>
+bun scripts/cli.ts web-probe sentinel dashboard verify --node <node> --lane <lane> --sentinel <id>
+bun scripts/cli.ts web-probe sentinel dashboard screenshot --node <node> --lane <lane> --sentinel <id>
 ```

 The screenshot command runs through the selected node/lane remote browser and downloads the PNG artifact to the caller's `/tmp` by default. Closeout evidence should cite `localPath`, `sha256`, page HTTP status, selected DOM summary fields and `layout.horizontalOverflow` / `overflowCount`; do not replace this with a local browser screenshot or ad-hoc `web-probe script` when the sentinel command can cover the page.
@@ -53,14 +53,15 @@ Use the freshness-only `--dry-run` scheduler command when the question is only "
 Report drill-down:

 ```bash
-bun scripts/cli.ts web-probe sentinel report --node D601 --lane v03 --sentinel <id> --latest --view summary
-bun scripts/cli.ts web-probe sentinel report --node D601 --lane v03 --sentinel <id> --latest --view findings
-bun scripts/cli.ts web-probe sentinel report --node D601 --lane v03 --sentinel <id> --run <runId> --view trace-frame
+bun scripts/cli.ts web-probe sentinel report --node <node> --lane <lane> --sentinel <id> --latest --view summary
+bun scripts/cli.ts web-probe sentinel report --node <node> --lane <lane> --sentinel <id> --latest --view findings
+bun scripts/cli.ts web-probe sentinel report --node <node> --lane <lane> --sentinel <id> --run <runId> --view trace-frame
 ```

 Public dashboard paths:

 - `https://monitor.pikapython.com/`
+- `https://monitor.pikapython.com/sentinels/d518-workbench-dsflash-go-tool-call-10x/`
 - `https://monitor.pikapython.com/sentinels/workbench-auth-session-switch-2users/`

 Direct API probes for shell/API/render separation:
@@ -79,12 +80,15 @@ Use `web-probe script` with explicit `page.goto("https://monitor.pikapython.com/
 Root registry:

 - `config/hwlab-node-lanes.yaml#lanes.v03.targets.D601.observability.webProbe.sentinels`
+- `config/hwlab-node-lanes.yaml#lanes.v03.targets.D518.observability.webProbe.sentinels`

 Per-sentinel management YAML:

 - `config/hwlab-web-probe-sentinels/d601-v03/workbench-dsflash-go-tool-call-10x.yaml#sentinel`
 - `config/hwlab-web-probe-sentinels/d601-v03/workbench-auth-session-switch-2users.yaml#sentinel`
 - `config/hwlab-web-probe-sentinels/d601-v03/mdtodo-visual-regression.yaml#sentinel`
+- `config/hwlab-web-probe-sentinels/d518-v03/workbench-dsflash-go-tool-call-10x.yaml#sentinel`
+- `config/hwlab-web-probe-sentinels/d518-v03/workbench-fake-echo-session-invariance-10x.yaml#sentinel`

 Typical config refs:

@@ -127,6 +131,8 @@ If `origin/master` advances while rolling out a sentinel, first classify the new

 Source mirror readiness must be proven by the internal mirror object/read probe for the expected commit. A GitHub/source head check alone is not sufficient evidence to skip source sync, because it does not prove the k3s publish job can fetch the object from the node-local mirror.

+If the internal mirror branch is ahead of the expected commit, status may still be ready only when the expected object exists and `expected` is an ancestor of the mirror branch tip. Treat that as `mirror-ahead`, not as a source blocker. An exact SHA match is sufficient but not required during parallel master advancement.
+
 Dashboard aggregate counters may include historical runs only when the UI labels that scope explicitly. They must not sit beside a latest-run chart or selected-run check list without a scope label. If trend, run detail and check list disagree, first identify whether each number is a type count, sample count or historical aggregate before changing code.

 For Code Agent multi-round quick-verify, accept the latest run's `turn-summary` / `trace-frame` plus `blockingFindingCount=0` and `controlFindingCount=0`. Analyzer red findings about hydration, API-to-DOM lag or timing drift are investigation evidence unless they coincide with missing durable turns/final responses, failed submit/login/auth, broken continuity, absent report or unavailable user path.