docs: clarify v02 final response closeout

2026-06-04 10:31:06 +00:00
parent c5d85d5409
commit 7ef98fbe08
1 changed files with 2 additions and 0 deletions
@@ -86,6 +86,8 @@ For Cloud Web Workbench and Code Agent issues, the closeout validation must use

 For Cloud Web Workbench Code Agent response or trace-rendering bugs, the minimum Web-equivalent CLI proof is a fresh `hwlab-cli client agent send --wait` against the deployed public Web origin, followed by `hwlab-cli client agent trace <traceId> --render web` against the same origin. The submit proof must show the browser dispatcher family, normally `POST /v1/agent/chat`, result polling through `/v1/agent/chat/result/<traceId>`, `continuation.webEquivalent=true`, `shortConnection=true`, and explicit `sessionId` / `conversationId` / `threadId` binding when those values affect the bug. The result proof must show the final assistant text from `assistantText` or `reply.content`; placeholder status text, result summaries, terminal status messages, and AgentRun completion boilerplate are not acceptable substitutes for the assistant final response.

+For persisted final-response display regressions, a fresh turn alone is not enough when the user report identifies an existing conversation, session, or trace. Re-read the original record on the deployed `v0.2` runtime with locked lane env and the correct `projectId`; the default session list project may differ from the affected Workbench project. The minimum proof is `client session list --project-id <projectId> --limit <N> --full`, `client session inspect <conversationId> --full`, and `client agent result <traceId> --full`. Passing evidence must show that list and inspect surface the same latest agent `traceId` as `lastTraceId`, the latest agent text matches the terminal result `reply.content` or equivalent final assistant text, and known fallback text such as `Code Agent 仍在处理，可以继续 steer 或等待 trace 完成。` is absent from list, inspect, and result output. When the repair is lazy-on-read, run the read path again or capture the exposed repair source/updated marker so the evidence proves persisted conversation state was repaired, not merely synthesized for one response. `client agent trace <traceId> --render web` remains required for trace-rendering bugs; for persisted conversation-display bugs it is supporting evidence unless it returns rendered assistant rows from the same original trace.
+
 The `--render web` proof must inspect the rendered body, not only the raw event count. Passing evidence should include `body.render=web`, the shared renderer identity when exposed, `status=completed`, rendered/returned row counts, noise/omitted counts when available, at least one rendered assistant row containing the final assistant text, and an explicit absence check for known non-user boilerplate such as `AgentRun terminal status completed`, `AgentRun result is ready`, and `Code Agent 仍在处理`. If the trace API returns `status=missing`, `sourceEventCount=0`, or no rows for a historical issue trace, treat that trace as expired or unavailable; do not use it as closure evidence. Generate a fresh equivalent turn on the current v0.2 runtime and validate that trace instead.

 CLI/Web-equivalent trace evidence does not replace browser UI evidence for visual, layout, copy-to-clipboard, collapsed-panel or removed-control bugs. Those require a bounded browser or DOM smoke against `http://74.48.78.17:19666/` after rollout, with assertions on the deployed page text, DOM state, or control behavior that the user reported. A local bundle smoke can support regression coverage, but the closeout still needs the deployed public endpoint unless the browser entry is unavailable and the issue comment records the blocker. Missing Playwright browser binaries or declared test dependencies are not a valid skip; install the repository-declared runner/browser or use an approved system browser executable and record that choice in the validation evidence.