diff --git a/docs/reference/cli.md b/docs/reference/cli.md index b63fc0c0..7642e180 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -51,7 +51,7 @@ CLI 可以从 `master` 快速演进,但必须兼容 `deploy.json` 固定的 CI - `codex resume [prompt|--prompt-file path|--prompt-stdin] [--resume-id id] [--dry-run] [--full|--raw]` 对已终态或 awaiting-closeout 的原 Code Queue task 创建后续 turn,优先用于 PR 小修、冲突、rebase、补测和 reviewer feedback,保留原 task、attempt、branch/PR 上下文和 `codexThreadId`/OpenCode session。CLI 会为同一 task/prompt 生成稳定 `resumeId`,也允许显式传入;同一 `resumeId` 加同 prompt 返回 `duplicate_suppressed` 且不重复注入,同一 `resumeId` 加不同 prompt 返回 409 conflict。真实成功只返回 taskId、resumeId/turnId、`deliveryState`、是否复用原 `codexThreadId`、有界 trace confirmation 和 `codex task/detail/trace/output` 后续命令,不回显 prompt 或完整 task state。running/judging task 必须 fail closed 并给出 `disposition=use-steer-for-active-task` 与 `codex steer` 命令,不把 resume 伪装成新 task;不存在 task 返回结构化 not accepted。若 delivery timeout 或 trace 未确认,输出 `deliveryUnconfirmed` 和确认命令,调用方先查 `codex task --trace` 再用同一 `resumeId` 重试。 - `codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/] [--pr-create-dry-run --pr-create-dry-run-head ] [--issue N] [--full|--raw]` 通过稳定 `code-queue` proxy 请求 D601 scheduler `/api/runtime-preflight`,用于 PR 型派单 admission。默认输出是紧凑 commander 视图,显式分出 `schedulerPreflight` 与 `activeRunnerPrCapability`,并附带 `commands` 和 `disclosure`,方便先看 scheduler auth 缺口、再看当前 runner/dev container 的 `gh auth status` 与 `gh pr create --dry-run` 能力;`--full` 或 `--raw` 才展开完整 `preflight`、工具、agent port、Git worktree、GitHub egress、repo/issue/PR 只读探测和观测原文。只报告 `GH_TOKEN`/`GITHUB_TOKEN` 是否存在和来源 key,不打印值。当 auth-broker 配置存在时,`tokenCoverage.source="auth-broker"`、`credentialSource="broker-issued-token"` 且 runner env token 不是成功前提;当仅 env token 存在时,`credentialSource="env-token"` 且 `authBroker.nextAction="use-env-token-until-auth-broker-live"`;两者都缺失时顶层 `ok=false`、`runnerDisposition=infra-blocked`、`degradedReason=auth-broker-needed`,`tokenCoverage.missing` 同时列出 `GH_TOKEN` 与 `GITHUB_TOKEN`,并输出 `authBroker.source="broker/auth-broker-needed"`、`capability.source="missing-token"`。该 `auth-missing` 的 scope 是 `scheduler-runner-env`,不能简化成“当前 active runner/dev container 不能创建 PR”;默认视图必须带 `scopeBoundary` 和 `activeRunnerPrCapability`。GitHub DNS/API 连接失败应归类为 `failureKind=github-transient`、`degradedReason=github-dns-api-transient`,并带 `retryable=true`、`commanderAction=retry-backoff-or-keep-running-if-heartbeat-fresh` 和有界 `githubTransient.failedProbes`;调用方应重试/退避,且在任务 heartbeat/trace 新鲜时继续监督,不把它当成 auth 缺失或 PR 语义失败。`prCapability` 是 runner-facing 合同摘要,必须包含目标分支、token/auth 来源、`systemGhBinaryRequiredForWrites=false`、UniDesk REST `bun scripts/cli.ts gh` 可用性、push dry-run/PR create dry-run 的 `writesRemote=false`、expected PR handoff、真实 PR 创建需要 commander 授权和 `gh pr merge` 的 `unsupported-command` 边界;系统 `gh` binary 缺失只进入 `tools.systemGhBinary`,不得误判为 UniDesk REST `gh` CLI 不可用。`--remote` 在 runner-like 环境里不再依赖本地 `unidesk-backend-core`、`unidesk-database`、`baidu-netdisk-backend` 容器存在;这些缺失只作为本地观测证据。若远程控制面可达,则继续走远程控制面结果;若远程控制面不可达,则结构化返回 `failureKind=control-plane-missing` / `degradedReason=remote-control-plane-unreachable`,而不是把本地 `backend-core-container-missing` 当作最终阻塞。`--pr-create-dry-run` 不 POST GitHub,只证明 runner 内 PR body 生成、`scripts/cli.ts gh pr create --dry-run` 和 branch 参数形态可用;服务端创建权限仍以 token/auth broker、repo/issue/PR read、push dry-run 和最终授权后的真实 PR 创建结果为准。 - `codex task ` 通过 Code Queue 私有代理按任务 ID 查询结构化审阅摘要;默认只返回任务身份、执行 Provider、工作目录、attempt 计数、原始 prompt、最终 response、最后错误和渐进披露命令,适合指挥官审阅完成未读任务且避免上下文爆炸。`--detail` 仍是有界详细摘要:默认只返回少量 attempt/tool 行、短 prompt/response/stderr/feedback 预览和 omitted/truncated 元数据;需要完整 prompt/response 文本或更多 tool/attempt 细节时再显式加 `--full`、`--tool-limit N`、`--trace` 或 `codex output`。该摘要读取默认由主 server `code-queue-mgr` 从 PostgreSQL 返回,不依赖 D601 `code-queue-read` Service 可用。 -- `codex tasks [--view commander|supervisor|full] [--queue id] [--status succeeded|running|queued|failed|canceled|judging|retry_wait[,..]] [--unread|--unread-only] [--limit N] [--before-id id]` 通过同一私有代理输出渐进式披露视图。host commander 轮询应优先使用 `--view commander`:它只返回有界 action map,包含 `activeRunners.count` 及来源/处置、queued/retry_wait 精确计数、terminal-unread 总数和已省略行数、active/stale/heartbeat/final-response blocker 风险、HWLAB#7/#99/#116/#164/#317 与 UniDesk#20/#118 命中、确定性分类和 `codex task/trace/output/read` drill-down 命令,不嵌入完整 prompt、final response、trace、output 或 raw overview。默认 `supervisor` 保持旧低噪声分区视图,只返回 `activeRunning`、`running`、`completedUnread`、`recentCompleted`、`queued`、`activity`、`commanderConcurrency` 和 `executionDiagnostics` 的紧凑行;`activeRunning.count` 是 running+judging 的状态计数,`exact=true` 时来自 queue summary counts,`running.returned` 和 `activeRunning.rowPage.returned` 只是本次返回的紧凑行数。`commanderConcurrency.activeRunnerCount` 是并发策略应使用的 active/running 计数,等于 `activity.effectiveActiveTaskCount`;15 并发策略按 `15 - activeRunnerCount` 计算剩余窗口。`commanderConcurrency.splitBrainDisposition=live-count-as-active` 表示 split-brain 有 fresh heartbeat 证据,应继续监督并计入 active;`interventionRequired=true` 才提示介入。prompt/body 只给短预览和原始字符数,`running`/`completedUnread`/`queued` 默认只返回一个有界小页并通过 section `commands.next` 继续分页,`recentCompleted` 默认限量且不重复 `completedUnread` 未读终态,不嵌入完整 Trace、final response 或全量 overview。`--limit` 在 commander/supervisor 中主要是扫描/分页预算,不是返回几十条肥行的开关;CLI 安全上限是 100,输出会在 `filters.requestedLimit`、`filters.effectiveLimit`、`filters.limitCapped` 和 disclosure 中说明显式请求是否被 capped;底层 overview 拉取预算独立显示在 `source.requestedLimit` / `source.effectiveLimit`,所以 `--limit 260` 应显示 requested=260、effective=100、source requested/effective=200,而不是只露出一个含糊的 `limit`。`--unread` 是 `--unread-only` 的别名,必须只保留未读终态;`--status` 必须真实过滤支持的状态,未知参数或未知状态必须结构化失败。需要更详细当前页任务行时显式使用 `--view full` 或 `--full`,仍受 `--limit` 和 `--before-id` 分页约束。 +- `codex tasks [--view commander|supervisor|full] [--queue id] [--status succeeded|running|queued|failed|canceled|judging|retry_wait[,..]] [--unread|--unread-only] [--limit N] [--before-id id]` 通过同一私有代理输出渐进式披露视图。host commander 轮询应优先使用 `--view commander`:它是低噪声 polling 入口,只返回有界 action map,包含 `activeRunners.count` 及来源/处置、少量 active item、queued/retry_wait 精确计数、terminal-unread 总数和省略行数、关键风险计数、HWLAB#7/#99/#116/#164/#317 与 UniDesk#20/#118 命中、确定性分类计数和集中式 `codex task/trace/output/read` drill-down 命令。默认 commander 不展开历史 terminal unread item details,也不嵌入 prompt preview、final response preview、trace、output 或 raw overview;terminal unread 详情必须通过 `codex unread`、`codex tasks --unread --view supervisor`、`--view full`、`--full` 或 per-task `codex read ` 获取。默认 `supervisor` 保持旧低噪声分区视图,只返回 `activeRunning`、`running`、`completedUnread`、`recentCompleted`、`queued`、`activity`、`commanderConcurrency` 和 `executionDiagnostics` 的紧凑行;`activeRunning.count` 是 running+judging 的状态计数,`exact=true` 时来自 queue summary counts,`running.returned` 和 `activeRunning.rowPage.returned` 只是本次返回的紧凑行数。`commanderConcurrency.activeRunnerCount` 是并发策略应使用的 active/running 计数,等于 `activity.effectiveActiveTaskCount`;15 并发策略按 `15 - activeRunnerCount` 计算剩余窗口。`commanderConcurrency.splitBrainDisposition=live-count-as-active` 表示 split-brain 有 fresh heartbeat 证据,应继续监督并计入 active;`interventionRequired=true` 才提示介入。`supervisor` prompt/body 只给短预览和原始字符数,`running`/`completedUnread`/`queued` 默认只返回一个有界小页并通过 section `commands.next` 继续分页,`recentCompleted` 默认限量且不重复 `completedUnread` 未读终态,不嵌入完整 Trace、final response 或全量 overview。`--limit` 在 commander/supervisor 中主要是扫描/分页预算,不是返回几十条肥行的开关;CLI 安全上限是 100,输出会在 `filters.requestedLimit`、`filters.effectiveLimit`、`filters.limitCapped` 和 disclosure 中说明显式请求是否被 capped;底层 overview 拉取预算独立显示在 `source.requestedLimit` / `source.effectiveLimit`,所以 `--limit 260` 应显示 requested=260、effective=100、source requested/effective=200,而不是只露出一个含糊的 `limit`。`--unread` 是 `--unread-only` 的别名,必须只保留未读终态;`--status` 必须真实过滤支持的状态,未知参数或未知状态必须结构化失败。需要更详细当前页任务行时显式使用 `--view full` 或 `--full`,仍受 `--limit` 和 `--before-id` 分页约束。 - `codex unread [summary|mark-read] [--queue id] [--repo owner/name] [--issue N] [--status succeeded|failed|canceled[,..]] [--limit N] [--before-id id] [--confirm]` 是完成未读积压的默认低噪声 triage 入口。默认只读返回 repo/issue/status/queue 计数和最新任务 id 小页,不拉取 per-task summary,不输出 raw prompt、final response、trace 或 output;每行只给 `codex task/detail/trace/output/read` drill-down 命令。批量已读必须使用 `codex unread mark-read ... --confirm`,缺少 `--confirm` 时结构化失败且不 POST `/read`;单任务审阅仍优先 `codex read `。 - `codex task --trace --tail|--from-start|--after-seq N|--before-seq N --limit N` 按页拉取 Code Queue 的逻辑 trace;响应会返回 `nextAfterSeq`、`previousBeforeSeq`、`hasMore`、`hasBefore` 和下一页/上一页命令,默认 `--trace` 取最新一页,且仍以分页 trace 为主;需要完整 prompt/最终 response 时加 `--full`,需要详细 task 摘要时加 `--detail`。 - `codex output --tail|--from-start|--after-seq N|--before-seq N --limit N [--full-text]` 按原始 output seq 分页读取底层记录;当 trace 行提示 `commandOmittedLines`、`bodyOmittedLines` 或 `rawSeqs` 时,用该命令按 seq 补取信息。默认是低噪声 raw-output 摘要:即使传入很大的 `--limit`,非 `--full-text` 也会限制返回行数和单条文本预览,并在 `disclosure.limitCapped`、`requestedLimit`、`effectiveLimit` 和 `commands.fullText` 中说明如何继续展开;显式 `--full-text` 才返回该页全文。 diff --git a/docs/reference/code-queue-supervision.md b/docs/reference/code-queue-supervision.md index 6e212274..703d9cf9 100644 --- a/docs/reference/code-queue-supervision.md +++ b/docs/reference/code-queue-supervision.md @@ -38,6 +38,8 @@ HWLAB 业务目标、验收和实现优先级归 `pikasTech/HWLAB#7`;UniDesk `split-brain live` 且 heartbeat/trace 新鲜时,指挥官必须继续监督,不把它当作服务中断。此类状态的优先动作是继续轮询、继续审阅、继续派单,而不是默认 interrupt 或 cancel。`codex submit` 的默认写入确认也遵守同一口径:如果 queue counts 显示 running/queued 非零,但 default summary 不能枚举 active/queued id 列表,CLI 必须返回 `idsUnavailable=true` / `itemsOmitted=true` 和 `stateDisclosure.idsUnavailableMeaning`,而不是输出看起来像“没有 active/queued 任务”的 `items=[]`。需要 raw drill-down 时使用返回的 `queue.listPreviewPolicy.rawCommand`,即 `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview?limit=30 --raw --full`。 +`codex tasks --view commander --limit N` 是指挥官高频轮询入口,必须保持默认 stdout 严格有界。它优先展示 active runner 精确计数与少量 active item、queued/retry_wait count、terminal unread compact counts、关键 risk counts 和 drill-down commands;历史 terminal unread item details、prompt preview、final response preview、trace/output 和 raw overview 不在默认 commander 输出中展开。需要审阅 terminal unread 详情时使用 `bun scripts/cli.ts codex unread --limit N`、`bun scripts/cli.ts codex tasks --unread --view supervisor --limit N`、`bun scripts/cli.ts codex tasks --view full --limit N` 或 per-task `bun scripts/cli.ts codex read `。 + live-read browser audit 只用于观察已部署 UI,不授权写入。未获得显式 live mutation 授权时,审计浏览器只能放行 `GET`、`HEAD` 和 `OPTIONS`;`POST`、`PUT`、`PATCH`、`DELETE` 以及其他可能改变状态的方法必须被拦截并 abort,报告时统一标记为 `audit guard blocked page mutation attempt`,同时记录 method、path、触发的页面动作和已拦截事实。这个证据只能证明页面渲染、只读请求和某个交互会尝试发起写请求;它不能证明 backend outage、写入失败、写入成功、持久化状态变化或 mutating workflow 已验收。需要真实点击、提交、启动、停止、保存、删除、训练或其他 live-mutating acceptance 时,必须先取得针对目标服务、动作和环境的明确授权,并按授权后的验证规则单独记录结果。 每次新派一批任务、接收一批 completed unread 结果,或者发生实质态势变化时,都要同步更新 `#20` 的正文主表;如果当天有滚动简报,则同时更新当日简报 issue 的正文主内容,而不是只在聊天中补上下文。 diff --git a/scripts/code-queue-commander-view-contract-test.ts b/scripts/code-queue-commander-view-contract-test.ts index 5daf5453..662d6f61 100644 --- a/scripts/code-queue-commander-view-contract-test.ts +++ b/scripts/code-queue-commander-view-contract-test.ts @@ -1,6 +1,7 @@ import { codexTasksQueryForTest } from "./src/code-queue"; type JsonRecord = Record; +type RequestRecord = { path: string; method: string }; function assertCondition(condition: unknown, message: string, detail: JsonRecord = {}): void { if (!condition) throw new Error(`${message}: ${JSON.stringify(detail)}`); @@ -73,7 +74,8 @@ function summaryForTask(taskId: string): JsonRecord { }; } -function noisyCommanderFixture(path: string): JsonRecord { +function noisyCommanderFixture(path: string, requests: RequestRecord[] = []): JsonRecord { + requests.push({ path, method: "GET" }); if (path.includes("/summary")) { const taskId = decodeURIComponent(path.split("/api/tasks/")[1]?.split("/")[0] ?? "unknown"); return summaryForTask(taskId); @@ -140,12 +142,24 @@ function noisyCommanderFixture(path: string): JsonRecord { } export function runCodeQueueCommanderViewContract(): JsonRecord { - const commander = codexTasksQueryForTest(["--view", "commander", "--limit", "260"], noisyCommanderFixture); - const supervisor = codexTasksQueryForTest(["--view", "supervisor", "--limit", "260"], noisyCommanderFixture); - const full = codexTasksQueryForTest(["--view", "full", "--limit", "260"], noisyCommanderFixture); + const commanderRequests: RequestRecord[] = []; + const commanderLimit8Requests: RequestRecord[] = []; + const fetchCommander = (path: string): JsonRecord => noisyCommanderFixture(path, commanderRequests); + const fetchCommanderLimit8 = (path: string): JsonRecord => noisyCommanderFixture(path, commanderLimit8Requests); + const fetchNoisy = (path: string): JsonRecord => noisyCommanderFixture(path); + const commander = codexTasksQueryForTest(["--view", "commander", "--limit", "260"], fetchCommander); + const supervisor = codexTasksQueryForTest(["--view", "supervisor", "--limit", "260"], fetchNoisy); + const full = codexTasksQueryForTest(["--view", "full", "--limit", "260"], fetchNoisy); + const commanderLimit8 = codexTasksQueryForTest(["--view", "commander", "--limit", "8"], fetchCommanderLimit8); + const fullLimit8 = codexTasksQueryForTest(["--view", "full", "--limit", "8"], fetchNoisy); + const unreadLimit8 = codexTasksQueryForTest(["--unread", "--limit", "8"], fetchNoisy); const commanderBody = JSON.stringify(commander); + const commanderLimit8Body = JSON.stringify(commanderLimit8); + const fullLimit8Body = JSON.stringify(fullLimit8); + const unreadLimit8Body = JSON.stringify(unreadLimit8); const fullBody = JSON.stringify(full); const commanderView = asRecord(asRecord(commander).commander); + const commanderLimit8View = asRecord(asRecord(commanderLimit8).commander); const supervisorView = asRecord(asRecord(supervisor).supervisor); const filters = asRecord(commanderView.filters); const activeRunners = asRecord(commanderView.activeRunners); @@ -164,8 +178,14 @@ export function runCodeQueueCommanderViewContract(): JsonRecord { const recentCompletedSection = asRecord(sections.recentCompleted); const recentIds = asArray(recentCompletedSection.items).map((item) => String(asRecord(item).id ?? "")); const terminalIds = asArray(terminalUnreadSection.items).map((item) => String(asRecord(item).id ?? "")); + const activeItems = asArray(activeRunners.items).map(asRecord); const runningRisk = attentionItems.find((item) => item.id === "task-running-risk") ?? {}; - const failedUnread = attentionItems.find((item) => item.id === "task-failed-unread") ?? {}; + const limit8ActiveRunners = asRecord(commanderLimit8View.activeRunners); + const limit8Sections = asRecord(commanderLimit8View.sections); + const limit8TerminalUnread = asRecord(limit8Sections.terminalUnread); + const limit8Commands = asRecord(commanderLimit8View.commands); + const limit8Attention = asRecord(commanderLimit8View.attention); + const limit8AttentionItems = asArray(limit8Attention.items).map(asRecord); assertCondition(commanderBody.length < 30_000, "commander output should stay under the noisy fixture budget", { chars: commanderBody.length }); assertCondition(commanderBody.length < fullBody.length * 0.65, "commander output should stay materially smaller than full output", { commanderChars: commanderBody.length, fullChars: fullBody.length }); @@ -173,7 +193,8 @@ export function runCodeQueueCommanderViewContract(): JsonRecord { assertCondition(activeRunners.count === 14 && activeRunners.exact === true && activeRunners.source === "database-active", "commander view should expose exact active runner count and source/disposition", activeRunners); assertCondition(backlog.queued === 18 && backlog.retryWait === 4 && backlog.total === 22 && backlog.exact === true, "commander view should expose queued/retry_wait exact counts", backlog); assertCondition(terminalUnread.total === 8 && terminalUnread.rowsReturned === 3 && terminalUnread.rowsOmitted === 5 && terminalUnread.exact === true, "commander view should expose terminal unread count plus omitted rows", terminalUnread); - assertCondition(attentionCounts.total === 7 && attentionCounts.returned === 7 && attentionCounts.omitted === 0, "commander attention counts should preserve total/returned/omitted", attentionCounts); + assertCondition(activeItems.some((item) => item.id === "task-running-risk") && activeItems.some((item) => item.id === "task-running-watch"), "commander activeRunners should include compact active task items", activeRunners); + assertCondition(attentionCounts.total === 4 && attentionCounts.returned === 4 && attentionCounts.omitted === 0, "commander attention counts should preserve non-terminal attention totals", attentionCounts); assertCondition(highPriorityIssues.present === true && highPriorityIssues.matchedCount === 7, "commander should surface tracked high-priority issues", highPriorityIssues); assertCondition(Number(byCategory["business-user-facing"] ?? 0) >= 1 && Number(byCategory["deployment-artifact"] ?? 0) >= 1 @@ -188,28 +209,47 @@ export function runCodeQueueCommanderViewContract(): JsonRecord { assertCondition(String(commands.rawOverview ?? "").includes("microservice proxy code-queue") && String(commands.rawOverview ?? "").includes("--raw"), "commander should expose raw overview drilldown", commands); assertCondition(String(commands.traceTemplate ?? "").includes("codex task --trace"), "commander should expose trace drilldown template", commands); assertCondition(String(commands.outputTemplate ?? "").includes("codex output "), "commander should expose output drilldown template", commands); - assertCondition(asRecord(runningRisk.commands).show === "bun scripts/cli.ts codex task task-running-risk", "attention row should include task drilldown command", runningRisk); + assertCondition(String(commands.showTemplate ?? "").includes("codex task "), "commander should include task drilldown template for attention rows", commands); assertCondition(asArray(runningRisk.riskSignals).includes("stale-recovery-candidate") && asArray(runningRisk.riskSignals).includes("blocked"), "active risk row should expose stale/blocker signals", runningRisk); - assertCondition(asRecord(failedUnread.commands).read === "bun scripts/cli.ts codex read task-failed-unread", "failed unread row should include read command", failedUnread); + assertCondition(!attentionItems.some((item) => item.id === "task-failed-unread"), "default commander attention should not expand terminal unread items", { attentionItems }); assertCondition(!commanderBody.includes("raw-prompt-task-running-risk-20"), "commander output should not dump long raw prompt bodies", { chars: commanderBody.length }); assertCondition(!commanderBody.includes("summary-final-task-running-risk-20"), "commander output should not dump long final response bodies", { chars: commanderBody.length }); + assertCondition(!commanderBody.includes("\"prompt\""), "commander output should not include prompt preview fields by default", { commanderBody }); + assertCondition(!commanderBody.includes("\"last\""), "commander output should not include final-response preview fields by default", { commanderBody }); assertCondition(!recentIds.some((id) => terminalIds.includes(id)), "recentCompleted section must not duplicate terminalUnread rows", { recentIds, terminalIds }); assertCondition(recentIds.length === 3, "recentCompleted commander section should be independently capped", { recentIds }); + assertCondition(terminalUnreadSection.returned === 0 && asArray(terminalUnreadSection.items).length === 0, "default commander terminal unread section should omit item details", terminalUnreadSection); + assertCondition(String(asRecord(terminalUnreadSection.commands).unread ?? "").includes("codex unread"), "terminal unread section should point to codex unread drill-down", terminalUnreadSection); assertCondition(asRecord(supervisorView.completedUnread).count === 3 && asRecord(supervisorView.recentCompleted).count === 5, "supervisor view should remain available and keep separate unread/recent sections", supervisorView); + assertCondition(commanderLimit8Body.length < 16_000, "commander --limit 8 output should stay compact for polling", { chars: commanderLimit8Body }); + assertCondition(asRecord(commanderLimit8View.filters).requestedLimit === 8, "commander --limit 8 should preserve requested limit disclosure", commanderLimit8View); + assertCondition(asArray(limit8ActiveRunners.items).some((item) => asRecord(item).id === "task-running-risk"), "commander --limit 8 should keep active items", limit8ActiveRunners); + assertCondition(limit8TerminalUnread.returned === 0 && asArray(limit8TerminalUnread.items).length === 0, "commander --limit 8 should not expand terminal unread item details", limit8TerminalUnread); + assertCondition(!limit8AttentionItems.some((item) => String(item.id ?? "").includes("unread")), "commander --limit 8 attention should omit terminal unread rows", { limit8AttentionItems }); + assertCondition(String(limit8Commands.unread ?? "").includes("codex unread"), "commander --limit 8 should keep unread drill-down command", limit8Commands); + assertCondition(String(limit8Commands.full ?? "").includes("--view full"), "commander --limit 8 should keep full drill-down command", limit8Commands); + assertCondition(!commanderLimit8Body.includes("RAW_PROMPT_SHOULD_NOT_LEAK") && !commanderLimit8Body.includes("raw-prompt-task-failed-unread"), "commander --limit 8 should not print unread prompt details", { commanderLimit8Body }); + assertCondition(!commanderLimit8Body.includes("summary-final-task-failed-unread"), "commander --limit 8 should not print unread final-response details", { commanderLimit8Body }); + assertCondition(fullLimit8Body.includes("raw-prompt-task-failed-unread") || fullLimit8Body.includes("display-prompt-task-failed-unread"), "--view full should still expose task detail previews", { fullLimit8Body }); + assertCondition(unreadLimit8Body.includes("task-failed-unread") && unreadLimit8Body.includes("readTemplate"), "supervisor unread drill-down should still expose terminal unread task ids", { unreadLimit8Body }); + assertCondition(!commanderLimit8Requests.some((request) => request.path.includes("task-failed-unread") && request.path.includes("/summary")), "default commander --limit 8 should not fetch terminal unread summaries", { commanderLimit8Requests }); return { ok: true, checks: [ "commander view is explicit and bounded", "exact active/queued/retry_wait/terminal-unread counts are preserved", - "attention rows expose stale, heartbeat, terminal-unread and blocker signals", + "attention rows expose active, queued/retry_wait and blocker signals", "high-priority issue refs are surfaced", "deterministic classifier emits requested categories", "drilldown commands are present without prompt/final-response flood", + "commander --limit 8 omits terminal unread details and prompt previews", + "full and unread drill-down paths still expose details", "recent completed does not duplicate terminal unread", "supervisor/full views remain available", ], commanderChars: commanderBody.length, + commanderLimit8Chars: commanderLimit8Body.length, fullChars: fullBody.length, }; } diff --git a/scripts/src/code-queue.ts b/scripts/src/code-queue.ts index b9a47a3c..bf733131 100644 --- a/scripts/src/code-queue.ts +++ b/scripts/src/code-queue.ts @@ -32,10 +32,9 @@ const supervisorPromptPreviewChars = 70; const supervisorBodyPreviewChars = 70; const supervisorRecentBodyPreviewChars = 50; const commanderAttentionLimit = 10; +const commanderActiveItemLimit = 8; const commanderSectionReturnedLimit = 5; const commanderRecentCompletedLimit = 3; -const commanderPromptPreviewChars = 96; -const commanderBodyPreviewChars = 120; const commanderIssueTaskPreviewLimit = 4; const commanderConcurrencyTarget = 15; const unreadTriageCountLimit = 12; @@ -442,13 +441,6 @@ interface CommanderAttentionItem { finishedAt?: string | null; unreadTerminal?: boolean; finalResponseAt?: unknown; - prompt: string; - promptChars: number; - promptTruncated?: boolean; - last?: string; - lastAt?: unknown; - lastChars?: number; - lastTruncated?: boolean; commands: { show: string; detail: string; @@ -2281,6 +2273,42 @@ function supervisorExecutionDiagnostics(value: unknown): Record }; } +function commanderExecutionDiagnostics(value: unknown): Record | null { + const diagnostics = supervisorExecutionDiagnostics(value); + if (diagnostics === null) return null; + const listBudget = asRecord(diagnostics.listBudget) ?? {}; + const recovery = asRecord(diagnostics.recovery) ?? {}; + return { + state: diagnostics.state ?? null, + effectiveLiveness: diagnostics.effectiveLiveness ?? null, + recommendedAction: diagnostics.recommendedAction ?? null, + splitBrainLive: diagnostics.splitBrainLive ?? null, + activeHeartbeatCount: diagnostics.activeHeartbeatCount ?? null, + databaseActiveTaskCount: diagnostics.databaseActiveTaskCount ?? null, + schedulerActiveRunSlotCount: diagnostics.schedulerActiveRunSlotCount ?? null, + heartbeatFreshTaskIds: diagnostics.heartbeatFreshTaskIds ?? [], + heartbeatRiskTaskIds: diagnostics.heartbeatRiskTaskIds ?? [], + traceGapTaskIds: diagnostics.traceGapTaskIds ?? [], + recovery: { + disposition: recovery.disposition ?? null, + hint: recovery.hint ?? null, + rePollBeforeRecovery: recovery.rePollBeforeRecovery ?? null, + repeatedPollConfirmed: recovery.repeatedPollConfirmed ?? null, + recoveryMutationAllowedByThisSnapshot: recovery.recoveryMutationAllowedByThisSnapshot ?? null, + heartbeatRiskTaskCount: recovery.heartbeatRiskTaskCount ?? null, + staleRecoveryCandidateTaskCount: recovery.staleRecoveryCandidateTaskCount ?? null, + nextPollCommand: recovery.nextPollCommand ?? null, + dryRunReconcileCommand: recovery.dryRunReconcileCommand ?? null, + }, + interpretation: diagnostics.interpretation ?? null, + listBudget: { + idPreviewLimit: listBudget.idPreviewLimit ?? diagnosticsIdPreviewLimit, + truncated: listBudget.truncated ?? false, + rawCommand: listBudget.rawCommand ?? "bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview?limit=30 --raw --full", + }, + }; +} + function compactToolSummary(value: unknown, full: boolean, limit = defaultToolLimit): Record { const record = asRecord(value) ?? {}; const allItems = asArray(record.items); @@ -3945,7 +3973,7 @@ function commanderInfrastructureSignals(rawQueue: Record): Reco source: asString(signal.source) || "code-queue", })); const storageDegraded = health.degraded === true || storage.postgresReady === false || storage.lastError !== null && storage.lastError !== undefined; - return { + const result: Record = { infrastructureBlocker: storageDegraded || boundedSignals.some((signal) => signal.category === "infrastructure-blocker"), status: health.status ?? (storageDegraded ? "degraded" : "ready"), source: "queue.storage.health", @@ -3953,7 +3981,12 @@ function commanderInfrastructureSignals(rawQueue: Record): Reco omittedSignalCount: Math.max(0, rawSignals.length - boundedSignals.length), bounded: true, signals: boundedSignals, - storage: { + actionable: storageDegraded + ? "Treat as Code Queue infrastructure-blocker; inspect storage health/logs and wait for bounded dirty-flush retry before duplicating or canceling business tasks." + : "No Code Queue storage infrastructure blocker reported in this overview page.", + }; + if (storageDegraded) { + result.storage = { postgresReady: storage.postgresReady ?? health.postgresReady ?? null, dirtyTaskCount: storage.dirtyTaskCount ?? health.dirtyTaskCount ?? null, dirtyQueueCount: storage.dirtyQueueCount ?? health.dirtyQueueCount ?? null, @@ -3964,11 +3997,9 @@ function commanderInfrastructureSignals(rawQueue: Record): Reco lastClientRotationAt: health.lastClientRotationAt ?? null, lastErrorKind: health.lastErrorKind ?? null, lastErrorTransient: health.lastErrorTransient ?? null, - }, - actionable: storageDegraded - ? "Treat as Code Queue infrastructure-blocker; inspect storage health/logs and wait for bounded dirty-flush retry before duplicating or canceling business tasks." - : "No Code Queue storage infrastructure blocker reported in this overview page.", - }; + }; + } + return result; } function commanderAttentionReasons( @@ -4054,8 +4085,6 @@ function commanderAttentionItem( const status = asString(task.status) || null; const summaryLastAssistant = summary?.lastAssistantMessage ?? task.lastAssistantMessage; const awaitingStatus = finalResponseAwaitingTerminalStatus(status, summaryLastAssistant); - const prompt = supervisorTextSummary(asString(task.displayPrompt ?? task.basePrompt ?? task.prompt), commanderPromptPreviewChars); - const lastMessage = supervisorLastMessage(summaryLastAssistant, commanderBodyPreviewChars); const issues = taskIssueRefs(task, summary); const unreadTerminal = taskUnreadTerminal(task); return { @@ -4076,15 +4105,6 @@ function commanderAttentionItem( attempt: typeof task.currentAttempt === "number" && Number.isFinite(task.currentAttempt) ? task.currentAttempt : null, updatedAt: asString(task.updatedAt) || null, ...(isTerminalTaskStatus(status) ? { finishedAt: asString(task.finishedAt) || null, unreadTerminal } : {}), - prompt: prompt.text, - promptChars: prompt.chars, - ...(prompt.truncated ? { promptTruncated: true } : {}), - ...(lastMessage === null ? {} : { - last: lastMessage.text, - lastAt: lastMessage.at, - lastChars: lastMessage.chars, - ...(lastMessage.truncated ? { lastTruncated: true } : {}), - }), commands: taskDrilldownCommands(taskId, unreadTerminal), }; } @@ -4100,6 +4120,23 @@ function commanderAttentionRank(item: CommanderAttentionItem): number { return severityRank[item.severity] * 10 + actionRank[item.action]; } +function compactCommanderAttentionItem(item: CommanderAttentionItem): Record { + return { + id: item.id, + queue: item.queue, + status: item.status, + ...(item.statusLabel === undefined ? {} : { statusLabel: item.statusLabel }), + severity: item.severity, + action: item.action, + riskSignals: item.riskSignals, + issues: item.issues, + highPriorityIssues: item.highPriorityIssues, + category: item.classification.category, + updatedAt: item.updatedAt, + ...(item.finalResponseAt === undefined ? {} : { finalResponseAt: item.finalResponseAt }), + }; +} + function commanderIdSection(tasks: Record[], summaries: Map>, limit: number, nextCommand: string | null, fullCommand: string): Record { const visible = tasks.slice(0, limit); return { @@ -4111,10 +4148,6 @@ function commanderIdSection(tasks: Record[], summaries: Map visible.length || nextCommand !== null ? nextCommand : null, full: fullCommand, - showTemplate: "bun scripts/cli.ts codex task ", - traceTemplate: `bun scripts/cli.ts codex task --trace --tail --limit ${defaultTraceLimit}`, - outputTemplate: `bun scripts/cli.ts codex output --tail --limit ${defaultOutputLimit}`, - readTemplate: "bun scripts/cli.ts codex read ", }, items: visible.map((task) => { const taskId = taskOverviewCandidateKey(task); @@ -4146,9 +4179,8 @@ function commanderClassificationCounts(tasks: Record[], summari return { byCategory, byNoiseClass, - categories: ["business-user-facing", "deployment-artifact", "ci-e2e-evidence", "diagnostics-gate-report", "docs-governance", "infrastructure-blocker", "unknown"], deterministic: true, - sourceFields: ["task prompt previews", "task metadata", "summary lastAssistantMessage preview when fetched"], + disclosure: "counts only; use --view full or codex task for per-task classification evidence", }; } @@ -4198,6 +4230,83 @@ function attentionCounts(items: CommanderAttentionItem[], returnedItems: Command }; } +function activeRunnerItem(task: Record, summary: Record | null, diagnostics: Record): Record { + const taskId = taskOverviewCandidateKey(task); + const status = asString(task.status) || null; + const summaryLastAssistant = summary?.lastAssistantMessage ?? task.lastAssistantMessage; + const awaitingStatus = finalResponseAwaitingTerminalStatus(status, summaryLastAssistant); + const attention = commanderAttentionReasons(task, summary, diagnostics); + const issues = taskIssueRefs(task, summary); + return { + id: taskId, + queue: asString(task.queueId) || null, + status, + ...(awaitingStatus === null ? {} : { + statusLabel: awaitingStatus.label, + closeoutState: awaitingStatus.state, + finalResponseAt: awaitingStatus.finalResponseAt, + }), + severity: attention?.severity ?? null, + riskSignals: attention?.riskSignals ?? [], + issues, + highPriorityIssues: highPriorityIssueRefs(issues), + category: commanderTaskClassification(task, summary).category, + attempt: typeof task.currentAttempt === "number" && Number.isFinite(task.currentAttempt) ? task.currentAttempt : null, + updatedAt: asString(task.updatedAt) || null, + }; +} + +function commanderActiveRunnerItems( + tasks: Record[], + summaries: Map>, + diagnostics: Record, + options: CodexTasksOptions, +): Record { + const visible = tasks.slice(0, commanderActiveItemLimit); + const hasMore = tasks.length > visible.length; + const runningOptions: CodexTasksOptions = { + ...baseTaskListOptions({ ...options, unreadOnly: false, beforeId: undefined }), + statusFilter: ["running", "judging"], + }; + return { + returned: visible.length, + omitted: Math.max(0, tasks.length - visible.length), + truncated: hasMore, + itemLimit: commanderActiveItemLimit, + outputPolicy: "compact active rows only; prompt, final response, trace, and output require drill-down commands", + commands: { + running: taskListCommandWithView(runningOptions, "supervisor"), + full: taskListCommandWithView(runningOptions, "full"), + }, + items: visible.map((task) => activeRunnerItem(task, summaries.get(taskOverviewCandidateKey(task)) ?? null, diagnostics)), + }; +} + +function commanderTerminalUnreadSection( + total: number, + unreadRowsOnFetchedPage: Record[], + options: CodexTasksOptions, +): Record { + return { + count: total, + returned: 0, + omitted: total, + truncated: total > 0, + hasMore: total > 0, + fetchedRowsOnOverviewPage: unreadRowsOnFetchedPage.length, + outputPolicy: "terminal unread task details are intentionally omitted from the default commander poll; use codex unread, supervisor unread, or full drill-down.", + commands: { + next: total > 0 ? `bun scripts/cli.ts codex unread --limit ${Math.min(options.requestedLimit, defaultTasksLimit)}` : null, + unread: `bun scripts/cli.ts codex unread --limit ${Math.min(options.requestedLimit, defaultTasksLimit)}`, + supervisor: taskListCommandWithView({ ...options, unreadOnly: true }, "supervisor"), + full: taskListCommandWithView({ ...options, unreadOnly: true }, "full"), + showTemplate: "bun scripts/cli.ts codex task ", + readTemplate: "bun scripts/cli.ts codex read ", + }, + items: [], + }; +} + function terminalUnreadAggregateCount(taskPage: CodexTasksTaskPage, options: CodexTasksOptions, fallback: number): { total: number; exact: boolean; source: string } { const queue = taskPage.queue; if (queue !== null && options.queueId === undefined) { @@ -4242,6 +4351,7 @@ function codexTasksCommanderResult( const activeSection = buildSupervisorTaskSection(runningTasks, summaries, taskSectionLimit(options), sectionNextCommand(runningTasks, taskSectionLimit(options), options, nextCommand), fullCommand); const activeRunning = supervisorActiveRunningSummary(taskPage, options, activeSection, diagnostics); const attentionItems = allTasks + .filter((task) => !taskUnreadTerminal(task)) .map((task) => commanderAttentionItem(task, summaries.get(taskOverviewCandidateKey(task)) ?? null, rawDiagnostics)) .filter((item): item is CommanderAttentionItem => item !== null) .sort((left, right) => { @@ -4276,11 +4386,11 @@ function codexTasksCommanderResult( bounded: true, disclosure: { recommendedFor: "host commander supervision loops", - policy: "bounded action map only; no full prompt, final response, trace, output, or raw overview body is included by default", + policy: "low-noise polling summary only; no full prompt, final response, terminal unread detail, trace, output, or raw overview body is included by default", attentionLimit: commanderAttentionLimit, + activeItemLimit: commanderActiveItemLimit, sectionReturnedLimit: commanderSectionReturnedLimit, - promptPreviewChars: commanderPromptPreviewChars, - bodyPreviewChars: commanderBodyPreviewChars, + terminalUnreadDetails: "omitted-by-default; use commands.unread, sections.terminalUnread.commands.supervisor, --view full, or codex read ", rawOverview: `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview${tasksListQueryString(options)} --raw`, }, activeRunners: { @@ -4302,6 +4412,7 @@ function codexTasksCommanderResult( heartbeatFreshActive: activity.heartbeatFreshActiveTaskCount, schedulerLocalActiveQueues: activity.schedulerLocalActiveQueueCount, schedulerLocalActiveRunSlots: activity.schedulerLocalActiveRunSlotCount, + ...commanderActiveRunnerItems(runningTasks, summaries, rawDiagnostics, options), }, queueBacklog: { queued, @@ -4334,7 +4445,7 @@ function codexTasksCommanderResult( highPriorityIssues: commanderHighPriorityIssues(allTasks, summaries), classification: commanderClassificationCounts(allTasks, summaries), infrastructure, - executionDiagnostics: diagnostics, + executionDiagnostics: commanderExecutionDiagnostics(rawDiagnostics), degraded, commands: { refresh: taskListCommand(options), @@ -4353,11 +4464,11 @@ function codexTasksCommanderResult( attention: { ...attentionCounts(attentionItems, returnedAttention), truncated: attentionItems.length > returnedAttention.length, - items: returnedAttention, + items: returnedAttention.map(compactCommanderAttentionItem), }, sections: { activeNeedsAttention: commanderIdSection(activeRiskTasks, summaries, commanderSectionReturnedLimit, taskListCommandWithView({ ...options, statusFilter: ["running", "judging"] }, "supervisor"), fullCommand), - terminalUnread: commanderIdSection(unreadCompletedTasks, summaries, commanderSectionReturnedLimit, taskListCommandWithView({ ...options, unreadOnly: true }, "supervisor"), fullCommand), + terminalUnread: commanderTerminalUnreadSection(terminalUnreadAggregate.total, unreadCompletedTasks, options), queuedRetryWait: commanderIdSection(queuedRetryTasks, summaries, commanderSectionReturnedLimit, taskListCommandWithView({ ...options, statusFilter: ["queued", "retry_wait"] }, "supervisor"), fullCommand), recentCompleted: commanderIdSection(recentCompletedTasks, summaries, commanderRecentCompletedLimit, nextCommand, fullCommand), }, @@ -4527,8 +4638,10 @@ function visibleTaskIdsForOverview(tasks: Record[], options: Co const sectionLimit = taskSectionLimit(options); if (options.view === "commander") { return Array.from(new Set([ - ...sortRunningWatchTasks(filtered), - ...sortCompletedWatchTasks(filtered).filter((task) => taskUnreadTerminal(task)), + ...sortRunningWatchTasks(filtered).slice(0, commanderActiveItemLimit), + ...sortRunningWatchTasks(filtered) + .filter((task) => commanderAttentionReasons(task, null, {}) !== null) + .slice(0, commanderAttentionLimit), ...sortQueuedWatchTasks(filtered).slice(0, commanderSectionReturnedLimit), ...sortCompletedWatchTasks(filtered).filter((task) => !taskUnreadTerminal(task)).slice(0, commanderRecentCompletedLimit), ].map((task) => taskOverviewCandidateKey(task))))