feat(code-queue): add commander tasks view

This commit is contained in:
unidesk-code-queue-runner
2026-05-23 10:35:26 +00:00
parent e08bca7ada
commit b3f08c4f44
8 changed files with 839 additions and 38 deletions
+1 -1
View File
@@ -50,7 +50,7 @@ UniDesk 是一个以主 server 为统一入口的分布式工作平台;本文
- `bun scripts/cli.ts ci install/status/run/publish-backend-core/publish-user-service/run-dev-e2e/logs`:在 D601 原生 k3s 上安装和运行 Tekton CI,支持每 commit 检查、Code Queue 只读性能门禁、`CI.json` catalog 驱动的 backend-core 与 user-service commit-pinned 镜像发布和手动触发的 `origin/master:deploy.json#environments.dev` 临时 namespace e2ecatalog/producer/consumer 分工见 `docs/reference/cicd-standardization.md``run-dev-e2e` 的 Git 控制 runner、短 launcher 和 no-CD 边界见 `docs/reference/dev-ci-runner.md`Tekton 规则见 `docs/reference/ci.md`
- `bun scripts/cli.ts codex deploy <commitId>`:旧 Code Queue 兼容部署入口已禁用,原因是它会绕过受控部署边界直连 D601 部署 Code Queue;规则见 `docs/reference/codex-deploy.md`
- `bun scripts/cli.ts codex prompt-lint [prompt|--prompt-file path|--prompt-stdin]` / `codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue <id>]` / `codex pr-preflight [--remote]``prompt-lint` 在派发/steer 前 dry-run 检查 runner prompt 的 DEV 测试授权分级(`read-only`/`live-read`/`live-mutating`)且不回显 prompt`submit --dry-run` 同时给出 MiniMax/GPT/人工路由建议、该 lint 结果和 requested/effective execution mode;真实提交成功只返回写入确认、task id、服务级 runnerPermissions 和后续查看命令,不回显 prompt;`pr-preflight` 只读检查 D601 scheduler/runner 的 GitHub token、egress 和 PR 能力,PR 型派单前必须使用,规则见 `docs/reference/cli.md``docs/reference/code-queue-supervision.md`
- `bun scripts/cli.ts codex task <taskId>`:按 Code Queue 任务 ID 查询默认审阅摘要,只返回原始 prompt、最终 response、最后错误和渐进披露命令;`--detail``codex output` 和 supervisor `--limit` 仍默认有界,完整内容需显式 `--full`/`--full-text`/分页展开;`codex queues [--full] [--limit N] [--page N|--offset N]` 默认分页低噪声输出队列摘要,完整 upstream 只通过 raw command 显式获取。
- `bun scripts/cli.ts codex task <taskId>`:按 Code Queue 任务 ID 查询默认审阅摘要,只返回原始 prompt、最终 response、最后错误和渐进披露命令;`codex tasks --view commander` 是 host commander 推荐轮询入口,默认有界显示 active runner 精确计数、queued/retry_wait、terminal-unread、active 风险、分类和 drill-down 命令;`--view supervisor|full``codex output``--limit` 仍默认有界,完整内容需显式 `--full`/`--full-text`/分页展开;`codex queues [--full] [--limit N] [--page N|--offset N]` 默认分页低噪声输出队列摘要,完整 upstream 只通过 raw command 显式获取。
- `bun scripts/cli.ts codex unread [--repo owner/name] [--issue N] [--limit N]`:只读汇总完成未读积压并给出 repo/issue/status/queue 计数和 drill-down/read 命令;批量已读必须显式 `codex unread mark-read ... --confirm`,规则见 `docs/reference/cli.md`
- `bun scripts/cli.ts codex judge <taskId> --attempt <n> [--dry-run]`:按指定 task/attempt 用与队列 worker 相同的上下文构建和 MiniMax judge 调用路径单步复现完成判定;`--dry-run` 只输出 prompt/payload 诊断。
- `bun scripts/cli.ts codex steer <taskId> [prompt|--prompt-file path|--prompt-stdin] [--steer-id id] [--dry-run] [--no-retry|--retry-attempts N]` / `codex steer-confirm <taskId> --steer-id <id>`:向运行中的 active turn 注入纠偏提示并用 `steerId` 做幂等/trace 确认;真实输出不回显 prompt,遇到 `deliveryUnconfirmed` 先查确认命令,不重复发送同一纠偏。
+4
View File
@@ -115,6 +115,10 @@
阅读 `AGENTS.md`(本项目 `AGENTS.md` 同时承担 `SKILL.md``scripts/cli.ts` 的解释职责),然后用 cli 手动测试以下内容:运行 `bun scripts/code-queue-cicd-dry-run-contract-test.ts`,确认 Code Queue 标准镜像化 CI/CD dry-run 合同输出 `ok=true`。该测试必须证明 CI 只负责发布 commit-pinned `unidesk/code-queue:<commit>` artifact 与 digest/label summaryDEV CD dry-run 只指向 `unidesk-dev` Code Queue scheduler/read/write/provider-egress-proxyPROD Code Queue 是 structured unsupported 且不暴露 production k3s runtime deploy target,并且输出显式禁止 `codex deploy``server rebuild code-queue`、production namespace/manifest mutation、scheduler/runner restart、任务 interrupt 和 cancel。该测试不得执行真实 deploy/rebuild/restart,不得 interrupt/cancel 运行中任务,也不得执行 prod apply。
## T23E Code Queue Commander View Contract
阅读 `AGENTS.md``docs/reference/code-queue-supervision.md`,然后用 cli 手动测试以下内容:运行 `bun scripts/code-queue-commander-view-contract-test.ts`,确认输出 `ok=true`。该测试必须证明 `bun scripts/cli.ts codex tasks --view commander` 是显式 host commander 轮询视图,默认输出有界,保留 exact active runner count 与来源/处置、queued/retry_wait 计数、terminal-unread 总数与 omitted 行数、active stale/heartbeat/final-response blocker 风险、HWLAB#7/#99/#116/#164/#317 和 UniDesk#20/#118 命中、确定性任务分类、`codex task`/`trace`/`output`/`read`/raw overview drill-down 命令,并且不输出完整 prompt、完整 final response、raw output 或 full trace。该测试还必须证明 `recentCompleted` 不重复 `completedUnread``--view supervisor``--view full` 仍可显式使用。本测试不得 deploy、rollout、重启 Code Queue backend、interrupt/cancel 运行中任务或批量 mark-read。
## T23A D601 k3s CI Gate
阅读 `AGENTS.md``docs/reference/ci.md`,运行 `bun scripts/cli.ts ci install`,确认 Tekton Pipelines `v1.12.0`、Tekton Triggers `v0.34.0``unidesk-ci` Pipeline/Task/EventListener 已部署到 D601 原生 k3s;随后运行 `bun scripts/cli.ts ci run --revision <已push的commitId> --wait-ms 1200000`,确认 PipelineRun 只执行 clone/check/performance,不调用 `deploy apply``codex deploy`,并确认临时 `code-queue-ci-read` 使用主 PostgreSQL 只读查询 Code Queue 首屏、TraceView summary、TraceView steps 和 step detail 的性能指标。若失败,使用 `bun scripts/cli.ts ci logs <pipelineRun>` 查看 TaskRun 和 Pod 日志;交付说明必须记录性能预算是否通过。
+1 -1
View File
@@ -50,7 +50,7 @@ CLI 可以从 `master` 快速演进,但必须兼容 `deploy.json` 固定的 CI
- `codex resume <taskId> [prompt|--prompt-file path|--prompt-stdin] [--resume-id id] [--dry-run] [--full|--raw]` 对已终态或 awaiting-closeout 的原 Code Queue task 创建后续 turn,优先用于 PR 小修、冲突、rebase、补测和 reviewer feedback,保留原 task、attempt、branch/PR 上下文和 `codexThreadId`/OpenCode session。CLI 会为同一 task/prompt 生成稳定 `resumeId`,也允许显式传入;同一 `resumeId` 加同 prompt 返回 `duplicate_suppressed` 且不重复注入,同一 `resumeId` 加不同 prompt 返回 409 conflict。真实成功只返回 taskId、resumeId/turnId、`deliveryState`、是否复用原 `codexThreadId`、有界 trace confirmation 和 `codex task/detail/trace/output` 后续命令,不回显 prompt 或完整 task state。running/judging task 必须 fail closed 并给出 `disposition=use-steer-for-active-task``codex steer` 命令,不把 resume 伪装成新 task;不存在 task 返回结构化 not accepted。若 delivery timeout 或 trace 未确认,输出 `deliveryUnconfirmed` 和确认命令,调用方先查 `codex task <taskId> --trace` 再用同一 `resumeId` 重试。
- `codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/<name>] [--pr-create-dry-run --pr-create-dry-run-head <head>] [--issue N] [--full|--raw]` 通过稳定 `code-queue` proxy 请求 D601 scheduler `/api/runtime-preflight`,用于 PR 型派单 admission。默认输出是紧凑 commander 视图,显式分出 `schedulerPreflight``activeRunnerPrCapability`,并附带 `commands``disclosure`,方便先看 scheduler auth 缺口、再看当前 runner/dev container 的 `gh auth status``gh pr create --dry-run` 能力;`--full``--raw` 才展开完整 `preflight`、工具、agent port、Git worktree、GitHub egress、repo/issue/PR 只读探测和观测原文。只报告 `GH_TOKEN`/`GITHUB_TOKEN` 是否存在和来源 key,不打印值。当 auth-broker 配置存在时,`tokenCoverage.source="auth-broker"``credentialSource="broker-issued-token"` 且 runner env token 不是成功前提;当仅 env token 存在时,`credentialSource="env-token"``authBroker.nextAction="use-env-token-until-auth-broker-live"`;两者都缺失时顶层 `ok=false``runnerDisposition=infra-blocked``degradedReason=auth-broker-needed``tokenCoverage.missing` 同时列出 `GH_TOKEN``GITHUB_TOKEN`,并输出 `authBroker.source="broker/auth-broker-needed"``capability.source="missing-token"`。该 `auth-missing` 的 scope 是 `scheduler-runner-env`,不能简化成“当前 active runner/dev container 不能创建 PR”;默认视图必须带 `scopeBoundary``activeRunnerPrCapability`。GitHub DNS/API 连接失败应归类为 `failureKind=github-transient``degradedReason=github-dns-api-transient`,并带 `retryable=true``commanderAction=retry-backoff-or-keep-running-if-heartbeat-fresh` 和有界 `githubTransient.failedProbes`;调用方应重试/退避,且在任务 heartbeat/trace 新鲜时继续监督,不把它当成 auth 缺失或 PR 语义失败。`prCapability` 是 runner-facing 合同摘要,必须包含目标分支、token/auth 来源、`systemGhBinaryRequiredForWrites=false`、UniDesk REST `bun scripts/cli.ts gh` 可用性、push dry-run/PR create dry-run 的 `writesRemote=false`、expected PR handoff、真实 PR 创建需要 commander 授权和 `gh pr merge``unsupported-command` 边界;系统 `gh` binary 缺失只进入 `tools.systemGhBinary`,不得误判为 UniDesk REST `gh` CLI 不可用。`--remote` 在 runner-like 环境里不再依赖本地 `unidesk-backend-core``unidesk-database``baidu-netdisk-backend` 容器存在;这些缺失只作为本地观测证据。若远程控制面可达,则继续走远程控制面结果;若远程控制面不可达,则结构化返回 `failureKind=control-plane-missing` / `degradedReason=remote-control-plane-unreachable`,而不是把本地 `backend-core-container-missing` 当作最终阻塞。`--pr-create-dry-run` 不 POST GitHub,只证明 runner 内 PR body 生成、`scripts/cli.ts gh pr create --dry-run` 和 branch 参数形态可用;服务端创建权限仍以 token/auth broker、repo/issue/PR read、push dry-run 和最终授权后的真实 PR 创建结果为准。
- `codex task <taskId>` 通过 Code Queue 私有代理按任务 ID 查询结构化审阅摘要;默认只返回任务身份、执行 Provider、工作目录、attempt 计数、原始 prompt、最终 response、最后错误和渐进披露命令,适合指挥官审阅完成未读任务且避免上下文爆炸。`--detail` 仍是有界详细摘要:默认只返回少量 attempt/tool 行、短 prompt/response/stderr/feedback 预览和 omitted/truncated 元数据;需要完整 prompt/response 文本或更多 tool/attempt 细节时再显式加 `--full``--tool-limit N``--trace``codex output`。该摘要读取默认由主 server `code-queue-mgr` 从 PostgreSQL 返回,不依赖 D601 `code-queue-read` Service 可用。
- `codex tasks [--view supervisor|full] [--queue id] [--status succeeded|running|queued|failed|canceled|judging|retry_wait[,..]] [--unread|--unread-only] [--limit N] [--before-id id]` 通过同一私有代理输出渐进式披露视图。默认 `supervisor` 是低噪声指挥官视图,只返回 `activeRunning``running``completedUnread``recentCompleted``queued``activity``commanderConcurrency``executionDiagnostics` 的紧凑行;`activeRunning.count` 是 running+judging 的状态计数,`exact=true` 时来自 queue summary counts`running.returned``activeRunning.rowPage.returned` 只是本次返回的紧凑行数。`commanderConcurrency.activeRunnerCount` 是并发策略应使用的 active/running 计数,等于 `activity.effectiveActiveTaskCount`15 并发策略按 `15 - activeRunnerCount` 计算剩余窗口。`commanderConcurrency.splitBrainDisposition=live-count-as-active` 表示 split-brain 有 fresh heartbeat 证据,应继续监督并计入 active;`interventionRequired=true` 才提示介入。prompt/body 只给短预览和原始字符数,`running`/`completedUnread`/`queued` 默认只返回一个有界小页并通过 section `commands.next` 继续分页,`recentCompleted` 默认限量且不重复 `completedUnread` 未读终态,不嵌入完整 Trace、final response 或全量 overview。`--limit` 在 supervisor 中主要是扫描/分页预算,不是返回几十条肥行的开关;CLI 安全上限是 100,输出会在 `filters.requestedLimit``filters.effectiveLimit``filters.limitCapped``disclosure.limitPolicy` 说明显式请求是否被 capped;底层 overview 拉取预算独立显示在 `source.requestedLimit` / `source.effectiveLimit`,所以 `--limit 260` 应显示 requested=260、effective=100、source requested/effective=200,而不是只露出一个含糊的 `limit``--unread``--unread-only` 的别名,必须只保留未读终态;`--status` 必须真实过滤支持的状态,未知参数或未知状态必须结构化失败。需要更详细当前页任务行时显式使用 `--view full``--full`,仍受 `--limit``--before-id` 分页约束。
- `codex tasks [--view commander|supervisor|full] [--queue id] [--status succeeded|running|queued|failed|canceled|judging|retry_wait[,..]] [--unread|--unread-only] [--limit N] [--before-id id]` 通过同一私有代理输出渐进式披露视图。host commander 轮询应优先使用 `--view commander`:它只返回有界 action map,包含 `activeRunners.count` 及来源/处置、queued/retry_wait 精确计数、terminal-unread 总数和已省略行数、active/stale/heartbeat/final-response blocker 风险、HWLAB#7/#99/#116/#164/#317 与 UniDesk#20/#118 命中、确定性分类和 `codex task/trace/output/read` drill-down 命令,不嵌入完整 prompt、final response、trace、output 或 raw overview。默认 `supervisor` 保持旧低噪声分区视图,只返回 `activeRunning``running``completedUnread``recentCompleted``queued``activity``commanderConcurrency``executionDiagnostics` 的紧凑行;`activeRunning.count` 是 running+judging 的状态计数,`exact=true` 时来自 queue summary counts`running.returned``activeRunning.rowPage.returned` 只是本次返回的紧凑行数。`commanderConcurrency.activeRunnerCount` 是并发策略应使用的 active/running 计数,等于 `activity.effectiveActiveTaskCount`15 并发策略按 `15 - activeRunnerCount` 计算剩余窗口。`commanderConcurrency.splitBrainDisposition=live-count-as-active` 表示 split-brain 有 fresh heartbeat 证据,应继续监督并计入 active;`interventionRequired=true` 才提示介入。prompt/body 只给短预览和原始字符数,`running`/`completedUnread`/`queued` 默认只返回一个有界小页并通过 section `commands.next` 继续分页,`recentCompleted` 默认限量且不重复 `completedUnread` 未读终态,不嵌入完整 Trace、final response 或全量 overview。`--limit`commander/supervisor 中主要是扫描/分页预算,不是返回几十条肥行的开关;CLI 安全上限是 100,输出会在 `filters.requestedLimit``filters.effectiveLimit``filters.limitCapped` 和 disclosure 说明显式请求是否被 capped;底层 overview 拉取预算独立显示在 `source.requestedLimit` / `source.effectiveLimit`,所以 `--limit 260` 应显示 requested=260、effective=100、source requested/effective=200,而不是只露出一个含糊的 `limit``--unread``--unread-only` 的别名,必须只保留未读终态;`--status` 必须真实过滤支持的状态,未知参数或未知状态必须结构化失败。需要更详细当前页任务行时显式使用 `--view full``--full`,仍受 `--limit``--before-id` 分页约束。
- `codex unread [summary|mark-read] [--queue id] [--repo owner/name] [--issue N] [--status succeeded|failed|canceled[,..]] [--limit N] [--before-id id] [--confirm]` 是完成未读积压的默认低噪声 triage 入口。默认只读返回 repo/issue/status/queue 计数和最新任务 id 小页,不拉取 per-task summary,不输出 raw prompt、final response、trace 或 output;每行只给 `codex task/detail/trace/output/read` drill-down 命令。批量已读必须使用 `codex unread mark-read ... --confirm`,缺少 `--confirm` 时结构化失败且不 POST `/read`;单任务审阅仍优先 `codex read <taskId>`
- `codex task <taskId> --trace --tail|--from-start|--after-seq N|--before-seq N --limit N` 按页拉取 Code Queue 的逻辑 trace;响应会返回 `nextAfterSeq``previousBeforeSeq``hasMore``hasBefore` 和下一页/上一页命令,默认 `--trace` 取最新一页,且仍以分页 trace 为主;需要完整 prompt/最终 response 时加 `--full`,需要详细 task 摘要时加 `--detail`
- `codex output <taskId> --tail|--from-start|--after-seq N|--before-seq N --limit N [--full-text]` 按原始 output seq 分页读取底层记录;当 trace 行提示 `commandOmittedLines``bodyOmittedLines``rawSeqs` 时,用该命令按 seq 补取信息。默认是低噪声 raw-output 摘要:即使传入很大的 `--limit`,非 `--full-text` 也会限制返回行数和单条文本预览,并在 `disclosure.limitCapped``requestedLimit``effectiveLimit``commands.fullText` 中说明如何继续展开;显式 `--full-text` 才返回该页全文。
+6 -3
View File
@@ -265,6 +265,7 @@ replacement runner 只用于方向明显错误、质量不可接受、原 task
常用入口:
- `bun scripts/cli.ts codex tasks --view commander --limit N`host commander 轮询的推荐入口。输出是有界 action map,必须直接显示 `activeRunners.count`、计数来源、split-brain/heartbeat 处置、queued/retry_wait 精确计数、terminal-unread 总数和已省略行数、active 风险数、stale/heartbeat/trace gap、`finalResponse` 已出现但仍非终态的 awaiting terminal/judge、blocker-like final response、HWLAB#7/#99/#116/#164/#317 与 UniDesk#20/#118 命中、任务分类和下一步 drill-down 命令。默认不得输出完整 prompt、完整 final response、raw output、完整 trace 或 raw overview;需要详情只能按 task id 使用 `codex task``codex task --trace``codex output``codex read``rawOverview` 命令渐进展开。
- `bun scripts/cli.ts codex tasks --view supervisor --limit N`:查看默认低噪声监督视图,包括 `activeRunning`、running、完成未读、少量最近完成、queued/runnable、activity、commanderConcurrency、execution diagnostics、任务分类和下一步 drill-down 命令。默认行只保留 task id、队列、短 prompt/body 预览和原始字符数;`--limit` 是扫描/分页预算,不是返回几十条肥行的开关,CLI effective limit 安全上限为 100,输出必须用 `filters.requestedLimit``filters.effectiveLimit``filters.limitCapped``source.requestedLimit``source.effectiveLimit` 区分用户请求、CLI cap 和 overview 源拉取预算;例如 `--limit 260` 应明确显示 requested=260、effective=100、source=200`running.returned` 只是低噪声返回行数。`show/detail/trace/output/full/read` 放在 section template 中,避免每条任务重复刷屏,需要更多内容再按 taskId 展开。刚执行 `codex submit` 后也可以先读 submit 返回的 `submitted.taskStates[]``queue.countContext``queue.activity.effectiveActiveTaskCount``queue.stateDisclosure`;若某个 id preview 有 `idsUnavailable=true`,不要把它当成空队列,按 `queue.listPreviewPolicy.rawCommand` 或本 supervisor 命令继续查。
- `bun scripts/cli.ts codex queues`:查看低噪声队列计数、activity、commanderConcurrency、active task id、完成未读队列、runnable 队列和控制面诊断;需要完整队列行视图时加 `--full`,但 `--full` 仍默认分页,继续用 `--limit N``--page N``--offset N` 渐进展开。summary 和 full 都使用稳定 JSON path `.data.queues.items[]` 读取队列行,并从 `.data.queues.commanderConcurrency``.data.queues.activity``.data.queues.counts``.data.queues.executionDiagnostics` 读取全局活跃计数和执行诊断;完整 upstream 只通过输出中的 raw command 显式获取。
- `bun scripts/cli.ts codex unread --limit N`:查看完成未读审阅积压的默认 triage,按 repo、issue、status 和 queue 汇总,并给出有界最新任务和 drill-down/read 命令;默认不输出 raw prompt、final response、trace 或 output。
@@ -276,7 +277,9 @@ replacement runner 只用于方向明显错误、质量不可接受、原 task
- `bun scripts/cli.ts codex resume <taskId> --prompt-file <path>`:对已终态或 awaiting-closeout 的原 task 追加后续修正 turn,适合 PR 小修、冲突、rebase、补测和 review 修正;running/judging task 改用 `codex steer`
- 当 master 控制面状态和 D601 scheduler 状态看起来分裂时,使用 `docs/reference/observability.md` 中的活性规则判断。
默认 supervisor 视图必须保持低噪声。`activeRunning.count` 是指挥官 active running 计数,来源是 queue summary 的 status counts 时 `activeRunning.exact=true`,用于 redline 判断;`activeRunning.rowPage.returned` / `running.returned` 只表示本次返回的紧凑任务行。`activeRunning.redline` 必须写明 `countField`、routine target、burst redline、hard redline、`state``decisionReady`;只有 `decisionReady=true` 时,才能直接用该 count 做红线/补派判断。`running``completedUnread``queued` 即使传入较大的 `--limit`,默认也只返回一个很小的有界页,并通过 section `commands.next` 继续分页;`--limit` 保留为扫描/分页预算和 full view 返回预算,不得让一次 supervisor 调用输出几十条肥行。每个任务行只应带 task id 和必要摘要,`show``detail``trace``output``full``read` 使用 section template 表达,让下一步渐进披露动作明确且不重复;默认不得嵌入完整 queue 列表、完整 final response、raw output 页或完整 trace 行。`recentCompleted` 必须默认限量,且不得重复 `completedUnread` 里的未读终态,避免完成历史把当前 running、阻塞和未读审阅挤出视野;需要完整当前页时显式使用 `--view full``executionDiagnostics` 只能展示有界 task-id/reason 预览、总数、截断标记和 omitted counts;需要全量诊断时使用输出中的 raw command。`commands.read` 只是在人工审阅后的建议命令,listing 命令绝不能自动执行。
默认 commander/supervisor 视图必须保持低噪声。commander 视图用于回答“现在需要处理什么”,supervisor 视图用于看分区小页和红线细节。commander 的 `activeRunners.count` 是指挥官 active runner 计数,supervisor 的 `activeRunning.count` 是 running+judging 状态计数;两者都必须标明 exact/source,不能把返回行数当成并发总数。`activeRunning.count` 来源是 queue summary 的 status counts 时 `activeRunning.exact=true`,用于 redline 判断;`activeRunning.rowPage.returned` / `running.returned` 只表示本次返回的紧凑任务行。`activeRunning.redline` 必须写明 `countField`、routine target、burst redline、hard redline、`state``decisionReady`;只有 `decisionReady=true` 时,才能直接用该 count 做红线/补派判断。commander 的 `attention.items` 只返回最需要处理的有界任务,`attention.total/returned/omitted` 必须保留省略计数;`sections.recentCompleted` 不得重复 `sections.terminalUnread` 的未读终态。`running``completedUnread``queued` 即使传入较大的 `--limit`,默认也只返回一个很小的有界页,并通过 section `commands.next` 继续分页;`--limit` 保留为扫描/分页预算和 full view 返回预算,不得让一次 commander/supervisor 调用输出几十条肥行。每个任务行只应带 task id 和必要摘要,`show``detail``trace``output``full``read` 使用 section template 或 row commands 表达,让下一步渐进披露动作明确且不重复;默认不得嵌入完整 queue 列表、完整 final response、raw output 页或完整 trace 行。`recentCompleted` 必须默认限量,且不得重复 `completedUnread` 里的未读终态,避免完成历史把当前 running、阻塞和未读审阅挤出视野;需要完整当前页时显式使用 `--view full``executionDiagnostics` 只能展示有界 task-id/reason 预览、总数、截断标记和 omitted counts;需要全量诊断时使用输出中的 raw command。`commands.read` 只是在人工审阅后的建议命令,listing 命令绝不能自动执行。
commander 视图的任务分类必须是确定性字段,至少区分 `business-user-facing``deployment-artifact``ci-e2e-evidence``diagnostics-gate-report``docs-governance``infrastructure-blocker``unknown`。分类只用于监督优先级和噪声折叠,不替代任务验收;当 final response 带 blocker-like 语言、failed/terminal-unread、heartbeat/stale risk、trace gap 或 awaiting terminal/judge 时,分类再低噪声也必须进入 attention 或风险计数。
`codex tasks` 中的 `status` 永远是 scheduler/control-plane 原始状态,不因为看到 worker final response 而改写。若某个非终态任务的最后 assistant 文本来自 `finalResponse`CLI 会额外显示 `statusLabel``awaitingTerminalJudge=true``closeoutState=awaiting-terminal-or-judge``awaiting-judge`,并附带 closeout hint。指挥官应把这类行理解为“worker 已经产出最终回复文本,但 Code Queue 还在等待 agent terminal event、scheduler 写回或 judge 结果”;它仍占用 active/running 监督窗口,不能按完成任务 `read` 或验收,直到 `status` 进入 `succeeded``failed``canceled` 并可审阅 judge/terminal 记录。
@@ -288,7 +291,7 @@ replacement runner 只用于方向明显错误、质量不可接受、原 task
stale-active 恢复和 `/api/scheduler/reconcile?staleMs=...` 诊断入口的 heartbeat stale 阈值必须按安全下限归一化:缺省和低于默认 5 分钟的值都按 5 分钟处理,过大值按 24 小时上限截断,并在结构化响应中返回 `requestedStaleMs*``staleMsAdjusted``staleMsAdjustmentReason``minStaleMs``maxStaleMs`。任何 `staleMs=0` 或过低阈值都不能把仍有 fresh scheduler heartbeat 的任务判成 stale/recoverable。
`codex queues` 和默认 supervisor 视图的 `activity` / `commanderConcurrency` 是指挥官并发治理的主读数。并发决策固定使用 `commanderConcurrency.activeRunnerCount`,它等于 `activity.effectiveActiveTaskCount`15 并发策略的可补窗口按 `15 - activeRunnerCount` 计算,不能用 `activeQueueIds.length` 或 scheduler-local slot 数替代。`effectiveActiveTaskCount` 表示用于调度判断的有效活跃任务数;`databaseRunningTaskCount` 来自 PostgreSQL 中 `running` 状态计数;`databaseActiveTaskCount` 覆盖 running/judging 等数据库活跃任务;`heartbeatFreshActiveTaskCount` 表示 heartbeat-fresh 的有效 runner 数;`schedulerLocalActiveQueueCount``schedulerLocalActiveRunSlotCount` 只表示当前控制面本地可见 active run slots。`activeQueueIds``activeQueueCount` 是 scheduler-local 字段,可能在 `counts.running>0` 且 heartbeat 新鲜时为 0;看到这种组合时应按 `activity.effectiveActiveTaskCount``activity.heartbeatFreshActiveTaskCount``splitBrainLive` 决策,不得把空 `activeQueueIds` 当作零并发或停摆证据。`commanderConcurrency.splitBrainDisposition=live-count-as-active` 表示 split-brain 仍是 live 且应计入 active runner`interventionRequired=true`、heartbeat risk、stale recovery candidates,或非 `continue-supervision` 的 recommended action 才进入人工介入/恢复判断。
`codex queues``codex tasks --view commander` 和默认 supervisor 视图的 `activity` / `commanderConcurrency` 是指挥官并发治理的主读数。并发决策固定使用 `commanderConcurrency.activeRunnerCount` 或 commander `activeRunners.count`,它等于 `activity.effectiveActiveTaskCount`15 并发策略的可补窗口按 `15 - activeRunnerCount` 计算,不能用 `activeQueueIds.length` 或 scheduler-local slot 数替代。`effectiveActiveTaskCount` 表示用于调度判断的有效活跃任务数;`databaseRunningTaskCount` 来自 PostgreSQL 中 `running` 状态计数;`databaseActiveTaskCount` 覆盖 running/judging 等数据库活跃任务;`heartbeatFreshActiveTaskCount` 表示 heartbeat-fresh 的有效 runner 数;`schedulerLocalActiveQueueCount``schedulerLocalActiveRunSlotCount` 只表示当前控制面本地可见 active run slots。`activeQueueIds``activeQueueCount` 是 scheduler-local 字段,可能在 `counts.running>0` 且 heartbeat 新鲜时为 0;看到这种组合时应按 `activity.effectiveActiveTaskCount``activity.heartbeatFreshActiveTaskCount``splitBrainLive` 决策,不得把空 `activeQueueIds` 当作零并发或停摆证据。`commanderConcurrency.splitBrainDisposition=live-count-as-active` 表示 split-brain 仍是 live 且应计入 active runner`interventionRequired=true`、heartbeat risk、stale recovery candidates,或非 `continue-supervision` 的 recommended action 才进入人工介入/恢复判断。
单次 `provider is not online`、SSH 超时、proxy 超时或 registry 请求失败只能证明“当前观察路径失败”,不能单独升级为 D601 全局离线、CI/CD 全局阻塞或业务任务不可推进。指挥官和 runner 必须用多信号裁决运行面状态,至少区分以下观察面:
@@ -308,7 +311,7 @@ host Codex 指挥官正规化后仍受同一条高风险边界约束。`docs/ref
当多信号裁决显示 provider 服务器、D601 执行面或关键维护桥疑似需要人工检查时,指挥官可以在更新 #24/#40 等记录之外,通过 ClaudeQQ 额外提醒用户检查 provider 服务器状态。提醒只在首次确认、状态恶化、恢复或需要用户介入时发送,不能在每轮轮询中重复轰炸。ClaudeQQ 提醒是 best-effort:若 ClaudeQQ 本身依赖同一条故障 provider/k3sctl 链路而不可达,指挥官应把通知失败的原因写入 #24 或对应 blocker issue,并继续按轮询和恢复规则推进。
在 UniDesk CLI 中,`bun scripts/cli.ts provider triage <providerId>` 是只读多信号裁决入口,适合作为 worker 和指挥官的统一健康判断前置。它必须至少保留这些合同:默认输出只展示裁决、scope、失败/降级/未知信号和有界 evidence 摘要,完整 evidence 必须显式加 `--full``--raw``provider is not online` 这类单路径失败只应落到 `decision=retryable-transient` / `blockingDisposition=runner-local-observation-gap`,不得直接输出 `global-offline`;只有 provider-gateway/SSH/k3s/scheduler 等多个独立关键路径同时失败且缺少健康交叉证据,才允许输出 `decision=global-offline`registry 或单个 service proxy 失败但 heartbeat、SSH 或节点视图仍健康时,应输出 `decision=service-degraded``recommendedCrossChecks` 必须包含 `debug health``debug dispatch <providerId> host.ssh --wait-ms 15000``ssh <providerId> argv true``artifact-registry health --provider-id <providerId>``microservice health k3sctl-adapter``microservice health code-queue``codex tasks --view supervisor --limit 20`
在 UniDesk CLI 中,`bun scripts/cli.ts provider triage <providerId>` 是只读多信号裁决入口,适合作为 worker 和指挥官的统一健康判断前置。它必须至少保留这些合同:默认输出只展示裁决、scope、失败/降级/未知信号和有界 evidence 摘要,完整 evidence 必须显式加 `--full``--raw``provider is not online` 这类单路径失败只应落到 `decision=retryable-transient` / `blockingDisposition=runner-local-observation-gap`,不得直接输出 `global-offline`;只有 provider-gateway/SSH/k3s/scheduler 等多个独立关键路径同时失败且缺少健康交叉证据,才允许输出 `decision=global-offline`registry 或单个 service proxy 失败但 heartbeat、SSH 或节点视图仍健康时,应输出 `decision=service-degraded``recommendedCrossChecks` 必须包含 `debug health``debug dispatch <providerId> host.ssh --wait-ms 15000``ssh <providerId> argv true``artifact-registry health --provider-id <providerId>``microservice health k3sctl-adapter``microservice health code-queue``codex tasks --view commander --limit 20`;需要分区小页时再用 `codex tasks --view supervisor --limit 20`
D601 artifact registry 的 systemd unit inactive 不等于 D601 全局离线。如果 `artifact-registry health``provider triage D601` 同时看到 registry container running、loopback listener healthy、`/v2/` 返回 200,且 provider heartbeat、Host SSH、k3sctl-adapter、Code Queue scheduler 或业务 API 有健康信号,这只能判为 `service-degraded`,不得写成 provider offline、D601 offline 或 CI/CD 全局不可推进。只有这些健康面也同时失败,才进入 `global-offline` 判断。
@@ -0,0 +1,219 @@
import { codexTasksQueryForTest } from "./src/code-queue";
type JsonRecord = Record<string, unknown>;
function assertCondition(condition: unknown, message: string, detail: JsonRecord = {}): void {
if (!condition) throw new Error(`${message}: ${JSON.stringify(detail)}`);
}
function asRecord(value: unknown): JsonRecord {
assertCondition(typeof value === "object" && value !== null && !Array.isArray(value), "expected JSON object", { value });
return value as JsonRecord;
}
function asArray(value: unknown): unknown[] {
assertCondition(Array.isArray(value), "expected JSON array", { value });
return value as unknown[];
}
function longText(marker: string, repeat: number): string {
return Array.from({ length: repeat }, (_, index) => `${marker}-${index} status evidence command output final response prompt body should stay capped`).join("\n");
}
function task(id: string, status: string, updatedAt: string, prompt: string, readAt: string | null = null, finalText = ""): JsonRecord {
return {
id,
queueId: "default",
status,
currentAttempt: status === "queued" || status === "retry_wait" ? 0 : 1,
updatedAt,
finishedAt: status === "succeeded" || status === "failed" || status === "canceled" ? updatedAt : null,
readAt,
prompt: `${prompt}\n${longText(`raw-prompt-${id}`, 80)}`,
basePrompt: `${prompt}\n${longText(`base-prompt-${id}`, 60)}`,
displayPrompt: `${prompt}\n${longText(`display-prompt-${id}`, 70)}`,
lastAssistantMessage: finalText.length === 0 ? null : {
at: updatedAt,
seq: 42,
source: "finalResponse",
text: `${finalText}\n${longText(`assistant-${id}`, 100)}`,
},
};
}
function summaryForTask(taskId: string): JsonRecord {
const finalText = taskId === "task-running-risk"
? "Blocked by provider auth token timeout and cannot proceed without commander authorization."
: taskId === "task-failed-unread"
? "CI failed and final response reports missing e2e evidence."
: taskId === "task-running-watch"
? "Implementation finished but task is still awaiting judge."
: "Completed with compact evidence.";
return {
ok: true,
status: 200,
body: {
ok: true,
summary: {
id: taskId,
queueId: "default",
status: taskId.includes("running") ? "running" : taskId.includes("failed") ? "failed" : "succeeded",
currentAttempt: 1,
maxAttempts: 99,
prompt: longText(`summary-prompt-${taskId}`, 90),
basePrompt: longText(`summary-base-${taskId}`, 70),
lastAssistantMessage: {
at: "2026-05-22T00:59:00.000Z",
seq: 120,
source: "finalResponse",
text: `${finalText}\n${longText(`summary-final-${taskId}`, 120)}`,
},
},
},
};
}
function noisyCommanderFixture(path: string): JsonRecord {
if (path.includes("/summary")) {
const taskId = decodeURIComponent(path.split("/api/tasks/")[1]?.split("/")[0] ?? "unknown");
return summaryForTask(taskId);
}
assertCondition(path.startsWith("/api/microservices/code-queue/proxy/api/tasks/overview"), "unexpected path", { path });
return {
ok: true,
status: 200,
body: {
ok: true,
queue: {
counts: {
running: 12,
judging: 2,
queued: 18,
retry_wait: 4,
succeeded: 28,
failed: 3,
canceled: 1,
},
unreadTerminal: 8,
maxActiveQueues: 15,
executionDiagnostics: {
now: "2026-05-22T01:00:00.000Z",
state: "stale-active",
effectiveLiveness: "at-risk",
recommendedAction: "investigate-heartbeat-risk",
databaseActiveTaskCount: 14,
databaseActiveTaskIds: ["task-running-risk", "task-running-watch"],
activeHeartbeatCount: 13,
heartbeatFreshTaskIds: ["task-running-watch"],
heartbeatRiskTaskIds: ["task-running-risk"],
heartbeatExpiredTaskIds: ["task-running-risk"],
heartbeatMissingTaskIds: [],
staleRecoveryCandidateTaskIds: ["task-running-risk"],
traceGapTaskIds: ["task-running-risk", "task-running-watch"],
reasons: [longText("diagnostic-reason", 30), longText("diagnostic-reason-two", 30)],
},
},
pagination: {
limit: 200,
returned: 12,
total: 68,
hasMore: true,
nextBeforeId: "task-oldest-page",
includeActive: true,
},
tasks: [
task("task-running-risk", "running", "2026-05-22T00:00:00.000Z", "HWLAB#7 backend-core provider token blocker for M3 hardware workbench", null, "Blocked by provider auth token timeout."),
task("task-running-watch", "judging", "2026-05-22T00:52:00.000Z", "pikasTech/HWLAB#164 user-facing patch-panel verification", null, "Final response ready while judge is pending."),
task("task-failed-unread", "failed", "2026-05-22T00:50:00.000Z", "UniDesk#20 CI e2e evidence gate for commander view", null, "CI failed and needs read closeout."),
task("task-succeeded-unread", "succeeded", "2026-05-22T00:49:00.000Z", "pikasTech/HWLAB#317 deployment artifact digest publish evidence", null, "Artifact published."),
task("task-canceled-unread", "canceled", "2026-05-22T00:48:00.000Z", "UniDesk#118 diagnostics gate report stale commander loop", null, "Canceled after blocker."),
task("task-queued-priority", "queued", "2026-05-22T00:47:00.000Z", "HWLAB#99 business user-facing dashboard fix waiting for runner"),
task("task-retry-priority", "retry_wait", "2026-05-22T00:46:00.000Z", "HWLAB#116 infrastructure blocker retry_wait due to github transient"),
task("task-recent-read-docs", "succeeded", "2026-05-22T00:45:00.000Z", "docs governance reference update", "2026-05-22T00:45:01.000Z"),
task("task-recent-read-business", "succeeded", "2026-05-22T00:44:00.000Z", "business user-facing workbench UI fix", "2026-05-22T00:44:01.000Z"),
task("task-recent-read-evidence", "succeeded", "2026-05-22T00:43:00.000Z", "ci e2e evidence smoke report", "2026-05-22T00:43:01.000Z"),
task("task-recent-read-artifact", "succeeded", "2026-05-22T00:42:00.000Z", "deployment artifact registry digest", "2026-05-22T00:42:01.000Z"),
task("task-recent-read-diagnostic", "succeeded", "2026-05-22T00:41:00.000Z", "diagnostics gate report", "2026-05-22T00:41:01.000Z"),
],
},
};
}
export function runCodeQueueCommanderViewContract(): JsonRecord {
const commander = codexTasksQueryForTest(["--view", "commander", "--limit", "260"], noisyCommanderFixture);
const supervisor = codexTasksQueryForTest(["--view", "supervisor", "--limit", "260"], noisyCommanderFixture);
const full = codexTasksQueryForTest(["--view", "full", "--limit", "260"], noisyCommanderFixture);
const commanderBody = JSON.stringify(commander);
const fullBody = JSON.stringify(full);
const commanderView = asRecord(asRecord(commander).commander);
const supervisorView = asRecord(asRecord(supervisor).supervisor);
const filters = asRecord(commanderView.filters);
const activeRunners = asRecord(commanderView.activeRunners);
const backlog = asRecord(commanderView.queueBacklog);
const terminalUnread = asRecord(commanderView.terminalUnread);
const riskCounts = asRecord(commanderView.riskCounts);
const attentionCounts = asRecord(riskCounts.attention);
const highPriorityIssues = asRecord(commanderView.highPriorityIssues);
const classification = asRecord(commanderView.classification);
const byCategory = asRecord(classification.byCategory);
const commands = asRecord(commanderView.commands);
const attention = asRecord(commanderView.attention);
const attentionItems = asArray(attention.items).map(asRecord);
const sections = asRecord(commanderView.sections);
const terminalUnreadSection = asRecord(sections.terminalUnread);
const recentCompletedSection = asRecord(sections.recentCompleted);
const recentIds = asArray(recentCompletedSection.items).map((item) => String(asRecord(item).id ?? ""));
const terminalIds = asArray(terminalUnreadSection.items).map((item) => String(asRecord(item).id ?? ""));
const runningRisk = attentionItems.find((item) => item.id === "task-running-risk") ?? {};
const failedUnread = attentionItems.find((item) => item.id === "task-failed-unread") ?? {};
assertCondition(commanderBody.length < 30_000, "commander output should stay under the noisy fixture budget", { chars: commanderBody.length });
assertCondition(commanderBody.length < fullBody.length * 0.65, "commander output should stay materially smaller than full output", { commanderChars: commanderBody.length, fullChars: fullBody.length });
assertCondition(filters.requestedLimit === 260 && filters.effectiveLimit === 100 && filters.limitCapped === true, "commander view should disclose requested/effective limit cap", filters);
assertCondition(activeRunners.count === 14 && activeRunners.exact === true && activeRunners.source === "database-active", "commander view should expose exact active runner count and source/disposition", activeRunners);
assertCondition(backlog.queued === 18 && backlog.retryWait === 4 && backlog.total === 22 && backlog.exact === true, "commander view should expose queued/retry_wait exact counts", backlog);
assertCondition(terminalUnread.total === 8 && terminalUnread.rowsReturned === 3 && terminalUnread.rowsOmitted === 5 && terminalUnread.exact === true, "commander view should expose terminal unread count plus omitted rows", terminalUnread);
assertCondition(attentionCounts.total === 7 && attentionCounts.returned === 7 && attentionCounts.omitted === 0, "commander attention counts should preserve total/returned/omitted", attentionCounts);
assertCondition(highPriorityIssues.present === true && highPriorityIssues.matchedCount === 7, "commander should surface tracked high-priority issues", highPriorityIssues);
assertCondition(Number(byCategory["business-user-facing"] ?? 0) >= 1
&& Number(byCategory["deployment-artifact"] ?? 0) >= 1
&& Number(byCategory["ci-e2e-evidence"] ?? 0) >= 1
&& Number(byCategory["diagnostics-gate-report"] ?? 0) >= 1
&& Number(byCategory["docs-governance"] ?? 0) >= 1
&& Number(byCategory["infrastructure-blocker"] ?? 0) >= 1, "deterministic classifier should cover requested categories", byCategory);
assertCondition(classification.deterministic === true, "classification metadata should be deterministic", classification);
assertCondition(String(commands.refresh ?? "").includes("--view commander"), "commander refresh command should preserve explicit commander view", commands);
assertCondition(String(commands.supervisor ?? "").startsWith("bun scripts/cli.ts codex tasks") && !String(commands.supervisor ?? "").includes("--view commander"), "commander should keep supervisor drilldown command", commands);
assertCondition(String(commands.full ?? "").includes("--view full"), "commander should keep full drilldown command", commands);
assertCondition(String(commands.rawOverview ?? "").includes("microservice proxy code-queue") && String(commands.rawOverview ?? "").includes("--raw"), "commander should expose raw overview drilldown", commands);
assertCondition(String(commands.traceTemplate ?? "").includes("codex task <taskId> --trace"), "commander should expose trace drilldown template", commands);
assertCondition(String(commands.outputTemplate ?? "").includes("codex output <taskId>"), "commander should expose output drilldown template", commands);
assertCondition(asRecord(runningRisk.commands).show === "bun scripts/cli.ts codex task task-running-risk", "attention row should include task drilldown command", runningRisk);
assertCondition(asArray(runningRisk.riskSignals).includes("stale-recovery-candidate") && asArray(runningRisk.riskSignals).includes("blocked"), "active risk row should expose stale/blocker signals", runningRisk);
assertCondition(asRecord(failedUnread.commands).read === "bun scripts/cli.ts codex read task-failed-unread", "failed unread row should include read command", failedUnread);
assertCondition(!commanderBody.includes("raw-prompt-task-running-risk-20"), "commander output should not dump long raw prompt bodies", { chars: commanderBody.length });
assertCondition(!commanderBody.includes("summary-final-task-running-risk-20"), "commander output should not dump long final response bodies", { chars: commanderBody.length });
assertCondition(!recentIds.some((id) => terminalIds.includes(id)), "recentCompleted section must not duplicate terminalUnread rows", { recentIds, terminalIds });
assertCondition(recentIds.length === 3, "recentCompleted commander section should be independently capped", { recentIds });
assertCondition(asRecord(supervisorView.completedUnread).count === 3 && asRecord(supervisorView.recentCompleted).count === 5, "supervisor view should remain available and keep separate unread/recent sections", supervisorView);
return {
ok: true,
checks: [
"commander view is explicit and bounded",
"exact active/queued/retry_wait/terminal-unread counts are preserved",
"attention rows expose stale, heartbeat, terminal-unread and blocker signals",
"high-priority issue refs are surfaced",
"deterministic classifier emits requested categories",
"drilldown commands are present without prompt/final-response flood",
"recent completed does not duplicate terminal unread",
"supervisor/full views remain available",
],
commanderChars: commanderBody.length,
fullChars: fullBody.length,
};
}
if (import.meta.main) {
process.stdout.write(`${JSON.stringify(runCodeQueueCommanderViewContract(), null, 2)}\n`);
}
+4
View File
@@ -44,6 +44,7 @@ const syntaxFiles = [
"scripts/code-queue-gh-auth-redaction-contract-test.ts",
"scripts/microservice-health-output-contract-test.ts",
"scripts/code-queue-supervisor-disclosure-contract-test.ts",
"scripts/code-queue-commander-view-contract-test.ts",
"src/components/frontend/src/index.ts",
"src/components/frontend/src/app.tsx",
"src/components/frontend/src/decision-center.tsx",
@@ -329,6 +330,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default
fileItem("scripts/code-queue-submit-routing-contract-test.ts"),
fileItem("scripts/code-queue-gh-auth-redaction-contract-test.ts"),
fileItem("scripts/code-queue-supervisor-disclosure-contract-test.ts"),
fileItem("scripts/code-queue-commander-view-contract-test.ts"),
fileItem("scripts/host-codex-commander-skeleton-contract-test.ts"),
fileItem("scripts/host-codex-commander-no-daemon-smoke-contract-test.ts"),
fileItem("scripts/provider-runner-triage-contract-test.ts"),
@@ -378,6 +380,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default
items.push(commandItem("code-queue:submit-routing-contract", ["bun", "scripts/code-queue-submit-routing-contract-test.ts"], 30_000));
items.push(commandItem("code-queue:gh-auth-redaction-contract", ["bun", "scripts/code-queue-gh-auth-redaction-contract-test.ts"], 30_000));
items.push(commandItem("code-queue:supervisor-disclosure-contract", ["bun", "scripts/code-queue-supervisor-disclosure-contract-test.ts"], 30_000));
items.push(commandItem("code-queue:commander-view-contract", ["bun", "scripts/code-queue-commander-view-contract-test.ts"], 30_000));
items.push(commandItem("host-codex-commander:skeleton-contract", ["bun", "scripts/host-codex-commander-skeleton-contract-test.ts"], 30_000));
items.push(commandItem("host-codex-commander:no-daemon-smoke-contract", ["bun", "scripts/host-codex-commander-no-daemon-smoke-contract-test.ts"], 30_000));
items.push(commandItem("provider:runner-triage-contract", ["bun", "scripts/provider-runner-triage-contract-test.ts"], 30_000));
@@ -416,6 +419,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default
items.push(skippedItem("code-queue:submit-routing-contract", "Code Queue submit routing contract is opt-in with script checks", "--scripts-typecheck or --full"));
items.push(skippedItem("code-queue:gh-auth-redaction-contract", "Code Queue GitHub auth output redaction contract is opt-in with script checks", "--scripts-typecheck or --full"));
items.push(skippedItem("code-queue:supervisor-disclosure-contract", "Code Queue supervisor disclosure contract is opt-in with script checks", "--scripts-typecheck or --full"));
items.push(skippedItem("code-queue:commander-view-contract", "Code Queue commander view contract is opt-in with script checks", "--scripts-typecheck or --full"));
items.push(skippedItem("host-codex-commander:skeleton-contract", "host Codex commander skeleton contract is opt-in with script checks", "--scripts-typecheck or --full"));
items.push(skippedItem("host-codex-commander:no-daemon-smoke-contract", "host Codex commander no-daemon smoke contract is opt-in with script checks", "--scripts-typecheck or --full"));
items.push(skippedItem("provider:runner-triage-contract", "Provider runner triage contract is opt-in with script checks", "--scripts-typecheck or --full"));
+602 -31
View File
@@ -31,6 +31,12 @@ const supervisorRecentCompletedLimit = 3;
const supervisorPromptPreviewChars = 70;
const supervisorBodyPreviewChars = 70;
const supervisorRecentBodyPreviewChars = 50;
const commanderAttentionLimit = 10;
const commanderSectionReturnedLimit = 5;
const commanderRecentCompletedLimit = 3;
const commanderPromptPreviewChars = 96;
const commanderBodyPreviewChars = 120;
const commanderIssueTaskPreviewLimit = 4;
const unreadTriageCountLimit = 12;
const diagnosticsIdPreviewLimit = 3;
const diagnosticsReasonPreviewLimit = 2;
@@ -303,7 +309,7 @@ interface CodexTasksOptions {
beforeId: string | undefined;
unreadOnly: boolean;
statusFilter: string[] | null;
view: "supervisor" | "full";
view: "commander" | "supervisor" | "full";
}
type CodexUnreadAction = "summary" | "mark-read";
@@ -388,6 +394,58 @@ interface CodexTasksSupervisorEntry {
read?: string;
}
type CommanderTaskCategory =
| "business-user-facing"
| "deployment-artifact"
| "ci-e2e-evidence"
| "diagnostics-gate-report"
| "docs-governance"
| "infrastructure-blocker"
| "unknown";
type CommanderAttentionSeverity = "critical" | "high" | "medium";
interface CommanderTaskClassification {
category: CommanderTaskCategory;
labels: string[];
noiseClass: "delivery" | "evidence" | "governance" | "blocker" | "unknown";
reason: string;
}
interface CommanderAttentionItem {
id: string;
queue: string | null;
status: string | null;
statusLabel?: string;
severity: CommanderAttentionSeverity;
action: "read-closeout" | "inspect-active" | "watch-active" | "inspect-blocker";
reasons: string[];
riskSignals: string[];
issues: string[];
highPriorityIssues: string[];
classification: CommanderTaskClassification;
attempt: number | null;
updatedAt: string | null;
finishedAt?: string | null;
unreadTerminal?: boolean;
finalResponseAt?: unknown;
prompt: string;
promptChars: number;
promptTruncated?: boolean;
last?: string;
lastAt?: unknown;
lastChars?: number;
lastTruncated?: boolean;
commands: {
show: string;
detail: string;
trace: string;
output: string;
read?: string;
full: string;
};
}
interface CodexTasksSection<T = CodexTasksEntry> {
count: number;
returned: number;
@@ -2343,7 +2401,7 @@ function parseTasksOptions(args: string[]): CodexTasksOptions {
valueOptions: ["--queue", "--queue-id", "--limit", "--status", "--view", "--before-id", "--beforeId"],
}, "codex tasks");
const viewValue = optionValue(args, ["--view"]) ?? "supervisor";
if (viewValue !== "supervisor" && viewValue !== "full") throw new Error(`--view must be supervisor or full; got ${viewValue}`);
if (viewValue !== "commander" && viewValue !== "supervisor" && viewValue !== "full") throw new Error(`--view must be commander, supervisor, or full; got ${viewValue}`);
const statusRaw = optionValue(args, ["--status"]);
const statusFilter = statusRaw === undefined
? null
@@ -2672,7 +2730,89 @@ function taskIssueRefs(task: Record<string, unknown>, summary: Record<string, un
asString(summary?.lastError),
asString(task.lastError),
].join("\n");
return Array.from(new Set(Array.from(text.matchAll(/#(\d{1,6})\b/gu)).map((match) => `#${match[1]}`))).slice(0, 8);
return issueRefsFromText(text).slice(0, 8);
}
function taskSearchText(task: Record<string, unknown>, summary: Record<string, unknown> | null): string {
return [
asString(task.id),
asString(task.queueId),
asString(task.status),
asString(task.displayPrompt),
asString(task.basePrompt),
asString(task.prompt),
asString(task.cwd),
asString(task.model),
asString(task.providerId),
asString(task.lastError),
asString(summary?.lastError),
asString(asRecord(summary?.lastAssistantMessage)?.text),
...stringList(task.referenceTaskIds),
].join("\n");
}
function issueRefsFromText(text: string): string[] {
const issues = new Set<string>();
const knownPrefix = (value: string): "HWLAB" | "UniDesk" | null => {
const lower = value.toLowerCase();
if (lower === "hwlab") return "HWLAB";
if (lower === "unidesk") return "UniDesk";
return null;
};
for (const match of text.matchAll(/\b(HWLAB|UniDesk)#(\d{1,6})\b/giu)) {
const prefix = knownPrefix(String(match[1] ?? ""));
if (prefix !== null) issues.add(`${prefix}#${match[2]}`);
}
for (const match of text.matchAll(/\b[A-Za-z0-9_.-]+\/(HWLAB|unidesk)#(\d{1,6})\b/giu)) {
const prefix = knownPrefix(String(match[1] ?? ""));
if (prefix !== null) issues.add(`${prefix}#${match[2]}`);
}
for (const match of text.matchAll(/\bgithub\.com[:/][A-Za-z0-9_.-]+\/(HWLAB|unidesk)(?:\.git)?\/issues\/(\d{1,6})\b/giu)) {
const prefix = knownPrefix(String(match[1] ?? ""));
if (prefix !== null) issues.add(`${prefix}#${match[2]}`);
}
for (const match of text.matchAll(/(?:^|[^A-Za-z0-9_/-])#(\d{1,6})\b/gu)) issues.add(`#${match[1]}`);
for (const match of text.matchAll(/\/issues\/(\d{1,6})\b/gu)) issues.add(`#${match[1]}`);
const issueNumber = (value: string): number => Number(value.match(/#(\d{1,6})\b/u)?.[1] ?? Number.MAX_SAFE_INTEGER);
return Array.from(issues).sort((left, right) => issueNumber(left) - issueNumber(right) || left.localeCompare(right));
}
function highPriorityIssueRefs(issues: string[]): string[] {
const priority = new Set(["HWLAB#7", "HWLAB#99", "HWLAB#116", "HWLAB#164", "HWLAB#317", "UniDesk#20", "UniDesk#118", "#7", "#20", "#99", "#116", "#118", "#164", "#317"]);
return issues.filter((issue) => priority.has(issue));
}
function commanderTaskClassification(task: Record<string, unknown>, summary: Record<string, unknown> | null): CommanderTaskClassification {
const text = taskSearchText(task, summary).toLowerCase();
const matches = (pattern: RegExp): boolean => pattern.test(text);
const labels: string[] = [];
if (matches(/\b(?:provider|gateway|k3s|k3sctl|backend-core|scheduler|runner|runtime|heartbeat|stale|tunnel|proxy|auth|token|secret|postgres|database|db|gh auth|github transient|429|rate limit|blocked|blocker|offline|unreachable|timeout)\b|||||||/iu)) labels.push("infrastructure-blocker");
if (matches(/\b(?:deploy|deployment|rollout|artifact|image|digest|registry|publish|release|ci\/cd|cd)\b|||||digest|/iu)) labels.push("deployment-artifact");
if (matches(/\b(?:ci|e2e|playwright|smoke|test|tests|typecheck|syntax|lint)\b|||/iu)) labels.push("ci-e2e-evidence");
if (matches(/\b(?:diagnostic|diagnostics|gate|report|audit|triage|preflight|observability|summary|brief|board|supervisor|commander|visibility|verification|validate|validation|evidence|proof|check)\b|||||||||||/iu)) labels.push("diagnostics-gate-report");
if (matches(/\b(?:doc|docs|documentation|reference|governance|policy|runbook|ag\.?ents|readme|markdown)\b|||||/iu)) labels.push("docs-governance");
if (matches(/\b(?:fix|bug|repair|implement|feature|ui|frontend|backend|api|workbench|patch-panel|box-?simu|gateway-?simu|m3|hardware|hwlab|user-facing|business|||线|仿||)\b||||/iu)) labels.push("business-user-facing");
const ordered: CommanderTaskCategory[] = [
"infrastructure-blocker",
"deployment-artifact",
"business-user-facing",
"docs-governance",
"ci-e2e-evidence",
"diagnostics-gate-report",
];
const category = ordered.find((item) => labels.includes(item)) ?? "unknown";
const noiseClass: CommanderTaskClassification["noiseClass"] =
category === "infrastructure-blocker" ? "blocker"
: category === "business-user-facing" || category === "deployment-artifact" ? "delivery"
: category === "ci-e2e-evidence" || category === "diagnostics-gate-report" ? "evidence"
: category === "docs-governance" ? "governance"
: "unknown";
return {
category,
labels: labels.length > 0 ? labels : ["uncategorized"],
noiseClass,
reason: labels.length > 0 ? `matched ${labels.slice(0, 3).join(", ")}` : "no strong classifier term matched",
};
}
function taskClassification(task: Record<string, unknown>, summary: Record<string, unknown> | null): {
@@ -2681,38 +2821,21 @@ function taskClassification(task: Record<string, unknown>, summary: Record<strin
managementNoise: boolean;
reason: string;
} {
const text = [
asString(task.displayPrompt),
asString(task.basePrompt),
asString(task.prompt),
asString(task.lastError),
asString(summary?.lastError),
asString(asRecord(summary?.lastAssistantMessage)?.text),
].join("\n").toLowerCase();
const matches = (pattern: RegExp): boolean => pattern.test(text);
const labels: string[] = [];
if (matches(/\b(?:gate|report|aggregator|runbook|contract|audit|review|brief|evidence|diagnostic|observability|visibility|preflight|smoke)\b|||||||/iu)) labels.push("management-or-verification");
if (matches(/\b(?:deploy|deployment|prod|dev|release|artifact|ci|cd)\b|||线/iu)) labels.push("deployment");
if (matches(/\b(?:fix|bug|repair|implement|feature|ui|frontend|backend|api|database|db|workbench|patch-panel|box-simu|gateway-simu)\b|||||线|仿|/iu)) labels.push("direct-work");
if (matches(/\b(?:doc|docs|reference|markdown)\b||/iu)) labels.push("documentation");
if (labels.length === 0) labels.push("uncategorized");
if (labels.includes("management-or-verification") && !labels.includes("direct-work") && !labels.includes("deployment")) {
return { kind: "management-noise", labels, managementNoise: true, reason: "matched report/gate/review/diagnostic terms without direct implementation or deployment terms" };
const commander = commanderTaskClassification(task, summary);
const labels = commander.labels;
if (commander.category === "deployment-artifact") {
return { kind: "deployment-fix", labels, managementNoise: commander.noiseClass === "evidence", reason: commander.reason };
}
if (labels.includes("deployment")) {
return { kind: "deployment-fix", labels, managementNoise: labels.includes("management-or-verification") && !labels.includes("direct-work"), reason: "matched deployment or artifact terms" };
if (commander.category === "business-user-facing" || commander.category === "infrastructure-blocker") {
return { kind: "direct-progress", labels, managementNoise: false, reason: commander.reason };
}
if (labels.includes("direct-work")) {
return { kind: "direct-progress", labels, managementNoise: false, reason: "matched implementation, user-visible, runtime, or repair terms" };
if (commander.category === "ci-e2e-evidence" || commander.category === "diagnostics-gate-report") {
return { kind: "verification", labels, managementNoise: true, reason: commander.reason };
}
if (labels.includes("management-or-verification")) {
return { kind: "verification", labels, managementNoise: true, reason: "matched verification/report terms; keep folded unless it blocks real work" };
if (commander.category === "docs-governance") {
return { kind: "documentation", labels, managementNoise: false, reason: commander.reason };
}
if (labels.includes("documentation")) {
return { kind: "documentation", labels, managementNoise: false, reason: "matched documentation terms" };
}
return { kind: "unknown", labels, managementNoise: false, reason: "no strong classifier term matched" };
return { kind: "unknown", labels, managementNoise: false, reason: commander.reason };
}
function supervisorLastMessage(summaryLastAssistant: unknown, maxChars: number): SupervisorMessageSummary | null {
@@ -2970,6 +3093,10 @@ function taskListCommand(options: CodexTasksOptions, extra: string[] = []): stri
return `bun scripts/cli.ts ${args.join(" ")}`;
}
function taskListCommandWithView(options: CodexTasksOptions, view: CodexTasksOptions["view"], extra: string[] = []): string {
return taskListCommand({ ...options, view }, extra);
}
function codexTasksLimitDisclosure(options: CodexTasksOptions): Record<string, unknown> {
return {
requestedLimit: options.requestedLimit,
@@ -3518,6 +3645,439 @@ function codexUnreadTriage(taskArgs: string[], fetcher: CodexResponseFetcher = c
return codexUnreadMutationResult(result, options, results);
}
function taskDrilldownCommands(taskId: string, includeRead: boolean): CommanderAttentionItem["commands"] {
return {
show: `bun scripts/cli.ts codex task ${taskId}`,
detail: `bun scripts/cli.ts codex task ${taskId} --detail`,
trace: `bun scripts/cli.ts codex task ${taskId} --trace --tail --limit ${defaultTraceLimit}`,
output: `bun scripts/cli.ts codex output ${taskId} --tail --limit ${defaultOutputLimit}`,
...(includeRead ? { read: `bun scripts/cli.ts codex read ${taskId}` } : {}),
full: `bun scripts/cli.ts codex task ${taskId} --full`,
};
}
function diagnosticTaskSet(record: Record<string, unknown>, keys: string[]): Set<string> {
return new Set(keys.flatMap((key) => stringList(record[key])));
}
function taskAgeMsAt(task: Record<string, unknown>, nowIsoValue: unknown): number | null {
const nowMs = Date.parse(asString(nowIsoValue));
const updatedMs = Date.parse(asString(task.updatedAt));
if (!Number.isFinite(nowMs) || !Number.isFinite(updatedMs)) return null;
return Math.max(0, nowMs - updatedMs);
}
function blockerLikeFinalResponseSignals(task: Record<string, unknown>, summary: Record<string, unknown> | null): string[] {
const lastAssistant = asRecord(summary?.lastAssistantMessage ?? task.lastAssistantMessage);
const text = [
asString(lastAssistant?.text),
asString(task.lastError),
asString(summary?.lastError),
asString(asRecord(task.lastJudge)?.reason),
asString(asRecord(summary?.lastJudge)?.reason),
].join("\n").toLowerCase();
const signals: string[] = [];
const add = (id: string, pattern: RegExp): void => {
if (pattern.test(text) && !signals.includes(id)) signals.push(id);
};
add("blocked", /\b(blocked|blocker|cannot proceed|can't proceed|stuck|waiting for commander|needs authorization|need authorization|requires approval|permission denied)\b|||||/iu);
add("infra-auth-network", /\b(auth|token|credential|github transient|dns|rate limit|429|timeout|timed out|unreachable|offline|proxy|tunnel|provider unavailable)\b|||||/iu);
add("merge-or-test-failure", /\b(conflict|merge failed|tests? failed|typecheck failed|syntax failed|build failed|ci failed|e2e failed)\b||||/iu);
add("not-deployed", /\b(not deployed|not rebuilt|not rolled out|deploy skipped|rollout skipped|needs rollout|requires deploy)\b|||线|/iu);
return signals;
}
function commanderAttentionReasons(
task: Record<string, unknown>,
summary: Record<string, unknown> | null,
diagnostics: Record<string, unknown>,
): {
reasons: string[];
riskSignals: string[];
severity: CommanderAttentionSeverity;
action: CommanderAttentionItem["action"];
} | null {
const taskId = taskOverviewCandidateKey(task);
const status = asString(task.status);
const summaryLastAssistant = summary?.lastAssistantMessage ?? task.lastAssistantMessage;
const awaitingStatus = finalResponseAwaitingTerminalStatus(status || null, summaryLastAssistant);
const unreadTerminal = taskUnreadTerminal(task);
const issues = taskIssueRefs(task, summary);
const priorityIssues = highPriorityIssueRefs(issues);
const heartbeatRiskTaskIds = diagnosticTaskSet(diagnostics, ["heartbeatRiskTaskIds", "heartbeatExpiredTaskIds", "heartbeatMissingTaskIds", "staleRecoveryCandidateTaskIds"]);
const traceGapTaskIds = diagnosticTaskSet(diagnostics, ["traceGapTaskIds"]);
const staleRecoveryTaskIds = diagnosticTaskSet(diagnostics, ["staleRecoveryCandidateTaskIds"]);
const blockerSignals = blockerLikeFinalResponseSignals(task, summary);
const staleAgeMs = taskAgeMsAt(task, diagnostics.now);
const staleActiveByAge = isActiveTaskStatus(status) && staleAgeMs !== null && staleAgeMs >= 45 * 60 * 1000;
const reasons: string[] = [];
const riskSignals: string[] = [];
if (unreadTerminal) {
reasons.push(status === "failed" ? "failed terminal unread; read and close out" : "terminal unread; read and close out");
riskSignals.push(status === "failed" ? "failed-terminal-unread" : "terminal-unread");
}
if (isActiveTaskStatus(status) && heartbeatRiskTaskIds.has(taskId)) {
reasons.push("heartbeat/stale-recovery risk is present for this active task");
riskSignals.push(staleRecoveryTaskIds.has(taskId) ? "stale-recovery-candidate" : "heartbeat-risk");
}
if (isActiveTaskStatus(status) && traceGapTaskIds.has(taskId)) {
reasons.push("trace progress gap is reported for this active task");
riskSignals.push("trace-gap");
}
if (staleActiveByAge) {
reasons.push("active task updatedAt is older than 45 minutes relative to execution diagnostics");
riskSignals.push("updatedAt-stale");
}
if (awaitingStatus !== null) {
reasons.push(awaitingStatus.state === "awaiting-judge" ? "final response is visible while judge is still pending" : "final response is visible while task is still running");
riskSignals.push(awaitingStatus.state);
}
if (isActiveTaskStatus(status) && blockerSignals.length > 0) {
reasons.push("latest assistant/final response has blocker-like language");
riskSignals.push(...blockerSignals);
}
if ((status === "queued" || status === "retry_wait") && priorityIssues.length > 0) {
reasons.push("queued/retry_wait task references a tracked high-priority issue");
riskSignals.push("high-priority-queued");
}
if (isActiveTaskStatus(status) && priorityIssues.length > 0) {
reasons.push("active task references a tracked high-priority issue");
riskSignals.push("high-priority-active");
}
if (reasons.length === 0) return null;
const uniqueRiskSignals = Array.from(new Set(riskSignals));
const severity: CommanderAttentionSeverity = uniqueRiskSignals.some((signal) => signal === "failed-terminal-unread" || signal === "heartbeat-risk" || signal === "stale-recovery-candidate")
? "critical"
: uniqueRiskSignals.some((signal) => signal === "terminal-unread" || signal === "blocked" || signal === "infra-auth-network" || signal === "merge-or-test-failure" || signal === "not-deployed" || signal === "updatedAt-stale")
? "high"
: "medium";
const action: CommanderAttentionItem["action"] = unreadTerminal
? "read-closeout"
: isActiveTaskStatus(status)
? severity === "medium" ? "watch-active" : "inspect-active"
: "inspect-blocker";
return { reasons: Array.from(new Set(reasons)), riskSignals: uniqueRiskSignals, severity, action };
}
function commanderAttentionItem(
task: Record<string, unknown>,
summary: Record<string, unknown> | null,
diagnostics: Record<string, unknown>,
): CommanderAttentionItem | null {
const attention = commanderAttentionReasons(task, summary, diagnostics);
if (attention === null) return null;
const taskId = taskOverviewCandidateKey(task);
const status = asString(task.status) || null;
const summaryLastAssistant = summary?.lastAssistantMessage ?? task.lastAssistantMessage;
const awaitingStatus = finalResponseAwaitingTerminalStatus(status, summaryLastAssistant);
const prompt = supervisorTextSummary(asString(task.displayPrompt ?? task.basePrompt ?? task.prompt), commanderPromptPreviewChars);
const lastMessage = supervisorLastMessage(summaryLastAssistant, commanderBodyPreviewChars);
const issues = taskIssueRefs(task, summary);
const unreadTerminal = taskUnreadTerminal(task);
return {
id: taskId,
queue: asString(task.queueId) || null,
status,
...(awaitingStatus === null ? {} : {
statusLabel: awaitingStatus.label,
finalResponseAt: awaitingStatus.finalResponseAt,
}),
severity: attention.severity,
action: attention.action,
reasons: attention.reasons,
riskSignals: attention.riskSignals,
issues,
highPriorityIssues: highPriorityIssueRefs(issues),
classification: commanderTaskClassification(task, summary),
attempt: typeof task.currentAttempt === "number" && Number.isFinite(task.currentAttempt) ? task.currentAttempt : null,
updatedAt: asString(task.updatedAt) || null,
...(isTerminalTaskStatus(status) ? { finishedAt: asString(task.finishedAt) || null, unreadTerminal } : {}),
prompt: prompt.text,
promptChars: prompt.chars,
...(prompt.truncated ? { promptTruncated: true } : {}),
...(lastMessage === null ? {} : {
last: lastMessage.text,
lastAt: lastMessage.at,
lastChars: lastMessage.chars,
...(lastMessage.truncated ? { lastTruncated: true } : {}),
}),
commands: taskDrilldownCommands(taskId, unreadTerminal),
};
}
function commanderAttentionRank(item: CommanderAttentionItem): number {
const severityRank: Record<CommanderAttentionSeverity, number> = { critical: 0, high: 1, medium: 2 };
const actionRank: Record<CommanderAttentionItem["action"], number> = {
"read-closeout": 0,
"inspect-active": 1,
"inspect-blocker": 2,
"watch-active": 3,
};
return severityRank[item.severity] * 10 + actionRank[item.action];
}
function commanderIdSection(tasks: Record<string, unknown>[], summaries: Map<string, Record<string, unknown>>, limit: number, nextCommand: string | null, fullCommand: string): Record<string, unknown> {
const visible = tasks.slice(0, limit);
return {
count: tasks.length,
returned: visible.length,
omitted: Math.max(0, tasks.length - visible.length),
truncated: tasks.length > visible.length,
hasMore: tasks.length > visible.length || nextCommand !== null,
commands: {
next: tasks.length > visible.length || nextCommand !== null ? nextCommand : null,
full: fullCommand,
showTemplate: "bun scripts/cli.ts codex task <taskId>",
traceTemplate: `bun scripts/cli.ts codex task <taskId> --trace --tail --limit ${defaultTraceLimit}`,
outputTemplate: `bun scripts/cli.ts codex output <taskId> --tail --limit ${defaultOutputLimit}`,
readTemplate: "bun scripts/cli.ts codex read <taskId>",
},
items: visible.map((task) => {
const taskId = taskOverviewCandidateKey(task);
const summary = summaries.get(taskId) ?? null;
const issues = taskIssueRefs(task, summary);
return {
id: taskId,
queue: asString(task.queueId) || null,
status: asString(task.status) || null,
issues,
highPriorityIssues: highPriorityIssueRefs(issues),
category: commanderTaskClassification(task, summary).category,
updatedAt: asString(task.updatedAt) || null,
finishedAt: asString(task.finishedAt) || null,
};
}),
};
}
function commanderClassificationCounts(tasks: Record<string, unknown>[], summaries: Map<string, Record<string, unknown>>): Record<string, unknown> {
const byCategory: Record<string, number> = {};
const byNoiseClass: Record<string, number> = {};
for (const task of tasks) {
const taskId = taskOverviewCandidateKey(task);
const classification = commanderTaskClassification(task, summaries.get(taskId) ?? null);
byCategory[classification.category] = (byCategory[classification.category] ?? 0) + 1;
byNoiseClass[classification.noiseClass] = (byNoiseClass[classification.noiseClass] ?? 0) + 1;
}
return {
byCategory,
byNoiseClass,
categories: ["business-user-facing", "deployment-artifact", "ci-e2e-evidence", "diagnostics-gate-report", "docs-governance", "infrastructure-blocker", "unknown"],
deterministic: true,
sourceFields: ["task prompt previews", "task metadata", "summary lastAssistantMessage preview when fetched"],
};
}
function commanderHighPriorityIssues(tasks: Record<string, unknown>[], summaries: Map<string, Record<string, unknown>>): Record<string, unknown> {
const tracked = ["HWLAB#7", "HWLAB#99", "HWLAB#116", "HWLAB#164", "HWLAB#317", "UniDesk#20", "UniDesk#118"];
const byIssue = new Map<string, string[]>();
for (const task of tasks) {
const taskId = taskOverviewCandidateKey(task);
const issues = highPriorityIssueRefs(taskIssueRefs(task, summaries.get(taskId) ?? null));
for (const issue of issues) {
const existing = byIssue.get(issue) ?? [];
if (!existing.includes(taskId)) existing.push(taskId);
byIssue.set(issue, existing);
}
}
const issueNumber = (value: string): number => Number(value.match(/#(\d{1,6})\b/u)?.[1] ?? Number.MAX_SAFE_INTEGER);
const items = Array.from(byIssue.entries())
.sort(([left], [right]) => issueNumber(left) - issueNumber(right) || left.localeCompare(right))
.map(([issue, taskIds]) => ({
issue,
taskCount: taskIds.length,
returnedTaskIds: taskIds.slice(0, commanderIssueTaskPreviewLimit),
omittedTaskIds: Math.max(0, taskIds.length - commanderIssueTaskPreviewLimit),
}));
return {
tracked,
present: items.length > 0,
matchedCount: items.length,
returned: items.length,
items,
};
}
function attentionCounts(items: CommanderAttentionItem[], returnedItems: CommanderAttentionItem[]): Record<string, unknown> {
const countBy = (source: CommanderAttentionItem[], key: "severity" | "action"): Record<string, number> => source.reduce((counts, item) => {
const value = item[key];
counts[value] = (counts[value] ?? 0) + 1;
return counts;
}, {} as Record<string, number>);
return {
total: items.length,
returned: returnedItems.length,
omitted: Math.max(0, items.length - returnedItems.length),
bySeverity: countBy(items, "severity"),
returnedBySeverity: countBy(returnedItems, "severity"),
byAction: countBy(items, "action"),
};
}
function terminalUnreadAggregateCount(taskPage: CodexTasksTaskPage, options: CodexTasksOptions, fallback: number): { total: number; exact: boolean; source: string } {
const queue = taskPage.queue;
if (queue !== null && options.queueId === undefined) {
const count = positiveCount(queue.unreadTerminal);
if (count !== null) return { total: count, exact: true, source: "queue-summary-unreadTerminal" };
}
if (queue !== null && options.queueId !== undefined) {
for (const item of asArray(queue.queues)) {
const record = asRecord(item);
if (record === null || asString(record.id) !== options.queueId) continue;
const count = positiveCount(record.unreadTerminal);
if (count !== null) return { total: count, exact: true, source: "queue-row-unreadTerminal" };
}
}
return { total: fallback, exact: false, source: "overview-page-fallback" };
}
function codexTasksCommanderResult(
taskPage: CodexTasksTaskPage,
upstream: { ok: unknown; status: unknown },
options: CodexTasksOptions,
summaries: Map<string, Record<string, unknown>>,
degraded: CodexTasksDegraded | null,
): Record<string, unknown> {
const allTasks = filterTasksForOptions(taskPage.tasks, options);
const statusCounts = statusCountsForOptions(taskPage, options);
const rawQueue = asRecord(taskPage.queue) ?? {};
const rawDiagnostics = asRecord(rawQueue.executionDiagnostics) ?? {};
const diagnostics = supervisorExecutionDiagnostics(rawDiagnostics);
const activity = compactCodeQueueActivity(rawQueue, diagnostics);
const commanderConcurrency = asRecord(activity.commanderConcurrency) ?? {};
const runningTasks = sortRunningWatchTasks(allTasks);
const unreadCompletedTasks = sortCompletedWatchTasks(allTasks).filter((task) => taskUnreadTerminal(task));
const recentCompletedTasks = options.unreadOnly ? [] : sortCompletedWatchTasks(allTasks).filter((task) => !taskUnreadTerminal(task));
const queuedRetryTasks = options.unreadOnly ? [] : sortQueuedWatchTasks(allTasks);
const nextBeforeId = asString(taskPage.pagination.nextBeforeId) || null;
const sourceHasMore = asBoolean(taskPage.pagination.hasMore);
const nextCommand = sourceHasMore && nextBeforeId !== null ? taskListCommand({ ...options, beforeId: nextBeforeId }) : null;
const fullCommand = taskListCommandWithView(options, "full");
const supervisorCommand = taskListCommandWithView(options, "supervisor");
const activeSection = buildSupervisorTaskSection(runningTasks, summaries, taskSectionLimit(options), sectionNextCommand(runningTasks, taskSectionLimit(options), options, nextCommand), fullCommand);
const activeRunning = supervisorActiveRunningSummary(taskPage, options, activeSection, diagnostics);
const attentionItems = allTasks
.map((task) => commanderAttentionItem(task, summaries.get(taskOverviewCandidateKey(task)) ?? null, rawDiagnostics))
.filter((item): item is CommanderAttentionItem => item !== null)
.sort((left, right) => {
const rankDelta = commanderAttentionRank(left) - commanderAttentionRank(right);
if (rankDelta !== 0) return rankDelta;
const timeDelta = Date.parse(right.updatedAt ?? "") - Date.parse(left.updatedAt ?? "");
if (Number.isFinite(timeDelta) && timeDelta !== 0) return timeDelta;
return left.id.localeCompare(right.id);
});
const returnedAttention = attentionItems.slice(0, commanderAttentionLimit);
const activeRiskTasks = runningTasks.filter((task) => commanderAttentionReasons(task, summaries.get(taskOverviewCandidateKey(task)) ?? null, rawDiagnostics) !== null);
const queued = statusCounts.counts.queued ?? 0;
const retryWait = statusCounts.counts.retry_wait ?? 0;
const failedUnread = unreadCompletedTasks.filter((task) => asString(task.status) === "failed").length;
const canceledUnread = unreadCompletedTasks.filter((task) => asString(task.status) === "canceled").length;
const succeededUnread = unreadCompletedTasks.filter((task) => asString(task.status) === "succeeded").length;
const terminalUnreadAggregate = terminalUnreadAggregateCount(taskPage, options, unreadCompletedTasks.length);
return {
upstream,
commander: {
filters: {
view: "commander",
queueId: options.queueId ?? null,
requestedLimit: options.requestedLimit,
limit: options.limit,
...codexTasksLimitDisclosure(options),
unreadOnly: options.unreadOnly,
status: options.statusFilter,
beforeId: options.beforeId ?? null,
},
source: { queueId: options.queueId ?? null, ...codexTasksSourceDisclosure(taskPage.pagination) },
bounded: true,
disclosure: {
recommendedFor: "host commander supervision loops",
policy: "bounded action map only; no full prompt, final response, trace, output, or raw overview body is included by default",
attentionLimit: commanderAttentionLimit,
sectionReturnedLimit: commanderSectionReturnedLimit,
promptPreviewChars: commanderPromptPreviewChars,
bodyPreviewChars: commanderBodyPreviewChars,
rawOverview: `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview${tasksListQueryString(options)} --raw`,
},
activeRunners: {
count: asNumber(commanderConcurrency.activeRunnerCount, asNumber(activity.effectiveActiveTaskCount, activeRunning.effectiveActiveRunnerCount as number)),
exact: activeRunning.exact,
countField: commanderConcurrency.activeRunnerCountField ?? "activity.effectiveActiveTaskCount",
source: activity.effectiveActiveSource ?? "unknown",
disposition: commanderConcurrency.splitBrainDisposition ?? activity.splitBrainDisposition ?? null,
interventionRequired: commanderConcurrency.interventionRequired ?? null,
interventionReason: commanderConcurrency.interventionReason ?? null,
statusCounts: {
running: statusCounts.counts.running ?? 0,
judging: statusCounts.counts.judging ?? 0,
source: statusCounts.source,
exact: statusCounts.exact,
},
databaseRunning: activity.databaseRunningTaskCount,
databaseActive: activity.databaseActiveTaskCount,
heartbeatFreshActive: activity.heartbeatFreshActiveTaskCount,
schedulerLocalActiveQueues: activity.schedulerLocalActiveQueueCount,
schedulerLocalActiveRunSlots: activity.schedulerLocalActiveRunSlotCount,
},
queueBacklog: {
queued,
retryWait,
total: queued + retryWait,
exact: statusCounts.exact,
source: statusCounts.source,
},
terminalUnread: {
total: terminalUnreadAggregate.total,
failed: failedUnread,
canceled: canceledUnread,
succeeded: succeededUnread,
rowsReturned: unreadCompletedTasks.length,
rowsOmitted: Math.max(0, terminalUnreadAggregate.total - unreadCompletedTasks.length),
exact: terminalUnreadAggregate.exact,
source: terminalUnreadAggregate.source,
statusBreakdownSource: "fetched-priority-rows",
},
riskCounts: {
attention: attentionCounts(attentionItems, returnedAttention),
activeRisks: activeRiskTasks.length,
heartbeatRiskTaskIds: stringList(rawDiagnostics.heartbeatRiskTaskIds).length,
staleRecoveryCandidateTaskIds: stringList(rawDiagnostics.staleRecoveryCandidateTaskIds).length,
traceGapTaskIds: stringList(rawDiagnostics.traceGapTaskIds).length,
awaitingTerminalOrJudge: attentionItems.filter((item) => item.riskSignals.includes("awaiting-terminal-or-judge") || item.riskSignals.includes("awaiting-judge")).length,
blockerLikeFinalResponse: attentionItems.filter((item) => item.riskSignals.some((signal) => signal === "blocked" || signal === "infra-auth-network" || signal === "merge-or-test-failure" || signal === "not-deployed")).length,
},
highPriorityIssues: commanderHighPriorityIssues(allTasks, summaries),
classification: commanderClassificationCounts(allTasks, summaries),
executionDiagnostics: diagnostics,
degraded,
commands: {
refresh: taskListCommand(options),
supervisor: supervisorCommand,
full: fullCommand,
unread: `bun scripts/cli.ts codex unread --limit ${Math.min(options.requestedLimit, defaultTasksLimit)}`,
running: taskListCommandWithView({ ...baseTaskListOptions({ ...options, unreadOnly: false, beforeId: undefined }), statusFilter: ["running", "judging"] }, "supervisor"),
queued: taskListCommandWithView({ ...baseTaskListOptions({ ...options, unreadOnly: false, beforeId: undefined }), statusFilter: ["queued", "retry_wait"] }, "supervisor"),
rawOverview: `bun scripts/cli.ts microservice proxy code-queue /api/tasks/overview${tasksListQueryString(options)} --raw`,
showTemplate: "bun scripts/cli.ts codex task <taskId>",
detailTemplate: "bun scripts/cli.ts codex task <taskId> --detail",
traceTemplate: `bun scripts/cli.ts codex task <taskId> --trace --tail --limit ${defaultTraceLimit}`,
outputTemplate: `bun scripts/cli.ts codex output <taskId> --tail --limit ${defaultOutputLimit}`,
readTemplate: "bun scripts/cli.ts codex read <taskId>",
},
attention: {
...attentionCounts(attentionItems, returnedAttention),
truncated: attentionItems.length > returnedAttention.length,
items: returnedAttention,
},
sections: {
activeNeedsAttention: commanderIdSection(activeRiskTasks, summaries, commanderSectionReturnedLimit, taskListCommandWithView({ ...options, statusFilter: ["running", "judging"] }, "supervisor"), fullCommand),
terminalUnread: commanderIdSection(unreadCompletedTasks, summaries, commanderSectionReturnedLimit, taskListCommandWithView({ ...options, unreadOnly: true }, "supervisor"), fullCommand),
queuedRetryWait: commanderIdSection(queuedRetryTasks, summaries, commanderSectionReturnedLimit, taskListCommandWithView({ ...options, statusFilter: ["queued", "retry_wait"] }, "supervisor"), fullCommand),
recentCompleted: commanderIdSection(recentCompletedTasks, summaries, commanderRecentCompletedLimit, nextCommand, fullCommand),
},
},
};
}
function codexTasksOverviewResult(
taskPage: CodexTasksTaskPage,
upstream: { ok: unknown; status: unknown },
@@ -3525,6 +4085,7 @@ function codexTasksOverviewResult(
summaries: Map<string, Record<string, unknown>>,
degraded: CodexTasksDegraded | null,
): Record<string, unknown> {
if (options.view === "commander") return codexTasksCommanderResult(taskPage, upstream, options, summaries, degraded);
if (options.view === "full") return codexTasksFullResult(taskPage, upstream, options, summaries, degraded);
const allTasks = filterTasksForOptions(taskPage.tasks, options);
const runningTasks = sortRunningWatchTasks(allTasks);
@@ -3677,6 +4238,16 @@ function visibleTaskIdsForOverview(tasks: Record<string, unknown>[], options: Co
return filtered.slice(0, options.limit).map((task) => taskOverviewCandidateKey(task)).filter((taskId) => taskId.length > 0);
}
const sectionLimit = taskSectionLimit(options);
if (options.view === "commander") {
return Array.from(new Set([
...sortRunningWatchTasks(filtered),
...sortCompletedWatchTasks(filtered).filter((task) => taskUnreadTerminal(task)),
...sortQueuedWatchTasks(filtered).slice(0, commanderSectionReturnedLimit),
...sortCompletedWatchTasks(filtered).filter((task) => !taskUnreadTerminal(task)).slice(0, commanderRecentCompletedLimit),
].map((task) => taskOverviewCandidateKey(task))))
.filter((taskId) => taskId.length > 0)
.slice(0, maxTasksLimit);
}
return Array.from(new Set([
...sortRunningWatchTasks(filtered).slice(0, sectionLimit),
...sortCompletedWatchTasks(filtered).filter((task) => taskUnreadTerminal(task)).slice(0, sectionLimit),
+2 -2
View File
@@ -57,7 +57,7 @@ export function rootHelp(): unknown {
{ command: "codex skills-sync --dry-run [--full]", description: "Inspect the controlled runner skills hostPath lifecycle contract without copying files, restarting services, reading secrets, or mutating live runner paths." },
{ command: "codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/<name>] [--pr-create-dry-run --pr-create-dry-run-head <head>] [--issue N] [--full|--raw]", description: "Read-only PR admission check with compact commander output by default; use --full or --raw to expand the full runtime preflight, tool, and observation payload." },
{ command: "codex task <taskId> [--detail] [--trace --tail|--from-start|--after-seq N|--before-seq N --limit N] [--full]", description: "Fetch the bounded review view by default; --detail is still capped, while --full/trace/output explicitly expand evidence." },
{ command: "codex tasks [--view supervisor|full] [--queue id] [--status status[,status]] [--unread|--unread-only] [--limit N] [--before-id id]", description: "Show the low-noise supervisor view by default: compact task rows, tiny local sections, activity counts, diagnostics, and drill-down commands; use --view full for detailed rows." },
{ command: "codex tasks [--view commander|supervisor|full] [--queue id] [--status status[,status]] [--unread|--unread-only] [--limit N] [--before-id id]", description: "Show Code Queue task state with progressive disclosure; --view commander is the recommended bounded host-commander loop, supervisor keeps compact sections, and full returns detailed rows." },
{ command: "codex unread [summary|mark-read] [--queue id] [--repo owner/name] [--issue N] [--status succeeded,failed,canceled] [--limit N] [--confirm]", description: "Summarize unread terminal backlog by repo, issue, status and queue without raw prompts; batch mark-read requires the explicit mark-read subcommand plus --confirm." },
{ command: "codex output <taskId> [--tail|--from-start|--after-seq N|--before-seq N --limit N] [--full-text]", description: "Fetch paged raw Code Queue output records; default caps large limits/text previews, --full-text explicitly expands one seq window." },
{ command: "codex read <taskId>", description: "Mark one reviewed terminal task read and return terminal metadata plus final response; prompt/tool logs stay behind drill-down commands." },
@@ -259,7 +259,7 @@ function codexHelp(): unknown {
"cat <<'PROMPT' | bun scripts/cli.ts codex submit --prompt-stdin --queue <id> --dry-run",
"bun scripts/cli.ts codex submit --prompt-file /tmp/code-queue-prompt.md --queue <id> --dry-run",
"bun scripts/cli.ts codex task <taskId> [--detail] [--trace --tail|--from-start|--after-seq N|--before-seq N --limit N] [--full]",
"bun scripts/cli.ts codex tasks [--view supervisor|full] [--queue id] [--status succeeded,running] [--unread|--unread-only] [--limit N] [--before-id id]",
"bun scripts/cli.ts codex tasks [--view commander|supervisor|full] [--queue id] [--status succeeded,running] [--unread|--unread-only] [--limit N] [--before-id id]",
"bun scripts/cli.ts codex unread [--repo owner/name] [--issue N] [--limit N]",
"bun scripts/cli.ts codex unread mark-read [--repo owner/name] [--issue N] [--limit N] --confirm",
"bun scripts/cli.ts codex output <taskId> [--tail|--from-start|--after-seq N|--before-seq N --limit N] [--full-text]",