From 7ea2f77b70bcac0dda54962257adac26b47b27b1 Mon Sep 17 00:00:00 2001 From: Codex Date: Sat, 23 May 2026 01:56:59 +0000 Subject: [PATCH] fix: bound code queue cli noise --- AGENTS.md | 4 +- docs/reference/cli.md | 8 +- docs/reference/code-queue-supervision.md | 6 +- ...code-queue-cli-disclosure-contract-test.ts | 155 ++++++++++++++++++ scripts/code-queue-cli-steer-test.ts | 5 + ...eue-supervisor-disclosure-contract-test.ts | 12 +- scripts/src/check.ts | 8 + scripts/src/code-queue.ts | 135 ++++++++++----- scripts/src/help.ts | 14 +- 9 files changed, 286 insertions(+), 61 deletions(-) create mode 100644 scripts/code-queue-cli-disclosure-contract-test.ts diff --git a/AGENTS.md b/AGENTS.md index 92c9bef0..02950288 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -50,9 +50,9 @@ UniDesk 是一个以主 server 为统一入口的分布式工作平台;本文 - `bun scripts/cli.ts ci install/status/run/publish-backend-core/publish-user-service/run-dev-e2e/logs`:在 D601 原生 k3s 上安装和运行 Tekton CI,支持每 commit 检查、Code Queue 只读性能门禁、`CI.json` catalog 驱动的 backend-core 与 user-service commit-pinned 镜像发布和手动触发的 `origin/master:deploy.json#environments.dev` 临时 namespace e2e;catalog/producer/consumer 分工见 `docs/reference/cicd-standardization.md`,`run-dev-e2e` 的 Git 控制 runner、短 launcher 和 no-CD 边界见 `docs/reference/dev-ci-runner.md`,Tekton 规则见 `docs/reference/ci.md`。 - `bun scripts/cli.ts codex deploy `:旧 Code Queue 兼容部署入口已禁用,原因是它会绕过受控部署边界直连 D601 部署 Code Queue;规则见 `docs/reference/codex-deploy.md`。 - `bun scripts/cli.ts codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue ]` / `codex pr-preflight [--remote]`:前者通过 backend-core 私有代理提交 Code Queue 任务,`--dry-run` 会给出 MiniMax/GPT/人工路由建议但不改写 payload,真实提交成功只返回写入确认、task id 和后续查看命令,不回显 prompt;后者只读检查 D601 scheduler/runner 的 GitHub token、egress 和 PR 能力,PR 型派单前必须使用,规则见 `docs/reference/cli.md` 和 `docs/reference/code-queue-supervision.md`。 -- `bun scripts/cli.ts codex task `:按 Code Queue 任务 ID 查询默认审阅摘要,只返回原始 prompt、最终 response、最后错误和渐进披露命令;需要工具调用、attempt/judge 和详细耗时时显式加 `--detail`;`codex queues [--full] [--limit N] [--page N|--offset N]` 默认分页低噪声输出队列摘要,完整 upstream 只通过 raw command 显式获取。 +- `bun scripts/cli.ts codex task `:按 Code Queue 任务 ID 查询默认审阅摘要,只返回原始 prompt、最终 response、最后错误和渐进披露命令;`--detail`、`codex output` 和 supervisor 大 `--limit` 仍默认有界,完整内容需显式 `--full`/`--full-text`/分页展开;`codex queues [--full] [--limit N] [--page N|--offset N]` 默认分页低噪声输出队列摘要,完整 upstream 只通过 raw command 显式获取。 - `bun scripts/cli.ts codex judge --attempt [--dry-run]`:按指定 task/attempt 用与队列 worker 相同的上下文构建和 MiniMax judge 调用路径单步复现完成判定;`--dry-run` 只输出 prompt/payload 诊断。 -- `bun scripts/cli.ts codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run]`:通过 Code Queue 私有代理向运行中的 active turn 注入纠偏提示,正式替代底层 `microservice proxy ... /steer` 调用。 +- `bun scripts/cli.ts codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run]`:通过 Code Queue 私有代理向运行中的 active turn 注入纠偏提示,真实成功只确认写入并返回后续查看命令,不回显 prompt 或完整 task state。 - `bun scripts/cli.ts codex interrupt|cancel `:通过 Code Queue 私有代理中断运行任务或取消 queued/retry_wait 任务,规则见 `docs/reference/cli.md`。 - `bun scripts/cli.ts server stop`:以异步 job 停止固定 Compose 项目中的全部 UniDesk 服务,停止后用 `server status` 复核。 - `bun scripts/cli.ts job list [--limit N]` / `bun scripts/cli.ts job status latest [--tail-bytes N]`:分页查询 `.state/jobs/` 中的异步任务状态,状态输出只读日志尾部并保留完整日志路径,job 机制见 `docs/reference/cli.md`。 diff --git a/docs/reference/cli.md b/docs/reference/cli.md index b95f1909..893b4923 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -46,14 +46,14 @@ CLI 可以从 `master` 快速演进,但必须兼容 `deploy.json` 固定的 CI - `codex deploy ` 是旧 Code Queue 兼容部署入口,已禁用以防止维护通道直连 D601 部署 Code Queue;当前 dev 自动化只做 `ci run-dev-e2e` smoke,不提供 Code Queue CD,详细规则见 `docs/reference/codex-deploy.md`。 - `codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue queueId] [--provider-id id] [--cwd path] [--model model] [--reasoning-effort effort] [--execution-mode mode] [--max-attempts N] [--reference-task-id id] [--dry-run]` 通过 backend-core 私有代理向稳定 `code-queue` 用户服务路径提交任务;prompt 必须且只能来自位置参数、文件或 stdin 之一,`--dry-run` 只返回结构化请求且不实际入队。长 prompt、多行 prompt、含引号/反引号/Markdown 表格/JSON/反斜杠的 prompt 必须优先用 `--prompt-stdin` 或 `--prompt-file`,不要拼进 shell 单个参数;位置参数只适合短单行 smoke prompt。stdin 推荐用 quoted heredoc:`cat <<'PROMPT' | bun scripts/cli.ts codex submit --prompt-stdin --queue --dry-run`,文件路径推荐 `bun scripts/cli.ts codex submit --prompt-file /tmp/code-queue-prompt.md --queue --dry-run`,确认 dry-run 后移除 `--dry-run` 提交同一 payload。dry-run 会额外输出 `routingRecommendation`,包含推荐 route、runner、model、风险信号、prompt 自包含/issue 非唯一来源/prod-secret-DB 禁止/运行态或 release 禁止/证据要求/中等复杂度候选等 guard 状态;同时输出 `policyContract`,固定暴露 GPT-5.5、DeepSeek、MiniMax 的风险分层、并发上限和外部 provider 429 退避处置。该建议只用于指挥官 preflight,不会改写 payload,不改变 runtime admission,也不假设生产 MiniMax 或 DeepSeek 可用。`--dry-run` 必须返回完整 prompt、字符数和 `truncated=false` 用于人工验收;真实提交是写入操作,默认只返回 `accepted=true`、task id、队列、写入保护摘要和后续查看命令,必须标记 `promptOmitted=true` 且不得回显 prompt 或 promptPreview。真实提交会经过本机本地串行化保护和短节流,避免同一指挥端并发 submit 把低内存主机或 `code-queue-mgr` 控制面打抖;返回值会附带低噪声 `submitConcurrencyGuard` 说明本次提交的锁与等待信息。backend-core 默认把提交、队列 CRUD、已读状态、历史摘要和轻量 Trace 读取分流到主 server `code-queue-mgr`,由它写入主 PostgreSQL;D601 scheduler 只轮询并执行已入库任务。 - `codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/] [--pr-create-dry-run --pr-create-dry-run-head ] [--issue N] [--full]` 通过稳定 `code-queue` proxy 请求 D601 scheduler `/api/runtime-preflight`,用于 PR 型派单 admission。输出会压缩展示 scheduler/runner 的 token 覆盖、Auth Broker source/capability/nextAction、工具、agent port、Git worktree、GitHub egress、repo/issue/PR 只读探测、可选 push dry-run,以及可选 PR body/create dry-run guard;只报告 `GH_TOKEN`/`GITHUB_TOKEN` 是否存在和来源 key,不打印值。当 auth-broker 配置存在时,`tokenCoverage.source="auth-broker"`、`credentialSource="broker-issued-token"` 且 runner env token 不是成功前提;当仅 env token 存在时,`credentialSource="env-token"` 且 `preflight.authBroker.nextAction="use-env-token-until-auth-broker-live"`;两者都缺失时顶层 `ok=false`、`runnerDisposition=infra-blocked`、`degradedReason=auth-broker-needed`,`tokenCoverage.missing` 同时列出 `GH_TOKEN` 与 `GITHUB_TOKEN`,并输出 `preflight.authBroker.source="broker/auth-broker-needed"`、`capability.source="missing-token"`。该 `auth-missing` 的 scope 是 `scheduler-runner-env`,不能简化成“当前 active runner/dev container 不能创建 PR”;输出必须带 `scopeBoundary` 和 `activeRunnerDevContainer`,要求调用方另跑 `bun scripts/cli.ts gh auth status --repo pikasTech/unidesk` 与 PR dry-run 来确认当前 dev container 能力。`preflight.prCapabilityContract` 是 runner-facing 合同摘要,必须包含目标分支、token/auth 来源、`systemGhBinaryRequiredForWrites=false`、UniDesk REST `bun scripts/cli.ts gh` 可用性、push dry-run/PR create dry-run 的 `writesRemote=false`、expected PR handoff、真实 PR 创建需要 commander 授权和 `gh pr merge` 的 `unsupported-command` 边界;系统 `gh` binary 缺失只进入 `tools.systemGhBinary`,不得误判为 UniDesk REST `gh` CLI 不可用。`--remote` 在 runner-like 环境里不再依赖本地 `unidesk-backend-core`、`unidesk-database`、`baidu-netdisk-backend` 容器存在;这些缺失只作为本地观测证据。若远程控制面可达,则继续走远程控制面结果;若远程控制面不可达,则结构化返回 `failureKind=control-plane-missing` / `degradedReason=remote-control-plane-unreachable`,而不是把本地 `backend-core-container-missing` 当作最终阻塞。`--pr-create-dry-run` 不 POST GitHub,只证明 runner 内 PR body 生成、`scripts/cli.ts gh pr create --dry-run` 和 branch 参数形态可用;服务端创建权限仍以 token/auth broker、repo/issue/PR read、push dry-run 和最终授权后的真实 PR 创建结果为准。 -- `codex task ` 通过 Code Queue 私有代理按任务 ID 查询结构化审阅摘要;默认只返回任务身份、执行 Provider、工作目录、attempt 计数、原始 prompt、最终 response、最后错误和渐进披露命令,适合指挥官审阅完成未读任务且避免上下文爆炸。需要旧式详细摘要时显式加 `--detail`;需要完整 prompt/response 文本时加 `--full`;需要工具调用、judge、attempt 全量摘要时使用 `--detail --full --tool-limit N`。该摘要读取默认由主 server `code-queue-mgr` 从 PostgreSQL 返回,不依赖 D601 `code-queue-read` Service 可用。 -- `codex tasks [--view supervisor|full] [--queue id] [--status succeeded|running|queued|failed|canceled|judging|retry_wait[,..]] [--unread|--unread-only] [--limit N] [--before-id id]` 通过同一私有代理输出渐进式披露视图。默认 `supervisor` 是低噪声指挥官视图,只返回 `running`、`completedUnread`、`recentCompleted`、`queued` 和 `executionDiagnostics` 的紧凑行;prompt/body 只给短预览和原始字符数,`running`/`completedUnread`/`queued` 默认只返回一个有界小页并通过 section `commands.next` 继续分页,`recentCompleted` 默认最多返回 5 条且不重复 `completedUnread` 未读终态,不嵌入完整 Trace、final response 或全量 overview。每个条目只保留 task id、队列、状态、issue、分类和短摘要,`show/detail/trace/output/full/read` 放在 section template 中避免重复噪声,并带 `kind` 标记直接推进、部署修复、验证/报告噪声等类别,帮助指挥官按 #131 聚焦真实推进而不是被 Gate/报告/审查任务牵引。`--unread` 是 `--unread-only` 的别名,必须只保留未读终态;`--status` 必须真实过滤支持的状态,未知参数或未知状态必须结构化失败,不能静默忽略。需要更详细当前页任务行时显式使用 `--view full` 或 `--full`,仍受 `--limit` 和 `--before-id` 分页约束。 +- `codex task ` 通过 Code Queue 私有代理按任务 ID 查询结构化审阅摘要;默认只返回任务身份、执行 Provider、工作目录、attempt 计数、原始 prompt、最终 response、最后错误和渐进披露命令,适合指挥官审阅完成未读任务且避免上下文爆炸。`--detail` 仍是有界详细摘要:默认只返回少量 attempt/tool 行、短 prompt/response/stderr/feedback 预览和 omitted/truncated 元数据;需要完整 prompt/response 文本或更多 tool/attempt 细节时再显式加 `--full`、`--tool-limit N`、`--trace` 或 `codex output`。该摘要读取默认由主 server `code-queue-mgr` 从 PostgreSQL 返回,不依赖 D601 `code-queue-read` Service 可用。 +- `codex tasks [--view supervisor|full] [--queue id] [--status succeeded|running|queued|failed|canceled|judging|retry_wait[,..]] [--unread|--unread-only] [--limit N] [--before-id id]` 通过同一私有代理输出渐进式披露视图。默认 `supervisor` 是低噪声指挥官视图,只返回 `running`、`completedUnread`、`recentCompleted`、`queued` 和 `executionDiagnostics` 的紧凑行;prompt/body 只给短预览和原始字符数,`running`/`completedUnread`/`queued` 默认只返回一个很小的有界页并通过 section `commands.next` 继续分页,`recentCompleted` 默认也限量且不重复 `completedUnread` 未读终态,不嵌入完整 Trace、final response 或全量 overview。`--limit` 在 supervisor 中主要是扫描/分页预算,不是返回几十条肥行的开关;需要更详细当前页任务行时显式使用 `--view full` 或 `--full`,仍受 `--limit` 和 `--before-id` 分页约束。 - `codex task --trace --tail|--from-start|--after-seq N|--before-seq N --limit N` 按页拉取 Code Queue 的逻辑 trace;响应会返回 `nextAfterSeq`、`previousBeforeSeq`、`hasMore`、`hasBefore` 和下一页/上一页命令,默认 `--trace` 取最新一页,且仍以分页 trace 为主;需要完整 prompt/最终 response 时加 `--full`,需要详细 task 摘要时加 `--detail`。 -- `codex output --tail|--from-start|--after-seq N|--before-seq N --limit N [--full-text]` 按原始 output seq 分页读取底层记录;当 trace 行提示 `commandOmittedLines`、`bodyOmittedLines` 或 `rawSeqs` 时,用该命令按 seq 补取完整信息,默认仍有单条文本预览上限,显式 `--full-text` 才返回该页全文。 +- `codex output --tail|--from-start|--after-seq N|--before-seq N --limit N [--full-text]` 按原始 output seq 分页读取底层记录;当 trace 行提示 `commandOmittedLines`、`bodyOmittedLines` 或 `rawSeqs` 时,用该命令按 seq 补取信息。默认是低噪声 raw-output 摘要:即使传入很大的 `--limit`,非 `--full-text` 也会限制返回行数和单条文本预览,并在 `disclosure.limitCapped`、`requestedLimit`、`effectiveLimit` 和 `commands.fullText` 中说明如何继续展开;显式 `--full-text` 才返回该页全文。 - `codex read ` 在人工审阅后标记单个终态任务已读;列表、overview 和 supervisor 视图只返回这个命令字段,不得自动执行,也不得批量清空未读状态。 - `codex dev-ready` 查询 Code Queue `/api/dev-ready` 并返回有界 readiness 摘要,包括工具、Docker、Codex config、SSH 和 `devReady.skills`。`devReady.skills` 只暴露 `UNIDESK_SKILLS_PATH`、是否存在、是否只读、skillCount、`cli-spec` 是否可见和修复建议,不输出宿主 auth/token 文件内容。 - `codex judge --attempt N [--dry-run] [--include-prompt]` 通过 Code Queue 私有代理按指定 attempt 单步复现 judge;这是执行面诊断入口,仍依赖 D601 scheduler/runner 侧的真实 judge builder、MiniMax 调用路径和执行环境。默认会真实调用 MiniMax,`--dry-run` 只返回 prompt/payload 大小、attempt 窗口和重建来源诊断,`--include-prompt` 仅用于本地深度排查。 -- `codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run]` 通过 Code Queue 私有代理向正在运行的 task 注入纠偏提示,正式替代底层 `microservice proxy code-queue /api/tasks//steer` 调用。prompt 必须且只能来自位置参数、文件或 stdin 之一;`--dry-run` 只输出 `method`、`path`、`stableProxyPath`、prompt 字符数、截断预览和 raw proxy 等价命令,不触碰运行中 session,也不得泄露超长 prompt 全文。真实执行复用与 `codex task/tasks/read` 相同的 backend-core stable proxy helper,路径固定为 `/api/microservices/code-queue/proxy/api/tasks//steer`,只能作用于 D601 scheduler 上存在 active steerable turn 的 running task。 +- `codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run]` 通过 Code Queue 私有代理向正在运行的 task 注入纠偏提示,正式替代底层 `microservice proxy code-queue /api/tasks//steer` 调用。prompt 必须且只能来自位置参数、文件或 stdin 之一;`--dry-run` 只输出 `method`、`path`、`stableProxyPath`、prompt 字符数、截断预览和 raw proxy 等价命令,不触碰运行中 session,也不得泄露超长 prompt 全文。真实执行是写入操作,成功只返回 `accepted=true`、task id、prompt 字符数、`promptOmitted=true`、有界 task/queue 确认和后续查看命令,不回显 prompt 或完整 task state;路径固定为 `/api/microservices/code-queue/proxy/api/tasks//steer`,只能作用于 D601 scheduler 上存在 active steerable turn 的 running task。 - `codex steer` 非 dry-run 失败仍输出 JSON 且退出非零;`.data.diagnostics.reason` 用于 runner 分流,当前包括 `backend-core-unreachable`、`code-queue-microservice-unregistered`、`proxy-unauthorized`、`proxy-404`、`steer-endpoint-404`、`upstream-runtime-rejected`、`stable-proxy-failed` 和 `invalid-proxy-response`。`scope` 区分 `backend-core`、`stable-proxy`、`code-queue-runtime` 或 `unknown`,并带 `status`、`exitCode`、`retryable`、有界 `upstreamBodyPreview` 和推荐交叉验证命令;若任务不在 running/active-turn 状态,通常归类为 `upstream-runtime-rejected`,不得静默成功。 - `codex interrupt|cancel ` 通过 Code Queue 私有代理请求中断;running/judging 任务会请求 D601 当前 agent run 停止,queued/retry_wait 任务的取消也必须保持与 WebUI 相同代理路径,返回有界 task 摘要和后续查询命令。任何需要接触 active run 的动作仍属于 D601 执行面。 - Code Queue 多队列 lane 由 `codex` 命令命名空间管理:`queues [--full|--all] [--limit N] [--page N|--offset N]` 列表、`queue create ` 创建、`queue merge --into ` 合并、`move --queue ` 迁移;这些队列管理入口默认由主 server `code-queue-mgr` 直管 PostgreSQL,仍通过稳定 `code-queue` 用户服务代理路径访问。`codex queues` 默认只返回 active/nonempty/unread/runnable queue 摘要、全局 counts 和 execution diagnostics;`--full` 或 `--all` 只切换为完整队列行视图的一页,仍受 `--limit`/`--page`/`--offset` 分页约束,不再默认携带 deprecated full array。summary 和 full 的稳定机读路径都是 `.data.queues.items[]`,全局元数据固定在 `.data.queues.counts`、`.data.queues.executionDiagnostics`、`.data.queues.activeTaskIds` 和 `.data.queues.queuedTaskIds`;需要完整 upstream 时使用输出中的 raw command。旧 full 顶层数组语义已作为 deprecated 兼容信息记录,不再作为 `.data.queues` 主形态。同一个 queue 内部串行执行,不同 queue 之间并行执行。迁移只允许尚未被 scheduler claim 的 `queued`/`retry_wait` 任务,必须满足 `startedAt=null`、`currentAttempt=0` 且没有 active thread/turn;已进入 `running`/`judging` 或已有 claim 标记的任务返回 409,不得被 move/merge 回写成 queued。合并会移动可迁移任务归属并自动删除源 queue 记录,只保留合并后的目标 queue;若 source 或 target queue 存在 active/claimed 任务,合并整体返回 409。合并后的目标 queue 按任务原 `queueEnteredAt`/`createdAt` 时间顺序串行,成功迁移 queued/retry_wait 任务后由 D601 scheduler 轮询推进。 diff --git a/docs/reference/code-queue-supervision.md b/docs/reference/code-queue-supervision.md index cfdf0f99..3f9c2e3f 100644 --- a/docs/reference/code-queue-supervision.md +++ b/docs/reference/code-queue-supervision.md @@ -201,7 +201,7 @@ bun scripts/cli.ts codex pr-preflight --remote --issue 常用入口: -- `bun scripts/cli.ts codex tasks --view supervisor --limit N`:查看默认低噪声监督视图,包括 running、完成未读、最多 5 条最近完成、queued/runnable、execution diagnostics、任务分类和下一步 drill-down 命令。默认行只保留 task id、队列、短 prompt/body 预览和原始字符数;`show/detail/trace/output/full/read` 放在 section template 中,避免每条任务重复刷屏,需要更多内容再按 taskId 展开。 +- `bun scripts/cli.ts codex tasks --view supervisor --limit N`:查看默认低噪声监督视图,包括 running、完成未读、少量最近完成、queued/runnable、execution diagnostics、任务分类和下一步 drill-down 命令。默认行只保留 task id、队列、短 prompt/body 预览和原始字符数;`--limit` 是扫描/分页预算,不是返回几十条肥行的开关;`show/detail/trace/output/full/read` 放在 section template 中,避免每条任务重复刷屏,需要更多内容再按 taskId 展开。 - `bun scripts/cli.ts codex queues`:查看低噪声队列计数、active task id、完成未读队列、runnable 队列和控制面诊断;需要完整队列行视图时加 `--full`,但 `--full` 仍默认分页,继续用 `--limit N`、`--page N` 或 `--offset N` 渐进展开。summary 和 full 都使用稳定 JSON path `.data.queues.items[]` 读取队列行,并从 `.data.queues.counts` 与 `.data.queues.executionDiagnostics` 读取全局计数和执行诊断;完整 upstream 只通过输出中的 raw command 显式获取。 - `bun scripts/cli.ts codex tasks --unread --limit N`:查看完成未读审阅积压;`--unread` 与 `--unread-only` 等价,不能被静默忽略。 - `bun scripts/cli.ts codex tasks --status succeeded --unread --limit N`:按具体终态过滤监督结果;不支持的 status filter 必须显式失败,不能扩大为未过滤结果。 @@ -209,11 +209,11 @@ bun scripts/cli.ts codex pr-preflight --remote --issue - 当默认审阅摘要不足时,再逐级使用 `bun scripts/cli.ts codex task --detail`、`bun scripts/cli.ts codex task --trace --limit N` 或 `codex output`。 - 当 master 控制面状态和 D601 scheduler 状态看起来分裂时,使用 `docs/reference/observability.md` 中的活性规则判断。 -默认 supervisor 视图必须保持低噪声。`running`、`completedUnread` 和 `queued` 即使传入较大的 `--limit`,默认也只返回一个有界小页,并通过 section `commands.next` 继续分页;`--limit` 保留给 full view 和后续分页请求,不得让一次 supervisor 调用输出几十条肥行。每个任务行只应带 task id 和必要摘要,`show`、`detail`、`trace`、`output`、`full`、`read` 使用 section template 表达,让下一步渐进披露动作明确且不重复;默认不得嵌入完整 queue 列表、完整 final response、raw output 页或完整 trace 行。`recentCompleted` 必须默认限量,且不得重复 `completedUnread` 里的未读终态,避免完成历史把当前 running、阻塞和未读审阅挤出视野;需要完整当前页时显式使用 `--view full`。`executionDiagnostics` 只能展示有界 task-id/reason 预览、总数、截断标记和 omitted counts;需要全量诊断时使用输出中的 raw command。`commands.read` 只是在人工审阅后的建议命令,listing 命令绝不能自动执行。 +默认 supervisor 视图必须保持低噪声。`running`、`completedUnread` 和 `queued` 即使传入较大的 `--limit`,默认也只返回一个很小的有界页,并通过 section `commands.next` 继续分页;`--limit` 保留为扫描/分页预算和 full view 返回预算,不得让一次 supervisor 调用输出几十条肥行。每个任务行只应带 task id 和必要摘要,`show`、`detail`、`trace`、`output`、`full`、`read` 使用 section template 表达,让下一步渐进披露动作明确且不重复;默认不得嵌入完整 queue 列表、完整 final response、raw output 页或完整 trace 行。`recentCompleted` 必须默认限量,且不得重复 `completedUnread` 里的未读终态,避免完成历史把当前 running、阻塞和未读审阅挤出视野;需要完整当前页时显式使用 `--view full`。`executionDiagnostics` 只能展示有界 task-id/reason 预览、总数、截断标记和 omitted counts;需要全量诊断时使用输出中的 raw command。`commands.read` 只是在人工审阅后的建议命令,listing 命令绝不能自动执行。 这条规则直接服务 HWLAB #132:指挥官要优先看到真实业务推进、部署修复、阻塞和需要人工审阅的未读结果,Gate/报告/审查/诊断类任务只能作为折叠的分类信号存在,不能在默认输出中用长 prompt/body 抢占上下文。 -完成未读任务的审阅也必须遵循渐进披露。指挥官默认只拉取原始 prompt 和最终 response,用它判断任务是否声称完成、是否有明显越界、是否缺少验收证据;不要默认拉完整 trace、全量 tool summary 或 raw output。只有当 final response 与目标不一致、证据不足、远端 commit 无法验证、任务疑似造假、或需要追溯失败原因时,才继续展开 `--detail`、分页 `--trace`、或按 seq 读取 `codex output`。这条规则的目标是降低上下文压力,同时保留通过多步查询拿到完整证据的能力。 +完成未读任务的审阅也必须遵循渐进披露。指挥官默认只拉取原始 prompt 和最终 response,用它判断任务是否声称完成、是否有明显越界、是否缺少验收证据;不要默认拉完整 trace、全量 tool summary 或 raw output。`codex task --detail` 也是有界摘要,只提供少量 attempt/tool 行和短文本预览;需要完整证据时再继续展开 `--detail --full --tool-limit N`、分页 `--trace`,或按 seq 读取 `codex output`。`codex output` 默认仍会限制返回行数和单条文本预览;只有明确使用 `--full-text` 且选定 seq window 时才读取该页全文。只有当 final response 与目标不一致、证据不足、远端 commit 无法验证、任务疑似造假、或需要追溯失败原因时,才进入这些展开路径。这条规则的目标是降低上下文压力,同时保留通过多步查询拿到完整证据的能力。 队列诊断中的 `split-brain` 表示控制面/执行面观测分裂,不自动证明任务已经死亡。只要任务 heartbeat 还在刷新、trace 仍在推进,就不能把它判成服务中断或要求立刻 stop;应把它视为 `splitBrainLive=true` 的 live 任务,继续监督并推进 #20 里的已排任务,而不是 interrupt、替换或把 backend 当成已经挂掉。队列摘要应显示 `effectiveLiveness=live`、`splitBrainLive=true` 和 `recommendedAction=continue-supervision`;compact 输出还应在 `executionDiagnostics.liveness` 中重复这些低噪声字段,并突出 `activeHeartbeatCount`、有界 `heartbeatFreshTaskIds`、`databaseActiveTaskCount` 和 `schedulerActiveRunSlotCount`。当 master/control-plane 的 `schedulerActiveRunSlotCount=0` 但 `heartbeatFreshTaskIds` 非空时,active 数应优先按 scheduler heartbeat 摘要解释为 live,而不是按 master 本地 slot 0 解释为执行停摆。只有 heartbeat expired/missing 或满足 stale-recovery 条件时,才应显示 `effectiveLiveness=at-risk` 并进入恢复判断。 diff --git a/scripts/code-queue-cli-disclosure-contract-test.ts b/scripts/code-queue-cli-disclosure-contract-test.ts new file mode 100644 index 00000000..ee4bc362 --- /dev/null +++ b/scripts/code-queue-cli-disclosure-contract-test.ts @@ -0,0 +1,155 @@ +import { codexOutputQuery, codexTaskQuery } from "./src/code-queue"; + +type JsonRecord = Record; + +function assertCondition(condition: unknown, message: string, detail: unknown = {}): void { + if (!condition) throw new Error(`${message}: ${JSON.stringify(detail)}`); +} + +function asRecord(value: unknown): JsonRecord { + assertCondition(value !== null && typeof value === "object" && !Array.isArray(value), "expected object", { value }); + return value as JsonRecord; +} + +function asArray(value: unknown): unknown[] { + assertCondition(Array.isArray(value), "expected array", { value }); + return value as unknown[]; +} + +function longText(marker: string, repeat = 220): string { + return Array.from({ length: repeat }, (_, index) => `${marker}-${String(index + 1).padStart(3, "0")} ${"abcdefghijklmnopqrstuvwxyz0123456789".repeat(3)}`).join("\n"); +} + +function detailFixture(path: string): JsonRecord { + assertCondition(path.includes("/summary"), "detail fixture should only fetch summary", { path }); + return { + ok: true, + status: 200, + body: { + ok: true, + summary: { + id: "codex_disclosure_fixture", + queueId: "noise", + status: "failed", + providerId: "D601", + model: "gpt-5.5", + cwd: "/workspace", + prompt: longText("prompt-tail-marker"), + basePrompt: longText("base-tail-marker"), + transcriptCount: 300, + outputCount: 180, + eventCount: 40, + lastAssistantMessage: { + at: "2026-05-23T00:00:00.000Z", + seq: 300, + source: "assistant", + text: longText("assistant-tail-marker"), + }, + toolSummary: { + count: 12, + returned: 8, + limit: 8, + truncated: true, + items: Array.from({ length: 8 }, (_, index) => ({ + seq: index + 1, + at: "2026-05-23T00:00:00.000Z", + kind: "ran", + title: `tool-${index + 1}`, + status: "ok", + commandPreview: longText(`command-tail-marker-${index + 1}`, 20), + outputPreview: longText(`tool-output-tail-marker-${index + 1}`, 20), + rawSeqs: [index + 1], + })), + }, + attempts: Array.from({ length: 8 }, (_, index) => ({ + index: index + 1, + mode: index === 0 ? "initial" : "retry", + terminalStatus: "failed", + stderrTail: longText(`stderr-tail-marker-${index + 1}`, 20), + finalResponse: longText(`attempt-response-tail-marker-${index + 1}`, 30), + feedbackPromptPreview: longText(`feedback-tail-marker-${index + 1}`, 20), + runnerErrorClassification: { scope: "runner-local", globalBlocker: false }, + })), + }, + }, + }; +} + +function outputFixture(path: string): JsonRecord { + assertCondition(path.includes("/output"), "output fixture should fetch output", { path }); + assertCondition(path.includes("limit=20"), "default output should cap large requested limit to 20", { path }); + assertCondition(path.includes("maxTextChars=500"), "default output should cap text preview chars", { path }); + const output = Array.from({ length: 20 }, (_, index) => ({ + seq: index + 101, + at: "2026-05-23T00:00:00.000Z", + channel: index % 2 === 0 ? "command" : "assistant", + method: "fixture", + text: longText(`raw-output-tail-marker-${index + 1}`, 40), + })); + return { + ok: true, + status: 200, + body: { + ok: true, + taskId: "codex_disclosure_fixture", + queueId: "noise", + status: "running", + updatedAt: "2026-05-23T00:00:00.000Z", + mode: "tail", + limit: 20, + total: 240, + maxSeq: 1200, + afterSeq: 0, + nextAfterSeq: 120, + previousBeforeSeq: 101, + hasMore: true, + hasBefore: true, + output, + }, + }; +} + +export function runCodeQueueCliDisclosureContract(): JsonRecord { + const detail = codexTaskQuery("codex_disclosure_fixture", ["--detail"], detailFixture) as JsonRecord; + const detailJson = JSON.stringify(detail); + const summary = asRecord(detail.summary); + const attempts = asRecord(summary.attempts); + const attemptRecords = asArray(attempts.attemptRecords); + const toolSummary = asRecord(summary.toolSummary); + const toolItems = asArray(toolSummary.items); + + assertCondition(attemptRecords.length === 3, "detail should cap attempt records by default", attempts); + assertCondition(attempts.attemptRecordCount === 8 && attempts.attemptRecordsTruncated === true, "detail should expose omitted attempt metadata", attempts); + assertCondition(detailJson.includes("detailOutputPolicy"), "detail should disclose progressive detail policy", summary); + assertCondition(!detailJson.includes("prompt-tail-marker-220"), "detail should not include full prompt tail by default", summary); + assertCondition(!detailJson.includes("assistant-tail-marker-220"), "detail should not include full assistant tail by default", summary); + assertCondition(!detailJson.includes("attempt-response-tail-marker-1-030"), "detail should not include full attempt response by default", attempts); + assertCondition(toolItems.length === 3 && toolSummary.truncated === true, "detail should cap tool summary rows by default", toolSummary); + assertCondition(!detailJson.includes("tool-output-tail-marker-1-020"), "detail should compact tool output previews", toolSummary); + + const output = codexOutputQuery("codex_disclosure_fixture", ["--tail", "--limit", "120"], outputFixture) as JsonRecord; + const page = asRecord(output.outputPage); + const outputRows = asArray(page.output).map(asRecord); + const outputJson = JSON.stringify(output); + const disclosure = asRecord(page.disclosure); + assertCondition(page.requestedLimit === 120 && page.limit === 20 && page.returned === 20, "output should cap large requested limit by default", page); + assertCondition(disclosure.limitCapped === true && disclosure.fullText === false, "output should disclose capped default policy", disclosure); + assertCondition(outputRows.every((row) => row.textTruncated === true && Number(row.textChars) > String(row.text).length), "output rows should expose bounded text previews", page); + assertCondition(!outputJson.includes("raw-output-tail-marker-1-040"), "output should not include full raw text tail by default", page); + assertCondition(String(asRecord(page.commands).fullText).includes("--full-text"), "output should provide explicit full-text command", page); + + return { + ok: true, + checks: [ + "codex task --detail caps attempts and long text by default", + "codex output caps large requested limits by default", + "codex output preserves progressive full-text command", + ], + detailChars: detailJson.length, + outputChars: outputJson.length, + }; +} + +if (import.meta.main) { + process.stdout.write(`${JSON.stringify(runCodeQueueCliDisclosureContract(), null, 2)}\n`); +} diff --git a/scripts/code-queue-cli-steer-test.ts b/scripts/code-queue-cli-steer-test.ts index bbbce638..89ee48c2 100644 --- a/scripts/code-queue-cli-steer-test.ts +++ b/scripts/code-queue-cli-steer-test.ts @@ -142,6 +142,10 @@ export function runCodeQueueCliSteerContract(): JsonRecord { assertCondition(fetchMethod === "POST", "non-dry-run should POST", { fetchMethod }); assertCondition(fetchPrompt === "send this", "non-dry-run should send raw prompt in body", { fetchPrompt }); assertCondition(nestedRecord(success, ["steer"]).accepted === true, "successful steer should report accepted=true", success); + const successJson = JSON.stringify(success); + assertCondition(nestedRecord(success, ["steer"]).promptOmitted === true, "successful steer should mark prompt omitted", success); + assertCondition(!successJson.includes("send this"), "successful steer must not echo prompt text", success); + assertCondition(!successJson.includes("promptPreview"), "successful steer must not include promptPreview", success); assertReason(codexSteerTaskForTest("direct_task", ["p"], () => ({ ok: false, exitCode: 1, stderrTail: "Cannot connect to the Docker daemon" })), "backend-core-unreachable", null); assertReason(codexSteerTaskForTest("direct_task", ["p"], () => ({ ok: false, status: 404, body: { ok: false, error: "microservice not found: code-queue" } })), "code-queue-microservice-unregistered", 404); @@ -164,6 +168,7 @@ export function runCodeQueueCliSteerContract(): JsonRecord { "dry-run does not call stable proxy helper", "dry-run prompt preview is bounded", "non-dry-run uses stable proxy helper", + "successful steer confirms write without echoing prompt", "steer failure classification is JSON-consumable", ], }; diff --git a/scripts/code-queue-supervisor-disclosure-contract-test.ts b/scripts/code-queue-supervisor-disclosure-contract-test.ts index 40fe870a..209ca6bc 100644 --- a/scripts/code-queue-supervisor-disclosure-contract-test.ts +++ b/scripts/code-queue-supervisor-disclosure-contract-test.ts @@ -203,8 +203,8 @@ export function runCodeQueueSupervisorDisclosureContract(): JsonRecord { assertCondition(supervisorBody.length < fullBody.length * 0.55, "supervisor output should be materially smaller than full output", { supervisorChars: supervisorBody.length, fullChars: fullBody.length }); assertCondition(supervisorBody.length < 45_000, "supervisor output should remain bounded even with large diagnostics", { supervisorChars: supervisorBody.length }); - assertCondition(recentItems.length === 5, "recentCompleted should be capped below --limit by default", { returned: recentItems.length }); - assertCondition(asArray(completedUnread.items).length === 5, "completedUnread should be locally paged and kept separate from recentCompleted", completedUnread); + assertCondition(recentItems.length === 3, "recentCompleted should be capped below --limit by default", { returned: recentItems.length }); + assertCondition(asArray(completedUnread.items).length === 3, "completedUnread should be locally paged and kept separate from recentCompleted", completedUnread); assertCondition(recentItems.every((item) => asRecord(item).unreadTerminal === false), "recentCompleted should not duplicate unread terminal tasks", { recentItems }); assertCondition(diagnostics.databaseActiveTaskIds === undefined, "supervisor diagnostics should not expose verbose databaseActiveTaskIds by default", diagnostics); assertCondition(omittedCounts.databaseActiveTaskIds === 77, "diagnostic omitted counts should preserve full visibility metadata", omittedCounts); @@ -224,12 +224,12 @@ export function runCodeQueueSupervisorDisclosureContract(): JsonRecord { assertCondition(asRecord(fullItem.promptPreview).chars !== undefined && fullItem.lastAssistantMessage !== undefined, "full view must retain detailed task row fields", fullItem); assertCondition(fullTasks.returned === 15, "full view must not inherit supervisor recentCompleted cap", fullTasks); const budget = asRecord(disclosure.outputBudget); - assertCondition(budget.recentCompletedReturnedLimit === 5 && budget.sectionReturnedLimit === 5, "supervisor must expose output budget metadata", disclosure); - assertCondition(asArray(runningFilteredSection.items).length === 5, "running status filter should be locally paged below --limit", runningFilteredSection); + assertCondition(budget.recentCompletedReturnedLimit === 3 && budget.sectionReturnedLimit === 3, "supervisor must expose output budget metadata", disclosure); + assertCondition(asArray(runningFilteredSection.items).length === 3, "running status filter should be locally paged below --limit", runningFilteredSection); assertCondition(runningFilteredSection.count === 40 && runningFilteredSection.hasMore === true, "running status filter should preserve count and hasMore", runningFilteredSection); - assertCondition(String(asRecord(runningFilteredSection.commands).next ?? "").includes("--before-id task-running-05"), "running status filter should provide next page command", runningFilteredSection); + assertCondition(String(asRecord(runningFilteredSection.commands).next ?? "").includes("--before-id task-running-03"), "running status filter should provide next page command", runningFilteredSection); assertCondition(runningFilteredBody.length < 14_000, "running status filter output should remain bounded", { chars: runningFilteredBody.length }); - assertCondition(asArray(unreadFilteredSection.items).length <= 5, "unread list should be locally paged below --limit", unreadFilteredSection); + assertCondition(asArray(unreadFilteredSection.items).length <= 3, "unread list should be locally paged below --limit", unreadFilteredSection); assertCondition(unreadFilteredBody.length < 14_000, "unread output should remain bounded", { chars: unreadFilteredBody.length }); return { diff --git a/scripts/src/check.ts b/scripts/src/check.ts index b7abc679..c8042f68 100644 --- a/scripts/src/check.ts +++ b/scripts/src/check.ts @@ -30,6 +30,8 @@ const syntaxFiles = [ "scripts/host-codex-commander-no-daemon-smoke-contract-test.ts", "scripts/host-codex-commander-skeleton-contract-test.ts", "scripts/auth-broker-contract-test.ts", + "scripts/code-queue-cli-disclosure-contract-test.ts", + "scripts/code-queue-cli-steer-test.ts", "scripts/code-queue-cli-submit-prompt-contract-test.ts", "scripts/code-queue-supervisor-disclosure-contract-test.ts", "src/components/frontend/src/index.ts", @@ -305,6 +307,8 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default fileItem("scripts/code-queue-trace-summary-contract-test.ts"), fileItem("scripts/code-queue-pr-preflight-contract-test.ts"), fileItem("scripts/code-queue-runner-skills-contract-test.ts"), + fileItem("scripts/code-queue-cli-disclosure-contract-test.ts"), + fileItem("scripts/code-queue-cli-steer-test.ts"), fileItem("scripts/code-queue-submit-routing-contract-test.ts"), fileItem("scripts/code-queue-supervisor-disclosure-contract-test.ts"), fileItem("scripts/host-codex-commander-skeleton-contract-test.ts"), @@ -342,6 +346,8 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default items.push(commandItem("code-queue:trace-summary-contract", ["bun", "scripts/code-queue-trace-summary-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:pr-preflight-contract", ["bun", "scripts/code-queue-pr-preflight-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:runner-skills-contract", ["bun", "scripts/code-queue-runner-skills-contract-test.ts"], 30_000)); + items.push(commandItem("code-queue:cli-disclosure-contract", ["bun", "scripts/code-queue-cli-disclosure-contract-test.ts"], 30_000)); + items.push(commandItem("code-queue:cli-steer-contract", ["bun", "scripts/code-queue-cli-steer-test.ts"], 30_000)); items.push(commandItem("code-queue:submit-prompt-contract", ["bun", "scripts/code-queue-cli-submit-prompt-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:submit-routing-contract", ["bun", "scripts/code-queue-submit-routing-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:supervisor-disclosure-contract", ["bun", "scripts/code-queue-supervisor-disclosure-contract-test.ts"], 30_000)); @@ -369,6 +375,8 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default items.push(skippedItem("code-queue:trace-summary-contract", "Code Queue trace summary contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:pr-preflight-contract", "Code Queue PR preflight contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:runner-skills-contract", "Code Queue runner skill availability contract is opt-in with script checks", "--scripts-typecheck or --full")); + items.push(skippedItem("code-queue:cli-disclosure-contract", "Code Queue CLI disclosure/noise contract is opt-in with script checks", "--scripts-typecheck or --full")); + items.push(skippedItem("code-queue:cli-steer-contract", "Code Queue steer CLI contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:submit-prompt-contract", "Code Queue submit prompt contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:submit-routing-contract", "Code Queue submit routing contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:supervisor-disclosure-contract", "Code Queue supervisor disclosure contract is opt-in with script checks", "--scripts-typecheck or --full")); diff --git a/scripts/src/code-queue.ts b/scripts/src/code-queue.ts index 24a5ad8d..6ed22c89 100644 --- a/scripts/src/code-queue.ts +++ b/scripts/src/code-queue.ts @@ -5,22 +5,27 @@ import { coreInternalFetch } from "./microservices"; import { previewJson } from "./preview"; import { codeAgentPortForModel, codeModelPorts as sharedCodeModelPorts, defaultCodeModels as sharedDefaultCodeModels, opencodeModels as sharedOpencodeModels } from "../../src/components/microservices/code-queue/src/code-agent/common"; -const defaultToolLimit = 8; +const defaultToolLimit = 3; const defaultTraceLimit = 80; const maxTraceLimit = 500; const defaultOutputLimit = 20; -const defaultTextPreviewChars = 12_000; +const defaultOutputPreviewChars = 500; +const maxOutputPreviewChars = 1200; const defaultTasksLimit = 20; const defaultQueuesLimit = 8; const maxTasksLimit = 100; -const supervisorSectionReturnedLimit = 5; -const supervisorRecentCompletedLimit = 5; -const supervisorPromptPreviewChars = 90; -const supervisorBodyPreviewChars = 90; -const supervisorRecentBodyPreviewChars = 60; +const supervisorSectionReturnedLimit = 3; +const supervisorRecentCompletedLimit = 3; +const supervisorPromptPreviewChars = 70; +const supervisorBodyPreviewChars = 70; +const supervisorRecentBodyPreviewChars = 50; const diagnosticsIdPreviewLimit = 3; const diagnosticsReasonPreviewLimit = 2; const steerPromptPreviewChars = 320; +const detailAttemptReturnedLimit = 3; +const detailInitialPromptPreviewChars = 1200; +const detailBasePromptPreviewChars = 800; +const detailLastAssistantPreviewChars = 1200; const minimaxSubmitModel = "minimax-m2.7"; const deepseekSubmitModel = "deepseek-chat"; const gptSubmitModel = "gpt-5.5"; @@ -42,6 +47,7 @@ interface CodexTaskOptions { } interface CodexOutputOptions { + requestedLimit: number; limit: number; mode: "tail" | "after" | "before"; afterSeq: number; @@ -654,13 +660,13 @@ function compactText(text: unknown, full: boolean, maxChars: number): Record { +function compactLastAssistant(value: unknown, full: boolean, maxChars = 4000): Record { const record = asRecord(value) ?? {}; return { at: record.at ?? null, seq: record.seq ?? null, source: record.source ?? "none", - ...textView(asString(record.text), full, 4000), + ...textView(asString(record.text), full, maxChars), }; } @@ -1255,9 +1261,11 @@ function supervisorExecutionDiagnostics(value: unknown): Record }; } -function compactToolSummary(value: unknown, full: boolean): Record { +function compactToolSummary(value: unknown, full: boolean, limit = defaultToolLimit): Record { const record = asRecord(value) ?? {}; - const items = asArray(record.items).map((item) => { + const allItems = asArray(record.items); + const sourceItems = full ? allItems : allItems.slice(0, limit); + const items = sourceItems.map((item) => { const line = asRecord(item) ?? {}; return { seq: line.seq ?? null, @@ -1265,18 +1273,18 @@ function compactToolSummary(value: unknown, full: boolean): Record items.length), items, }; } @@ -1286,8 +1294,11 @@ function compactSummary(summary: unknown, options: CodexTaskOptions, taskId: str const transcriptCount = asNumber(record.transcriptCount, 0); const transcriptMaxSeq = transcriptCount > 0 ? record.transcriptMaxSeq ?? null : null; const initialPrompt = asString(record.initialPrompt ?? record.prompt); - const initialPromptView = textView(initialPrompt, options.full, 3000); - const basePromptView = textView(asString(record.basePrompt), options.full, 2000); + const initialPromptView = textView(initialPrompt, options.full, detailInitialPromptPreviewChars); + const basePromptView = textView(asString(record.basePrompt), options.full, detailBasePromptPreviewChars); + const attemptRecordsSource = asArray(record.attempts); + const attemptRecords = (options.full ? attemptRecordsSource : attemptRecordsSource.slice(0, detailAttemptReturnedLimit)) + .map((attempt) => compactAttemptCycle(attempt, options.full)); return { id: record.id ?? taskId, queueId: record.queueId ?? null, @@ -1304,7 +1315,10 @@ function compactSummary(summary: unknown, options: CodexTaskOptions, taskId: str currentMode: record.currentMode ?? null, judgeFailCount: record.judgeFailCount ?? null, judgeFailRetryLimit: record.judgeFailRetryLimit ?? null, - attemptRecords: asArray(record.attempts).map((attempt) => compactAttemptCycle(attempt, options.full)), + attemptRecordCount: attemptRecordsSource.length, + attemptRecordsReturned: attemptRecords.length, + attemptRecordsTruncated: !options.full && attemptRecordsSource.length > attemptRecords.length, + attemptRecords, }, thread: { codexThreadId: record.codexThreadId ?? null, @@ -1322,10 +1336,10 @@ function compactSummary(summary: unknown, options: CodexTaskOptions, taskId: str : { initialPromptPreview: initialPromptView, basePromptPreview: basePromptView }), referenceTaskIds: record.referenceTaskIds ?? [], referenceInjection: record.referenceInjection ?? null, - lastAssistantMessage: compactLastAssistant(record.lastAssistantMessage, options.full), + lastAssistantMessage: compactLastAssistant(record.lastAssistantMessage, options.full, detailLastAssistantPreviewChars), lastJudge: record.lastJudge ?? null, lastError: record.lastError ?? null, - toolSummary: compactToolSummary(record.toolSummary, options.full), + toolSummary: compactToolSummary(record.toolSummary, options.full, options.toolLimit), counts: { transcript: record.transcriptCount ?? null, output: record.outputCount ?? null, @@ -1336,6 +1350,7 @@ function compactSummary(summary: unknown, options: CodexTaskOptions, taskId: str renderer: "shared trace-summary/trace-steps progressive abstraction; CLI and WebUI diverge only at final rendering", total: record.transcriptCount ?? null, maxSeq: transcriptMaxSeq, + detailOutputPolicy: "bounded detail by default; use --full, --trace, --tool-limit, or codex output for progressive disclosure", defaultPage: `bun scripts/cli.ts codex task ${taskId} --trace --limit ${defaultTraceLimit}`, firstPage: `bun scripts/cli.ts codex task ${taskId} --trace --from-start --limit ${defaultTraceLimit}`, nextPageTemplate: `bun scripts/cli.ts codex task ${taskId} --trace --after-seq --limit ${defaultTraceLimit}`, @@ -1449,16 +1464,16 @@ function compactAttemptCycle(value: unknown, full: boolean): Record arg === "--after-seq" || arg === "--afterSeq") ? "after" : "tail"; + const fullText = hasFlag(args, "--full-text") || hasFlag(args, "--raw"); + const requestedLimit = positiveIntegerOption(args, ["--limit"], defaultOutputLimit, maxTraceLimit); return { - limit: positiveIntegerOption(args, ["--limit"], defaultOutputLimit, maxTraceLimit), + requestedLimit, + limit: fullText ? requestedLimit : Math.min(requestedLimit, defaultOutputLimit), mode, afterSeq, beforeSeq, - fullText: hasFlag(args, "--full-text") || hasFlag(args, "--raw"), - maxTextChars: positiveIntegerOption(args, ["--max-text-chars"], defaultTextPreviewChars, 500_000), + fullText, + maxTextChars: positiveIntegerOption(args, ["--max-text-chars"], defaultOutputPreviewChars, fullText ? 500_000 : maxOutputPreviewChars), }; } @@ -1706,8 +1724,23 @@ function codexTaskSummary(taskId: string, options: CodexTaskOptions, fetcher: Co return result; } -function compactOutputPage(body: Record, taskId: string, limit: number): Record { - const output = asArray(body.output); +function compactOutputRecord(item: unknown, options: CodexOutputOptions): Record { + const record = asRecord(item) ?? {}; + const text = textView(asString(record.text), options.fullText, options.maxTextChars); + return { + seq: record.seq ?? null, + at: record.at ?? null, + channel: record.channel ?? null, + method: record.method ?? null, + itemId: record.itemId ?? null, + text: text.text, + textChars: text.chars, + ...(text.truncated ? { textTruncated: true, textOmittedChars: text.omittedChars } : {}), + }; +} + +function compactOutputPage(body: Record, taskId: string, options: CodexOutputOptions): Record { + const output = asArray(body.output).map((item) => compactOutputRecord(item, options)); const nextAfterSeq = body.nextAfterSeq ?? null; const previousBeforeSeq = body.previousBeforeSeq ?? null; return { @@ -1716,7 +1749,8 @@ function compactOutputPage(body: Record, taskId: string, limit: status: body.status ?? null, updatedAt: body.updatedAt ?? null, mode: body.mode ?? null, - limit, + requestedLimit: options.requestedLimit, + limit: options.limit, returned: output.length, total: body.total ?? null, maxSeq: body.maxSeq ?? null, @@ -1726,13 +1760,21 @@ function compactOutputPage(body: Record, taskId: string, limit: previousBeforeSeq, hasMore: body.hasMore ?? false, hasBefore: body.hasBefore ?? false, + disclosure: { + defaultPolicy: "bounded output rows and bounded text previews; use --full-text with an explicit seq window only when raw text is required", + limitCapped: !options.fullText && options.limit < options.requestedLimit, + fullText: options.fullText, + textPreviewChars: options.maxTextChars, + requestedLimit: options.requestedLimit, + effectiveLimit: options.limit, + }, output, commands: { - next: body.hasMore === true && nextAfterSeq !== null ? `bun scripts/cli.ts codex output ${taskId} --after-seq ${nextAfterSeq} --limit ${limit}` : null, - previous: body.hasBefore === true && previousBeforeSeq !== null ? `bun scripts/cli.ts codex output ${taskId} --before-seq ${previousBeforeSeq} --limit ${limit}` : null, - tail: `bun scripts/cli.ts codex output ${taskId} --tail --limit ${limit}`, - first: `bun scripts/cli.ts codex output ${taskId} --from-start --limit ${limit}`, - fullText: `bun scripts/cli.ts codex output ${taskId} --after-seq --limit ${limit} --full-text`, + next: body.hasMore === true && nextAfterSeq !== null ? `bun scripts/cli.ts codex output ${taskId} --after-seq ${nextAfterSeq} --limit ${options.limit}` : null, + previous: body.hasBefore === true && previousBeforeSeq !== null ? `bun scripts/cli.ts codex output ${taskId} --before-seq ${previousBeforeSeq} --limit ${options.limit}` : null, + tail: `bun scripts/cli.ts codex output ${taskId} --tail --limit ${options.limit}`, + first: `bun scripts/cli.ts codex output ${taskId} --from-start --limit ${options.limit}`, + fullText: `bun scripts/cli.ts codex output ${taskId} --after-seq --limit ${Math.min(options.requestedLimit, defaultOutputLimit)} --full-text`, }, }; } @@ -1747,7 +1789,7 @@ function codexTaskOutput(taskId: string, options: CodexOutputOptions, fetcher: C if (options.mode === "after") params.afterSeq = options.afterSeq; if (options.mode === "before") params.beforeSeq = options.beforeSeq; const response = unwrapCodexResponse(fetcher(codeQueueProxyPath(`/api/tasks/${encodeURIComponent(taskId)}/output${queryString(params)}`))); - return { upstream: response.upstream, outputPage: compactOutputPage(response.body, taskId, options.limit) }; + return { upstream: response.upstream, outputPage: compactOutputPage(response.body, taskId, options) }; } function codexTaskJudge(taskId: string, options: CodexJudgeOptions, fetcher: CodexResponseFetcher): unknown { @@ -2389,7 +2431,7 @@ async function codexTaskOutputAsync(taskId: string, options: CodexOutputOptions, if (options.mode === "after") params.afterSeq = options.afterSeq; if (options.mode === "before") params.beforeSeq = options.beforeSeq; const response = unwrapCodexResponse(await fetcher(codeQueueProxyPath(`/api/tasks/${encodeURIComponent(taskId)}/output${queryString(params)}`))); - return { upstream: response.upstream, outputPage: compactOutputPage(response.body, taskId, options.limit) }; + return { upstream: response.upstream, outputPage: compactOutputPage(response.body, taskId, options) }; } async function codexTaskJudgeAsync(taskId: string, options: CodexJudgeOptions, fetcher: AsyncCodexResponseFetcher): Promise { @@ -3882,17 +3924,28 @@ function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFe }; } return { + ok: true, upstream: response.upstream, steer: { accepted: true, - prompt, + taskId, + promptChars: options.prompt.length, + promptOmitted: true, + outputPolicy: { + default: "write-confirmation", + promptEchoed: false, + taskDetailEchoed: false, + reason: "codex steer is a write operation; default output confirms delivery and provides drill-down commands without echoing prompt text or full task state.", + }, }, - task: compactTaskMutationResponse(response.body.task), - queue: compactQueueMutationSummary(response.body.queue), + task: compactSubmitTaskConfirmation(response.body.task), + queue: compactSubmitQueueConfirmation(response.body.queue), commands: { show: `bun scripts/cli.ts codex task ${taskId}`, + detail: `bun scripts/cli.ts codex task ${taskId} --detail`, trace: `bun scripts/cli.ts codex task ${taskId} --trace --tail --limit ${defaultTraceLimit}`, output: `bun scripts/cli.ts codex output ${taskId} --tail --limit ${defaultOutputLimit}`, + supervisor: `bun scripts/cli.ts codex tasks --view supervisor --limit ${defaultTasksLimit}`, }, }; } diff --git a/scripts/src/help.ts b/scripts/src/help.ts index dea7d08a..ce2d0b2e 100644 --- a/scripts/src/help.ts +++ b/scripts/src/help.ts @@ -54,13 +54,13 @@ export function rootHelp(): unknown { { command: "codex deploy [--provider-id D601] [--timeout-ms N]", description: "Disabled legacy Code Queue deploy path; use the dev-only artifact consumer instead." }, { command: "codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue queueId] [--provider-id id] [--cwd path] [--model model] [--execution-mode mode] [--max-attempts N] [--reference-task-id id] [--dry-run]", description: "Submit a Code Queue task through backend-core -> code-queue proxy; --dry-run shows the structured request, while real success only confirms the write and task id." }, { command: "codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/] [--pr-create-dry-run --pr-create-dry-run-head ] [--issue N]", description: "Read-only PR admission check against the D601 scheduler/runner token, GitHub egress, repo visibility, optional push dry-run, and PR body/create dry-run guard." }, - { command: "codex task [--detail] [--trace --tail|--from-start|--after-seq N|--before-seq N --limit N] [--full]", description: "Fetch the bounded review view by default: original prompt, final response, and drill-down commands; detail and trace are opt-in." }, - { command: "codex tasks [--view supervisor|full] [--queue id] [--status status[,status]] [--unread|--unread-only] [--limit N] [--before-id id]", description: "Show the low-noise supervisor view by default: compact task rows, capped recent completions, diagnostics, and drill-down commands; use --view full for detailed rows." }, - { command: "codex output [--tail|--from-start|--after-seq N|--before-seq N --limit N] [--full-text]", description: "Fetch paged raw Code Queue output records by seq when a trace row has omitted command/output text." }, + { command: "codex task [--detail] [--trace --tail|--from-start|--after-seq N|--before-seq N --limit N] [--full]", description: "Fetch the bounded review view by default; --detail is still capped, while --full/trace/output explicitly expand evidence." }, + { command: "codex tasks [--view supervisor|full] [--queue id] [--status status[,status]] [--unread|--unread-only] [--limit N] [--before-id id]", description: "Show the low-noise supervisor view by default: compact task rows, tiny local sections, diagnostics, and drill-down commands; use --view full for detailed rows." }, + { command: "codex output [--tail|--from-start|--after-seq N|--before-seq N --limit N] [--full-text]", description: "Fetch paged raw Code Queue output records; default caps large limits/text previews, --full-text explicitly expands one seq window." }, { command: "codex read ", description: "Mark one reviewed terminal task read; never run automatically as part of listing." }, { command: "codex dev-ready", description: "Fetch execution-container readiness, including sanitized skill injection status from /api/dev-ready." }, { command: "codex judge --attempt N [--dry-run] [--include-prompt]", description: "Replay one stored Code Queue attempt through the same judge context builder and MiniMax judge call path used by the live queue worker." }, - { command: "codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run]", description: "Push a bounded corrective prompt into a running Code Queue task through the stable private proxy path." }, + { command: "codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run]", description: "Push a corrective prompt into a running Code Queue task; real success only confirms the write and does not echo prompt text." }, { command: "codex interrupt|cancel ", description: "Request interrupt for a running Code Queue task, or cancel a queued/retry_wait task, through the same private proxy." }, { command: "codex (queues [--full|--all] | queue create | queue merge --into | move --queue )", description: "List low-noise queue summaries by default; full queue rows require --full/--all." }, { command: "job list [--limit N] [--include-command]", description: "List async jobs from .state/jobs with a bounded default page." }, @@ -278,7 +278,11 @@ function codexHelp(): unknown { file: "bun scripts/cli.ts codex submit --prompt-file /tmp/code-queue-prompt.md --queue --dry-run", dryRunThenSubmit: "Run with --dry-run first; remove --dry-run to submit exactly the same payload.", }, - description: "Operate Code Queue through the stable backend-core private proxy path. Real submit success is a low-noise write confirmation and does not echo prompt text.", + disclosure: { + defaultPolicy: "low-noise JSON by default; write commands confirm persistence, list/detail/output commands return bounded summaries with drill-down commands", + expand: ["codex task --full", "codex task --trace --limit N", "codex output --after-seq N --limit N --full-text", "codex tasks --view full --limit N"], + }, + description: "Operate Code Queue through the stable backend-core private proxy path. Real submit/steer success is a low-noise write confirmation and does not echo prompt text.", }; }