diff --git a/docs/reference/cli.md b/docs/reference/cli.md index 36a8c973..4dfbf6fa 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -46,6 +46,7 @@ CLI 可以从 `master` 快速演进,但必须兼容 `deploy.json` 固定的 CI - `schedule list|get|runs|run|retry-run|delete|upsert-pgdata-backup` 管理 backend-core 定时任务和运行历史。`schedule list`、`schedule get`、`schedule runs --limit N` 和 `schedule runs --limit N` 是只读观察入口;`schedule run`、`schedule retry-run`、`schedule delete` 和 `schedule upsert-pgdata-backup` 会触发运行或写入配置,生产恢复时必须有明确授权。`schedule runs --limit N` 是全局历史视图,返回 `scope=global` 和 `scheduleId=null`;`schedule runs --limit N` 是指定 schedule 历史视图,返回 `scope=schedule` 和对应 `scheduleId`。CLI 必须拒绝 `schedule runs 50` 这类纯数字位置参数,并提示使用 `schedule runs --limit 50`,避免把空数组误判成“没有历史 run”。`schedule run --wait-ms N` 触发同一 schedule,并且即使 wait 超时也必须返回 `newRunId` 和 `observeCommand`;`schedule retry-run ` 只接受 failed run,从原 run 反查 `scheduleId` 后重触发同一 schedule,并输出 `originalRunId`、`scheduleId`、`newRunId` 和 `observeCommand`。当 backend-core 目标容器缺失或只观察到 verify-only 容器时,schedule/microservice 命令必须以非零退出并返回 `failureKind=target-stack-not-running`、`runnerDisposition=infra-blocked`、`readOnlyCommands` 和 `authorizationRequiredForRecovery`,不得把 Docker 的 `No such container` 当成成功的空历史。 - `codex deploy ` 是旧 Code Queue 兼容部署入口,已禁用以防止维护通道直连 D601 部署 Code Queue;当前 dev 自动化只做 `ci run-dev-e2e` smoke,不提供 Code Queue CD,详细规则见 `docs/reference/codex-deploy.md`。 - `codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue queueId] [--provider-id id] [--cwd path] [--model model] [--reasoning-effort effort] [--execution-mode mode] [--max-attempts N] [--reference-task-id id] [--dry-run]` 通过 backend-core 私有代理向稳定 `code-queue` 用户服务路径提交任务;prompt 必须且只能来自位置参数、文件或 stdin 之一,`--dry-run` 只返回结构化请求且不实际入队。长 prompt、多行 prompt、含引号/反引号/Markdown 表格/JSON/反斜杠的 prompt 必须优先用 `--prompt-stdin` 或 `--prompt-file`,不要拼进 shell 单个参数;位置参数只适合短单行 smoke prompt。stdin 推荐用 quoted heredoc:`cat <<'PROMPT' | bun scripts/cli.ts codex submit --prompt-stdin --queue --dry-run`,文件路径推荐 `bun scripts/cli.ts codex submit --prompt-file /tmp/code-queue-prompt.md --queue --dry-run`,确认 dry-run 后移除 `--dry-run` 提交同一 payload。dry-run 会额外输出 `routingRecommendation`,包含推荐 route、runner、model、风险信号、prompt 自包含/issue 非唯一来源/prod-secret-DB 禁止/运行态或 release 禁止/证据要求/中等复杂度候选等 guard 状态;同时输出 `policyContract`,固定暴露 GPT-5.5、DeepSeek、MiniMax 的风险分层、并发上限和外部 provider 429 退避处置。该建议只用于指挥官 preflight,不会改写 payload,不改变 runtime admission,也不假设生产 MiniMax 或 DeepSeek 可用。`--dry-run` 必须返回完整 prompt、字符数和 `truncated=false` 用于人工验收;真实提交是写入操作,默认只返回 `accepted=true`、task id、队列、写入保护摘要和后续查看命令,必须标记 `promptOmitted=true` 且不得回显 prompt 或 promptPreview。真实提交会经过本机本地串行化保护和短节流,避免同一指挥端并发 submit 把低内存主机或 `code-queue-mgr` 控制面打抖;返回值会附带 `executionMode`、`runnerPermissions` 和低噪声 `submitConcurrencyGuard`,显式说明 requested/effective mode、服务级 runner sandbox/approvalPolicy、锁与等待信息。`--execution-mode` 是 Code Queue runtime placement,不是 Codex sandbox 权限;有效模式是 `default` 和 `windows-native`,`--execution-mode full-access` 等 sandbox-like 值会保留 requested 值并显示 effective `default`,同时提示当前不支持每任务 sandbox override。真实提交的 `queue` 摘要保持低噪声:`submittedTaskIds`、`queuedTaskIds`、`activeTaskIds` 和 `databaseActiveTaskIds` 是带 `items/count/returned/omitted/truncated/source` 的有界预览对象,`queuedTaskIds.items` 必须包含本次新入队的 queued/retry_wait 任务,`countContext` 与 `counts` 是权威计数;当预览被省略或截断时,`listPreviewPolicy` 必须写明 omitted counts 和 raw 查看命令。backend-core 默认把提交、队列 CRUD、已读状态、历史摘要和轻量 Trace 读取分流到主 server `code-queue-mgr`,由它写入主 PostgreSQL;D601 scheduler 只轮询并执行已入库任务。 +- `codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N] [--full|--raw]` 向运行中的 Code Queue 任务发送纠偏 prompt。真实成功只返回低噪声写入确认,不回显 prompt 或完整任务状态;失败默认只返回 `accepted=false`、原因、scope、retryable、attempt 摘要、operator guidance 和 task/read/submit/health drill-down 命令。`upstreamBodyPreview`、request 元数据和 raw upstream failure 必须显式加 `--full` 或 `--raw` 才输出。任务已终态时返回紧凑 `task-already-terminal`、状态、终态状态、更新时间、`retryable=false` 和 `codex task` / `codex read` / `codex submit --reference-task-id ` 后续命令。 - `codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/] [--pr-create-dry-run --pr-create-dry-run-head ] [--issue N] [--full|--raw]` 通过稳定 `code-queue` proxy 请求 D601 scheduler `/api/runtime-preflight`,用于 PR 型派单 admission。默认输出是紧凑 commander 视图,显式分出 `schedulerPreflight` 与 `activeRunnerPrCapability`,并附带 `commands` 和 `disclosure`,方便先看 scheduler auth 缺口、再看当前 runner/dev container 的 `gh auth status` 与 `gh pr create --dry-run` 能力;`--full` 或 `--raw` 才展开完整 `preflight`、工具、agent port、Git worktree、GitHub egress、repo/issue/PR 只读探测和观测原文。只报告 `GH_TOKEN`/`GITHUB_TOKEN` 是否存在和来源 key,不打印值。当 auth-broker 配置存在时,`tokenCoverage.source="auth-broker"`、`credentialSource="broker-issued-token"` 且 runner env token 不是成功前提;当仅 env token 存在时,`credentialSource="env-token"` 且 `authBroker.nextAction="use-env-token-until-auth-broker-live"`;两者都缺失时顶层 `ok=false`、`runnerDisposition=infra-blocked`、`degradedReason=auth-broker-needed`,`tokenCoverage.missing` 同时列出 `GH_TOKEN` 与 `GITHUB_TOKEN`,并输出 `authBroker.source="broker/auth-broker-needed"`、`capability.source="missing-token"`。该 `auth-missing` 的 scope 是 `scheduler-runner-env`,不能简化成“当前 active runner/dev container 不能创建 PR”;默认视图必须带 `scopeBoundary` 和 `activeRunnerPrCapability`。GitHub DNS/API 连接失败应归类为 `failureKind=github-transient`、`degradedReason=github-dns-api-transient`,并带 `retryable=true`、`commanderAction=retry-backoff-or-keep-running-if-heartbeat-fresh` 和有界 `githubTransient.failedProbes`;调用方应重试/退避,且在任务 heartbeat/trace 新鲜时继续监督,不把它当成 auth 缺失或 PR 语义失败。`prCapability` 是 runner-facing 合同摘要,必须包含目标分支、token/auth 来源、`systemGhBinaryRequiredForWrites=false`、UniDesk REST `bun scripts/cli.ts gh` 可用性、push dry-run/PR create dry-run 的 `writesRemote=false`、expected PR handoff、真实 PR 创建需要 commander 授权和 `gh pr merge` 的 `unsupported-command` 边界;系统 `gh` binary 缺失只进入 `tools.systemGhBinary`,不得误判为 UniDesk REST `gh` CLI 不可用。`--remote` 在 runner-like 环境里不再依赖本地 `unidesk-backend-core`、`unidesk-database`、`baidu-netdisk-backend` 容器存在;这些缺失只作为本地观测证据。若远程控制面可达,则继续走远程控制面结果;若远程控制面不可达,则结构化返回 `failureKind=control-plane-missing` / `degradedReason=remote-control-plane-unreachable`,而不是把本地 `backend-core-container-missing` 当作最终阻塞。`--pr-create-dry-run` 不 POST GitHub,只证明 runner 内 PR body 生成、`scripts/cli.ts gh pr create --dry-run` 和 branch 参数形态可用;服务端创建权限仍以 token/auth broker、repo/issue/PR read、push dry-run 和最终授权后的真实 PR 创建结果为准。 - `codex task ` 通过 Code Queue 私有代理按任务 ID 查询结构化审阅摘要;默认只返回任务身份、执行 Provider、工作目录、attempt 计数、原始 prompt、最终 response、最后错误和渐进披露命令,适合指挥官审阅完成未读任务且避免上下文爆炸。`--detail` 仍是有界详细摘要:默认只返回少量 attempt/tool 行、短 prompt/response/stderr/feedback 预览和 omitted/truncated 元数据;需要完整 prompt/response 文本或更多 tool/attempt 细节时再显式加 `--full`、`--tool-limit N`、`--trace` 或 `codex output`。该摘要读取默认由主 server `code-queue-mgr` 从 PostgreSQL 返回,不依赖 D601 `code-queue-read` Service 可用。 - `codex tasks [--view supervisor|full] [--queue id] [--status succeeded|running|queued|failed|canceled|judging|retry_wait[,..]] [--unread|--unread-only] [--limit N] [--before-id id]` 通过同一私有代理输出渐进式披露视图。默认 `supervisor` 是低噪声指挥官视图,只返回 `activeRunning`、`running`、`completedUnread`、`recentCompleted`、`queued`、`activity`、`commanderConcurrency` 和 `executionDiagnostics` 的紧凑行;`activeRunning.count` 是 running+judging 的状态计数,`exact=true` 时来自 queue summary counts,`running.returned` 和 `activeRunning.rowPage.returned` 只是本次返回的紧凑行数。`commanderConcurrency.activeRunnerCount` 是并发策略应使用的 active/running 计数,等于 `activity.effectiveActiveTaskCount`;15 并发策略按 `15 - activeRunnerCount` 计算剩余窗口。`commanderConcurrency.splitBrainDisposition=live-count-as-active` 表示 split-brain 有 fresh heartbeat 证据,应继续监督并计入 active;`interventionRequired=true` 才提示介入。prompt/body 只给短预览和原始字符数,`running`/`completedUnread`/`queued` 默认只返回一个有界小页并通过 section `commands.next` 继续分页,`recentCompleted` 默认限量且不重复 `completedUnread` 未读终态,不嵌入完整 Trace、final response 或全量 overview。`--limit` 在 supervisor 中主要是扫描/分页预算,不是返回几十条肥行的开关;CLI 安全上限是 100,输出会在 `filters.requestedLimit`、`filters.effectiveLimit`、`filters.limitCapped` 和 `disclosure.limitPolicy` 说明显式请求是否被 capped;底层 overview 拉取预算独立显示在 `source.requestedLimit` / `source.effectiveLimit`,所以 `--limit 260` 应显示 requested=260、effective=100、source requested/effective=200,而不是只露出一个含糊的 `limit`。`--unread` 是 `--unread-only` 的别名,必须只保留未读终态;`--status` 必须真实过滤支持的状态,未知参数或未知状态必须结构化失败。需要更详细当前页任务行时显式使用 `--view full` 或 `--full`,仍受 `--limit` 和 `--before-id` 分页约束。 @@ -55,8 +56,8 @@ CLI 可以从 `master` 快速演进,但必须兼容 `deploy.json` 固定的 CI - `codex read ` 在人工审阅后标记单个终态任务已读,并在同一次响应中返回稳定任务身份、执行元数据、终态 attempt 摘要、最后错误或 judge 信息和最终 response,避免标记已读后还要额外 drill-down 才能确认结果。该命令不返回完整 prompt、tool logs 或 feedback prompt,只返回字符数、计数和 `codex task/detail/trace/output` 渐进披露命令;列表、overview 和 supervisor 视图只返回这个命令字段,不得自动执行,也不得批量清空未读状态。 - `codex dev-ready` 查询 Code Queue `/api/dev-ready` 并返回有界 readiness 摘要,包括工具、Docker、Codex config、SSH 和 `devReady.skills`。`devReady.skills` 只暴露 `UNIDESK_SKILLS_PATH`、是否存在、是否只读、skillCount、`cli-spec` 是否可见和修复建议,不输出宿主 auth/token 文件内容。 - `codex judge --attempt N [--dry-run] [--include-prompt]` 通过 Code Queue 私有代理按指定 attempt 单步复现 judge;这是执行面诊断入口,仍依赖 D601 scheduler/runner 侧的真实 judge builder、MiniMax 调用路径和执行环境。默认会真实调用 MiniMax,`--dry-run` 只返回 prompt/payload 大小、attempt 窗口和重建来源诊断,`--include-prompt` 仅用于本地深度排查。 -- `codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N]` 通过 Code Queue 私有代理向正在运行的 task 注入纠偏提示,正式替代底层 `microservice proxy code-queue /api/tasks//steer` 调用。prompt 必须且只能来自位置参数、文件或 stdin 之一;`--dry-run` 只输出 `method`、`path`、`stableProxyPath`、retry policy、prompt 字符数、截断预览和 raw proxy 等价命令,不触碰运行中 session,也不得泄露超长 prompt 全文。真实执行是写入操作,成功只返回 `accepted=true`、task id、prompt 字符数、`promptOmitted=true`、有界 task/queue 确认、attempt summary 和后续查看命令,不回显 prompt 或完整 task state;路径固定为 `/api/microservices/code-queue/proxy/api/tasks//steer`,只能作用于 D601 scheduler 上存在 active steerable turn 的 running task。默认对 `stable-proxy-failed` 和 `backend-core-unreachable` 这类 retryable control-plane failures 做一次有界重试;`--retry-attempts N` 最大为 3,`--retry-delay-ms N` 最大为 5000,`--no-retry` 用于复现单次失败。 -- `codex steer` 非 dry-run 失败仍输出 JSON 且退出非零;`.data.diagnostics.reason` 用于 runner 分流,当前包括 `backend-core-unreachable`、`code-queue-microservice-unregistered`、`proxy-unauthorized`、`proxy-404`、`steer-endpoint-404`、`upstream-runtime-rejected`、`stable-proxy-failed` 和 `invalid-proxy-response`。`scope` 区分 `backend-core`、`stable-proxy`、`code-queue-runtime` 或 `unknown`,并带 `status`、`exitCode`、`retryable`、有界 `upstreamBodyPreview`、`attempts`、`retryPolicy` 和推荐交叉验证命令;若任务不在 running/active-turn 状态,通常归类为 `upstream-runtime-rejected`,不得静默成功。`502 provider HTTP tunnel failed`、`provider-gateway-http-fetch`、`The operation was aborted` 或约 30 秒 tunnel wait abort 会归类为 `stable-proxy-failed`,CLI 会先按 retry policy 重试;如果仍失败,`.data.diagnostics.operatorGuidance.rawProxyEquivalentIsFallback=false` 表示 raw proxy 等价命令走同一条 tunnel,只能用于对照诊断,不应被当作更低噪声 fallback。此时 `.data.steer.deliveryUnconfirmed=true`,指挥官应先看 `codex tasks --view supervisor`、`codex task ` 和 `microservice health code-queue`,再从主 server CLI 或显式 SSH transport 重试同一个 `codex steer`。若 D601 返回的 409 已包含 terminal task state,CLI 默认改为紧凑终态响应:`reason=task-already-terminal`、task status、terminal status、`updatedAt`/`finishedAt`、`retryable=false`,并只给出 `codex task `、`codex read ` 和 `codex submit --prompt-file --reference-task-id ` 后续命令,不回显 steer prompt、完整 request body 或大 task object。 +- `codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N] [--full|--raw]` 通过 Code Queue 私有代理向正在运行的 task 注入纠偏提示,正式替代底层 `microservice proxy code-queue /api/tasks//steer` 调用。prompt 必须且只能来自位置参数、文件或 stdin 之一;`--dry-run` 只输出 `method`、`path`、`stableProxyPath`、retry policy、prompt 字符数、截断预览和 raw proxy 等价命令,不触碰运行中 session,也不得泄露超长 prompt 全文。真实执行是写入操作,成功只返回 `accepted=true`、task id、prompt 字符数、`promptOmitted=true`、有界 task/queue 确认、attempt summary 和后续查看命令,不回显 prompt 或完整 task state;路径固定为 `/api/microservices/code-queue/proxy/api/tasks//steer`,只能作用于 D601 scheduler 上存在 active steerable turn 的 running task。默认对 `stable-proxy-failed` 和 `backend-core-unreachable` 这类 retryable control-plane failures 做一次有界重试;`--retry-attempts N` 最大为 3,`--retry-delay-ms N` 最大为 5000,`--no-retry` 用于复现单次失败。 +- `codex steer` 非 dry-run 失败仍输出 JSON 且退出非零;`.data.diagnostics.reason` 用于 runner 分流,当前包括 `backend-core-unreachable`、`code-queue-microservice-unregistered`、`proxy-unauthorized`、`proxy-404`、`steer-endpoint-404`、`upstream-runtime-rejected`、`stable-proxy-failed` 和 `invalid-proxy-response`。`scope` 区分 `backend-core`、`stable-proxy`、`code-queue-runtime` 或 `unknown`,默认带 `status`、`exitCode`、`retryable`、`attempts`、`retryPolicy`、operator guidance 和推荐交叉验证命令,但不回显 steer prompt、不带 request body、不带 `upstreamBodyPreview`,也不带 raw upstream response;这些详细字段必须显式加 `--full` 或 `--raw` 才展开。若任务不在 running/active-turn 状态,通常归类为 `upstream-runtime-rejected`,不得静默成功。`502 provider HTTP tunnel failed`、`provider-gateway-http-fetch`、`The operation was aborted` 或约 30 秒 tunnel wait abort 会归类为 `stable-proxy-failed`,CLI 会先按 retry policy 重试;如果仍失败,`.data.diagnostics.operatorGuidance.rawProxyEquivalentIsFallback=false` 表示 raw proxy 等价命令走同一条 tunnel,只能用于对照诊断,不应被当作更低噪声 fallback。此时 `.data.steer.deliveryUnconfirmed=true`,指挥官应先看 `codex tasks --view supervisor`、`codex task ` 和 `microservice health code-queue`,再从主 server CLI 或显式 SSH transport 重试同一个 `codex steer`。若 D601 返回的 409 已包含 terminal task state,CLI 默认改为紧凑终态响应:`reason=task-already-terminal`、task status、terminal status、`updatedAt`/`finishedAt`、`retryable=false`,并只给出 `codex task `、`codex read ` 和 `codex submit --prompt-file --reference-task-id ` 后续命令,不回显 steer prompt、完整 request body 或大 task object。 - `codex interrupt|cancel ` 通过 Code Queue 私有代理请求中断;running/judging 任务会请求 D601 当前 agent run 停止,queued/retry_wait 任务的取消也必须保持与 WebUI 相同代理路径,返回有界 task 摘要和后续查询命令。任何需要接触 active run 的动作仍属于 D601 执行面。 - Code Queue 多队列 lane 由 `codex` 命令命名空间管理:`queues [--full|--all] [--limit N] [--page N|--offset N]` 列表、`queue create ` 创建、`queue merge --into ` 合并、`move --queue ` 迁移;这些队列管理入口默认由主 server `code-queue-mgr` 直管 PostgreSQL,仍通过稳定 `code-queue` 用户服务代理路径访问。`codex queues` 默认只返回 active/nonempty/unread/runnable queue 摘要、activity、commanderConcurrency、全局 counts 和 execution diagnostics;`--full` 或 `--all` 只切换为完整队列行视图的一页,仍受 `--limit`/`--page`/`--offset` 分页约束,不再默认携带 deprecated full array。summary 和 full 的稳定机读路径都是 `.data.queues.items[]`,全局元数据固定在 `.data.queues.commanderConcurrency`、`.data.queues.activity`、`.data.queues.counts`、`.data.queues.executionDiagnostics`、`.data.queues.activeTaskIds` 和 `.data.queues.queuedTaskIds`;需要完整 upstream 时使用输出中的 raw command。`commanderConcurrency.activeRunnerCount` / `activity.effectiveActiveTaskCount` 是指挥官并发判断的有效活跃数,`schedulerLocalActiveQueueCount`/`activeQueueIds` 只描述本地 scheduler active-run slots,不能覆盖数据库 running 计数或 heartbeat-fresh runner 计数。旧 full 顶层数组语义已作为 deprecated 兼容信息记录,不再作为 `.data.queues` 主形态。同一个 queue 内部串行执行,不同 queue 之间并行执行。迁移只允许尚未被 scheduler claim 的 `queued`/`retry_wait` 任务,必须满足 `startedAt=null`、`currentAttempt=0` 且没有 active thread/turn;已进入 `running`/`judging` 或已有 claim 标记的任务返回 409,不得被 move/merge 回写成 queued。合并会移动可迁移任务归属并自动删除源 queue 记录,只保留合并后的目标 queue;若 source 或 target queue 存在 active/claimed 任务,合并整体返回 409。合并后的目标 queue 按任务原 `queueEnteredAt`/`createdAt` 时间顺序串行,成功迁移 queued/retry_wait 任务后由 D601 scheduler 轮询推进。 - 所有 `codex` 查询和管理命令必须走与 WebUI 相同的 backend-core 私有代理路径 `/api/microservices/code-queue/proxy/...`;CLI 不得为了提交、移动、中断、取消或队列管理直接调用 D601 内部 Service、数据库、pod curl 或 k3sctl scheduler 子服务。若该路径失败,应先修复 CLI/backend/provider tunnel 链路,而不是绕过控制面。 diff --git a/docs/reference/code-queue-supervision.md b/docs/reference/code-queue-supervision.md index b512a378..df5c615b 100644 --- a/docs/reference/code-queue-supervision.md +++ b/docs/reference/code-queue-supervision.md @@ -337,7 +337,7 @@ D601 artifact registry 的 systemd unit inactive 不等于 D601 全局离线。 只有存在明确理由时才干预。 - 如果任务还在运行且 trace 或 scheduler heartbeat 新鲜,应引导而不是 interrupt。 -- 对运行中任务的引导应优先使用正式 CLI:`bun scripts/cli.ts codex steer --prompt-file `。该命令和 `codex task/tasks/read` 复用同一个 backend-core stable proxy helper;`--dry-run` 会显示 `method/path/stableProxyPath`、retry policy、prompt 摘要和 raw proxy 等价命令但不发送。非 dry-run 默认对 `stable-proxy-failed` 和 `backend-core-unreachable` 做一次有界重试,失败时先看 `.data.diagnostics.reason`、`.data.diagnostics.attempts` 和 `.data.diagnostics.operatorGuidance`:`backend-core-unreachable` 属于本机到 core 的观察路径,`code-queue-microservice-unregistered`/`proxy-unauthorized`/`proxy-404` 属于 stable proxy 配置或权限,`steer-endpoint-404`/`upstream-runtime-rejected` 属于 D601 runtime 或任务状态,`stable-proxy-failed` 多为 provider/k3s/tunnel 链路问题。`502 provider HTTP tunnel failed`、`The operation was aborted`、约 30 秒 provider tunnel wait abort 仍失败时,`.data.steer.deliveryUnconfirmed=true`;指挥官应先用 `codex tasks --view supervisor --limit 20`、`codex task ` 和 `microservice health code-queue` 交叉确认任务活性,再从主 server CLI 或显式 SSH transport 重试同一个 steer。raw proxy 等价命令走同一条 tunnel,`rawProxyEquivalentIsFallback=false`,只能做诊断对照,不应作为正式 fallback。若任务已终态,CLI 返回紧凑 `task-already-terminal` 响应并给出 `bun scripts/cli.ts codex task `、`bun scripts/cli.ts codex read ` 和 `bun scripts/cli.ts codex submit --prompt-file --reference-task-id `;指挥官应提交 follow-up task,而不是继续 steer 终态任务。若正式 CLI 自身不可用,临时通过受控 microservice proxy 调用只能作为现场恢复手段;这类绕行必须记录到指挥简报 issue #24 主体的常驻观察,并创建正式 issue 补齐 CLI 能力,避免长期依赖隐式 API。 +- 对运行中任务的引导应优先使用正式 CLI:`bun scripts/cli.ts codex steer --prompt-file `。该命令和 `codex task/tasks/read` 复用同一个 backend-core stable proxy helper;`--dry-run` 会显示 `method/path/stableProxyPath`、retry policy、prompt 摘要和 raw proxy 等价命令但不发送。非 dry-run 默认对 `stable-proxy-failed` 和 `backend-core-unreachable` 做一次有界重试,失败时先看 `.data.diagnostics.reason`、`.data.diagnostics.attempts` 和 `.data.diagnostics.operatorGuidance`:`backend-core-unreachable` 属于本机到 core 的观察路径,`code-queue-microservice-unregistered`/`proxy-unauthorized`/`proxy-404` 属于 stable proxy 配置或权限,`steer-endpoint-404`/`upstream-runtime-rejected` 属于 D601 runtime 或任务状态,`stable-proxy-failed` 多为 provider/k3s/tunnel 链路问题。默认失败输出必须保持低噪声:不回显 steer prompt、不带 request body、不带 upstream body preview,也不带 raw response;需要上游预览或原始失败对象时显式重跑 `--full` 或 `--raw`。`502 provider HTTP tunnel failed`、`The operation was aborted`、约 30 秒 provider tunnel wait abort 仍失败时,`.data.steer.deliveryUnconfirmed=true`;指挥官应先用 `codex tasks --view supervisor --limit 20`、`codex task ` 和 `microservice health code-queue` 交叉确认任务活性,再从主 server CLI 或显式 SSH transport 重试同一个 steer。raw proxy 等价命令走同一条 tunnel,`rawProxyEquivalentIsFallback=false`,只能做诊断对照,不应作为正式 fallback。若任务已终态,CLI 返回紧凑 `task-already-terminal` 响应并给出 `bun scripts/cli.ts codex task `、`bun scripts/cli.ts codex read ` 和 `bun scripts/cli.ts codex submit --prompt-file --reference-task-id `;指挥官应提交 follow-up task,而不是继续 steer 终态任务。若正式 CLI 自身不可用,临时通过受控 microservice proxy 调用只能作为现场恢复手段;这类绕行必须记录到指挥简报 issue #24 主体的常驻观察,并创建正式 issue 补齐 CLI 能力,避免长期依赖隐式 API。 - 如果任务进入终态但缺少必要验收证据,应使用聚焦 continuation prompt retry 同一任务。 - 如果任务被可复用基础设施缺陷阻塞,应把该缺陷分配给合适的空闲或低风险队列,让原业务任务等待,或在修复后 retry。 - 如果基础设施缺陷影响 Code Queue 控制面可用性,指挥官可以执行恢复队列所需的最小受控部署,然后验证原任务能继续。 diff --git a/scripts/code-queue-cli-steer-test.ts b/scripts/code-queue-cli-steer-test.ts index 77b6e745..88ab2697 100644 --- a/scripts/code-queue-cli-steer-test.ts +++ b/scripts/code-queue-cli-steer-test.ts @@ -199,6 +199,24 @@ export function runCodeQueueCliSteerContract(): JsonRecord { assertCondition(String(exhaustedDiagnostics.message || "").includes("The operation was aborted"), "diagnostics should include provider abort message", exhaustedDiagnostics); assertCondition(nestedRecord(exhaustedDiagnostics, ["operatorGuidance"]).rawProxyEquivalentIsFallback === false, "raw proxy equivalent should be diagnostic, not fallback", exhaustedDiagnostics); assertCondition(String(nestedRecord(exhausted, ["commands"]).rawProxy || "").includes("microservice proxy code-queue /api/tasks/direct_task/steer"), "failure should still expose raw proxy diagnostic command", exhausted); + assertCondition(nestedRecord(exhausted, ["steer"]).promptOmitted === true, "failed steer should omit prompt by default", exhausted); + assertCondition(!("request" in exhausted), "failed steer should omit request by default", exhausted); + assertCondition(!("upstreamBodyPreview" in exhaustedDiagnostics), "failed steer should omit upstream preview by default", exhaustedDiagnostics); + assertCondition(!String(JSON.stringify(exhausted)).includes("provider-gateway-http-fetch"), "failed steer default output should omit upstream body internals", exhausted); + assertCondition(String(nestedRecord(exhausted, ["commands"]).fullDetails || "").includes("--full"), "failed steer should suggest full disclosure command", exhausted); + assertCondition(String(nestedRecord(exhausted, ["commands"]).rawDetails || "").includes("--raw"), "failed steer should suggest raw disclosure command", exhausted); + + const exhaustedFull = codexSteerTaskForTest("direct_task", ["final correction", "--retry-attempts", "1", "--retry-delay-ms", "0", "--full"], () => { + return { ok: false, status: 502, body: abortedTunnelBody }; + }) as JsonRecord; + const exhaustedFullDiagnostics = nestedRecord(exhaustedFull, ["diagnostics"]); + assertCondition("request" in exhaustedFull, "--full failed steer should expose request metadata", exhaustedFull); + assertCondition("upstreamBodyPreview" in exhaustedFullDiagnostics, "--full failed steer should expose bounded upstream preview", exhaustedFullDiagnostics); + + const exhaustedRaw = codexSteerTaskForTest("direct_task", ["final correction", "--retry-attempts", "1", "--retry-delay-ms", "0", "--raw"], () => { + return { ok: false, status: 502, body: abortedTunnelBody }; + }) as JsonRecord; + assertCondition("rawFailure" in exhaustedRaw, "--raw failed steer should expose raw response", exhaustedRaw); const terminalPrompt = `${"do not leak ".repeat(40)}tail-secret-marker`; const terminalRejection = codexSteerTaskForTest("completed_task", [terminalPrompt], () => ({ @@ -232,6 +250,8 @@ export function runCodeQueueCliSteerContract(): JsonRecord { assertCondition(String(terminalCommands.show || "").includes("codex task completed_task"), "terminal rejection should suggest show command", terminalCommands); assertCondition(String(terminalCommands.read || "").includes("codex read completed_task"), "terminal rejection should suggest read command", terminalCommands); assertCondition(String(terminalCommands.followUpSubmit || "").includes("codex submit --prompt-file --reference-task-id completed_task"), "terminal rejection should suggest follow-up submit pattern", terminalCommands); + assertCondition(String(terminalCommands.fullDetails || "").includes("codex steer completed_task --prompt-file --full"), "terminal rejection should suggest explicit full disclosure command", terminalCommands); + assertCondition(String(terminalCommands.rawDetails || "").includes("codex steer completed_task --prompt-file --raw"), "terminal rejection should suggest explicit raw disclosure command", terminalCommands); const terminalJson = JSON.stringify(terminalRejection); assertCondition(!terminalJson.includes("tail-secret-marker"), "terminal rejection must not echo steer prompt", terminalRejection); assertCondition(!terminalJson.includes("hidden task prompt"), "terminal rejection must not echo task prompt", terminalRejection); @@ -239,6 +259,39 @@ export function runCodeQueueCliSteerContract(): JsonRecord { assertCondition(!("request" in terminalRejection), "terminal rejection should omit request preview", terminalRejection); assertCondition(!("diagnostics" in terminalRejection), "terminal rejection should omit bulky diagnostics", terminalRejection); + const terminalFull = codexSteerTaskForTest("completed_task", [terminalPrompt, "--full"], () => ({ + ok: false, + status: 409, + body: { + ok: false, + error: "task does not have an active steerable turn", + task: { + id: "completed_task", + status: "succeeded", + terminalStatus: "completed", + prompt: `${"hidden task prompt ".repeat(60)}tail`, + output: [{ seq: 1, text: "noisy raw task output" }], + }, + }, + })) as JsonRecord; + const fullDiagnostics = nestedRecord(terminalFull, ["diagnostics"]); + assertCondition("upstreamBodyPreview" in fullDiagnostics, "--full should expose bounded upstream preview behind diagnostics", fullDiagnostics); + assertCondition(typeof fullDiagnostics.rawProxyEquivalent === "string", "--full should expose raw proxy equivalent", fullDiagnostics); + assertCondition(!("rawFailure" in terminalFull), "--full should not include raw upstream response", terminalFull); + + const terminalRaw = codexSteerTaskForTest("completed_task", [terminalPrompt, "--raw"], () => ({ + ok: false, + status: 409, + body: { + ok: false, + error: "task does not have an active steerable turn", + task: { id: "completed_task", status: "succeeded", terminalStatus: "completed" }, + }, + })) as JsonRecord; + const rawFailure = nestedRecord(terminalRaw, ["rawFailure"]); + const rawFailureTask = nestedRecord(rawFailure, ["body", "task"]); + assertCondition(rawFailure.status === 409 && rawFailureTask.status === "succeeded", "--raw should expose raw upstream failure only when requested", rawFailure); + return { ok: true, checks: [ @@ -256,6 +309,7 @@ export function runCodeQueueCliSteerContract(): JsonRecord { "steer failure classification is JSON-consumable", "retryable tunnel aborts are retried with bounded diagnostics", "terminal steer rejection is compact and actionable", + "terminal steer rejection full/raw disclosure is explicit", ], }; } diff --git a/scripts/src/code-queue.ts b/scripts/src/code-queue.ts index 2ddfdfad..c4f73451 100644 --- a/scripts/src/code-queue.ts +++ b/scripts/src/code-queue.ts @@ -222,6 +222,8 @@ interface CodexSteerOptions { dryRun: boolean; retryAttempts: number; retryDelayMs: number; + full: boolean; + raw: boolean; } type CodexSteerFailureReason = @@ -258,7 +260,6 @@ interface CodexSteerAttemptSummary { scope: ClassifiedCodexSteerError["scope"] | null; retryable: boolean; message: string | null; - upstreamBodyPreview?: unknown; } interface CompactTaskMutationResponseOptions { @@ -718,6 +719,22 @@ function classifySteerFailure(response: unknown, targetPath: string, stableProxy }; } +function compactSteerFailureDiagnostics(diagnostics: ClassifiedCodexSteerError, includeDetails: boolean): Record { + return { + reason: diagnostics.reason, + scope: diagnostics.scope, + status: diagnostics.status, + exitCode: diagnostics.exitCode, + retryable: diagnostics.retryable, + message: diagnostics.message, + recommendedCrossChecks: diagnostics.recommendedCrossChecks, + ...(includeDetails ? { + upstreamBodyPreview: diagnostics.upstreamBodyPreview, + rawProxyEquivalent: diagnostics.rawProxyEquivalent, + } : {}), + }; +} + function unwrapSteerResponse(response: unknown, targetPath: string, stableProxyPath: string, rawProxyEquivalent: string): { ok: true; upstream: { ok: unknown; status: unknown }; body: Record } | { ok: false; diagnostics: ClassifiedCodexSteerError; rawResponse: unknown } { const record = asRecord(response); const body = responseBody(record); @@ -771,6 +788,41 @@ function compactTerminalSteerRejection(taskId: string, response: unknown): Recor }; } +function attachSteerDisclosure( + result: Record, + disclosure: { full: boolean; raw: boolean }, + response: unknown, + targetPath: string, + stableProxyPath: string, + rawProxyEquivalent: string, +): Record { + if (!disclosure.full && !disclosure.raw) return result; + const record = asRecord(response); + const body = responseBody(record); + result.disclosure = { + defaultPolicy: "compact rejection; upstream body and raw response are only included with explicit --full or --raw", + full: disclosure.full, + raw: disclosure.raw, + defaultOmitted: ["request", "diagnostics.upstreamBodyPreview", "rawFailure"], + }; + if (disclosure.full) { + result.request = { + path: targetPath, + stableProxyPath, + method: "POST", + bodySummary: { + promptOmitted: true, + }, + }; + result.diagnostics = { + upstreamBodyPreview: previewJson(body ?? record, { maxDepth: 4, maxArrayItems: 8, maxObjectKeys: 24, maxStringLength: 600 }), + rawProxyEquivalent, + }; + } + if (disclosure.raw) result.rawFailure = response; + return result; +} + function steerSuccessAttempt(attempt: number, durationMs: number, upstream: { ok: unknown; status: unknown }): CodexSteerAttemptSummary { const status = typeof upstream.status === "number" && Number.isFinite(upstream.status) ? upstream.status : null; return { @@ -797,7 +849,6 @@ function steerFailureAttempt(attempt: number, durationMs: number, diagnostics: C scope: diagnostics.scope, retryable: diagnostics.retryable, message: diagnostics.message, - upstreamBodyPreview: diagnostics.upstreamBodyPreview, }; } @@ -4027,7 +4078,7 @@ function parseSubmitOptions(args: string[]): CodexSubmitOptions { function parseSteerOptions(args: string[]): CodexSteerOptions { assertKnownOptions(args, { - flags: ["--prompt-stdin", "--stdin", "--dry-run", "--no-retry"], + flags: ["--prompt-stdin", "--stdin", "--dry-run", "--no-retry", "--full", "--raw"], valueOptions: ["--prompt-file", "--file", "--retry-attempts", "--retry-delay-ms"], }, "codex steer"); const retryAttempts = hasFlag(args, "--no-retry") @@ -4038,6 +4089,8 @@ function parseSteerOptions(args: string[]): CodexSteerOptions { dryRun: hasFlag(args, "--dry-run"), retryAttempts, retryDelayMs: nonNegativeIntegerOption(args, ["--retry-delay-ms"], defaultSteerRetryDelayMs, maxSteerRetryDelayMs), + full: hasFlag(args, "--full") || hasFlag(args, "--raw"), + raw: hasFlag(args, "--raw"), }; } @@ -5997,7 +6050,7 @@ function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFe }; } const attempts: CodexSteerAttemptSummary[] = []; - let failedResponse: { ok: false; diagnostics: ClassifiedCodexSteerError } | null = null; + let failedResponse: { ok: false; diagnostics: ClassifiedCodexSteerError; rawResponse: unknown } | null = null; let successfulResponse: { ok: true; upstream: { ok: unknown; status: unknown }; body: Record } | null = null; for (let attempt = 1; attempt <= options.retryAttempts; attempt += 1) { const startedAt = Date.now(); @@ -6011,18 +6064,27 @@ function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFe attempts.push(steerFailureAttempt(attempt, durationMs, response.diagnostics)); failedResponse = response; const terminalRejection = compactTerminalSteerRejection(taskId, response.rawResponse); - if (terminalRejection !== null) return terminalRejection; + if (terminalRejection !== null) { + terminalRejection.commands = { + ...(asRecord(terminalRejection.commands) ?? {}), + fullDetails: `bun scripts/cli.ts codex steer ${taskId} --prompt-file --full`, + rawDetails: `bun scripts/cli.ts codex steer ${taskId} --prompt-file --raw`, + }; + return attachSteerDisclosure(terminalRejection, options, response.rawResponse, targetPath, stableProxyPath, rawProxyEquivalent); + } if (!shouldRetrySteerFailure(response.diagnostics, attempt, options.retryAttempts)) break; sleepSync(options.retryDelayMs); } if (successfulResponse === null) { const diagnostics = failedResponse?.diagnostics ?? classifySteerFailure(null, targetPath, stableProxyPath, rawProxyEquivalent); const transportDeliveryUnconfirmed = diagnostics.reason === "stable-proxy-failed" || diagnostics.reason === "backend-core-unreachable"; + const compactDiagnostics = compactSteerFailureDiagnostics(diagnostics, options.full); return { ok: false, steer: { accepted: false, - prompt, + promptChars: options.prompt.length, + promptOmitted: true, attempts, deliveryUnconfirmed: transportDeliveryUnconfirmed, retryPolicy: { @@ -6034,9 +6096,14 @@ function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFe note: "codex steer retried only retryable stable-proxy/backend-core failures; raw microservice proxy uses the same tunnel and is diagnostic, not a lower-noise fallback.", }, }, - request, + ...(options.full ? { + request: { + ...request, + body: { prompt: { chars: options.prompt.length, textOmitted: true } }, + }, + } : {}), diagnostics: { - ...diagnostics, + ...compactDiagnostics, attempts, retryPolicy: { attempted: attempts.length, @@ -6057,10 +6124,19 @@ function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFe dryRun: `bun scripts/cli.ts codex steer ${taskId} --prompt-file --dry-run`, retry: `bun scripts/cli.ts codex steer ${taskId} --prompt-file `, noRetry: `bun scripts/cli.ts codex steer ${taskId} --prompt-file --no-retry`, + fullDetails: `bun scripts/cli.ts codex steer ${taskId} --prompt-file --full`, + rawDetails: `bun scripts/cli.ts codex steer ${taskId} --prompt-file --raw`, rawProxy: rawProxyEquivalent, tasks: "bun scripts/cli.ts codex tasks --view supervisor --limit 20", health: "bun scripts/cli.ts microservice health code-queue", }, + disclosure: { + defaultPolicy: "compact rejection; request prompt, upstream body preview, and raw response require explicit --full or --raw", + full: options.full, + raw: options.raw, + defaultOmitted: ["request.body.prompt.text", "diagnostics.upstreamBodyPreview", "rawFailure"], + }, + ...(options.raw ? { rawFailure: failedResponse?.rawResponse ?? null } : {}), }; } return { diff --git a/scripts/src/help.ts b/scripts/src/help.ts index 89ff4d22..81d7ae2c 100644 --- a/scripts/src/help.ts +++ b/scripts/src/help.ts @@ -63,7 +63,7 @@ export function rootHelp(): unknown { { command: "codex read ", description: "Mark one reviewed terminal task read and return terminal metadata plus final response; prompt/tool logs stay behind drill-down commands." }, { command: "codex dev-ready", description: "Fetch execution-container readiness, including sanitized skill injection status from /api/dev-ready." }, { command: "codex judge --attempt N [--dry-run] [--include-prompt]", description: "Replay one stored Code Queue attempt through the same judge context builder and MiniMax judge call path used by the live queue worker." }, - { command: "codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N]", description: "Push a corrective prompt into a running Code Queue task; retryable tunnel aborts get bounded diagnostics, terminal-task rejection suggests codex task/read plus codex submit --reference-task-id , and real success does not echo prompt text." }, + { command: "codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N] [--full|--raw]", description: "Push a corrective prompt into a running Code Queue task; retryable tunnel aborts get bounded diagnostics, terminal-task rejection suggests codex task/read plus codex submit --reference-task-id , and upstream/raw rejection details require explicit --full or --raw." }, { command: "codex interrupt|cancel ", description: "Request interrupt for a running Code Queue task, or cancel a queued/retry_wait task, through the same private proxy." }, { command: "codex (queues [--full|--all] | queue create | queue merge --into | move --queue )", description: "List low-noise queue summaries by default, including effective activity counts that distinguish scheduler-local queues, DB running tasks, and heartbeat-fresh runners; full queue rows require --full/--all." }, { command: "job list [--limit N] [--include-command]", description: "List async jobs from .state/jobs with a bounded default page." }, @@ -265,7 +265,7 @@ function codexHelp(): unknown { "bun scripts/cli.ts codex skills-sync --dry-run [--full]", "bun scripts/cli.ts codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/] [--pr-create-dry-run --pr-create-dry-run-head ] [--issue N] [--full|--raw]", "bun scripts/cli.ts codex judge --attempt N [--dry-run] [--include-prompt]", - "bun scripts/cli.ts codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N]", + "bun scripts/cli.ts codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N] [--full|--raw]", "bun scripts/cli.ts codex interrupt|cancel ", "bun scripts/cli.ts codex queues [--full|--all] [--limit N] [--page N|--offset N] | queue create | queue merge --into | move --queue ", ], @@ -323,7 +323,7 @@ function codexHelp(): unknown { embeddedIn: ["codex submit --dry-run", "codex steer --dry-run"], reference: "docs/reference/code-queue-supervision.md#dev-测试授权分级", }, - description: "Operate Code Queue through the stable backend-core private proxy path with bounded activity summaries for queue and supervisor views. Real submit/steer success is a low-noise write confirmation and does not echo prompt text; terminal steer rejection returns compact status plus codex task/read/submit follow-up commands.", + description: "Operate Code Queue through the stable backend-core private proxy path with bounded activity summaries for queue and supervisor views. Real submit/steer success is a low-noise write confirmation and does not echo prompt text; terminal steer rejection returns compact status plus codex task/read/submit follow-up commands, with upstream/raw details behind --full or --raw.", }; }