diff --git a/AGENTS.md b/AGENTS.md index f4fe7e38..89889338 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -49,7 +49,7 @@ UniDesk 是一个以主 server 为统一入口的分布式工作平台;本文 - `bun scripts/cli.ts commander contract|plan --dry-run|smoke --dry-run|approval request --dry-run`:查看 host Codex 指挥官直管微服务 skeleton 的 source/contract、无 daemon smoke 验证计划、.state/commander/ 状态模型、trace summary 聚合和 ClaudeQQ 高风险请示草案;当前只返回 dry-run 计划,不接 live bridge、不接管人工指挥官,不发送消息,规则见 `docs/reference/host-codex-commander.md`。 - `bun scripts/cli.ts ci install/status/run/publish-backend-core/publish-user-service/run-dev-e2e/logs`:在 D601 原生 k3s 上安装和运行 Tekton CI,支持每 commit 检查、Code Queue 只读性能门禁、`CI.json` catalog 驱动的 backend-core 与 user-service commit-pinned 镜像发布和手动触发的 `origin/master:deploy.json#environments.dev` 临时 namespace e2e;catalog/producer/consumer 分工见 `docs/reference/cicd-standardization.md`,`run-dev-e2e` 的 Git 控制 runner、短 launcher 和 no-CD 边界见 `docs/reference/dev-ci-runner.md`,Tekton 规则见 `docs/reference/ci.md`。 - `bun scripts/cli.ts codex deploy `:旧 Code Queue 兼容部署入口已禁用,原因是它会绕过受控部署边界直连 D601 部署 Code Queue;规则见 `docs/reference/codex-deploy.md`。 -- `bun scripts/cli.ts codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue ]` / `codex pr-preflight [--remote]`:前者通过 backend-core 私有代理提交 Code Queue 任务,`--dry-run` 会给出 MiniMax/GPT/人工路由建议但不改写 payload,真实提交成功只返回写入确认、task id 和后续查看命令,不回显 prompt;后者只读检查 D601 scheduler/runner 的 GitHub token、egress 和 PR 能力,PR 型派单前必须使用,规则见 `docs/reference/cli.md` 和 `docs/reference/code-queue-supervision.md`。 +- `bun scripts/cli.ts codex prompt-lint [prompt|--prompt-file path|--prompt-stdin]` / `codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue ]` / `codex pr-preflight [--remote]`:`prompt-lint` 在派发/steer 前 dry-run 检查 runner prompt 的 DEV 测试授权分级(`read-only`/`live-read`/`live-mutating`)且不回显 prompt;`submit --dry-run` 同时给出 MiniMax/GPT/人工路由建议和该 lint 结果但不改写 payload,真实提交成功只返回写入确认、task id 和后续查看命令,不回显 prompt;`pr-preflight` 只读检查 D601 scheduler/runner 的 GitHub token、egress 和 PR 能力,PR 型派单前必须使用,规则见 `docs/reference/cli.md` 和 `docs/reference/code-queue-supervision.md`。 - `bun scripts/cli.ts codex task `:按 Code Queue 任务 ID 查询默认审阅摘要,只返回原始 prompt、最终 response、最后错误和渐进披露命令;`--detail`、`codex output` 和 supervisor 大 `--limit` 仍默认有界,完整内容需显式 `--full`/`--full-text`/分页展开;`codex queues [--full] [--limit N] [--page N|--offset N]` 默认分页低噪声输出队列摘要,完整 upstream 只通过 raw command 显式获取。 - `bun scripts/cli.ts codex judge --attempt [--dry-run]`:按指定 task/attempt 用与队列 worker 相同的上下文构建和 MiniMax judge 调用路径单步复现完成判定;`--dry-run` 只输出 prompt/payload 诊断。 - `bun scripts/cli.ts codex steer [prompt|--prompt-file path|--prompt-stdin] [--dry-run] [--no-retry|--retry-attempts N]`:通过 Code Queue 私有代理向运行中的 active turn 注入纠偏提示,对 retryable tunnel abort 做有界重试诊断,真实成功只确认写入并返回后续查看命令,不回显 prompt 或完整 task state。 diff --git a/docs/reference/cli.md b/docs/reference/cli.md index 713c7c0c..7350dac9 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -32,6 +32,7 @@ CLI 可以从 `master` 快速演进,但必须兼容 `deploy.json` 固定的 CI - `artifact-registry plan|render|status|health|install|deploy-backend-core|deploy-service` 管理 D601 host-managed CNCF Distribution registry 的声明、安装、只读检查和 pull-only artifact CD。该 registry 固定为 D601 loopback `127.0.0.1:5000`,由 systemd + Docker Compose 管理,位于 native k3s 故障域外;`deploy-service` 只拉取 CI 已发布的 commit-pinned 镜像、retag/recreate 或导入 native k3s,并做 live commit 验证,不构建 runtime source。`deploy-backend-core` 是 deprecated 兼容名,标准 backend-core prod CD 入口是 `deploy apply --env prod --service backend-core`。长期规则见 `docs/reference/artifact-registry.md`。 - `commander contract|plan --dry-run|smoke --dry-run|approval request --dry-run` 是 host Codex 指挥官直管微服务 skeleton 入口。当前命令返回 `phase=source-contract`、service/API/state/bridge/prompt/trace/#20/#46/ClaudeQQ 审批边界、.state/commander/ 状态模型、dev 无 daemon smoke contract 和 dry-run 计划,服务骨架只提供本地 `/health`、`/api/commander/contract`、状态读写、trace summary 聚合和 approval draft preview,不接 live bridge、不注入 prompt、不发送 ClaudeQQ。`plan`、`smoke` 与 `approval request` 必须带 `--dry-run`;缺少时返回 `error=dry-run-required`。长期规则见 `docs/reference/host-codex-commander.md`。 - `gh auth status [--repo owner/name]` 探测 GitHub 操作前置条件并输出脱敏 JSON:是否存在 `gh` binary、是否存在 `GH_TOKEN`/`GITHUB_TOKEN` 或可用 `gh auth token` fallback、REST API 是否可达、目标 repo 是否可见、issue 是否可读。degraded reason 必须归类为 `missing-binary`、`missing-token`、`auth-failed`、`github-transient`、`network-proxy-failed`、`permission-denied`、`repo-not-found`、`repo-forbidden`、`issue-not-found`、`pr-not-found`、`scope-insufficient`、`validation-failed`、`invalid-response` 或 `unsupported-command`,不得打印 token;失败对象必须包含 `runnerDisposition=infra-blocked|business-failed`,runner 应优先用该字段分流。`github-transient` 表示 GitHub DNS/API 连接在收到 HTTP 状态前失败,输出应带 `retryable=true` 或等价 commander action;这不是缺 token、认证失败、权限不足或 PR 语义失败。 +- `codex prompt-lint [prompt|--prompt-file path|--prompt-stdin]` 是派发/steer 前的本地 dry-run prompt lint。它只读取 prompt 文本,返回 `dryRun=true`、`mutation=false`、`declaredClass`、`effectiveClass`、`requiredClass`、`dispatchDisposition`、缺失或矛盾项和有界 evidence,不访问 live service、不提交任务、不打印完整 prompt。分级固定为 `read-only`、`live-read`、`live-mutating`;未声明时按 `read-only` 处理。`codex submit --dry-run` 与 `codex steer --dry-run` 会嵌入同一 `promptLint` 结果,帮助指挥官在 dispatch/steer 前发现缺失或矛盾的 live mutation 授权。长期规则见 `docs/reference/code-queue-supervision.md` 的 DEV 测试授权分级。 - `gh issue list [--state open|closed|all] [--limit N] [--repo owner/name] [--json number,title,state,url,updatedAt,createdAt,author,labels]` 通过 GitHub REST 列出 issue,默认 `state=open`、`limit=30`,输出稳定 JSON 且不依赖系统 `gh` binary。`--limit` 会映射到 GitHub `per_page` 并限制返回数量,避免一次拉爆上下文;未知 state 或未知 `--json` 字段必须结构化失败并带 `runnerDisposition=business-failed`。GitHub issues API 可能混入 PR,CLI 会从 `.data.issues` 中过滤 pull request。 - `gh issue read [--repo owner/name] [--json body,title,state,comments] [--raw|--full]` 通过 GitHub REST 读取 issue title/body/state/url 和 comments,默认输出 JSON;`view` 只保留为兼容别名。`owner/repo#number` shorthand 会自动派生 `--repo owner/repo` 和 issue number;若同时提供冲突的显式 `--repo`,CLI 必须结构化失败并给出 `gh issue read --repo owner/repo --json body,title,state,comments` 与 shorthand raw 的可执行命令。兼容旧脚本的 `--json body` 和 `--json body,title,state,comments` 字段选择,且正文仍稳定暴露在 `.data.issue.body`,避免调用方因为 JSON 路径变化把空值当成正文。字段白名单是 `body,title,state,comments,number,url,author,createdAt,updatedAt`,未知字段必须结构化失败并带 `runnerDisposition=business-failed`。`--raw` 与 `--full` 只在 read/view 上可用,是显式完整披露别名,会选择完整支持字段集并保持结构化 JSON 输出;默认 list/read 输出仍不得扩散到无界非 JSON 文本。`gh issue create --title --body-file <file> [--label label[,label...]]... [--dry-run]`、`gh issue update <number> --mode replace|append --body-file <file> [--title ...] [--dry-run]`、`gh issue comment create <number> --body-file <file> [--dry-run]`、`gh issue comment delete <commentId> [--dry-run]`、`gh issue close|reopen <number> [--dry-run]` 都走 REST,不依赖 `gh` binary。`--label` 仅用于 `issue create`,支持重复传入和逗号分隔;`--dry-run` 会展示解析后的 labels 与 request plan,正式创建时把 labels 放入 GitHub REST create-issue payload,GitHub 返回不存在 label 等 422 校验失败时 CLI 结构化返回 `validation-failed`,不静默成功。`gh issue delete <number>` 是结构化 `unsupported-command`,因为 GitHub REST 不支持 issue 硬删除;生命周期删除语义请使用 `close`。 - `gh issue update <number> --mode replace|append --body-file <file>` 是正文更新主入口,`edit` 保留为兼容别名。`replace` 用文件正文替换现有 body;`append` 先读取当前 body,再按 UTF-8 文件字节追加,保留真实换行、反引号和 Markdown 表格。更新默认拒绝字面量 `null`、空白正文和过短正文;只有真实需要写短正文时才允许显式加 `--allow-short-body`,返回 JSON 会报告该风险。#20 总看板和指挥简报类 issue 是长期 body-only issue,`--body-profile auto` 会按 issue number 自动启用 #20/#24 legacy guard:#20 必须包含 `## 看板(OPEN)`,#24 legacy 指挥简报必须包含 `## 常驻观察与长期建议`。显式 `--body-profile commander-brief` 不再固定 #24;#24 仍兼容,标题为 `YYYY-MM-DD 指挥简报(北京时间)` 或既有正文首行/关键 heading 表明为每日滚动指挥简报的 issue 也合法,并仍必须包含 `## 常驻观察与长期建议`。对非简报 issue 显式使用 `commander-brief` 会结构化失败为 `profile-issue-mismatch`。`--dry-run` 不 PATCH GitHub,输出新正文长度、SHA、关键标题检查结果、字面量 `\n`、反引号、Markdown 表格和 shell 污染信号;若环境里有 `GH_TOKEN` 或 `GITHUB_TOKEN`,dry-run 还会只读抓取旧正文长度、SHA 和 `updatedAt` 作为更新前对照。正式写入可带 `--expect-updated-at <updated_at>` 或 `--expect-body-sha <sha256>`,CLI 会先读当前 issue,匹配后才 PATCH,防止旧缓存覆盖新正文。 diff --git a/docs/reference/code-queue-supervision.md b/docs/reference/code-queue-supervision.md index 12b0f706..5ffb11af 100644 --- a/docs/reference/code-queue-supervision.md +++ b/docs/reference/code-queue-supervision.md @@ -42,6 +42,36 @@ live-read browser audit 只用于观察已部署 UI,不授权写入。未获 每次新派一批任务、接收一批 completed unread 结果,或者发生实质态势变化时,都要同步更新 `#20` 的正文主表;如果当天有滚动简报,则同时更新当日简报 issue 的正文主内容,而不是只在聊天中补上下文。 +## DEV 测试授权分级 + +`DEV` 只说明目标环境,不自动说明允许的写入级别。所有 runner prompt 和 supervisor closeout 都必须把 DEV 验证分成 `read-only`、`live-read` 和 `live-mutating` 三类;如果 prompt 没有显式分类,默认按 `read-only` 处理。 + +| 分级 | 含义 | 常见允许动作 | 禁止动作 | +| --- | --- | --- | --- | +| `read-only` | 不连接或不观察正在运行的 DEV 服务,只验证源码、本地 contract、fixture、mock、dry-run 或静态输出。 | `git diff`、`rg`、类型检查、unit/contract test、CLI `--dry-run`、生成计划或补文档。 | 访问 live service、触发任务、写数据库、部署、重启、rollout、真实硬件或虚拟硬件动作。 | +| `live-read` | 读取正在运行的 DEV 服务、日志、health、status、metrics、Kubernetes 只读对象或只读 API,不改变 live 状态。 | `GET /health`、`GET /status`、只读 proxy、`kubectl get/describe/logs`、只读 CLI status/diagnostics。 | `POST/PUT/PATCH/DELETE`、`kubectl apply/delete/rollout restart`、触发 schedule/job/task、写 issue/PR 之外的 runtime 状态、任何会创建 operation/audit/evidence 的动作。 | +| `live-mutating` | 在 DEV 环境执行会改变 live 状态的命令,即使目标是 smoke、复测或诊断。 | 经 prompt 明确授权的 dev deploy/apply/rollout、trigger/run/retry、task submit/steer、写配置、创建 operation/audit/evidence、HWLAB M3 DO/DI 链路触发。 | 任何未被 prompt 精确列出的 live mutation;生产写入、密钥读取、数据库手工 patch、Code Queue 高风险干预仍按更高安全边界处理。 | + +`DEV smoke`、`M3 smoke`、`live smoke`、`复测`、`验证` 这类词本身不构成 live mutation 授权。只要命令会改变 DEV runtime、触发真实或虚拟设备动作、创建任务/operation/audit/evidence、改变 deployment 或写入服务状态,就必须归入 `live-mutating`。 + +派单 prompt 必须显式写出: + +- `DEV test class`:只能是 `read-only`、`live-read` 或 `live-mutating`。 +- `允许的 live mutation`:若 class 是 `live-mutating`,必须逐项列出允许的命令形态或动作、目标环境/服务/namespace、可接受的状态变化、观察和回滚步骤;若没有授权,写 `none`。 +- `禁止动作`:至少说明 prod mutation、密钥明文、数据库手工 patch、Code Queue backend 重启/重建、运行中任务 interrupt/cancel 是否禁止;未写明的高风险动作一律禁止。 +- `closeout 字段`:runner final response 必须报告实际执行的 test class、是否发生 live mutation、执行命令摘要、目标环境、证据链接或 ID,以及未覆盖风险。 + +runner 收到未分类或含糊的 prompt 时,只能执行 `read-only` 范围;如果完成任务需要 `live-read` 或 `live-mutating`,必须停在计划和待授权状态,列出拟执行命令、风险和需要指挥官补充的授权,不能自行把“DEV”解释成允许写入。 + +supervisor closeout 不能只看 runner 的成功自述,必须核对 prompt 授权和实际命令级别: + +- `read-only` closeout 应证明没有 live service 写入,证据来自 diff、静态检查、unit/contract test 或 dry-run 输出。 +- `live-read` closeout 应记录读取的 DEV endpoint、service、namespace 或日志范围,并明确没有触发 runtime 状态变化。 +- `live-mutating` closeout 应指出 prompt 中的明确授权、实际变更目标、operation/audit/evidence/task/job ID、回滚或恢复观察,以及 prod 未触碰。 +- 如果 runner 在没有明确 prompt 授权时执行了 live mutation,即使 smoke 结果成功,也不能把任务验收为正常完成;指挥官应先核实 live 状态和 blast radius,再把它记录为治理缺陷或 follow-up,并修正后续 prompt 模板。 + +HWLAB M3 口径使用同一分级:只读报告、fixture、LOCAL/DRY-RUN 和 diagnostics 只能算 `read-only` 或 `live-read`;触发 `res_boxsimu_1:DO1 -> hwlab-patch-panel -> res_boxsimu_2:DI1` 的可信闭环属于 `live-mutating`,必须有 prompt 明确授权并在 closeout 中给出 operation / audit / evidence 关联。 + ## 任务设计 每个 Code Queue task 都必须有清晰且狭窄的 ownership 边界。 @@ -67,6 +97,8 @@ Code Queue 派单模型按成本、可信度和 blast radius 分层:GPT-5.5/Co `codex submit --dry-run` 是派单前的轻量 preflight。它输出 `routingRecommendation`、`policyContract` 和模型注册表,帮助指挥官看到推荐 runner/model、风险信号、缺失的 prompt guard、模型分层、并发上限、`opencodeModels` 和 `modelPorts`;它不会修改真实提交 payload,也不会替代指挥官判断。真实派单是否使用 `--model minimax-m2.7`、`--model deepseek-chat` 或 `--model gpt-5.5` 仍由指挥官显式决定。 +`codex prompt-lint [prompt|--prompt-file path|--prompt-stdin]` 是同一套派单前 guardrail 的本地 dry-run 入口,用于检查 runner prompt 是否声明了 `DEV test class`、是否列出允许的 live mutation、禁止动作和 closeout 字段。它只返回分类、缺失或矛盾项和有界 evidence,不提交任务、不连接 live service、不打印完整 prompt。`codex submit --dry-run` 和 `codex steer --dry-run` 会嵌入同一 `promptLint` 结果;`dispatchDisposition=needs-authorization` 时,指挥官必须补齐授权或把 prompt 降到 `read-only` 范围后再派发/steer。 + 并发治理按模型和风险一起决定。GPT-5.5 常规并发目标是 5 条 lane;当写入范围互不重叠、heartbeat/trace 健康、完成质量稳定时可以短时提高到 10。MiniMax 只承接简单任务时可以提高到 10,但必须保留指挥官审阅和证据核验。DeepSeek 用于中等复杂度任务,默认按约 5 条 lane 观察质量,再根据成功率和 reviewer 负载逐步调整。并发扩张的前提永远是任务质量和可观测性,而不是模型价格。 模型选择矩阵: diff --git a/docs/reference/host-codex-commander.md b/docs/reference/host-codex-commander.md index 20a87685..961a8e44 100644 --- a/docs/reference/host-codex-commander.md +++ b/docs/reference/host-codex-commander.md @@ -67,6 +67,8 @@ host commander 不直接编辑 HWLAB 业务代码,不以本地热修绕过 HWL `commander smoke --dry-run` 是无 daemon smoke contract。它只输出验证计划,不启动 HTTP daemon、不打开 SSH/PTY/stdio bridge、不发送 ClaudeQQ、不重启服务、不 interrupt/cancel 任务、不部署、不跑全量 check/e2e。 +指挥官派发或审阅任何 `DEV smoke` 时,必须沿用 `docs/reference/code-queue-supervision.md` 的 DEV 测试授权分级:未显式授权的 smoke 默认是 `read-only`;读取 live DEV 状态必须标成 `live-read`;触发 deploy、rollout、task、operation、audit、evidence 或硬件/虚拟硬件链路的 smoke 必须标成 `live-mutating`,并在 prompt 中逐项列出授权命令和 closeout 证据要求。 + 需要验证的 source/contract 面: - health endpoint:用 `createCommanderRequestHandler` 和临时 `RuntimeConfig` 调用 `GET /health`,期望返回 `service=host-codex-commander`、`stateRoot` 和日志文件路径;禁止 `Bun.serve` 和端口监听。 diff --git a/scripts/code-queue-prompt-lint-contract-test.ts b/scripts/code-queue-prompt-lint-contract-test.ts new file mode 100644 index 00000000..691a63f5 --- /dev/null +++ b/scripts/code-queue-prompt-lint-contract-test.ts @@ -0,0 +1,186 @@ +import { spawnSync } from "node:child_process"; +import { mkdtempSync, rmSync, writeFileSync } from "node:fs"; +import { join } from "node:path"; +import { tmpdir } from "node:os"; +import { codexPromptLiveAuthorizationLintForTest } from "./src/code-queue"; + +type JsonRecord = Record<string, unknown>; + +function assertCondition(condition: unknown, message: string, detail: unknown = {}): void { + if (!condition) throw new Error(`${message}: ${JSON.stringify(detail)}`); +} + +function asRecord(value: unknown): JsonRecord { + assertCondition(typeof value === "object" && value !== null && !Array.isArray(value), "expected JSON object", { value }); + return value as JsonRecord; +} + +function nestedRecord(value: unknown, path: string[]): JsonRecord { + let current: unknown = value; + for (const key of path) { + current = asRecord(current)[key]; + } + return asRecord(current); +} + +function stringArray(value: unknown): string[] { + return Array.isArray(value) ? value.map((item) => String(item)) : []; +} + +function runCli(args: string[], stdin?: string): { status: number | null; stdout: string; stderr: string; json: JsonRecord | null } { + const result = spawnSync("bun", ["scripts/cli.ts", ...args], { + cwd: process.cwd(), + input: stdin, + encoding: "utf8", + }); + const stdout = String(result.stdout || ""); + let json: JsonRecord | null = null; + try { + json = JSON.parse(stdout) as JsonRecord; + } catch { + json = null; + } + return { + status: result.status, + stdout, + stderr: String(result.stderr || ""), + json, + }; +} + +const readOnlyPrompt = ` +DEV test class: read-only +允许的 live mutation: none +禁止动作: prod mutation, secret values, database patch, Code Queue backend restart, interrupt/cancel. +closeout fields: report class, live mutation false, commands, target environment, evidence, residual risk. + +Task: update docs and run contract tests only. +Validation: git diff --check and bun scripts/code-queue-prompt-lint-contract-test.ts. +`; + +const unclassifiedM3SmokePrompt = ` +Run the HWLAB M3 smoke on DEV and verify res_boxsimu_1:DO1 -> hwlab-patch-panel -> res_boxsimu_2:DI1. +Do not touch production. +`; + +const liveReadPrompt = ` +DEV test class: live-read +允许的 live mutation: none +禁止动作: no POST/PUT/PATCH/DELETE, no deploy, no task trigger, no prod mutation. +closeout fields: report class, live mutation false, endpoints read, namespace/service, evidence, residual risk. + +Read live DEV health and status only: GET /health and kubectl get pods. +`; + +const liveMutatingPrompt = ` +DEV test class: live-mutating +允许的 live mutation: run exactly one DEV HWLAB M3 operation for res_boxsimu_1:DO1 -> hwlab-patch-panel -> res_boxsimu_2:DI1, record operation/audit/evidence ids, then observe recovery. +禁止动作: no prod mutation, no secret values, no database manual patch, no Code Queue backend restart, no unrelated interrupt/cancel. +closeout fields: report class, live mutation true, command summary, DEV target, operation/audit/evidence ids, rollback observation, residual risk. +`; + +const secretBearingPrompt = ` +DEV test class: live-mutating +允许的 live mutation: POST /api/run?token=ds-secret-must-not-print once in DEV only. +禁止动作: no prod mutation, no database patch. +closeout fields: report class, live mutation true, command summary and evidence id. +`; + +function assertLintShape(lint: JsonRecord): void { + assertCondition(lint.dryRun === true, "lint must be dry-run", lint); + assertCondition(lint.mutation === false, "lint must be non-mutating", lint); + assertCondition(asRecord(lint.policy).printsPromptText === false, "lint policy must not print full prompt", lint); + assertCondition(asRecord(lint.promptShape).textEchoed === false, "lint shape must not echo prompt text", lint); + assertCondition(Array.isArray(lint.signals), "lint must expose signals", lint); + const json = JSON.stringify(lint); + assertCondition(!json.includes("ds-secret-must-not-print"), "lint must not print secret marker", lint); +} + +export function runCodeQueuePromptLintContract(): JsonRecord { + const readOnly = asRecord(codexPromptLiveAuthorizationLintForTest(readOnlyPrompt)); + assertLintShape(readOnly); + assertCondition(readOnly.ok === true, "well-formed read-only prompt should pass", readOnly); + assertCondition(readOnly.declaredClass === "read-only", "read-only prompt should declare read-only", readOnly); + assertCondition(readOnly.effectiveClass === "read-only", "read-only effective class mismatch", readOnly); + assertCondition(readOnly.requiredClass === "read-only", "read-only required class mismatch", readOnly); + assertCondition(readOnly.dispatchDisposition === "ready", "read-only prompt should be dispatch-ready", readOnly); + + const liveRead = asRecord(codexPromptLiveAuthorizationLintForTest(liveReadPrompt)); + assertLintShape(liveRead); + assertCondition(liveRead.ok === true, "well-formed live-read prompt should pass", liveRead); + assertCondition(liveRead.declaredClass === "live-read", "live-read prompt should declare live-read", liveRead); + assertCondition(liveRead.requiredClass === "live-read", "live-read required class mismatch", liveRead); + assertCondition(liveRead.dispatchDisposition === "ready", "live-read prompt should be dispatch-ready", liveRead); + + const unclassifiedM3 = asRecord(codexPromptLiveAuthorizationLintForTest(unclassifiedM3SmokePrompt)); + assertLintShape(unclassifiedM3); + assertCondition(unclassifiedM3.ok === false, "unclassified M3 smoke should fail lint", unclassifiedM3); + assertCondition(unclassifiedM3.declaredClass === null, "unclassified prompt should have no declared class", unclassifiedM3); + assertCondition(unclassifiedM3.effectiveClass === "read-only", "unclassified prompt should default to read-only", unclassifiedM3); + assertCondition(unclassifiedM3.requiredClass === "live-mutating", "M3 smoke should require live-mutating", unclassifiedM3); + assertCondition(unclassifiedM3.dispatchDisposition === "needs-authorization", "unclassified live mutation should need authorization", unclassifiedM3); + assertCondition(stringArray(unclassifiedM3.missingOrContradictory).some((item) => item.includes("missing DEV test class")), "unclassified prompt should report missing class", unclassifiedM3); + + const liveMutating = asRecord(codexPromptLiveAuthorizationLintForTest(liveMutatingPrompt)); + assertLintShape(liveMutating); + assertCondition(liveMutating.ok === true, "well-formed live-mutating prompt should pass", liveMutating); + assertCondition(liveMutating.declaredClass === "live-mutating", "live-mutating prompt should declare live-mutating", liveMutating); + assertCondition(liveMutating.requiredClass === "live-mutating", "live-mutating required class mismatch", liveMutating); + assertCondition(liveMutating.liveMutationAuthorized === true, "live-mutating prompt should be authorized when allowed mutation is enumerated", liveMutating); + + const secretBearing = asRecord(codexPromptLiveAuthorizationLintForTest(secretBearingPrompt)); + assertLintShape(secretBearing); + assertCondition(secretBearing.requiredClass === "live-mutating", "secret-bearing live mutation should still classify", secretBearing); + assertCondition(!JSON.stringify(secretBearing).includes("ds-secret-must-not-print"), "prompt lint evidence must redact secret-looking values", secretBearing); + + const tmp = mkdtempSync(join(tmpdir(), "unidesk-code-queue-prompt-lint-")); + const promptFile = join(tmp, "prompt.md"); + writeFileSync(promptFile, liveMutatingPrompt, "utf8"); + try { + const cliLint = runCli(["codex", "prompt-lint", "--prompt-file", promptFile]); + assertCondition(cliLint.status === 0 && cliLint.json?.ok === true, "prompt-lint CLI should succeed for authorized live-mutating prompt", cliLint.json ?? { stdout: cliLint.stdout }); + const lintData = nestedRecord(cliLint.json?.data, []); + assertCondition(lintData.dryRun === true && lintData.mutation === false, "prompt-lint CLI should be dry-run and non-mutating", lintData); + assertCondition(lintData.declaredClass === "live-mutating", "prompt-lint CLI should classify live-mutating", lintData); + assertCondition(!JSON.stringify(lintData).includes("run exactly one DEV HWLAB M3 operation"), "prompt-lint CLI should not echo full prompt text", lintData); + } finally { + rmSync(tmp, { recursive: true, force: true }); + } + + const submitDryRun = runCli(["codex", "submit", "--prompt-stdin", "--dry-run"], unclassifiedM3SmokePrompt); + assertCondition(submitDryRun.status === 0 && submitDryRun.json?.ok === true, "submit dry-run should still succeed for commander review", submitDryRun.json ?? { stdout: submitDryRun.stdout }); + const submitPromptLint = nestedRecord(submitDryRun.json?.data, ["promptLint"]); + assertCondition(submitPromptLint.dispatchDisposition === "needs-authorization", "submit dry-run should embed prompt lint authorization blocker", submitPromptLint); + assertCondition(submitPromptLint.requiredClass === "live-mutating", "submit dry-run lint should require live-mutating", submitPromptLint); + + const steerDryRun = runCli(["codex", "steer", "codex_test_task", "--prompt-stdin", "--dry-run"], unclassifiedM3SmokePrompt); + assertCondition(steerDryRun.status === 0 && steerDryRun.json?.ok === true, "steer dry-run should succeed for commander review", steerDryRun.json ?? { stdout: steerDryRun.stdout }); + const steerPromptLint = nestedRecord(steerDryRun.json?.data, ["promptLint"]); + assertCondition(steerPromptLint.dispatchDisposition === "needs-authorization", "steer dry-run should embed prompt lint authorization blocker", steerPromptLint); + + const help = runCli(["codex", "help"]); + assertCondition(help.status === 0 && help.json?.ok === true, "codex help should succeed", help.json ?? { stdout: help.stdout }); + const helpData = nestedRecord(help.json?.data, []); + const usage = stringArray(helpData.usage); + assertCondition(usage.some((line) => line.includes("codex prompt-lint")), "codex help should list prompt-lint", helpData); + const authorizationHelp = nestedRecord(helpData, ["promptLiveAuthorization"]); + assertCondition(stringArray(authorizationHelp.classes).includes("live-mutating"), "help should document live-mutating class", authorizationHelp); + assertCondition(authorizationHelp.defaultWhenMissing === "read-only", "help should document read-only default", authorizationHelp); + + return { + ok: true, + checks: [ + "prompt-lint classifies read-only/live-read/live-mutating prompts", + "unclassified HWLAB M3 smoke defaults read-only but requires live-mutating authorization", + "prompt-lint evidence redacts secret-looking values", + "prompt-lint CLI is dry-run, non-mutating, and does not echo full prompt text", + "submit --dry-run embeds prompt live-authorization lint", + "steer --dry-run embeds prompt live-authorization lint", + "codex help documents prompt-lint and authorization classes", + ], + }; +} + +if (import.meta.main) { + process.stdout.write(`${JSON.stringify(runCodeQueuePromptLintContract(), null, 2)}\n`); +} diff --git a/scripts/src/check.ts b/scripts/src/check.ts index d084c647..3efa2430 100644 --- a/scripts/src/check.ts +++ b/scripts/src/check.ts @@ -31,6 +31,7 @@ const syntaxFiles = [ "scripts/host-codex-commander-skeleton-contract-test.ts", "scripts/auth-broker-contract-test.ts", "scripts/code-queue-cli-disclosure-contract-test.ts", + "scripts/code-queue-prompt-lint-contract-test.ts", "scripts/code-queue-cli-steer-test.ts", "scripts/code-queue-cli-submit-prompt-contract-test.ts", "scripts/code-queue-cli-read-terminal-contract-test.ts", @@ -311,6 +312,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default fileItem("scripts/code-queue-pr-preflight-contract-test.ts"), fileItem("scripts/code-queue-runner-skills-contract-test.ts"), fileItem("scripts/code-queue-cli-disclosure-contract-test.ts"), + fileItem("scripts/code-queue-prompt-lint-contract-test.ts"), fileItem("scripts/code-queue-cli-steer-test.ts"), fileItem("scripts/code-queue-cli-read-terminal-contract-test.ts"), fileItem("scripts/code-queue-submit-routing-contract-test.ts"), @@ -353,6 +355,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default items.push(commandItem("code-queue:pr-preflight-contract", ["bun", "scripts/code-queue-pr-preflight-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:runner-skills-contract", ["bun", "scripts/code-queue-runner-skills-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:cli-disclosure-contract", ["bun", "scripts/code-queue-cli-disclosure-contract-test.ts"], 30_000)); + items.push(commandItem("code-queue:prompt-lint-contract", ["bun", "scripts/code-queue-prompt-lint-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:cli-steer-contract", ["bun", "scripts/code-queue-cli-steer-test.ts"], 30_000)); items.push(commandItem("code-queue:read-terminal-contract", ["bun", "scripts/code-queue-cli-read-terminal-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:submit-prompt-contract", ["bun", "scripts/code-queue-cli-submit-prompt-contract-test.ts"], 30_000)); @@ -385,6 +388,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default items.push(skippedItem("code-queue:pr-preflight-contract", "Code Queue PR preflight contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:runner-skills-contract", "Code Queue runner skill availability contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:cli-disclosure-contract", "Code Queue CLI disclosure/noise contract is opt-in with script checks", "--scripts-typecheck or --full")); + items.push(skippedItem("code-queue:prompt-lint-contract", "Code Queue prompt live-authorization lint contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:cli-steer-contract", "Code Queue steer CLI contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:read-terminal-contract", "Code Queue terminal read contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:submit-prompt-contract", "Code Queue submit prompt contract is opt-in with script checks", "--scripts-typecheck or --full")); diff --git a/scripts/src/code-queue.ts b/scripts/src/code-queue.ts index c369a44d..2f180830 100644 --- a/scripts/src/code-queue.ts +++ b/scripts/src/code-queue.ts @@ -67,6 +67,71 @@ interface CodexJudgeOptions { includePrompt: boolean; } +type LiveTestAuthorizationClass = "read-only" | "live-read" | "live-mutating"; +type PromptLintSeverity = "info" | "warning" | "block"; +type PromptLintDisposition = "ready" | "review" | "needs-authorization"; + +interface CodexPromptLintOptions { + prompt: string; +} + +interface PromptLintSignal { + id: string; + severity: PromptLintSeverity; + matched: boolean; + evidence: string[]; + message: string; +} + +interface PromptLiveAuthorizationLint { + ok: boolean; + dryRun: true; + mutation: false; + dispatchDisposition: PromptLintDisposition; + declaredClass: LiveTestAuthorizationClass | null; + effectiveClass: LiveTestAuthorizationClass; + requiredClass: LiveTestAuthorizationClass; + defaultedReadOnly: boolean; + liveMutationAuthorized: boolean; + promptShape: { + chars: number; + lines: number; + textEchoed: false; + }; + requiredPromptFields: { + devTestClass: { + present: boolean; + value: LiveTestAuthorizationClass | null; + allowedValues: LiveTestAuthorizationClass[]; + }; + allowedLiveMutation: { + present: boolean; + nonNone: boolean; + requiredWhen: "live-mutating"; + }; + forbiddenActions: { + present: boolean; + }; + closeoutFields: { + present: boolean; + }; + }; + signals: PromptLintSignal[]; + missingOrContradictory: string[]; + policy: { + defaultWhenUnclassified: "read-only"; + promptLintOnly: true; + accessesLiveService: false; + printsPromptText: false; + reference: string; + }; + commands: { + lintFile: string; + submitDryRun: string; + steerDryRun: string; + }; +} + interface CodexSubmitOptions { prompt: string; queueId: string | undefined; @@ -892,6 +957,196 @@ function routeSignal(id: string, severity: SubmitRouteSignalSeverity, evidence: return { id, severity, matched: evidence.length > 0, evidence, message }; } +function promptLintSignal(id: string, severity: PromptLintSeverity, evidence: string[], message: string): PromptLintSignal { + return { id, severity, matched: evidence.length > 0, evidence, message }; +} + +function declaredLiveTestClass(prompt: string): LiveTestAuthorizationClass | null { + const patterns: Array<[LiveTestAuthorizationClass, RegExp[]]> = [ + ["live-mutating", [ + /\bDEV\s+test\s+class\s*[::]\s*`?live-mutating`?/iu, + /\blive\s+test\s+class\s*[::]\s*`?live-mutating`?/iu, + /\btest\s+class\s*[::]\s*`?live-mutating`?/iu, + /\bDEV\s+测试(?:授权)?分级\s*[::]\s*`?live-mutating`?/iu, + ]], + ["live-read", [ + /\bDEV\s+test\s+class\s*[::]\s*`?live-read`?/iu, + /\blive\s+test\s+class\s*[::]\s*`?live-read`?/iu, + /\btest\s+class\s*[::]\s*`?live-read`?/iu, + /\bDEV\s+测试(?:授权)?分级\s*[::]\s*`?live-read`?/iu, + ]], + ["read-only", [ + /\bDEV\s+test\s+class\s*[::]\s*`?read-only`?/iu, + /\blive\s+test\s+class\s*[::]\s*`?read-only`?/iu, + /\btest\s+class\s*[::]\s*`?read-only`?/iu, + /\bDEV\s+测试(?:授权)?分级\s*[::]\s*`?read-only`?/iu, + ]], + ]; + for (const [value, valuePatterns] of patterns) { + if (valuePatterns.some((pattern) => pattern.test(prompt))) return value; + } + return null; +} + +function liveClassRank(value: LiveTestAuthorizationClass): number { + if (value === "read-only") return 0; + if (value === "live-read") return 1; + return 2; +} + +function hasPromptField(prompt: string, patterns: RegExp[]): boolean { + return patterns.some((pattern) => pattern.test(prompt)); +} + +function sanitizePromptLintEvidence(evidence: string[]): string[] { + return evidence.map((item) => item + .replace(/([?&](?:token|api[_-]?key|secret|password|credential)=)[^&\s]+/giu, "$1<redacted>") + .replace(/((?:token|api[_-]?key|secret|password|credential)\s*[:=]\s*)[^\s,;]+/giu, "$1<redacted>") + .replace(/(Bearer\s+)[A-Za-z0-9._~+/-]+=*/giu, "$1<redacted>") + .slice(0, 160)); +} + +function buildPromptLiveAuthorizationLint(prompt: string): PromptLiveAuthorizationLint { + const declaredClass = declaredLiveTestClass(prompt); + const effectiveClass = declaredClass ?? "read-only"; + const allowedClasses: LiveTestAuthorizationClass[] = ["read-only", "live-read", "live-mutating"]; + const liveReadEvidence = sanitizePromptLintEvidence(regexEvidenceWithoutNegatedContext(prompt, [ + /\blive[- ]read\b/giu, + /\blive\s+(?:dev\s+)?(?:service|runtime|endpoint|health|status|logs?|metrics?)\b/giu, + /\bGET\s+\/(?:health|status|live|metrics|api\/diagnostics)\b/gu, + /\bkubectl\s+(?:get|describe|logs)\b/giu, + /\bmicroservice\s+(?:health|status|diagnostics)\b/giu, + /\bdiagnostics\b|\bstatus\b|\bmetrics\b|\blogs?\b/giu, + /只读(?:读取|观察|诊断|状态|日志)/gu, + /读取\s*(?:DEV|live|运行中|服务|日志|状态)/giu, + ])); + const liveMutationEvidence = sanitizePromptLintEvidence(regexEvidenceWithoutNegatedContext(prompt, [ + /\blive-mutating\b/giu, + /\bDEV\s+smoke\b|\blive\s+smoke\b|\bM3\s+smoke\b/giu, + /\bdeploy\s+apply\b|\brollout\s+restart\b|\bkubectl\s+(?:apply|delete|patch|rollout)\b/giu, + /\b(?:POST|PUT|PATCH|DELETE)\s+\/[A-Za-z0-9_./:-]*/gu, + /\bcodex\s+(?:submit|steer|interrupt|cancel)\b/giu, + /\btask\s+(?:submit|steer|retry|trigger)\b/giu, + /\btrigger\s+(?:schedule|job|task|operation|audit|evidence)\b/giu, + /\bschedule\s+(?:run|retry-run|delete)\b/giu, + /\b(?:create|write|post|put|patch)\b[^\n。]{0,40}\b(?:operation|audit|evidence)\b/giu, + /\b(?:operation|audit|evidence)\s+(?:id|record|write|create)\b/giu, + /\bDO\d+\b|\bDI\d+\b|\bres_boxsimu_\d+\b|\bhwlab-patch-panel\b/giu, + /触发|写入|部署|重启|重建|回滚|创建(?:任务|operation|audit|evidence)|硬件|虚拟硬件/gu, + ])); + const prodMutationEvidence = sanitizePromptLintEvidence(regexEvidenceWithoutNegatedContext(prompt, [ + /\bprod(?:uction)?\b[^\n。]*(?:deploy|restart|write|mutation|mutating|apply|rollout|delete|patch)\b/giu, + /\b(?:deploy|restart|write|mutation|mutating|apply|rollout|delete|patch)\b[^\n。]*\bprod(?:uction)?\b/giu, + /生产[^\n。]*(?:写入|部署|重启|变更|删除|回滚)/gu, + ])); + const requiredClass = liveMutationEvidence.length > 0 || prodMutationEvidence.length > 0 + ? "live-mutating" + : liveReadEvidence.length > 0 + ? "live-read" + : "read-only"; + const allowedLiveMutationPresent = hasPromptField(prompt, [ + /\ballowed\s+live\s+mutation\s*[::]/iu, + /允许的\s*live\s*mutation\s*[::]/iu, + /允许的(?:现场|实时|运行态)?(?:写入|变更|mutation)\s*[::]/iu, + ]); + const allowedLiveMutationNone = hasPromptField(prompt, [ + /\ballowed\s+live\s+mutation\s*[::]\s*(?:`?none`?|无|なし)(?:\s|$|[。.;,,])/iu, + /允许的\s*live\s*mutation\s*[::]\s*(?:`?none`?|无|なし)(?:\s|$|[。.;,,])/iu, + /允许的(?:现场|实时|运行态)?(?:写入|变更|mutation)\s*[::]\s*(?:`?none`?|无|なし)(?:\s|$|[。.;,,])/iu, + ]); + const forbiddenActionsPresent = hasPromptField(prompt, [ + /\bforbidden\s+actions?\s*[::]/iu, + /禁止动作\s*[::]/u, + /禁止\s*[::]/u, + ]); + const closeoutFieldsPresent = hasPromptField(prompt, [ + /\bcloseout\s+fields?\s*[::]/iu, + /\bfinal\s+response\b[^\n。]*(?:must|include|report)/iu, + /\b收口字段\s*[::]/u, + /\bfinal\s+response\b[^\n。]*报告/iu, + ]); + const effectiveInsufficient = liveClassRank(effectiveClass) < liveClassRank(requiredClass); + const liveMutationAuthorized = effectiveClass === "live-mutating" && allowedLiveMutationPresent && !allowedLiveMutationNone; + const contradictionEvidence = [ + ...(effectiveClass === "read-only" && liveMutationEvidence.length > 0 ? ["declares/read-only but prompt contains live mutation signals"] : []), + ...(effectiveClass === "live-read" && liveMutationEvidence.length > 0 ? ["declares/live-read but prompt contains live mutation signals"] : []), + ...(effectiveClass === "live-mutating" && allowedLiveMutationNone ? ["declares/live-mutating but allowed live mutation is none"] : []), + ...(prodMutationEvidence.length > 0 ? prodMutationEvidence.map((item) => `prod mutation signal: ${item}`) : []), + ]; + const missingOrContradictory = [ + ...(declaredClass === null ? ["missing DEV test class; defaulting to read-only"] : []), + ...(effectiveInsufficient ? [`effective class ${effectiveClass} is below required ${requiredClass}`] : []), + ...(requiredClass === "live-mutating" && !allowedLiveMutationPresent ? ["live-mutating prompt must include allowed live mutation"] : []), + ...(requiredClass === "live-mutating" && allowedLiveMutationNone ? ["live-mutating prompt cannot set allowed live mutation to none"] : []), + ...(!forbiddenActionsPresent ? ["missing forbidden actions"] : []), + ...(!closeoutFieldsPresent ? ["missing closeout fields"] : []), + ...contradictionEvidence, + ]; + const signals = [ + promptLintSignal("declared-dev-test-class", "info", declaredClass === null ? [] : [declaredClass], "Prompt explicitly declares DEV test class."), + promptLintSignal("live-read-signal", "warning", liveReadEvidence, "Prompt appears to read live DEV service state, logs, health, status, metrics, or Kubernetes objects."), + promptLintSignal("live-mutation-signal", "block", liveMutationEvidence, "Prompt appears to trigger runtime writes, deployment, task control, operation/audit/evidence creation, or HWLAB DO/DI activity."), + promptLintSignal("prod-mutation-signal", "block", prodMutationEvidence, "Prompt appears to mention production mutation; Code Queue runner prompts must not implicitly authorize this."), + promptLintSignal("allowed-live-mutation-field", requiredClass === "live-mutating" ? "block" : "info", allowedLiveMutationPresent && !allowedLiveMutationNone ? ["present"] : [], "live-mutating prompts must enumerate allowed live mutation commands and target state changes."), + promptLintSignal("forbidden-actions-field", "warning", forbiddenActionsPresent ? ["present"] : [], "Prompt should list forbidden high-risk actions."), + promptLintSignal("closeout-fields-field", "warning", closeoutFieldsPresent ? ["present"] : [], "Prompt should require final closeout fields for class, mutation, commands, targets, evidence, and residual risk."), + ]; + const ok = missingOrContradictory.length === 0; + const dispatchDisposition: PromptLintDisposition = ok + ? "ready" + : requiredClass === "live-mutating" || effectiveInsufficient || contradictionEvidence.length > 0 + ? "needs-authorization" + : "review"; + return { + ok, + dryRun: true, + mutation: false, + dispatchDisposition, + declaredClass, + effectiveClass, + requiredClass, + defaultedReadOnly: declaredClass === null, + liveMutationAuthorized, + promptShape: { + chars: prompt.length, + lines: prompt.split(/\r\n|\r|\n/u).length, + textEchoed: false, + }, + requiredPromptFields: { + devTestClass: { + present: declaredClass !== null, + value: declaredClass, + allowedValues: allowedClasses, + }, + allowedLiveMutation: { + present: allowedLiveMutationPresent, + nonNone: allowedLiveMutationPresent && !allowedLiveMutationNone, + requiredWhen: "live-mutating", + }, + forbiddenActions: { + present: forbiddenActionsPresent, + }, + closeoutFields: { + present: closeoutFieldsPresent, + }, + }, + signals, + missingOrContradictory, + policy: { + defaultWhenUnclassified: "read-only", + promptLintOnly: true, + accessesLiveService: false, + printsPromptText: false, + reference: "docs/reference/code-queue-supervision.md#dev-测试授权分级", + }, + commands: { + lintFile: "bun scripts/cli.ts codex prompt-lint --prompt-file <path>", + submitDryRun: "bun scripts/cli.ts codex submit --prompt-file <path> --dry-run", + steerDryRun: "bun scripts/cli.ts codex steer <taskId> --prompt-file <path> --dry-run", + }, + }; +} + function submitPolicyContract(): SubmitRoutingRecommendation["policyContract"] { return { selectionPrinciples: [ @@ -3318,6 +3573,16 @@ function parseSteerOptions(args: string[]): CodexSteerOptions { }; } +function parsePromptLintOptions(args: string[]): CodexPromptLintOptions { + assertKnownOptions(args, { + flags: ["--prompt-stdin", "--stdin"], + valueOptions: ["--prompt-file", "--file"], + }, "codex prompt-lint"); + return { + prompt: promptFromArgs(args, "codex prompt-lint", steerPromptValueOptions), + }; +} + function submitPayload(options: CodexSubmitOptions): Record<string, unknown> { return { prompt: options.prompt, @@ -4986,17 +5251,28 @@ export function codexSubmitRoutingRecommendationForTest(prompt: string, model?: }); } +export function codexPromptLiveAuthorizationLintForTest(prompt: string): PromptLiveAuthorizationLint { + return buildPromptLiveAuthorizationLint(prompt); +} + export function codexSubmitModelRegistryForTest(models: string[] = sharedDefaultCodeModels): ReturnType<typeof submitModelRegistry> { return submitModelRegistry(models); } +function codexPromptLintTask(args: string[]): unknown { + const options = parsePromptLintOptions(args); + return buildPromptLiveAuthorizationLint(options.prompt); +} + function codexSubmitTask(args: string[]): unknown { const options = parseSubmitOptions(args); const payload = submitPayload(options); + const promptLint = buildPromptLiveAuthorizationLint(options.prompt); if (options.dryRun) { return { ok: true, dryRun: true, + promptLint, routingRecommendation: submitRoutingRecommendation(options), modelRegistry: submitModelRegistry(), request: { @@ -5027,6 +5303,7 @@ function codexInterruptTask(taskId: string): unknown { function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFetcher = coreInternalFetch): unknown { const options = parseSteerOptions(args); + const promptLint = buildPromptLiveAuthorizationLint(options.prompt); const targetPath = `/api/tasks/${encodeURIComponent(taskId)}/steer`; const stableProxyPath = codeQueueProxyPath(targetPath); const rawProxyEquivalent = codeQueueProxyEquivalentCommand(targetPath, "{\"prompt\":\"...\"}"); @@ -5053,6 +5330,7 @@ function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFe return { ok: true, dryRun: true, + promptLint, request, commands: { run: `bun scripts/cli.ts codex steer ${taskId} --prompt-file <path>`, @@ -5164,6 +5442,9 @@ function codexSteerTask(taskId: string, args: string[], fetcher: CodexResponseFe export async function runCodeQueueCommand(config: UniDeskConfig, args: string[]): Promise<unknown> { const [action = "task", taskIdArg] = args; + if (action === "prompt-lint" || action === "lint-prompt") { + return codexPromptLintTask(args.slice(1)); + } if (action === "submit" || action === "enqueue") { return codexSubmitTask(args.slice(1)); } @@ -5214,5 +5495,5 @@ export async function runCodeQueueCommand(config: UniDeskConfig, args: string[]) const taskId = requireTaskId(taskIdArg, "codex steer"); return codexSteerTask(taskId, args.slice(2)); } - throw new Error("codex command must be one of: submit, enqueue, task, summary, show, tasks, overview, output, judge, read, mark-read, dev-ready, health, skills-sync, pr-preflight, runtime-preflight, queues, queue list, queue create, queue merge, move, steer, interrupt, cancel"); + throw new Error("codex command must be one of: prompt-lint, submit, enqueue, task, summary, show, tasks, overview, output, judge, read, mark-read, dev-ready, health, skills-sync, pr-preflight, runtime-preflight, queues, queue list, queue create, queue merge, move, steer, interrupt, cancel"); } diff --git a/scripts/src/help.ts b/scripts/src/help.ts index a791f64f..4406eb9f 100644 --- a/scripts/src/help.ts +++ b/scripts/src/help.ts @@ -52,7 +52,8 @@ export function rootHelp(): unknown { { command: "schedule list|get|runs|run|retry-run|delete", description: "Manage backend-core scheduled tasks and run history; schedule run <id> supports --wait-ms N and retry-run reuses the failed run's schedule." }, { command: "schedule upsert-pgdata-backup [--time HH:MM] [--remote-base /SERVER_DATA/UNIDESK_PG_DATA]", description: "Create or update the daily PGDATA physical backup task that uploads monthly rotated archives to Baidu Netdisk." }, { command: "codex deploy <commitId> [--provider-id D601] [--timeout-ms N]", description: "Disabled legacy Code Queue deploy path; use the dev-only artifact consumer instead." }, - { command: "codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue queueId] [--provider-id id] [--cwd path] [--model model] [--execution-mode mode] [--max-attempts N] [--reference-task-id id] [--dry-run]", description: "Submit a Code Queue task through backend-core -> code-queue proxy; --dry-run shows the structured request, while real success only confirms the write and task id." }, + { command: "codex prompt-lint [prompt|--prompt-file path|--prompt-stdin]", description: "Dry-run lint a runner prompt for DEV test class read-only/live-read/live-mutating authorization without echoing prompt text or touching live services." }, + { command: "codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue queueId] [--provider-id id] [--cwd path] [--model model] [--execution-mode mode] [--max-attempts N] [--reference-task-id id] [--dry-run]", description: "Submit a Code Queue task through backend-core -> code-queue proxy; --dry-run shows the structured request, routing recommendation, and prompt live-authorization lint while real success only confirms the write and task id." }, { command: "codex skills-sync --dry-run [--full]", description: "Inspect the controlled runner skills hostPath lifecycle contract without copying files, restarting services, reading secrets, or mutating live runner paths." }, { command: "codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/<name>] [--pr-create-dry-run --pr-create-dry-run-head <head>] [--issue N] [--full|--raw]", description: "Read-only PR admission check with compact commander output by default; use --full or --raw to expand the full runtime preflight, tool, and observation payload." }, { command: "codex task <taskId> [--detail] [--trace --tail|--from-start|--after-seq N|--before-seq N --limit N] [--full]", description: "Fetch the bounded review view by default; --detail is still capped, while --full/trace/output explicitly expand evidence." }, @@ -245,10 +246,11 @@ function scheduleHelp(): unknown { function codexHelp(): unknown { return { - command: "codex deploy|submit|task|tasks|output|read|dev-ready|skills-sync|pr-preflight|judge|steer|interrupt|cancel|queues|queue|move", + command: "codex deploy|prompt-lint|submit|task|tasks|output|read|dev-ready|skills-sync|pr-preflight|judge|steer|interrupt|cancel|queues|queue|move", output: "json", usage: [ "bun scripts/cli.ts codex deploy <commitId> # disabled legacy deployment entry", + "bun scripts/cli.ts codex prompt-lint [prompt|--prompt-file path|--prompt-stdin]", "bun scripts/cli.ts codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue id] [--model model] [--dry-run]", "cat <<'PROMPT' | bun scripts/cli.ts codex submit --prompt-stdin --queue <id> --dry-run", "bun scripts/cli.ts codex submit --prompt-file /tmp/code-queue-prompt.md --queue <id> --dry-run", @@ -276,6 +278,7 @@ function codexHelp(): unknown { disclosure: "Full prompt, tool logs, and feedback prompts are not printed by codex read; use codex task/detail/trace/output for progressive disclosure.", }, examples: { + promptLint: "bun scripts/cli.ts codex prompt-lint --prompt-file /tmp/code-queue-prompt.md", stdin: [ "cat <<'PROMPT' | bun scripts/cli.ts codex submit --prompt-stdin --queue <id> --dry-run", "<multi-line prompt body>", @@ -300,6 +303,13 @@ function codexHelp(): unknown { redline: "data.supervisor.activeRunning.redline names the count field, routine target, burst redline, hard redline, and decisionReady flag.", limitSemantics: "filters.requestedLimit preserves the user input; filters.limit/effectiveLimit shows the capped query budget; section outputBudget/rowPage show returned-row caps.", }, + promptLiveAuthorization: { + classes: ["read-only", "live-read", "live-mutating"], + defaultWhenMissing: "read-only", + command: "bun scripts/cli.ts codex prompt-lint --prompt-file <path>", + embeddedIn: ["codex submit --dry-run", "codex steer --dry-run"], + reference: "docs/reference/code-queue-supervision.md#dev-测试授权分级", + }, description: "Operate Code Queue through the stable backend-core private proxy path with bounded activity summaries for queue and supervisor views. Real submit/steer success is a low-noise write confirmation and does not echo prompt text; terminal steer rejection returns compact status plus codex task/read/submit follow-up commands.", }; }