From afe50b9e8f4982bd9564f8eaad431f4fe2d3c4bc Mon Sep 17 00:00:00 2001 From: Codex Date: Thu, 21 May 2026 12:51:53 +0000 Subject: [PATCH] test: add commander no-daemon smoke contract --- AGENTS.md | 2 +- docs/reference/cli.md | 2 +- docs/reference/host-codex-commander.md | 27 ++- ...commander-no-daemon-smoke-contract-test.ts | 170 ++++++++++++++++++ scripts/src/check.ts | 4 + scripts/src/commander.ts | 161 +++++++++++++++++ scripts/src/help.ts | 7 +- 7 files changed, 367 insertions(+), 6 deletions(-) create mode 100644 scripts/host-codex-commander-no-daemon-smoke-contract-test.ts diff --git a/AGENTS.md b/AGENTS.md index 6a8a5486..a8d24a0a 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -44,7 +44,7 @@ UniDesk 是一个以主 server 为统一入口的分布式工作平台;本文 - `bun scripts/cli.ts dev-env validate [--manifest path] [--kubectl-dry-run]` / `dev-env prewarm-images`:离线校验 D601 `unidesk-dev` 生产隔离护栏和 dev workload manifests,或把开发底座基础镜像预热到 D601 原生 k3s containerd,规则见 `docs/reference/deploy.md` 与 `docs/reference/microservices.md`。 - `bun scripts/cli.ts artifact-registry plan|render|status|health|install|deploy-backend-core|deploy-service`:管理 D601 host-managed CNCF Distribution registry,并通过短生命周期 relay 或 D601 pull/import 做 commit-pinned pull-only artifact CD;`deploy-backend-core` 是 deprecated 兼容名,`findjob`/`pipeline` 支持 D601 direct dev/prod,`met-nonlinear` 和 `k3sctl-adapter` 只给受限计划路径,`code-queue` 只支持 dev,规则见 `docs/reference/artifact-registry.md`。 - `bun scripts/cli.ts gh auth status|issue ...|pr list|view|create|comment` / `bun scripts/code-queue-pr-preflight-example.ts`:通过 REST 执行安全 GitHub issue 读写、脱敏 auth/status 诊断、body-file Markdown 写入、当日滚动简报时间线 ClaudeQQ 通知、escape 扫描、只读 cleanup-plan 和 #20 board-audit、PR 创建/评论 dry-run 与 runner PR preflight;`gh pr merge` 当前仍结构化拒绝,规则见 `docs/reference/cli.md` 和 `docs/reference/code-queue-supervision.md`。 -- `bun scripts/cli.ts commander contract|plan --dry-run|approval request --dry-run`:查看 host Codex 指挥官直管微服务 skeleton 的 source/contract、.state/commander/ 状态模型、trace summary 聚合和 ClaudeQQ 高风险请示草案;当前只返回 dry-run 计划,不接 live bridge、不接管人工指挥官,不发送消息,规则见 `docs/reference/host-codex-commander.md`。 +- `bun scripts/cli.ts commander contract|plan --dry-run|smoke --dry-run|approval request --dry-run`:查看 host Codex 指挥官直管微服务 skeleton 的 source/contract、无 daemon smoke 验证计划、.state/commander/ 状态模型、trace summary 聚合和 ClaudeQQ 高风险请示草案;当前只返回 dry-run 计划,不接 live bridge、不接管人工指挥官,不发送消息,规则见 `docs/reference/host-codex-commander.md`。 - `bun scripts/cli.ts ci install/status/run/publish-backend-core/publish-user-service/run-dev-e2e/logs`:在 D601 原生 k3s 上安装和运行 Tekton CI,支持每 commit 检查、Code Queue 只读性能门禁、`CI.json` catalog 驱动的 backend-core 与 user-service commit-pinned 镜像发布和手动触发的 `origin/master:deploy.json#environments.dev` 临时 namespace e2e;catalog/producer/consumer 分工见 `docs/reference/cicd-standardization.md`,`run-dev-e2e` 的 Git 控制 runner、短 launcher 和 no-CD 边界见 `docs/reference/dev-ci-runner.md`,Tekton 规则见 `docs/reference/ci.md`。 - `bun scripts/cli.ts codex deploy `:旧 Code Queue 兼容部署入口已禁用,原因是它会绕过受控部署边界直连 D601 部署 Code Queue;规则见 `docs/reference/codex-deploy.md`。 - `bun scripts/cli.ts codex submit [prompt] [--prompt-file path|--prompt-stdin] [--queue ]` / `codex pr-preflight [--remote]`:前者通过 backend-core 私有代理提交 Code Queue 任务,`--dry-run` 会给出 MiniMax/GPT/人工路由建议但不改写 payload;后者只读检查 D601 scheduler/runner 的 GitHub token、egress 和 PR 能力,PR 型派单前必须使用,规则见 `docs/reference/cli.md` 和 `docs/reference/code-queue-supervision.md`。 diff --git a/docs/reference/cli.md b/docs/reference/cli.md index 68e73d13..3f383640 100644 --- a/docs/reference/cli.md +++ b/docs/reference/cli.md @@ -29,7 +29,7 @@ CLI 可以从 `master` 快速演进,但必须兼容 `deploy.json` 固定的 CI - `dev-env validate [--manifest path] [--kubectl-dry-run]` 离线校验 D601 `unidesk-dev` namespace、dev PostgreSQL 底座和 dev workload manifest。默认检查 `src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-foundation.k8s.yaml`;也可显式校验 `src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-core.k8s.yaml` 或 `src/components/microservices/k3sctl-adapter/k3s/dev/unidesk-dev-code-queue.k8s.yaml`。所有 namespaced 对象必须只落到 `unidesk-dev`,foundation manifest 必须包含 `postgres-dev` StatefulSet/Service、dev secret/config、迁移 Job 和 DB URL guard,core manifest 必须包含 `backend-core-dev`/`frontend-dev` Deployment/Service,Code Queue dev manifest 必须包含 `code-queue-scheduler-dev`、`code-queue-read-dev`、`code-queue-write-dev`、dev provider egress proxy,以及只读挂载宿主 `/home/ubuntu/.agents/skills` 到容器 `/root/.agents/skills` 的 `skills-dir` volume。加 `--kubectl-dry-run` 时额外执行 `kubectl apply --dry-run=client --validate=false -f `,仍不 apply 资源。 - `dev-env prewarm-images [--image image] [--provider-id D601] [--no-pull] [--proxy-url URL] [--pull-timeout-ms N] [--dry-run]` 创建异步 job,通过 UniDesk SSH 维护桥在 D601 上把开发底座依赖镜像从 Docker 缓存导入原生 k3s containerd。默认镜像是 `postgres:16-alpine` 和 `rancher/mirrored-library-busybox:1.36.1`,用于避免 `postgres-dev` 与 local-path helper pod 卡在外部 registry 拉取。该命令固定验证 `/etc/rancher/k3s/k3s.yaml` 指向的 native k3s 上下文,并输出 `dev_env_containerd_image_ready=...` 作为成功判据;它不 apply manifest、不修改生产 `unidesk` namespace。 - `artifact-registry plan|render|status|health|install|deploy-backend-core|deploy-service` 管理 D601 host-managed CNCF Distribution registry 的声明、安装、只读检查和 pull-only artifact CD。该 registry 固定为 D601 loopback `127.0.0.1:5000`,由 systemd + Docker Compose 管理,位于 native k3s 故障域外;`deploy-service` 只拉取 CI 已发布的 commit-pinned 镜像、retag/recreate 或导入 native k3s,并做 live commit 验证,不构建 runtime source。`deploy-backend-core` 是 deprecated 兼容名,标准 backend-core prod CD 入口是 `deploy apply --env prod --service backend-core`。长期规则见 `docs/reference/artifact-registry.md`。 -- `commander contract|plan --dry-run|approval request --dry-run` 是 host Codex 指挥官直管微服务 skeleton 入口。当前命令返回 `phase=source-contract`、service/API/state/bridge/prompt/trace/#20/#46/ClaudeQQ 审批边界、.state/commander/ 状态模型和 dry-run 计划,服务骨架只提供本地 `/health`、`/api/commander/contract`、状态读写、trace summary 聚合和 approval draft preview,不接 live bridge、不注入 prompt、不发送 ClaudeQQ。`plan` 与 `approval request` 必须带 `--dry-run`;缺少时返回 `error=dry-run-required`。长期规则见 `docs/reference/host-codex-commander.md`。 +- `commander contract|plan --dry-run|smoke --dry-run|approval request --dry-run` 是 host Codex 指挥官直管微服务 skeleton 入口。当前命令返回 `phase=source-contract`、service/API/state/bridge/prompt/trace/#20/#46/ClaudeQQ 审批边界、.state/commander/ 状态模型、dev 无 daemon smoke contract 和 dry-run 计划,服务骨架只提供本地 `/health`、`/api/commander/contract`、状态读写、trace summary 聚合和 approval draft preview,不接 live bridge、不注入 prompt、不发送 ClaudeQQ。`plan`、`smoke` 与 `approval request` 必须带 `--dry-run`;缺少时返回 `error=dry-run-required`。长期规则见 `docs/reference/host-codex-commander.md`。 - `gh auth status [--repo owner/name]` 探测 GitHub 操作前置条件并输出脱敏 JSON:是否存在 `gh` binary、是否存在 `GH_TOKEN`/`GITHUB_TOKEN` 或可用 `gh auth token` fallback、REST API 是否可达、目标 repo 是否可见、issue 是否可读。degraded reason 必须归类为 `missing-binary`、`missing-token`、`auth-failed`、`network-proxy-failed`、`permission-denied`、`repo-not-found`、`repo-forbidden`、`issue-not-found`、`pr-not-found`、`scope-insufficient`、`validation-failed`、`invalid-response` 或 `unsupported-command`,不得打印 token;失败对象必须包含 `runnerDisposition=infra-blocked|business-failed`,runner 应优先用该字段分流。 - `gh issue list [--state open|closed|all] [--limit N] [--repo owner/name] [--json number,title,state,url,updatedAt,createdAt,author,labels]` 通过 GitHub REST 列出 issue,默认 `state=open`、`limit=30`,输出稳定 JSON 且不依赖系统 `gh` binary。`--limit` 会映射到 GitHub `per_page` 并限制返回数量,避免一次拉爆上下文;未知 state 或未知 `--json` 字段必须结构化失败并带 `runnerDisposition=business-failed`。GitHub issues API 可能混入 PR,CLI 会从 `.data.issues` 中过滤 pull request。 - `gh issue read [--repo owner/name] [--json body,title,state,comments]` 通过 GitHub REST 读取 issue title/body/state/url 和 comments,默认输出 JSON;`view` 只保留为兼容别名。兼容旧脚本的 `--json body` 和 `--json body,title,state,comments` 字段选择,且正文仍稳定暴露在 `.data.issue.body`,避免调用方因为 JSON 路径变化把空值当成正文。字段白名单是 `body,title,state,comments,number,url,author,createdAt,updatedAt`,未知字段必须结构化失败并带 `runnerDisposition=business-failed`。`gh issue create --title --body-file <file> [--label label[,label...]]... [--dry-run]`、`gh issue update <number> --mode replace|append --body-file <file> [--title ...] [--dry-run]`、`gh issue comment create <number> --body-file <file> [--dry-run]`、`gh issue comment delete <commentId> [--dry-run]`、`gh issue close|reopen <number> [--dry-run]` 都走 REST,不依赖 `gh` binary。`--label` 仅用于 `issue create`,支持重复传入和逗号分隔;`--dry-run` 会展示解析后的 labels 与 request plan,正式创建时把 labels 放入 GitHub REST create-issue payload,GitHub 返回不存在 label 等 422 校验失败时 CLI 结构化返回 `validation-failed`,不静默成功。`gh issue delete <number>` 是结构化 `unsupported-command`,因为 GitHub REST 不支持 issue 硬删除;生命周期删除语义请使用 `close`。 diff --git a/docs/reference/host-codex-commander.md b/docs/reference/host-codex-commander.md index 837ff129..b9171ee6 100644 --- a/docs/reference/host-codex-commander.md +++ b/docs/reference/host-codex-commander.md @@ -14,10 +14,31 @@ ```bash bun scripts/cli.ts commander contract bun scripts/cli.ts commander plan --dry-run [--session-id primary] +bun scripts/cli.ts commander smoke --dry-run [--session-id primary] bun scripts/cli.ts commander approval request --action <action> --dry-run [--reason text] [--task-id id] ``` -`plan` 与 `approval request` 必须显式使用 `--dry-run`,缺失时返回 `error=dry-run-required`。 +`plan`、`smoke` 与 `approval request` 必须显式使用 `--dry-run`,缺失时返回 `error=dry-run-required`。 + +## Dev 验证计划 + +`commander smoke --dry-run` 是无 daemon smoke contract。它只输出验证计划,不启动 HTTP daemon、不打开 SSH/PTY/stdio bridge、不发送 ClaudeQQ、不重启服务、不 interrupt/cancel 任务、不部署、不跑全量 check/e2e。 + +需要验证的 source/contract 面: + +- health endpoint:用 `createCommanderRequestHandler` 和临时 `RuntimeConfig` 调用 `GET /health`,期望返回 `service=host-codex-commander`、`stateRoot` 和日志文件路径;禁止 `Bun.serve` 和端口监听。 +- state file:只在临时目录写读 `sessions/<sessionId>.json`、`events/<sessionId>.jsonl` 和 `approvals/draft.json`,确认 session 状态和 redaction round-trip;禁止触碰真实 `.state/commander/`。 +- trace summary dry-run:只喂 mock JSONL 给 `summarizeCommanderTrace`,确认 `taskId`、`sessionId`、`lastSeq`、`status`、`redactionsApplied` 和有界摘要;禁止读取 live Code Queue trace、标记已读、interrupt 或 cancel。 +- approval draft preview:只运行 `commander approval request --dry-run` 或 `buildCommanderApprovalDraft`,确认 `requiresExplicitUserApproval=true`、`claudeqq.mutation=false`、`sendImplemented=false` 和敏感信息脱敏;禁止 POST ClaudeQQ。 +- SSH bridge boundary:只检查 `commander plan --dry-run` 中 `bridge.mutation=false`、`startPlan.enabled=false` 和 `safetyBoundary.phaseOneMutationAllowed=false`;禁止打开 SSH、PTY 或 stdio bridge。 + +轻量契约测试是: + +```bash +bun scripts/host-codex-commander-no-daemon-smoke-contract-test.ts +``` + +该测试只执行 CLI dry-run 和短命 source-level handler/helper,不启动长期进程。 ## HTTP @@ -60,3 +81,7 @@ trace summary 输入 mock Code Queue trace JSONL 和可选 task summary,输出 ## Approval draft 高风险动作只生成 approval draft JSON / Markdown preview。preview 必须显示 redaction 结果,并明确 `sendImplemented=false`。 + +## 进入真实运行态前 + +启用 daemon、PTY/stdio bridge、SSH bridge 或 ClaudeQQ 发送路径前,必须先获得人工授权。授权必须绑定一个精确 action 和目标 session/task/service,已有 smoke/skeleton contract 通过,风险审查确认不会打印 token、不会直接 patch 数据库、不会绕过 backend 确认策略,并且已有可审计的 approval id、回滚步骤和观测步骤。 diff --git a/scripts/host-codex-commander-no-daemon-smoke-contract-test.ts b/scripts/host-codex-commander-no-daemon-smoke-contract-test.ts new file mode 100644 index 00000000..34c9e51b --- /dev/null +++ b/scripts/host-codex-commander-no-daemon-smoke-contract-test.ts @@ -0,0 +1,170 @@ +import { spawnSync } from "node:child_process"; +import { existsSync, mkdtempSync, readFileSync, rmSync } from "node:fs"; +import { tmpdir } from "node:os"; +import { join } from "node:path"; +import { createCommanderRequestHandler, type RuntimeConfig } from "../src/components/microservices/host-codex-commander/src/index"; +import { commanderHealth, summarizeCommanderTrace } from "../src/components/microservices/host-codex-commander/src/state"; + +type JsonRecord = Record<string, unknown>; + +function assertCondition(condition: unknown, message: string, detail: unknown = {}): void { + if (!condition) throw new Error(`${message}: ${JSON.stringify(detail)}`); +} + +function asRecord(value: unknown, label: string): JsonRecord { + assertCondition(typeof value === "object" && value !== null && !Array.isArray(value), `${label} must be an object`, value); + return value as JsonRecord; +} + +function asRecordArray(value: unknown, label: string): JsonRecord[] { + assertCondition(Array.isArray(value) && value.every((item) => typeof item === "object" && item !== null && !Array.isArray(item)), `${label} must be object array`, value); + return value as JsonRecord[]; +} + +function asStringArray(value: unknown, label: string): string[] { + assertCondition(Array.isArray(value) && value.every((item) => typeof item === "string"), `${label} must be string array`, value); + return value as string[]; +} + +function runCli(args: string[], expectStatus: number): JsonRecord { + const result = spawnSync("bun", ["scripts/cli.ts", ...args], { + cwd: process.cwd(), + encoding: "utf8", + maxBuffer: 4 * 1024 * 1024, + }); + assertCondition(result.status === expectStatus, `status mismatch for ${args.join(" ")}`, { + status: result.status, + stdout: result.stdout.slice(-2000), + stderr: result.stderr.slice(-2000), + }); + assertCondition(result.stdout.trim().length > 0, `command produced no stdout: ${args.join(" ")}`); + return asRecord(JSON.parse(result.stdout) as unknown, "cli envelope"); +} + +function dataOf(envelope: JsonRecord): JsonRecord { + return asRecord(envelope.data, "data"); +} + +async function readJson(response: Response): Promise<JsonRecord> { + return asRecord(await response.json() as unknown, "response body"); +} + +const sessionId = `no-daemon-smoke-contract-${process.pid}`; +const liveSessionPath = join(process.cwd(), ".state", "commander", "sessions", `${sessionId}.json`); +assertCondition(!existsSync(liveSessionPath), "precondition failed: smoke session path should not already exist", liveSessionPath); + +const smoke = dataOf(runCli(["commander", "smoke", "--dry-run", "--session-id", sessionId], 0)); +assertCondition(smoke.ok === true, "smoke must succeed", smoke); +assertCondition(smoke.phase === "source-contract", "smoke must remain source-contract phase", smoke); +assertCondition(smoke.mode === "dry-run", "smoke must report dry-run mode", smoke); +assertCondition(smoke.mutation === false, "smoke must be non-mutating", smoke); + +const noDaemon = asRecord(smoke.noDaemonSmokeContract, "noDaemonSmokeContract"); +for (const flag of [ + "startsDaemon", + "startsPtyBridge", + "startsStdioBridge", + "opensSshBridge", + "sendsClaudeqq", + "restartsServices", + "interruptsTasks", + "cancelsTasks", + "deploys", + "runsFullCheckOrE2e", +]) { + assertCondition(noDaemon[flag] === false, `${flag} must be false`, noDaemon); +} +assertCondition(asStringArray(noDaemon.allowedCommands, "allowedCommands").includes("bun scripts/host-codex-commander-no-daemon-smoke-contract-test.ts"), "smoke should name this lightweight contract", noDaemon); + +const validationPlan = asRecordArray(smoke.validationPlan, "validationPlan"); +const surfaces = validationPlan.map((item) => item.surface); +for (const expected of [ + "health endpoint", + "state file", + "trace summary dry-run", + "approval draft preview", + "SSH bridge boundary", +]) { + assertCondition(surfaces.includes(expected), `missing validation surface ${expected}`, surfaces); +} +for (const item of validationPlan) { + assertCondition(asStringArray(item.expectedEvidence, "expectedEvidence").length > 0, "each validation item must define evidence", item); + assertCondition(asStringArray(item.noRuntimeSideEffects, "noRuntimeSideEffects").length > 0, "each validation item must define no-side-effect boundary", item); +} + +const smokeWithoutDryRun = dataOf(runCli(["commander", "smoke", "--session-id", sessionId], 1)); +assertCondition(smokeWithoutDryRun.error === "dry-run-required", "smoke must require --dry-run", smokeWithoutDryRun); +assertCondition(!existsSync(liveSessionPath), "smoke CLI must not write live commander state", liveSessionPath); + +const tmp = mkdtempSync(join(tmpdir(), "host-codex-commander-smoke-")); +try { + const runtime: RuntimeConfig = { + rootDir: tmp, + host: "127.0.0.1", + port: 4261, + logFile: join(tmp, "logs", "commander.jsonl"), + serviceId: "host-codex-commander", + stateRoot: tmp, + sessionId, + }; + const health = commanderHealth(runtime, "2026-05-21T00:00:00.000Z"); + assertCondition(health.ok === true && health.service === "host-codex-commander", "health helper must expose service metadata", health); + assertCondition(health.stateRoot === tmp, "health helper must use temp state root", health); + + const handler = createCommanderRequestHandler(runtime); + const healthBody = await readJson(await handler(new Request("http://localhost/health"))); + assertCondition(healthBody.ok === true, "short-lived handler health route must succeed without Bun.serve", healthBody); + + const trace = summarizeCommanderTrace({ + taskId: "task-smoke", + sessionId, + traceJsonl: [ + JSON.stringify({ seq: 1, kind: "message", status: "running", summary: "checking token=ghp_1234567890abcdef" }), + JSON.stringify({ seq: 2, kind: "event", status: "attention_required", text: "needs approval" }), + ].join("\n"), + taskSummary: "summary password=secret", + }); + assertCondition(trace.taskId === "task-smoke", "trace summary must preserve task id", trace); + assertCondition(trace.sessionId === sessionId, "trace summary must preserve session id", trace); + assertCondition(trace.lastSeq === 2, "trace summary must compute last seq", trace); + assertCondition(trace.status === "attention_required", "trace summary must derive attention_required status", trace); + assertCondition(trace.redactionsApplied >= 2, "trace summary must redact mock secrets", trace); +} finally { + rmSync(tmp, { recursive: true, force: true }); +} + +const approval = dataOf(runCli([ + "commander", + "approval", + "request", + "--action", + "code-queue-task-cancel", + "--reason", + "token=ghp_1234567890abcdef", + "--dry-run", +], 0)); +const claudeqq = asRecord(approval.claudeqq, "claudeqq"); +assertCondition(claudeqq.mutation === false, "approval preview must not mutate ClaudeQQ", claudeqq); +assertCondition(claudeqq.sendImplemented === false, "approval preview must not implement sending", claudeqq); +assertCondition(!JSON.stringify(approval).includes("ghp_1234567890abcdef"), "approval preview must redact secret-like reason", approval); + +const doc = readFileSync("docs/reference/host-codex-commander.md", "utf8"); +for (const snippet of [ + "commander smoke --dry-run", + "无 daemon smoke contract", + "health endpoint", + "SSH bridge boundary", +]) { + assertCondition(doc.includes(snippet), `reference doc missing snippet: ${snippet}`); +} + +process.stdout.write(`${JSON.stringify({ + ok: true, + checks: [ + "commander smoke --dry-run is non-mutating and dry-run required", + "no-daemon smoke contract forbids daemon, SSH/PTY/stdio bridge, ClaudeQQ send, restart, interrupt, cancel, deploy, and full e2e", + "health endpoint and trace summary are validated through short-lived source-level helpers", + "approval draft preview remains sendImplemented=false and redacted", + "reference doc describes the dev validation surfaces and no-daemon boundary", + ], +}, null, 2)}\n`); diff --git a/scripts/src/check.ts b/scripts/src/check.ts index 2ef9635b..a984f13f 100644 --- a/scripts/src/check.ts +++ b/scripts/src/check.ts @@ -25,6 +25,7 @@ const syntaxFiles = [ "scripts/src/commander.ts", "scripts/src/remote.ts", "scripts/host-codex-commander-contract-test.ts", + "scripts/host-codex-commander-no-daemon-smoke-contract-test.ts", "scripts/host-codex-commander-skeleton-contract-test.ts", "src/components/frontend/src/index.ts", "src/components/frontend/src/app.tsx", @@ -299,6 +300,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default fileItem("scripts/code-queue-pr-preflight-contract-test.ts"), fileItem("scripts/code-queue-submit-routing-contract-test.ts"), fileItem("scripts/host-codex-commander-skeleton-contract-test.ts"), + fileItem("scripts/host-codex-commander-no-daemon-smoke-contract-test.ts"), fileItem("scripts/provider-runner-triage-contract-test.ts"), fileItem("scripts/src/provider-triage.ts"), fileItem("src/components/microservices/code-queue/src/runner-error-classifier.ts"), @@ -326,6 +328,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default items.push(commandItem("code-queue:pr-preflight-contract", ["bun", "scripts/code-queue-pr-preflight-contract-test.ts"], 30_000)); items.push(commandItem("code-queue:submit-routing-contract", ["bun", "scripts/code-queue-submit-routing-contract-test.ts"], 30_000)); items.push(commandItem("host-codex-commander:skeleton-contract", ["bun", "scripts/host-codex-commander-skeleton-contract-test.ts"], 30_000)); + items.push(commandItem("host-codex-commander:no-daemon-smoke-contract", ["bun", "scripts/host-codex-commander-no-daemon-smoke-contract-test.ts"], 30_000)); items.push(commandItem("provider:runner-triage-contract", ["bun", "scripts/provider-runner-triage-contract-test.ts"], 30_000)); items.push(commandItem("deploy:artifact-matrix-contract", ["bun", "scripts/deploy-artifact-matrix-contract-test.ts"], 30_000)); items.push(commandItem("decision-center:desired-state-contract", ["bun", "scripts/decision-center-desired-state-contract-test.ts"], 30_000)); @@ -347,6 +350,7 @@ export function runChecks(config: UniDeskConfig, options: CheckOptions = default items.push(skippedItem("code-queue:pr-preflight-contract", "Code Queue PR preflight contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("code-queue:submit-routing-contract", "Code Queue submit routing contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("host-codex-commander:skeleton-contract", "host Codex commander skeleton contract is opt-in with script checks", "--scripts-typecheck or --full")); + items.push(skippedItem("host-codex-commander:no-daemon-smoke-contract", "host Codex commander no-daemon smoke contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("provider:runner-triage-contract", "Provider runner triage contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("deploy:artifact-matrix-contract", "deploy artifact matrix contract is opt-in with script checks", "--scripts-typecheck or --full")); items.push(skippedItem("decision-center:desired-state-contract", "Decision Center desired-state drift contract is opt-in with script checks", "--scripts-typecheck or --full")); diff --git a/scripts/src/commander.ts b/scripts/src/commander.ts index c88fdb47..d5eb4d5c 100644 --- a/scripts/src/commander.ts +++ b/scripts/src/commander.ts @@ -50,6 +50,7 @@ function commanderHelp(): Record<string, unknown> { usage: [ "bun scripts/cli.ts commander contract", "bun scripts/cli.ts commander plan --dry-run [--session-id id]", + "bun scripts/cli.ts commander smoke --dry-run [--session-id id]", "bun scripts/cli.ts commander approval request --action <action> --dry-run [--reason text] [--task-id id]", ], highRiskActions, @@ -211,6 +212,165 @@ function commanderPlan(args: string[]): Record<string, unknown> { }; } +function healthEndpointValidation(): Record<string, unknown> { + return { + surface: "health endpoint", + endpoint: "GET /health", + validationMethod: "invoke createCommanderRequestHandler with a temporary RuntimeConfig inside a short-lived contract test", + expectedEvidence: [ + "body.ok=true", + "body.service=host-codex-commander", + "body.stateRoot points at the temp directory", + "body.currentLogFile is reported", + ], + noRuntimeSideEffects: [ + "do not run Bun.serve", + "do not publish or expose a port", + "do not restart any service", + ], + }; +} + +function stateFileValidation(sessionId: string): Record<string, unknown> { + return { + surface: "state file", + storageRoot: ".state/commander/", + validationMethod: "write and read a session record only under a temporary directory owned by the contract test", + files: [ + `sessions/${sessionId}.json`, + `events/${sessionId}.jsonl`, + "approvals/draft.json", + "logs/commander.jsonl", + ], + expectedEvidence: [ + "session state round-trips through writeCommanderSession/readCommanderSession", + "secret-like notes are redacted before persistence", + "temporary state root is deleted after the smoke contract", + ], + noRuntimeSideEffects: [ + "do not touch the live .state/commander directory", + "do not patch database state", + "do not discover or signal live host Codex processes", + ], + }; +} + +function traceSummaryValidation(sessionId: string): Record<string, unknown> { + return { + surface: "trace summary dry-run", + validationMethod: "summarize mock JSONL trace input through summarizeCommanderTrace", + inputPolicy: "bounded mock Code Queue trace JSONL only; no live codex task trace fetch", + expectedEvidence: [ + "taskId and sessionId are preserved", + "lastSeq is computed from mock seq fields", + "status is derived without returning raw transcript", + "secret-like text is redacted", + ], + sampleSessionId: sessionId, + noRuntimeSideEffects: [ + "do not call Code Queue manager", + "do not mark tasks read", + "do not interrupt or cancel tasks", + ], + }; +} + +function approvalDraftValidation(): Record<string, unknown> { + return { + surface: "approval draft preview", + validationMethod: "build a draft preview for one high-risk action with --dry-run", + commandShape: "bun scripts/cli.ts commander approval request --action code-queue-task-interrupt --reason <text> --dry-run", + expectedEvidence: [ + "requiresExplicitUserApproval=true", + "claudeqq.mutation=false", + "claudeqq.sendImplemented=false", + "reason and messageTemplate are redacted", + ], + noRuntimeSideEffects: [ + "do not POST to ClaudeQQ", + "do not notify any QQ user or group", + "do not record an approval as consumed", + ], + }; +} + +function sshBridgeBoundaryValidation(): Record<string, unknown> { + return { + surface: "SSH bridge boundary", + validationMethod: "assert contract-only bridge metadata from commander plan --dry-run", + allowedAtThisStage: [ + "readonly contract description", + "future reviewed maintenance command shape", + "explicit approval precondition text", + ], + noRuntimeSideEffects: [ + "do not open provider SSH session", + "do not open PTY bridge", + "do not open stdio bridge", + "do not inject prompt", + "do not restart services", + "do not interrupt or cancel tasks", + ], + expectedEvidence: [ + "bridge.mutation=false", + "startPlan.enabled=false", + "safetyBoundary.phaseOneMutationAllowed=false", + ], + }; +} + +function commanderSmoke(args: string[]): Record<string, unknown> { + if (!hasFlag(args, "--dry-run")) { + return { + ok: false, + error: "dry-run-required", + message: requiredDryRunMessage, + command: "bun scripts/cli.ts commander smoke --dry-run", + }; + } + const sessionId = optionValue(args, "--session-id") ?? "primary"; + return { + ok: true, + phase: "source-contract", + mode: "dry-run", + mutation: false, + serviceId: "host-codex-commander", + noDaemonSmokeContract: { + startsDaemon: false, + startsPtyBridge: false, + startsStdioBridge: false, + opensSshBridge: false, + sendsClaudeqq: false, + restartsServices: false, + interruptsTasks: false, + cancelsTasks: false, + deploys: false, + runsFullCheckOrE2e: false, + allowedCommands: [ + "bun scripts/cli.ts commander contract", + "bun scripts/cli.ts commander plan --dry-run", + "bun scripts/cli.ts commander smoke --dry-run", + "bun scripts/cli.ts commander approval request --action <action> --dry-run", + "bun scripts/host-codex-commander-no-daemon-smoke-contract-test.ts", + ], + }, + validationPlan: [ + healthEndpointValidation(), + stateFileValidation(sessionId), + traceSummaryValidation(sessionId), + approvalDraftValidation(), + sshBridgeBoundaryValidation(), + ], + manualAuthorizationBeforeLiveRuntime: [ + "operator explicitly names the exact live action and target session/task/service", + "current source-contract smoke and skeleton contract tests are green", + "risk review confirms no token output, no direct database patch, and no backend restart bypass", + "ClaudeQQ approval draft is reviewed, sent by an authorized future path, and matched to an explicit approval id", + "rollback and observation steps are written before enabling any daemon or bridge", + ], + }; +} + function commanderApprovalRequest(args: string[]): Record<string, unknown> { if (!hasFlag(args, "--dry-run")) { return { @@ -276,6 +436,7 @@ export function runCommanderCommand(args: string[]): Record<string, unknown> { if (sub === undefined || isHelpToken(sub)) return commanderHelp(); if (sub === "contract") return commanderContract(); if (sub === "plan") return commanderPlan(args.slice(1)); + if (sub === "smoke") return commanderSmoke(args.slice(1)); if (sub === "approval" && second === "request") return commanderApprovalRequest(args.slice(2)); return { ok: false, diff --git a/scripts/src/help.ts b/scripts/src/help.ts index a6298d0a..224474ca 100644 --- a/scripts/src/help.ts +++ b/scripts/src/help.ts @@ -44,7 +44,7 @@ export function rootHelp(): unknown { { command: "dev-env validate|prewarm-images", description: "Validate D601 unidesk-dev guardrails or prewarm dev foundation images into native k3s containerd through a bounded async job." }, { command: "artifact-registry plan|render|status|health|install|deploy-backend-core|deploy-service", description: "Manage the D601 host-managed CNCF Distribution registry and run pull-only artifact CD for supported services, including D601 direct, k3s-managed, and code-queue dev-only consumers." }, { command: "gh auth|issue|pr", description: "Run safe GitHub issue and PR CRUD/lifecycle operations through REST with body-file update replace/append, comment delete, token diagnostics, hard delete unsupported, and merge blocked." }, - { command: "commander contract|plan --dry-run|approval request --dry-run", description: "Host Codex commander skeleton contract and dry-run preview; exposes local health, state, trace summary, and approval draft helpers without live bridges or message sends." }, + { command: "commander contract|plan --dry-run|smoke --dry-run|approval request --dry-run", description: "Host Codex commander skeleton contract, no-daemon smoke plan, and dry-run preview; exposes local health, state, trace summary, and approval draft helpers without live bridges or message sends." }, { command: "code-agent-sandbox", description: "Independent Code Agent Sandbox service skeleton for adapter, mode, and credential-boundary diagnostics." }, { command: "schedule list|get|runs|run|retry-run|delete", description: "Manage backend-core scheduled tasks and run history; schedule run <id> supports --wait-ms N and retry-run reuses the failed run's schedule." }, { command: "schedule upsert-pgdata-backup [--time HH:MM] [--remote-base /SERVER_DATA/UNIDESK_PG_DATA]", description: "Create or update the daily PGDATA physical backup task that uploads monthly rotated archives to Baidu Netdisk." }, @@ -188,14 +188,15 @@ function providerHelp(): unknown { function commanderHelp(): unknown { return { - command: "commander contract|plan|approval", + command: "commander contract|plan|smoke|approval", output: "json", usage: [ "bun scripts/cli.ts commander contract", "bun scripts/cli.ts commander plan --dry-run [--session-id id]", + "bun scripts/cli.ts commander smoke --dry-run [--session-id id]", "bun scripts/cli.ts commander approval request --action <action> --dry-run [--reason text] [--task-id id]", ], - description: "Inspect the local host Codex commander skeleton contract, dry-run planner, state helpers, trace summary aggregator, and approval draft preview.", + description: "Inspect the local host Codex commander skeleton contract, dry-run planner, no-daemon smoke validation plan, state helpers, trace summary aggregator, and approval draft preview.", boundary: [ "the current skeleton is local-only and never attaches to live bridges", "dry-run commands never open SSH, PTY, or stdio bridges",