chore: integrate v0.1 component branches

2026-05-29 12:04:29 +08:00
parent 1869ad57a4 d236cfee61
commit 5917ef4295
17 changed files with 670 additions and 208 deletions
@@ -108,7 +108,7 @@ POST  /api/v1/commands/:commandId/ack
 | 规格项 | 状态 | 说明 |
 | --- | --- | --- |
 | `agentrun-mgr` 服务规格 | 已定义 | 本文为 v0.1 manager 权威。 |
-| Manager REST API | 已实现骨架 | 已有 run、command、event、runner register、claim、lease heartbeat、status、ack、backends 和 health/readiness 的 HTTP JSON 骨架；仍需后续真实部署验收。 |
+| Manager REST API | 已实现骨架 | 已有 run、command、event、backends、runner register、claim、lease heartbeat、poll、ack、status 和 health/readiness 的 HTTP JSON 骨架；仍需后续真实部署验收。 |
 | Tenant policy boundary | 已定义/未实现 | v0.1 只做最小 schema/allowlist/secretScope 边界。 |
 | Postgres durable adapter | 已实现骨架 | live runtime 通过 `DATABASE_URL` 使用 Postgres durable store；memory store 仅用于显式 self-test/dev。见 [spec-v01-postgres.md](spec-v01-postgres.md)。 |
-| Observability 最小合同 | 已实现骨架 | events append-only、terminal status、failureKind、health/readiness store 状态和 Secret/DSN redaction 已进入 manager 骨架；集中 trace 和部署级观测仍属后续工作。 |
+| Observability 最小合同 | 已实现骨架 | events append-only、terminal status、failureKind、health/readiness store 状态、runner claim/lease/backend events 和 Secret/DSN redaction 已进入 manager 骨架；集中 trace 和部署级观测仍属后续工作。 |
@@ -99,7 +99,7 @@ Runner 日志必须实时 flush 到文件或 pod log，CLI 启动 runner 时必
 | 规格项 | 状态 | 说明 |
 | --- | --- | --- |
 | `agentrun-runner` 服务规格 | 已定义 | 本文为 v0.1 runner 权威。 |
-| Kubernetes Job runner | 未实现 | 需要后续 runtime/GitOps 实现。 |
-| host process runner | 未实现 | 可作为 MVP 手动 dispatch，但仍必须走 manager API。 |
-| claim/lease/report client | 未实现 | 需要后续代码实现。 |
+| Kubernetes Job runner | 部分实现 | 已提供 `runner job --dry-run` Job manifest 渲染骨架，固定使用 `agentrun-v01-runner` ServiceAccount、manager URL、runId/commandId/attemptId、executionPolicy 和 SecretRef 文件投影；尚未执行真实 Kubernetes create/apply。 |
+| host process runner | 部分实现 | `runner start` 和 `src/runner/main.ts` 进入同一套 `runOnce`，可通过 manager register/claim/poll/report 执行自测试。 |
+| claim/lease/report client | 部分实现 | 已拆出 runner manager API client，覆盖 register、claim、lease heartbeat、poll command、ack、append event 和 terminal status；durable store 仍待 Postgres adapter 接入。 |
 | runner redaction | 已定义/未实现 | 需与 backend adapter 共同实现。 |
@@ -72,7 +72,7 @@ bun scripts/agentrun-cli.ts server start|status|stop|logs
 | 规格项 | 状态 | 说明 |
 | --- | --- | --- |
 | AgentRun CLI 规格 | 已定义 | 本文为 v0.1 CLI 权威。 |
-| `scripts/agentrun-cli.ts` | 未实现 | 需要后续代码实现。 |
-| CLI 调 manager REST | 未实现 | 需随 manager API 实现。 |
-| runner start | 未实现 | 需随 runner Job/host process 实现。 |
+| `scripts/agentrun-cli.ts` | 部分实现 | 已提供 run/command/event/backend/server 基础命令和 JSON envelope。 |
+| CLI 调 manager REST | 部分实现 | CLI 通过 `ManagerClient` 调 manager REST；当前自测试使用内存 manager。 |
+| runner start | 部分实现 | `runner start` 可执行 host process runner；`runner job --dry-run` 可渲染 Kubernetes Job JSON，尚不执行 create/apply。 |
 | CLI 测试规格 | 已定义 | 综合联调见 [spec-v01-validation.md](spec-v01-validation.md)。 |
@@ -132,7 +132,7 @@ Secret 创建和轮换不由 source branch 自动生成；source branch 只声
 | 规格项 | 状态 | 说明 |
 | --- | --- | --- |
 | Secret 分发规格 | 已定义 | 本文为 v0.1 provider credential 分发权威。 |
-| Kubernetes SecretRef 注入 | 未实现 | 需要后续 GitOps/runtime 实现。 |
-| Codex auth/config file projection | 未实现 | 需要后续 runner/backend adapter 实现，测试来源为 `~/.codex/auth.json` 和 `~/.codex/config.toml`。 |
-| redaction 最小规则 | 已定义 | 需要后续代码实现和测试。 |
+| Kubernetes SecretRef 注入 | 部分实现 | runner Job dry-run 渲染已按 run `executionPolicy.secretScope.providerCredentials` 生成 Secret volume projection 和 `CODEX_HOME`，但尚未 apply 到集群。 |
+| Codex auth/config file projection | 部分实现 | backend readiness 检查 `auth.json`/`config.toml` 可读性，缺失时返回 `secret-unavailable`；self-test 使用临时文件模拟投影。 |
+| redaction 最小规则 | 部分实现 | event、Job dry-run 输出和 self-test 已验证不打印测试 token；复杂审计仍待后续补齐。 |
 | 外部 secret manager | 未采用 | 如需 Vault/ExternalSecrets/SOPS，后续单独更新规格。 |
@@ -39,6 +39,19 @@ UniDesk or HWLAB client

 Runner inbound API 只允许本地或私有诊断，不作为业务客户端入口。业务客户端只能调用 `agentrun-mgr`。

+## 并行开发边界
+
+`v0.1` 默认会由多个 agent 并行实现 manager、runner、backend、Secret、Postgres、CLI 和 CI/CD。架构必须主动降低并行冲突，而不是把所有组件都写进同一个入口文件、总表或巨型测试文件。
+
+并行开发的长期规则：
+
+- 组件实现按目录分层：manager 写入 `src/mgr/**`，runner 写入 `src/runner/**`，backend 写入 `src/backend/**`，CLI 写入 `scripts/src/**`，部署写入 `deploy/**`。跨组件共享类型和工具只放 `src/common/**`，且变更必须保持小而稳定。
+- 自测试采用可发现 case 模型：`src/selftest/run.ts` 只负责发现和调度，公共 fixture 放 `src/selftest/harness.ts`，组件测试放 `src/selftest/cases/*.ts`。新增组件自测试应优先新增 case 文件，不应反复修改总入口。
+- 长期文档状态归属到对应 spec：组件实现状态优先维护在 `spec-v01-agentrun-mgr.md`、`spec-v01-agentrun-runner.md`、`spec-v01-backend-codex.md` 等组件 spec 中；`spec-v01-services.md` 只保留总览和跨组件边界，避免成为并行改动热点。
+- 对并行冲突的处理优先优化模块边界：如果多个 PR 反复冲突在同一文件或同一段状态表，应拆出可独立追加的模块、case、manifest、helper 或 spec 子项；只有确认边界已经合理后，才做普通冲突解析。
+- 共享 API 合同要先稳定再并行扩展：`RunRecord`、`CommandRecord`、`ExecutionPolicy`、`SecretRef`、`FailureKind` 等跨组件类型改动必须兼顾 manager、runner、backend、CLI、self-test 和 GitOps render，不得为单个组件引入临时字段绕开合同。
+- 过程性合并、冲突解析和临时验证证据进入 PR/issue；长期参考文档只记录稳定的并行开发原则、边界和判断标准。
+
 ## 服务总表

 | 对象 | 类型 | v0.1 处理 | 说明 | 后续单独 spec |
@@ -154,7 +167,7 @@ Manager 负责校验、保存和返回这些字段；runner 只能消费已保
 | Codex backend 规格 | 已定义 | 见 [spec-v01-backend-codex.md](spec-v01-backend-codex.md)。 |
 | AgentRun CLI 规格 | 已定义 | 见 [spec-v01-cli.md](spec-v01-cli.md)。 |
 | Scheduler deferred 规格 | 已定义 | 见 [spec-v01-scheduler.md](spec-v01-scheduler.md)。 |
-| `agentrun-mgr` 实现 | 已实现骨架 | 已有 REST API、Postgres durable store 选择、migration ledger、health/readiness 和 self-test memory 模式骨架；仍需 G14 `agentrun-v01` 真实 Postgres/GitOps 验收。 |
-| `agentrun-runner` 实现 | 未实现 | 需后续代码实现。 |
-| 第一真实 backend | 未实现 | 默认候选 Codex。 |
+| `agentrun-mgr` 实现 | 已实现骨架 | 已有 REST API、Postgres durable store 选择、migration ledger、runner claim/lease/report、health/readiness 和 self-test memory 模式骨架；仍需 G14 `agentrun-v01` 真实 Postgres/GitOps 验收。 |
+| `agentrun-runner` 实现 | 已实现骨架 | 已有 host process runner 与 Kubernetes Job dry-run 渲染骨架，runner 通过 manager API claim/poll/report，不直连 Postgres；真实 Kubernetes Job 联调仍需验收。 |
+| 第一真实 backend | 已实现骨架 | Codex app-server stdio backend 已有协议、失败分类、脱敏和 fake self-test；真实 Codex provider turn 仍需综合联调验收。 |
 | 自动 scheduler | Deferred | 不作为 `v0.1` 第一阶段验收目标。 |
@@ -26,6 +26,8 @@

 自测试应使用 Bun + TypeScript 运行，Codex 相关自测试可以使用 fake app-server JSON-RPC client 模拟 `initialize`、`thread/start`、`thread/resume`、`turn/start`、assistant 输出、协议错误、timeout 和 transport close。

+仓库内自测试入口必须保持可并行扩展：`src/selftest/run.ts` 只负责发现和调度 `src/selftest/cases/*.ts`，公共 fixture 和断言辅助放 `src/selftest/harness.ts`。新增组件自测试优先新增独立 case 文件，避免多个并行任务修改同一个总入口导致反复冲突。若某个 case 需要跨组件验证，可以依赖正式 manager/runner/backend API，但仍应作为独立 case 命名和维护。
+
 自测试输出可以是 Tekton TaskRun result、JUnit、tap、JSON summary 或普通日志，但不得回写 source branch，也不得伪装成综合联调通过。

 ## 综合联调
@@ -3,7 +3,8 @@ import { startManagerServer } from "../../src/mgr/server.js";
 import { MemoryAgentRunStore } from "../../src/mgr/store.js";
 import { ManagerClient } from "../../src/mgr/client.js";
 import { runOnce } from "../../src/runner/run-once.js";
-import type { JsonRecord, JsonValue } from "../../src/common/types.js";
+import { renderRunnerJobDryRun } from "../../src/runner/k8s-job.js";
+import type { JsonRecord, JsonValue, RunRecord } from "../../src/common/types.js";
 import { AgentRunError, errorToJson } from "../../src/common/errors.js";
 import type { RunnerOnceOptions } from "../../src/runner/run-once.js";

@@ -59,9 +60,37 @@ async function dispatch(args: ParsedArgs): Promise<JsonValue> {
    if (codexHome) options.codexHome = codexHome;
    return runOnce(options) as unknown as JsonValue;
  }
+  if (group === "runner" && command === "job") return renderRunnerJob(args);
  throw new AgentRunError("schema-invalid", `unsupported command: ${args.positional.join(" ")}`, { httpStatus: 2 });
 }

+async function renderRunnerJob(args: ParsedArgs): Promise<JsonRecord> {
+  if (args.flags.get("dry-run") !== true) throw new AgentRunError("schema-invalid", "runner job only supports --dry-run in v0.1", { httpStatus: 2 });
+  const runId = flag(args, "run-id", "");
+  const commandId = flag(args, "command-id", "");
+  const image = flag(args, "image", "");
+  if (!runId) throw new AgentRunError("schema-invalid", "runner job requires --run-id", { httpStatus: 2 });
+  if (!commandId) throw new AgentRunError("schema-invalid", "runner job requires --command-id", { httpStatus: 2 });
+  if (!image) throw new AgentRunError("schema-invalid", "runner job requires --image", { httpStatus: 2 });
+  const run = await client(args).get(`/api/v1/runs/${encodeURIComponent(runId)}`) as RunRecord;
+  const options = {
+    run,
+    commandId,
+    image,
+    managerUrl: managerUrl(args),
+    namespace: optionalFlag(args, "namespace") ?? "agentrun-v01",
+  };
+  const attemptId = optionalFlag(args, "attempt-id");
+  const runnerId = optionalFlag(args, "runner-id");
+  const sourceCommit = optionalFlag(args, "source-commit");
+  return renderRunnerJobDryRun({
+    ...options,
+    ...(attemptId ? { attemptId } : {}),
+    ...(runnerId ? { runnerId } : {}),
+    ...(sourceCommit ? { sourceCommit } : {}),
+  });
+}
+
 async function startServer(args: ParsedArgs): Promise<JsonRecord> {
  const port = Number(flag(args, "port", "8080"));
  const host = flag(args, "host", "0.0.0.0");
@@ -126,6 +155,7 @@ function help(): JsonRecord {
      "commands create <runId> --type turn --json-file <payload.json>",
      "commands show <commandId> --run-id <runId>",
      "runner start --run-id <runId>",
+      "runner job --dry-run --run-id <runId> --command-id <commandId> --image <image>",
      "backends list",
      "server start|status",
    ],
@@ -0,0 +1,191 @@
+import { stableHash } from "../common/validation.js";
+import type { BackendProfile, ExecutionPolicy, JsonRecord, JsonValue, RunRecord, SecretRef } from "../common/types.js";
+
+export interface RunnerJobRenderOptions {
+  run: RunRecord;
+  commandId: string;
+  managerUrl: string;
+  image: string;
+  namespace?: string;
+  attemptId?: string;
+  runnerId?: string;
+  sourceCommit?: string;
+  serviceAccountName?: string;
+  imagePullPolicy?: string;
+  backoffLimit?: number;
+  ttlSecondsAfterFinished?: number;
+}
+
+interface CredentialProjection {
+  profile: BackendProfile | string;
+  secretRef: SecretRef;
+  volumeName: string;
+  mountPath: string;
+}
+
+export function renderRunnerJobDryRun(options: RunnerJobRenderOptions): JsonRecord {
+  const render = renderRunnerJobManifest(options);
+  return {
+    dryRun: true,
+    mutation: false,
+    action: "render-kubernetes-job",
+    jobIdentity: {
+      kind: "Job",
+      namespace: render.namespace,
+      name: render.jobName,
+      serviceAccountName: render.serviceAccountName,
+    },
+    runner: {
+      runId: options.run.id,
+      commandId: options.commandId,
+      attemptId: render.attemptId,
+      runnerId: render.runnerId,
+      backendProfile: options.run.backendProfile,
+      managerUrl: options.managerUrl,
+      sourceCommit: render.sourceCommit,
+    },
+    secretRefs: render.secretRefs.map((item) => ({ profile: item.profile, name: item.secretRef.name, namespace: item.secretRef.namespace ?? render.namespace, keys: item.secretRef.keys ?? [], mountPath: item.mountPath, valuesPrinted: false })),
+    pollCommands: {
+      run: `bun scripts/agentrun-cli.ts runs show ${options.run.id} --manager-url ${options.managerUrl}`,
+      events: `bun scripts/agentrun-cli.ts runs events ${options.run.id} --manager-url ${options.managerUrl} --after-seq 0 --limit 100`,
+    },
+    warnings: render.warnings,
+    manifest: render.manifest,
+  };
+}
+
+export function renderRunnerJobManifest(options: RunnerJobRenderOptions): { manifest: JsonRecord; namespace: string; jobName: string; runnerId: string; attemptId: string; sourceCommit: string; serviceAccountName: string; secretRefs: CredentialProjection[]; warnings: string[] } {
+  const namespace = options.namespace ?? "agentrun-v01";
+  const attemptId = options.attemptId ?? `attempt_${Date.now().toString(36)}`;
+  const runnerId = options.runnerId ?? `runner_${shortHash(`${options.run.id}:${attemptId}:${options.commandId}`)}`;
+  const sourceCommit = options.sourceCommit ?? process.env.AGENTRUN_SOURCE_COMMIT ?? "unknown";
+  const serviceAccountName = options.serviceAccountName ?? "agentrun-v01-runner";
+  const jobName = `agentrun-v01-runner-${shortDnsHash(options.run.id, attemptId)}`;
+  const secretRefs = credentialProjections(options.run, namespace);
+  const warnings: string[] = [];
+  if (secretRefs.length === 0) warnings.push("run executionPolicy.secretScope 未声明 provider SecretRef；runner 将按 secret-unavailable 上报，而不会降级直连外部凭据");
+  const env = runnerEnv(options, { namespace, jobName, runnerId, attemptId, sourceCommit, secretRefs });
+  const manifest: JsonRecord = {
+    apiVersion: "batch/v1",
+    kind: "Job",
+    metadata: {
+      name: jobName,
+      namespace,
+      labels: labels(options.run, jobName),
+      annotations: {
+        "agentrun.pikastech.local/run-id": options.run.id,
+        "agentrun.pikastech.local/command-id": options.commandId,
+        "agentrun.pikastech.local/dry-run-render": "true",
+      },
+    },
+    spec: {
+      backoffLimit: options.backoffLimit ?? 0,
+      ttlSecondsAfterFinished: options.ttlSecondsAfterFinished ?? 86_400,
+      template: {
+        metadata: {
+          labels: labels(options.run, jobName),
+          annotations: {
+            "agentrun.pikastech.local/run-id": options.run.id,
+            "agentrun.pikastech.local/command-id": options.commandId,
+          },
+        },
+        spec: {
+          serviceAccountName,
+          automountServiceAccountToken: false,
+          restartPolicy: "Never",
+          containers: [
+            {
+              name: "runner",
+              image: options.image,
+              imagePullPolicy: options.imagePullPolicy ?? "IfNotPresent",
+              command: ["bun", "src/runner/main.ts"],
+              env,
+              volumeMounts: secretRefs.map((item) => ({ name: item.volumeName, mountPath: item.mountPath, readOnly: true })),
+              resources: {
+                requests: { cpu: "250m", memory: "512Mi" },
+                limits: { cpu: "2", memory: "4Gi" },
+              },
+              securityContext: {
+                allowPrivilegeEscalation: false,
+                readOnlyRootFilesystem: false,
+                capabilities: { drop: ["ALL"] },
+              },
+            },
+          ],
+          volumes: secretRefs.map(secretVolume),
+        },
+      },
+    },
+  };
+  return { manifest, namespace, jobName, runnerId, attemptId, sourceCommit, serviceAccountName, secretRefs, warnings };
+}
+
+function runnerEnv(options: RunnerJobRenderOptions, context: { namespace: string; jobName: string; runnerId: string; attemptId: string; sourceCommit: string; secretRefs: CredentialProjection[] }): JsonRecord[] {
+  const codexMount = context.secretRefs.find((item) => item.profile === "codex")?.mountPath ?? "/home/agentrun/.codex";
+  return [
+    { name: "AGENTRUN_MGR_URL", value: options.managerUrl },
+    { name: "AGENTRUN_RUN_ID", value: options.run.id },
+    { name: "AGENTRUN_COMMAND_ID", value: options.commandId },
+    { name: "AGENTRUN_ATTEMPT_ID", value: context.attemptId },
+    { name: "AGENTRUN_RUNNER_ID", value: context.runnerId },
+    { name: "AGENTRUN_BACKEND_PROFILE", value: options.run.backendProfile },
+    { name: "AGENTRUN_EXECUTION_POLICY_JSON", value: JSON.stringify(options.run.executionPolicy) },
+    { name: "AGENTRUN_SOURCE_COMMIT", value: context.sourceCommit },
+    { name: "AGENTRUN_RUNTIME_NAMESPACE", value: context.namespace },
+    { name: "AGENTRUN_K8S_JOB_NAME", value: context.jobName },
+    { name: "AGENTRUN_LOG_PATH", value: "/tmp/agentrun-runner.jsonl" },
+    { name: "HOME", value: "/home/agentrun" },
+    { name: "CODEX_HOME", value: codexMount },
+  ];
+}
+
+function credentialProjections(run: RunRecord, namespace: string): CredentialProjection[] {
+  const policy: ExecutionPolicy = run.executionPolicy;
+  const credentials = policy.secretScope.providerCredentials ?? [];
+  return credentials.map((item, index) => ({
+    profile: item.profile,
+    secretRef: item.secretRef.namespace ? item.secretRef : { ...item.secretRef, namespace },
+    volumeName: sanitizeVolumeName(`${String(item.profile)}-${index}`),
+    mountPath: normalizeMountPath(item.secretRef.mountPath),
+  }));
+}
+
+function secretVolume(item: CredentialProjection): JsonRecord {
+  const secret: JsonRecord = {
+    secretName: item.secretRef.name,
+    defaultMode: 256,
+  };
+  const keys = item.secretRef.keys ?? [];
+  if (keys.length > 0) secret.items = keys.map((key) => ({ key, path: key, mode: 256 }));
+  return { name: item.volumeName, secret };
+}
+
+function normalizeMountPath(value: string | undefined): string {
+  if (!value || value === "~/.codex") return "/home/agentrun/.codex";
+  if (value.startsWith("~/")) return `/home/agentrun/${value.slice(2)}`;
+  return value;
+}
+
+function labels(run: RunRecord, jobName: string): JsonRecord {
+  return {
+    "app.kubernetes.io/name": "agentrun-runner",
+    "app.kubernetes.io/component": "runner",
+    "app.kubernetes.io/part-of": "agentrun",
+    "agentrun.pikastech.local/lane": "v0.1",
+    "agentrun.pikastech.local/run-hash": shortHash(run.id),
+    "job-name": jobName,
+  };
+}
+
+function shortDnsHash(...parts: string[]): string {
+  return shortHash(parts.join(":"));
+}
+
+function shortHash(value: JsonValue): string {
+  return stableHash(value).slice(0, 12);
+}
+
+function sanitizeVolumeName(value: string): string {
+  const sanitized = value.toLowerCase().replace(/[^a-z0-9-]+/gu, "-").replace(/^-+|-+$/gu, "");
+  return sanitized.length > 0 ? sanitized.slice(0, 40) : "credential";
+}
@@ -1,4 +1,6 @@
 import { runOnce, type RunnerOnceOptions } from "./run-once.js";
+import { AgentRunError, errorToJson } from "../common/errors.js";
+import { failureKindFromError } from "./manager-api.js";

 const managerUrl = process.env.AGENTRUN_MGR_URL;
 const runId = process.env.AGENTRUN_RUN_ID;
@@ -11,9 +13,23 @@ const options: RunnerOnceOptions = {
  managerUrl,
  runId,
 };
+if (process.env.AGENTRUN_COMMAND_ID) options.commandId = process.env.AGENTRUN_COMMAND_ID;
+if (process.env.AGENTRUN_ATTEMPT_ID) options.attemptId = process.env.AGENTRUN_ATTEMPT_ID;
 if (process.env.AGENTRUN_RUNNER_ID) options.runnerId = process.env.AGENTRUN_RUNNER_ID;
+if (process.env.AGENTRUN_BACKEND_PROFILE === "codex") options.backendProfile = "codex";
+if (process.env.AGENTRUN_K8S_JOB_NAME) options.placement = "kubernetes-job";
+if (process.env.AGENTRUN_SOURCE_COMMIT) options.sourceCommit = process.env.AGENTRUN_SOURCE_COMMIT;
+if (process.env.AGENTRUN_K8S_JOB_NAME) options.jobName = process.env.AGENTRUN_K8S_JOB_NAME;
+if (process.env.HOSTNAME) options.podName = process.env.HOSTNAME;
+if (process.env.AGENTRUN_LOG_PATH) options.logPath = process.env.AGENTRUN_LOG_PATH;
 if (process.env.AGENTRUN_CODEX_COMMAND) options.codexCommand = process.env.AGENTRUN_CODEX_COMMAND;
 if (process.env.AGENTRUN_CODEX_ARGS) options.codexArgs = JSON.parse(process.env.AGENTRUN_CODEX_ARGS) as string[];
 if (process.env.CODEX_HOME) options.codexHome = process.env.CODEX_HOME;
-const result = await runOnce(options);
-console.log(JSON.stringify({ ok: true, data: result }));
+try {
+  const result = await runOnce(options);
+  console.log(JSON.stringify({ ok: true, data: result }));
+} catch (error) {
+  const failureKind = failureKindFromError(error);
+  console.log(JSON.stringify({ ok: false, failureKind, message: error instanceof Error ? error.message : String(error), error: errorToJson(error) }));
+  process.exit(error instanceof AgentRunError && error.httpStatus >= 1 && error.httpStatus <= 255 ? error.httpStatus : 1);
+}
@@ -0,0 +1,108 @@
+import { ManagerClient } from "../mgr/client.js";
+import { AgentRunError } from "../common/errors.js";
+import type { BackendEvent, BackendProfile, CommandRecord, FailureKind, JsonRecord, RunRecord, RunnerRecord, TerminalStatus } from "../common/types.js";
+
+export interface RunnerRegistrationInput {
+  runId: string;
+  attemptId: string;
+  backendProfile: BackendProfile;
+  placement: "host-process" | "kubernetes-job";
+  sourceCommit: string;
+  runnerId?: string;
+  jobName?: string;
+  podName?: string;
+  logPath?: string;
+}
+
+export interface PollCommandsResult {
+  items: CommandRecord[];
+  selected: CommandRecord | null;
+}
+
+export interface RunnerFailureReport {
+  terminalStatus: TerminalStatus;
+  failureKind: FailureKind;
+  failureMessage: string;
+}
+
+export class RunnerManagerApi {
+  readonly client: ManagerClient;
+
+  constructor(readonly managerUrl: string) {
+    this.client = new ManagerClient(managerUrl);
+  }
+
+  async register(input: RunnerRegistrationInput): Promise<RunnerRecord> {
+    const body: JsonRecord = {
+      runId: input.runId,
+      attemptId: input.attemptId,
+      backendProfile: input.backendProfile,
+      placement: input.placement,
+      sourceCommit: input.sourceCommit,
+    };
+    if (input.runnerId) body.id = input.runnerId;
+    if (input.jobName) body.jobName = input.jobName;
+    if (input.podName) body.podName = input.podName;
+    if (input.logPath) body.logPath = input.logPath;
+    return await this.client.post("/api/v1/runners/register", body) as RunnerRecord;
+  }
+
+  async claim(runId: string, runnerId: string, leaseMs: number): Promise<RunRecord> {
+    return await this.client.post(`/api/v1/runs/${encodeURIComponent(runId)}/claim`, { runnerId, leaseMs }) as RunRecord;
+  }
+
+  async heartbeat(runId: string, runnerId: string, leaseMs: number): Promise<RunRecord> {
+    return await this.client.patch(`/api/v1/runs/${encodeURIComponent(runId)}/lease`, { runnerId, leaseMs }) as RunRecord;
+  }
+
+  async pollCommands(runId: string, options: { afterSeq?: number; limit?: number; commandId?: string }): Promise<PollCommandsResult> {
+    const afterSeq = options.afterSeq ?? 0;
+    const limit = options.limit ?? 20;
+    const response = await this.client.get(`/api/v1/runs/${encodeURIComponent(runId)}/commands?afterSeq=${afterSeq}&limit=${limit}`) as { items?: CommandRecord[] };
+    const items = Array.isArray(response.items) ? response.items : [];
+    const selected = options.commandId ? items.find((item) => item.id === options.commandId && item.state === "pending" && item.type === "turn") ?? null : items.find((item) => item.state === "pending" && item.type === "turn") ?? null;
+    return { items, selected };
+  }
+
+  async ackCommand(commandId: string): Promise<CommandRecord> {
+    return await this.client.post(`/api/v1/commands/${encodeURIComponent(commandId)}/ack`, {}) as CommandRecord;
+  }
+
+  async appendEvent(runId: string, event: BackendEvent): Promise<JsonRecord> {
+    return await this.client.post(`/api/v1/runs/${encodeURIComponent(runId)}/events`, event as unknown as JsonRecord) as JsonRecord;
+  }
+
+  async reportStatus(runId: string, report: { terminalStatus: TerminalStatus; failureKind: FailureKind | null; failureMessage: string | null }): Promise<RunRecord> {
+    return await this.client.patch(`/api/v1/runs/${encodeURIComponent(runId)}/status`, report as unknown as JsonRecord) as RunRecord;
+  }
+
+  async reportFailure(runId: string, report: RunnerFailureReport): Promise<{ reported: boolean; run: RunRecord | null; reportError: string | null }> {
+    try {
+      await this.appendEvent(runId, { type: "error", payload: { failureKind: report.failureKind, message: report.failureMessage, source: "agentrun-runner" } });
+      const run = await this.reportStatus(runId, report);
+      return { reported: true, run, reportError: null };
+    } catch (error) {
+      return { reported: false, run: null, reportError: errorMessage(error) };
+    }
+  }
+}
+
+export function failureKindFromError(error: unknown): FailureKind {
+  if (error instanceof AgentRunError) return error.failureKind;
+  const message = errorMessage(error).toLowerCase();
+  if (message.includes("auth") || message.includes("unauthorized") || message.includes("forbidden")) return "provider-auth-failed";
+  if (message.includes("timeout")) return "backend-timeout";
+  if (message.includes("lease") || message.includes("claim")) return "runner-lease-conflict";
+  return "infra-failed";
+}
+
+export function terminalStatusForFailure(failureKind: FailureKind): TerminalStatus {
+  if (failureKind === "cancelled") return "cancelled";
+  if (failureKind === "secret-unavailable" || failureKind === "tenant-policy-denied" || failureKind === "schema-invalid") return "blocked";
+  return "failed";
+}
+
+export function errorMessage(error: unknown): string {
+  if (error instanceof Error) return error.message;
+  return String(error);
+}
@@ -1,35 +1,57 @@
-import { ManagerClient } from "../mgr/client.js";
+import { RunnerManagerApi, failureKindFromError, terminalStatusForFailure, errorMessage } from "./manager-api.js";
 import { runBackendTurn, type BackendAdapterOptions } from "../backend/adapter.js";
-import type { CommandRecord, JsonRecord, RunRecord, RunnerRecord } from "../common/types.js";
+import type { BackendProfile, JsonRecord, RunRecord, RunnerRecord } from "../common/types.js";

 export interface RunnerOnceOptions extends BackendAdapterOptions {
  managerUrl: string;
  runId: string;
+  commandId?: string;
  runnerId?: string;
  attemptId?: string;
  leaseMs?: number;
+  backendProfile?: BackendProfile;
+  placement?: "host-process" | "kubernetes-job";
+  sourceCommit?: string;
+  jobName?: string;
+  podName?: string;
+  logPath?: string;
 }

 export async function runOnce(options: RunnerOnceOptions): Promise<JsonRecord> {
-  const client = new ManagerClient(options.managerUrl);
-  const runner = await client.post("/api/v1/runners/register", {
-    id: options.runnerId ?? undefined,
+  const api = new RunnerManagerApi(options.managerUrl);
+  const leaseMs = options.leaseMs ?? 60_000;
+  const attemptId = options.attemptId ?? `attempt_${Date.now().toString(36)}`;
+  const runner = await api.register({
    runId: options.runId,
-    attemptId: options.attemptId ?? `attempt_${Date.now().toString(36)}`,
-    backendProfile: "codex",
-    placement: "host-process",
-    sourceCommit: process.env.AGENTRUN_SOURCE_COMMIT ?? "unknown",
-  } as JsonRecord) as unknown as RunnerRecord;
-  const claimed = await client.post(`/api/v1/runs/${options.runId}/claim`, { runnerId: runner.id, leaseMs: options.leaseMs ?? 60_000 }) as unknown as RunRecord;
-  const commandsResponse = await client.get(`/api/v1/runs/${options.runId}/commands?afterSeq=0&limit=20`) as { items?: CommandRecord[] };
-  const command = commandsResponse.items?.find((item) => item.state === "pending" && item.type === "turn");
-  if (!command) {
-    await client.patch(`/api/v1/runs/${options.runId}/status`, { terminalStatus: "blocked", failureKind: "schema-invalid", failureMessage: "no pending turn command" });
-    return { runner, claimed, terminalStatus: "blocked", failureKind: "schema-invalid" };
+    attemptId,
+    backendProfile: options.backendProfile ?? "codex",
+    placement: options.placement ?? "host-process",
+    sourceCommit: options.sourceCommit ?? process.env.AGENTRUN_SOURCE_COMMIT ?? "unknown",
+    ...(options.runnerId ? { runnerId: options.runnerId } : {}),
+    ...(options.jobName ? { jobName: options.jobName } : {}),
+    ...(options.podName ? { podName: options.podName } : {}),
+    ...(options.logPath ? { logPath: options.logPath } : {}),
+  }) as RunnerRecord;
+  let claimed: RunRecord;
+  try {
+    claimed = await api.claim(options.runId, runner.id, leaseMs);
+    await api.heartbeat(options.runId, runner.id, leaseMs);
+  } catch (error) {
+    const failureKind = failureKindFromError(error);
+    if (failureKind !== "runner-lease-conflict") {
+      await api.reportFailure(options.runId, { terminalStatus: terminalStatusForFailure(failureKind), failureKind, failureMessage: errorMessage(error) });
+    }
+    throw error;
  }
-  await client.post(`/api/v1/commands/${command.id}/ack`, {});
+  const commandsResponse = await api.pollCommands(options.runId, { afterSeq: 0, limit: 20, ...(options.commandId ? { commandId: options.commandId } : {}) });
+  const command = commandsResponse.selected;
+  if (!command) {
+    await api.reportStatus(options.runId, { terminalStatus: "blocked", failureKind: "schema-invalid", failureMessage: "no pending turn command" });
+    return { runner, claimed, terminalStatus: "blocked", failureKind: "schema-invalid", polledCommands: commandsResponse.items.length };
+  }
+  await api.ackCommand(command.id);
  const result = await runBackendTurn(claimed, command, options);
-  for (const event of result.events) await client.post(`/api/v1/runs/${options.runId}/events`, event as unknown as JsonRecord);
-  const finalRun = await client.patch(`/api/v1/runs/${options.runId}/status`, { terminalStatus: result.terminalStatus, failureKind: result.failureKind, failureMessage: result.failureMessage }) as unknown as RunRecord;
+  for (const event of result.events) await api.appendEvent(options.runId, event);
+  const finalRun = await api.reportStatus(options.runId, { terminalStatus: result.terminalStatus, failureKind: result.failureKind, failureMessage: result.failureMessage }) as RunRecord;
  return { runner, commandId: command.id, terminalStatus: result.terminalStatus, failureKind: result.failureKind, run: finalRun } as JsonRecord;
 }
@@ -0,0 +1,24 @@
+import assert from "node:assert/strict";
+import { openAgentRunStoreFromEnv } from "../../mgr/store.js";
+import { postgresMigrationContract } from "../../mgr/postgres-store.js";
+import { redactText } from "../../common/redaction.js";
+import { AgentRunError } from "../../common/errors.js";
+import type { SelfTestCase } from "../harness.js";
+
+const selfTest: SelfTestCase = async () => {
+  assert.equal(redactText("Authorization: Bearer abc123"), "Authorization: Bearer REDACTED");
+  assert.equal(redactText('{"token":"test-token-material"}').includes("test-token-material"), false);
+  await assert.rejects(
+    () => openAgentRunStoreFromEnv({}),
+    (error) => error instanceof AgentRunError && error.failureKind === "infra-failed" && error.message.includes("DATABASE_URL is required"),
+  );
+  const postgresContract = postgresMigrationContract();
+  assert.equal(postgresContract.latestMigrationId, "001_v01_initial_durable_store");
+  assert.ok(Array.isArray(postgresContract.requiredTables));
+  assert.ok(postgresContract.requiredTables.includes("agentrun_schema_migrations"));
+  assert.ok(postgresContract.requiredTables.includes("agentrun_runs"));
+  assert.ok(postgresContract.requiredTables.includes("agentrun_events"));
+  return { name: "redaction-postgres", tests: ["redaction", "postgres-store-contract"] };
+};
+
+export default selfTest;
@@ -0,0 +1,23 @@
+import assert from "node:assert/strict";
+import { startManagerServer } from "../../mgr/server.js";
+import { MemoryAgentRunStore } from "../../mgr/store.js";
+import { ManagerClient } from "../../mgr/client.js";
+import type { SelfTestCase } from "../harness.js";
+
+const selfTest: SelfTestCase = async () => {
+  const server = await startManagerServer({ port: 0, host: "127.0.0.1", sourceCommit: "self-test", store: new MemoryAgentRunStore() });
+  try {
+    const client = new ManagerClient(server.baseUrl);
+    const health = await client.get("/health/readiness") as { database?: { adapter?: string; reachable?: boolean; migrationReady?: boolean; failureKind?: string | null }; secretRefs?: { valuesPrinted?: boolean } };
+    assert.equal(health.database?.adapter, "memory-self-test");
+    assert.equal(health.database?.reachable, true);
+    assert.equal(health.database?.migrationReady, true);
+    assert.equal(health.database?.failureKind, null);
+    assert.equal(health.secretRefs?.valuesPrinted, false);
+    return { name: "manager-memory", tests: ["manager-memory-lifecycle"] };
+  } finally {
+    await new Promise<void>((resolve) => server.server.close(() => resolve()));
+  }
+};
+
+export default selfTest;
@@ -0,0 +1,32 @@
+import assert from "node:assert/strict";
+import { startManagerServer } from "../../mgr/server.js";
+import { MemoryAgentRunStore } from "../../mgr/store.js";
+import { ManagerClient } from "../../mgr/client.js";
+import { renderRunnerJobDryRun } from "../../runner/k8s-job.js";
+import type { RunRecord } from "../../common/types.js";
+import { assertNoSecretLeak, createRunWithCommand, type SelfTestCase } from "../harness.js";
+
+const selfTest: SelfTestCase = async (context) => {
+  const server = await startManagerServer({ port: 0, host: "127.0.0.1", sourceCommit: "self-test", store: new MemoryAgentRunStore() });
+  try {
+    const client = new ManagerClient(server.baseUrl);
+    const item = await createRunWithCommand(client, context, "job smoke", "selftest-job-render", 15_000);
+    const rendered = renderRunnerJobDryRun({
+      run: await client.get(`/api/v1/runs/${item.runId}`) as RunRecord,
+      commandId: item.commandId,
+      managerUrl: server.baseUrl,
+      image: "127.0.0.1:5000/agentrun/agentrun-mgr@sha256:1111111111111111111111111111111111111111111111111111111111111111",
+      attemptId: "attempt_selftest",
+      sourceCommit: "self-test",
+    });
+    assert.equal(rendered.dryRun, true);
+    assert.equal(rendered.mutation, false);
+    assert.equal((rendered.jobIdentity as { serviceAccountName?: string }).serviceAccountName, "agentrun-v01-runner");
+    assertNoSecretLeak(rendered);
+    return { name: "runner-k8s-job", tests: ["runner-k8s-job-dry-run"] };
+  } finally {
+    await new Promise<void>((resolve) => server.server.close(() => resolve()));
+  }
+};
+
+export default selfTest;
@@ -0,0 +1,71 @@
+import assert from "node:assert/strict";
+import path from "node:path";
+import os from "node:os";
+import { startManagerServer } from "../../mgr/server.js";
+import { MemoryAgentRunStore } from "../../mgr/store.js";
+import { ManagerClient } from "../../mgr/client.js";
+import { runOnce } from "../../runner/run-once.js";
+import type { FailureKind, JsonRecord, TerminalStatus } from "../../common/types.js";
+import { assertNoSecretLeak, createRunWithCommand, type SelfTestCase, type SelfTestContext } from "../harness.js";
+
+const selfTest: SelfTestCase = async (context) => {
+  const server = await startManagerServer({ port: 0, host: "127.0.0.1", sourceCommit: "self-test", store: new MemoryAgentRunStore() });
+  try {
+    const client = new ManagerClient(server.baseUrl);
+    const happy = await createRunWithCommand(client, context, "hello", "selftest-turn", 15_000);
+    const result = await runOnce({ managerUrl: server.baseUrl, runId: happy.runId, codexCommand: context.fakeCodexCommand, codexArgs: context.fakeCodexArgs, codexHome: context.codexHome, env: { CODEX_HOME: context.codexHome } });
+    assert.equal(result.terminalStatus, "completed");
+    assert.equal(typeof (result.runner as { id?: unknown }).id, "string");
+    const events = await client.get(`/api/v1/runs/${happy.runId}/events?afterSeq=0&limit=100`) as { items?: Array<{ type: string; payload: unknown }> };
+    assert.ok(events.items?.some((event) => event.type === "assistant_message"));
+    assert.ok(events.items?.some((event) => event.type === "backend_status" && JSON.stringify(event.payload).includes("run-claimed")));
+    assertNoSecretLeak(events);
+    const finalRun = await client.get(`/api/v1/runs/${happy.runId}`) as { terminalStatus?: string };
+    assert.equal(finalRun.terminalStatus, "completed");
+
+    await runFailureCase({ client, managerUrl: server.baseUrl, context, mode: "missing-turn-result", expectedStatus: "failed", expectedFailureKind: "backend-response-invalid" });
+    await runFailureCase({ client, managerUrl: server.baseUrl, context, mode: "invalid-json", expectedStatus: "failed", expectedFailureKind: "backend-json-parse-error" });
+    await runFailureCase({ client, managerUrl: server.baseUrl, context, mode: "missing-terminal", expectedStatus: "failed", expectedFailureKind: "backend-timeout", timeoutMs: 500 });
+    await runSpawnFailureCase({ client, managerUrl: server.baseUrl, context });
+
+    return { name: "codex-stdio", tests: ["runner-lease-heartbeat", "codex-stdio-fake-turn", "codex-stdio-missing-turn-result", "codex-stdio-invalid-json", "codex-stdio-timeout", "codex-stdio-spawn-failure"] };
+  } finally {
+    await new Promise<void>((resolve) => server.server.close(() => resolve()));
+  }
+};
+
+async function runFailureCase(options: { client: ManagerClient; managerUrl: string; context: SelfTestContext; mode: string; expectedStatus: TerminalStatus; expectedFailureKind: FailureKind; timeoutMs?: number }): Promise<void> {
+  const item = await createRunWithCommand(options.client, options.context, `failure ${options.mode}`, `selftest-${options.mode}`, options.timeoutMs ?? 3_000);
+  const result = await runOnce({
+    managerUrl: options.managerUrl,
+    runId: item.runId,
+    codexCommand: options.context.fakeCodexCommand,
+    codexArgs: options.context.fakeCodexArgs,
+    codexHome: options.context.codexHome,
+    env: { CODEX_HOME: options.context.codexHome, AGENTRUN_FAKE_CODEX_MODE: options.mode },
+  }) as JsonRecord;
+  assert.equal(result.terminalStatus, options.expectedStatus, options.mode);
+  assert.equal(result.failureKind, options.expectedFailureKind, options.mode);
+  const events = await options.client.get(`/api/v1/runs/${item.runId}/events?afterSeq=0&limit=100`) as { items?: Array<{ type: string; payload: unknown }> };
+  assert.ok(events.items?.some((event) => event.type === "error"), options.mode);
+  assertNoSecretLeak(events);
+}
+
+async function runSpawnFailureCase(options: { client: ManagerClient; managerUrl: string; context: SelfTestContext }): Promise<void> {
+  const item = await createRunWithCommand(options.client, options.context, "failure spawn", "selftest-spawn-failure", 3_000);
+  const result = await runOnce({
+    managerUrl: options.managerUrl,
+    runId: item.runId,
+    codexCommand: path.join(os.tmpdir(), `agentrun-missing-codex-${process.pid}`),
+    codexArgs: [],
+    codexHome: options.context.codexHome,
+    env: { CODEX_HOME: options.context.codexHome },
+  }) as JsonRecord;
+  assert.equal(result.terminalStatus, "failed", "spawn failure");
+  assert.equal(result.failureKind, "backend-spawn-failed", "spawn failure");
+  const events = await options.client.get(`/api/v1/runs/${item.runId}/events?afterSeq=0&limit=100`) as { items?: Array<{ type: string; payload: unknown }> };
+  assert.ok(events.items?.some((event) => event.type === "error"), "spawn failure");
+  assertNoSecretLeak(events);
+}
+
+export default selfTest;
@@ -0,0 +1,82 @@
+import { mkdtemp, mkdir, writeFile, rm } from "node:fs/promises";
+import os from "node:os";
+import path from "node:path";
+import assert from "node:assert/strict";
+import { ManagerClient } from "../mgr/client.js";
+import type { JsonRecord } from "../common/types.js";
+
+export interface SelfTestContext {
+  root: string;
+  tmp: string;
+  codexHome: string;
+  workspace: string;
+  fakeCodexPath: string;
+  fakeCodexCommand: string;
+  fakeCodexArgs: string[];
+  cleanup(): Promise<void>;
+}
+
+export interface SelfTestResult {
+  name?: string;
+  tests?: string[];
+}
+
+export type SelfTestCase = (context: SelfTestContext) => Promise<SelfTestResult> | SelfTestResult;
+
+export async function createSelfTestContext(root: string): Promise<SelfTestContext> {
+  const tmp = await mkdtemp(path.join(os.tmpdir(), "agentrun-selftest-"));
+  const codexHome = path.join(tmp, "codex-home");
+  const workspace = path.join(tmp, "workspace");
+  await mkdir(codexHome, { recursive: true });
+  await mkdir(workspace, { recursive: true });
+  await writeFile(path.join(codexHome, "auth.json"), JSON.stringify({ token: "test-token-material" }));
+  await writeFile(path.join(codexHome, "config.toml"), "model = \"gpt-test\"\n");
+  await writeFile(path.join(workspace, "README.md"), "self-test workspace\n");
+  const fakeCodexPath = path.join(root, "src/selftest/fake-codex-app-server.ts");
+  return {
+    root,
+    tmp,
+    codexHome,
+    workspace,
+    fakeCodexPath,
+    fakeCodexCommand: process.env.AGENTRUN_SELFTEST_CODEX_COMMAND ?? defaultFakeCommand(),
+    fakeCodexArgs: process.env.AGENTRUN_SELFTEST_CODEX_ARGS ? JSON.parse(process.env.AGENTRUN_SELFTEST_CODEX_ARGS) as string[] : defaultFakeArgs(fakeCodexPath),
+    cleanup: () => rm(tmp, { recursive: true, force: true }),
+  };
+}
+
+export async function createRunWithCommand(client: ManagerClient, context: Pick<SelfTestContext, "workspace" | "codexHome">, prompt: string, idempotencyKey: string, timeoutMs: number): Promise<{ runId: string; commandId: string }> {
+  const run = await client.post("/api/v1/runs", {
+    tenantId: "unidesk",
+    projectId: "pikasTech/unidesk",
+    workspaceRef: { kind: "host-path", path: context.workspace },
+    providerId: "G14",
+    backendProfile: "codex",
+    executionPolicy: {
+      sandbox: "workspace-write",
+      approval: "never",
+      timeoutMs,
+      network: "default",
+      secretScope: { allowCredentialEcho: false, providerCredentials: [{ profile: "codex", secretRef: { name: "agentrun-v01-provider-codex", keys: ["auth.json", "config.toml"], mountPath: context.codexHome } }] },
+    },
+    traceSink: null,
+  }) as { id: string };
+  const command = await client.post(`/api/v1/runs/${run.id}/commands`, { type: "turn", payload: { prompt }, idempotencyKey }) as { id: string };
+  const duplicate = await client.post(`/api/v1/runs/${run.id}/commands`, { type: "turn", payload: { prompt }, idempotencyKey }) as { id: string };
+  assert.equal(duplicate.id, command.id);
+  return { runId: run.id, commandId: command.id };
+}
+
+export function assertNoSecretLeak(value: unknown): void {
+  const text = JSON.stringify(value);
+  assert.equal(text.includes("test-token-material"), false);
+  assert.equal(text.includes("Bearer test-token"), false);
+}
+
+function defaultFakeCommand(): string {
+  return process.versions.bun ? process.execPath : "npx";
+}
+
+function defaultFakeArgs(fakePath: string): string[] {
+  return process.versions.bun ? [fakePath] : ["tsx", fakePath];
+}
@@ -1,180 +1,28 @@
-import { mkdtemp, mkdir, writeFile, rm } from "node:fs/promises";
-import os from "node:os";
+import { readdir } from "node:fs/promises";
 import path from "node:path";
-import { fileURLToPath } from "node:url";
-import assert from "node:assert/strict";
-import { startManagerServer } from "../mgr/server.js";
-import { MemoryAgentRunStore, openAgentRunStoreFromEnv } from "../mgr/store.js";
-import { postgresMigrationContract } from "../mgr/postgres-store.js";
-import { ManagerClient } from "../mgr/client.js";
-import { runOnce } from "../runner/run-once.js";
-import { redactText } from "../common/redaction.js";
-import { AgentRunError } from "../common/errors.js";
-import type { FailureKind, JsonRecord, TerminalStatus } from "../common/types.js";
+import { pathToFileURL } from "node:url";
+import { createSelfTestContext, type SelfTestCase, type SelfTestResult } from "./harness.js";

-const root = path.resolve(path.dirname(fileURLToPath(import.meta.url)), "../..");
-const tmp = await mkdtemp(path.join(os.tmpdir(), "agentrun-selftest-"));
+const root = path.resolve(import.meta.dirname, "../..");
+const casesDir = path.join(root, "src/selftest/cases");
+const context = await createSelfTestContext(root);

 try {
-  const codexHome = path.join(tmp, "codex-home");
-  const workspace = path.join(tmp, "workspace");
-  await mkdir(codexHome, { recursive: true });
-  await mkdir(workspace, { recursive: true });
-  await writeFile(path.join(codexHome, "auth.json"), JSON.stringify({ token: "test-token-material" }));
-  await writeFile(path.join(codexHome, "config.toml"), "model = \"gpt-test\"\n");
-  await writeFile(path.join(workspace, "README.md"), "self-test workspace\n");
+  const caseFiles = (await readdir(casesDir))
+    .filter((file) => file.endsWith(".ts") && !file.endsWith(".d.ts"))
+    .sort();
+  const results: SelfTestResult[] = [];

-  assert.equal(redactText("Authorization: Bearer abc123"), "Authorization: Bearer REDACTED");
-  assert.equal(redactText('{"token":"test-token-material"}').includes("test-token-material"), false);
-  await assert.rejects(
-    () => openAgentRunStoreFromEnv({}),
-    (error) => error instanceof AgentRunError && error.failureKind === "infra-failed" && error.message.includes("DATABASE_URL is required"),
-  );
-  const postgresContract = postgresMigrationContract();
-  assert.equal(postgresContract.latestMigrationId, "001_v01_initial_durable_store");
-  assert.ok(Array.isArray(postgresContract.requiredTables));
-  assert.ok(postgresContract.requiredTables.includes("agentrun_schema_migrations"));
-  assert.ok(postgresContract.requiredTables.includes("agentrun_runs"));
-  assert.ok(postgresContract.requiredTables.includes("agentrun_events"));
-
-  const server = await startManagerServer({ port: 0, host: "127.0.0.1", sourceCommit: "self-test", store: new MemoryAgentRunStore() });
-  try {
-    const client = new ManagerClient(server.baseUrl);
-    const health = await client.get("/health/readiness") as { database?: { adapter?: string; reachable?: boolean; migrationReady?: boolean; failureKind?: string | null }; secretRefs?: { valuesPrinted?: boolean } };
-    assert.equal(health.database?.adapter, "memory-self-test");
-    assert.equal(health.database?.reachable, true);
-    assert.equal(health.database?.migrationReady, true);
-    assert.equal(health.database?.failureKind, null);
-    assert.equal(health.secretRefs?.valuesPrinted, false);
-
-    const fakePath = path.join(root, "src/selftest/fake-codex-app-server.ts");
-    const fakeCommand = process.env.AGENTRUN_SELFTEST_CODEX_COMMAND ?? defaultFakeCommand();
-    const fakeArgs = process.env.AGENTRUN_SELFTEST_CODEX_ARGS ? JSON.parse(process.env.AGENTRUN_SELFTEST_CODEX_ARGS) as string[] : defaultFakeArgs(fakePath);
-
-    const happy = await createRunWithCommand(client, workspace, codexHome, "hello", "selftest-turn", 15_000);
-    const result = await runOnce({ managerUrl: server.baseUrl, runId: happy.runId, codexCommand: fakeCommand, codexArgs: fakeArgs, codexHome, env: { CODEX_HOME: codexHome } });
-    assert.equal(result.terminalStatus, "completed");
-    const events = await client.get(`/api/v1/runs/${happy.runId}/events?afterSeq=0&limit=100`) as { items?: Array<{ type: string; payload: unknown }> };
-    assert.ok(events.items?.some((event) => event.type === "assistant_message"));
-    assertNoSecretLeak(events);
-    const finalRun = await client.get(`/api/v1/runs/${happy.runId}`) as { terminalStatus?: string };
-    assert.equal(finalRun.terminalStatus, "completed");
-
-    await runFailureCase({
-      client,
-      managerUrl: server.baseUrl,
-      workspace,
-      codexHome,
-      fakeCommand,
-      fakeArgs,
-      mode: "missing-turn-result",
-      expectedStatus: "failed",
-      expectedFailureKind: "backend-response-invalid",
-    });
-    await runFailureCase({
-      client,
-      managerUrl: server.baseUrl,
-      workspace,
-      codexHome,
-      fakeCommand,
-      fakeArgs,
-      mode: "invalid-json",
-      expectedStatus: "failed",
-      expectedFailureKind: "backend-json-parse-error",
-    });
-    await runFailureCase({
-      client,
-      managerUrl: server.baseUrl,
-      workspace,
-      codexHome,
-      fakeCommand,
-      fakeArgs,
-      mode: "missing-terminal",
-      expectedStatus: "failed",
-      expectedFailureKind: "backend-timeout",
-      timeoutMs: 500,
-    });
-    await runSpawnFailureCase({
-      client,
-      managerUrl: server.baseUrl,
-      workspace,
-      codexHome,
-    });
-
-    console.log(JSON.stringify({ ok: true, tests: ["manager-memory-lifecycle", "codex-stdio-fake-turn", "codex-stdio-missing-turn-result", "codex-stdio-invalid-json", "codex-stdio-timeout", "codex-stdio-spawn-failure", "postgres-store-contract", "redaction"], runId: happy.runId }));
-  } finally {
-    await new Promise<void>((resolve) => server.server.close(() => resolve()));
+  for (const file of caseFiles) {
+    const moduleUrl = pathToFileURL(path.join(casesDir, file)).href;
+    const imported = await import(moduleUrl) as { default?: SelfTestCase; selfTest?: SelfTestCase };
+    const selfTest = imported.default ?? imported.selfTest;
+    if (typeof selfTest !== "function") throw new Error(`self-test case ${file} must export default or selfTest function`);
+    const result = await selfTest(context);
+    results.push({ name: result.name ?? file.replace(/\.ts$/u, ""), tests: result.tests ?? [] });
  }
+
+  console.log(JSON.stringify({ ok: true, cases: results, tests: results.flatMap((result) => result.tests) }));
 } finally {
-  await rm(tmp, { recursive: true, force: true });
-}
-
-function defaultFakeCommand(): string {
-  return process.versions.bun ? process.execPath : "npx";
-}
-
-function defaultFakeArgs(fakePath: string): string[] {
-  return process.versions.bun ? [fakePath] : ["tsx", fakePath];
-}
-
-async function createRunWithCommand(client: ManagerClient, workspace: string, codexHome: string, prompt: string, idempotencyKey: string, timeoutMs: number): Promise<{ runId: string; commandId: string }> {
-  const run = await client.post("/api/v1/runs", {
-    tenantId: "unidesk",
-    projectId: "pikasTech/unidesk",
-    workspaceRef: { kind: "host-path", path: workspace },
-    providerId: "G14",
-    backendProfile: "codex",
-    executionPolicy: {
-      sandbox: "workspace-write",
-      approval: "never",
-      timeoutMs,
-      network: "default",
-      secretScope: { allowCredentialEcho: false, providerCredentials: [{ profile: "codex", secretRef: { name: "agentrun-v01-provider-codex", keys: ["auth.json", "config.toml"], mountPath: codexHome } }] },
-    },
-    traceSink: null,
-  }) as { id: string };
-  const command = await client.post(`/api/v1/runs/${run.id}/commands`, { type: "turn", payload: { prompt }, idempotencyKey }) as { id: string };
-  const duplicate = await client.post(`/api/v1/runs/${run.id}/commands`, { type: "turn", payload: { prompt }, idempotencyKey }) as { id: string };
-  assert.equal(duplicate.id, command.id);
-  return { runId: run.id, commandId: command.id };
-}
-
-async function runFailureCase(options: { client: ManagerClient; managerUrl: string; workspace: string; codexHome: string; fakeCommand: string; fakeArgs: string[]; mode: string; expectedStatus: TerminalStatus; expectedFailureKind: FailureKind; timeoutMs?: number }): Promise<void> {
-  const item = await createRunWithCommand(options.client, options.workspace, options.codexHome, `failure ${options.mode}`, `selftest-${options.mode}`, options.timeoutMs ?? 3_000);
-  const result = await runOnce({
-    managerUrl: options.managerUrl,
-    runId: item.runId,
-    codexCommand: options.fakeCommand,
-    codexArgs: options.fakeArgs,
-    codexHome: options.codexHome,
-    env: { CODEX_HOME: options.codexHome, AGENTRUN_FAKE_CODEX_MODE: options.mode },
-  }) as JsonRecord;
-  assert.equal(result.terminalStatus, options.expectedStatus, options.mode);
-  assert.equal(result.failureKind, options.expectedFailureKind, options.mode);
-  const events = await options.client.get(`/api/v1/runs/${item.runId}/events?afterSeq=0&limit=100`) as { items?: Array<{ type: string; payload: unknown }> };
-  assert.ok(events.items?.some((event) => event.type === "error"), options.mode);
-  assertNoSecretLeak(events);
-}
-
-async function runSpawnFailureCase(options: { client: ManagerClient; managerUrl: string; workspace: string; codexHome: string }): Promise<void> {
-  const item = await createRunWithCommand(options.client, options.workspace, options.codexHome, "failure spawn", "selftest-spawn-failure", 3_000);
-  const result = await runOnce({
-    managerUrl: options.managerUrl,
-    runId: item.runId,
-    codexCommand: path.join(os.tmpdir(), `agentrun-missing-codex-${process.pid}`),
-    codexArgs: [],
-    codexHome: options.codexHome,
-    env: { CODEX_HOME: options.codexHome },
-  }) as JsonRecord;
-  assert.equal(result.terminalStatus, "failed", "spawn failure");
-  assert.equal(result.failureKind, "backend-spawn-failed", "spawn failure");
-  const events = await options.client.get(`/api/v1/runs/${item.runId}/events?afterSeq=0&limit=100`) as { items?: Array<{ type: string; payload: unknown }> };
-  assert.ok(events.items?.some((event) => event.type === "error"), "spawn failure");
-  assertNoSecretLeak(events);
-}
-
-function assertNoSecretLeak(value: unknown): void {
-  const text = JSON.stringify(value);
-  assert.equal(text.includes("test-token-material"), false);
-  assert.equal(text.includes("Bearer test-token"), false);
+  await context.cleanup();
 }