Merge pull request #260 from pikasTech/fix/agentrun-bridge-runner

fix: route AgentRun bridge from runners
2026-06-11 10:00:04 +08:00
parent 23e2a6e3e2 0d0b3e21f3
commit 0ba1d19f5f
5 changed files with 217 additions and 6 deletions
@@ -113,6 +113,8 @@ AgentRun `v0.1` 的指挥官任务面已经按 AgentRun issue #105 完成真实

 UniDesk 指挥官新任务入口固定使用 `bun scripts/cli.ts agentrun queue|sessions`。该入口是 G14 `/root/agentrun-v01` 中官方 `./scripts/agentrun --manager-url auto` CLI 的直接 bridge；日常派单、dispatch、turn 和 steer 优先用 `--json-stdin`、`--prompt-stdin`、`--runner-json-stdin` 或 `--*-file -` 的 quoted heredoc/stdin 形式，stdin 会通过管道直通官方 CLI，不先落 dump 文件。`--json-file`、`--prompt-file` 和 `--runner-json-file` 只用于已审阅且可复用的受控文件，bridge 会将其 materialize 到 G14 临时文件后传给官方 CLI。UniDesk 不实现 AgentRun queue 协议，也不把任务 double-write 回旧 Code Queue。

+`agentrun control-plane ...` 与 `agentrun queue|runs|commands|runner|sessions ...` 共用同一 UniDesk SSH capture bridge。主 server 本机可继续使用本地 backend-core broker；AgentRun runner、Artificer 或其他没有本地 Docker / `unidesk-backend-core` 容器的环境会自动改走既有 frontend `/ws/ssh` WebSocket backend，并在输出的 `bridge.capture.backend`、`reason` 和 `localBackendCore` 中披露选择依据。本地 `unidesk-backend-core` 容器不是 runner 环境使用这些 AgentRun CLI 入口的隐式前置条件。若所有 capture backend 都不可用，CLI 必须返回 `failureKind=bridge-execution-environment` 与 `capture-backend-unavailable` 或 `bridge-execution-environment-unavailable`，并给出受控恢复入口；AgentRun 官方 CLI 自身返回的 run/command/schema 错误不得被改写成 bridge 失败。
+
 AgentRun Queue 任务如果需要调用 UniDesk 维护桥，例如 `trans` / `unidesk-ssh`，长期契约以 AgentRun 仓库 `docs/reference/spec-v01-runtime-assembly.md` 和 `docs/reference/spec-v01-secret-distribution.md` 为准：调用方通过 `executionPolicy.secretScope.toolCredentials[].tool=unidesk-ssh` 请求 `UNIDESK_SSH_CLIENT_TOKEN` SecretRef；非敏感 endpoint 由 runner-job `transientEnv` 显式提供，或由 manager 受控默认值自动补齐。UniDesk bridge 提交 Queue payload 时不得在 prompt、payload 或 `transientEnv` 中携带 token，也不得使用 HWLAB runtime Web 入口冒充 UniDesk frontend。若 dispatcher 已正确请求 `unidesk-ssh` 但 trace 的 `runner-job-created.transientEnv.names` 没有 `UNIDESK_MAIN_SERVER_IP`、`UNIDESK_MAIN_SERVER_HOST` 或 `UNIDESK_FRONTEND_URL`，归为 AgentRun assembly 问题；若 endpoint env 已存在但 route denied/timeout，再按 UniDesk frontend/token scope 或 provider session 排查。

 旧 UniDesk Code Queue 只保留历史归档、只读排障和残留旧任务停止入口。`codex submit/enqueue`、`codex steer`、`codex resume`、`codex queue create/merge`、`codex move`、旧 Web 提交表单、旧队列管理和旧 workdir 管理都必须返回冻结状态或禁用；`codex task/tasks/output/read/unread/queues` 可继续读取历史，`codex interrupt|cancel` 只用于停止残留旧任务。旧 Code Queue history 不迁移到 AgentRun，也不提供 adapter、legacy mode、fallback 或双写路径。
@@ -94,6 +94,7 @@ CI/CD、GitOps、rollout、artifact 发布、PR 合并后的 runtime lane 滚动
 - `schedule list|get|runs|run|retry-run|delete|upsert-pgdata-backup` 管理 backend-core 定时任务和运行历史。`schedule list`、`schedule get`、`schedule runs --limit N` 和 `schedule runs <scheduleId> --limit N` 是只读观察入口；`schedule run`、`schedule retry-run`、`schedule delete` 和 `schedule upsert-pgdata-backup` 会触发运行或写入配置，生产恢复时必须有明确授权。`schedule runs --limit N` 是全局历史视图，返回 `scope=global` 和 `scheduleId=null`；`schedule runs <scheduleId> --limit N` 是指定 schedule 历史视图，返回 `scope=schedule` 和对应 `scheduleId`。CLI 必须拒绝 `schedule runs 50` 这类纯数字位置参数，并提示使用 `schedule runs --limit 50`，避免把空数组误判成“没有历史 run”。`schedule run <id> --wait-ms N` 触发同一 schedule，并且即使 wait 超时也必须返回 `newRunId` 和 `observeCommand`；`schedule retry-run <failedRunId>` 只接受 failed run，从原 run 反查 `scheduleId` 后重触发同一 schedule，并输出 `originalRunId`、`scheduleId`、`newRunId` 和 `observeCommand`。当 backend-core 目标容器缺失或只观察到 verify-only 容器时，schedule/microservice 命令必须以非零退出并返回 `failureKind=target-stack-not-running`、`runnerDisposition=infra-blocked`、`readOnlyCommands` 和 `authorizationRequiredForRecovery`，不得把 Docker 的 `No such container` 当成成功的空历史。
 - `codex deploy <commitId>` 是旧 Code Queue 兼容部署入口，已禁用以防止维护通道直连 D601 部署 Code Queue；当前 dev 自动化只做 `ci run-dev-e2e` smoke，不提供 Code Queue CD，详细规则见 `docs/reference/codex-deploy.md`。
 - `agentrun queue|sessions` 是当前指挥官新任务和 AgentRun session 控制入口。UniDesk CLI 通过 G14 `/root/agentrun-v01` 中官方 `./scripts/agentrun --manager-url auto` 执行：`queue commander` 查看指挥官队列，`queue submit --json-stdin` 创建新任务，`queue dispatch <taskId> --json-stdin` 派发，`queue read/cancel` 标记和取消队列任务；`sessions trace/output/read/steer/cancel` 读取和控制已创建 session。日常一次性 JSON、prompt 和 runner JSON 输入优先用 quoted heredoc/stdin；`--json-file`、`--prompt-file`、`--runner-json-file` 只用于已审阅且可复用的受控文件。本地 bridge 对 stdin 直通官方 AgentRun CLI，不先落 dump 文件；它不是旧 Code Queue adapter，不做双写，也不迁移旧历史。
+- `agentrun control-plane ...` 与 `agentrun queue|runs|commands|runner|sessions ...` 共用 UniDesk SSH capture bridge。主 server 本机可使用本地 backend-core broker；Artificer/AgentRun runner 等没有本地 Docker 或 `unidesk-backend-core` 容器的环境会自动使用 frontend `/ws/ssh` WebSocket backend，并在 `bridge.capture.backend`、`reason`、`localBackendCore` 中披露选择依据。本地 `unidesk-backend-core` 不是 runner 环境的隐式前置条件；若 capture backend 不可用，错误必须归类为 `failureKind=bridge-execution-environment` 并给出受控恢复入口。
 - `codex submit/enqueue`、`codex steer`、`codex resume`、`codex queue create`、`codex queue merge`、`codex move`、旧 Web 提交表单、旧队列管理和旧 workdir 管理是冻结的 legacy Code Queue 写入口。CLI 必须返回 `ok=false`、`frozen=true`、`degradedReason=legacy-code-queue-frozen` 和 AgentRun 替代命令；服务端旧 API 写入口必须返回 410。新任务、steer、trace/output、read 和 cancel 走 AgentRun Queue/Sessions。
 - 旧 Code Queue 只保留历史归档、只读排障和残留任务停止。`codex task/tasks/output/read/unread/queues` 继续通过 backend-core 私有代理读取旧 PostgreSQL 历史；`codex interrupt|cancel <taskId>` 只用于停止旧运行面残留任务。旧 `steer-confirm` 只作为历史 trace confirmation 查询，不是新任务控制入口。
 - `codex pr-preflight [--remote] [--push-dry-run --push-dry-run-ref refs/heads/probe/<name>] [--pr-create-dry-run --pr-create-dry-run-head <head>] [--issue N] [--full|--raw]` 通过稳定 `code-queue` proxy 请求 D601 scheduler `/api/runtime-preflight`，用于 PR 型派单 admission。默认输出是紧凑 commander 视图，显式分出 `schedulerPreflight` 与 `activeRunnerPrCapability`，并附带 `commands` 和 `disclosure`，方便先看 scheduler auth 缺口、再看当前 runner/dev container 的 `gh auth status` 与 `gh pr create --dry-run` 能力；`--full` 或 `--raw` 才展开完整 `preflight`、工具、agent port、Git worktree、GitHub egress、repo/issue/PR 只读探测和观测原文。只报告 `GH_TOKEN`/`GITHUB_TOKEN` 是否存在和来源 key，不打印值。当 auth-broker 配置存在时，`tokenCoverage.source="auth-broker"`、`credentialSource="broker-issued-token"` 且 runner env token 不是成功前提；当仅 env token 存在时，`credentialSource="env-token"` 且 `authBroker.nextAction="use-env-token-until-auth-broker-live"`；两者都缺失时顶层 `ok=false`、`runnerDisposition=infra-blocked`、`degradedReason=auth-broker-needed`，`tokenCoverage.missing` 同时列出 `GH_TOKEN` 与 `GITHUB_TOKEN`，并输出 `authBroker.source="broker/auth-broker-needed"`、`capability.source="missing-token"`。该 `auth-missing` 的 scope 是 `scheduler-runner-env`，不能简化成“当前 active runner/dev container 不能创建 PR”；默认视图必须带 `scopeBoundary` 和 `activeRunnerPrCapability`。GitHub DNS/API 连接失败应归类为 `failureKind=github-transient`、`degradedReason=github-dns-api-transient`，并带 `retryable=true`、`commanderAction=retry-backoff-or-keep-running-if-heartbeat-fresh` 和有界 `githubTransient.failedProbes`；调用方应重试/退避，且在任务 heartbeat/trace 新鲜时继续监督，不把它当成 auth 缺失或 PR 语义失败。`prCapability` 是 runner-facing 合同摘要，必须包含目标分支、token/auth 来源、`systemGhBinaryRequiredForWrites=false`、UniDesk REST `bun scripts/cli.ts gh` 可用性、push dry-run/PR create dry-run 的 `writesRemote=false`、expected PR handoff、真实 PR 创建需要 commander 授权，以及 guarded `gh pr merge --dry-run` 预检路径；系统 `gh` binary 缺失只进入 `tools.systemGhBinary`，不得误判为 UniDesk REST `gh` CLI 不可用。`--remote` 在 runner-like 环境里不再依赖本地 `unidesk-backend-core`、`unidesk-database`、`baidu-netdisk-backend` 容器存在；这些缺失只作为本地观测证据。若远程控制面可达，则继续走远程控制面结果；若远程控制面不可达，则结构化返回 `failureKind=control-plane-missing` / `degradedReason=remote-control-plane-unreachable`，而不是把本地 `backend-core-container-missing` 当作最终阻塞。`--pr-create-dry-run` 不 POST GitHub，只证明 runner 内 PR body 生成、`scripts/cli.ts gh pr create --dry-run` 和 branch 参数形态可用；服务端创建权限仍以 token/auth broker、repo/issue/PR read、push dry-run 和最终授权后的真实 PR 创建结果为准。
@@ -85,6 +85,21 @@ assertCondition(
  "AgentRun control-plane status must degrade empty runtime JSON snippets instead of failing the whole status probe",
 );

+assertCondition(
+  agentRunSource.includes('type AgentRunBridgeCaptureBackend = "local-backend-core-broker" | "remote-frontend-websocket"')
+    && agentRunSource.includes('reason: "runner-environment"')
+    && agentRunSource.includes('degradedReason: "capture-backend-unavailable"')
+    && agentRunSource.includes('"agentrun-cli-returned-failure"')
+    && agentRunSource.includes('failureKind: "bridge-execution-environment"')
+    && agentRunSource.includes('key === "nextActions"'),
+  "AgentRun CLI bridge must use the remote frontend backend in runner/no-Docker environments and classify bridge failures separately",
+);
+
+assertCondition(
+  !agentRunSource.includes('degradedReason: "agentrun-cli-bridge-failed"'),
+  "AgentRun CLI bridge must not collapse official AgentRun failures into bridge failures",
+);
+
 console.log(JSON.stringify({
  ok: true,
  checks: [
@@ -96,5 +111,7 @@ console.log(JSON.stringify({
    "AgentRun command help presents heredoc/stdin before reusable file fallbacks",
    "global help indexes AgentRun v0.1 entrypoints",
    "AgentRun control-plane status degrades empty runtime JSON snippets",
+    "AgentRun CLI bridge selects remote frontend backend in runner/no-Docker environments",
+    "AgentRun CLI bridge keeps AgentRun failures distinct from bridge failures",
  ],
 }));
@@ -1,6 +1,8 @@
 import { readFileSync } from "node:fs";
+import { spawnSync } from "node:child_process";
 import type { UniDeskConfig } from "./config";
 import { runSshCommandCapture, type SshCaptureResult } from "./ssh";
+import { runRemoteSshCommandCapture } from "./remote";
 import { startJob } from "./jobs";

 const g14SourceRoute = "G14:/root/agentrun-v01";
@@ -1608,9 +1610,9 @@ type AgentRunOfficialCliBridgeGroup = "queue" | "sessions" | "aipod-specs" | "ai
 async function runOfficialAgentRunCli(config: UniDeskConfig, group: AgentRunOfficialCliBridgeGroup, args: string[]): Promise<Record<string, unknown>> {
  const prepared = prepareOfficialAgentRunCliArgs([group, ...args]);
  const command = `agentrun ${prepared.args.join(" ")}`.trim();
-  const bridge = agentRunQueueBridgeMetadata(prepared.materializedFiles, prepared.stdinPayload);
  const script = officialAgentRunCliScript(prepared);
  const result = await capture(config, g14SourceRoute, ["script", "--", script]);
+  const bridge = agentRunQueueBridgeMetadata(prepared.materializedFiles, prepared.stdinPayload, captureBridgeMetadata(result));
  const payload = captureJsonPayload(result);
  if (result.exitCode === 0 && Object.keys(payload).length > 0) {
    return {
@@ -1628,10 +1630,14 @@ async function runOfficialAgentRunCli(config: UniDeskConfig, group: AgentRunOffi
      remote: compactCapture(result, { full: true, stdoutTailChars: 4000, stderrTailChars: 2000 }),
    };
  }
+  const bridgeFailureKind = bridgeExecutionFailureKind(result);
+  const agentrunFailureKind = stringOrNull(payload.failureKind) ?? stringOrNull(record(payload.error).failureKind);
  return {
    ok: false,
    command,
-    degradedReason: "agentrun-cli-bridge-failed",
+    degradedReason: bridgeFailureKind === null ? "agentrun-cli-returned-failure" : bridgeFailureKind.degradedReason,
+    ...(bridgeFailureKind === null ? {} : { failureKind: bridgeFailureKind.failureKind, recoveryActions: bridgeFailureKind.recoveryActions }),
+    ...(bridgeFailureKind === null && agentrunFailureKind !== null ? { failureKind: agentrunFailureKind } : {}),
    bridge,
    agentrun: payload,
    remote: compactCapture(result, { full: true, stdoutTailChars: 8000, stderrTailChars: 4000 }),
@@ -1668,7 +1674,7 @@ function rewriteOfficialCommandFieldValue(value: unknown): unknown {
 }

 function shouldRewriteOfficialCommandField(key: string): boolean {
-  return key === "pollCommands" || key === "drillDownCommands" || key === "recoveryActions" || key === "logPath";
+  return key === "pollCommands" || key === "drillDownCommands" || key === "recoveryActions" || key === "nextActions" || key === "logPath";
 }

 function rewriteOfficialCommandString(value: string): string {
@@ -1804,7 +1810,7 @@ function parentDir(pathValue: string): string {
  return index > 0 ? pathValue.slice(0, index) : ".";
 }

-function agentRunQueueBridgeMetadata(materializedFiles: AgentRunCliMaterializedFile[], stdinPayload: AgentRunCliForwardedStdin | null): Record<string, unknown> {
+function agentRunQueueBridgeMetadata(materializedFiles: AgentRunCliMaterializedFile[], stdinPayload: AgentRunCliForwardedStdin | null, captureBridge: Record<string, unknown> | null): Record<string, unknown> {
  return {
    route: g14SourceRoute,
    sourceWorktree: "/root/agentrun-v01",
@@ -1813,6 +1819,7 @@ function agentRunQueueBridgeMetadata(materializedFiles: AgentRunCliMaterializedF
    officialCli: "./scripts/agentrun",
    mode: "direct-official-cli",
    compatibility: "no-code-queue-adapter-no-double-write",
+    capture: captureBridge,
    stdinForwarded: stdinPayload ? {
      flag: stdinPayload.flag,
      source: stdinPayload.source,
@@ -1827,8 +1834,162 @@ function agentRunQueueBridgeMetadata(materializedFiles: AgentRunCliMaterializedF
  };
 }

-async function capture(config: UniDeskConfig, target: string, args: string[]): Promise<SshCaptureResult> {
-  return await runSshCommandCapture(config, target, args);
+type AgentRunBridgeCaptureBackend = "local-backend-core-broker" | "remote-frontend-websocket";
+
+interface LocalBackendCoreStatus {
+  dockerExecutable: boolean;
+  backendCoreContainer: boolean;
+  error: string | null;
+}
+
+interface AgentRunBridgeCapturePlan {
+  backend: AgentRunBridgeCaptureBackend;
+  route: string;
+  reason: string;
+  remoteHost: string | null;
+  localBackendCore: LocalBackendCoreStatus;
+}
+
+type AgentRunBridgeCaptureResult = SshCaptureResult & { bridgeExecution?: AgentRunBridgeCapturePlan };
+
+let localBackendCoreStatusCache: LocalBackendCoreStatus | null = null;
+
+async function capture(config: UniDeskConfig, target: string, args: string[]): Promise<AgentRunBridgeCaptureResult> {
+  const plan = agentRunBridgeCapturePlan(config, target);
+  if (plan.backend === "remote-frontend-websocket" && plan.remoteHost !== null) {
+    return await captureRemote(config, plan, target, args);
+  }
+  const local = attachBridgeExecution(await runSshCommandCapture(config, target, args), plan);
+  const fallbackHost = agentRunBridgeRemoteHost(config);
+  if (local.exitCode !== 0 && fallbackHost !== null && bridgeExecutionFailureKind(local)?.degradedReason === "capture-backend-unavailable") {
+    const fallbackPlan: AgentRunBridgeCapturePlan = {
+      ...plan,
+      backend: "remote-frontend-websocket",
+      remoteHost: fallbackHost,
+      reason: "local-capture-backend-unavailable",
+    };
+    return await captureRemote(config, fallbackPlan, target, args);
+  }
+  return local;
+}
+
+async function captureRemote(config: UniDeskConfig, plan: AgentRunBridgeCapturePlan, target: string, args: string[]): Promise<AgentRunBridgeCaptureResult> {
+  if (plan.remoteHost === null) return attachBridgeExecution(remoteBridgeCaptureFailure(new Error("remote host is not configured")), plan);
+  try {
+    return attachBridgeExecution(await runRemoteSshCommandCapture(config, plan.remoteHost, target, args), plan);
+  } catch (error) {
+    return attachBridgeExecution(remoteBridgeCaptureFailure(error), plan);
+  }
+}
+
+function remoteBridgeCaptureFailure(error: unknown): SshCaptureResult {
+  const message = error instanceof Error ? `${error.name}: ${error.message}` : String(error);
+  return {
+    exitCode: 255,
+    stdout: "",
+    stderr: `unidesk remote frontend ssh bridge failed: ${message}\n`,
+  };
+}
+
+function attachBridgeExecution(result: SshCaptureResult, plan: AgentRunBridgeCapturePlan): AgentRunBridgeCaptureResult {
+  return { ...result, bridgeExecution: plan };
+}
+
+function agentRunBridgeCapturePlan(config: UniDeskConfig, target: string, env: NodeJS.ProcessEnv = process.env): AgentRunBridgeCapturePlan {
+  const localBackendCore = detectLocalBackendCoreStatus();
+  const remoteHost = agentRunBridgeRemoteHost(config, env);
+  const runnerEnv = isAgentRunRunnerEnvironment(env);
+  if (runnerEnv && remoteHost !== null) {
+    return { backend: "remote-frontend-websocket", route: target, reason: "runner-environment", remoteHost, localBackendCore };
+  }
+  if (!localBackendCore.backendCoreContainer && remoteHost !== null) {
+    return { backend: "remote-frontend-websocket", route: target, reason: "local-backend-core-unavailable", remoteHost, localBackendCore };
+  }
+  return { backend: "local-backend-core-broker", route: target, reason: "main-server-local-backend-core", remoteHost: null, localBackendCore };
+}
+
+function detectLocalBackendCoreStatus(): LocalBackendCoreStatus {
+  if (localBackendCoreStatusCache !== null) return localBackendCoreStatusCache;
+  const result = spawnSync("docker", ["ps", "--format", "{{.Names}}"], { encoding: "utf8", timeout: 2000 });
+  if (result.error !== undefined) {
+    localBackendCoreStatusCache = {
+      dockerExecutable: false,
+      backendCoreContainer: false,
+      error: result.error.message,
+    };
+    return localBackendCoreStatusCache;
+  }
+  const output = `${result.stdout ?? ""}\n${result.stderr ?? ""}`.trim();
+  localBackendCoreStatusCache = {
+    dockerExecutable: result.status === 0,
+    backendCoreContainer: result.status === 0 && String(result.stdout ?? "").split(/\r?\n/u).includes("unidesk-backend-core"),
+    error: result.status === 0 ? null : output || `docker ps exited ${result.status ?? "unknown"}`,
+  };
+  return localBackendCoreStatusCache;
+}
+
+function isAgentRunRunnerEnvironment(env: NodeJS.ProcessEnv): boolean {
+  return Boolean(
+    env.AGENTRUN_BOOT_MODE
+      || env.AGENTRUN_RUN_ID
+      || env.AGENTRUN_K8S_JOB_NAME
+      || env.CODE_QUEUE_SERVICE_ROLE
+      || env.CODE_QUEUE_INSTANCE_ID
+      || env.KUBERNETES_SERVICE_HOST,
+  );
+}
+
+function agentRunBridgeRemoteHost(config: UniDeskConfig, env: NodeJS.ProcessEnv = process.env): string | null {
+  return normalizeRemoteHostHint(env.UNIDESK_MAIN_SERVER_IP)
+    ?? normalizeRemoteHostHint(env.UNIDESK_MAIN_SERVER_HOST)
+    ?? normalizeRemoteHostHint(env.CODE_QUEUE_DEV_CONTAINER_MASTER_HOST)
+    ?? normalizeRemoteHostHint(config.network.publicHost);
+}
+
+function normalizeRemoteHostHint(raw: string | undefined): string | null {
+  const value = raw?.trim() ?? "";
+  if (value.length === 0 || value === "localhost" || value === "127.0.0.1" || value === "::1") return null;
+  return value.replace(/\/+$/u, "");
+}
+
+function captureBridgeMetadata(result: SshCaptureResult): Record<string, unknown> | null {
+  const bridgeExecution = (result as AgentRunBridgeCaptureResult).bridgeExecution;
+  if (bridgeExecution === undefined) return null;
+  return {
+    backend: bridgeExecution.backend,
+    route: bridgeExecution.route,
+    reason: bridgeExecution.reason,
+    remoteHost: bridgeExecution.remoteHost,
+    localBackendCore: bridgeExecution.localBackendCore,
+  };
+}
+
+function bridgeExecutionFailureKind(result: SshCaptureResult): { degradedReason: string; failureKind: string; recoveryActions: string[] } | null {
+  if (result.exitCode === 0) return null;
+  const stderr = result.stderr;
+  if (/No such container: unidesk-backend-core|failed to start broker|Executable not found.*"docker"|docker: not found|Cannot connect to the Docker daemon/iu.test(stderr)) {
+    return {
+      degradedReason: "capture-backend-unavailable",
+      failureKind: "bridge-execution-environment",
+      recoveryActions: [
+        "Run the same command from a healthy UniDesk main-server CLI, or provide UNIDESK_MAIN_SERVER_IP/UNIDESK_MAIN_SERVER_HOST so the bridge can use the frontend WebSocket SSH backend.",
+        "For AgentRun runner jobs, request the unidesk-ssh tool credential SecretRef instead of embedding secret values in prompts or payloads.",
+        "After restoring the bridge backend, retry the same bun scripts/cli.ts agentrun ... command; do not use direct kubectl or GitHub API bypasses.",
+      ],
+    };
+  }
+  if (/frontend login failed|remote frontend ssh bridge|websocket error|timed out waiting for provider session|route-not-allowed/iu.test(stderr)) {
+    return {
+      degradedReason: "bridge-execution-environment-unavailable",
+      failureKind: "bridge-execution-environment",
+      recoveryActions: [
+        "Verify the UniDesk frontend/backend-core SSH bridge is reachable from this runner and that UNIDESK_MAIN_SERVER_IP or UNIDESK_MAIN_SERVER_HOST points at the main server.",
+        "Verify scoped UNIDESK_SSH_CLIENT_TOKEN route allowlist includes the requested AgentRun route, without printing token values.",
+        "Retry with the same bun scripts/cli.ts agentrun ... command after the controlled bridge path is restored.",
+      ],
+    };
+  }
+  return null;
 }

 function compactCapture(result: SshCaptureResult, options: { full?: boolean; stdoutTailChars?: number; stderrTailChars?: number } = {}): Record<string, unknown> {
@@ -1842,6 +2003,14 @@ function compactCapture(result: SshCaptureResult, options: { full?: boolean; std
    stdoutTailOmitted: !full && result.exitCode === 0,
    stderrTailOmitted: !full && result.exitCode === 0,
  };
+  const bridgeExecution = captureBridgeMetadata(result);
+  if (bridgeExecution !== null) payload.bridgeExecution = bridgeExecution;
+  const failureKind = bridgeExecutionFailureKind(result);
+  if (failureKind !== null) {
+    payload.degradedReason = failureKind.degradedReason;
+    payload.failureKind = failureKind.failureKind;
+    payload.recoveryActions = failureKind.recoveryActions;
+  }
  if (full || result.exitCode !== 0) {
    payload.stdoutTail = tail(result.stdout, stdoutTailChars);
    payload.stderrTail = tail(result.stderr, stderrTailChars);
@@ -535,6 +535,28 @@ function scopedSshFrontendSession(host: string, config: UniDeskConfig, token: st
  return { baseUrl: frontendBaseUrl(host, config), cookie: "", sshClientToken: token };
 }

+export async function runRemoteSshCommandCapture(
+  config: UniDeskConfig,
+  host: string,
+  target: string,
+  args: string[],
+  input?: string,
+  env: NodeJS.ProcessEnv = process.env,
+): Promise<SshCaptureResult> {
+  const token = sshClientTokenFromEnv(env);
+  const session = token === null
+    ? await loginFrontend(host, config)
+    : scopedSshFrontendSession(host, config, token);
+  const normalizedArgs = normalizeSshOperationArgs(args);
+  const invocation = parseSshInvocation(target, normalizedArgs);
+  const parsed = invocation.parsed;
+  if (parsed.remoteCommand === null) throw new Error(`remote ssh ${target} capture requires a non-interactive operation`);
+  const stdin = parsed.stdinPrefix !== undefined || parsed.stdinSuffix !== undefined
+    ? `${parsed.stdinPrefix ?? ""}${input ?? ""}${parsed.stdinSuffix ?? ""}`
+    : input;
+  return await runRemoteSshWebSocketCaptureRemoteCommand(session, invocation, parsed.remoteCommand, stdin);
+}
+
 async function frontendJson(session: FrontendSession, path: string, init?: RequestInit, timeoutMs = 8000, maxResponseBytes = 5_000_000): Promise<FetchJsonResult> {
  const headers = new Headers(init?.headers);
  headers.set("cookie", session.cookie);