docs: 收敛 AgentRun SPEC 权威引用

2026-06-15 00:38:26 +00:00
parent af7baca595
commit e4bfaf1746
3 changed files with 12 additions and 12 deletions
@@ -1,6 +1,6 @@
 # AgentRun 开发与运维参考

-本文只记录 UniDesk 侧对独立仓库 `pikasTech/agentrun` 的开发与运维约束。AgentRun 的架构设计、MVP 范围、API 契约、runner/backend 协议和运行时内部规则必须维护在 AgentRun 仓库自身，不能放在 UniDesk 长期参考里作为事实来源。
+本文只记录 UniDesk 侧对独立仓库 `pikasTech/agentrun` 的开发与运维约束。AgentRun 作为 HWLAB Agent 编排执行基础设施时，需求规格正文由 UniDesk OA 管理，入口是 [PJ2026-0102 Agent编排](../../project-management/PJ2026-01/specs/PJ2026-0102-agent-orchestration.md)。AgentRun 仓库内 `docs/reference/spec-v01-*.md` 只保留到 OA 规格的交叉引用 stub；实现细节、源码组织和仓库本地运行说明仍维护在 AgentRun 仓库自身。

 ## 仓库与 Worktree

@@ -43,7 +43,7 @@ G14:/root/agentrun-v01/.worktree/{pr_branch}

 ## 文档落库规则

-AgentRun 的 SPEC 和长期参考文档变更不创建 PR。完成本地审查后，必须直接提交并推送到对应目标分支，例如 `origin/v0.1`。过程计划、阶段证据、验收结果和阻塞点写入对应 GitHub issue 评论区，不能用文档 PR 代替直接落库。
+AgentRun 仓库内长期参考和 `spec-v01-*` 交叉引用 stub 变更不创建 PR。完成本地审查后，必须直接提交并推送到对应目标分支，例如 `origin/v0.1`。需求规格正文变更落到 UniDesk OA 的 `project-management/PJ2026-01/specs/`，不要在 AgentRun repo 另维护一份正文。过程计划、阶段证据、验收结果和阻塞点写入对应 GitHub issue 评论区，不能用文档 PR 代替直接落库。

 ## 部署目标

@@ -100,7 +100,7 @@ AgentRun resource/session client policy 也由 `config/agentrun.yaml` 声明。`

 `cleanup-runs` 是 AgentRun `v0.1` 完成态 CI workspace retention 入口，只清理 `agentrun-ci` namespace 中超过 `--min-age-minutes` 的 `agentrun-v01-ci-*` PipelineRun，通过 Tekton ownerRef 释放临时 workspace PVC。dry-run 必须披露候选 PipelineRun、owned PVC、active mount 保护、local-path 实际估算 bytes 和 confirm 命令。默认保护最新完成的 PipelineRun，保留当前 CI/CD 状态证据。`cleanup-released-pvs` 是二次回收入口，只处理 `agentrun-ci`、`local-path`、`Delete` reclaim policy 的 `Released` PV；它不触碰 `agentrun-v01` runtime namespace、业务 PVC、Secret、registry storage 或 GitOps desired state。磁盘治理和 G14 safe-stop 规则见 `docs/reference/gc.md`。

-涉及 AgentRun runner egress、`transientEnv` 或 Secret 不泄露的 closeout，必须用真实 `create/apply/send` 资源原语触发 `agentrun-v01` runner Job，再通过 `describe runnerjob/...`、`events run/...`、`logs session/...` 或必要的兼容 bridge 检查 runner job response、event/trace 和 Kubernetes Pod spec。通过证据应显示 proxy env 是否存在、`NO_PROXY` 是否包含 `hyueapi.com`/`.hyueapi.com`、短期 `HWLAB_API_KEY` 等 `transientEnv` 是否通过 per-job Secret 的 `valueFrom.secretKeyRef` 注入，以及 response/event 只输出 env name、Secret metadata 和 `valuesPrinted=false`。不得在 issue、trace 或 Pod spec 摘要中输出 Secret value。AgentRun 内部 SecretRef 合同以 AgentRun 仓库 `docs/reference/spec-v01-secret-distribution.md` 和 `docs/reference/spec-v01-runtime-assembly.md` 为权威；UniDesk 只记录验证入口和跨仓库归因。
+涉及 AgentRun runner egress、`transientEnv` 或 Secret 不泄露的 closeout，必须用真实 `create/apply/send` 资源原语触发 `agentrun-v01` runner Job，再通过 `describe runnerjob/...`、`events run/...`、`logs session/...` 或必要的兼容 bridge 检查 runner job response、event/trace 和 Kubernetes Pod spec。通过证据应显示 proxy env 是否存在、`NO_PROXY` 是否包含 `hyueapi.com`/`.hyueapi.com`、短期 `HWLAB_API_KEY` 等 `transientEnv` 是否通过 per-job Secret 的 `valueFrom.secretKeyRef` 注入，以及 response/event 只输出 env name、Secret metadata 和 `valuesPrinted=false`。不得在 issue、trace 或 Pod spec 摘要中输出 Secret value。HWLAB-facing SecretRef 和 RuntimeAssembly 需求以 [Runtime装配](../../project-management/PJ2026-01/specs/PJ2026-010202-runtime-assembly.md) 与 [YAML运维](../../project-management/PJ2026-01/specs/PJ2026-010603-yaml-first-ops.md) 为权威；AgentRun 仓库 stub 只交叉引用这些 OA 规格。

 通过 `g14-provider-egress-proxy.unidesk.svc.cluster.local:18789` 验证 `codeload.github.com` 时，必须同时确认 G14 runtime egress Service 有 ready endpoint。Service/DNS 存在但 Deployment `0/1`、Endpoint 只有 notReady address、Pod `ImagePullBackOff` 或 `ContainerStatusUnknown` 时，问题归为 UniDesk/G14 runtime egress 基础设施；不能把 runner 已注入 proxy env 后的 `connect refused` 归为 AgentRun 业务修复失败，也不能关闭要求“通过受控 proxy 成功访问 codeload”的 issue。

@@ -124,11 +124,11 @@ UniDesk 不能作为以下内容的事实来源：
 - tenant policy 模型；
 - backend adapter 设计。

-这些内容必须维护在 AgentRun 仓库自己的 `AGENTS.md` 和 `docs/reference/` 中。
+这些实现侧内容必须维护在 AgentRun 仓库自己的 `AGENTS.md` 和非 SPEC 长期参考中；HWLAB-facing 需求规格正文必须维护在 UniDesk OA。

 ## AgentRun Queue 与旧 Code Queue 边界

-AgentRun `v0.1` 的指挥官任务面已经按 AgentRun issue #105 完成真实运行面验收，可作为新任务派发、commander queue 观察、events/logs/result、steer/send、ack 和 cancel 的 AgentRun 侧标准路径。长期使用时仍以 AgentRun 仓库自身 SPEC 为能力事实来源；UniDesk 只记录该路径已经通过 G14 `agentrun-v01` 运行面和 `hy` profile + `gpt-5.5` 验证。
+AgentRun `v0.1` 的指挥官任务面已经按 AgentRun issue #105 完成真实运行面验收，可作为新任务派发、commander queue 观察、events/logs/result、steer/send、ack 和 cancel 的 AgentRun 侧标准路径。长期能力规格以 UniDesk OA 的 [队列会话](../../project-management/PJ2026-01/specs/PJ2026-010203-queue-session.md) 和 [AgentRun核心](../../project-management/PJ2026-01/specs/PJ2026-010201-agentrun-core.md) 为准；UniDesk 只记录该路径已经通过 G14 `agentrun-v01` 运行面和 `hy` profile + `gpt-5.5` 验证。

 UniDesk 指挥官新任务入口固定使用 `bun scripts/cli.ts agentrun get|describe|events|logs|result|ack|cancel|dispatch|create|apply|steer|send` 资源原语。该入口是 render-only client：UniDesk 客户端保留 k8s 风格命令解析、human 表格、生命周期摘要、下一步命令、分页、`-o json|yaml` 稳定客户端 schema 和错误展示；AgentRun 服务端只提供稳定 RESTful API、鉴权和业务事实，不承载 UniDesk CLI 渲染。日常派单优先用 `agentrun create task --aipod Artificer --prompt-stdin` 或 `agentrun apply -f -` 的 quoted YAML/JSON heredoc/stdin 形式；已创建未运行任务用 `agentrun dispatch task/<taskId>` 派发；`--json-file`、`--prompt-file` 和 `--runner-json-file` 只是客户端输入来源，用于已审阅且可复用的受控文件。UniDesk 不实现 AgentRun queue 协议，也不把任务 double-write 回旧 Code Queue。

@@ -136,7 +136,7 @@ UniDesk 指挥官新任务入口固定使用 `bun scripts/cli.ts agentrun get|de

 AgentRun 公网 HTTPS 入口、FRP/Caddy edge、direct REST base URL 和鉴权来源都由 UniDesk `config/agentrun.yaml` 声明。YAML-only lane 不允许把这些部署选择写回 AgentRun source branch 的 `deploy/deploy.json`；AgentRun source repo 只保留应用代码、构建输入和 AgentRun 自身契约。`bun scripts/cli.ts agentrun control-plane expose --confirm` 只负责按 UniDesk YAML 补 edge 侧 allow port 与 Caddy site，不在 AgentRun k3s 中创建 Ingress、NodePort、LoadBalancer、hostPort 或 HWLAB 转发层。

-AgentRun Queue 任务如果需要调用 UniDesk 维护桥，例如 `trans` / `unidesk-ssh`，长期契约以 AgentRun 仓库 `docs/reference/spec-v01-runtime-assembly.md` 和 `docs/reference/spec-v01-secret-distribution.md` 为准：调用方通过 `executionPolicy.secretScope.toolCredentials[].tool=unidesk-ssh` 请求 `UNIDESK_SSH_CLIENT_TOKEN` SecretRef；非敏感 endpoint 由 runner-job `transientEnv` 显式提供，或由 manager 受控默认值自动补齐。UniDesk bridge 提交 Queue payload 时不得在 prompt、payload 或 `transientEnv` 中携带 token，也不得使用 HWLAB runtime Web 入口冒充 UniDesk frontend。若 dispatcher 已正确请求 `unidesk-ssh` 但 trace 的 `runner-job-created.transientEnv.names` 没有 `UNIDESK_MAIN_SERVER_IP`、`UNIDESK_MAIN_SERVER_HOST` 或 `UNIDESK_FRONTEND_URL`，归为 AgentRun assembly 问题；若 endpoint env 已存在但 route denied/timeout，再按 UniDesk frontend/token scope 或 provider session 排查。
+AgentRun Queue 任务如果需要调用 UniDesk 维护桥，例如 `trans` / `unidesk-ssh`，长期契约以 UniDesk OA 的 [Runtime装配](../../project-management/PJ2026-01/specs/PJ2026-010202-runtime-assembly.md) 和 [YAML运维](../../project-management/PJ2026-01/specs/PJ2026-010603-yaml-first-ops.md) 为准：调用方通过 `executionPolicy.secretScope.toolCredentials[].tool=unidesk-ssh` 请求 `UNIDESK_SSH_CLIENT_TOKEN` SecretRef；非敏感 endpoint 由 runner-job `transientEnv` 显式提供，或由 manager 受控默认值自动补齐。UniDesk bridge 提交 Queue payload 时不得在 prompt、payload 或 `transientEnv` 中携带 token，也不得使用 HWLAB runtime Web 入口冒充 UniDesk frontend。若 dispatcher 已正确请求 `unidesk-ssh` 但 trace 的 `runner-job-created.transientEnv.names` 没有 `UNIDESK_MAIN_SERVER_IP`、`UNIDESK_MAIN_SERVER_HOST` 或 `UNIDESK_FRONTEND_URL`，归为 AgentRun assembly 问题；若 endpoint env 已存在但 route denied/timeout，再按 UniDesk frontend/token scope 或 provider session 排查。

 旧 UniDesk Code Queue 只保留历史归档、只读排障和残留旧任务停止入口。`codex submit/enqueue`、`codex steer`、`codex resume`、`codex queue create/merge`、`codex move`、旧 Web 提交表单、旧队列管理和旧 workdir 管理都必须返回冻结状态或禁用；`codex task/tasks/output/read/unread/queues` 可继续读取历史，`codex interrupt|cancel` 只用于停止残留旧任务。旧 Code Queue history 不迁移到 AgentRun，也不提供 adapter、legacy mode、fallback 或双写路径。

@@ -144,13 +144,13 @@ AgentRun Queue 任务如果需要调用 UniDesk 维护桥，例如 `trans` / `un

 HWLAB 接入 AgentRun 时，必须先按公共契约和运行证据判断问题归属，再进入对应仓库修改。谁拥有缺失能力、错误语义或未修复行为，就改谁；不得为了让当前联调继续推进而在另一侧迁就、伪造语义、补观测替代实现，或把缺失能力包装成已完成。

-AgentRun 负责共享 Agent 执行基础设施本身，包括 run/command/runner-job 生命周期、bundle 物化、cancel、trace/result 元语、backend adapter 事件语义、runner 环境传递、CLI 结果查询和 SPEC 中已经承诺的能力。若这些能力缺失或行为错误，必须在 `pikasTech/agentrun` 的 SPEC、源码、单元/自测、CI/CD 和 `agentrun-v01` 运行面中补齐，再让 HWLAB 通过 adapter 消费明确契约；HWLAB 不应在渲染层、adapter 层或 prompt 中推断、补造 AgentRun 没有发出的事实。
+AgentRun 负责共享 Agent 执行基础设施本身，包括 run/command/runner-job 生命周期、bundle 物化、cancel、trace/result 元语、backend adapter 事件语义、runner 环境传递、CLI 结果查询和 OA 规格中已经承诺的能力。若这些能力缺失或行为错误，必须回到 UniDesk OA 规格确认需求，再在 `pikasTech/agentrun` 的源码、自测、CI/CD 和 `agentrun-v01` 运行面中补齐；HWLAB 不应在渲染层、adapter 层或 prompt 中推断、补造 AgentRun 没有发出的事实。

 HWLAB 负责自身产品和接入层，包括用户鉴权、Cloud Web/CLI 对外 API、conversation/session 归属、前端展示、device-pod 业务授权、HWLAB 到 AgentRun 的 adapter 映射，以及不改变外部 API 的内部调用切换。若 AgentRun 已按契约输出正确语义，而 HWLAB 消费、映射、渲染或业务路径仍有问题，必须在 `pikasTech/HWLAB` 修复，不能要求 AgentRun 为 HWLAB 私有 UI 或业务模型增加临时兼容。

 跨仓库 issue 和 PR 必须明确写出责任归属、契约依据和验证入口。需要两边配合时，先在拥有公共契约的一侧补齐能力，再在消费侧做最小适配；不允许用双路径、legacy mode、feature flag、fallback 或额外噪声观测长期绕过真实缺口。

-直接通过 AgentRun manager、`dispatchHwlabAgentRun()` 或手写 runner job 发起的 canary 只能证明 AgentRun 基础设施和凭据投影本身可用，不能证明 HWLAB Cloud Web/Cloud API 的产品入口已经正确请求这些能力。涉及 Cloud Web Workbench、用户会话、conversation/session/thread、AgentRun runtime assembly 或业务授权的 issue，必须用 HWLAB 的 Web dispatcher 原入口，或调用同一 dispatcher 的 CLI 验证。当前 HWLAB v0.2 到 AgentRun 的资源装配权威是 HWLAB `docs/reference/agentrun-code-agent-dispatch.md` 和 AgentRun `docs/reference/spec-v01-runtime-assembly.md`：`ResourceBundleRef.kind="gitbundle"` 通过 `bundles[]` 装配 `tools/` 和 `.agents/skills`，旧 `toolAliases` / `skillRefs` / `workspaceFiles` 不再是有效接入口。若消费侧 Web dispatcher 没有按该契约传递 `gitbundle`、tool credential 或 transient env，应归为 HWLAB 接入层问题；若 dispatcher 已正确请求但 AgentRun runner 没有装配，应归为 AgentRun 执行基础设施问题。
+直接通过 AgentRun manager、`dispatchHwlabAgentRun()` 或手写 runner job 发起的 canary 只能证明 AgentRun 基础设施和凭据投影本身可用，不能证明 HWLAB Cloud Web/Cloud API 的产品入口已经正确请求这些能力。涉及 Cloud Web Workbench、用户会话、conversation/session/thread、AgentRun runtime assembly 或业务授权的 issue，必须用 HWLAB 的 Web dispatcher 原入口，或调用同一 dispatcher 的 CLI 验证。当前 HWLAB v0.2 到 AgentRun 的资源装配需求权威是 UniDesk OA 的 [Runtime装配](../../project-management/PJ2026-01/specs/PJ2026-010202-runtime-assembly.md) 和 [HWLAB接入](../../project-management/PJ2026-01/specs/PJ2026-010205-hwlab-dispatch.md)：`ResourceBundleRef.kind="gitbundle"` 通过 `bundles[]` 装配 `tools/` 和 `.agents/skills`，旧 `toolAliases` / `skillRefs` / `workspaceFiles` 不再是有效接入口。若消费侧 Web dispatcher 没有按该契约传递 `gitbundle`、tool credential 或 transient env，应归为 HWLAB 接入层问题；若 dispatcher 已正确请求但 AgentRun runner 没有装配，应归为 AgentRun 执行基础设施问题。

 HWLAB 与 UniDesk/Artificer 的 `gitbundle` checkout authority 是 repo URL + workspace ref，而不是 cloud-api artifact revision、AipodSpec mirror 开关或运行时 prompt。`ResourceBundleRef` / AipodSpec 必须继续声明无明文凭据的 GitHub repo URL；Git mirror 是 G14/AgentRun 基础设施能力，由 runner 在物化阶段自动把 GitHub URL 改写到受控 mirror read URL。不得在 AipodSpec、Queue task、prompt 或业务 adapter 中声明 `gitMirror`、mirror base URL 或 direct/mirror 分支开关。AgentRun runner 物化后必须记录原始 `repoUrl`、实际 `fetchRepoUrl`、`mirrorUsed`、`mirrorBaseUrl`、requested ref/commit 和 actual `commitId`；devops-infra mirror cache 必须覆盖 Artificer 和 HWLAB 常用 bundle repo，缺 cache 属于基础设施缺口，不能通过让 AipodSpec 直连 GitHub 来绕过。cloud-api、CI/CD 或 rollout 注入的 `commitId` 只可作为 requested hint 或显式 pin 的输入，不得作为默认 materialization 来源。关闭相关 issue 时，证据必须同时显示 `repoUrl`、`requestedRef`、actual `commitId`，以及 `bundles/tools/promptRefs/skillDirs` 摘要；若 actual `commitId` 仍等于旧 cloud-api rollout commit 且不是显式 pin，应继续归为 AgentRun bundle 物化问题。

@@ -170,7 +170,7 @@ Codex app-server/provider 返回 tool-call 参数 JSON 错误时，AgentRun 应

 AgentRun `command-result` / result API 的 `finalResponse` 必须来自当前 command 的最新终态 assistant 输出，不能在长 trace、steer 或多 command 查询后回退到过期响应。发现 result API 与 raw events、trace rows 或 terminal command 序列不一致时，关闭 HWLAB/CaseRun 问题不能只引用 `command-result.finalResponse`；应以 AgentRun terminal status、当前 command id、raw event/trace 中最后 assistant 输出和硬件证据共同判定，并把 stale result 作为 AgentRun 可见性/结果契约问题追踪。

-AgentRun result/session 可见性必须把正在运行的目标 command 与后续 steer command 分开判定。排查 active turn 卡顿、恢复或 closeout 时，优先读取目标 command result/session status 中的 `liveness`，用 `liveness.phase` 区分 `waiting-runner`、`waiting-model`、`waiting-tool`、`idle-after-tool`、`transport-disconnected`、`runner-heartbeat-stale` 和 `terminal`；禁止只凭长时间没有新 event、外层超时或 runner 已回连来推断 turn 已恢复或失败。`steerDelivery` 只说明 steer RPC 在 runner/app-server 链路上的 ack、forward 和 backend accept 状态；`steer completed` 不能替代目标 command 终态，也不能作为目标 turn 已继续输出的证据。关闭 HWLAB/CaseRun 问题时，应同时引用目标 command id、目标 result/session 的 `liveness`、raw trace/terminal command 序列和原入口证据；字段契约以 AgentRun 仓库 v0.1 spec 为准，UniDesk 只记录跨仓库归因与验收口径。
+AgentRun result/session 可见性必须把正在运行的目标 command 与后续 steer command 分开判定。排查 active turn 卡顿、恢复或 closeout 时，优先读取目标 command result/session status 中的 `liveness`，用 `liveness.phase` 区分 `waiting-runner`、`waiting-model`、`waiting-tool`、`idle-after-tool`、`transport-disconnected`、`runner-heartbeat-stale` 和 `terminal`；禁止只凭长时间没有新 event、外层超时或 runner 已回连来推断 turn 已恢复或失败。`steerDelivery` 只说明 steer RPC 在 runner/app-server 链路上的 ack、forward 和 backend accept 状态；`steer completed` 不能替代目标 command 终态，也不能作为目标 turn 已继续输出的证据。关闭 HWLAB/CaseRun 问题时，应同时引用目标 command id、目标 result/session 的 `liveness`、raw trace/terminal command 序列和原入口证据；字段需求以 UniDesk OA 的 [AgentRun核心](../../project-management/PJ2026-01/specs/PJ2026-010201-agentrun-core.md) 与 [队列会话](../../project-management/PJ2026-01/specs/PJ2026-010203-queue-session.md) 为准，UniDesk `docs/reference` 只记录跨仓库归因与验收口径。

 ## 中文规则