docs: record agentrun egress closeout rules

This commit is contained in:
Codex
2026-06-09 15:43:27 +00:00
parent c61a9af4f0
commit d8b4481cc9
3 changed files with 9 additions and 1 deletions
+3 -1
View File
@@ -206,9 +206,11 @@ bun scripts/cli.ts agentrun v01 control-plane cleanup-released-pvs \
- `status`: 只读汇总 source commit、PipelineRun、Argo、manager image、git mirror 和 `aligned` 结论
- `trigger-current`: 快进 `G14:/root/agentrun-v01` → mirror sync → 创建 `agentrun-v01-ci-<short12>` PipelineRun
- `refresh`: Argo hard refresh(不 patch runtime workload
- `cleanup-runs`: 清理已完成 PipelineRun + 临时 PVC
- `cleanup-runs`: 清理 `agentrun-ci`已完成 PipelineRun + 临时 PVC;不清理 `agentrun-v01` runtime runner Job/Pod/Secret
- `cleanup-released-pvs`: 回收 Released PV
AgentRun `control-plane status` 的 compact JSON 关键字段在 `.data.sourceCommit``.data.expectedPipelineRun``.data.runtimeAlignment``.data.gitMirror.summary` 等位置,不要假设存在 `.data.status`。触发部署后如果 GitOps 已 promotion 但 git mirror `pendingFlush=true`,先执行 `bun scripts/cli.ts agentrun v01 git-mirror flush --confirm --wait`,再 `control-plane refresh --confirm`,最后用 `control-plane status --full` 证明 `runtimeAlignment.localHeadMatchesOrigin=true``syncedToGitopsLatest=true``managerSourceMatchesExpected=true`
## AgentRun v0.1 Git Mirror
```bash
+4
View File
@@ -81,6 +81,10 @@ bun scripts/cli.ts agentrun v01 control-plane cleanup-released-pvs --limit 200 -
`cleanup-runs` 是 AgentRun `v0.1` 完成态 CI workspace retention 入口,只清理 `agentrun-ci` namespace 中超过 `--min-age-minutes``agentrun-v01-ci-*` PipelineRun,通过 Tekton ownerRef 释放临时 workspace PVC。dry-run 必须披露候选 PipelineRun、owned PVC、active mount 保护、local-path 实际估算 bytes 和 confirm 命令。默认保护最新完成的 PipelineRun,保留当前 CI/CD 状态证据。`cleanup-released-pvs` 是二次回收入口,只处理 `agentrun-ci``local-path``Delete` reclaim policy 的 `Released` PV;它不触碰 `agentrun-v01` runtime namespace、业务 PVC、Secret、registry storage 或 GitOps desired state。磁盘治理和 G14 safe-stop 规则见 `docs/reference/gc.md`
涉及 AgentRun runner egress、`transientEnv` 或 Secret 不泄露的 closeout,必须用真实 `queue dispatch``sessions turn``runner-jobs` 路径创建 `agentrun-v01` runner Job,再检查 runner job response、event/trace 和 Kubernetes Pod spec。通过证据应显示 proxy env 是否存在、`NO_PROXY` 是否包含 `hyueapi.com`/`.hyueapi.com`、短期 `HWLAB_API_KEY``transientEnv` 是否通过 per-job Secret 的 `valueFrom.secretKeyRef` 注入,以及 response/event 只输出 env name、Secret metadata 和 `valuesPrinted=false`。不得在 issue、trace 或 Pod spec 摘要中输出 Secret value。AgentRun 内部 SecretRef 合同以 AgentRun 仓库 `docs/reference/spec-v01-secret-distribution.md``docs/reference/spec-v01-runtime-assembly.md` 为权威;UniDesk 只记录验证入口和跨仓库归因。
通过 `g14-provider-egress-proxy.unidesk.svc.cluster.local:18789` 验证 `codeload.github.com` 时,必须同时确认 G14 runtime egress Service 有 ready endpoint。Service/DNS 存在但 Deployment `0/1`、Endpoint 只有 notReady address、Pod `ImagePullBackOff``ContainerStatusUnknown` 时,问题归为 UniDesk/G14 runtime egress 基础设施;不能把 runner 已注入 proxy env 后的 `connect refused` 归为 AgentRun 业务修复失败,也不能关闭要求“通过受控 proxy 成功访问 codeload”的 issue。
## UniDesk 边界
UniDesk 是 AgentRun 的综合分布式开发和运维中心。UniDesk 可以记录:
+2
View File
@@ -239,6 +239,8 @@ The backup proxy uses `HTTP_PROXY=http://127.0.0.1:11809`, `HTTPS_PROXY=http://1
This proxy is not a replacement for UniDesk runtime egress. k3s workloads such as Code Queue must still use the cataloged `g14-provider-egress-proxy` Kubernetes Service and `g14-tcp-egress-gateway` for normal runtime access to PostgreSQL, OA Event Flow and external APIs. The node-local VPN proxy is allowed only for G14 host-side bootstrap, image build, cache prewarm or recovery steps, and those steps should record the proxy choice in issue or deployment evidence.
Runtime egress readiness is not proven by Service DNS alone. Before closing HWLAB, AgentRun, Code Queue, CI/CD or user-service issues that require GitHub/codeload/npm/pip/API access through `g14-provider-egress-proxy`, validate that the Deployment is Ready and the Service has at least one ready endpoint. If the Service resolves but `curl` through `http://g14-provider-egress-proxy.unidesk.svc.cluster.local:18789` fails immediately with connection refused, inspect `unidesk/g14-provider-egress-proxy` and `unidesk/g14-tcp-egress-gateway` pods for `ImagePullBackOff`, `ContainerStatusUnknown` or notReady endpoints. A pod trying to pull `unidesk-code-queue:g14` from Docker Hub as `docker.io/library/unidesk-code-queue:g14` is an invalid runtime egress deployment state; restore the controlled image source/import path instead of redirecting long-lived workload proxy env to the node-local bootstrap proxy.
## v0.2 device-pod cloud-api architecture
`v0.2` device-pod integration is the cloud-api → executor → D601 Windows `device-host-cli.mjs` chain under `internal/cloud/access-control.ts`, `cmd/hwlab-device-pod/main.ts` and the host-side `F:\Work\ConStart\tools\device-host-cli.mjs`. PR #765 (selector cheat sheet + fail-fast) and PR #778 (output.text propagation + evidence selector + read-only sub-action `--reason` exemption) are the two anchor PRs; PR #779 tracks the still-open host-side ops work. Earlier work used raw `MUTATING_INTENTS.has(intent) && !reason` and a single-pass `textOr(output.text, …)` extractor; both are obsolete and must not be re-introduced.