docs: record D601 git mirror flush recovery
This commit is contained in:
@@ -130,6 +130,8 @@ PipelineRun `gitops-promote` 如果报 git mirror 控制面漂移、refs 不一
|
||||
|
||||
D601 `v0.3` 固定 worktree 的 fetch remote 是 node-local git mirror。GitHub PR 合并后,如果 `D601:/home/ubuntu/workspace/hwlab-v03` 中 `git fetch origin v0.3` 仍看不到最新 merge commit,先执行 `hwlab nodes git-mirror sync --node D601 --lane v03 --confirm --wait`,再在固定 worktree `git fetch origin v0.3 && git pull --ff-only origin v0.3`。`trigger-current --lane v03` 会为 PipelineRun 做 mirror pre-sync,但不替代固定 worktree 的 fetch hygiene。promotion 后若 node-local `git-mirror status` 显示 `pendingFlush=true`,执行 node-local flush 并等到 `pendingFlush=false`、`githubInSync=true`。
|
||||
|
||||
D601/node-scoped mirror status 的 `githubGitops` 来自本地 mirror cache 的 `refs/mirror-stage/...`。如果 `hwlab nodes git-mirror flush --node D601 --lane v03 --confirm --wait` 的日志已经显示 `v0.3-gitops -> v0.3-gitops` 推送成功,但随后因 GitHub SSH `kex_exchange_identification` 或 fetch 确认失败导致命令 exit 44、status 仍显示 `pendingFlush=true`,不要连续盲目 flush;先执行 `hwlab nodes git-mirror sync --node D601 --lane v03 --confirm --wait` 刷新 mirror-stage,再用 status 确认 `localGitops=githubGitops`、`pendingFlush=false`、`githubInSync=true`。
|
||||
|
||||
---
|
||||
|
||||
## Secret 管理
|
||||
|
||||
@@ -81,6 +81,8 @@ The `devops-infra` git mirror/relay remains manual and CLI-controlled, not CronJ
|
||||
|
||||
After a `v0.2` PipelineRun completes, treat runtime rollout and remote GitOps persistence as two separate checks. `hwlab g14 control-plane status --lane v02` is the runtime check: it must show the expected source commit, PipelineRun completed, Argo `Synced/Healthy`, public 19666/19667 probes passing, and Cloud Web asset probes such as `/app.js` readable. `hwlab g14 git-mirror status` is the persistence check: `cache.summary.pendingFlush` must be false and `cache.summary.githubInSync` true before declaring GitOps fully flushed back to GitHub. The PR monitor performs this flush automatically for its own merged PRs and records the result in the PR comment. Manual operators should run `bun scripts/cli.ts hwlab g14 git-mirror flush --confirm` and poll the returned job with `bun scripts/cli.ts job status <jobId> --tail-bytes 12000` only when they used lower-level manual trigger/status paths or when the monitor reports a flush failure; do not replace this with raw `kubectl`, native `git push`, or a long SSH wait.
|
||||
|
||||
For D601/node-scoped runtime lanes, `hwlab nodes git-mirror status --node <node> --lane <lane>` reports GitHub refs from the mirror cache's `refs/mirror-stage/...`, not from a live GitHub API request. A `flush --wait` run can push the GitOps ref successfully and still exit non-zero if the follow-up fetch/verification hits a transient GitHub SSH error such as `kex_exchange_identification`. When the flush log already shows the ref update was pushed but status still reports `pendingFlush=true`, do not keep repeating blind flush attempts; first run the corresponding controlled `hwlab nodes git-mirror sync --node <node> --lane <lane> --confirm --wait` to refresh `refs/mirror-stage/...`, then recheck status for `localGitops=githubGitops`, `pendingFlush=false`, and `githubInSync=true`.
|
||||
|
||||
If `gitops-promote` fails because the git mirror control plane drifted, refs are inconsistent, or publish/flush did not complete, recover through the controlled mirror path: `hwlab g14 git-mirror apply --confirm` to reinstall the current hook/ConfigMap, `hwlab g14 git-mirror sync --confirm --wait` to realign source and GitOps refs, then a targeted `control-plane cleanup-runs --pipeline-run <failed-run> --confirm` before retriggering the same lane. The old branch/path allowlist gate has been removed; do not restore it, patch the hook inside the pod, delete PipelineRuns with raw kubectl, or bypass `git-mirror flush`. Closeout still requires the target PipelineRun status, Argo health, public probes, and `git-mirror status` with `pendingFlush=false`.
|
||||
|
||||
When closing an issue against a specific completed `v0.2` PipelineRun, use targeted status instead of the latest-head status if `origin/v0.2` has already advanced through a parallel task:
|
||||
|
||||
Reference in New Issue
Block a user