From 6a56234b875fb5a96601ffb5cbb0d9cb846237a6 Mon Sep 17 00:00:00 2001 From: Codex Date: Fri, 29 May 2026 08:35:35 +0000 Subject: [PATCH] =?UTF-8?q?docs:=20=E5=BC=BA=E5=8C=96=E5=88=86=E5=B8=83?= =?UTF-8?q?=E5=BC=8F=E6=95=8F=E6=8D=B7=E5=AE=9E=E5=9C=B0=E9=AA=8C=E8=AF=81?= =?UTF-8?q?=E8=A7=84=E5=88=99?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- AGENTS.md | 2 ++ docs/reference/devops-hygiene.md | 10 ++++++++++ 2 files changed, 12 insertions(+) diff --git a/AGENTS.md b/AGENTS.md index ff09f829..eb3ed576 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -35,7 +35,9 @@ UniDesk 是一个以主 server 为统一入口的分布式工作平台;本文 ## Critical Distributed Agile Validation Rule - P0: 分布式敏捷开发默认先在目标 provider/pod/host 透传环境做最小真实闭环验证,再进入完整 CI/CD、GitOps 或发布流水线;禁止把完整 CI/CD 当作每轮兼容性探索和试错工具。 +- P0: 对第三方模型、API provider、硬件、跨平台 bridge、CLI/tran 或高频工具链异常,禁止在未做目标运行面实地探测前定性为外部不可用;必须先核对当前 runtime config/Secret/env/proxy,使用受控透传在真实 pod/host/provider 上复现,并与 UniDesk/HWLAB 等成熟实现做对照。 - 对第三方模型、硬件、跨平台 bridge、CLI/tran 和高频工具链摩擦的修复,必须先用目标运行面上的最小脚本、临时 pod exec、真实端口或受控透传命令证明核心链路跑通;只有闭环通过后,才把修复固化到源码、测试、长期参考和正式发布。 +- P0: 用户反馈或现场证据推翻初始 blocker 判断时,必须立即停止原 blocker 叙事,改走“透传探测 -> 单变量实验 -> 必要热修复验证 -> 源码 PR -> CI/CD -> 原入口复测”的闭环;运行面热修复只能作为实验和恢复证据,不能替代正式 PR/CI/CD。 - 外部 API 或模型行为可能变化时,先查官方/一手文档和成熟实践,再在目标运行面做实验验证;不要凭旧记忆反复推 CI/CD 试错。 ## Critical CI/CD CLI Control Rule diff --git a/docs/reference/devops-hygiene.md b/docs/reference/devops-hygiene.md index 1e8f519d..675fbc4c 100644 --- a/docs/reference/devops-hygiene.md +++ b/docs/reference/devops-hygiene.md @@ -48,6 +48,16 @@ If a manual repair is needed to unblock the platform, the durable fix must be co “分布式敏捷”是 UniDesk 对 distributed agile field repair 的固定流程名。后续 issue、PR、指挥记录或用户反馈提到“分布式敏捷”时,默认指下面这套流程:先在真实分布式运行面快速探测和实验补丁,形成可复现的证据与复盘 issue,再把有效修复收敛为 Git/PR/CI/CD 的持久化交付,最后从原始用户入口复测。它允许快速现场学习,但不允许运行面改动变成隐藏部署真相。 +Before classifying a failure as an external blocker, the operator must complete the field anti-misclassification check. This is P0 for model providers, API providers, hardware links, cross-platform bridges, CLI/tran paths and frequently used tooling: + +1. Confirm the exact runtime configuration used by the failing path: committed source ref, deployed image or script revision, redacted Secret names and key presence, env/proxy/NO_PROXY shape, endpoint identity and command args. Do not infer these values from memory or from a different workspace. +2. Reproduce the symptom from the actual target provider, pod, host bridge or service port through UniDesk passthrough or the service entry that failed. A commander-machine-only check is supporting evidence, not classification evidence. +3. Compare with the mature local implementation when one exists. For Codex/model-provider work, inspect the current UniDesk/HWLAB stdio, forwarder, proxy, env-stripping and config-loading paths before concluding the provider itself is broken. +4. Run narrow one-variable experiments in the live target environment. Typical variables are explicit versus config-derived model, endpoint, proxy or NO_PROXY, env inheritance, secret mount shape, CLI version, protocol start parameters and request payload. Record the success case and the failure case with trace ids, run ids, job names, rollout objects or bounded logs. +5. Only call the condition an external blocker after the current runtime config has been verified, the minimal real-path probe still fails, a mature reference path or equivalent cross-check also fails, and the evidence rules out local adapter/config mistakes. + +If user feedback or fresh evidence contradicts an initial blocker claim, the operator must stop repeating the blocker narrative and switch to field repair mode immediately. The expected sequence is passthrough probing, single-variable live experiments, a bounded hotfix experiment when needed, a source PR, CI/CD rollout and re-test from the original entry point. The hotfix proves direction or restores a live path; it does not complete the task. + The standard flow is: 1. Probe the real runtime surface first. Use structured UniDesk passthrough, service health endpoints, trace/result polling, bounded logs, object metadata and user-entry requests to reproduce the symptom on the actual target environment. Prefer short single-step commands that return promptly and can be repeated.