Files
2026-06-11 10:57:15 +00:00

272 lines
16 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Auth Broker P0 Contract
本文收口 GitHub issue/PR 权限注入的 P0 方案。P0 目标不是建设完整 IAM,也不是把 secret 分发给 runner;目标是先在 D601 dev 验证一个 Rust 单二进制 broker proxy,让 Code Queue runner 即使没有 `GH_TOKEN` / `GITHUB_TOKEN`,也能通过受控代理完成 GitHub issue/PR 访问和 PR preflight dry-run。
## Existing Paths
当前阻塞来自多个路径都把 GitHub 能力等同于 runner 环境变量:
- `scripts/src/gh.ts`UniDesk `gh` CLI 已经是 repo-native GitHub REST wrapper,优先读取 `GH_TOKEN`,其次读取 `GITHUB_TOKEN`,再 fallback 到系统 `gh auth token`。它不会打印 token 值,并已把 `missing-token``permission-denied``scope-insufficient``github-transient``network-proxy-failed``unsupported-command` 等失败结构化;其中 `github-transient` 是可重试的 GitHub DNS/API 连接失败,不等同于缺 token 或权限不足。
- `scripts/code-queue-pr-preflight-example.ts`:本地示例先检查 env token,再跑 `gh auth status``gh pr create --dry-run``gh pr comment --dry-run`。没有 env token 时该 workflow 直接失败。
- `src/components/microservices/code-queue/src/runtime-preflight.ts`D601 scheduler preflight 检查 `GH_TOKEN` / `GITHUB_TOKEN` 是否存在,用 curl/gh/git 探测 GitHub API、issue、PR、SSH 和可选 push dry-run。
- `scripts/src/code-queue.ts``codex pr-preflight` 把 runtime preflight 压缩为 runner capability 摘要;scheduler env token 与 active runner/dev container token 是独立 scope。缺少 scheduler env token 或 auth-broker 时输出 `failureKind=auth-missing``degradedReason=auth-broker-needed``runnerDisposition=infra-blocked`,并用 `authScopeSummary`/`scopeBoundary`/`recommendedActions` 明确这不等于当前 active runner 不能创建 PRactive task 内仍需用 repo-native `bun scripts/cli.ts gh auth status` 和 PR create dry-run 验证。
- `src/components/microservices/code-queue/src/index.ts``CODE_QUEUE_REMOTE_CODEX_ENV_KEYS` 默认包含 `GH_TOKEN``GITHUB_TOKEN``GH_HOST``GITHUB_API_URL``GH_REPO`,用于把 scheduler env 继续传给 provider dev container。这是凭证分发路径,不适合作为 P0 的唯一解。
- `src/components/microservices/code-queue/docker-compose.d601.yml`D601 Code Queue 从 `.state/code-queue-d601.env` 读取运行时环境,并配置 provider-gateway egress proxy。该文件不能承载本任务中的真实 secret 值。
- `src/components/microservices/code-queue/Dockerfile`runner 镜像已经包含 `git``gh``curl``jq``rg``cargo``rustc``rustfmt`,足够执行轻量 broker client、preflight 和端到端 CLI 交互验证;本 P0 不需要为了调研临时安装新系统包。
Artifact registry / deploy 路径的问题不同:D601 registry 是 host loopback 的 artifact cache,标准 CD 先验证 commit-pinned manifest,再 pull/import/retag/rollout。P0 auth broker 不代理 Docker registry credential、不替代 provider-gateway Host SSH、不执行 deploy apply,也不解决 production registry 发布授权。
## P0 Scope
P0 只解决 GitHub REST 权限不应出现在普通 runner env 中的问题:
- 新增 Rust 单二进制服务 skeleton,路径为 `src/components/microservices/auth-broker`,工作名 `auth-broker`
- 新增 CLI adapter dry-run 形态,入口为 `bun scripts/cli.ts auth-broker contract|health --dry-run|credential-request --dry-run|pr-preflight --dry-run`
- 先只在 D601 dev 验证,入口只能是 k3s ClusterIP、backend-core/microservice 私有代理或 D601 loopback,不开放公网端口。
- broker 持有服务端 GitHub 凭证引用并调用 GitHub REST;runner 不接收、不读取、不打印 `GH_TOKEN` / `GITHUB_TOKEN`
- API 只接受结构化 operation,不接受 shell、argv、任意 URL 或原始 `gh api`
- P0 不实现真实 secret 存储、轮换、用户身份绑定、细粒度授权、production rollout 或 registry/deploy 凭证代理。
P0 可以让 Code Queue 并行推进,但必须把实现拆成互不冲突的 laneRust API skeleton、CLI client adapter、runtime-preflight capability 摘要、D601 dev manifest/dry-run、文档和端到端 CLI 交互验证。真实凭证配置、D601 dev Secret 创建、production 启用和 live write allowlist 需要人工确认。
## API
The first skeleton lives at:
- `src/components/microservices/auth-broker/Cargo.toml`
- `src/components/microservices/auth-broker/src/main.rs`
- `src/components/microservices/auth-broker/Dockerfile`
- `config.json` microservice id `auth-broker`
- `deploy.json` prod/dev desired-state entries for `auth-broker`
- `docker-compose.yml` service `auth-broker` behind Compose profile `auth-broker`
- `scripts/src/auth-broker.ts`
The skeleton intentionally does not read `GH_TOKEN` or `GITHUB_TOKEN`. It uses only redacted readiness configuration such as `AUTH_BROKER_GITHUB_CONFIGURED`, `AUTH_BROKER_GITHUB_CREDENTIAL_REF`, `AUTH_BROKER_ALLOWED_REPOS` and optional `AUTH_BROKER_AUDIT_LOG`. Real secret mounting is outside this dry-run surface.
## P1 Source Registration
P1 keeps Auth Broker in source and dry-run only:
- `config.json` registers stable microservice id `auth-broker` on `main-server`, private backend `http://auth-broker:4291`, health path `/health`, and allowed proxy prefixes `/health` plus `/v1/github/`.
- `docker-compose.yml` defines service `auth-broker` with `profiles: ["auth-broker"]`, `restart: "no"`, no public `ports`, and redacted env names only. Default `server start` does not select this profile, so this source registration must not change current production runtime.
- `deploy.json` includes prod and dev desired-state entries so `deploy plan --env prod|dev --service auth-broker` has a stable identity. Live apply is supervisor-gated until credential mounting and private exposure are separately reviewed.
- `bun scripts/cli.ts auth-broker contract|health --dry-run|credential-request --dry-run|pr-preflight --dry-run` reports `serviceRegistration.config`, `serviceRegistration.compose`, `serviceRegistration.deploy`, and `serviceRegistration.runtimeCredentialRef` using presence/ref fields only.
- Runtime credential readiness is expressed by `UNIDESK_AUTH_BROKER_GITHUB_CONFIGURED` / `AUTH_BROKER_GITHUB_CONFIGURED` and `UNIDESK_AUTH_BROKER_GITHUB_CREDENTIAL_REF` / `AUTH_BROKER_GITHUB_CREDENTIAL_REF` presence. The CLI prints only the source key and a sanitized `github:<ref>` style preview, never a token or raw credential value.
P1 still does not start Auth Broker, mount real secrets, deploy to prod/dev, restart backend-core/provider-gateway/Code Queue, or proxy registry/deploy credentials.
### `GET /health`
只返回服务状态和 redacted capability,不返回 secret 值。
```json
{
"ok": true,
"service": "auth-broker",
"phase": "p0",
"github": {
"configured": true,
"credentialRef": "github:unidesk-dev",
"valuesPrinted": false
},
"capabilities": ["github.issue.read", "github.pr.read", "github.pr.preflight.dry-run"]
}
```
### `POST /v1/github/gh`
通用 GitHub issue/PR proxy。请求必须是结构化 JSON:
```json
{
"requestId": "uuid",
"caller": {
"plane": "code-queue",
"taskId": "codex_...",
"queueId": "default"
},
"repo": "pikasTech/unidesk",
"operation": "github.issue.read",
"dryRun": false,
"params": {
"number": 59,
"jsonFields": ["body", "title", "state"]
}
}
```
P0 required operation allowlist:
| Operation | Upstream | Remote write | P0 status |
| --- | --- | --- | --- |
| `github.auth.status` | GitHub REST rate limit + repo probe | no | enabled |
| `github.issue.list` | `GET /repos/{owner}/{repo}/issues` | no | enabled |
| `github.issue.read` | `GET /repos/{owner}/{repo}/issues/{number}` | no | enabled |
| `github.pr.list` | `GET /repos/{owner}/{repo}/pulls` | no | enabled |
| `github.pr.read` | `GET /repos/{owner}/{repo}/pulls/{number}` | no | enabled |
| `github.pr.create` | planned request only when `dryRun=true` | no | enabled as dry-run |
| `github.pr.comment.create` | planned request only when `dryRun=true` | no | enabled as dry-run |
P0 explicitly blocks `gh pr merge`, issue/PR delete, arbitrary `gh api`, arbitrary HTTP proxying, raw git push, Docker registry login and deploy commands. Live GitHub writes can be added later only after a user-confirmed allowlist and identity/audit review; without that confirmation P0 mutation operations must return `dry-run-required`.
### `POST /v1/github/pr-preflight`
Runner-facing PR preflight proxy. It must preserve the existing `codex pr-preflight` semantics while replacing env-token coverage with broker coverage:
```json
{
"requestId": "uuid",
"repo": "pikasTech/unidesk",
"base": "master",
"head": "feature/example",
"issueNumber": 59,
"includePrCreateDryRun": true,
"includePushDryRun": false
}
```
Successful response shape:
```json
{
"ok": true,
"runnerDisposition": "ready",
"failureKind": null,
"degradedReason": null,
"tokenCoverage": {
"ok": true,
"source": "auth-broker",
"scope": "broker-held-github-credential",
"runnerEnvTokenRequired": false,
"valuesPrinted": false
},
"prCapabilityContract": {
"targetBranch": "master",
"systemGhBinaryRequiredForWrites": false,
"preflightCreatesPr": false,
"preflightMergesPr": false,
"brokerProxy": {
"ok": true,
"operations": ["github.auth.status", "github.pr.create"],
"writesRemote": false
}
}
}
```
Broker PR preflight can prove GitHub REST auth, repo visibility, issue/PR read, and PR create body/parameter shape. It cannot prove runner-local git push capability unless the runner still performs `git push --dry-run` with its own git credentials. Therefore `includePushDryRun=true` remains a runner-local check and may still fail as `git-remote-gap`.
## Permission Boundary
P0 permission is intentionally coarse because identity verification is not in scope:
- Allowed repo list defaults to `pikasTech/unidesk`; unknown repos return `repo-not-allowed`.
- Allowed operations are finite strings. The broker never executes caller-provided shell, argv or URLs.
- Request body size is bounded. Markdown bodies are accepted only for planned dry-run output in P0.
- Response redaction is mandatory. No response, log or audit field may contain token values, Authorization headers, cookie values or upstream credential material.
- Broker-held credential is identified only by `credentialRef`, `credentialKind` and boolean readiness.
- Network exposure is dev-only and private. Public frontend access must go through existing authenticated UniDesk proxy if exposed at all.
- P0 does not grant production deploy, registry push/pull, provider token, database, k3s or host SSH permissions.
## Audit Fields
Every broker request must emit one JSONL audit record with these fields:
| Field | Meaning |
| --- | --- |
| `requestId` | Caller-provided or broker-generated id |
| `observedAt` | ISO timestamp |
| `caller.plane` | `code-queue`, `ci`, `commander`, `manual-cli` or `unknown` |
| `caller.taskId` / `caller.queueId` | Code Queue correlation, nullable |
| `operation` | One allowlisted operation string |
| `repo` | Owner/repo after allowlist validation |
| `resource` | Issue/PR number or branch names, no body text |
| `dryRun` | Whether upstream mutation is impossible |
| `credentialRef` | Stable redacted credential reference |
| `credentialValuePrinted` | Always false |
| `upstream.method` / `upstream.path` | GitHub REST method/path without query secrets |
| `status` | HTTP status returned to caller |
| `ok` | Boolean success |
| `failureKind` / `degradedReason` | Structured failure semantics |
| `runnerDisposition` | `ready`, `infra-blocked` or `business-failed` |
| `retryable` | Boolean retry hint |
| `durationMs` | End-to-end broker latency |
| `redaction.valuesPrinted` | Always false |
Audit records may include body length and SHA-256 for planned PR/issue bodies, but must not store full Markdown bodies by default.
## Failure Semantics
P0 must use stable failure kinds so Code Queue can decide whether to retry, split blocker work or ask for manual action.
| Failure kind | HTTP | Runner disposition | Retryable | Meaning |
| --- | --- | --- | --- | --- |
| `auth-not-configured` | 503 | `infra-blocked` | false | Broker has no configured GitHub credential reference |
| `broker-unavailable` | 503 | `infra-blocked` | true | Broker service/proxy is unreachable |
| `unauthorized-caller` | 403 | `infra-blocked` | false | Caller is outside the private dev/proxy boundary |
| `repo-not-allowed` | 403 | `business-failed` | false | Repo is not in the broker allowlist |
| `operation-not-allowed` | 403 | `business-failed` | false | Operation is not in the finite allowlist |
| `dry-run-required` | 409 | `business-failed` | false | Mutation was requested but P0 only allows dry-run |
| `validation-failed` | 400 | `business-failed` | false | Missing or invalid structured parameters |
| `github-egress-failed` | 502 | `infra-blocked` | true | GitHub or proxy network path failed |
| `github-rate-limited` | 429 | `infra-blocked` | true | GitHub returned rate limiting |
| `github-permission-denied` | 403 | `infra-blocked` | false | Credential lacks repo or action access |
| `scope-insufficient` | 403 | `infra-blocked` | false | Accepted scopes do not satisfy the operation |
| `repo-not-found` | 404 | `business-failed` | false | Allowed repo/resource does not exist or is hidden |
| `upstream-invalid-response` | 502 | `infra-blocked` | true | GitHub response could not be parsed safely |
All failures must include `message`, `requestId`, `failureKind`, `degradedReason`, `runnerDisposition`, `retryable`, and a `next` array with bounded diagnostic commands or manual actions. None may include secret values.
## CLI Adapter
The local runner adapter is a dry-run surface only:
```bash
bun scripts/cli.ts auth-broker contract
bun scripts/cli.ts auth-broker health --dry-run
bun scripts/cli.ts auth-broker credential-request --operation github.pr.create --repo pikasTech/unidesk --dry-run
bun scripts/cli.ts auth-broker pr-preflight --repo pikasTech/unidesk --base master --head <head-branch> --issue 59 --dry-run
```
If no `UNIDESK_AUTH_BROKER_URL` / `AUTH_BROKER_URL` is configured, the adapter returns a structured failure instead of falling through to live GitHub or a shell fallback. `GH_TOKEN` / `GITHUB_TOKEN` presence is reported only as migration diagnostics and does not make the Auth Broker adapter ready:
```json
{
"ok": false,
"failureKind": "auth-missing",
"degradedReason": "broker-needed",
"runnerDisposition": "infra-blocked",
"brokerNeeded": true,
"tokenCoverage": {
"ok": false,
"presenceOnly": true,
"valuesRead": false,
"valuesPrinted": false
}
}
```
When a broker endpoint is configured, the same command returns the P0 ready shape with `tokenCoverage.source=auth-broker`, `runnerEnvTokenRequired=false`, `valuesPrinted=false`, `preflightCreatesPr=false`, `preflightMergesPr=false` and `brokerProxy.writesRemote=false`. The adapter sanitizes endpoint URLs before printing and never reads token values.
## D601 Dev Acceptance
The minimum D601 dev verification is:
1. Code Queue scheduler has no `GH_TOKEN` / `GITHUB_TOKEN` requirement for PR preflight success when broker is configured.
2. `codex pr-preflight --remote` reports `tokenCoverage.source=auth-broker`, `runnerEnvTokenRequired=false` and `valuesPrinted=false`.
3. Broker-backed issue/PR read returns structured GitHub data for `pikasTech/unidesk` through the stable proxy.
4. Broker-backed PR create dry-run returns a planned operation with `writesRemote=false`.
5. `includePushDryRun=true` is clearly reported as runner-local and can fail independently as `git-remote-gap`.
6. Logs and audit records contain request ids, operation names and failure semantics, but no token values.
7. No deploy, restart, production mutation, registry credential proxy or long-lived extra control plane is introduced by the verification itself.
## Manual Confirmation Points
These items are outside P0 automation and require user or operator confirmation:
- Where the broker-held GitHub credential reference is mounted in D601 dev.
- Whether P0 permits any live GitHub write. Default is read-only plus dry-run.
- Whether Code Queue CLI should default to broker mode when env token is absent or require an explicit `--auth-broker` flag first.
- Production exposure, production credentials, rotation, identity binding and per-user authorization.
- Any deploy, restart, registry credential, provider token, database or k3s permission integration.