diff --git a/.agents/skills/unidesk-sub2api/SKILL.md b/.agents/skills/unidesk-sub2api/SKILL.md index e2291039..5550e0ab 100644 --- a/.agents/skills/unidesk-sub2api/SKILL.md +++ b/.agents/skills/unidesk-sub2api/SKILL.md @@ -5,7 +5,7 @@ description: UniDesk Sub2API 平台运维技能。用户提到 Sub2API、sub2api # UniDesk Sub2API -UniDesk 在 G14 k3s `platform-infra` namespace 运维 Sub2API。日常操作统一使用 UniDesk CLI,不直接写 Kubernetes 资源或手工调用 Sub2API 管理 API。 +UniDesk 在 k3s `platform-infra` namespace 运维 Sub2API。G14 是默认 active runtime;D601 只作为同一 YAML/CLI 控制下的 standby predeploy target,外置 DB 未就绪时应用和本地 Redis cache 都保持 replicas=0。日常操作统一使用 UniDesk CLI,不直接写 Kubernetes 资源或手工调用 Sub2API 管理 API。 **固定入口**: `cd /root/unidesk && bun scripts/cli.ts platform-infra sub2api ...` @@ -33,7 +33,8 @@ bun scripts/cli.ts platform-infra sub2api codex-pool trace --request-id ` / `auth.json.` 并在 YAML 里声明。 - 输出只能包含 Secret 路径、长度、preview/fingerprint;禁止打印完整 API key、admin password、JWT secret、TOTP key。 @@ -42,16 +43,21 @@ bun scripts/cli.ts platform-infra sub2api codex-pool trace --request-id --pipeline-run --confirm` 受控清理后重试,不要用原生 `kubectl delete` 或手工改 mirror hook。修复后仍必须用 `control-plane status --pipeline-run ` 和 `git-mirror status` 分别确认 runtime closeout 与 GitHub flush。 -- `platform-infra sub2api plan|apply|status|validate|codex-pool` 是 G14 `platform-infra` namespace 内 Sub2API 的受控入口。镜像版本由 `config/platform-infra/sub2api.yaml` 控制,Codex 上游池、统一 API key Secret、FRP 公网端口和 master `~/.codex` 消费端由 `config/platform-infra/sub2api-codex-pool.yaml` 控制;完整日常部署、上游增删、FRP 暴露、local Codex 配置、验收和排障步骤统一见 `$unidesk-sub2api`(`.agents/skills/unidesk-sub2api/SKILL.md`)。`docs/reference/platform-infra.md` 只保留 namespace、YAML-first、路由、Secret 脱敏和探针开发边界。 +- `platform-infra sub2api plan|apply|status|validate|codex-pool` 是 `platform-infra` namespace 内 Sub2API 的受控入口;`--target` 选择运行目标,默认 `G14` 为 active runtime,`D601` 为同一 YAML 控制的 standby predeploy target。镜像版本和 target 边界由 `config/platform-infra/sub2api.yaml` 控制,Codex 上游池、统一 API key Secret、FRP 公网端口和 master `~/.codex` 消费端由 `config/platform-infra/sub2api-codex-pool.yaml` 控制;完整日常部署、上游增删、FRP 暴露、local Codex 配置、验收和排障步骤统一见 `$unidesk-sub2api`(`.agents/skills/unidesk-sub2api/SKILL.md`)。`docs/reference/platform-infra.md` 只保留 namespace、YAML-first、路由、Secret 脱敏和探针开发边界。 - `hwlab g14 observability status|apply|query|targets|boundary|closeout [--lane v02] [--promql ] [--expect-count N] [--expect-value V] [--dry-run|--confirm]` 是 G14 `devops-infra` 共享监控基础设施和 HWLAB v0.2 监控 closeout 的受控入口。`apply` 固定安装 Prometheus Operator `v0.91.0`、Prometheus `v3.12.0`、Prometheus 发现 RBAC、`devops-infra` 内 Prometheus 实例和 ClusterIP query Service,并给被允许发现的 workload namespace 打低风险 label;它不把 Prometheus、Grafana 或 Alertmanager 部署到 `hwlab-v02`,也不接管 HWLAB runtime Deployment/Service。`status` 只读汇总 CRD、operator Deployment、Prometheus CR/pod/service、`hwlab-v02` ServiceMonitor/PrometheusRule 和 bounded `up` 查询;`query` 只通过 Kubernetes service proxy 查询 Prometheus,支持 `--expect-count` / `--expect-value` 输出 `assertion`、bad values 和 missing/extra series;`targets` 汇总 ServiceMonitor/PrometheusRule、metrics sidecar readiness/restart、三层指标值和 `metrics.k8s.io` 当前 CPU/内存资源快照;`boundary` 验证 workload namespace 没有 Prometheus/Alertmanager,并对 `19666/19667` 公网 `/metrics` 做负向验证;`closeout` 聚合平台 ready、scrape reachable、sidecar serving、business health probe、resource snapshot、namespace boundary 和 public metrics exposure 语义结论。长期边界见 `docs/reference/g14-observability-infra.md`。 - `hwlab g14 tools-image status|build --name ci-node-tools --tag [--dockerfile deploy/ci/hwlab-ci-node-tools.Dockerfile] [--dry-run|--confirm]` 是 G14 固定 HWLAB CI tools image 的受控 host build/push 入口;构建和 push 只发生在 G14 host 与本地 registry,不在 master server 构建,也不把 `apk add`/runtime install 塞进 Tekton PipelineRun。 - `trans gh:/owner/repo ...` 把 GitHub issue/PR 映射成只读/受控写入的虚拟文本目录,适合日报、PR 正文和 issue 正文的小补丁维护:`trans gh:/pikasTech/HWLAB ls` 展示 `pr/` 与 `issue/`,`trans gh:/pikasTech/HWLAB/pr ls [--limit N] [--full]` 和 `trans gh:/pikasTech/HWLAB/issue ls [--limit N] [--full]` 展示条目状态、楼层数、正文长度和标题,`trans gh:/pikasTech/HWLAB/pr/507 ls` 展示单个 PR 的一楼正文文件,`trans gh:/pikasTech/HWLAB/505/1 cat|rg|patch-apply` 兼容旧式 issue/PR number route。`patch-apply` 使用 UniDesk 默认 apply-patch v2 的虚拟文件 executor,把正文一楼映射为 `body.md`,写回仍走 `bun scripts/cli.ts gh issue/pr update` 的 guard/concurrency 规则;`rm` 对正文一楼结构化拒绝,避免误删 issue/PR 正文。大正文读取必须展开 UniDesk gh dump 文件,否则 `cat/rg/patch-apply` 会误读为空,这是 `gh:` 虚拟文件接口的 P0 可见性契约。 diff --git a/docs/reference/platform-infra.md b/docs/reference/platform-infra.md index b1029ec9..f2af95c0 100644 --- a/docs/reference/platform-infra.md +++ b/docs/reference/platform-infra.md @@ -1,6 +1,6 @@ -# G14 Platform Infra +# Platform Infra -`platform-infra` is the G14 k3s namespace for UniDesk-operated shared platform services. It is separate from HWLAB runtime lanes, AgentRun lanes, D601 user services, and legacy `devops-infra` control-plane helpers. New shared infra should land here first; old `devops-infra` resources migrate gradually only when a concrete owner and validation path exist. +`platform-infra` is the k3s namespace for UniDesk-operated shared platform services. G14 is the active default runtime for this namespace; D601 may host explicitly declared standby platform targets when the service needs node-local preparation or cutover capacity. It is separate from HWLAB runtime lanes, AgentRun lanes, D601 user services, and legacy `devops-infra` control-plane helpers. New shared infra should land here first; old `devops-infra` resources migrate gradually only when a concrete owner and validation path exist. ## Source Of Truth @@ -11,13 +11,15 @@ ## Sub2API Deployment Boundary -- Sub2API is a G14 platform service operated by UniDesk in namespace `platform-infra`. It is not a HWLAB lane workload, AgentRun workload, D601 service, or master server daemon. -- The canonical deployment entrypoint is `bun scripts/cli.ts platform-infra sub2api plan|apply|status|validate|codex-pool`; daily operation procedures live in `$unidesk-sub2api` at `.agents/skills/unidesk-sub2api/SKILL.md`. This reference keeps only development boundaries and project-specific source-of-truth rules. -- Raw `kubectl` through `trans G14:k3s` is only for bounded diagnosis and evidence, not a formal mutate path. +- Sub2API is a platform service operated by UniDesk in namespace `platform-infra`. It is not a HWLAB lane workload, AgentRun workload, D601 user service, or master server daemon. +- The canonical deployment entrypoint is `bun scripts/cli.ts platform-infra sub2api plan|apply|status|validate|codex-pool`. Runtime targets are selected with `--target`; `G14` is the active default target and `D601` is a standby target controlled by the same YAML. Daily operation procedures live in `$unidesk-sub2api` at `.agents/skills/unidesk-sub2api/SKILL.md`. This reference keeps only development boundaries and project-specific source-of-truth rules. +- Raw `kubectl` through `trans :k3s` is only for bounded diagnosis and evidence, not a formal mutate path. - The image version is controlled by `config/platform-infra/sub2api.yaml`. Image update procedures are daily operations owned by `$unidesk-sub2api`; the development boundary is that image choices remain YAML-controlled. - Sub2API should stay ClusterIP-only by default. Do not add Ingress, NodePort, LoadBalancer, or broad FRP exposure unless a YAML-controlled public exposure decision exists. - Sub2API currently has no resource limits by design. Do not add CPU or memory limits unless a later explicit decision changes that policy and stores the new policy in YAML. - Master server is a consumer/control host, not the runtime location. Do not deploy Sub2API, PostgreSQL, Redis, or heavy validation loops on master server. +- D601 Sub2API is a predeployment target, not a second active singleton. While the platform database handoff is pending, it must render without a local PostgreSQL StatefulSet, keep the Sub2API app and local Redis cache scaled to zero, and use only ephemeral Redis storage when Redis is later activated. After the external platform DB endpoint, Secret, and runtime images are ready, activation must be expressed by YAML and applied through the same `platform-infra sub2api --target D601` CLI path. +- Sub2API account sentinel and public FRP exposure remain singleton concerns. Do not create a second sentinel or public management surface for D601 unless a later YAML-controlled decision explicitly moves or splits that responsibility. ## Codex Pool Routing @@ -166,4 +168,4 @@ spec: This policy must be included in the `sub2api plan` / `apply` manifest rendering so that it is created as part of the normal deployment flow, not maintained as a manual one-off. -`platform-infra sub2api status` must report whether `NetworkPolicy/allow-all` exists and still has `podSelector: {}`, `policyTypes: [Ingress, Egress]`, `ingress: [{}]`, and `egress: [{}]`. `platform-infra sub2api validate` must also run temporary in-namespace probe pods that connect to `sub2api-postgres:5432` and `sub2api-redis:6379`; local `pg_isready` inside the PostgreSQL pod alone is insufficient because it does not exercise kube-router cross-pod policy evaluation. +`platform-infra sub2api status` must report whether `NetworkPolicy/allow-all` exists and still has `podSelector: {}`, `policyTypes: [Ingress, Egress]`, `ingress: [{}]`, and `egress: [{}]`. For active bundled targets, `platform-infra sub2api validate` must also run temporary in-namespace probe pods that connect to `sub2api-postgres:5432` and `sub2api-redis:6379`; local `pg_isready` inside the PostgreSQL pod alone is insufficient because it does not exercise kube-router cross-pod policy evaluation. For external-DB pending standby targets, `validate --target` checks the predeployment shape instead: no local PostgreSQL, app replicas zero, ClusterIP services, allow-all NetworkPolicy, and local Redis declared as ephemeral cache with readiness required only when Redis replicas are above zero. diff --git a/scripts/src/platform-infra.ts b/scripts/src/platform-infra.ts index 4f34e0a3..31ef3f5b 100644 --- a/scripts/src/platform-infra.ts +++ b/scripts/src/platform-infra.ts @@ -6,7 +6,7 @@ import { startJob } from "./jobs"; import type { RenderedCliResult } from "./output"; import { runSshCommandCapture, type SshCaptureResult } from "./ssh"; -const g14K3sRoute = "G14:k3s"; +const defaultTargetId = "G14"; const namespace = "platform-infra"; const serviceName = "sub2api"; const fieldManager = "unidesk-platform-infra"; @@ -15,6 +15,9 @@ const configPath = rootPath("config", "platform-infra", "sub2api.yaml"); const secretName = "sub2api-secrets"; const requiredSecretKeys = ["POSTGRES_PASSWORD", "ADMIN_PASSWORD", "JWT_SECRET", "TOTP_ENCRYPTION_KEY"] as const; +type DatabaseMode = "bundled" | "external-pending"; +type RedisMode = "bundled-persistent" | "local-ephemeral"; + interface Sub2ApiConfig { image: { repository: string; @@ -29,6 +32,48 @@ interface Sub2ApiConfig { upstreamHosts: string[]; }; }; + targets: Sub2ApiTargetConfig[]; + runtime: { + database: ExternalDatabaseConfig; + redis: RuntimeRedisConfig; + appData: { + mode: "persistent-pvc" | "empty-dir"; + }; + sentinel: { + mode: "singleton"; + enabledOnTargets: string[]; + }; + }; +} + +interface Sub2ApiTargetConfig { + id: string; + route: string; + namespace: string; + role: string; + enabled: boolean; + databaseMode: DatabaseMode; + redisMode: RedisMode; + appReplicas: number; + redisReplicas: number; +} + +interface ExternalDatabaseConfig { + mode: "external"; + sourceRef: string; + secretName: string; + passwordKey: string; + host: string; + port: number; + user: string; + dbName: string; + sslMode: string; + pendingAllowed: boolean; +} + +interface RuntimeRedisConfig { + serviceName: string; + persistence: boolean; } export function platformInfraHelp(): unknown { @@ -36,11 +81,11 @@ export function platformInfraHelp(): unknown { command: "platform-infra sub2api plan|apply|status|validate|codex-pool", output: "json", usage: [ - "bun scripts/cli.ts platform-infra sub2api plan", - "bun scripts/cli.ts platform-infra sub2api apply --dry-run", - "bun scripts/cli.ts platform-infra sub2api apply --confirm", - "bun scripts/cli.ts platform-infra sub2api status [--full|--raw]", - "bun scripts/cli.ts platform-infra sub2api validate [--full|--raw]", + "bun scripts/cli.ts platform-infra sub2api plan [--target G14|D601]", + "bun scripts/cli.ts platform-infra sub2api apply [--target G14|D601] --dry-run", + "bun scripts/cli.ts platform-infra sub2api apply [--target G14|D601] --confirm", + "bun scripts/cli.ts platform-infra sub2api status [--target G14|D601] [--full|--raw]", + "bun scripts/cli.ts platform-infra sub2api validate [--target G14|D601] [--full|--raw]", "bun scripts/cli.ts platform-infra sub2api codex-pool plan", "bun scripts/cli.ts platform-infra sub2api codex-pool sync --confirm", "bun scripts/cli.ts platform-infra sub2api codex-pool validate", @@ -48,9 +93,9 @@ export function platformInfraHelp(): unknown { "bun scripts/cli.ts platform-infra sub2api codex-pool sentinel-image status", "bun scripts/cli.ts platform-infra sub2api codex-pool sentinel-probe --account unidesk-codex-hy --confirm", ], - description: "Operate the G14 k3s internal-only Sub2API deployment in the shared platform-infra namespace. This entry creates no Ingress, NodePort, LoadBalancer, hostPort, hostNetwork, ResourceQuota, LimitRange, or CPU/memory resource requests/limits.", + description: "Operate YAML-controlled Sub2API platform-infra targets. G14 remains the active bundled deployment; D601 is a standby predeployment target with external DB pending and no local PostgreSQL.", target: { - route: g14K3sRoute, + default: defaultTargetId, namespace, service: serviceName, serviceDns: `${serviceName}.${namespace}.svc.cluster.local:8080`, @@ -74,7 +119,7 @@ export function platformInfraHelp(): unknown { export async function runPlatformInfraCommand(config: UniDeskConfig, args: string[]): Promise | RenderedCliResult> { const [target, action] = args; if (target !== "sub2api") return unsupported(args); - if (action === "plan" || action === undefined) return plan(); + if (action === "plan" || action === undefined) return plan(parseTargetOptions(args.slice(2))); if (action === "apply") return await apply(config, parseApplyOptions(args.slice(2))); if (action === "status") return await status(config, parseDisclosureOptions(args.slice(2))); if (action === "validate") return await validate(config, parseDisclosureOptions(args.slice(2))); @@ -86,16 +131,22 @@ export async function runPlatformInfraCommand(config: UniDeskConfig, args: strin } interface ApplyOptions { + targetId: string; dryRun: boolean; confirm: boolean; wait: boolean; } interface DisclosureOptions { + targetId: string; full: boolean; raw: boolean; } +interface TargetOptions { + targetId: string; +} + interface PolicyCheck { name: string; ok: boolean; @@ -112,9 +163,11 @@ function unsupported(args: string[]): Record { } function parseApplyOptions(args: string[]): ApplyOptions { - validateOptions(args, new Set(["--dry-run", "--confirm", "--wait"])); + const target = parseTargetOptions(args); + validateOptions(args, new Set(["--dry-run", "--confirm", "--wait", "--target"])); if (args.includes("--dry-run") && args.includes("--confirm")) throw new Error("apply accepts only one of --dry-run or --confirm"); return { + targetId: target.targetId, dryRun: args.includes("--dry-run") || !args.includes("--confirm"), confirm: args.includes("--confirm"), wait: args.includes("--wait"), @@ -122,13 +175,37 @@ function parseApplyOptions(args: string[]): ApplyOptions { } function parseDisclosureOptions(args: string[]): DisclosureOptions { - validateOptions(args, new Set(["--full", "--raw"])); + const target = parseTargetOptions(args); + validateOptions(args, new Set(["--full", "--raw", "--target"])); const raw = args.includes("--raw"); - return { full: raw || args.includes("--full"), raw }; + return { targetId: target.targetId, full: raw || args.includes("--full"), raw }; +} + +function parseTargetOptions(args: string[]): TargetOptions { + let targetId = defaultTargetId; + for (let index = 0; index < args.length; index += 1) { + const arg = args[index]; + if (arg === "--target") { + const value = args[index + 1]; + if (value === undefined || value.startsWith("--")) throw new Error("--target requires a value"); + targetId = value; + index += 1; + } else if (arg.startsWith("--target=")) { + targetId = arg.slice("--target=".length); + } + } + if (!/^[A-Za-z0-9._-]+$/u.test(targetId)) throw new Error("--target must be a simple target id"); + return { targetId }; } function validateOptions(args: string[], booleanOptions: Set): void { - for (const arg of args) { + for (let index = 0; index < args.length; index += 1) { + const arg = args[index]; + if (arg === "--target") { + index += 1; + continue; + } + if (arg.startsWith("--target=") && booleanOptions.has("--target")) continue; if (booleanOptions.has(arg)) continue; throw new Error(`unsupported option: ${arg}`); } @@ -137,6 +214,7 @@ function validateOptions(args: string[], booleanOptions: Set): void { function readSub2ApiConfig(): Sub2ApiConfig { const parsed = Bun.YAML.parse(readFileSync(configPath, "utf8")) as unknown; if (typeof parsed !== "object" || parsed === null || Array.isArray(parsed)) throw new Error(`${configPath} must contain a YAML object`); + const root = parsed as Record; const image = (parsed as { image?: unknown }).image; if (typeof image !== "object" || image === null || Array.isArray(image)) throw new Error(`${configPath}.image must be an object`); const record = image as Record; @@ -152,6 +230,8 @@ function readSub2ApiConfig(): Sub2ApiConfig { const allowInsecureHttp = booleanField(urlAllowlist, "allowInsecureHttp", "security.urlAllowlist"); const allowPrivateHosts = booleanField(urlAllowlist, "allowPrivateHosts", "security.urlAllowlist"); const upstreamHosts = stringArrayField(urlAllowlist, "upstreamHosts", "security.urlAllowlist"); + const targets = parseTargets(root); + const runtime = parseRuntime(root); return { image: { repository, tag, pullPolicy }, security: { @@ -162,6 +242,119 @@ function readSub2ApiConfig(): Sub2ApiConfig { upstreamHosts, }, }, + targets, + runtime, + }; +} + +function parseTargets(root: Record): Sub2ApiTargetConfig[] { + const value = root.targets; + if (value === undefined) return defaultTargets(); + if (!Array.isArray(value)) throw new Error(`${configPath}.targets must be an array`); + const targets = value.map((item, index) => { + if (typeof item !== "object" || item === null || Array.isArray(item)) throw new Error(`${configPath}.targets[${index}] must be an object`); + const record = item as Record; + const path = `targets[${index}]`; + const id = stringField(record, "id", path); + const route = stringField(record, "route", path); + const targetNamespace = stringField(record, "namespace", path); + const role = stringField(record, "role", path); + const enabled = booleanField(record, "enabled", path); + const databaseMode = enumField(record, "databaseMode", path, ["bundled", "external-pending"] as const); + const redisMode = enumField(record, "redisMode", path, ["bundled-persistent", "local-ephemeral"] as const); + const appReplicas = integerField(record, "appReplicas", path); + const redisReplicas = record.redisReplicas === undefined + ? (databaseMode === "external-pending" && appReplicas === 0 ? 0 : 1) + : integerField(record, "redisReplicas", path); + if (!/^[A-Za-z0-9._-]+$/u.test(id)) throw new Error(`${configPath}.${path}.id must be a simple target id`); + if (!/^[A-Za-z0-9:_./-]+$/u.test(route)) throw new Error(`${configPath}.${path}.route has an unsupported format`); + if (!isKubernetesName(targetNamespace)) throw new Error(`${configPath}.${path}.namespace must be a Kubernetes namespace name`); + if (appReplicas < 0 || appReplicas > 1) throw new Error(`${configPath}.${path}.appReplicas must be 0 or 1`); + if (redisReplicas < 0 || redisReplicas > 1) throw new Error(`${configPath}.${path}.redisReplicas must be 0 or 1`); + return { id, route, namespace: targetNamespace, role, enabled, databaseMode, redisMode, appReplicas, redisReplicas }; + }); + const ids = new Set(); + for (const target of targets) { + if (ids.has(target.id)) throw new Error(`${configPath}.targets contains duplicate id ${target.id}`); + ids.add(target.id); + } + if (!ids.has(defaultTargetId)) throw new Error(`${configPath}.targets must include ${defaultTargetId}`); + return targets; +} + +function defaultTargets(): Sub2ApiTargetConfig[] { + return [ + { + id: "G14", + route: "G14:k3s", + namespace, + role: "active", + enabled: true, + databaseMode: "bundled", + redisMode: "bundled-persistent", + appReplicas: 1, + redisReplicas: 1, + }, + ]; +} + +function parseRuntime(root: Record): Sub2ApiConfig["runtime"] { + const value = root.runtime; + if (value === undefined) { + return { + database: { + mode: "external", + sourceRef: "platform-db/postgres-active.env", + secretName, + passwordKey: "POSTGRES_PASSWORD", + host: "pika01-postgres.pending.local", + port: 5432, + user: "sub2api", + dbName: "sub2api", + sslMode: "prefer", + pendingAllowed: true, + }, + redis: { serviceName: "sub2api-redis", persistence: false }, + appData: { mode: "persistent-pvc" }, + sentinel: { mode: "singleton", enabledOnTargets: ["G14"] }, + }; + } + if (typeof value !== "object" || value === null || Array.isArray(value)) throw new Error(`${configPath}.runtime must be an object`); + const runtime = value as Record; + const database = objectField(runtime, "database", "runtime"); + const databaseMode = enumField(database, "mode", "runtime.database", ["external"] as const); + const sourceRef = stringField(database, "sourceRef", "runtime.database"); + const secret = stringField(database, "secretName", "runtime.database"); + const passwordKey = stringField(database, "passwordKey", "runtime.database"); + const host = stringField(database, "host", "runtime.database"); + const port = integerField(database, "port", "runtime.database"); + const user = stringField(database, "user", "runtime.database"); + const dbName = stringField(database, "dbName", "runtime.database"); + const sslMode = stringField(database, "sslMode", "runtime.database"); + const pendingAllowed = booleanField(database, "pendingAllowed", "runtime.database"); + if (!isKubernetesName(secret)) throw new Error(`${configPath}.runtime.database.secretName must be a Kubernetes Secret name`); + if (!/^[A-Z0-9_]+$/u.test(passwordKey)) throw new Error(`${configPath}.runtime.database.passwordKey must be an env key`); + if (!/^[A-Za-z0-9._-]+$/u.test(host)) throw new Error(`${configPath}.runtime.database.host has an unsupported format`); + if (!/^[A-Za-z0-9_][-A-Za-z0-9_]*$/u.test(user)) throw new Error(`${configPath}.runtime.database.user has an unsupported format`); + if (!/^[A-Za-z0-9_][-A-Za-z0-9_]*$/u.test(dbName)) throw new Error(`${configPath}.runtime.database.dbName has an unsupported format`); + if (!/^[A-Za-z0-9_-]+$/u.test(sslMode)) throw new Error(`${configPath}.runtime.database.sslMode has an unsupported format`); + if (port < 1 || port > 65535) throw new Error(`${configPath}.runtime.database.port must be a TCP port`); + if (databaseMode !== "external") throw new Error(`${configPath}.runtime.database.mode must be external`); + const redis = objectField(runtime, "redis", "runtime"); + const redisServiceName = stringField(redis, "serviceName", "runtime.redis"); + const redisPersistence = booleanField(redis, "persistence", "runtime.redis"); + if (!isKubernetesName(redisServiceName)) throw new Error(`${configPath}.runtime.redis.serviceName must be a Kubernetes Service name`); + const appData = objectField(runtime, "appData", "runtime"); + const appDataMode = enumField(appData, "mode", "runtime.appData", ["persistent-pvc", "empty-dir"] as const); + const sentinel = objectField(runtime, "sentinel", "runtime"); + const sentinelMode = enumField(sentinel, "mode", "runtime.sentinel", ["singleton"] as const); + const enabledOnTargets = stringArrayField(sentinel, "enabledOnTargets", "runtime.sentinel"); + if (sentinelMode !== "singleton") throw new Error(`${configPath}.runtime.sentinel.mode must be singleton`); + return { + database: { mode: "external", sourceRef, secretName: secret, passwordKey, host, port, user, dbName, sslMode, pendingAllowed }, + redis: { serviceName: redisServiceName, persistence: redisPersistence }, + appData: { mode: appDataMode }, + sentinel: { mode: "singleton", enabledOnTargets }, }; } @@ -192,21 +385,41 @@ function stringArrayField(obj: Record, key: string, path: strin return value.map((item) => item.trim()); } +function enumField(obj: Record, key: string, path: string, values: T): T[number] { + const value = stringField(obj, key, path); + if (!(values as readonly string[]).includes(value)) throw new Error(`${configPath}.${path}.${key} must be one of ${values.join(", ")}`); + return value as T[number]; +} + +function integerField(obj: Record, key: string, path: string): number { + const value = obj[key]; + if (typeof value !== "number" || !Number.isInteger(value)) throw new Error(`${configPath}.${path}.${key} must be an integer`); + return value; +} + +function isKubernetesName(value: string): boolean { + return /^[a-z0-9]([-a-z0-9]*[a-z0-9])?$/u.test(value); +} + +function resolveTarget(sub2api: Sub2ApiConfig, targetId: string): Sub2ApiTargetConfig { + const target = sub2api.targets.find((item) => item.id.toLowerCase() === targetId.toLowerCase()); + if (target === undefined) { + const known = sub2api.targets.map((item) => item.id).join(", "); + throw new Error(`unknown Sub2API target ${targetId}; known targets: ${known}`); + } + if (!target.enabled) throw new Error(`Sub2API target ${target.id} is disabled in ${configPath}`); + return target; +} + function imageRef(sub2api: Sub2ApiConfig): string { return `${sub2api.image.repository}:${sub2api.image.tag}`; } -function manifest(): string { - const sub2api = readSub2ApiConfig(); +function manifest(sub2api: Sub2ApiConfig, target: Sub2ApiTargetConfig): string { + if (target.databaseMode === "external-pending") return externalPendingManifest(sub2api, target); const template = readFileSync(manifestPath, "utf8"); const urlAllowlist = sub2api.security.urlAllowlist; - const configHash = createHash("sha256") - .update(JSON.stringify({ - image: sub2api.image, - security: sub2api.security, - })) - .digest("hex") - .slice(0, 16); + const configHash = configHashFor(sub2api, target); return template .replaceAll("__SUB2API_IMAGE__", imageRef(sub2api)) .replaceAll("__SUB2API_IMAGE_PULL_POLICY__", sub2api.image.pullPolicy) @@ -217,36 +430,382 @@ function manifest(): string { .replaceAll("__SUB2API_SECURITY_URL_ALLOWLIST_UPSTREAM_HOSTS__", urlAllowlist.upstreamHosts.join(",")); } -function plan(): Record { +function configHashFor(sub2api: Sub2ApiConfig, target: Sub2ApiTargetConfig): string { + return createHash("sha256") + .update(JSON.stringify({ + image: sub2api.image, + security: sub2api.security, + target, + runtime: sub2api.runtime, + })) + .digest("hex") + .slice(0, 16); +} + +function externalPendingManifest(sub2api: Sub2ApiConfig, target: Sub2ApiTargetConfig): string { + const urlAllowlist = sub2api.security.urlAllowlist; + const database = sub2api.runtime.database; + const redisService = sub2api.runtime.redis.serviceName; + const appReplicas = target.appReplicas; + const redisReplicas = target.redisReplicas; + return `apiVersion: v1 +kind: Namespace +metadata: + name: ${target.namespace} + labels: + app.kubernetes.io/name: platform-infra + app.kubernetes.io/managed-by: unidesk + unidesk.ai/runtime-node: ${target.id} +--- +apiVersion: networking.k8s.io/v1 +kind: NetworkPolicy +metadata: + name: allow-all + namespace: ${target.namespace} + labels: + app.kubernetes.io/name: platform-infra + app.kubernetes.io/part-of: platform-infra + app.kubernetes.io/managed-by: unidesk +spec: + podSelector: {} + policyTypes: + - Ingress + - Egress + ingress: + - {} + egress: + - {} +--- +apiVersion: v1 +kind: ConfigMap +metadata: + name: sub2api-config + namespace: ${target.namespace} + labels: + app.kubernetes.io/name: sub2api + app.kubernetes.io/part-of: platform-infra + app.kubernetes.io/managed-by: unidesk + unidesk.ai/runtime-node: ${target.id} +data: + AUTO_SETUP: "true" + SERVER_HOST: "0.0.0.0" + SERVER_PORT: "8080" + SERVER_MODE: "release" + RUN_MODE: "standard" + DATABASE_HOST: "${database.host}" + DATABASE_PORT: "${database.port}" + DATABASE_USER: "${database.user}" + DATABASE_DBNAME: "${database.dbName}" + DATABASE_SSLMODE: "${database.sslMode}" + DATABASE_MAX_OPEN_CONNS: "10" + DATABASE_MAX_IDLE_CONNS: "2" + DATABASE_CONN_MAX_LIFETIME_MINUTES: "30" + DATABASE_CONN_MAX_IDLE_TIME_MINUTES: "5" + REDIS_HOST: "${redisService}" + REDIS_PORT: "6379" + REDIS_PASSWORD: "" + REDIS_DB: "0" + REDIS_POOL_SIZE: "32" + REDIS_MIN_IDLE_CONNS: "2" + REDIS_ENABLE_TLS: "false" + ADMIN_EMAIL: "admin@sub2api.platform-infra.local" + JWT_EXPIRE_HOUR: "24" + TZ: "Asia/Shanghai" + SECURITY_URL_ALLOWLIST_ENABLED: "${urlAllowlist.enabled}" + SECURITY_URL_ALLOWLIST_ALLOW_INSECURE_HTTP: "${urlAllowlist.allowInsecureHttp}" + SECURITY_URL_ALLOWLIST_ALLOW_PRIVATE_HOSTS: "${urlAllowlist.allowPrivateHosts}" + SECURITY_URL_ALLOWLIST_UPSTREAM_HOSTS: "${urlAllowlist.upstreamHosts.join(",")}" + UPDATE_PROXY_URL: "" + GATEWAY_OPENAI_RESPONSE_HEADER_TIMEOUT: "0" + GATEWAY_OPENAI_HTTP2_ENABLED: "true" + GATEWAY_OPENAI_HTTP2_ALLOW_PROXY_FALLBACK_TO_HTTP1: "true" + GATEWAY_OPENAI_HTTP2_FALLBACK_ERROR_THRESHOLD: "2" + GATEWAY_OPENAI_HTTP2_FALLBACK_WINDOW_SECONDS: "60" + GATEWAY_OPENAI_HTTP2_FALLBACK_TTL_SECONDS: "600" + GATEWAY_IMAGE_STREAM_DATA_INTERVAL_TIMEOUT: "900" + GATEWAY_IMAGE_STREAM_KEEPALIVE_INTERVAL: "10" + GATEWAY_IMAGE_CONCURRENCY_ENABLED: "false" + GATEWAY_IMAGE_CONCURRENCY_MAX_CONCURRENT_REQUESTS: "0" + GATEWAY_IMAGE_CONCURRENCY_OVERFLOW_MODE: "reject" + GATEWAY_IMAGE_CONCURRENCY_WAIT_TIMEOUT_SECONDS: "30" + GATEWAY_IMAGE_CONCURRENCY_MAX_WAITING_REQUESTS: "100" +--- +apiVersion: v1 +kind: Service +metadata: + name: ${redisService} + namespace: ${target.namespace} + labels: + app.kubernetes.io/name: sub2api-redis + app.kubernetes.io/component: redis + app.kubernetes.io/part-of: platform-infra + app.kubernetes.io/managed-by: unidesk + unidesk.ai/runtime-node: ${target.id} + unidesk.ai/state-mode: ephemeral-cache +spec: + selector: + app.kubernetes.io/name: sub2api-redis + app.kubernetes.io/component: redis + ports: + - name: redis + port: 6379 + targetPort: redis +--- +apiVersion: v1 +kind: Service +metadata: + name: sub2api + namespace: ${target.namespace} + labels: + app.kubernetes.io/name: sub2api + app.kubernetes.io/component: app + app.kubernetes.io/part-of: platform-infra + app.kubernetes.io/managed-by: unidesk + unidesk.ai/runtime-node: ${target.id} +spec: + selector: + app.kubernetes.io/name: sub2api + app.kubernetes.io/component: app + ports: + - name: http + port: 8080 + targetPort: http +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: sub2api-redis + namespace: ${target.namespace} + labels: + app.kubernetes.io/name: sub2api-redis + app.kubernetes.io/component: redis + app.kubernetes.io/part-of: platform-infra + app.kubernetes.io/managed-by: unidesk + unidesk.ai/runtime-node: ${target.id} + unidesk.ai/state-mode: ephemeral-cache +spec: + replicas: ${redisReplicas} + strategy: + type: Recreate + selector: + matchLabels: + app.kubernetes.io/name: sub2api-redis + app.kubernetes.io/component: redis + template: + metadata: + labels: + app.kubernetes.io/name: sub2api-redis + app.kubernetes.io/component: redis + app.kubernetes.io/part-of: platform-infra + spec: + securityContext: + fsGroup: 999 + containers: + - name: redis + image: redis:8-alpine + imagePullPolicy: IfNotPresent + command: + - sh + - -c + args: + - redis-server --save "" --appendonly no + ports: + - name: redis + containerPort: 6379 + env: + - name: TZ + value: Asia/Shanghai + readinessProbe: + exec: + command: + - redis-cli + - ping + initialDelaySeconds: 5 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 6 + livenessProbe: + exec: + command: + - redis-cli + - ping + initialDelaySeconds: 30 + periodSeconds: 20 + timeoutSeconds: 5 + failureThreshold: 6 + volumeMounts: + - name: redis-data + mountPath: /data + volumes: + - name: redis-data + emptyDir: {} +--- +apiVersion: apps/v1 +kind: Deployment +metadata: + name: sub2api + namespace: ${target.namespace} + labels: + app.kubernetes.io/name: sub2api + app.kubernetes.io/component: app + app.kubernetes.io/part-of: platform-infra + app.kubernetes.io/managed-by: unidesk + unidesk.ai/runtime-node: ${target.id} + unidesk.ai/database-status: pending-external-db +spec: + replicas: ${appReplicas} + strategy: + type: Recreate + selector: + matchLabels: + app.kubernetes.io/name: sub2api + app.kubernetes.io/component: app + template: + metadata: + annotations: + unidesk.ai/sub2api-config-hash: "${configHashFor(sub2api, target)}" + unidesk.ai/database-source-ref: "${database.sourceRef}" + unidesk.ai/database-status: "pending-external-db" + labels: + app.kubernetes.io/name: sub2api + app.kubernetes.io/component: app + app.kubernetes.io/part-of: platform-infra + spec: + securityContext: + fsGroup: 1000 + initContainers: + - name: wait-postgres + image: postgres:18-alpine + imagePullPolicy: IfNotPresent + command: + - sh + - -c + - until pg_isready -h ${database.host} -p ${database.port} -U ${database.user} -d ${database.dbName}; do sleep 2; done + - name: wait-redis + image: redis:8-alpine + imagePullPolicy: IfNotPresent + command: + - sh + - -c + - until redis-cli -h ${redisService} ping | grep -q PONG; do sleep 2; done + containers: + - name: sub2api + image: ${imageRef(sub2api)} + imagePullPolicy: ${sub2api.image.pullPolicy} + ports: + - name: http + containerPort: 8080 + envFrom: + - configMapRef: + name: sub2api-config + env: + - name: DATABASE_PASSWORD + valueFrom: + secretKeyRef: + name: ${database.secretName} + key: ${database.passwordKey} + - name: ADMIN_PASSWORD + valueFrom: + secretKeyRef: + name: ${secretName} + key: ADMIN_PASSWORD + - name: JWT_SECRET + valueFrom: + secretKeyRef: + name: ${secretName} + key: JWT_SECRET + - name: TOTP_ENCRYPTION_KEY + valueFrom: + secretKeyRef: + name: ${secretName} + key: TOTP_ENCRYPTION_KEY + readinessProbe: + httpGet: + path: /health + port: http + initialDelaySeconds: 10 + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 6 + livenessProbe: + httpGet: + path: /health + port: http + initialDelaySeconds: 30 + periodSeconds: 20 + timeoutSeconds: 5 + failureThreshold: 6 + startupProbe: + httpGet: + path: /health + port: http + periodSeconds: 10 + timeoutSeconds: 5 + failureThreshold: 30 + volumeMounts: + - name: sub2api-data + mountPath: /app/data + volumes: + - name: sub2api-data + emptyDir: {} +`; +} + +function plan(options: TargetOptions): Record { const sub2api = readSub2ApiConfig(); - const yaml = manifest(); - const policy = policyChecks(yaml); + const target = resolveTarget(sub2api, options.targetId); + const yaml = manifest(sub2api, target); + const policy = policyChecks(yaml, target); return { ok: policy.every((check) => check.ok), action: "platform-infra-sub2api-plan", target: { - route: g14K3sRoute, - namespace, + id: target.id, + route: target.route, + namespace: target.namespace, + role: target.role, manifestPath, configPath, fieldManager, - serviceDns: `${serviceName}.${namespace}.svc.cluster.local:8080`, + serviceDns: `${serviceName}.${target.namespace}.svc.cluster.local:8080`, }, config: { image: imageRef(sub2api), pullPolicy: sub2api.image.pullPolicy, security: sub2api.security, + target: { + databaseMode: target.databaseMode, + redisMode: target.redisMode, + appReplicas: target.appReplicas, + redisReplicas: target.redisReplicas, + }, + externalDatabase: target.databaseMode === "external-pending" ? { + status: "pending-external-db", + sourceRef: sub2api.runtime.database.sourceRef, + secretName: sub2api.runtime.database.secretName, + passwordKey: sub2api.runtime.database.passwordKey, + host: sub2api.runtime.database.host, + port: sub2api.runtime.database.port, + user: sub2api.runtime.database.user, + dbName: sub2api.runtime.database.dbName, + sslMode: sub2api.runtime.database.sslMode, + pendingAllowed: sub2api.runtime.database.pendingAllowed, + } : null, }, decision: { owner: "UniDesk", - namespace, - reason: "Sub2API is an internal shared platform utility for G14 k3s workloads, so it belongs with platform infrastructure rather than a user workload namespace.", + namespace: target.namespace, + reason: target.id === "G14" + ? "Sub2API remains the active G14 platform-infra deployment." + : "D601 is a standby Sub2API platform-infra target prepared through YAML; it does not run until the external Pika01/PK01 database secret is ready.", exposure: "ClusterIP only; no public ingress or node-level exposure.", resourcePolicy: "No Kubernetes CPU/memory requests or limits, matching issue #220.", imageVersionControl: "Sub2API image repository/tag/pullPolicy are controlled by config/platform-infra/sub2api.yaml in the UniDesk repository.", urlAllowlistControl: "Sub2API upstream URL validation options are controlled by config/platform-infra/sub2api.yaml and rendered to SECURITY_URL_ALLOWLIST_* env vars.", networkPolicy: "NetworkPolicy/allow-all is rendered with the deployment so kube-router cannot silently default-deny Sub2API cross-pod traffic.", - dataStores: ["PostgreSQL 18", "Redis 8"], + dataStores: target.databaseMode === "external-pending" + ? ["External PostgreSQL pending from platform-db/Pika01", target.redisReplicas === 0 ? "D601 local Redis 8 ephemeral cache, scaled to zero until activation" : "D601 local Redis 8 ephemeral cache"] + : ["PostgreSQL 18", "Redis 8"], appPoolCaps: { databaseMaxOpenConns: 10, databaseMaxIdleConns: 2, @@ -256,78 +815,104 @@ function plan(): Record { }, policy, next: { - dryRun: "bun scripts/cli.ts platform-infra sub2api apply --dry-run", - apply: "bun scripts/cli.ts platform-infra sub2api apply --confirm", - status: "bun scripts/cli.ts platform-infra sub2api status", - validate: "bun scripts/cli.ts platform-infra sub2api validate", + dryRun: `bun scripts/cli.ts platform-infra sub2api apply --target ${target.id} --dry-run`, + apply: target.databaseMode === "external-pending" && target.appReplicas === 0 + ? `bun scripts/cli.ts platform-infra sub2api apply --target ${target.id} --confirm # predeploy only; app replicas=0 until DB is ready` + : `bun scripts/cli.ts platform-infra sub2api apply --target ${target.id} --confirm`, + status: `bun scripts/cli.ts platform-infra sub2api status --target ${target.id}`, + validate: `bun scripts/cli.ts platform-infra sub2api validate --target ${target.id}`, }, }; } async function apply(config: UniDeskConfig, options: ApplyOptions): Promise> { - const yaml = manifest(); - const policy = policyChecks(yaml); + const sub2api = readSub2ApiConfig(); + const target = resolveTarget(sub2api, options.targetId); + const yaml = manifest(sub2api, target); + const policy = policyChecks(yaml, target); if (!policy.every((check) => check.ok)) { return { ok: false, action: "platform-infra-sub2api-apply", mode: "policy-blocked", + target: target.id, policy, }; } if (options.confirm && !options.wait) { const job = startJob( - "platform_infra_sub2api_apply", - ["bun", "scripts/cli.ts", "platform-infra", "sub2api", "apply", "--confirm", "--wait"], - "Apply G14 k3s platform-infra Sub2API manifests through the controlled UniDesk CLI", + `platform_infra_sub2api_apply_${target.id.toLowerCase()}`, + ["bun", "scripts/cli.ts", "platform-infra", "sub2api", "apply", "--target", target.id, "--confirm", "--wait"], + `Apply ${target.id} k3s platform-infra Sub2API manifests through the controlled UniDesk CLI`, ); return { ok: true, action: "platform-infra-sub2api-apply", mode: "async-job", + target: { + id: target.id, + route: target.route, + namespace: target.namespace, + }, job, statusCommand: `bun scripts/cli.ts job status ${job.id} --tail-bytes 12000`, next: { status: `bun scripts/cli.ts job status ${job.id} --tail-bytes 12000`, - rollout: "bun scripts/cli.ts platform-infra sub2api status", - validate: "bun scripts/cli.ts platform-infra sub2api validate", + rollout: `bun scripts/cli.ts platform-infra sub2api status --target ${target.id}`, + validate: `bun scripts/cli.ts platform-infra sub2api validate --target ${target.id}`, }, }; } if (options.dryRun) { - const result = await capture(config, g14K3sRoute, ["script"], dryRunScript(yaml)); + const result = await capture(config, target.route, ["script"], dryRunScript(yaml, target)); const parsed = parseJsonOutput(result.stdout); return { ok: result.exitCode === 0 && boolField(parsed, "ok", false), action: "platform-infra-sub2api-apply", mode: "dry-run", + target: { + id: target.id, + route: target.route, + namespace: target.namespace, + }, policy, remote: parsed ?? compactCapture(result, { full: true }), }; } - const result = await capture(config, g14K3sRoute, ["script"], applyScript(yaml)); + const result = await capture(config, target.route, ["script"], applyScript(yaml, target)); const parsed = parseJsonOutput(result.stdout); return { ok: result.exitCode === 0 && boolField(parsed, "ok", false), action: "platform-infra-sub2api-apply", mode: "confirmed", + target: { + id: target.id, + route: target.route, + namespace: target.namespace, + }, policy, remote: parsed ?? compactCapture(result, { full: true }), next: { - status: "bun scripts/cli.ts platform-infra sub2api status", - validate: "bun scripts/cli.ts platform-infra sub2api validate", + status: `bun scripts/cli.ts platform-infra sub2api status --target ${target.id}`, + validate: `bun scripts/cli.ts platform-infra sub2api validate --target ${target.id}`, }, }; } async function status(config: UniDeskConfig, options: DisclosureOptions): Promise> { const sub2api = readSub2ApiConfig(); - const result = await capture(config, g14K3sRoute, ["script"], statusScript(sub2api)); + const target = resolveTarget(sub2api, options.targetId); + const result = await capture(config, target.route, ["script"], statusScript(sub2api, target)); const parsed = parseJsonOutput(result.stdout); if (options.raw) { return { ok: result.exitCode === 0 && boolField(parsed, "ok", false), action: "platform-infra-sub2api-status", + target: { + id: target.id, + route: target.route, + namespace: target.namespace, + }, remote: compactCapture(result, { full: true }), parsed, }; @@ -335,18 +920,33 @@ async function status(config: UniDeskConfig, options: DisclosureOptions): Promis return { ok: result.exitCode === 0 && boolField(parsed, "ok", false), action: "platform-infra-sub2api-status", + target: { + id: target.id, + route: target.route, + namespace: target.namespace, + }, summary: parsed, remote: compactCapture(result, { full: options.full || result.exitCode !== 0 }), }; } async function validate(config: UniDeskConfig, options: DisclosureOptions): Promise> { - const result = await capture(config, g14K3sRoute, ["script"], validateScript()); + const sub2api = readSub2ApiConfig(); + const target = resolveTarget(sub2api, options.targetId); + const script = target.databaseMode === "external-pending" + ? validateExternalPendingScript(sub2api, target) + : validateScript(target); + const result = await capture(config, target.route, ["script"], script); const parsed = parseJsonOutput(result.stdout); if (options.raw) { return { ok: result.exitCode === 0 && boolField(parsed, "ok", false), action: "platform-infra-sub2api-validate", + target: { + id: target.id, + route: target.route, + namespace: target.namespace, + }, remote: compactCapture(result, { full: true }), parsed, }; @@ -354,13 +954,18 @@ async function validate(config: UniDeskConfig, options: DisclosureOptions): Prom return { ok: result.exitCode === 0 && boolField(parsed, "ok", false), action: "platform-infra-sub2api-validate", + target: { + id: target.id, + route: target.route, + namespace: target.namespace, + }, summary: parsed, remote: compactCapture(result, { full: options.full || result.exitCode !== 0 }), }; } -function policyChecks(yaml: string): PolicyCheck[] { - return [ +function policyChecks(yaml: string, target: Sub2ApiTargetConfig): PolicyCheck[] { + const checks: PolicyCheck[] = [ { name: "no-ingress", ok: !/^\s*kind:\s*Ingress\s*$/mu.test(yaml), @@ -393,22 +998,53 @@ function policyChecks(yaml: string): PolicyCheck[] { }, { name: "expected-namespace", - ok: new RegExp(`^\\s*name:\\s*${namespace}\\s*$`, "mu").test(yaml), - detail: `Manifest declares namespace ${namespace}.`, + ok: new RegExp(`^\\s*name:\\s*${escapeRegExp(target.namespace)}\\s*$`, "mu").test(yaml), + detail: `Manifest declares namespace ${target.namespace}.`, }, { name: "allow-all-network-policy", - ok: hasAllowAllNetworkPolicy(yaml), - detail: `Manifest must include NetworkPolicy/allow-all in ${namespace} to keep kube-router from blocking Sub2API cross-pod traffic.`, + ok: hasAllowAllNetworkPolicy(yaml, target.namespace), + detail: `Manifest must include NetworkPolicy/allow-all in ${target.namespace} to keep kube-router from blocking Sub2API cross-pod traffic.`, }, ]; + + if (target.databaseMode === "external-pending") { + checks.push( + { + name: "external-db-no-local-postgres", + ok: !/^\s*kind:\s*StatefulSet\s*$/mu.test(yaml) && !/\bsub2api-postgres\b/u.test(yaml), + detail: "External-pending targets must not deploy a local PostgreSQL StatefulSet, Service, or PVC.", + }, + { + name: "pending-db-app-scaled-to-zero", + ok: target.appReplicas === 0 && target.redisReplicas === 0 && hasDeploymentReplicas(yaml, serviceName, 0) && hasDeploymentReplicas(yaml, "sub2api-redis", 0), + detail: "External-pending predeployment keeps the Sub2API app and local Redis cache at replicas=0 until the external DB secret, endpoint, and runtime images are ready.", + }, + ); + } else { + checks.push({ + name: "bundled-db-present", + ok: /^\s*kind:\s*StatefulSet\s*$/mu.test(yaml) && /\bsub2api-postgres\b/u.test(yaml), + detail: "Bundled active targets render the local PostgreSQL StatefulSet.", + }); + } + + if (target.redisMode === "local-ephemeral") { + checks.push({ + name: "local-redis-ephemeral", + ok: !/\bsub2api-redis-data\b/u.test(yaml) && /^\s*emptyDir:\s*\{\}\s*$/mu.test(yaml), + detail: "D601 standby Redis is a local ephemeral cache and must not allocate persistent Redis storage.", + }); + } + + return checks; } -function hasAllowAllNetworkPolicy(yaml: string): boolean { +function hasAllowAllNetworkPolicy(yaml: string, namespaceName: string): boolean { return yaml.split(/^---\s*$/mu).some((document) => { return /^\s*kind:\s*NetworkPolicy\s*$/mu.test(document) && /^\s*name:\s*allow-all\s*$/mu.test(document) - && new RegExp(`^\\s*namespace:\\s*${namespace}\\s*$`, "mu").test(document) + && new RegExp(`^\\s*namespace:\\s*${escapeRegExp(namespaceName)}\\s*$`, "mu").test(document) && /^\s*podSelector:\s*\{\}\s*$/mu.test(document) && /^\s*-\s*Ingress\s*$/mu.test(document) && /^\s*-\s*Egress\s*$/mu.test(document) @@ -417,7 +1053,19 @@ function hasAllowAllNetworkPolicy(yaml: string): boolean { }); } -function dryRunScript(yaml: string): string { +function hasDeploymentReplicas(yaml: string, name: string, replicas: number): boolean { + return yaml.split(/^---\s*$/mu).some((document) => { + return /^\s*kind:\s*Deployment\s*$/mu.test(document) + && new RegExp(`^\\s*name:\\s*${escapeRegExp(name)}\\s*$`, "mu").test(document) + && new RegExp(`^\\s*replicas:\\s*${replicas}\\s*$`, "mu").test(document); + }); +} + +function escapeRegExp(value: string): string { + return value.replace(/[.*+?^${}()|[\]\\]/gu, "\\$&"); +} + +function dryRunScript(yaml: string, target: Sub2ApiTargetConfig): string { const encoded = Buffer.from(yaml, "utf8").toString("base64"); return ` set -u @@ -431,7 +1079,7 @@ server_out="$tmp/server.out" server_err="$tmp/server.err" kubectl apply --dry-run=client -f "$manifest" >"$client_out" 2>"$client_err" client_rc=$? -if kubectl get namespace ${namespace} >/dev/null 2>&1; then +if kubectl get namespace ${target.namespace} >/dev/null 2>&1; then namespace_exists=true kubectl apply --server-side --dry-run=server --field-manager=${fieldManager} -f "$manifest" >"$server_out" 2>"$server_err" server_rc=$? @@ -455,7 +1103,8 @@ def text(path): return "" payload = { "ok": client_rc == 0 and server_rc == 0, - "namespace": "${namespace}", + "target": "${target.id}", + "namespace": "${target.namespace}", "namespaceExistsBeforeDryRun": namespace_exists, "clientDryRun": { "exitCode": client_rc, @@ -475,8 +1124,9 @@ PY `; } -function applyScript(yaml: string): string { +function applyScript(yaml: string, target: Sub2ApiTargetConfig): string { const encoded = Buffer.from(yaml, "utf8").toString("base64"); + const managesSecret = target.databaseMode !== "external-pending"; return ` set -u tmp="$(mktemp -d)" @@ -489,12 +1139,16 @@ secret_out="$tmp/secret.out" secret_err="$tmp/secret.err" apply_out="$tmp/apply.out" apply_err="$tmp/apply.err" -kubectl create namespace ${namespace} --dry-run=client -o yaml | kubectl apply --server-side --force-conflicts --field-manager=${fieldManager} -f - >"$ns_out" 2>"$ns_err" +kubectl create namespace ${target.namespace} --dry-run=client -o yaml | kubectl apply --server-side --force-conflicts --field-manager=${fieldManager} -f - >"$ns_out" 2>"$ns_err" ns_rc=$? secret_action="unknown" secret_rc=0 if [ "$ns_rc" -eq 0 ]; then - if kubectl -n ${namespace} get secret ${secretName} >/dev/null 2>&1; then + if [ "${managesSecret ? "true" : "false"}" != "true" ]; then + secret_action="external-pending-not-managed" + : >"$secret_out" + printf '%s\\n' 'external DB target expects its DB credential Secret from the platform DB handoff; predeploy does not create placeholder secrets' >"$secret_err" + elif kubectl -n ${target.namespace} get secret ${secretName} >/dev/null 2>&1; then secret_action="kept-existing" : >"$secret_out" : >"$secret_err" @@ -507,7 +1161,7 @@ if [ "$ns_rc" -eq 0 ]; then dd if=/dev/urandom bs="$bytes" count=1 2>/dev/null | od -An -tx1 | tr -d ' \\n' fi } - kubectl -n ${namespace} create secret generic ${secretName} \\ + kubectl -n ${target.namespace} create secret generic ${secretName} \\ --from-literal=POSTGRES_PASSWORD="$(rand_hex 32)" \\ --from-literal=ADMIN_PASSWORD="$(rand_hex 16)" \\ --from-literal=JWT_SECRET="$(rand_hex 32)" \\ @@ -540,11 +1194,16 @@ def text(path): return "" payload = { "ok": ns_rc == 0 and secret_rc == 0 and apply_rc == 0, - "namespace": "${namespace}", + "target": "${target.id}", + "namespace": "${target.namespace}", + "databaseMode": "${target.databaseMode}", + "appReplicas": ${target.appReplicas}, + "redisReplicas": ${target.redisReplicas}, "secret": { "name": "${secretName}", "action": secret_action, "requiredKeys": ${JSON.stringify(requiredSecretKeys)}, + "managedByThisApply": ${managesSecret ? "True" : "False"}, "valuesPrinted": False, }, "steps": { @@ -559,9 +1218,10 @@ PY `; } -function statusScript(sub2api: Sub2ApiConfig): string { +function statusScript(sub2api: Sub2ApiConfig, target: Sub2ApiTargetConfig): string { const expectedImage = imageRef(sub2api); const expectedUrlAllowlist = sub2api.security.urlAllowlist; + const externalPending = target.databaseMode === "external-pending"; return ` set -u tmp="$(mktemp -d)" @@ -573,22 +1233,22 @@ capture_json() { rc=$? printf '%s' "$rc" >"$tmp/$name.rc" } -capture_json ns kubectl get namespace ${namespace} -capture_json deployments kubectl -n ${namespace} get deployments -l app.kubernetes.io/part-of=platform-infra -capture_json statefulsets kubectl -n ${namespace} get statefulsets -l app.kubernetes.io/part-of=platform-infra -capture_json pods kubectl -n ${namespace} get pods -l app.kubernetes.io/part-of=platform-infra -capture_json services kubectl -n ${namespace} get services -l app.kubernetes.io/part-of=platform-infra -capture_json pvc kubectl -n ${namespace} get pvc -l app.kubernetes.io/part-of=platform-infra -capture_json secrets kubectl -n ${namespace} get secret ${secretName} -capture_json configmap kubectl -n ${namespace} get configmap sub2api-config -capture_json networkpolicies kubectl -n ${namespace} get networkpolicy -capture_json ingresses kubectl -n ${namespace} get ingress -capture_json quotas kubectl -n ${namespace} get resourcequota -capture_json limitranges kubectl -n ${namespace} get limitrange -pod_name="$(kubectl -n ${namespace} get pod -l app.kubernetes.io/name=${serviceName},app.kubernetes.io/component=app -o jsonpath='{.items[0].metadata.name}' 2>"$tmp/pod-name.err" || true)" +capture_json ns kubectl get namespace ${target.namespace} +capture_json deployments kubectl -n ${target.namespace} get deployments -l app.kubernetes.io/part-of=platform-infra +capture_json statefulsets kubectl -n ${target.namespace} get statefulsets -l app.kubernetes.io/part-of=platform-infra +capture_json pods kubectl -n ${target.namespace} get pods -l app.kubernetes.io/part-of=platform-infra +capture_json services kubectl -n ${target.namespace} get services -l app.kubernetes.io/part-of=platform-infra +capture_json pvc kubectl -n ${target.namespace} get pvc -l app.kubernetes.io/part-of=platform-infra +capture_json secrets kubectl -n ${target.namespace} get secret ${secretName} +capture_json configmap kubectl -n ${target.namespace} get configmap sub2api-config +capture_json networkpolicies kubectl -n ${target.namespace} get networkpolicy +capture_json ingresses kubectl -n ${target.namespace} get ingress +capture_json quotas kubectl -n ${target.namespace} get resourcequota +capture_json limitranges kubectl -n ${target.namespace} get limitrange +pod_name="$(kubectl -n ${target.namespace} get pod -l app.kubernetes.io/name=${serviceName},app.kubernetes.io/component=app -o jsonpath='{.items[0].metadata.name}' 2>"$tmp/pod-name.err" || true)" printf '%s' "$pod_name" >"$tmp/pod-name.txt" if [ -n "$pod_name" ]; then - kubectl -n ${namespace} exec "$pod_name" -- sh -c 'printf "SECURITY_URL_ALLOWLIST_ENABLED=%s\\n" "$SECURITY_URL_ALLOWLIST_ENABLED"; printf "SECURITY_URL_ALLOWLIST_ALLOW_INSECURE_HTTP=%s\\n" "$SECURITY_URL_ALLOWLIST_ALLOW_INSECURE_HTTP"; printf "SECURITY_URL_ALLOWLIST_ALLOW_PRIVATE_HOSTS=%s\\n" "$SECURITY_URL_ALLOWLIST_ALLOW_PRIVATE_HOSTS"; printf "SECURITY_URL_ALLOWLIST_UPSTREAM_HOSTS=%s\\n" "$SECURITY_URL_ALLOWLIST_UPSTREAM_HOSTS"' >"$tmp/pod-env.out" 2>"$tmp/pod-env.err" + kubectl -n ${target.namespace} exec "$pod_name" -- sh -c 'printf "SECURITY_URL_ALLOWLIST_ENABLED=%s\\n" "$SECURITY_URL_ALLOWLIST_ENABLED"; printf "SECURITY_URL_ALLOWLIST_ALLOW_INSECURE_HTTP=%s\\n" "$SECURITY_URL_ALLOWLIST_ALLOW_INSECURE_HTTP"; printf "SECURITY_URL_ALLOWLIST_ALLOW_PRIVATE_HOSTS=%s\\n" "$SECURITY_URL_ALLOWLIST_ALLOW_PRIVATE_HOSTS"; printf "SECURITY_URL_ALLOWLIST_UPSTREAM_HOSTS=%s\\n" "$SECURITY_URL_ALLOWLIST_UPSTREAM_HOSTS"' >"$tmp/pod-env.out" 2>"$tmp/pod-env.err" printf '%s' "$?" >"$tmp/pod-env.rc" else : >"$tmp/pod-env.out" @@ -783,6 +1443,9 @@ configmap = load("configmap") configmap_data = (configmap or {}).get("data") or {} secret_keys = sorted(((secret or {}).get("data") or {}).keys()) missing_secret_keys = [key for key in ${JSON.stringify(requiredSecretKeys)} if key not in secret_keys] +external_pending = ${externalPending ? "True" : "False"} +expected_app_replicas = ${target.appReplicas} +expected_redis_replicas = ${target.redisReplicas} service_violations = [] for svc in services: spec = svc.get("spec") or {} @@ -795,6 +1458,9 @@ resource_violations = resource_findings("Deployment", deployments) + resource_fi expected_image = "${expectedImage}" expected_url_allowlist = json.loads(${JSON.stringify(JSON.stringify(expectedUrlAllowlist))}) sub2api_deployment = next((deployment_summary(item) for item in deployments if item["metadata"]["name"] == "${serviceName}"), None) +redis_deployment = next((deployment_summary(item) for item in deployments if item["metadata"]["name"] == "sub2api-redis"), None) +sub2api_desired_aligned = sub2api_deployment is not None and sub2api_deployment.get("desired") == expected_app_replicas +redis_desired_aligned = redis_deployment is not None and redis_deployment.get("desired") == expected_redis_replicas image_aligned = sub2api_deployment is not None and expected_image in sub2api_deployment.get("images", []) url_allowlist_runtime = { "enabled": configmap_data.get("SECURITY_URL_ALLOWLIST_ENABLED"), @@ -822,7 +1488,7 @@ expected_url_allowlist_strings = { } url_allowlist_configmap_aligned = url_allowlist_runtime == expected_url_allowlist_strings url_allowlist_pod_env_aligned = rc("pod-env") == 0 and url_allowlist_pod_env == expected_url_allowlist_strings -url_allowlist_aligned = url_allowlist_configmap_aligned and url_allowlist_pod_env_aligned +url_allowlist_aligned = url_allowlist_configmap_aligned and (url_allowlist_pod_env_aligned or (external_pending and expected_app_replicas == 0)) allow_all_network_policy = next((item for item in networkpolicies if item.get("metadata", {}).get("name") == "allow-all"), None) network_policy = { "requiredName": "allow-all", @@ -838,13 +1504,35 @@ boundary = { "limitRangeCount": len(items("limitranges")), "resourceViolations": resource_violations, } -workload_ready = all(d["ready"] for d in map(deployment_summary, deployments)) and all(s["ready"] for s in map(statefulset_summary, statefulsets)) +deployment_summaries = [deployment_summary(item) for item in deployments] +statefulset_summaries = [statefulset_summary(item) for item in statefulsets] +workload_ready = all(d["ready"] for d in deployment_summaries) and all(s["ready"] for s in statefulset_summaries) +local_postgres_present = any(item.get("metadata", {}).get("name") == "sub2api-postgres" for item in statefulsets + services + pvcs) +redis_pvc_present = any(item.get("metadata", {}).get("name") == "sub2api-redis-data" for item in pvcs) +secret_ready = len(missing_secret_keys) == 0 +secret_ok = secret_ready or external_pending +state_model_ok = (not external_pending) or ( + not local_postgres_present + and not redis_pvc_present + and sub2api_desired_aligned + and redis_desired_aligned + and expected_app_replicas == 0 + and expected_redis_replicas == 0 +) +status_label = "pending-external-db" if external_pending else "active" payload = { - "ok": rc("ns") == 0 and workload_ready and image_aligned and url_allowlist_aligned and network_policy["ok"] and boundary["internalOnly"] and len(resource_violations) == 0 and boundary["resourceQuotaCount"] == 0 and boundary["limitRangeCount"] == 0 and len(missing_secret_keys) == 0, - "namespace": "${namespace}", + "ok": rc("ns") == 0 and workload_ready and image_aligned and url_allowlist_aligned and network_policy["ok"] and boundary["internalOnly"] and len(resource_violations) == 0 and boundary["resourceQuotaCount"] == 0 and boundary["limitRangeCount"] == 0 and secret_ok and state_model_ok, + "target": "${target.id}", + "route": "${target.route}", + "namespace": "${target.namespace}", + "status": status_label, + "databaseMode": "${target.databaseMode}", + "redisMode": "${target.redisMode}", + "expectedAppReplicas": expected_app_replicas, + "expectedRedisReplicas": expected_redis_replicas, "namespaceExists": rc("ns") == 0, - "deployments": [deployment_summary(item) for item in deployments], - "statefulsets": [statefulset_summary(item) for item in statefulsets], + "deployments": deployment_summaries, + "statefulsets": statefulset_summaries, "pods": [pod_summary(item) for item in pods], "services": [service_summary(item) for item in services], "pvcs": [pvc_summary(item) for item in pvcs], @@ -854,8 +1542,17 @@ payload = { "exists": rc("secrets") == 0, "requiredKeys": ${JSON.stringify(requiredSecretKeys)}, "missingKeys": missing_secret_keys, + "ready": secret_ready, + "requiredForPredeployOk": not external_pending, "valuesPrinted": False, }, + "stateModel": { + "localPostgresPresent": local_postgres_present, + "redisPvcPresent": redis_pvc_present, + "sub2apiDesiredReplicasAligned": sub2api_desired_aligned, + "redisDesiredReplicasAligned": redis_desired_aligned, + "ok": state_model_ok, + }, "imageControl": { "desiredImage": expected_image, "configPath": "config/platform-infra/sub2api.yaml", @@ -871,6 +1568,7 @@ payload = { "aligned": url_allowlist_aligned, "configMapAligned": url_allowlist_configmap_aligned, "podEnvAligned": url_allowlist_pod_env_aligned, + "podEnvRequired": not (external_pending and expected_app_replicas == 0), "configMapExists": rc("configmap") == 0, "podEnvProbe": { "podName": text("pod-name.txt"), @@ -880,10 +1578,10 @@ payload = { } }, "boundary": boundary, - "serviceDns": "${serviceName}.${namespace}.svc.cluster.local:8080", + "serviceDns": "${serviceName}.${target.namespace}.svc.cluster.local:8080", "next": { - "apply": "bun scripts/cli.ts platform-infra sub2api apply --confirm", - "validate": "bun scripts/cli.ts platform-infra sub2api validate", + "apply": "bun scripts/cli.ts platform-infra sub2api apply --target ${target.id} --confirm", + "validate": "bun scripts/cli.ts platform-infra sub2api validate --target ${target.id}", }, } print(json.dumps(payload, ensure_ascii=False, indent=2)) @@ -892,7 +1590,7 @@ PY `; } -function validateScript(): string { +function validateScript(target: Sub2ApiTargetConfig): string { return ` set -u tmp="$(mktemp -d)" @@ -900,27 +1598,27 @@ probe_suffix="$(date +%s)-$$" pg_probe="unidesk-sub2api-netcheck-pg-$probe_suffix" redis_probe="unidesk-sub2api-netcheck-redis-$probe_suffix" cleanup() { - kubectl -n ${namespace} delete pod "$pg_probe" "$redis_probe" --ignore-not-found=true --wait=false >/dev/null 2>&1 || true + kubectl -n ${target.namespace} delete pod "$pg_probe" "$redis_probe" --ignore-not-found=true --wait=false >/dev/null 2>&1 || true rm -rf "$tmp" } trap cleanup EXIT -kubectl get --raw /api/v1/namespaces/${namespace}/services/${serviceName}:8080/proxy/health >"$tmp/health.body" 2>"$tmp/health.err" +kubectl get --raw /api/v1/namespaces/${target.namespace}/services/${serviceName}:8080/proxy/health >"$tmp/health.body" 2>"$tmp/health.err" health_rc=$? -kubectl get --raw /api/v1/namespaces/${namespace}/services/${serviceName}:8080/proxy/ >"$tmp/root.body" 2>"$tmp/root.err" +kubectl get --raw /api/v1/namespaces/${target.namespace}/services/${serviceName}:8080/proxy/ >"$tmp/root.body" 2>"$tmp/root.err" root_rc=$? -kubectl -n ${namespace} get networkpolicy allow-all -o json >"$tmp/network-policy.json" 2>"$tmp/network-policy.err" +kubectl -n ${target.namespace} get networkpolicy allow-all -o json >"$tmp/network-policy.json" 2>"$tmp/network-policy.err" network_policy_rc=$? -pg_pod="$(kubectl -n ${namespace} get pod -l app.kubernetes.io/name=sub2api-postgres -o jsonpath='{.items[0].metadata.name}' 2>"$tmp/pg-pod.err")" -redis_pod="$(kubectl -n ${namespace} get pod -l app.kubernetes.io/name=sub2api-redis -o jsonpath='{.items[0].metadata.name}' 2>"$tmp/redis-pod.err")" +pg_pod="$(kubectl -n ${target.namespace} get pod -l app.kubernetes.io/name=sub2api-postgres -o jsonpath='{.items[0].metadata.name}' 2>"$tmp/pg-pod.err")" +redis_pod="$(kubectl -n ${target.namespace} get pod -l app.kubernetes.io/name=sub2api-redis -o jsonpath='{.items[0].metadata.name}' 2>"$tmp/redis-pod.err")" if [ -n "$pg_pod" ]; then - kubectl -n ${namespace} exec "$pg_pod" -- pg_isready -U sub2api -d sub2api -h 127.0.0.1 >"$tmp/pg.out" 2>"$tmp/pg.err" + kubectl -n ${target.namespace} exec "$pg_pod" -- pg_isready -U sub2api -d sub2api -h 127.0.0.1 >"$tmp/pg.out" 2>"$tmp/pg.err" pg_rc=$? else pg_rc=1 printf '%s\\n' 'sub2api postgres pod not found' >"$tmp/pg.err" fi if [ -n "$redis_pod" ]; then - kubectl -n ${namespace} exec "$redis_pod" -- redis-cli ping >"$tmp/redis.out" 2>"$tmp/redis.err" + kubectl -n ${target.namespace} exec "$redis_pod" -- redis-cli ping >"$tmp/redis.out" 2>"$tmp/redis.err" redis_rc=$? else redis_rc=1 @@ -934,9 +1632,9 @@ if ! command -v timeout >/dev/null 2>&1; then pg_cross_rc=127 redis_cross_rc=127 else - timeout 35s kubectl -n ${namespace} run "$pg_probe" --restart=Never --rm -i --image=postgres:18-alpine --image-pull-policy=IfNotPresent --command -- pg_isready -h sub2api-postgres -U sub2api -d sub2api -t 5 >"$tmp/pg-cross.out" 2>"$tmp/pg-cross.err" + timeout 35s kubectl -n ${target.namespace} run "$pg_probe" --restart=Never --rm -i --image=postgres:18-alpine --image-pull-policy=IfNotPresent --command -- pg_isready -h sub2api-postgres -U sub2api -d sub2api -t 5 >"$tmp/pg-cross.out" 2>"$tmp/pg-cross.err" pg_cross_rc=$? - timeout 35s kubectl -n ${namespace} run "$redis_probe" --restart=Never --rm -i --image=redis:8-alpine --image-pull-policy=IfNotPresent --command -- redis-cli -h sub2api-redis -p 6379 ping >"$tmp/redis-cross.out" 2>"$tmp/redis-cross.err" + timeout 35s kubectl -n ${target.namespace} run "$redis_probe" --restart=Never --rm -i --image=redis:8-alpine --image-pull-policy=IfNotPresent --command -- redis-cli -h sub2api-redis -p 6379 ping >"$tmp/redis-cross.out" 2>"$tmp/redis-cross.err" redis_cross_rc=$? fi python3 - "$tmp" "$health_rc" "$root_rc" "$pg_rc" "$redis_rc" "$network_policy_rc" "$pg_cross_rc" "$redis_cross_rc" <<'PY' @@ -980,13 +1678,14 @@ network_policy_obj = json_file("network-policy.json") network_policy_ok = network_policy_rc == 0 and is_allow_all_network_policy(network_policy_obj) payload = { "ok": health_rc == 0 and root_rc == 0 and pg_rc == 0 and redis_rc == 0 and network_policy_ok and pg_cross_rc == 0 and redis_cross_rc == 0, - "namespace": "${namespace}", - "serviceDns": "${serviceName}.${namespace}.svc.cluster.local:8080", + "target": "${target.id}", + "namespace": "${target.namespace}", + "serviceDns": "${serviceName}.${target.namespace}.svc.cluster.local:8080", "checks": { "allowAllNetworkPolicy": { "exitCode": network_policy_rc, "ok": network_policy_ok, - "method": "kubectl -n ${namespace} get networkpolicy allow-all -o json", + "method": "kubectl -n ${target.namespace} get networkpolicy allow-all -o json", "policyTypes": ((network_policy_obj or {}).get("spec") or {}).get("policyTypes") if isinstance(network_policy_obj, dict) else None, "podSelector": ((network_policy_obj or {}).get("spec") or {}).get("podSelector") if isinstance(network_policy_obj, dict) else None, "ingress": ((network_policy_obj or {}).get("spec") or {}).get("ingress") if isinstance(network_policy_obj, dict) else None, @@ -995,13 +1694,13 @@ payload = { }, "sub2apiHealthViaKubernetesServiceProxy": { "exitCode": health_rc, - "method": "kubectl get --raw /api/v1/namespaces/${namespace}/services/${serviceName}:8080/proxy/health", + "method": "kubectl get --raw /api/v1/namespaces/${target.namespace}/services/${serviceName}:8080/proxy/health", "bodyPreview": health_body, "stderr": text("health.err", 2000), }, "sub2apiRootViaKubernetesServiceProxy": { "exitCode": root_rc, - "method": "kubectl get --raw /api/v1/namespaces/${namespace}/services/${serviceName}:8080/proxy/", + "method": "kubectl get --raw /api/v1/namespaces/${target.namespace}/services/${serviceName}:8080/proxy/", "bodyBytes": len(root_body.encode("utf-8")), "bodyPreview": root_body[:400], "stderr": text("root.err", 2000), @@ -1036,6 +1735,160 @@ PY `; } +function validateExternalPendingScript(sub2api: Sub2ApiConfig, target: Sub2ApiTargetConfig): string { + const expectedImage = imageRef(sub2api); + const database = sub2api.runtime.database; + const redisService = sub2api.runtime.redis.serviceName; + return ` +set -u +tmp="$(mktemp -d)" +trap 'rm -rf "$tmp"' EXIT +capture_json() { + name="$1" + shift + "$@" -o json >"$tmp/$name.json" 2>"$tmp/$name.err" + rc=$? + printf '%s' "$rc" >"$tmp/$name.rc" +} +capture_json ns kubectl get namespace ${target.namespace} +capture_json app kubectl -n ${target.namespace} get deployment ${serviceName} +capture_json redis kubectl -n ${target.namespace} get deployment ${redisService} +capture_json services kubectl -n ${target.namespace} get service +capture_json statefulsets kubectl -n ${target.namespace} get statefulsets +capture_json pvc kubectl -n ${target.namespace} get pvc +capture_json configmap kubectl -n ${target.namespace} get configmap sub2api-config +capture_json networkpolicy kubectl -n ${target.namespace} get networkpolicy allow-all +python3 - "$tmp" <<'PY' +import json +import os +import sys + +tmp = sys.argv[1] + +def rc(name): + try: + return int(open(os.path.join(tmp, f"{name}.rc"), encoding="utf-8").read() or "1") + except FileNotFoundError: + return 1 + +def load(name): + path = os.path.join(tmp, f"{name}.json") + if not os.path.exists(path): + return None + try: + return json.load(open(path, encoding="utf-8")) + except json.JSONDecodeError: + return None + +def items(name): + data = load(name) + if not isinstance(data, dict): + return [] + return data.get("items") or [] + +def text(name, limit=2000): + path = os.path.join(tmp, name) + try: + return open(path, encoding="utf-8", errors="replace").read()[-limit:] + except FileNotFoundError: + return "" + +def is_allow_all_network_policy(item): + if not isinstance(item, dict): + return False + spec = item.get("spec") or {} + return ( + item.get("metadata", {}).get("name") == "allow-all" + and spec.get("podSelector") == {} + and set(spec.get("policyTypes") or []) == {"Ingress", "Egress"} + and spec.get("ingress") == [{}] + and spec.get("egress") == [{}] + ) + +def deployment_summary(item): + spec = (item or {}).get("spec") or {} + status = (item or {}).get("status") or {} + containers = ((spec.get("template") or {}).get("spec") or {}).get("containers") or [] + volumes = ((spec.get("template") or {}).get("spec") or {}).get("volumes") or [] + desired = spec.get("replicas", 1) + available = status.get("availableReplicas", 0) + return { + "exists": isinstance(item, dict), + "desired": desired, + "availableReplicas": available, + "readyReplicas": status.get("readyReplicas", 0), + "ready": available >= desired, + "images": [container.get("image") for container in containers], + "volumes": volumes, + } + +services = items("services") +statefulsets = items("statefulsets") +pvcs = items("pvc") +configmap = load("configmap") +configmap_data = (configmap or {}).get("data") or {} +app = load("app") +redis = load("redis") +app_summary = deployment_summary(app) +redis_summary = deployment_summary(redis) +network_policy_obj = load("networkpolicy") +network_policy_ok = rc("networkpolicy") == 0 and is_allow_all_network_policy(network_policy_obj) +service_names = sorted(item.get("metadata", {}).get("name") for item in services) +local_postgres_present = any(item.get("metadata", {}).get("name") == "sub2api-postgres" for item in services + statefulsets + pvcs) +redis_pvc_present = any(item.get("metadata", {}).get("name") == "sub2api-redis-data" for item in pvcs) +redis_ephemeral = any(volume.get("emptyDir") == {} for volume in redis_summary["volumes"]) +configmap_aligned = ( + configmap_data.get("DATABASE_HOST") == "${database.host}" + and configmap_data.get("DATABASE_PORT") == "${database.port}" + and configmap_data.get("DATABASE_USER") == "${database.user}" + and configmap_data.get("DATABASE_DBNAME") == "${database.dbName}" + and configmap_data.get("DATABASE_SSLMODE") == "${database.sslMode}" + and configmap_data.get("REDIS_HOST") == "${redisService}" +) +app_scaled_to_zero = app_summary["exists"] and app_summary["desired"] == ${target.appReplicas} and ${target.appReplicas} == 0 +image_aligned = "${expectedImage}" in app_summary["images"] +redis_ready = redis_summary["exists"] and redis_summary["desired"] == ${target.redisReplicas} and redis_ephemeral and (redis_summary["ready"] or ${target.redisReplicas} == 0) +payload = { + "ok": rc("ns") == 0 and network_policy_ok and app_scaled_to_zero and image_aligned and redis_ready and configmap_aligned and not local_postgres_present and not redis_pvc_present and "${serviceName}" in service_names and "${redisService}" in service_names, + "target": "${target.id}", + "namespace": "${target.namespace}", + "status": "pending-external-db", + "databaseMode": "${target.databaseMode}", + "redisMode": "${target.redisMode}", + "expectedRedisReplicas": ${target.redisReplicas}, + "externalDatabase": { + "sourceRef": "${database.sourceRef}", + "host": "${database.host}", + "port": ${database.port}, + "user": "${database.user}", + "dbName": "${database.dbName}", + "sslMode": "${database.sslMode}", + "secretName": "${database.secretName}", + "passwordKey": "${database.passwordKey}", + "pendingAllowed": ${database.pendingAllowed ? "True" : "False"}, + "passwordPrinted": False, + }, + "checks": { + "namespaceExists": rc("ns") == 0, + "allowAllNetworkPolicy": {"ok": network_policy_ok, "stderr": text("networkpolicy.err")}, + "appScaledToZero": {"ok": app_scaled_to_zero, "summary": app_summary}, + "imageAligned": {"ok": image_aligned, "desiredImage": "${expectedImage}", "runningImages": app_summary["images"]}, + "redisReadyEphemeral": {"ok": redis_ready, "summary": redis_summary, "redisPvcPresent": redis_pvc_present, "readinessRequired": ${target.redisReplicas} > 0}, + "serviceBoundary": {"serviceNames": service_names, "servicePresent": "${serviceName}" in service_names, "redisServicePresent": "${redisService}" in service_names}, + "externalDbOnly": {"ok": not local_postgres_present, "localPostgresPresent": local_postgres_present}, + "configMapAligned": {"ok": configmap_aligned, "databaseHost": configmap_data.get("DATABASE_HOST"), "redisHost": configmap_data.get("REDIS_HOST")}, + }, + "next": { + "status": "bun scripts/cli.ts platform-infra sub2api status --target ${target.id}", + "activateAfterDbReady": "update config/platform-infra/sub2api.yaml target ${target.id} appReplicas and redisReplicas to 1 after ${database.sourceRef}, ${database.secretName}/${database.passwordKey}, and runtime images are ready, then apply --target ${target.id} --confirm", + }, +} +print(json.dumps(payload, ensure_ascii=False, indent=2)) +sys.exit(0 if payload["ok"] else 1) +PY +`; +} + async function capture(config: UniDeskConfig, target: string, args: string[], input?: string): Promise { return await runSshCommandCapture(config, target, args, input); }