141 lines
10 KiB
Markdown
141 lines
10 KiB
Markdown
# PK01 Provider Operations Reference
|
|
|
|
PK01 is a Tencent Cloud compute provider attached to UniDesk through `provider-gateway` with Provider ID `PK01`. This reference is the long-term operating boundary for PK01 host access, provider-gateway bootstrap state, pikanode retention, and disk GC. General provider-gateway rules remain authoritative in `docs/reference/provider-gateway.md`; general GC safety rules remain authoritative in `docs/reference/gc.md`.
|
|
|
|
## Operating Entry Points
|
|
|
|
Use UniDesk SSH passthrough for PK01 host operations:
|
|
|
|
```bash
|
|
trans PK01 argv hostname
|
|
trans PK01 script <<'SCRIPT'
|
|
df -h /
|
|
docker ps --format 'table {{.Names}}\t{{.Image}}\t{{.Status}}'
|
|
SCRIPT
|
|
```
|
|
|
|
Before closing an operation, verify both the provider channel and host workload state:
|
|
|
|
```bash
|
|
bun scripts/cli.ts debug health
|
|
trans PK01 argv bash -lc 'docker inspect --format "name={{.Name}} restart={{.HostConfig.RestartPolicy.Name}} pid={{.HostConfig.PidMode}} state={{.State.Status}} image={{.Config.Image}}" unidesk-provider-gateway-pk01'
|
|
trans PK01 argv bash -lc 'docker ps --format "table {{.Names}}\t{{.Image}}\t{{.Status}}"'
|
|
```
|
|
|
|
PK01 has no k3s control plane. `trans PK01:k3s ...` is not an operating truth. If a future PK01 k3s lane is introduced, it must get a separate runtime-lane reference and must not reuse the current pikanode host-data policy as a Kubernetes retention policy.
|
|
|
|
## Provider Gateway Bootstrap State
|
|
|
|
PK01 currently uses a direct Docker provider-gateway deployment rather than a full UniDesk source checkout. The node-local runtime bundle is:
|
|
|
|
| Item | Path / value | Boundary |
|
|
|---|---|---|
|
|
| Provider ID | `PK01` | Must stay unique in the UniDesk node registry. |
|
|
| Container | `unidesk-provider-gateway-pk01` | Must be `restart=always`, `pid=host`, and `running`. |
|
|
| Runtime bundle | `/home/ubuntu/unidesk-provider-pk01` | Minimal workspace mounted read-only into the gateway container. |
|
|
| Env file | `/home/ubuntu/.unidesk/state/provider-pk01/provider.env` | Contains provider token and must not be printed, copied into docs, or committed. |
|
|
| Host SSH key | `/home/ubuntu/.unidesk/host-ssh-pk01/id_ed25519` | Mounted read-only at `/run/host-ssh`; public key is authorized for `ubuntu`. |
|
|
| Logs | `/home/ubuntu/.unidesk/logs/provider-pk01` | Node-local runtime logs, not a Git source of truth. |
|
|
| Egress proxy | `127.0.0.1:18789` | Loopback only; never expose as a public endpoint. |
|
|
|
|
Long-term provider-gateway upgrades should converge to the standard `provider.upgrade mode=schedule` flow described in `docs/reference/provider-gateway.md`. If PK01 is still on the direct Docker bootstrap path, do not rebuild the gateway synchronously through the gateway's own `trans PK01` session. Use a detached node-local job or first move PK01 to the standard attach/upgrade bundle.
|
|
|
|
The minimal PK01 provider-gateway health contract is:
|
|
|
|
- `debug health` shows `providerId=PK01` as online.
|
|
- labels include `providerGatewayVersion`, `providerGatewayRuntimeGuardOk=true`, `providerGatewaySshDataTransport=tcp-pool`, and a nonzero ready SSH data pool.
|
|
- `trans PK01 argv hostname` reaches the Tencent Cloud host and returns the host name.
|
|
|
|
## Host Workloads
|
|
|
|
PK01 currently hosts existing Docker workloads:
|
|
|
|
| Container | Role | Protection boundary |
|
|
|---|---|---|
|
|
| `pikanode` | Public PikaPython/PikaNode service rooted at `/home/ubuntu/pikanode` | Do not delete source, `files/`, `html/download/`, `html/upload/`, certificates, or Git state without a service-owner retention decision. |
|
|
| `met_server` | Existing MET service | Treat as protected runtime unless a separate owner-approved retention plan exists. |
|
|
| `unidesk-provider-gateway-pk01` | UniDesk maintenance bridge | Must remain running; do not stop it as part of generic disk GC. |
|
|
|
|
`pikanode` mounts `/home/ubuntu/pikanode` read-write into the container. Static/generated download artifacts under `html/download/` and repository data under `files/` may be user-visible or needed by the service. They are not generic GC candidates.
|
|
|
|
## Host PostgreSQL
|
|
|
|
PK01 host-native PostgreSQL is declared by `config/platform-db/postgres-pk01.yaml` and managed through `bun scripts/cli.ts platform-db postgres plan|status|apply`; daily operation commands live in `$unidesk-ops` at `.agents/skills/unidesk-ops/SKILL.md`. It is a host systemd service, not a Docker container or k3s workload. The YAML is the source of truth for PostgreSQL version, TLS mode, listening addresses, `pg_hba` source CIDRs, generated Secret source files, exported `DATABASE_URL`, and backup timer settings.
|
|
|
|
Cross-node platform consumers must connect directly to the YAML-declared `postgres.network.connectionHost`. For consumers outside the PK01 private VPC, that value must be PK01's public endpoint, not the private `10.0.8.3` address and not a master-server tunnel. The master server may run control-plane CLI operations and secret sync, but it must not become the data-plane relay for D601, G14, Sub2API, HWLAB, AgentRun, or other PostgreSQL clients.
|
|
|
|
`postgres.network.publicDns` is an optional alias for operator readability and future `sslmode=verify-full` work. With the current PostgreSQL native TLS posture, clients use `sslmode=require`; DNS resolution is therefore not a cutover blocker when `connectionHost` is a reachable IP endpoint. If `publicDns` later becomes the connection host or `verify-full` is enabled, certificate common name/SAN and DNS must be promoted back into the cutover criteria.
|
|
|
|
The exported Sub2API connection string is written under the configured ignored Secret root and must never be committed or printed in full. CLI output may show key names, presence, fingerprints, selected host, SSL status, and source/target Secret references only. If a consumer's public egress IP changes, update the YAML `allowSources` and matching `pg_hba` rules, then use the `$unidesk-ops` PK01 Host PostgreSQL workflow to apply and recheck status.
|
|
|
|
## Disk GC Policy
|
|
|
|
PK01 follows the same safe-stop principle as G14: first produce a bounded attribution, then clean only classified candidates, and stop when remaining pressure is in protected runtime data.
|
|
|
|
Default sequence for a high-water incident:
|
|
|
|
1. Run generic remote GC plan and, if useful, confirmed run:
|
|
```bash
|
|
bun scripts/cli.ts gc remote PK01 plan --target-use-percent 60 --limit 100 --full
|
|
bun scripts/cli.ts gc remote PK01 run --confirm --target-use-percent 60 --limit 100 --full
|
|
```
|
|
2. Inspect PK01-specific host data with short passthrough commands; avoid full-root `du` in one `trans` call because `trans` has a 60 second hard timeout.
|
|
3. For pikanode growth, clean only `html/temp` direct child directories that are older than the configured node-local retention window. Preserve direct files such as `stdout.log`, `update.log`, `accesstoken.json`, `pullrequest.json`, and any recent temp workspaces.
|
|
4. Re-check `df -h /`, provider health, Docker container state, and a pikanode local HTTPS probe.
|
|
5. If the target still cannot be reached without touching `html/download/`, `files/`, Docker images, or other protected runtime data, stop and make a retention/capacity decision instead of widening deletion scope.
|
|
|
|
PK01 pikanode temp directories are safe to remove only under this narrow definition:
|
|
|
|
- path is a direct child directory of `/home/ubuntu/pikanode/html/temp`;
|
|
- path is not a symlink;
|
|
- parent is exactly `/home/ubuntu/pikanode/html/temp`;
|
|
- mtime is older than the configured retention window;
|
|
- deletion uses `rm -rf --one-file-system` and never follows paths outside that root.
|
|
|
|
Never use `rm -rf /home/ubuntu/pikanode/html/temp/*` as an unbounded shell expansion. It risks deleting current generation workspaces and direct state/log files.
|
|
|
|
## Long-Term Retention Mechanisms
|
|
|
|
PK01 has node-local retention controls installed so that pikanode temp output and logs do not grow without bound:
|
|
|
|
| Mechanism | Node-local path | Purpose |
|
|
|---|---|---|
|
|
| pikanode temp timer | `/etc/systemd/system/unidesk-pk01-pikanode-temp-gc.timer` | Runs pikanode temp retention on a daily timer. |
|
|
| pikanode temp service | `/etc/systemd/system/unidesk-pk01-pikanode-temp-gc.service` | Executes `/usr/local/sbin/unidesk-pk01-pikanode-temp-gc` as a one-shot cleanup. |
|
|
| pikanode temp script | `/usr/local/sbin/unidesk-pk01-pikanode-temp-gc` | Deletes only old direct temp directories under the protected root. |
|
|
| retention log | `/var/log/unidesk-pk01/pikanode-temp-gc.log` | Bounded operational evidence for the timer. |
|
|
| pikanode logrotate | `/etc/logrotate.d/unidesk-pk01-pikanode` | Rotates pikanode temp/runtime logs and the retention log. |
|
|
| journald cap | `/etc/systemd/journald.conf.d/99-unidesk-pk01.conf` | Caps systemd journal growth on PK01. |
|
|
|
|
Operational checks:
|
|
|
|
```bash
|
|
trans PK01 argv bash -lc 'systemctl status unidesk-pk01-pikanode-temp-gc.timer --no-pager'
|
|
trans PK01 argv bash -lc 'sudo systemctl start unidesk-pk01-pikanode-temp-gc.service && tail -n 40 /var/log/unidesk-pk01/pikanode-temp-gc.log'
|
|
trans PK01 argv bash -lc 'sudo logrotate -d /etc/logrotate.d/unidesk-pk01-pikanode'
|
|
```
|
|
|
|
The timer and logrotate configuration are node-local operational state. If a future UniDesk CLI subcommand manages PK01 retention centrally, it must first render a dry-run plan, show the same protected paths, and then install/update these node-local files through a confirmed operation.
|
|
|
|
## Space Attribution Baseline
|
|
|
|
PK01 space attribution should use short, bounded commands. Recommended probes:
|
|
|
|
```bash
|
|
trans PK01 argv bash -lc 'df -h / && df -i /'
|
|
trans PK01 argv bash -lc 'sudo timeout 20 du -xhd1 /var /home/ubuntu/pikanode /home/ubuntu/.vscode-server /var/lib/docker /var/log 2>/dev/null | sort -h | tail -80'
|
|
trans PK01 argv bash -lc 'docker system df -v | sed -n "1,220p"'
|
|
trans PK01 argv bash -lc 'sudo find /home/ubuntu/pikanode/html/temp -xdev -mindepth 1 -maxdepth 1 -printf "%TY-%Tm-%Td %TH:%TM %p\n" | sort | tail -40'
|
|
```
|
|
|
|
Interpretation guide:
|
|
|
|
| Path | Meaning | Default action |
|
|
|---|---|---|
|
|
| `/home/ubuntu/pikanode/html/temp` | Generated pikanode build workspaces | Managed by PK01 temp retention. |
|
|
| `/home/ubuntu/pikanode/html/download` | Generated ZIP downloads | Protected unless a separate download retention policy is approved. |
|
|
| `/home/ubuntu/pikanode/files` | pikanode repository/service data | Protected. |
|
|
| `/home/ubuntu/.vscode-server` | VS Code remote server, extensions, and cache | Do not delete installed servers/extensions by default; cached VSIX cleanup needs an explicit policy. |
|
|
| `/var/lib/docker` | Docker overlay/image/container state for PK01 workloads | Do not prune generically; inspect running containers first. |
|
|
| `/var/log/journal` | systemd journal | Managed by journald cap; use sudo when vacuuming manually. |
|