diff --git a/.agents/skills/unidesk-ops/SKILL.md b/.agents/skills/unidesk-ops/SKILL.md index 2bbd3ef0..106a7bba 100644 --- a/.agents/skills/unidesk-ops/SKILL.md +++ b/.agents/skills/unidesk-ops/SKILL.md @@ -197,10 +197,18 @@ dscx exec --skip-git-repo-check 'Reply exactly: dscx-codex-ok' mxcx doctor mxcx bridge-smoke mxcx-bridge-ok mxcx exec --skip-git-repo-check 'Reply exactly: mxcx-codex-ok' + +# ACX GPT direct profiles +acx status +acx gpt-only exec --json 'Reply exactly: acx-only-ok' +acx gpt-sub2api exec --json 'Reply exactly: acx-sub2api-ok' +acx exec --json 'Reply exactly: acx-default-ok' ``` `bridge-smoke` 验证 Moon Bridge → provider 链路。`exec` 验证完整 Codex CLI → bridge → provider 全链路。 +`acx` 的 GPT aliases 是 Codex custom provider 直连 Responses 上游,不经过本地 `127.0.0.1:38448` router。GPT 模式下 `acx status` 应输出 `mode=gpt-direct`、`routerRequired=false`、`portPids=[]`;小真实调用应返回期望文本,重复或 resume 流量应能看到非零 `cached_input_tokens`。OpenCode Zen Go aliases 仍通过 `acx route-start|route-status` 走 router → `gocx`/Moon Bridge 路径。长期边界见 `docs/reference/master-server-ops.md`。 + --- ## MiniMax Session Recovery diff --git a/docs/reference/master-server-ops.md b/docs/reference/master-server-ops.md index edd111f1..e256ae21 100644 --- a/docs/reference/master-server-ops.md +++ b/docs/reference/master-server-ops.md @@ -14,20 +14,22 @@ This document records master-server architecture and decision rules. **Operation - `dscx` uses `CODEX_HOME=/root/.codex-deepseek-v4-pro`, model `deepseek-v4-pro`, Codex custom provider `deepseek`, and local Moon Bridge at `http://127.0.0.1:38440/v1`. - `mxcx` uses `CODEX_HOME=/root/.codex-minimax-m3`, model `MiniMax-M3`, Codex custom provider `minimax`, and local Moon Bridge at `http://127.0.0.1:38441/v1`. -- `acx` uses `CODEX_HOME=/root/.codex-acx`, default model `gpt-5.5-only`, Codex custom provider `acx`, and local ACX router at `http://127.0.0.1:38448/v1`. +- `acx` uses `CODEX_HOME=/root/.codex-acx` and default model `gpt-5.5-only`. GPT aliases use Codex custom providers that connect directly to their OpenAI-compatible Responses upstreams; OpenCode Zen Go aliases still use the local ACX router only to reach `gocx`/Moon Bridge. - `gocx` uses `CODEX_HOME=/root/.codex-opencode-go-all`, default model `glm-5.1`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38447/v1`. - `dscx-go` uses `CODEX_HOME=/root/.codex-opencode-go`, model `deepseek-v4-pro`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38443/v1`. - `dfcx-go` uses `CODEX_HOME=/root/.codex-opencode-flash`, model `deepseek-v4-flash`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38444/v1`. - `glcx-go` uses `CODEX_HOME=/root/.codex-opencode-glm`, model `glm-5.1`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38446/v1`. - `acx` includes all OpenCode Zen Go upstream slugs plus `gpt-5.5-only` and `gpt-5.5-sub2api` in one `model-catalog.json` so Codex can use `/model` or `-m ` within the same profile. -- `acx` routes OpenCode Zen Go models to the existing `gocx` Moon Bridge, but GPT models do not go through Moon Bridge: `gpt-5.5-only` uses the direct `only` OpenAI-compatible Responses endpoint and `gpt-5.5-sub2api` uses the Sub2API pool endpoint. The ACX router rewrites both GPT model aliases to upstream model `gpt-5.5` before forwarding. -- All wrappers read the upstream API key from the profile `auth.json`; generated Moon Bridge runtime configs live under the profile `.tmp/` directory with mode `0600`. Do not copy upstream keys into documentation. +- `acx` routes OpenCode Zen Go models to the existing `gocx` Moon Bridge. GPT models must not pass through Moon Bridge or the ACX router: `gpt-5.5-only` uses the direct `only` OpenAI-compatible Responses endpoint and `gpt-5.5-sub2api` uses the Sub2API pool endpoint, both with upstream model `gpt-5.5`. +- GPT direct providers must receive their API key through an environment-key path such as `ACX_GPT_DIRECT_API_KEY`, read from the matching `/root/.codex/auth.json.` file by the wrapper. Do not let direct GPT calls fall back to `/root/.codex-acx/auth.json`; that file may contain only the local-router dummy key. +- All wrappers read upstream API keys from profile auth files or wrapper-injected environment variables; generated Moon Bridge or router runtime configs live under the profile `.tmp/` directory with mode `0600`. Do not copy upstream keys into documentation. - Each profile must include `model_catalog_json` in `config.toml` pointing to a profile-local `model-catalog.json` entry for its active model. Missing catalog metadata causes Codex to fall back to default metadata, which lowers the effective context window and prints `Model metadata ... not found`. - Profile context metadata must match the intended upstream limit closely enough for Codex auto-compact to fire before provider rejection. Keep the local profile metadata in sync with the actual model family you are routing to. - Current master-server profile baselines: - - `deepseek-v4-pro`, `deepseek-v4-flash`, and GPT profiles exposed through `acx` use `model_context_window = 1000000` and `model_auto_compact_token_limit = 900000`. + - GPT profiles exposed through `acx` use `model_context_window = 272000` and `model_auto_compact_token_limit = 240000`. This represents the Codex-facing input window for GPT-5.5, not the larger raw API model window. + - `deepseek-v4-pro` and `deepseek-v4-flash` use `model_context_window = 1000000` and `model_auto_compact_token_limit = 900000`. - Other local Moon Bridge profiles, including `glm-5.1`, `MiniMax-M3`, and the non-DeepSeek OpenCode models exposed through `acx`/`gocx`, use `model_context_window = 200000` and `model_auto_compact_token_limit = 180000`. -- Keep the wrapper-generated Moon Bridge `models..context_window` aligned with the profile `config.toml` and `model-catalog.json`. If those three diverge, Codex and Moon Bridge may disagree about compaction and admission behavior. +- Keep the wrapper-generated Moon Bridge/router metadata aligned with the profile `config.toml` and `model-catalog.json`. If these diverge, Codex and the local admission layer may disagree about compaction and request size behavior. - `hyueapi.com` / `.hyueapi.com` must remain in `NO_PROXY` / `no_proxy` for Codex API channels. ## Moon Bridge @@ -40,7 +42,7 @@ Profile architecture: - `dscx bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38440`. - `mxcx bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38441`. -- `acx route-start` renders the ACX router config and starts the local routing service on `127.0.0.1:38448`. +- `acx route-start` renders the ACX router config and starts the local routing service on `127.0.0.1:38448` for non-GPT ACX aliases that still need the OpenCode Zen Go bridge path. - `gocx bridge-start` renders multi-model OpenCode Zen Go profile config and starts Moon Bridge on `127.0.0.1:38447`. - `dscx-go bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38443`. - `dfcx-go bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38444`. @@ -50,7 +52,7 @@ Profile architecture: - `dscx` routes DeepSeek through Moon Bridge using Anthropic-compatible upstream + `deepseek_v4` extension. - `mxcx` routes MiniMax through Moon Bridge using `openai-response` upstream passthrough. - `acx` routes OpenCode Zen Go models to `gocx`/Moon Bridge. `gocx`, `dscx-go`, `dfcx-go`, and `glcx-go` route OpenCode Zen Go through Moon Bridge using `openai-chat` upstream at `https://opencode.ai/zen/go/v1/chat/completions`. The Codex side remains `wire_api = "responses"` against the local Moon Bridge URL. -- `acx` routes GPT aliases directly to OpenAI-compatible Responses endpoints and must not send GPT traffic through Moon Bridge. +- `acx` routes GPT aliases directly to OpenAI-compatible Responses endpoints and must not send GPT traffic through Moon Bridge or through the local ACX router. In GPT mode, `acx status` should report `mode=gpt-direct`, `routerRequired=false`, and no listener on `127.0.0.1:38448`. - OpenCode Zen Go model IDs must use the upstream slug, such as `glm-5.1`; display names such as `GLM-5.1` are not profile model identifiers. - Do not keep local handwritten bridge scripts, static alternate `moonbridge.config.yml` files, or other sidecar proxy paths for OpenCode Zen Go profiles. The only supported runtime path is wrapper-generated `.tmp/moonbridge.generated.yml` plus `/root/.local/bin/moonbridge`. - For OpenCode Zen Go profiles, set an explicit `user_agent` in the generated Moon Bridge provider config. The upstream may reject default client signatures. @@ -62,7 +64,8 @@ Profile validation: - `*-go raw-smoke` verifies the upstream OpenCode Zen Go Chat Completions API directly. - `*-go bridge-smoke` verifies local Moon Bridge's `/v1/responses` translation path. - `*-go exec '在吗'` verifies the actual Codex profile. Passing output must not contain `Model metadata ... not found`; latest session records should show `model_context_window` derived from the profile catalog, not fallback metadata. -- `acx doctor`, `acx route-status`, `acx models`, and `acx -m exec '在吗'` verify the unified ACX profile. Use a small real Responses request through `http://127.0.0.1:38448/v1/responses` to verify a GPT alias because GPT traffic intentionally bypasses Moon Bridge. +- `acx status`, `acx models`, `acx gpt-only exec '在吗'`, `acx gpt-sub2api exec '在吗'`, and default `acx exec '在吗'` verify GPT direct mode. Passing GPT verification should show a real Codex Responses turn and, for repeated or resume traffic, nonzero `cached_input_tokens`; it should not require a listener on `127.0.0.1:38448`. +- For OpenCode Zen Go aliases exposed through `acx`, use `acx route-start`, `acx route-status`, `acx models`, and `acx -m exec '在吗'` to verify the router-to-`gocx` path. - `gocx raw-smoke [model]`, `gocx bridge-smoke [model]`, and `gocx -m exec '在吗'` verify specific OpenCode Zen Go models. Omitting `[model]` uses the default `glm-5.1`. - `ReasoningSummaryDelta without active item` in Codex stderr is a separate adapter noise from reasoning summary events. It is not the same failure as missing model metadata and does not by itself prove the profile is unusable.