docs: record acx gpt direct routing

2026-06-12 12:22:29 +00:00
parent 478981f87e
commit 6132bf0f26
2 changed files with 19 additions and 8 deletions
@@ -14,20 +14,22 @@ This document records master-server architecture and decision rules. **Operation

 - `dscx` uses `CODEX_HOME=/root/.codex-deepseek-v4-pro`, model `deepseek-v4-pro`, Codex custom provider `deepseek`, and local Moon Bridge at `http://127.0.0.1:38440/v1`.
 - `mxcx` uses `CODEX_HOME=/root/.codex-minimax-m3`, model `MiniMax-M3`, Codex custom provider `minimax`, and local Moon Bridge at `http://127.0.0.1:38441/v1`.
- `acx` uses `CODEX_HOME=/root/.codex-acx`, default model `gpt-5.5-only`, Codex custom provider `acx`, and local ACX router at `http://127.0.0.1:38448/v1`.
+- `acx` uses `CODEX_HOME=/root/.codex-acx` and default model `gpt-5.5-only`. GPT aliases use Codex custom providers that connect directly to their OpenAI-compatible Responses upstreams; OpenCode Zen Go aliases still use the local ACX router only to reach `gocx`/Moon Bridge.
 - `gocx` uses `CODEX_HOME=/root/.codex-opencode-go-all`, default model `glm-5.1`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38447/v1`.
 - `dscx-go` uses `CODEX_HOME=/root/.codex-opencode-go`, model `deepseek-v4-pro`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38443/v1`.
 - `dfcx-go` uses `CODEX_HOME=/root/.codex-opencode-flash`, model `deepseek-v4-flash`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38444/v1`.
 - `glcx-go` uses `CODEX_HOME=/root/.codex-opencode-glm`, model `glm-5.1`, Codex custom provider `opencode`, and local Moon Bridge at `http://127.0.0.1:38446/v1`.
 - `acx` includes all OpenCode Zen Go upstream slugs plus `gpt-5.5-only` and `gpt-5.5-sub2api` in one `model-catalog.json` so Codex can use `/model` or `-m <model>` within the same profile.
- `acx` routes OpenCode Zen Go models to the existing `gocx` Moon Bridge, but GPT models do not go through Moon Bridge: `gpt-5.5-only` uses the direct `only` OpenAI-compatible Responses endpoint and `gpt-5.5-sub2api` uses the Sub2API pool endpoint. The ACX router rewrites both GPT model aliases to upstream model `gpt-5.5` before forwarding.
- All wrappers read the upstream API key from the profile `auth.json`; generated Moon Bridge runtime configs live under the profile `.tmp/` directory with mode `0600`. Do not copy upstream keys into documentation.
+- `acx` routes OpenCode Zen Go models to the existing `gocx` Moon Bridge. GPT models must not pass through Moon Bridge or the ACX router: `gpt-5.5-only` uses the direct `only` OpenAI-compatible Responses endpoint and `gpt-5.5-sub2api` uses the Sub2API pool endpoint, both with upstream model `gpt-5.5`.
+- GPT direct providers must receive their API key through an environment-key path such as `ACX_GPT_DIRECT_API_KEY`, read from the matching `/root/.codex/auth.json.<profile>` file by the wrapper. Do not let direct GPT calls fall back to `/root/.codex-acx/auth.json`; that file may contain only the local-router dummy key.
+- All wrappers read upstream API keys from profile auth files or wrapper-injected environment variables; generated Moon Bridge or router runtime configs live under the profile `.tmp/` directory with mode `0600`. Do not copy upstream keys into documentation.
 - Each profile must include `model_catalog_json` in `config.toml` pointing to a profile-local `model-catalog.json` entry for its active model. Missing catalog metadata causes Codex to fall back to default metadata, which lowers the effective context window and prints `Model metadata ... not found`.
 - Profile context metadata must match the intended upstream limit closely enough for Codex auto-compact to fire before provider rejection. Keep the local profile metadata in sync with the actual model family you are routing to.
 - Current master-server profile baselines:
-  - `deepseek-v4-pro`, `deepseek-v4-flash`, and GPT profiles exposed through `acx` use `model_context_window = 1000000` and `model_auto_compact_token_limit = 900000`.
+  - GPT profiles exposed through `acx` use `model_context_window = 272000` and `model_auto_compact_token_limit = 240000`. This represents the Codex-facing input window for GPT-5.5, not the larger raw API model window.
+  - `deepseek-v4-pro` and `deepseek-v4-flash` use `model_context_window = 1000000` and `model_auto_compact_token_limit = 900000`.
  - Other local Moon Bridge profiles, including `glm-5.1`, `MiniMax-M3`, and the non-DeepSeek OpenCode models exposed through `acx`/`gocx`, use `model_context_window = 200000` and `model_auto_compact_token_limit = 180000`.
- Keep the wrapper-generated Moon Bridge `models.<slug>.context_window` aligned with the profile `config.toml` and `model-catalog.json`. If those three diverge, Codex and Moon Bridge may disagree about compaction and admission behavior.
+- Keep the wrapper-generated Moon Bridge/router metadata aligned with the profile `config.toml` and `model-catalog.json`. If these diverge, Codex and the local admission layer may disagree about compaction and request size behavior.
 - `hyueapi.com` / `.hyueapi.com` must remain in `NO_PROXY` / `no_proxy` for Codex API channels.

 ## Moon Bridge
@@ -40,7 +42,7 @@ Profile architecture:

 - `dscx bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38440`.
 - `mxcx bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38441`.
- `acx route-start` renders the ACX router config and starts the local routing service on `127.0.0.1:38448`.
+- `acx route-start` renders the ACX router config and starts the local routing service on `127.0.0.1:38448` for non-GPT ACX aliases that still need the OpenCode Zen Go bridge path.
 - `gocx bridge-start` renders multi-model OpenCode Zen Go profile config and starts Moon Bridge on `127.0.0.1:38447`.
 - `dscx-go bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38443`.
 - `dfcx-go bridge-start` renders profile config and starts Moon Bridge on `127.0.0.1:38444`.
@@ -50,7 +52,7 @@ Profile architecture:
 - `dscx` routes DeepSeek through Moon Bridge using Anthropic-compatible upstream + `deepseek_v4` extension.
 - `mxcx` routes MiniMax through Moon Bridge using `openai-response` upstream passthrough.
 - `acx` routes OpenCode Zen Go models to `gocx`/Moon Bridge. `gocx`, `dscx-go`, `dfcx-go`, and `glcx-go` route OpenCode Zen Go through Moon Bridge using `openai-chat` upstream at `https://opencode.ai/zen/go/v1/chat/completions`. The Codex side remains `wire_api = "responses"` against the local Moon Bridge URL.
- `acx` routes GPT aliases directly to OpenAI-compatible Responses endpoints and must not send GPT traffic through Moon Bridge.
+- `acx` routes GPT aliases directly to OpenAI-compatible Responses endpoints and must not send GPT traffic through Moon Bridge or through the local ACX router. In GPT mode, `acx status` should report `mode=gpt-direct`, `routerRequired=false`, and no listener on `127.0.0.1:38448`.
 - OpenCode Zen Go model IDs must use the upstream slug, such as `glm-5.1`; display names such as `GLM-5.1` are not profile model identifiers.
 - Do not keep local handwritten bridge scripts, static alternate `moonbridge.config.yml` files, or other sidecar proxy paths for OpenCode Zen Go profiles. The only supported runtime path is wrapper-generated `.tmp/moonbridge.generated.yml` plus `/root/.local/bin/moonbridge`.
 - For OpenCode Zen Go profiles, set an explicit `user_agent` in the generated Moon Bridge provider config. The upstream may reject default client signatures.
@@ -62,7 +64,8 @@ Profile validation:
 - `*-go raw-smoke` verifies the upstream OpenCode Zen Go Chat Completions API directly.
 - `*-go bridge-smoke` verifies local Moon Bridge's `/v1/responses` translation path.
 - `*-go exec '在吗'` verifies the actual Codex profile. Passing output must not contain `Model metadata ... not found`; latest session records should show `model_context_window` derived from the profile catalog, not fallback metadata.
- `acx doctor`, `acx route-status`, `acx models`, and `acx -m <model> exec '在吗'` verify the unified ACX profile. Use a small real Responses request through `http://127.0.0.1:38448/v1/responses` to verify a GPT alias because GPT traffic intentionally bypasses Moon Bridge.
+- `acx status`, `acx models`, `acx gpt-only exec '在吗'`, `acx gpt-sub2api exec '在吗'`, and default `acx exec '在吗'` verify GPT direct mode. Passing GPT verification should show a real Codex Responses turn and, for repeated or resume traffic, nonzero `cached_input_tokens`; it should not require a listener on `127.0.0.1:38448`.
+- For OpenCode Zen Go aliases exposed through `acx`, use `acx route-start`, `acx route-status`, `acx models`, and `acx -m <opencode-model> exec '在吗'` to verify the router-to-`gocx` path.
 - `gocx raw-smoke [model]`, `gocx bridge-smoke [model]`, and `gocx -m <model> exec '在吗'` verify specific OpenCode Zen Go models. Omitting `[model]` uses the default `glm-5.1`.
 - `ReasoningSummaryDelta without active item` in Codex stderr is a separate adapter noise from reasoning summary events. It is not the same failure as missing model metadata and does not by itself prove the profile is unusable.