fix: default codex sentinel protect retries

This commit is contained in:
Codex
2026-06-13 06:57:35 +00:00
parent 8ac952c7a6
commit e78cc9a16d
4 changed files with 28 additions and 18 deletions
+2 -2
View File
@@ -134,9 +134,9 @@ The sentinel must not maintain separate classifiers for "private content", "main
`profiles.entries[].trustUpstream` is the durable account-level trust marker for sentinel success cadence, and the absence of the field means untrusted. Trusted and untrusted accounts use separate YAML cadence maximums after marker-matching probes; the values belong only in `config/platform-infra/sub2api-codex-pool.yaml`. This field must not change Sub2API scheduler priority, capacity, load factor, membership, built-in temporary-unschedulable settings, or the marker-only health contract. Its purpose is to keep intermittently unreliable 200-success providers under more frequent direct probes without adding provider-specific content classifiers.
`profiles.entries[].sentinelProtect` is an optional account-level protection policy for sentinel freeze decisions, and the absence of the field means disabled. For protected accounts, the marker-only health contract still applies, but the sentinel must exhaust the configured consecutive marker confirmation attempts before treating the account as failed and entering the freeze state machine. The retry count, initial delay, maximum delay, and backoff multiplier are YAML values; long-term reference prose must not duplicate the current numbers. This policy exists only to absorb occasional marker/probe or gateway-failure confirmation jitter for selected accounts. It must not change Sub2API scheduler priority, capacity, load factor, membership, built-in temporary-unschedulable settings, or the recovery condition.
`pool.defaultSentinelProtect` is the default protection policy for sentinel freeze decisions, and `profiles.entries[].sentinelProtect` may override it for a specific account. For protected accounts, the marker-only health contract still applies, but the sentinel must exhaust the configured consecutive marker confirmation attempts before treating the account as failed and entering the freeze state machine. The retry count, initial delay, maximum delay, and backoff multiplier are YAML values; long-term reference prose must not duplicate the current numbers. This policy exists only to absorb occasional marker/probe or gateway-failure confirmation jitter. It must not change Sub2API scheduler priority, capacity, load factor, membership, built-in temporary-unschedulable settings, or the recovery condition.
When `codex-pool sync --confirm` creates a YAML-managed account or changes direct-probe-relevant account inputs such as the profile mapping, upstream base URL, API key fingerprint, upstream User-Agent, Responses WebSocket mode, `trustUpstream`, or `sentinelProtect`, sync records a pending sentinel probe from the pre-mutation runtime state, updates the account, restores `schedulable=true` unless an active sentinel quarantine already exists, and schedules the account probe immediately. New or changed accounts are not default-frozen; only an actual non-marker probe result or an existing active quarantine may remove an account from the scheduler. This avoids zero-available windows during sync while still ensuring that later marker failures enter the normal freeze/restore state machine. Unchanged accounts must not have their existing success or failure backoff reset by unrelated YAML syncs.
When `codex-pool sync --confirm` creates a YAML-managed account or changes direct-probe-relevant account inputs such as the profile mapping, upstream base URL, API key fingerprint, upstream User-Agent, Responses WebSocket mode, `trustUpstream`, pool/profile `sentinelProtect`, sync records a pending sentinel probe from the pre-mutation runtime state, updates the account, restores `schedulable=true` unless an active sentinel quarantine already exists, and schedules the account probe immediately. New or changed accounts are not default-frozen; only an actual non-marker probe result or an existing active quarantine may remove an account from the scheduler. This avoids zero-available windows during sync while still ensuring that later marker failures enter the normal freeze/restore state machine. Unchanged accounts must not have their existing success or failure backoff reset by unrelated YAML syncs.
If the YAML failure freeze maximum is lowered, `codex-pool sync --confirm` may migrate only currently active sentinel quarantines whose stored interval or next recovery time exceeds the current maximum. The migration keeps the account frozen, marks the next recovery probe due immediately, and lets the next marker result decide restore versus the new shorter failure backoff. It must not clear quarantine or restore schedulability merely because an older TTL has expired.