docs: clarify code queue split-brain supervision
This commit is contained in:
@@ -50,6 +50,8 @@ Use:
|
||||
- `bun scripts/cli.ts codex task <taskId> --trace --limit N` or `codex output` only when the summary is insufficient.
|
||||
- The liveness rules in `docs/reference/observability.md` when master control-plane state and D601 scheduler state appear split.
|
||||
|
||||
`split-brain` in queue diagnostics is a control-plane/execution-plane divergence signal, not automatic evidence that the work is dead. If the task heartbeats are fresh and the trace is still advancing, treat the task as live and keep supervising it rather than interrupting or replacing it.
|
||||
|
||||
Long-running tasks with fresh trace or heartbeat evidence should normally be left alone. Polling every few minutes is preferred over repeated interrupt/retry cycles.
|
||||
|
||||
For broad CI/CD migration waves, use a fixed supervision cadence unless an incident demands faster action. A five-minute poll loop is the default: read `codex queues`, read terminal or suspect task summaries, then either accept, retry, split a blocker, or leave healthy tasks alone. The loop should keep the supervisor doing useful non-overlapping work, such as documentation or issue triage, but that side work must not take over a worker's assigned implementation.
|
||||
|
||||
Reference in New Issue
Block a user