From e0d38d717230cc196dfb1dbc4259ca09aba93ec3 Mon Sep 17 00:00:00 2001 From: Codex Date: Wed, 20 May 2026 01:24:38 +0000 Subject: [PATCH] docs: clarify code queue split-brain supervision --- docs/reference/code-queue-supervision.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/reference/code-queue-supervision.md b/docs/reference/code-queue-supervision.md index 78d4b58b..a0fe9dd5 100644 --- a/docs/reference/code-queue-supervision.md +++ b/docs/reference/code-queue-supervision.md @@ -50,6 +50,8 @@ Use: - `bun scripts/cli.ts codex task --trace --limit N` or `codex output` only when the summary is insufficient. - The liveness rules in `docs/reference/observability.md` when master control-plane state and D601 scheduler state appear split. +`split-brain` in queue diagnostics is a control-plane/execution-plane divergence signal, not automatic evidence that the work is dead. If the task heartbeats are fresh and the trace is still advancing, treat the task as live and keep supervising it rather than interrupting or replacing it. + Long-running tasks with fresh trace or heartbeat evidence should normally be left alone. Polling every few minutes is preferred over repeated interrupt/retry cycles. For broad CI/CD migration waves, use a fixed supervision cadence unless an incident demands faster action. A five-minute poll loop is the default: read `codex queues`, read terminal or suspect task summaries, then either accept, retry, split a blocker, or leave healthy tasks alone. The loop should keep the supervisor doing useful non-overlapping work, such as documentation or issue triage, but that side work must not take over a worker's assigned implementation.