Files
pikasTech-unidesk/docs/reference/release-governance.md
T

9.3 KiB

Release Governance

This document owns UniDesk release-line, runtime-version, CI/CD control-plane and feature-flag governance. The decision record is GitHub issue #6.

For the strategic filter that decides whether a proposal should exist at all, and whether it is a short-term or long-term investment, see docs/reference/strategy-governance.md.

Decision Scope

The governance decision covers four boundaries:

  • stable maintenance work for the current usable UniDesk architecture;
  • high-risk integration work on master, including the Rust backend-core rewrite and next-generation infrastructure changes;
  • CI/CD server runtime pinning versus CLI compatibility;
  • short-lived feature flags versus long-lived release or service boundaries.

This document records the target policy. It does not by itself create a release/v1 branch, change deploy.json schema, or authorize production deployment.

Release Lines

master remains the normal integration branch for UniDesk source changes. New architecture work, backend-core Rust migration, provider-gateway reshaping, Code Agent sandbox work, and other high-risk development can continue there, but master must not be treated as the implicit production or stable-dev runtime truth.

release/v1 is the planned stable maintenance line for the existing usable architecture. Its baseline should be the last known-good TypeScript backend-core version or an equivalent verified stable commit. After it is enabled, it may accept only:

  • bug fixes for existing behavior;
  • high-availability, recoverability and observability fixes;
  • CI/CD reliability fixes;
  • security and compatibility fixes;
  • narrowly scoped deployment fixes that preserve the existing architecture.

release/v1 must not carry new product features, large architecture changes, the default Rust backend-core switch, or speculative Code Agent sandbox behavior. Any exception requires an explicit issue and a deployment rollback plan.

Until the release-line implementation is completed in CLI, CI, CD and documentation, the current repository rule still applies: UniDesk agent changes are developed on master and pushed to origin master. Creating or updating release/v1 is an explicit release operation, not a replacement for arbitrary feature or fix branches.

Stabilization Mode

UniDesk enters stabilization mode when core availability is threatened by control-plane, deployment, backend-core, provider-gateway, Code Queue, or CI/CD instability. During stabilization:

  • high availability, trace visibility, deploy reproducibility and rollback safety take priority over new features;
  • production and stable dev should run only from commit-pinned desired state;
  • live manual repairs are temporary and must be converted into pushed Git changes;
  • high-risk architecture work may continue only on the integration lane and must not be promoted into stable runtime by default.

Exit from stabilization requires CI/CD, deploy verification, Code Queue recoverability and dev/prod isolation to be demonstrably healthy.

V1 Stabilization Window

When release/v1 is being stabilized, the current dev environment should be temporarily treated as the v1 validation lane. Do not add dev-v1 and dev-master manifest lanes during the same window unless the absence of split lanes is the blocker for a v1 fix.

The temporary rule is:

  • pause high-risk master feature and architecture work that would compete for dev, CI, Code Queue or deploy capacity;
  • keep origin/master:deploy.json#environments.dev as the implemented command contract, but pin the relevant service commits to v1 stabilization commits when validating v1 fixes;
  • document any temporary v1 pin in the related issue before relying on it;
  • do not introduce dirty local manifests, hidden deploy refs, or implicit branch-based behavior;
  • do not change deploy.json schema only to express the future split-lane target.

The v1 stabilization task should move through these phases:

  1. Establish the v1 baseline from the last TypeScript backend-core commit and record which post-baseline fixes are candidates for backport.
  2. Backport user-visible and operability fixes that preserve the existing architecture, especially Code Queue prompt/trace visibility, /api/workdirs, gateway diagnostics, and CI image-build stability.
  3. Make the current dev lane validate v1 by pinning environments.dev service commits to the v1 fix set and running the existing dev/CI checks.
  4. Promote only the verified v1 commits to production desired state through the normal commit-pinned deploy path.
  5. After v1 is stable, implement explicit dev-v1 and dev-master lanes or an equivalent schema, with CLI, CI, frontend labels, diagnostics and documentation updated together.

Code Queue can implement bounded v1 bug fixes, tests and documentation patches when the task can run in an isolated worktree and does not require production mutation. Manual control is required for branch creation, v1 baseline selection, deploy.json pin changes, production deploys, recovery actions on active tasks, and any decision to resume high-risk master work.

CI/CD Runtime Versioning

CI/CD server and control-plane services are production-like infrastructure. Their runtime version must be pinned by deploy.json to the production desired commit, not implicitly follow the operator's local worktree or the latest master.

The CI/CD CLI may run from master because it is the command vocabulary and should evolve quickly. That compatibility is acceptable only when all of the following hold:

  • CLI changes are backward compatible with the pinned server version or fail with a clear unsupported-version error;
  • server-side policy remains authoritative for deploy boundaries, allowed environments and dangerous operations;
  • the CLI uses server capability data instead of guessing support from local code;
  • an unsupported server capability must not be bypassed through raw SSH, direct kubectl, direct SQL, or hidden fallback commands.

CI/CD services should expose their commit, API/schema capability, supported environments, supported services and supported operations through health or capability endpoints. The CLI must include the observed capability or server commit in diagnostics for failed operations.

Dev Environment Lanes

The target model is to separate a stable maintenance dev lane from a master integration dev lane, for example with explicit names such as dev-v1 and dev-master or an equivalent schema. The stable lane validates release/v1 fixes without disrupting production. The master lane validates high-risk work and may be less stable.

This split must be implemented explicitly in deploy.json, deploy planning, deploy apply, CI, frontend labels and diagnostics. Until that work is done, existing commands continue to read the currently documented manifest ref such as origin/master:deploy.json#environments.dev; operators must not simulate split lanes through dirty local manifests, hidden branches, or undocumented runtime edits.

During a v1 stabilization window, the future split-lane implementation is deliberately deferred. The current environments.dev lane is reserved for v1 validation until the v1 release criteria are met, then the explicit split-lane work can proceed as a separate infrastructure change.

CI/CD Infrastructure Repair

CI/CD infrastructure bootstrap, repair and upgrade must not depend on a new CI job that tests the CI/CD bootstrap path itself. When the CI/CD infrastructure is the broken component, use manual validation: runtime health, logs, source commit metadata, capability endpoints, bounded smoke commands and operator review.

Manual CI/CD infrastructure fixes may be promoted directly to production when production infrastructure is the intended target. The durable state still has to return to Git-backed desired state: any code, manifest, config or policy change that should survive must be committed and pushed after the repair.

Feature Flags

Feature flags are short-lived risk controls, not a long-term architecture partitioning mechanism. Prefer release lines, service boundaries and deployment boundaries for durable divergence.

Allowed feature flags:

  • release toggles with a named owner and removal condition;
  • kill switches for availability protection;
  • migration toggles that are observable in health output and removed after rollout;
  • compatibility toggles needed for a bounded transition.

Disallowed feature-flag patterns:

  • permanent flags with no owner or removal issue;
  • nested flag combinations that create untested behavior matrices;
  • flags that hide a second data, control, event, or deployment path;
  • flags that let production bypass the documented desired-state or server-side deploy policy.

Every feature flag must have a default, an owner, an observability signal, a removal condition and tests for both active states when both states can run in production.

Promotion And Backport Rules

Stable-line fixes must remain traceable. A fix may land first on master and then be cherry-picked to release/v1, or land on release/v1 and be forward-ported to master, but the chosen direction must be recorded in the related issue or pull request.

Deployment truth remains commit-pinned. Neither release/v1 nor master may be represented by mutable local runtime files, dirty worktrees, copied source trees, copied images, or manual hotfixes that are not committed and pushed.