Appearance
Management Fleet
Authoritative bounded-context reference for the Management Fleet (internal/provisioning/managementfleet/). It owns the durable record of which Kubernetes clusters host the provisioning substrate (Crossplane v2 + the External Secrets Operator) and which cluster owns each Project, and reconciles a dedicated RBAC-scoped namespace per assigned Project.
The context has no HTTP surface: callers reach it through the in-process Service facade, and the reconcile loop drives the per-Project namespace off live cluster state. The closed port set keeps the domain and application layers free of pgx and controller-runtime (ports.go). This milestone registers clusters, places Projects, reconciles namespaces, and verifies a cluster carries an Available substrate before assignment — installing the substrate stays with the deploy/ Helm chart.
This reference is a single page: the surface is narrow (two aggregates, one pure state machine, five ports, seven sentinels) and the pieces travel in lockstep. The Recovery runbook at the foot is the operator-facing companion.
Cross-references
../../contributing/layout.md— the bounded-context map row that locatesinternal/provisioning/managementfleetinside the codebase and enumerates the depguard rules that confinepgxtorepo/,controller-runtimetoreconcile/, and bar cross-context imports../credentials.mdand./credential-pool.md— the sibling provisioning contexts. The management fleet mirrors their module shape (a domain package, arepo/adapter, a port set, an events subpackage) but owns inventory and reconcile rather than secret material.../identity/tenancy.md— theDomain → Project → Resource → Nodeaggregate model. A Project is owned by the tenancy context; the management fleet observes it only as the key of aProjectClusterAssignmentand as the seed of the per-Project namespace name.../../../internal/provisioning/managementfleet/doc.go— the package-level pin of the ubiquitous language and the reconcile-modelDECISIONblock (why a boot-probe-plus-ticker closure rather than a watch-driven controller-runtime Manager).../../../internal/platform/db/migrations/0023_management_fleet.sql— the persistence schema forplexsphere.management_clustersandplexsphere.project_cluster_assignments.../../../cmd/plexsphere/management_fleet_factory_prod.go— the production composition root: the env loader, the in-cluster client, the reconcile sweep, and the boot-probe / steady-state-ticker pair.../../../tests/e2e/provisioning/management-fleet/chainsaw-test.yaml— the Chainsaw e2e suite that stands up a kind cluster with digest-pinned Crossplane v2 and the External Secrets Operator, runs the reconcile, and asserts the per-Project Namespace, Role, RoleBinding, ServiceAccount, ResourceQuota, and a passing verify gate.
Ubiquitous language
The terms below travel together across the Go code, the SQL migration, the domain-event payloads, the structured-log attributes, and the recovery runbook. Names are preserved verbatim so a reader chasing a string from a log line finds it in the source without translation.
| Term | Definition | Code anchor |
|---|---|---|
| ManagementCluster | The aggregate root modelling one registered Kubernetes cluster in the fleet. Carries (id, name, slug, region, kubeconfigSecretRef, status, createdAt, updatedAt). The application mints a UUIDv7 id at registration; the slug is the unique human-facing key the assignment model shards on. The fields are unexported and reached through accessors so the creation invariants hold only through the two constructors. | cluster.go |
| ClusterID | The UUIDv7 identity of a ManagementCluster, a named wrapper over uuid.UUID. The String() projection is the canonical hyphenated lowercase form. The zero value is rejected by every creation invariant. | cluster.go |
| ProjectClusterAssignment | The aggregate root keyed by Project id, recording which management cluster owns a Project, the region the placement landed in, the per-Project namespace name, and that namespace's lifecycle phase. The (projectID, managementClusterID) pairing is immutable once the Project has provisioned resources. | assignment.go |
| ProjectID | The UUID identity of a Project as the fleet observes it — a named wrapper over uuid.UUID, distinct at compile time from ClusterID. It is the assignment key and the seed of the namespace name. | namespace.go |
| per-Project namespace | The Kubernetes Namespace carved out for one Project on its management cluster. Its name is plexsphere-project-<uuid> — the fixed prefix followed by the canonical hyphenated lowercase Project UUID. The name is a deterministic function of the Project id (a 55-character RFC 1123 label) so the persisted column and the live cluster name can never drift. | namespace.go |
| NamespacePhase | The closed-set lifecycle of the per-Project namespace: Pending, Provisioning, Ready, Degraded, Terminating, Deleted. The phase is advanced exclusively by the pure transition function over observed cluster facts, never from process memory. | phase.go |
| ObservedCluster | The snapshot of per-Project facts the reconcile loop reads live from a management cluster on a single tick: whether the Namespace, Role, RoleBinding, ServiceAccount, and ResourceQuota exist, and whether the verify gate passed. Every field is a fact the cluster reports, not a desired state. | phase.go |
| Action | The reconcile instruction the transition function emits for one tick: Noop, Converge, or Delete. | phase.go |
| Service | The in-process application-service facade. Methods: RegisterCluster, AssignProjectToCluster, LookupAssignment, SelectClusterForRegion, MarkTerminating, Unassign. It orchestrates the Repository and owns the transaction boundary; the aggregates enforce their own creation invariants. | service.go |
| Reconciler | The controller-runtime adapter that reconciles exactly one ProjectClusterAssignment per call: observe cluster facts, decide via the pure transition, apply the action, persist the phase, emit the lifecycle event. | reconcile/reconcile.go |
| verify gate | The read-only readiness check that confirms a management cluster serves the apiextensions.crossplane.io, pkg.crossplane.io, and external-secrets.io API groups and that the Crossplane and External Secrets Operator controller Deployments report Available. | reconcile/verify.go |
Aggregates
The context owns two aggregate roots. They are modified one per transaction; the application service coordinates the cross-aggregate read (the resource count) inside the same RunInTx as the assignment write so the immutability invariant holds against one consistent snapshot.
ManagementCluster — invariants
NewManagementCluster and HydrateManagementCluster are the only paths that produce a valid aggregate. Both enforce the same creation invariants, so a row that drifts from them in storage is rejected at the hydration boundary rather than surfacing later in the reconcile loop.
| Invariant | Layer | Failure mode |
|---|---|---|
id is non-zero. | Aggregate constructor. | ErrInvalidInput — a zero ClusterID would collide with the "not yet assigned" sentinel. |
slug is non-empty. | Aggregate constructor; SQL UNIQUE constraint management_clusters_slug_unique. | ErrInvalidInput on an empty slug; a duplicate slug surfaces from the repository adapter. |
createdAt and updatedAt are non-zero, with createdAt ≤ updatedAt. | Aggregate constructor. | ErrInvalidInput — an unordered or zero timestamp pair is a corrupt row. |
ProjectClusterAssignment — invariants
| Invariant | Layer | Failure mode |
|---|---|---|
projectID and managementClusterID are both non-zero; timestamps are non-zero and ordered. | Aggregate constructor. | ErrInvalidInput. |
namespaceName equals NamespaceName(projectID). | Aggregate constructor — the field is derived, never caller-supplied, and re-derived on hydration. | A persisted name that drifts from the derivation cannot reach the reconcile loop: hydration recomputes it. |
project_id is the PRIMARY KEY — a Project is assigned to at most one cluster at a time. | SQL PRIMARY KEY. | A second CreateAssignment for the same Project surfaces a constraint violation from the repository adapter. |
The (projectID, managementClusterID) pairing is immutable once the Project owns at least one resource. | Application service — CountResourcesForProject inside the placement RunInTx. | ErrAssignmentImmutable — re-pointing would orphan the Composite Resources, ProviderConfigs, and Secrets in the old namespace. |
namespace_phase is one of the six closed-set values. | SQL CHECK constraint project_cluster_assignments_phase_check; application validation. | The CHECK is defence-in-depth: an out-of-set phase string cannot reach a persisted row even on an application bug. |
| Neither a Project nor a management cluster can be deleted while a live assignment references it. | SQL FOREIGN KEY … ON DELETE RESTRICT on both project_id and management_cluster_id. | The delete fails closed at the database — a CASCADE would silently orphan a live namespace with no inventory record that it needs tearing down. |
Namespace-phase state machine
The per-Project namespace walks a six-phase lifecycle. The phase is advanced only by the pure, total transition function Next (phase.go), which reads no clock, no process memory, and no I/O: it derives the Action and the next NamespacePhase solely from the live ObservedCluster snapshot and the assignment's current phase. The same inputs always yield the same output, so the machine is fully unit-testable without a cluster.
The machine has two arms, selected by teardown intent:
text
Converge arm — current phase ∈ {Pending, Provisioning, Ready, Degraded}
┌──────────┐ resources missing OR verify failing
│ Pending │ ──────────────────────────────────────┐
└────┬─────┘ │
│ ▼
│ ┌────────────────────┐
│ all resources present │ Provisioning │
│ AND verify passed │ (Converge) │
▼ └─────────┬──────────┘
┌──────────────┐ ◀───────────────────────────────────┘
│ Ready │ all resources present AND verify passed
│ (Converge, │
│ then Noop) │ ──┐ a required resource went missing
└──────┬───────┘ │ OR verify started failing
│ ▼
│ ┌────────────────────┐
│ │ Degraded │ keeps converging to repair
│ │ (Converge) │
│ └────────────────────┘
│
│ Service.MarkTerminating (teardown requested)
▼
Teardown arm — current phase ∈ {Terminating, Deleted}; sticky
┌────────────────────┐ resources still remain
│ Terminating │ ──────────────────────────┐
│ (Delete) │ ◀──────────────────────────┘
└─────────┬──────────┘
│ no per-Project resources remain on the cluster
▼
┌────────────────────┐
│ Deleted │ terminal — (Noop)
└────────────────────┘Key properties the transition guarantees:
- Teardown is sticky. Once the phase is
TerminatingorDeletedthe namespace never converges again; it can only drain toDeleted. Degradedis distinct fromProvisioning. A namespace that reachedReadyand then lost a resource is markedDegraded, notProvisioning, so an operator can tell a first install apart from a regression. Both keep emittingConverge.Readyis event-worthy once. A namespace arriving atReadyfrom a lower phase emitsConvergeso the caller records the crossing; on the next tick it settles toNoop.- The function is total. Every
(ObservedCluster, NamespacePhase)pair maps to a defined result; an unrecognised phase string is treated asPending, the safe converge-from-scratch entry point.
What the reconcile loop does with the result
Reconciler.Reconcile (reconcile/reconcile.go) runs one assignment per call in four steps:
- Observe —
Getthe five per-Project resources from the cluster and run the verify gate, building anObservedCluster. ANotFoundmarks a resource absent; any other API error is an infrastructure failure that propagates. AnErrClusterUnhealthyfrom the verify gate collapses intoVerifyPassed = false— the namespace keeps converging — while a non-unhealthy gate error propagates. - Decide — call the pure
Nexttransition. - Apply —
Convergecreates or repairs the Namespace, then the RBAC objects, then the ResourceQuota, in that order;Deletetears the namespace down;Noopdoes nothing. Every step is idempotent. - Persist + emit — when the phase changed, write it through the
Repositoryand, on a crossing intoReadyorDeleted, publish the matching lifecycle event.
Re-running Reconcile against an already-converged or already-torn-down cluster is success.
Region-assignment resolver
While Reconcile converges one assignment's namespace against live cluster facts, a RegionAssignmentResolver decides where a Project is placed in the first place and whether an existing placement still matches its tenant's region. It runs each management-fleet sweep and has two halves — scheduling and migration — both governed by one exact-match contract: a placement is correct iff ManagementCluster.Region() == Domain.Region(). The match is exact string equality; there is no nearest-region fallback and no normalisation beyond the kebab-case validation ParseRegion already applied at construction.
A guiding rule frames both halves: a per-Project condition — no cluster for the region, a blocked migration — is a skip-and-WARN, never a sweep failure. Only an infrastructure error (a repository fault, an apiserver outage) fails a sweep. So a single misplaceable Project never stalls the placement of every other Project in the fleet.
Scheduling — Schedule()
Schedule() places each schedulable Project. A Project is schedulable when it owns at least one resource and has no assignment yet.
For each schedulable Project, the resolver looks for a registered management cluster whose region exactly matches the Project's Domain region:
- A cluster matches. The resolver creates a
Pendingassignment on it and emits aProjectClusterAssignedevent. - No cluster matches. The resolver logs a
WARNand skips the Project — no assignment row is created. The Project stays unplaced until a cluster for its region is registered; the next sweep retries. (An unpinned Domain — empty region — is placed under the same free-placement path that predates region pinning.)
Migration — MigrateOutOfRegion()
Re-pinning a Domain's region (a PATCH, since the region is mutable unlike the immutable slug) re-targets every Project beneath it. The Project region is derived from the Domain, not stored on the Project, so one Domain re-pin can leave many assignments pointing at a cluster whose region no longer matches. MigrateOutOfRegion() reconciles each such assignment one step per sweep, and the path forks on whether the Project owns resources.
Zero-resource Project — migrated automatically. The resolver drives the move itself, advancing one step per sweep:
- It advances the current assignment's namespace phase into the teardown arm — from
Pending,Provisioning,Ready, orDegradedtoTerminating. - While the namespace is
Terminating, it waits; the reconcile loop drains the per-Project namespace, RBAC, and quota. - Once the old assignment reaches
Deleted, it re-points the Project to a freshPendingassignment on a correct-region cluster, emittingProjectClusterAssigned.
Because the state machine advances by a single step each sweep, a migration spans several sweeps; this is expected.
Resource-owning Project — blocked. When the Project owns resources, the (projectID, managementClusterID) pairing is immutable: the resolver does not move it. It logs a WARN, leaves the assignment untouched, and the sweep still returns success. Re-pointing a Project with live resources would orphan its Composite Resources, ProviderConfigs, and Secrets in the old region's namespace — the same fail-closed invariant ErrAssignmentImmutable guards in the application service. To migrate such a Project deliberately, drain its resources on the old cluster first (drive the namespace through Terminating → Deleted and Unassign), then let the next sweep place it in the new region.
Operators inspect the resulting placement — the owning cluster, the region the assignment landed in, and the namespace phase — through GET /v1/projects/{project_id}/management-cluster-assignment (see the Management Fleet HTTP API).
Per-Project resources
For every assignment the converge step provisions five objects on the management cluster. All five carry the Kubernetes recommended labels app.kubernetes.io/managed-by=plexsphere, app.kubernetes.io/part-of=management-fleet, app.kubernetes.io/component=<role>, and app.kubernetes.io/instance=<namespace name>, so one app.kubernetes.io/instance selector scopes to a single Project's full resource set (reconcile/labels.go).
When the assignment is region-pinned, the same five objects additionally carry the well-known Kubernetes topology label topology.kubernetes.io/region=<region>, set to the assignment's region. This label is distinct from the app.kubernetes.io/* recommended set: it locates a Project's resources in a placement region rather than describing the application. It is omitted for a region-less (unpinned) Domain, so a label selector on topology.kubernetes.io/region matches only region-pinned Projects.
| Resource | Name | Purpose |
|---|---|---|
| Namespace | plexsphere-project-<uuid> | The RBAC and quota boundary for the Project's resources. Carries a kubernetes.io/description annotation. |
| Role | plexsphere-project | A namespace-scoped, least-privilege Role. Its single rule grants get, list, watch on configmaps, secrets, events, serviceaccounts in the core API group — no create/update/delete, no cluster-admin, and never the * wildcard in apiGroups, resources, or verbs. |
| RoleBinding | plexsphere-project | Binds the Role to the per-Project ServiceAccount. |
| ServiceAccount | plexsphere-project | The dedicated identity for the Project's in-namespace provisioning workload. |
| ResourceQuota | plexsphere-project-quota | Caps the Project's footprint on the shared cluster — object counts (ConfigMaps, Secrets, Pods, Services) and compute (CPU/memory requests and limits) — so one Project cannot starve another. |
The RBAC grant is deliberately read-only at this milestone: no requirement asks the in-namespace workload to write, and an explicit, narrow grant keeps the bootstrap RBAC at the minimum the reconcile loop can audit. A future task that needs write verbs widens projectRoleRules deliberately and records the justification in the DECISION block on reconcile/rbac.go.
Ports
The context reaches every collaborator through one of five ports declared in framework-free terms — context, time, and the package's own aggregates — so the domain layer stays free of pgx, controller-runtime, and k8s.io (ports.go). The composition root wires concrete adapters; tests inject in-memory fakes.
| Port | Methods | Adapter | Test seam |
|---|---|---|---|
Repository | CreateCluster, GetCluster, ListClusters, CreateAssignment, GetAssignment, ListAssignmentsForCluster, UpdateAssignmentPhase, DeleteAssignment, CountResourcesForProject, RunInTx | Postgres in repo/managementfleet_pg.go, wrapping the sqlc-generated queries. Constraint-name dispatch maps SQLSTATE collisions and FK violations onto the sentinels. depguard confines pgx to this subpackage. | In-memory fakes in the unit tests; the repository adapter has its own pgtype-conversion and classification tests, and the integration suite drives the real adapter against a testcontainers Postgres. |
ClusterClientFactory | HandleFor | Mints an opaque ClusterHandle for a registered cluster. At this milestone the binary runs inside the single management cluster, so the production wiring uses one in-cluster controller-runtime client rather than a per-cluster fan-out. | Fakes return a stub handle. |
FleetHealthChecker | Verify | The Verifier in reconcile/verify.go — the read-only verify gate. depguard confines controller-runtime to the reconcile/ subpackage. | A fake controller-runtime client exercises both the healthy and unhealthy arms; the envtest and Chainsaw suites drive it against a real apiserver. |
AuditSink | Record | A composition-root shim translating the local AuditEntry value object onto internal/audit.Entry — keeping the module free of an internal/audit import the no-cross-context-imports rule denies. | An in-memory recording sink. |
Clock | Now | A wall-clock implementation at the composition root. | A frozen clock pins a deterministic now in unit tests. |
The reconcile loop additionally declares a narrow EventSink port local to the reconcile/ package (reconcile/reconcile.go): the lifecycle events it emits are an integration concern, not an aggregate write, so per the interface-segregation principle the adapter declares the smallest dependency it needs rather than widening Repository with an outbox surface.
Lifecycle events
The context defines four typed domain events (events/events.go). The EventType discriminator string is stable and becomes part of the wire contract once an event is emitted; the set is closed and pinned by a package-local drift gate.
| Event type (discriminator) | Trigger | Payload |
|---|---|---|
managementfleet.ManagementClusterRegistered | A cluster is first materialised in the fleet inventory. | Cluster identity, name, slug, region. |
managementfleet.ProjectClusterAssigned | The region-assignment resolver places a Project onto the cluster that owns it (a fresh scheduling placement, or a re-point after a region migration). | Project identity, owning cluster identity, region, namespace name. |
managementfleet.ProjectNamespaceReady | The per-Project namespace crosses into the Ready phase. | Project identity, namespace name. |
managementfleet.ProjectNamespaceTerminated | The per-Project namespace crosses into the Deleted phase. | Project identity, namespace name. |
ProjectClusterAssigned was declared with the inventory model but reserved — without an emitter — until the region-assignment resolver (Region-assignment resolver) began firing it. The resolver is the event's emitter: it publishes ProjectClusterAssigned when it places a Project on the cluster matching its Domain region, and again when it re-points a migrated Project after the old assignment reaches Deleted. The low-level AssignProjectToCluster persist method deliberately does not emit the event inline — it is also called during internal region-migration bookkeeping, so emitting there would fire on writes that are not a fresh region-scheduling placement; the resolver owns the region-placement decision and is therefore the correct emitter.
The production composition root emits the reconcile events through a structured-slog sink — migration 0023 created only the two inventory tables, so there is no management-fleet outbox table to append to. Minting one is a schema change a dedicated story owns; until then the slog line is the operator-facing breadcrumb. The adapter rationale lives in the DECISION block on newManagementFleetEventSink in management_fleet_factory_prod.go.
Error sentinels
Every operation funnels through one of seven package-local sentinels. Callers branch on these via errors.Is — wrapping with fmt.Errorf("%w", …) is fine, identity must remain intact. The set is closed: adding an eighth without updating the closed-set drift gate trips the build (errors.go).
| Sentinel | Layer | Trigger | Remediation |
|---|---|---|---|
ErrManagementClusterNotFound | Repository / Service | GetCluster or a fleet lookup for a ClusterID with no inventory row. | Re-check the fleet inventory; register the cluster if it is genuinely absent. |
ErrAssignmentNotFound | Repository / Service | GetAssignment for a Project that has not been placed. | Place the Project with AssignProjectToCluster, or accept the Project has no assignment. |
ErrAssignmentImmutable | Service (resource-count gate inside the placement RunInTx) | A re-assignment to a different cluster while the Project already owns ≥ 1 resource. | Re-pointing would orphan live Composite Resources, ProviderConfigs, and Secrets. Tear the Project's resources down on the old cluster first, or keep the existing placement. |
ErrClusterUnhealthy | FleetHealthChecker verify gate | A management cluster is missing a Crossplane / External Secrets Operator API group, or a substrate controller Deployment is not Available. | Install or repair the substrate via the deploy/ Helm chart; no Project may be assigned until the gate passes. |
ErrNoClusterForRegion | Service (SelectClusterForRegion) | Placement is asked for a region with no matching registered cluster — or an empty region against a fleet that does not have exactly one cluster. | Register a cluster for the region, or pass an explicit region when the fleet has more than one cluster. |
ErrAssignmentTerminating | Service (Unassign) | Unassign is attempted while the namespace phase has not yet reached Deleted. | Let the reconcile loop drain the namespace to Deleted, then retry Unassign. |
ErrInvalidInput | Aggregate constructors / port boundaries | A zero id, a zero or unordered timestamp, or an empty slug observed before any persistence call. | Programmer error at a boundary; surfaces in tests, not in steady-state production. |
Two further sentinels exist outside the closed domain set because they name wiring bugs, not observable domain failures, and a misconfigured composition root must fail fast at boot rather than on the first operation:
ErrServiceRepositoryRequired—NewServicewas handed a nilRepository(service.go).ErrReconcilerClientRequired/ErrReconcilerRepositoryRequired/ErrReconcilerEventsRequired—NewReconcilerwas handed a nil collaborator (reconcile/reconcile.go).
The verify gate
The verify gate (reconcile/verify.go) is a read-only readiness check. This milestone ships the gate only; installing the substrate stays with the deploy/ Helm chart. The gate performs two checks and returns on the first failure:
API groups present. Every provisioning API group is served by the cluster, resolved by mapping a representative
GroupKindthrough the clusterRESTMapper. The gate probes one representative kind per group rather than the full CRD inventory — a partial CRD install is not a failure mode the upstream Helm charts produce:API group Representative kind Substrate apiextensions.crossplane.ioCompositeResourceDefinitionCrossplane composition pkg.crossplane.ioProviderConfigCrossplane packages external-secrets.ioExternalSecretExternal Secrets Operator Controllers Available. Every substrate controller Deployment exists and reports the
Availablestatus conditionTrue. The gate pins the canonical upstream install coordinates: thecrossplaneDeployment in thecrossplane-systemNamespace (one controller serves both Crossplane groups), and theexternal-secretsDeployment in theexternal-secretsNamespace.
On any failure the gate returns an error wrapping ErrClusterUnhealthy so callers branch with errors.Is. A nil return means the cluster is eligible to host Projects. Inside the reconcile loop an ErrClusterUnhealthy is not fatal — it folds into VerifyPassed = false and the namespace keeps converging — whereas a non-unhealthy infrastructure error (an apiserver outage, a RESTMapper fault) propagates and fails the tick.
Persistence
Migration 0023_management_fleet.sql introduces two tables in the plexsphere schema:
plexsphere.management_clusters— one row per registered cluster.idis the application-minted UUID PRIMARY KEY;slugis heldUNIQUEbymanagement_clusters_slug_unique;regionandkubeconfig_secret_refare nullable (a cluster may be not-region-pinned, and the kubeconfig material lives behind a secret store, never in this table).plexsphere.project_cluster_assignments— one row per Project.project_idis the PRIMARY KEY (a Project is assigned to at most one cluster) and referencesplexsphere.projects(id);management_cluster_idreferencesplexsphere.management_clusters(id). Both FKs useON DELETE RESTRICT.namespace_phaseis held to the six-value closed set byproject_cluster_assignments_phase_check. Theproject_cluster_assignments_cluster_idxindex backs theListAssignmentsForClusterrange scan.
Because neither table holds secret bytes or hash-chained forensic rows, the migration's Down block performs a real DROP in reverse-FK order — it cannot resurrect compliance-sensitive plaintext on a subsequent Up.
Operational model
The management fleet is opt-in at the composition root. The single load-bearing knob is PLEXSPHERE_DSN: when it is empty the binary boots without a management-fleet reconcile probe (the early-boot posture for deployments that have not yet plumbed Postgres). The in-cluster Kubernetes access is ambient — the ServiceAccount the Pod runs under — and needs no env var.
| Env var | Effect | Default |
|---|---|---|
PLEXSPHERE_DSN | The Postgres connection string. Empty disables management-fleet wiring entirely. | "" (inert). |
PLEXSPHERE_MANAGEMENT_FLEET_RECONCILE_INTERVAL | The steady-state period between fleet reconcile sweeps. Parsed with time.ParseDuration; a non-positive value is rejected at boot. | 30s. |
The reconcile runs on two cadences, driven by management_fleet_factory_prod.go and registered through internal/platform/bootstrap/managementfleet_reconcile.go:
- Boot sweep — synchronous, before the listener binds.
RegisterManagementFleetReconcileProberuns the sweep once; a failure here refuses startup, because a fleet that cannot be reconciled into its expected shape must not serve traffic. /readyzprobe + steady-state ticker — after the boot sweep the same closure is registered as a/readyzprobe under the stable namemanagement-fleet-reconcile, and a goroutine re-runs the sweep every reconcile interval. A failure on a later probe tick flips/readyzto HTTP 503 so Kubernetes (and operators) catch drift after the binary has already come up. The ticker exits cleanly on context cancellation.
The sweep lists every registered cluster, lists each cluster's assignments, and reconciles each one, returning the first error it hits. It is idempotent because Reconcile is total and idempotent.
Recovery runbook
This section is the operator-facing companion to the reference above. Each entry follows the same shape — Symptom, Diagnostic, Remediation — and is scoped to a single failure mode. The reconcile loop is idempotent, so unless an entry says otherwise the safe baseline action is to let the next sweep retry and watch
/readyzand themanagement-fleet-reconcileprobe recover.
1. Management cluster unreachable
Symptom. /readyz reports the management-fleet-reconcile probe failing (HTTP 503), or the boot sweep refused startup. The structured log carries management fleet reconcile tick failed with an error that wraps a controller-runtime client transport failure (connection refused, TLS handshake timeout, context deadline exceeded) rather than ErrClusterUnhealthy.
Diagnostic.
- Confirm the apiserver of the management cluster the binary runs inside is reachable:
kubectl get --raw=/healthzfrom the Pod, or inspect the in-cluster apiserver Service and endpoints. - Confirm the Pod's ServiceAccount token is mounted and not expired — a rotated or revoked token surfaces as a
401/403from everyGet. - Distinguish this from a substrate failure (entry 2): an unreachable cluster fails on the first resource
Get; an unhealthy substrate reaches the verify gate and surfacesErrClusterUnhealthy.
Remediation. Restore apiserver reachability or re-issue the ServiceAccount credentials. No fleet data is lost — the inventory and assignment rows are durable in Postgres. Once the apiserver is reachable the next sweep reconciles every assignment and /readyz returns to HTTP 200 with no operator action on the fleet records.
2. Crossplane or External Secrets Operator missing or unhealthy
Symptom. Per-Project namespaces never leave Provisioning (or sit in Degraded). The verify gate returns an error wrapping ErrClusterUnhealthy; the wrapped message names the specific substrate — a missing API group (apiextensions.crossplane.io, pkg.crossplane.io, external-secrets.io) or a controller Deployment that is not Available.
Diagnostic.
kubectl get crd | grep -E 'crossplane|external-secrets'confirms whether the substrate CRDs are installed.kubectl -n crossplane-system get deploy crossplaneandkubectl -n external-secrets get deploy external-secretsconfirm the two controller Deployments exist and reportAvailable.- Read the wrapped error: the gate stops on the first failure, so fix that substrate and re-check rather than assuming the rest are healthy.
Remediation. Install or repair the substrate via the deploy/ Helm chart — this context verifies the substrate, it does not install it. The verify failure is non-fatal to a converging namespace: the reconcile keeps converging the per-Project resources and the namespace crosses into Ready automatically once the gate passes. No assignment needs re-creating.
3. Namespace stuck in Terminating
Symptom. After Service.MarkTerminating, a per-Project namespace stays in the Terminating phase across many sweeps and never reaches Deleted. Service.Unassign fails with ErrAssignmentTerminating.
Diagnostic.
kubectl get ns plexsphere-project-<uuid>— a namespace stuckTerminatingat the Kubernetes level almost always has a resource with a finalizer that has not been removed.kubectl get all,resourcequota,rolebinding -n plexsphere-project-<uuid>and inspect.metadata.finalizerson anything that remains.- The transition is sticky: once
Terminating, the namespace can only drain toDeleted— it will not converge back. The phase advances toDeletedonly when the reconcile observes all five per-Project resources absent.
Remediation. Clear the blocking finalizer on the offending object (typically a Crossplane managed resource awaiting external-API deletion). Once every per-Project resource is gone, the next sweep records Deleted and Service.Unassign succeeds. Do not delete the assignment row directly to "unstick" it — that orphans whatever is still live on the cluster and loses the teardown record.
4. Namespace deleted out of band
Symptom. An operator (or another controller) deleted a per-Project Namespace, Role, RoleBinding, ServiceAccount, or ResourceQuota directly on the cluster. The assignment row still shows Ready.
Diagnostic.
- The reconcile derives its action exclusively from live cluster facts every tick, never from cached desired state. On the next sweep the
ObservedClustersnapshot reports the missing resource. - A namespace that was
Readyand lost a resource transitions toDegraded(notProvisioning), soDegradedin the log or thenamespace_phasecolumn is the fingerprint of an out-of-band deletion of a still-assigned Project.
Remediation. None required — this is the case the reconcile loop exists to handle. The next sweep emits Converge, recreates the missing resources idempotently, and the namespace returns to Ready, emitting a fresh ProjectNamespaceReady event. If the namespace does not self-heal, fall through to entry 2 (the verify gate may be failing) or entry 1 (the cluster may be unreachable).
5. Assignment-immutability conflict
Symptom. Service.AssignProjectToCluster fails with ErrAssignmentImmutable. The wrapped message names the Project, the resource count it owns, and the cluster it is currently pinned to.
Diagnostic.
- The immutability gate is the Project's resource count, read inside the same transaction as the placement write. A re-assignment to a different cluster is rejected when the count is
> 0; a re-assignment to the same cluster is idempotent and returns the existing assignment unchanged; a re-assignment to a different cluster with a count of0is permitted (it is a misplacement correction before any resource exists). errors.Is(err, managementfleet.ErrAssignmentImmutable)confirms the failure class;Service.LookupAssignmentshows the current pinning.
Remediation. This sentinel is a fail-closed guard, not a bug — re-pointing a Project with live resources would orphan its Composite Resources, ProviderConfigs, and materialised Secrets in the old namespace. To move a Project to a different cluster: drive its namespace through MarkTerminating → reconcile-to-Deleted → Unassign on the old cluster (which tears the resources down), then AssignProjectToCluster to the new one. If the rejection was unexpected, the resource count is the source of truth — verify what the Project actually owns before forcing anything.
6. Rebuilding a cluster from the durable assignment records
Symptom. A management cluster was lost or rebuilt — every per-Project namespace and its RBAC and quota objects are gone — but the plexsphere.management_clusters and plexsphere.project_cluster_assignments rows in Postgres are intact.
Diagnostic.
- The Postgres rows are the durable record of the fleet; the cluster-side objects are derived, reconcilable state. After a cluster rebuild the inventory still knows every cluster and every Project assignment, including each namespace name and phase.
- Confirm the rebuilt cluster carries the substrate before relying on recovery: a rebuilt cluster with no Crossplane / External Secrets Operator install fails the verify gate (entry 2).
Remediation. Install the substrate on the rebuilt cluster, then let the reconcile sweep do the work. Each sweep lists every assignment for the cluster and reconciles it: every per-Project Namespace, Role, RoleBinding, ServiceAccount, and ResourceQuota is recreated idempotently from the derived namespace name, and each namespace crosses back into Ready. No assignment rows need re-creating and no Project needs re-placing — the inventory survived, so the fleet rebuilds itself from it. If a Project's old namespace had been mid-teardown, its row's namespace_phase still reflects that and the teardown arm resumes correctly.