Skip to content

Management Fleet HTTP API

This is the reference for the operator-facing Management Fleet HTTP surface. It maps each operation to its OpenAPI schema, ReBAC gate, audit relation, and the closed Problem.code taxonomy. The wire-contract origin is api/openapi/plexsphere-v1.yaml; this doc is a map, not a duplicate contract.

The Management Fleet is otherwise driven by an in-process reconciliation loop that converges each Project's namespace, RBAC, and quota on its management cluster. This surface adds a thin operator layer on top: register a cluster, inspect the fleet and its assignments, and request a namespace teardown. Assignment creation and re-pointing have no HTTP surface and remain reconciliation/broker-only — the assignment-immutability invariant (a Project that already owns provisioned resources cannot be re-pointed to a different cluster) is enforced inside the domain application service and is deliberately not reachable from an operator endpoint that could bypass region selection. The only assignment mutation here is the idempotent terminate trigger, which advances the namespace lifecycle phase to Terminating and leaves the reconcile loop to drive the teardown to Deleted and the final unassign.

The ReBAC chain the gates below resolve against is the managementfleet and managementcluster definitions in schema/authz.zed. managementfleet is a singleton object (managementfleet:fleet) with owner-only manage, operate adding the operator relation, and observe adding the auditor relation; managementcluster:<id> derives its permissions from the fleet singleton via the schema parent relation, so a fleet observer transitively observes every cluster. There is no managementcluster#read; the read paths gate on observe, the lowest-privilege read-equivalent, and the cluster-creating / teardown paths gate on manage.

Operations

MethodPathOperation IDReBAC gateAudit relationBody cap
GET/v1/management-clustersListManagementClustersmanagementfleet:fleet#observe (BEFORE the read)management_cluster.list (granted, with item_count)n/a
POST/v1/management-clustersRegisterManagementClustermanagementfleet:fleet#managemanagement_cluster.register8 KiB
GET/v1/management-clusters/{id}GetManagementClustermanagementcluster:<id>#observe (BEFORE the read)management_cluster.readn/a
GET/v1/management-clusters/{id}/assignmentsListManagementClusterAssignmentsmanagementcluster:<id>#observe (BEFORE the read)management_cluster.list_assignments (granted, with item_count)n/a
GET/v1/projects/{project_id}/management-cluster-assignmentGetProjectManagementClusterAssignmentmanagementfleet:fleet#observemanagement_assignment.readn/a
POST/v1/projects/{project_id}/management-cluster-assignment/terminateTerminateProjectManagementClusterAssignmentmanagementfleet:fleet#managemanagement_assignment.terminaten/a
  • Every gate runs authz-before-read: the path id is itself the ReBAC object (managementcluster:<id>) or the gate is the fleet singleton, so no pre-authz existence-oracle read is required. This is the simpler ordering compared to the Cloud Credentials read paths, which must pre-read to learn the owning Cloud — see cloud-credentials.
  • body_cap = 8 KiB (MaxManagementFleetRequestBodyBytes in internal/transport/http/v1/managementfleet/wiring.go) is enforced before the JSON decoder runs on the register body; an oversized body surfaces as 413 request_body_too_large.
  • The per-Project assignment endpoints gate on the fleet singleton rather than the resolved managementcluster:<id> so a caller that does not yet know which cluster a Project landed on can still be authorised without leaking the assignment's existence through a pre-authz read.

Projections

ManagementClusterResponse is shared by RegisterManagementCluster, ListManagementClusters, and GetManagementCluster. It is metadata-only: id, name, slug, region, status, created_at, updated_at. The kubeconfig Secret reference is structurally absent from the wire type — the storage location is a storage-internal detail the operator-facing surface has no reason to expose.

ProjectClusterAssignmentResponse is shared by GetProjectManagementClusterAssignment, ListManagementClusterAssignments, and TerminateProjectManagementClusterAssignment: project_id, management_cluster_id, region, namespace_name, namespace_phase, assigned_at, updated_at. namespace_phase is the NamespacePhase enum (Pending, Provisioning, Ready, Degraded, Terminating, Deleted) the reconcile loop advances; the terminate endpoint is the only HTTP path that moves it (to Terminating, idempotently). The region field is the region the placement landed in — it equals the Domain region the resolver matched the hosting cluster against.

Region-assignment resolver

Assignment creation and re-pointing have no HTTP surface; placement is owned by the in-process region-assignment resolver, which runs on every management-fleet reconcile sweep. The operator-facing contract is:

  • Exact-match placement. A Project is placed on a management cluster whose region exactly string-matches the region of the Project's Domain — ManagementCluster.region == Domain.region. There is no fuzzy or nearest-region fallback. A schedulable Project (one that owns at least one resource and has no assignment) with no region-matching cluster is skipped with a WARN and stays unplaced until a cluster for its region is registered.
  • Re-pin drives migration. Re-pinning a Domain's region through PATCH /v1/domains/{id} re-targets its Projects. For a zero-resource Project the resolver migrates it automatically, one step per sweep: it terminates the old assignment (driving namespace_phase through the teardown arm to Deleted) and re-points the Project to a Pending assignment on a correct-region cluster, emitting a ProjectClusterAssigned event. For a Project that owns resources the assignment is immutable: the resolver leaves it untouched, logs a WARN, and the sweep still succeeds — moving it would orphan its live substrate.
  • Inspecting the result. Use GET /v1/projects/{project_id}/management-cluster-assignment to read the current management_cluster_id, region, and namespace_phase. During a migration the old assignment's phase walks TerminatingDeleted and the new one starts at Pending.

For the full state machine, the scheduling and migration halves, and the operator runbook, see the Management Fleet bounded-context reference and the multi-region operations guide.

Problem.code taxonomy

The surface emits a closed set of Problem.code values (pinned as constants in internal/transport/http/v1/managementfleet/errors.go):

Problem.codeStatusRaised when
management_fleet_not_provisioned501The operator surface is not wired (no Postgres DSN / SpiceDB Authorizer); handlers fall through to the shared not-provisioned stub.
unauthenticated401The request carries no authenticated principal.
invalid_management_cluster_id400The path {id} is not a non-zero UUID.
invalid_project_id400The path {project_id} is not a non-zero UUID.
invalid_body400The register body is not a valid RegisterManagementClusterRequest JSON document.
invalid_management_cluster400name, slug, or kubeconfig_secret_ref is empty, or the aggregate rejects the input.
management_cluster_not_found404GetManagementCluster for an unregistered id.
management_assignment_not_found404The addressed Project has no management-cluster assignment.
management_cluster_conflict409RegisterManagementCluster with a slug that already exists (the fleet's unique shard key).
request_body_too_large413The register body exceeded the 8 KiB ceiling.
internal500An unexpected error; the underlying text is logged, never surfaced.

A 403 denial renders the shared PermissionDenied problem and is audit-first: the denial row (outcome=permission_denied) is recorded through the audit sink before the response is flushed, exactly as the Cloud Credentials and Clouds surfaces do.

Audit emission

Every handler stamps a transport-local audit row on grant, denial, and invariant rejection. The relations are verb-style and pinned as constants alongside the gates. The grant rows carry the resolved identifiers in caveat_context (management_cluster_id, slug, project_id, namespace_phase, or item_count). At this iteration the sink is a structured-log adapter; a hash-chained Management Fleet audit sink lands in a later iteration, mirroring the Cloud Credentials slog-fallback posture.

Composition-root wiring

The transport package declares narrow local ports (ReadService, Registrar, Terminator, Authorizer, AuditSink) and transport-local value types (ClusterView, AssignmentView, RegisterClusterInput). The production adapter at the composition root (cmd/plexsphere/management_fleet_factory_prod.go) wraps the in-process managementfleet.Service onto those ports and maps the domain + repository sentinels onto the transport sentinels. This keeps the transport HTTP module free of the management-fleet bounded-context module's dependency graph (controller-runtime, pgx) — the same split the Cloud Credentials surface takes. The operator surface dispatches only when the binary opens a Postgres pool (PLEXSPHERE_DSN) and the SpiceDB-backed Authorizer is wired; otherwise the six handlers stay on the 501 management_fleet_not_provisioned stub.