Skip to content

Approval Workflow — dual-control gating for proposed actions

This document is the authoritative bounded-context reference for the Approval Workflow — the generic dual-control sub-context of the Identity bounded context. It models a proposal to perform some action against a target resource that may require a second principal's decision before it is applied. The context owns the closed lifecycle state machine (proposed → pending-approval → approved | rejected | expired), the per-Domain ApprovalPolicy value object that decides whether a given action is gated, the break-glass emergency override, and the five HTTP operations an operator drives the queue through. The domain root that pins the ubiquitous language is ../../internal/identity/approvals/doc.go.

The Approval aggregate is generic: it does not know what action it gates. The proposer attaches an action_kind, a target_resource, and an opaque JSON payload; the workflow decides whether a second decision is required and records that it was approved, rejected, expired, or force-approved — never executing the action itself. The caller that raised the proposal is responsible for applying the payload once the proposal reaches approved. This keeps the context free of any coupling to the specific operations a Domain chooses to place under dual control.

This context relates to three siblings:

  • Identity / tenancy owns the Domain aggregate that carries the per-Domain ApprovalPolicy and the ChangeApprovalPolicy mutator an operator uses to enable dual-control. See identity / tenancy.
  • Audit is the downstream sink every decision writes a names-only audit row to; the break-glass reason value is routed to the PII-safe channel, never onto the hash-chain. See audit.
  • Provisioning credential-assignment is the closest structural cousin: its request / approve / reject / revoke ReBAC workflow is the template this aggregate mirrors (a closed-state value-type aggregate, a closed sentinel-error set, audit-first denial emission). See provisioning credential-assignment.

Ubiquitous language

The terms below travel verbatim across the domain root, the application services, the persistence layer, the HTTP surface, and the lifecycle outbox events. Internal code never paraphrases them; documentation, JSON fields, and database columns adopt the exact spelling.

TermDefinitionCode anchor
ApprovalThe aggregate root: one dual-control proposal — who proposed what action against which resource, the raw payload to be applied once approved, the lifecycle state, and the decision metadata once a terminal state is reached. A value-type aggregate whose transition methods return a new value rather than mutating the receiver.../../internal/identity/approvals/approval.go (Approval)
StateThe lifecycle state of an Approval. A closed-enum string over the five-member roster proposed / pending-approval / approved / rejected / expired, pinned by a roster test so a sixth member trips the drift gate at build time.../../internal/identity/approvals/approval.go (State)
ApprovalPolicyThe per-Domain value object that declares which proposed actions require dual-control. JSON-serializable so it round-trips through a JSONB column whose default is the empty object. An empty policy gates nothing — the wired-but-empty default.../../internal/identity/approvals/approvalpolicy.go (ApprovalPolicy)
ApprovalRuleOne matching rule: an action_kind, an optional target_resource (empty is a wildcard), and approvers_required (at least one). A proposal is gated when some rule matches its action kind and target.../../internal/identity/approvals/approvalpolicy.go (ApprovalRule)
ProposerSubjectThe ReBAC subject string of the principal that raised the proposal. The dual-control invariant forbids this principal from approving their own proposal.../../internal/identity/approvals/approval.go (Approval.ProposerSubject)
ActionKindThe kind of action the proposal would perform once approved. Matched against the policy rules to decide whether the proposal is gated.../../internal/identity/approvals/approval.go (Approval.ActionKind)
TargetResourceThe resource the proposed action targets. Matched against the optional target_resource of a policy rule.../../internal/identity/approvals/approval.go (Approval.TargetResource)
ObjectStringThe canonical resource string identifying an approval within its Domain, in the form domain:<32hex>#approval:<32hex> — each segment the lower-case, UNDASHED hex of the underlying 16 bytes. The render (ObjectString) and the parse (Resolve) live in one place so the format cannot drift; the dashed UUID form is rejected.../../internal/identity/approvals/objectstring.go (ObjectString, Resolve)
DomainID / IDThe package-local, anti-corruption id types (a [16]byte Domain reference and the approval's own UUIDv7). The domain layer never imports the platform UUID type into its helper signatures; adapters at the composition root map these onto the rest of the platform.../../internal/identity/approvals/approval.go
BreakGlassThe emergency override that forces a pending-approval proposal to approved without a second-party decision, gated on a dedicated emergency relation and demanding a mandatory justification.../../internal/identity/approvals/approval.go (Approval.BreakGlass)

State machine

The closed transition table the aggregate enforces. Every transition method returns a new Approval value and leaves the receiver unchanged; an illegal source state wraps the package's illegal-transition sentinel.

text
                          propose
                             |
                             v
                       +-----------+
                       | proposed  |
                       +-----------+
                        |    |    |
        policy matches? |    |    | empty / no-match policy
        (RequireApproval)|   |    | (Approve short-circuit)
                         |   |    |
                         v   |    v
              +------------------+ +----------+
              | pending-approval | | approved | (terminal)
              +------------------+ +----------+
                 |    |    |    |
         approve |    |    |    | expire (sweeper)
                 |    |    |    +-----------------> +----------+
                 |    |    | break-glass            | expired  | (terminal)
                 |    |    +---------------------+  +----------+
                 |    | reject                   |       ^
                 v    v                          v       | expire (sweeper)
            +----------+   +----------+    +----------+   | from proposed
            | approved |   | rejected |   | approved |---+
            +----------+   +----------+   +----------+
             (terminal)     (terminal)    (terminal)

The two non-obvious edges:

  • The empty-policy short-circuit. When no rule gates the action, Propose approves the proposal in place — proposed → approved — without ever entering pending-approval. There is nothing to wait for, and manufacturing a pending-approval event no human acted on would pollute the audit trail and any "proposals awaiting a human" query with phantom rows. No deciding subject is stamped on the short-circuit.
  • RequireApproval is the only edge that makes pending-approval reachable. It is the gating transition the application service calls when a policy rule matches.

approved, rejected, and expired are terminal. The expired edge is the unattended sweeper path: it stamps the decision timestamp but leaves the deciding subject and reason empty (no human decided).

rejected and expired are kept as two distinct terminal states rather than collapsed into one closed state with a reason field: an operator-driven rejection names a deciding subject and is a distinct domain event from an unattended timeout, and a downstream consumer auditing "which proposals a human declined" must tell the two apart.

ApprovalPolicy

The policy is a value object stored on the owning Domain's approval_policy JSONB column. It declares which proposed actions require dual-control:

  • The empty policy gates nothing. A nil or empty Rules slice is the wired-but-empty default the workflow ships with: the machinery is in place, but no action is held for approval until an operator adds a rule. A zero policy and a policy unmarshalled from the empty object round-trip identically — IsZero reports true for both, so the JSONB default is a no-gate policy.
  • A matching rule gates the proposal. Evaluate(actionKind, target) returns true when some rule's action_kind equals the proposed action kind and its target_resource is either empty (a wildcard matching any target) or equals the proposed target. A gated proposal enters pending-approval; a non-matching proposal short-circuits to approved.
  • approvers_required records how many distinct approvers a matched action requires; each rule must name a non-empty action_kind and require at least one approver. Validate enforces these per-rule invariants; an empty policy is valid (it simply gates nothing).

The policy lives on the Domain rather than on the Approval because it is a Domain-wide configuration axis an operator owns, orthogonal to any single proposal. The Approval aggregate cannot import the tenancy context, so the application service receives the resolved policy as input at propose time — the composition root resolves the owning Domain's policy across the tenancy boundary and passes it in, mirroring how the credential-assignment request flow receives its caller-resolved input rather than reaching across a bounded-context boundary.

Break-glass

Break-glass is the emergency escape hatch: it forces a pending-approval proposal to approved without the second-party decision. Its semantics are deliberately stricter than an ordinary approval:

  • It is gated on a dedicated relation, not the ordinary approve permission. The override requires the emergency_approver relation on the owning Domain — a grant the Domain owner extends only to break-glass operators. A principal holding the ordinary approve permission cannot break glass.
  • It demands a mandatory justification of at least 16 characters. A shorter reason fails fast before any row is read, any permission is checked, or anything is written.
  • It forces the transition to approved and stamps the deciding subject and the reason on the aggregate.
  • The reason value is PII and never reaches the audit hash-chain or the outbox. The audit row carries the reason field name on its names-only caveat surface so an auditor knows a justification was supplied, while the reason value is routed through a separate PII map to a PII-safe downstream sink. The lifecycle outbox payload carries no decision reason at all. Putting the value onto the names-only caveat surface would leak operator PII onto the channel the composition root forwards verbatim, breaking the names-only contract every other decision path upholds.

Audit relations

Each decision path emits exactly one audit row through the services-local audit sink, which the composition root adapts onto the canonical audit sink. The relation strings and outcomes below are the stable contract downstream auditors key on. The outcome is one of three strings the composition root maps onto the frozen canonical audit reason enum.

RelationEmitted whenOutcome
approval.proposeA proposal is submitted — whether it gated into pending-approval or short-circuited to approved.granted
approval.approveA proposed or pending proposal is approved by an operator decision.granted
approval.rejectA pending proposal is declined by an operator decision.insufficient_relation
approval.break_glassA pending proposal is force-approved via the emergency override.granted
approval.expireThe background sweeper expires an un-decided proposal past its deadline.out_of_scope

The expire row is the unattended path: it stamps the system subject fallback (no deciding principal) and emits one audit row only when a row was actually expired this pass, never on the idempotent skip.

Invariants

InvariantWhere enforced
A new approval always starts in proposed.NewApproval forces State = proposed and clears decision fields regardless of input.
DomainID is non-zero; ProposerSubject, ActionKind, TargetResource are non-empty; ExpiresAt is non-zero.buildApproval (shared by NewApproval / HydrateApproval).
State transitions follow the closed transition table.The transition methods (RequireApproval / Approve / Reject / Expire / BreakGlass) reject an illegal source state.
HydrateApproval additionally requires ID, CreatedAt, and a roster-valid State.buildApproval strict path — a corrupt persisted row is rejected rather than silently defaulted.
Each approval owns a UUIDv7 ID; a zero ID is auto-assigned on construction.NewApproval mints via NewID; HydrateApproval rejects a zero ID.
A proposer may not approve their own proposal (dual-control).The approve service guards on Subject == ProposerSubject BEFORE the ReBAC check and BEFORE any write.
Each policy rule names a non-empty action_kind and requires at least one approver; the empty policy is valid.ApprovalPolicy.Validate.
The break-glass reason is at least 16 characters and its value never reaches the audit hash-chain or outbox.The break-glass service validates length fast, routes the field name to the caveat surface and the value to the PII map.
The object string is the undashed domain:<32hex>#approval:<32hex> form; the dashed UUID form is rejected.ObjectString / Resolve.
One state mutation plus its outbox event commit atomically.Every decision service wraps the UpdateState + outbox append in a single RunInTx.

API contract

The HTTP surface (../../internal/transport/http/v1/approvals/) exposes five operations defined by ../../api/openapi/plexsphere-v1.yaml. Every operation authenticates the principal (a 401 otherwise) and runs its ReBAC check BEFORE the persistence read. Error bodies follow RFC 9457 (application/problem+json).

ListApprovals — GET /v1/approvals

Returns a creation-ordered page of Approval metadata. Optional query parameters: status (a lifecycle-state filter), domain_id (an owning-Domain filter), cursor (an HMAC-signed continuation token), and limit (clamped to [1, 200], default 50). The handler runs a platform-global read gate against platform:plexsphere BEFORE the read, then layers a per-row read visibility filter on the owning Domain so the returned items are the subset the caller is authorised to see. Success is 200 with an ApprovalList. The pagination cursor is HMAC-signed and bound to the per-(caller, pepper) pseudonym: a cursor minted by one principal replayed by another surfaces as 403 cursor_binding_mismatch, while a tampered envelope or unknown version byte stays on 400 invalid_cursor. Problem codes: 400 (invalid cursor / out-of-range limit / malformed domain_id), 401, 403 (PermissionDenied or cursor_binding_mismatch), 500.

GetApproval — GET /v1/approvals/{id}

Returns the metadata projection of one Approval. The handler runs the read ReBAC check on the owning Domain BEFORE the read. Success is 200 with an Approval. Problem codes: 400 invalid_approval_id, 401, 403 (PermissionDenied), 404 approval_not_found, 500.

ApproveApproval — POST /v1/approvals/{id}/approve

Approves the proposal: the handler reads the row to resolve the owning Domain, runs the approve ReBAC check on that Domain, then delegates to the application service which moves the proposal to approved and appends the approval outbox event in one transaction. No request body. Success is 200 with the projection in state approved. Approval is legal only from pending-approval (or proposed under the empty-policy short-circuit). Problem codes: 400 invalid_approval_id, 401, 403 (PermissionDenied or self_approval_denied — a caller may not approve a proposal they themselves raised), 404 approval_not_found, 409 illegal_transition, 500.

RejectApproval — POST /v1/approvals/{id}/reject

Rejects the proposal: the handler resolves the owning Domain, runs the approve ReBAC check, then delegates to the application service which moves the proposal to rejected and appends the rejection outbox event in one transaction. Request body is a RejectApprovalRequest with a required reason (length 1–1024); the reason is recorded on the decision, not echoed into the audit caveat. Success is 200 with the projection in state rejected. Rejection is legal only from pending-approval. Problem codes: 400 (invalid_approval_id / invalid_body / invalid_decision_reason), 401, 403 (PermissionDenied), 404 approval_not_found, 409 illegal_transition, 500.

BreakGlassApproval — POST /v1/approvals/{id}/break-glass

Forces the proposal to approved via the emergency override: the handler resolves the owning Domain, runs the emergency_approver relation check (the ordinary approve permission is NOT sufficient), then delegates to the application service. Request body is a BreakGlassRequest with a required reason of at least 16 characters. The reason is recorded by field NAME only on the decision's audit caveat context — the field projected onto the response carries the names-only marker x-plexsphere-names-only; the value itself is PII routed to a PII-safe sink and never crosses the contract boundary verbatim. Success is 200 with the projection in state approved. The override is legal only from pending-approval. Problem codes: 400 (invalid_approval_id / invalid_body / invalid_break_glass_reason), 401, 403 (PermissionDenied), 404 approval_not_found, 409 illegal_transition, 500.

Events

Every lifecycle transition appends exactly one outbox event in the same transaction as the state mutation, so the row and its event commit atomically. The event_type strings are snake_case and must match the approval_workflow_request_outbox_token CHECK constraint:

event_typeEmitted by
approval_proposedA proposal that gated into pending-approval.
approval_approvedThe empty-policy short-circuit AND the operator approve decision.
approval_rejectedThe operator reject decision.
approval_break_glassedThe emergency override.
approval_expiredThe background sweeper.

The outbox payload denormalises the aggregate identity and the resulting state (approval_id, domain_id, action_kind, target_resource, state, the optional decided_by_subject, and occurred_at) so a relay consumer routes without re-reading the row. It carries nodecision_reason — the break-glass reason is PII and never enters the event log. The per-(approval, event_type) idempotency token anchored on the outbox event id gives the expire sweeper an at-most-once guarantee on re-run.

Observability

The application service instruments every decision transition. The two metrics are:

  • plexsphere_approvals_decisions_total — a CounterVec with labels status and action_kind, incremented once per decision transition.
  • plexsphere_approvals_decision_duration_seconds — a HistogramVec with labels status and action_kind, observing the latency of one decision transition.

Each transition also emits a structured slog line carrying the fields operation, status, action_kind, approval_id, domain_id, and decided_by. No PII — in particular the break-glass reason value — and no trace identifiers appear in any metric help string, metric label value, or log message: the action_kind label is the operator-chosen action kind (a bounded vocabulary), not free text, and the reason value travels only on the PII-safe audit channel. The requirement that motivates each assertion lives in the code's doc-comments, never in a logged string.

Operator runbook

Enabling dual-control on a Domain

The workflow ships wired-but-empty: every Domain starts with an empty ApprovalPolicy that gates nothing, so the platform behaves exactly like a deployment that never opted in. To turn on dual-control for a Domain, set a non-empty policy via the tenancy ChangeApprovalPolicy mutator (../../internal/identity/tenancy/domain.go), which validates the policy (each rule names a non-empty action kind and requires at least one approver) and bumps the Domain's update timestamp. Add one ApprovalRule per action you want gated: name the action_kind, optionally narrow it to a single target_resource (leave it empty for a wildcard over every target of that action kind), and set approvers_required to at least one. From the moment the rule lands, a proposal whose action kind and target match it enters pending-approval instead of short-circuiting to approved, and the queue surfaces it to GET /v1/approvals?status=pending-approval. Re-setting the policy to the empty value disables gating again — the empty policy is accepted as the valid wired-but-empty default and is never silently replaced with a non-empty one.

Composition knobs

Two optional environment variables tune the production wiring in ../../cmd/plexsphere/approvals_factory_prod.go:

  • PLEXSPHERE_APPROVALS_EXPIRE_TICK sets the cadence of the background sweep that expires stale pending-approval proposals. It is parsed by time.ParseDuration, must be positive, and defaults to 60 seconds.
  • PLEXSPHERE_APPROVALS_CURSOR_HMAC_KEY is a hex-encoded HMAC key that binds the GET /v1/approvals list cursor to the presenting caller. When unset the list cursor falls back to the unsigned identity codec. It is secret material and should ride in a Secret, never a ConfigMap.

The break-glass escape hatch

When a gated proposal must be approved without waiting for a second-party decision — a production incident, an unavailable approver — a principal holding the emergency_approver relation on the owning Domain calls POST /v1/approvals/{id}/break-glass with a mandatory justification of at least 16 characters. The override forces the proposal to approved, records the reason on the decision, and routes the reason value to the PII-safe audit sink while the audit hash-chain and the outbox carry only the field name. Grant the emergency_approver relation narrowly: it is deliberately distinct from the ordinary approve permission so that the set of principals who can bypass dual control is auditable and small. The override is only legal from pending-approval; any other source state returns 409 illegal_transition.

Cross-references