Skip to content

Identity and Groups — manual and IdP-synced Group aggregates

This document is the authoritative bounded-context reference for the Group, GroupMembership, and GroupParent aggregates that ship under internal/identity/groups, the persistence layer under internal/identity/groups/repo, the groups.Syncer port wired into the sign-in flow, and the admin HTTP surface under internal/transport/http/v1/admin/groups.go.

The ubiquitous language, Source-dictated invariants, Resolver semantics (including the MaxDepth = 32 cap), IdP sync pipeline, drift-triage runbook, ReBAC-consumer contract for the forthcoming authorisation rollout, and the invariant-to-test matrix that ties every requirement to at least one automated test live here. This doc also owns the ERD of the three plexsphere.{groups,group_memberships, group_parents} tables the context writes through.

For the bounded-context siblings see idp.md (per-Domain IdP binding, User, UserSession, ServiceIdentity, APIToken aggregates) and tenancy.md (Domain, Project, Resource, Node). For the persistent schema see the Groups schema section of the internal/platform/db reference, and for the operator workflow see the how-to at ../../how-to/identity/manage-groups.md.

ERD — three groups tables plus the outbox

The three tables below are the persistent footprint of the groups aggregates. plexsphere.outbox_events is listed alongside because every aggregate-shaped write in this context appends exactly one row to it inside the same transaction.

text
                     +------------------------------+
                     |   plexsphere.groups          |
                     +------------------------------+
                     | PK id                uuid    |
                     | FK domain_id         uuid    |  -> domains.id (RESTRICT)
                     |    slug              text    |  kebab-case, 1..64
                     |    display_name      text    |
                     |    source            text    |  CHECK IN (manual, idp)
                     | FK idp_binding_id    uuid    |  NULLable -> idp_bindings.id
                     |    idp_claim_value   text    |  NULLable
                     |    created_at, updated_at    |
                     | UQ (domain_id, slug)         |
                     | CK source='idp' => binding   |
                     |    AND claim_value set       |
                     +--+---------------+-----------+
                        | 1             | 1
                        |               |
                        | *             | *
           +------------v--+          +-v-----------------------------+
           |  group_parents|          |  plexsphere.group_memberships |
           +---------------+          +-------------------------------+
           | FK parent_id  |          | FK group_id           uuid    | -> groups.id (CASCADE)
           | FK child_id   |          |    principal_user_id  uuid    | NULLable -> users.id
           | CK no-self    |          |    principal_service_ uuid    | NULLable
           | PK (p, c)     |          |      identity_id              |
           +---------------+          |    principal_group_id uuid    | NULLable
                                      |    source             text    | CHECK IN (manual, idp)
                                      |    created_at                 |
                                      | CK XOR on the three principal |
                                      |     columns                   |
                                      | UQ (group_id, principals)     |
                                      | TG source matches parent grp  |
                                      +-------------------------------+

                     +------------------------------+
                     |  plexsphere.outbox_events    |
                     +------------------------------+
                     | PK id                uuid    |
                     |    aggregate_type    text    |
                     |    aggregate_id      uuid    |
                     |    event_type        text    |
                     |    payload           jsonb   |
                     |    occurred_at       ts-tz   |
                     |    transaction_id    xid8    |  pg_current_xact_id()
                     +------------------------------+

Two schema-level invariants back aggregate-level contracts and are worth calling out because they both fail at pgerrcode 23514 (check_violation) instead of the default 23505 (unique_violation):

  • groups_source_idp_requires_binding — a Group row with source = 'idp' MUST carry a non-null idp_binding_id and a non-empty idp_claim_value; a Group with source = 'manual' MUST carry neither. A direct INSERT bypassing the aggregate cannot produce a mis-sourced row.
  • group_memberships_source_matches_parent — a BEFORE INSERT OR UPDATE trigger joins the parent plexsphere.groups row and rejects any Membership whose source disagrees with the parent's source. The repo's classifier routes the resulting 23514 to ErrSourceMismatch alongside the XOR check.
  • groups_idp_binding_claim_uq — a partial UNIQUE INDEX on (domain_id, idp_binding_id, idp_claim_value) WHERE source = 'idp' guarantees that at most one source='idp' Group per (domain_id, binding_id, claim_value) tuple can exist. The sync path keys candidate Groups by idp_claim_value, so without this guard two IdP Groups sharing a claim value would silently collapse into a single candidate in the reconciler's map. A duplicate INSERT trips pgerrcode 23505 which the repo classifier discriminates by constraint name and surfaces as the dedicated ErrIdPBindingClaimConflict sentinel (intentionally NOT an errors.Is-alias of the slug-race ErrConflict, mirroring the tenancy ErrMeshCIDROverlap / ErrReservationOverlap split so callers branching on a "claim-value race" do not accidentally also match a "slug race"); the admin HTTP handlers map that sentinel onto a 409 group-idp-claim-conflict Problem distinct from the generic 409 group-conflict returned for groups_domain_slug_uq .

Ubiquitous-language glossary

Every term listed here appears verbatim in the aggregates, in the sqlc queries, in the migration, in the HTTP surface, and in the integration suite. The whole point of listing them in one place is that internal/identity/groups code and the operator-facing conversation use the same words.

TermDefinition
GroupDomain-scoped aggregate identifying a named set of principals. Carries an immutable Source, a kebab-case Slug unique per Domain, and a DisplayName. Persisted in plexsphere.groups.
SourceString enum {manual, idp}. A Group's Source is immutable; the Memberships attached to it MUST share it. Manual Groups are edited by administrators; IdP Groups mirror a claim value and are authoritatively rewritten by the sync loop.
MembershipValue object linking a Group to exactly one Principal. Memberships are immutable once constructed; changes happen by revoking and re-issuing. Persisted in plexsphere.group_memberships.
PrincipalKindString enum {user, service_identity, group}. Discriminates the column the Membership row populates (XOR CHECK in SQL). A group-kind Membership models a nested child and is walked via ParentEdge in the Resolver.
ParentEdgeDirected edge (parent_id -> child_id) describing nested Group inclusion. Self-parent edges are forbidden at the aggregate, the SQL CHECK, and the API handlers. Cycles are rejected by the Resolver's colored-DFS and by the SQL recursive CTE pre-check in ParentRepo.AddParent. Persisted in plexsphere.group_parents.
ResolverPure in-memory (for unit tests) or CTE-backed (for production) domain service that flattens a principal's transitive Group memberships up to MaxDepth = 32. Returns a deduplicated, lexicographically sorted []tenancy.ID.
MaxDepthThe shared MaxDepth = 32 cap on nested-group hierarchy depth. Defined in internal/identity/groups/resolver.go; mirrored by the ResolveTransitiveMembership CTE's recursion guard and by the integration test that plants a depth-33 chain.
SyncerDomain-layer port under internal/identity/groups that reconciles a User's IdP-sourced Group memberships to match the upstream id_token groups claim. Invoked by the sign-in handler after the User upsert and sign-in record. Sync opens its own pgx.Tx inside MembershipRepo.SyncForUser; the sign-in handler does NOT share a tx across the three per-repo calls today (tracked as a follow-up — see DECISION block in callback.go).
DriftAn IdP-asserted group claim value that does not map to any local IdP-sourced Group in the User's Domain. Recorded as a GroupIdPSyncDrift outbox event so operator tooling can triage the unknown claim without blocking sign-in.
Outboxplexsphere.outbox_events — the transactional outbox relay source. transaction_id is pg_current_xact_id() so consumers can order by commit time without a dedicated sequence. Every aggregate mutation appends exactly one matching row.

Resolver semantics

groups.Resolver.ResolveMembership(ctx, userID) returns the set of Group IDs the userID belongs to, either directly or via ancestor ParentEdge walks. The contract is identical across the in-memory implementation used by unit tests and the CTE-backed implementation used in production:

  1. An unknown userID returns (nil, nil) — callers interpret "no groups" as non-membership, never as a lookup failure. This mirrors how SpiceDB's LookupSubjects handles unknown subjects and keeps the ReBAC consumer contract clean.
  2. The returned slice is deduplicated and lexicographically sorted by tenancy.ID.String() so callers may rely on a stable order across runs and across storage backends.
  3. Cycles produce a *groups.CycleError whose Path field carries the offending chain of group IDs. errors.Is(err, groups.ErrMembershipCycle) matches. The in-memory resolver detects cycles via three-colour DFS (white/gray/black); the SQL resolver's recursive CTE terminates via the NOT parent_id = ANY(d.path) cycle guard on the UNION ALL recursive step so a cycle cannot run forever.
  4. Chains longer than MaxDepth = 32 return groups.ErrHierarchyTooDeep. The cap exists in three places — the in-memory walk's depth counter, the SQL CTE's depth <= MaxDepth guard, and the aggregate-level pre-check in ParentRepo.AddParent — so a depth-33 chain is rejected whichever path the operator takes.

Worked example — nested groups, direct plus transitive

text
ops (manual)    <---- parent_edge ----   ops-apac (manual)
                                               |
                                          +----+----+
                                          | member  |
                                          | alice   |
                                          +---------+

Given the direct Membership (alice -> ops-apac) and the ParentEdge (ops -> ops-apac) (ops is parent, ops-apac is child), ResolveMembership(ctx, alice) returns [ops-apac, ops] in lexicographic order of the tenancy.ID strings. If the operator adds ops-eu -> ops-apac so alice is a member of both subtrees, the dedup step collapses ops into a single entry — the Resolver never returns duplicates.

Sync pipeline

The IdP sync path is the single source of truth for IdP-sourced Group memberships. Manual memberships are opaque to the sync loop — they are never touched by Syncer.

text
  browser            callback handler             groups.Syncer
    |                       |                           |
    |  GET /v1/auth/callback|                           |
    |---------------------->|                           |
    |                       | UserRepo.UpsertWithBinding|
    |                       |-------+                   |
    |                       |       | RecordSignIn      |
    |                       |<------+                   |
    |                       |                           |
    |                       |  Sync(ctx, userID,        |
    |                       |        bindingID,         |
    |                       |        domainID, claims)  |
    |                       |-------------------------->|
    |                       |                           | MembershipRepo.SyncForUser
    |                       |                           |-------+  (opens its own tx,
    |                       |                           |       |   snapshot of current
    |                       |                           |       |   IdP-sourced rows)
    |                       |                           |<------+
    |                       |                           |-------+  (delta = add + remove,
    |                       |                           |       |   emits
    |                       |                           |       |   GroupMemberAdded /
    |                       |                           |       |   GroupMemberRemoved)
    |                       |                           |<------+
    |                       |                           |-------+  (for every claim value
    |                       |                           |       |   with no matching
    |                       |                           |       |   source='idp' Group:
    |                       |                           |       |   GroupIdPSyncDrift)
    |                       |                           |<------+
    |                       |<--------------------------|
    |                       | 302 Set-Cookie            |
    |<----------------------|                           |

Three outbox shapes enter the log inside the same sync call:

  • identity.GroupMemberAdded / identity.GroupMemberRemoved — one per delta edge. Payload carries the group_id, the user_id, the principal_kind (always user in the sync path), and the source (always idp). The add/remove pair is diff-symmetric so replay against a fresh state reconstructs the membership exactly.
  • identity.GroupIdPSyncDrift — emitted for every claim value that the upstream asserted but no local source='idp' Group resolves. Payload carries the user_id, domain_id, binding_id, and the unmatched_claim_value. The event is not fatal — sign-in proceeds — so the operator can backfill the missing Group without kicking users out.
  • None of the above fire for manual memberships. A Membership whose parent Group has source='manual' is invisible to MembershipRepo.SyncForUser; the query filters on groups.source = 'idp' before computing the delta. This is the invariant the e2e suite's manual-group-survives step anchors .

Transactional rollback

groups.Syncer.Sync is tx-atomic within its own call: the default syncer delegates to MembershipRepo.SyncForUser which opens a single pgx.Tx, computes the membership delta, applies every add / remove / drift outbox append, and commits or rolls back that tx as a unit. A failure anywhere inside the Sync call (e.g. the injected outbox trigger in the integration test) rolls back every membership and drift row the Sync call would have written.

The sign-in handler in internal/transport/http/v1/auth/callback.go runs three separate transactions per sign-in:

  1. UserRepo.UpsertWithBinding (its own tx, emits UserProvisioned / UserBindingUpdated outbox rows).
  2. UserRepo.RecordSignIn (its own tx, emits UserSignedIn).
  3. groups.Syncer.Sync (its own tx, the one described above).

DECISION: the handler deliberately does NOT wrap these three calls in a single outer tx today (see the DECISION block in internal/transport/http/v1/auth/callback.go and internal/identity/groups/syncer.go). Lifting transaction management into a shared unit-of-work facade is tracked as a follow-up and is out of scope for this story. A failure inside Sync therefore leaves the User upsert and the sign-in record committed; only the membership / drift delta is rolled back, and the handler aborts the cookie / redirect response so the caller sees a 5xx. TestGroupsSync_TransactionalRollback in tests/integration/identity_groups_sync_test.go covers the inner-tx rollback contract — it asserts that no membership rows and no drift rows land when the Sync tx fails.

Drift triage runbook

A GroupIdPSyncDrift event means the upstream IdP asserted a group claim value that no local source='idp' Group resolves. Sign-in itself is unaffected; the User is provisioned and the session cookie lands. But the resulting authorisation state is subtly wrong — the User is missing a Group that operator intent says they should have. The runbook below turns the event stream into an action queue.

1. Surface the drift

Query the outbox for open drift events. The payload carries the binding and the unmatched claim so an operator can disambiguate between two IdPs that happen to emit the same label:

sql
SELECT id, occurred_at,
       payload ->> 'user_id'                AS user_id,
       payload ->> 'binding_id'             AS binding_id,
       payload ->> 'unmatched_claim_value'  AS claim
FROM   plexsphere.outbox_events
WHERE  event_type = 'identity.GroupIdPSyncDrift'
ORDER  BY occurred_at DESC
LIMIT  50;

Dashboards should also graph the drift-event rate alongside the sign-in rate. A jump in GroupIdPSyncDrift without a matching jump in sign-ins is an upstream-change signal — the IdP has started asserting a new claim value the tenant has not yet reflected locally.

2. Decide disposition

For each distinct (binding_id, unmatched_claim_value) pair:

  • Expected new Group — operator creates a source='idp' Group with idp_binding_id = binding_id and idp_claim_value = claim via POST /v1/admin/groups. The next sign-in attaches the User automatically; the drift event then stops recurring for that pair.
  • Typo / retired — operator asks the IdP admin to remove the claim value from the directory. Until that propagates, the drift events accumulate harmlessly; they do NOT block sign-in.
  • Intentional denial — operator does nothing. The User signs in without the Group, authorisation denies whatever the Group would have granted, and the drift row stays as evidence for the audit trail.

3. Silence the backlog

Drift events are append-only; they do not self-expire. Operators should periodically relay the outbox and purge consumed rows via the standard outbox-relay tooling. The drift events survive the purge for their full audit retention window because they are classified under the identity aggregate (aggregate_type = 'identity.Group').

ReBAC consumer contract

The authorisation context consumes Group membership via SpiceDB. The authz-side of this contract — the schema that references Groups, the zedtoken consistency flow, the caveat context the Authorizer feeds, and the audit shape every decision produces — is documented in ./rebac.md. This section freezes the contract the Groups context exposes so the authz consumer can design against a stable surface.

  • Resolution. The authorisation layer calls groups.Resolver.ResolveMembership(ctx, userID) and receives the transitive Group IDs in lexicographic order. The call is read-only and commutes with concurrent writes — the underlying CTE snapshots the graph inside the read's tx.
  • Identity. Group IDs are tenancy.ID values (UUIDv7). They are stable across renames; DisplayName changes emit GroupRenamed but do not re-issue the ID. SpiceDB schemas should therefore reference Groups by UUID, not by slug.
  • Caching. The Resolver does not cache — the caller owns the cache lifetime. Invalidation can be driven off the outbox: GroupMemberAdded, GroupMemberRemoved, GroupParentAdded, GroupParentRemoved, and GroupDeleted are the only event types that can flip a resolution; GroupCreated and GroupRenamed cannot change an existing result.
  • Manual vs IdP. The consumer does not distinguish — a User is either a member or not. The Source discriminator is an internal provenance record. This keeps the ReBAC schema source-agnostic: an "owner" relation does not care whether the "ops" Group was administered manually or mirrored from Okta.
  • Drift. GroupIdPSyncDrift is not a membership signal — it is an operator alert. The authorisation layer MUST NOT key any relation off it.

Invariant-to-test matrix

Every invariant the groups context enforces is backed by at least one automated test. Every requirement in the groups plan appears at least once below. When a row lists multiple enforcement layers, the later layers are belt-and-braces — the earlier one is authoritative.

Invariant (REQ-id)Enforced atTest
Group: DomainID non-zero, Slug kebab-case ^[a-z0-9]([a-z0-9-]{0,62}[a-z0-9])?$, DisplayName non-empty, Source ∈ {manual, idp} with matching binding / claim invariantsgroups.NewGroup + groups_source_idp_requires_binding CHECK + CHECK slug ~ '...' on plexsphere.groupsinternal/identity/groups/group_test.go + tests/integration/identity_groups_source_sql_check_test.go
Slug unique per DomainUNIQUE (domain_id, slug) on plexsphere.groups (groups_domain_slug_uq)internal/identity/groups/repo/group_repo_test.go + tests/integration/identity_groups_crud_test.go
Three tables + CHECK constraints + XOR + no-self-parent + source-matches-parent trigger0004_groups.sql schematests/integration/identity_groups_source_sql_check_test.go (pg_constraint probe + 23505/23514 trips)
ParentEdge: self-parent forbidden, cycles rejected, depth capped at MaxDepth=32groups.NewParentEdge + parent_repo.AddParent cycle pre-check CTE + resolver.MaxDepth + group_parents_no_self_parent CHECKinternal/identity/groups/parent_test.go + internal/identity/groups/resolver_test.go + tests/integration/identity_groups_resolver_test.go
Membership XOR on principal columns; Source must match parent Group Sourcegroups.NewMembership + group_memberships_principal_xor CHECK + group_memberships_source_matches_parent triggerinternal/identity/groups/membership_test.go + tests/integration/identity_groups_source_sql_check_test.go
Manual Membership survives every IdP syncMembershipRepo.SyncForUser filters on groups.source = 'idp'tests/integration/identity_groups_sync_test.go + tests/e2e/identity-groups/chainsaw-test.yaml::manual-group-survives
Syncer delta: add missing, remove stale, emit drift for unmatched claimsgroups.Syncer + MembershipRepo.SyncForUserinternal/identity/groups/syncer_test.go + tests/integration/identity_groups_sync_test.go
Syncer.Sync is tx-atomic within its own callMembershipRepo.SyncForUser opens a single pgx.Tx for the full delta + outbox appends; a failure inside that tx rolls back every membership and drift row. DECISION: the sign-in handler runs UpsertWithBinding, RecordSignIn, and Sync in three separate per-repo txs today — lifting them into a shared unit-of-work is tracked as a follow-up.tests/integration/identity_groups_sync_test.go::TestGroupsSync_TransactionalRollback
Every aggregate mutation appends exactly one matching outbox event in the same transactionGroupRepo.{Upsert,UpdateDisplayName,Delete}, MembershipRepo.{AddMember,RemoveMember,SyncForUser}, ParentRepo.{AddParent,RemoveParent}tests/integration/identity_groups_crud_test.go + tests/integration/identity_groups_sync_test.go
Admin HTTP surface: 201 happy path, 400 invalid idp fields, 404 unknown, 409 slug conflict, 401 no principal, 409 source-mismatch on manual add to idp Groupinternal/transport/http/v1/admin/groups.go + group_members.gointernal/transport/http/v1/admin/groups_test.go + group_members_test.go + tests/integration/identity_groups_api_test.go
OpenAPI spec carries X-Plexsphere-API-Version and paginated GroupListResponseapi/openapi/plexsphere-v1.yamlmake openapi-lint + tests/integration/identity_groups_api_test.go (cursor round-trip)
Sign-in flow invokes Syncer once per sign-in; Syncer's inner tx rolls back on failure and the handler aborts the responseinternal/transport/http/v1/auth/callback.go invokes groups.Syncer.Sync after the user upsert + sign-in record; Sync opens its own tx. DECISION: callback.go does not share a tx across the three repo calls today (follow-up).internal/transport/http/v1/auth/callback_test.go + tests/integration/identity_groups_sync_test.go
End-to-end: Dex-driven sign-in twice with different groups claims, assert membership state + outbox ordering + drift event + manual-group survivaltests/e2e/identity-groups/chainsaw-test.yamlkind-loaded plexsphere:e2e-identity-groups image
CI wiring: chainsaw suite tests/e2e/identity-groups ships on the e2e job; tests/integration/identity_groups_*_test.go ships on the integration job.github/workflows/ci.yaml + kind-load.shtests/workspace/ci_workflow_test.go
Bounded-context reference doc carries feature: PX-0010 front-matter, required headings, and cross-linksdocs/contexts/identity/groups.md + docs/contexts/identity/idp.md + docs/contexts/identity/tenancy.md cross-linkstests/docs/groups_doc_test.go
All invariant errors carry the (PX-0010, REQ-xxx) suffixgroups.errInvariant, groups/repo.errors, groups/events constructorsEvery *_test.go in internal/identity/groups/** asserts the suffix
Paginated admin list uses stable cursor (created_at, id) and cursor round-trip preserves orderingGroupRepo.List + sqlc ListGroupsByDomain queryinternal/identity/groups/repo/group_repo_test.go + tests/integration/identity_groups_api_test.go
ReBAC consumer contract: Resolver is read-only, returns sorted dedup'd IDs, unknown userID yields nil, MaxDepth=32 enforcedgroups.Resolver (in-memory + CTE-backed)internal/identity/groups/resolver_test.go + tests/integration/identity_groups_resolver_test.go

Cross-references

  • ./idp.md — sibling bounded-context reference for the per-Domain IdP binding, User, UserSession, ServiceIdentity, and APIToken aggregates that live under internal/identity/{idp,users,services,tokens,authn}.
  • ./tenancy.md — sibling bounded-context reference for the Domain, Project, Resource, Node aggregates under internal/identity/tenancy.
  • ./rebac.md — sibling bounded-context reference for the SpiceDB-backed authorisation layer that consumes Group membership (schema walk-through, zedtoken consistency flow, caveat-context table, audit contract, auth posture) under internal/authz.
  • ../../contributing/layout.md — bounded-context map placing internal/identity inside the repo.
  • ../../reference/platform/db.md — pgx pool, goose migrations, sqlc workflow, and the Groups schema row the groups context writes through (see Groups schema).
  • ../../how-to/identity/manage-groups.md — operator how-to: create a manual Group, create an IdP-synced Group, read memberships, triage drift events.
  • ../../contributing/openapi.md — the OpenAPI spec that hosts /v1/admin/groups/**.
  • internal/identity/groups/doc.go — package doc for the domain layer.
  • internal/platform/db/migrations/0004_groups.sql — canonical schema for the three tables diagrammed above.
  • tests/e2e/identity-groups/chainsaw-test.yaml — end-to-end suite that seeds a Domain, registers an IdPBinding, seeds manual + IdP Groups, drives two sign-ins, and asserts membership state plus the manual-group-survives invariant.