Skip to content

Bridge Orchestrator — relay, user access, public ingress, site-to-site

This document is the authoritative bounded-context reference for the Bridge Orchestrator that ships under ../../../internal/bridge/. It covers the ubiquitous language, the four aggregates and their natural keys, the schema reference for the four Postgres tables, the value objects, the bridge-kind precondition, the closed seven-event outbox set, the verb-style audit relation namespace, the ReBAC ownership matrix, the opaque-SecretRef contract, and the downstream stories this context deliberately leaves to follow-ups.

The Bridge Orchestrator is the authoring and storage surface for the orchestrator configuration a Resource of kind bridge carries — and only that. It owns four independent configuration aggregates: the relay daemon configuration, the user-access providers end users dial into the mesh through, the SNI-routed public ingress rules, and the site-to-site tunnels to remote endpoints. It stores operator intent; it does not run a relay daemon, materialise a secret, build a mesh effective-config, or fan a change out to a Node. The follow-ups that consume this configuration are named below where they become relevant.

This is NOT a secret store, a mesh effective-config builder, or a relay control loop. Every auth secret the context handles is an opaque reference, never material (see Opaque SecretRef); resolving that reference to live bytes is a deferred story. The per-peer fallback-relay endpoint the mesh effective-config builder sources from a bridge's relay listen port is a downstream consumer, not part of this context. A reader looking for "how does a peer actually dial the relay" belongs in the mesh effective-config story, not here.

For the bounded-context siblings and upstream references see:

Ubiquitous language

The vocabulary travels verbatim across the Go code, the SQL schema, the OpenAPI surface, the outbox event payloads, and the audit trail. Internal code never paraphrases the terms; documentation and error prose adopt the exact spelling below. The package godoc at internal/bridge/doc.go is the ubiquitous-language pin; this document is its prose expansion.

TermDefinitionCode anchor
bridge ResourceA tenancy Resource whose kind is bridge. It is the aggregate scope every bridge-orchestrator aggregate is keyed within. The orchestrator never creates a Resource; it attaches configuration to an existing one.ResourceID in internal/bridge/types.go
BridgeRelayThe singleton relay configuration of a bridge Resource: an enable flag plus the listen port the relay daemon binds. At most one exists per bridge, so the bridge ResourceID IS the natural key — there is no surrogate id.internal/bridge/relay/relay.go
UserAccessProviderA named WireGuard-family mesh-ingress provider (tailscale / netbird / wireguard) the bridge stands up so end users dial into the mesh. Many per Resource; the (resource_id, slug) pair is the natural key.internal/bridge/useraccess/useraccess.go
PublicIngressRuleAn SNI-routed public ingress termination that forwards to a target Node + port inside the mesh. Many per Resource; both (resource_id, slug) AND (resource_id, sni_host) are natural keys.internal/bridge/ingress/ingress.go
SiteToSiteTunnelA directional tunnel (wireguard / ipsec / openvpn) to a remote endpoint, carrying a non-empty allowed-subnets list and a routing policy. Many per Resource; the (resource_id, slug) pair is the natural key.internal/bridge/sitetosite/sitetosite.go
ResourceID / NodeIDOpaque [16]byte external references to plexsphere.resources(id) and plexsphere.nodes(id). The composition root maps them from the tenancy identifiers so the bridge context stays free of an internal/identity import.ResourceID, NodeID in internal/bridge/types.go
IDThe internally-minted UUIDv7 surrogate identifier the three many-per-Resource aggregates and the outbox event ids carry. Time-ordered so an event can carry the same id before the row is persisted.ID in internal/bridge/types.go
SlugA lowercase kebab-case handle (≤ 63 chars) naming an aggregate within its Resource. A value object so an invariant violation surfaces at parse time, not inside a Build call.Slug in internal/bridge/types.go
SecretRefThe opaque reference to a provider's or tunnel's auth material, of the form secret:<domain>/<project>/<name>(:<version>)?. It carries the reference only, never the material.SecretRef in internal/bridge/types.go

Why four aggregates, not one

The Bridge Orchestrator is modelled as four separate aggregates rather than one mega-aggregate rooted on the bridge Resource. The DECISION block above the package declaration in internal/bridge/doc.go records the choice; the reasoning is load-bearing enough to restate here.

The alternative considered was a single Bridge aggregate holding the relay configuration, the providers, the ingress rules, and the tunnels under one transactional root. It was rejected. The four carry four independent invariants:

  • BridgeRelay — a one-per-bridge singleton whose listen port sits inside the inclusive 1..65535 range.
  • UserAccessProvider — per-Resource slug uniqueness and a strictly positive peer ceiling.
  • PublicIngressRule — per-Resource slug and SNI-host uniqueness, plus an in-Domain target Node.
  • SiteToSiteTunnel — a non-empty allowed-subnets list.

Burying them under one transactional gate would force every provider edit to lock the relay row and every tunnel edit to re-validate every ingress rule, and it would couple four lifecycles operators manage independently. The four-aggregate split keeps each invariant enforced inside its own boundary and lets the orchestrator modify one aggregate per transaction, coordinating across them via domain events. Each aggregate is scoped to its bridge Resource through the opaque ResourceID; the three many-per-Resource aggregates additionally carry an internally-minted UUIDv7 surrogate ID, while BridgeRelay is keyed on the bridge ResourceID alone because there is no second relay row a surrogate id could distinguish.

Schema reference

The Bridge Orchestrator attaches to four tables in the plexsphere schema. The migration that creates them is internal/platform/db/migrations/0042_bridge_orchestrator.sql. Every CHECK constraint, UNIQUE index, and ON DELETE behaviour below is sourced from that file.

TableAggregateNotable constraints
plexsphere.bridge_relayBridgeRelayresource_id is the PRIMARY KEY (one relay per bridge, the singleton invariant is a structural fact); FK resource_id REFERENCES plexsphere.resources(id) ON DELETE RESTRICT; CHECK listen_port BETWEEN 1 AND 65535.
plexsphere.bridge_user_access_providerUserAccessProviderApp-minted uuid PRIMARY KEY (no DB DEFAULT); FK resource_id … ON DELETE RESTRICT; CHECK kind IN ('tailscale', 'netbird', 'wireguard'); CHECK listen_port BETWEEN 1 AND 65535; CHECK max_peers > 0; routing_policy jsonb NOT NULL; UNIQUE index bridge_user_access_provider_resource_slug_uq on (resource_id, slug).
plexsphere.bridge_public_ingress_rulePublicIngressRuleApp-minted uuid PRIMARY KEY; FK resource_id … ON DELETE RESTRICT; FK target_node_id REFERENCES plexsphere.nodes(id) ON DELETE RESTRICT; CHECK target_port BETWEEN 1 AND 65535; acme_account_ref is NULLable; UNIQUE indexes …_resource_slug_uq on (resource_id, slug) AND …_resource_sni_uq on (resource_id, sni_host).
plexsphere.bridge_site_to_site_tunnelSiteToSiteTunnelApp-minted uuid PRIMARY KEY; FK resource_id … ON DELETE RESTRICT; CHECK kind IN ('wireguard', 'ipsec', 'openvpn'); CHECK remote_port BETWEEN 1 AND 65535; allowed_subnets cidr[] with CHECK cardinality(allowed_subnets) >= 1; CHECK routing_policy IN ('bidirectional', 'ingress_only', 'egress_only'); UNIQUE index …_resource_slug_uq on (resource_id, slug).

Why ON DELETE RESTRICT on every FK

All four tables FK resource_id to plexsphere.resources(id) with ON DELETE RESTRICT — never CASCADE — and the ingress rule's target_node_id likewise RESTRICTs against plexsphere.nodes(id). A bridge Resource carrying live orchestrator configuration cannot be deleted out from under that configuration: the application-layer teardown flow tears down the relay, providers, ingress rules, and tunnels first, and only then deletes the Resource. CASCADE would silently drop the operator-authored configuration alongside the Resource, and that configuration is the source-of-truth that is reconstructible from nowhere else. The RESTRICT FK is the smallest invariant that pins the teardown ordering; the DECISION block in the migration records the rejected alternatives.

Why the surrogate ids are app-minted

The three many-per-Resource tables carry an app-minted uuid primary key with no database DEFAULT. The application mints a UUIDv7 so the row key is time-ordered and the same id can be carried on the domain event the orchestrator emits before the row is persisted. A DEFAULT gen_random_uuid() would mint a v4 id the application never sees, splitting the aggregate id from its outbox event. BridgeRelay needs no surrogate id at all — keying on resource_id makes the one-relay-per-bridge invariant a structural fact the SQL layer cannot violate.

Down-refusal — SQLSTATE 0A000

The migration's downgrade block refuses the rollback and raises SQL exception code 0A000 (feature_not_supported). The four tables hold the operator-authored source-of-truth for how each bridge routes mesh ingress, public ingress, and site-to-site traffic; that configuration is not reconstructible from anywhere else, so dropping the tables would silently discard live operator intent. Operators performing a legitimate wipe-and-reinstall drop the Postgres database itself — the downgrade path is not a destructive teardown tool. The stance mirrors the prior config- and audit-bearing migrations (0006_signing.sql, 0011_audit_log.sql, 0012_peers.sql, 0029_peer_endpoint.sql, 0030_peer_relay_assignment.sql).

Aggregates and invariants

Each aggregate owns every shape invariant locally so the transport and service layers operate on already-shaped values. A constructor (New…) defaults timestamps and auto-assigns a UUIDv7 ID; a hydrator (Hydrate…) rejects a zero ID, CreatedAt, or UpdatedAt so a corrupt row never round-trips silently.

BridgeRelay

The singleton relay configuration. Identity IS the bridge ResourceID. Invariants enforced at the boundary (relay.go):

  • ResourceID is non-zero.
  • ListenPort is within the inclusive 1..65535 range, wrapping ErrRelayPortOutOfRange.
  • WithConfig bumps UpdatedAt unconditionally; the idempotent short-circuit lives at the service layer via SameConfigAs, so a configure command whose values already match the stored aggregate is a no-op that neither bumps UpdatedAt nor re-emits an event.

The configured listen port is the seam a downstream story uses: it lets a peer's fallback relay path source a per-bridge port instead of a package-level default constant.

UserAccessProvider

A named WireGuard-family mesh-ingress provider. Invariants (useraccess.go):

  • ResourceID non-zero; Slug kebab-case; Kind one of the closed set {tailscale, netbird, wireguard}.
  • InterfaceName non-empty after trimming.
  • ListenPort within 1..65535; MaxPeers strictly positive.
  • AuthSecretRef is a well-formed opaque SecretRef — never dereferenced (see Opaque SecretRef).
  • RoutingPolicy is a non-empty, valid-JSON document the daemon applies to admitted peers; the aggregate stores a defensive copy.

Update re-validates the mutable configuration (Kind, InterfaceName, ListenPort, MaxPeers, AuthSecretRef, RoutingPolicy) and bumps UpdatedAt; the identity fields (ID, ResourceID, Slug) are taken from the receiver, so a caller cannot reslug or rekey a provider through Update.

PublicIngressRule

An SNI-routed public ingress termination. Invariants (ingress.go):

  • ResourceID non-zero; Slug kebab-case.
  • SNIHost non-empty after trimming and carrying no internal whitespace.
  • TargetNodeID non-zero; TargetPort within 1..65535.
  • ACMEAccountRef is a *string: nil means the rule terminates TLS with operator-supplied certificates; a non-nil pointer must point at a non-empty (after trimming) reference.

Two invariants are deliberately not enforced inside the aggregate: slug/SNI-host uniqueness within a bridge is a cross-row invariant the UNIQUE indexes enforce (surfaced through the repository as ErrSlugConflict), and the in-Domain target-Node check is a cross-aggregate read the ingress service performs through the ResourceReader and NodeReader ports before admitting the rule (surfaced as ErrTargetNodeNotInDomain). Pushing either into the aggregate would couple it to persistence and to a sibling context's read model.

SiteToSiteTunnel

A directional tunnel to a remote endpoint. Invariants (sitetosite.go):

  • ResourceID non-zero; Slug kebab-case; Kind one of {wireguard, ipsec, openvpn}.
  • RemoteHost non-empty after trimming; RemotePort within 1..65535.
  • AuthSecretRef a well-formed opaque SecretRef.
  • AllowedSubnets is a non-empty []netip.Prefix where every entry is a valid prefix; an empty list wraps ErrAllowedSubnetEmpty. The aggregate accepts already-parsed prefixes (the transport boundary parses the operator strings once and fails fast on a malformed CIDR).
  • RoutingPolicy one of the closed directions {bidirectional, ingress_only, egress_only}.

Value objects and error sentinels

The context's value objects live in internal/bridge/types.go and its error sentinels in internal/bridge/errors.go.

Value objectShapeNotes
ResourceID, NodeID[16]byteOpaque external references; canonical 8-4-4-4-12 textual form; zero value is "not yet assigned".
IDUUIDv7 wrapperSurrogate key for the many-per-Resource aggregates and event ids; time-ordered so an event can carry it before the row is written.
Slugkebab-case string^[a-z0-9]+(-[a-z0-9]+)*$, ≤ 63 chars; whitespace is rejected, never trimmed; invalid input wraps ErrSlugInvalid.
SecretRefsecret:<domain>/<project>/<name>(:<version>)?Shape-only validation; never dereferenced; malformed input wraps ErrSecretRefMalformed.

The seven domain error sentinels each map to a transport Problem code (the mapping table is restated under ReBAC and Problem codes):

SentinelMeaning
ErrResourceNotBridgeThe addressed Resource exists but its kind is not bridge.
ErrRelayPortOutOfRangeA relay listen port falls outside 1..65535.
ErrSlugInvalidA slug is not lowercase kebab-case or exceeds the length bound.
ErrSlugConflictA (resource_id, slug) or (resource_id, sni_host) uniqueness collision (classified from a pgerrcode 23505).
ErrTargetNodeNotInDomainAn ingress rule's target Node resolves to a Domain other than the bridge's.
ErrAllowedSubnetEmptyA site-to-site tunnel carries an empty allowed-subnets list.
ErrSecretRefMalformedAn auth secret reference does not match the opaque SecretRef format.

The bridge-kind precondition

Every mutation first reads the target Resource through the ResourceReader port and refuses the operation unless Resource.kind == "bridge", surfacing ErrResourceNotBridge. The check runs before any aggregate write, so a Resource of any other kind can never grow bridge-orchestrator configuration. The transport layer maps ErrResourceNotBridge onto HTTP 409 with the Problem code resource_not_bridge.

The same ResourceReader.GetResource call returns the bridge Resource's Domain and Project alongside its kind. The ingress flow compares the bridge Resource's Domain against the target Node's Domain (resolved via NodeReader.GetNodeDomain) and refuses the rule with ErrTargetNodeNotInDomain when they differ, so a misconfigured rule cannot punch public traffic into a foreign tenant's mesh. The returned Project id is what the read operations gate observe on (see ReBAC and Problem codes).

Opaque SecretRef

Every auth secret a bridge aggregate consumes — a UserAccessProvider's provider credentials, a SiteToSiteTunnel's tunnel key — is stored as an opaque SecretRef of the form secret:<domain>/<project>/<name>(:<version>)? and never as secret material. The DECISION block in internal/bridge/types.go records the choice; the alternative considered was storing the material inline (or a resolved handle the bridge could dereference), and it was rejected.

This iteration stores the reference only. ParseSecretRef validates the shape only and never dereferences or resolves the underlying value. Keeping the context reference-only means it carries no secret bytes, depends on no secret-store client, and an auth_secret_ref column leaking through a projection exposes a pointer, not a credential. Materialising a SecretRef into live auth material through the Secret Store is a deferred story (see Deferred downstream).

Closed outbox event set

The Bridge Orchestrator emits exactly seven outbox event_type strings and no others. Each is a past-tense state-change notification: a *Configured event announces a create-or-update of the corresponding aggregate, and a *Removed event announces its teardown. BridgeRelay, being a singleton, has only a Configured form.

Event type literalPayload structEmitted on
bridge.RelayConfiguredBridgeRelayConfiguredCreate-or-update of a bridge Resource's singleton relay configuration.
bridge.UserAccessProviderConfiguredUserAccessProviderConfiguredCreate-or-update of a UserAccessProvider.
bridge.UserAccessProviderRemovedUserAccessProviderRemovedTeardown of a UserAccessProvider.
bridge.PublicIngressRuleConfiguredPublicIngressRuleConfiguredCreate-or-update of a PublicIngressRule.
bridge.PublicIngressRuleRemovedPublicIngressRuleRemovedTeardown of a PublicIngressRule.
bridge.SiteToSiteTunnelConfiguredSiteToSiteTunnelConfiguredCreate-or-update of a SiteToSiteTunnel.
bridge.SiteToSiteTunnelRemovedSiteToSiteTunnelRemovedTeardown of a SiteToSiteTunnel.

The payload structs and their discriminator constants live next to their aggregates, one events sub-package each: relay/events, useraccess/events, ingress/events, and sitetosite/events. Each event denormalises just enough — ResourceID, the aggregate surrogate id, and the Slug (plus Kind or SNIHost where relevant) — so a downstream consumer can route and index without joining back to a row that may already be gone.

Wire-contract gate

The closed seven-event set is enforced by an AST workspace gate at tests/workspace/bridge_event_type_set_test.go. The gate parses this context's source, finds every event_type literal reaching an AppendOutboxEvent call, and fails the build if any literal escapes the closed set. A new bridge event type must be declared in the context's events package and added to the gate's allow-list first; silent additions are a programming error.

Audit relation namespace

Separately from the outbox event set, every operation stamps a verb-style audit relation onto its audit row. These are audit-event operation namespaces in dotted-snake form — they name the OPERATION, not a SpiceDB schema relation. They are distinct from both the outbox event literals above and the ReBAC relations below; the distinction matters because the same operation carries an outbox event type, an audit operation namespace, and a ReBAC permission, and the three must not be conflated. The relations are pinned as constants in internal/transport/http/v1/bridge/wiring.go.

AggregateAudit operation namespaces
BridgeRelaybridge.relay.configure, bridge.relay.read
UserAccessProviderbridge.user_access.configure, bridge.user_access.update, bridge.user_access.remove, bridge.user_access.read
PublicIngressRulebridge.ingress.configure, bridge.ingress.update, bridge.ingress.remove, bridge.ingress.read
SiteToSiteTunnelbridge.site_to_site.configure, bridge.site_to_site.update, bridge.site_to_site.remove, bridge.site_to_site.read

The orchestrator emits one audit row per mutation and one per denial, distinguished by the row's Outcome (permission_denied, invariant_violation, conflict, internal_error, or a success). A denial row is written before the response is flushed so a flaky audit backend cannot land a silent denial.

ReBAC and Problem codes

Every operation is gated by a ReBAC permission check before it touches an aggregate. The split is deliberate and contract-faithful (the DECISION block in internal/transport/http/v1/bridge/wiring.go records it):

  • Writes gate the manage permission on the addressed Resource (resource:<resource_id>). The resource definition is the aggregate-scope object the mutation acts on.
  • Reads gate the observe permission on the addressed Resource (resource:<resource_id>) — the SAME object the writes gate on. Binding the read decision to the object actually returned closes the cross-tenant BOLA an earlier project:<project_id>-keyed gate exposed (a caller who could observe any Project could read any bridge Resource by pairing their own project_id with a victim's resource_id). The resource definition derives observe = owner + maintainer + operator + viewer + parent->observe, so a legitimate Project observer still resolves through the parent->observe arm — no access is lost. The project_id path parameter is consequently NOT load-bearing for authorization.
Operation classReBAC objectPermission
Configure / Update / Remove (all four aggregates)resource:<resource_id>manage
Get / List (all four aggregates)resource:<resource_id>observe

The typed domain, repository, and service sentinels map onto the closed Problem-code taxonomy at the transport boundary (internal/transport/http/v1/bridge/errors.go):

Sentinel / conditionHTTP statusProblem code
ErrResourceNotBridge409resource_not_bridge
ErrSlugConflict409slug_conflict
ErrRelayPortOutOfRange400relay_port_out_of_range
ErrTargetNodeNotInDomain400target_node_not_in_domain
ErrAllowedSubnetEmpty400allowed_subnet_empty
ErrSecretRefMalformed400secret_ref_malformed
ErrSlugInvalid400invalid_slug
repo not-found (any of the four)404resource_not_found
ReBAC denial403rendered as a PermissionDenied body
unprovisioned handler (no wiring)501bridge_not_provisioned

The 500 path never interpolates the underlying error text into the wire body — raw driver messages can carry SQL fragments or constraint names a caller has no right to see; the detail is logged internally and the body stays generic.

Deferred downstream

Two downstream stories build on this context and are out of scope here. Each is named so a reader who expects end-to-end behaviour today knows where it lands. The third historical follow-up — sourcing a peer's fallback relay endpoint from the per-bridge BridgeRelay listen port and dispatching the effective config onto the bridge_config_updated mesh SSE wire literal — has landed; its producer-side seam, the closed-outbox-vs-wire-literal split, and the per-Node fan-out are documented at ./events.md. bridge_config_updated is the mesh SSE wire event produced from the seven outbox literals by the publisher's translation seam — it is not an eighth outbox event this context emits.

  • Cross-aggregate validation. The validation that runs when several aggregates are weighed against one another — a candidate's host port against its sibling relay and providers, a tunnel's allowed subnets against the Domain mesh CIDR and sibling tunnels, an ingress rule's ACME issuance feasibility — has landed as a separate, stateless pre-persist application service that the four bridge services invoke after the ReBAC check authorises the caller and before the persist transaction opens. It is no longer deferred; its refusals, the total failure-precedence order, and the per-entry-point prefix are documented at ./validation.md.
  • Secret materialisation. A downstream story materialises a SecretRef into live auth material through the Secret Store — the resolution this context deliberately does not perform (see Opaque SecretRef).

What this context is not

To keep the boundary sharp, the Bridge Orchestrator is deliberately NOT:

  • A secret store. It stores opaque SecretRef references and never secret material; resolution is a deferred story.
  • A mesh effective-config builder. It exposes the relay listen port and the RelayAssignmentReader read seam; building the per-peer fallback-relay effective config and dispatching it onto the bridge_config_updated mesh SSE wire literal is the events surface documented at ./events.md.
  • A relay / tunnel control loop. It stores operator intent; it does not run a relay daemon, a provider daemon, or a tunnel.
  • A Resource lifecycle owner. It attaches configuration to an existing bridge Resource and refuses any other kind; creating or deleting the Resource itself belongs to tenancy.
  • A cross-aggregate validator. Each aggregate still validates its own shape invariants inside its own boundary; the orchestrator aggregates are not themselves the cross-aggregate validator. The coordinated checks that weigh one aggregate against its siblings and cross-Domain state now run as a separate pre-persist validation pipeline documented at ./validation.md.

Cross-references