Skip to content

Storage topology — plexsphere platform packages

This document maps each row of the README § Storage Topology table to the internal/platform/* package that owns it, the key types every bounded context consumes across that seam, and the name under which the package registers its readiness probe with internal/platform/health.Registry.

The canonical truth about which stores plexsphere uses is the README table. This document is the truth about how plexsphere consumes those stores from Go — which package, which constructor, which probe name. A drift between the two is caught by the doctest in tests/docs/storage_topology_test.go .

See also:

Row-to-package map

README rowTechnologyinternal/platform/* packageKey exported typesProbe name
Relational DBPostgreSQL 16+internal/platform/dbConfig, New, ProbeFunc, migrations.Up/Down/Statusdb-primary
ReBAC engineSpiceDB on PostgreSQL datastore (see SpiceDB↔Postgres wiring below)internal/platform/db/spicedb (bootstrap) + internal/authz (gRPC client, Authorizer)SpiceDBFS, ApplySpiceDBBootstrap, ErrWrongTarget; authz.Client, authz.Authorizer, authz.Session(bootstrap-only; no runtime probe today)
Secret backendOpenBao (Vault-compatible)internal/platform/secretstoreConfig, NewClient, Auth, AuthAppRole, AuthKubernetes, KVSecret, ProbeFuncsecretstore
Management K8s fleetKubernetes + Crossplane v2 + ESO(deferred to a provisioning story — no internal/platform seam yet)
Object storeSeaweedFS / AWS S3 (S3 API)internal/platform/blobstoreConfig, NewClient, ErrNotFound, ObjectInfo, Presign, ProbeFuncblobstore
Pub/sub + SSE replayNATS with JetStreaminternal/platform/messagingConfig, NewClient, StreamConfig, EnsureStream, Publish, Replay, ErrStreamNotFound, ProbeFuncjetstream
Metrics storeGrafana Mimir(deferred to an observability story — no internal/platform seam yet)
Logs + audit storeGrafana Loki(deferred to an observability story — no internal/platform seam yet)

The three (deferred) rows exist in the README because the SaaS topology depends on them; they do not yet have an internal/platform/* seam because the foundational platform-storage work carries only the four data platforms the core binary touches directly. The drift gate in tests/docs/storage_topology_test.go asserts all eight rows are represented above.

Data flow

text
   +----------------------------------------------------------------+
   |                         cmd/plexsphere                         |
   |                      (core binary, stateless)                  |
   +------+---------------+-------------------+---------------+-----+
          |               |                   |               |
          v               v                   v               v
   +------+------+ +------+------+ +----------+-------+ +-----+------+
   | internal/   | | internal/   | |  internal/       | | internal/  |
   | platform/db | | platform/   | |  platform/       | | platform/  |
   |  (+migs +   | | messaging   | |  secretstore     | | blobstore  |
   |   spicedb)  | | (JetStream) | |  (OpenBao KV v2) | | (S3 API)   |
   +------+------+ +------+------+ +---------+--------+ +-----+------+
          |               |                  |                |
          | pgx pool      | nats.go /        | openbao-go     | aws-sdk-go-v2
          | goose         | jetstream        | api v2         | service/s3
          v               v                  v                v
   +----------+   +----------------+   +-------------+   +----------------+
   |          |   |                |   |             |   |                |
   | Postgres |   | NATS cluster   |   |  OpenBao    |   | SeaweedFS /    |
   | 16+      |   | (JetStream,    |   |  cluster    |   | AWS S3         |
   | (core +  |   | replicas per   |   |  (AppRole / |   | (path-style    |
   | spicedb  |   | STREAM_REPLICAS|   |   K8s auth) |   |  for Seaweed)  |
   | logical  |   |  default 3,    |   |             |   |                |
   |  DBs)    |   |  max-age 24h,  |   |             |   |                |
   |          |   |  max-bytes     |   |             |   |                |
   |          |   |   1 GiB/stream)|   |             |   |                |
   +----------+   +----------------+   +-------------+   +----------------+

Stream replica factor (STREAM_REPLICAS)

JetStream streams are provisioned with a replica factor taken from the STREAM_REPLICAS environment variable at publisher start-up (cmd/messaging-publisher/main.go). The default — messaging.DefaultStreamReplicas = 3 — is the production invariant and matches the durability claim the topology contract makes: no single broker loss should lose a committed message. Values other than the default are expected only when the deployment environment physically cannot satisfy the >=3 requirement.

The chainsaw e2e fixture tests/e2e/messaging/chainsaw-test.yaml sets STREAM_REPLICAS=1 because the kind cluster it targets is a single-node cluster — a 3-replica request would block stream creation with insufficient resources. The override is the only sanctioned case where STREAM_REPLICAS < 3; production Helm charts MUST leave the variable unset (or set it to 3) so the publisher binary falls back to messaging.DefaultStreamReplicas .

Every arrow in the diagram is exactly one package import. Bounded contexts under internal/{identity,mesh,policy,...} reach the stores only through these four packages — the no-direct-persistence-from-contexts depguard rule in .golangci.yml denies any direct import of the underlying drivers (github.com/jackc/pgx/**, github.com/nats-io/**, github.com/openbao/openbao/api, github.com/aws/aws-sdk-go-v2/service/s3) from any bounded-context path.

SpiceDB↔Postgres wiring

SpiceDB is a separate gRPC process that shares the plexsphere Postgres instance. The "spicedb logical DBs" box in the diagram above is the physical backing — SpiceDB writes relation tuples into its own logical database on the same cluster that hosts the plexsphere core schemas, and the spicedb role's grants are restricted by 0001_spicedb_bootstrap.sql (see docs/reference/platform/db.md#spicedb-tree).

text
 +---------------------+        gRPC (50051)         +-------------------+
 | cmd/plexsphere      | --------------------------> | spicedb process   |
 |   internal/authz    |   x-correlation-id metadata |   serve           |
 |   (authz.Client,    |   preshared-key bearer OR   |   --datastore-    |
 |    Authorizer,      |   mTLS per posture          |      engine=      |
 |    Session)         |                             |      postgres     |
 +---------+-----------+                             +---------+---------+
           | pgx pool                                          | pgx (own pool)
           | (plexsphere logical DBs)                          | (spicedb logical DB)
           v                                                   v
       +---+-----------------------------------------------+---+
       |                    Postgres 16+                       |
       |   plexsphere.{domains,projects,resources,groups,...}  |
       |   spicedb.{relation_tuple, ... }                      |
       +-------------------------------------------------------+

The plexsphere core binary never issues SQL against the spicedb.* schema — every relation read/write goes through authz.Client over gRPC. The no-authzed-go-outside-authz depguard rule enforces the gRPC surface's opacity, and the no-direct-persistence-from-contexts rule keeps bounded contexts off the pgx driver.

For the bounded-context explanation covering schema walk-through, zedtoken consistency, caveat semantics, audit contract, and authentication posture see docs/contexts/identity/rebac.md. For the operator runbook covering schema changes see docs/how-to/authorization/apply-the-rebac-schema.md.

Label Registry tables

The Label Registry bounded context (internal/labels) persists its aggregates on the same shared Postgres cluster as the rest of the plexsphere core schema. Migration 0005_labels.sql introduces two tables inside the plexsphere schema:

TableAggregateWhat it stores
plexsphere.label_definitionLabelDefinitionMetadata for every declarable label: scope (platform | domain | project), scope ID, local key, denormalised qualified key, JSONB value schema, applicable object kinds, cardinality policy, cloud-tag-propagation flag, immutability flag, and the on_delete policy (block | cascade | orphan). Seeds three immutable platform-scoped rows (platform/origin, platform/mesh-ip, platform/domain) with created_by='system'.
plexsphere.label_assignmentLabelAssignmentOne row per (object_kind, object_id, qualified_key) triple: the concrete value, a foreign key to the owning label_definition, and the acting identity. The qualified key is denormalised from the definition so the common "list every assignment on object X" path is a single index lookup without a join.

Why label history is non-round-trippable

The migration's Down block drops both tables in reverse-dependency order. This is deliberate: the Label Registry carries the before/after value of every assignment through the Platform Audit Log, not through a table-native history column, so rolling the migration back destroys the label history that exists in those two tables at the moment Down runs. The operator-facing contract is that Down is a destructive reset for the Label Registry, not a time-reversible migration — restoring label state across a Down/Up cycle requires a backup, not a re-run of the migration. The TestMigrations_UpThenDown_Idempotent gate in internal/platform/db/migrations/migrations_test.go asserts Up and Down are each individually idempotent (repeat runs are no-ops); it does not claim that data survives a round-trip, and the DECISION block at the top of 0005_labels.sql names the reason .

For the bounded-context explanation covering the ubiquitous language, scope hierarchy, value-schema catalogue, reserved keys, selector grammar, and the SelectorPort contract see docs/contexts/labels/index.md.

Signed Event Bus stream — PLEXSPHERE_NODE_EVENTS

The Signed Event Bus bounded sub-context (internal/mesh/sse) publishes its envelopes onto a single JetStream stream whose canonical name is PLEXSPHERE_NODE_EVENTS. The constant lives in internal/mesh/sse/streamname.go and is the SINGLE SOURCE OF TRUTH consumed by both the publisher (EnsureStream call) and the SSE handler (messaging.Replay call): the stream name, the per-node subject layout plexsphere.node.events.<domain>.<node>, and the validator that rejects NATS-illegal subject tokens are co-located so any wire-format change is a single-file review.

The 24h MaxAge replay window pinned by messaging.DefaultStreamConfig is the source-of-truth bound for Last-Event-ID resume. A reconnect that quotes a Last-Event-ID below the stream's current low watermark receives HTTP 410 Gone with error code last_event_id_outside_replay_window; clients that hit this branch fall back to the reconciliation pull at GET /v1/nodes/{id}/state, which serves the same per-node state the event stream would have replayed. Operators tuning retention must move the README § Storage Topology row, the messaging package default, and this paragraph in lockstep — the 24h figure is the durability promise the SSE replay contract makes.

For the bounded-context explanation covering ubiquitous language, envelope contract, the no-cross-context-imports-mesh-sse allow-list, and the replay-window state machine see docs/contexts/mesh/sse.md. For the operator runbook covering nats stream info, nats stream view, backlog inspection, and the 410 Gone triage path see docs/how-to/mesh/inspect-the-event-bus.md.

Probe aggregation

All four probes register into the single internal/platform/health.Registry that backs /readyz. The aggregator returns 200 only when every registered probe succeeds within its per-probe timeout:

ProbeRegistered constantBacking call
db-primarydb.ProbeName*pgxpool.Pool.Ping
jetstreammessaging.ProbeNamenats.Conn.Status() == CONNECTED
secretstoresecretstore.ProbeNameSys().Health() (rejects sealed / uninitialised)
blobstoreblobstore.ProbeNames3.HeadBucket against the probe bucket passed to blobstore.ProbeFunc(client, probeBucket) at registration time. There is no Config.ProbeBucket field — the bucket is deliberately not part of blobstore.Config so probe wiring stays callsite-local.

The integration test tests/integration/probes_aggregation_test.go spins up all four containers, registers every probe, asserts /readyz == 200, stops the Postgres container, and asserts /readyz == 503 naming db-primary. The same test asserts that the 503 body does not contain any DSN password, OpenBao SecretID, or S3 SecretKey.

Evolution rules

Adding a new data platform to the topology is a four-step change that must land together:

  1. Add the row to README § Storage Topology.
  2. Add a sibling internal/platform/<name> package with Config, New*, and ProbeFunc. Register its driver's import path in the no-direct-persistence-from-contexts depguard allow-list so only that package may import it.
  3. Add a reference doc under docs/reference/platform/<name>.md with front-matter feature: PX-NNNN.
  4. Add the row to this document's Row-to-package map.

The drift tests in tests/docs refuse any partial landing of the four steps.