Skip to content

plexsphere architecture overview

This document is the authoritative entry point for the plexsphere codebase. It describes the two-binary control-plane runtime, the shared platform packages that both binaries compose, the health-endpoint contract, and the Domain-Driven Design (DDD) rationale that governs the directory layout under internal/. The repository also builds several supporting cmd/ binaries — the plexctl operator CLI, the bootstrap Job, and the messaging and e2e-fixture utilities — which docs/contributing/layout.md enumerates in full.

See also:

  • README.md — Runtime Topology for the deployment-level topology (HA minimums, SSE fan-out, rolling updates).
  • README.md — Runtime Topology — see the "Health and readiness" bullets inside that section for the operator-facing contract of /livez, /readyz, and /metrics (the bullets are rendered as bold text under the Runtime Topology heading and therefore share its anchor).
  • docs/contributing/layout.md for the bounded-context module map and the depguard enforcement story.

Topology

The plexsphere control-plane runtime is two cooperating Go binaries built from this repository — cmd/plexsphere and cmd/plexsphere-signer. They share no process state; they share only the packages under internal/platform/* (and, once subsequent stories land, the HTTP adapters under internal/transport/*). Every other bounded context is consumed through these shared seams, never by direct cross-context imports. The repository's other cmd/ binaries (operator CLI, bootstrap Job, messaging utilities, e2e fixtures) are supporting tooling, not part of this runtime topology.

text
                    +---------------------------+       +---------------------------+
                    |     cmd/plexsphere        |       |   cmd/plexsphere-signer   |
                    |  service="plexsphere"     |       | service="plexsphere-signer"|
                    |  default --addr :8080     |       |  default --addr :8081     |
                    |                           |       |                           |
    SIGTERM / SIGINT|  +---------------------+  |       |  +---------------------+  |SIGTERM / SIGINT
    ---------------->  | signal.NotifyContext|  |       |  | signal.NotifyContext|  <-----------------
                    |  +----------+----------+  |       |  +----------+----------+  |
                    |             |             |       |             |             |
                    |             v             |       |             v             |
                    |  +---------------------+  |       |  +---------------------+  |
                    |  |   app.Run(ctx,cfg)  |  |       |  |   app.Run(ctx,cfg)  |  |
                    |  +----------+----------+  |       |  +----------+----------+  |
                    |             |             |       |             |             |
                    |             v             |       |             v             |
                    |  chi router on *Addr*     |       |  chi router on *Addr*     |
                    |   GET /livez              |       |   GET /livez              |
                    |   GET /readyz             |       |   GET /readyz             |
                    |   GET /metrics            |       |   GET /metrics            |
                    +-----------+---------------+       +---------------+-----------+
                                |                                       |
                                |  compose the same platform packages   |
                                v                                       v
                    +------------------------------------------------------------+
                    |                   internal/platform/*                     |
                    |  bootstrap  telemetry  health  httpx  server  lint  make  |
                    +------------------------------------------------------------+

    Graceful shutdown (both binaries):
      SIGTERM -> ctx cancelled -> health.Registry.Drain() -> /readyz flips to 503
              -> http.Server.Shutdown(ctx, GracePeriod) -> in-flight requests finish
              -> process exits 0 (or 2 on server.ErrForceClose after grace deadline)

Both binaries are stateless. Any replica can serve any request, and a replica that has begun draining refuses readiness before the HTTP server starts rejecting new connections. The signer runs as a dedicated process because its access to signing material is the sharpest privilege edge in the deployment — see the README §Runtime Topology for the operational rationale.

Binaries

plexsphere (core)

  • Source: cmd/plexsphere/main.go, with the testable entrypoint in cmd/plexsphere/app.go (app.Run(ctx, cfg)).
  • Default listen address: :8080 (overridable via --addr).
  • Log field: service="plexsphere" — every structured log line emitted by this binary carries this value, so operators can filter per-binary output with a single predicate.
  • Flags:
    • --addr — TCP listen address (host:port). Defaults to :8080.
    • --addr-file — optional path that atomically receives the bound address after listen succeeds; intended for integration tests binding :0 and discovering the ephemeral port.
    • --version — print version=<semver> commit=<sha> date=<rfc3339> on stdout (one line, no binary name prefix) and exit 0.
    • --shutdown-grace — override the server's default graceful-shutdown deadline (~25 s); 0 keeps the default.
  • Build metadata: version, commit, and date are injected at link time via -ldflags "-X main.version=... -X main.commit=... -X main.date=...". The zero-value defaults (0.0.0-dev, unknown, unknown) keep unlinked developer builds observable.

plexsphere-signer (signer)

  • Source: cmd/plexsphere-signer/main.go.
  • Default listen address: :8081 (overridable via --addr) so both binaries can run side-by-side on a single developer workstation without collision.
  • Log field: service="plexsphere-signer".
  • Flags: identical surface to the core binary (--addr, --addr-file, --version, --shutdown-grace), and the same -ldflags injection of version, commit, and date.
  • Scope: only the signing surface (sign, public-key) lives in this process. Isolating it from the core binary means a memory-disclosure bug in the core cannot exfiltrate signing keys; this split is permanent, not an optimisation that later stories could undo.

Both binaries install the same signal handling (signal.NotifyContext(ctx, os.Interrupt, syscall.SIGTERM)) and translate server.ErrForceClose to exit code 2, every other server error to exit code 1, and clean shutdown to exit code 0.

Sharpest privilege edge — Signing Service

The signer is always a separate process because access to signing material is the sharpest privilege edge in the deployment. The private-key handle — whether it resolves to a file-backed developer key, a KMS client, or a PKCS#11 session against an HSM — lives exclusively in the plexsphere-signer process, so a memory-disclosure bug in the core binary cannot exfiltrate signing material. This split is a load-bearing invariant of the architecture, not an optimisation that later stories may undo; the core binary reaches the signer only through the narrow gRPC surface documented below.

See docs/contexts/signing.md for the ubiquitous language, the KeyProvider contract, rotation state machine, and operator debugging surface.

See the README runtime topology for the deployment rationale this callout materialises — lines 1588-1589 pin the "Signing Service is always a separate process" commitment.

The three load-bearing invariants that follow from this split:

Bootstrap-token enrolment flow

A fresh Node or Bridge joins a plexsphere deployment by exchanging a single-use BootstrapToken for a long-lived ServiceIdentity API token. The data flow below names the three actors, the credential boundary at each hop, and the dashed line for the out-of-band plaintext hand-off the operator owns:

text
                +--------------------+
                |  Operator (human)  |
                +----+----------+----+
       1 issue       |          | 2 plaintext (one-shot)
                     v          v
       +---------------------------------------------+
       |             cmd/plexsphere (core)           |
       |                                             |
       |   internal/identity/bootstraptokens/issuer  |
       |    -> Argon2id hash, persists aggregate     |
       |    -> emits one audit Entry per call        |
       |                                             |
       |   internal/identity/bootstraptokens/validator|
       |    -> consumes presented plaintext          |
       |    -> single-use UPDATE ... RETURNING       |
       +-----------------+------------------+--------+
              ^          |                  ^
              |          |                  |
   3 out-of-band         | 6 audit          | 5 POST /v1/register
   plaintext (- - - - -) | sink             |    (carries plaintext +
              |          v                  |     replay nonce)
              |  +----------------+         |
              +->|     plexd      +---------+
                 | (Node/Bridge)  | 4 receives plaintext, seals
                 +----------------+    nonce, redeems exactly once

The dashed (- - -) line marks the out-of-band hand-off the operator owns: the plexsphere API surface NEVER carries the plaintext to the redeeming workload. The Issuer returns it once on POST /v1/projects/{id}/bootstrap-tokens and the Validator only ever sees it on the redemption call from plexd. Every other surface (List, Get, Revoke, audit log, reconcile sweep) operates against the persisted Argon2id hash plus consumption metadata.

The bootstraptokens.Validator redemption surface is internal/transport/http/v1/handlers/register.go, which routes a POST /v1/register plaintext through the Registration Service described in the next section. The full lifecycle — issue, redeem, revoke, expire — and the threat model that names the attacker shapes this flow defends against live in ../contexts/identity/bootstrap-tokens.md.

Node Registration flow

Node Registration is the next step after the bootstrap-token enrolment flow above. A redeeming plexd (already enrolled with a BootstrapToken plaintext) presents that plaintext plus its X25519 public key to POST /v1/register, and the Registration Service in internal/identity/nodes/registration/service.go atomically (a) consumes the BootstrapToken via bootstraptokens.Validator, (b) allocates a mesh-IP from the Domain's mesh CIDR, (c) inserts the Node row, (d) issues the NSK plaintext plus its wrapped persistence form, and (e) appends a tenancy.NodeRegistered outbox row — all inside a single Postgres transaction in internal/identity/nodes/repo/repository_pg.go (RegistrationTx). The handler that drives the flow is internal/transport/http/v1/handlers/register.go, and the Registration Service composition is wired into Deps from cmd/plexsphere/app.go.

text
                +--------------------+
                |  plexd (enrolling) |
                |  (BootstrapToken   |
                |   plaintext +      |
                |   X25519 pubkey)   |
                +----+----------+----+
       1 POST /v1/register   |
                     v       |
       +---------------------------------------------+
       |             cmd/plexsphere (core)           |
       |                                             |
       |   internal/identity/nodes/registration      |
       |    (Registration Service)                   |
       |    -> validate X25519 public key            |
       |       (REQ-010: BEFORE token consume)       |
       |                                             |
       |    -> bootstraptokens.Validator.Consume     |
       |       (single-use, in-tx)                   |
       |                                             |
       |    -> Signing Service: resolve active       |
       |       signing public key + kid              |
       |                                             |
       |    -> mesh-IP allocate, Node insert,        |
       |       NSK issue (plaintext + wrap)          |
       |                                             |
       |    -> outbox append:                        |
       |       tenancy.NodeRegistered                |
       |       (S014 SSE consumer reads here)        |
       +-----------------+---------------------------+
                         |
                         | 2 response (one-shot):
                         |   node_id, mesh_ip,
                         |   NSK plaintext (one-shot),
                         |   signing_public_key + kid,
                         |   peer_snapshot,
                         |   domain_mesh_cidr
                         v
                +--------------------+
                |  plexd (enrolled)  |
                +--------------------+

Three load-bearing invariants follow from this flow:

  • Atomic five-step transaction — token consume, mesh-IP allocate, Node insert, NSK issue, outbox append all run inside a single RegistrationTx (see internal/identity/nodes/repo/repository_pg.go). Partial failure rolls the BootstrapToken back to un-consumed so a redeeming plexd may retry without burning the token.
  • NSK plaintext leaves the process exactly once — the Registration Service returns the plaintext to plexd in the response body and persists only its wrapped form. The wrap-key ledger lives in internal/platform/db/migrations/0008_node_secret_keys.sql, which records the wrap-key id alongside the wrapped NSK so a future unwrap can locate the key without re-deriving it.
  • Public-key validation runs BEFORE token consumption — a malformed X25519 key causes the request to fail with 400 before the bootstraptokens.Validator.Consume call, so a syntactically broken client cannot waste a single-use token. The check lives next to the command parser in internal/identity/nodes/registration/command.go .

The tenancy.NodeRegistered outbox row is the sole public consumption surface for downstream contexts: SSE Signed Event Bus consumer, reconciliation pull GET /v1/nodes/{id}/state, and Node deregistration handler all read from this seam rather than reaching into the nodes aggregate directly.

The bounded-context reference at ../contexts/identity/registration.md carries the ubiquitous language, the state machine for the Node aggregate, the threat model that names the attacker shapes this flow defends against, and the OpenAPI surface details for /v1/register .

Forensic substrate — Platform Audit Log

The Platform Audit Log is the forensic substrate of last resort: the tamper-evident record of every privileged plexsphere action a post-incident investigator reads when they need to answer "who did what, when, and from where?". Every ReBAC grant and denial, every bootstrap-token issuance / consumption / revocation, every signing-service Sign and rotation transition, every IdP-binding edit, every label-registry edit, every group-membership change, every approval-workflow transition, and the future Cloud-Credential, Secret-Store, and Session-issuance paths land here as one row on a per-Domain hash chain — no exceptions.

The substrate is split deliberately across two stores, each chosen for the failure mode it shoulders:

text
   Emitters (FROZEN ports)                                     │
     bootstraptokens/audit  signing/audit_middleware  authz    │
                       │                                       │
                       ▼  audit.Sink.Record(ctx, Entry) error  │
            +---------------------------+                      │
            |  audit.Service             |                     │
            |  (hash-chained sink)       |                     │
            |   · DomainResolver         |                     │
            |   · Pepper                 |                     │
            |   · canonical encoder      |                     │
            |   · sha256 chain           |                     │
            |   · advisory lock          |                     │
            +-------------+-------------+                      │
                          │ AppendAuditEntry (sqlc)            │
              +-----------▼-----------+ pull   +-------------+ │
              | Postgres (primary)    +------► |  archiver   | │
              |                       |        |  worker     | │
              |  audit_log_entry      |        +------+------+ │
              |  audit_log_chain_head |               │        │
              |  audit_subject_pii    |               ▼        │
              |  audit_tamper_quarant |        +-------------+ │
              +-----------+-----------+        | object store| │
                          │ verify             | (S3 / Sweed)| │
                          ▼                    | 7-yr cold   | │
                /readyz reconcile probe        +-------------+ │
                          │                                    │
                          ▼  audit/<domain>/<seq:020d>.json.zst
                  red on chain or mirror divergence            │
StoreBackendRoleRetention
Primary (hot)PostgreSQLAuthoritative chain. Every append is a single transaction under a per-Domain pg_advisory_xact_lock. Reads (ListAuditEntries, GetAuditEntry, VerifyAuditChain) hit this store.Full retention window — chain rows never leave Postgres until the operator-driven retention boundary is reached.
Mirror (cold)S3-compatible object store (SeaweedFS self-hosted, AWS S3 SaaS)Long-term archive. The audit.archiver worker drains archived_at IS NULL rows, zstd-compresses the canonical-JSON projection, and uploads to audit/<domain_id>/<seq:020d>.json.zst. Used by the audit-archive-restore Chainsaw suite to recover a dropped seq window.7-year cold default; P0 restore priority (README §Backup & Disaster Recovery).

Three load-bearing invariants follow from this split:

  • Per-Domain residency, no shared "system" chain. The DomainResolver port maps every (Subject, Object) to the Domain UUID that owns the resulting chain row; cross-Domain decisions fan out to one row per affected Domain with a shared correlation_id. An unresolvable Domain fails closed at the Sink rather than collapsing onto a default chain.
  • Pseudonymise from inception, never rewrite the chain. Chain rows reference a 32-byte deterministic pseudonym sha256(pepper(domain_id) ‖ subject_id); subject plaintext lives exactly once in audit_subject_pii. Right-to-erasure DROPs the PII row; the chain remains mathematically valid because the bytes that produced entry_hash were never the plaintext.
  • Operator-action stream separated from node-side audit ingest at the depguard layer. POST /v1/nodes/{id}/audit lands in Grafana Loki, NOT on the hash-chained Postgres surface. The no-node-audit-on-platform-chain depguard rule in ../../.golangci.yml refuses any import of internal/audit/repo from internal/observability/**, and tests/workspace/depguard_audit_isolation_test.go backs the rule at go test time so a future contributor cannot silently route the node-side stream through the operator-action chain.

See docs/contexts/audit/index.md for the ubiquitous language, the canonical-byte encoder pin, the hash-chain state machine (Mermaid), the retention matrix, the right-to-erasure flow, the read-side ReBAC rule (domain#auditor), the OpenAPI surface (ListAuditEntries, GetAuditEntry, VerifyAuditChain, EraseIdentityFromAudit), the threat model (hostile DBA, panic dump, backup tamper, archive-mirror divergence), and the explicit "what this story is NOT" boundary against the downstream consumer stories that build on this substrate.

Heartbeat and reachability flow

Once a Node is enrolled, plexd sustains the per-Domain reachability view by sending periodic heartbeats. The flow lands a wall-clock timestamp on the Node row, runs the reachability state machine to detect transitions between healthy, stale, and unreachable, and emits a tenancy.NodeReachabilityChanged Domain Event onto the SSE event bus so the Dashboard updates without polling :

text
   plexd ── POST /v1/nodes/{id}/heartbeat ──► cmd/plexsphere (core)


                                       reachability evaluator
                                       (internal/mesh/reachability)


                                       outbox: tenancy.NodeReachabilityChanged


                                       SSE bus (envelope Type =
                                                node_state_updated)


                                              Dashboard / plexd

The reachability projection is also recoverable via the reconciliation pull at GET /v1/nodes/{id}/state, where it appears as the reachability block on NodeStateSnapshot — same source row, two channels, one shape — so a Dashboard cold-start and an SSE subscriber converge on the same to_state. The bounded-context reference at ../contexts/mesh/reachability.md carries the state-machine semantics, the heartbeat persistence layout, the evaluator's transition rules, and the SSE / reconciliation-pull convergence proof.

Health Endpoints

The HTTP handlers for /livez, /readyz, and /metrics are wired once, in internal/platform/bootstrap, and mounted on the chi router that both binaries serve. The contract is deliberately identical across the two processes so a Kubernetes probe configuration can be shared.

  • GET /livez — returns 200 as long as the process is alive. It does not consult the health.Registry; it exists to let an orchestrator kill a hung replica whose HTTP server has stopped making progress entirely.
  • GET /readyz — aggregates every probe registered with the health.Registry (see internal/platform/health/registry.go). Each probe runs in parallel with a per-probe bounded timeout (DefaultProbeTimeout, 2 s, overridable at registration via WithProbeTimeout). The endpoint returns 200 only when every probe returned nil; any probe failure — or the draining flag described below — flips the response to 503 with the per-probe detail on the JSON body.
  • GET /metrics — Prometheus exposition endpoint for the process-level and platform-level metrics emitted through internal/platform/telemetry, including the plexsphere_build_info gauge labelled with the injected version/commit/date.

Drain-on-SIGTERM

On SIGTERM (or SIGINT), the signal-notified context is cancelled, which drives the graceful-shutdown sequence in order:

  1. health.Registry.Drain() is called. /readyz now returns 503 with {"status":"draining"} regardless of probe results. Upstream load balancers observe the flip and stop routing new traffic to this replica.
  2. http.Server.Shutdown(ctx, GracePeriod) runs. In-flight requests are allowed to complete within the grace deadline (default ~25 s, overridable by --shutdown-grace).
  3. If the grace deadline elapses before in-flight requests finish, the server package surfaces server.ErrForceClose; the binary translates that to exit code 2 so an orchestrator records the forced close distinctly from a clean exit.

The sequence — flip readiness first, then stop accepting connections — is the load-balancer-friendly drain semantics the README §Runtime Topology "Health and readiness" bullets promise to operators.

Cross-references

  • Engineering Principles — DDD as the primary ranking and the supporting SOLID / KISS / YAGNI / SoC principles the directory layout under internal/ materialises.
  • Repository layout and bounded-context map — the depguard rules (no-cross-context-imports, no-direct-persistence-from-contexts, no-default-http-client) that enforce the boundary at lint time.
  • CI pipeline — the per-job catalogue with the local make target that reproduces each gate, the runner caches, and the third-party-action SHA pin contract.