Appearance
Capability Manifest Ingest — agent-side capability snapshots, persisted, diffed, and emitted
This document is the authoritative bounded-context reference for the capability manifest ingest surface — the seam between a plexd agent's per-Node snapshot of "what binary am I running, what hooks do I advertise, what host key am I presenting" and the durable projection downstream consumers (the integrity correlator and the dashboard projector named in the Out of scope section) observe through the NodeCapabilitiesUpdated outbox event. It covers the ubiquitous-language pins, the on-the-wire request and response shape, the exhaustive catalogue of Problem codes the handler emits, the NSK authentication model the transport inherits from the heartbeat sibling, the diff and event contract that translates a per-PUT snapshot into an outbox emission, the Postgres persistence shape migration 0035_node_capabilities.sql defines, and the four downstream consumers this slice deliberately leaves out of scope.
The capability ingest surface is a snapshot ingestor and a diff emitter — and only that. The agent publishes a self-describing manifest on every meaningful change; the control plane persists the canonical form, computes the diff against the prior persisted row, and on a non-empty diff emits a single outbox event carrying the field-level transition list and the security-load-bearing host-key flag. The ingest surface deliberately does NOT correlate against attestation, does NOT compare against the Artifact Registry, and does NOT participate in CRD-style hook scheduling; those concerns live in named follow-up stories.
Ubiquitous language
Five terms travel verbatim across the tenancy.CapabilityManifest value object, the PUT /v1/nodes/{id}/capabilities handler, the plexsphere.node_capability_manifest table, and the tenancy.NodeCapabilitiesUpdated outbox event. Internal code never paraphrases them; documentation and Problem detail strings adopt the exact spelling below.
| Term | Definition | Code anchor |
|---|---|---|
| CapabilityManifest | The value-object snapshot every plexd agent publishes through PUT /v1/nodes/{id}/capabilities. Carries the agent binary version + checksum, an optional SSH host-key fingerprint, and a list of DeclaredHook entries. Immutable: the receiver-side Node.RecordCapabilities returns a fresh Node rather than mutating the prior. | internal/identity/tenancy/capability_manifest.go (CapabilityManifest) |
| DeclaredHook | A single (name, sha256_checksum) tuple the agent advertises. The constructor rejects empty names and non-32-byte checksums; the parent CapabilityManifest rejects duplicate names case-sensitively and caps the list at CapabilitiesMaxDeclaredHooks (128). | internal/identity/tenancy/capability_manifest.go (DeclaredHook) |
| ManifestDiff | The pure-function output of DiffManifests(prev, next). Carries FieldsChanged (alphabetised name list — no before/after values, honouring the audit pseudonym contract) and HostKeyChanged (the dedicated SSH-fingerprint-moved flag downstream consumers branch on without parsing FieldsChanged). | internal/identity/tenancy/capability_manifest.go (ManifestDiff, DiffManifests) |
| NodeCapabilitiesUpdated | The closed outbox event emitted exactly once per non-empty diff. Carries the (Node, Resource, Project, Domain) ID quartet (denormalised so downstream consumers route per-Domain without an extra lookup), the FieldsChanged name list, and the HostKeyChanged flag. Stable event-type discriminator tenancy.NodeCapabilitiesUpdated. | internal/identity/tenancy/events/events.go (TypeNodeCapabilitiesUpdated, NodeCapabilitiesUpdated) |
| NSK (Node Secret Key) | The per-Node bearer token the agent presents on every authenticated request. The NSK middleware resolves the bearer plaintext to a Node, asserts the resolved Node id equals the URL path id, and attaches the Node to the request context. The capability ingest handler reads the attached Node out of the context and never re-validates the NSK itself. | internal/identity/authn/middleware/ |
The translation is one-directional: the agent emits a manifest, the handler emits a ManifestDiff on the response, and the repository emits a NodeCapabilitiesUpdated outbox event on a non-empty diff. The agent never reads the diff back through the ingest surface; the integrity correlator and the dashboard projector named in the Out of scope section consume the outbox event downstream.
Wire shape
The transport surface is one HTTP operation: PUT /v1/nodes/{id}/capabilities. The path id is a UUIDv7 Node id; the NSK presented in Authorization: Bearer MUST resolve to that same Node id (a cross-Node replay surfaces as 403 node_id_mismatch).
Request
The handler decodes the body with DisallowUnknownFields after capping the body at 32 KiB through http.MaxBytesReader. The decoded envelope is then canonicalised through tenancy.NewCapabilityManifest; the constructor is the single seam that enforces every manifest invariant.
jsonc
{
"binary_version": "plexd-v0.4.2-ge5f3a1c",
"binary_checksum": "<base64 of 32-byte SHA-256>",
"ssh_host_key_fingerprint": "SHA256:<base64>", // optional
"declared_hooks": [
{ "name": "post-install", "checksum": "<base64 of 32 bytes>" }
]
}Field rules:
binary_version(string, required) — the agent's semver. Empty and whitespace-only values are rejected with400 binary_version_empty.binary_checksum(string, base64, required) — the raw SHA-256 digest of the running agent binary. Decodes to exactly 32 bytes throughencoding/json's transparent base64 handling; any other length is rejected with400 binary_checksum_invalid.ssh_host_key_fingerprint(string, optional) — the OpenSSH SHA-256 host-key fingerprint in the canonicalSHA256:<base64>form. Empty / absent values are accepted (a Node legitimately may not have a host key configured); a non-empty value not matching the pattern is rejected with400 ssh_host_key_fingerprint_invalid.declared_hooks(array, optional) — at mostCapabilitiesMaxDeclaredHooks(128) entries, no duplicatenamevalues, every entry'schecksumdecoding to exactly 32 bytes. Violations route to400 declared_hooks_too_many,400 declared_hook_duplicate, or400 declared_hook_invalidrespectively.
Response
jsonc
{
"accepted_at": "2026-04-27T10:15:30.123Z",
"fields_changed": ["binary_checksum", "binary_version"],
"host_key_changed": false
}accepted_at is the server commit timestamp the handler stamped on the request (UTC, RFC 3339). fields_changed is the alphabetised list of manifest fields that transitioned versus the prior persisted snapshot — on an idempotent PUT (no diff) the field is the explicit empty array [] and host_key_changed is false. The JSON encoder is forced to emit [] rather than null so downstream consumers can rely on the field being present on every 200.
The valid fields_changed enum values are the four snake_case column-equivalent identifiers:
binary_versionbinary_checksumssh_host_key_fingerprintdeclared_hooks
host_key_changed is a dedicated boolean rather than a derived predicate over fields_changed because SSH host-key transitions are security-load-bearing — downstream consumers (the integrity correlator named in the Out of scope section) switch on the explicit flag without parsing the string list.
Error code catalog
Every reachable Problem code from capabilities.go is enumerated below; generated TypeScript / Go clients can exhaustively switch on code without a fall-through arm. The handler's gate ordering (authn → path-id → body cap → body decode → manifest invariants → aggregate write) determines which arm fires; earlier gates are cheaper than later ones so the cheapest rejection path always short-circuits the more expensive ones.
| HTTP status | code literal | Trigger |
|---|---|---|
| 400 | binary_version_empty | binary_version is missing or empty after trimming whitespace (tenancy.ErrCapabilityManifestVersion). |
| 400 | binary_checksum_invalid | binary_checksum is missing, empty, or not exactly 32 bytes after base64 decoding (tenancy.ErrCapabilityManifestChecksum). Distinct from heartbeat's binary_checksum_empty because the manifest invariant covers both empty AND wrong-length values in one branch. |
| 400 | ssh_host_key_fingerprint_invalid | ssh_host_key_fingerprint is non-empty and does not match SHA256:<base64> (tenancy.ErrCapabilityManifestHostKey). |
| 400 | declared_hook_invalid | A declared_hooks entry violates a per-entry invariant — empty name or non-32-byte checksum (tenancy.ErrCapabilityManifestHookInvalid). |
| 400 | declared_hook_duplicate | Two declared_hooks entries carry the same name (tenancy.ErrCapabilityManifestHookDuplicate). |
| 400 | declared_hooks_too_many | declared_hooks exceeds CapabilitiesMaxDeclaredHooks (128) (tenancy.ErrCapabilityManifestHooksTooMany). |
| 400 | malformed_capabilities_request | Body is not a valid CapabilityManifestRequest envelope — invalid JSON, unknown field (the decoder runs DisallowUnknownFields), or missing required field. |
| 401 | unauthorized | The NSK middleware did not attach a Node to the request context. The handler's defensive arm — in production the middleware runs upstream and a missing Node is unreachable on a correctly mounted route. |
| 403 | node_id_mismatch | The defense-in-depth path-id gate: the resolved NSK Node id does not equal the URL path id. The NSK middleware runs the same check upstream; this handler-side double-check protects against a misconfigured router that mounts the handler without the middleware. Audit row stamps the dedicated node_capabilities.path_gate relation so dashboards can split admission versus ingestion outcomes. |
| 404 | capabilities_node_not_found | The recorder reported ErrNotFound — the Node row was concurrently deleted between the NSK middleware's admission and the aggregate write. |
| 413 | capabilities_body_too_large | Body exceeds the 32 KiB CapabilitiesMaxBodyBytes ceiling. http.MaxBytesReader caps the bytes before the JSON decoder is ever invoked. |
| 501 | capabilities_not_provisioned | The deferred-wiring posture: one or more of CapabilitiesRecorder, NSKResolver, or NodeRepo is nil on the Handlers struct. The dispatch shim in capabilities_dispatch.go fails closed so log scrapers can alert on the deferred-wiring state. |
Every reachable 4xx and 5xx arm emits an audit row through the shared AuditSink. Ingestion-phase entries (malformed body, invariant violation, recorder failure, granted) stamp the node_capabilities.record relation; the defense-in-depth path-id gate stamps node_capabilities.path_gate. Audit dashboards filter on the relation to detect "middleware was bypassed but the handler caught it" without conflating it with ingestion-phase entries.
NSK authentication model
The capability ingest surface inherits the NSK authentication seam from its heartbeat sibling — the same per-Node bearer credential governs both endpoints. There is no ReBAC participation on the ingest path: the operator-facing identity-and-relation authorisation surface lives in ../../../internal/authz/ and is documented under ../identity/rebac.md; the agent-facing capability ingest is a per-Node credential surface keyed on the NSK plaintext alone.
The authentication contract is structural and two-step:
- Middleware admission. The NSK middleware (
../../../internal/identity/authn/middleware/) resolves theAuthorization: Bearerplaintext to a Node, asserts the resolved Node id equals the URL pathid, and attaches the Node to the request context. Missing, malformed, or revoked credentials surface as401 nsk_revoked; a cross-Node bearer surfaces as403 node_id_mismatch. The capability handler never re-validates the NSK itself. - Handler-side double-check. The handler reads the attached Node off the context with
middleware.FromContextNodeand runs a defense-in-depth path-id comparison throughmiddleware.MatchesPathID. A mismatch fires thenode_capabilities.path_gateaudit relation and refuses with403 node_id_mismatch. This arm is unreachable when the middleware is mounted correctly; it protects against misconfigured routes that bypass the middleware.
The capability ingest surface is wired through the shared NSK middleware instance installed by the composition root for both the heartbeat and the capabilities paths — see nskAuthenticatedPathRE in ../../../cmd/plexsphere/app.go. A future Node-facing endpoint that requires NSK admission appends its path suffix to the same regex; there is no per-handler NSK middleware instance.
Diff & event contract
The repository write is the single seam where a per-PUT snapshot becomes (a) a row in plexsphere.node_capability_manifest and (b) a non-empty diff turns into one outbox event in plexsphere.outbox_events. The write is one transaction; the diff, the UPSERT, and the outbox append all commit or all roll back together.
Single-transaction ingest
NodeRepo.RecordCapabilities (../../../internal/identity/tenancy/repo/node_capability_repo.go) runs the following inside runInTx:
SelectCapabilityManifest(node_id)— hydrate the prior manifest (returns the zero value when no row exists yet).DiffManifests(prev, next)— pure function; the constructor on the input has already canonicalised the next manifest.UpsertCapabilityManifest(...)— INSERT ... ON CONFLICT (node_id) DO UPDATE; the table is keyed onnode_idso the UPSERT replaces the prior row in place.- If the diff is non-empty,
appendOutbox(...)writes onetenancy.NodeCapabilitiesUpdatedevent toplexsphere.outbox_events. The event payload is the JSON form theNodeCapabilitiesUpdated.Marshalmethod produces, carrying the(Node, Resource, Project, Domain)ID quartet, the alphabetisedFieldsChangedname list, and theHostKeyChangedflag.
The DECISION block on RecordCapabilities pins the diff-inside-the-transaction posture: two concurrent PUTs against the same Node would each compute a diff against an already-stale row if the SELECT happened at the handler boundary; the loser's diff would not reflect the winner's UPSERT and the two outbox rows would disagree on which fields actually moved. Folding the SELECT, the diff, the UPSERT, and the outbox append into one transaction guarantees the emitted FieldsChanged matches the row that landed.
Empty diff semantics
An idempotent PUT (the agent re-publishes a manifest that exactly matches the persisted row) yields an empty ManifestDiff. The repository UPSERTs the row anyway (so the updated_at trigger fires) but skips the outbox append — downstream consumers see no event because nothing observable moved. The handler still returns 200 with fields_changed: [] and host_key_changed: false so the agent observes a consistent successful response shape.
DiffManifests semantics
The pure-function comparison treats declared_hooks as a set keyed by (name → checksum). A re-order does NOT register as a diff — the constructor already rejects duplicate names so the set keying is unambiguous, and a re-order should not emit a spurious outbox row that the dashboard projector named in the Out of scope section would render as an "agent capabilities changed" alert. The DECISION block on DiffManifests pins the rationale.
A zero-value prev (first PUT, no prior row) is treated as "every field at its zero value". FieldsChanged therefore names every non-zero field on the first manifest a Node publishes — matching the canonical "fields_changed lists every field that moved from zero-value" contract pinned by the event constructor in internal/identity/tenancy/events/events.go.
FieldsChanged is sorted alphabetically so the on-the-wire response and the outbox event payload are byte-stable across runs. HostKeyChanged is a separate flag because the SSH host-key transition is the single field downstream consumers need to branch on without parsing the string list.
Persistence
The schema lives in migration internal/platform/db/migrations/0035_node_capabilities.sql. A single table plexsphere.node_capability_manifest carries one row per Node; the row is overwritten in place when the agent re-publishes.
Table shape
| Column | Type | Notes |
|---|---|---|
node_id | uuid | PRIMARY KEY and FOREIGN KEY to plexsphere.node(id) ON DELETE CASCADE. The aggregate boundary's "one CapabilityManifest per Node" invariant is enforced by the PK, not a synthetic id + UNIQUE constraint. |
binary_version | text NOT NULL | CHECK (trim(binary_version) <> '') — the SQL-side mirror of the value-object invariant so a half-formed row cannot survive Hydrate. |
binary_checksum | bytea NOT NULL | CHECK (length(binary_checksum) = 32) — the same 32-byte raw-SHA-256 length the constructor enforces. |
ssh_host_key_fingerprint | text NULL | NULLable because a Node legitimately may not have a host key configured. The downstream event flags transitions on this column separately because SSH host-key changes are security-load-bearing. |
declared_hooks | jsonb NOT NULL DEFAULT '[]' | Array of {"name": "<name>", "checksum_base64": "<b64>"} objects. The application layer is the authoritative parser; SQL treats the blob as opaque. |
created_at | timestamptz NOT NULL DEFAULT now() | Set on first INSERT; never modified. |
updated_at | timestamptz NOT NULL DEFAULT now() | Refreshed by an updated_at trigger function on every UPDATE. |
Down policy
The migration's Down arm DROPs the table. The DECISION block under the Down section pins the rationale: unlike the reachability and heartbeat tables (migrations 0010 and 0033) that hydrate ongoing state machines, the capability manifest is reconstructible from the next agent PUT — the agent re-publishes its current state on the next meaningful change and the table re-populates from a single round of ingest. A destructive Down is therefore safe and keeps the rollback semantics symmetric with the Up arm.
sqlc queries
The handler reaches Postgres through three sqlc-generated queries declared in internal/platform/db/queries/L0_node_capabilities.sql:
UpsertCapabilityManifest— INSERT ... ON CONFLICT (node_id) DO UPDATE. Returns the persisted row.SelectCapabilityManifest— SELECT the prior row bynode_id. Returnspgx.ErrNoRowswhen no row exists yet; the repository translates this into the zero-valueCapabilityManifestsoDiffManifestscan treat the first PUT uniformly.LoadNodeWithDomainIDs— hydrates the(Resource, Project, Domain)IDs the outbox payload needs alongside the Node id, so the outbox row is emitted without a second round-trip.
Out of scope
The capability ingest surface deliberately does NOT correlate, does NOT compare, and does NOT schedule. The downstream concerns below are owned by named follow-up stories; this slice is the producer side of the data their consumer arms will read.
- Integrity correlator — consumes the
NodeCapabilitiesUpdatedoutbox event and correlates thehost_key_changed: truearm against attestation evidence to raise an integrity alert on a suspicious SSH host-key transition. The correlator owns the alerter and the correlation rules; the ingest surface is responsible only for emitting the flag. - Artifact Registry — owns the canonical catalogue of agent-binary checksums and hook-payload checksums. A future cross-check arm will compare a freshly-PUT manifest's
binary_checksumand per-hookchecksumagainst the registry and raise a divergence alert. The capability ingest surface intentionally does NOT perform this lookup at PUT time — the registry is a separate aggregate with its own lifecycle and the cross-check belongs in the registry's consumer arm. - CRD hooks — owns the Kubernetes-native scheduling and execution semantics of the
declared_hooksadvertised through this surface. The capability ingest surface persists the declarations and emits diff events; it does NOT plan, schedule, or execute any hook. The DeclaredHook value object is a metadata record only. Separately, the discovery-only PlexdHook projection in./hooks.mdprojects discovered Kubernetes PlexdHook custom resources onto the distinctplexd_hooksmanifest field; it too stays strictly read-only — scheduling and execution remain out of scope. - Operator-facing read surface — owns the dashboard view of "what is each Node advertising right now?". A read API and the dashboard projection live in the operator-facing read surface; the capability ingest surface is intentionally write-only on the agent path. The dashboard projector consumes the same
NodeCapabilitiesUpdatedoutbox event the integrity correlator reads.
Cross-references
../../../internal/identity/tenancy/capability_manifest.go—CapabilityManifest,DeclaredHook,ManifestDiff, and the pureDiffManifestsfunction.../../../internal/identity/tenancy/node.go—Node.CapabilityManifest()accessor andNode.RecordCapabilitiesreceiver-side ingest.../../../internal/identity/tenancy/events/events.go—TypeNodeCapabilitiesUpdateddiscriminator,NodeCapabilitiesUpdatedstruct, andNewNodeCapabilitiesUpdatedconstructor.../../../internal/identity/tenancy/repo/node_capability_repo.go— single-transactionRecordCapabilitieswith diff, UPSERT, and outbox append.../../../internal/transport/http/v1/handlers/capabilities.go—PUT /v1/nodes/{id}/capabilitieshandler body, the seven-step gate ordering, and the exhaustive Problem code dispatch.../../../internal/transport/http/v1/handlers/capabilities_deps.go— transport ports, transport sentinels,CapabilitiesInput/CapabilitiesResult, body cap, and audit relation constants.../../../internal/transport/http/v1/handlers/capabilities_dispatch.go— fail-closed501 capabilities_not_provisionedshim while ports are unwired.../../../internal/platform/db/migrations/0035_node_capabilities.sql—plexsphere.node_capability_manifesttable, theupdated_attrigger, and the destructive-Down DECISION block.../../../api/openapi/plexsphere-v1.yaml—PutNodeCapabilitiesoperation,CapabilityManifestRequest/CapabilityManifestResponse/DeclaredHookschemas, and the exhaustivecodeenum on every Problem response.../identity/tenancy.md— the Identity & Tenancy bounded-context reference the Node aggregate belongs to../hooks.md— the discovery-only PlexdHook projection of discovered Kubernetes PlexdHook custom resources onto the separateplexd_hooksmanifest field (name, image digest, parameters, timeout, sandbox); it is read-only — plexsphere never writes the resources back, resolves no registry digest, and never schedules or executes a hook.