Appearance
Integrity Violation Ingest — agent-detected divergences, batched, persisted, and alerted
This document is the authoritative bounded-context reference for the integrity violation ingest surface — the seam between a plexd agent's on-Node detection of "this binary's checksum does not match what I declared, this hook's checksum does not match what I declared, the SSH host key I am presenting does not match what I declared" and the durable evidence trail downstream consumers (the alerter routing and the dashboard projector named in the Out of scope section) observe through the integrity_alert outbox event. It covers the ubiquitous-language pins, the violation taxonomy, the on-the-wire request and response shape, the exhaustive catalogue of Problem codes the handler emits, the NSK authentication model the transport inherits from the capability sibling, the batching contract that turns a per-POST batch into one alert, the persist-and-alert contract that pins the atomicity guarantee, the ReBAC fan-out quartet, the Postgres persistence shape, and the downstream consumers this slice deliberately leaves out of scope.
The integrity ingest surface is a divergence ingestor and an alert emitter — and only that. The agent publishes a batch of observed divergences whenever its on-Node detectors fire; the control plane persists every row as durable evidence and emits a single alert event carrying the per-kind summary and the security-load-bearing quartet. The ingest surface deliberately does NOT correlate against attestation, does NOT compare against the Artifact Registry, does NOT decide remediation policy, and does NOT gate dispatch on the violation; those concerns live in named follow-up stories.
Ubiquitous language
Four terms travel verbatim across the tenancy.IntegrityViolation value object, the POST /v1/nodes/{id}/integrity-violations handler, the plexsphere.node_integrity_violation table, and the integrity_alert outbox event. Internal code never paraphrases them; documentation and Problem detail strings adopt the exact spelling below.
| Term | Definition | Code anchor |
|---|---|---|
| IntegrityViolation | The value-object record of a single observed divergence — one of binary_checksum, hook_checksum, or ssh_host_key. Carries the ArtifactID, the observed digest or fingerprint, the optional expected digest or fingerprint, and the on-Node DetectedBy discriminator. Immutable: the receiver-side Node.RecordIntegrityViolations returns a canonicalised batch rather than mutating the prior. | internal/identity/tenancy/integrity_violation.go (IntegrityViolation) |
| DetectedBy | The closed enumeration of on-Node detectors that surface a violation: startup_scan (the agent's boot-time integrity sweep), inotify (a filesystem watcher firing on a post-boot artifact change), and pre_dispatch (a just-in-time verification immediately before the agent acts on the artifact). The value-object constructor rejects any other token at the aggregate boundary. | internal/identity/tenancy/integrity_violation.go (IntegrityViolationDetectedBy*) |
| IntegrityAlert | The closed outbox event emitted exactly once per accepted batch. Carries the (Node, Resource, Project, Domain) ID quartet (denormalised so downstream consumers fan out per-Domain without an extra lookup), the ViolationCount, the sorted unique Kinds list, and the operator-facing RecommendedAction. Stable event-type discriminator integrity_alert. | internal/identity/tenancy/events/events.go (TypeIntegrityAlert, IntegrityAlert) |
| NSK (Node Secret Key) | The per-Node bearer token the agent presents on every authenticated request. The NSK middleware resolves the bearer plaintext to a Node, asserts the resolved Node id equals the URL path id, and attaches the Node to the request context. The integrity ingest handler reads the attached Node out of the context and never re-validates the NSK itself. Same seam the Capability Manifest Ingest inherits. | internal/identity/authn/middleware/ |
The translation is one-directional: the agent emits a violation batch, the handler returns an accepted_at + violation_count receipt on the response, and the repository emits a single integrity_alert outbox event per accepted batch. The agent never reads the alert back through the ingest surface; the alerter routing and the dashboard projector named in the Out of scope section consume the outbox event downstream.
Violation taxonomy
The closed taxonomy spans three artifact kinds crossed with three detection sources. The aggregate boundary rejects any combination outside the closed product set; the database CHECK constraint pins the same closed set so a hand-rolled INSERT cannot smuggle an unknown discriminator through.
Kinds — what diverged
kind literal | Artifact | Columns populated |
|---|---|---|
binary_checksum | The running agent binary's SHA-256 digest. The detector compares the on-disk binary against the digest the agent's declared CapabilityManifest.BinaryChecksum advertises. | observed_checksum (required, 32 bytes), optional expected_checksum (0 or 32 bytes). Fingerprint columns MUST be empty. |
hook_checksum | A declared hook payload's SHA-256 digest. The detector compares the on-disk hook against the digest the agent's CapabilityManifest.DeclaredHooks advertises. ArtifactID carries the hook name. | observed_checksum (required, 32 bytes), optional expected_checksum (0 or 32 bytes). Fingerprint columns MUST be empty. |
ssh_host_key | The SSH host-key the Node is currently presenting. The detector compares the running sshd's SHA256:<base64> host-key fingerprint against the fingerprint the agent's CapabilityManifest.SSHHostKeyFingerprint advertises. | observed_fingerprint (required, matching ^SHA256:[A-Za-z0-9+/]+={0,2}$), optional expected_fingerprint (same shape). Checksum columns MUST be empty. |
Sources — who reported it
detected_by literal | Detector | Use case |
|---|---|---|
startup_scan | The agent's boot-time integrity sweep that walks every declared artifact and digests it before the agent starts accepting traffic. | Catches tampering that happened while the agent was stopped. |
inotify | The kernel inotify (or platform equivalent) watcher the agent registers on every declared artifact; fires on a post-boot modification. | Catches tampering that happens while the agent is running. |
pre_dispatch | A just-in-time verification the agent runs immediately before it acts on a declared artifact (executes a hook, advertises a host key, restarts). | Catches tampering that happens in the window between the prior watcher event and the dispatch. |
The DECISION block on the value-object constructor pins the kind/column-pair invariant: a binary_checksum row carrying a fingerprint is rejected with integrity_violation_kind_mismatch rather than the more generic integrity_violation_checksum_invalid, so a mixed-column entry surfaces as the precise misuse it is rather than collapsing onto a vaguer arm.
Wire shape
The transport surface is one HTTP operation: POST /v1/nodes/{id}/integrity-violations. The path id is a UUIDv7 Node id; the NSK presented in Authorization: Bearer MUST resolve to that same Node id (a cross-Node replay surfaces as 403 node_id_mismatch).
Request
The handler decodes the body with DisallowUnknownFields after capping the body at 32 KiB through http.MaxBytesReader. Each decoded element is then canonicalised through tenancy.NewIntegrityViolation; the constructor is the single seam that enforces every per-entry invariant. The batch-level invariants (non-empty, at most 128 entries) are enforced by Node.RecordIntegrityViolations.
jsonc
{
"violations": [
{
"kind": "binary_checksum",
"detected_by": "startup_scan",
"artifact_id": "plexd",
"observed_checksum": "<base64 of 32 bytes>",
"expected_checksum": "<base64 of 32 bytes>" // optional
},
{
"kind": "ssh_host_key",
"detected_by": "pre_dispatch",
"artifact_id": "ssh_host_ed25519_key",
"observed_fingerprint": "SHA256:<base64>",
"expected_fingerprint": "SHA256:<base64>" // optional
}
]
}Field rules:
violations(array, required) — at least 1, at mostIntegrityViolationsMaxBatch(128) entries. Empty array is rejected with400 integrity_violations_empty; over-cap is rejected with400 integrity_violations_too_many.kind(string, required) — one ofbinary_checksum,hook_checksum,ssh_host_key. Any other value is rejected with400 integrity_violation_kind_invalid.detected_by(string, required) — one ofstartup_scan,inotify,pre_dispatch. Any other value is rejected with400 integrity_violation_detected_by_invalid.artifact_id(string, required) — the stable label of the affected artifact (hook name, binary path label, host-key file label). Empty and whitespace-only values are rejected with400 integrity_violation_artifact_id_empty.observed_checksum(string, base64, required for checksum kinds, forbidden forssh_host_key) — raw 32-byte SHA-256 digest. Any other length is rejected with400 integrity_violation_checksum_invalid.expected_checksum(string, base64, optional for checksum kinds, forbidden forssh_host_key) — raw 32-byte SHA-256 digest when present. Same length rule.observed_fingerprint(string, required forssh_host_key, forbidden for checksum kinds) — must match^SHA256:[A-Za-z0-9+/]+={0,2}$. Any other shape is rejected with400 integrity_violation_host_key_fingerprint_invalid.expected_fingerprint(string, optional forssh_host_key, forbidden for checksum kinds) — same pattern rule.
A checksum-kind row carrying any fingerprint field, or an ssh_host_key row carrying any checksum field, is rejected with 400 integrity_violation_kind_mismatch — the dedicated branch fires BEFORE the generic checksum / fingerprint arms so a mixed-column entry surfaces as the precise misuse it is.
Response
jsonc
{
"accepted_at": "2026-05-28T10:15:30.123Z",
"violation_count": 2
}accepted_at is the server commit timestamp the handler stamped on the request (UTC, RFC 3339). violation_count is the persisted-row count which equals the input batch size on a successful write — the handler is all-or-nothing: every row in the batch lands together with the alert, or no row and no alert lands. The response carries no per-row identifiers; the operator UI consumes those from the dedicated audit and integrity surfaces.
The success status is 202 Accepted rather than 200 OK because the alert downstream of the persist is consumed asynchronously by the alerter routing named in the Out of scope section; the synchronous receipt confirms the evidence row landed and the alert was enqueued, not that any operator-visible action has happened yet.
Error code catalog
Every reachable Problem code from integrity_violations.go is enumerated below; generated TypeScript / Go clients can exhaustively switch on code without a fall-through arm. The handler's gate ordering (authn → path-id → body cap → body decode → per-entry invariant → batch canonicalisation → aggregate write) determines which arm fires; earlier gates are cheaper than later ones so the cheapest rejection path always short-circuits the more expensive ones.
| HTTP status | code literal | Trigger |
|---|---|---|
| 400 | malformed_integrity_violations_request | Body is not a valid IntegrityViolationsRequest envelope — invalid JSON, unknown field (the decoder runs DisallowUnknownFields), or missing required field. |
| 400 | integrity_violations_empty | The violations array is empty after decoding (tenancy.ErrIntegrityViolationsEmpty). |
| 400 | integrity_violations_too_many | The violations array exceeds IntegrityViolationsMaxBatch (128) (tenancy.ErrIntegrityViolationsTooMany). |
| 400 | integrity_violation_kind_invalid | An entry's kind is not in the closed set {binary_checksum, hook_checksum, ssh_host_key} (tenancy.ErrIntegrityViolationKind). |
| 400 | integrity_violation_detected_by_invalid | An entry's detected_by is not in the closed set {startup_scan, inotify, pre_dispatch} (tenancy.ErrIntegrityViolationDetectedBy). |
| 400 | integrity_violation_artifact_id_empty | An entry's artifact_id is missing or empty after trimming whitespace (tenancy.ErrIntegrityViolationArtifactID). |
| 400 | integrity_violation_kind_mismatch | A binary_checksum / hook_checksum entry carries a fingerprint field, or an ssh_host_key entry carries a checksum field (tenancy.ErrIntegrityViolationKindMismatch). The branch fires BEFORE the checksum / fingerprint arms so the precise misuse surfaces. |
| 400 | integrity_violation_checksum_invalid | An entry's observed_checksum or expected_checksum is missing or does not decode to exactly 32 bytes (tenancy.ErrIntegrityViolationChecksum). |
| 400 | integrity_violation_host_key_fingerprint_invalid | An entry's observed_fingerprint or expected_fingerprint is non-empty and does not match SHA256:<base64> (tenancy.ErrIntegrityViolationHostKeyFingerprint). |
| 401 | unauthorized | The NSK middleware did not attach a Node to the request context. The handler's defensive arm — in production the middleware runs upstream and a missing Node is unreachable on a correctly mounted route. |
| 403 | node_id_mismatch | The defense-in-depth path-id gate: the resolved NSK Node id does not equal the URL path id. The NSK middleware runs the same check upstream; this handler-side double-check protects against a misconfigured router that mounts the handler without the middleware. Audit row stamps the dedicated node_integrity_violations.path_gate relation so dashboards can split admission versus ingestion outcomes. |
| 404 | integrity_violations_node_not_found | The recorder reported ErrIntegrityViolationsNodeNotFound — the Node row was concurrently deleted between the NSK middleware's admission and the aggregate write. |
| 413 | integrity_violations_body_too_large | Body exceeds the 32 KiB IntegrityViolationsMaxBodyBytes ceiling. http.MaxBytesReader caps the bytes before the JSON decoder is ever invoked. |
| 501 | integrity_violations_not_provisioned | The deferred-wiring posture: one or more of IntegrityViolationsRecorder, NSKResolver, or NodeRepo is nil on the Handlers struct. The dispatch shim in integrity_violations_dispatch.go fails closed so log scrapers can alert on the deferred-wiring state. |
Every reachable 4xx and 5xx arm emits an audit row through the shared AuditSink. Ingestion-phase entries (malformed body, invariant violation, recorder failure, granted) stamp the node_integrity_violations.record relation; the defense-in-depth path-id gate stamps node_integrity_violations.path_gate. Audit dashboards filter on the relation to detect "middleware was bypassed but the handler caught it" without conflating it with ingestion-phase entries.
NSK authentication model
The integrity ingest surface inherits the NSK authentication seam from its capability sibling — the same per-Node bearer credential governs the capabilities, heartbeat, and integrity endpoints. There is no ReBAC participation on the ingest path: the operator-facing identity-and-relation authorisation surface lives in ../../../internal/authz/ and is documented under ../identity/rebac.md; the agent-facing integrity ingest is a per-Node credential surface keyed on the NSK plaintext alone.
The authentication contract is structural and two-step:
- Middleware admission. The NSK middleware (
../../../internal/identity/authn/middleware/) resolves theAuthorization: Bearerplaintext to a Node, asserts the resolved Node id equals the URL pathid, and attaches the Node to the request context. Missing, malformed, or revoked credentials surface as401 nsk_revoked; a cross-Node bearer surfaces as403 node_id_mismatch. The integrity handler never re-validates the NSK itself. - Handler-side double-check. The handler reads the attached Node off the context with
middleware.FromContextNodeand runs a defense-in-depth path-id comparison throughmiddleware.MatchesPathID. A mismatch fires thenode_integrity_violations.path_gateaudit relation and refuses with403 node_id_mismatch. This arm is unreachable when the middleware is mounted correctly; it protects against misconfigured routes that bypass the middleware.
The integrity ingest surface is wired through the shared NSK middleware instance the composition root installs for the capabilities, heartbeat, and integrity paths together — see nskAuthenticatedPathRE in ../../../cmd/plexsphere/app.go. A future Node-facing endpoint that requires NSK admission appends its path suffix to the same regex; there is no per-handler NSK middleware instance.
Batching contract
The repository write is the single seam where a per-POST batch becomes (a) N rows in plexsphere.node_integrity_violation and (b) one row in plexsphere.outbox_events. The write is one transaction; the per-row INSERT loop and the outbox append all commit or all roll back together.
Cap
The batch carries at most IntegrityViolationsMaxBatch (128) entries; the cap is enforced at the aggregate boundary (Node.RecordIntegrityViolations → ErrIntegrityViolationsTooMany) rather than the transport so every code path agrees on the limit. The transport's body cap (IntegrityViolationsMaxBodyBytes = 32 KiB) is the cheaper outer ring: each entry is at most ~200 bytes (enum discriminators, base64 checksums, a short artifact_id), so 128 entries comfortably fits the byte cap with headroom.
Atomic persist + emit
NodeRepo.RecordIntegrityViolations (../../../internal/identity/tenancy/repo/node_integrity_violation_repo.go) runs the following inside runInTx:
LoadNodeWithDomainIDs(node_id)— the "does this Node still exist" gate. Apgx.ErrNoRowsor a23503FK violation on the subsequent INSERT both surface asErrIntegrityViolationsNodeNotFound(the Node was deleted out from under the call).- Per-row
InsertNodeIntegrityViolation(...)— the repository allocates a UUIDv7 id per row and INSERTs every violation in the batch. appendOutbox(...)— writes oneintegrity_alertevent toplexsphere.outbox_events. The event payload is the JSON form theIntegrityAlert.Marshalmethod produces, carrying the(Node, Resource, Project, Domain)ID quartet, theViolationCount, the sorted uniqueKindslist, and theRecommendedAction.
The DECISION block on RecordIntegrityViolations pins the single-transaction posture: a successful violation persist without a corresponding alert would silently swallow the integrity signal, and a successful alert without persisted rows would leave the operator UI with no evidence to drill into. Folding the per-row INSERT loop and the outbox append into one transaction guarantees the alert and the evidence land together or neither lands.
One alert per batch
Every accepted batch emits exactly one integrity_alert event regardless of batch size. A 5-violation batch mixing binary_checksum and hook_checksum produces five rows in plexsphere.node_integrity_violation and one row in the outbox; the alert payload's kinds is the sorted unique slice ["binary_checksum", "hook_checksum"] and violation_count is 5.
The DECISION block on IntegrityViolationBatch.PerKindCounters pins the in-aggregate counter computation: the aggregate already walks the batch to defensive-copy it, so computing the per-kind counters in the same pass keeps the canonicalisation atomic and means the outbox event constructor consumes a ready-made summary rather than re-deriving it from the row list.
Recommended action
The alert payload's recommended_action is the operator-facing remediation hint. The handler stamps a constant "reprovision" at admission time:
DECISION: the recommended action is a constant rather than a configurable per-batch field on the request body. Alternative considered: accept a
recommended_actionfield inIntegrityViolationsRequestso the agent could carry its own guidance. REJECTED because the recommended action is an operational decision that belongs on the control plane (the agent reports observations; the control plane dispatches remediation). Hard-coding"reprovision"keeps the transport surface stable; a future change to a richer remediation taxonomy can introduce a per-kind lookup without breaking the existing wire contract.
ReBAC scoping
The IntegrityAlert event payload carries the (Node, Resource, Project, Domain) ID quartet so the operator-facing alerter can fan out per-Domain without an extra database lookup. The quartet's role is twofold:
- Per-Domain isolation. The alerter routing iterates per
domain_idso an integrity-alert subscriber bound to Domain A cannot observe Domain B's alerts even if the routing topology accidentally fans both onto a shared channel. The Domain id on the event is the authoritative isolation key. - ReBAC fan-out via
LookupSubjects. The operator surface resolves "who is allowed to see this Node's alert?" through the shared authz package'sLookupSubjectsAPI keyed on theresource_id(orproject_id, depending on the operator relation the dashboard projector queries). The event payload carries every id the authz layer needs so the projector never has to JOIN back to a row that may have been deleted between ingest and projection.
The integrity ingest surface deliberately does NOT participate in ReBAC on the ingest path itself — admission is the per-Node NSK plaintext alone. ReBAC governs the operator-facing read surface that consumes the alert; the agent-facing write surface is authenticated by the per-Node credential the agent already holds.
The DECISION block on IntegrityViolationBatch pins the quartet's denormalisation onto the batch rather than re-resolution at the repository boundary: the Node aggregate already guarantees the four ids are non-zero (see Node invariants), so the batch can carry them without a second SELECT and the batch stays self-contained for the outbox event payload.
Persistence
The schema lives in migration internal/platform/db/migrations/0036_node_integrity_violations.sql. A single table plexsphere.node_integrity_violation carries one row per reported violation; rows are append-only and ordered for operator-facing listings by (node_id, reported_at DESC, id DESC).
Table shape
| Column | Type | Notes |
|---|---|---|
id | uuid | PRIMARY KEY. Application-allocated UUIDv7 the repository sets via tenancy.NewID() at INSERT time so ordering by id stays stable across writers without a central sequence. The DECISION block on the migration enumerates the rejected alternatives (composite PK, BIGSERIAL surrogate). |
node_id | uuid NOT NULL | FOREIGN KEY to plexsphere.nodes(id) ON DELETE CASCADE. A Node delete atomically clears its violation history in the same transaction. |
kind | text NOT NULL | CHECK (kind IN ('binary_checksum', 'hook_checksum', 'ssh_host_key')) — the SQL-side mirror of the value-object closed set so a hand-rolled INSERT cannot smuggle an unknown discriminator through. |
artifact_id | text NOT NULL | CHECK (length(trim(artifact_id)) > 0) — the SQL-side mirror of the value-object invariant. |
observed_checksum | bytea NULL | CHECK (observed_checksum IS NULL OR length(observed_checksum) = 32) — raw 32-byte SHA-256 digest. NULL for the ssh_host_key kind. |
expected_checksum | bytea NULL | Same length CHECK. NULL when the agent did not know the expected digest, or for the ssh_host_key kind. |
observed_fingerprint | text NULL | NULL for checksum kinds. The aggregate boundary rejects malformed SHA256:<base64> strings before the row is built, so no regex CHECK is needed at the SQL layer. |
expected_fingerprint | text NULL | Same rule. |
detected_by | text NOT NULL | CHECK (detected_by IN ('startup_scan', 'inotify', 'pre_dispatch')). |
reported_at | timestamptz NOT NULL | The server-stamped admission instant. |
created_at | timestamptz NOT NULL DEFAULT now() | Set on INSERT. |
The compound index node_integrity_violation_node_id_reported_at_idx on (node_id, reported_at DESC, id DESC) lets the per-Node operator- facing listing endpoint walk the most-recent-first range on the index without a sort step.
Down policy
The migration's Down arm DROPs the index and the table. The DECISION block under the Down section pins the rationale: the integrity-alert evidence chain that lives in plexsphere.outbox_events (the alert event the repository appends inside the same transaction as the violation INSERTs) is anchored to its own aggregate identifier and survives this DROP. The per-violation detail rows are reconstructible from the next agent report after a downgrade-and-reup cycle, so dropping them is not a regulatory regression and the rollback semantics stay symmetric with the Up arm.
sqlc queries
The handler reaches Postgres through three sqlc-generated queries declared in internal/platform/db/queries/M0_node_integrity_violations.sql:
InsertNodeIntegrityViolation— INSERT one row, returning the persisted row. The repository runs this in a loop inside the transaction so a single batch lands atomically.ListNodeIntegrityViolationsByNodeID— paginated listing in(reported_at DESC, id DESC)order with aLIMITparameter; the legacy per-Node read path consumes this query.GetNodeIntegrityViolation— single-row lookup byidfor the per-Node detail view.ListIntegrityViolationsPage— the Phase-3 cross-Domain keyset listing (see Triage lifecycle and operator read service). JOINsnode_integrity_violation -> nodes -> resourcesto surfacedomain_id+project_id, applies optionaldomain_id/project_id/node_id/kind/statusfilters, and orders(reported_at DESC, id DESC)with alimit+1peek-ahead.AcknowledgeNodeIntegrityViolation— the:execrowsUPDATE that flips a rowopen -> acknowledgedunder aWHERE id = $1 AND status = 'open'guard; a 0-rows-affected result lets the repository distinguish a missing row from an already-decided one.GetIntegrityViolationTriageRow— id-keyed read of the full triage projection (same JOIN as the list query) backing the Acknowledge re-read and the 0-rows-affected probe.
Triage lifecycle and operator read service
The Phase-3 operator surface adds a triage lifecycle on top of the append-only ingest rows described above. Where the ingest surface is write-only on the agent path, the triage surface is the operator-facing read-and-acknowledge seam: it lists persisted violations across Domains under filters and keyset pagination, and it records an operator's acknowledgement of a divergence.
The migration internal/platform/db/migrations/0040_node_integrity_violation_triage.sql adds four columns to plexsphere.node_integrity_violation: status (text NOT NULL DEFAULT 'open', CHECK-constrained to the closed set {open, acknowledged, resolved}), and the NULLable acknowledgement trio acknowledged_at / acknowledged_by_subject / acknowledge_reason. Every pre-Phase-3 row backfills to open on the ADD COLUMN.
Triage aggregate
The triage lifecycle is modelled as a separate read-side aggregate, TriagedIntegrityViolation (internal/identity/tenancy/triaged_integrity_violation.go), distinct from the ingest-time IntegrityViolation value object. The DECISION block on the aggregate pins the rationale: IntegrityViolation is an immutable, identity-free batch element on the write path, so bolting a mutable triage status onto it would conflate "what the agent reported" with "how an operator has triaged the persisted row". TriagedIntegrityViolation carries the violation's UUIDv7 identity, the denormalised owning DomainID, the persisted ingest kind, the triage status, the artifact_id, the detected_at (= reported_at), and the acknowledgement metadata.
The aggregate exposes one pure transition, Acknowledge(subject, reason, now):
- The current
statusMUST beopen; acknowledging an already-terminal row wrapstenancy.ErrIntegrityViolationNotOpen(mapped to a 409 by the transport). subjectMUST be non-empty after trimming.reasonMUST be non-empty after trimming and at most 1024 characters (integrityAcknowledgeReasonMaxLen); both arms wraptenancy.ErrIntegrityViolationAcknowledgeReason.
The transition is pure and returns a new aggregate value; persistence is the repository's concern.
Kind vocabularies — ingest versus wire
The triage surface exposes a shorter operator-facing wire kind (binary / hook / host_key) distinct from the snake_case ingest literal the persisted kind column carries (binary_checksum / hook_checksum / ssh_host_key). The single documented DB→wire mapping lives in tenancy.WireKindFromIngestKind; the read service maps the inbound wire kind filter back onto the ingest literal before binding the query so the SQL predicate always compares against the stored value.
Ingest literal (kind column) | Operator wire kind |
|---|---|
binary_checksum | binary |
hook_checksum | hook |
ssh_host_key | host_key |
Read service
The application service IntegrityReadService (internal/identity/tenancy/services/integrity_read_service.go) composes a persistence port (IntegrityViolationReadRepo, satisfied by *repo.NodeIntegrityViolationRepo) and an audit sink:
Listvalidates and maps the optionalkind(wire→ingest) andstatusfilters, clampslimitto[1, 200](default 50), forwards the optionaldomain_id/project_id/node_idfilters and the raw keyset cursor to the repo, and maps every returned row's ingestkindback onto the wire form. Per-row visibility filtering (the per-Domain authz check) lives in the transport layer, not the service.Acknowledgetrims and rejects a blank reason and a zero id before the repo call, delegates the transition, and emits an audit row with the stable relationintegrity_violation.acknowledgeon success. No audit row is emitted on a failed transition.
Cursor value object
The list pagination cursor is the raw keyset value{ReportedAt, ID} for the (reported_at DESC, id DESC) ordering, with an IsEmpty() first-page signal. The service and repo traffic in this plain value object only; the HMAC signing and per-(caller, pepper) binding the OpenAPI next_cursor promises are a transport concern owned by the HTTP layer, which wraps the raw cursor in the signed envelope on the way out and unwraps it on the way in. The empty cursor maps to a max-sentinel keyset bound (a far-future timestamp + the all-ones UUID) so the (reported_at, id) < (sentinel) comparison returns every row under the DESC ordering.
Out of scope
The integrity ingest surface deliberately does NOT correlate, does NOT compare, and does NOT gate. The downstream concerns below are owned by named follow-up stories; this slice is the producer side of the data their consumer arms will read.
- Operator-facing HTTP transport, dashboard, and alerter UI — the triage domain aggregate, the keyset read service, and the Acknowledge transition (see Triage lifecycle and operator read service) are landed; the HTTP transport that signs/binds the pagination cursor and maps the triage sentinels onto Problem codes, the dashboard view of "what integrity violations have been reported?", the per-Domain alerter routing, and the remediation workflow remain owned by sibling Phase-3 slices. The agent ingest surface stays write-only on the agent path.
- Control-plane comparison against the Artifact Registry — owns the canonical catalogue of agent-binary checksums and hook-payload checksums. A future cross-check arm will correlate a freshly-PUT capability manifest's declared checksums (via the capability ingest surface) against the registry and raise a divergence alert through this same
integrity_alertevent. The integrity ingest surface intentionally does NOT perform this lookup at ingest time — the registry is a separate aggregate with its own lifecycle and the cross-check belongs in the registry's consumer arm. - Dispatch gating on outstanding integrity alerts — owns the policy decision of "should this Node still receive dispatched policy projections / hook commands while an integrity alert is outstanding?". A future dispatch gate will read the most-recent unresolved
integrity_alertevent and pause dispatch per-node_iduntil the operator marks the alert acknowledged or remediated. The integrity ingest surface intentionally does NOT make this policy decision — its job is to record the observation and emit the alert; the dispatch gate is a separate consumer of the same outbox event.
Cross-references
../../../internal/identity/tenancy/integrity_violation.go—IntegrityViolationvalue object, the closed Kind / DetectedBy enumerations,IntegrityViolationBatch, and theIntegrityViolationsMaxBatchcap.../../../internal/identity/tenancy/node.go—Node.RecordIntegrityViolationsreceiver-side ingest that enforces the batch-level invariants and stamps the quartet onto the returned batch.../../../internal/identity/tenancy/events/events.go—TypeIntegrityAlertdiscriminator,IntegrityAlertstruct, andNewIntegrityAlertconstructor.../../../internal/identity/tenancy/repo/node_integrity_violation_repo.go— single-transactionRecordIntegrityViolationswith per-row INSERT loop and outbox append, plus the Phase-3NodeIntegrityViolationReporead adapter (Listkeyset page +Acknowledgetransition) and the raw keyset cursor value object.../../../internal/identity/tenancy/triaged_integrity_violation.go— theTriagedIntegrityViolationread-side aggregate, theopen -> acknowledged -> resolvedstatus set, theAcknowledgetransition, and theWireKindFromIngestKindDB→wire kind mapping.../../../internal/identity/tenancy/services/integrity_read_service.go— theIntegrityReadServiceapplication service (Listwith the filter/cursor/clamp contract,Acknowledgewith audit emission) and itsIntegrityViolationReadRepopersistence port.../../../internal/platform/db/migrations/0040_node_integrity_violation_triage.sql— the triage columns (status+ acknowledgement trio) added toplexsphere.node_integrity_violation.../../../internal/transport/http/v1/handlers/integrity_violations.go—POST /v1/nodes/{id}/integrity-violationshandler body, the eight-step gate ordering, and the exhaustive Problem code dispatch.../../../internal/transport/http/v1/handlers/integrity_violations_deps.go— transport ports, transport sentinels,IntegrityViolationsInput/IntegrityViolationsResult, body cap, and audit relation constants.../../../internal/transport/http/v1/handlers/integrity_violations_dispatch.go— fail-closed501 integrity_violations_not_provisionedshim while ports are unwired.../../../internal/platform/db/migrations/0036_node_integrity_violations.sql—plexsphere.node_integrity_violationtable, the(node_id, reported_at DESC, id DESC)index, and the DROP-on-Down DECISION block.../../../api/openapi/plexsphere-v1.yaml—PostNodeIntegrityViolationsoperation,IntegrityViolationsRequest/IntegrityViolationRequest/IntegrityViolationsResponseschemas, and the exhaustivecodeenum on every Problem response../capabilities.md— the sibling Capability Manifest Ingest surface whose declared checksums and host-key fingerprint the on-Node detectors compare against to raise anIntegrityViolation.../identity/tenancy.md— the Identity & Tenancy bounded-context reference the Node aggregate belongs to.