Skip to content

Audit log storage topology

This page covers the two persistence backends the Platform Audit Log writes to and the deterministic byte projection the chain hash is computed over. For the chain state machine itself see Hash chain and residency model; for the entry point and ubiquitous language see the index.

Two-store split — Postgres primary, object-store mirror

The Platform Audit Log is split deliberately across two persistence backends, each chosen for the failure mode it shoulders. The README's storage topology table pins the split as a deployment commitment; this section reproduces the split with code anchors so a reader can navigate from the diagram into the SQL and back.

text
   ┌────────────────────────────────────────────────────────────────┐
   │  Emitters (FROZEN ports — Sink.Record(ctx, Entry))             │
   │   bootstraptokens/audit  signing/audit_middleware  authz       │
   │            │                    │                    │         │
   │            └────────── audit.Sink.Record ────────────┘         │
   │                            │                                    │
   │            ┌───────────────▼───────────────────┐               │
   │            │  audit.Service (hash-chained)      │              │
   │            │   · DomainResolver  · Pepper       │              │
   │            │   · canonical.Canonical            │              │
   │            │   · sha256 chain · advisory lock   │              │
   │            └───────────────┬───────────────────┘               │
   │                            │ AppendAuditEntry (sqlc)            │
   │            ┌───────────────▼───────────────────┐               │
   │            │  Postgres (primary, hot)           │              │
   │            │   plexsphere.audit_log_entry       │              │
   │            │   plexsphere.audit_log_chain_head  │              │
   │            │   plexsphere.audit_subject_pii     │              │
   │            │   plexsphere.audit_tamper_quarantine                │
   │            └────────┬───────────────┬───────────┘              │
   │                     │ pull          │ verify                    │
   │       audit.archiver           audit.ReconcileChain (/readyz)   │
   │                     │                                            │
   │            ┌────────▼────────┐                                  │
   │            │ blobstore (S3)  │ audit/<domain>/<seq:020d>.json.zst │
   │            │ 7-yr cold       │                                  │
   │            └─────────────────┘                                  │
   └────────────────────────────────────────────────────────────────┘
StoreBackendRoleSchema / layout
Primary (hot)PostgreSQLAuthoritative chain. Every append is a single transaction under a per-Domain advisory lock; reads (ListAuditEntries, GetAuditEntry, VerifyAuditChain) hit this store. The chain remains queryable for the full retention window.0011_audit_log.sql — four tables: audit_log_entry, audit_log_chain_head, audit_subject_pii, audit_tamper_quarantine.
Mirror (cold)S3-compatible object store (SeaweedFS self-hosted, AWS S3 SaaS)Long-term retention. The archiver drains archived_at IS NULL rows, zstd-compresses the canonical-JSON projection, and uploads to audit/<domain_id>/<seq:020d>.json.zst. Used by the audit-archive-restore Chainsaw suite to recover a dropped seq window.internal/audit/archiver/upload.goarchiveRow JSON projection + zstd compressor + per-Domain bucket key.

The split is not a hot/cold tier with a TTL on the hot side: a chain row never leaves PostgreSQL during the retention window. The mirror is a parallel write that gives the operator a recoverable snapshot if the Postgres cluster suffers catastrophic loss; the Chainsaw audit-archive-restore suite proves the round-trip end-to-end against a real seaweedfs testcontainer.

Loki carries the node-side audit stream (POST /v1/nodes/{id}/audit) entirely separately. The two streams MUST NOT silently merge — the depguard rule no-node-audit-on-platform-chain in .golangci.yml refuses any import of internal/audit/repo from internal/observability/**, and the workspace test tests/workspace/depguard_audit_isolation_test.go backs the rule at go test time so a future contributor cannot silently route the node-side stream through the operator-action chain.

Canonical-byte encoder pin

The chain hash is computed over canonical.Canonical(entry), a deterministic length-prefixed binary form whose magic prefix is PXA1 (the four ASCII bytes 'P','X','A','1' — see internal/audit/chain/canonical.go). The encoder is pinned: any drift retroactively invalidates every downstream row's entry_hash. Two contracts protect the pin:

  1. Golden corpus. Every encoder change must round-trip byte-for-byte against internal/audit/chain/testdata/canonical/*.golden. The current corpus covers genesis_minimal, full_fields, empty_slices, and unicode_relation cases; new cases are added alongside their golden file. The test canonical_diff_test.go diffs the live encoder output against each golden.
  2. Versioned magic. A future encoder revision MUST bump the magic prefix (e.g. PXA2) so re-encoded rows are not byte-equal to legacy ones. The verifier uses the magic to refuse a cross-version mix in the same chain segment.

Field-level limits (maxStringBytes = 64 KiB, maxSliceElements = 1024, maxCaveatNameLen = 64) are defence in depth: oversized inputs are rejected at encode time so they never reach storage. These ceilings are exercised by the fuzz harness at internal/audit/chain/chain_fuzz_test.go, which CI runs for ≥ 30 s on every PR.